Recently, while building a collaborative drawing web application with WebSocket for one of my livestreams, I discovered just how efficient it is to document a WebSocket server using the AsyncAPI specification in a spec-first approach. But what exactly do I mean by “spec-first”? 🤔
What Do I Mean by Spec-First?
The spec-first API development approach involves designing the API using an API specification before implementing it. This method offers significant advantages, such as reducing the time needed to build the actual API, improving communication with stakeholders, and producing higher-quality APIs overall. But let’s save the deep dive into spec-first for another time and get back on track! 😄
So Why WebSocket and AsyncAPI Instead of OpenAPI?
OpenAPI isn’t ideal for my use case because it’s specifically designed for REST APIs. WebSocket, on the other hand, differs significantly from traditional HTTP. It provides a two-way communication channel over a single Transmission Control Protocol (TCP) connection, which OpenAPI doesn’t support.
In simpler terms, unlike REST APIs, where you must send a request to receive a response, maintaining a connection similar to a WebSocket would require repeatedly pinging the server at intervals (a process known as polling). WebSocket does the opposite. It keeps the connection open between server and client, allowing the server to send data to the client without waiting for a request.
So, why would I use OpenAPI for that? Now you see why AsyncAPI is the better fit. Since WebSocket enables an event-driven connection between client and server, we need an API specification that supports this kind of interaction—and that’s where AsyncAPI comes in.
Let’s explore why combining AsyncAPI with WebSocket is such a powerful approach.
The Intersection
As I mentioned earlier, WebSocket enables an event-driven connection between client and server
, meaning it operates asynchronously. AsyncAPI offers a standardized way to define these asynchronous APIs, making it a perfect match. This combination enhances real-time application development by ensuring consistent, reliable message formats and enabling instant, bidirectional data exchange between client and server.
Now, let’s dive deeper into this powerful intersection! 🐬
Clear and Concise Message Format and Event Types
Defining your WebSocket API with AsyncAPI allows you to leverage AsyncAPI’s schema definitions, ensuring a structured and consistent approach to handling data exchange across WebSocket connections. This reduces misunderstandings about message formats and event types, creating a smoother, more reliable communication flow.
Message Schema Validation
AsyncAPI allows your WebSocket API to validate real-time messages against predefined schemas at runtime, helping to catch errors early in the development stage. This approach promotes better data consistency across your application.
Improved Architectural Planning
If, like me, you enjoy designing your API before implementation, using AsyncAPI with your WebSocket API supports an API-first approach. It enables you to thoughtfully design your API and identify message patterns early on, making it easier and faster to plan for scaling.
Leveraging the Tooling Ecosystem
As the industry standard for defining asynchronous APIs, AsyncAPI enables a robust ecosystem of tools, some of which is maintained by the AsyncAPI initiative. This includes capabilities like generating code in multiple languages, creating deployment-ready documentation, and setting up mock servers for development with tools like Microcks.
Now that you've seen some of the powerful things this intersection creates, let's take a look at the key concepts in AsyncAPI for our WebSocket API.
Key Concepts in AsyncAPI for WebSocket
If you've used WebSocket before, you’re likely familiar with the term channel (sometimes referred to as “topics” or “paths”), right? If not, here’s a quick overview: channels in WebSocket act as specific routes within the WebSocket connection, enabling messages to be sent and received across different sections of the connection.
For instance, if we have channels named general
and members
, messages can be sent and received on either of these channels independently. So, if I want to receive messages specifically from the members
channel, I just need to listen to that channel, and I’ll get all incoming messages tagged for it. Channels help organize communication within a WebSocket, making it easy to manage different types of messages effectively.
Now let's look at what channels looks like in an AsyncAPI document.
Channels
The AsyncAPI channels allows us to establish a bidirectional communication between message senders and receivers.
Channels in AsyncAPI are primarily based on a simple idea, Senders send messages with different contents to be addressed to different channels, and receivers subscribe to these channels to receive these messages. But AsyncAPI channels are more than just a message highway, they are made up of a number of different elements that works together to make communication between senders and receivers smooth. Some of these components includes,
- Address: An optional string that specifies the channel's address. This could be a topic name, routing key, event type, or path.
- Title: A friendly, descriptive title for the channel.
- Messages: The list of message types that can be sent to this channel, ready to be received by any subscriber at any time.
- Bindings: A set of WebSocket-specific info that customizes the connection details.
Now that we've seen what makes up a websocket channel in AsyncAPI, let's take a look at the next key concept which is messages.
Messages
I mean, really—what’s the point of it all? Don’t worry, this isn’t an existential crisis! 😄
I’m talking about data exchange! In an event-driven system, exchanging messages is at the core of everything we’re building. AsyncAPI helps us create a structured, consistent approach to handling this exchange across WebSocket connections.
In AsyncAPI, a message is the key mechanism by which information flows between senders and receivers via channels. And since messages are flexible, they can support all kinds of interaction patterns such as events, commands, requests, or responses.
Just like channels, websocket messages in AsyncAPI are also made up of various elements such as:
- Name: A friendly, descriptive name for the message.
- Summary: A short summary of what the message is about.
- Description: A verbose explanation of the message.
- Payload: The structured and verbose required properties for your message.
Now let's take a look at another key concept which is called operations
Operations
Operations are one of my favorite parts of the AsyncAPI specification—and for good reason! They were part of the latest additions to the spec, making it possible to re-use a channel in ways that weren’t possible before.
In AsyncAPI, an operation defines the specific actions that can occur within a channel. Basically, it tells you if your application will be sending or receiving a message in that channel, making message flow clear and structured.
Operations are made up of a few important elements:
- Action: Using
send
orreceive
keywords,send
indicates the app will send a message to the channel, whilereceive
means the app expects to receive a message. - Channel: The specific channel where the operation will happen.
- Reply: The definition of the reply in a request-reply operation.
- Title: A friendly, descriptive name for the operation.
- Summary: A quick summary of what the operation is all about.
- Description: A more detailed explanation of the operation’s purpose.
With operations, you get more control and clarity over message flow within each channel, making AsyncAPI even more powerful for building event-driven systems!
These three concepts are integral when documenting our websocket server using AsyncAPI.
The Complete Breakdown
Now that we’ve seen how AsyncAPI can streamline real-time communication and simplify managing WebSocket channels, let's take a closer look at what a complete AsyncAPI document would look like for a simple chat application, using the key concepts we've outlined.
Now let's take a closer look at what a complete asyncapi document looks like for a simple chat application using some of the key concepts we've outlined above.
Step 1 - Defining Basic Information About Our WebSocket API
First, we provide some essential information about our API, including the server details for client connections.
1asyncapi: "3.0.0"
2
3info:
4 title: A simple chat application
5 version: 1.0.0
6 description: A simple real-time chat API using WebSocket protocol
7
8servers:
9 development:
10 host: localhost:8787
11 description: Development Websocket broker.
12 protocol: wss
Step 2 - Defining Our WebSocket Channel
As we mentioned earlier, AsyncAPI channels enable bidirectional communication between senders and receivers. Let’s define our channel below:
1channels:
2 chat:
3 address: /
4 title: Users channel
Notice we haven’t included message details yet. To keep things organized, we’ll use components to define reusable messages and then reference them in our channel.
Step 3 - Creating a component
Components in AsyncAPI helps holds a set of reusable objects for different aspect of the AsyncAPI specification. When you define an object in a component, it won't have any effect on your API unless the object has been explicitly referenced from another properties outside the component object.
Just like the rest of the key concepts i mentioned earlier, components also have a set of required elements that can be defined such as the following:
- Messages: An Object that holds reusable message objects
- Channels: An object that holds reusable channel objects
- Operations: An object to hold reusable operation objects
- SecuritySchemes: An object that holds reusable security scheme objects
- Schemas: and object to hold reusable schema object.
Now, because we want our chat channel to not look overwhelming and difficult to read, we are going to create our message in the component object.
1components:
2 messages:
3 chat:
4 description: A message sent in the chat room
5 payload:
6 type: object
7 properties:
8 messageId:
9 type: string
10 format: uuid
11 description: Unique identifier for the message
12 senderId:
13 type: string
14 description: ID of the user sending the message
15 content:
16 type: string
17 maxLength: 1000
18 description: The message content
19 timestamp:
20 type: string
21 format: date-time
22 description: Time when the message was sent
23 required:
24 - messageId
25 - senderId
26 - content
27 - timestamp
This message structure includes required fields like messageId
, senderId
, content
, and timestamp
. Now, let’s link it to our channel.
Step 4 - Adding Messages to Our Channel and Referencing Components
To make the chat
message available in our channel, we’ll add it to the channel's messages
section and reference our defined component.
1channels:
2 chat:
3 address: /
4 title: Users channel
5 messages:
6 chatMessage:
7 $ref: '#/components/messages/chat'
With our message now tied to the channel, the final step is to specify the type of operation that can be performed within this channel. This structure allows for clear, consistent message flow and easy extensibility as your API grows!
Step 5 - Defining our chat channel Operation
The Operation part is critical to our API because it specifies what kind of action can be executed in a given channel. So now we need to create an operation for our chat channel and we do that by doing the following:
1operations:
2 sendMessage:
3 summary: Send a chat message
4 description: Allows users to send messages to the chat room
5 action: send
6 channel:
7 $ref: '#/channels/chat'
8 messages:
9 - $ref: '#/channels/chat/messages/chatMessage'
In the definition above, we created our first operation called sendMessage
with a send
action, that's made available in the chat channel. This basically means we've just enabled connected client to send
a message, but not any kind of message, but the chatMessage
to the chat channel.
If I attempt to parse a message that isn't included in the list of messages for the chat channel, as shown below...
1operations:
2 sendMessage:
3 summary: Send a chat message
4 description: Allows users to send messages to the chat room
5 action: send
6 channel:
7 $ref: '#/channels/chat'
8 messages:
9 - $ref: '#/channels/chat/messages/hello'
This will fail because in my chat channel, I have no such message as hello
even if i have the hello
message defined in my message component.
A good thing to keep at the back of your mind when defining an operation is the list of messages you're assigning to an operation has to be available in the linked channel messages.
Now that we've created our first operation that allows us to send message, we also need to create another operation that allows us to receive a message. And we do that by doing almost same thing as sending a message except, instead of send
in the action, we use the receive
action, just as seen below.
1operations:
2 sendMessage:
3 summary: Receive a chat message
4 description: Allows users to receive messages to the chat room
5 action: receive
6 channel:
7 $ref: '#/channels/chat'
8 messages:
9 - $ref: '#/channels/chat/messages/chatMessage'
With this implementation, we have a fully functional AsyncAPI document, but want to go a few more steps
Step 6 - Reusing an Existing Message Component
Let’s say we want our server to notify users whenever someone joins or leaves the chat. How would we approach this?
First, we define the new message in our components section. This message will hold information about the user joining or leaving.
1components:
2 messages:
3 chat:
4 ...
5 status:
6 payload:
7 type: object
8 properties:
9 userId:
10 type: string
11 description: ID of the user that joined or left
12 type:
13 type: string
14 enum:
15 - join
16 - leave
17 timestamp:
18 type: string
19 format: date-time
20 required:
21 - userId
22 - type
23 - timestamp
24
Here, we’ve created a new status
message to capture details about users joining or leaving.
Next, let’s add this message to our channel, so our server can broadcast it as needed:
1channels:
2 chat:
3 address: /
4 title: Users channel
5 messages:
6 chatMessage:
7 $ref: '#/components/messages/chat'
8 userStatus: # newly added channel message
9 $ref: '#/components/messages/status'
Finally, we need to define two operations within our channel: one for notifying when a user joins (userJoin
) and another for when they leave (userLeave
). Here’s how:
1operations:
2 sendMessage:
3 ...
4 userJoin:
5 summary: User join notification
6 description: Notifies when a new user joins the chat room
7 action: receive
8 channel:
9 $ref: '#/channels/chat'
10 messages:
11 - $ref: '#/channels/chat/messages/userStatus'
12 userLeave:
13 summary: User leave notification
14 description: Notifies when a user leaves the chat room
15 action: receive
16 channel:
17 $ref: '#/channels/chat'
18 messages:
19 - $ref: '#/channels/chat/messages/userStatus'
In this setup, both userJoin
and userLeave
operations use the same userStatus
message structure, saving time and reducing redundancy!
Step 7 - Adding Authentication to Our API
Securing our API is critical, and AsyncAPI supports defining security schemes to specify the authentication methods needed to connect.
Leveraging the AsyncAPI SecurityScheme allows you to define any or many of the available types of securityscheme
object available such as API key
HTTP authentication
HTTP API Key
, OAuth2
, and e.t.c.
Let's see how to declare a security scheme for our websocket server using the HTT API Key scheme
.
To secure our WebSocket server, let’s define an API key scheme in our components:
1components:
2 messages:
3 ....
4 securitySchemes:
5 apiKeyHeader:
6 type: httpApiKey
7 in: header
8 name: X-API-Key
9 description: API key passed in header
10
Here, apiKeyHeader
is our security scheme, specifying that the key should be included in the header under the name X-API-Key
.
Now, let’s associate this security scheme with our WebSocket server so it requires authorization:
1servers:
2 development:
3 host: localhost:8787
4 description: Development Websocket broker.
5 protocol: ws
6 security: # newly added line
7 - $ref: '#/components/securitySchemes/apiKeyHeader'
As you can see we added a security property to the development server, and one thing you can notice is i'm specifying it as an array because you can pass multiple security types in the security object, and only one of this security scheme needs to be satisfied to authorize a connection. But in our case, we only needed one so yea, let's role with that.
Step 8 - Providing Protocol-Specific Information
Remember when we discussed bindings in the Channel section? These bindings allow us to add WebSocket-specific details to customize the connection.
For instance, if we want to allow users to connect to multiple rooms simultaneously and send messages to any of them, we need an efficient approach than the traditional method. Typically, a channel with a parameter like /{roomId}
would be created. However, this approach has a major drawback such that for every room a user is trying to join, a new connection is going to be established, which doesn't align well with our use case. Instead, we can leverage channel bindings.
Since bindings are protocol-specific, we can tailor the implementation to WebSocket. Instead of relying on parameters, we’ll extend our chat channel by including roomIds
as a query parameter, as shown below:
1chat:
2 address: /
3 bindings:
4 ws:
5 query:
6 type: object
7 properties:
8 roomIds:
9 type: string
10 descritpion: The unique identifier of the chat room
11 pattern: ^[a-zA-Z0-9-]+$
12 additionalProperties: false
By adding these bindings, users can establish a connection once to the /
address and use the same connection to join multiple rooms by simply including the list of rooms in the roomIds
query parameter, which would look like this /?roomIds=room1,room2,room3
. This approach allows a single connection to be used across various chat rooms, making it ideal for exchanging messages in multiple channels simultaneously.
Step 9 - Bringing Everything together
We've finally written a complete asyncapi document for our chat application and this is what it looks like...
1asyncapi: 3.0.0
2info:
3 title: Simple Chat API
4 version: 1.0.0
5 description: A simple real-time chat API using WebSocket protocol
6
7servers:
8 production:
9 host: chat.example.com
10 protocol: ws
11 description: Production server
12 security:
13 - $ref: '#/components/securitySchemes/apiKeyHeader'
14
15channels:
16 chat:
17 address: /
18 bindings:
19 ws:
20 query:
21 type: object
22 properties:
23 roomIds:
24 type: string
25 description: The unique identifier of the chat room
26 pattern: ^[a-zA-Z0-9-]+$
27 additionalProperties: false
28
29 messages:
30 chatMessage:
31 $ref: '#/components/messages/chat'
32
33 userJoin:
34 description: Event when a user joins the chat room
35 $ref: '#/components/messages/status'
36
37 userLeave:
38 description: Event when a user leaves the chat room
39 $ref: '#/components/messages/status'
40
41operations:
42 sendMessage:
43 action: send
44 channel:
45 $ref: '#/channels/chat'
46 messages:
47 - $ref: '#/channels/chat/messages/chatMessage'
48 summary: Send a chat message
49 description: Allows users to send messages to the chat room
50
51 getMessage:
52 action: receive
53 channel:
54 $ref: '#/channels/chat'
55 messages:
56 - $ref: '#/channels/chat/messages/chatMessage'
57 summary: Receive chat messages
58 description: Allows users to receive messages from the chat room
59
60 userJoin:
61 action: receive
62 channel:
63 $ref: '#/channels/chat'
64 messages:
65 - $ref: '#/channels/chat/messages/userJoin'
66 summary: User join notification
67 description: Notifies when a new user joins the chat room
68
69 userLeave:
70 action: receive
71 channel:
72 $ref: '#/channels/chat'
73 messages:
74 - $ref: '#/channels/chat/messages/userLeave'
75 summary: User leave notification
76 description: Notifies when a user leaves the chat room
77
78components:
79 messages:
80 chat:
81 description: A message sent in the chat room
82 payload:
83 type: object
84 properties:
85 messageId:
86 type: string
87 format: uuid
88 description: Unique identifier for the message
89 senderId:
90 type: string
91 description: ID of the user sending the message
92 content:
93 type: string
94 maxLength: 1000
95 description: The message content
96 timestamp:
97 type: string
98 format: date-time
99 description: Time when the message was sent
100 required:
101 - messageId
102 - senderId
103 - content
104 - timestamp
105
106 status:
107 payload:
108 type: object
109 properties:
110 userId:
111 type: string
112 description: ID of the user status[join/leave]
113 type:
114 type: string
115 enum:
116 - join
117 - leave
118 timestamp:
119 type: string
120 format: date-time
121 required:
122 - userId
123 - type
124 - timestamp
125
126 securitySchemes:
127 apiKeyHeader:
128 type: httpApiKey
129 in: header
130 name: X-API-Key
131 description: API key passed in header
And since we followed the spec-first approach, we can do a lot of interesting thing with this document, such as:
- Generate Documentation: Using our asyncapi document above, we can automatically generate rich, interactive documentation to make understanding and using our API easy for anyone. With tools like AsyncAPI Studio, you can visualize and interact with our API, view channel information, messages, and operations, all without leaving the browser.
- Code Generation: Using the AsyncAPI CLI we can generate powerful code in any language, enabling us to transform our AsyncAPI document directly into production-ready code. This means we can generate client or server code and models, while speeding up the development process and reducing the risk of inconsistencies.
- API Contract Testing: Using our AsyncAPI document, we can perform some contract testing that ensures that our system remains aligned with its design, preventing unexpected behavior. With tools like Microcks, we can test and mock our API based on our AsyncAPI specification, so we're sure our API behaves as expected, even before it’s fully implemented.
After using the AsyncAPI CLI to generate an HTML template with the following command: asyncapi generate fromTemplate ./asyncapi.yaml @asyncapi/html-template@3.0.0 --use-new-generator
, we get a fully functional production-ready website for our API documentation. This generated site provides a visually appealing and interactive way to explore our AsyncAPI definition, as shown in the screenshot below.
Additionally, with the help of AsyncAPI Studio, you can easily preview your AsyncAPI document in a user-friendly interface. Simply click on this URL to explore the document live. This makes it even more convenient to review and refine your API definition in real-time!
Putting everything we've learnt together, we have our AsyncAPI document ready to go!
Conclusion
Documenting your WebSocket API with AsyncAPI brings clarity and structure to designing and managing your APIs. By standardized message formats, channels, and operations, AsyncAPI simplifies the process of building scalable, consistent, and reliable APIs.
AsyncAPI's structured approach equips teams with a collaborative framework that enhances efficiency and reduces development friction, making it a cornerstone in modern API design.
References
- For a detailed walkthrough, refer to my livestream on building a chat application from scratch using the AsyncAPI specification.
- Dive deeper with this blog post on using WebSocket with AsyncAPI.
- Join the conversation and connect with the AsyncAPI community in our Slack workspace.