AsyncAPI & WebSocket A Match Made from Heaven?

Azeez Elegbede

·18 min read

Recently, while building a collaborative drawing web application with WebSocket for one of my livestreams, I discovered just how efficient it is to document a WebSocket server using the AsyncAPI specification in a spec-first approach. But what exactly do I mean by “spec-first”? 🤔

What Do I Mean by Spec-First?

API spec first diagram

The spec-first API development approach involves designing the API using an API specification before implementing it. This method offers significant advantages, such as reducing the time needed to build the actual API, improving communication with stakeholders, and producing higher-quality APIs overall. But let’s save the deep dive into spec-first for another time and get back on track! 😄

So Why WebSocket and AsyncAPI Instead of OpenAPI?

Asyncapi-OpenAPI

OpenAPI isn’t ideal for my use case because it’s specifically designed for REST APIs. WebSocket, on the other hand, differs significantly from traditional HTTP. It provides a two-way communication channel over a single Transmission Control Protocol (TCP) connection, which OpenAPI doesn’t support.

In simpler terms, unlike REST APIs, where you must send a request to receive a response, maintaining a connection similar to a WebSocket would require repeatedly pinging the server at intervals(a process known as polling). WebSocket does the opposite. It keeps the connection open between server and client, allowing the server to send data to the client without waiting for a request.

So, why would I use OpenAPI for that? Now you see why AsyncAPI is the better fit. Since WebSocket enables an event-driven connection between client and server, we need an API specification that supports this kind of interaction—and that’s where AsyncAPI comes in.

Let’s explore why combining AsyncAPI with WebSocket is such a powerful approach.

The Intersection

As I mentioned earlier, WebSocket enables an event-driven connection between client and server, meaning it operates asynchronously. AsyncAPI offers a standardized way to define these asynchronous APIs, making it a perfect match. This combination enhances real-time application development by ensuring consistent, reliable message formats and enabling instant, bi-directional data exchange between client and server.

Now, let’s dive deeper into this powerful intersection! 🐬

Clear and Concise Message Format and Event Types

Defining your WebSocket API with AsyncAPI allows you to leverage AsyncAPI’s schema definitions, ensuring a structured and consistent approach to handling data exchange across WebSocket connections. This reduces misunderstandings about message formats and event types, creating a smoother, more reliable communication flow.

Message Schema Validation

AsyncAPI allows your WebSocket API to validate real-time messages against predefined schemas at runtime, helping to catch errors early in the development stage. This approach promotes better data consistency across your application.

Improved Architectural Planning

If, like me, you enjoy designing your API before implementation, using AsyncAPI with your WebSocket API supports an API-first approach. It enables you to thoughtfully design your API and identify message patterns early on, making it easier and faster to plan for scaling.

Leveraging the Tooling Ecosystem

AsyncAPI Ecosystem As the industry standard for defining asynchronous APIs, AsyncAPI enables a robust ecosystem of tools, some of which is maintained by the AsyncAPI initiative. This includes capabilities like generating code in multiple languages, creating deployment-ready documentation, and setting up mock servers for development with tools like Microcks.

Now that you've seen some of the powerful things this intersection creates, let's take a look at the key concepts in AsyncAPI for our WebSocket API.

Key Concepts in AsyncAPI for WebSocket

If you've used WebSocket before, you’re likely familiar with the term channel (sometimes referred to as “topics” or “paths”), right? If not, here’s a quick overview: channels in WebSocket act as specific routes within the WebSocket connection, enabling messages to be sent and received across different sections of the connection.

For instance, if we have channels named general and members, messages can be sent and received on either of these channels independently. So, if I want to receive messages specifically from the members channel, I just need to listen to that channel, and I’ll get all incoming messages tagged for it. Channels help organize communication within a WebSocket, making it easy to manage different types of messages effectively.

Now let's look at what channels looks like in an AsyncAPI document.

Channels

The AsyncAPI channels allows us to establish a bi-directional communication between message senders and receivers. yea, that's it.

Channels in AsyncAPI are pretty much based on a simple idea, Senders send messages with different contents to be addressed to different channels, and receivers subscribe to these channels to receive these messages. But AsyncAPI channels are more than just a message highway, they are made up of a number of different elements that works together to make communication between senders and receivers smooth. Some of these components includes,

  • Address: An optional string that specifies the channel's address This could be a topic name, routing key, event type, or path.
  • Title: A friendly, descriptive title for the channel.
  • Messages: The list of message types that can be sent to this channel, ready to be received by any subscriber at any time.
  • Bindings: A set of WebSocket-specific info that customizes the connection details.

Now that we've seen how to describe a websocket channel in AsyncAPI, not let's see the next key concept which is messages.

Messages

I mean, really—what’s the point of it all? Don’t worry, this isn’t an existential crisis! 😄

I’m talking about data exchange! In an event-driven system, exchanging messages is at the core of everything we’re building. AsyncAPI helps us create a structured, consistent approach to handling this exchange across WebSocket connections.

In AsyncAPI, a message is the key mechanism by which information flows between senders and receivers via channels. And since messages are flexible, they can support all kinds of interaction patterns such as events, commands, requests, or responses.

Just like channels, websocket messages in AsyncAPI are also made up of various elements such as:

  • Name: A friendly, descriptive name for the message.
  • Summary: A short summary of what the message is about.
  • Description: A verbose explanation of the message
  • Payload: The structured and verbose required properties for your message

And here's an example of what messages looks like in an AsyncAPI document

Now let's take a look at another key concept which is called operations

Operations

Operations are one of my favorite parts of the AsyncAPI specification—and for good reason! They were part of the latest additions to the spec, making it possible to re-use a channel in ways that weren’t possible before.

In AsyncAPI, an operation defines the specific actions that can occur within a channel. Basically, it tells you if your application will be sending or receiving a message in that channel, making message flow clear and structured.

Operations are made up of a few important elements:

  • Action: Using send or receive keywords, send indicates the app will send a message to the channel, while receive means the app expects to receive a message.
  • Channel: The specific channel where the operation will happen.
  • Reply: The definition of the reply in a request-reply operation.
  • Title: A friendly, descriptive name for the operation.
  • Summary: A quick summary of what the operation is all about.
  • Description: A more detailed explanation of the operation’s purpose.

And here's an example of an operation in an AsyncAPI document.

With operations, you get more control and clarity over message flow within each channel—making AsyncAPI even more powerful for building real-time, event-driven systems!

These three concepts are integral when documenting our websocket server using AsyncAPI.

The Complete Breakdown

Now that we’ve seen how AsyncAPI can streamline real-time communication and simplify managing WebSocket channels, let's take a closer look at what a complete AsyncAPI document would look like for a simple chat application, using the key concepts we've outlined.

Now let's take a closer look at what a complete asyncapi document looks like for a simple chat application using some of the key concepts we've outlined above.

Step 1 - Defining Basic Information About Our WebSocket API

First, we provide some essential information about our API, including the server details for client connections.

1asyncapi: "3.0.0"
2
3info:
4  title: A simple chat application
5  version: 1.0.0
6  description: A simple real-time chat API using WebSocket protocol
7
8servers:
9  development:
10    host: localhost:8787
11    description: Development Websocket broker.
12    protocol: wss

Step 2 - Defining Our WebSocket Channel

As we mentioned earlier, AsyncAPI channels enable bi-directional communication between senders and receivers. Let’s define our channel below:

1channels:
2  chat:
3    address: /
4    title: Users channel

Notice we haven’t included message details yet. To keep things organized, we’ll use components to define reusable messages and then reference them in our channel.

Step 3 - Creating a component

Components in AsyncAPI helps holds a set of reusable objects for different aspect of the AsyncAPI specification. When you define an object in a component, it won't have any effect on your API unless the object is been explicitly referenced from another properties outside the component object.

Just like the rest of the key concepts i mentioned earlier, components also have a set of required elements that can be defined such as the following:

  • Messages: An Object that holds reusable message objects
  • Channels: An object that holds reusable channel objects
  • Operations: An object to hold reusable operation objects
  • SecuritySchemes: An object that holds reusable security scheme objects
  • Schemas: and object to hold reusable schema object.

Now, because we want our #chat channel to not look overwhelming and difficult to read, we are going to create our message in the component object.

1components:
2  messages:
3    chat:
4      description: A message sent in the chat room
5      payload:
6        type: object
7        properties:
8          messageId:
9            type: string
10            format: uuid
11            description: Unique identifier for the message
12          senderId:
13            type: string
14            description: ID of the user sending the message
15          content:
16            type: string
17            maxLength: 1000
18            description: The message content
19          timestamp:
20            type: string
21            format: date-time
22            description: Time when the message was sent
23          required:
24            - messageId
25            - senderId
26            - content
27            - timestamp

This message structure includes required fields like messageId, senderId, content, and timestamp. Now, let’s link it to our channel.

Step 4 - Adding Messages to Our Channel and Referencing Components

To make the chat message available in our channel, we’ll add it to the channel's messages section and reference our defined component.

1channels:
2  chat:
3    address: /
4    title: Users channel
5     messages:
6       chatMessage:
7         $ref: '#/components/messages/chat'

With our message now tied to the channel, the final step is to specify the type of operation that can be performed within this channel. This structure allows for clear, consistent message flow and easy extensibility as your API grows!

Step 5 - Defining our chat channel Operation

The Operation part is critical to our API because it specifies what kind of action can be executed in a given channel. So now we need to create a operation for our #chat channel and we do that by doing the following:

1operations:
2  sendMessage:
3    summary: Send a chat message
4    description: Allows users to send messages to the chat room
5    action: send
6    channel:
7      $ref: '#/channels/chat'
8    messages:
9      - $ref: '#/channels/chat/messages/chatMessage'

In the definition above, we created our first operation called sendMessage with a send action, that's made available in the #chat channel. This basically means we've just enabled connected client to send a message, but not any kind of message, but the chatMessage to the #chat channel.

If I attempt to parse a message that isn't included in the list of messages for the #chat channel, as shown below...

1operations:
2  sendMessage:
3    summary: Send a chat message
4    description: Allows users to send messages to the chat room
5    action: send
6    channel:
7      $ref: '#/channels/chat'
8    messages:
9      - $ref: '#/channels/chat/messages/hello'

This will fail because in my #chat channel, i have no such message as hello even if i have the hello message defined in my message component.

A good thing to keep at the back of your mind when defining an operation is the list of messages you're assigning to an operation has to be available in the linked channel messages.

Now that we've created our first operation that allows us to send message, we also need to create another operation that allows us to receive a message. And we do that by doing almost same thing as sending a message except, instead of send in the action, we use the receive action, just as seen below.

1operations:
2  sendMessage:
3    summary: Receive a chat message
4    description: Allows users to receive messages to the chat room
5    action: receieve
6    channel:
7      $ref: '#/channels/chat'
8    messages:
9      - $ref: '#/channels/chat/messages/chatMessage'

With this implementation, we have a fully functional AsyncAPI document, but want to go a few more steps

Step 6 - Reusing an Existing Message Component

Let’s say we want our server to notify users whenever someone joins or leaves the chat. How would we approach this?

First, we define the new message in our components section. This message will hold information about the user joining or leaving.

1components:
2  messages:
3   chat:
4     ...
5   status:
6     payload:
7       type: object
8       properties:
9         userId:
10           type: string
11           description: ID of the user that joined or left
12         type:
13           type: string
14           enum:
15             - join
16             - leave
17         username:
18           type: string
19           description: Display name of the user
20         timestamp:
21           type: string
22           format: date-time
23         required:
24           - userId
25           - username
26           - timestamp
27     

Here, we’ve created a new status message to capture details about users joining or leaving.

Next, let’s add this message to our channel, so our server can broadcast it as needed:

1channels:
2  chat:
3    address: /
4    title: Users channel
5     messages:
6       chatMessage:
7         $ref: '#/components/messages/chat'
8         
9       userStatus: #newly added channel message
10         $ref: '#/components/messages/status'

Finally, we need to define two operations within our channel: one for notifying when a user joins (userJoin) and another for when they leave (userLeave). Here’s how:

1operations:
2  sendMessage:
3  ...
4  
5userJoin: # Newly added operation
6  summary: User join notification
7  description: Notifies when a new user joins the chat room
8  action: receive
9  channel:
10    $ref: '#/channels/chat'
11  messages:
12    - $ref: '#/channels/chat/messages/userStatus'
13
14userLeave: # Newly added operation
15  summary: User leave notification
16  description: Notifies when a user leaves the chat room
17  action: receive
18  channel: $ref: '#/channels/chat'
19  messages:
20    - $ref: '#/channels/chat/messages/userStatus'  

In this setup, both userJoin and userLeave operations use the same userStatus message structure, saving time and reducing redundancy!

Step 7 - Adding Authentication to Our API

Securing our API is critical, and AsyncAPI supports defining security schemes to specify the authentication methods needed to connect.

Leveraging the AsyncAPI SecurityScheme allows you to define any or many of the available types of securityschemeobject available such as API key HTTP authentication HTTP API Key, OAuth2, and e.t.c.

Let's see how to declare a security scheme for our websocket server using the HTT API Key scheme.

To secure our WebSocket server, let’s define an API key scheme in our components:

1components:
2  messages:
3  ....
4  securitySchemes:
5    apiKeyHeader:
6      type: httpApiKey
7      in: header
8      name: X-API-Key
9      description: API key passed in header
10  

Here, apiKeyHeader is our security scheme, specifying that the key should be included in the header under the name X-API-Key.

Now, let’s associate this security scheme with our WebSocket server so it requires authorization:

1servers:
2  development:
3    host: localhost:8787
4    description: Development Websocket broker.
5    protocol: ws
6    security: # newly added line
7      - $ref: '#/components/securitySchemes/apiKeyHeader' 

As you can see we added a security property to the development server, and one thing you can notice is i'm specifying it as an array -$ref and the reason is because you can pass multiple security types in the security object, and only one of this security scheme needs to be satisfied to authorize a connection. But in our case, we only needed one so yea, let's role with that.

Step 8 - Providing Protocol-Specific Information

Remember when we discussed bindings in the Channel section? These bindings allow us to add WebSocket-specific details to customize the connection.

For instance, if we want users to send messages to specific chat rooms, we could traditionally create a channel with a parameter like /{roomId}, which establishes a new connection for each room a user joins. However, this can lead to multiple connections, which we want to avoid. Instead, we’ll use channel bindings.

Bindings are protocol-specific, so we can provide details unique to WebSocket. Rather than using parameters, we’ll use the #chat channel and pass the roomId in the query parameters, as shown below:

1chat:
2  address: /
3  bindings:
4    ws:
5      query:
6        type: object
7        properties:
8          roomId:
9            type: string
10            descritpion: The unique identifier of the chat room
11            pattern: ^[a-zA-Z0-9-]+$
12        additionalProperties: false

By adding these bindings, users can connect once to the / address and use the same connection to join multiple rooms by simply updating the roomId query parameter, which would look like this /?roomId={roomId}. This approach allows a single connection to be used across various chat rooms, making it ideal for chatting in multiple channels simultaneously.

Step 9 - Bringing Everything together

We've finally written a complete asyncapi document for our chat application and this is what it looks like...

1asyncapi: 3.0.0
2info:
3  title: Simple Chat API
4  version: 1.0.0
5  description: A simple real-time chat API using WebSocket protocol
6
7servers:
8  production:
9    host: chat.example.com
10    protocol: ws
11    description: Production server
12    security:
13      - $ref: '#/components/securitySchemes/apiKeyHeader'
14
15  
16channels:
17  chat:
18    address: /
19    
20    bindings:
21      ws:
22        query:
23          type: object
24          properties:
25            roomId:
26              type: string
27              descritpion: The unique identifier of the chat room
28              pattern: ^[a-zA-Z0-9-]+$
29          additionalProperties: false
30          
31    messages:
32      chatMessage:
33        $ref: '#/components/messages/chat'
34        
35      userJoined:
36        description: Event when a user joins the chat room
37        $ref: '#/components/messages/status'
38        
39      userLeft:
40        description: Event when a user leaves the chat room
41        $ref: '#/components/messages/status'
42
43operations:
44  sendMessage:
45    action: send
46    channel:
47      $ref: '#/channels/chat'
48    messages:
49      - $ref: '#/channels/chat/messages/chatMessage'
50    summary: Send a chat message
51    description: Allows users to send messages to the chat room
52
53  receiveMessage:
54  action: receive
55  channel:
56    $ref: '#/channels/chat'
57  messages:
58    - $ref: '#/channels/chat/messages/chatMessage'
59  summary: Receive chat messages
60  description: Allows users to receive messages from the chat room
61
62  userJoin:
63    action: receive
64    channel:
65      $ref: '#/channels/chat'
66    messages:
67      - $ref: '#/channels/chat/messages/userJoined'
68    summary: User join notification
69    description: Notifies when a new user joins the chat room
70
71  userLeave:
72  action: receive
73  channel:
74    $ref: '#/channels/chat'
75  messages:
76  - $ref: '#/channels/chat/messages/userLeft'
77  summary: User leave notification
78  description: Notifies when a user leaves the chat room
79
80components:
81  messages:
82    chat:
83      description: A message sent in the chat room
84      payload:
85        type: object
86        properties:
87          messageId:
88            type: string
89            format: uuid
90            description: Unique identifier for the message
91          senderId:
92            type: string
93            description: ID of the user sending the message
94          content:
95            type: string
96            maxLength: 1000
97            description: The message content
98          timestamp:
99            type: string
100            format: date-time
101            description: Time when the message was sent
102          required:
103            - messageId
104            - senderId
105            - content
106            - timestamp
107 
108	status:
109      payload:
110        type: object
111        properties:
112          userId:
113            type: string
114            description: ID of the user status[join/leave]
115          username:
116            type: string
117            description: Display name of the user
118          timestamp:
119            type: string
120            format: date-time
121        required:
122          - userId
123          - username
124          - timestamp
125
126    securitySchemes:
127      apiKeyHeader:
128        type: httpApiKey
129        in: header
130        name: X-API-Key
131        description: API key passed in header

And since we followed the spec-first approach, we can do a lot of interesting thing with this document, such as:

  • Generate Documentation: Using our asyncapi document above, we can automatically generate rich, interactive documentation to make understanding and using our API easy for anyone. With tools like AsyncAPI Studio, you can visualize and interact with our API, view channel information, messages, and operations, all without leaving the browser.
  • Code Generation: Using the AsyncAPI CLI we can generate powerful code in any language, enabling us to transform our AsyncAPI document directly into production-ready code. This means we can generate client or server code and models, while speeding up the development process and reducing the risk of inconsistencies.
  • API Contract Testing: Using our AsyncAPI document, we can perform some contract testing that ensures that our system remains aligned with its design, preventing unexpected behavior. With tools like Microcks, we can test and mock our API based on our AsyncAPI specification, so we're sure our API behaves as expected, even before it’s fully implemented.

After using the AsyncAPI CLI to generate an HTML template with the following command: asyncapi generate fromTemplate ./asyncapi.yaml @asyncapi/html-template@3.0.0 --use-new-generator, we get a fully functional production-ready website for our API documentation. This generated site provides a visually appealing and interactive way to explore our AsyncAPI definition, as shown in the screenshot below.

AsyncAPI preview screenshot

Additionally, with the help of AsyncAPI Studio, you can easily preview your AsyncAPI document in a user-friendly interface. Simply click on this URL to explore the document live. This makes it even more convenient to review and refine your API definition in real-time!

Putting everything we've learnt together, we have our AsyncAPI document ready to go!

Conclusion

Documenting your WebSocket API with AsyncAPI brings clarity and structure to designing and managing your APIs. By standardized message formats, channels, and operations, AsyncAPI simplifies the process of building scalable, consistent, and reliable APIs.

AsyncAPI's structured approach equips teams with a collaborative framework that enhances efficiency and reduces development friction, making it a cornerstone in modern API design.

References

  • Feel free to check out my livestream for a walkthrough on building a chat application from scratch using the AsyncAPI specification.
  • Dive deeper with this blog post on using WebSocket with AsyncAPI.
  • Join the conversation and connect with the AsyncAPI community in our Slack workspace.