Wayland Overview
Most Linux users have heard the term "Wayland", maybe in the context of a windowing system, maybe compared against X11, but most likely accompanied by a few swears related to their applications not working on it. But what exactly is Wayland and how does it work?
It's a Protocol
It's not a graphics library nor is it a compositor or a windowing system on its own. All it is is a set of agreed upon messages that two applications can send back and forth, and those messages are focused around displaying and interacting with applications on the screen.
The "two applications" represent a client and server relationship. The client is the user application that wants to be displayed, and the server is your Wayland-based compositor responsible for displaying all of the clients correctly. It is fairly new and isn't fully adopted yet, especially when compared to its predecessor "X11", and that results in some issues and bugs with the implementations of the clients and servers, and that's a source for a lot of frustration among users. Whether we like it or not, I believe it is the "way of the future", and has mostly benefits over the X11 windowing system for modern computing.
The Socket
In order to send messages, the client needs to connect to the server. It connects through a Unix socket which, since everything is a file descriptor on Linux, involves just reading to and writing from a ✌️"file"✌️ on the file system. The Unix socket that should be used is determined by the following:
- If the
WAYLAND_SOCKETenvironment variable is set, treat it as a file descriptor number that's already open/established. The parent process must've already set it for the current process. - If the
WAYLAND_DISPLAYenv var is set, concat it withXDG_RUNTIME_DIRto form the socket path. - If the
WAYLAND_DISPLAYenv var is not set, concatenate "wayland-0" to the end ofXDG_RUNTIME_DIRto form the socket path. - If all of the above fails, then give up hope and find solace in a pint of ice cream.
Note that multiple Wayland compositors can be running at the same time on different virtual terminals, and all of them must use a different socket. If you're running multiple compositors, you may need to adjust some of the above environment variables to get your clients to open within the correct compositor.
The Message
Messages that the server sends to the client are called events.
Messages that the client sends to the server are called requests.
A message is just a series of 64+ bits that take the following form:
The first 32 bits in all messages are an object ID representing the object that the message is operating on. The server refers to these objects as resources (wl_resource), and the client refers to these objects as proxies (wl_proxy), but ultimately they're the same thing.
The next 16 bits contain the message length, and is needed since the message data is of variable length. Different messages require different arguments to convey their meaning.
Then comes the opcode, which is the type of message that's being sent. Whoever receives the message uses a combination of the object ID and opcode to determine how to process that message.
Protocol Libraries
The valid messages and their arguments are defined inside Wayland XML spec files. There's a core Wayland protocol containing the absolute essential messages (usually found at /usr/share/wayland/wayland.xml), but many extensions also exist that compositors should implement.
If you want to browse the different protocols and their messages, this website shows the contents of all the common XML spec files in a pretty/visual form: https://wayland.app/protocols/
Rather than crafting messages on our own, libraries exist for the core protocol called libwayland-client and libwayland-server that provide us with functions for sending and receiving the different message types. To use protocol extensions, you'll need to generate your own functions using wayland-scanner, which takes a Wayland XML spec file and turns it into C source code and header files:
# Generates the client or server header file that your application should include and use.
wayland-scanner client-header < protocol.xml > client-protocol.h
wayland-scanner server-header < protocol.xml > server-protocol.h
# Generates the glue code that makes the header files work. Include this as a source file.
wayland-scanner private-code < protocol.xml > protocol-glue.cThe header files can be useful to look through so you know what functions and data types are available for you to use. The "glue code" is pretty much useless to inspect, but needs to be listed as a source file in your compilation command.
Functions and Interfaces
"So we have these header files, but how do we actually use them to send and receive messages?"
Great question! I'm glad you asked.
To send a message, you just call the appropriate function defined inside the header file, and supply it with the proper parameters.
To receive a message, you need to register a handler/callback function during program startup. Each object type has a struct defined that lists the different messages that can be received for it, so your job is to define an instance of that struct and implement handlers for each of its members. Server-side objects ("resources") call this struct an interface, whereas client-side objects ("proxies") call this struct a listener.
As an example, let's take a look at a snippet from the wl_surface object. Here's its XML definition:
<interface name="wl_surface" version="1">
<!-- Requests (client to server messages) -->
<request name="attach">
<arg name="buffer" type="object" interface="wl_buffer" allow-null="true"/>
<arg name="x" type="int"/>
<arg name="y" type="int"/>
</request>
<request name="damage">
<arg name="x" type="int"/>
<arg name="y" type="int"/>
<arg name="width" type="int"/>
<arg name="height" type="int"/>
</request>
<!-- ... -->
<!-- Events (server to client messages) -->
<event name="enter">
<arg name="output" type="object" interface="wl_output"/>
</event>
<event name="leave">
<arg name="output" type="object" interface="wl_output"/>
</event>
<!-- ... -->
</interface>If we were to generate header files for this XML definition using wayland-scanner, they'd look something like this:
// -- client-protocol.h --
// Functions for sending requests
static inline void wl_surface_attach(
struct wl_surface *wl_surface,
struct wl_buffer *buffer,
int32_t x,
int32_t y
) {
// Code here
}
static inline void wl_surface_damage(
struct wl_surface *wl_surface,
int32_t x,
int32_t y,
int32_t width,
int32_t height,
) {
// Code here
}
// Listener for receiving events
struct wl_surface_listener {
void (*enter)(void *data, struct wl_surface *wl_surface, struct wl_output *output);
void (*leave)(void *data, struct wl_surface *wl_surface, struct wl_output *output);
};// -- server-protocol.h --
// Functions for sending events
static inline void wl_surface_send_enter(
struct wl_resource *resource_,
struct wl_resource *output
) {
// Code here
}
static inline void wl_surface_send_leave(
struct wl_resource *resource_,
struct wl_resource *output
) {
// Code here
}
// Interface for receiving requests
struct wl_surface_interface {
void (*attach)(
struct wl_client *client,
struct wl_resource *resource,
struct wl_resource *buffer,
int32_t x,
int32_t y
);
void (*damage)(
struct wl_client *client,
struct wl_resource *resource,
int32_t x,
int32_t y,
int32_t width,
int32_t height
);
};Note that the server functions include the word "send" in their name, but the client functions do not: wl_surface_send_enter() vs wl_surface_attach()
For the case of a Wayland server, the interface would be implemented and registered something like this:
static void handle_attach(
struct wl_client *client,
struct wl_resource *resource,
struct wl_resource *buffer,
int32_t x,
int32_t y
) {
// Code here
}
static void handle_damage(
struct wl_client *client,
struct wl_resource *resource,
int32_t x,
int32_t y,
int32_t width,
int32_t height
) {
// Code here
}
const struct wl_surface_interface my_surface_implementation = {
.attach = handle_attach,
.damage = handle_damage,
/* ... */
};
// Whenever an instance of wl_surface is created, you'd call set_implementation to
// bind that surface resource to your API's implementation:
wl_resource_set_implementation(resource, &my_surface_interface, nullptr, nullptr);It's worth noting that the first two parameters in both of the handle_*() functions are not listed in the protocol's spec. When the C bindings are generated, you'll always receive a reference to both the client and underlying resource as the first two parameters.
You're able to watch messages being sent between a Wayland application and your compositor by launching your application with the WAYLAND_DEBUG flag set to 1. The messages are written to stderr. For example, let's say you have an application called foot:
# To watch the messages in realtime:
WAYLAND_DEBUG=1 foot 2>&1
# To send them into a file for future inspection:
WAYLAND_DEBUG=1 foot > ./output 2>&1Globals
For most Wayland objects, you'd create and use multiple instances throughout your interactions. For a select few object types, only a single global instance is created, and it's created when the compositor starts up. Here are the two main globals in the core protocol:
wl_display- This always has object ID 1. It's used by the compositor to open the main socket, listen for clients, and manage a reference towl_registry.wl_registry- Responsible for managing all the global objects within the compositor. When a client connects, the registry emits all the resources (objects) that are available, and the client chooses which ones it wants to bind to for use.
...and here's a list of the factory-like globals that are used to create instances of other objects. These are emitted to a newly-connected client by the registry.
wl_compositor- Creates and manages all things related to surfaces, which are what application buffers are drawn into.wl_subcompositor- Creates and manages subsurfaces, which allow applications to have sections that are rendered independently from the rest of the application (ex: video players).wl_shm- Creates and manages all things related to shared memory for display buffers. This memory is only for CPU / RAM processing, which means it has good support but is slow compared to GPU / VRAM (which you'd get via the DMA-BUF extension).wl_seat- Creates and manages all input devices related to the active seat, such as keyboards, pointers, and touch devices.wl_output- Manages things related to the visible display for the compositor, which is a fancy way of saying "your monitor's resolution, rotation, scale, etc".wl_data_device_manager- Creates and manages objects related to cross-client interactions. This is useful for things like copy-paste and drang-and-drop, and ties closely towl_seat.
There are dozens of non-global object types, so I won't be listing them all out here. As I cover them in the main devlog series, there'll be new notes pages published that go into more details on them.
If you're curious which Wayland protocols your compositor supports, you can install and run wayland-info, which is typically provided inside your Linux distro's wayland-utils package. For example, on Void Linux, I can install and then run it with:
# Install
xbps-install -Su wayland-utils
# Run
wayland-infoSummary
So to summarize the main takeaways that are worth committing to your brain:
- Wayland is a protocol that enables clients and display servers to communicate over a socket by specifying valid message formats.
- In addition to the core Wayland protocol, there are many extensions which can enable additional functionality.
- The valid message formats are defined in XML files, which can have C bindings generated for them via
wayland-scanner. - Each message operates on an object. Servers call these objects resources. Clients call these objects proxies.
- A message the server sends to a client is called an event. A message a client sends to the server is called a request.
- Servers process requests by implementing an interface. Clients process events by implementing a listener.
- There are a few global objects that the server manages inside its registry. When a client connects, the registry sends the available object types to the client, and the client chooses which ones it wants to bind to for use.
Further Reading
There are actually quite a few Wayland resources out there that all have differing degrees of information, but none seem to be good at covering everything:
- Wayland Protocol Explorer, which is helpful for navigating the different XML specs.
- The Wayland Client API or the Wayland Server API, which have some (but not enough) information on other functions/structs brought in with
libwayland-clientandlibwayland-server. - The "Wayland Book", written by the guy who made both Sway and wlroots (Drew DeVault). It's a work-in-progress, might be on hold, and shifts to having a larger focus on client applications rather than compositors. I found the first few chapters to be massively helpful though.