How DRI and DRM Work
Introduction
This page is intended as an introduction to what DRI and DRM are, how they fit into the overall structure of X, Mesa and so on, and why you should care about it if you're developing applications which make use of graphics acceleration. It's written with the SMedia Glamo chip of the Openmoko Neo FreeRunner in mind, but many of the concepts are generally applicable to all graphics hardware. However, more advanced hardware is generally a lot more complicated than the technology described here...
The Hardware
The graphics processor (GPU) is separate from the main CPU, and has its own memory. The graphics chip can't access the main system memory at all [1], but the graphics memory ("VRAM") can be mapped into the CPU's address space and then written to or read from. If we didn't care about accelerated graphics, we would just program the GPU to display a section of its VRAM on the LCD screen, and write what we wanted to display.
But we want to do some slightly more exciting things. To use acceleration features, a sequence of commands has to be written to a circular buffer, in VRAM, of commands. Circular in this sense means that when there is no more space left at the end of the buffer, writing begins again at the start of the buffer. Hopefully the buffer is long enough that the program can write a significant number of commands, then go away and do something else while they execute.
Commands submitted via this buffer (known as the "command queue", "ring buffer" or similar) generally need to refer to other buffers in VRAM. For instance, if we told the GPU to turn the entire screen blue, we would need to send a command sequence which described that a flood fill operation was to be performed, that it was to be done using the colour blue, and that the buffer to be filled with blue was the same buffer we previously told it to display on the LCD.
Historical Situation
Previously, all applications which used hardware acceleration would contain a full implementation of the code required to allocate VRAM for the command queue, program the command queue processor (part of the GPU) to read commands from the newly allocated buffer, and to keep updating its configuration every time new commands were written. There'd also be a lot of error-checking to be done to make sure everything went smoothly. In addition, it'd have to handle memory management if it wanted to allocate more buffers in VRAM - for instance, to hold the contents of offscreen windows ready to be quickly copied onto the screen. This is a lot of programming, but is the situation we currently have with the Glamo chip. The X server (both the old Kdrive-based server (Xglamo) and the more recent X.org driver xf86-video-glamo) contains command queue handling code, as does the accelerated version of mplayer.
It's pretty clear that multiple applications can't simultaneously use the acceleration - they'd both be trying to manage a single command queue and pool of VRAM in their own ways, and the results could range from instability to outright catastrophe. This is one of the things DRI is here to fix.
DRM - Kernel Level Support
The Direct Rendering Manager, DRM, is the kernel part of the wider Direct Rendering Infrastructure (DRI). With DRM, all the command queue and VRAM handling is done by the kernel, and there's an ioctl interface through which userspace programs can ask it to do things. For example, a program might ask to allocate some VRAM, and the DRM will return a unique handle by which the program can refer to the newly allocated VRAM. The kernel, aware of the requests from the multiple programs, can coordinate memory management across them all. If the program needed to read or write its VRAM, it could ask the kernel to map the memory into the program's address space. The subset of the DRM ioctl interface which takes care of memory is called GEM [2].
Command queue handling is similar. If the program wanted to submit a sequence of commands, it could call another ioctl with its commands, and the kernel would add it to the command queue. Part of the beauty of all this is that only the kernel has to know where the objects in VRAM actually reside at any point, and it can move some of them out of VRAM if space becomes tight. Userspace programs just use their unique handles to refer to memory, and let the kernel take care of making sure that the correct buffers are in the right places at the right times.
Finally, there's a library ("libDRM") which exists to make handling DRM's ioctls a little easier.
EXA - X Server Acceleration
I mentioned that the X server used to be one of the programs which wanted to access the hardware to send command sequences. With DRM in place, the X server uses the ioctl interface to pass its commands down to the kernel. The GEM interface is used to allocate VRAM for offscreen pixmaps.
DRI - X Server Support
"DRI" could be taken to mean the overall infrastructure which makes accelerated rendering possible. The DRI interface, which is what this section is about, is specific to X. It consists of a handful of X requests by which X clients can request that the X server allows it to carry out accelerated rendering to a particular window. The client asks for a handle for the window, and uses that handle to tell the kernel to draw things, for instance using the 3D engine of the graphics hardware. When it's finished drawing a frame, the client asks the X server to swap the buffers (assuming, say, a double-buffered OpenGL context) so that the contents are visible on the screen.
DRI is just one way to use the underlying DRM framework. For instance, there are other graphics systems (such as Wayland) which also use DRM to access the hardware.
KMS - Kernel Modesetting
There's just one more piece to the puzzle, which is called KMS. This could be the subject of a whole new article, but here's a short overview. Previously, the X driver would directly program the hardware, just as it had to program the command queue engine itself. With KMS, it can ask the kernel to set a certain display resolution, colour depth, or whatever. At the same time, the X driver can send the kernel its handle for a memory buffer which should be used as the contents of the screen. Since the kernel is in complete control of the actual programming of the hardware, it can switch back in the case of, say, a kernel panic or X server crash.
Conclusion
This mostly isn't as complicated as it sounds...! For examples of what programs which use DRM look like, take a look at the Glamo DRI tests. Despite the name, most of these just test DRM, and are independent of the higher-level DRI interfaces.
Further Reading
Footnotes
[1] This is the case for the Glamo chip at least. More advanced graphics hardware can usually be programmed, via something called a GART or GTT, to access areas of the main memory. Some graphics chips do not have any memory of their own, and do everything this way.
[2] Other interfaces are available. GEM was originally developed for Intel cards, and we've borrowed its interface for Glamo even though our memory management requirements are much simpler.
This page is intended as an introduction to what DRI and DRM are, how they fit into the overall structure of X, Mesa and so on, and why you should care about it if you're developing applications which make use of graphics acceleration. It's written with the SMedia Glamo chip of the Openmoko Neo FreeRunner in mind, but many of the concepts are generally applicable to all graphics hardware. However, more advanced hardware is generally a lot more complicated than the technology described here...
The Hardware
The graphics processor (GPU) is separate from the main CPU, and has its own memory. The graphics chip can't access the main system memory at all [1], but the graphics memory ("VRAM") can be mapped into the CPU's address space and then written to or read from. If we didn't care about accelerated graphics, we would just program the GPU to display a section of its VRAM on the LCD screen, and write what we wanted to display.
But we want to do some slightly more exciting things. To use acceleration features, a sequence of commands has to be written to a circular buffer, in VRAM, of commands. Circular in this sense means that when there is no more space left at the end of the buffer, writing begins again at the start of the buffer. Hopefully the buffer is long enough that the program can write a significant number of commands, then go away and do something else while they execute.
Commands submitted via this buffer (known as the "command queue", "ring buffer" or similar) generally need to refer to other buffers in VRAM. For instance, if we told the GPU to turn the entire screen blue, we would need to send a command sequence which described that a flood fill operation was to be performed, that it was to be done using the colour blue, and that the buffer to be filled with blue was the same buffer we previously told it to display on the LCD.
Historical Situation
Previously, all applications which used hardware acceleration would contain a full implementation of the code required to allocate VRAM for the command queue, program the command queue processor (part of the GPU) to read commands from the newly allocated buffer, and to keep updating its configuration every time new commands were written. There'd also be a lot of error-checking to be done to make sure everything went smoothly. In addition, it'd have to handle memory management if it wanted to allocate more buffers in VRAM - for instance, to hold the contents of offscreen windows ready to be quickly copied onto the screen. This is a lot of programming, but is the situation we currently have with the Glamo chip. The X server (both the old Kdrive-based server (Xglamo) and the more recent X.org driver xf86-video-glamo) contains command queue handling code, as does the accelerated version of mplayer.
It's pretty clear that multiple applications can't simultaneously use the acceleration - they'd both be trying to manage a single command queue and pool of VRAM in their own ways, and the results could range from instability to outright catastrophe. This is one of the things DRI is here to fix.
DRM - Kernel Level Support
The Direct Rendering Manager, DRM, is the kernel part of the wider Direct Rendering Infrastructure (DRI). With DRM, all the command queue and VRAM handling is done by the kernel, and there's an ioctl interface through which userspace programs can ask it to do things. For example, a program might ask to allocate some VRAM, and the DRM will return a unique handle by which the program can refer to the newly allocated VRAM. The kernel, aware of the requests from the multiple programs, can coordinate memory management across them all. If the program needed to read or write its VRAM, it could ask the kernel to map the memory into the program's address space. The subset of the DRM ioctl interface which takes care of memory is called GEM [2].
Command queue handling is similar. If the program wanted to submit a sequence of commands, it could call another ioctl with its commands, and the kernel would add it to the command queue. Part of the beauty of all this is that only the kernel has to know where the objects in VRAM actually reside at any point, and it can move some of them out of VRAM if space becomes tight. Userspace programs just use their unique handles to refer to memory, and let the kernel take care of making sure that the correct buffers are in the right places at the right times.
Finally, there's a library ("libDRM") which exists to make handling DRM's ioctls a little easier.
EXA - X Server Acceleration
I mentioned that the X server used to be one of the programs which wanted to access the hardware to send command sequences. With DRM in place, the X server uses the ioctl interface to pass its commands down to the kernel. The GEM interface is used to allocate VRAM for offscreen pixmaps.
DRI - X Server Support
"DRI" could be taken to mean the overall infrastructure which makes accelerated rendering possible. The DRI interface, which is what this section is about, is specific to X. It consists of a handful of X requests by which X clients can request that the X server allows it to carry out accelerated rendering to a particular window. The client asks for a handle for the window, and uses that handle to tell the kernel to draw things, for instance using the 3D engine of the graphics hardware. When it's finished drawing a frame, the client asks the X server to swap the buffers (assuming, say, a double-buffered OpenGL context) so that the contents are visible on the screen.
DRI is just one way to use the underlying DRM framework. For instance, there are other graphics systems (such as Wayland) which also use DRM to access the hardware.
KMS - Kernel Modesetting
There's just one more piece to the puzzle, which is called KMS. This could be the subject of a whole new article, but here's a short overview. Previously, the X driver would directly program the hardware, just as it had to program the command queue engine itself. With KMS, it can ask the kernel to set a certain display resolution, colour depth, or whatever. At the same time, the X driver can send the kernel its handle for a memory buffer which should be used as the contents of the screen. Since the kernel is in complete control of the actual programming of the hardware, it can switch back in the case of, say, a kernel panic or X server crash.
Conclusion
This mostly isn't as complicated as it sounds...! For examples of what programs which use DRM look like, take a look at the Glamo DRI tests. Despite the name, most of these just test DRM, and are independent of the higher-level DRI interfaces.
Further Reading
Footnotes
[1] This is the case for the Glamo chip at least. More advanced graphics hardware can usually be programmed, via something called a GART or GTT, to access areas of the main memory. Some graphics chips do not have any memory of their own, and do everything this way.
[2] Other interfaces are available. GEM was originally developed for Intel cards, and we've borrowed its interface for Glamo even though our memory management requirements are much simpler.
Hello, i find very interesting all this information: anyway i have want to understand more about it, and about the gallium project if you know something, in fact, where drm kernel module resides? i mean, there exist a kernel module and a video driver module that connect (through pipes?) to it?.. so there exist at last two DRM modules one for the kernel and other for the video driver (wich are both part of DRI)??.. im making a mess in my head..
PD: sorry about my english (im argentine), and thank you for all the information.
A good piece of information for some one like me, who has just dived in X ocean.
To answer the questions from "axizhe", the DRM module resides entirely within the kernel, i.e. within the Linux itself. Of course, non-Linux implementations exist as well, e.g. DRM for BSD. The video driver module, which runs in userspace as part of the X server, normally sends commands to it using ioctls. The commands from the X server will normally be pretty dull stuff like "move this rectangle from point A to point B", but other things apart from X can also send their commands to the DRM module - for example, Mesa might send some 3D rendering commands.
So, there are definitely at least two parts of the equation, one in the kernel and at least one in userspace. Only the kernel part is called "DRM", though. DRI, strictly speaking, refers to the X11 protocol which coordinates things in order to allow accelerated rendering to be done by client programs (i.e. anything other than the X server itself) without interfering with the X server's own (accelerated or non-accelerated) rendering. However, it's not too much of a stretch to use the name "DRI" for the overall framework that makes any of it work.
Excellent article. This is one of the few articles on Linux Graphics that actualy makes things clearer.