Recently, GTK gained not one, but two new renderers: one for GL and one for Vulkan.

Since naming is hard, we reused existing names and called them “ngl” and “vulkan”. They are built from the same sources, therefore we also call them “unified” renderers.

But what is exciting about them?

A single source


As mentioned already, the two renderers are built from the same source. It is modeled to follow Vulkan apis, with some abstractions to cover the differences between Vulkan and GL (more specifically, GL 3.3+ and GLES 3.0+). This lets us share much of the infrastructure for walking the scene graph, maintaining transforms and other state, caching textures and glyphs, and will make it easier to keep both renderers up-to-date and on-par.

Could this unified approach be extended further, to cover a Metal-based renderer on macOS or a DirectX-based one on Windows? Possibly. The advantage of the Vulkan/GL combination is that they share basically the same shader language (GLSL, with some variations). That isn’t the case for Metal or DirectX. For those platforms, we either need to duplicate the shaders or use a translation tool like SPIRV-Cross.

If that is the kind of thing that excites you, help is welcome.

Implementation details


The old GL renderer uses simple shaders for each rendernode type and frequently resorts to offscreen rendering for more complex content. The unified renderers have (more capable) per-node shaders too, but instead of relying on offscreens, they will also use a complex shader that interprets data from a buffer. In game programming, this approach is known as a ubershader.

The unified renderer implementation is less optimized than the old GL renderer, and has been written with a focus on correctness and maintainability. As a consequence, it can handle much more varied rendernode trees correctly.

Here is an harmless-looking example:
repeat { bounds: 0 0 50 50; child: border { outline: 0 0 4.3 4.3; widths: 1.3; }}gl (left) ngl (right)A close-up view

New capabilities


We wouldn’t have done all this work, if there wasn’t some tangible benefit. Of course, there’s new features and capabilities. Lets look at some:

Antialiasing. A big problem with the old GL renderer is that it will just lose fine details. If something is small enough to fall between the boundaries of a single line of pixels, it will simply disappear. In particular this can affect underlines, such as mnemonics. The unified renderers handle such cases better, by doing antialiasing. This helps not just for preserving fine detail, but also prevents jagged outlines of primitives.
Close-up view of GL vs NGL
Fractional scaling. Antialiasing is also the basis that lets us handle fractional scales properly. If your 1200 × 800 window is set to be scaled to 125 %, with the unified renderers, we will use a framebuffer of size 1500 × 1000 for it, instead of letting the compositor downscale a 2400 × 1600 image. Much less pixels, and a sharper image.

Arbitrary gradients. The old GL renderer handles linear, radial and conic gradients with up to 6 color stops. The unified renders allow an unlimited number of color stops. The new renderers also apply antialiasing to gradients, so sharp edges will have smooth lines.
A linear gradient with 64 color stops
Dmabufs. As a brief detour from the new renderers, we worked on dmabuf support and graphics offloading last fall. The new renderers support this and extend it to create dmabufs when asked to produce a texture via the render_texture api (currently, just the Vulkan renderer).

Any sharp edges?


As is often the case, with new capabilities comes the potential for new gotchas. Here are some things to be aware of, as an app developer:

No more glshader nodes. Yes, they made for some fancy demos for 4.0, but they are very much tied to the old GL renderer, since they make assumptions about the GLSL api exposed by that renderer. Therefore, the new renderers don’t support them.

You have been warned in the docs:

If there is a problem, this function returns FALSE and reports an error. You should use this function before relying on the shader for rendering and use a fallback with a simpler shader or without shaders if it fails.


Thankfully, many uses of the glshader node are no longer necessary, since GTK has gained new features since 4.0, such as mask nodes and support for straight-alpha textures.

Fractional positions. The old GL renderer is rounding things, so you could get away with handing it fractional positions. The new renderers will place things where you tell it. This can sometimes have unintended consequences, so should be on the lookout and make sure that your positions are where they should be.

In particular, look out for out for cairo-style drawing where you place lines at half-pixel positions so they fill out one row of pixels precisely.

Driver problems. The new renderers are using graphics drivers in new and different ways, so there is potential for triggering problems on that side.

Please file problems you see against GTK even if they look like driver issues, since it is useful for us to get an overview how well (or badly) the new code works with the variety of drivers and hardware out there.

But is it faster?


No, the new renderers are not faster (yet).

The old GL renderer is heavily optimized for speed. It also uses much simpler shaders, and does not do the math that is needed for features such as antialiasing. We want to make the new renderers faster eventually, but the new features and correctness make them very exciting, even before we reach that goal. All of the GPU-based renderers are more than fast enough to render todays GTK apps at 60 or 144 fps.

That being said, the Vulkan renderer comes close to matching and surpassing the old GL renderer in some unscientific benchmarks. The new GL renderer is slower for some reason that we have not tracked down yet.

New defaults


In the just-released 4.13.6 snapshot, we have made the ngl renderer the new default. This is a trial balloon — the renderers need wider testing with different apps too verify that they are ready for production. If significant problems appear, we can revert back to the gl renderer for 4.14.

We decided not make the Vulkan renderer the default yet, since it is behind the GL renderers in a few application integration aspects: the webkit GTK4 port works with GL, not with Vulkan, and GtkGLArea and GtkMediaStream currently both produce GL textures that the Vulkan renderer can’t directly import. All of these issues will hopefully be addressed in the not-too-distant future, and then we will revisit the default renderer decision.

If you are using GTK on very old hardware, you may be better off with the old GL renderer, since it makes fewer demands on the GPU. You can override the renderer selection using the GSK_RENDERER environment variable:
GSK_RENDERER=gl

Future plans and possibilities


The new renderers are a good foundation to implement things that we’ve wanted to have for a long time, such as

  • Proper color handling (including HDR)
  • Path rendering on the GPU
  • Possibly including glyph rendering
  • Off-the-main-thread rendering
  • Performance (on old and less powerful devices)

Some of these will be a focus of our work in the near and medium-term future.

Summary


The new renderers have some exciting features, with more to come.

Please try them out, and let us know what works and what doesn’t work for you.

blog.gtk.org/2024/01/28/new-re…


Some of us in the GTK team have spent the last month or so exploring the world of linux kernel graphics apis, in particular, dmabufs. We are coming back from this adventure with some frustrations and some successes.

What is a dmabuf?


A dmabuf is a memory buffer in kernel space that is identified by a file descriptor. The idea is that you don’t have to copy lots of pixel data around, and instead just pass a file descriptor between kernel subsystems.

Reality is of course more complicated that this rosy picture: the memory may be device memory that is not accessible in the same way as ‘plain’ memory, and there may be more than one buffer (and more than one file descriptor), since graphics data is often split into planes (e.g. RGB and A may be separate, or Y and UV).

Why are dmabufs useful?


I’ve already mentioned that we hope to avoid copying the pixel data and feeding it through the GTK compositing pipeline (and with 4k video, that can be quite a bit of data for each frame).

The use cases where this kind of optimization matters are those where frequently changing content is displayed for a long time, such as

  • Video players
  • Virtual machines
  • Streaming
  • Screencasting
  • Games

In the best case, we may be able to avoid feeding the data through the compositing pipeline of the compositor as well, if the compositor supports direct scanout and the dmabuf is suitable for it. In particular on mobile systems, this may avoid using the GPU altogether, thereby reducing power consumption.

Details


GTK has already been using dmabufs since 4.0: When composing a frame, GTK translates all the render nodes (typically several for each widget) into GL commands, sends those to the GPU, and mesa then exports the resulting texture as a dmabuf and attaches it to our Wayland surface.

But if the only thing that is changing in your UI is the video content that is already in a dmabuf, it would be nice to avoid the detour through GL and just hand the data directly to the compositor, by giving it the file descriptor for the the dmabuf.

Wayland has the concept of subsurfaces that let applications defer some of their compositing needs to the compositor: The application attaches a buffer to each (sub)surface, and it is the job of the compositor to combine them all together.

With what is now in git main, GTK will create subsurfaces as-needed in order to pass dmabufs directly to the compositor. We can do this in two different ways: If nothing is drawn on top of the dmabuf (no rounded corners, or overlaid controls), then we can stack the subsurface above the main surface without changing any of the visuals.

This is the ideal case, since it enables the compositor to set up direct scanout, which gives us a zero-copy path from the video decoder to the display.

If there is content that gets drawn on top of the video, we may not be able to get that, but we can still get the benefit of letting the compositor do the compositing, by placing the subsurface with the video below the main surface and poking a translucent hole in the main surface to let it peek through.

The round play button is what forces the subsurface to be placed below the main surface here.

GTK picks these modes automatically and transparently for each frame, without the application developer having to do anything. Once that play button appears in a frame, we place the subsurface below, and once the video is clipped by rounded corners, we stop offloading altogether. Of course, the advantages of offloading also disappear.

The graphics offload visualization in the GTK inspector shows these changes as they happen:

blog.gtk.org/files/2023/11/off…

Initially, the camera stream is not offloaded because the rounded corners clip it. The magenta outline indicates that the stream is offloaded to a subsurface below the main surface (because the video controls are on top of it). The golden outline indicates that the subsurface is above the main surface.

How do you use this?


GTK 4.14 will introduce a GtkGraphicsOffload widget, whose only job it is to give a hint that GTK should try to offload the content of its child widget by attaching it to a subsurface instead of letting GSK process it like it usually does.

To create suitable content for offloading, the new GdkDmabufTextureBuilder wraps dmabufs in GdkTexture objects. Typical sources for dmabufs are pipewire, video4linux or gstreamer. The dmabuf support in gstreamer will be much more solid in the upcoming 1.24 release.

When testing this code, we used the GtkMediaStream implementation for pipewire by Georges Basile Stavracas Neto that can be found in pipewire-media-stream and libmks by Christian Hergert and Bilal Elmoussaoui.

What are the limitations?


At the moment, graphics offload will only work with Wayland on Linux. There is some hope that we may be able to implement similar things on MacOS, but for now, this is Wayland-only. It also depends on the content being in dmabufs.

Applications that want to take advantage of this need to play along and avoid doing things that interfere with the use of subsurfaces, such as rounding the corners of the video content. The GtkGraphicsOffload docs have more details for developers on constraints and how to debug problems with graphics offload.

Summary


The GTK 4.14 release will have some interesting new capabilities for media playback. You can try it now, with the just-released 4.13.3 snapshot.

Please try it and let us know what does and doesn’t work for you.

blog.gtk.org/2023/11/15/introd…


This entry was edited (1 year ago)