DoodleTs: GPU Support with OpenEmbedded (Introduction)

Synopsis

Traditionally, an embedded device that included a couple buttons and a 2x16 text display was considered state-of-the-art. These days, an increasing number of embedded projects are using graphic displays; potentially touch-enabled. This trend appears to be growing. If an embedded product is going to use a graphics system, it would be best if as much of the graphics processing as possible were offloaded from the CPU to the GPU.

Being able to quickly put together a basic image for an embedded device that includes accelerated graphics support is the starting point for more and more projects. Ideally the project's time should be spent developing the application which runs on the device, rather than on trying to build the basic image with functioning accelerated graphics.

Modern GPUs include multiple logical subunits for different jobs: multimedia units for video playback, compute units for computation offloading, rendering units for drawing, and many others. My primary interest is with rendering on X11.

OpenEmbedded (OE) is a great tool for building and maintaining images for embedded devices (as well as for building and maintaining embedded distributions). In this series of articles I want to take a look at how well (or not) OE supports GPUs and GPU acceleration. GPU drivers and acceleration are huge topics, and I won't pretend to know or write much about them. Rather, I'll be looking at this topic from an "image building" point-of-view.

GPU Support Options

When a vendor ships a GPU, they usually provide some sort of software for it. But usually that software is in the form of a binary blob exposed via a high-level API (such as OpenGL). From a software point-of-view, interfacing with a GPU requires many moving parts. On the one side is the kernel, on the other side is the application itself; in between are many other components. When a vendor ships a binary blob, it is built against a specific version/branch of each of these components. This means that the moment you pick a specific board/SoC for your project, you are already locking into a specific kernel version for your product. Your product will forever be locked to that version, unless the GPU vendor decides to release a newer version of the blob for your given GPU. Worse still, even though the kernel that you're being locked into says (for example) "3.10", in most cases you're forced to use your vendor's branch of "3.10". Which really means: "at some point this was 3.10, but now (1000+ patches later) it could only be best described as '3.10-ish'".

Many embedded projects like to use (or at least experiment with using) the PREEMPT_RT patch. But not every kernel that is released has an associated PREEMPT_RT patch. So if the kernel you're being forced to use doesn't have an associated PREEMPT_RT patch, you'll either have to invest the effort in trying to get the closest PREEMPT_RT patch working with your specific kernel, or forgo using PREEMPT_RT altogether. In some cases, although your kernel might be advertised as a given version, and although there might be a PREEMPT_RT patch for that kernel version, the vendor patches that have been added make applying the PREEMPT_PT patch difficult.

Similarly, support for new features is being added to the kernel every day. If your GPU vendor is locking you into an older kernel, you'll either have to back-port the new features to the older kernel yourself, or not be able to take advantage of the new features in your product.

Another potential "gotcha" when using a GPU vendor's binary blob is device support. Sometimes a GPU vendor will only decide to support a specific OS (Android, and not Linux at all) or a specific display server (Xorg vs Wayland vs Mir...) or API (OpenGL vs OpenGL ES (1, 2, 3?) vs Vulcan...) in their binary blob (or some small subset there-of). In many companies, the people who develop the product aren't the same people who choose the board/SoC (and there might be no communication between these two groups). Meaning the SoC gets chosen based on factors such as availability, size, or price without any consideration for how the product will need to be coded if such restrictions are in place.

There are also security implications of using older kernels...

...and the list continues.

An open-source GPU driver provides you with the most flexibility in choosing which version of which components you want to use in your product, as well as the most flexibility in how to implement your product. You can choose to use the pure upstream sources, or any variation there-of. You can decide to use OpenGLES on X11, if that's what you prefer. As well, it lets you experiment with various projects the wider community is working on. Do you want to create a product that uses virtualization, accelerated graphics, PREEMPT_RT, and supports the latest TPM2.0 devices? No problem. Want to try that with a binary blob that locks you into some version of a 3.4 kernel...? That might be a little more difficult. Your GPU vendor can't possibly predict what sort of product you'll want to create or how you'll want to create it.

In summary there are two options: use the vendor-supplied binary blobs which limit your flexibility, or use an open-source graphics driver and get to make more of the decisions yourself.

Open-Source GPU Projects

There are a number of projects whose goal is to create open-source drivers for a GPU family:

etnaviv

Vivante GPUs
mostly found in NXP i.MX products and Marvell ARMADA SoCs
https://github.com/etnaviv

freedreno

Adreno GPUs
developed by, and used in Snapdragon SoCs from, Qualcomm
http://freedreno.github.io/

nouveau

nVidia products
mostly discrete add-in cards for desktop PCs
also the Tegra devices
https://nouveau.freedesktop.org/wiki/

found in the Broadcom SoC used in the Raspberry Pi
https://github.com/anholt/mesa/wiki/VC4

Additionally, Intel already provides and supports free and open-source drivers for the GPUs in their chipsets. Yay Intel! If only all companies who produce GPUs were so like-minded! For one thing, there would be no need for a write-up such as this one.

Note: not all open-source GPU projects provide support for every subunit or function a GPU implements nor provide support for every API (etc). Most of these projects are "works in progress". Having said that, however, most of these projects are quite mature and offer excellent capabilities (in some cases exceeding the capabilities of the vendor blobs!) and at least offer the ability to adapt to your needs.

Why OpenEmbedded?

Getting the right versions of each of these components configured with the correct options, installing them to the correct locations, setting up a cross-compiler, cross-compiling all the code, and tweaking them with proper configuration files in the image is not a trivial undertaking. Just assembling the right set of components isn't trivial because the implementation details of how acceleration is achieved for different GPUs varies!

OpenEmbedded provides the metadata, the "recipes", that describe the low-level details of how to configure and build various components. It allows the user to focus on higher-level details, instead of getting bogged down in the minutiae of setting up sysroots for cross-compilation and making sure the compiler gets passed the right parameters. Do you want your image to include the "xdpyinfo" program? Just add it to the list. Do you want to build an image with musl instead of glibc? Just add the correct layer and set the variable indicating which C library to use. Then let OpenEmbedded handle the details; the commands you type are the same regardless.

There are, of course, other build systems for generating images. The point of this article, however, is to survey the state-of-the-art in graphics support with respect to OpenEmbedded. This is not meant to be a series of articles on the state of open-source graphics support in general nor a comparison of graphics support from various build systems.

Summary

For each GPU family, I would like to write an article describing how to use OpenEmbedded to create one of two images: one image using the vendor blob, and one image using the open-source replacement. As a basis, it would be great if it were easy for anyone to create either of these images. This would allow the user to quickly start their base images and choose their GPU support.

Going further, I'd like to then run the same software on each image and provide performance statistics and general feedback.

Hopefully the information in these articles will:

provide concise information to help users get their images built and running easily and quickly
provide a comparison between the various GPU families and provide a software support matrix
help make it easy for developers to become involved in developing and debugging open-source graphics drivers

Caveat

As always, please try to remember that software is an ever-evolving entity. As I write this article (early June 2017) I try to be as correct as possible. But that doesn't mean I'm always correct, and that doesn't mean that what is correct right now, is still correct an hour from now. So if you're reading these articles many years into the future, please try to remember that everything evolves and there will be a time at which all of what's written here stops being true, or possible, or whatever.

DoodleTs

7 Jun 2017

GPU Support with OpenEmbedded (Introduction)