24 Jun 2017

etnaviv/vivante Update

Lucas Stach suggested I also run the test (glmark2-es2) with --off-screen. Here are those results from a recent build.

Note that the etnaviv test
  • will sometimes segfault (usually somewhere in the [conditionals] tests)
  • the [ideas] test sometimes will cause a GPU hangcheck
  • and the [terrain] test complains saying "etna_draw_vbo:199: compiled shaders are not okay"



wandboard
GL_VENDORVivante Corporationetnaviv
GL_RENDERERVivante GC880Gallium 0.4 on Vivante GC880 rev 5106
GL_VERSIONOpenGL ES 3.0 V5.0.11.p8.41671OpenGL ES 2.0 Mesa 17.1.1
glmark2 score116521934756138
/proc/loadavg0.410.150.150.260.500.15
--fullscreen--off-screen--fullscreen--off-screen
[build]20414145180136437
[build]25915871384160614
[texture]20111452376108294
[texture]20211349276100286
[texture]1991094787597259
[shading]20512239281119335
[shading]170611847076160
[shading]10839111515297
[shading]813083423969
[bump]124551285769119
[bump]230722817898236
[bump]203612237073163
[effect2d]621568332647
[effect2d]2352314916
[pulsar]1858441475102372
[desktop]31931151018
[desktop]10433122423971
[buffer]483253303346
[buffer]493252293247
[buffer]553562363851
[ideas]4444435100
[jellyfish]611962292338
[terrain]101212
[shadow]9229107383462
[refract]2092010811
[conditionals]209773836978169
[conditionals]611965332647
[conditionals]204763476576159
[function]11436130494790
[function]341135302442
[loop]10534119494688
[loop]10534119494588
[loop]551869302442

9 Jun 2017

GPU Support with OpenEmbedded (etnaviv/vivante)

 Introduction

This series of articles assumes some familiarity with OpenEmbedded, but if you haven't used it before, hopefully you can follow along. If you've never used OpenEmbedded before and would like to give these build instructions a try, start off by reading and trying the examples in The Yocto Project's Quick Start Guide. Hopefully that will get you and your machine setup.

There are many SoCs that incorporate the Vivante GPU. I happen to have the Wandboard Dual, so that's the board I'll be using for my tests.

Build Setup

To begin an OpenEmbedded build (assuming your Linux build machine has all the necessary packages), we need to choose some place on our computer to which we have rwx access, and grab the necessary metadata. The first chunks of metadata contains all the base, generic things:

$ git clone git://git.openembedded.org/openembedded-core
$ git clone git://git.openembedded.org/meta-openembedded

Then we need to add BSP metadata. Most BSPs consist of one layer, but in the case of the Wandboard we need the generic freescale BSP layer (meta-freescale) and the BSP that specifically supports the Wandboard (meta-freescale-3rdparty). The Wandboard has an i.MX6 SoC on it which was made by Freescale. In 2015 NXP merged with Freescale. Although the i.MX6 is, technically, an NXP SoC, the layer retains the "freescale" name.

$ git clone https://git.yoctoproject.org/git/meta-freescale
$ git clone https://github.com/Freescale/meta-freescale-3rdparty

How did I possibly know I needed those two layers? By consulting the layer index. The layer index is where layer maintainers go to make their layers known, and it's a great place for users to go to find support machines, software, distros, etc. If you're not working with the Wandboard, then you'll need to consult the layer index to figure out which layers you need for your specific hardware.

Now we need the tool that uses all this metadata and actually performs the build:

$ git clone  git://git.openembedded.org/bitbake

Now that we have all the pieces in place, we setup our shell

$ . openembedded-core/oe-init-build-env build bitbake/
You had no conf/local.conf file. This configuration file has therefore been
created for you with some default values. You may wish to edit it to, for
example, select a different MACHINE (target hardware). See conf/local.conf
for more information as common configuration options are commented.

You had no conf/bblayers.conf file. This configuration file has therefore been
created for you with some default values. To add additional metadata layers
into your configuration please add entries to conf/bblayers.conf.

The Yocto Project has extensive documentation about OE including a reference
manual which can be found at:
    http://yoctoproject.org/documentation

For more information about OpenEmbedded see their website:
    http://www.openembedded.org/


### Shell environment set up for builds. ###

You can now run 'bitbake <target>'

Common targets are:
    core-image-minimal
    core-image-sato
    meta-toolchain
    meta-ide-support

You can also run generated qemu images with a command like 'runqemu qemux86'

This creates the build directory ("build") that was specified on the shell setup line. Now we tell bitbake about our additional layers:

$ bitbake-layers add-layer ../meta-freescale
Parsing recipes: 100% |#########################################################| Time: 0:00:12
Parsing of 925 .bb files complete (0 cached, 925 parsed). 1408 targets, 149 skipped, 0 masked, 0 errors.
$ bitbake-layers add-layer ../meta-freescale-3rdparty
Parsing recipes: 100% |#########################################################| Time: 0:00:07
Parsing of 962 .bb files complete (0 cached, 962 parsed). 1445 targets, 185 skipped, 0 masked, 0 errors.
$ bitbake-layers add-layer ../meta-openembedded/meta-oe
Parsing recipes: 100% |#########################################################| Time: 0:00:13
Parsing of 1614 .bb files complete (0 cached, 1614 parsed). 2226 targets, 265 skipped, 0 masked, 0 errors.

Vivante Build

 By default, the "freescale" BSP layers assume the user wants to build an image using the vivante binary blob. This blob isn't "free", so in order to use it, you have to agree to its EULA. To do that, you have to read the "EULA" file you'll find at the top-level of the meta-freescale BSP that was cloned earlier. Once you've read that file and agreed to it, you can proceed with this build. If you don't or can't agree to the EULA, then you can proceed directly to the Etnaviv build.

When you setup your shell, earlier, it created a boilerplate configuration file for you. From the "build" directory that was created for you during setup, open the conf/local.conf file with your favourite text editor and add the following lines at the top (be sure to leave the rest of the file as-is!):

ACCEPT_FSL_EULA = "1"
CORE_IMAGE_EXTRA_INSTALL += "openbox glmark2"
DISTRO_FEATURES_append = " opengl x11"
IMAGE_FEATURES += "x11"

Once that's done you can run your build:

$ MACHINE=wandboard bitbake core-image-full-cmdline
Parsing recipes: 100% |#########################################################| Time: 0:00:14
Parsing of 1614 .bb files complete (0 cached, 1614 parsed). 2226 targets, 249 skipped, 0 masked, 0 errors.
NOTE: Resolving any missing task queue dependencies

Build Configuration:
BB_VERSION        = "1.34.0"
BUILD_SYS         = "x86_64-linux"
NATIVELSBSTRING   = "opensuse-42.2"
TARGET_SYS        = "arm-oe-linux-gnueabi"
MACHINE           = "wandboard"
DISTRO            = "nodistro"
DISTRO_VERSION    = "nodistro.0"
TUNE_FEATURES     = "arm armv7a vfp thumb neon callconvention-hard cortexa9"
TARGET_FPU        = "hard"
meta              = "master:186882ca62bf683b93cd7a250963921b89ba071f"
meta-freescale    = "master:98d57b06d88cb22129bd417a9a3edbaf24612460"
meta-freescale-3rdparty = "master:fd3962a994b2f477d3e81fa7083f6b3d4e666df5"
meta-oe           = "master:41cf832cc9abd6f2293a6d612463a34a53a9a52a"

Initialising tasks: 100% |######################################################| Time: 0:00:04
NOTE: Executing SetScene Tasks
NOTE: Executing RunQueue Tasks
NOTE: Tasks Summary: Attempted 4369 tasks of which 2317 didn't need to be rerun and all succeeded.

From this successful build we'll find our device image located at:

tmp-glibc/deploy/images/wandboard/core-image-full-cmdline-wandboard.wic.gz

If you unzip this wic file, it can be dd'ed directly to a microSD card, this microSD card can be inserted into the Wandboard, which can then be powered up.

$ gzip -d < tmp-glibc/deploy/images/wandboard/core-image-full-cmdline-wandboard.wic.gz > core-image-full-cmdline-wandboard.wic
$ su
Password:
# dd if=core-image-full-cmdline-wandboard.wic of=/dev/sdi bs=10M
21+1 records in
21+1 records out
222298112 bytes (222 MB, 212 MiB) copied, 85.6691 s, 2.6 MB/s

I like to interact with embedded boards via a serial console. On the Wandboard, this means using a DE-9 serial cable. Once the board boots up I login and run the benchmark application:

OpenEmbedded nodistro.0 wandboard /dev/ttymxc0

wandboard login: D-BUS per-session daemon address is: unix:abstract=/tmp/dbus-fd5NI0apUa,guid=772b87be1cee0f1d2acde6c25938e674
Using calibration data stored in /etc/pointercal.xinput
Invalid format 42060
unable to find device EETI eGalax Touch Screen
INFO: width=1920, height=1080
Obt-Message: Failed to open an Input Method
Openbox-Message: X server does not support locale.
Openbox-Message: Cannot set locale modifiers for the X server.

root
root@wandboard:~# uname -a
Linux wandboard 4.1.15-1.1.0-ga-wandboard+g8b015473d340 #1 SMP PREEMPT Wed Jun 7 23:42:49 EDT 2017 armv7l armv7l armv7l GNU/Linux
root@wandboard:~# export DISPLAY=:0
root@wandboard:~# glmark2-es2
=======================================================
    glmark2 2014.03
=======================================================
    OpenGL Information
    GL_VENDOR:     Vivante Corporation
    GL_RENDERER:   Vivante GC880
    GL_VERSION:    OpenGL ES 3.0 V5.0.11.p8.41671
=======================================================
[build] use-vbo=false: FPS: 206 FrameTime: 4.854 ms
[build] use-vbo=true: FPS: 246 FrameTime: 4.065 ms
[texture] texture-filter=nearest: FPS: 200 FrameTime: 5.000 ms
[texture] texture-filter=linear: FPS: 200 FrameTime: 5.000 ms
[texture] texture-filter=mipmap: FPS: 199 FrameTime: 5.025 ms
[shading] shading=gouraud: FPS: 205 FrameTime: 4.878 ms
[shading] shading=blinn-phong-inf: FPS: 170 FrameTime: 5.882 ms
[shading] shading=phong: FPS: 108 FrameTime: 9.259 ms
[shading] shading=cel: FPS: 81 FrameTime: 12.346 ms
[bump] bump-render=high-poly: FPS: 124 FrameTime: 8.065 ms
[bump] bump-render=normals: FPS: 220 FrameTime: 4.545 ms
[bump] bump-render=height: FPS: 203 FrameTime: 4.926 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 62 FrameTime: 16.129 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 23 FrameTime: 43.478 ms
[pulsar] light=false:quads=5:texture=false: FPS: 183 FrameTime: 5.464 ms
libpng warning: iCCP: known incorrect sRGB profile
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 31 FrameTime: 32.258 ms
libpng warning: iCCP: known incorrect sRGB profile
[desktop] effect=shadow:windows=4: FPS: 103 FrameTime: 9.709 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 49 FrameTime: 20.408 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 49 FrameTime: 20.408 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 57 FrameTime: 17.544 ms
[ideas] speed=duration: FPS: 44 FrameTime: 22.727 ms
[jellyfish] <default>: FPS: 61 FrameTime: 16.393 ms
[terrain] <default>: FPS: 1 FrameTime: 1000.000 ms
[shadow] <default>: FPS: 92 FrameTime: 10.870 ms
[refract] <default>: FPS: 20 FrameTime: 50.000 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 209 FrameTime: 4.785 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 61 FrameTime: 16.393 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 203 FrameTime: 4.926 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 114 FrameTime: 8.772 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 34 FrameTime: 29.412 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 105 FrameTime: 9.524 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 105 FrameTime: 9.524 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 55 FrameTime: 18.182 ms
=======================================================
                                  glmark2 Score: 115
=======================================================

Some of the relevant packages in this build include:
  • libegl-mesa_2:17.1.1
  • libgles2-mesa_2:17.1.1
  • libgl-mesa_2:17.1.1
  • xserver-xorg_2:1.19.3
  • kernel-4.1.15-1.1.0-ga-wandboard+g8b015473d340
  • libc6_2.25
  • glmark2_2014.03+0+7215c0f337
  • cross-compiler: gcc-6.3.0

Etnaviv Build

Switching to a build that uses etnaviv isn't very hard. Keeping the bottom part of the configuration file as it was found, modify the top of conf/local.conf so that it looks like:

MACHINEOVERRIDES .= ":use-mainline-bsp"
CORE_IMAGE_EXTRA_INSTALL += "openbox glmark2"
DISTRO_FEATURES_append = " opengl x11"
IMAGE_FEATURES += "x11"

Since you're no longer using the binary blob, agreeing to the EULA is no longer required. Telling the build you want to switch to more "upstream" components is just a matter of adding the MACHINEOVERRIDES line.

Building:

$ MACHINE=wandboard bitbake core-image-full-cmdline
Parsing recipes: 100% |#########################################################| Time: 0:00:17
Parsing of 1614 .bb files complete (0 cached, 1614 parsed). 2226 targets, 265 skipped, 0 masked, 0 errors.
NOTE: There are 199 recipes to be removed from sysroot wandboard, removing...
NOTE: Resolving any missing task queue dependencies

Build Configuration:
BB_VERSION        = "1.34.0"
BUILD_SYS         = "x86_64-linux"
NATIVELSBSTRING   = "opensuse-42.2"
TARGET_SYS        = "arm-oe-linux-gnueabi"
MACHINE           = "wandboard"
DISTRO            = "nodistro"
DISTRO_VERSION    = "nodistro.0"
TUNE_FEATURES     = "arm armv7a vfp thumb neon callconvention-hard"
TARGET_FPU        = "hard"
meta              = "master:186882ca62bf683b93cd7a250963921b89ba071f"
meta-freescale    = "master:98d57b06d88cb22129bd417a9a3edbaf24612460"
meta-freescale-3rdparty = "master:fd3962a994b2f477d3e81fa7083f6b3d4e666df5"
meta-oe           = "master:41cf832cc9abd6f2293a6d612463a34a53a9a52a"

Initialising tasks: 100% |######################################################| Time: 0:00:07
NOTE: Executing SetScene Tasks
NOTE: Executing RunQueue Tasks
NOTE: Tasks Summary: Attempted 4396 tasks of which 1407 didn't need to be rerun and all succeeded.

Unpack the wic file, dd it to a microSD card, and boot it up on the Wandboard:

OpenEmbedded nodistro.0 wandboard /dev/ttymxc0

wandboard login: Error: No calibratable devices found.
Obt-Message: Failed to open an Input Method
Openbox-Message: X server does not support locale.
Openbox-Message: Cannot set locale modifiers for the X server.

root
root@wandboard:~# uname -a
Linux wandboard 4.9.21-fslc+gb69ecd63c123 #1 SMP Thu Jun 8 02:34:26 EDT 2017 armv7l armv7l armv7l GNU/Linux
root@wandboard:~#
export DISPLAY=:0
root@wandboard:~# glmark2-es2
=======================================================
    glmark2 2014.03
=======================================================
    OpenGL Information
    GL_VENDOR:     etnaviv
    GL_RENDERER:   Gallium 0.4 on Vivante GC880 rev 5106
    GL_VERSION:    OpenGL ES 2.0 Mesa 17.1.1
=======================================================
[build] use-vbo=false: FPS: 81 FrameTime: 12.346 ms
[build] use-vbo=true:[   59.956033] random: crng init done
 FPS: 91 FrameTime: 10.989 ms
[texture] texture-filter=nearest: FPS: 80 FrameTime: 12.500 ms
[texture] texture-filter=linear: FPS: 78 FrameTime: 12.821 ms
[texture] texture-filter=mipmap: FPS: 75 FrameTime: 13.333 ms
[shading] shading=gouraud: FPS: 87 FrameTime: 11.494 ms
[shading] shading=blinn-phong-inf: FPS: 68 FrameTime: 14.706 ms
[shading] shading=phong: FPS: 51 FrameTime: 19.608 ms
[shading] shading=cel: FPS: 42 FrameTime: 23.810 ms
[bump] bump-render=high-poly: FPS: 57 FrameTime: 17.544 ms
[bump] bump-render=normals: FPS: 74 FrameTime: 13.514 ms
[bump] bump-render=height: FPS: 66 FrameTime: 15.152 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 34 FrameTime: 29.412 ms
libpng warning: iCCP: known incorrect sRGB profile
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 14 FrameTime: 71.429 ms
[pulsar] light=false:quads=5:texture=false: FPS: 75 FrameTime: 13.333 ms
libpng warning: iCCP: known incorrect sRGB profile
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 15 FrameTime: 66.667 ms
libpng warning: iCCP: known incorrect sRGB profile
[desktop] effect=shadow:windows=4: FPS: 42 FrameTime: 23.810 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 30 FrameTime: 33.333 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 29 FrameTime: 34.483 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 34 FrameTime: 29.412 ms
[ideas] speed=duration:[  253.131165] etnaviv-gpu 130000.gpu: hangcheck detected gpu lockup!
[  253.137434] etnaviv-gpu 130000.gpu:      completed fence: 11419
[  253.143413] etnaviv-gpu 130000.gpu:      active fence: 11420
[  253.150061] etnaviv-gpu 130000.gpu: hangcheck recover!
[  257.691146] etnaviv-gpu 130000.gpu: hangcheck detected gpu lockup!
[  257.697403] etnaviv-gpu 130000.gpu:      completed fence: 11420
[  257.703374] etnaviv-gpu 130000.gpu:      active fence: 11421
[  257.709247] etnaviv-gpu 130000.gpu: hangcheck recover!
[  263.931124] etnaviv-gpu 130000.gpu: hangcheck detected gpu lockup!
[  263.937380] etnaviv-gpu 130000.gpu:      completed fence: 11423
[  263.943352] etnaviv-gpu 130000.gpu:      active fence: 11425
[  263.949221] etnaviv-gpu 130000.gpu: hangcheck recover!
[  269.131129] etnaviv-gpu 130000.gpu: hangcheck detected gpu lockup!
[  269.137383] etnaviv-gpu 130000.gpu:      completed fence: 11425
[  269.143355] etnaviv-gpu 130000.gpu:      active fence: 11427
[  269.149324] etnaviv-gpu 130000.gpu: hangcheck recover!
 FPS: 0 FrameTime: inf ms
[jellyfish] <default>: FPS: 29 FrameTime: 34.483 ms
[terrain] <default>:error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!

etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
error: compile failed!
etna_draw_vbo:199: compiled shaders are not okay
 FPS: 2 FrameTime: 500.000 ms
[shadow] <default>: FPS: 39 FrameTime: 25.641 ms
[refract] <default>: FPS: 10 FrameTime: 100.000 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 70 FrameTime: 14.286 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 33 FrameTime: 30.303 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 65 FrameTime: 15.385 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 50 FrameTime: 20.000 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 30 FrameTime: 33.333 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 49 FrameTime: 20.408 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 49 FrameTime: 20.408 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 30 FrameTime: 33.333 ms
=======================================================
                                  glmark2 Score: 47
=======================================================

Some of the relevant packages in this build include:
  • libegl-mesa_2:17.1.1
  • libgles2-mesa_2:17.1.1
  • libgl-mesa_2:17.1.1
  • xserver-xorg_2:1.19.3
  • kernel-image-4.9.21-fslc
  • libc6_2.25
  • glmark2_2014.03+0+7215c0f337
  • cross-compiler: gcc-6.3.0

Results

My test currently consists of running glmark2-es2 (i.e. the OpenGL ES2 version of glmark2). As of today, the etnaviv support isn't as full-featured as the binary blob. However, using etnaviv doesn't require any EULAs, and it lets you use a newer kernel. Thanks to how well the freescale layers are organized/maintained, switching between the two builds is quite easy.

Here's a side-by-side comparison:

vivanteetnaviv
GL_VENDORVivante Corporationetnaviv
GL_RENDERERVivante GC880Gallium 0.4 on Vivante GC880 rev 5106
GL_VERSIONOpenGL ES 3.0 V5.0.11.p8.41671OpenGL ES 2.0 Mesa 17.1.1

glmark2(-es2) is a set of individual tests that are run back-to-back. They can be run individually, but calling "glmark2-es2" by itself simply invokes all of them sequentially.

vivanteetnaviv
glmark2-es2 score11547
[build]20681
[build]24691
[texture]20080
[texture]20078
[texture]19975
[shading]20587
[shading]17068
[shading]10851
[shading]8142
[bump]12457
[bump]22074
[bump]20366
[effect2d]6234
[effect2d]2314
[pulsar]18375
[desktop]3115
[desktop]10342
[buffer]4930
[buffer]4929
[buffer]5734
[ideas]44(hangcheck)
[jellyfish]6129
[terrain]12 (shader compile failed)
[shadow]9239
[refract]2010
[conditionals]20970
[conditionals]6133
[conditionals]20365
[function]11450
[function]3430
[loop]10549
[loop]10549
[loop]5530

7 Jun 2017

GPU Support with OpenEmbedded (Introduction)

Synopsis

Traditionally, an embedded device that included a couple buttons and a 2x16 text display was considered state-of-the-art. These days, an increasing number of embedded projects are using graphic displays; potentially touch-enabled. This trend appears to be growing. If an embedded product is going to use a graphics system, it would be best if as much of the graphics processing as possible were offloaded from the CPU to the GPU.

Being able to quickly put together a basic image for an embedded device that includes accelerated graphics support is the starting point for more and more projects. Ideally the project's time should be spent developing the application which runs on the device, rather than on trying to build the basic image with functioning accelerated graphics.

Modern GPUs include multiple logical subunits for different jobs: multimedia units for video playback, compute units for computation offloading, rendering units for drawing, and many others. My primary interest is with rendering on X11.

OpenEmbedded (OE) is a great tool for building and maintaining images for embedded devices (as well as for building and maintaining embedded distributions). In this series of articles I want to take a look at how well (or not) OE supports GPUs and GPU acceleration. GPU drivers and acceleration are huge topics, and I won't pretend to know or write much about them. Rather, I'll be looking at this topic from an "image building" point-of-view.

GPU Support Options

When a vendor ships a GPU, they usually provide some sort of software for it. But usually that software is in the form of a binary blob exposed via a high-level API (such as OpenGL). From a software point-of-view, interfacing with a GPU requires many moving parts. On the one side is the kernel, on the other side is the application itself; in between are many other components. When a vendor ships a binary blob, it is built against a specific version/branch of each of these components. This means that the moment you pick a specific board/SoC for your project, you are already locking into a specific kernel version for your product. Your product will forever be locked to that version, unless the GPU vendor decides to release a newer version of the blob for your given GPU. Worse still, even though the kernel that you're being locked into says (for example) "3.10", in most cases you're forced to use your vendor's branch of "3.10". Which really means: "at some point this was 3.10, but now (1000+ patches later) it could only be best described as '3.10-ish'".

Many embedded projects like to use (or at least experiment with using) the PREEMPT_RT patch. But not every kernel that is released has an associated PREEMPT_RT patch. So if the kernel you're being forced to use doesn't have an associated PREEMPT_RT patch, you'll either have to invest the effort in trying to get the closest PREEMPT_RT patch working with your specific kernel, or forgo using PREEMPT_RT altogether. In some cases, although your kernel might be advertised as a given version, and although there might be a PREEMPT_RT patch for that kernel version, the vendor patches that have been added make applying the PREEMPT_PT patch difficult.

Similarly, support for new features is being added to the kernel every day. If your GPU vendor is locking you into an older kernel, you'll either have to back-port the new features to the older kernel yourself, or not be able to take advantage of the new features in your product.

Another potential "gotcha" when using a GPU vendor's binary blob is device support. Sometimes a GPU vendor will only decide to support a specific OS (Android, and not Linux at all) or a specific display server (Xorg vs Wayland vs Mir...) or API (OpenGL vs OpenGL ES (1, 2, 3?) vs Vulcan...) in their binary blob (or some small subset there-of). In many companies, the people who develop the product aren't the same people who choose the board/SoC (and there might be no communication between these two groups). Meaning the SoC gets chosen based on factors such as availability, size, or price without any consideration for how the product will need to be coded if such restrictions are in place.

There are also security implications of using older kernels...

...and the list continues.

An open-source GPU driver provides you with the most flexibility in choosing which version of which components you want to use in your product, as well as the most flexibility in how to implement your product. You can choose to use the pure upstream sources, or any variation there-of. You can decide to use OpenGLES on X11, if that's what you prefer. As well, it lets you experiment with various projects the wider community is working on. Do you want to create a product that uses virtualization, accelerated graphics, PREEMPT_RT, and supports the latest TPM2.0 devices? No problem. Want to try that with a binary blob that locks you into some version of a 3.4 kernel...? That might be a little more difficult. Your GPU vendor can't possibly predict what sort of product you'll want to create or how you'll want to create it.

In summary there are two options: use the vendor-supplied binary blobs which limit your flexibility, or use an open-source graphics driver and get to make more of the decisions yourself.

Open-Source GPU Projects

There are a number of projects whose goal is to create open-source drivers for a GPU family:
Additionally, Intel already provides and supports free and open-source drivers for the GPUs in their chipsets. Yay Intel! If only all companies who produce GPUs were so like-minded! For one thing, there would be no need for a write-up such as this one.

Note: not all open-source GPU projects provide support for every subunit or function a GPU implements nor provide support for every API (etc). Most of these projects are "works in progress". Having said that, however, most of these projects are quite mature and offer excellent capabilities (in some cases exceeding the capabilities of the vendor blobs!) and at least offer the ability to adapt to your needs.



Why OpenEmbedded?

Getting the right versions of each of these components configured with the correct options, installing them to the correct locations, setting up a cross-compiler, cross-compiling all the code, and tweaking them with proper configuration files in the image is not a trivial undertaking. Just assembling the right set of components isn't trivial because the implementation details of how acceleration is achieved for different GPUs varies!

OpenEmbedded provides the metadata, the "recipes", that describe the low-level details of how to configure and build various components. It allows the user to focus on higher-level details, instead of getting bogged down in the minutiae of setting up sysroots for cross-compilation and making sure the compiler gets passed the right parameters. Do you want your image to include the "xdpyinfo" program? Just add it to the list. Do you want to build an image with musl instead of glibc? Just add the correct layer and set the variable indicating which C library to use. Then let OpenEmbedded handle the details; the commands you type are the same regardless.

There are, of course, other build systems for generating images. The point of this article, however, is to survey the state-of-the-art in graphics support with respect to OpenEmbedded. This is not meant to be a series of articles on the state of open-source graphics support in general nor a comparison of graphics support from various build systems.

Summary

For each GPU family, I would like to write an article describing how to use OpenEmbedded to create one of two images: one image using the vendor blob, and one image using the open-source replacement. As a basis, it would be great if it were easy for anyone to create either of these images. This would allow the user to quickly start their base images and choose their GPU support.

Going further, I'd like to then run the same software on each image and provide performance statistics and general feedback.

Hopefully the information in these articles will:
  • provide concise information to help users get their images built and running easily and quickly
  • provide a comparison between the various GPU families and provide a software support matrix
  • help make it easy for developers to become involved in developing and debugging open-source graphics drivers

Caveat

As always, please try to remember that software is an ever-evolving entity. As I write this article (early June 2017) I try to be as correct as possible. But that doesn't mean I'm always correct, and that doesn't mean that what is correct right now, is still correct an hour from now. So if you're reading these articles many years into the future, please try to remember that everything evolves and there will be a time at which all of what's written here stops being true, or possible, or whatever.

1 Oct 2016

Dual-WAN

A quasi-review of the ASUS AC1900 RT-AC68U router and Bell's ZTE MF275R Turbo Hub (LTE). Other carriers also offer this same device.

As I mentioned in a previous post, getting satisfactory Internet access in the Canadian country-side can be a challenge especially if you have specific needs and don't want to spend a huge fortune (just a small fortune, perhaps). [People living in medium to large-sized towns or cities in Canada generally have access to faster, cheaper, Internet.]

On the one hand there's DSL which is relatively cheap, plans are readily available with no caps, static IPs are available, but the speeds are slow. On the other hand LTE is (in my specific case) >8x faster, but comes with caps, doesn't allow static IPs, and is more expensive.

How to decide between the two? Get both!

Having decided to get both, the next job is to decide how to manage your devices. A simple solution is to run two access points and make each of your devices log into one access point or the other. The ZTE is both an LTE modem and a dual-band WiFi access point all rolled into one device. A regular DSL setup would involve the phone line going to a DSL modem, then out from the modem to a router (which serves as an access point). But if you want the possibility of sharing both Internet connections between all your devices dynamically, a dual-WAN approach is better.

The devices you have in your home which connect to your router via WiFi are described as being on the LAN side of your router. The pathways which lead from your router to the Internet are described as being on the WAN side of your router. In the vast majority of cases there are usually many LAN devices and only one WAN connection. In my case I have two ways to get from my router to the Internet; this is described as a dual-WAN setup. Most routers expect the "usual" setup and therefore only provide one WAN connection. In order to run a dual-WAN configuration you have to own a router that specifically supports this topology. I own an ASUS AC1900 RT-AC68U router which is one of the few that support dual-WAN and is the one with which I am most familiar. Other routers support dual-WAN too, you'll need to check the specs of other routers to verify whether or not they have this feature.

Let's say you do several Internet things with one device: you perform some large downloads from specific sites, and you also want to play some online games. For the downloads you'll want to use the DSL (which, although slower, will not incur large data costs since there's no cap). For gaming you'll want to use LTE. If both Internet connections are on separate routers, you'll need to switch which access point your device uses when performing each of these tasks. Plus you won't be able to perform both tasks at the same time; you'll need to do one task while connected to one access point, then switch to the other access point in order to perform the other task.

Another consideration is reliability. Although both technologies have pretty good uptimes, there will be times when one or the other might be down (especially DSL, I'm not saying DSL goes down every week or even every month, but it does go down and when it does it's very annoying). When one of your Internet connections goes down, you'll need to switch your devices so that they are all on the working access point. Not all devices are smart enough to switch automatically. The flip side of the same argument is: if you use a dual-WAN setup, you only ever have to program one access point (SSID) and password into all your devices and they'll all be able to access the Internet through either Internet connection (WAN interface) regardless of which service is up or down.

In my specific case, I'm using the ZTE MF275R Turbo Hub which I obtained through Bell Canada. If you read through this device's specs you'll find that it only allows 20 devices to be connected to it at a time. This doesn't sound too bad, but the device actually only allows up to 10 devices to be connected at any one time to each band. Since the device supports both 2.4GHz and 5GHz bands, it smudges the truth a little and claims up to 20 devices. So if you have more than 10 older WiFi devices, they can't all connect at the same time (since they'll all be using the 2.4GHz band and probably don't have support for the 5GHz band). My point being, another good reason to consider plugging your Turbo Hub into your router is that most routers don't have this artificial limitation, so you could connect more than 10 devices on any one band and route them all through your LTE Internet connection.

In other words, there are a number of good reasons to connect both Internet sources through the same router to feed your set of devices.

When the AC68U was introduced it did not support dual-WAN. But since its introduction, ASUS has been producing firmware updates and somewhere along the way it added dual-WAN functionality to this device. The last time I checked, however, the "latest" manual still did not reference this new functionality, so even if you download and read through their manual you might not believe this device supports this feature. There is a chance, however, that you might buy a "new" AC68U router to find it is loaded with old firmware. Therefore you can't do dual-WAN "out of the box". In this case you'll need to update the firmware yourself. In any case it's nice to see a manufacturer's stock firmware actually adding to the value of a device over time (instead of a rising trend among manufacturers to use firmware updates to take away features from the consumer!).

As part of its dual-WAN configuration, the AC68U allows the user to define a set of rules (up to 32) whereby it can be specified which LAN device should use which of the WAN interfaces when connecting to which machines on the Internet. It's a great addition, but not perfect. The rules can only be defined in terms of source and destination numerical IP addresses. First of all, it means you have to configure the AC68U's DHCP server to give static IPs to your devices (not a big deal). Secondly, it only works if the machines to which you are connecting (on the Internet) have static IPs themselves (this is getting harder). Alternatively you can leave the destination IP blank, in which case the AC68U will fill in the destination with "all" when you add the rule. In this way you can specify that a given device (say, a Roku) will always only use one specific WAN interface (e.g. the DSL) for all Internet traffic. For my purposes, I can live with these restructions, but I could certainly see how some rules would be better described by source or destination port number/protocol, or DNS name.

29 Sep 2016

Internet in the Canadian Near-City

I love living in the country. But one of the few downsides is Internet connectivity: fewer choices, and poor quality for what few choices exist.

I don't have Fibre To The Home (FTTH) and never will. I'd be happy with Fibre To The Neighbourhood or Distribution Point, but that's probably never going to happen either. I don't even have cable as a choice. My only choices are: satellite, aDSL, and LTE/cell; and they're not great choices.

Satellites are a shared commodity and are over-provisioned to the point of uselessness. I had someone come and give me satellite because of the promised 10Mbps. It was free for the first month; after about 20 days I called and asked them to remove it. It wasn't even worth finishing off the full free first month! Maybe at around 3am in the morning you might get something approaching 10Mbps. That'll last until about 6am. By 8am you're lucky to be getting 1-2Mbps, and from 10am until 3am the next morning you'd do better with a 2400 baud modem and a regular phone line!

With satellite the usage caps are pretty low, the costs are high (considering the low caps), and the performance is "good" if you only use the Internet through the night and don't need it during the day. Oh, and there's no option of simply paying to increase your cap; once you hit your cap, they throttle you down to speeds from the Internet pliocene age until your next billing cycle!

For the last 2 years (since I moved to my current address) I've been muddling along with aDSL. At best I get, maybe, 2Mbps (if I'm lucky) down, and about 600Kbps up. Unlike satellite (or cable, if that were an option) it's not shared, so it's a pretty constant 1.5-2Mbps throughout the entire day. But here's the funny thing: every once in a while the performance plummets and I'm forced to contact my ISP for a remedy. DSL lines can be set to one of a couple "profiles": there's a "go as fast as you can and ignore dropped packets" profile, a "minimize packet drops by going slower" profile, and a "make them pine for the days of dialup" profile. The annoying thing is, the carrier's network equipment is able to switch the profile without any human intervention! So every once in a while the network analyzes my line, decides my dropped packet rate is too high, and switches me to a slower profile. Once I notice, I'm forced to log a performance ticket with my ISP, whose response is to log a ticket with the carrier, who (eventually) changes me back to the hi-speed performance profile and everything is fine again (if you call 2Mbps "fine"). But, of course, that doesn't happen too quickly, so I usually have to go a couple days with very slow lines before I'm boosted back up. That happens roughly 5 times a year.

With DSL it's easy and relatively cheap to get unlimited (or virtually unlimited) download caps, the cost is low-ish, but the performance isn't great. However, the speed is consistently not great, so you get used to it (until their equipment sets you to a lower profile). When I say the cost is low I'm not implying it's cheap by any stretch of the imagination. I'm speaking relative to the other Canadian choices; certainly not relative to what people in other countries pay.

Most providers offer different DSL packages. For a lot of money a person can get really fast DSL, but if someone wanted to save a bit of cash they can opt for slower DSL speeds. The slowest package I can find is a 5/0.8Mbps package for $53 with a 50GB cap. Nobody offers a 2Mbps package, so even if I took this 5Mbps package, I would already be paying for >50% bandwidth I'm never going to realize because the equipment on my street simply can't go that fast. As it turns out I needed a static IP and I wanted a plan with no cap so the cheapest plan I could find to fit that criteria is a 10Mbps plan. So I'm paying for a 10Mbps connection, but by living in the country I'm not getting everything for which I'm paying.

Even though DSL isn't that great an option, ironically I'm "lucky" to have it. I'm the last person on my street to which the DSL lines extend. My neighbour, a couple houses down, doesn't have the option of DSL since these lines end at the junction box outside my home. My home is about 2.5 km from the neighbourhood junction, and that junction is about 4-5 km from the CO. So it's actually amazing that I even get DSL at all!

My final option is LTE (Internet over modern cell towers).  Internet via cell towers has been around for many years (decades?) but it's an option I've never taken seriously due to the ridiculously low caps, the (traditionally) low speeds, and how we're gouged on cell data prices here in Canada. A recent conversation with a fellow country-living friend, however, had me reconsider. We've finally gotten to the point where LTE is, at least, worth considering. LTE offers much better speed than had been available via cell towers, so the data rate is now good. Again, everything's relative. For $60/month I can get roughly 16Mbps and a 5GB cap. How is this worth considering? If I put in a second DSL line and bonded them, I'd be paying over $140/month and my speed would only be roughly 4-5Mbps (but I'd have no cap). Spending $110/month on an LTE plan would give me up to 50GB (cap) at 16Mbps. Before LTE, doing Internet over cell towers would give you speeds in the low Mbps or high Kbps, have caps in the 100's of MBs, and I won't even bother mentioning the price.

Another major consideration is reliability, and it's not something you should ignore. The reliability of satellite is abysmal; the laws of physics simply dictate that weather is going to interfere. DSL is better since it's not as dependent on weather, but as a long-time DSL user I can attest to the fact that downtime will happen. Looking back through the tickets I've opened with my ISP over the years (I've been using DSL since the late 90's) I'd say that roughly 8-10 times a year you'll find yourself without Internet for at least a day if not the entire weekend. I'm new to LTE so I can't give any feedback on its reliability. However, in theory, cell towers tend to have excellent up-time, so, in theory, LTE should have fantastic up-times as well.

I like to think of myself as someone who lives in the country but I refer to myself as living near-city since I have the advantage of being close enough to a major city such that there are cell towers in my area and that they've been upgraded to LTE. If I lived in even a largish town there might exist the possibility of fibre, or at the very least I could get fast DSL or cable (and at least get everything for which I was paying). But since I do live close to cell towers with LTE I have to categorize myself as a near-city country-living Canadian, as distinct from other country-living Canadians who don't live within cell-tower range, for whom the only options would be dial-up or satellite.

What is my solution? Currently I've kept my DSL line for any large downloads, and Netflix (it's more than adequate for standard definition streaming using a Roku 3). But I also have an LTE device for those situations where speed is essential but the download amount isn't going to be too high.

19 Aug 2016

Gerrit User Management for a Small Installation

Setting up Jenkins is a simple matter of downloading the latest .war file and java -jar'ing it. It comes with all the basics of what you need, including its own web server. So there's no need to fiddle with things like databases or web servers... if you don't want to. Most people at a given organization don't need accounts on their Jenkins instance. In most cases, only a couple people who are able to create and manage its various jobs need to log on. Most other people just want to see the status, maybe download the latest successful compile, or look at the logs of a recent failure. These are things most anonymous users can do.

Bugzilla isn't quite as easy to setup; you need to assemble the pieces mostly yourself. It also doesn't have its own built-in web server (which is, really, its primary function, no?) so you have to integrate it with Apache or Nginx. For basic installations the defaults are fine, and it comes with functional user management and a simple database if you don't need "production" quality. Most people contributing to a project should have a Bugzilla account, and Bugzilla has good enough user management "out of the box", especially for a small installation.

Gerrit requires everyone who interacts with it to contribute to a repository to have an account. You wouldn't want any anonymous user to be able to make changes to your patch flow? Plus you do want to track everyone who does make a change.

Sadly, Gerrit doesn't include any sort of built-in user management. Not even a dumb, "don't use this for production environments", user-management system (like Jenkins or Bugzilla). Gerrit assumes, and requires you to use, an external identity management system (such as having your users use their google or facebook credentials via OpenID; a company-wide ldap installation; or the user-management features from a web server).

If you're part of a large organization, which has a dedicated and capable IT team, these issues aren't of any concern to you. All you need to do is to decide that you want to use Gerrit. Setting it up and managing it is someone else's problem. But small companies can benefit from distributed code review too, and if nothing else, at its core Gerrit is a solid source code repository server.

With a small team there usually isn't a dedicated person who is responsible for managing servers. You have developers, you have sales people, you have a CEO, you have managers (there are always managers), and you have someone doing the financial stuff. But there's rarely a dedicated IT person who is able to setup a Linux machine, configure, and manage various services (Bugzilla, Jenkins, Gerrit, etc). That job ends up falling to some developer who would rather be writing code than configuring servers.

The reasons why Gerrit doesn't do user management are obviously religious. Gerrit does include its own "don't use this for production installations" database (h2) and provides all the ODBC connectors you need to connect it to any real database you can imagine. So if it's already doing database stuff, why not just add a user table? But it's even worse than that. Pure Gerrit doesn't even allow you to specify permissions at the user level, only at the group level. This means you have to create a group for every meaningful permission you want to assign. At a small-ish installation this means that you end up with lots of groups all of which only contain one person.

Fortunately there is an easy-enough-to-install plugin which allows you to create a group for every user, so creating a fine-grained permission scheme for a small team with a group of projects is relatively easy enough, but is awkward that you still need to manage users that are users, and users that are groups.

Unfortunately there isn't an easy-enough-to-install add-on for user management. But, if you fetch the Gerrit sources, you will find a perl script called fake_ldap.pl in its contrib folder. fake_ldap.pl makes it easy to generate a file which your Gerrit installation can query to get the basic information regarding your allowed users. It does require you to manage this file by hand yourself outside of your Gerrit system. But, in my experience, provides the easiest way to manage the users of a small Gerrit installation.

26 Jun 2016

How To Setup JTAG with Galileo (the modern version)

A recent blog post from Olimex pointed to a document [1] showing how to debug the Intel Galileo board using a JTAG. The nice thing about the document is that it assumed the user would be building their own image using Bitbake/OpenEmbedded. The unfortunate part is that the Galileo BSP downloads from Intel are so ancient they have next-to-no chance of working on a recent, modern distro. Their instructions, however, do point this out (i.e. ...this procedure was performed on <some old version of> Ubuntu...), leaving the user little choice but to start by preparing a VM in which to perform the build!

Back when the Galileo board was released, Intel did a great job of supporting it by creating various layers to be used with OpenEmbedded: meta-clanton, meta-galileo, meta-intel-iot-devkit, meta-intel-iot-middleware, meta-intel-quark-fast, meta-intel-quark. But, as you can see, that support was a bit "scattered". On top of that, it doesn't look like meta-clanton was ever placed somewhere public; the only way to get it (and to build for the Galileo) was to download a massive BSP from Intel which included it. Over time this massive download was replaced by a smaller download, which then required you to run a script which would pull in all the sub-components as a separate step (which performed the massive download). Additionally, a fixup script needed to be run in order to clean up some of the build area before you could start your build. Attempting any of this procedure on a modern Linux development host is very likely to fail.

Fast-forward to today (June 26, 2016) and all that's needed to create an image for the Galileo are a distro layer, the basic OE meta layer, and meta-intel. Or, if you're using poky as your distro, you'll get the meta data as part of it.


Building An Image for the Galileo

$ mkdir /some/place
$ cd /some/place

$ mkdir layers
$ pushd layers
$ git clone git://git.yoctoproject.org/poky meta-poky
$ git clone git://git.yoctoproject.org/meta-intel
$ popd


$ . layers/meta-poky/oe-init-build-env galileo

Now, edit conf/local.conf so that
MACHINE ?= "intel-quark"
EXTRA_IMAGE_FEATURES ?= "debug-tweaks tools-debug tools-profile"

And edit conf/bblayers.conf to replace the part that says "meta-poky/meta-yocto-bsp" with "meta-intel".

Now run:
$ bitbake core-image-minimal

When bitbake starts it prints some build configuration information. For my build I saw:

Build Configuration:
BB_VERSION        = "1.31.0"
BUILD_SYS         = "x86_64-linux"
NATIVELSBSTRING   = "SUSELINUX-42.1"
TARGET_SYS        = "i586-poky-linux"
MACHINE           = "intel-quark"
DISTRO            = "poky"
DISTRO_VERSION    = "2.1+snapshot-20160622"
TUNE_FEATURES     = "m32 i586-nlp"
TARGET_FPU        = ""
meta   
meta-poky         = "master:6f0c5537e02c59e1c8f3b08f598dc049ff8ee098"
meta-intel        = "master:1b98ae6d7e10390c9ecb383432593644a524f9c8"


If your build fails, one thing you could try is to go to each of the layers and checkout the commits specified in the above information; then restart the build.

At the end of a successful build, continue with the following to create an SDcard image:

$ bitbake parted-native
$ wic create mkgalileodisk -e core-image-minimal

Look through the wic output, it will tell you where it has placed its artifact. Use dd to create your SDcard with the wic artifact:

# dd if=/var/tmp/wic/build/mkgalileodisk-<datetime>-mmcblk0.direct of=/dev/sdX bs=1M


Cross-GDB

Eventually you're going to use GDB, via openOCD, to debug your target. In order for this to work (in addition to openOCD) you're going to need two things:
  1. a gdbserver "stub" running on your target
  2. a cross-GDB running on your development machine
A cross-GDB is required because your native GDB will only understand your native host's machine code and other CPU-specific information. A cross-GDB is built to run on your native host, but understand a different CPU architecture. A gdbserver stub is necessary on the target because you need some device-specific software running on the target which is able to interrupt the CPU, set breakpoints, etc. The cross-GDB program is large, capable of doing all the work required to perform source-level debugging, and presets the interface to the user. The stub is quite small and has just the minimum target-CPU-specific functionality required on the target.

Above, as part of your first build, I mentioned that you needed to adjust the EXTRA_IMAGE_FEATURES variable of your conf/local.conf file. One of the things that change does is to include the gdbserver stub in your target image.

In order to build a native cross-GDB for your development host you'll need to generate an SDK for your image:

$ bitbake core-image-minimal -c populate_sdk

Once built, you then need to install the SDK. To do so, simply run the resulting SDK script which you'll find in ${TMPDIR}/deploy/sdk. The install script will ask you where you want to install the SDK; type in a path and press Enter, or simply press Enter to accept the default.

Once installed, source the SDK environment file:

$ . <SDK_INSTALL_LOCATION>/environment-setup-i586-nlp-32-poky-linux


OpenOCD

My recommendation is to get, build, and install the latest OpenOCD from sources:

$ mkdir <SOMEPLACE_TO_BUILD_OPENOCD>
$ cd <SOMEPLACE_TO_BUILD_OPENOCD>
$ git clone git://git.code.sf.net/p/openocd/code openocd
$ cd openocd
$ ./bootstrap
$ ./configure

At the end of ./configure'ing, the script will print out a list of all the dongles for which it can include support. Reasons why it can't include support for a particular dongle may include the lack of additional required libraries. If a particular dongle is marked as un-buildable and you want to build support for that dongle, you'll need to figure out the reason(s) why it can't presently be built (i.e. figure out which library it needs) and fix the deficiency (i.e. use your host distribution's package manager to install that library's -dev/-devel package). The ./configure script is pretty good at telling you which library/libraries are missing.

Once the configuration is done:

$ make -j
$ sudo make install


Connecting to the Galileo via JTAG and GDB

Two terminals are required for this part. In one terminal you'll run OpenOCD and in the other you run the cross-GDB (or telnet).

To run OpenOCD you'll need to tell it to which board you're connecting, and which dongle you're using. Obviously the board part will remain the same, but the dongle part for you might be different depending on whether or not you're using the same dongle(s) as me. Also, the order in which this information is given to OpenOCD is important. Apparently you need to specify the dongle first, then the board.

In the following example I'm using the Segger j-link EDU:

# openocd -f interface/jlink.cfg -f board/quark_x10xx_board.cfg
Open On-Chip Debugger 0.10.0-dev-00322-g406f4d1 (2016-06-22-09:29)
Licensed under GNU GPL v2
For bug reports, read
        http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "jtag". To override use 'transport select <transport>'.
adapter speed: 4000 kHz
trst_only separate trst_push_pull
Info : No device selected, using first device.
Info : J-Link V9 compiled Apr 15 2014 19:08:28
Info : Hardware version: 9.00
Info : VTarget = 3.354 V
Info : clock speed 4000 kHz
Info : JTAG tap: quark_x10xx.cltap tap/device found: 0x0e681013 (mfg: 0x009 (Intel), part: 0xe681, ver: 0x0)
enabling core tap
Info : JTAG tap: quark_x10xx.cpu enabled




In this example I'm using the ARM-USB-OCD-H from Olimex:

# openocd -f interface/ftdi/olimex-arm-usb-ocd-h.cfg -f board/quark_x10xx_board.cfg
Open On-Chip Debugger 0.10.0-dev-00322-g406f4d1 (2016-06-22-09:29)
Licensed under GNU GPL v2
For bug reports, read
        http://openocd.org/doc/doxygen/bugs.html
Info : auto-selecting first available session transport "jtag". To override use 'transport select <transport>'.
adapter speed: 4000 kHz
trst_only separate trst_push_pull
Info : clock speed 4000 kHz
Info : JTAG tap: quark_x10xx.cltap tap/device found: 0x0e681013 (mfg: 0x009 (Intel), part: 0xe681, ver: 0x0)
enabling core tap
Info : JTAG tap: quark_x10xx.cpu enabled




Now, to communicate with and control the board via OpenOCD you need to open a second terminal. If you want to simply send commands to OpenOCD (such as to check or flash the board) you can simply use telnet:

$ telnet localhost 4444
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Open On-Chip Debugger
>


If you want to debug the target via GDB then you need to startup the cross-GDB and connect it to OpenOCD from within GDB itself (note: the cross-GDB should be already on your $PATH, it comes from the SDK we built and installed earlier; if it's not on your PATH you may have forgotten to source the SDK's environment file, see above):

$ i586-poky-linux-gdb
Python Exception <class 'ImportError'> No module named 'operator':
i586-poky-linux-gdb: warning:
Could not load the Python gdb module from `sysroots/x86_64-pokysdk-linux/usr/share/gdb/python'.
Limited Python support is available from the _gdb module.
Suggest passing --data-directory=/path/to/gdb/data-directory.

GNU gdb (GDB) 7.11.0.20160511-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-pokysdk-linux --target=i586-poky-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".

(gdb) target remote localhost:3333
Remote debugging using localhost:3333
Python Exception <class 'NameError'> Installation error: gdb.execute_unwinders function is missing:
0x00000000 in ?? ()
(gdb)





[1] Source Level Debug using OpenOCD/GDB/Eclipse on Intel Quark SoC X1000, sourcedebugusingopenocd_quark_appnote_330015_003-2.pdf