FB Graphics
FBGraphics (FBG) : Simple C 16, 24, 32 bpp generic graphics library with parallelism support and custom backend.
FBGraphics : Lightweight C 2D graphics API agnostic library with parallelism support

FBGraphics (FBG) is a simple C 24, 32 bpp (internal format) graphics library with parallelism and custom rendering backend support (graphics API agnostic).

The library is only two .c files on most use cases, the renderer agnostic library fbgraphics.c and one of the rendering backend found in custom_backend directory.

The library come with five backend (see custom_backend folder) :

Features :

The library is generic, most functions (including parallel ones) only manipulate buffers and you can build a custom rendering backend pretty easily with few functions call, see the custom_backend folder.

Table of Contents

About

FBGraphics was built to produce fullscreen pixels effects easily (think of Processing-like creative coding etc.) with non-accelerated framebuffer by leveraging multi-core processors, it is a bit like a software GPU but much less complex and featured, the initial target platform was a Raspberry PI 3B / NanoPI.

FBGraphics was extended to support any numbers of custom rendering backend; all graphics calls manipulate internal buffers and a simple interface allow to draw the result the way you want to.

FBGraphics can support low memory hardware such as GBA. It should be noted that all internal buffers are manipulated in 24/32 bpp so it has to convert to 16bpp on GBA.

An OpenGL rendering backend which use the GLFW library was created to demonstrate the custom backend feature, it allow to draw the non-accelerated FB Graphics buffer into an OpenGL context through a texture and thus allow to interwine 3D or 2D graphics produced with standard OpenGL calls with CPU-only graphics produced by FBGraphics draw calls.

An OpenGL ES 2.0 backend is also available with similar features, it target platforms with support for OpenGL ES 2.0 through fbdev (tested on Nano PI Fire 3 SBC) or Raspberry PI dispmanx and similar platforms, it wouldn't be hard to extend this for more OpenGL ES 2.0 platforms...

There is also a dispmanx backend targeting Raspberry PI, it have better performances than the OpenGL ES 2 backend on this platform and is recommended if you don't need 3D stuff.

FBGraphics was built so that it is possible to create any number of rendering context using different backend running at the same time while exploiting multi-core processors... the content of any rendering context can be transfered into other context through images when calling fbg_drawInto

FBGraphics framebuffer settings support 16, 24 (BGR/RGB), 32 bpp, 16 bpp mode is handled by converting from 24 bpp to 16 bpp upon drawing, page flipping mechanism is disabled in 16 bpp mode, 24 bpp is the fastest mode.

FBGraphics is lightweight and does not intend to be a fully featured graphics library, it provide a limited set of graphics primitive and a small set of useful functions to start doing computer graphics anywhere right away with or without multi-core support.

If you want to use the parallelism features with advanced graphics primitives, take a look at great libraries such as libgd, Adafruit GFX library or even ImageMagick which should be easy to integrate.

FBGraphics is fast but should be used with caution, display bounds checking is not implemented on most primitives, this allow raw performances at the cost of crashs if not careful.

Multi-core support is optional and is only enabled when FBG_PARALLEL C definition is present.

FBGraphics framebuffer backend support a mechanism known as page flipping, it allow fast double buffering by doubling the framebuffer virtual area, it is disabled by default because it is actually slower on some devices. You can enable it with a fbg_fbdevSetup call.

VSync is automatically enabled if supported.

Note : FBGraphics framebuffer backend does not let you setup the framebuffer, it expect the framebuffer to be configured prior launch with a command such as :

fbset -fb /dev/fb0 -g 512 240 512 240 24 -vsync high
setterm -cursor off > /dev/tty0

fbset should be available in your package manager.

Framebuffer Quickstart

The simplest example (no parallelism, without texts and images) :

#include <sys/stat.h>
#include <signal.h>
#include "fbg_fbdev.h"
#include "fbgraphics.h"
int keep_running = 1;
void int_handler(int dummy) {
keep_running = 0;
}
int main(int argc, char* argv[]) {
signal(SIGINT, int_handler);
struct _fbg *fbg = fbg_fbdevSetup("/dev/fb0", 0); // you can also directly use fbg_fbdevInit(); for "/dev/fb0", last argument mean that will not use page flipping mechanism for double buffering (it is actually slower on some devices!)
do {
fbg_clear(fbg, 0); // can also be replaced by fbg_fill(fbg, 0, 0, 0);
fbg_draw(fbg);
fbg_rect(fbg, fbg->width / 2 - 32, fbg->height / 2 - 32, 16, 16, 0, 255, 0);
fbg_pixel(fbg, fbg->width / 2, fbg->height / 2, 255, 0, 0);
fbg_flip(fbg);
} while (keep_running);
fbg_close(fbg);
return 0;
}

A simple quickstart example with most features (but no parallelism, see below) :

#include <sys/stat.h>
#include <signal.h>
#include "fbg_fbdev.h"
#include "fbgraphics.h"
int keep_running = 1;
void int_handler(int dummy) {
keep_running = 0;
}
int main(int argc, char* argv[]) {
signal(SIGINT, int_handler);
struct _fbg *fbg = fbg_fbdevInit();
struct _fbg_img *texture = fbg_loadImage(fbg, "texture.png");
struct _fbg_img *bb_font_img = fbg_loadImage(fbg, "bbmode1_8x8.png");
struct _fbg_font *bbfont = fbg_createFont(fbg, bb_font_img, 8, 8, 33);
do {
fbg_clear(fbg, 0);
fbg_draw(fbg);
// you can also use fbg_image(fbg, texture, 0, 0)
// but you must be sure that your image size fit on the display
fbg_imageClip(fbg, texture, 0, 0, 0, 0, fbg->width, fbg->height);
fbg_write(fbg, "Quickstart example\nFPS:", 4, 2);
fbg_write(fbg, fbg->fps_char, 32 + 8, 2 + 8);
fbg_rect(fbg, fbg->width / 2 - 32, fbg->height / 2 - 32, 16, 16, 0, 255, 0);
fbg_pixel(fbg, fbg->width / 2, fbg->height / 2, 255, 0, 0);
fbg_flip(fbg);
} while (keep_running);
fbg_freeImage(texture);
fbg_freeImage(bb_font_img);
fbg_freeFont(bbfont);
fbg_close(fbg);
return 0;
}

Note : Functions like fbg_clear or fbg_fpixel are fast functions, there is slower equivalent (but more parametrable) such as fbg_background or fbg_pixel, some functions variant also support transparency such as `fbg_pixela or fbg_recta.

Note : You can generate monospace bitmap fonts to be used with fbg_createFont function by using my monoBitmapFontCreator tool available here

Parallelism

Exploiting multiple cores with FBGraphics is really easy, first you have to prepare 3 functions (of which two are optional if you don't have any allocations to do) of the following definition :

// optional function
void *fragmentStart(struct _fbg *fbg) {
// typically used to allocate your per-thread data
// see full_example.c for more informations
return NULL; // return your user data here
}
void fragment(struct _fbg *fbg, struct _fragment_user_data *user_data) {
// this function will be executed by each threads
// you are free to call any FBG graphics primitive here
fbg_clear(fbg, 0);
// you are also free to fill each threads back buffer the way you want to
// fbg->task_id : thread identifier (starting at 1, 0 is reserved for the main thread)
// each threads will draw an horizontal line, the shade of the blue color will change based on the thread it is drawn from
int x = 0, y = 0;
for (y = fbg->task_id; y < fbg->height; y += 4) {
for (x = 0; x < fbg->width; x += 1) {
int i = (x + y * fbg->width) * 3;
fbg->back_buffer[i] = fbg->task_id * 85; // note : BGR format
fbg->back_buffer[i + 1] = 0;
fbg->back_buffer[i + 2] = 0;
}
}
// simple graphics primitive (4 blue rectangle which will be handled by different threads in parallel)
fbg_rect(fbg, fbg->task_id * 32, 0, 32, 32, 0, 0, 255);
}
// optional function
void fragmentStop(struct _fbg *fbg, struct _fragment_user_data *data) {
// typically used to free your per-thread data
// see full_example.c for more informations
}

Then you have to create a 'Fragment' which is a FBG multi-core task :

fbg_createFragment(fbg, fragmentStart, fragment, fragmentStop, 3);

Where :

And finally you just have to make a call to your fragment function in your drawing loop and call fbg_draw!

fragment(fbg, NULL);
fbg_draw(fbg, NULL);

fbg_draw will wait until all the data are received from all the threads then draw to screen

Note : This example will use 4 threads (including your app one) for drawing things on the screen but calling the fragment function in your drawing loop is totally optional, you could for example make use of threads for intensive drawing tasks and just use the main thread to draw the GUI or the inverse etc. it is up to you!

And that is all you have to do!

See simple_parallel_example.c and full_example.c for more informations.

Note : By default, the resulting buffer of each tasks are additively mixed into the main back buffer, you can override this behavior by specifying a mixing function as the last argument of fbg_draw such as :

// function called for each tasks in the fbg_draw function
void selectiveMixing(struct _fbg *fbg, unsigned char *buffer, int task_id) {
// fbg is the main fbg structure returned by fbg_customSetup calls and any backend setup calls
// buffer is the current task buffer
// task_id is the current task id
int j = 0;
for (j = 0; j < fbg->size; j += 1) {
fbg->back_buffer[j] = (fbg->back_buffer[j] > buffer[j]) ? fbg->back_buffer[j] : buffer[j];
}
}

Then you just have to specify it to the fbg_draw function :

fbg_draw(fbg, additiveMixing);

By using the mixing function, you can have different layers handled by different cores with different compositing rule, see compositing.c for an example of alpha blending compositing 2 layers running on their own cores.

Note : You can only create one Fragment per fbg instance, another call to fbg_createFragment will stop all tasks for the passed fbg context and will create a new set of tasks.

Note : On low performances platforms you may encounter performance issues at high resolution and with a high number of fragments, this is because all the threads buffer need to be mixed back onto the main thread before being displayed and at high resolution / threads count that is alot of pixels to process! You can see an alternative implementation using pure pthread in the custom_backend folder and dispmanx_pure_parallel.c but it doesn't have compositing. If your platform support some sort of SIMD instructions you could also do all the compositing using SIMD which should result in a 5x or more speed increase!

Technical implementation

FBGraphics threads come with their own fbg context data which is essentialy a copy of the actual fbg context, they make use of C atomic types.

Initially parallelism was implemented using liblfds library for its Ringbuffer and Freelist data structure.

Now parallelism has two implementation, liblfds and a custom synchronization mechanism which has the advantage to not require additional libraries and thus execute on more platforms.

You can still use the liblfds implementation using the FBG_LFDS define, it may be faster.

With liblfds

Each threads begin by fetching a pre-allocated buffer from a freelist, then the fragment function is called to fill that buffer, the thread then place the buffer into a ringbuffer data structure which will be fetched upon calling fbg_draw, the buffers are then mixed into the main back buffer and put back into the freelist.

Without liblfds

Each threads fragment function is called to fill the local buffer, each threads then wait till that buffer is consumed by the main thread upon calling fbg_draw, the buffers are then mixed into the main back buffer and fbg_draw wake up all threads.

Benchmark (framebuffer)

A simple unoptimized per pixels screen clearing with 4 cores on a Raspberry PI 3B : 30 FPS @ 1280x768 and 370 FPS @ 320x240

Note : Using the dispmanx backend a screen clearing + rectangle moving on a Raspberry PI 3B : 60 FPS @ 1920x1080

Full example

Fullscreen per pixels perlin noise with texture mapping and scrolling (unoptimized)

Device : Raspberry PI 3B ( Quad Core 1.2GHz )

Settings : 320x240

Cores used to draw graphics FPS
1 42 FPS
2 81 FPS
3 120 FPS

See screenshots below.

Tunnel example

Fullscreen texture-mapped and animated tunnel made of 40800 2px rectangles with motion blur (unoptimized)

Device : Raspberry PI 3B ( Quad Core 1.2GHz )

Settings : 320x240

Cores used to draw graphics FPS
1 36 FPS
2 69 FPS
3 99 FPS
4 66 FPS

Note : The framerate drop with 4 cores is due to the main thread being too busy which make all the other threads follow due to the synchronization.

See screenshots below.

Documentation

All usable functions and structures are documented in the fbgraphics.h file with Doxygen

The HTML documentation can be found in the docs directory.

Examples demonstrating all features are available in the examples directory.

Some effects come from my Open Processing sketches

Building

C11 standard should be supported by the C compiler.

All examples found in examples directory make use of the framebuffer device /dev/fb0 and can be built by typing make into the examples directory then run them by typing ./run_quickstart for example (this handle the framebuffer setup prior launch), you will need to compile liblfds for the parallelism features. (see below)

All examples were tested on a Raspberry PI 3B with framebuffer settings : 320x240 24 bpp

For the default build (no parallelism), FBGraphics come with a header file fbgraphics.h and a C file fbgraphics.c to be included / compiled / linked with your program plus one of the rendering backend found in custom_backend directory, you will also need to compile the lodepng.c library and nanojpeg.c library, see the examples directory for examples of Makefile.

For parallelism support, FBG_PARALLEL need to be defined.

If you need to use the slightly different parallelism implementation (see technical implementation section) you will need the liblfds library :

To compile liblfds parallel examples, just copy liblfds711.a / liblfds711.h file and liblfds711 directory into the examples directory then type make lfds711.

Note : FBGraphics with liblfds work on ARM64 platforms but you will need liblfds720 which is not yet released.

Executable size optimization

This library may be used for size optimized executable for things like demos

PNG and JPEG support can be disabled with the WITHOUT_JPEG and WITHOUT_PNG define.

See tiny makefile rule inside the custom_backend or examples folder for some compiler optimizations related to executable size.

Under Linux sstrip and UPX can be used to bring the size down even futher.

Rendering backend

See README into custom_backend folder

GLFW backend

See README into custom_backend folder

The GLFW backend was made to demonstrate how to write a backend but it is complete enough to be used by default.

The GLFW backend has a cool lightweight Lua example which setup a Processing-like environment making use of the parallelism feature of the library, allowing the user to prototype multithreaded graphical stuff without C code compilation through the Lua language.

OpenGL ES 2 backend

See README into custom_backend folder

GBA backend

See README into custom_backend folder

Screenshots

Full example screenshot with three threads
Tunnel with four threads
Earth with four threads
Flags of the world with four threads
Compositing with three threads

License

BSD, see LICENSE file