pcb-rnd knowledge pool
GTK4 rant: glib
gtk4r_glib by Tibor 'Igor2' Palinkas on 2021-12-02
Tags: insight, gtk4, rant, glib, OOP, C++
Abstract: Why I think glib is a terrible thing.
This rant is part of a series.
When I choose to use C and not C++, I do that because I do not want all the modern OOP complication C++ introduces. Don't get me wrong, I am not "writing assembly in C". I do want to write structured code, I do couple code and data, I do form stand-alone, independent modules of these things. I just can solve the clear, reusable API and "object" question with structs, function pointers and function calls. I don't call them classes, I don't want multiple inheritance, I don't want all kind of side effects, call-argument-type-dependent function call dispatching, etc.
What glib is trying to do very hard is introduce a lot of what I don't want to have from C++ into C.
First it does introduce classes with macro magic. Since gtk widget are classes, one can't avoid dealing with this. Instead of relying on plain old struct types and compile time C type checks, glib introduces a special cast that lets you convert a class into another class, if the hierarchy of classes (yeah, exactly the thing I didn't want to have) allows that.
For example you have a GtkButton C type, which, as a "class" for glib is really a GtkWidget. Or in plain English: a button is one of the possible widgets. Now if you want to have a generic function that for example hides a widget, any widget, including a button, that's gtk_widget_hide(), which takes a GtkWidget pointer as argument. You can pass a GtkButton pointer, because it's also a widget but since these are different types in C, you get a warning. So you glib-cast your button to widget:
GtkButton *btn = gtk_button_new_with_label("foo"); gtk_widget_hide(GTK_WIDGET(btn));
Sounds elegant, right? Except that most checks will happen run-time instead of compile-time. For example:
char *s = "foo"; gtk_widget_hide(GTK_WIDGET(s));
This will compile without warning even with -Wall and you will figure the invalid cast only when the code happens to run (see also: the halting problem ). Of course you wouldn't write such a stupid thing in your code - but it was only a minimal example. What happens in practice is that you create a lot of different glib objects and store their pointers in variables, structs, lists, hash tables, etc. If you have a few 100k sloc application like pcb-rnd, some variables may be mixed up by accident and a few 10k lines later you case your string to widget. Thanks to glib, you won't figure this in compilation-time, only run-time.
Glib insists on running the main loop for you so that it can implement poll(2) and emulate async signals. These signals are not to be confused with POSIX or UNIX signals. These signals are really just:
- a function pointer bound to an event, called back by glib; the call has:
- a pointer to the object the signal is about (a "self" or "this" as first argument)
- an arbitrary number of event-specific arguments (e.g. x and y coords for mouse events)
- a (void *) user data specified by the user at binding time
Using function pointers and user attachable callbacks is a good thing. I do that a lot in my non-glib C code too. It's also very good that a void * user data can be passed on: a lot of libraries forget to do that rendering their API unusable - surprisingly common in scripting language implementations!
The bad part is when the callback happens, the async part. It happens from the main loop. So for example you have a widget-destroy signal bound to your button, then your code decides to destroy a dialog box which in turn will destroy your button that lived in that dialog box. Your callback function segfaults, or you are just wondering why it ran, so you put a debugger on it. You run your application, you trigger the problem and the debugger stops in your callback function. Then you do a backtrace.... and get unusable garbage.
Because what really happens is this: when you called the function that "did" destroy the dialog, that really did not destroy it, or did not destroy it fully. Instead: when you returned from your functions and passed back control to glib's main loop, a lot of the things will happen only then. With gtk, often in some "idle code". So when gtk gets to actually destroying your code, and the signal is really generated, you already lost the stack for the place where you really called the destroy function!
Which means, in general: if you look at a random backtrace of your application, half of the time it will be totally useless because you are serving some async signal and all you see is some 30+ glib/gtk stack frames how the main loop ended up delivering your signal (function callback). But figuring what triggered the signal where is much harder.
User data: signal meets class
If you combine the above two, there is a funny dark corner: despite of all the OOP cruft glib introduces, at the end your signal handler is called literally with (void *) user data. So you have no way to make compile-time checks on what you expect to pass there.
Which is normal for C. C++ programmers often tease C programmers saying people who choose C are just lazy and want to do everything in void * at the end. And indeed, for a generic callback function pointer, it's hard (but not impossible!) to do otherwise. But then, if it's really just a void * at the end, why do we need that 800k sloc glib around it?
Even worse: different signals will have different number and type of arguments. For example the widget destroy signal will have only the self/this pointer to the widget and the void * user data, while a "gesture pressed" (mouse button click) has self/this, button, x, y and user data. But at the end you use the same g_signal_connect() to bind both.
How does it accept function pointers with totally different signatures? Well, you dynamic-cast your function pointer using G_CALLBACK(), so literally any function will do. Now if you mix up function names, you won't get a compiler time warning about passing a 6-argument function instead of a 2-argument function. You won't even get a run-time warning from G_CALLBACK() because it has no chance to detect that in C! All you get is a big fat undefined behavior: totally broken arguments at least, or if you don't use much of your arguments seemingly proper operation and then occasional random crashes at worse.
The glib project has no scope: there's not a simple, well defined, finite set of problems it tries to solve as a lib. I don't even consider it a lib. I consider it a programming environment.
It's really a huge random collection of mostly useless features. Once such a feature is added, it becomes very hard to remove it, so it's also an ever-growing code base. At the end it tries to convince you to use the g-version of everything, even instead of totally portable, standard C89 libc functions or types!
It has it's own integer type, called gint, and even it's own character type, gchar. It's unclear what problem these types are trying to solve: in C99, which is required for glib, you both have the standard C types of host-specific sizes (such as int) and standard fixed-size types in stdint.h.
Beside that, glib implements its own I/O, string handling, memory allocator (to make memory debug with valgrind futile).
At the end it tries to partially replace C syntax with some OOP language implemeted in macros and tries to replace libc with an even bigger spaceship.
Partly because of the above, partly for other mega-lib issues, I long ago removed glib from pcb-rnd (and librnd) and replaced it with minilibs . The only code that depends on glib is the gtk HID code, and only because gtk requires it.