While experimenting with Glom versus a big musicbrainz database, valgrind’s memcheck helped me to find lots of small memory leaks in Glom, libgda, and libepc, and one huge GValue leak in libgda. Viewing a table of 600,000 records initially took 2 minutes (or 2 hours in Glom 1.6) and 1000Mb of memory. It now takes 1 minute and 250Mb of memory, mostly all during the execution of the SQL query.
That’s still awful, but it looks like libgda-4.0 will do this in about 1 second with 35Mb of memory, no more than when using the raw libpq API. That seems acceptable and makes me think that Glom is going to be fine with large tables. I don’t know what’s wrong with libgda-3.0. Maybe it’s not really using a database cursor when asked.
That had worried me for a while so I’m glad to know it will be OK. But I’ll wait for libgda-4.0’s API to settle down before switching Glom to it, to avoid disrupting the other developers working on Glom.
I also think that the 250Mb is being leaked every time I switch from Details to List view in Glom, but valgrind’s leak check seems to hit an endless loop, or is otherwise overloaded by this challenge, even when using less rows, so I’m not sure what the cause might be. If the leak is still there with libgda-4.0 then it should be easier to investigate.
There are still some other smaller leaks reported by valgrind, but I don’t believe most of them. For instance, it reports some leaks in std::string constructors even when the std::strings are temporary instances, not created by new.
By the way, here is my valgrind command for leak detection:
G_SLICE=always-malloc G_DEBUG=gc-friendly GLIBCPP_FORCE_NEW=1 GLIBCXX_FORCE_NEW=1 valgrind –tool=memcheck –leak-check=full –leak-resolution=high –num-callers=30 –suppressions=/somewhere/valgrind-python.supp –suppressions=/somewhere/gtk.suppression yourapp &> valgrind_output.txt
I found some of those environment variables, and Johan’s GTK+ valgrind suppressions file on the live.gnome.org valgrind page. And here is the official Python valgrind suppressions file.
3 thoughts on “Big Tables in Glom: Leaks and libgda”
Last time I profiled libgda-3.0 the main reason for this tremendous performance problem was that the GdaDict holds a reference to all GdaQuery objects and this causes about 1 million g_weak_ref/g_weak_unref calls which searches through a GList with 1 million items each time. Huge performance waste but luckily fixed in libgda-4.0 (which is really super-fast!).
Don’t most std::string implementations keep the string data in memory allocated on the heap, even if the string object is a temporary and therefore lives on the stack?
Not that I actually believe libstdc++’s std:string implementation has memory leaks.
Marius, I think that’s what the GLIBCPP_FORCE_NEW=1 and GLIBCXX_FORCE_NEW=1 are meant to avoid, to help valgrind.
Johannes, that’s interesting. By the way, how did you do this profile? The output of valgrind’s massif tool seems like it has become much harder to understand, at first glance. It no longer creates the nice graphs.