Wednesday, 21 March 2012

on the importance of doing nothing

I've been meaning to write about this for a while, but I've only just now been driven over the edge by having to go and basically run sed over code again for no good reason.

When you're programming, always make sure you question *why* things are done. Qt provides three functions, helpfully named qMalloc/qRealloc/qFree. Despite the 'q' in front of their names, these functions do absolutely nothing useful, they just wrap around their stdlib friends. This was originally done to enable replacement of the allocator inside Qt (but there are better ways to do that, without getting sidetracked from my central point), but in reality, doesn't have much use. That's why I'm trying to deprecate them.

Now, you might ask, "what impact could a simple function call have, anyway"? I'm glad you asked. Benchmark time (spoiler for the lazy: ~10% extra overhead for small allocation sizes, ~0-5% for larger allocation sizes).
virgin:~/mallocbench% cat main.cpp
#include <QtCore>
#include <qtest.h>
#include <qcoreapplication.h>
#include <qdatetime.h>

class MallocBenchmark : public QObject
private slots:
    void qtMalloc();
    void qtMalloc_data();
    void regularMalloc();
    void regularMalloc_data();

void MallocBenchmark::qtMalloc_data()
    QTest::newRow("1") << 1;
    QTest::newRow("10") << 1;
    QTest::newRow("100") << 100;
    QTest::newRow("10000") << 10000;
    QTest::newRow("1000000") << 1000000;
    QTest::newRow("10000000") << 10000000;

void MallocBenchmark::qtMalloc()
    QFETCH(int, size);

        void *p = ::qMalloc(size);

void MallocBenchmark::regularMalloc_data()

void MallocBenchmark::regularMalloc()
    QFETCH(int, size);

        void *p = malloc(size);


#include "main.moc"

And now, the results on my machine:
********* Start testing of MallocBenchmark *********
Config: Using QTest library 5.0.0, Qt 5.0.0
PASS   : MallocBenchmark::initTestCase()
RESULT : MallocBenchmark::qtMalloc():"1":
     0.000059 msecs per iteration (total: 62, iterations: 1048576)
RESULT : MallocBenchmark::qtMalloc():"10":
     0.000062 msecs per iteration (total: 66, iterations: 1048576)
RESULT : MallocBenchmark::qtMalloc():"100":
     0.000087 msecs per iteration (total: 92, iterations: 1048576)
RESULT : MallocBenchmark::qtMalloc():"10000":
     0.000083 msecs per iteration (total: 88, iterations: 1048576)
RESULT : MallocBenchmark::qtMalloc():"1000000":
     0.0043 msecs per iteration (total: 72, iterations: 16384)
RESULT : MallocBenchmark::qtMalloc():"10000000":
     0.0063 msecs per iteration (total: 52, iterations: 8192)
PASS   : MallocBenchmark::qtMalloc()
RESULT : MallocBenchmark::regularMalloc():"1":
     0.000053 msecs per iteration (total: 56, iterations: 1048576)
RESULT : MallocBenchmark::regularMalloc():"10":
     0.000051 msecs per iteration (total: 54, iterations: 1048576)
RESULT : MallocBenchmark::regularMalloc():"100":
     0.000082 msecs per iteration (total: 86, iterations: 1048576)
RESULT : MallocBenchmark::regularMalloc():"10000":
     0.000076 msecs per iteration (total: 80, iterations: 1048576)
RESULT : MallocBenchmark::regularMalloc():"1000000":
     0.0043 msecs per iteration (total: 71, iterations: 16384)
RESULT : MallocBenchmark::regularMalloc():"10000000":
     0.0060 msecs per iteration (total: 99, iterations: 16384)
PASS   : MallocBenchmark::regularMalloc()
PASS   : MallocBenchmark::cleanupTestCase()
Totals: 4 passed, 0 failed, 0 skipped
********* Finished testing of MallocBenchmark *********

Around 10% extra time per iteration on smaller allocation sizes, 0-5% on larger sizes (most likely explained by glibc falling back to using mmap for larger allocations, which is going to take an awful long time compared to a single function call). These, obviously, aren't huge numbers. But remember: this is overhead you're taking for no reason at all. Don't do it. Your CPU cycles will thank me.


  1. Careful there, Window$ in general only allows freeing memory from the same DLL/EXE which allocated it, so the only reliable way to allow the application to free something allocated by the library on Window$ is to put the malloc and free functions in a common library. Yes, some (proprietary) platforms are broken like that.

  2. Trying again with fixed formatting ;)

    Kevin: I know about Windows' multiple heaps, but this wasn't created because of that problem, nor would it really help, given that it's nowhere even remotely been universally used throughout Qt. This shouldn't actually be a problem, since you only actually run into that problem when you have multiple CRT heaps (which in the majority of cases shouldn't happen - unless you mix CRT versions or static/shared linking).

    Elsewhere (outside of Qt). I've seen this solved with a global operator new which called HeapAlloc(), passing GetProcessHeap() as the first argument (matched with a HeapFree() and GetProcessHeap() also). The same trick would be applicable to malloc/free I imagine, but we didn't do that, as we didn't use them.