Tuesday, 23 September 2014

Wayland and Qt 5.4

Since nobody else has done the honors yet, I'm happy to announce that - as decided at the Qt Contributors Summit this year - support for running applications under a Wayland compositor will be seeing its initial release with Qt 5.4. That is, the QtWayland repository is finally going to stop sitting in the corner, sulking. :)

There's a few "buts", though.

Firstly, it should be noted that support for QWidget-Based applications (and other desktop-based usecases) may be far from ideal, and quality may not be great. This is a consequence of most development on QtWayland having been driven from mobile/embedded viewpoints to date, and is not, in general, an inherent limitation on the windowing system. It's also something of a reflection on Wayland itself, which is only now starting to mature for desktop use (through xdg-shell etc etc.)

tl;dr: Think of this as a technical preview, keep your expectations realistic, and if you want to use it, expect to roll up your sleeves a bit and get dirty from time to time.

Secondly, the QtCompositor API in the QtWayland module (allowing you to write your own Wayland compositor) will not be seeing a release at this time. The API is not frozen, and has not seen the usual polish/quality that you might expect from Qt APIs. As this API is only of use to a limited number of people (those looking to implement an embedded/mobile device, typically, or write their own DE) this should not impact too many people.

tl;dr: If you want to write a compositor, you get to keep both pieces if it breaks. If you want to use applications under an existing Wayland compositor, you're fine.

Future work to QtWayland is largely an open story, but some obvious candidates come to mind:

  • Continued work on xdg-shell support
  • Plugin based window decorations (to enable environment-specific look and feel) this has now landed in the 5.4 branch :)
  • Integration with the rest of Qt's autotests (I spent a while getting tests fixed or at least runnable under window-compositor, but it would be nice to automate this)
  • "Official" subsurface protocol support
If there's something you would like to see happen, here or not, you're more than welcome to pitch in. If you'd like to talk to the other people hacking on QtWayland, please pop by on #qt-lighthouse on freenode, and talk to the folks there :-)

I'd also like to take a moment to thank everyone for their contributions to QtWayland. In particular, I'd like to say thanks to the following, in no particular order (and I'm extremely sorry if I've missed someone, please let me know and I'll happily add you to the list):
  • Kristian Høgsberg & Jesse Barnes, for their initial work on the port, sponsored by Intel,
  • Jørgen Lind, Samuel Rødal, Andy Nichols, Laszlo Agocs, and Paul Olav Tvete for continuing work on it excellently and admirably,
  • Nokia for sponsoring a good deal of the development up until their abrupt departure from the Qt world,
  • Digia for continuing to help out after Nokia left,
  • Andrew Knight, for ably shepherding problems encountered by Jolla for quite a long time,
  • Jolla for sponsoring a large chunk of work on QtWayland (past and present),
  • Gunnar Sletta for rewriting integration with rendering (especially QtQuick), removing a large number of bugs & improving performance,
  • Giulio Camuffo for numerous fixes, improvements and interaction with the wider Wayland community.
As a conclusion, I'd like to note that I'm really happy to see this finally happen - I've wanted it for a very long time now - and for Wayland to keep moving on for bigger and better things. Hopefully, this release will achieve its intended result (that more eyes/hands get exposed to the code, and start to use it, and help out with it).

Friday, 12 September 2014

profiling is not understanding

When software goes slow, generally, the first reaction is to profile. This might be done through system tools (like Instruments on OS X, perf/valgrind/etc on Linux, VTune, etc). This is fine and good, but just because you have the output of a tool does not necessarily correlate to understanding what is going on.

This might seem like an obvious distinction, but all too often, efforts at improving performance focus on the small picture ("this thing here is slow") and not the bigger picture ("why is this so slow"). At Jolla, I had the pleasure of running into one such instance of this, together with Gunnar Sletta, my esteemed colleague, and friend.

As those of you who are familiar with Jolla may know, we had been working on upgrading to a newer Qt release. This also involved quite a bit of work for us, both in properly upstreaming work we had done on the hurry to the late-2013 release, and in isolating problems and fixing them properly in newer code (the new scenegraph renderer, and the v4 javascript engine in particular have been an interesting ride to get both at once!).

As a part of this work, we noted that touch handling was quite slow (something which we had worked around for our initial release, but now wanted to solve properly). This was due to the touch driver on the Jolla introducing touchpoints faster than the display was updating, that is, while the display might be updating at 57 hz (yes, the Jolla is weird, it doesn't do 60 hz) - we might be getting input events a lot more frequently than that.

This was, in turn, causing QtQuick to run touch processing (involving costly item traversals, as well as the actual processing of touch handling) a lot more frequently than the display was updating. As these took so much time, this in turn slowed rendering down, meaning even more touch handling was going on per frame. A really ugly situation.

Figure 1: Event tracing inside the Sailfish OS Compositor
Figure 1 demonstrates this happening at the compositor level. The bottom slice (titled "QThread") is the event delivery thread, responsible for reading events from evdev The peaks there are - naturally - when events are being read in. The top thread is the GUI thread, and the high peaks there are touch events being processed and delivered to the right QtQuick item (in this case, a Wayland client, we'll get to that later). The middle slice is the compositor's scenegraph rendering (using QtQuick).

With the explanation out of the way, let's look at the details a bit more. It's obvious that the event thread is regularly delivering events at around-but-not-quite twice the display update. Our frame preparation on the GUI thread looks good, despite the too-frequent occurrence of event delivery, though, and the render thread is coping too.

But this isn't a major surprise - the compositor in this case is dead simple (just showing a fullscreen client). What about the client? Let's take a look at it over the same timeframe...

Figure 2: Event tracing for the client (Silica's component gallery, in this case)
Figure 2 focuses on two threads in the client: the render thread (top), and the GUI thread (bottom). Touch events are delivered on the GUI thread, QtQuick processes them there while preparing the next frame for the render thread.

Here, it's very clear that touch processing is happening way too often, and worse than that, it's taking a very long time (each touch event's processing is taking ~4ms), not leaving much time for rendering - and this was on a completely unloaded device. In a more complicated client still, this impact would be much, much worse, leading to frame skipping (which we saw, on some other applications).

Going back to my original introduction here, if we had used traditional profiling techniques, we'd have seen that touch handling/preparation to render was taking a really long time. And we might have focused on optimizing that. Instead, thanks to some out-of-the-box thinking, we looked at the overall structure of application flow, and were able to see the real problem: doing extra work that wasn't necessary.

As an aside to this, I'm happy to announce that we worked out a neat solution to this: QtQuick now doesn't immediately process touch events, instead, choosing to wait until it is about to prepare the next frame for display - as well as "compressing" them to only deal with the minimal number of sensible touch updates per frame. This should have no real impact on any hardware where touch delivery was occurring at a sensible rate, but for any hardware where touch was previously delivering too fast, this will no longer be a problem as of Qt 5.4.

(Thanks to Gunnar & myself for the fix, Carsten & Mikko for opening my eyes about performance tooling, and Jolla for sponsoring this work.

P.S. If you're looking for performance experts, Qt/QML/etc expertise or all round awesome, Gunnar and myself are currently interested in hearing from you.)

Wednesday, 13 August 2014

sailing in search of fresh waters

I've had a long, quiet time on this blog over the past few years while I've been frantically helping Jolla to launch their self-named product: the Jolla. I've enjoyed (almost) every day I've been there: they really are a great bunch of people and the work has been plentiful and challenging.

But as the saying goes, "this too shall pass". Nothing lasts forever, and it's time for a change: after this week, I will be taking a break from Jolla to get some fresh perspective.

On the bright side, maybe I'll have some more time for writing now :)

If anyone is interested in getting a hold of a C++/Qt/QML/Linux expert with a focus on performance, expertise on mobile, and a wide range of knowledge across other areas who loves open source, please let me know.

Thursday, 24 October 2013

Every time you use CONFIG+=ordered, a kitten dies.

QMake users: public service announcement. If you use CONFIG+=ordered, please stop right now. If you don't, I'll hunt you down. I promise to god I will.

There is simply no reason to use this, ever. There's two reasons this might be in your project file:
  1. you have no idea what you are doing, and you copied it from somewhere else
  2. you have a target that needs to be built after another target, and you don't know any better
If you fit into category 1, then I hope you're turning red right now, because by using CONFIG+=ordered, you're effectively screwing over multicore builds of your code. See a very nice case of this here.

If you fit into category 2, then you're doing it wrong. You should specify dependencies between your targets properly like this:

TEMPLATE = subdirs
SUBDIRS = src plugins tests docs
plugins.depends = src
tests.depends = src plugins

And then you'll have docs built whenever the build tool feels like it, and the rest built when their dependencies are built.

If you have subdirectories involved in this, then you need an extra level of indirection in your project, but it's still not rocket science:

TEMPLATE = subdirs
src_lib.subdir = src/lib
src_lib.target = sub-src-lib

src_plugins.subdir = src/plugins
src_plugins.target = sub-plugins
src_plugins.depends = sub-src-lib

SUBDIRS = src_lib src_plugins

For those of you wondering why I sound frustrated about this, I've fixed so many instances of this by now that it's just getting old and tired, frankly. And I still keep running into more. That's countless minutes of wasted build time, all because of laziness boiling down to a single line. Please fix it.

Wednesday, 18 July 2012

Qt 5 and Android

Astute observers of the Qt 5 repositories may have noticed that for quite a while, patches have been trickling in from me allowing Qt 5 to compile on Android.

The goal in mind was to allow use of Qt on Android primarily in order to work at the system level (not using the regular Android display stack, but using Wayland on Android) - tying in with Collabora's other work on Android, but this work also doesn't preclude someone from e.g. implementing a platform plugin to allow Android applications to run natively on unhacked devices, similar to Necessitas on Qt 4 - and I'd very much like to see that happen upstream.

In terms of compilation, there is one approach currently upstreamed that involves using the NDK, see this wiki page for more information. You'll note it's quite easy to do a build yourself, something that was quite intentional, since I figure that the only way it's going to improve easily is if it is easy to hack on it. I'm sure the build & installation instructions can be more optimal still (like installing to /system/lib, etc) but it's a start. Contributions welcome. I should also take a moment to thank the Necessitas guys, their mkspecs provided a nice starting point.

I had started an alternative route of integrating Qt with Android image builds (so, check out the Android tree, repo sync, drop Qt in place, run 'make' and have it built & deployed for you), but unfortunately, my sponsored time to work on this ran out, and so I wasn't able to finish it. It's still an interesting area of work, and so, I do plan to try continue it in my spare time.

In terms of actually using it, one area which is a bit of pain still, is that there's a bug in the way bionic's linker handles R_ARM_COPY relocations - instead of looking up the symbol to copy in the shared libraries the binary depends on, it finds the binary's symbol instead, meaning it doesn't really do any actual relocation.

The symptom of this is that your binary will crash on start due to things being zero'd out that really shouldn't be (like QObject::staticMetaObject in my case), depending on how it's been built. Thanks to Thiago for helping me nut that very difficult problem out. There is a patch pending on Android's gerrit instance, but I need to find the time to go rebase the patch and retest it to make sure it still works, although the code changes in the area look quite trivial.

For those of you who are visually oriented: I'm sorry, but there's not much to show here, because - as of yet - I don't have anything graphical running. Though in theory, it might be already possible to easily shoehorn Wayland libraries into the NDK using Pekka's work, and build QtWayland that way. But if anyone wants to talk Qt on Android, or better still, contribute, I'm all ears.

Massive kudos to Collabora for sponsoring my work on this!