Monday, 19 March 2012

Qt 5.1, aka: when QFileSystemWatcher might not be so useless

With 5.0 now being feature frozen, I thought I'd turn myself towards something I've been meaning to do (and talking about doing) for a very long time. Back before the Qt Project launched, even before Qt Contributors Summit 2011, in fact. I thought I'd make QFileSystemWatcher more useful.

Let's review what QFileSystemWatcher offers: a way to add and remove paths for monitoring directories and files, a directoryChanged(path) and a fileChanged(path) signal. That's it. If you think about this for a moment, you realise just how broken the semantics of the signals are: what _is_ a 'change' anyway? I guess this is one of the reasons that QFileSystemWatcher has been called 'deprecated' (I fixed one of the others, a performance issue, a while back).

So I've been working on making the signals a bit less useless. In the longer run, I plan to call fileChanged and directoryChanged deprecated. They'll still be emitted (of course) to keep existing code working, but in addition, you'll have (subject to review):

pathCreated(path) - emitted when something is created inside a directory you are monitoring, or if something that didn't exist that you were monitoring for is created (more on this later)
pathDeleted(path) - emitted when something is deleted inside a directory you were monitoring, or something that you _were_ monitoring was deleted
fileModified(path) - emitted when a file you were monitoring is modified (attributes or contents)

I also have early plans to introduce a pathMoved(oldLocation, newLocation), but that one has a lot of caveats: it might only work on certain platforms, in certain phases of the moon, and only if you're very lucky - on many platforms, it will likely continue to be synthesised as a pathDeleted(oldPath) and pathCreated(newPath) (if you're watching the new location).

I also have vague ideas about introducing more syntactically friendly API over the top of this, something like:

QPathMonitor pathMonitor(myPath);
connect(&pathMonitor, SIGNAL(deleted()), SLOT(watchedPathDeleted()));
... etc ...

but that's a bit farther away in that I haven't really thought it through, yet.

In working on the new signals on Windows, I stumbled across QTBUG-2331, which sort of proves just how useless the existing signals were: deleting a watched directory on Windows essentially never removed the watch, because it didn't notice the deletion. Hopefully, it will be fixed soon, because it's going to block the work on the new signals: otherwise, pathDeleted will never be emitted on a watched path on Windows, and that makes my unit tests sad.

I noted earlier that I have plans to emit pathCreated when you monitor a non-existent path, well, that's correct - I intend to allow just that. None of the native APIs (that I know of) allow for recieving events when you monitor a non-existent path, which is a bit sad, as it means we'll need to fall back to polling on a timer for those cases, but it's certainly much better than nothing.

Oh, and if you're curious, it seems like Linux has the best filesystem monitoring API, in the form of inotify. OS X/BSD comes in second with kqueue (though having to open file descriptors to do the actual monitoring is a bit crap, and not being able to get any sort of real fine-grained notifications on _what_ was added/removed in a directory is also painful). Windows is also incredibly painful due to running into many stupid limitations (like only some ~60 paths being able to be monitored per thread, so the backend spawns loads of threads if you monitor a lot of paths), and not being able to get even remotely useful signals without a lot of extra legwork. Not to mention the above bug, where signals are emitted before deletion.

OS X may get more love in the future, as there is talk about another contributor revisiting the state of the FSEvents backend (which has been disabled for a very long time, and the code removed in Qt 5, due to being massively buggy and unmaintained).

For Windows, ReadDirectoryChangesW might improve the situation, somewhat, but I certainly don't have the time to investigate it for 5.1, and I also lack the motivation, not being a Windows user myself. Contributions welcome?


  1. If QFileSystemWatcher has been deprecated, then what replaces it ?

  2. @digitalsurgeon: nothing. I'm working on un-deprecating it.

  3. One thing that would be handy too is the ability to monitor a whole subtree - that is, I'd like to be told whenever any path that starts with 'x' is changed. I tried to implement this a while back for e.g. the project tree in Qt Creator (.qmlproject's, specifically), and it's really a pain on top of the current API.

    Not sure whether any of the underlying API's support this directly, though. Anyhow, QFileSystemWatcher is really one of the few things in Qt which are really broken, so ++ for taking this up :)

  4. It's not supported on most APIs (inotify does support it), so I can't expose it directly. It's also not really feasible to expose a 'watch this recursively' option, because each watch consumes resources - more of an issue on OS X, where you can only have 256 FDs open at a time...

    it isn't _that_ hard to implement though (I've done it) and this API will make it even easier. I'll probably write an example once the new signals land.

  5. Some time ago I tried to use QFileSystemWatcher to monitor a text file (in order to reread it when modified) but found that it didn't work if I edited the file in e.g. Vim. What I noticed was that saving the file in some text editors removed the watch. I'm not very knowledgeable in this area, but my guess is that Vim replaces the original file when saving, which removes the inotify watch.

    I wonder if this a known issue. Will this be fixed in QFileSystemWatcher in the future, or was I supposed to use it in a different way?

  6. ...plans to emit pathCreated when you monitor a non-existent path, [--] need to fall back to polling on a timer for those cases...

    Instead of a timer could it be done with a "recursive" watcher?

    For example: I want to monitor path /a/b/c but that doesn't exist yet. Only /a exists but there's no /a/b. I set the watcher to monitor /a/b/c. Internally it would start monitoring /a. When /a/b appears, the watcher would internally check, if /a/b/c is also there. If not, it would continue monitoring /a/b. Finally when /a/b/c would be created, it would notify the user.

    This would probably need quite a lot of logic to tackle all the file system modification cases but no timer would be needed.

    Just a thought :)

  7. @Hans: correct, and this isn't just a thing with text editors.

    When you want to modify a file, you can't easily rewind back to say, the start of the file, change a few bytes, and leave everything else. You need to write the whole thing. But obviously, while you're writing, things can go wrong. The user might run out of disk space, or worse: pull the power out of the computer. So the "reliable" way to save files is to open a new (temporary) file, save it there, then use rename(2) (on UNIX) to replace the original atomically - it's guarenteed to either leave the old there, or the new.

    Now, let's look at QFileSystemWatcher's documentation:
    "Note that QFileSystemWatcher stops monitoring files once they have been renamed or removed from disk, and directories once they have been removed from disk."

    So, with the old signals, you'd pretty much be SOL. With the new signals, you'd need to watch for pathDeleted(path) and pathCreated(path), as well as fileModified (for cases where the file is modified directly, without an intermediate temporary file).

    Hope this helps.

  8. @Aki: you could, with a fair chunk of extra state tracking in the engines. but watching a recursive tree that doesn't exist yet isn't something that (at least I) have had to do all that often. A much more common case is monitoring for a configuration file that isn't yet there, for instance.

    with that in mind, and keeping in mind you'd have to implement it for all platforms, I'd say it's probably not worth it.

  9. @Robin Burchell:
    Very informative, thanks a lot! I tried to work around it with a timer checking if files().isEmpty, but obviously this is far from optimal.

    "So, with the old signals, you'd pretty much be SOL. With the new signals, you'd need to watch for pathDeleted(path) and pathCreated(path), as well as fileModified (for cases where the file is modified directly, without an intermediate temporary file)."

    Would it be possible to make QFileSystemWatcher do this instead of requiring the user to implement a solution with pathDeleted(path) and pathCreated(path)?

  10. @Hans: no. There's no real way to know what happened to the file cross-platform. It might have been deleted and created by something else. pathMoved(old, new) will _sometimes_ be able to give you this information when it is introduced, though (as it should track renames like this).

    Filesystems are messy, unfortunately.

  11. Can this class improve the nepomuk indexer behavior?

    1. Nope.

      It can only be fixed if/when the kernel provides a better interface for monitoring files. And I don't have the bandwidth to hack on the kernel right now.

      inotify is not recursive, and fanotify does not monitor file move events.

  12. How do you plan to implement this on MACOS?
    As you wrote kevent does not provide this information.

  13. Any more developments on this?
    I'm still keenly interested in a better QFileSystemWatcher! :)