Should the FUSE getattr operation always be serialised?

13

I'm implementing a FUSE filesystem intended to provide access via familiar POSIX calls to files which are actually stored behind a RESTful API. The filesystem caches files once they've been retrieved for the first time so that they are more readily available in subsequent accesses.

I'm running the filesystem in multi-threaded mode (which is FUSE's default) but finding that the getattr calls seem to be serialised, even though other calls can take place in parallel.

When opening a file FUSE always calls getattr first, and the client I'm supporting needs the file size returned by this initial call to be accurate (I don't have any control over this behaviour). This means that if I don't have the file cached I need to actually get the information via the RESTful API calls. Sometimes these calls happen over a high latency network, with a round trip time of approximately 600ms.

As a result of the apparent sequential nature of the getattr call, any access to a file which is not currently cached will cause the entire filesystem to block any new operations while this getattr is serviced.

I've come up with a number of ways to work round this, but all seem ugly or long-winded, really I just want the getattr calls to run in parallel like all the other calls seem to.

Looking at the source code I don't see why getattr should be behaving like this, FUSE does lock the tree_lock mutex, but only for read, and there are no writes happening at the same time.

For the sake of posting something simple in this question I've knocked up an incredibly basic implementation which just supports getattr and allows easy demonstration of the issue.

#ifndef FUSE_USE_VERSION
#define FUSE_USE_VERSION 22
#endif

#include <fuse.h>
#include <iostream>

static int GetAttr(const char *path, struct stat *stbuf)
{
    std::cout << "Before: " << path << std::endl;
    sleep(5);
    std::cout << "After: " << path << std::endl;
    return -1;
}

static struct fuse_operations ops;

int main(int argc, char *argv[])
{
    ops.getattr = GetAttr;
    return fuse_main(argc, argv, &ops);
}

Using a couple of terminals to call ls on a path at (roughly) the same time shows that the second getattr call only starts once the first has finished, this causes the second ls to take ~10 seconds instead of 5.

Terminal 1

$ date; sudo ls /mnt/cachefs/file1.ext; date
Tue Aug 27 16:56:34 BST 2013
ls: /mnt/cachefs/file1.ext: Operation not permitted
Tue Aug 27 16:56:39 BST 2013

Terminal 2

$ date; sudo ls /mnt/cachefs/file2.ext; date
Tue Aug 27 16:56:35 BST 2013
ls: /mnt/cachefs/file2.ext: Operation not permitted
Tue Aug 27 16:56:44 BST 2013

As you can see, the time difference from the two date outputs from before the ls differs only by one second, but the two from after the ls differs by 5 seconds, which corresponds to the delay in GetAttr function. This suggests that the second call is blocked somewhere deep in FUSE.

Output

$ sudo ./cachefs /mnt/cachefs -f -d
unique: 1, opcode: INIT (26), nodeid: 0, insize: 56
INIT: 7.10
flags=0x0000000b
max_readahead=0x00020000
   INIT: 7.8
   flags=0x00000000
   max_readahead=0x00020000
   max_write=0x00020000
   unique: 1, error: 0 (Success), outsize: 40
unique: 2, opcode: LOOKUP (1), nodeid: 1, insize: 50
LOOKUP /file1.ext
Before: /file1.ext
After: /file1.ext
   unique: 2, error: -1 (Operation not permitted), outsize: 16
unique: 3, opcode: LOOKUP (1), nodeid: 1, insize: 50
LOOKUP /file2.ext
Before: /file2.ext
After: /file2.ext
   unique: 3, error: -1 (Operation not permitted), outsize: 16

The above code and examples are nothing like the actual application or how the application is used, but demonstrates the same behaviour. I haven't shown this in the example above, but I've found that once the getattr call completes, subsequent open calls are able to run in parallel, as I would have expected.

I've scoured the docs to try and explain this behaviour and tried to find someone else reporting a similar experience but can't seem to find anything. Possibly because most implementations of getattr would be so quick you wouldn't notice or care if it was being serialised, or maybe because I'm doing something silly in the configuration. I'm using version 2.7.4 of FUSE, so it's possible this was an old bug that has since been fixed.

If anyone has any insight in to this it would be greatly appreciated!

c++
c
linux
multithreading
fuse
asked on Stack Overflow Aug 27, 2013 by abulford • edited Sep 23, 2013 by abulford

1 Answer

11

I signed up to the FUSE mailing list, posted my question and recently got the following response from Miklos Szeredi:

Lookup (i.e. first finding the file associated with a name) is serialized per directory. This is in the VFS (the common filesystem part in the kernel), so basically any filesystem is susceptible to this issue, not just fuse.

Many thanks to Miklos for his help. For the full thread see http://fuse.996288.n3.nabble.com/GetAttr-calls-being-serialised-td11741.html.

I had also noticed that the serialisation was per-directory, i.e. the above effect would be seen if both files were in the same directory, but not if they were in separate directories. For my application this mitigation is enough for me, the clients to my filesystem do use directories so, while I might expect a lot of getattr calls in close succession, the likelihood of them all happening on the same directory is low enough for me not to worry about.

For those for whom this mitigation isn't enough, if your filesystem supports directory listing you may be able to take advantage of David Strauss' suggestion, which is to use the readdir call as a trigger to prime your cache:

In our file systems, we try to pre-fetch and cache the attribute information (which will inevitably be requested) during readdir so we don't have to hit the backend for each one.

Since the backend to my filesystem has no concept of directories I was unable to take advantage of his suggestion, but hopefully this will be helpful to others.

answered on Stack Overflow Sep 23, 2013 by abulford

User contributions licensed under CC BY-SA 3.0