Tuesday, October 25, 2005

System Calls for Attributes

The title is a link to Dominic Giampaolo's book Practical File System Design with the Be File System. The attribute related part is in Chapter 5. From the book:

BFS stores the list of attributes associated with a file in an attribute directory (the attributes field of the bfs.inode structure). The directory is not part of the normal directory hierarchy but rather "hangs" on the side of the file. The named entries of the attribute directory point to the corresponding attribute value.
So each file has a hidden attributes directory and the attributes themselves are files in that directory. The file name is the attribute name and the file contents are the attribute's value. This allows the inode data structure to be reused. As an optimization, unused portions of inodes are used for small attributes (to limit the disk head from having to open other directories to get attribute information).

A program can perform the following system calls on attributes:

  • Open attribute directory
  • Read attribute directory
  • Rewind attribute directory
  • Close attribute directory
  • Stat attribute
  • Remove attribute
  • Read attribute
  • Write attribute
An example system call looks like:
ssize_t fs_read_attr(int fd, const char *attribute, uint32 type, off_t pos, void *buf, size_t count);
The file descriptor indicates which file to operate on, the attribute name indicates which attribute to do the I/O to, the type indicates the type of data being written (integer, double, string, etc.), and the position specifies the offset into the attribute to do the I/O at.

The exact system calls for the above are

  • DIR *fs_open_attr_dir(char *path);
  • struct dirent *fs_read_attr_dir(DIR *dirp);
  • int fs_rewind_attr_dir(DIR *dirp);
  • int fs_close_attr_dir(DIR *dirp);
  • int fs_stat_attr(int fd, char *name, struct attr_info *info);
  • int fs_remove_attr(int fd, char *name);
  • ssize_t fs_read_attr(int fd, char *name, uint32 type, off_t pos, void *buffer, size_t count);
  • ssize_t fs_write_attr(int fd, char *name, uint32 type, off_t pos, void *buffer, size_t count);
Note the API style for the last four calls: both the file descriptor of the file that the attribute is associated with and the name of the attribute are required. Making attributes into full-fledged file descriptors would have made removing files considerably more complex, so attributes are not treated as file descriptors in their own right.

Attributes

There was once a great operating system called BeOS which sported an excellent filesystem called BFS (the Be File System). In addition to being efficient, journalled, large (64-bit) and multithreaded, the most compelling feature of the operating system (i.e., the feature not found in most conventional operating systems) was attributes. Attributes are metadata associated with files on a BFS volume.

For example, MP3 files contain ID3 data identifying fields such as Artist and Title. JPEG files can contain IPTC data containing information such as the date of a photograph, the photographer, the shutter speed, the aperture, etc. Using attributes, these fields would be refactored into the filesystem, not stored in the file itself. That is, an MP3 file would not have an ID3 tag, but would instead be attributed with Artist and Title attributes.

Another example of attribute usage is storing the preferred application to view a file. For example, in Windows, I may designate that JPEG files should open in the Gimp. However, when I download my vacation photos (which are JPEGs) from my camera, I'd like the default double click action to be to open them in the Picture Viewer (so I can get an easy slide show). I do not recall if Windows currently has the ability to selectively open some files of a type in one viewer and other files of the same type in a different editor. If it does indeed have this feature, that information would have to be stored in the registry (which brings a lot of woes including loss of information on OS reinstall). Using attributes, each file could have a PreferredApp attribute that specifies which application to use. This attribute could be an OS specified attribute (so that "power users" do not accidently delete this attribute).

SpoonFS will support arbitrary attributes on files. There will be a concept of system attributes (e.g., Date, Permission) and user attributes (Width, Height for JPEGs, Author for MP3 files). As well, attributes will be type safe. For example, a Date attribute will expect a date to be entered according to the format provided by the OS locale.

Attributes present a very powerful paradigm; they are a first step for turning a filesystem into a database. Attributes are indexed and queryable, so it is possible to quickly find files. As well, attributes allow the creation of metadata-only files. For example, consider the concept of a Contact in an e-mail program or IM application. In a traditional filesystem, a Contact may be implemented as a small text file that has fields listed one per line (e.g., Name, E-mail address, Birthday, etc.). In an attribute-based filesystem, each field could be stored as an attribute on a zero byte file. In actuality, no inode needs to be allocated for the file itself; only metadata is stored. The name SpoonFS comes from this observation on the lack of existance of an data actual file. Quoting from the Matrix:

There is no spoon

Similarly, there is no file (only metadata).

Saturday, October 22, 2005

SpoonFS Motivation

This is the first post to get some ideas regarding SpoonFS down. Traditionally, filesystems have been based on the notion of a hierarchy. There are two primitives in these hierarchical systems:
  • files (which are an abstraction for a sequence of bytes)
  • directories (which contain a collection of files)
Directories are called 'folders' in some simpler OSes. Files are named objects and the names are unique (i.e., you cannot have two files with the same name). Over time, various applications have required the ability to build up databases of files that are queriable. E-mail clients, photo software, word processors, all of these clients end up building their own databases so that users may query their data orthogonally. However, this repetition by all applications points out a deficiency in the design of current file systems. Application developers are building databases of files on top of hierarchical file systems; this common functionality should be refactored into the filesystem yielding a filesystem that is itself a database of files. The hierarchy notion will be disbanded. The SpoonFS project will provide an implementation of this database filesystem. Features will include typesafe attributes, anonymous and named relations of files and fast query time.