Lambda the Ultimate

USENIX2003: A Logic File System
started 7/24/2003; 12:59:57 AM - last post 8/1/2003; 10:02:35 AM

Ehud Lamm - USENIX2003: A Logic File System

7/24/2003; 12:59:57 AM (reads: 2811, responses: 3)

USENIX2003: A Logic File System

I'd like to submit one more summary of USENIX2003 talks: a logical file system. It my opinion, it was the best, the most innovative USENIX talk. Although the subject matter may seem remote from programming languages proper, that talk touches on several points that have been discussed on LtU:

SQL and database access languages in general. The talk shows how to access what is essentially a relational database using a query language based on propositional logic. Although the label 'propositional logic' seems intimidating, the query language is actually quite natural -- even more natural than the familiar "cd/ls". Of a particular importance is the ability to formulate a query incrementally, to refine it -- what the paper calls integration of query and navigation.
The paper demonstrates how to automatically group the results of a query in meaningful and the most general categories, so as to avoid overwhelming the user.
the view that a hierarchy is deficient in organizing and using large collections. OOP programmers come to understand this fact: thus we have AOP. The logic file system shows a different approach, which might be suitable in large software projects.
The author of the paper is currently working on "views" of a file (which the summary mentions at the end). For example, we can view and edit a sample of sections from a document file as if it were a separate document. This relates to a recent LtU discussion of "one file per class". The views provided by the file system show a different solution to the problem (other than the hidden types).
Finally. The summary states that the system was written in a special language -- improved Perl -- which is then translated into regular Perl. I guess that makes the paper directly relevant to LtU.

In general, the notion of scoping is common to both file systems and programming language's variables. The talk shows that we might view the variable scoping in a more general way: as a logical formula. I must also state that embedding a Prolog interpreter into a file system and viewing file paths as logical formulas are very insightful and inspiring. Cheers, Oleg

Another interesting USENIX summary from Oleg. Enjoy!

Posted to general by Ehud Lamm on 7/24/03; 1:06:40 AM

Ehud Lamm - Re: USENIX2003: A Logic File System

7/24/2003; 1:05:15 AM (reads: 1806, responses: 0)

A Logic File System. Yoann Padioleau and Olivier Ridoux, IRISA / University of Rennes. General Track, pp. 99-112 of the Proceedings http://www.usenix.org/events/usenix03/tech/padioleau.html

This is the most interesting, insightful, and innovative talk at this conference.

The goal of the talk is to combine the navigation and querying in a file system. Roughly speaking, the talk attempted to bring filesystems from the age of hierarchical and network databases into the age of relational -- and, moreover, deductive databases -- and yet maintain the familiar cd/ls/mkdir/mv interface, file paths and shell globs.

The current filesystems are built around the hierarchical model (links, if present, are only uni-directional. Therefore filesystems do not actually follow the network database model). Unfortunately, a hierarchy is a tree rather than a lattice: a hierarchy decomposes the system around one aspect disregarding the others. For example, the typical layout of the filesystem emphasizes the scope: /, /usr, /usr/local, /usr/local/share/ghostscript/7.05. Other aspects, e.g., library code, are smeared all over the place: /lib, /usr/lib, /usr/local/lib and /usr/local/share/ghostscript/7.05/lib. Locating files that are local to an installation is easy (/usr/local), locating all the library files is hard. Many interesting queries are difficult to execute or even to formulate. For example, to list all directories that contain only empty subdirectories, to "cd" to a directory with files foo.h or foo.c (disjunctive query), or to list all the files *.h or *.c in the given tree that are not in (sub)directories named Attic. To answer such queries, we have to write quite complex logical formulas, made more complex by the fact that me must keep the navigational structure in mind.

Many operating systems have tools that impose a relational layer on the file system -- glimpse, locate, find -- and offer a keyword-based search. However these tools often return a great number of results, which are not grouped in any particular way. For example, "locate sendmail.cf" may return a couple of file names. But "locate !sendmail.cf" (where ! stands for negation), if it were allowed, would list almost any file on the system. Clearly the user will find such a result unmanageable. The user would much prefer if the output were automatically grouped according to the most general and informative categories: the most general keywords that _will_ refine the search. Furthermore, the user should be able to navigate such a virtual directory of results using the familiar 'cd' and 'ls' commands, or reclassify a file with a 'mv' command. Finally, if the user adds or deletes a file, the virtual file tree should be updated automatically.

We would like to "cd /lib" and descend into the Directory that contains only library files, grouped into subdirectories "root", "usr", "local". We would like to "cd /bin|/sbin" and get to the root of the "filesystem tree" that contains only executable files (again, grouped into "subdirectories"). We may even do "cd !src" to get the view of the filesystem without any source files. The Logical File System described in this talk (LISFS for short, not to be confused with the Log File system) can accomplish all that -- now. The prototype implementation is fully functional and reasonably fast. An ls-benchmark shows hardly any performance overhead of far more expressive filesystem queries.

The logical filesystem LISFS takes a relational view: files are entities that are described by their properties (attributes). To locate a file, we execute a query. LISFS lets a user compose a query incrementally: "cd /query" executes the query and makes the corresponding collection current. "cd query1" refines the original query. Incidentally, this means that "cd a/b" and "cd b/a" are the same: the path separator stands for a conjunction, and therefore, is commutative. The closed-world assumption makes negation queries possible (e.g., "cd !/lib"). LISFS goes farther than the naive relational model. A user may create taxonomies of attribute values. For example, the user may create a category "lib" with sub-categories "shared-lib" and "static-lib" -- using the mkdir command. LISFS will infer that anything classified as a shared library is also a library. The inference rules can be more complex: we can define a category "dual-lib" for project's libraries that can be linked both statically and dynamically. When LISFS reifies the result of a query as a filesystem directory, the most general categories that partition the result become the names of subdirectories. LISFS thus lets us navigate the inference tree. The complex inference requires a logical engine -- and indeed the current version of LISFS includes a Prolog interpreter, as a part of the filesystem code! Truly a file path is a logical formula.

Like the semantic file system, LISFS has intrinsic attributes, whose values are derived from the content of a file (e.g., file's size or the resolution of a JPEG image). Unlike the semantic file system, a user can define his own attributes and organize them into arbitrary taxonomies.

The paper describes the algorithms, the implementation of directory operations, and the obvious optimizations (e.g., caching).

LISFS is implemented truly at the low, file system level, as a user-level NFS daemon. Therefore, none of the applications need to be modified in any way to take advantage of LISFS. For example, the wildcard expansion in a shell (glob) "just works".

LISFS is written with the help of a PerlFS toolkit. However, Yoann told me that the implementation language was not truly Perl. Yoann designed his own language with types and an ability to express post-conditions and invariants. That source language was then mechanically translated into Perl. Perhaps Yoann might wish to supply more details?

The only filesystem that comes close to LISFS in terms of integration of query and navigation was the file system of Apple Newton (so-called 'soup'). But NewtonScript never had any logical inference.

In the future research, Yoann plans to implement views of a file. We could then edit selected sections from a TeX document as if they were a separate file (Emacs' all-mode is a glimpse of that functionality). It seems that LISFS can be meaningfully integrated with Semantic Web.

Yoann was asked a question if a common person (without the knowledge of logic, etc) would be able to meaningfully navigate LISFS. Yoann gave an excellent answer: a common person can use Google. LISFS is not much different.

The prototype and more information are available at http://www.irisa.fr/LIS

Cimarron Taylor - Re: USENIX2003: A Logic File System

7/25/2003; 1:01:14 AM (reads: 1409, responses: 0)

It appears this project shares some of the goals of ReiserFS (http://www.namesys.com/whitepaper.html).

Leandro - Re: USENIX2003: A Logic File System

8/1/2003; 10:02:35 AM (reads: 932, responses: 0)

This looks like a saner, humbler, higher-level ReiserFS.

But it still seems like they didn't get the relational model properly, unduly speaking of it as simply a keyword search schema. Then, by not having the simplicity and expressiveness of the relational model available, they are forced to use a much more complicated theory to reinvent a part of the relational power.

Maybe it's only me playing the Date inside, but here my 2¢.