Specialized File/Disk Systems for Actor Environments

Hi Folks,

It occurs to me that every once in a while someone takes a radically different approach to file system design - to better match disk (or now SSD) i/o to the nature of the processing going on. Database File Systems come to mind. Maybe Hadoop FS?

It strikes me that large clouds of actors - as in Erlang - might better be supported/represented by something other than a standard file system. (This is in the context of starting to design a system that might run on Erlang on bare iron, which leads to the question of whether there are some storage optimizations to consider.)

Any thoughts, references, etc?

Thanks,

Miles Fidelman

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Orleans

Well folks, not for lack of any responses here, but I found a couple of rather interesting approaches - notably Orleans (https://dotnet.github.io/orleans/Documentation/index.html) - pitched as an "Actor Oriented Database" and an Erlang implementation (https://github.com/erleans/erleans)

Not exactly a file system, but a particularly nice approach to using file systems to manage large clouds of actors.

Streaming fsync

One feature I've always wanted in a filesystem is a streaming version of fsync() - one that would notify me when a given prefix of an append-only file had been synced to disk while I was still appending records to it. I believe most fsync() implementations block use of the file descriptor until the entire file sync is complete.

Storage region types

For what it’s worth, one small feature I found useful in my work are region-annotated types, where regions can correspond to sections of files on disk. The TLDR is that you can write code with normal data structures but have values persist (especially when used with SSDs, the distinction between memory and the file system basically disappears):

https://github.com/Morgan-Stanley/hobbes

can you say a little more?

Thanks for this, and the link makes interesting region. But...

This kind of leaves out what's happening on the disk side (hard or SSD). What kind of file systems (or alternatives) are giving you the ability to map data to disk sections? And what kinds of algorithms are optimizing the placement of data?