Forest: A Language and Toolkit For Programming with Filestores

Report ID: TR-889-10
Author: Walker, David / Zhu, Kenny Q. / Fisher, Kathleen / Foster, Nate
Date: 2010-12-00
Pages: 28
Download Formats: |PDF|
Abstract:

Many applications use the file system as a simple persistent data store. This approach is expedient, but not robust. The correctness of such an application depends on the collection of files, directories, and symbolic links having a precise organization. Furthermore these components must have acceptable values for a variety of file system attributes such as ownership, permissions, and timestamps. Unfortunately, current programming languages do not support documenting assumptions about the file system. In addition, actually loading data from disk requires writing tedious boilerplate code.

This paper describes Forest, a new domain-specific language embedded in Haskell for describing directory structures. Forest descriptions use a type-based metaphor to specify portions of the file system in a simple, declarative manner. Forest makes it easy to connect data on disk to an isomorphic representation in memory that can be manipulated by programmers as if it were any other data structure in their program. Forest generates metadata that describes to what degree the files on disk conform to the specification, making error detection easy. As a result, the system greatly lowers the divide between on-disk and in-memory representations of data. Forest leverages Haskell’s powerful generic programming infrastructure to make it easy for third-party developers to build tools that work for any Forest description. We illustrate the use of this infrastructure to build a number of useful tools, including a visualizer, permission checker, and description-specific replacements for a number of standard shell tools.