Distributed, Garbage-Collected, Persistent, Virtual Address Spaces (thesis)

Report ID: TR-419-93
Author: Campos, Alvaro E.
Date: 1993-06-00
Pages: 116
Download Formats: |Postscript|
Abstract:

Most integrated programming environments were designed for uniprocessor systems, and they require substantial processing power and storage capacity. Despite the advantages they offer --- flexibility, rapid response --- their appetites have limited their use. The recent increases in work-station and network computing power provide an opportunity to overcome these limitations. This dissertation describes the design of a distributed, language-based, integrated environment and its implementation on current hardware. A distributed system allows users, who can access the system from different locations, to share values and computing power. Shared virtual-memory techniques are used to distribute the address space of EZ, a persistent, very high-level, string-processing programming and command language. Unlike other persistent systems, persistence pervades EZ and applies to both data and active objects. Distributed EZ runs on a loosely coupled multiprocessor, implemented as several work-stations connected by a network. A distributed virtual-memory manager provides a shared-memory programming paradigm even though there is no shared physical memory accessible by all work-stations. The virtual address space is distributed over the secondary storage devices of the individual processors. Managers cache pages into the physical memory of the work-stations that access them and maintain coherence among the multiple copies of a page. They use replication to permit multiple readers, but permit only one writer. A distributed, mark-and-sweep garbage collector, which works in concert with the memory manager, reclaims inaccessible objects in the distributed system. This collector is concurrent and real-time. The memory manager collaborates with the collector to avoid direct mutator assistance. The results show that it is feasible to build an effective distributed system by using shared virtual memory to distribute the persistent address space. Performance remains a problem, due mainly to the high cost of network communication. The implementation of EZ's data structures conflicts with the techniques used to maintain cache coherence and cause too much interprocessor communication. The distributed, mark-and-sweep garbage collector works well, however, and it is especially effective when the system activity is distributed among several processors.