R  E  A  C  T  O  R

A  DISTRIBUTED  APPLICATION DEVELOPMENT  SYSTEM




Reactor represents a new generation of technology for building client/server, multi-tiered, and distributed applications. It combines an innovative web-based application development system with a rich and robust distributed infrastructure. This paper outlines the features and benefits of Reactor's high-productivity, distributed application development system.

The Development Environment

Most development environments force you to use the proprietary user interface included with the environment. At Critical Mass, Inc., we've decided to take a different approach. The user interface of our development environment is a standard web browser such as Netscape Navigator, or the Internet Explorer. At the heart of our development environment is a custom-built server driven by HTTP requests for browsing, building, and compiling. (Think of it as your personal HTTP server which knows about all your program files and their relationships.) Our reliance on a web browser as our development front-end makes it easy for new programmers to come up to speed and results in a reduction of training costs-after all, most users are familiar with web browsers. Moreover, the hypertext and networked nature of the web creates a rich visual metaphor for building and exploring distributed applications. For example, using Reactor, it is easy to embed references to external documentation, to point out relationships with other projects in your system, or to create links in a README file so that a coworker can easily find the external documentation for programs.

The Reactor Development Environment: Integrating into an off-the-shelf web browser, the Reactor development environment gives you the same look and feel whether you are programming on Unix or Windows. You can build and browse your programs, or even link in your own html pages into your program sources. Full size image, 98K.

Each file in your project and every command in Reactor maps to a URL in the virtual web space of the Reactor server. Hence, as web browsing technology becomes embedded further into the desktop metaphor, the Reactor user interface will become a natural part of the development desktop rather than a stand-alone entity running on the side. Essentially, a Reactor-based project will be just another explorable space residing on the desktop reached through standard desktop interaction mechanisms. Moreover, to aid programming efforts that require multiple developers, Reactor provides public and private repositories, and allows team members to browse each other's projects. You can embed a reference to a Reactor project in an e-mail message to a coworker. With Reactor, team communication is just a click away.

The Programming Language

The Modula-3 programming language is the foundation of Reactor's programming environment. The Modula-3 language was developed in the late 1980s by researchers at Digital Systems Research Center and Olivetti Research Center; both organizations have extensive experience in building sophisticated application and systems software. Modula-3 provides features that are essential to the development of large-scale systems, such as separate interfaces, garbage collection, objects, threads, and exceptions.

Smalltalk Ada83 Ada95 Java C++ C Modula-3
Interfaces      
Strict Separation of
Impl. and Interface
       
Garbage Collection      
Mulitple Inheritance            
Objects    
Concurrency & Threads    
Generics      
Exceptions    
Unsafe Features    
Separation of
Unsafe Code
         
Dynamic Typing    

The Reactor Development Language: Despite its 50-page definition, Modula-3, the heart of the Reactor programming environment, gives you more support than other languages you may have considered for your systems programming tasks.

The language designers sought a combination of features that would support the development of robust, long-lived systems while keeping the language small and comprehensible. The result is a language that is a better suited for building robust distributed applications than other current languages. Modula-3 is more powerful than c++, but its language definition fits comfortably within about fifty pages of text. The next few sections describe some of the major features of the language.

True Separation of Interface and Implementation

The key structuring idea in building programs with Modula-3 is the strong separation of interface from implementation. An interface describes an abstraction. That is, it describes what something is capable of doing but does not describe how it is done. Interfaces provide the what, implementations provide the how.

This separation means that you are free to change the way an abstraction is implemented without having to worry about all the clients of that abstraction-no recompiling is necessary. As long as the interface remains unchanged, the clients are unaffected. Clients cannot accidently become dependent on the way something is implemented and later break in mysterious ways when the implementation is changed.

Another benefit of separating interfaces from implementations is that it decouples the activities of the implementor of a programming client from those of the API's provider. For example, client implementors can begin programming as soon as the interface is written. They don't need to wait for full implementations from the API's provider. Also, project managers can easily maintain control of an interface's evolution by restricting write access to it, without crippling the programmer's need to write code.

The true separation of interfaces from implementations is a critical advantage for building large systems.

Garbage Collection

Garbage collection automates the most difficult aspect of memory management by keeping track of references in your system, and freeing unused pieces of memory when there are no longer references to them. Memory is finite; thus, memory management is necessary in any long-running program written in any language. Most of the time, the programmer needs to do the hard work of determining when to free a piece of memory that is no longer needed. While this is an easy task for small programs, it is quite difficult to keep track of complex dependencies for large or long-running client/server systems. With garbage collection, it is the responsibility of the system to determine when a piece of memory is no longer in use. Automatic garbage collection makes it easier to create more robust programs. Programs built using manual storage management are often plagued by storage leaks and dangling references. Garbage collection eliminates these errors. Correct, efficient memory management is essential to building long-lived and robust applications. Automatic garbage collection makes it easy.

Historically, programmers have been suspicious of garbage-collected systems, feeling that they introduced too much overhead. Reactor's garbage collector has been tuned through years of intensive use, and performs quite well in client/server applications. Also, Reactor's state-of-the-art, concurrent, incremental collector supports the development of response-critical, multi-threaded applications.

Accompanying Reactor are a number of tools for analyzing and tuning memory usage and an open interface to tune the collector's behavior. In fact, many of the same tools were used in the development of Reactor, which is itself a client/server application.

No matter what garbage collector or memory management scheme you use, there are times when you need complete control of memory management. In Modula-3, you can use untraced references. Then, as the programmer, you are responsible for allocating and freeing the storage. This feature is essential when dealing with data structures allocated by foreign and legacy systems, where you must disallow automatic garbage collections.

Objects

Objects combine state with operations on that state. Modula-3, Reactor's programming language, provides a simple single-inheritance object system much like Simula's, Smalltalk's or the original C++'s. Its object model is simpler than C++'s, making it easier to build more robust systems. Like Smalltalk or Java, objects reside in a garbage collected heap.

Threads

Today's applications are quite different from those of just a few years ago. When C was invented, most applications were either batch-oriented or presented a simple line-oriented interface. Today's applications are often driven by multiple, asynchronous, external events; they must respond quickly at all times, and need access to other applications on the network. For instance, a server needs to respond in a timely fashion; otherwise clients will perceive it as having crashed. A client may need to invoke operations on multiple servers. This demanding environment has led to systems with multi-threaded architectures. Programming responsive, event-driven applications is simpler with threads. Reactor meets this need with language-level support for multi-threaded applications. Standard libraries and the run-time on all platforms are multi-threaded.

Exception Handling

One of the biggest problems in developing robust software is getting callers of a routine to check error codes returned by that routine. In any large software system, there may be numerous places where someone forgot to check the error status after calling some routine. Often the system will crash in some mysterious way unrelated to the original, missed error report. These failures are time bombs waiting to go off long after the system has been deployed.

Exception handling is an error signalling and handling technique. It ensures that callers can't accidentally forget to check if a routine completed successfully. When a routine encounters an error situation, it raises an exception. This causes the language run-time to see if any of the callers have indicated that they want to handle this exception. If so, then control is transferred back to the point where the program is prepared to handle the exception. If the program is unable to handle the exception, the program is terminated in a controlled fashion. When a routine is declared to raise a set of exceptions, the compiler checks all calls to that routine to see if they are in scopes that handle the raised exceptions. If there are some exceptions that the caller doesn't handle, then the compiler issues a warning. If you get rid of all such warnings in a program, your program will never raise an exception that would not be handled. This is yet another way Reactor helps create robust code.

Access to Unsafe Features

No matter how complete a safe interface is, sometimes you need access to the lower-levels of the system. Modula-3 allows such access for system programming, for example, to tweak bits and bytes of memory locations. This kind of access comes in handy for integrating legacy systems, connecting to foreign or networked systems, or performing fast graphics operations. Indeed, unsafe features have been used to integrate many standard C apis, such as x11, Win32, tcp, and Opengl.

The Compilation System

Programming environments for languages such as C++ keep track of dependencies on a per-file basis. Typically, when a procedure declaration in a header file is changed, all compilation units that #include this header file must be recompiled, even if they are completely unaffected by the change. For large systems, this strategy results in unnecessary compilation that slows the development. To avoid such delays, development teams often adopt strict controls as to when a header file can be modified. This, of course, hinders the developers. In a typical development group, such a scheme would require changed header files to be submitted to a master repository late in the day, and the entire system would be rebuilt overnight. Thus, group development becomes a batch process. To avoid the delays of the nightly batch process, most large C/C++ programmer shops spread logically related parts of the system across seemingly unrelated header files in order to speed recompilation. The outcome is a system that is harder to understand, change, and maintain.

Reactor's compilation system provides a much more elegant and practical solution. In Reactor, cross-module dependencies are kept on a per-declaration basis. When a declaration (such as a type or a procedure signature) is changed in an interface, only those modules that depend on that particular declaration are recompiled. Typically, only a small number of modules are affected by a changed definition, resulting in much faster rebuild times, even when changing an item in an interface used by many other modules. Programming teams have much greater freedom in submitting, integrating, and testing changes.

Reactor's builder keeps track of dependencies between various modules automatically. Hence, writing makefiles is a breeze: all you have to do is list the modules and interfaces in your program-the builder takes care of the rest.

Reactor's compiler also offers rapid compilation using a fast back-end that generates native object code directly, instead of generating C code. This, in conjunction with the minimal recompilation system, means that large systems can be quickly rebuilt after a change. No more hour-long waits because someone changed the globals.h file, and no more hassles with maintaining make dependencies.

The Infrastructure

Reactor provides a large set of reusable, thread-friendly, and portable libraries, some of which are highlighted in this section.

IO Framework

The IO framework provides an extensible set of types for doing stream-oriented input and output to screens, files, and network connection-all in a safe, multi-threaded environment. New custom io stream types can be added easily.

Simple Persistence ("Pickles")

Reactor provides a set of interfaces for writing and reading object state to and from disks or network streams. This facility, called "Pickles", is extremely simple to use. With no help from the programmer, Pickles preserve the shape (that is, the structure) of arbitrary, complex object graphs during save and load operations. Pickles are designed to be customizable so that applications can store objects in a form that is optimized based on their type. For example, a sparse array can be stored in a compact form on disk by writing a small procedure.

Lightweight Object Storage ("SmallDB")

Smalldb allows "pickled" objects to be stored in a recoverable fashion. If a crash occurs while the objects are being written to disk, the object state will be restored from the latest consistent snapshot the next time they are used. This kind of recoverable storage is vital in the development of robust servers.

Stable Object Storage

Stable object storage extends the lightweight object storage provided by Pickles and SmallDB to allow for recoverable storage of objects through logging and checkpointing. Updates to objects are logged to stable storage automatically. When the state of an object is restored from disk, the restoration process checks to see if a crash occurred before the entire state of the object was written to disk. If so, the state of the object is recovered from the log of modifications to the object.

High-level OS Interfaces

Reactor provides a set of high-level object-oriented interfaces to the underlying os facilities such as files, processes, directories, terminals, and keyboards. The interfaces to these operating system functions are identical whether you are running on Windows or Unix. You can have the same piece of code on Unix and Win32 that uses operating systems services without messy #ifdef statements.

Generic Collection Interfaces

A standard set of generic collections, such as lists, tables, sets, sequences, sorted lists, and priority queues are provided with the library.

Safe & Portable Relational Database Interfaces

Reactor includes an interface for accessing relational databases. The default implementation is based on the ODBC API on Win32 and Unix to access most name-brand databases easily. The back-end of the relational database interface can easily be extended to connect to a database interface via its native API.

Safe & Portable TCP Interface

With Reactor's TCP interfaces, you can write programs using sockets. The TCP interfaces are safe, so you, as the programmer, do not need to worry about the interaction of multi-threading or memory management with the lower-level TCP libraries. Reactor's TCP interfaces are also portable, so the same code works whether you use Unix sockets or Win32's Winsock. Hence, interaction with Internet or Intranet TCP services is straightforward. For example, a simple multi-threaded web server written using the safe TCP interface takes less than a page of code.

Safe & Portable Web Server Construction Toolkit

Using Reactor's construction toolkit for HTTP, you can build customized web and proxy servers. Capitalizing on built-in support for concurrency and the portable TCP interface, you can build multi-threaded, dynamic web servers using the same code on Win32 and Unix platforms.

The Tools

In addition to the libraries, Reactor includes several useful utility programs, a few of which are described in this section.

ShowThread

ShowThread provides a graphical display of the current state of each thread running in an application. Thread information can be recorded to a file and replayed for later analysis.

ShowHeap and ShowNew

ShowHeap and ShowNew display heap allocation information to aid the programmer in finding memory usage patterns. This information can be recorded to file and replayed for analysis.

Stubgen

Stubgen reads a Modula-3 object description and generates the code necessary to make the distributed instances of the object. Reactor's distributed object system is described in more detail later in this paper.

Stablegen

Stablegen reads an interface containing an object type and generates the code that is necessary to allow instances of that object type to be stored in a recoverable fashion.

Cbind

Modula-3 is a compiled language that can readily access existing C code. The usual approach is to create a Modula-3 interface that is implemented by a C module or library. Cbind is a tool for automatically generating interfaces to existing C libraries and programs. Cbind reads a C header file for an existing system and generates the corresponding Modula-3 interface. The ability to directly call existing libraries makes the integration of existing and legacy C applications much easier than in many other non-c systems.

An Open Infrastructure For Building Distributed Applications

Network IO Framework

The network io framework provides a set of high-level abstractions for sending and receiving messages across the network. The Message Stream interface is the simplest part to describe. A message stream is simply a type of io stream that passes multi-byte chunks of data as atomic units.

The network io framework currently supports the tcp/ip protocol. However, adding support for protocols such as IPX/SPX or NetBEUI is relatively straightforward.

Here is a short example for sending a "Hello World" message to the process listening on the IP port dest_port:

VAR
  conn := TCP.Connect(dest_port);
  msgWr := ConnMsgRW.NewWr(conn);
BEGIN
  Wr.PutText(msgWr, "Hello World\n");
  Wr.Flush(msgWr);
END

Network Objects

The Network Objects system is a robust mechanism for developing distributed applications. Central to the robustness of Network Objects is a sophisticated distributed garbage collector. This garbage collector simplifies one of the most difficult problems faced by developers of distributed applications: global memory management in the presence of machine and communication failures.

Network Objects allows a Modula-3 object to be handed to another process in such a way that the process receiving the object can operate on it as if it were local. The holder of a remote object can freely invoke operations on that object just as if it had created that object locally. Further, it can pass the object to other processes. Thus, the Network Objects system allows the development of not just simple client-server applications, but more general multi-tiered distributed applications.

In CORBA terms, Network Objects provides the equivalent of ORB functionality; however, Network Objects is much more tightly integrated with the Modula-3 language. Most CORBA implementations are layered on top of C++, a language which was not designed to host distributed programming. In contrast, the Modula-3 language designers made the easy and natural design of distributed applications one of their primary goals. Modula-3 programs are much easier to make distributed than their C++ counterparts.

The current implementation of Network Objects is built on the TCP framework described above. In addition, it is designed to make adoption to specialized network protocols easy. For instance, it is relatively straightforward to add a new transport for Network Objects for CORBA IIOP or DCE RPC.

Putting It All Together: A Network Objects Example

In this section we outline the canonical example of a distributed program, namely that of a remotely accessible bank (ignoring possible communication failures.) We first provide the interface to the bank:
INTERFACE Bank;
IMPORT NetObj;

TYPE
  T = NetObj.T OBJECT METHODS
    deposit (acct: AcctNum; amount: REAL) RAISES {BadAmount};
    withdraw (acct: AcctNum; amount: REAL) RAISES {BadAmount, InsufficientFunds};
    get_balance (acct: AcctNum): REAL;
  END;
  
TYPE
  AcctNum = [1..100];
  
EXCEPTION
  BadAmount;
  InsufficientFunds;
  
END Bank.

The Bank interface defines an object type Bank.T which inherits from the type NetObj.T; inheriting from NetObj.T makes an object eligible for distribution. The Bank.T type defines three operations: deposit, withdraw, and get_balance. The deposit and withdraw operations will raise BadAmount if the amount is less than zero. The withdraw operation also raises the InsufficientFunds exception if there isn't sufficient funds in the account to meet the requested withdrawal amount.

Continuing our example, we show how the bank server can be implemented. To save space, we show only the implementation of the deposit operation; the other operations can be implemented in a similar fashion.

MODULE Server;
IMPORT Bank, NetObj;

TYPE BankImpl = Bank.T OBJECT accounts : ARRAY Bank.AcctNum OF Account; lock : MUTEX; OVERRIDES deposit := Deposit; withdraw := Withdraw; (* not included *) get_balance := Balance; (* not included *) END;
Account = RECORD balance : REAL := 0.0; END;
PROCEDURE Deposit (self: BankImpl; acct: Bank.AcctNum; amount: REAL) RAISES {Bank.BadAmount} =
BEGIN IF amount < 0.0 THEN RAISE Bank.BadAmount; END; LOCK self.lock DO WITH bal = self.accounts[acct].balance DO bal := bal + amount; END; END; END Deposit;
(* The implementations of "Withdraw" and "Balance" would go here. *)
VAR bank := NEW (BankImpl, lock := NEW(MUTEX)); BEGIN NetObj.Export ("LastNationalBank", bank);
(* Here would might start other threads to audit or manage the bank.
Finally, the program must wait until the bank is closed. *) END Server.

The above example demonstrates that coding a simple server in Reactor is indeed simple. The code resembles quite closely the code that you would write if you were building a non-distributed version of the Bank interface. The primary difference is the need to make the bank object visible via the NetObj.Export operation. This operation binds a name ("LastNationalBank") in the global namespace to a local object (bank), allowing clients running on other machines to access it. The Network Objects run-time creates threads to handle incoming connections as they are needed. When incoming calls arrive for the object named "LastNationalBank", they are dispatched to our implementation.

Next we outline a simple client of the bank. This client receives the object representing the "LastNationalBank", deposits some money in an account via the deposit command and prints the final balance. Despite its simplicity, this client will work properly regardless of whether the bank is located in the same room, down the street, or across the county.

MODULE Client;
IMPORT Bank, NetObj;
IMPORT IO, Fmt;

CONST MyBank : TEXT = "LastNationalBank"; MyAcct : Bank.AcctNum = 99;
VAR bank: Bank.T; BEGIN TRY bank := NetObj.Import (MyBank); bank.deposit (MyAcct, 125.00); WITH balance = bank.get_balance(MyAcct) DO IO.Put ("My account balance is " & Fmt.Real(balance) &"\n"); END; EXCEPT | NetObj.Error => IO.Put ("A network error occured\n"); | Bank.BadAmount => <* ASSERT FALSE *> (* BadAmount will not be raised for positive deposits. *) END; END Client.

A Slightly More Realistic Example

In the example above, the Bank.T object provided operations for modifying accounts. We now extend the example to include separate Bank.Account objects which could implement different interest, penalty or usage policies. Bank.T provides operations for looking up accounts, while Bank.Account provides operations for depositing and withdrawing money. Here is the new Bank interface:
INTERFACE Bank;
IMPORT NetObj;

TYPE T = NetObj.T OBJECT METHODS findAccount (acct: AcctNum): Account; END;
TYPE Account = NetObj.T OBJECT METHODS deposit (amount: REAL) RAISES {BadAmount}; withdraw (amount: REAL) RAISES {BadAmount, InsufficientFunds}; get_balance (): REAL; END;
TYPE AcctNum = [1..100];
EXCEPTION BadAmount; InsufficientFunds;
END Bank.

We now sketch the implementation of a bank server that manages both banks and accounts:

MODULE Server;

IMPORT Bank, NetObj;
TYPE BankImpl = Bank.T OBJECT accounts : ARRAY Bank.AcctNum OF Account; OVERRIDES findAccount := FindAccount; END;
TYPE Account = Bank.Account OBJECT lock : MUTEX; balance : REAL := 0.0; OVERRIDES deposit := Deposit; withdraw := Withdraw; (* not included *) get_balance := Balance; (* not included *) END;
PROCEDURE FindAccount (self: BankImpl; acct: Bank.AcctNum): Bank.Account = BEGIN RETURN self.accounts[acct]; END FindAccount;
PROCEDURE Deposit (self: Account; amount: REAL) RAISES {Bank.BadAmount} = (* Deposit the money, making sure to serialize access with others trying to operate on this account.*) BEGIN IF amount < 0.0 THEN RAISE Bank.BadAmount; END; LOCK self.lock DO self.balance := self.balance + amount; END; END Deposit;
(* The implementations of "Withdraw" and "Balance" would go here. *)
PROCEDURE NewBank () : BankImpl = VAR b := NEW (BankImpl); BEGIN FOR i := FIRST (b.accounts) TO LAST (b.accounts) DO b.accounts[i] := NEW (Account, lock := NEW (MUTEX)); END; RETURN b; END NewBank;
BEGIN NetObj.Export ("LastNationalBank", NewBank ());
(* Here would might start other threads to audit or manage the bank. Finally, the program must wait until the bank is closed. *) END Server.

In this version, the client gets an account object (of type Bank.Account) from the bank and then makes deposits, withdrawals, and checks balances on that account object. When the client finishes using that account object, it does not need to clean up after itself, unlike what would be required in most client/server or corba systems today. Instead, the garbage collector on the client will determine that the account object is no longer being used. Then, it will inform the garbage collector at the server that the account object has one less client. When the garbage collector on the server determines there are no more clients using this account object, it will reclaim the storage used by the Network Objects run-time.

Distributed garbage collection is key in creating robust client/server programs. Manually managing memory within a single program is hard enough; manual memory management across clients and servers when hardware failures are possible is nearly impossible. Without distributed garbage collection, the application designer would need to invent and follow conventions that allow a client to inform the server when it finishes using the account object. What would happen if the client crashed unexpectedly or failed to make that call? Slowly, the server would accumulate more and more objects that it couldn't free; eventually it would crash. By providing a robust distributed garbage collection protocol, Reactor dramatically eases the development of long-running client-server and distributed applications. Distributed garbage collection also simplifies the development of the middle layers in multi-tiered systems.

Future Directions

Network OLE/Distributed COM

As Network ole becomes a major force in enterprise connectivity, we are planning an implementation of our distribution architecture on top of Network ole. The goal of this effort is to allow remote ole objects to be treated in a uniform fashion with Reactor programming language level objects.

Microsoft IIS & Netscape Communications Server

IIs is the Microsoft architecture for developing extensible Internet information servers (HTTP servers). As this architecture becomes finalized, we intend to provide a framework that allows HTTP server extensions to be written in Modula-3. This will allow extensions to be written in a safe and compiled programming language. We expect our system to perform better and be safer than current web-based solutions. Safety is critical as HTTP servers are expected to run continuously. Dynamically loaded extensions should not be able to corrupt a server's memory. Current solutions (such as Netscape's Communication Server) require a multi-process architecture where the server process keeps extensions in separate processes. Because they are separate, they pose no safety threat and are often killed to reclaim resources.

Advanced Distributed Services: Distributed Transactions

Based on the Network and Stable Objects platform, Critical Mass intends to release a set of portable, distributed services, such as distributed transactions performing two-phase commit and recovery.

Many Other Libraries

The Reactor distribution contains a large number of Modula-3 packages not mentioned here. These libraries also take advantage of Modula-3's state-of-the-art features. Some examples are a multi-threaded windowing toolkit called Trestle, packages that extend Trestle with animation and 3D graphics, and m3tk, a meta-programming library for Modula-3 programs.

The Reactor Architecture: Reactorıs robust core is the basis for its tight integration with industry standards on the front-end and the back-end, making Reactor into an ideal choice for serious systems architects and programmers. Full size image, 35K.

Conclusions

Reactor is an environment for developing robust and distributed applications that can handle the requirements of today's businesses. With its state-of-the-art web-based user interface, training programmers to use Reactor is straightforward. Reactor greatly aids the development of large distributed programs with features such as distributed memory management, integral support for threads, Win32/Unix portability, and a rich collection of safe, multi-threaded libraries.


Copyright © 1996 Critical Mass, Inc. All Rights Reserved. Reactor is a trademark of Critical Mass, Inc.