Mostly Typed

For Hackers and Other Heretics

Experiences Building an OS in Rust

On Sunday, I’ll be presenting a paper at Programming Languages and Operating Systems (PLOS) on our experiences building Tock in Rust. While we (OS and language builders) often like to think of system and language design as separate problems, they are really not. Instead, language features or constraints often lead to very different system designs.

In other words, taking advantage of language features, like memory- and type-safety, is usually not as simple as rewriting your system in a language that offers those features.

We’ve been experiencing exactly this tension in the year we’ve spent so far designing and building Tock.

Tock is an operating system for microcontrollers that radically departs from the design of existing systems like TinyOS and Contiki. Traditionally, embedded systems were single application devices where the hardware and software were co-designed and their software was rarely updated, if ever. However, newer generations of embedded systems—smart watches, drones, thermostats and even medical devices—are platforms. They are frequently updated and may even allow third-party applications.

Supporting multi-programming on such devices, though, is not as simple as just porting Linux. Embedded systems have extremely constrained resource for which general purpose operating systems were simply not designed:

  • RAM on the order of 16KB-512KB

  • Power consumption less than 1uA

  • No virtual memory

  • Limited or no user interface (so hard crashes are even less tolerable)

Tock targets these new embedded platforms by supporting multi-programming through a combination of language-level isolation (i.e. Rust’s type-system) and a new hardware protection mechanism called MPU available on some new ARM Cortex-M microcontrollers.

We chose Rust, in particular, for several reasons:

  1. Rust is unique in providing type and memory safety without a runtime system (i.e. garbage collector).

  2. Rust explicitly separate between “trusted” and “untrusted” code via the unsafe keyword.

  3. We hope(d) we could co-opt compiler enforcement of ownership for resource management.

  4. Native support for closures was encouraging for replacing state-machines with callback-style code.

However, using these features with a traditional operating system design has proved more difficult than we expected. In particular, ownership has forced us to avoid concurrency in the kernel and include more code than we would have liked in the TCB.

In the end, we have been able to move ahead despite these challenges and are still happy with our choice of language. Our experience underscores the codepedence of language and system design and, in particular, the need to rethink kernel designs that leverage the compile-time safety guarantees of languages like Rust.

More detail is available in the paper, as well as a discussion of a language proposal which could address some of these challenges by exposing concurrency in the type system.