Roadmap

I use the *BSD operating systems as a platform for my personal research—the “integrated” nature of these operating system projects make them a good fit for exploring many interesting ideas.

The diagram below is a visual representation of the work ahead, as of mid-2022.

This page will be revised occasionally.

Roadmap

Roadmap Themes Over Time

2005-2008: Observability
(FreeBSD™) I had been trying to figure out what made interpreters “slow” when I came across the paper “Continuous Profiling: Where Have All the Cycles Gone?" describing how Digital’s Systems Research Center had used in-CPU hardware counters for system-wide performance measurements. This appeared to be just the kind of tool that I needed, so I wrote and contributed PmcTools (i.e. hwpmc(4), pmcstat(8) and pmccontrol(8)) to FreeBSD—please see my post PmcTools: Motivation and Future Steps.

My initial commit supported counting-mode access to in-CPU hardware performance counters—sufficient for instrumenting interpreters (please see hwpmc(4) for more information on the counting-mode PMCs). Subsequently I added support for profiling the kernel and user processes using sampling, for profiling dynamically loaded objects, and for capturing callchains.

For implementing profiling in PmcTools I needed a way to “look inside” ELF objects in order to map the raw machine addresses captured by my hwpmc(4) driver to source code locations. Because there were no BSD-licensed libraries available at that time for manipulating ELF objects, I wrote libelf a BSD-licensed implementation of the SVR4 ELF(3) API (FreeBSD commit).

2008-present: Code Sharing
(Elftoolchain) In 2008, my FreeBSD mentee Kai Wang and I started the Elftoolchain project (SVN r1). This project’s goals were to offer liberally licensed libraries and tools for program development, and to enable sharing of toolchain development effort across the open-source BSD operating system projects. I also have a long-term goal to write tools to patch running processes to add PmcTools instrumentation (libmc(3), isa(1)) on the fly.

The Elftoolchain project’s code is known to be present in FreeBSD, NetBSD, OpenBSD, RTEMS and Minix3. Interestingly, Minix3 may well be one of the world’s most widely used OSes, because it is apparently present (embedded) in the ‘management engine’ in recent Intel® CPUs.

The Next Steps: Integration, Refactoring, Enhancements and New Build Tools
(Elftoolchain, NetBSD) The work ahead would fall into the following broad categories:
  1. Improving code quality—improved tests, code reviews, refactoring etc.
  2. Changes to ease integration into BSD base systems: e.g. tooling to automate code imports, making Elftoolchain ELF definitions compatible with the kernel build environment, and so on.
  3. Integration: replacing GNU binutils utilities with their Elftoolchain equivalents.
  4. Improving compatibility with GNU binutils.
  5. New tools and libraries, as described in the next section.

The Work Ahead

Improving Quality

Documentation review
Study the guidelines at the Good Docs Project, and revise the project’s documentation accordingly. This would be an ongoing task.
GNU binutils compatibility review
A periodic check of option and behavior-level compatibility of Elftoolchain’s utilities with their GNU binutils equivalents.
Coverity® Scan
Periodic Coverity® Scan over Elftoolchain code, and fixes for any regressions reported.
Refactoring
Code reviews, refactoring shared code across utilities (e.g. ticket #578), changing the code to work at a higher level of abstraction (#609).
Change Test Framework
The current set of tests were written using the TET test framework from the OpenGroup™. TET is not BSD-licensed. Porting our tests to ATF or equivalent BSD-licensed test framework would remove a dependency on non-BSD software (ticket: #270).
elfc(1) rework
elfc(1) is an YAML-based test tool that I wrote that generates ELF objects from textual descriptions. Use a more descriptive notation than YAML in order to describe complex ELF objects.

Enhancements

libelf enhancements
Support compressed sections (ticket #594), new APIs (#591), better MIPS64 support (#559), etc.
-lsymtab
Symbol table handling, a common need for many tools.
-lpeg/peg
PEG parser generator for general parsing needs. Intended to be an easier-to-use alternative to GNU bison(1), but my real motivation is to understand how PEG parsers work.
Configuration/data notation
A notation to describe data, for use with isa(1), elfc(1) etc.
-ldemangler
(Correct) C++ demangling as a standalone library, driven off a formal grammar (ticket #595).
isa(1)
A tool to describe machine instruction sets.
ld(1) refactor, enhance
Use PEG based parsing instead of bison(1)-generated parsers for parsing linker scripts. This removes a dependency on GNU bison(1). PEG-based parsers are usually easier to read.
as(1)
A machine assember, an as(1) replacement.
libmc
Machine code parsing.
-lryu (Floating point printing)
Use (or implement afresh) a correct floating point to text converter for displaying floating point quantities (see “Ryū: fast float-to-string conversion”, Ulf Adams, PLDI 2018).
Build in src/
Changes to ease integration of Elftoolchain code into the NetBSD® base system.
Update libelf/libdwarf
Update to a recent Elftoolchain revision.
Kernel Build
Integrating Elftoolchain’s definitions into NetBSD kernel source.
Incorporate Utilities
Add a WITH_ELFTOOLCHAIN (or similar) build knob and incrementally add Elftoolchain components to the NetBSD base system.
Add tests to src/
This is to allow imported Elftoolchain tests to be tested along with the other tests present in /usr/tests in NetBSD.

Infrastructure

Change Review
Deployment of a change review tool for pre-commit reviews.
Code Browser
A Trac-like code browser, but one that understands both Trac and Allura markup styles.
Trial builds
A way for changes to be built and tested across all of a project’s OSes prior to commit. Using Buildbot, perhaps.

Build Tool

This is a tool that takes the best ideas of Bazel, Buck and similar build tools and adapts these for (cross-)building a *BSD operating system, while allowing sharing of build cycles across developers.

Please see: BuildAutomation (Elftoolchain Wiki).

Auditable Builds

This is an enhancement to the build tool to allow end-users to cryptographically verify that:

  1. A given binary (or kernel, or dynamically loadable object) was built from a set of vetted sources …
  2. using a toolchain that was built in turn from vetted sources …
  3. and which ran on a similarly vetted kernel while building the binary …
  4. and that every byte of code or data the final executable or dynamically loaded object can be accounted for.

Please get in touch if this sounds interesting!

Future

PmcTools v2
Revise PmcTools, porting it to whichever operating system I would be using by the time I get around to this item. Use libmc (the planned library for machine code parsing/process instrumentation) for writing utilities to instrument running processes, etc.
Profile Guided Layout
Profile guided layout of object code, to improve its cache behavior at runtime.
Profiling Code For Energy Use
(Research project) As mentioned in the passing in slide 46 of my 2009 ACM Bangalore presentation, it might be possible to profile code for energy use with a reasonable level of accuracy, even on stock hardware. The applications for such a profiler are varied—for example, such a profiler could help schedule workloads with high energy usage at the times where energy is cheap, or it could find code ‘hot spots’ (no pun intended) that consume significant energy in embedded devices, and so on.

It would be quite a lot of work to make this tool a reality. If this kind of thing interests you, please get in touch.