This project involves writing a parser-generator for Parsing Expression Grammars, suitable for incorporation into the BSD base system.

Currently most BSD operating systems ship with BYacc, a parser-generator for LALR(1) grammars.  PEG grammars can be easier to write, and are often more readable when compared to LALR grammars.

The requirement that the parser-generator is suitable for incorporation into the BSD base system means that:

  1. The parser-generator should be usable during the bootstrapping phase of the OS build, for example, during the NetBSD cross-build process.

  2. The parser-generator should be written in a programming language that is available during the bootstrapping phase of the build (a dialect of C would be a safe choice).

  3. The code itself needs to be BSD-licensed.

Additionally, the parser generator should support good error recovery. Please see Medeiros, Alvez & Mascarenhas, 2020 for recent research in this area.

Resources