A PEG parser generator for the base system
This project involves writing a parser-generator for Parsing Expression Grammars, suitable for incorporation into the BSD base system.
Currently most BSD operating systems ship with BYacc, a parser-generator for LALR(1) grammars. PEG grammars can be easier to write, and are often more readable when compared to LALR grammars.
The requirement that the parser-generator is suitable for incorporation into the BSD base system means that:
-
The parser-generator should be usable during the bootstrapping phase of the OS build, for example, during the NetBSD cross-build process.
-
The parser-generator should be written in a programming language that is available during the bootstrapping phase of the build (a dialect of
C
would be a safe choice). -
The code itself needs to be BSD-licensed.
Additionally, the parser generator should support good error recovery. Please see Medeiros, Alvez & Mascarenhas, 2020 for recent research in this area.
Resources
-
The Packrat Parsing Expression Grammars Page, Bryan Ford. Resources related to PEG parsing.
-
Automatic syntax error reporting and recovery in parsing expression grammars, Sérgio Queiroz de Medeiros, Gilney de Azevedo Alvez Junior, Fabio Mascarenhas, Science of Computer Programming, Volume 187, February 2020.