Yacc is a very useful tool. When it was invented, life was easier. It was most often used in projects where a single executable needed a single grammar. Following the best traditions, it was designed to be simple. 1. Problems Later on larger applications started to pose different requirements for the parser. A common one was multiple grammars in the same executable. One may say "on UNIX you should just use one process per task for simplicity and then each process (and executable) will need to know only one language!". Which may be true for some tasks, but not for others: for example if the task is to write a converter that needs to read various different file formats to emit a single output (like anytopnm(1)). Of course it could still be a hoard of one-to-one converters dispatched by the original executable, but that sounds like unnecessary overhead and complication. Thus the need for reentrant parsers with separate namespace raised. Yacc successors like byacc or bison have options for this. Another problem is push vs. pull: the original approach was that the parsing was coordinated by yyparse(). It was pulling tokens from the lexer and parsed until the end of the file or end of the grammar. This worked well when developing awk(1) where the script source to parse is a single, finite file and the process doesn't need to do anything else until it is fully parsed. However this is not the case these days: many parsers will need to parse a stream of data, often infinite, often read from the network. It's also common that there shall be a timeout on parsing. This is not possible if yyparse() needs to read until the end (pull) but is easily possible if the process feeds the parser token by token (push). Push parsing also eliminates the fancy options required to generate the bridge code between the parser and the lexer (e.g. bison bridge in flex). It helps separating things: the lexer only needs to know how to tokenize and the parser only needs to know how to process a sequence of tokens and they don't need to know anything about each-other. Which becomes important when the question of context structs comes up: if the process needs to parse multiple streams in parallel, neither the parser nor the lexer may use global variables. They both need to use context structures. Unfortunately the original yacc didn't do this and this can be achieved only with extra options even with byacc or bison. To make matters worse, a complicated constellation of options and structs need to be used because the grammar code needs to communicate with the lexer in the pull model so it needs to know what context struct to pass on. A minor annoyance is file naming. In the Good Old Days it made sense yacc invented the file name - there was one grammar per project and it was just easy to integrate it in the built system with always the same name. Unfortunately later on when the fixed name was not enough, the solution was not to specify the output file names but to invent options and mechanisms to generate them. This resulted in unnecessary complications and suboptimal file names for most projects that use multiple grammars. 2. Solutions with compatibility in mind The solution offered by byacc and bison is a combination of: - stay compatible with old .y files at default settings - provide a large number of settings to enable different extra functionality like "pure parser" (reentrant parser) or push parser. These extras are incompatible among different yacc successors and they tend to break over time. My main motivation for forking byacc was that one of my old projects that had to use a set of fancy functions to get a push parser and context pointers broke with new versions of bison. 3. Solutions with simplicity in mind The solution byaccic choose is: - drop backward compatibility with old .y files and old parser interfaces - remove most of the options: implement only one API, a reentrant push parser with context pointers. - remove location handling in favor of %struct (see differences.txt point 6) Rationale: - the reentrant push parser is required for the multi-grammar, multi-stream application... - ... while it is not very hard to use in the simpler cases (single grammar and/or single stream blocking read) - it's just a question of a loop, typically 5..10 lines of code - always having context pointers is not much overhead for the simple case either: just create a local variable of the given struct and pass the address on - upgrading existing simple-case code to byaccic is typically adding 10..20 lines of code and removing a some option settings... - ... but the converted grammars will start to depend on byaccic - which is not a big difference if it used anything extra over the oldest yacc options because then it most probably already depended on a specific flavor of yacc (typically bison) - output file names are specified directly on the command line and are not calculated by byaccic Or in short: throw out the decades of API legacy; do only one, simple API but do it right so it can be used in all cases without much overhead.