Parser And Diagnostics

miniR's parser is not only a grammar file. It is a pipeline:

pest parses source text into grammar pairs
builder code turns those pairs into the AST used by the interpreter
diagnostics code turns raw parser failures into R-facing error messages with source context and suggestions

This split is important because "the parser" often gets blamed for bugs that are actually in evaluation. The parser should answer syntax questions. The interpreter should answer semantic ones.

The Entry Point

src/parser.rs is the public parser boundary.

The core function is:

pub fn parse_program(input: &str) -> Result<ast::Expr, Box<ParseError>>

Its job is deliberately small:

invoke RParser::parse() against the top-level program rule
take the successful parse tree
pass it to build_program()
return a single AST expression for the whole program

That means the parser boundary is stable and easy to call from Session, tests, and tooling.

Grammar

The grammar lives in src/parser/r.pest.

This is where operator precedence and syntax shape are defined. It decides what counts as:

expressions
function definitions
calls and indexing
control-flow forms
assignment operators
special operators and pipes

When a bug is genuinely about precedence or tokenization, this is where it starts.

AST Construction

The parser does not hand the interpreter raw pest pairs.

src/parser/builder.rs converts grammar pairs into AST nodes under src/parser/ast.rs. That builder layer is where miniR turns syntax into a runtime-oriented tree that the evaluator can walk directly.

This is also where a few syntax rewrites happen. For example, help syntax such as:

?foo
methods?show

is lowered into a call to help() rather than being treated as a special runtime form forever.

That is a good example of the parser doing syntax work while leaving runtime behavior to the ordinary evaluator.

Diagnostics

Raw pest errors are not good user-facing error messages.

src/parser/diagnostics.rs converts them into ParseError, which carries:

the formatted message
line and column
the source line
an optional filename
an optional suggestion
byte offsets and span length for richer rendering

With the diagnostics feature enabled, the same structured error can render through miette for a more graphical report. Without that feature, miniR still has a plain-text fallback that keeps the useful context.

Why The Parse Errors Matter

miniR is intentionally trying to have better parse diagnostics than GNU R, not just equivalent ones.

A parse error should answer:

what token or construct was unexpected
where it happened
what the user probably meant

That is why the diagnostics layer tries to classify tokens and generate suggestions instead of dumping the raw grammar failure.

Parse-Only Versus Runtime Work

The parser should only answer syntax questions.

If the failure is about:

lazy evaluation
environments
S3 dispatch
replacement semantics
package loading

the fix almost certainly does not belong in the parser.

That distinction matters because CRAN compatibility work can otherwise drift into syntax hacks for bugs that are really semantic.

Parser Divergences

Some deliberate language divergences do start in the parser.

Examples include:

newline continuation in postfix chains such as x\n(y)
accepting if (...) ...\nelse ...
treating ** as a synonym for ^

Those are documented in the divergences docs because they are language choices, not parser accidents.

Where To Debug Parser Problems

Start in the parser layer when the symptom looks like:

precedence is wrong
a valid file fails before evaluation even begins
a token is classified or highlighted badly
a parse error points to the wrong span
?topic or similar syntax is rewritten incorrectly

The main files are:

src/parser/r.pest
src/parser.rs
src/parser/builder.rs
src/parser/diagnostics.rs

That combination is the real parser, not only the grammar file.