Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Core Parser Components

This document details the fundamental components that make up the BSharp parser infrastructure.

Public Parser API

Parser Struct

The main entry point for all parsing operations:

#![allow(unused)]
fn main() {
#[derive(Default)]
pub struct Parser;

impl Parser {
    pub fn new() -> Self
    pub fn parse(&self, input: &str) -> Result<ast::CompilationUnit, String>
}
}

The Parser provides a clean, simple interface that abstracts away the complexity of the underlying parsing implementation.

Error System

ErrorTree (nom-supreme)

BSharp uses nom-supreme's ErrorTree for rich error diagnostics:

#![allow(unused)]
fn main() {
pub type BResult<I, O> = IResult<I, O, ErrorTree<I>>;
}

Key features:

  • Context Stack: Maintains parsing contexts via .context() calls
  • Position Tracking: Built-in span tracking for error locations
  • Rich Diagnostics: Tree structure shows complete parse failure path
  • Integration: Seamless with nom combinators

Error Helpers

Utility functions for enhanced error handling:

Location: src/bsharp_parser/src/helpers/

  • context(): Adds contextual information to parser errors
  • bws(): Whitespace-aware wrapper with error context
  • bdelimited(): Delimited parsing with cut on closing delimiter
  • cut(): Commits to parse branch, preventing misleading backtracking
  • Error recovery mechanisms for common parsing scenarios

Pretty Error Formatting

Location: src/bsharp_parser/src/syntax/errors.rs

#![allow(unused)]
fn main() {
pub fn format_error_tree(input: &str, error: &ErrorTree<Span<'_>>) -> String;
}

Produces rustc-like error messages with:

  • Line and column numbers
  • Source code context
  • Caret pointing to error location
  • Context stack showing parse path

AST Foundation

CompilationUnit

The root node of every parsed C# file:

#![allow(unused)]
fn main() {
pub struct CompilationUnit {
    pub global_attributes: Vec<GlobalAttribute>,
    pub using_directives: Vec<UsingDirective>,
    pub global_using_directives: Vec<GlobalUsingDirective>,
    pub declarations: Vec<TopLevelDeclaration>,
    pub file_scoped_namespace: Option<FileScopedNamespaceDeclaration>,
    pub top_level_statements: Vec<Statement>,
}
}

Represents the complete structure of a C# source file, supporting both traditional and modern C# features.

TopLevelDeclaration

Enum representing all possible top-level declarations:

#![allow(unused)]
fn main() {
pub enum TopLevelDeclaration {
    Namespace(NamespaceDeclaration),
    FileScopedNamespace(FileScopedNamespaceDeclaration),
    Class(ClassDeclaration),
    Struct(StructDeclaration),
    Record(RecordDeclaration),
    Interface(InterfaceDeclaration),
    Enum(EnumDeclaration),
    Delegate(DelegateDeclaration),
    GlobalAttribute(GlobalAttribute),
}
}

Keyword Parsing

Keyword Module Organization

Location: src/bsharp_parser/src/keywords/

Keywords are organized by category in dedicated modules for maintainability and consistency:

src/bsharp_parser/src/keywords/
├── mod.rs                      # Keyword infrastructure
├── access_keywords.rs          # public, private, protected, internal
├── accessor_keywords.rs        # get, set, init, add, remove
├── type_keywords.rs            # class, struct, interface, enum, record
├── modifier_keywords.rs        # static, abstract, virtual, sealed
├── flow_control_keywords.rs    # if, else, switch, case, default
├── iteration_keywords.rs       # for, foreach, while, do
├── expression_keywords.rs      # new, this, base, typeof, sizeof
├── linq_query_keywords.rs      # from, where, select, orderby
└── ...

Keyword Parsing Strategy

Word Boundary Enforcement:

#![allow(unused)]
fn main() {
pub fn keyword(kw: &'static str) -> impl Fn(&str) -> BResult<&str, &str>;
}

The keyword() helper enforces [A-Za-z0-9_] word boundaries to prevent partial matches:

  • Correctly rejects "int" when parsing "int32"
  • Ensures "class" doesn't match "classname"
  • Consistent across all keyword parsers

Benefits:

  • Maintainability: Easy to find and update keyword parsers
  • Consistency: Uniform keyword parsing strategy
  • Bug Prevention: Avoids partial match issues
  • Centralization: Single source of truth for keywords

Parser Helpers

Context Management

Functions for maintaining parsing context:

#![allow(unused)]
fn main() {
pub fn context<I, O, F>(
    ctx: &'static str,
    parser: F
) -> impl FnMut(I) -> BResult<I, O>
}

Wraps parsers with contextual information that appears in error messages, making debugging much easier.

Parser Composition

Utilities for combining smaller parsers into larger ones:

  • Sequencing parsers with error propagation
  • Optional parsing with fallbacks
  • Alternative parsing with preference ordering
  • Repetition parsing with separators

Whitespace and Comment Handling

Consistent handling of whitespace and comments throughout the parser:

  • Automatic whitespace skipping between tokens
  • Comment preservation for documentation tools
  • Preprocessor directive handling

Node Structure Standards

Common Traits

All AST nodes implement standard traits:

  • Debug: For debugging and logging
  • PartialEq: For testing and comparison
  • Clone: For AST manipulation
  • Serialize/Deserialize: For JSON export/import

Node Organization

AST nodes are organized hierarchically:

nodes/
├── declarations/     # Type and member declarations
├── expressions/      # All expression types
├── statements/       # All statement types
├── types/           # Type system representations
└── ...              # Other language constructs

Identifier Handling

Consistent identifier representation throughout the AST:

#![allow(unused)]
fn main() {
pub struct Identifier {
    pub name: String,
    // Additional metadata like source location
}
}

Type System Integration

Type Representation

The parser builds a complete representation of C# types:

  • Primitive types (int, string, bool, etc.)
  • Reference types (classes, interfaces)
  • Value types (structs, enums)
  • Generic types with constraints
  • Array and pointer types
  • Nullable types

Generic Support

Full support for C# generics:

  • Type parameters with constraints
  • Variance annotations (in, out)
  • Generic method declarations
  • Complex constraint combinations

Memory Management

Zero-Copy Parsing

Where possible, the parser avoids unnecessary string allocations:

  • String slices reference original input
  • Minimal cloning during parsing
  • Efficient error reporting without excessive allocation

AST Ownership

Clear ownership semantics for AST nodes:

  • Parent nodes own their children
  • Shared references through navigation traits
  • No circular references in the AST structure

This foundation provides a robust base for parsing complex C# code while maintaining performance and usability.

2025-11-17 15:18:26 • commit: 03a4e25