Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

BSharp C# Parser Documentation

BSharp is a comprehensive C# parser and analysis toolkit written in Rust. It provides a complete solution for parsing C# source code into an Abstract Syntax Tree (AST), performing various code analyses, and generating insights about code quality and structure.

What is BSharp?

BSharp consists of several key components:

  • Parser: A robust C# parser built using the nom parser combinator library
  • AST: A complete representation of C# language constructs
  • Analysis Framework: Tools for analyzing code structure, dependencies, and quality
  • CLI Tools: Command-line utilities for parsing, visualization, and analysis

Key Features

  • Complete C# Language Support: Supports modern C# features including:

    • Classes, structs, interfaces, records, enums
    • Methods, properties, fields, events, indexers
    • All statement types (if, for, while, switch, try-catch, etc.)
    • Expression parsing with operator precedence
    • Generic types and constraints
    • Attributes and modifiers
    • Preprocessor directives
  • Robust Error Handling: Custom error types with context information for debugging parse failures

  • Query API: Typed, ergonomic traversal of the AST via bsharp_analysis::framework::Query

  • Code Analysis: Built-in analyzers for:

    • Control flow analysis
    • Dependency tracking
    • Code metrics (complexity, maintainability)
    • Type analysis
    • Code quality assessment
  • Extensible Architecture: Modular design allowing easy extension of parsing and analysis capabilities

Architecture Overview

The codebase is organized into several main modules:

src/
├── bsharp_parser/    # Parser crate (expressions, statements, declarations, helpers)
├── bsharp_syntax/    # AST nodes and shared syntax types (re-exported by parser)
├── bsharp_analysis/  # Analysis framework and workspace loader
├── bsharp_cli/       # Command-line interface
└── bsharp_tests/     # External tests for parser/analysis/CLI

Key Components

Parser (src/bsharp_parser/, src/bsharp_syntax/)

  • Modular parser using nom combinators
  • Complete C# language support
  • Rich error diagnostics with ErrorTree
  • Keyword parsing organized by category
  • AST nodes follow PascalCase naming without 'Syntax' suffix

Workspace Loading (src/bsharp_analysis/src/workspace/)

  • Solution file (.sln) parsing
  • Project file (.csproj) parsing with XML
  • Transitive ProjectReference resolution
  • Source file discovery with glob patterns
  • Deterministic project ordering

Analysis Framework (src/bsharp_analysis/src/)

  • Pipeline-based architecture with phases
  • Extensible passes and rules system
  • Metrics collection (complexity, maintainability)
  • Control flow analysis
  • Dependency tracking
  • Code quality assessment

CLI Tools (src/bsharp_cli/)

  • parse - Parse C# and print textual AST tree
  • tree - Generate AST visualization (Mermaid/DOT)
  • analyze - Comprehensive code analysis
  • format - Format C# code using syntax emitters

Getting Started

The easiest way to get started is using the CLI tools:

# Parse a C# file and print textual AST tree
bsharp parse input.cs

# Generate AST visualization
bsharp tree input.cs --output ast.svg

# Analyze a project or solution
bsharp analyze MyProject.csproj --out report.json

# Format a file in-place (or a directory recursively)
bsharp format input.cs --write true

Formatter Quickstart

Use the built-in formatter from the CLI or integrate the Formatter directly.

Quick examples:

# Format a single file in-place
bsharp format Program.cs

# Print formatted output (do not write)
bsharp format Program.cs --write false

# Enable emission tracing to a JSONL file
bsharp format Program.cs --emit-trace --emit-trace-file format_trace.jsonl

Use Cases

BSharp is designed for:

  • Static Analysis Tools: Build custom analyzers for code quality, security, or style
  • Code Transformation: Parse, modify, and regenerate C# code
  • Language Tooling: Create IDE extensions, linters, or formatters
  • Educational Tools: Understand and visualize C# code structure
  • Migration Tools: Analyze legacy code for modernization efforts

This documentation will guide you through all aspects of using and extending BSharp.

2025-11-17 15:18:26 • commit: 03a4e25

Parser Overview

The BSharp parser transforms C# source code into a structured Abstract Syntax Tree (AST). Built using the nom parser combinator library, it provides a robust and extensible foundation for parsing modern C# syntax as part of the BSharp toolkit.

Architecture

The parser follows a modular architecture with clear separation of concerns. It serves as the frontend for tools that consume the AST (analysis, visualization, etc.):

Parser Infrastructure (src/bsharp_syntax/src/)

  • mod.rs: Public API and re-exports
  • ast.rs: Root AST node definitions (CompilationUnit, TopLevelDeclaration)
  • errors.rs: Error formatting utilities (format_error_tree)
  • parser_helpers.rs: Core parsing utilities (context, bws, keyword, etc.)
  • test_helpers.rs: Testing utilities (expect_ok, etc.)
  • nodes/: AST node definitions organized by category

Parser Implementations (src/bsharp_parser/src/)

The parsers are organized by language construct type:

  • expressions/: All expression parsing (literals, operators, method calls, etc.)
  • keywords/: Keyword parsing organized by category
  • types/: Type system parsing (primitives, generics, arrays, etc.)
  • helpers/: Declaration helpers and utilities
  • preprocessor/: Preprocessor directive parsing

AST Node Definitions (src/bsharp_syntax/src/)

Structured node definitions that mirror C# language constructs:

  • declarations/: All declaration node types
  • expressions/: All expression node types
  • statements/: All statement node types
  • types/: Type system node definitions

Parser Design Principles

1. Compositional Design

The parser is built from small, focused parser functions that combine to handle complex language constructs:

#![allow(unused)]
fn main() {
// Example: Method declaration combines multiple sub-parsers
fn parse_method_declaration(input: &str) -> BResult<&str, MethodDeclaration> {
    let (input, attributes) = parse_attributes(input.into())?;
    let (input, modifiers) = parse_modifiers(input.into())?;
    let (input, return_type) = parse_type(input.into())?;
    let (input, name) = parse_identifier(input.into())?;
    let (input, parameters) = parse_parameter_list(input.into())?;
    let (input, body) = opt(parse_block_statement)(input.into())?;
    // ... construct and return MethodDeclaration
}
}

2. Error Recovery and Context

Custom error types provide detailed context about parse failures:

  • Location information (line, column)
  • Expected vs. actual input
  • Contextual error messages
  • Error recovery strategies

3. Extensibility

The modular design allows easy addition of new language features:

  • Add new expression types by extending the Expression enum
  • Implement new statement parsers following established patterns
  • Extend AST navigation traits for new analysis capabilities

Parsing Flow

1. Entry Point

Parsing begins via the public facade in src/bsharp_parser/src/facade.rs:

#![allow(unused)]
fn main() {
let parser = bsharp_parser::facade::Parser::new();
let cu = parser.parse(source)?;
}

2. Compilation Unit Parsing

The parser starts by parsing a CompilationUnit, which represents a complete C# source file:

  • Global attributes (assembly/module level)
  • Using directives
  • Top-level declarations (namespaces, classes, etc.)
  • File-scoped namespaces (C# 10+)
  • Top-level statements (C# 9+)

3. Recursive Descent

The parser uses recursive descent to handle nested structures:

  • Namespaces contain type declarations
  • Types contain member declarations
  • Methods contain statements
  • Statements contain expressions

Key Parser Features

Expression Parsing with Precedence

The expression parser handles operator precedence correctly:

  • Primary expressions (literals, identifiers, parentheses)
  • Unary operators (!, -, +, ++, --, etc.)
  • Binary operators with correct precedence and associativity
  • Conditional expressions (ternary operator)
  • Assignment expressions

Statement Parsing

Comprehensive support for all C# statement types:

  • Control flow: if, switch, for, foreach, while, do-while
  • Jump statements: break, continue, return, throw, goto
  • Exception handling: try-catch-finally
  • Resource management: using, lock
  • Local declarations and assignments

Declaration Parsing

Full support for C# type and member declarations:

  • Types: classes, structs, interfaces, records, enums, delegates
  • Members: methods, properties, fields, events, indexers, operators
  • Modifiers: access modifiers, static, abstract, virtual, override, etc.
  • Generics: type parameters, constraints, variance

Modern C# Features

Support for recent C# language additions:

  • Records (C# 9)
  • File-scoped namespaces (C# 10)
  • Top-level statements (C# 9)
  • Pattern matching enhancements
  • Nullable reference types

Error Handling Strategy

The parser uses a multi-layered error handling approach:

  1. Parse Errors: Detailed information about what went wrong during parsing
  2. Context Propagation: Errors include context about where in the parsing process they occurred
  3. Recovery Mechanisms: Ability to continue parsing after certain types of errors
  4. User-Friendly Messages: Clear, actionable error messages for developers

This design makes the parser robust and helpful for development and debugging. Code generation/compilation is out of scope for now; the parser and analysis crates form the current focus of the toolkit.


See Also

2025-11-17 15:18:26 • commit: 03a4e25

Core Parser Components

This document details the fundamental components that make up the BSharp parser infrastructure.

Public Parser API

Parser Struct

The main entry point for all parsing operations:

#![allow(unused)]
fn main() {
#[derive(Default)]
pub struct Parser;

impl Parser {
    pub fn new() -> Self
    pub fn parse(&self, input: &str) -> Result<ast::CompilationUnit, String>
}
}

The Parser provides a clean, simple interface that abstracts away the complexity of the underlying parsing implementation.

Error System

ErrorTree (nom-supreme)

BSharp uses nom-supreme's ErrorTree for rich error diagnostics:

#![allow(unused)]
fn main() {
pub type BResult<I, O> = IResult<I, O, ErrorTree<I>>;
}

Key features:

  • Context Stack: Maintains parsing contexts via .context() calls
  • Position Tracking: Built-in span tracking for error locations
  • Rich Diagnostics: Tree structure shows complete parse failure path
  • Integration: Seamless with nom combinators

Error Helpers

Utility functions for enhanced error handling:

Location: src/bsharp_parser/src/helpers/

  • context(): Adds contextual information to parser errors
  • bws(): Whitespace-aware wrapper with error context
  • bdelimited(): Delimited parsing with cut on closing delimiter
  • cut(): Commits to parse branch, preventing misleading backtracking
  • Error recovery mechanisms for common parsing scenarios

Pretty Error Formatting

Location: src/bsharp_parser/src/syntax/errors.rs

#![allow(unused)]
fn main() {
pub fn format_error_tree(input: &str, error: &ErrorTree<Span<'_>>) -> String;
}

Produces rustc-like error messages with:

  • Line and column numbers
  • Source code context
  • Caret pointing to error location
  • Context stack showing parse path

AST Foundation

CompilationUnit

The root node of every parsed C# file:

#![allow(unused)]
fn main() {
pub struct CompilationUnit {
    pub global_attributes: Vec<GlobalAttribute>,
    pub using_directives: Vec<UsingDirective>,
    pub global_using_directives: Vec<GlobalUsingDirective>,
    pub declarations: Vec<TopLevelDeclaration>,
    pub file_scoped_namespace: Option<FileScopedNamespaceDeclaration>,
    pub top_level_statements: Vec<Statement>,
}
}

Represents the complete structure of a C# source file, supporting both traditional and modern C# features.

TopLevelDeclaration

Enum representing all possible top-level declarations:

#![allow(unused)]
fn main() {
pub enum TopLevelDeclaration {
    Namespace(NamespaceDeclaration),
    FileScopedNamespace(FileScopedNamespaceDeclaration),
    Class(ClassDeclaration),
    Struct(StructDeclaration),
    Record(RecordDeclaration),
    Interface(InterfaceDeclaration),
    Enum(EnumDeclaration),
    Delegate(DelegateDeclaration),
    GlobalAttribute(GlobalAttribute),
}
}

Keyword Parsing

Keyword Module Organization

Location: src/bsharp_parser/src/keywords/

Keywords are organized by category in dedicated modules for maintainability and consistency:

src/bsharp_parser/src/keywords/
├── mod.rs                      # Keyword infrastructure
├── access_keywords.rs          # public, private, protected, internal
├── accessor_keywords.rs        # get, set, init, add, remove
├── type_keywords.rs            # class, struct, interface, enum, record
├── modifier_keywords.rs        # static, abstract, virtual, sealed
├── flow_control_keywords.rs    # if, else, switch, case, default
├── iteration_keywords.rs       # for, foreach, while, do
├── expression_keywords.rs      # new, this, base, typeof, sizeof
├── linq_query_keywords.rs      # from, where, select, orderby
└── ...

Keyword Parsing Strategy

Word Boundary Enforcement:

#![allow(unused)]
fn main() {
pub fn keyword(kw: &'static str) -> impl Fn(&str) -> BResult<&str, &str>;
}

The keyword() helper enforces [A-Za-z0-9_] word boundaries to prevent partial matches:

  • Correctly rejects "int" when parsing "int32"
  • Ensures "class" doesn't match "classname"
  • Consistent across all keyword parsers

Benefits:

  • Maintainability: Easy to find and update keyword parsers
  • Consistency: Uniform keyword parsing strategy
  • Bug Prevention: Avoids partial match issues
  • Centralization: Single source of truth for keywords

Parser Helpers

Context Management

Functions for maintaining parsing context:

#![allow(unused)]
fn main() {
pub fn context<I, O, F>(
    ctx: &'static str,
    parser: F
) -> impl FnMut(I) -> BResult<I, O>
}

Wraps parsers with contextual information that appears in error messages, making debugging much easier.

Parser Composition

Utilities for combining smaller parsers into larger ones:

  • Sequencing parsers with error propagation
  • Optional parsing with fallbacks
  • Alternative parsing with preference ordering
  • Repetition parsing with separators

Whitespace and Comment Handling

Consistent handling of whitespace and comments throughout the parser:

  • Automatic whitespace skipping between tokens
  • Comment preservation for documentation tools
  • Preprocessor directive handling

Node Structure Standards

Common Traits

All AST nodes implement standard traits:

  • Debug: For debugging and logging
  • PartialEq: For testing and comparison
  • Clone: For AST manipulation
  • Serialize/Deserialize: For JSON export/import

Node Organization

AST nodes are organized hierarchically:

nodes/
├── declarations/     # Type and member declarations
├── expressions/      # All expression types
├── statements/       # All statement types
├── types/           # Type system representations
└── ...              # Other language constructs

Identifier Handling

Consistent identifier representation throughout the AST:

#![allow(unused)]
fn main() {
pub struct Identifier {
    pub name: String,
    // Additional metadata like source location
}
}

Type System Integration

Type Representation

The parser builds a complete representation of C# types:

  • Primitive types (int, string, bool, etc.)
  • Reference types (classes, interfaces)
  • Value types (structs, enums)
  • Generic types with constraints
  • Array and pointer types
  • Nullable types

Generic Support

Full support for C# generics:

  • Type parameters with constraints
  • Variance annotations (in, out)
  • Generic method declarations
  • Complex constraint combinations

Memory Management

Zero-Copy Parsing

Where possible, the parser avoids unnecessary string allocations:

  • String slices reference original input
  • Minimal cloning during parsing
  • Efficient error reporting without excessive allocation

AST Ownership

Clear ownership semantics for AST nodes:

  • Parent nodes own their children
  • Shared references through navigation traits
  • No circular references in the AST structure

This foundation provides a robust base for parsing complex C# code while maintaining performance and usability.

2025-11-17 15:18:26 • commit: 03a4e25

AST Structure

The BSharp AST (Abstract Syntax Tree) provides a complete, structured representation of C# source code. This document explains the organization and relationships between different AST node types.

AST Hierarchy

Root Node: CompilationUnit

Every parsed C# file results in a CompilationUnit, which serves as the root of the AST:

#![allow(unused)]
fn main() {
pub struct CompilationUnit {
    pub global_attributes: Vec<GlobalAttribute>,        // [assembly: ...] attributes
    pub using_directives: Vec<UsingDirective>,          // using statements
    pub global_using_directives: Vec<GlobalUsingDirective>, // C# 10+ global using
    pub declarations: Vec<TopLevelDeclaration>,         // namespaces, types
    pub file_scoped_namespace: Option<FileScopedNamespaceDeclaration>, // C# 10+
    pub top_level_statements: Vec<Statement>,           // C# 9+ top-level code
}
}

This structure supports both traditional C# files and modern features like file-scoped namespaces and top-level statements.

Declaration Hierarchy

Top-Level Declarations

Top-level declarations represent constructs that can appear at the file or namespace level:

#![allow(unused)]
fn main() {
pub enum TopLevelDeclaration {
    Namespace(NamespaceDeclaration),
    FileScopedNamespace(FileScopedNamespaceDeclaration),
    Class(ClassDeclaration),
    Struct(StructDeclaration),
    Record(RecordDeclaration),
    Interface(InterfaceDeclaration),
    Enum(EnumDeclaration),
    Delegate(DelegateDeclaration),
    GlobalAttribute(GlobalAttribute),
}
}

Type Declarations

Each type declaration contains comprehensive information about the type:

ClassDeclaration

#![allow(unused)]
fn main() {
pub struct ClassDeclaration {
    pub attributes: Vec<AttributeList>,
    pub modifiers: Vec<Modifier>,
    pub name: Identifier,
    pub type_parameters: Option<Vec<TypeParameter>>,
    pub primary_constructor_parameters: Option<Vec<Parameter>>, // C# 12
    pub base_types: Vec<Type>,
    pub body_declarations: Vec<ClassBodyDeclaration>,
    pub documentation: Option<XmlDocumentationComment>,
    pub constraints: Option<Vec<TypeParameterConstraintClause>>,
}
}

MethodDeclaration

#![allow(unused)]
fn main() {
pub struct MethodDeclaration {
    pub modifiers: Vec<Modifier>,
    pub return_type: Type,
    pub name: Identifier,
    pub type_parameters: Option<Vec<TypeParameter>>,
    pub parameters: Vec<Parameter>,
    pub body: Option<Statement>,                 // None for abstract/interface methods
    pub constraints: Option<Vec<TypeParameterConstraintClause>>,
}
}

Member Declarations

Class body declarations represent all possible class members:

#![allow(unused)]
fn main() {
pub enum ClassBodyDeclaration {
    Method(MethodDeclaration),
    Constructor(ConstructorDeclaration),
    Destructor(DestructorDeclaration),
    Property(PropertyDeclaration),
    Field(FieldDeclaration),
    Event(EventDeclaration),
    Indexer(IndexerDeclaration),
    Operator(OperatorDeclaration),
    NestedClass(ClassDeclaration),
    NestedStruct(StructDeclaration),
    NestedInterface(InterfaceDeclaration),
    NestedEnum(EnumDeclaration),
    NestedDelegate(DelegateDeclaration),
}
}

Expression Hierarchy

Expression Types

The expression system covers all C# expression types with proper precedence:

#![allow(unused)]
fn main() {
pub enum Expression {
    // Primary and names
    Literal(Literal),
    Variable(Identifier),

    // Object and member operations
    New(Box<NewExpression>),
    MemberAccess(Box<MemberAccessExpression>),
    Invocation(Box<InvocationExpression>),
    Indexing(Box<IndexingExpression>),
    Index(Box<IndexExpression>),
    Range(Box<RangeExpression>),

    // Lambda and anonymous methods
    Lambda(Box<LambdaExpression>),
    AnonymousMethod(Box<AnonymousMethodExpression>),

    // Keywords
    This,
    Base,

    // Operators
    Unary { op: UnaryOperator, expr: Box<Expression> },
    Binary { left: Box<Expression>, op: BinaryOperator, right: Box<Expression> },
    PostfixUnary { op: UnaryOperator, expr: Box<Expression> },
    Assignment(Box<AssignmentExpression>),

    // Patterns and type ops
    Pattern(Box<Pattern>),
    IsPattern { expression: Box<Expression>, pattern: Box<Pattern> },
    As { expression: Box<Expression>, target_type: Type },
    Cast { expression: Box<Expression>, target_type: Type },

    // Misc language features
    Conditional(Box<ConditionalExpression>),
    Query(Box<QueryExpression>),
    Await(Box<AwaitExpression>),
    Throw(Box<ThrowExpression>),
    Nameof(Box<NameofExpression>),
    Typeof(Box<TypeofExpression>),
    Sizeof(Box<SizeofExpression>),
    Default(Box<DefaultExpression>),
    StackAlloc(Box<StackAllocExpression>),
    Ref(Box<Expression>),
    Checked(Box<CheckedExpression>),
    Unchecked(Box<UncheckedExpression>),

    // With/collection expressions
    With { target: Box<Expression>, initializers: Vec<WithInitializerEntry> },
    Collection(Vec<CollectionElement>),

    // Composite forms
    AnonymousObject(AnonymousObjectCreationExpression),
    Tuple(TupleExpression),
    SwitchExpression(Box<SwitchExpression>),
}
}

Key helper structs:

#![allow(unused)]
fn main() {
pub struct SwitchExpression {
    pub expression: Expression,
    pub arms: Vec<SwitchExpressionArm>,
}

pub enum WithInitializerEntry {
    Property { name: String, value: Expression },
    Indexer { indices: Vec<Expression>, value: Expression },
}

pub enum CollectionElement {
    Expr(Expression),
    Spread(Expression),
}
}

Literal Types

Comprehensive support for C# literals:

#![allow(unused)]
fn main() {
pub enum Literal {
    Boolean(bool),
    Integer(String),          // Preserves original format
    FloatingPoint(String),    // Preserves original format
    Character(char),
    String(String),
    InterpolatedString(InterpolatedStringLiteral),
    Null,
    Default,
}
}

Statement Hierarchy

Statement Types

Complete coverage of C# statement types:

#![allow(unused)]
fn main() {
pub enum Statement {
    // Control flow
    If(IfStatement),
    Switch(SwitchStatement),
    For(ForStatement),
    ForEach(ForEachStatement),
    While(WhileStatement),
    DoWhile(DoWhileStatement),
    
    // Jump statements
    Break(BreakStatement),
    Continue(ContinueStatement),
    Return(ReturnStatement),
    Throw(ThrowStatement),
    Goto(GotoStatement),
    
    // Exception handling
    Try(TryStatement),
    
    // Resource management
    Using(UsingStatement),
    Lock(LockStatement),
    
    // Declarations and expressions
    LocalVariableDeclaration(LocalVariableDeclaration),
    ExpressionStatement(Expression),
    Block(Vec<Statement>),
    Empty,
    
    // Modern features
    LocalFunction(LocalFunctionStatement),
}
}

Control Flow Statements

Complex control flow statements contain nested structures:

IfStatement

#![allow(unused)]
fn main() {
pub struct IfStatement {
    pub condition: Expression,
    pub consequence: Box<Statement>,
    pub alternative: Option<Box<Statement>>,
}
}

TryStatement

#![allow(unused)]
fn main() {
pub struct TryStatement {
    pub body: Box<Statement>,
    pub catch_clauses: Vec<CatchClause>,
    pub finally_clause: Option<FinallyClause>,
}
}

Type System

Type Representation

The type system models all C# type constructs:

#![allow(unused)]
fn main() {
pub enum Type {
    // Primitive types
    Primitive(PrimitiveType),
    
    // Named types
    Named { name: Identifier, type_arguments: Vec<Type> },
    
    // Array types
    Array { element_type: Box<Type>, rank: usize },
    
    // Pointer types
    Pointer(Box<Type>),
    
    // Nullable types
    Nullable(Box<Type>),
    
    // Generic type parameters
    TypeParameter(Identifier),
    
    // Tuple types
    Tuple(Vec<Type>),
}
}

Generic Support

Full support for C# generics:

TypeParameter

#![allow(unused)]
fn main() {
pub struct TypeParameter {
    pub attributes: Vec<Attribute>,
    pub variance: Option<Variance>,      // in, out
    pub identifier: Identifier,
}
}

TypeParameterConstraint

#![allow(unused)]
fn main() {
pub enum TypeParameterConstraint {
    TypeConstraint { parameter: Identifier, constraint_type: Type },
    ConstructorConstraint(Identifier),    // new()
    ClassConstraint(Identifier),          // class
    StructConstraint(Identifier),         // struct
    UnmanagedConstraint(Identifier),      // unmanaged
}
}

AST Metadata

Attributes

Comprehensive attribute support:

#![allow(unused)]
fn main() {
pub struct Attribute {
    pub name: Identifier,
    pub arguments: Vec<AttributeArgument>,
}

pub enum AttributeArgument {
    Positional(Expression),
    Named { name: Identifier, value: Expression },
}
}

Modifiers

All C# modifiers are represented:

#![allow(unused)]
fn main() {
pub enum Modifier {
    // Access modifiers
    Public, Private, Protected, Internal, ProtectedInternal, PrivateProtected,
    
    // Other modifiers
    Static, Abstract, Virtual, Override, Sealed, New,
    Async, Unsafe, Volatile, Readonly, Const,
    Partial, Extern,
}
}

The AST maintains clear parent-child relationships while providing navigation capabilities through traits:

  • Ownership: Parent nodes own their children
  • Navigation: Traits provide methods to traverse and search the AST
  • Context: Nodes can access their containing context when needed

This structure provides a complete, navigable representation of C# code that supports both analysis and transformation scenarios.

2025-11-17 15:18:26 • commit: 03a4e25

Error Handling

BSharp implements a comprehensive error handling system that provides detailed context information for debugging parse failures.

Error Types

The parser uses ErrorTree from nom-supreme for structured error information:

#![allow(unused)]
fn main() {
pub type BResult<I, O> = nom::IResult<I, O, ErrorTree<I>>;
}

ErrorTree Structure

The ErrorTree type provides:

  • Context Stack: Hierarchical parsing context via .context() calls
  • Location: Span tracking for error positions
  • Error Tree: Complete parse failure path
  • Rich Diagnostics: Detailed error information for debugging

Error Recovery

The parser implements several error recovery strategies:

1. Malformed Syntax Recovery

When encountering malformed syntax, the parser attempts to skip to recovery points:

  • Semicolons (;)
  • Closing braces (})
  • End of input

1.a Declaration Error Recovery (Type Member Top-Level)

For type declarations (classes, structs, records, interfaces), malformed members are recovered using a lightweight, scope-aware helper:

  • Helper: skip_to_member_boundary_top_level()
  • Location: src/bsharp_parser/src/expressions/declarations/type_declaration_helpers.rs

Contract:

  • Only use from within a type body when a member parser fails.
  • Stops at the next safe boundary at top level of the current type:
    • Consumes a top-level ; and returns the slice after it.
    • Or stops at a top-level } without consuming it (so the caller can close the current body cleanly).
    • Returns an empty slice at EOF.
  • Depth-tracks (), [], {}, and a heuristic <> to avoid stopping inside expressions, attribute arguments, or generic argument lists.
  • Ignores control characters inside strings, chars, and comments.

Limitations:

  • Angle-bracket tracking is heuristic and does not fully disambiguate generics from shift operators.
  • Verbatim/interpolated strings are not fully lexed here; this helper is intended for robust, not perfect, recovery.

Usage example (simplified):

#![allow(unused)]
fn main() {
match member_parser(cur) {
    Ok((rest, member)) => { members.push(member); cur = rest; }
    Err(_) => {
        let next = skip_to_member_boundary_top_level(cur);
        if next.is_empty() || next == cur { break; }
        cur = next;
    }
}
}

1.b Namespace Body: Using-Directives Before Members

Inside a block-scoped namespace body, using directives are accepted before type and nested-namespace members.

  • Implementation: parse_namespace_declaration() scans for using immediately after the opening { and collects all consecutive directives before parsing members.
  • This ensures inputs like the following are parsed deterministically without interleaving usings with members:
namespace Outer {
    using System;
    namespace Inner {
        using System.Collections;
        class MyClass {}
    }
}

Contract and limitations:

  • Only leading using directives at the current namespace body level are collected.
  • Interleaving using directives among members is not supported yet (matches common style and avoids ambiguous recovery).

1.c File-Scoped Namespace

When parsing a file-scoped namespace, the parser also skips preprocessor directives following the namespace line before parsing members, mirroring the block-scoped behavior.

Preprocessor Directives and Trivia

Preprocessor directives (e.g., #pragma, #line) are treated as structured trivia, not AST declarations:

  • Parser entrypoints (e.g., parse_csharp_source()) skip directive lines anywhere they can appear at the compilation-unit level.
  • parse_preprocessor_directive() consumes the entire directive line including an optional trailing newline.
  • Current status: directives inside type and namespace bodies are planned to be skipped similarly; tests are tracked and temporarily ignored until this is integrated.

Example:

#pragma warning disable CS0168
namespace N {
    // class and members...
}

The directive is skipped and not present as a namespace member.

2. Context-Aware Errors

Errors include contextual information about the parsing context:

#![allow(unused)]
fn main() {
context("method declaration", parse_method_body)(input.into())
}

This provides clear error messages like "expected method body in method declaration context".

Helper Location: src/bsharp_parser/src/helpers/

3. Graceful Degradation

The parser continues parsing even after encountering errors, collecting multiple errors to provide comprehensive feedback.

Error Reporting

Errors are reported with:

  • Line and column numbers
  • Surrounding context
  • Suggestions for fixes
  • Parser state information

Common Error Scenarios

Syntax Errors

  • Missing semicolons
  • Unmatched braces
  • Invalid identifiers

Type Errors

  • Unknown type references
  • Generic constraint violations
  • Invalid type parameter usage

Declaration Errors

  • Conflicting modifiers
  • Missing required elements
  • Invalid access levels

Debugging Tips

  1. Use verbose error output to get detailed parser state
  2. Check recovery points when errors cascade
  3. Validate input syntax with simpler test cases first
  4. Use parser context to understand where parsing failed
2025-11-17 15:18:26 • commit: 03a4e25

Wrapper Expression Variants

For clarity, several operations are modeled as distinct expression variants in the AST:

  • New(NewExpression) for object creation
  • MemberAccess(MemberAccessExpression) for obj.Member
  • Invocation(InvocationExpression) for calls expr(args)
  • Indexing(IndexingExpression) and Index(IndexExpression)
  • Range(RangeExpression) for start..end
  • With { target, initializers } for record-like with-expressions
  • Collection(Vec<CollectionElement>) for collection expressions

Expression Parsing

BSharp implements a complete expression parser that handles all C# expression types with proper operator precedence and associativity.

Expression Hierarchy

The expression parser follows C#'s operator precedence rules:

  1. Primary Expressions (x, x.y, x[y], x(), etc.)
  2. Unary Expressions (+x, -x, !x, ~x, ++x, --x)
  3. Multiplicative (*, /, %)
  4. Additive (+, -)
  5. Shift (<<, >>)
  6. Relational (<, >, <=, >=, is, as)
  7. Equality (==, !=)
  8. Logical AND (&)
  9. Logical XOR (^)
  10. Logical OR (|)
  11. Conditional AND (&&)
  12. Conditional OR (||)
  13. Null Coalescing (??)
  14. Conditional (?:)
  15. Assignment (=, +=, -=, etc.)

Expression Types

Primary Expressions

Literals

  • Numeric: 42, 3.14, 0x1A
  • String: "hello", @"verbatim", $"interpolated {value}"
  • Character: 'a', '\n'
  • Boolean: true, false
  • Null: null

Identifiers and Member Access

variable          // Simple identifier
obj.property      // Member access
obj.method()      // Method invocation
obj[index]        // Indexer access

Note: In the AST, simple identifiers are represented by the Expression::Variable(Identifier) variant. Member access, invocation, and indexing are represented by dedicated wrapper variants (MemberAccess, Invocation, Indexing).

Object Creation

new MyClass()                    // Constructor
new MyClass { Prop = value }     // Object initializer
new[] { 1, 2, 3 }               // Array initializer
new { Name = "John", Age = 30 }  // Anonymous object

Lambda Expressions

The parser supports various lambda syntax forms:

x => x * 2                      // Single parameter
(x, y) => x + y                 // Multiple parameters
() => DoSomething()             // No parameters
(int x, string y) => Process(x, y)  // Typed parameters
x => { return x * 2; }          // Block body
async x => await ProcessAsync(x) // Async lambda

Query Expressions (LINQ)

Complete LINQ query syntax support:

from item in collection
where item.IsValid
orderby item.Name
select item.Value

Supported clauses:

  • from - Data source
  • where - Filtering
  • select - Projection
  • orderby - Sorting
  • group by - Grouping
  • join - Joining
  • let - Variable introduction
  • into - Query continuation

Pattern Expressions

Modern C# pattern matching:

obj is int value           // Type pattern
obj is not null           // Negation pattern
obj is > 0 and < 100     // Relational patterns
obj is var x             // Var pattern

Switch Expressions

value switch
{
    1 => "one",
    2 => "two",
    _ => "other"
}

Operator Precedence Implementation

The expression entrypoint is spanned-first. Callers can unwrap the Spanned<Expression> when they do not need spans:

#![allow(unused)]
fn main() {
use bsharp_parser::parser::expressions::primary_expression_parser::parse_expression_spanned;
use bsharp_syntax::span::Span;

let result = parse_expression_spanned(Span::new(input))
    .map(|(rest, s)| (rest, s.node));
}

Error Handling in Expressions

The expression parser provides detailed error messages:

  • Operator precedence conflicts
  • Missing operands
  • Invalid syntax combinations
  • Type compatibility issues

Advanced Features

Null-Conditional Operators

obj?.Property        // Null-conditional member access
obj?[index]         // Null-conditional element access
obj?.Method()       // Null-conditional invocation

Throw Expressions

value ?? throw new ArgumentNullException()

Range and Index Expressions

array[^1]           // Index from end
array[1..5]         // Range
array[..^1]         // Range to index from end

With Expressions (Records)

person with { Name = "Updated" }

The expression parser is designed to be extensible, allowing for easy addition of new expression types as the C# language evolves.


See Also

  • Keywords and Tokens – keyword helpers, word boundaries, trivia handling for tokens used in expressions
2025-11-17 15:18:26 • commit: 03a4e25

Statement Parsing

BSharp provides comprehensive parsing for all C# statement types, from simple expressions to complex control flow constructs.

Statement Categories

1. Declaration Statements

Local Variable Declarations

int x = 5;
var name = "John";
const double PI = 3.14159;

Local Function Declarations

void LocalFunction(int parameter)
{
    // function body
}

T GenericLocalFunction<T>(T value) where T : class
{
    return value;
}

2. Expression Statements

Any expression followed by a semicolon:

x++;                    // Increment
Method();              // Method call
obj.Property = value;  // Assignment

3. Control Flow Statements

Conditional Statements

If Statements

if (condition)
    statement;

if (condition)
{
    // block
}
else if (otherCondition)
{
    // else if block
}
else
{
    // else block
}

Switch Statements

switch (expression)
{
    case constant1:
        statements;
        break;
    case constant2 when condition:
        statements;
        goto case constant1;
    default:
        statements;
        break;
}

Loop Statements

For Loops

for (int i = 0; i < 10; i++)
{
    // loop body
}

for (;;)  // infinite loop
{
    // body
}

Foreach Loops

foreach (var item in collection)
{
    // process item
}

foreach ((string key, int value) in dictionary)
{
    // deconstruction in foreach
}

While Loops

while (condition)
{
    // loop body
}

Do-While Loops

do
{
    // loop body
} while (condition);

Jump Statements

break;              // Break from loop/switch
continue;           // Continue to next iteration
return;             // Return from method
return value;       // Return with value
goto label;         // Jump to label
goto case 5;        // Jump to switch case
goto default;       // Jump to switch default

4. Exception Handling

Try-Catch-Finally

try
{
    // risky code
}
catch (SpecificException ex) when (ex.Code == 123)
{
    // specific exception handling
}
catch (Exception ex)
{
    // general exception handling
}
finally
{
    // cleanup code
}

Throw Statements

throw;                           // Rethrow current exception
throw new InvalidOperationException();
throw new CustomException("message");

5. Resource Management

Using Statements

using (var resource = new DisposableResource())
{
    // use resource
}

using var resource = new DisposableResource();
// resource disposed at end of scope

Lock Statements

lock (syncObject)
{
    // synchronized code
}

Fixed Statements

unsafe
{
    fixed (byte* ptr = array)
    {
        // work with fixed pointer
    }
}

6. Special Statements

Yield Statements

yield return value;     // Return value in iterator
yield break;           // End iterator

Checked/Unchecked Statements

checked
{
    // arithmetic overflow checking enabled
}

unchecked
{
    // arithmetic overflow checking disabled
}

Unsafe Statements

unsafe
{
    // unsafe code block
}

Statement Parsing Implementation

Use the spanned entrypoint and unwrap when spans are not needed:

#![allow(unused)]
fn main() {
use bsharp_parser::parser::statement_parser::parse_statement_ws_spanned;
use bsharp_syntax::span::Span;

let result = parse_statement_ws_spanned(Span::new(input))
    .map(|(rest, s)| (rest, s.node));
}

Block Statements

Block statements group multiple statements:

{
    int x = 5;
    Console.WriteLine(x);
    if (x > 0)
    {
        Console.WriteLine("Positive");
    }
}

Error Recovery

The statement parser implements robust error recovery:

  1. Statement-level recovery: Skip to next statement boundary (semicolon or brace)
  2. Block-level recovery: Skip to matching brace
  3. Context preservation: Maintain parsing context across errors

Statement Attributes

Statements can have attributes applied:

[Obsolete("Use NewMethod instead")]
void OldMethod() { }

[ConditionalAttribute("DEBUG")]
static void DebugMethod() { }

Top-Level Statements

Support for C# 9+ top-level statements:

// Program.cs
using System;

Console.WriteLine("Hello World!");
return 0;

The statement parser is designed to handle the full complexity of C# control flow while providing clear error messages and robust error recovery.

2025-11-17 15:18:26 • commit: 03a4e25

Declaration Parsing

BSharp implements comprehensive parsing for all C# declaration types, from simple variables to complex generic types with constraints.

Declaration Categories

1. Namespace Declarations

Traditional Namespace

namespace MyCompany.MyProject
{
    // namespace members
}

File-Scoped Namespace (C# 10+)

namespace MyCompany.MyProject;

// All following declarations belong to this namespace

Nested Namespaces

namespace Outer
{
    namespace Inner
    {
        // nested namespace content
    }
}

2. Type Declarations

Class Declarations

public class MyClass : BaseClass, IInterface1, IInterface2
{
    // class members
}

public abstract class AbstractClass
{
    public abstract void AbstractMethod();
}

public sealed class SealedClass
{
    // cannot be inherited
}

Interface Declarations

public interface IMyInterface : IBaseInterface
{
    void Method();
    int Property { get; set; }
    event Action SomeEvent;
}

public interface IGeneric<T> where T : class
{
    T GenericMethod<U>(U parameter) where U : struct;
}

Struct Declarations

public struct Point
{
    public int X { get; set; }
    public int Y { get; set; }
    
    public Point(int x, int y)
    {
        X = x;
        Y = y;
    }
}

public readonly struct ReadOnlyPoint
{
    public readonly int X;
    public readonly int Y;
    
    public ReadOnlyPoint(int x, int y)
    {
        X = x;
        Y = y;
    }
}

Record Declarations

public record Person(string FirstName, string LastName);

public record class Employee(string FirstName, string LastName, string Department)
    : Person(FirstName, LastName);

public record struct Point(int X, int Y);

Enum Declarations

public enum Color
{
    Red,
    Green,
    Blue
}

[Flags]
public enum FileAccess : byte
{
    None = 0,
    Read = 1,
    Write = 2,
    Execute = 4,
    All = Read | Write | Execute
}

Delegate Declarations

public delegate void EventHandler(object sender, EventArgs e);
public delegate T GenericDelegate<T, U>(U parameter) where T : class;

3. Member Declarations

Field Declarations

private int field;
public readonly string ReadOnlyField;
public const double PI = 3.14159;
private static readonly List<string> StaticField = new();

Property Declarations

// Auto-implemented properties
public string Name { get; set; }
public int Age { get; private set; }
public bool IsValid { get; init; }

// Properties with backing fields
private string _description;
public string Description
{
    get => _description;
    set => _description = value?.Trim();
}

// Expression-bodied properties
public string FullName => $"{FirstName} {LastName}";

// Indexer properties
public string this[int index]
{
    get => items[index];
    set => items[index] = value;
}

Method Declarations

public void VoidMethod() { }
public int MethodWithReturnType() => 42;
public static T GenericMethod<T>(T parameter) where T : new() => new T();

// Async methods
public async Task<string> AsyncMethod()
{
    await Task.Delay(1000);
    return "result";
}

// Extension methods
public static class Extensions
{
    public static bool IsEmpty(this string str) => string.IsNullOrEmpty(str);
}

Constructor Declarations

public class MyClass
{
    public MyClass() { }                    // Default constructor
    public MyClass(string name) : this()   // Constructor chaining
    {
        Name = name;
    }
    
    static MyClass()                        // Static constructor
    {
        // Static initialization
    }
}

Destructor Declarations

public class Resource
{
    ~Resource()
    {
        // Cleanup code
    }
}

Note: In the AST, DestructorDeclaration.body is Option<Statement>:

#![allow(unused)]
fn main() {
// Some(Block(...)) for `{ ... }`, None for extern (i.e., `;` only)
pub struct DestructorDeclaration {
    pub name: Identifier,
    pub body: Option<Statement>,
}
}

Event Declarations

public event Action<string> SomethingHappened;

public event EventHandler<CustomEventArgs> CustomEvent
{
    add { customEvent += value; }
    remove { customEvent -= value; }
}

Operator Declarations

public static Point operator +(Point a, Point b)
{
    return new Point(a.X + b.X, a.Y + b.Y);
}

public static implicit operator string(Point p)
{
    return $"({p.X}, {p.Y})";
}

4. Generic Constraints

Type Parameter Constraints

public class Container<T> where T : class, IDisposable, new()
{
    // T must be a reference type, implement IDisposable, and have a parameterless constructor
}

public void Method<T, U>()
    where T : class
    where U : struct, IComparable<U>
{
    // Multiple constraint clauses
}

AST mapping for constraints:

#![allow(unused)]
fn main() {
// On type declarations (class/struct/interface/record)
pub struct ClassDeclaration {
    pub constraints: Option<Vec<TypeParameterConstraintClause>>,
}

// On methods
pub struct MethodDeclaration {
    pub constraints: Option<Vec<TypeParameterConstraintClause>>,
}
}

5. Modifiers and Attributes

Access Modifiers

  • public - Accessible everywhere
  • private - Accessible only within the same class
  • protected - Accessible within class and derived classes
  • internal - Accessible within the same assembly
  • protected internal - Accessible within assembly or derived classes
  • private protected - Accessible within derived classes in the same assembly

Other Modifiers

  • static - Belongs to the type rather than instance
  • abstract - Must be overridden in derived classes
  • virtual - Can be overridden in derived classes
  • override - Overrides a virtual/abstract member
  • sealed - Cannot be overridden further
  • readonly - Can only be assigned during initialization
  • const - Compile-time constant
  • async - Asynchronous method
  • unsafe - Contains unsafe code
  • extern - Implemented externally

Attributes

[Obsolete("Use NewMethod instead")]
public void OldMethod() { }

[DllImport("kernel32.dll")]
public static extern bool SetConsoleTitle(string title);

[AttributeUsage(AttributeTargets.Class | AttributeTargets.Method)]
public class CustomAttribute : Attribute
{
    public string Description { get; set; }
}

6. Using Directives

using System;                           // Namespace using
using System.Collections.Generic;
using static System.Math;               // Static using
using Project = MyCompany.MyProject;    // Alias directive
global using System.Text;              // Global using (C# 10+)

Note: global using directives are stored at the compilation unit level in CompilationUnit.global_using_directives.

Declaration Parsing Implementation

The declaration parser uses a multi-stage approach:

  1. Modifier Parsing: Parse access modifiers and other keywords
  2. Declaration Type Detection: Determine what kind of declaration
  3. Specific Parser Dispatch: Route to specialized parser
  4. Member Collection: Gather all declaration components
#![allow(unused)]
fn main() {
fn parse_type_declaration(input: &str) -> BResult<&str, TypeDeclaration> {
    let (input, attributes) = many0(parse_attribute)(input.into())?;
    let (input, modifiers) = parse_modifiers(input.into())?;
    let (input, declaration) = alt((
        parse_class_declaration,
        parse_interface_declaration,
        parse_struct_declaration,
        parse_enum_declaration,
        parse_delegate_declaration,
        parse_record_declaration,
    ))(input.into())?;
    
    Ok((input, TypeDeclaration {
        attributes,
        modifiers,
        declaration,
    }))
}
}

Error Handling

The declaration parser provides comprehensive error reporting:

  • Modifier conflicts: Detecting incompatible modifier combinations
  • Constraint validation: Ensuring generic constraints are valid
  • Accessibility consistency: Verifying access level consistency
  • Syntax validation: Catching malformed declarations

Recovery for Malformed Members

When a member inside a type body fails to parse, the parser uses a scoped recovery strategy to skip to the next safe boundary without crossing the enclosing type's closing brace. See the dedicated section in Error Handling for details on skip_to_member_boundary_top_level() and its contract:

  • docs: docs/parser/error-handling.md (Declaration Error Recovery subsection)

XML Documentation

The parser handles XML documentation comments:

/// <summary>
/// Calculates the area of a rectangle.
/// </summary>
/// <param name="width">The width of the rectangle.</param>
/// <param name="height">The height of the rectangle.</param>
/// <returns>The area of the rectangle.</returns>
public double CalculateArea(double width, double height)
{
    return width * height;
}

The declaration parser is designed to handle the full complexity of C# type system while maintaining performance and providing detailed error diagnostics.

2025-11-17 15:18:26 • commit: 03a4e25

Type System

BSharp implements a comprehensive type system that accurately represents all C# type constructs, from primitive types to complex generic types with constraints.

Type Categories

1. Primitive Types

Built-in Value Types

bool        // Boolean type
byte        // 8-bit unsigned integer
sbyte       // 8-bit signed integer
short       // 16-bit signed integer
ushort      // 16-bit unsigned integer
int         // 32-bit signed integer
uint        // 32-bit unsigned integer
long        // 64-bit signed integer
ulong       // 64-bit unsigned integer
char        // 16-bit Unicode character
float       // 32-bit floating point
double      // 64-bit floating point
decimal     // 128-bit decimal

Special Types

object      // Base type of all types
string      // Immutable string type
void        // Absence of type (method returns)
dynamic     // Dynamic type
var         // Implicitly typed variable

2. Reference Types

Class Types

MyClass                 // Simple class reference
System.Collections.List<int>  // Generic class

Interface Types

IEnumerable<T>         // Generic interface
IDisposable            // Non-generic interface

Array Types

int[]                  // Single-dimensional array
int[,]                 // Multi-dimensional array
int[][]                // Jagged array
int[,,]                // Three-dimensional array

Delegate Types

Action                 // Parameterless action
Action<int>            // Action with parameter
Func<int, string>      // Function with return type
EventHandler<T>        // Event handler

3. Nullable Types

Nullable Value Types

int?                   // Nullable integer
DateTime?              // Nullable DateTime
bool?                  // Nullable boolean

Nullable Reference Types (C# 8+)

string?                // Nullable string
List<int>?             // Nullable list
MyClass?               // Nullable custom class

4. Generic Types

Type Parameters

T                      // Simple type parameter
TKey, TValue           // Multiple type parameters

Constructed Generic Types

List<int>              // Generic list of integers
Dictionary<string, object>  // Generic dictionary

Generic Constraints

T where T : class                    // Reference type constraint
T where T : struct                   // Value type constraint
T where T : new()                    // Constructor constraint
T where T : BaseClass                // Base class constraint
T where T : IInterface               // Interface constraint
T where T : class, IDisposable, new() // Multiple constraints

5. Tuple Types

Named Tuples

(int x, int y)         // Named tuple elements
(string name, int age) // Different element types

Unnamed Tuples

(int, string)          // Unnamed tuple elements

Nested Tuples

(int, (string, bool))  // Nested tuple structure

6. Pointer Types (Unsafe Context)

int*                   // Pointer to integer
char**                 // Pointer to pointer to char
void*                  // Void pointer

7. Function Pointer Types (C# 9+)

delegate*<int, string>              // Function pointer
delegate* managed<int, void>        // Managed function pointer
delegate* unmanaged<int, void>      // Unmanaged function pointer

Type Syntax Parsing

Basic Type Parsing

The type parser handles various syntactic forms:

#![allow(unused)]
fn main() {
fn parse_type(input: &str) -> BResult<&str, Type> {
    alt((
        parse_tuple_type,
        parse_function_pointer_type,
        parse_named_type,
        parse_primitive_type,
    ))(input.into())
}
}

Array Type Parsing

Array types have specific syntax rules:

int[]                  // T[]
int[,]                 // T[,]
int[,,]                // T[,,]
int[][]                // T[][] (jagged)

Generic Type Parsing

Generic types require careful parsing of type arguments:

List<int>              // Simple generic
Dictionary<string, List<int>>  // Nested generics

Nullable Type Parsing

Nullable types use special syntax:

int?                   // Nullable<int>
string?                // string with nullable annotation

Type Resolution

Qualified Names

Types can be fully qualified:

System.Collections.Generic.List<int>
MyNamespace.MyClass

Type Aliases

Using directives create type aliases:

using StringList = System.Collections.Generic.List<string>;

Global Type References

Global namespace references:

global::System.String  // Fully qualified from global namespace

Type Constraints

Constraint Types

  1. Reference Type: where T : class
  2. Value Type: where T : struct
  3. Constructor: where T : new()
  4. Base Class: where T : BaseClass
  5. Interface: where T : IInterface
  6. Type Parameter: where T : U

Constraint Combinations

Multiple constraints can be combined:

where T : class, IDisposable, new()

Constraint Validation

The parser validates constraint combinations:

  • class and struct are mutually exclusive
  • new() constraint must come last
  • Base class constraint must come before interface constraints

Type Variance

Covariance and Contravariance

interface ICovariant<out T> { }     // Covariant
interface IContravariant<in T> { }  // Contravariant
interface IInvariant<T> { }         // Invariant

Advanced Type Features

Record Types

record Person(string Name, int Age);
record class Employee(string Name, int Age, string Department);
record struct Point(int X, int Y);

Pattern Types

Types used in pattern matching:

obj is string str          // Type pattern
obj is not null           // Negation pattern
obj is > 0 and < 100     // Relational pattern

Type System Implementation

The type system is implemented with a hierarchical structure:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum Type {
    Primitive(PrimitiveType),
    Named {
        name: Identifier,
        type_arguments: Option<Vec<Type>>,
    },
    Array {
        element_type: Box<Type>,
        dimensions: u32,
    },
    Nullable(Box<Type>),
    Tuple(Vec<(Option<Identifier>, Type)>),
    Pointer(Box<Type>),
    FunctionPointer {
        parameters: Vec<Type>,
        return_type: Box<Type>,
    },
}
}

Error Handling

The type parser provides detailed error messages for:

  • Invalid type syntax
  • Constraint violations
  • Generic parameter mismatches
  • Nullable context errors
  • Variance violations

Type Inference

While the parser doesn't perform type inference (that's the compiler's job), it correctly parses:

  • var declarations
  • Anonymous types
  • Implicitly typed arrays
  • Lambda parameter types

The type system parser is designed to accurately represent the full complexity of C#'s type system while maintaining performance and providing clear error diagnostics.

2025-11-17 15:18:26 • commit: 03a4e25

C# Feature Completeness Matrix

This document tracks the implementation status of C# language features in the BSharp parser.

Legend:

  • Fully Supported - Feature is completely implemented and tested
  • 🟡 Partial Support - Feature is partially implemented or has known limitations
  • ⚠️ Planned - Feature is planned but not yet implemented
  • Not Supported - Feature is not currently supported

C# 1.0 Features (2002)

Type Declarations

FeatureStatusNotes
ClassesFull support including nested classes
StructsFull support
InterfacesFull support
EnumsFull support including flags
DelegatesFull support

Members

FeatureStatusNotes
FieldsPublic, private, protected, internal
PropertiesGet/set accessors
MethodsInstance and static methods
ConstructorsInstance and static constructors
Destructors/FinalizersFull support
EventsFull support
IndexersFull support
OperatorsOperator overloading

Statements

FeatureStatusNotes
if/elseFull support
switch/caseTraditional switch statements
forFull support
foreachFull support
whileFull support
do-whileFull support
breakFull support
continueFull support
returnFull support
throwFull support
try/catch/finallyFull exception handling
using statementResource management
lockThread synchronization
gotoIncluding goto case
checked/uncheckedOverflow checking

Expressions

FeatureStatusNotes
LiteralsAll literal types
Arithmetic operators+, -, *, /, %
Comparison operators==, !=, <, >, <=, >=
Logical operators&&, `
Bitwise operators&, `
Assignment operators=, +=, -=, etc.
Conditional operator? : ternary
Member access. operator
Indexing[] operator
Method invocationFull support
Object creationnew expressions
Array creationSingle and multi-dimensional
Type casting(Type)expr
typeofType information
sizeofSize of types
is operatorType testing
as operatorSafe casting

Types

FeatureStatusNotes
Primitive typesAll built-in types
ArraysSingle, multi-dimensional, jagged
Nullable value typesT? syntax
Reference typesClasses, interfaces, delegates
Value typesStructs, enums

Modifiers

FeatureStatusNotes
Access modifierspublic, private, protected, internal
staticFull support
readonlyFull support
constFull support
virtualFull support
overrideFull support
abstractFull support
sealedFull support
externFull support

C# 2.0 Features (2005)

FeatureStatusNotes
GenericsFull support including constraints
Generic constraintswhere T : class, struct, new(), etc.
Partial typespartial keyword
Anonymous methodsdelegate { } syntax
Nullable typesNullable<T> and T?
Iteratorsyield return, yield break
Covariance/Contravariancein/out variance
Static classesFull support
Property accessorsDifferent accessibility
Namespace aliasesusing Alias = Namespace
?? operatorNull-coalescing

C# 3.0 Features (2007)

FeatureStatusNotes
Auto-implemented properties{ get; set; }
Object initializersnew T { Prop = value }
Collection initializersnew List<T> { 1, 2, 3 }
Anonymous typesnew { Name = "x" }
Extension methodsthis parameter
Lambda expressionsx => x * 2
Expression treesParsing support
LINQ query syntaxfrom x in y select z
Implicitly typed variablesvar keyword
Partial methodsIn partial classes

C# 4.0 Features (2010)

FeatureStatusNotes
Dynamic bindingdynamic type
Named argumentsMethod(param: value)
Optional parametersDefault parameter values
Generic covariance/contravarianceEnhanced support
Embedded interop typesno-pia

C# 5.0 Features (2012)

FeatureStatusNotes
Async/awaitasync and await keywords
Caller info attributes[CallerMemberName], etc.

C# 6.0 Features (2015)

FeatureStatusNotes
Auto-property initializerspublic int X { get; set; } = 1;
Expression-bodied members=> expr for methods/properties
using staticImport static members
Null-conditional operator?. and ?[]
String interpolation$"Hello {name}"
nameof operatornameof(variable)
Index initializers[index] = value
Exception filterscatch (E) when (condition)
await in catch/finallyFull support

C# 7.0 Features (2017)

FeatureStatusNotes
Out variablesMethod(out var x)
Tuples(int, string) syntax
Tuple deconstruction(var x, var y) = tuple
Pattern matchingis patterns
Local functionsFunctions inside methods
Ref returns and localsref keyword
Discards_ placeholder
Binary literals0b1010
Digit separators1_000_000
Throw expressionsx ?? throw new E()
Expression-bodied constructors=> expr syntax
Expression-bodied finalizers=> expr syntax
Expression-bodied accessorsget => expr

C# 7.1 Features (2017)

FeatureStatusNotes
Async mainasync Task Main()
Default literal expressionsdefault without type
Inferred tuple namesAutomatic naming
Pattern matching on genericsFull support

C# 7.2 Features (2017)

FeatureStatusNotes
ref readonlyRead-only references
in parametersPass by readonly reference
ref structStack-only structs
Non-trailing named argumentsMixed named/positional
private protectedAccess modifier
Leading underscores in numeric literals_123
Conditional ref expressionsref in ternary

C# 7.3 Features (2018)

FeatureStatusNotes
Tuple equality== and !=
Attributes on backing fields[field: Attribute]
Expression variables in initializersFull support
ref local reassignmentReassign ref locals
Stackalloc initializersstackalloc[] { 1, 2 }
Pattern-based fixedCustom fixed
Improved overload candidatesBetter resolution

C# 8.0 Features (2019)

FeatureStatusNotes
Nullable reference typesstring? annotations
Default interface methodsInterface implementations
Pattern matching enhancementsSwitch expressions, property patterns
Switch expressionsx switch { ... }
Property patterns{ Prop: value }
Tuple patterns(1, 2) patterns
Positional patternsDeconstruction patterns
Using declarationsusing var x = ...
Static local functionsstatic modifier
Disposable ref structsIDisposable on ref struct
Nullable reference types#nullable directives
Asynchronous streamsIAsyncEnumerable<T>
Asynchronous disposableIAsyncDisposable
Indices and ranges^ and .. operators
Null-coalescing assignment??= operator
Unmanaged constructed typesGeneric constraints
Stackalloc in nested expressionsFull support

C# 9.0 Features (2020)

FeatureStatusNotes
Recordsrecord keyword
Init-only settersinit accessor
Top-level statementsNo Main method required
Pattern matching improvementsRelational, logical patterns
Relational patterns> 0, <= 10
Logical patternsand, or, not
Target-typed newnew() without type
Covariant returnsOverride with derived type
Extension GetEnumeratorforeach support
Lambda discard parameters(_, _) => expr
Attributes on local functionsFull support
Module initializers[ModuleInitializer]
Partial methods with returnExtended partial
Native integersnint, nuint
Function pointersdelegate* syntax
Suppress emitting localsinit[SkipLocalsInit]
Target-typed conditional? : inference

C# 10.0 Features (2021)

FeatureStatusNotes
Record structsrecord struct
Global using directivesglobal using
File-scoped namespacesnamespace X;
Extended property patternsNested patterns
Constant interpolated stringsconst strings
Lambda improvementsNatural types, attributes
Caller expression attribute[CallerArgumentExpression]
Improved definite assignmentBetter analysis
Allow AsyncMethodBuilderCustom builders
Record types with sealed ToStringSealed override
Assignment and declaration in same deconstructionMixed syntax
Allow both assignment and declarationFull support

C# 11.0 Features (2022)

FeatureStatusNotes
Raw string literals"""text"""
Generic attributes[Attr<T>]
UTF-8 string literals"text"u8
Newlines in string interpolationsMulti-line expressions
List patterns[1, 2, .., 10]
File-local typesfile class
Required membersrequired modifier
Auto-default structsDefault initialization
Pattern match Span<char>Constant patterns
Extended nameof scopeMore contexts
Numeric IntPtrOperators on IntPtr
ref fieldsIn ref structs
scoped refLifetime annotations
Checked operatorsUser-defined checked

C# 12.0 Features (2023)

FeatureStatusNotes
Primary constructorsFull support for classes and structs
Collection expressions[1, 2, 3] and spread .. syntax
Inline arraysNot yet implemented
Optional parameters in lambdasFull support
ref readonly parametersFull support
Alias any typeusing Alias = (int, string)
Experimental attribute[Experimental]
InterceptorsNot yet implemented

C# 13.0 Features (2024)

FeatureStatusNotes
params collections⚠️Planned
New lock type⚠️Planned
New escape sequence \e⚠️Planned
Method group natural type⚠️Planned
Implicit indexer access⚠️Planned
ref and unsafe in iterators⚠️Planned
ref struct interfaces⚠️Planned
Allows ref struct types⚠️Planned

C# 14.0 Features (2025 - .NET 10)

FeatureStatusNotes
Extension members🟡Parser + emitter for extension blocks; semantics planned
field keyword⚠️Planned - Field-backed properties
Null-conditional assignment⚠️Planned - ?. on left side of =
nameof unbound generics⚠️Planned - nameof(List<>)
Implicit Span<T> conversions⚠️Planned - First-class span support
Lambda parameter modifiers⚠️Planned - (out x) => ... without types
Partial constructors⚠️Planned - partial instance constructors
Partial events⚠️Planned - partial events
User-defined compound assignment⚠️Planned - Custom +=, -= operators

Preprocessor Directives

FeatureStatusNotes
#if / #elif / #else / #endifConditional compilation
#define / #undefSymbol definition
#warning / #errorCompiler messages
#lineLine number control
#region / #endregionCode folding
#pragma warningWarning control
#pragma checksumDebugging support
#nullableNullable context

Documentation Comments

FeatureStatusNotes
XML documentation/// and /** */
<summary>Full support
<param>Full support
<returns>Full support
<exception>Full support
<see> / <seealso>Full support
<example>Full support
<code> / <c>Full support
<para>Full support
<list>Full support
<include>Full support

Unsafe Code

FeatureStatusNotes
PointersT* syntax
unsafe keywordBlocks and methods
fixed statementPin managed objects
stackallocStack allocation
Function pointersdelegate* (C# 9+)
sizeof operatorType sizes
Pointer arithmeticFull support
Address-of operator& operator
Indirection operator* operator

Summary Statistics

Overall Completeness

VersionFeaturesSupportedPartialPlannedNot SupportedCompletion
C# 1.080+80+000100%
C# 2.01111000100%
C# 3.01010000100%
C# 4.055000100%
C# 5.022000100%
C# 6.01010000100%
C# 7.01313000100%
C# 7.144000100%
C# 7.277000100%
C# 7.377000100%
C# 8.01818000100%
C# 9.01717000100%
C# 10.01212000100%
C# 11.01313000100%
C# 12.076001~86%
C# 13.0800800% (Preview)
C# 14.0900900% (Preview)

Total: ~99% of released C# features supported (C# 1.0 - 12.0)


Testing Coverage

Test Organization

All parser tests are located in tests/parser/ with comprehensive coverage:

  • Expression tests: tests/parser/expressions/
  • Statement tests: tests/parser/statements/
  • Declaration tests: tests/parser/declarations/
  • Type tests: tests/parser/types/
  • Pattern matching tests: tests/parser/expressions/pattern_matching_tests.rs
  • Preprocessor tests: tests/parser/preprocessor/

Test Fixtures

Real-world C# projects in tests/fixtures/:

  • happy_path/: Valid, well-formed C# code
  • complex/: Complex real-world scenarios

Known Limitations

C# 12.0 Limitations

  1. Inline Arrays: Not yet implemented

    • Requires [InlineArray(n)] attribute support
    • Planned for future release
  2. Interceptors: Not yet implemented

    • Experimental feature in C# 12
    • May be implemented when feature stabilizes

C# 13.0 & 14.0 Status

All C# 13.0 and 14.0 features are in preview/development status and planned for future implementation as they stabilize in the official .NET releases.


Contributing

To add support for new C# features:

  1. Update AST nodes in src/syntax/nodes/
  2. Implement parser in src/parser/
  3. Add comprehensive tests in tests/parser/
  4. Update this matrix to reflect new support
  5. Document in relevant parser documentation

See Contributing Guide for details.


References

  • C# Language Specification: https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/
  • C# Version History: https://docs.microsoft.com/en-us/dotnet/csharp/whats-new/
  • Roslyn Source: https://github.com/dotnet/roslyn
  • Parser Implementation: src/parser/
  • Test Suite: tests/parser/

Last Updated: 2025-09-30
Parser Version: Current development version
Maintained By: BSharp Project Contributors

2025-11-17 15:18:26 • commit: 03a4e25

Keywords and Tokens

Keyword and token helpers used by the parser.


Keyword Pairs Macro

  • Location: src/bsharp_parser/src/keywords/mod.rs
  • Macro: define_keyword_pair! (macro_rules)
  • Generates two functions per keyword:
    • kw_<name>() – consumes the keyword with word boundary check
    • peek_<name>() – non-consuming peek with surrounding whitespace/comments tolerated
#![allow(unused)]
fn main() {
// Define a pair:
// define_keyword_pair!(kw_public, peek_public, "public");
#[macro_export]
macro_rules! define_keyword_pair {
    ($kw_fn:ident, $peek_fn:ident, $lit:literal) => {
        pub fn $kw_fn() -> impl FnMut($crate::syntax::span::Span) -> $crate::syntax::errors::BResult<&str> {
            use nom::Parser as _;
            (|i: $crate::syntax::span::Span| {
                nom::combinator::map(
                    nom::sequence::terminated(
                        nom_supreme::tag::complete::tag($lit),
                        nom::combinator::peek(nom::combinator::not(
                            nom::character::complete::satisfy(|c: char| c.is_alphanumeric() || c == '_'),
                        )),
                    ),
                    |s: $crate::syntax::span::Span| *s.fragment(),
                )
                .parse(i)
            })
        }
        pub fn $peek_fn() -> impl FnMut($crate::syntax::span::Span) -> $crate::syntax::errors::BResult<&str> {
            use nom::Parser as _;
            (|i: $crate::syntax::span::Span| {
                nom::combinator::peek(
                    nom::sequence::delimited(
                        $crate::syntax::comment_parser::ws,
                        nom::combinator::map(
                            nom::sequence::terminated(
                                nom_supreme::tag::complete::tag($lit),
                                nom::combinator::peek(nom::combinator::not(
                                    nom::character::complete::satisfy(|c: char| c.is_alphanumeric() || c == '_'),
                                )),
                            ),
                            |_| $lit,
                        ),
                        $crate::syntax::comment_parser::ws,
                    ),
                )
                .parse(i)
            })
        }
    };
}
}
  • Keyword modules live under src/bsharp_parser/src/keywords/ (e.g., access_keywords.rs, declaration_keywords.rs, linq_query_keywords.rs, type_keywords.rs).
  • Central keyword set: KEYWORDS in keywords/mod.rs and check is_keyword().

Token and Whitespace Helpers

  • Whitespace/comments: src/bsharp_parser/src/syntax/comment_parser.rs
    • ws() parses optional whitespace and comments
    • parse_whitespace_or_comments() returns the consumed span text
  • List parsing: src/bsharp_parser/src/syntax/list_parser.rs provides helpers for delimited/separated lists
  • Punctuation/tokens: Use nom_supreme::tag::complete::tag("...") with:
    • peek(not(satisfy(|c| ...))) for word boundaries on keywords
    • preceded/terminated/delimited and ws() to control surrounding trivia

Example token with trivia discipline:

#![allow(unused)]
fn main() {
use nom::{combinator::map, sequence::delimited};
use nom_supreme::tag::complete::tag;
use crate::syntax::comment_parser::ws;
use crate::syntax::errors::BResult;
use crate::syntax::span::Span;

pub fn comma(i: Span) -> BResult<()> {
    map(delimited(ws, tag(","), ws), |_| ()).parse(i)
}
}

Usage Patterns

  • Prefer peek_*() when branching without consuming input (e.g., lookahead for statement kind).
  • After consuming a keyword with kw_*(), use cut() to prevent backtracking past the commitment.
  • Always wrap top-level file parser with all_consuming.
  • Keep context labels short and specific.

Adding a New Keyword

  1. Pick the right module in keywords/ and add a define_keyword_pair! entry.
  2. If it's a reserved word, add it to KEYWORDS (for identifier filtering).
  3. Use kw_*()/peek_*() in parsers with ws() at boundaries.
  4. Add tests under src/bsharp_tests/src/parser/... for both positive and negative cases.

References

  • Keyword macro and modules: src/bsharp_parser/src/keywords/
  • Whitespace/comment parser: src/bsharp_parser/src/syntax/comment_parser.rs
  • Lists: src/bsharp_parser/src/syntax/list_parser.rs
  • Error formatting: src/bsharp_parser/src/syntax/errors.rs
2025-11-17 15:18:26 • commit: 03a4e25

Query API for AST traversal

The Query API is provided by the bsharp_syntax crate and re-exported by bsharp_analysis for convenience. It replaces older navigation traits, but the Query API itself is current and not deprecated.

Core types

  • NodeRef<'a>: a thin enum over AST nodes (CompilationUnit, Namespace, Class, Struct, Interface, Enum, Record, Delegate, Method, Statement, Expression, plus top-level items). Origin: bsharp_syntax::node::ast_node::NodeRef (re-exported as bsharp_analysis::framework::NodeRef).
  • Query<'a>: a fluent helper to enumerate descendants and select typed nodes. Origin: bsharp_syntax::query::Query (re-exported as bsharp_analysis::framework::Query).
#![allow(unused)]
fn main() {
use bsharp_analysis::framework::{NodeRef, Query};
use bsharp_syntax::CompilationUnit;
use bsharp_syntax::{ClassDeclaration, MethodDeclaration};

fn all_classes<'a>(cu: &'a CompilationUnit) -> Vec<&'a ClassDeclaration> {
    Query::from(NodeRef::CompilationUnit(cu))
        .of::<ClassDeclaration>()
        .collect()
}

fn all_methods<'a>(cu: &'a CompilationUnit) -> Vec<&'a MethodDeclaration> {
    Query::from(NodeRef::CompilationUnit(cu))
        .of::<MethodDeclaration>()
        .collect()
}
}

Descendant enumeration

Query::descendants() walks the tree using Children implemented for NodeRef.

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::{NodeRef, Query};
use bsharp_syntax::statements::Statement;

fn all_statements<'a>(cu: &'a CompilationUnit) -> Vec<&'a Statement> {
    Query::from(NodeRef::CompilationUnit(cu))
        .of::<Statement>()
        .collect()
}
}

Filtering

Use filter_typed to filter by predicate.

#![allow(unused)]
fn main() {
use analysis::syntax::declarations::ClassDeclaration;

let public_classes: Vec<&ClassDeclaration> =
    Query::from(NodeRef::CompilationUnit(&cu))
        .filter_typed::<ClassDeclaration>(|c| c.modifiers.iter().any(|m| m.is_public()))
        .collect();
}

Best practices

  • Prefer Query for node enumeration across passes.
  • For hot path statement/expression analysis, use shared helpers (metrics::shared) or a small local walker when necessary.
  • Keep passes stateless and deterministic; feed inputs via AnalysisSession artifacts.

Implementation notes

The Children/Extract traits are implemented for common AST nodes, enabling Query::of<T>() to return strong types. See:

  • src/bsharp_syntax/src/query/ for Children, Extract, Query.
  • src/bsharp_syntax/src/node/ast_node.rs for NodeRef.
2025-11-17 15:18:26 • commit: 03a4e25

Comment Parsing

BSharp implements comprehensive comment parsing for both regular comments and XML documentation comments, preserving them as part of the AST for documentation generation and analysis tools.

Comment Types

1. Single-Line Comments

Standard C++ style comments:

// This is a single-line comment
int x = 5; // End-of-line comment

2. Multi-Line Comments

Traditional C-style block comments:

/*
 * This is a multi-line comment
 * that spans several lines
 */
int y = 10; /* Inline block comment */

3. XML Documentation Comments

Single-Line XML Comments

/// <summary>
/// This method calculates the sum of two integers.
/// </summary>
/// <param name="a">The first integer.</param>
/// <param name="b">The second integer.</param>
/// <returns>The sum of a and b.</returns>
public int Add(int a, int b)
{
    return a + b;
}

Multi-Line XML Comments

/**
 * <summary>
 * This is a multi-line XML documentation comment.
 * It provides detailed information about the method.
 * </summary>
 * <param name="value">The input value to process.</param>
 * <returns>The processed result.</returns>
 */
public string ProcessValue(string value) { }

XML Documentation Structure

Standard XML Tags

Summary and Description

<summary>
Brief description of the member.
</summary>

<remarks>
Detailed remarks and additional information.
</remarks>

Parameters and Returns

<param name="parameterName">Description of the parameter.</param>
<returns>Description of the return value.</returns>

Exceptions

<exception cref="ArgumentNullException">
Thrown when the parameter is null.
</exception>

Examples

<example>
This example shows how to use the method:
<code>
var result = MyMethod("input");
Console.WriteLine(result);
</code>
</example>

See References

<see cref="RelatedMethod"/>
<seealso cref="AnotherClass"/>

Generic Type Parameters

<typeparam name="T">The type parameter.</typeparam>
<typeparamref name="T"/>

Custom XML Tags

The parser supports custom XML tags:

<custom attribute="value">
Custom content with <nested>elements</nested>.
</custom>

XML Documentation Parsing

XML Element Structure

The parser represents XML elements with:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct XmlElement {
    pub name: Identifier,
    pub attributes: Vec<XmlAttribute>,
    pub children: Vec<XmlNode>,
}

#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct XmlAttribute {
    pub name: Identifier,
    pub value: String,
}

#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum XmlNode {
    Element(XmlElement),
    Text(String),
    CData(String),
    Comment(String),
}
}

XML Documentation Comment

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct XmlDocumentationComment {
    pub elements: Vec<XmlNode>,
}
}

Parsing XML Attributes

The parser handles XML attributes with various syntaxes:

<param name="value">Description</param>
<see cref="MyClass.MyMethod(int, string)"/>
<exception cref="System.ArgumentException">Error description</exception>

XML Content Parsing

The parser processes mixed content:

<summary>
This method processes <paramref name="input"/> and returns
<see cref="ProcessResult"/> containing the result.
</summary>

Comment Association

Declaration Comments

Comments are associated with their following declarations:

/// <summary>Class documentation</summary>
public class MyClass
{
    /// <summary>Method documentation</summary>
    public void MyMethod() { }
}

Member Comments

Each declaration can have associated documentation:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct MethodDeclaration {
    pub documentation: Option<XmlDocumentationComment>,
    // ... other fields
}
}

Advanced XML Features

CDATA Sections

The parser handles CDATA sections for literal content:

<example>
<![CDATA[
if (x < y && y > z)
{
    Console.WriteLine("Complex condition");
}
]]>
</example>

Nested XML Elements

Complex nested structures are supported:

<summary>
This method handles <see cref="List{T}"/> where T is
<typeparamref name="T"/> and implements <see cref="IComparable{T}"/>.
</summary>

XML Namespaces

The parser can handle XML namespaces in documentation:

<doc:summary xmlns:doc="http://schemas.microsoft.com/developer/documentation">
Namespaced documentation content.
</doc:summary>

Comment Preservation

Comment Tokens

Comments are preserved as tokens in the AST:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum CommentToken {
    SingleLine(String),
    MultiLine(String),
    XmlDocumentation(XmlDocumentationComment),
}
}

Position Information

Comments maintain position information:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct PositionedComment {
    pub comment: CommentToken,
    pub line: usize,
    pub column: usize,
}
}

Error Handling

XML Validation

The parser validates XML structure:

  • Well-formed XML: Proper opening and closing tags
  • Attribute syntax: Valid attribute name-value pairs
  • Nesting rules: Correct element nesting
  • Character escaping: Proper XML character escaping

Error Recovery

When XML is malformed, the parser attempts recovery:

  • Skip malformed elements: Continue parsing after errors
  • Preserve content: Keep as much content as possible
  • Error reporting: Provide detailed error locations

Integration with Analysis

Documentation Analysis

Comments are available for analysis tools:

#![allow(unused)]
fn main() {
impl XmlDocumentationComment {
    pub fn find_elements_by_name(&self, name: &str) -> Vec<&XmlElement> {
        // Find all elements with the given tag name
    }
    
    pub fn get_summary(&self) -> Option<String> {
        // Extract summary text
    }
    
    pub fn get_parameters(&self) -> Vec<(String, String)> {
        // Extract parameter documentation
    }
}
}

Documentation Generation

The parsed XML documentation can be used for:

  • API documentation generation
  • IntelliSense information
  • Code analysis and quality checks
  • Documentation coverage reports

Performance Considerations

Lazy Parsing

XML documentation can be parsed lazily when needed:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub enum DocumentationState {
    Unparsed(String),
    Parsed(XmlDocumentationComment),
    Invalid(String, ParseError),
}
}

Memory Optimization

The parser optimizes memory usage by:

  • String interning: Reusing common XML tag names
  • Structured storage: Efficient representation of XML structure
  • On-demand parsing: Parse XML only when accessed

The comment parsing system ensures that all documentation and comments are preserved and available for analysis, while maintaining the performance characteristics needed for large codebases.

2025-11-17 15:18:26 • commit: 03a4e25

Preprocessor Directives

This parser treats preprocessor directives as trivia that can appear at safe boundaries (file start, between members inside namespaces and type bodies). We currently parse only a small subset explicitly and skip the rest.

What is parsed today

  • #pragma lines are parsed into PreprocessorDirective::Pragma { pragma: String }.
  • #line lines are parsed into PreprocessorDirective::Line { line: String }.
  • Any other line starting with # is recognized and consumed as PreprocessorDirective::Unknown { text: String } (the remainder of the line after #).

All directive parsers consume the optional trailing newline so the main parser can continue cleanly at the next token.

Where directives are skipped

Directives are treated as trivia and skipped at these locations:

This skipping is centralized via parser/helpers/directives.rs: skip_preprocessor_directives().


### 2. Symbol Definition

#### #define and #undef

```csharp
#define FEATURE_ENABLED
#define VERSION_2_0

#undef OLD_FEATURE

3. Diagnostic Directives

#warning

#warning This code is deprecated and will be removed in the next version

#error

#if UNSUPPORTED_PLATFORM
#error This platform is not supported
#endif

4. Line Directives

#line

#line 100 "OriginalFile.cs"
// Following code appears to come from line 100 of OriginalFile.cs

#line default
// Reset to actual file and line numbers

#line hidden
// Hide following lines from debugger

5. Region Directives

#region and #endregion

#region Private Methods
private void HelperMethod()
{
    // Implementation
}

private void AnotherHelper()
{
    // Implementation
}
#endregion

6. Pragma Directives

#pragma warning

#pragma warning disable CS0618
// Use of obsolete members
ObsoleteMethod();
#pragma warning restore CS0618

#pragma warning disable CS0162, CS0168
// Disable multiple warnings
#pragma warning restore CS0162, CS0168

#pragma checksum

#pragma checksum "file.cs" "{406EA660-64CF-4C82-B6F0-42D48172A799}" "checksum_bytes"

7. Nullable Context Directives

#nullable

#nullable enable
string? nullable = null;  // Nullable reference types enabled

#nullable disable
string notNullable = null;  // Warning disabled

#nullable restore
// Restore previous nullable context

Preprocessor Expression Evaluation

Symbols and Operators

Boolean Operators

#if DEBUG && !RELEASE           // AND and NOT
#if WINDOWS || LINUX || MACOS   // OR
#if (A && B) || (C && D)        // Grouping with parentheses

Equality Operators

#if VERSION == "2.0"            // String equality
#if BUILD_NUMBER >= 1000        // Numeric comparison (limited support)

Symbol Resolution

Symbols can be defined:

  1. Source code: #define SYMBOL
  2. Compiler flags: /define:SYMBOL
  3. Project settings: <DefineConstants>
  4. Environment: Predefined symbols

Predefined Symbols

Common predefined symbols:

#if NET5_0_OR_GREATER          // Framework version
#if WINDOWS                    // Platform
#if DEBUG                      // Configuration
#if X64                        // Architecture

Preprocessor AST Representation

Preprocessor Directive Node

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum PreprocessorDirective {
    If {
        condition: PreprocessorExpression,
        then_block: Vec<PreprocessorDirective>,
        elif_blocks: Vec<(PreprocessorExpression, Vec<PreprocessorDirective>)>,
        else_block: Option<Vec<PreprocessorDirective>>,
    },
    Define(String),
    Undef(String),
    Warning(String),
    Error(String),
    Line {
        line_number: Option<u32>,
        file_name: Option<String>,
        hidden: bool,
    },
    Region {
        name: String,
        content: Vec<PreprocessorDirective>,
    },
    Pragma {
        directive: String,
        arguments: Vec<String>,
    },
    Nullable(NullableDirective),
}
}

Preprocessor Expression

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum PreprocessorExpression {
    Symbol(String),
    Not(Box<PreprocessorExpression>),
    And(Box<PreprocessorExpression>, Box<PreprocessorExpression>),
    Or(Box<PreprocessorExpression>, Box<PreprocessorExpression>),
    Equal(Box<PreprocessorExpression>, Box<PreprocessorExpression>),
    NotEqual(Box<PreprocessorExpression>, Box<PreprocessorExpression>),
    Parenthesized(Box<PreprocessorExpression>),
    Literal(String),
}
}

Conditional Compilation Processing

Block Structure

Conditional blocks create a tree structure:

#if CONDITION_A
    // Block A
    #if NESTED_CONDITION
        // Nested block
    #endif
#elif CONDITION_B
    // Block B
#else
    // Default block
#endif

Active Code Determination

The preprocessor determines which code blocks are active:

  1. Evaluate conditions: Process #if expressions
  2. Symbol lookup: Resolve defined symbols
  3. Block selection: Choose active code paths
  4. Nested processing: Handle nested conditionals

Integration with Main Parser

Two-Phase Parsing

  1. Preprocessor Phase: Process directives and determine active code
  2. Main Parse Phase: Parse the active code sections

Conditional Code Exclusion

Inactive code blocks are:

  • Excluded from parsing: Not processed by main parser
  • Preserved in AST: Available for analysis tools
  • Marked as inactive: Flagged for tooling

Directive Preservation

All directives are preserved for:

  • Code formatting tools
  • Refactoring utilities
  • Documentation generation
  • Build system integration

Error Handling

Directive Validation

The parser validates:

  • Balanced conditionals: Every #if has matching #endif
  • Valid expressions: Preprocessor expressions are syntactically correct
  • Symbol definitions: #define follows naming rules
  • Pragma syntax: Pragma directives have valid format

Error Recovery

When encountering malformed directives:

  • Skip invalid directives: Continue parsing
  • Report detailed errors: Show directive location and issue
  • Maintain structure: Keep conditional block structure intact

Advanced Features

Nested Regions

#region Outer Region
    #region Inner Region
        // Nested region content
    #endregion
#endregion

Complex Pragma Directives

#pragma warning disable IDE0051 // Remove unused private members
#pragma warning restore IDE0051

#pragma nullable enable warnings
#pragma nullable disable annotations

Source Mapping

Line directives affect source mapping:

#line 1 "Generated.cs"
// This appears to come from Generated.cs line 1
var generated = true;
#line default
// Back to actual file location

Usage in Analysis

Conditional Code Analysis

Analysis tools can:

  • Detect dead code: Find code that's never compiled
  • Track feature flags: Analyze conditional compilation usage
  • Generate reports: Show compilation configurations

Symbol Tracking

Track symbol definitions and usage:

  • Definition locations: Where symbols are defined
  • Usage contexts: Where symbols are referenced
  • Scope analysis: Symbol visibility across files

Performance Considerations

Preprocessing Optimization

  • Symbol caching: Cache symbol resolution results
  • Lazy evaluation: Process conditionals only when needed
  • Memory efficiency: Minimize directive storage overhead

Integration Efficiency

  • Single-pass processing: Process directives during parsing
  • Minimal backtracking: Avoid reparsing conditional blocks
  • Incremental updates: Support for incremental parsing with directive changes

The preprocessor directive system ensures that all C# preprocessing features are supported while maintaining the ability to analyze and manipulate code across different compilation configurations.

2025-11-17 15:18:26 • commit: 03a4e25

Spans

This page explains how source spans are represented and returned during parsing.


Span Type

  • Type: bsharp_parser::syntax::span::Span<'a>
  • Alias: type Span<'a> = nom_locate::LocatedSpan<&'a str>;
  • Provides line/column offsets and byte positions for parser errors and mapping.
#![allow(unused)]
fn main() {
// src/bsharp_parser/src/syntax/span.rs
pub type Span<'a> = nom_locate::LocatedSpan<&'a str>;
}

Parsing With Spans

Use the parser facade to parse and also get a span table for top-level declarations.

#![allow(unused)]
fn main() {
use bsharp_parser::facade::Parser;

let source = std::fs::read_to_string("Program.cs")?;
let (cu, spans) = Parser::new().parse_with_spans(&source)?;
}
  • The return value is (CompilationUnit, SpanTable).
  • SpanTable maps top-level declarations to byte ranges for later mapping.

Error Reporting

Pretty error formatting uses Span to print line/column with context:

#![allow(unused)]
fn main() {
use bsharp_parser::syntax::errors::format_error_tree;

let msg = format_error_tree(&source, &error_tree);
}

See: docs/parser/error-handling.md for details.

2025-11-17 15:18:26 • commit: 03a4e25

Syntax Traits

Core traits used by AST types and formatting emitters.


AstNode

  • Path: bsharp_syntax::node::ast_node::AstNode
  • Implemented by all syntax node types for traversal and visualization.
#![allow(unused)]
fn main() {
pub trait AstNode: Any {
    fn as_any(&self) -> &dyn Any;
    fn children<'a>(&'a self, _push: &mut dyn FnMut(NodeRef<'a>)) {}
    fn node_kind(&self) -> &'static str { core::any::type_name::<Self>() }
    fn node_label(&self) -> String { format!("{} ({})", self.node_kind(), core::any::type_name::<Self>()) }
}
}

Helpers:

  • NodeRef<'a> alias to DynNodeRef<'a> for dynamic traversal.
  • push_child(push, node) to push typed children.

Emit and Emitter

  • Path: bsharp_syntax::emitters::emit_trait::{Emit, Emitter, EmitCtx}
  • Emit is implemented by nodes that can render themselves as C# code.
  • Emitter writes items to String (or writer) using a mutable EmitCtx.
#![allow(unused)]
fn main() {
pub trait Emit {
    fn emit<W: std::fmt::Write>(&self, w: &mut W, cx: &mut EmitCtx) -> Result<(), EmitError>;
}
}

EmitCtx controls indentation, simple policies, and optional JSONL tracing.


Rendering Helpers

  • Graph renderers in bsharp_syntax::node::render::{to_text, to_mermaid, to_dot} operate on &impl AstNode.

See Also

  • docs/syntax/spans.md
  • docs/syntax/derive-macros.md
  • docs/syntax/formatter.md
2025-11-17 15:18:26 • commit: 03a4e25

Derive Macros

Procedural macros used by syntax nodes to implement traversal and visualization behavior.


#[derive(AstNode)]

  • Crate: bsharp_syntax_derive
  • Implements: bsharp_syntax::node::ast_node::AstNode for your struct/enum
  • Purpose: Auto-generates children() to enable dynamic traversal via NodeRef/DynNodeRef.

How it works

For each field, the macro emits code to push children appropriately:

  • Option<T>: pushes inner T if present
  • Vec<T>: iterates and pushes each T
  • Box<T>: borrows inner &T and pushes it
  • Other types: treated as AST nodes by default
  • Primitive-like types are skipped: bool, numbers, char, String, and internal primitive enums like PrimitiveType

Excerpt from implementation (src/bsharp_syntax_derive/src/lib.rs):

#![allow(unused)]
fn main() {
#[proc_macro_derive(AstNode)]
pub fn derive_ast_node(input: TokenStream) -> TokenStream {
    // ...
    impl crate::node::ast_node::AstNode for #name {
        fn as_any(&self) -> &dyn ::core::any::Any { self }
        fn children<'a>(&'a self, push: &mut dyn FnMut(crate::node::ast_node::NodeRef<'a>)) {
            // Generated per-type based on fields
        }
    }
}
}

Helper routine decides how to push for common containers:

#![allow(unused)]
fn main() {
fn gen_push_for_type(ty: &Type, access: TokenStream) -> TokenStream {
    // Handles Option<T>, Vec<T>, Box<T>, or default to AST node push
}
}

Usage

Add the derive to your AST types in bsharp_syntax:

#![allow(unused)]
fn main() {
#[derive(bsharp_syntax_derive::AstNode, Debug, Clone, PartialEq, Serialize, Deserialize)]
pub enum Expression {
    Literal(Literal),
    Variable(Identifier),
    Invocation(Box<InvocationExpression>),
    // ...
}
}

This enables:

  • Graph rendering via to_text, to_mermaid, to_dot
  • Traversal via AstWalker/Visit or Query API (by way of NodeRef children)

Guidelines

  • Ensure child fields are typed as AST nodes or containers of AST nodes for traversal to work.
  • Keep primitive data out of traversal (the derive already skips standard primitives).
  • Favor Box<T> for recursive enum variants to keep sizes reasonable.

See Also

  • docs/syntax/traits.mdAstNode, NodeRef
  • docs/analysis/traversal-guide.md – traversal patterns
  • docs/development/query-cookbook.md – query examples
2025-11-17 15:18:26 • commit: 03a4e25

Formatter and Emitters

This page describes the formatting architecture in BSharp, implemented in the bsharp_syntax crate.


Overview

The formatter is an AST-driven emitter that produces the final C# text directly. There is no post-processing pass (no normalize_text): the output is exactly what emitters write.

  • Core types:
    • Formatter
    • FormatOptions
  • Emission is instrumentable via a JSONL trace for debugging and profiling.

FormatOptions

#![allow(unused)]
fn main() {
pub struct FormatOptions {
    pub indent_width: usize,                      // default: 4 spaces
    pub newline: &'static str,                    // "\n" or "\r\n"
    pub max_consecutive_blank_lines: u8,          // default: 1
    pub blank_line_between_members: bool,         // default: true
    pub ensure_final_newline: bool,               // default: true (emit one final newline if any content)
    pub trim_trailing_whitespace: bool,           // default: true
    pub instrument_emission: bool,                // default: false
    pub trace_file: Option<std::path::PathBuf>,   // optional JSONL output
    pub current_file: Option<std::path::PathBuf>, // helpful in messages
}
}
  • Newline mode is controlled by CLI --newline-mode or defaults to LF.
  • Emission tracing can be toggled via CLI --emit-trace or BSHARP_EMIT_TRACE=1.

Brace Style and Spacing Policy

  • Brace style: All containers and headers use Allman style

    • Header ends the line (e.g., namespace X, class C, void M())
    • Next line is an opening {, indented body, then closing } on its own line.
  • Spacing is centralized in simple policy helpers (see src/bsharp_syntax/src/emitters/policy.rs):

    • between_header_and_body_of_file → blank line between file header (e.g., file-scoped ns) and body
    • after_file_scoped_namespace_header → blank line after namespace X.Y;
    • between_using_blocks_and_declarations → blank line after using block before first declaration
    • between_top_level_declarations → single separator newline between top-level declarations
    • between_members → single separator newline between adjacent type members
    • between_block_items → optional extra newline inside a block when a control-flow block (if/for/while/do/switch/inner block) is followed by a declaration

Notes:

  • Policies are invoked from emitters; emitters themselves keep logic minimal and do not hardcode extra blank lines.
  • Interfaces, classes, structs, and records call between_members between members; the boolean blank_line_between_members toggles this globally.

End-of-file Newline

  • The CompilationUnit emitter ensures at most one final newline at EOF.
  • There are no per-statement trailing newlines at the root; separation is handled by policy functions.

Usage

#![allow(unused)]
fn main() {
use bsharp_syntax::{Formatter, FormatOptions};

let mut opts = FormatOptions::default();
opts.newline = "\n";
opts.max_consecutive_blank_lines = 1;
opts.blank_line_between_members = true;
opts.trim_trailing_whitespace = true;

let fmt = Formatter::new(opts);
let output = fmt.format_compilation_unit(&cu)?; // cu: CompilationUnit
}

Emission Trace (JSONL)

When instrumentation is enabled, the formatter emits a stream of JSON objects describing emission steps.

  • CLI integration:
    • --emit-trace to enable
    • --emit-trace-file <FILE> to write to a file (stdout by default)
    • Env var BSHARP_EMIT_TRACE=1 acts as a default toggle

The trace can be useful to:

  • Diagnose spacing/blank line decisions (look for action: "policy" with names like between_members, between_top_level_declarations, between_block_items)
  • Identify costly emission paths
  • Reproduce formatting anomalies

Typical actions include: enter_node, open_brace, close_brace, newline, space, token, and policy.


Integration with CLI

  • See bsharp format in docs/cli/format.md for options mapping to FormatOptions.
  • Files that fail to parse are skipped; a summary is printed.
  • With --write false on a single file input, the formatted output is printed to stdout.

Design Notes

  • Emitters are AST-driven to preserve structure while normalizing whitespace and layout based on policies.
  • The formatter avoids changing semantics and focuses on consistent style.
  • Options default to safe, conservative values and can be tuned via CLI.
2025-11-17 15:18:26 • commit: 03a4e25

Analysis Framework Overview

The BSharp analysis framework provides a comprehensive suite of tools for analyzing C# code at various levels of detail. It is built on top of the BSharp parser infrastructure and offers insights into code structure, quality, dependencies, and maintainability. These capabilities support standalone analysis tools and editor/CI integrations.

Analysis Architecture

The analysis framework is organized into specialized modules:

src/bsharp_analysis/src/
├── framework/        # pipeline, passes, registry, session, walker, query
├── passes/           # indexing, metrics, control_flow, dependencies, reporting
├── artifacts/        # symbols, cfg, dependencies
├── metrics/          # AstAnalysis data + shared helpers
├── rules/            # naming, semantic, control_flow_smells
├── report/           # AnalysisReport assembly
└── (no quality module)

Analysis Capabilities

Control Flow Analysis

  • Path Analysis: Identify all possible execution paths through methods
  • Reachability: Detect unreachable code sections
  • Complexity Metrics: Calculate cyclomatic complexity and other flow-based metrics
  • Dead Code Detection: Find code that can never be executed

Dependency Analysis

  • Type Dependencies: Track relationships between types
  • Assembly Dependencies: Analyze external assembly usage
  • Circular Dependencies: Detect problematic dependency cycles
  • Coupling Metrics: Measure afferent and efferent coupling

Code Metrics

Comprehensive metrics collection across multiple dimensions:

Complexity Metrics

  • Cyclomatic Complexity
  • Cognitive Complexity
  • Nesting Depth
  • Method Length

Size Metrics

  • Lines of Code (LOC)
  • Source Lines of Code (SLOC)
  • Comment Lines
  • Method Count per Class

Maintainability Metrics

  • Maintainability Index
  • Technical Debt Indicators
  • Code Duplication Detection
  • Halstead Metrics

Rules

  • Naming Rules: Basic naming convention checks
  • Control Flow Smells: Simple flow-related smells (e.g., deep nesting warnings)

Type Analysis

  • Type Usage: Track how types are used throughout the codebase
  • Generic Analysis: Analyze generic type usage patterns
  • Inheritance Hierarchies: Map class and interface hierarchies
  • Interface Compliance: Validate interface implementations

Analysis Workflow

1. AST Preparation

All analysis begins with a parsed AST:

#![allow(unused)]
fn main() {
let parser = Parser::new();
let compilation_unit = parser.parse(source_code)?;
}

2. Pipeline

Use the framework pipeline with registered passes. Per-file runs populate typed artifacts; a final AnalysisReport summarizes metrics, control flow, and dependencies.

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::pipeline::AnalyzerPipeline;
use bsharp_analysis::framework::session::AnalysisSession;
use bsharp_analysis::context::AnalysisContext;
use bsharp_analysis::report::AnalysisReport;
use bsharp_parser::facade::Parser;

let parser = Parser::new();
let (cu, spans) = parser.parse_with_spans(source_code)?;
let ctx = AnalysisContext::new("file.cs", source_code);
let mut session = AnalysisSession::new(ctx, spans);
AnalyzerPipeline::run_with_defaults(&cu, &mut session);
let report: AnalysisReport = AnalysisReport::from_session(&session);
}

3. Analysis Execution

The pipeline runs passes in phases:

  • Index → Metrics (local) → Global (CFG, deps) → Semantic rules → Reporting

Artifacts (e.g., AstAnalysis, ControlFlowIndex, DependencyGraph) are inserted into the AnalysisSession and consumed by reporting.

4. Results Processing

Analysis results are structured for easy consumption:

#![allow(unused)]
fn main() {
// Metrics results
println!("Cyclomatic Complexity: {}", metrics.cyclomatic_complexity);
println!("Lines of Code: {}", metrics.lines_of_code);

// Diagnostics
for d in &report.diagnostics.diagnostics {
    println!("{}: {}", d.code, d.message);
}
}

Analysis Registry and Passes

Analyses are implemented as AnalyzerPass implementations registered in an AnalyzerRegistry and executed by the AnalyzerPipeline. Local rulesets and semantic rulesets run alongside passes based on Phase.

Configuration and Customization

Analysis Configuration

Analyzers can be configured for different scenarios:

#![allow(unused)]
fn main() {
let config = AnalysisConfig {
    max_cyclomatic_complexity: 10,
    max_method_length: 50,
    enforce_naming_conventions: true,
    detect_code_smells: true,
    // ... other configuration options
};

let analyzer = MetricsAnalyzer::with_config(config);
}

Custom Rules

Extend analysis with custom rules:

#![allow(unused)]
fn main() {
let custom_analyzer = QualityAnalyzer::new()
    .add_rule(CustomRule::new("no-goto-statements"))
    .add_rule(CustomRule::new("max-parameters", 5))
    .add_rule(CustomRule::new("prefer-composition"));
}

Reporting Options

Flexible reporting formats:

#![allow(unused)]
fn main() {
// JSON output
let json_report = analyzer.analyze(&ast).to_json();

// XML output
let xml_report = analyzer.analyze(&ast).to_xml();

// Custom format
let custom_report = analyzer.analyze(&ast).format_with(custom_formatter);
}

Integration Points

CLI Integration

Analysis capabilities are exposed through the analyze command and configured via options (format, config, include/exclude, enable/disable passes and rulesets, severity overrides). See docs/cli/analyze.md for details.

Programmatic Usage

Direct integration in tools typically runs the pipeline and pulls artifacts from the session:

#![allow(unused)]
fn main() {
use bsharp_analysis::context::AnalysisContext;
use bsharp_analysis::framework::{AnalyzerPipeline, AnalysisSession};
use bsharp_analysis::metrics::AstAnalysis;
use bsharp_parser::facade::Parser;

let source = fs::read_to_string(path)?;
let (cu, spans) = Parser::new().parse_with_spans(&source)?;
let mut session = AnalysisSession::new(AnalysisContext::new(path, &source), spans);
AnalyzerPipeline::run_with_defaults(&cu, &mut session);
if let Some(ast) = session.artifacts.get::<AstAnalysis>() {
    println!("methods={} complexity={}", ast.total_methods, ast.cyclomatic_complexity);
}
}

Performance Characteristics

Analysis Performance

  • Incremental Analysis: Support for analyzing only changed parts
  • Parallel Processing: Multi-threaded analysis for large codebases
  • Memory Efficiency: Minimal memory overhead during analysis
  • Caching: Results caching for repeated analysis

Scalability

The framework scales from single files to large enterprise codebases:

  • Single file analysis: Sub-second performance
  • Medium projects (100+ files): Seconds to minutes
  • Large codebases (1000+ files): Minutes with parallel processing

This analysis framework provides the foundation for building sophisticated code quality tools, IDE integrations, and automated code review systems.

2025-11-17 15:18:26 • commit: 03a4e25

Analysis Pipeline

This document describes the analysis pipeline architecture, artifacts, rulesets, configuration toggles, and determinism guarantees in the B# analyzer.

Phases

The pipeline runs in deterministic phases (see src/bsharp_analysis/src/framework/pipeline.rs):

  • Index
    • Runs early passes like IndexingPass to populate core artifacts (SymbolIndex, NameIndex, FqnMap).
  • Local Rules
    • Runs per-file passes such as MetricsPass (Query-based) to compute artifacts like AstAnalysis.
    • Local rulesets run here as well; use bsharp_analysis::framework::Query for AST enumeration.
  • Global
    • Passes that aggregate information across the file (or project) after initial indexing.
  • Semantic
    • Rules and passes that require previously built artifacts (e.g., control flow, dependencies).
  • Reporting
    • Finalization phase that can synthesize report artifacts.

Each phase is explicitly selected in AnalyzerPipeline::run_for_file() using Phase discriminants. Pass and ruleset registration is driven by AnalyzerRegistry.

Artifacts

Artifacts are stored in the per-file AnalysisSession.artifacts and summarized into an AnalysisReport:

  • Symbols (src/bsharp_analysis/src/artifacts/symbols.rs)
    • SymbolIndex (by id and name), NameIndex (name frequencies), FqnMap (local name → FQNs).
  • Control Flow (src/bsharp_analysis/src/artifacts/cfg.rs)
    • ControlFlowIndex keyed per method; summarized to CfgSummary with total methods and smell counts.
  • Dependencies (src/bsharp_analysis/src/artifacts/dependencies.rs)
    • Graph keyed by symbols; summarized to node/edge counts.
  • Metrics (src/bsharp_analysis/src/artifacts/metrics.rsAstAnalysis)
    • Basic metrics gathered during the local traversal.

Artifacts are optional in the final report; missing artifacts simply result in None summaries.

Rulesets and Passes

Rules implement the Rule trait and are grouped into logical rulesets. Passes implement AnalyzerPass and declare a Phase:

  • Rulesets are separated into Local vs. Semantic groups and executed during the respective phases.
  • Passes can be toggled individually by id.
  • The registry is created with AnalyzerRegistry::from_config(&AnalysisConfig) to honor config toggles.

Configuration

AnalysisConfig (src/bsharp_analysis/src/context.rs) controls thresholds and toggles:

  • Control flow thresholds
    • cf_high_complexity_threshold (default 10)
    • cf_deep_nesting_threshold (default 4)
  • Toggles
    • enable_rulesets: HashMap<String, bool>
    • enable_passes: HashMap<String, bool>
    • rule_severities: HashMap<String, DiagnosticSeverity>
  • Workspace filters
    • workspace.follow_refs: bool
    • workspace.include: Vec<String> (glob patterns)
    • workspace.exclude: Vec<String> (glob patterns)

CLI maps flags to these fields in src/bsharp_cli/src/commands/analyze.rs and supports TOML/JSON config files.

Workspace Analysis and Determinism

AnalyzerPipeline::run_workspace() and run_workspace_with_config():

  • Discover files deterministically by sorting absolute paths and deduping.
  • Analyze each file independently, then merge artifacts into a single AnalysisReport.
  • Diagnostics are sorted by file, line, column, then diagnostic code for stable output.
  • Workspace loader warnings/errors are merged into workspace_warnings (sorted, deduped).
  • When the parallel_analysis feature is enabled, files are analyzed in parallel but merged deterministically in path order.

Report Schema

AnalysisReport (src/bsharp_analysis/src/report/mod.rs) includes:

  • schema_version: u32 (currently 1)
  • diagnostics: DiagnosticCollection
  • metrics: Option<AstAnalysis>
  • cfg: Option<CfgSummary>
  • deps: Option<DependencySummary>
  • workspace_warnings: Vec<String>
  • workspace_errors: Vec<String> (reserved for future use)

The JSON shape is intentionally stable; tests use snapshots with path normalization to ensure cross-platform consistency.

Testing Guidance

  • Prefer deterministic fixtures under tests/fixtures/.
  • Normalize absolute paths in snapshots (see tests/integration/workspace_analysis_snapshot.rs).
  • For workspace filtering, use run_workspace_with_config() with include/exclude globs and snapshot the resulting report.
2025-11-17 15:18:26 • commit: 03a4e25

Analysis Traversal Guide

This guide explains how to traverse BSharp AST statements and expressions in analysis passes using the current framework.

  • Source files:
    • src/bsharp_analysis/src/framework/walker.rs
    • src/bsharp_analysis/src/framework/query/
    • src/bsharp_analysis/src/passes/*

Statement traversal

Use AstWalker for single-pass traversal with the Visit trait, or the Query API for typed filtering.

Example using AstWalker + Visit to count if statements:

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::{AstWalker, Visit, NodeRef, AnalysisSession};

struct CountIfs { pub ifs: usize }
impl Visit for CountIfs {
    fn enter(&mut self, node: &NodeRef, _session: &mut AnalysisSession) {
        if let NodeRef::Statement(s) = node {
            if matches!(s, bsharp_syntax::statements::statement::Statement::If(_)) {
                self.ifs += 1;
            }
        }
    }
}
}

Expression traversal

Use Query for typed expression searches:

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::{NodeRef, Query};
use bsharp_syntax::expressions::AwaitExpression;

let await_count = Query::from(NodeRef::CompilationUnit(&cu))
    .of::<AwaitExpression>()
    .count();
}

Putting it together

When analyzing methods, you typically:

  • Parse the compilation unit and build the analysis session.
  • For each method body (a Statement::Block), compute metrics by walking statements and expressions.

Example (from ControlFlowPass pattern):

#![allow(unused)]
fn main() {
use bsharp_analysis::artifacts::cfg::{ControlFlowIndex, MethodControlFlowStats};
use bsharp_syntax::statements::statement::Statement;

fn stats_for_method(body: Option<&Statement>) -> MethodControlFlowStats {
    let complexity = match body { Some(s) => 1 + decision_points(s), None => 1 };
    let max_nesting = calc_max_nesting(body, 0);
    let exit_points = count_exit_points(body);
    let statement_count = count_statements(body);
    MethodControlFlowStats { complexity, max_nesting, exit_points, statement_count }
}
}

See src/bsharp_analysis/src/metrics/shared.rs for helpers like decision_points, max_nesting_of, count_statements and src/bsharp_analysis/src/passes/control_flow.rs for usage.


See Also

Tips

  • Keep walkers side-effect free; accumulate results in closures.
  • Prefer small, focused passes that use the walkers rather than embedding traversal in each pass.
  • If a construct is not being traversed, add it to the walker first to avoid duplicated traversal logic.
2025-11-17 15:18:26 • commit: 03a4e25

Control Flow Analysis

The control flow analysis system analyzes method control flow to calculate complexity metrics, detect control flow smells, and identify potential issues.


Overview

Location: src/bsharp_analysis/src/passes/control_flow.rs, src/bsharp_analysis/src/artifacts/cfg.rs

Control flow analysis provides:

  • Cyclomatic complexity calculation
  • Maximum nesting depth tracking
  • Exit point counting
  • Statement counting
  • Control flow smell detection

Control Flow Metrics

Cyclomatic Complexity

Definition: Number of linearly independent paths through a method

Calculation: CC = 1 + number of decision points

Decision Points:

  • if statements
  • case labels in switch
  • Loop statements (for, foreach, while, do-while)
  • catch clauses
  • Logical operators (&&, ||) in conditions
  • Ternary operators (?:)
  • Null-coalescing operators (??)

Example:

public void ProcessOrder(Order order) {  // CC = 1 (base)
    if (order == null) {                 // +1 = 2
        throw new ArgumentNullException();
    }
    
    if (order.IsValid) {                 // +1 = 3
        if (order.Amount > 1000) {       // +1 = 4
            ApplyDiscount(order);
        }
        SaveOrder(order);
    }
}
// Total CC = 4

Maximum Nesting Depth

Definition: Deepest level of nested control structures

Example:

public void Example() {
    if (condition1) {              // Depth 1
        while (condition2) {       // Depth 2
            if (condition3) {      // Depth 3
                DoSomething();
            }
        }
    }
}
// Max Nesting Depth = 3

Exit Points

Definition: Number of points where method can return

Counted:

  • return statements
  • throw statements
  • End of void method

Example:

public int Calculate(int x) {
    if (x < 0) {
        return -1;        // Exit point 1
    }
    if (x == 0) {
        return 0;         // Exit point 2
    }
    return x * 2;         // Exit point 3
}
// Total Exit Points = 3

Statement Count

Definition: Total number of statements in method body

Includes all statement types:

  • Expression statements
  • Declaration statements
  • Control flow statements
  • Jump statements

Control Flow Artifacts

MethodControlFlowStats

#![allow(unused)]
fn main() {
pub struct MethodControlFlowStats {
    pub complexity: usize,
    pub max_nesting: usize,
    pub exit_points: usize,
    pub statement_count: usize,
}
}

ControlFlowIndex

#![allow(unused)]
fn main() {
pub struct ControlFlowIndex {
    // Method identifier -> stats
    methods: HashMap<String, MethodControlFlowStats>,
}
}

CfgSummary

#![allow(unused)]
fn main() {
pub struct CfgSummary {
    pub total_methods: usize,
    pub high_complexity_count: usize,
    pub deep_nesting_count: usize,
}
}

Control Flow Smells

High Complexity

Threshold: Configurable (default: 10)

Detection:

#![allow(unused)]
fn main() {
if stats.complexity > config.cf_high_complexity_threshold {
    session.diagnostics.add(
        DiagnosticCode::HighComplexity,
        format!("Method complexity {} exceeds threshold {}", 
               stats.complexity, threshold)
    );
}
}

Diagnostic:

warning[CF002]: High cyclomatic complexity
  --> src/OrderProcessor.cs:42:17
   |
42 |     public void ProcessOrder(Order order) {
   |                 ^^^^^^^^^^^^ complexity = 15 (threshold: 10)
   |
   = help: Consider breaking this method into smaller methods

Deep Nesting

Threshold: Configurable (default: 4)

Detection:

#![allow(unused)]
fn main() {
if stats.max_nesting > config.cf_deep_nesting_threshold {
    session.diagnostics.add(
        DiagnosticCode::DeepNesting,
        format!("Maximum nesting depth {} exceeds threshold {}", 
               stats.max_nesting, threshold)
    );
}
}

Diagnostic:

warning[CF003]: Deep nesting detected
  --> src/Validator.cs:15:9
   |
15 |         if (condition1) {
   |         ^^ nesting depth = 5 (threshold: 4)
   |
   = help: Consider extracting nested logic into separate methods

Implementation

Analysis Pass

Location: src/bsharp_analysis/src/passes/control_flow.rs

#![allow(unused)]
fn main() {
pub struct ControlFlowPass;

impl AnalyzerPass for ControlFlowPass {
    fn id(&self) -> &'static str { "control_flow" }
    fn phase(&self) -> Phase { Phase::Semantic }
    
    fn run(&self, cu: &CompilationUnit, session: &mut AnalysisSession) {
        let mut index = ControlFlowIndex::new();
        
        // Analyze all methods in compilation unit
        for decl in &cu.declarations {
            analyze_declaration(decl, &mut index, session);
        }
        
        session.artifacts.cfg = Some(index);
    }
}
}

Method Analysis

#![allow(unused)]
fn main() {
fn analyze_method(
    method: &MethodDeclaration,
    index: &mut ControlFlowIndex,
    session: &mut AnalysisSession
) {
    let stats = calculate_stats(method.body.as_ref());
    
    // Check thresholds
    if stats.complexity > session.config.cf_high_complexity_threshold {
        session.diagnostics.add(/* high complexity diagnostic */);
    }
    
    if stats.max_nesting > session.config.cf_deep_nesting_threshold {
        session.diagnostics.add(/* deep nesting diagnostic */);
    }
    
    // Store in index
    index.add_method(&method.identifier.name, stats);
}
}

Stats Calculation

#![allow(unused)]
fn main() {
fn calculate_stats(body: Option<&Statement>) -> MethodControlFlowStats {
    let complexity = match body {
        Some(stmt) => 1 + count_decision_points(stmt),
        None => 1,
    };
    
    let max_nesting = calculate_max_nesting(body, 0);
    let exit_points = count_exit_points(body);
    let statement_count = count_statements(body);
    
    MethodControlFlowStats {
        complexity,
        max_nesting,
        exit_points,
        statement_count,
    }
}
}

Configuration

Thresholds

[analysis.control_flow]
cf_high_complexity_threshold = 10
cf_deep_nesting_threshold = 4

CLI Usage

# Analyze with custom thresholds
bsharp analyze MyProject.csproj --config .bsharp.toml

# Enable control flow pass
bsharp analyze MyProject.csproj --enable-pass control_flow

Integration with Pipeline

Phase: Semantic

Control flow analysis runs in the Semantic phase after symbol indexing:

Phase::Index    -> Build SymbolIndex
Phase::Local    -> Collect metrics
Phase::Semantic -> Control flow analysis

Artifacts

Results stored in AnalysisSession:

#![allow(unused)]
fn main() {
session.artifacts.cfg = Some(ControlFlowIndex { ... });
}

Summarized in AnalysisReport:

#![allow(unused)]
fn main() {
report.cfg = Some(CfgSummary {
    total_methods: 87,
    high_complexity_count: 5,
    deep_nesting_count: 3,
});
}


References

  • Implementation: src/bsharp_analysis/src/passes/control_flow.rs
  • Artifacts: src/bsharp_analysis/src/artifacts/cfg.rs
  • Tests: src/bsharp_tests/src/analysis/control_flow/ (planned)
2025-11-17 15:18:26 • commit: 03a4e25

Dependency Analysis

The dependency analysis system tracks relationships between types, methods, and other symbols in C# code to identify coupling, circular dependencies, and architectural issues.


Overview

Location: src/bsharp_analysis/src/artifacts/dependencies.rs

The dependency analysis builds a directed graph of symbol relationships, where:

  • Nodes represent symbols (classes, interfaces, methods, etc.)
  • Edges represent dependencies (inheritance, method calls, field types, etc.)

Dependency Types

Type Dependencies

Inheritance:

public class Derived : Base { }  // Derived depends on Base

Interface Implementation:

public class MyClass : IInterface { }  // MyClass depends on IInterface

Field Types:

public class Container {
    private Helper helper;  // Container depends on Helper
}

Method Parameters and Return Types:

public Response Process(Request req) { }  // Process depends on Request and Response

Member Dependencies

Method Calls:

public void Caller() {
    Helper.DoSomething();  // Caller depends on Helper.DoSomething
}

Property Access:

var value = obj.Property;  // Depends on Property

Constructor Calls:

var instance = new MyClass();  // Depends on MyClass constructor

Dependency Graph Structure

DependencyGraph

#![allow(unused)]
fn main() {
pub struct DependencyGraph {
    // Symbol ID -> list of symbols it depends on
    dependencies: HashMap<SymbolId, Vec<SymbolId>>,
}
}

Operations

Adding Dependencies:

#![allow(unused)]
fn main() {
graph.add_dependency(from_symbol, to_symbol);
}

Querying Dependencies:

#![allow(unused)]
fn main() {
// Direct dependencies
let deps = graph.get_dependencies(symbol_id);

// Transitive dependencies
let all_deps = graph.get_transitive_dependencies(symbol_id);

// Reverse dependencies (who depends on this symbol)
let dependents = graph.get_dependents(symbol_id);
}

Circular Dependency Detection

Algorithm

The analysis uses depth-first search to detect cycles in the dependency graph:

  1. Start from each symbol
  2. Traverse dependencies depth-first
  3. Track visited nodes in current path
  4. If we revisit a node in current path, cycle detected

Example

public class A {
    private B b;  // A depends on B
}

public class B {
    private C c;  // B depends on C
}

public class C {
    private A a;  // C depends on A -> CYCLE: A -> B -> C -> A
}

Detection:

#![allow(unused)]
fn main() {
let cycles = graph.find_cycles();
for cycle in cycles {
    // Report diagnostic for circular dependency
    session.diagnostics.add(
        DiagnosticCode::CircularDependency,
        format!("Circular dependency detected: {:?}", cycle)
    );
}
}

Coupling Metrics

Afferent Coupling (Ca)

Definition: Number of types that depend on this type (incoming dependencies)

Interpretation:

  • High Ca = Many types depend on this type (responsibility)
  • Type is stable and hard to change

Efferent Coupling (Ce)

Definition: Number of types this type depends on (outgoing dependencies)

Interpretation:

  • High Ce = This type depends on many others
  • Type is unstable and sensitive to changes

Instability (I)

Formula: I = Ce / (Ca + Ce)

Range: 0.0 to 1.0

  • 0.0 = Maximally stable (only incoming dependencies)
  • 1.0 = Maximally unstable (only outgoing dependencies)

Example:

#![allow(unused)]
fn main() {
let ca = graph.afferent_coupling(symbol_id);
let ce = graph.efferent_coupling(symbol_id);
let instability = ce as f64 / (ca + ce) as f64;

if instability > 0.8 {
    // Highly unstable type - consider refactoring
}
}

Dependency Summary

DependencySummary

#![allow(unused)]
fn main() {
pub struct DependencySummary {
    pub total_nodes: usize,
    pub total_edges: usize,
    pub circular_dependencies: usize,
    pub max_depth: usize,
}
}

Generated by: DependencyGraph::summarize()

Included in: AnalysisReport


Usage in Analysis Pipeline

Phase: Global

Dependency analysis runs in the Global phase after symbol indexing:

#![allow(unused)]
fn main() {
// In AnalyzerPipeline
Phase::Index   -> Build SymbolIndex
Phase::Global  -> Build DependencyGraph
Phase::Semantic -> Use dependencies for semantic analysis
}

Integration with Passes

DependencyPass (if implemented):

#![allow(unused)]
fn main() {
impl AnalyzerPass for DependencyPass {
    fn id(&self) -> &'static str { "dependencies" }
    fn phase(&self) -> Phase { Phase::Global }
    
    fn run(&self, cu: &CompilationUnit, session: &mut AnalysisSession) {
        let graph = build_dependency_graph(cu, &session.artifacts.symbols);
        session.artifacts.dependencies = Some(graph);
    }
}
}

Building Dependency Graph

From CompilationUnit

#![allow(unused)]
fn main() {
pub fn build_dependency_graph(
    cu: &CompilationUnit,
    symbols: &SymbolIndex
) -> DependencyGraph {
    let mut graph = DependencyGraph::new();
    
    // Visit all declarations
    for decl in &cu.declarations {
        match decl {
            TopLevelDeclaration::Class(class) => {
                analyze_class_dependencies(class, symbols, &mut graph);
            }
            // ... other declaration types
        }
    }
    
    graph
}
}

From Class Declaration

#![allow(unused)]
fn main() {
fn analyze_class_dependencies(
    class: &ClassDeclaration,
    symbols: &SymbolIndex,
    graph: &mut DependencyGraph
) {
    let class_symbol = symbols.lookup(&class.identifier.name);
    
    // Base types
    for base_type in &class.base_types {
        if let Some(base_symbol) = resolve_type(base_type, symbols) {
            graph.add_dependency(class_symbol, base_symbol);
        }
    }
    
    // Members
    for member in &class.body_declarations {
        analyze_member_dependencies(member, class_symbol, symbols, graph);
    }
}
}

Dependency Visualization

Dependency Matrix

Generate a matrix showing which types depend on which:

        A  B  C  D
    A   -  X  -  X
    B   -  -  X  -
    C   X  -  -  -
    D   -  -  -  -
  • Row A, Column B = X means A depends on B

Dependency Tree

MyApp
├── Services
│   ├── UserService
│   │   ├── IUserRepository
│   │   └── IEmailService
│   └── OrderService
│       ├── IOrderRepository
│       └── IPaymentService
└── Models
    ├── User
    └── Order

Diagnostics

Circular Dependency Warning

warning[DEP001]: Circular dependency detected
  --> src/ClassA.cs:3:5
   |
 3 |     private ClassB b;
   |             ^^^^^^ ClassA depends on ClassB
   |
   = note: Dependency cycle: ClassA -> ClassB -> ClassC -> ClassA

High Coupling Warning

warning[DEP002]: High efferent coupling detected
  --> src/GodClass.cs:1:14
   |
 1 | public class GodClass {
   |              ^^^^^^^^ depends on 25 other types
   |
   = help: Consider breaking this class into smaller, focused classes

Unstable Dependency Warning

warning[DEP003]: Stable type depends on unstable type
  --> src/StableClass.cs:5:5
   |
 5 |     private UnstableClass helper;
   |             ^^^^^^^^^^^^^ instability = 0.95
   |
   = note: Stable types (instability < 0.2) should not depend on unstable types (instability > 0.8)

Configuration

Thresholds

[analysis.dependencies]
max_efferent_coupling = 20
max_afferent_coupling = 10
max_instability = 0.8
warn_circular_dependencies = true

CLI Usage

# Analyze dependencies
bsharp analyze MyProject.csproj --enable-pass dependencies

# Generate dependency report
bsharp analyze MyProject.sln --out deps.json --format pretty-json

Future Enhancements

Planned Features

  1. Package-Level Dependencies

    • Track dependencies between namespaces/assemblies
    • Identify layering violations
  2. Dependency Metrics Dashboard

    • Visual dependency graphs
    • Coupling heatmaps
    • Trend analysis over time
  3. Architectural Rules

    • Define allowed/forbidden dependencies
    • Enforce layered architecture
    • Prevent specific coupling patterns
  4. Dependency Injection Analysis

    • Track DI container registrations
    • Verify dependency lifetimes
    • Detect missing registrations

Implementation Status

Current State:

  • Basic dependency graph structure defined
  • Integration with analysis pipeline planned
  • Circular dependency detection algorithm ready

TODO:

  • Implement full dependency extraction from AST
  • Add coupling metrics calculation
  • Create dependency visualization tools
  • Add comprehensive tests


References

  • Implementation: src/bsharp_analysis/src/artifacts/dependencies.rs
  • Tests: src/bsharp_tests/src/analysis/dependencies/ (planned)
  • Related Passes: src/bsharp_analysis/src/passes/ (when implemented)
2025-11-17 15:18:26 • commit: 03a4e25

Metrics Collection

The BSharp metrics system collects comprehensive code metrics during analysis to assess code complexity, size, and maintainability.


Overview

Location: src/bsharp_analysis/src/metrics/

The metrics system provides:

  • Basic Metrics - Lines of code, statement counts, declaration counts
  • Complexity Metrics - Cyclomatic complexity, cognitive complexity, nesting depth
  • Maintainability Metrics - Maintainability index, Halstead metrics

Architecture

Core Components

src/bsharp_analysis/src/metrics/
├── core.rs     # AstAnalysis data structure (aggregated counts)
└── shared.rs   # Helpers: decision_points, max_nesting_of, count_statements, etc.

How metrics are produced

  • MetricsPass runs in Phase::LocalRules and computes an AstAnalysis artifact using the Query API to enumerate declarations, plus lightweight walkers for statement counts.
  • Access AstAnalysis from AnalysisSession after running the pipeline.

Metric Types

1. Basic Metrics

AstAnalysis Structure:

#![allow(unused)]
fn main() {
pub struct AstAnalysis {
    // Size metrics
    pub total_lines: usize,
    pub code_lines: usize,
    pub comment_lines: usize,
    pub blank_lines: usize,
    
    // Declaration counts
    pub namespace_count: usize,
    pub class_count: usize,
    pub interface_count: usize,
    pub struct_count: usize,
    pub enum_count: usize,
    pub method_count: usize,
    pub property_count: usize,
    pub field_count: usize,
    
    // Statement counts
    pub statement_count: usize,
    pub expression_count: usize,
    
    // Complexity (aggregated)
    pub total_complexity: usize,
    pub max_complexity: usize,
    pub max_nesting_depth: usize,
}
}

2. Complexity Metrics

Cyclomatic Complexity

Definition: Number of linearly independent paths through code

Formula: CC = E - N + 2P

  • E = edges in control flow graph
  • N = nodes in control flow graph
  • P = connected components (usually 1)

Simplified: CC = 1 + number of decision points

Decision Points:

  • if, else if
  • case in switch
  • for, foreach, while, do-while
  • &&, || in conditions
  • catch clauses
  • ?: ternary operator
  • ?? null-coalescing operator

Example:

public void ProcessOrder(Order order) {  // CC = 1 (base)
    if (order == null) {                 // +1 = 2
        throw new ArgumentNullException();
    }
    
    if (order.IsValid) {                 // +1 = 3
        if (order.Amount > 1000) {       // +1 = 4
            ApplyDiscount(order);
        }
        SaveOrder(order);
    } else {                             // else doesn't add
        LogError(order);
    }
}
// Total CC = 4

Implementation:

#![allow(unused)]
fn main() {
pub fn cyclomatic_complexity(method: &MethodDeclaration) -> usize {
    let mut complexity = 1;  // Base complexity
    
    if let Some(body) = &method.body {
        complexity += count_decision_points(body);
    }
    
    complexity
}

fn count_decision_points(stmt: &Statement) -> usize {
    let mut count = 0;
    
    walk_statements(stmt, &mut |s| {
        match s {
            Statement::If(_) => count += 1,
            Statement::For(_) => count += 1,
            Statement::ForEach(_) => count += 1,
            Statement::While(_) => count += 1,
            Statement::DoWhile(_) => count += 1,
            Statement::Switch(sw) => {
                // Each case is a decision point
                count += sw.sections.len();
            }
            Statement::Try(try_stmt) => {
                // Each catch is a decision point
                count += try_stmt.catch_clauses.len();
            }
            _ => {}
        }
    });
    
    // Also count logical operators in expressions
    // count += count_logical_operators(stmt);
    
    count
}
}

Thresholds:

  • 1-10: Simple, low risk
  • 11-20: Moderate complexity, moderate risk
  • 21-50: Complex, high risk
  • 50+: Very complex, very high risk - refactor recommended

Cognitive Complexity

Definition: Measure of how difficult code is to understand

Increments:

  • +1 for each: if, else if, switch, for, foreach, while, do-while, catch, ?:, ??
  • +1 for each level of nesting (nested control structures)
  • +1 for each break or continue that jumps out of nested structure
  • +1 for each recursive call

Example:

public void Process(List<int> items) {
    if (items != null) {                 // +1 (if)
        foreach (var item in items) {    // +1 (loop) +1 (nesting) = +2
            if (item > 0) {              // +1 (if) +2 (nesting) = +3
                Process(item);           // +1 (recursion) +3 (nesting) = +4
            }
        }
    }
}
// Total Cognitive Complexity = 1 + 2 + 3 + 4 = 10

Implementation:

#![allow(unused)]
fn main() {
pub fn cognitive_complexity(method: &MethodDeclaration) -> usize {
    let mut complexity = 0;
    
    if let Some(body) = &method.body {
        complexity = calculate_cognitive_complexity(body, 0);
    }
    
    complexity
}

fn calculate_cognitive_complexity(stmt: &Statement, nesting_level: usize) -> usize {
    let mut complexity = 0;
    
    match stmt {
        Statement::If(if_stmt) => {
            complexity += 1 + nesting_level;  // if + nesting penalty
            complexity += calculate_cognitive_complexity(&if_stmt.consequence, nesting_level + 1);
            if let Some(alt) = &if_stmt.alternative {
                complexity += calculate_cognitive_complexity(alt, nesting_level + 1);
            }
        }
        Statement::For(for_stmt) => {
            complexity += 1 + nesting_level;
            if let Some(body) = &for_stmt.body {
                complexity += calculate_cognitive_complexity(body, nesting_level + 1);
            }
        }
        // ... other statement types
        _ => {}
    }
    
    complexity
}
}

Nesting Depth

Definition: Maximum depth of nested control structures

Example:

public void Example() {
    if (condition1) {              // Depth 1
        while (condition2) {       // Depth 2
            if (condition3) {      // Depth 3
                for (int i = 0; i < 10; i++) {  // Depth 4
                    // Code here
                }
            }
        }
    }
}
// Max Nesting Depth = 4

Implementation:

#![allow(unused)]
fn main() {
pub fn max_nesting_depth(method: &MethodDeclaration) -> usize {
    method.body.as_ref()
        .map(|body| calculate_max_nesting(body, 0))
        .unwrap_or(0)
}

fn calculate_max_nesting(stmt: &Statement, current_depth: usize) -> usize {
    let mut max_depth = current_depth;
    
    match stmt {
        Statement::If(if_stmt) => {
            let then_depth = calculate_max_nesting(&if_stmt.consequence, current_depth + 1);
            max_depth = max_depth.max(then_depth);
            
            if let Some(alt) = &if_stmt.alternative {
                let else_depth = calculate_max_nesting(alt, current_depth + 1);
                max_depth = max_depth.max(else_depth);
            }
        }
        Statement::Block(stmts) => {
            for s in stmts {
                let depth = calculate_max_nesting(s, current_depth);
                max_depth = max_depth.max(depth);
            }
        }
        // ... other nesting statements
        _ => {}
    }
    
    max_depth
}
}

Thresholds:

  • 1-3: Acceptable
  • 4-5: Consider refactoring
  • 6+: Refactor recommended

Planned: Maintainability Metrics

Maintainability Index

Definition: Composite metric indicating code maintainability

Formula (Microsoft version):

MI = MAX(0, (171 - 5.2 * ln(HV) - 0.23 * CC - 16.2 * ln(LOC)) * 100 / 171)

Where:

  • HV = Halstead Volume
  • CC = Cyclomatic Complexity
  • LOC = Lines of Code

Scale:

  • 85-100: Good maintainability (green)
  • 65-84: Moderate maintainability (yellow)
  • 0-64: Difficult to maintain (red)

Note: Maintainability Index is not implemented in the current codebase. This section outlines potential future work.

pub fn maintainability_index(
    halstead_volume: f64,
    cyclomatic_complexity: usize,
    lines_of_code: usize
) -> f64 {
    let hv_term = 5.2 * halstead_volume.ln();
    let cc_term = 0.23 * (cyclomatic_complexity as f64);
    let loc_term = 16.2 * (lines_of_code as f64).ln();
    
    let mi = 171.0 - hv_term - cc_term - loc_term;
    let normalized = (mi * 100.0 / 171.0).max(0.0);
    
    normalized
}

Planned: Halstead Metrics

Operators and Operands:

  • n1 = number of distinct operators
  • n2 = number of distinct operands
  • N1 = total number of operators
  • N2 = total number of operands

Derived Metrics:

  • Program Vocabulary: n = n1 + n2
  • Program Length: N = N1 + N2
  • Calculated Length: N' = n1 * log2(n1) + n2 * log2(n2)
  • Volume: V = N * log2(n)
  • Difficulty: D = (n1 / 2) * (N2 / n2)
  • Effort: E = D * V
  • Time to Program: T = E / 18 seconds
  • Bugs Delivered: B = V / 3000

Note: Halstead metrics are not implemented in the current codebase.

#![allow(unused)]
fn main() {
pub struct HalsteadMetrics {
    pub distinct_operators: usize,    // n1
    pub distinct_operands: usize,     // n2
    pub total_operators: usize,       // N1
    pub total_operands: usize,        // N2
    pub vocabulary: usize,            // n
    pub length: usize,                // N
    pub volume: f64,                  // V
    pub difficulty: f64,              // D
    pub effort: f64,                  // E
    pub time_to_program: f64,         // T
    pub bugs_delivered: f64,          // B
}

impl HalsteadMetrics {
    pub fn calculate(operators: &HashSet<String>, operands: &HashSet<String>,
                     op_count: usize, operand_count: usize) -> Self {
        let n1 = operators.len();
        let n2 = operands.len();
        let n = n1 + n2;
        let N = op_count + operand_count;
        
        let volume = (N as f64) * (n as f64).log2();
        let difficulty = (n1 as f64 / 2.0) * (operand_count as f64 / n2 as f64);
        let effort = difficulty * volume;
        let time = effort / 18.0;
        let bugs = volume / 3000.0;
        
        HalsteadMetrics {
            distinct_operators: n1,
            distinct_operands: n2,
            total_operators: op_count,
            total_operands: operand_count,
            vocabulary: n,
            length: N,
            volume,
            difficulty,
            effort,
            time_to_program: time,
            bugs_delivered: bugs,
        }
    }
}
}

Metrics Collection in the Pipeline

MetricsPass is registered in the analyzer registry and runs during Phase::LocalRules. It enumerates classes/structs/methods via Query and uses helpers from bsharp_analysis::metrics::shared to compute statement counts, decision points (cyclomatic complexity), and nesting.

#![allow(unused)]
fn main() {
use bsharp_analysis::context::AnalysisContext;
use bsharp_analysis::framework::pipeline::AnalyzerPipeline;
use bsharp_analysis::framework::session::AnalysisSession;
use bsharp_analysis::metrics::AstAnalysis;
use bsharp_parser::facade::Parser;

let source = r#"public class C { public void M() { if (true) { } } }"#;
let (cu, spans) = Parser::new().parse_with_spans(source)?;
let mut session = AnalysisSession::new(AnalysisContext::new("file.cs", source), spans);
AnalyzerPipeline::run_with_defaults(&cu, &mut session);
let ast = session.artifacts.get::<AstAnalysis>().expect("AstAnalysis");
println!("classes={}, methods={}, ifs={}", ast.total_classes, ast.total_methods, ast.total_if_statements);
}

CLI Usage

Analyze Metrics

# Analyze single file
bsharp analyze MyFile.cs

# Analyze project
bsharp analyze MyProject.csproj --out metrics.json

# Analyze solution
bsharp analyze MySolution.sln --out metrics.json --format pretty-json

Example Output

{
  "schema_version": 1,
  "metrics": {
    "total_lines": 1250,
    "code_lines": 980,
    "comment_lines": 150,
    "blank_lines": 120,
    "class_count": 15,
    "method_count": 87,
    "total_complexity": 245,
    "max_complexity": 18,
    "max_nesting_depth": 5
  }
}

Thresholds and Warnings

Configuration

[analysis.metrics]
max_cyclomatic_complexity = 10
max_cognitive_complexity = 15
max_nesting_depth = 4
max_method_length = 50
min_maintainability_index = 65

Diagnostics

High Complexity Warning:

warning[MET001]: Method has high cyclomatic complexity
  --> src/OrderProcessor.cs:42:17
   |
42 |     public void ProcessOrder(Order order) {
   |                 ^^^^^^^^^^^^ complexity = 18 (threshold: 10)
   |
   = help: Consider breaking this method into smaller methods

Deep Nesting Warning:

warning[MET002]: Deep nesting detected
  --> src/Validator.cs:15:9
   |
15 |         if (condition1) {
   |         ^^ nesting depth = 5 (threshold: 4)
   |
   = help: Consider extracting nested logic into separate methods

Programmatic Usage

Analyzing a Method

#![allow(unused)]
fn main() {
use bsharp::analysis::metrics::{cyclomatic_complexity, cognitive_complexity, max_nesting_depth};

let method = parse_method("public void MyMethod() { ... }");

let cc = cyclomatic_complexity(&method);
let cog = cognitive_complexity(&method);
let nesting = max_nesting_depth(&method);

println!("Cyclomatic Complexity: {}", cc);
println!("Cognitive Complexity: {}", cog);
println!("Max Nesting Depth: {}", nesting);
}

Analyzing a file via the pipeline

#![allow(unused)]
fn main() {
let (cu, spans) = Parser::new().parse_with_spans(source_code)?;
let mut session = AnalysisSession::new(AnalysisContext::new("file.cs", source_code), spans);
AnalyzerPipeline::run_with_defaults(&cu, &mut session);
let metrics = session.artifacts.get::<AstAnalysis>().expect("AstAnalysis");
println!("Classes: {}", metrics.total_classes);
println!("Methods: {}", metrics.total_methods);
println!("Cyclomatic Complexity: {}", metrics.cyclomatic_complexity);
}


References

  • Implementation: src/bsharp_analysis/src/metrics/
  • Pass: src/bsharp_analysis/src/passes/metrics.rs
  • Tests: src/bsharp_tests/src/analysis/metrics/ (planned)
  • Standards: ISO/IEC 25023 (Software Quality Metrics)
2025-11-17 15:18:26 • commit: 03a4e25

Type Analysis

The type analysis system provides insights into type usage, inheritance hierarchies, and type-related patterns in C# code.


Overview

Status: Planned (module not implemented yet)

Type analysis tracks:

  • Type definitions and their relationships
  • Inheritance hierarchies
  • Interface implementations
  • Generic type usage
  • Type references and dependencies

Type Information

Type Categories

Value Types:

  • Primitives (int, bool, double, etc.)
  • Structs
  • Enums

Reference Types:

  • Classes
  • Interfaces
  • Delegates
  • Arrays

Special Types:

  • Generic type parameters
  • Nullable types
  • Tuple types
  • Anonymous types

Inheritance Analysis

Class Hierarchies

Tracking Inheritance:

public class Animal { }
public class Mammal : Animal { }
public class Dog : Mammal { }

Hierarchy Representation:

Animal
└── Mammal
    └── Dog

Analysis:

#![allow(unused)]
fn main() {
pub struct InheritanceHierarchy {
    // Type -> Base Type
    base_types: HashMap<TypeId, TypeId>,
    // Type -> Derived Types
    derived_types: HashMap<TypeId, Vec<TypeId>>,
}

impl InheritanceHierarchy {
    pub fn get_base_type(&self, type_id: TypeId) -> Option<TypeId>;
    pub fn get_derived_types(&self, type_id: TypeId) -> &[TypeId];
    pub fn get_all_ancestors(&self, type_id: TypeId) -> Vec<TypeId>;
    pub fn get_all_descendants(&self, type_id: TypeId) -> Vec<TypeId>;
    pub fn inheritance_depth(&self, type_id: TypeId) -> usize;
}
}

Interface Implementation

Tracking Implementations:

public interface IRepository { }
public interface IUserRepository : IRepository { }
public class UserRepository : IUserRepository { }

Analysis:

#![allow(unused)]
fn main() {
pub struct InterfaceImplementations {
    // Type -> Interfaces it implements
    implementations: HashMap<TypeId, Vec<TypeId>>,
    // Interface -> Types that implement it
    implementers: HashMap<TypeId, Vec<TypeId>>,
}
}

Generic Type Analysis

Type Parameters

Tracking Generic Definitions:

public class Container<T> where T : class { }
public class Repository<TEntity, TKey> where TEntity : class { }

Analysis:

#![allow(unused)]
fn main() {
pub struct GenericTypeInfo {
    pub type_parameters: Vec<TypeParameter>,
    pub constraints: Vec<TypeConstraint>,
}

pub struct TypeParameter {
    pub name: String,
    pub variance: Option<Variance>,  // in, out
}

pub struct TypeConstraint {
    pub parameter: String,
    pub kind: ConstraintKind,
}

pub enum ConstraintKind {
    Class,              // where T : class
    Struct,             // where T : struct
    New,                // where T : new()
    BaseType(TypeId),   // where T : BaseClass
    Interface(TypeId),  // where T : IInterface
}
}

Generic Type Usage

Tracking Instantiations:

var list = new List<int>();
var dict = new Dictionary<string, User>();

Analysis:

#![allow(unused)]
fn main() {
pub struct GenericInstantiation {
    pub generic_type: TypeId,
    pub type_arguments: Vec<TypeId>,
}

pub fn find_generic_instantiations(cu: &CompilationUnit) -> Vec<GenericInstantiation>;
}

Type Usage Patterns

Frequency Analysis

Most Used Types:

#![allow(unused)]
fn main() {
pub struct TypeUsageStats {
    pub type_references: HashMap<TypeId, usize>,
}

impl TypeUsageStats {
    pub fn most_used_types(&self, limit: usize) -> Vec<(TypeId, usize)>;
    pub fn usage_count(&self, type_id: TypeId) -> usize;
}
}

Type Categories Distribution

#![allow(unused)]
fn main() {
pub struct TypeDistribution {
    pub class_count: usize,
    pub interface_count: usize,
    pub struct_count: usize,
    pub enum_count: usize,
    pub delegate_count: usize,
}
}

Type Metrics

Depth of Inheritance Tree (DIT)

Definition: Maximum depth from type to root of hierarchy

Example:

class A { }              // DIT = 0 (or 1 from Object)
class B : A { }          // DIT = 1 (or 2 from Object)
class C : B { }          // DIT = 2 (or 3 from Object)

Interpretation:

  • Low DIT (0-2): Simple hierarchy, easy to understand
  • Medium DIT (3-4): Moderate complexity
  • High DIT (5+): Complex hierarchy, may indicate over-engineering

Number of Children (NOC)

Definition: Number of immediate subclasses

Example:

class Animal { }
class Dog : Animal { }
class Cat : Animal { }
class Bird : Animal { }
// Animal has NOC = 3

Interpretation:

  • High NOC: Type is heavily reused (good abstraction or god class)
  • Low NOC: Specialized type or leaf in hierarchy

Lack of Cohesion of Methods (LCOM)

Definition: Measure of how well methods in a class are related

Simplified Calculation:

  • Count pairs of methods that don't share instance variables
  • High LCOM suggests class should be split

Type Compatibility Analysis

Assignability

Checking Compatibility:

#![allow(unused)]
fn main() {
pub fn is_assignable_to(from: &Type, to: &Type, context: &TypeContext) -> bool {
    // Check if 'from' type can be assigned to 'to' type
    // Considers inheritance, interface implementation, variance, etc.
}
}

Rules:

  • Derived type assignable to base type
  • Type assignable to implemented interface
  • Covariant/contravariant generic types
  • Nullable value types
  • Implicit conversions

Type Conversions

Tracking Conversions:

int x = 42;
long y = x;              // Implicit conversion
string s = x.ToString(); // Explicit conversion

Analysis:

#![allow(unused)]
fn main() {
pub enum ConversionKind {
    Implicit,
    Explicit,
    UserDefined,
}

pub struct TypeConversion {
    pub from: TypeId,
    pub to: TypeId,
    pub kind: ConversionKind,
}
}

Nullable Reference Types Analysis

Nullability Tracking

C# 8+ Nullable Annotations:

string? nullable = null;      // Nullable reference
string nonNull = "value";     // Non-nullable reference

Analysis:

#![allow(unused)]
fn main() {
pub struct NullabilityInfo {
    pub is_nullable: bool,
    pub nullability_context: NullabilityContext,
}

pub enum NullabilityContext {
    Enabled,
    Disabled,
    Warnings,
}
}

Null Safety Diagnostics

Potential Null Reference:

warning[TYPE001]: Possible null reference
  --> src/UserService.cs:15:9
   |
15 |     user.Name = "John";
   |     ^^^^ 'user' may be null here
   |
   = help: Add null check or use null-conditional operator

Type Analysis in Pipeline

Integration

Type analysis is not part of the default registry yet. The intended phase is Semantic (after symbol indexing and global artifacts). This page outlines the planned scope.


Programmatic Usage

Analyzing Type Hierarchy

Planned APIs will expose hierarchy queries once implemented.

Finding Generic Instantiations

Planned helper(s) to enumerate generic instantiations will be documented here after implementation.


Future Enhancements

Planned Features

  1. Type Inference Tracking

    • Track var usage and inferred types
    • Analyze type inference patterns
  2. Variance Analysis

    • Detect variance violations
    • Suggest covariant/contravariant annotations
  3. Type Safety Metrics

    • Measure use of dynamic
    • Track unsafe casts
    • Nullable reference type coverage
  4. Design Pattern Detection

    • Identify common patterns (Factory, Strategy, etc.)
    • Detect anti-patterns

Implementation Status

Current State:

  • Basic type tracking infrastructure in place
  • Type analysis module integrated with analysis framework
  • Foundation for inheritance and generic analysis established

In Progress:

  • Full inheritance hierarchy analysis
  • Generic type instantiation tracking
  • Type usage statistics collection
  • Comprehensive test coverage

Planned:

  • Variance analysis
  • Type safety metrics
  • Design pattern detection based on type relationships


References

  • Implementation: Planned
  • Tests: Planned (under src/bsharp_tests/src/analysis/types/)
  • Related: docs/analysis/dependencies.md, docs/parser/ast-structure.md
2025-11-17 15:18:26 • commit: 03a4e25

Code Quality Analysis (Conceptual / Future Plan)

This document describes a future-facing design for quality analysis. The legacy quality module and QualityPass were removed from the codebase in the purge. Consider this document a proposal/reference for potential future work rather than current implementation.


Overview

Status: Not implemented. The legacy module was removed; this page documents future direction.

Quality analysis provides:

  • Code smell detection
  • Best practice validation
  • Design pattern recognition
  • Maintainability assessment
  • Technical debt identification

Code Smells

Method-Level Smells

Long Method

Description: Method with too many lines of code

Threshold: > 50 lines (configurable)

Example:

public void ProcessOrder(Order order) {
    // 150 lines of code...
}

Diagnostic:

warning[QUAL001]: Long method detected
  --> src/OrderService.cs:42:17
   |
42 |     public void ProcessOrder(Order order) {
   |                 ^^^^^^^^^^^^ method has 150 lines (threshold: 50)
   |
   = help: Consider breaking this method into smaller, focused methods

Refactoring:

  • Extract method
  • Decompose into smaller methods
  • Apply Single Responsibility Principle

Long Parameter List

Description: Method with too many parameters

Threshold: > 5 parameters (configurable)

Example:

public void CreateUser(string firstName, string lastName, string email, 
                      string phone, string address, string city, string zip) {
    // ...
}

Refactoring:

  • Introduce parameter object
  • Use builder pattern
  • Group related parameters into DTOs

Complex Conditional

Description: Deeply nested or complex conditional logic

Example:

if (user != null && user.IsActive && (user.Role == "Admin" || user.Role == "Manager") 
    && user.Department != null && user.Department.Budget > 10000) {
    // ...
}

Refactoring:

  • Extract condition to well-named method
  • Use guard clauses
  • Simplify boolean logic

Class-Level Smells

Large Class (God Class)

Description: Class with too many responsibilities

Indicators:

  • Too many methods (> 20)
  • Too many fields (> 10)
  • High cyclomatic complexity
  • Low cohesion

Example:

public class UserManager {
    // 50 methods handling user CRUD, authentication, authorization,
    // email sending, logging, caching, validation, etc.
}

Refactoring:

  • Split into multiple classes
  • Apply Single Responsibility Principle
  • Extract related functionality

Feature Envy

Description: Method uses more features of another class than its own

Example:

public class OrderProcessor {
    public decimal CalculateTotal(Order order) {
        decimal total = 0;
        foreach (var item in order.Items) {
            total += item.Price * item.Quantity;
        }
        total -= order.Discount;
        total += order.Tax;
        return total;
    }
}

Refactoring:

  • Move method to Order class
  • Method should be where the data is

Data Class

Description: Class with only fields and getters/setters, no behavior

Example:

public class User {
    public string Name { get; set; }
    public string Email { get; set; }
    public int Age { get; set; }
    // No methods, just data
}

Note: Sometimes acceptable for DTOs, but domain objects should have behavior

Code Organization Smells

Duplicate Code

Description: Identical or very similar code in multiple places

Detection:

  • Token-based comparison
  • AST structure comparison
  • Minimum clone size threshold

Refactoring:

  • Extract method
  • Extract class
  • Use inheritance or composition

Dead Code

Description: Code that is never executed

Examples:

  • Unreachable statements after return
  • Unused private methods
  • Unused fields
  • Conditions that are always true/false

Diagnostic:

warning[QUAL010]: Unreachable code detected
  --> src/Calculator.cs:15:9
   |
14 |     return result;
15 |     Console.WriteLine("Done");  // Never executed
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^ unreachable statement

Magic Numbers

Description: Unexplained numeric literals in code

Example:

if (order.Total > 1000) {  // What does 1000 mean?
    ApplyDiscount(order, 0.1);  // What does 0.1 mean?
}

Refactoring:

const decimal BULK_ORDER_THRESHOLD = 1000m;
const decimal BULK_ORDER_DISCOUNT = 0.1m;

if (order.Total > BULK_ORDER_THRESHOLD) {
    ApplyDiscount(order, BULK_ORDER_DISCOUNT);
}

Best Practices

Naming Conventions

Rules:

  • Classes: PascalCase
  • Methods: PascalCase
  • Properties: PascalCase
  • Fields: camelCase with _ prefix for private
  • Constants: UPPER_CASE or PascalCase
  • Interfaces: PascalCase with I prefix

Violations:

warning[QUAL020]: Naming convention violation
  --> src/UserService.cs:5:17
   |
 5 |     private int UserCount;
   |                 ^^^^^^^^^ private field should use camelCase with _ prefix
   |
   = help: Rename to '_userCount'

Exception Handling

Anti-patterns:

Empty Catch Block:

try {
    RiskyOperation();
} catch (Exception) {
    // Silent failure - BAD!
}

Catching Generic Exception:

try {
    SpecificOperation();
} catch (Exception ex) {  // Too broad
    // ...
}

Best Practices:

  • Catch specific exceptions
  • Log exceptions
  • Don't swallow exceptions
  • Use finally for cleanup

Resource Management

Using Statement:

// Good
using (var file = File.OpenRead("data.txt")) {
    // Use file
}

// Better (C# 8+)
using var file = File.OpenRead("data.txt");
// Disposed at end of scope

Diagnostic:

warning[QUAL030]: IDisposable not properly disposed
  --> src/FileProcessor.cs:10:9
   |
10 |     var file = File.OpenRead("data.txt");
   |         ^^^^ should be wrapped in using statement

Design Patterns and Anti-Patterns

Detected Patterns

Singleton Pattern

Detection:

  • Private constructor
  • Static instance field
  • Public static accessor

Example:

public class Logger {
    private static Logger _instance;
    private Logger() { }
    
    public static Logger Instance {
        get {
            if (_instance == null) {
                _instance = new Logger();
            }
            return _instance;
        }
    }
}

Factory Pattern

Detection:

  • Method returning interface or base class
  • Creates different concrete types based on parameters

Anti-Patterns

God Object

Detection:

  • High number of methods and fields
  • Low cohesion
  • High coupling

Spaghetti Code

Detection:

  • High cyclomatic complexity
  • Deep nesting
  • Lack of structure

Lava Flow

Detection:

  • Dead code
  • Commented-out code
  • Unused variables/methods

Quality Metrics

Code Quality Score

Composite Score (0-100):

#![allow(unused)]
fn main() {
pub struct QualityScore {
    pub overall: f64,
    pub maintainability: f64,
    pub complexity: f64,
    pub duplication: f64,
    pub test_coverage: f64,
}
}

Calculation:

Overall = (Maintainability * 0.3) + 
          (Complexity * 0.3) + 
          (Duplication * 0.2) + 
          (TestCoverage * 0.2)

Technical Debt

Estimation:

#![allow(unused)]
fn main() {
pub struct TechnicalDebt {
    pub total_issues: usize,
    pub estimated_hours: f64,
    pub debt_ratio: f64,  // debt / total development time
}
}

Calculation:

  • Each code smell assigned time cost
  • Sum all issues
  • Compare to total codebase size

Quality Rules

Rule System

Rule Definition:

#![allow(unused)]
fn main() {
pub trait QualityRule {
    fn id(&self) -> &'static str;
    fn name(&self) -> &'static str;
    fn description(&self) -> &'static str;
    fn check(&self, node: &NodeRef, session: &mut AnalysisSession);
}
}

Example Rule:

#![allow(unused)]
fn main() {
pub struct LongMethodRule {
    max_lines: usize,
}

impl QualityRule for LongMethodRule {
    fn id(&self) -> &'static str { "long_method" }
    fn name(&self) -> &'static str { "Long Method" }
    
    fn check(&self, node: &NodeRef, session: &mut AnalysisSession) {
        if let NodeRef::MethodDeclaration(method) = node {
            let line_count = count_lines(method);
            if line_count > self.max_lines {
                session.diagnostics.add(
                    DiagnosticCode::LongMethod,
                    format!("Method has {} lines (threshold: {})", 
                           line_count, self.max_lines)
                );
            }
        }
    }
}
}

Rule Categories

Maintainability Rules:

  • Long method
  • Long parameter list
  • Large class
  • Complex method

Reliability Rules:

  • Empty catch blocks
  • Null reference risks
  • Resource leaks
  • Unhandled exceptions

Security Rules:

  • SQL injection risks
  • XSS vulnerabilities
  • Hardcoded credentials
  • Insecure random

Performance Rules:

  • Inefficient loops
  • Unnecessary allocations
  • String concatenation in loops
  • Boxing/unboxing

Configuration

Quality Thresholds

[analysis.quality]
max_method_lines = 50
max_parameters = 5
max_class_methods = 20
max_cyclomatic_complexity = 10
max_nesting_depth = 4

[analysis.quality.rules]
long_method = "warning"
long_parameter_list = "warning"
god_class = "error"
empty_catch = "error"
magic_numbers = "info"

Severity Levels

  • Error: Must be fixed
  • Warning: Should be fixed
  • Info: Consider fixing
  • Hint: Suggestion for improvement

CLI Usage

Quality Analysis

# Analyze code quality
bsharp analyze MyProject.csproj --enable-ruleset quality

# Generate quality report
bsharp analyze MySolution.sln --out quality-report.json

# Filter by severity
bsharp analyze MyFile.cs --severity error,warning

Example Output

{
  "quality_score": {
    "overall": 72.5,
    "maintainability": 68.0,
    "complexity": 75.0,
    "duplication": 80.0
  },
  "technical_debt": {
    "total_issues": 45,
    "estimated_hours": 12.5,
    "debt_ratio": 0.08
  },
  "diagnostics": [
    {
      "code": "QUAL001",
      "severity": "warning",
      "message": "Long method detected",
      "file": "src/OrderService.cs",
      "line": 42,
      "column": 17
    }
  ]
}

Integration with Pipeline

Quality Ruleset

Registration:

#![allow(unused)]
fn main() {
// In AnalyzerRegistry
registry.add_ruleset(QualityRuleset {
    id: "quality",
    rules: vec![
        Box::new(LongMethodRule::new()),
        Box::new(LongParameterListRule::new()),
        Box::new(GodClassRule::new()),
        Box::new(EmptyCatchRule::new()),
        // ... more rules
    ],
});
}

Execution:

  • Rules run during Local or Semantic phase
  • Visitor pattern for AST traversal
  • Diagnostics collected in session

Programmatic Usage

Running Quality Analysis

#![allow(unused)]
fn main() {
use bsharp::analysis::quality::QualityAnalyzer;

let parser = Parser::new();
let cu = parser.parse(source_code)?;

let analyzer = QualityAnalyzer::new();
let report = analyzer.analyze(&cu);

println!("Quality Score: {}", report.quality_score.overall);
println!("Issues Found: {}", report.diagnostics.len());
}

Custom Rules

#![allow(unused)]
fn main() {
use bsharp::analysis::quality::QualityRule;

struct CustomRule;

impl QualityRule for CustomRule {
    fn id(&self) -> &'static str { "custom_rule" }
    fn name(&self) -> &'static str { "Custom Rule" }
    
    fn check(&self, node: &NodeRef, session: &mut AnalysisSession) {
        // Custom logic
    }
}

// Register custom rule
analyzer.add_rule(Box::new(CustomRule));
}

Future Enhancements

Planned Features

  1. Machine Learning-Based Detection

    • Learn from codebase patterns
    • Detect project-specific smells
  2. Refactoring Suggestions

    • Automated refactoring proposals
    • Preview refactoring impact
  3. Quality Trends

    • Track quality over time
    • Identify degradation
    • Measure improvement
  4. Team Metrics

    • Per-developer quality metrics
    • Code review insights
    • Best practice adoption


References

  • Standards: Clean Code (Robert C. Martin), Refactoring (Martin Fowler)
2025-11-17 15:18:26 • commit: 03a4e25

Passes and Rules Registry

This page summarizes the default analysis registry: which passes and rulesets are registered by default and when they run.


Default Registry

Source: src/bsharp_analysis/src/framework/registry.rs

#![allow(unused)]
fn main() {
// Simplified summary based on default_registry()
- Pass: indexing::IndexingPass          // indexing/symbols
- Pass: pe_loader::PeLoaderPass         // external PE metadata (if available)
- Pass: metrics::MetricsPass            // local metrics (Query-based)
- Ruleset (local): rules::naming        // naming conventions
- Ruleset (local): rules::semantic      // baseline semantic checks (local)
- Pass: control_flow::ControlFlowPass   // control flow stats and diagnostics
- Pass: dependencies::DependenciesPass  // dependency graph & summary
- Ruleset (semantic): control_flow_smells // consumes global artifacts
- Pass: reporting::ReportingPass        // consolidate artifacts into report
}

Notes:

  • Each pass declares its own Phase (AnalyzerPass::phase()), e.g. MetricsPass runs in Phase::LocalRules.
  • Semantic rulesets (e.g., control_flow_smells) run after global artifacts are produced.

Phases

  • Index: Build indexes (symbols, FQNs) and load external metadata.
  • LocalRules: Run per-file local analyses (e.g., metrics) and baseline rules.
  • Global/Semantic: Build global artifacts (control flow, dependencies), then run semantic rules consuming them.
  • Reporting: Finalize results into AnalysisReport.

Configuration: Enabling/Disabling

Toggles are driven by AnalysisConfig:

  • Passes: enable_passes[pass_id] = true|false
  • Rulesets: enable_rulesets[ruleset_id] = true|false
  • Severities: rule_severities["CODE"] = Error|Warning|Info|Hint

The CLI maps flags to these fields (see docs/cli/analyze.md).


IDs

  • Pass IDs (AnalyzerPass::id()):
    • passes.indexing
    • passes.pe_loader
    • passes.metrics
    • passes.control_flow
    • passes.dependencies
    • passes.reporting
  • Ruleset IDs depend on the ruleset constructors (e.g., naming, semantic, control_flow_smells).

References

  • src/bsharp_analysis/src/framework/registry.rs
  • src/bsharp_analysis/src/passes/*
  • src/bsharp_analysis/src/rules/*
2025-11-17 15:18:26 • commit: 03a4e25

Analysis Report Schema

The AnalysisReport summarizes diagnostics and artifacts produced by the analysis pipeline.


Struct

Source: src/bsharp_analysis/src/report/mod.rs

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct CfgSummary {
    pub total_methods: usize,
    pub high_complexity_methods: usize,
    pub deep_nesting_methods: usize,
}

#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct AnalysisReport {
    pub schema_version: u32,
    pub diagnostics: DiagnosticCollection,
    pub metrics: Option<AstAnalysis>,
    pub cfg: Option<CfgSummary>,
    pub deps: Option<DependencySummary>,
    pub workspace_warnings: Vec<String>,
    pub workspace_errors: Vec<String>,
}
}

Field Details

  • schema_version – current schema version (1)
  • diagnostics – all emitted diagnostics with codes, severities, locations
  • metrics – aggregated AstAnalysis when MetricsPass runs
  • cfg – summarized control flow stats when ControlFlowPass runs
  • deps – dependency summary when DependenciesPass runs
  • workspace_warnings – non-fatal workspace-level messages
  • workspace_errors – reserved for future use

Example (pretty JSON)

{
  "schema_version": 1,
  "diagnostics": {
    "diagnostics": [
      {
        "code": "CF002",
        "severity": "warning",
        "message": "High cyclomatic complexity",
        "file": "src/OrderProcessor.cs",
        "line": 42,
        "column": 17
      }
    ]
  },
  "metrics": {
    "total_classes": 15,
    "total_interfaces": 3,
    "total_structs": 2,
    "total_enums": 1,
    "total_records": 0,
    "total_delegates": 0,
    "total_methods": 87,
    "total_properties": 21,
    "total_fields": 12,
    "total_events": 0,
    "total_constructors": 15,
    "total_if_statements": 20,
    "total_for_loops": 5,
    "total_while_loops": 2,
    "total_switch_statements": 3,
    "total_try_statements": 1,
    "total_using_statements": 2,
    "cyclomatic_complexity": 245,
    "lines_of_code": 980,
    "max_nesting_depth": 5,
    "documented_methods": 0,
    "documented_classes": 0
  },
  "cfg": {
    "total_methods": 87,
    "high_complexity_methods": 5,
    "deep_nesting_methods": 3
  },
  "deps": {
    "nodes": 42,
    "edges": 120
  },
  "workspace_warnings": [],
  "workspace_errors": []
}

Where It Comes From

AnalysisReport::from_session(&session) collects:

  • metrics from session.artifacts.get::<AstAnalysis>()
  • cfg by summarizing the ControlFlowIndex artifact against thresholds
  • deps by summarizing DependencyGraph
  • diagnostics copied from session.diagnostics

  • docs/cli/analyze.md – CLI options and examples
  • docs/analysis/pipeline.md – Where in the pipeline artifacts are produced
2025-11-17 15:18:26 • commit: 03a4e25

Writing an Analyzer Pass

This guide shows how to create a new analysis pass by implementing AnalyzerPass and registering it in the analysis pipeline.


Trait

Source: src/bsharp_analysis/src/framework/passes.rs

#![allow(unused)]
fn main() {
pub trait AnalyzerPass: Send + Sync + 'static {
    fn id(&self) -> &'static str;
    fn phase(&self) -> Phase;                 // Index | LocalRules | Global | Semantic | Reporting
    fn depends_on(&self) -> &'static [&'static str] { &[] }
    fn run(&self, cu: &CompilationUnit, session: &mut AnalysisSession) {}
}
}

Minimal Pass

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::{AnalyzerPass, Phase, AnalysisSession};
use bsharp_syntax::CompilationUnit;

pub struct MyPass;

impl AnalyzerPass for MyPass {
    fn id(&self) -> &'static str { "passes.my_pass" }
    fn phase(&self) -> Phase { Phase::LocalRules }

    fn run(&self, cu: &CompilationUnit, session: &mut AnalysisSession) {
        // Inspect `cu` and write results into `session.artifacts` or `session.diagnostics`
        // Example: count classes and log a note (pseudo)
        let mut count = 0usize;
        for _c in bsharp_analysis::framework::Query::from(cu).of::<bsharp_syntax::ClassDeclaration>() {
            count += 1;
        }
        // session.artifacts.insert(MyArtifact { class_count: count });
        // session.diagnostics.add(...);
    }
}
}

Registration

Add your pass to the default registry in src/bsharp_analysis/src/framework/registry.rs:

#![allow(unused)]
fn main() {
reg.register_pass(crate::passes::my_pass::MyPass);
}

Or, build a custom registry for experiments:

#![allow(unused)]
fn main() {
let mut reg = AnalyzerRegistry::default_registry();
reg.register_pass(MyPass);
AnalyzerPipeline::run_for_file(&cu, &mut session, &reg);
}

You can also toggle passes via AnalysisConfig.enable_passes["passes.my_pass"] = true|false (see configuration docs).


Tips

  • Keep passes small: Focus on one responsibility.
  • Prefer Query/AstWalker: Use Query for typed enumeration or AstWalker with Visit for custom traversal.
  • Write artifacts: Insert results with session.artifacts.insert(T) when they may be consumed later.
  • Determinism: Avoid non-deterministic ordering; use sorted maps/lists if needed.
2025-11-17 15:18:26 • commit: 03a4e25

Writing a Ruleset

This guide shows how to define rules and bundle them into a RuleSet to be executed by the analysis pipeline.


Traits and Types

Source: src/bsharp_analysis/src/framework/rules.rs

#![allow(unused)]
fn main() {
pub enum RuleTarget { All, Declarations, Members, Statements, Expressions }

pub trait Rule: Send + Sync + 'static {
    fn id(&self) -> &'static str;
    fn category(&self) -> &'static str;
    fn applies_to(&self) -> RuleTarget { RuleTarget::All }
    fn visit(&self, _node: &NodeRef, _session: &mut AnalysisSession) {}
}

pub struct RuleSet { pub id: &'static str, pub rules: Vec<Box<dyn Rule>> }
}

Minimal Rule

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::{Rule, RuleTarget, NodeRef, AnalysisSession};

pub struct NoEmptyCatch;

impl Rule for NoEmptyCatch {
    fn id(&self) -> &'static str { "QUAL010" }
    fn category(&self) -> &'static str { "quality" }
    fn applies_to(&self) -> RuleTarget { RuleTarget::Statements }

    fn visit(&self, node: &NodeRef, session: &mut AnalysisSession) {
        if let NodeRef::Statement(stmt) = node {
            if let bsharp_syntax::statements::statement::Statement::Try(t) = stmt {
                for c in &t.catches {
                    if c.block_is_empty() {
                        session.diagnostics.add(
                            bsharp_analysis::DiagnosticCode::from_static("QUAL010"),
                            "Empty catch block",
                            None,
                        );
                    }
                }
            }
        }
    }
}
}

Building a RuleSet

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::RuleSet;

pub fn ruleset() -> RuleSet {
    RuleSet::new("quality")
        .with_rule(NoEmptyCatch)
        // .with_rule(AnotherRule)
}
}

Register in the default registry (src/bsharp_analysis/src/framework/registry.rs) or construct a custom registry.

#![allow(unused)]
fn main() {
reg.register_ruleset(crate::rules::quality::ruleset());         // local rules
reg.register_semantic_ruleset(crate::rules::control_flow_smells::ruleset());
}

Rulesets can be enabled/disabled via AnalysisConfig.enable_rulesets["quality"] = true|false.


Tips

  • Choose RuleTarget thoughtfully to avoid unnecessary visits.
  • Emit diagnostics with specific codes and helpful messages.
  • Keep rules independent; accumulate state in AnalysisSession artifacts when needed.
  • Honor config toggles; only run if your ruleset is enabled.
2025-11-17 15:18:26 • commit: 03a4e25

Command Line Interface

The BSharp CLI provides command-line tools for parsing, analyzing, and visualizing C# code.


Installation

From Source

git clone https://github.com/mikserek/bsharp.git
cd bsharp
cargo build --release

The binary will be available at target/release/bsharp.

Add to PATH

# Linux/macOS
export PATH="$PATH:/path/to/bsharp/target/release"

# Windows
# Add to System Environment Variables

Command Structure

bsharp <COMMAND> [OPTIONS] <INPUT>

Global Options

--help, -h      Show help information
--version, -V   Show version information

Argument Files (@file)

All commands support argument files via @file syntax. Example:

bsharp @args.txt

Where args.txt contains one argument per line (comments and quoting follow standard shell parsing rules).


Available Commands

parse

Parse C# source code and print a textual AST tree to stdout.

bsharp parse <INPUT>

See: Parse Command

tree

Generate a visualization of the Abstract Syntax Tree.

bsharp tree <INPUT> [--output <FILE>] [--format mermaid|dot]

Notes:

  • Default format is mermaid; output defaults to <input>.mmd.
  • For DOT/Graphviz, use --format dot (or graphviz); output defaults to <input>.dot.

See: Tree Visualization

analyze

Analyze C# code and generate comprehensive analysis report.

bsharp analyze <INPUT> [OPTIONS]

See: Analysis Command

format

Format C# files using the built-in formatter and syntax emitters.

bsharp format <INPUT> [--write] [--newline-mode lf|crlf] [--max-consecutive-blank-lines <N>] \
  [--blank-line-between-members <BOOL>] [--trim-trailing-whitespace <BOOL>] \
  [--emit-trace] [--emit-trace-file <FILE>]

Notes:

  • <INPUT> can be a file or directory (recursively formats .cs files; skips hidden/bin/obj/target).
  • --write defaults to true; when false and a single file is given, the formatted output is printed to stdout.
  • Emission tracing can be enabled by --emit-trace or environment variable BSHARP_EMIT_TRACE=1.

See: Format Command


Common Usage Patterns

Quick Parse Check

# Check if file parses successfully
bsharp parse MyFile.cs

Generate AST for Inspection

# Pretty-printed JSON
bsharp parse MyFile.cs --output ast.json

Visualize Code Structure

# Generate Mermaid diagram (default), writes MyClass.mmd
bsharp tree MyClass.cs

# Generate Graphviz DOT diagram
bsharp tree MyClass.cs --format dot --output diagram.dot

Analyze Project Quality

# Full analysis with report
bsharp analyze MyProject.csproj --out report.json --format pretty-json

Analyze Solution

# Analyze entire solution
bsharp analyze MySolution.sln --follow-refs true

Input Types

Single File

bsharp parse Program.cs

Project File (.csproj)

bsharp analyze MyProject.csproj

Solution File (.sln)

bsharp analyze MySolution.sln

Directory

bsharp analyze ./src

Output Formats

JSON (Compact)

bsharp analyze MyFile.cs --format json

Output: Single-line JSON, optimized for machine consumption

Pretty JSON

bsharp analyze MyFile.cs --format pretty-json

Output: Indented JSON, human-readable

Mermaid/DOT (Tree Command)

# Mermaid (default)
bsharp tree MyFile.cs --output diagram.mmd

# Graphviz DOT
bsharp tree MyFile.cs --format dot --output diagram.dot

Output: Mermaid (.mmd) or Graphviz DOT (.dot)


Error Handling

Parse Errors

$ bsharp parse InvalidSyntax.cs
Error: Parse failed at line 5, column 12
Expected ';' but found 'class'

public class MyClass
            ^

File Not Found

$ bsharp parse NonExistent.cs
Error: File not found: NonExistent.cs

Invalid Project

$ bsharp analyze Invalid.csproj
Error: Failed to parse project file: Invalid XML

Environment Variables

RUST_LOG

Control logging verbosity:

# Show all logs
RUST_LOG=debug bsharp parse MyFile.cs

# Show only warnings and errors
RUST_LOG=warn bsharp analyze MyProject.csproj

# Show specific module logs
RUST_LOG=bsharp::parser=debug bsharp parse MyFile.cs

RUST_BACKTRACE

Enable stack traces on panic:

RUST_BACKTRACE=1 bsharp parse MyFile.cs

Performance Considerations

Large Files

For large files (> 10,000 lines), parsing may take several seconds:

# Monitor progress with debug logging
RUST_LOG=info bsharp parse LargeFile.cs

Large Solutions

For solutions with many projects, use parallel analysis:

# Requires parallel_analysis feature
cargo build --release --features parallel_analysis
bsharp analyze LargeSolution.sln

Memory Usage

Memory usage scales with AST size. For very large codebases:

# Analyze incrementally by project
for proj in **/*.csproj; do
    bsharp analyze "$proj" --out "$(basename $proj .csproj).json"
done

Integration with Other Tools

CI/CD Pipeline

# GitHub Actions example
- name: Analyze Code Quality
  run: |
    bsharp analyze MySolution.sln --out analysis.json
    # Upload analysis.json as artifact

Pre-commit Hook

#!/bin/bash
# .git/hooks/pre-commit

changed_files=$(git diff --cached --name-only --diff-filter=ACM | grep '\.cs$')

for file in $changed_files; do
    if ! bsharp parse "$file" > /dev/null 2>&1; then
        echo "Parse error in $file"
        exit 1
    fi
done

Editor Integration

// VS Code tasks.json
{
    "version": "2.0.0",
    "tasks": [
        {
            "label": "Analyze Current File",
            "type": "shell",
            "command": "bsharp",
            "args": [
                "analyze",
                "${file}",
                "--out",
                "${file}.analysis.json"
            ]
        }
    ]
}

Troubleshooting

Command Not Found

$ bsharp: command not found

Solution: Add bsharp to PATH or use full path:

/path/to/bsharp/target/release/bsharp parse MyFile.cs

Permission Denied

$ bsharp parse MyFile.cs
Permission denied

Solution: Make binary executable:

chmod +x /path/to/bsharp

Out of Memory

$ bsharp analyze HugeSolution.sln
Error: memory allocation failed

Solution: Analyze smaller subsets or increase system memory


Configuration Files

Analysis Configuration

Create .bsharp.toml in project root:

[analysis]
max_cyclomatic_complexity = 10
max_method_length = 50

[analysis.quality]
long_method = "warning"
god_class = "error"

[workspace]
follow_refs = true
include = ["**/*.cs"]
exclude = ["**/obj/**", "**/bin/**"]

Usage:

# Automatically loads .bsharp.toml from current directory
bsharp analyze MyProject.csproj

Shell Completion

Shell completion generation is currently not available in the CLI.


Examples

Example 1: Quick Syntax Check

# Check if all C# files in directory parse correctly
find . -name "*.cs" -exec bsharp parse {} \; 2>&1 | grep -i error

Example 2: Generate Documentation

# Parse all files and extract class/method names
for file in src/**/*.cs; do
    bsharp parse "$file" --output "${file}.json"
done

# Process JSON to generate documentation
# (custom script)

Example 3: Code Quality Gate

#!/bin/bash
# quality-gate.sh

bsharp analyze MyProject.csproj --out report.json --format json

# Extract error count
errors=$(jq '.diagnostics | map(select(.severity == "error")) | length' report.json)

if [ "$errors" -gt 0 ]; then
    echo "Quality gate failed: $errors errors found"
    exit 1
fi

echo "Quality gate passed"

Example 4: Complexity Report

# Generate complexity report for all methods
bsharp analyze MySolution.sln --out complexity.json

# Extract high-complexity methods
jq '.diagnostics | map(select(.code == "MET001"))' complexity.json

CLI Architecture

Implementation

Location: src/bsharp_cli/

src/bsharp_cli/
├── src/
│   ├── main.rs         # CLI entry point, clap definitions
│   └── commands/
│       ├── mod.rs      # Command module exports
│       ├── parse.rs    # Parse command implementation
│       ├── tree.rs     # AST visualization command (Mermaid/DOT)
│       └── analyze.rs  # Analysis command
└── Cargo.toml

Command Pattern

Each command follows this pattern:

#![allow(unused)]
fn main() {
pub fn execute(input: PathBuf, /* other args */) -> Result<()> {
    // 1. Validate input
    // 2. Load/parse files
    // 3. Perform operation
    // 4. Generate output
    // 5. Handle errors
    Ok(())
}
}

Future Enhancements

Planned Features

  1. Interactive Mode

    • REPL for exploring AST
    • Interactive analysis
  2. Watch Mode

    • Monitor files for changes
    • Re-analyze on save
  3. Language Server

    • LSP implementation
    • IDE integration
  4. Web Interface

    • Browser-based visualization
    • Interactive reports


References

  • Implementation: src/bsharp_cli/
  • Commands: src/bsharp_cli/src/commands/
  • Clap Documentation: https://docs.rs/clap/
2025-11-17 15:18:26 • commit: 03a4e25

--emit-spans

  • When used with --errors-json, include absolute and relative spans in the JSON under error.spans.
  • No effect unless --errors-json is set.

Parse Command

The parse command parses C# source code and prints a textual AST tree representation to stdout.


Usage

bsharp parse --input <INPUT> [--errors-json] [--emit-spans] [--no-color] [--lenient]

Arguments

<INPUT> (required)

  • Path to C# source file
  • Must have .cs extension
  • File must exist and be readable

Options

--errors-json

  • Print a machine-readable JSON error object to stdout on parse failure and exit non-zero
  • Disables pretty error output

See: Parse Errors JSON Output

--no-color

  • Disable ANSI colors in pretty error output

--lenient

  • Enable best-effort recovery mode (default is strict)

Note: The --output option is currently not used; the command writes the textual tree to stdout.

Examples

Basic Parsing

# Parse and print textual AST tree to stdout
bsharp parse Program.cs

Batch Parsing

# Parse all C# files in a directory (prints textual trees)
for file in src/**/*.cs; do
    bsharp parse "$file"
done

Output

The command prints a human-readable textual tree describing the AST. For visualization outputs (Mermaid/DOT), use the tree command.


Error Handling

Parse Errors

$ bsharp parse InvalidSyntax.cs
Error: Parse failed

0: at line 5, in keyword "class":
public clas MyClass { }
       ^--- expected keyword "class"

1: in context "class declaration"

Error Information:

  • Line and column numbers
  • Context stack showing where parsing failed
  • Expected vs. actual input
  • Helpful error messages

Pretty error formatting

The parser integrates with the miette crate for rich, labeled diagnostics in pretty (non-JSON) mode. CLI parse errors are formatted from the underlying ErrorTree with spans and context information for easier debugging.

For programmatic formatting from parser code, see bsharp_parser::errors::to_miette_report which converts an ErrorTree to a miette::Report with source code attached.

File Errors

$ bsharp parse NonExistent.cs
Error: Failed to read file: NonExistent.cs
Caused by: No such file or directory (os error 2)

Use Cases

1. Syntax Validation

# Check if file has valid syntax
if bsharp parse MyFile.cs > /dev/null 2>&1; then
    echo "Syntax OK"
else
    echo "Syntax Error"
    exit 1
fi

2. AST Inspection

# Parse and inspect AST structure
bsharp parse MyClass.cs --output ast.json
jq '.declarations[0].Class.name.name' ast.json

3. Documentation Input

# Parse C# and generate documentation using your own script
bsharp parse MyFile.cs --output ast.json
python generate_docs.py ast.json > docs.md

4. Static Analysis

# Parse and analyze with custom tool
bsharp parse MyFile.cs --output ast.json
./my-analyzer ast.json

Performance

Parsing Speed

  • Small files (< 100 lines): < 10ms
  • Medium files (100-1000 lines): 10-100ms
  • Large files (1000-10000 lines): 100ms-1s
  • Very large files (> 10000 lines): 1-10s

Memory Usage

  • Memory usage scales linearly with file size
  • Typical: 1-5 MB per 1000 lines of code
  • Peak memory during AST construction

Integration

CI/CD Pipeline

# GitHub Actions
- name: Validate C# Syntax
  run: |
    find . -name "*.cs" | while read file; do
      bsharp parse "$file" || exit 1
    done

Pre-commit Hook

#!/bin/bash
# .git/hooks/pre-commit

git diff --cached --name-only --diff-filter=ACM | grep '\.cs$' | while read file; do
    if ! bsharp parse "$file" > /dev/null 2>&1; then
        echo "Parse error in $file"
        exit 1
    fi
done

Build Script

#!/bin/bash
# validate-syntax.sh

errors=0
for file in src/**/*.cs; do
    if ! bsharp parse "$file" > /dev/null 2>&1; then
        echo "ERROR: $file"
        ((errors++))
    fi
done

if [ $errors -gt 0 ]; then
    echo "Found $errors files with syntax errors"
    exit 1
fi

Comparison with Other Tools

vs. Roslyn

  • BSharp: Fast, standalone, JSON output
  • Roslyn: Full compiler, .NET required, complex API

vs. Tree-sitter

  • BSharp: C#-specific, complete AST
  • Tree-sitter: Multi-language, syntax tree only

Implementation

Location: src/bsharp_cli/src/commands/parse.rs

#![allow(unused)]
fn main() {
pub fn execute(
    input: PathBuf,
    output: Option<PathBuf>,
    errors_json: bool,
    no_color: bool,
    lenient: bool,
) -> Result<()> {
    // Read file, choose strict/lenient, parse, and write <input>.json by default
    // See the source file for detailed behavior and error formatting.
    Ok(())
}
}


References

  • Implementation: src/bsharp_cli/src/commands/parse.rs
  • Parser: src/bsharp_parser/src/
  • AST Definitions: src/bsharp_syntax/src/
2025-11-17 15:18:26 • commit: 03a4e25

Tree Visualization Command

The tree command generates a visualization of the Abstract Syntax Tree (AST) from C# source code in Mermaid or Graphviz DOT format.


Usage

bsharp tree <INPUT> [--output <FILE>] [--format mermaid|dot]

Arguments

<INPUT> (required)

  • Path to C# source file
  • Must have .cs extension

Options

--output, -o <FILE> (optional)

  • Output file path
  • Default: <input>.mmd for Mermaid, <input>.dot for DOT

--format <FORMAT> (optional)

  • One of: mermaid (default), dot (alias: graphviz)

Examples

Basic Visualization

# Generate Mermaid diagram (default)
bsharp tree Program.cs              # writes Program.mmd

# Generate Graphviz DOT diagram
bsharp tree Program.cs --format dot # writes Program.dot

# Specify output file
bsharp tree Program.cs --format dot --output ast-diagram.dot

View/Render

# Mermaid preview (e.g., VS Code Mermaid extension) or CLI renderer
# Graphviz render to PNG
dot -Tpng Program.dot -o Program.png

Output Formats

Mermaid

Outputs a simple top-level graph in Mermaid syntax (.mmd).

graph TD
n0["CompilationUnit\\nUsings: 1\\nDecls: 1"]
u0["Using using System;"]
n0 --> u0
d0["Class: Program"]
n0 --> d0

Graphviz DOT

Outputs a simple top-level graph in DOT syntax (.dot).

digraph AST {
  node [shape=box, fontname="Courier New"];
  n0 [label="CompilationUnit\\nUsings: 1\\nDecls: 1"];
  u0 [label="Using using System;"];
  n0 -> u0;
  d0 [label="Class: Program"];
  n0 -> d0;
}

Color Scheme

  • Gray - Root nodes (CompilationUnit)
  • Blue - Type declarations (Class, Interface, Struct)
  • Green - Member declarations (Method, Property, Field)
  • Yellow - Statements (If, For, While)
  • Orange - Expressions (Binary, Invocation)
  • Purple - Types (Primitive, Named, Generic)

Visualization Features

Node Information

Each node displays:

  • Node Type - AST node type name
  • Identifier - Name (for named nodes)
  • Additional Info - Modifiers, types, etc.

Tree Layout

  • Top-down - Root at top, leaves at bottom
  • Hierarchical - Parent-child relationships clear
  • Balanced - Nodes distributed evenly
  • Scalable - Adjusts to tree size

Use Cases

1. Understanding Code Structure

# Visualize complex class
bsharp tree ComplexClass.cs --output structure.svg

2. Teaching/Documentation

# Generate diagrams for documentation
bsharp tree Example.cs --output docs/ast-example.svg

3. Debugging Parser

# Verify parser output
bsharp tree TestCase.cs --output debug.svg

4. Code Review

# Visualize changes
bsharp tree NewFeature.cs --output review.svg

Limitations

Large Files

  • Files > 1000 lines may produce very large SVGs
  • Consider visualizing specific classes/methods only

Complex Nesting

  • Deeply nested structures may be hard to read
  • SVG may require horizontal scrolling

Performance

  • Generation time increases with AST size
  • Large files (> 5000 lines) may take several seconds

Advanced Usage

Selective Visualization

# Extract specific class and visualize
# (requires custom script to extract class)
extract-class.sh MyFile.cs MyClass > temp.cs
bsharp tree temp.cs --output MyClass-ast.svg
rm temp.cs

Batch Generation

# Generate visualizations for all files
for file in src/**/*.cs; do
    output="diagrams/$(basename $file .cs).svg"
    bsharp tree "$file" --output "$output"
done

Integration with Documentation

# MyClass Documentation

## AST Structure

![AST Diagram](./diagrams/MyClass.svg)

The class structure shows...

Implementation

Location: src/bsharp_cli/src/commands/tree.rs

#![allow(unused)]
fn main() {
pub fn execute(args: Box<TreeArgs>) -> Result<()> {
    // Parses input in lenient mode, then writes Mermaid (.mmd) or DOT (.dot)
    // using bsharp_syntax::node::render::{to_mermaid, to_dot}.
    Ok(())
}
}

Renderer functions live in src/bsharp_syntax/src/node/render.rs:

#![allow(unused)]
fn main() {
to_mermaid(&ast);
to_dot(&ast);
}

Customization

Future Enhancements

  1. Interactive SVG

    • Click to expand/collapse nodes
    • Hover for details
    • Search functionality
  2. Export Formats

    • PNG/PDF export
    • DOT format for Graphviz
    • PlantUML format
  3. Filtering

    • Show only specific node types
    • Hide implementation details
    • Focus on structure
  4. Styling

    • Custom color schemes
    • Font customization
    • Layout options

Troubleshooting

SVG Too Large

Problem: Generated SVG is too large to view

Solution:

  • Visualize smaller code sections
  • Use SVG viewer with zoom/pan
  • Export to PDF for printing

Overlapping Nodes

Problem: Nodes overlap in complex trees

Solution:

  • Increase SVG dimensions
  • Simplify code structure
  • Use horizontal layout (future feature)

Missing Nodes

Problem: Some AST nodes not shown

Solution:

  • Check parser output with parse command
  • Report issue if nodes are missing


References

  • Implementation: src/bsharp_cli/src/commands/tree.rs
  • Formats: Mermaid or Graphviz DOT
2025-11-17 15:18:26 • commit: 03a4e25

Analyze Command

The analyze command performs comprehensive code analysis on C# files, projects, or solutions, generating detailed reports with diagnostics, metrics, and quality assessments.


Usage

bsharp analyze <INPUT> [OPTIONS]

Arguments

<INPUT> (required)

  • Path to C# source file (.cs)
  • Path to project file (.csproj)
  • Path to solution file (.sln)
  • Path to directory

Options

Output Options

--out <FILE>

  • Output file path for analysis report (JSON)
  • Default: stdout
  • Creates parent directories if needed

--format <FORMAT>

  • Output format: json (compact) or pretty-json (indented)
  • Default: pretty-json

Configuration

--config <FILE>

  • Path to analysis configuration file (JSON or TOML)
  • Overrides default settings
  • CLI flags override config file settings

See: Configuration Overview

Workspace Options

--follow-refs <BOOL>

  • Follow ProjectReference dependencies transitively
  • Default: true
  • Set to false to analyze only specified project

--include <GLOB>...

  • Include only files matching glob patterns
  • Multiple patterns allowed
  • Example: --include "**/*Service.cs" "**/*Controller.cs"

--exclude <GLOB>...

  • Exclude files matching glob patterns
  • Multiple patterns allowed
  • Example: --exclude "**/obj/**" "**/bin/**" "**/Tests/**"

Analysis Control

--enable-ruleset <ID>...

  • Enable specific rulesets by ID
  • Multiple IDs allowed
  • Overrides config file
  • Example: --enable-ruleset naming quality

--disable-ruleset <ID>...

  • Disable specific rulesets by ID
  • Multiple IDs allowed
  • Example: --disable-ruleset experimental

--enable-pass <ID>...

  • Enable specific analysis passes by ID
  • Multiple IDs allowed
  • Example: --enable-pass indexing control_flow

--disable-pass <ID>...

  • Disable specific analysis passes by ID
  • Multiple IDs allowed
  • Example: --disable-pass dependencies

--severity <CODE=LEVEL>...

  • Override diagnostic severity for specific codes
  • Format: CODE=level where level is error, warning, info, or hint
  • Multiple overrides allowed
  • Example: --severity MET001=error QUAL010=warning

Legacy Options (Single File Mode)

--symbol <NAME>

  • Search for specific symbol by name
  • Only works in single-file mode
  • Prints symbol locations and information

Examples

Basic Analysis

# Analyze single file
bsharp analyze MyFile.cs

# Analyze project
bsharp analyze MyProject.csproj

# Analyze solution
bsharp analyze MySolution.sln

Output to File

# Save report to file
bsharp analyze MyProject.csproj --out report.json

# Compact JSON format
bsharp analyze MyProject.csproj --out report.json --format json

Using Configuration File

# Load config from file
bsharp analyze MyProject.csproj --config .bsharp.toml

# Config file with CLI overrides
bsharp analyze MyProject.csproj \
    --config .bsharp.toml \
    --enable-ruleset quality \
    --severity MET001=error

Workspace Filtering

# Analyze only service files
bsharp analyze MySolution.sln --include "**/*Service.cs"

# Exclude test files
bsharp analyze MySolution.sln --exclude "**/Tests/**"

# Multiple filters
bsharp analyze MySolution.sln \
    --include "src/**/*.cs" \
    --exclude "**/obj/**" "**/bin/**" "**/Tests/**"

Controlling Analysis

# Enable specific rulesets
bsharp analyze MyProject.csproj \
    --enable-ruleset naming quality control_flow

# Disable experimental features
bsharp analyze MyProject.csproj \
    --disable-ruleset experimental

# Enable/disable specific passes
bsharp analyze MyProject.csproj \
    --enable-pass indexing control_flow \
    --disable-pass dependencies

Severity Overrides

# Treat specific warnings as errors
bsharp analyze MyProject.csproj \
    --severity MET001=error \
    --severity QUAL001=error

# Downgrade specific errors to warnings
bsharp analyze MyProject.csproj \
    --severity CS0168=warning

Symbol Search (Single File)

# Find symbol in file
bsharp analyze MyFile.cs --symbol MyClass

# Output:
# Found symbol 'MyClass' at line 10, column 14

Analysis Modes

Single File Mode

Triggered when: Input is a .cs file

Behavior:

  • Parses single file
  • Runs analysis pipeline on CompilationUnit
  • Supports --symbol option for symbol search
  • Faster for quick checks

Example:

bsharp analyze Program.cs --out analysis.json

Workspace Mode

Triggered when: Input is .sln, .csproj, or directory

Behavior:

  • Loads entire workspace
  • Discovers all source files
  • Follows project references (if --follow-refs true)
  • Applies include/exclude filters
  • Analyzes all files deterministically
  • Aggregates results into single report

Example:

bsharp analyze MySolution.sln \
    --follow-refs true \
    --exclude "**/Tests/**" \
    --out workspace-analysis.json

Configuration File Format

TOML Format

.bsharp.toml:

[analysis]
max_cyclomatic_complexity = 10
max_method_length = 50

[analysis.control_flow]
cf_high_complexity_threshold = 10
cf_deep_nesting_threshold = 4

[analysis.quality]
long_method = "warning"
god_class = "error"
empty_catch = "error"

[workspace]
follow_refs = true
include = ["src/**/*.cs"]
exclude = ["**/obj/**", "**/bin/**", "**/Tests/**"]

[enable_rulesets]
naming = true
quality = true
control_flow = true

[enable_passes]
indexing = true
control_flow = true
dependencies = true

[rule_severities]
MET001 = "error"
QUAL001 = "warning"

JSON Format

.bsharp.json:

{
  "analysis": {
    "max_cyclomatic_complexity": 10,
    "max_method_length": 50,
    "control_flow": {
      "cf_high_complexity_threshold": 10,
      "cf_deep_nesting_threshold": 4
    }
  },
  "workspace": {
    "follow_refs": true,
    "include": ["src/**/*.cs"],
    "exclude": ["**/obj/**", "**/bin/**"]
  },
  "enable_rulesets": {
    "naming": true,
    "quality": true
  },
  "enable_passes": {
    "indexing": true,
    "control_flow": true
  },
  "rule_severities": {
    "MET001": "error",
    "QUAL001": "warning"
  }
}

Output Format

Analysis Report Structure

{
  "schema_version": 1,
  "diagnostics": {
    "items": [
      {
        "code": "MET001",
        "severity": "warning",
        "message": "Method has high cyclomatic complexity",
        "file": "src/OrderService.cs",
        "line": 42,
        "column": 17,
        "end_line": 85,
        "end_column": 5
      }
    ]
  },
  "metrics": {
    "total_lines": 1250,
    "code_lines": 980,
    "comment_lines": 150,
    "blank_lines": 120,
    "class_count": 15,
    "interface_count": 3,
    "method_count": 87,
    "total_complexity": 245,
    "max_complexity": 18,
    "max_nesting_depth": 5
  },
  "cfg": {
    "total_methods": 87,
    "high_complexity_count": 5,
    "deep_nesting_count": 3
  },
  "deps": {
    "total_nodes": 15,
    "total_edges": 42,
    "circular_dependencies": 0,
    "max_depth": 4
  },
  "workspace_warnings": [
    "Failed to parse project: MyBrokenProject.csproj"
  ]
}

Diagnostic Fields

  • code: Diagnostic code (e.g., MET001, QUAL010)
  • severity: error, warning, info, or hint
  • message: Human-readable description
  • file: Source file path
  • line/column: Start position
  • end_line/end_column: End position (optional)

Metrics Fields

  • total_lines: Total lines including blank/comments
  • code_lines: Lines with actual code
  • comment_lines: Lines with comments
  • blank_lines: Empty lines
  • class_count: Number of classes
  • interface_count: Number of interfaces
  • method_count: Number of methods
  • total_complexity: Sum of all method complexities
  • max_complexity: Highest method complexity
  • max_nesting_depth: Deepest nesting level

Available Rulesets

Built-in Rulesets

naming - Naming convention rules

  • Class names: PascalCase
  • Method names: PascalCase
  • Field names: camelCase with _ prefix
  • Constant names: UPPER_CASE or PascalCase

quality - Code quality rules

  • Long method detection
  • Long parameter list
  • God class detection
  • Empty catch blocks
  • Magic numbers

control_flow - Control flow rules

  • High complexity warnings
  • Deep nesting warnings
  • Unreachable code detection

semantic - Semantic rules

  • Type checking
  • Null reference analysis
  • Resource leak detection

Available Passes

Built-in Passes

indexing (Phase: Index)

  • Builds symbol index
  • Creates name index
  • Generates FQN map

control_flow (Phase: Semantic)

  • Analyzes control flow
  • Calculates complexity metrics
  • Detects control flow smells

dependencies (Phase: Global)

  • Builds dependency graph
  • Detects circular dependencies
  • Calculates coupling metrics

reporting (Phase: Reporting)

  • Generates final report
  • Aggregates diagnostics
  • Summarizes artifacts

Diagnostic Codes

Metrics (MET)

  • MET001: High cyclomatic complexity
  • MET002: Deep nesting detected
  • MET003: Long method
  • MET004: Long parameter list

Quality (QUAL)

  • QUAL001: Long method
  • QUAL002: Long parameter list
  • QUAL010: Empty catch block
  • QUAL020: Naming convention violation
  • QUAL030: Resource not disposed

Control Flow (CF)

  • CF001: Unreachable code
  • CF002: High complexity
  • CF003: Deep nesting

Dependencies (DEP)

  • DEP001: Circular dependency
  • DEP002: High coupling
  • DEP003: Unstable dependency

Performance

Analysis Speed

  • Single file (< 1000 lines): < 100ms
  • Small project (< 10 files): < 500ms
  • Medium project (10-50 files): 500ms-2s
  • Large solution (100+ files): 2-10s

Memory Usage

  • Scales with codebase size
  • Typical: 50-200 MB for medium projects
  • Artifacts cached in memory during analysis

Parallel Analysis

With parallel_analysis feature enabled:

cargo build --release --features parallel_analysis

Files analyzed in parallel, significantly faster for large workspaces.


Integration

CI/CD Pipeline

# GitHub Actions
- name: Code Quality Analysis
  run: |
    bsharp analyze MySolution.sln \
      --out analysis.json \
      --format json \
      --severity MET001=error QUAL001=error
    
    # Check for errors
    errors=$(jq '.diagnostics.items | map(select(.severity == "error")) | length' analysis.json)
    if [ "$errors" -gt 0 ]; then
      echo "Quality gate failed: $errors errors"
      exit 1
    fi

Pre-commit Hook

#!/bin/bash
# .git/hooks/pre-commit

changed_files=$(git diff --cached --name-only --diff-filter=ACM | grep '\.cs$')

for file in $changed_files; do
    result=$(bsharp analyze "$file" --format json 2>/dev/null)
    errors=$(echo "$result" | jq '.diagnostics.items | map(select(.severity == "error")) | length')
    
    if [ "$errors" -gt 0 ]; then
        echo "Analysis errors in $file"
        exit 1
    fi
done

Quality Gate Script

#!/bin/bash
# quality-gate.sh

bsharp analyze MySolution.sln \
    --out report.json \
    --format json \
    --enable-ruleset naming quality control_flow \
    --severity MET001=error QUAL001=error

# Extract metrics
errors=$(jq '.diagnostics.items | map(select(.severity == "error")) | length' report.json)
max_complexity=$(jq '.metrics.max_complexity' report.json)

echo "Errors: $errors"
echo "Max Complexity: $max_complexity"

if [ "$errors" -gt 0 ]; then
    echo "❌ Quality gate failed: $errors errors found"
    exit 1
fi

if [ "$max_complexity" -gt 15 ]; then
    echo "❌ Quality gate failed: complexity $max_complexity exceeds threshold 15"
    exit 1
fi

echo "✅ Quality gate passed"

Troubleshooting

Analysis Fails

$ bsharp analyze MyProject.csproj
Error: Failed to load workspace

Solutions:

  • Check project file is valid XML
  • Verify all referenced projects exist
  • Use --follow-refs false to skip references

Out of Memory

Error: memory allocation failed

Solutions:

  • Analyze smaller subsets with --include/--exclude
  • Disable expensive passes with --disable-pass
  • Increase system memory

Slow Analysis

Solutions:

  • Build with parallel_analysis feature
  • Exclude unnecessary files
  • Disable unused rulesets/passes


References

  • Implementation: src/bsharp_cli/src/commands/analyze.rs
  • Pipeline: src/bsharp_analysis/src/framework/pipeline.rs
  • Configuration: src/bsharp_analysis/src/context.rs
2025-11-17 15:18:26 • commit: 03a4e25

Format Command

The format command formats C# code using the built-in formatter and syntax emitters.


Usage

bsharp format <INPUT> [--write <BOOL>] [--print] [--newline-mode lf|crlf] \
  [--max-consecutive-blank-lines <N>] [--blank-line-between-members <BOOL>] \
  [--trim-trailing-whitespace <BOOL>] [--emit-trace] [--emit-trace-file <FILE>]

Arguments

<INPUT> (required)

  • Path to .cs file or directory
  • When a directory is given, formats all .cs files recursively
  • Hidden directories and bin/, obj/, target/ are skipped

Options

--write, -w <BOOL>

  • Write changes to files in-place
  • Default: true
  • When false and <INPUT> is a single file, the formatted content is printed to stdout
  • When false and formatting differences are found for multiple files, exits with code 2

--print

  • Always print formatted output for a single-file input and exit
  • Useful for piping to other tools; does not write to disk regardless of --write

--newline-mode <MODE>

  • Newline mode: lf (default) or crlf

--max-consecutive-blank-lines <N>

  • Maximum consecutive blank lines to keep (default: 1)

--blank-line-between-members <BOOL>

  • Insert a blank line between type members (default: true)

--trim-trailing-whitespace <BOOL>

  • Trim trailing whitespace (default: true)

--emit-trace

  • Enable emission tracing (JSONL) for debugging formatter behavior
  • Can also be enabled via environment variable BSHARP_EMIT_TRACE=1

--emit-trace-file <FILE>

  • Path to write the trace JSONL (defaults to stdout when omitted)

Examples

# Format a single file in-place
bsharp format Program.cs

# Print formatted output to stdout (do not write)
bsharp format Program.cs --write false

# Force printing formatted output even if --write is not set
bsharp format Program.cs --print

# Format a directory recursively
bsharp format src/

# Use CRLF newlines and avoid extra blank lines
bsharp format Program.cs --newline-mode crlf --max-consecutive-blank-lines 1

# Enable emission tracing to a file
bsharp format Program.cs --emit-trace --emit-trace-file format_trace.jsonl

Implementation

  • Command: src/bsharp_cli/src/commands/format.rs
  • Formatter: bsharp_syntax::Formatter with FormatOptions
  • Emission tracing is controlled by CLI flags or BSHARP_EMIT_TRACE and recorded as JSONL.
  • Files that fail to parse are skipped; a summary is printed and they are not modified.

2025-11-17 15:18:26 • commit: 03a4e25

Parse Errors JSON Output

When bsharp parse is run with --errors-json, parse failures are emitted as a single JSON object to stdout and the process exits with a non-zero code.


Schema

{
  "error": {
    "kind": "parse_error",
    "file": "<path>",
    "line": 0,
    "column": 0,
    "expected": "",
    "found": "",
    "line_text": "",
    "message": "<pretty formatted message>",
    "spans": {
      "abs": { "start": 0, "end": 1 },
      "rel": {
        "start": { "line": 0, "column": 0 },
        "end": { "line": 0, "column": 1 }
      }
    }
  }
}
  • kind – always parse_error for parse failures.
  • file – path of the file being parsed.
  • line, column – 1-based location of the deepest error span.
  • expected, found – reserved fields (currently empty strings).
  • line_text – the full source line at the error location.
  • message – multi-line pretty message formatted from the parser's error tree.
  • spans – present only when --emit-spans is provided; includes absolute byte range and relative line/column positions.

Example

bsharp parse Invalid.cs --errors-json | jq
{
  "error": {
    "kind": "parse_error",
    "file": "Invalid.cs",
    "line": 7,
    "column": 12,
    "expected": "",
    "found": "",
    "line_text": "public clas Program { }",
    "message": "0: at 7:12: expected keyword \"class\"\n  public clas Program { }\n           ^\nContexts:\n  - class declaration\n"
  }
}

Notes

  • In pretty (non-JSON) mode, errors are sent to stderr with optional ANSI colors (disable via --no-color or NO_COLOR=1).
  • --errors-json disables pretty errors and always prints the JSON object.
2025-11-17 15:18:26 • commit: 03a4e25

Workspace Loading

The BSharp workspace loading system provides comprehensive support for loading C# projects and solutions, including solution files (.sln), project files (.csproj), and directory-based discovery.


Overview

Location: src/bsharp_analysis/src/workspace/

The workspace loader:

  • Parses Visual Studio solution files (.sln)
  • Parses MSBuild project files (.csproj)
  • Discovers source files
  • Resolves project references
  • Handles multiple projects deterministically

Workspace Model

Core Types

#![allow(unused)]
fn main() {
pub struct Workspace {
    pub root: PathBuf,
    pub projects: Vec<Project>,
    pub solution: Option<Solution>,
    pub source_map: SourceMap,
}

pub struct Project {
    pub name: String,
    pub path: PathBuf,
    pub target_framework: String,
    pub output_type: String,
    pub files: Vec<ProjectFile>,
    pub references: Vec<ProjectRef>,
    pub package_references: Vec<PackageReference>,
    pub errors: Vec<String>,
}

pub struct Solution {
    pub name: String,
    pub path: PathBuf,
    pub projects: Vec<SolutionProject>,
}
}

Loading Workspaces

WorkspaceLoader API

#![allow(unused)]
fn main() {
pub struct WorkspaceLoader;

impl WorkspaceLoader {
    // Load from any path (auto-detects type)
    pub fn from_path(path: &Path) -> Result<Workspace>;
    
    // Load with options
    pub fn from_path_with_options(
        path: &Path, 
        opts: WorkspaceLoadOptions
    ) -> Result<Workspace>;
}

pub struct WorkspaceLoadOptions {
    pub follow_refs: bool,  // Follow ProjectReference transitively
}
}

Loading from Solution File

#![allow(unused)]
fn main() {
use bsharp_analysis::workspace::WorkspaceLoader;

let workspace = WorkspaceLoader::from_path(Path::new("MySolution.sln"))?;

println!("Loaded {} projects", workspace.projects.len());
for project in &workspace.projects {
    println!("  - {}: {} files", project.name, project.files.len());
}
}

Loading from Project File

#![allow(unused)]
fn main() {
let workspace = WorkspaceLoader::from_path(Path::new("MyProject.csproj"))?;

// Automatically follows ProjectReference if follow_refs = true
assert!(workspace.projects.len() >= 1);
}

Loading from Directory

#![allow(unused)]
fn main() {
let workspace = WorkspaceLoader::from_path(Path::new("./src"))?;

// Discovers .sln or .csproj files in directory
}

Solution File Parsing

Solution Format

Example .sln:

Microsoft Visual Studio Solution File, Format Version 12.00
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "MyApp", "MyApp\MyApp.csproj", "{GUID}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "MyLib", "MyLib\MyLib.csproj", "{GUID}"
EndProject

Parsing Implementation

Location: src/bsharp_analysis/src/workspace/sln/reader.rs

#![allow(unused)]
fn main() {
pub struct SolutionReader;

impl SolutionReader {
    pub fn read(path: &Path) -> Result<Solution> {
        let content = fs::read_to_string(path)?;
        Self::parse(&content, path)
    }
    
    fn parse(content: &str, base_path: &Path) -> Result<Solution> {
        // Parse solution format
        // Extract project entries
        // Resolve project paths
    }
}
}

Solution Structure

#![allow(unused)]
fn main() {
pub struct Solution {
    pub name: String,
    pub path: PathBuf,
    pub projects: Vec<SolutionProject>,
}

pub struct SolutionProject {
    pub name: String,
    pub path: PathBuf,
    pub type_guid: String,
    pub guid: String,
}
}

Project File Parsing

Project Format

Example .csproj:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFramework>net8.0</TargetFramework>
    <OutputType>Exe</OutputType>
  </PropertyGroup>
  
  <ItemGroup>
    <Compile Include="Program.cs" />
    <Compile Include="Utils.cs" />
  </ItemGroup>
  
  <ItemGroup>
    <ProjectReference Include="..\MyLib\MyLib.csproj" />
  </ItemGroup>
  
  <ItemGroup>
    <PackageReference Include="Newtonsoft.Json" Version="13.0.1" />
  </ItemGroup>
</Project>

Parsing Implementation

Location: src/bsharp_analysis/src/workspace/csproj/reader.rs

#![allow(unused)]
fn main() {
pub struct CsprojReader;

impl CsprojReader {
    pub fn read(path: &Path) -> Result<Project> {
        let content = fs::read_to_string(path)?;
        Self::parse(&content, path)
    }
    
    fn parse(content: &str, project_path: &Path) -> Result<Project> {
        // Parse XML
        // Extract properties (TargetFramework, OutputType)
        // Discover source files (Compile items)
        // Extract ProjectReference entries
        // Extract PackageReference entries
    }
}
}

Source File Discovery

Glob Patterns:

  • Default: **/*.cs (all C# files recursively)
  • Respects <Compile Include="..." /> items
  • Respects <Compile Remove="..." /> exclusions
  • Excludes obj/ and bin/ directories

Implementation:

#![allow(unused)]
fn main() {
fn discover_source_files(project_dir: &Path) -> Vec<ProjectFile> {
    let pattern = project_dir.join("**/*.cs");
    let mut files = Vec::new();
    
    for entry in glob::glob(pattern.to_str().unwrap()) {
        let path = entry.unwrap();
        
        // Skip obj/ and bin/
        if path.components().any(|c| c.as_os_str() == "obj" || c.as_os_str() == "bin") {
            continue;
        }
        
        files.push(ProjectFile {
            path,
            kind: ProjectFileKind::Compile,
        });
    }
    
    files
}
}

Project References

Transitive Resolution

follow_refs Option:

#![allow(unused)]
fn main() {
let opts = WorkspaceLoadOptions { follow_refs: true };
let workspace = WorkspaceLoader::from_path_with_options(path, opts)?;
}

Behavior:

  • Follows <ProjectReference> transitively
  • Loads all referenced projects
  • Avoids duplicates
  • Stays within workspace root
  • Deterministic ordering (sorted by path)

Example:

MyApp.csproj
  → MyLib.csproj
    → MyCore.csproj

Result: [MyApp, MyLib, MyCore]

Implementation

#![allow(unused)]
fn main() {
fn follow_project_references(root: &Path, projects: &mut Vec<Project>) {
    let mut seen = HashSet::new();
    let mut queue = VecDeque::new();
    
    // Add initial projects
    for proj in projects.iter() {
        seen.insert(proj.path.clone());
        queue.push_back(proj.path.clone());
    }
    
    // BFS traversal
    while let Some(proj_path) = queue.pop_front() {
        let proj = match CsprojReader::read(&proj_path) {
            Ok(p) => p,
            Err(_) => continue,
        };
        
        for ref_path in proj.references.iter().map(|r| &r.path) {
            // Resolve relative to project directory
            let abs_path = proj_path.parent().unwrap().join(ref_path);
            
            // Skip if outside root
            if !abs_path.starts_with(root) {
                continue;
            }
            
            // Skip if already seen
            if seen.insert(abs_path.clone()) {
                queue.push_back(abs_path.clone());
                
                // Load and add project
                if let Ok(referenced_proj) = CsprojReader::read(&abs_path) {
                    projects.push(referenced_proj);
                }
            }
        }
    }
    
    // Sort for determinism
    projects.sort_by(|a, b| a.path.cmp(&b.path));
}
}

Source Map

Purpose

The SourceMap provides fast lookup of source files:

#![allow(unused)]
fn main() {
pub struct SourceMap {
    files: HashMap<PathBuf, SourceFileInfo>,
}

impl SourceMap {
    pub fn get(&self, path: &Path) -> Option<&SourceFileInfo>;
    pub fn all_files(&self) -> Vec<&Path>;
}
}

Usage

#![allow(unused)]
fn main() {
let workspace = WorkspaceLoader::from_path(path)?;

// Look up file
if let Some(info) = workspace.source_map.get(Path::new("Program.cs")) {
    println!("Found in project: {}", info.project_name);
}

// Iterate all files
for file_path in workspace.source_map.all_files() {
    println!("File: {}", file_path.display());
}
}

Error Handling

Resilient Loading

Philosophy: Continue loading even if individual projects fail

#![allow(unused)]
fn main() {
// Failed projects recorded as stubs with errors
let workspace = WorkspaceLoader::from_path(sln_path)?;

for project in &workspace.projects {
    if !project.errors.is_empty() {
        eprintln!("Errors in {}: {:?}", project.name, project.errors);
    }
}
}

Error Types

#![allow(unused)]
fn main() {
pub enum WorkspaceError {
    IoError(io::Error),
    ParseError(String),
    InvalidPath(String),
    Unsupported(String),
}
}

CLI Integration

Analyze Command

# Analyze solution
bsharp analyze MySolution.sln

# Analyze project
bsharp analyze MyProject.csproj

# Follow references (default: true)
bsharp analyze MyProject.csproj --follow-refs true

# Don't follow references
bsharp analyze MyProject.csproj --follow-refs false

Filtering

# Include only specific files
bsharp analyze MySolution.sln --include "**/*Service.cs"

# Exclude test files
bsharp analyze MySolution.sln --exclude "**/Tests/**"

# Multiple patterns
bsharp analyze MySolution.sln \
    --include "src/**/*.cs" \
    --exclude "**/obj/**" "**/bin/**"

Deterministic Behavior

Guarantees

  1. Project Order: Always sorted by absolute path
  2. File Order: Always sorted within each project
  3. Deduplication: No duplicate projects or files
  4. Reproducible: Same input always produces same output

Implementation

#![allow(unused)]
fn main() {
// Sort projects
projects.sort_by(|a, b| a.path.cmp(&b.path));

// Deduplicate by path
let mut seen = HashSet::new();
projects.retain(|p| seen.insert(p.path.clone()));

// Sort files within each project
for project in &mut projects {
    project.files.sort_by(|a, b| a.path.cmp(&b.path));
}
}

Performance

Loading Speed

  • Small solution (1-5 projects): < 100ms
  • Medium solution (5-20 projects): 100-500ms
  • Large solution (20-100 projects): 500ms-2s

Memory Usage

  • Minimal: Only metadata loaded, not source content
  • Typical: 1-5 MB per solution

Optimization

  • Parallel project loading (with parallel_analysis feature)
  • Lazy source file reading
  • Efficient path canonicalization

Examples

Example 1: Load and Analyze

#![allow(unused)]
fn main() {
use bsharp_analysis::workspace::WorkspaceLoader;
use bsharp_parser::facade::Parser;

let workspace = WorkspaceLoader::from_path(Path::new("MySolution.sln"))?;

let parser = Parser::new();
for project in &workspace.projects {
    for file in &project.files {
        let source = fs::read_to_string(&file.path)?;
        match parser.parse(&source) {
            Ok(cu) => println!("Parsed: {}", file.path.display()),
            Err(e) => eprintln!("Error in {}: {}", file.path.display(), e),
        }
    }
}
}

Example 2: Project Statistics

#![allow(unused)]
fn main() {
let workspace = WorkspaceLoader::from_path(path)?;

println!("Solution: {}", workspace.solution.as_ref().unwrap().name);
println!("Projects: {}", workspace.projects.len());

let total_files: usize = workspace.projects.iter()
    .map(|p| p.files.len())
    .sum();
println!("Total files: {}", total_files);

for project in &workspace.projects {
    println!("  {}: {} files", project.name, project.files.len());
}
}

Example 3: Dependency Graph

#![allow(unused)]
fn main() {
let workspace = WorkspaceLoader::from_path(path)?;

println!("Project Dependencies:");
for project in &workspace.projects {
    if !project.references.is_empty() {
        println!("{}:", project.name);
        for ref_ in &project.references {
            println!("  → {}", ref_.name);
        }
    }
}
}

Testing

Test Fixtures

Location: tests/fixtures/

tests/fixtures/
├── happy_path/
│   ├── test.sln
│   ├── testApplication/
│   │   ├── testApplication.csproj
│   │   └── Program.cs
│   └── testDependency/
│       ├── testDependency.csproj
│       └── Library.cs
└── complex/
    └── ...

Test Examples

#![allow(unused)]
fn main() {
#[test]
fn test_load_solution() {
    let sln_path = PathBuf::from("tests/fixtures/happy_path/test.sln");
    let workspace = WorkspaceLoader::from_path(&sln_path).unwrap();
    
    assert_eq!(workspace.projects.len(), 2);
    assert!(workspace.solution.is_some());
}

#[test]
fn test_follow_references() {
    let proj_path = PathBuf::from("tests/fixtures/happy_path/testApplication/testApplication.csproj");
    let workspace = WorkspaceLoader::from_path(&proj_path).unwrap();
    
    // Should load both testApplication and testDependency
    assert_eq!(workspace.projects.len(), 2);
}
}

Future Enhancements

Planned Features

  1. NuGet Package Resolution

    • Resolve package references
    • Download packages if needed
    • Parse package assemblies
  2. MSBuild Integration

    • Full MSBuild evaluation
    • Property expansion
    • Target execution
  3. Multi-targeting Support

    • Handle multiple target frameworks
    • Conditional compilation
  4. Incremental Loading

    • Cache workspace metadata
    • Reload only changed projects


References

  • Implementation: src/bsharp_analysis/src/workspace/
  • Loader: src/bsharp_analysis/src/workspace/loader.rs
  • Solution Reader: src/bsharp_analysis/src/workspace/sln/reader.rs
  • Project Reader: src/bsharp_analysis/src/workspace/csproj/reader.rs
  • Model: src/bsharp_analysis/src/workspace/model.rs
  • Source Map: src/bsharp_analysis/src/workspace/source_map.rs
  • Tests: src/bsharp_tests/src/workspace/ and src/bsharp_tests/src/integration/
2025-11-17 15:18:26 • commit: 03a4e25

Configuration Overview

BSharp analysis can be configured via TOML or JSON files and by CLI flags that map to config fields.


Locations

  • Project root: .bsharp.toml or .bsharp.json
  • Custom path via bsharp analyze <INPUT> --config <FILE>

AnalysisConfig (fields)

Source: src/bsharp_analysis/src/context.rs

#![allow(unused)]
fn main() {
pub struct AnalysisConfig {
    // Control flow thresholds
    pub cf_high_complexity_threshold: usize, // default: 10
    pub cf_deep_nesting_threshold: usize,    // default: 4

    // Toggles and severities
    pub enable_rulesets: HashMap<String, bool>,
    pub enable_passes: HashMap<String, bool>,
    pub rule_severities: HashMap<String, DiagnosticSeverity>,

    // Workspace filters
    pub workspace: WorkspaceConfig,

    // Optional churn/PE settings (reserved/future)
    pub churn_enable: bool,
    pub churn_period_days: u32,
    pub churn_include_merges: bool,
    pub churn_max_commits: Option<u32>,
    pub pe_reference_paths: Vec<String>,
    pub pe_references: Vec<String>,
}

pub struct WorkspaceConfig {
    pub follow_refs: bool,
    pub include: Vec<String>,
    pub exclude: Vec<String>,
}
}

TOML Example

[analysis]
cf_high_complexity_threshold = 10
cf_deep_nesting_threshold = 4

[enable_rulesets]
naming = true
semantic = true
control_flow_smells = true

[enable_passes]
passes.metrics = true
passes.control_flow = true
passes.dependencies = true

[rule_severities]
CF002 = "warning"
CF003 = "warning"

[workspace]
follow_refs = true
include = ["src/**/*.cs"]
exclude = ["**/obj/**", "**/bin/**"]

JSON Example

{
  "cf_high_complexity_threshold": 10,
  "cf_deep_nesting_threshold": 4,
  "enable_rulesets": {
    "naming": true,
    "semantic": true,
    "control_flow_smells": true
  },
  "enable_passes": {
    "passes.metrics": true,
    "passes.control_flow": true,
    "passes.dependencies": true
  },
  "rule_severities": {
    "CF002": "warning",
    "CF003": "warning"
  },
  "workspace": {
    "follow_refs": true,
    "include": ["src/**/*.cs"],
    "exclude": ["**/obj/**", "**/bin/**"]
  }
}

CLI Mapping

  • --enable-ruleset <ID> / --disable-ruleset <ID>enable_rulesets[ID] = true|false
  • --enable-pass <ID> / --disable-pass <ID>enable_passes[ID] = true|false
  • --severity CODE=levelrule_severities[CODE] = level (error|warning|info|hint)
  • --follow-refs <BOOL>workspace.follow_refs
  • --include <GLOB>...workspace.include
  • --exclude <GLOB>...workspace.exclude

Tips

  • Prefer TOML for readability; JSON is supported for tool integration.
  • Thresholds influence CfgSummary counts in the final report.
  • Use unique IDs for passes/rulesets consistent with registry (see Passes & Rules).
2025-11-17 15:18:26 • commit: 03a4e25

Contributing to BSharp

Thank you for your interest in contributing to BSharp! This document provides guidelines for contributing to the project.

Development Setup

Prerequisites

  • Rust 1.70 or later
  • Git
  • A text editor or IDE with Rust support

Building the Project

  1. Clone the repository:
git clone https://github.com/mikserek/bsharp.git
cd bsharp

Parser Testing Best Practices

  • Prefer expect_ok(input, parse(input.into())) from syntax::test_helpers when asserting successful parses. It prints readable, rustc-like diagnostics on failure via format_error_tree.
  • Keep tests focused and minimal; add a separate negative test when ambiguity is possible (e.g., ternary vs ?. vs ??, range vs dot vs float).
  • For lookahead/disambiguation boundaries, add cases to tests/parser/expressions/lookahead_boundaries2_tests.rs.
  • For complex constructs (e.g., new with object/collection initializers), add positive and negative cases near tests/parser/expressions/new_expression_tests.rs and target_typed_new_tests.rs.
  • Invalid-input diagnostics: place small snapshot-style assertions in tests/parser/expressions/invalid_diagnostics_tests.rs that check for line/column and caret presence. Avoid overfitting on exact wording.
  • When adding delimited constructs (parentheses, brackets, braces), guard the closing delimiter with cut(...) once committed to that branch to prevent misleading backtracking.
  • Always wrap sub-parsers with bws(...) to ensure whitespace/comments are handled consistently.

Adding New Parser Test Files

  • In tests/parser/expressions/, simply add a new *_tests.rs file; it will be discovered by the existing integration test harness.
  • For declarations/statements/types, follow the existing directory structure under tests/parser/ and mimic module organization.
  • Keep tests deterministic and avoid relying on environment-specific paths or random data.
  1. Build the project:
cargo build
  1. Run tests:
cargo test
  1. Run the CLI tool:
cargo run -- --help

Project Structure

Understanding the codebase organization:

src/
├── parser/           # Parser implementations (expressions, statements, etc.)
├── syntax/           # Parser infrastructure (AST nodes, helpers, errors)
├── analysis/         # Code analysis framework
├── workspace/        # Solution and project file loading
├── cli/              # Command-line interface
└── lib.rs           # Library entry point

Code Style

Follow Rust conventions:

  • Use cargo fmt to format code
  • Use cargo clippy to check for common issues
  • Follow naming conventions (snake_case for functions, PascalCase for types)
  • Add documentation comments for public APIs

Testing

All contributions should include appropriate tests:

Parser Tests

IMPORTANT: Parser tests must live in an external test crate under src/bsharp_tests/src/, NOT inline #[cfg(test)] modules.

#![allow(unused)]
fn main() {
// ✅ CORRECT: External test file
// tests/parser/declarations/class_declaration_tests.rs

use bsharp::syntax::test_helpers::expect_ok;
use bsharp::parser::expressions::declarations::parse_class_declaration;

#[test]
fn test_parse_simple_class() {
    let input = "public class MyClass { }";
    let class = expect_ok(input, parse_class_declaration(input.into()));
    assert_eq!(class.identifier.name, "MyClass");
}
}

Analysis Tests

#![allow(unused)]
fn main() {
// tests/analysis/complexity_tests.rs

use bsharp::syntax::Parser;
use bsharp::analysis::metrics::cyclomatic_complexity;

#[test]
fn test_complexity_analysis() {
    let source = r#"
        public class Test {
            public void Method() {
                if (true) {
                    for (int i = 0; i < 10; i++) {
                        // complexity += 2
                    }
                }
            }
        }
    "#;
    
    let parser = Parser::new();
    let cu = parser.parse(source).unwrap();
    
    // Find the method and calculate complexity
    // (implementation details depend on analysis API)
    
    assert_eq!(complexity, 3);
}
}

Documentation

  • Add rustdoc comments for public functions and types
  • Update this documentation when adding new features
  • Include examples in documentation

Adding New Language Features

When adding support for new C# language features:

  1. Define AST Nodes: Add node definitions in src/syntax/nodes/
  2. Implement Parser: Add parser in appropriate src/parser/ subdirectory
  3. Add Tests: Include comprehensive tests in tests/parser/ directory
  4. Update Traversal: Prefer the bsharp_analysis::framework::Query API for AST enumeration; for statement/expression-heavy logic, use shared helpers or a focused walker.
  5. Document: Add documentation for the new feature

Example process for adding a new expression type:

  1. Define the AST node:
#![allow(unused)]
fn main() {
// src/syntax/nodes/expressions/new_expression.rs
#[derive(Debug, PartialEq, Clone, Serialize, Deserialize)]
pub struct NewExpression {
    pub keyword: String,  // "new"
    pub arguments: Vec<Expression>,
}
}
  1. Add to Expression enum:
#![allow(unused)]
fn main() {
// src/syntax/nodes/expressions/expression.rs
pub enum Expression {
    // ... existing variants
    New(NewExpression),
}
}
  1. Implement parser:
#![allow(unused)]
fn main() {
// src/parser/expressions/new_expression_parser.rs
pub fn parse_new_expression(input: &str) -> BResult<&str, NewExpression> {
    // Parser implementation
}
}
  1. Add tests:
#![allow(unused)]
fn main() {
// tests/parser/expressions/new_expression_tests.rs
#[test]
fn test_parse_new_expression() {
    // Test implementation
}
}

Submitting Changes

Pull Request Process

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/new-feature
  3. Make your changes
  4. Run tests: cargo test
  5. Run formatting: cargo fmt
  6. Run clippy: cargo clippy
  7. Commit changes with clear messages
  8. Push to your fork
  9. Create a pull request

Commit Messages

Use clear, descriptive commit messages:

feat: add support for C# 11 file-scoped types

- Add parser for file-scoped type declarations
- Update AST to handle new syntax
- Add comprehensive tests
- Update documentation

Fixes #123

Pull Request Requirements

  • All tests must pass
  • Code must be formatted with cargo fmt
  • No clippy warnings
  • Include tests for new functionality
  • Update documentation if needed

Common Development Tasks

Adding a New Parser

  1. Define the AST node structure
  2. Implement the parser function
  3. Add the parser to the appropriate module
  4. Write comprehensive tests
  5. Update integration points

Extending Analysis

  1. Define analysis traits if needed
  2. Implement analyzer struct
  3. Add configuration options
  4. Write tests with various scenarios
  5. Update CLI integration

Debugging Parser Issues

Use these tools for debugging:

# Test specific parser with debug output
RUST_LOG=debug cargo test test_name -- --nocapture

# Run parser on test file (prints textual AST tree)
cargo run -- parse debug_cases/test.cs

# Check AST visualization
cargo run -- tree debug_cases/test.cs --output debug.svg

Getting Help

  • Check existing issues and documentation
  • Ask questions in GitHub issues
  • Join community discussions

Code of Conduct

  • Be respectful and inclusive
  • Focus on constructive feedback
  • Help others learn and grow
  • Maintain a positive environment

Thank you for contributing to BSharp!

2025-11-17 15:18:26 • commit: 03a4e25

Testing Guide

This document provides comprehensive guidance on testing in the BSharp project, covering test organization, best practices, and debugging strategies.


Test Organization Philosophy

External Test Structure

Critical Principle: Parser tests are external to implementation modules and live in a dedicated test crate.

src/bsharp_tests/
├── cargo.toml               # Test crate manifest
└── src/
    ├── parser/
    │   ├── expressions/
    │   │   ├── expression_tests.rs
    │   │   ├── lambda_expression_tests.rs
    │   │   ├── pattern_matching_tests.rs
    │   │   ├── ambiguity_tests.rs
    │   │   ├── lookahead_boundaries2_tests.rs
    │   │   └── ...
    │   ├── statements/
    │   │   ├── if_statement_tests.rs
    │   │   ├── for_statement_tests.rs
    │   │   ├── expression_statement_tests.rs
    │   │   └── ...
    │   ├── declarations/
    │   │   ├── class_declaration_tests.rs
    │   │   ├── interface_declaration_parser_tests.rs
    │   │   ├── recovery_tests.rs
    │   │   └── ...
    │   ├── types/
    │   │   ├── type_tests.rs
    │   │   ├── advanced_type_tests.rs
    │   │   └── ...
    │   ├── preprocessor/
    │   │   └── ...
    │   └── keyword_parsers_tests.rs
    └── fixtures/
        ├── happy_path/
        └── complex/

Rationale:

  • Separation of Concerns: Test code separate from implementation
  • Compilation Efficiency: Tests don't bloat production binary
  • Organization: Clear structure mirrors parser organization
  • Maintainability: Easy to find and update tests

What NOT to Do:

#![allow(unused)]
fn main() {
// ❌ NEVER do this in src/parser/ files
#[cfg(test)]
mod tests {
    use super::*;
    
    #[test]
    fn test_something() {
        // ...
    }
}
}

What to Do Instead:

#![allow(unused)]
fn main() {
// ✅ Create tests/parser/expressions/my_feature_tests.rs
use bsharp::syntax::test_helpers::expect_ok;
use bsharp::parser::expressions::parse_my_feature;

#[test]
fn test_my_feature() {
    let input = "my feature syntax";
    let result = parse_my_feature(input.into());
    let ast = expect_ok(input, result);
    // assertions...
}
}

Test Helpers

expect_ok() - Readable Test Failures

Location: src/syntax/test_helpers.rs

Usage:

#![allow(unused)]
fn main() {
use bsharp::syntax::test_helpers::expect_ok;

#[test]
fn test_parse_class() {
    let input = "public class MyClass { }";
    let result = parse_class_declaration(input.into());
    let class = expect_ok(input, result);
    
    assert_eq!(class.identifier.name, "MyClass");
}
}

Benefits:

  • Automatic Error Formatting: Pretty-prints ErrorTree on failure
  • Readable Diagnostics: Shows parse failure context with caret
  • Panic on Failure: Test fails with clear error message

Error Output Example:

0: at line 1, in keyword "class":
public clas MyClass { }
       ^--- expected keyword "class"

1: in context "class declaration"

Other Test Helpers

parse_input_unwrap() - Unwrap parse result:

#![allow(unused)]
fn main() {
use bsharp_syntax::span::Span;
let (remaining, ast) = parse_input_unwrap(
    parse_expression_spanned(Span::new(input)).map(|(rest, s)| (rest, s.node))
);
assert_eq!(remaining, "");  // Verify full consumption
}

assert_parse_error() - Verify parse failures:

#![allow(unused)]
fn main() {
use bsharp_syntax::span::Span;
assert_parse_error(
    parse_expression_spanned(Span::new("invalid syntax")).map(|(rest, s)| (rest, s.node))
);
}

Parser Testing Best Practices

1. Prefer expect_ok() for Successful Parses

#![allow(unused)]
fn main() {
#[test]
fn test_if_statement() {
    let input = "if (x > 0) { return x; }";
    let stmt = expect_ok(input, parse_if_statement(input.into()));
    
    // Now assert on the AST structure
    match stmt {
        Statement::If(if_stmt) => {
            // Verify condition, consequence, etc.
        }
        _ => panic!("Expected IfStatement"),
    }
}
}

2. Keep Tests Focused and Minimal

Good:

#![allow(unused)]
fn main() {
#[test]
fn test_simple_lambda() {
    let input = "x => x * 2";
    let expr = expect_ok(input, parse_lambda_expression(input.into()));
    // Test one thing
}

#[test]
fn test_lambda_with_multiple_params() {
    let input = "(x, y) => x + y";
    let expr = expect_ok(input, parse_lambda_expression(input.into()));
    // Test another thing
}
}

Bad:

#![allow(unused)]
fn main() {
#[test]
fn test_all_lambda_forms() {
    // Testing too many things in one test
    // Hard to debug when it fails
}
}

3. Add Negative Tests for Ambiguity

When disambiguation is possible, add tests for both valid and invalid cases:

#![allow(unused)]
fn main() {
#[test]
fn test_ternary_vs_nullable() {
    // Valid ternary
    let input = "x ? y : z";
    expect_ok(input, parse_conditional_expression(input.into()));
    
    // Valid null-conditional (different test)
}

#[test]
fn test_null_conditional_operator() {
    let input = "obj?.Property";
    expect_ok(input, parse_postfix_expression(input.into()));
}
}

4. Test Lookahead/Disambiguation Boundaries

Location: tests/parser/expressions/lookahead_boundaries2_tests.rs

#![allow(unused)]
fn main() {
#[test]
fn test_range_vs_dot_vs_float() {
    // Range operator
    expect_ok("1..10", parse_range_expression("1..10"));
    
    // Member access
    expect_ok("obj.Method", parse_postfix_expression("obj.Method"));
    
    // Float literal
    expect_ok("3.14", parse_literal("3.14"));
}
}

5. Test Complex Constructs

For complex constructs like new expressions with initializers:

Location: tests/parser/expressions/new_expression_tests.rs

#![allow(unused)]
fn main() {
#[test]
fn test_new_with_object_initializer() {
    let input = "new Person { Name = \"John\", Age = 30 }";
    let expr = expect_ok(input, parse_new_expression(input.into()));
    // Verify structure
}

#[test]
fn test_new_with_collection_initializer() {
    let input = "new List<int> { 1, 2, 3 }";
    let expr = expect_ok(input, parse_new_expression(input.into()));
    // Verify structure
}

#[test]
fn test_target_typed_new() {
    let input = "new(42, \"test\")";
    let expr = expect_ok(input, parse_new_expression(input.into()));
    // Verify structure
}
}

6. Test Invalid Input Diagnostics

Location: tests/parser/expressions/invalid_diagnostics_tests.rs

#![allow(unused)]
fn main() {
#[test]
fn test_unclosed_paren_diagnostic() {
    use bsharp_syntax::span::Span;
    let input = "(x + y";
    let result = parse_expression_spanned(Span::new(input)).map(|(rest, s)| (rest, s.node));
    assert!(result.is_err());
    // Optionally check error contains expected message
}
}

Guidelines:

  • Keep small snapshot-style assertions
  • Check for line/column and caret presence
  • Avoid overfitting on exact wording (may change)

7. Guard Closing Delimiters with cut()

When adding delimited constructs, ensure closing delimiters use cut():

#![allow(unused)]
fn main() {
use nom::combinator::cut;
use crate::syntax::parser_helpers::{bdelimited, bchar};

fn parse_parenthesized(input: &str) -> BResult<&str, Expression> {
    bdelimited(
        bchar('('),
        parse_expression,
        cut(bchar(')'))  // ✅ Prevents misleading backtracking
    )(input.into())
}
}

8. Wrap Sub-Parsers with bws()

Ensure whitespace/comments are handled consistently:

#![allow(unused)]
fn main() {
use crate::syntax::parser_helpers::bws;

fn parse_if_statement(input: &str) -> BResult<&str, Statement> {
    let (input, _) = bws(keyword("if"))(input.into())?;
    let (input, _) = bws(bchar('('))(input.into())?;
    let (input, condition) = bws(parse_expression)(input.into())?;
    // ...
}
}

Test Discovery and Execution

Running All Tests

cargo test

Running Specific Test Suites

# All parser tests
cargo test --test parser

# Specific module
cargo test --test parser expression_tests

# Specific test
cargo test --test parser test_lambda_expression

Running with Output

# Show println! output
cargo test -- --nocapture

# Show test names as they run
cargo test -- --nocapture --test-threads=1

Running with Debug Logging

RUST_LOG=debug cargo test test_name -- --nocapture

Test Fixtures

Fixture Organization

tests/fixtures/
├── happy_path/           # Valid, well-formed C# projects
│   ├── testApplication/
│   │   ├── Program.cs
│   │   ├── testApplication.csproj
│   │   └── ...
│   └── testDependency/
│       └── ...
└── complex/              # Complex, real-world scenarios
    ├── testApplication/
    └── testDependency/

Using Fixtures in Tests

#![allow(unused)]
fn main() {
use std::fs;
use std::path::PathBuf;

#[test]
fn test_parse_fixture() {
    let fixture_path = PathBuf::from("tests/fixtures/happy_path/testApplication/Program.cs");
    let source = fs::read_to_string(&fixture_path).unwrap();
    
    let parser = Parser::new();
    let result = parser.parse(&source);
    
    assert!(result.is_ok());
}
}

Fixture Guidelines

  • Valid Code: Fixtures should be valid C# that compiles
  • Realistic: Use real-world patterns, not contrived examples
  • Documented: Add README.md explaining fixture purpose
  • Minimal: Keep fixtures as small as possible while testing feature

Snapshot Testing

Using insta for Snapshot Tests

Installation: Already included in Cargo.toml dev-dependencies

#![allow(unused)]
fn main() {
use insta::assert_json_snapshot;

#[test]
fn test_class_ast_structure() {
    let input = "public class MyClass { public int Field; }";
    let result = parse_class_declaration(input.into());
    let class = expect_ok(input, result);
    
    // Creates snapshot file on first run
    assert_json_snapshot!(class);
}
}

Reviewing Snapshots

# Review snapshot changes
cargo insta review

# Accept all changes
cargo insta accept

# Reject all changes
cargo insta reject

Snapshot Guidelines

  • Complex Structures: Use for complex AST structures
  • Regression Prevention: Catch unintended changes
  • Review Carefully: Always review snapshot diffs
  • Commit Snapshots: Include snapshot files in git

Debugging Test Failures

Strategy 1: Use expect_ok() Error Output

When a test fails, expect_ok() shows the parse error:

0: at line 1, in keyword "class":
public clas MyClass { }
       ^--- expected keyword "class"

Strategy 2: Add Debug Logging

#![allow(unused)]
fn main() {
#[test]
fn test_with_logging() {
    env_logger::init();  // Initialize logger
    
    use bsharp_syntax::span::Span;
    let input = "complex syntax";
    log::debug!("Parsing: {}", input);
    
    let result = parse_expression_spanned(Span::new(input)).map(|(rest, s)| (rest, s.node));
    log::debug!("Result: {:?}", result);
    
    expect_ok(input, result);
}
}

Run with:

RUST_LOG=debug cargo test test_with_logging -- --nocapture

Strategy 3: Test Smaller Components

If a complex parser fails, test its sub-parsers individually:

#![allow(unused)]
fn main() {
#[test]
fn test_method_declaration() {
    // Fails - too complex
    let input = "public async Task<int> Method(int x) { return x; }";
    expect_ok(input, parse_method_declaration(input.into()));
}

// Break it down:

#[test]
fn test_method_modifiers() {
    let input = "public async";
    expect_ok(input, parse_modifiers(input.into()));
}

#[test]
fn test_method_return_type() {
    let input = "Task<int>";
    expect_ok(input, parse_type(input.into()));
}

#[test]
fn test_method_parameters() {
    let input = "(int x)";
    expect_ok(input, parse_parameter_list(input.into()));
}
}

Strategy 4: Use Parser Debugging Tools

# Parse file and output JSON
cargo run -- parse debug_cases/test.cs --output debug.json

# Generate AST visualization
cargo run -- tree debug_cases/test.cs --output debug.svg

Strategy 5: Check Error Recovery

For declaration error recovery tests:

#![allow(unused)]
fn main() {
#[test]
fn test_recovery_from_malformed_member() {
    let input = r#"
    public class MyClass {
        public int ValidField;
        public invalid syntax here;  // Malformed
        public int AnotherValidField;  // Should recover
    }
    "#;
    
    let result = parse_class_declaration(input.into());
    // Should parse despite error
    assert!(result.is_ok());
}
}

Integration Testing

Workspace Loading Tests

#![allow(unused)]
fn main() {
use bsharp::workspace::WorkspaceLoader;

#[test]
fn test_load_solution() {
    let sln_path = PathBuf::from("tests/fixtures/happy_path/test.sln");
    let workspace = WorkspaceLoader::from_path(&sln_path).unwrap();
    
    assert_eq!(workspace.projects.len(), 2);
    assert!(workspace.solution.is_some());
}

#[test]
fn test_load_csproj() {
    let csproj_path = PathBuf::from("tests/fixtures/happy_path/testApplication/testApplication.csproj");
    let workspace = WorkspaceLoader::from_path(&csproj_path).unwrap();
    
    assert_eq!(workspace.projects.len(), 1);
}
}

Analysis Pipeline Tests

#![allow(unused)]
fn main() {
use bsharp::analysis::framework::pipeline::AnalyzerPipeline;
use bsharp::analysis::framework::session::AnalysisSession;

#[test]
fn test_analysis_pipeline() {
    let source = "public class Test { public void Method() { } }";
    let parser = Parser::new();
    let cu = parser.parse(source).unwrap();
    
    let mut session = AnalysisSession::new();
    AnalyzerPipeline::run_with_defaults(&cu, &mut session);
    
    let report = session.into_report();
    assert!(report.diagnostics.is_empty());  // No errors
}
}

Performance Testing

Benchmarking

#![allow(unused)]
fn main() {
#[test]
#[ignore]  // Run with --ignored flag
fn bench_parse_large_file() {
    use std::time::Instant;
    
    let source = fs::read_to_string("tests/fixtures/large_file.cs").unwrap();
    let parser = Parser::new();
    
    let start = Instant::now();
    let result = parser.parse(&source);
    let duration = start.elapsed();
    
    assert!(result.is_ok());
    println!("Parse time: {:?}", duration);
    
    // Assert reasonable performance
    assert!(duration.as_millis() < 1000, "Parse took too long");
}
}

Running Performance Tests

cargo test --ignored -- bench_

Continuous Integration

CI Test Strategy

# .github/workflows/test.yml (example)
name: Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions-rs/toolchain@v1
        with:
          toolchain: stable
      - name: Run tests
        run: cargo test --all-features
      - name: Run clippy
        run: cargo clippy -- -D warnings
      - name: Check formatting
        run: cargo fmt -- --check

Test Coverage

Measuring Coverage

# Install tarpaulin
cargo install cargo-tarpaulin

# Run coverage
cargo tarpaulin --out Html --output-dir coverage

Coverage Goals

  • Parser Core: 90%+ coverage
  • Analysis Framework: 80%+ coverage
  • CLI Commands: 70%+ coverage
  • Workspace Loading: 80%+ coverage

Common Testing Patterns

Pattern 1: Positive and Negative Tests

#![allow(unused)]
fn main() {
#[test]
fn test_valid_syntax() {
    let input = "valid syntax";
    expect_ok(input, parse_feature(input.into()));
}

#[test]
fn test_invalid_syntax() {
    let input = "invalid syntax";
    assert!(parse_feature(input.into()).is_err());
}
}

Pattern 2: Boundary Testing

#![allow(unused)]
fn main() {
#[test]
fn test_empty_input() {
    assert!(parse_feature("").is_err());
}

#[test]
fn test_minimal_input() {
    expect_ok("x", parse_feature("x"));
}

#[test]
fn test_maximal_input() {
    let input = "very complex nested structure...";
    expect_ok(input, parse_feature(input.into()));
}
}

Pattern 3: Equivalence Testing

#![allow(unused)]
fn main() {
#[test]
fn test_whitespace_insensitive() {
    let compact = "if(x){y;}";
    let spaced = "if (x) { y; }";
    
    let ast1 = expect_ok(compact, parse_if_statement(compact));
    let ast2 = expect_ok(spaced, parse_if_statement(spaced));
    
    assert_eq!(ast1, ast2);
}
}

Test Maintenance

When to Update Tests

  1. API Changes: Update tests when parser API changes
  2. Bug Fixes: Add regression tests for fixed bugs
  3. New Features: Add tests for new language features
  4. Refactoring: Ensure tests still pass after refactoring

Test Cleanup

  • Remove Duplicate Tests: Consolidate similar tests
  • Update Outdated Tests: Fix tests using deprecated APIs
  • Remove Dead Tests: Delete tests for removed features
  • Improve Names: Use descriptive test names

Test Documentation

#![allow(unused)]
fn main() {
/// Tests that lambda expressions with multiple parameters are parsed correctly.
/// 
/// This test verifies:
/// - Parameter list parsing
/// - Arrow token recognition
/// - Expression body parsing
#[test]
fn test_lambda_with_multiple_params() {
    let input = "(x, y) => x + y";
    let expr = expect_ok(input, parse_lambda_expression(input.into()));
    // ...
}
}

Summary

Testing Checklist

  • Tests in tests/ directory, not inline
  • Use expect_ok() for readable failures
  • Keep tests focused and minimal
  • Add negative tests for ambiguity
  • Test lookahead/disambiguation boundaries
  • Test complex constructs thoroughly
  • Use cut() for closing delimiters
  • Wrap sub-parsers with bws()
  • Add fixtures for integration tests
  • Use snapshot tests for complex structures
  • Document test purpose and coverage

Resources

  • Test Helpers: src/syntax/test_helpers.rs
  • Example Tests: tests/parser/expressions/
  • Fixtures: tests/fixtures/
  • Contributing Guide: docs/development/contributing.md
  • Architecture: docs/development/architecture.md
2025-11-17 15:18:26 • commit: 03a4e25

Architecture Decisions

This document explains the key architectural decisions made in the BSharp project, their rationale, and their implications for contributors.


Core Design Philosophy

BSharp is designed as a modular, extensible C# parser and analysis toolkit written in Rust. The architecture prioritizes:

  1. Correctness - Accurate parsing of C# syntax
  2. Performance - Efficient parsing and analysis of large codebases
  3. Maintainability - Clear module boundaries and minimal coupling
  4. Extensibility - Easy addition of new language features and analyzers

Parser Architecture

Why nom Parser Combinators?

Decision: Use the nom parser combinator library as the foundation for parsing.

Rationale:

  • Composability: Small, focused parsers combine to handle complex syntax
  • Type Safety: Rust's type system catches parser errors at compile time
  • Performance: Zero-copy parsing with minimal allocations
  • Testability: Individual parser functions are easily unit tested
  • Maintainability: Declarative style is easier to understand than hand-written parsers

Trade-offs:

  • Learning curve for contributors unfamiliar with parser combinators
  • Error messages require additional work (addressed with nom-supreme)

Implementation:

  • Core parsing infrastructure: src/bsharp_parser/src/helpers/
  • Parser implementations: src/bsharp_parser/src/
  • All parsers return BResult<I, O> type alias

Error Handling Strategy

Decision: Use nom-supreme::ErrorTree for all parser errors.

Rationale:

  • Rich Context: Tree structure preserves full parse failure path
  • Better Diagnostics: Context annotations via .context() method
  • Integration: Seamless integration with nom combinators
  • Debugging: Pretty-printing via format_error_tree()

Evolution:

  • Initially used custom BSharpParseError type
  • Migrated to ErrorTree for better diagnostics
  • Custom error type deprecated and removed

Implementation:

#![allow(unused)]
fn main() {
pub type BResult<I, O> = IResult<I, O, ErrorTree<I>>;
}

Helper Functions (in src/bsharp_parser/src/helpers/)

  • context() - Adds contextual information
  • cut() - Commits to parse branch (prevents misleading backtracking)
  • bws() - Whitespace-aware wrapper with error context
  • bdelimited() - Delimited parsing with cut on closing delimiter

Module Organization

Decision: Separate the parser crate from the syntax (AST) crate, and keep analysis in its own crate.

Structure:

src/
├── bsharp_parser/          # Parser implementations and public facade
│   ├── src/
│   │   ├── expressions/    # Expression parsers
│   │   ├── keywords/       # Keyword parsing (modularized)
│   │   ├── helpers/        # Parsing utilities (bws, cut, context, directives, ...)
│   │   ├── facade.rs       # Public Parser facade
│   │   └── ...
├── bsharp_syntax/          # AST node definitions and shared syntax types
│   └── src/                # (re-exported by bsharp_parser as `syntax`)
├── bsharp_analysis/        # Analysis framework and workspace
│   └── src/
└── bsharp_cli/             # CLI entry and subcommands

Rationale:

  • Separation of Concerns: Infrastructure vs implementation
  • Reusability: Helpers used across all parsers
  • API Clarity: syntax module is the public API
  • Testing: Infrastructure can be tested independently

Keyword Modularization

Decision: Organize keywords by category in dedicated modules.

Structure:

src/parser/keywords/
├── mod.rs                      # Keyword infrastructure
├── access_keywords.rs          # public, private, protected, internal
├── accessor_keywords.rs        # get, set, init, add, remove
├── type_keywords.rs            # class, struct, interface, enum, record
├── modifier_keywords.rs        # static, abstract, virtual, sealed
├── flow_control_keywords.rs    # if, else, switch, case, default
├── iteration_keywords.rs       # for, foreach, while, do
├── expression_keywords.rs      # new, this, base, typeof, sizeof
├── linq_query_keywords.rs      # from, where, select, orderby
└── ...

Rationale:

  • Maintainability: Easy to find and update keyword parsers
  • Consistency: Uniform keyword parsing strategy
  • Word Boundaries: All keywords use keyword() helper for boundary checking
  • Prevents Bugs: Avoids partial matches (e.g., "int" vs "int32")

Implementation:

  • keyword() function enforces [A-Za-z0-9_] word boundaries
  • Parsers grouped under src/bsharp_parser/src/keywords/

AST Design

Naming Convention

Decision: Use PascalCase names without 'Syntax' suffix for all AST nodes.

Examples:

  • ClassDeclaration (not ClassDeclarationSyntax)
  • MethodDeclaration (not MethodDeclarationSyntax)
  • ExpressionStatement (not ExpressionStatementSyntax)
  • IfStatement (not IfStatementSyntax)

Rationale:

  • Clarity: Shorter, clearer names
  • Roslyn Inspiration: Mirrors Roslyn's structure where appropriate
  • Consistency: Uniform naming across entire codebase
  • User Preference: Explicit design decision (documented in memories)

Implications:

  • All AST node types follow this convention
  • Test code uses these names
  • Documentation uses these names
  • Breaking change from earlier versions with 'Syntax' suffix

AST Ownership Model

Decision: Parent nodes own their children; no circular references.

Structure:

#![allow(unused)]
fn main() {
pub struct ClassDeclaration {
    pub attributes: Vec<AttributeList>,
    pub modifiers: Vec<Modifier>,
    pub name: Identifier,
    pub type_parameters: Option<Vec<TypeParameter>>,
    pub primary_constructor_parameters: Option<Vec<Parameter>>,
    pub base_types: Vec<Type>,
    pub body_declarations: Vec<ClassBodyDeclaration>,  // Owned
    pub documentation: Option<XmlDocumentationComment>,
    pub constraints: Option<Vec<TypeParameterConstraintClause>>,
}
}

Rationale:

  • Rust Ownership: Leverages Rust's ownership system
  • Memory Safety: No reference cycles or lifetime complexity
  • Simplicity: Clear ownership semantics
  • Traversal: Navigation traits provide search without ownership issues

Trade-offs:

  • Cannot directly reference parent from child
  • Navigation requires traversal from root
  • Mitigated by AstNavigate and FindDeclarations traits

Zero-Copy Parsing

Decision: Minimize string allocations during parsing where possible.

Implementation:

  • String slices reference original input
  • Identifiers store String (owned) for convenience
  • Literals preserve original format as String

Rationale:

  • Performance: Reduces allocation overhead
  • Memory Efficiency: Lower memory footprint
  • Trade-off: Some allocations necessary for AST lifetime

Spans and Location Tracking

Decision: Track source locations via spans for precise diagnostics and tooling.

Implementation:

  • Span type based on nom_locate::LocatedSpan lives in src/bsharp_parser/src/syntax/span.rs and is re-exported through the public parser API.
  • The parser facade supports parse_with_spans() which returns both the AST and span table for mapping nodes back to source locations.
  • Error reporting uses spans to include line/column, highlighting ranges via format_error_tree().

Rationale:

  • Diagnostics: Accurate error locations and ranges.
  • Tooling: Enables IDE features, navigation, and source mapping.
  • Testing: Stable, comparable locations for snapshot tests.

See also: docs/syntax/spans.md.


Analysis Framework

Framework-Driven Architecture

Decision: Implement a pipeline-based analysis framework with passes, rules, and visitors.

Structure:

src/analysis/
├── framework/        # Core analysis infrastructure
│   ├── pipeline.rs   # Analysis pipeline orchestration
│   ├── passes.rs     # Analysis pass trait and phases
│   ├── rules.rs      # Rule trait and rulesets
│   ├── walker.rs     # AST walker and visitor pattern
│   ├── registry.rs   # Analyzer registry
│   └── session.rs    # Analysis session and state
├── passes/           # Concrete analysis passes
├── rules/            # Concrete analysis rules
├── artifacts/        # Analysis artifacts (symbols, metrics, CFG)
└── ...

Rationale:

  • Extensibility: Easy to add new analyzers
  • Composability: Passes and rules compose via registry
  • Performance: Single-pass traversal for local rules
  • Configurability: Enable/disable passes and rules via config

Phases:

  1. Index - Symbol indexing and scope building
  2. Local - Single-pass local rules and metrics collection
  3. Global - Cross-file analysis (dependencies, etc.)
  4. Semantic - Type checking and semantic rules
  5. Reporting - Report generation and formatting

Visitor Pattern

Decision: Use visitor pattern for AST traversal.

Implementation:

#![allow(unused)]
fn main() {
pub trait Visit {
    fn enter(&mut self, node: &NodeRef, session: &mut AnalysisSession);
    fn exit(&mut self, node: &NodeRef, session: &mut AnalysisSession) {}
}

pub struct AstWalker {
    visitors: Vec<Box<dyn Visit>>,
}
}

Rationale:

  • Separation of Concerns: Traversal logic separate from analysis logic
  • Composability: Multiple visitors in single traversal
  • Performance: Single pass for multiple analyses
  • Extensibility: Easy to add new visitors

Query API

Decision: Use a typed Query API over a minimal NodeRef to traverse the AST. This is the current traversal API; the term “legacy” only refers to older navigation traits that the Query API replaced.

Implementation:

  • NodeRef enumerates coarse node categories (compilation unit, namespaces, declarations, methods, statements, expressions), and now includes top-level items like file-scoped namespaces, using directives, global using directives, and global attributes.
  • Children provides child enumeration for NodeRef.
  • Extract<T> enables Query::of<T>() to yield typed nodes without extending NodeRef for every concrete type.
  • Macro helpers impl_extract_expr! and impl_extract_stmt! simplify adding Extract impls for expression/statement variants.
  • Location: src/bsharp_syntax/src/query/ (re-exported as bsharp_analysis::framework::Query)

Rationale:

  • Composability: Typed filters via Query::filter_typed.
  • Maintainability: Avoids wide trait surfaces and duplicated traversal.
  • Performance: Focused walkers remain available for hot paths.
  • Determinism: Traversal order and artifact hashing remain stable.

See also:

  • docs/parser/navigation.md (Query API overview)
  • docs/analysis/traversal-guide.md (using Query in passes)
  • docs/development/query-cookbook.md (recipes)

Formatting and Emitters

Decision: Implement formatting via an Emit trait with per-node emitters in bsharp_syntax.

Implementation:

  • Emit trait and emitters live under src/bsharp_syntax/src/emitters/ (e.g., emitters/declarations/*, emitters/expressions/*, emitters/statements/*).
  • Formatting is separated from parsing; emitters reconstruct code from AST with consistent whitespace and trivia handling.
  • Trivia and XML doc emitters are under emitters/trivia/.

Rationale:

  • Separation of Concerns: Parsing and formatting evolve independently.
  • Consistency: Centralized formatting rules for all nodes.
  • Extensibility: Adding a new node implies an Emit impl in a known location.

See also: docs/syntax/formatter.md.


Workspace Loading

Multi-Format Support

Decision: Support loading from .sln, .csproj, or directory.

Implementation:

#![allow(unused)]
fn main() {
pub struct WorkspaceLoader;

impl WorkspaceLoader {
    pub fn from_path(path: &Path) -> Result<Workspace>;
    pub fn from_path_with_options(path: &Path, opts: WorkspaceLoadOptions) -> Result<Workspace>;
}
}

Rationale:

  • Flexibility: Support different entry points
  • IDE Integration: Match IDE project loading behavior
  • Incremental Analysis: Load only what's needed

Features:

  • Solution file (.sln) parsing
  • Project file (.csproj) parsing with XML
  • Transitive ProjectReference following
  • Source file discovery with glob patterns
  • Deterministic project ordering

Error Resilience

Decision: Continue loading workspace even if individual projects fail.

Implementation:

  • Failed projects recorded as stubs with error messages
  • Workspace loading succeeds with partial results
  • Errors accessible via Project::errors field

Rationale:

  • Robustness: Don't fail entire workspace for one bad project
  • User Experience: Show what can be analyzed
  • Debugging: Error messages preserved for investigation

Testing Strategy

External Test Organization

Decision: Externalize tests; in the current workspace they live under src/bsharp_tests/ rather than inline #[cfg(test)] modules.

Structure:

src/bsharp_tests/src/
├── parser/
│   ├── expressions/
│   ├── statements/
│   ├── declarations/
│   └── types/
├── cli/
└── integration/

Rationale:

  • Separation: Test code separate from implementation
  • Organization: Clear structure mirrors crates
  • Compilation: Tests don't bloat production binaries

Note: A future migration to top-level tests/ may be considered.

Test Helpers

Decision: Provide expect_ok() helper for readable test failures.

Implementation:

#![allow(unused)]
fn main() {
pub fn expect_ok<T>(input: &str, result: BResult<&str, T>) -> T {
    match result {
        Ok((_, value)) => value,
        Err(e) => {
            eprintln!("{}", format_error_tree(&input, &e));
            panic!("Parse failed");
        }
    }
}
}

Rationale:

  • Diagnostics: Pretty-printed errors on failure
  • Debugging: Shows parse failure context
  • Consistency: Uniform test error reporting

Snapshot Testing

Decision: Use insta crate for snapshot testing.

Implementation:

  • Cargo.toml includes insta in dev-dependencies
  • Snapshot tests for complex AST structures
  • JSON serialization for comparison

Rationale:

  • Regression Prevention: Catch unintended AST changes
  • Review: Visual diff of AST changes
  • Maintenance: Update snapshots when intentional

Performance Considerations

Parallel Analysis

Decision: Optional parallel analysis via rayon feature.

Implementation:

[features]
parallel_analysis = ["rayon"]

Rationale:

  • Scalability: Faster analysis for large workspaces
  • Optional: Not required for single-file use cases
  • Trade-off: Adds dependency and complexity

Incremental Parsing

Decision: Not implemented yet; designed for future addition.

Future Design:

  • Cache parsed ASTs by file hash
  • Reparse only changed files
  • Incremental analysis based on change scope

Rationale:

  • Performance: Critical for IDE integration
  • Complexity: Requires careful cache invalidation
  • Priority: Deferred until core features stable


CLI Design

Subcommand Structure

Decision: Use clap with subcommands for different operations.

Commands:

  • parse - Parse C# file to JSON
  • tree - Generate AST visualization (Mermaid/DOT)
  • analyze - Run analysis and generate report

Rationale:

  • Clarity: Each command has clear purpose
  • Extensibility: Easy to add new commands
  • Discoverability: --help shows all options
  • Consistency: Follows common CLI patterns

Output Formats

Decision: Support multiple output formats (JSON, pretty-JSON, SVG).

Implementation:

  • JSON for machine consumption
  • Pretty-JSON for human readability
  • SVG for visualization

Rationale:

  • Integration: JSON for tool integration
  • Debugging: Pretty-JSON for manual inspection
  • Visualization: SVG for understanding AST structure

Future Extensibility

Planned Enhancements

  1. Incremental Parsing

    • Cache parsed ASTs
    • Reparse only changed regions
    • Critical for IDE integration
  2. Language Server Protocol (LSP)

    • IDE integration
    • Real-time diagnostics
    • Code completion
  3. More Analysis Passes

    • Nullability analysis
    • Lifetime analysis
    • Security analysis
  4. Code Transformation

    • AST modification API
    • Code generation from AST
    • Refactoring support

Design for Extension

Principles:

  • Trait-Based: Use traits for extensibility points
  • Registry Pattern: Dynamic registration of analyzers
  • Configuration: Enable/disable features via config
  • Versioning: Stable API with clear versioning

Lessons Learned

What Worked Well

  1. Parser Combinators: Excellent for composability and testing
  2. Module Organization: Clear boundaries reduce coupling
  3. Error Context: ErrorTree provides excellent diagnostics
  4. External Tests: Clean separation improves maintainability

What We'd Do Differently

  1. Earlier Keyword Modularization: Should have organized keywords from start
  2. Error Type Migration: Earlier adoption of ErrorTree would have saved refactoring
  3. Documentation: More inline documentation from the beginning

Recent Refactoring

Major refactoring improvements completed:

  • Expression precedence chain builder implemented
  • Statement group deduplication completed
  • Consistent error recovery with skip_to_member_boundary_top_level()
  • Whitespace handling standardization via bws() combinator
  • Keyword modularization by category

Contributing Guidelines

When adding new features, follow these architectural principles:

  1. Use Existing Patterns: Follow established parser patterns
  2. Add Tests: External tests in tests/ directory
  3. Document Decisions: Update this file for significant changes
  4. Error Context: Add .context() calls for debugging
  5. Naming Convention: PascalCase without 'Syntax' suffix
  6. Keyword Boundaries: Use keyword() helper for all keywords

See docs/development/contributing.md for detailed contribution guidelines.

2025-11-17 15:18:26 • commit: 03a4e25

Cookbooks

Short, task-focused examples and patterns.


Available Cookbooks

  • Query Cookbook
    • Practical Query API patterns for traversing the AST.
  • Parser Cookbook
    • Nom recipes: identifiers, lists, delimited blocks with cut, tokens with complete, and all_consuming file parsers.

When to use

  • You know the outcome you want and need a concise example.
  • You want to copy/paste a small starting point and adapt.

For deeper explanations, see:

  • docs/development/writing-parsers.md
  • docs/analysis/traversal-guide.md
2025-11-17 15:18:26 • commit: 03a4e25

Query Cookbook

Practical examples for using the Query API to traverse the AST.


Imports

#![allow(unused)]
fn main() {
// Option A (canonical): import directly from bsharp_syntax
use bsharp_syntax::node::ast_node::NodeRef;
use bsharp_syntax::query::Query;
use bsharp_syntax::{CompilationUnit, ClassDeclaration, MethodDeclaration};

// Option B (ergonomic in analysis code): re-exports via bsharp_analysis
// use bsharp_analysis::framework::{NodeRef, Query};
}

All classes in a file

#![allow(unused)]
fn main() {
fn all_classes(cu: &CompilationUnit) -> Vec<&ClassDeclaration> {
    Query::from(NodeRef::CompilationUnit(cu))
        .of::<ClassDeclaration>()
        .collect()
}
}

All methods in a class

#![allow(unused)]
fn main() {
fn all_methods_in_class(c: &ClassDeclaration) -> Vec<&MethodDeclaration> {
    Query::from(NodeRef::from(c))
        .of::<MethodDeclaration>()
        .collect()
}
}

Public methods only

#![allow(unused)]
fn main() {
use bsharp_syntax::modifiers::Modifier;

fn public_methods(cu: &CompilationUnit) -> Vec<&MethodDeclaration> {
    Query::from(NodeRef::CompilationUnit(cu))
        .filter_typed::<MethodDeclaration>(|m| m.modifiers.iter().any(|mm| *mm == Modifier::Public))
        .collect()
}
}

Count await expressions

#![allow(unused)]
fn main() {
use bsharp_syntax::expressions::AwaitExpression;

fn await_count(cu: &CompilationUnit) -> usize {
    Query::from(NodeRef::CompilationUnit(cu))
        .of::<AwaitExpression>()
        .count()
}
}

Find invocations of a method name

#![allow(unused)]
fn main() {
use bsharp_syntax::expressions::{InvocationExpression, Expression};

fn invocations_of(cu: &CompilationUnit, name: &str) -> Vec<&InvocationExpression> {
    Query::from(NodeRef::CompilationUnit(cu))
        .filter_typed::<InvocationExpression>(|inv| {
            // Match simple Variable(...) calls; extend for MemberAccess as needed
            match &*inv.expression {
                Expression::Variable(id) => id.name == name,
                _ => false,
            }
        })
        .collect()
}
}

Methods with deep nesting

#![allow(unused)]
fn main() {
use bsharp_syntax::statements::statement::Statement;

fn deeply_nested_methods(cu: &CompilationUnit, threshold: usize) -> Vec<&MethodDeclaration> {
    Query::from(NodeRef::CompilationUnit(cu))
        .filter_typed::<MethodDeclaration>(|m| {
            if let Some(body) = &m.body {
                max_nesting(body, 0) > threshold
            } else {
                false
            }
        })
        .collect()
}

fn max_nesting(s: &Statement, cur: usize) -> usize {
    match s {
        Statement::If(i) => {
            let then_d = max_nesting(&i.consequence, cur + 1);
            let else_d = i.alternative.as_ref().map(|a| max_nesting(a, cur + 1)).unwrap_or(cur);
            then_d.max(else_d)
        }
        Statement::Block(stmts) => stmts.iter().map(|st| max_nesting(st, cur)).max().unwrap_or(cur),
        Statement::For(f) => max_nesting(&f.body, cur + 1),
        Statement::ForEach(f) => max_nesting(&f.body, cur + 1),
        Statement::While(w) => max_nesting(&w.body, cur + 1),
        Statement::DoWhile(d) => max_nesting(&d.body, cur + 1),
        _ => cur,
    }
}
}

Tips

  • Chain filters sparingly: Prefer a single filter_typed with a clear predicate.
  • Use NodeRef::from(x): Start from any AST node to scope queries.
  • Profile: For hot paths, consider a custom walker when you need full control.
2025-11-17 15:18:26 • commit: 03a4e25

Parser Cookbook

Practical recipes for nom-based parsers in bsharp_parser.


Spanned-first policy

  • All public parser entrypoints return Spanned<T> so callers have precise source ranges for AST nodes.
  • Internals should prefer spanned parsers as well to preserve spans through transformations.
  • When you only need the inner value, map via .node.

Examples:

#![allow(unused)]
fn main() {
// Prefer the spanned variant and map to inner node when spans are not needed
let (rest, expr) = nom::sequence::delimited(ws, parse_expression_spanned, ws)
    .map(|s| s.node)
    .parse(input)?;

// Lists of expressions: collect inner nodes
let (rest, args) = parse_delimited_list0(
    |i| delimited(ws, tok_l_paren(), ws).parse(i),
    |i| delimited(ws, parse_expression_spanned, ws).map(|s| s.node).parse(i),
    |i| delimited(ws, tok_comma(), ws).parse(i),
    |i| delimited(ws, tok_r_paren(), ws).parse(i),
    false,
    true,
).parse(input)?;
}

Parsable trait

  • For one-shot parsing of a type to Spanned<Self>, implement or use the crate’s Parsable abstraction (where available) instead of bespoke entrypoints.
  • This keeps a consistent contract across the parser and simplifies tests and tools that need spans.

Conventions

  • Use Span<'a> and BResult<'a, T> from bsharp_parser::syntax modules.
  • Prefer small, composable parsers and add context() labels.
  • Use cut() to avoid misleading backtracking after committing to a branch.
#![allow(unused)]
fn main() {
use bsharp_parser::syntax::span::Span;
use bsharp_parser::syntax::errors::BResult;
use nom::{IResult, branch::alt, bytes::complete::tag, character::complete as cc, combinator::{all_consuming, complete, map}, sequence::{delimited, preceded, terminated, tuple}};
use nom_supreme::ParserExt; // for .context(), .cut()
}

Identifier

#![allow(unused)]
fn main() {
fn identifier(input: Span) -> BResult<String> {
    // very simplified: letter (letter|digit|_)*
    map(
        tuple((cc::alpha1, cc::alphanumeric0)),
        |(h, t): (&str, &str)| format!("{}{}", h, t)
    ).context("identifier").parse(input)
}
}

Comma-Separated List

#![allow(unused)]
fn main() {
use nom::multi::separated_list0;

fn comma_sep<T, F>(item: F) -> impl FnMut(Span) -> BResult<Vec<T>>
where F: Fn(Span) -> BResult<T> {
    separated_list0(cc::multispace0.and(tag(",")).and(cc::multispace0), item)
}
}

Delimited Braces Block

#![allow(unused)]
fn main() {
fn lbrace(i: Span) -> BResult<()> { map(tag("{"), |_| ()).context("'{'").parse(i) }
fn rbrace(i: Span) -> BResult<()> { map(tag("}"), |_| ()).context("'}'").parse(i) }

fn block<T, F>(mut inner: F) -> impl FnMut(Span) -> BResult<Vec<T>>
where F: FnMut(Span) -> BResult<Vec<T>> {
    move |input| {
        delimited(
            lbrace.context("block start"),
            // prevent backtracking past '}' so the missing brace is reported
            inner.cut(),
            rbrace.cut().context("block end")
        ).parse(input)
    }
}
}

Using complete() for Tokens

#![allow(unused)]
fn main() {
use nom::bytes::streaming::take;
use nom::combinator::complete;

fn exactly_n(n: u8) -> impl FnMut(Span) -> BResult<Span<'_>> {
    move |input| complete(take(n)).context("exactly_n").parse(input)
}
}

all_consuming at File Level

#![allow(unused)]
fn main() {
use nom::combinator::all_consuming;

fn parse_file(input: Span) -> BResult<File> {
    all_consuming(file_parser).parse(input)
}
}

Precedence Chain Skeleton

#![allow(unused)]
fn main() {
fn primary(i: Span) -> BResult<Expr> { /* literals, names, parenthesized */ }
fn postfix(i: Span) -> BResult<Expr> { /* member access, invocation */ }
fn unary(i: Span) -> BResult<Expr> { /* + - ! ~ */ }
fn multiplicative(i: Span) -> BResult<Expr> { /* * / % */ }
fn additive(i: Span) -> BResult<Expr> { /* + - */ }
fn relational(i: Span) -> BResult<Expr> { /* < > <= >= */ }
fn equality(i: Span) -> BResult<Expr> { /* == != */ }
fn assignment(i: Span) -> BResult<Expr> { /* = += -= */ }

// Entry point used by statement parsers
fn expression(i: Span) -> BResult<Expr> { assignment(i) }
}

Context Labels and Cuts

#![allow(unused)]
fn main() {
fn class_declaration(i: Span) -> BResult<ClassDecl> {
    preceded(
        tag("class").context("keyword 'class'"),
        tuple((
            identifier.cut().context("class name"),
            // ... type params, base list
        ))
    ).context("class declaration").map(|(name, ..)| ClassDecl { name }).parse(i)
}
}

Tips

  • Whitespace: Prefer explicit multispace0/multispace1 at boundaries to avoid accidental greedy matches.
  • Error messages: Keep context() labels concise and domain-specific (e.g., "parameter list").
  • Backtracking: Insert cut() after committing to a branch to stop alt from swallowing errors.
2025-11-17 15:18:26 • commit: 03a4e25

Writing Tests

How to write and organize tests for BSharp.


Test Locations

  • Parser and analysis tests live under src/bsharp_tests/src/.
  • Prefer dedicated files per area, e.g.:
    • src/bsharp_tests/src/parser/expressions/...
    • src/bsharp_tests/src/parser/statements/...
    • src/bsharp_tests/src/analysis/...

Parser Tests

  • Use realistic C# snippets and assert AST shapes.
  • Prefer external test helpers (avoid inline #[cfg(test)] in parser modules).
#![allow(unused)]
fn main() {
// Example skeleton
#[test]
fn parses_simple_invocation() {
    let source = "class C { void M() { Foo(1); } }";
    let (cu, _spans) = bsharp_parser::facade::Parser::new().parse_with_spans(source).unwrap();
    // Use Query or pattern matching to verify nodes
}
}

Analysis Tests

  • Run AnalyzerPipeline::run_with_defaults and inspect artifacts:
    • AstAnalysis metrics
    • CFG summary
    • Dependency summary
#![allow(unused)]
fn main() {
#[test]
fn counts_methods() {
    let src = "class C { void A(){} void B(){} }";
    let (cu, spans) = bsharp_parser::facade::Parser::new().parse_with_spans(src).unwrap();
    let mut session = bsharp_analysis::framework::AnalysisSession::new(
        bsharp_analysis::context::AnalysisContext::new("file.cs", src), spans);
    bsharp_analysis::framework::AnalyzerPipeline::run_with_defaults(&cu, &mut session);
    let metrics = session.artifacts.get::<bsharp_analysis::metrics::AstAnalysis>().unwrap();
    assert!(metrics.total_methods >= 2);
}
}

Tips

  • Names: Use descriptive test names; each file should focus on one area.
  • Fixtures: Keep sources small and focused; add comments for intent.
  • Determinism: Avoid relying on traversal order; query by type or match by name.
2025-11-17 15:18:26 • commit: 03a4e25

bsharp_tests Overview

Structure and conventions for the test crate.


Location

  • All tests live under src/bsharp_tests/src/.
  • Organize by domain:
    • parser/ for parsing-related tests
    • analysis/ for analysis pipeline tests

Running Tests

cargo test -p bsharp_tests

Conventions

  • Prefer descriptive file names and test names.
  • Keep fixtures small and focused.
  • Use Parser::parse_with_spans and AnalyzerPipeline::run_with_defaults in integration-style tests.
2025-11-17 15:18:26 • commit: 03a4e25

Extending Syntax (New Nodes)

How to add new AST node types to bsharp_syntax.


1. Define the Node

  • Add a struct or enum in the relevant module under src/bsharp_syntax/src/.
  • Derive bsharp_syntax_derive::AstNode so it participates in traversal and rendering.
#![allow(unused)]
fn main() {
#[derive(bsharp_syntax_derive::AstNode, Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct InterpolatedString {
    pub parts: Vec<InterpolatedPart>,
}

#[derive(bsharp_syntax_derive::AstNode, Debug, Clone, PartialEq, Serialize, Deserialize)]
pub enum InterpolatedPart {
    Text(String),
    Expr(Expression),
}
}

The derive implements AstNode and auto-generates children() that pushes nested nodes.


2. Implement Emit (Optional)

If the node needs to be formatted back to C#, implement Emit in bsharp_syntax emitters.

#![allow(unused)]
fn main() {
impl crate::emitters::emit_trait::Emit for InterpolatedString {
    fn emit<W: std::fmt::Write>(&self, w: &mut W, cx: &mut EmitCtx) -> Result<(), EmitError> {
        cx.token(w, "$")?;
        cx.bracketed(w, '"', '"', || {
            for p in &self.parts { p.emit(w, cx)?; }
            Ok(())
        })
    }
}
}

Add per-part emitters in the same or nearby module (e.g., emitters/expressions/...).


3. Wire Up Parser (in bsharp_parser)

  • Add a parser in src/bsharp_parser/src/expressions/... that constructs the new node.
  • Use Span-based parsers (bsharp_parser::syntax::span::Span).
  • On errors, rely on helpers and contexts so format_error_tree() is informative.

3a. Add Keywords & Tokens

  • Define keyword helpers using define_keyword_pair! in src/bsharp_parser/src/keywords/.
  • If a new reserved word, add it to KEYWORDS (identifier filtering).
  • Use kw_*()/peek_*() in parsers, wrapped with ws() at boundaries, and insert .cut() after commitment.

See: docs/parser/keywords-and-tokens.md for the macro and examples.


3b. Use Syntax Parsers (Whitespace/Lists)

  • Whitespace/comments: syntax/comment_parser.rs (ws(), parse_whitespace_or_comments())
  • Lists: syntax/list_parser.rs for delimited/separated lists
  • Tokens: prefer nom_supreme::tag::complete::tag() and compose with preceded/terminated/delimited and ws()

Example token with trivia:

#![allow(unused)]
fn main() {
use nom::{combinator::map, sequence::delimited};
use nom_supreme::tag::complete::tag;
use crate::syntax::comment_parser::ws;

map(delimited(ws, tag(","), ws), |_| ())
}

4. Tests (bsharp_tests)

  • Create tests under src/bsharp_tests/src/parser/... verifying the node appears in the AST.
  • Add formatter round-trip tests if Emit is implemented.
#![allow(unused)]
fn main() {
#[test]
fn interpolated_string_ast() {
    let src = r#"class C { void M(){ var s = $"x={x}"; } }"#;
    let (cu, _spans) = bsharp_parser::facade::Parser::new().parse_with_spans(src).unwrap();
    // Use Query to find InterpolatedString once parser supports it
}
}

5. Visualization (Optional)

Graph views require no changes: to_text, to_mermaid, and to_dot use AstNode traversal.


Tips

  • Box recursion: Use Box<T> for recursive enum variants.
  • Keep primitives out: Store String, bool, numbers as payload only; derive will skip them.
  • Naming: Use PascalCase node names; no Syntax suffix.
2025-11-17 15:18:26 • commit: 03a4e25

Writing Parsers

Guidelines for implementing parsers in bsharp_parser using nom and spans.


Spans & Result Type

  • Span: bsharp_parser::syntax::span::Span<'a> (alias of nom_locate::LocatedSpan<&'a str>)
  • Error type: nom_supreme::error::ErrorTree<Span<'a>>
  • Result alias: type BResult<'a, O> = IResult<Span<'a>, O, ErrorTree<Span<'a>>> in bsharp_parser::syntax::errors
#![allow(unused)]
fn main() {
use bsharp_parser::syntax::errors::BResult;
use bsharp_parser::syntax::span::Span;
}

Streaming vs Complete

nom supports streaming parsers by default. Use nom::combinator::complete(parser) to transform Incomplete into Error when you want a "complete input" behavior for a sub-parser (e.g., tokens, literals).

Example (from nom docs):

#![allow(unused)]
fn main() {
use nom::bytes::streaming::take;
use nom::combinator::complete;

let mut parser = complete(take(5u8));
assert_eq!(parser.parse("abcdefg"), Ok(("fg", "abcde")));
assert!(parser.parse("abcd").is_err());
}

At the top level, wrap file parsers with nom::combinator::all_consuming to ensure the entire input is consumed:

#![allow(unused)]
fn main() {
use nom::combinator::all_consuming;
let mut parser = all_consuming(file_parser);
}

Error Contexts and Cuts

Use nom_supreme for structured errors and better messages:

  • context("label", p) to push human-readable frames.
  • cut(p) to prevent backtracking across critical boundaries and surface the right error.
  • Our pretty-printer format_error_tree(&source, &error_tree) renders the tree with line/column and context stack.
#![allow(unused)]
fn main() {
use nom::{branch::alt, sequence::{preceded, terminated}};
use nom_supreme::context::ContextError;
use nom_supreme::ParserExt; // for .context(), .cut()

fn identifier(input: Span) -> BResult<String> { /* ... */ }
fn lbrace(input: Span) -> BResult<()> { /* ... */ }
fn rbrace(input: Span) -> BResult<()> { /* ... */ }

fn block(input: Span) -> BResult<Vec<Stmt>> {
    preceded(
        lbrace.context("block: '{'"),
        terminated(statements, rbrace.cut().context("block: '}'"))
    ).parse(input)
}
}

Common Combinators

  • preceded(a, b), terminated(a, b), delimited(a, b, c)
  • alt((p1, p2, ...)) for alternatives
  • tuple((p1, p2, ...)) to sequence
  • separated_list0(sep, item) to parse comma-separated lists
  • map(p, f) to build AST nodes

Prefer small, focused parsers composed with these combinators.


Top-Level Entry Points

  • Keep clear entry points for precedence chains (e.g., primary → postfix → binary → assignment).
  • Use wrapper nodes for constructs like New, Invocation, MemberAccess, etc., to keep variants orthogonal in the AST (see bsharp_syntax::expressions::expression.rs).

Testing Parsers

  • Place tests in src/bsharp_tests/src/parser/....
  • Parse using Parser::new().parse_with_spans(&source) and assert expected AST shapes.
  • On failure, pretty-print errors with format_error_tree to diagnose.
#![allow(unused)]
fn main() {
#[test]
fn parses_expression_statement() {
    let src = "class C { void M(){ Foo(1); } }";
    let (cu, _spans) = bsharp_parser::facade::Parser::new().parse_with_spans(src).unwrap();
    // Verify expected nodes using Query or pattern matching
}
}

Tips

  • Return early with cut after consuming a keyword to avoid misleading alternatives.
  • Use complete for tokens/literals that must not be partial.
  • all_consuming at file/compilation-unit to ban trailing garbage.
  • Context labels: Be concise and specific; they surface in error messages and docs.

References

  • nom combinator complete: https://docs.rs/nom/8.0.0/nom/combinator/fn.complete.html
  • nom combinator all_consuming: https://docs.rs/nom/8.0.0/nom/combinator/fn.all_consuming.html
2025-11-17 15:18:26 • commit: 03a4e25

Spanned-first Parsers

This project follows a spanned-first policy for all parser entrypoints. Public parsers return Spanned<T> so every AST value carries precise source ranges for diagnostics, tooling, and downstream analysis.


Rationale

  • Rich diagnostics: precise byte and line/column ranges for errors and UI highlighting.
  • Uniform contract: tools and tests can rely on span presence everywhere.
  • Safer refactors: span plumbing is not an afterthought.

Usage Patterns

1) Prefer spanned entrypoints

#![allow(unused)]
fn main() {
// Prefer spanned variants
let (rest, s_expr) = parse_expression_spanned(input)?;
// Use inner value if spans are not needed at the call site
let expr = s_expr.node;
}

2) Map lists to inner nodes

#![allow(unused)]
fn main() {
use nom::sequence::delimited;

let (rest, args) = parse_delimited_list0(
    |i| delimited(ws, tok_l_paren(), ws).parse(i),
    |i| delimited(ws, parse_expression_spanned, ws).map(|s| s.node).parse(i),
    |i| delimited(ws, tok_comma(), ws).parse(i),
    |i| delimited(ws, tok_r_paren(), ws).parse(i),
    false,
    true,
).parse(input)?;
}

3) Statements

#![allow(unused)]
fn main() {
let (rest, s_stmt) = parse_statement_ws_spanned(input)?;
let stmt = s_stmt.node;
}

Implementing new parsers

  • Return Spanned<T> from public entrypoints.
  • Compose with existing spanned parsers to retain spans through transformations.
  • For adapters that must return unspanned values (e.g., legacy APIs), .map(|s| s.node) at the last possible boundary.
  • Use cut() after committing to a branch to produce focused errors.
  • Add context("...") labels on user-facing constructs.

Example:

#![allow(unused)]
fn main() {
use nom::sequence::delimited;
use nom_supreme::ParserExt;

pub fn parse_lambda_body(input: Span) -> BResult<LambdaBody> {
    nom::branch::alt((
        // block
        nom::combinator::map(parse_lambda_block_body, LambdaBody::Block),
        // expression
        nom::combinator::map(
            delimited(ws, parse_expression_spanned, ws).map(|s| s.node),
            LambdaBody::ExpressionSyntax,
        ),
    ))
    .context("lambda body")
    .parse(input)
}
}

Testing

  • Prefer helpers that accept/return Spanned<T> in new tests.
  • When asserting only structure, map to .node before comparison.
  • For diagnostics, use the existing pretty printers (see bsharp_parser::errors::format_error_tree and to_miette_report).

Migration Notes

  • Old unspanned entrypoints are deprecated; use their _spanned counterparts.
  • If a caller previously depended on unspanned types, add .map(|s| s.node).
  • For bulk changes: search for parse_expression( and parse_statement( and replace with spanned + .node mapping.
2025-11-17 15:18:26 • commit: 03a4e25

Compliance

Warning

This Compliance section is a work in progress. Content, mappings, and assertions may evolve and change between versions.

This section documents the Roslyn compliance pipeline and how we validate our bsharp_parser and bsharp_syntax against Roslyn’s structure tests.

2025-11-17 15:18:26 • commit: 03a4e25

Compliance Overview

Warning

This Compliance section is a work in progress. Content, mappings, and assertions may evolve and change between versions.

This section describes the Roslyn compliance effort for the C# parser, using our Rust-based bsharp_parser and the bsharp_syntax AST. The goal is to automatically extract structural assertions from Roslyn tests and validate that our AST shape and key payloads match Roslyn’s expectations (normalized to our naming conventions: PascalCase, no "Syntax" suffix).

High-Level Flow

  • Source: Roslyn test files in roslyn_testing/roslyn_repo/src/Compilers/CSharp/Test/Syntax/Parsing/.
  • Extraction: A generator scans for UsingTree(...) blocks and parses the following DSL of N(SyntaxKind.X) nodes.
  • Translation: The extracted Roslyn tree is translated and normalized to our canonical kinds and structure.
  • Running: Tests are emitted into bsharp_compliance_testing, parsing provided C# snippets with bsharp_parser and comparing the actual AST with the expected structure.

Core Components

  • bsharp_compliance (generator)

    • Reads Roslyn files and extracts structural expected trees.
    • Parses the Roslyn DSL (N(SyntaxKind.X), M(...), EOF()).
    • Normalizes kinds via kind_map.rs (e.g., RecordStructDeclarationRecordDeclaration).
    • Emits Rust tests into bsharp_compliance_testing/src/generated/.
  • bsharp_compliance_testing (tests & asserts)

    • Contains custom structural assertions in custom_asserts/structure_assert.rs.
    • Walks real bsharp_syntax nodes to build a comparable ExpectedTree.
    • Compares node kind shapes and selected token payloads (e.g., identifier text).

Normalization Principles

  • Node names are PascalCase and omit Roslyn’s ...Syntax suffix.
  • Tokens/keywords are filtered from structure; identifier text is lifted where relevant.
  • Harness differences (Roslyn’s class-with-method wrappers vs. our top-level statements) are normalized at assert time when needed.

What This Validates

  • Structural presence and order of major nodes (CompilationUnit, declarations, using directives, type parameters, constraint clauses, etc.).
  • Selected payloads (e.g., IdentifierName.token_value).
  • Deeper constructs incrementally (e.g., TypeParameterConstraintClause, “allows ref struct” constraints, record primary parameter lists).

Roadmap

  • Expand kind mapping and walker coverage across more Roslyn suites.
  • Tighten token payload checks where meaningful.
  • Add targeted hand-authored structure tests for corner cases.
2025-11-17 15:18:26 • commit: 03a4e25

Compliance Guide

Warning

This Compliance section is a work in progress. Content, mappings, and assertions may evolve and change between versions.

This guide explains how to write custom asserts for Roslyn compliance tests using our bsharp_compliance_testing helpers. It focuses on structural checks and optional diagnostics checks.

Where custom asserts live

  • File: roslyn_testing/bsharp_compliance_testing/src/custom_asserts/after_parse.rs
  • Entry points:
    • after_parse(...): lightweight per-case hook for structural or source-based assertions.
    • after_parse_with_expected(...): adds an optional diagnostics expectation integration.
  • Helper macro:
    • assert_when! { ... } — enables concise, per-case matching on module/file/method/index.

Using assert_when!

The macro lets you target a specific Roslyn case by module name, Roslyn filename, Roslyn test method name, and case index (0-based within the method).

Example for a Statement case:

#![allow(unused)]
fn main() {
use crate::custom_asserts::after_parse::{after_parse, CaseData};

pub fn after_parse(
    module: &str,
    roslyn_file: &str,
    roslyn_method: &str,
    idx: usize,
    case: CaseData<'_>,
) {
    assert_when!(
        module = "statement_parsing_tests",
        roslyn_file = "StatementParsingTests",
        roslyn_method = "TestSwitchStatementWithNullableTypeInPattern3",
        idx = 2,
        Statement(ast, src) {
            // Add your targeted assertions here
            assert!(src.contains("switch"));
            // Optional: pattern-match on `ast` when you need structure checks
            // match ast { /* ... */ }
        }
    );
}
}

Example for a File case (full CompilationUnit available):

#![allow(unused)]
fn main() {
use crate::custom_asserts::after_parse::{after_parse, CaseData};

pub fn after_parse(
    module: &str,
    roslyn_file: &str,
    roslyn_method: &str,
    idx: usize,
    case: CaseData<'_>,
) {
    assert_when!(
        module = "using_directive_parsing_tests",
        roslyn_file = "UsingDirectiveParsingTests",
        roslyn_method = "SimpleUsingDirectiveNamePointer",
        idx = 0,
        File(unit, src, original) {
            assert!(src.starts_with("using "));
            // `unit` is a &bsharp_syntax::ast::CompilationUnit
            // You can inspect its using directives or declarations if needed.
            assert!(unit.using_directives.len() >= 1);
            let _ = original; // original Roslyn text when provided
        }
    );
}
}

Diagnostics integration

If the generator attaches expected diagnostics, use after_parse_with_expected(...) to compare counts when diagnostics are supported by the build:

#![allow(unused)]
fn main() {
use crate::custom_asserts::after_parse::{after_parse_with_expected, CaseData};

pub fn my_integration(
    module: &str,
    roslyn_file: &str,
    roslyn_method: &str,
    idx: usize,
    expected: Option<crate::custom_asserts::roslyn_asserts::ExpectedDiagnostics>,
    case: CaseData<'_>,
) {
    // Runs custom case asserts and then asserts diagnostics count when available
    after_parse_with_expected(module, roslyn_file, roslyn_method, idx, expected, case);
}
}

Notes:

  • When diagnostics support is disabled, the helper asserts with an explicit "unimplemented" fallback to avoid silent failures.
  • Keep asserts precise and self-contained; prefer checking concrete substrings or specific AST facts.

Best practices

  • Keep assertions small and focused. Use assert_when! blocks per case.
  • Avoid brittle assumptions: prefer checking presence/shape over exact token trivia.
  • Match our naming convention in any structure references (PascalCase, no Syntax suffix).
  • Fail fast with clear messages; do not silently swallow errors.
2025-11-17 15:18:26 • commit: 03a4e25

Generator

Warning

This Compliance section is a work in progress. Content, mappings, and assertions may evolve and change between versions.

This document describes how the Roslyn structure test generator works in bsharp_compliance and how it produces executable tests for bsharp_compliance_testing.

Inputs

  • Roslyn source files under roslyn_testing/roslyn_repo/src/Compilers/CSharp/Test/Syntax/Parsing/.
  • The generator scans for UsingTree(...) calls and parses the immediately following Roslyn structure DSL composed of N(SyntaxKind.X) and EOF() entries (with M(...) ignored as "missing").

Pipeline

  1. Scan and collect test methods

    • Locates Roslyn [Fact] methods and all UsingTree(...) call sites.
    • Captures the closest preceding var text = "..."; snippet as input source, when present.
  2. Parse structure DSL

    • Reads the DSL block following UsingTree(...) and constructs a nested ExpectedTree (ExpectedNode graph) mirroring the Roslyn node hierarchy.
    • Tolerates whitespace, comments, and missing markers (M(...)).
  3. Kind translation and normalization

    • generator/kind_map.rs maps Roslyn kinds to our canonical naming (PascalCase, no Syntax suffix).
    • Filters token/keyword nodes, lifting identifier text (IdentifierToken → parent IdentifierName.token_value).
    • Applies targeted renames (e.g., RecordStructDeclarationRecordDeclaration).
  4. Emit tests

    • Writes Rust tests into bsharp_compliance_testing/src/generated/<module>.rs.
    • Each test parses the captured src with bsharp_parser and asserts structure via custom_asserts/structure_assert.rs.

Assertions

  • Structure assertions build a comparable expected tree from our actual AST (bsharp_syntax) and compare:
    • Node kinds and order
    • Selected token payloads (e.g., IdentifierName.token_value)
  • Normalization in the assert layer adapts Roslyn’s harness (class + method) to our top-level statements when applicable.

Extending the Generator

  • Update generator/kind_map.rs to add or refine kind mappings.
  • Expand custom_asserts/structure_assert.rs to walk deeper AST areas (e.g., records, types, constraints).
  • Improve the DSL parser (generator/structure_dsl.rs) as new Roslyn DSL shapes appear.

Output Location

  • Generated files live under roslyn_testing/bsharp_compliance_testing/src/generated/.
  • Modules track Roslyn file groups, e.g. record_parsing.rs, using_directive_parsing_tests.rs.
2025-11-17 15:18:26 • commit: 03a4e25

dotscope Guide

unknown • commit:

VM Design

unknown • commit:

Emitter Design

unknown • commit:

Testing & Conformance

unknown • commit:

Roadmap

unknown • commit:

Open Questions

unknown • commit:

Glossary

unknown • commit: