BSharp C# Parser Documentation

BSharp is a comprehensive C# parser and analysis toolkit written in Rust. It provides a complete solution for parsing C# source code into an Abstract Syntax Tree (AST), performing various code analyses, and generating insights about code quality and structure.

What is BSharp?

BSharp consists of several key components:

Parser: A robust C# parser built using the nom parser combinator library
AST: A complete representation of C# language constructs
Analysis Framework: Tools for analyzing code structure, dependencies, and quality
CLI Tools: Command-line utilities for parsing, visualization, and analysis

Key Features

Complete C# Language Support: Supports modern C# features including:
- Classes, structs, interfaces, records, enums
- Methods, properties, fields, events, indexers
- All statement types (if, for, while, switch, try-catch, etc.)
- Expression parsing with operator precedence
- Generic types and constraints
- Attributes and modifiers
- Preprocessor directives
Robust Error Handling: Custom error types with context information for debugging parse failures
Query API: Typed, ergonomic traversal of the AST via bsharp_analysis::framework::Query
Code Analysis: Built-in analyzers for:
- Control flow analysis
- Dependency tracking
- Code metrics (complexity, maintainability)
- Type analysis
- Code quality assessment
Extensible Architecture: Modular design allowing easy extension of parsing and analysis capabilities

Architecture Overview

The codebase is organized into several main modules:

src/
├── bsharp_parser/    # Parser crate (expressions, statements, declarations, helpers)
├── bsharp_syntax/    # AST nodes and shared syntax types (re-exported by parser)
├── bsharp_analysis/  # Analysis framework and workspace loader
├── bsharp_cli/       # Command-line interface
└── bsharp_tests/     # External tests for parser/analysis/CLI

Key Components

Parser (src/bsharp_parser/, src/bsharp_syntax/)

Modular parser using nom combinators
Complete C# language support
Rich error diagnostics with ErrorTree
Keyword parsing organized by category
AST nodes follow PascalCase naming without 'Syntax' suffix

Workspace Loading (src/bsharp_analysis/src/workspace/)

Solution file (.sln) parsing
Project file (.csproj) parsing with XML
Transitive ProjectReference resolution
Source file discovery with glob patterns
Deterministic project ordering

Analysis Framework (src/bsharp_analysis/src/)

Pipeline-based architecture with phases
Extensible passes and rules system
Metrics collection (complexity, maintainability)
Control flow analysis
Dependency tracking
Code quality assessment

CLI Tools (src/bsharp_cli/)

parse - Parse C# and print textual AST tree
tree - Generate AST visualization (Mermaid/DOT)
analyze - Comprehensive code analysis
format - Format C# code using syntax emitters

Getting Started

The easiest way to get started is using the CLI tools:

# Parse a C# file and print textual AST tree
bsharp parse input.cs

# Generate AST visualization
bsharp tree input.cs --output ast.svg

# Analyze a project or solution
bsharp analyze MyProject.csproj --out report.json

# Format a file in-place (or a directory recursively)
bsharp format input.cs --write true

Formatter Quickstart

Use the built-in formatter from the CLI or integrate the Formatter directly.

CLI usage and options: see Format Command
Formatter design and policies: see Formatter and Emitters

Quick examples:

# Format a single file in-place
bsharp format Program.cs

# Print formatted output (do not write)
bsharp format Program.cs --write false

# Enable emission tracing to a JSONL file
bsharp format Program.cs --emit-trace --emit-trace-file format_trace.jsonl

Use Cases

BSharp is designed for:

Static Analysis Tools: Build custom analyzers for code quality, security, or style
Code Transformation: Parse, modify, and regenerate C# code
Language Tooling: Create IDE extensions, linters, or formatters
Educational Tools: Understand and visualize C# code structure
Migration Tools: Analyze legacy code for modernization efforts

This documentation will guide you through all aspects of using and extending BSharp.

Parser Overview

The BSharp parser transforms C# source code into a structured Abstract Syntax Tree (AST). Built using the nom parser combinator library, it provides a robust and extensible foundation for parsing modern C# syntax as part of the BSharp toolkit.

Architecture

The parser follows a modular architecture with clear separation of concerns. It serves as the frontend for tools that consume the AST (analysis, visualization, etc.):

Parser Infrastructure (`src/bsharp_syntax/src/`)

mod.rs: Public API and re-exports
ast.rs: Root AST node definitions (CompilationUnit, TopLevelDeclaration)
errors.rs: Error formatting utilities (format_error_tree)
parser_helpers.rs: Core parsing utilities (context, bws, keyword, etc.)
test_helpers.rs: Testing utilities (expect_ok, etc.)
nodes/: AST node definitions organized by category

Parser Implementations (`src/bsharp_parser/src/`)

The parsers are organized by language construct type:

expressions/: All expression parsing (literals, operators, method calls, etc.)
keywords/: Keyword parsing organized by category
types/: Type system parsing (primitives, generics, arrays, etc.)
helpers/: Declaration helpers and utilities
preprocessor/: Preprocessor directive parsing

AST Node Definitions (`src/bsharp_syntax/src/`)

Structured node definitions that mirror C# language constructs:

declarations/: All declaration node types
expressions/: All expression node types
statements/: All statement node types
types/: Type system node definitions

Parser Design Principles

1. Compositional Design

The parser is built from small, focused parser functions that combine to handle complex language constructs:

#![allow(unused)]
fn main() {
// Example: Method declaration combines multiple sub-parsers
fn parse_method_declaration(input: &str) -> BResult<&str, MethodDeclaration> {
    let (input, attributes) = parse_attributes(input.into())?;
    let (input, modifiers) = parse_modifiers(input.into())?;
    let (input, return_type) = parse_type(input.into())?;
    let (input, name) = parse_identifier(input.into())?;
    let (input, parameters) = parse_parameter_list(input.into())?;
    let (input, body) = opt(parse_block_statement)(input.into())?;
    // ... construct and return MethodDeclaration
}
}

2. Error Recovery and Context

Custom error types provide detailed context about parse failures:

Location information (line, column)
Expected vs. actual input
Contextual error messages
Error recovery strategies

3. Extensibility

The modular design allows easy addition of new language features:

Add new expression types by extending the Expression enum
Implement new statement parsers following established patterns
Extend AST navigation traits for new analysis capabilities

Parsing Flow

1. Entry Point

Parsing begins via the public facade in src/bsharp_parser/src/facade.rs:

#![allow(unused)]
fn main() {
let parser = bsharp_parser::facade::Parser::new();
let cu = parser.parse(source)?;
}

2. Compilation Unit Parsing

The parser starts by parsing a CompilationUnit, which represents a complete C# source file:

Global attributes (assembly/module level)
Using directives
Top-level declarations (namespaces, classes, etc.)
File-scoped namespaces (C# 10+)
Top-level statements (C# 9+)

3. Recursive Descent

The parser uses recursive descent to handle nested structures:

Namespaces contain type declarations
Types contain member declarations
Methods contain statements
Statements contain expressions

Key Parser Features

Expression Parsing with Precedence

The expression parser handles operator precedence correctly:

Primary expressions (literals, identifiers, parentheses)
Unary operators (!, -, +, ++, --, etc.)
Binary operators with correct precedence and associativity
Conditional expressions (ternary operator)
Assignment expressions

Statement Parsing

Comprehensive support for all C# statement types:

Control flow: if, switch, for, foreach, while, do-while
Jump statements: break, continue, return, throw, goto
Exception handling: try-catch-finally
Resource management: using, lock
Local declarations and assignments

Declaration Parsing

Full support for C# type and member declarations:

Types: classes, structs, interfaces, records, enums, delegates
Members: methods, properties, fields, events, indexers, operators
Modifiers: access modifiers, static, abstract, virtual, override, etc.
Generics: type parameters, constraints, variance

Modern C# Features

Support for recent C# language additions:

Records (C# 9)
File-scoped namespaces (C# 10)
Top-level statements (C# 9)
Pattern matching enhancements
Nullable reference types

Error Handling Strategy

The parser uses a multi-layered error handling approach:

Parse Errors: Detailed information about what went wrong during parsing
Context Propagation: Errors include context about where in the parsing process they occurred
Recovery Mechanisms: Ability to continue parsing after certain types of errors
User-Friendly Messages: Clear, actionable error messages for developers

This design makes the parser robust and helpful for development and debugging. Code generation/compilation is out of scope for now; the parser and analysis crates form the current focus of the toolkit.

Core Parser Components

This document details the fundamental components that make up the BSharp parser infrastructure.

Public Parser API

Parser Struct

The main entry point for all parsing operations:

#![allow(unused)]
fn main() {
#[derive(Default)]
pub struct Parser;

impl Parser {
    pub fn new() -> Self
    pub fn parse(&self, input: &str) -> Result<ast::CompilationUnit, String>
}
}

The Parser provides a clean, simple interface that abstracts away the complexity of the underlying parsing implementation.

Error System

ErrorTree (nom-supreme)

BSharp uses nom-supreme's ErrorTree for rich error diagnostics:

#![allow(unused)]
fn main() {
pub type BResult<I, O> = IResult<I, O, ErrorTree<I>>;
}

Key features:

Context Stack: Maintains parsing contexts via .context() calls
Position Tracking: Built-in span tracking for error locations
Rich Diagnostics: Tree structure shows complete parse failure path
Integration: Seamless with nom combinators

Error Helpers

Utility functions for enhanced error handling:

Location: src/bsharp_parser/src/helpers/

context(): Adds contextual information to parser errors
bws(): Whitespace-aware wrapper with error context
bdelimited(): Delimited parsing with cut on closing delimiter
cut(): Commits to parse branch, preventing misleading backtracking
Error recovery mechanisms for common parsing scenarios

Pretty Error Formatting

Location: src/bsharp_parser/src/syntax/errors.rs

#![allow(unused)]
fn main() {
pub fn format_error_tree(input: &str, error: &ErrorTree<Span<'_>>) -> String;
}

Produces rustc-like error messages with:

Line and column numbers
Source code context
Caret pointing to error location
Context stack showing parse path

AST Foundation

CompilationUnit

The root node of every parsed C# file:

#![allow(unused)]
fn main() {
pub struct CompilationUnit {
    pub global_attributes: Vec<GlobalAttribute>,
    pub using_directives: Vec<UsingDirective>,
    pub global_using_directives: Vec<GlobalUsingDirective>,
    pub declarations: Vec<TopLevelDeclaration>,
    pub file_scoped_namespace: Option<FileScopedNamespaceDeclaration>,
    pub top_level_statements: Vec<Statement>,
}
}

Represents the complete structure of a C# source file, supporting both traditional and modern C# features.

TopLevelDeclaration

Enum representing all possible top-level declarations:

#![allow(unused)]
fn main() {
pub enum TopLevelDeclaration {
    Namespace(NamespaceDeclaration),
    FileScopedNamespace(FileScopedNamespaceDeclaration),
    Class(ClassDeclaration),
    Struct(StructDeclaration),
    Record(RecordDeclaration),
    Interface(InterfaceDeclaration),
    Enum(EnumDeclaration),
    Delegate(DelegateDeclaration),
    GlobalAttribute(GlobalAttribute),
}
}

Keyword Parsing

Keyword Module Organization

Location: src/bsharp_parser/src/keywords/

Keywords are organized by category in dedicated modules for maintainability and consistency:

src/bsharp_parser/src/keywords/
├── mod.rs                      # Keyword infrastructure
├── access_keywords.rs          # public, private, protected, internal
├── accessor_keywords.rs        # get, set, init, add, remove
├── type_keywords.rs            # class, struct, interface, enum, record
├── modifier_keywords.rs        # static, abstract, virtual, sealed
├── flow_control_keywords.rs    # if, else, switch, case, default
├── iteration_keywords.rs       # for, foreach, while, do
├── expression_keywords.rs      # new, this, base, typeof, sizeof
├── linq_query_keywords.rs      # from, where, select, orderby
└── ...

Keyword Parsing Strategy

Word Boundary Enforcement:

#![allow(unused)]
fn main() {
pub fn keyword(kw: &'static str) -> impl Fn(&str) -> BResult<&str, &str>;
}

The keyword() helper enforces [A-Za-z0-9_] word boundaries to prevent partial matches:

Correctly rejects "int" when parsing "int32"
Ensures "class" doesn't match "classname"
Consistent across all keyword parsers

Benefits:

Maintainability: Easy to find and update keyword parsers
Consistency: Uniform keyword parsing strategy
Bug Prevention: Avoids partial match issues
Centralization: Single source of truth for keywords

Parser Helpers

Context Management

Functions for maintaining parsing context:

#![allow(unused)]
fn main() {
pub fn context<I, O, F>(
    ctx: &'static str,
    parser: F
) -> impl FnMut(I) -> BResult<I, O>
}

Wraps parsers with contextual information that appears in error messages, making debugging much easier.

Parser Composition

Utilities for combining smaller parsers into larger ones:

Sequencing parsers with error propagation
Optional parsing with fallbacks
Alternative parsing with preference ordering
Repetition parsing with separators

Whitespace and Comment Handling

Consistent handling of whitespace and comments throughout the parser:

Automatic whitespace skipping between tokens
Comment preservation for documentation tools
Preprocessor directive handling

Node Structure Standards

Common Traits

All AST nodes implement standard traits:

Debug: For debugging and logging
PartialEq: For testing and comparison
Clone: For AST manipulation
Serialize/Deserialize: For JSON export/import

Node Organization

AST nodes are organized hierarchically:

nodes/
├── declarations/     # Type and member declarations
├── expressions/      # All expression types
├── statements/       # All statement types
├── types/           # Type system representations
└── ...              # Other language constructs

Identifier Handling

Consistent identifier representation throughout the AST:

#![allow(unused)]
fn main() {
pub struct Identifier {
    pub name: String,
    // Additional metadata like source location
}
}

Type System Integration

Type Representation

The parser builds a complete representation of C# types:

Primitive types (int, string, bool, etc.)
Reference types (classes, interfaces)
Value types (structs, enums)
Generic types with constraints
Array and pointer types
Nullable types

Generic Support

Full support for C# generics:

Type parameters with constraints
Variance annotations (in, out)
Generic method declarations
Complex constraint combinations

Memory Management

Zero-Copy Parsing

Where possible, the parser avoids unnecessary string allocations:

String slices reference original input
Minimal cloning during parsing
Efficient error reporting without excessive allocation

AST Ownership

Clear ownership semantics for AST nodes:

Parent nodes own their children
Shared references through navigation traits
No circular references in the AST structure

This foundation provides a robust base for parsing complex C# code while maintaining performance and usability.

AST Structure

The BSharp AST (Abstract Syntax Tree) provides a complete, structured representation of C# source code. This document explains the organization and relationships between different AST node types.

AST Hierarchy

Root Node: CompilationUnit

Every parsed C# file results in a CompilationUnit, which serves as the root of the AST:

#![allow(unused)]
fn main() {
pub struct CompilationUnit {
    pub global_attributes: Vec<GlobalAttribute>,        // [assembly: ...] attributes
    pub using_directives: Vec<UsingDirective>,          // using statements
    pub global_using_directives: Vec<GlobalUsingDirective>, // C# 10+ global using
    pub declarations: Vec<TopLevelDeclaration>,         // namespaces, types
    pub file_scoped_namespace: Option<FileScopedNamespaceDeclaration>, // C# 10+
    pub top_level_statements: Vec<Statement>,           // C# 9+ top-level code
}
}

This structure supports both traditional C# files and modern features like file-scoped namespaces and top-level statements.

Declaration Hierarchy

Top-Level Declarations

Top-level declarations represent constructs that can appear at the file or namespace level:

#![allow(unused)]
fn main() {
pub enum TopLevelDeclaration {
    Namespace(NamespaceDeclaration),
    FileScopedNamespace(FileScopedNamespaceDeclaration),
    Class(ClassDeclaration),
    Struct(StructDeclaration),
    Record(RecordDeclaration),
    Interface(InterfaceDeclaration),
    Enum(EnumDeclaration),
    Delegate(DelegateDeclaration),
    GlobalAttribute(GlobalAttribute),
}
}

Type Declarations

Each type declaration contains comprehensive information about the type:

ClassDeclaration

#![allow(unused)]
fn main() {
pub struct ClassDeclaration {
    pub attributes: Vec<AttributeList>,
    pub modifiers: Vec<Modifier>,
    pub name: Identifier,
    pub type_parameters: Option<Vec<TypeParameter>>,
    pub primary_constructor_parameters: Option<Vec<Parameter>>, // C# 12
    pub base_types: Vec<Type>,
    pub body_declarations: Vec<ClassBodyDeclaration>,
    pub documentation: Option<XmlDocumentationComment>,
    pub constraints: Option<Vec<TypeParameterConstraintClause>>,
}
}

MethodDeclaration

#![allow(unused)]
fn main() {
pub struct MethodDeclaration {
    pub modifiers: Vec<Modifier>,
    pub return_type: Type,
    pub name: Identifier,
    pub type_parameters: Option<Vec<TypeParameter>>,
    pub parameters: Vec<Parameter>,
    pub body: Option<Statement>,                 // None for abstract/interface methods
    pub constraints: Option<Vec<TypeParameterConstraintClause>>,
}
}

Member Declarations

Class body declarations represent all possible class members:

#![allow(unused)]
fn main() {
pub enum ClassBodyDeclaration {
    Method(MethodDeclaration),
    Constructor(ConstructorDeclaration),
    Destructor(DestructorDeclaration),
    Property(PropertyDeclaration),
    Field(FieldDeclaration),
    Event(EventDeclaration),
    Indexer(IndexerDeclaration),
    Operator(OperatorDeclaration),
    NestedClass(ClassDeclaration),
    NestedStruct(StructDeclaration),
    NestedInterface(InterfaceDeclaration),
    NestedEnum(EnumDeclaration),
    NestedDelegate(DelegateDeclaration),
}
}

Expression Hierarchy

Expression Types

The expression system covers all C# expression types with proper precedence:

#![allow(unused)]
fn main() {
pub enum Expression {
    // Primary and names
    Literal(Literal),
    Variable(Identifier),

    // Object and member operations
    New(Box<NewExpression>),
    MemberAccess(Box<MemberAccessExpression>),
    Invocation(Box<InvocationExpression>),
    Indexing(Box<IndexingExpression>),
    Index(Box<IndexExpression>),
    Range(Box<RangeExpression>),

    // Lambda and anonymous methods
    Lambda(Box<LambdaExpression>),
    AnonymousMethod(Box<AnonymousMethodExpression>),

    // Keywords
    This,
    Base,

    // Operators
    Unary { op: UnaryOperator, expr: Box<Expression> },
    Binary { left: Box<Expression>, op: BinaryOperator, right: Box<Expression> },
    PostfixUnary { op: UnaryOperator, expr: Box<Expression> },
    Assignment(Box<AssignmentExpression>),

    // Patterns and type ops
    Pattern(Box<Pattern>),
    IsPattern { expression: Box<Expression>, pattern: Box<Pattern> },
    As { expression: Box<Expression>, target_type: Type },
    Cast { expression: Box<Expression>, target_type: Type },

    // Misc language features
    Conditional(Box<ConditionalExpression>),
    Query(Box<QueryExpression>),
    Await(Box<AwaitExpression>),
    Throw(Box<ThrowExpression>),
    Nameof(Box<NameofExpression>),
    Typeof(Box<TypeofExpression>),
    Sizeof(Box<SizeofExpression>),
    Default(Box<DefaultExpression>),
    StackAlloc(Box<StackAllocExpression>),
    Ref(Box<Expression>),
    Checked(Box<CheckedExpression>),
    Unchecked(Box<UncheckedExpression>),

    // With/collection expressions
    With { target: Box<Expression>, initializers: Vec<WithInitializerEntry> },
    Collection(Vec<CollectionElement>),

    // Composite forms
    AnonymousObject(AnonymousObjectCreationExpression),
    Tuple(TupleExpression),
    SwitchExpression(Box<SwitchExpression>),
}
}

Key helper structs:

#![allow(unused)]
fn main() {
pub struct SwitchExpression {
    pub expression: Expression,
    pub arms: Vec<SwitchExpressionArm>,
}

pub enum WithInitializerEntry {
    Property { name: String, value: Expression },
    Indexer { indices: Vec<Expression>, value: Expression },
}

pub enum CollectionElement {
    Expr(Expression),
    Spread(Expression),
}
}

Literal Types

Comprehensive support for C# literals:

#![allow(unused)]
fn main() {
pub enum Literal {
    Boolean(bool),
    Integer(String),          // Preserves original format
    FloatingPoint(String),    // Preserves original format
    Character(char),
    String(String),
    InterpolatedString(InterpolatedStringLiteral),
    Null,
    Default,
}
}

Statement Hierarchy

Statement Types

Complete coverage of C# statement types:

#![allow(unused)]
fn main() {
pub enum Statement {
    // Control flow
    If(IfStatement),
    Switch(SwitchStatement),
    For(ForStatement),
    ForEach(ForEachStatement),
    While(WhileStatement),
    DoWhile(DoWhileStatement),
    
    // Jump statements
    Break(BreakStatement),
    Continue(ContinueStatement),
    Return(ReturnStatement),
    Throw(ThrowStatement),
    Goto(GotoStatement),
    
    // Exception handling
    Try(TryStatement),
    
    // Resource management
    Using(UsingStatement),
    Lock(LockStatement),
    
    // Declarations and expressions
    LocalVariableDeclaration(LocalVariableDeclaration),
    ExpressionStatement(Expression),
    Block(Vec<Statement>),
    Empty,
    
    // Modern features
    LocalFunction(LocalFunctionStatement),
}
}

Control Flow Statements

Complex control flow statements contain nested structures:

IfStatement

#![allow(unused)]
fn main() {
pub struct IfStatement {
    pub condition: Expression,
    pub consequence: Box<Statement>,
    pub alternative: Option<Box<Statement>>,
}
}

TryStatement

#![allow(unused)]
fn main() {
pub struct TryStatement {
    pub body: Box<Statement>,
    pub catch_clauses: Vec<CatchClause>,
    pub finally_clause: Option<FinallyClause>,
}
}

Type System

Type Representation

The type system models all C# type constructs:

#![allow(unused)]
fn main() {
pub enum Type {
    // Primitive types
    Primitive(PrimitiveType),
    
    // Named types
    Named { name: Identifier, type_arguments: Vec<Type> },
    
    // Array types
    Array { element_type: Box<Type>, rank: usize },
    
    // Pointer types
    Pointer(Box<Type>),
    
    // Nullable types
    Nullable(Box<Type>),
    
    // Generic type parameters
    TypeParameter(Identifier),
    
    // Tuple types
    Tuple(Vec<Type>),
}
}

Generic Support

Full support for C# generics:

TypeParameter

#![allow(unused)]
fn main() {
pub struct TypeParameter {
    pub attributes: Vec<Attribute>,
    pub variance: Option<Variance>,      // in, out
    pub identifier: Identifier,
}
}

TypeParameterConstraint

#![allow(unused)]
fn main() {
pub enum TypeParameterConstraint {
    TypeConstraint { parameter: Identifier, constraint_type: Type },
    ConstructorConstraint(Identifier),    // new()
    ClassConstraint(Identifier),          // class
    StructConstraint(Identifier),         // struct
    UnmanagedConstraint(Identifier),      // unmanaged
}
}

AST Metadata

Attributes

Comprehensive attribute support:

#![allow(unused)]
fn main() {
pub struct Attribute {
    pub name: Identifier,
    pub arguments: Vec<AttributeArgument>,
}

pub enum AttributeArgument {
    Positional(Expression),
    Named { name: Identifier, value: Expression },
}
}

Modifiers

All C# modifiers are represented:

#![allow(unused)]
fn main() {
pub enum Modifier {
    // Access modifiers
    Public, Private, Protected, Internal, ProtectedInternal, PrivateProtected,
    
    // Other modifiers
    Static, Abstract, Virtual, Override, Sealed, New,
    Async, Unsafe, Volatile, Readonly, Const,
    Partial, Extern,
}
}

The AST maintains clear parent-child relationships while providing navigation capabilities through traits:

Ownership: Parent nodes own their children
Navigation: Traits provide methods to traverse and search the AST
Context: Nodes can access their containing context when needed

This structure provides a complete, navigable representation of C# code that supports both analysis and transformation scenarios.

Error Handling

BSharp implements a comprehensive error handling system that provides detailed context information for debugging parse failures.

Error Types

The parser uses ErrorTree from nom-supreme for structured error information:

#![allow(unused)]
fn main() {
pub type BResult<I, O> = nom::IResult<I, O, ErrorTree<I>>;
}

ErrorTree Structure

The ErrorTree type provides:

Context Stack: Hierarchical parsing context via .context() calls
Location: Span tracking for error positions
Error Tree: Complete parse failure path
Rich Diagnostics: Detailed error information for debugging

Error Recovery

The parser implements several error recovery strategies:

1. Malformed Syntax Recovery

When encountering malformed syntax, the parser attempts to skip to recovery points:

Semicolons (;)
Closing braces (})
End of input

1.a Declaration Error Recovery (Type Member Top-Level)

For type declarations (classes, structs, records, interfaces), malformed members are recovered using a lightweight, scope-aware helper:

Helper: skip_to_member_boundary_top_level()
Location: src/bsharp_parser/src/expressions/declarations/type_declaration_helpers.rs

Contract:

Only use from within a type body when a member parser fails.
Stops at the next safe boundary at top level of the current type:
- Consumes a top-level ; and returns the slice after it.
- Or stops at a top-level } without consuming it (so the caller can close the current body cleanly).
- Returns an empty slice at EOF.
Depth-tracks (), [], {}, and a heuristic <> to avoid stopping inside expressions, attribute arguments, or generic argument lists.
Ignores control characters inside strings, chars, and comments.

Limitations:

Angle-bracket tracking is heuristic and does not fully disambiguate generics from shift operators.
Verbatim/interpolated strings are not fully lexed here; this helper is intended for robust, not perfect, recovery.

Usage example (simplified):

#![allow(unused)]
fn main() {
match member_parser(cur) {
    Ok((rest, member)) => { members.push(member); cur = rest; }
    Err(_) => {
        let next = skip_to_member_boundary_top_level(cur);
        if next.is_empty() || next == cur { break; }
        cur = next;
    }
}
}

1.b Namespace Body: Using-Directives Before Members

Inside a block-scoped namespace body, using directives are accepted before type and nested-namespace members.

Implementation: parse_namespace_declaration() scans for using immediately after the opening { and collects all consecutive directives before parsing members.
This ensures inputs like the following are parsed deterministically without interleaving usings with members:

namespace Outer {
    using System;
    namespace Inner {
        using System.Collections;
        class MyClass {}
    }
}

Contract and limitations:

Only leading using directives at the current namespace body level are collected.
Interleaving using directives among members is not supported yet (matches common style and avoids ambiguous recovery).

1.c File-Scoped Namespace

When parsing a file-scoped namespace, the parser also skips preprocessor directives following the namespace line before parsing members, mirroring the block-scoped behavior.

Preprocessor Directives and Trivia

Preprocessor directives (e.g., #pragma, #line) are treated as structured trivia, not AST declarations:

Parser entrypoints (e.g., parse_csharp_source()) skip directive lines anywhere they can appear at the compilation-unit level.
parse_preprocessor_directive() consumes the entire directive line including an optional trailing newline.
Current status: directives inside type and namespace bodies are planned to be skipped similarly; tests are tracked and temporarily ignored until this is integrated.

Example:

#pragma warning disable CS0168
namespace N {
    // class and members...
}

The directive is skipped and not present as a namespace member.

2. Context-Aware Errors

Errors include contextual information about the parsing context:

#![allow(unused)]
fn main() {
context("method declaration", parse_method_body)(input.into())
}

This provides clear error messages like "expected method body in method declaration context".

Helper Location: src/bsharp_parser/src/helpers/

3. Graceful Degradation

The parser continues parsing even after encountering errors, collecting multiple errors to provide comprehensive feedback.

Error Reporting

Errors are reported with:

Line and column numbers
Surrounding context
Suggestions for fixes
Parser state information

Common Error Scenarios

Syntax Errors

Missing semicolons
Unmatched braces
Invalid identifiers

Type Errors

Unknown type references
Generic constraint violations
Invalid type parameter usage

Declaration Errors

Conflicting modifiers
Missing required elements
Invalid access levels

Debugging Tips

Use verbose error output to get detailed parser state
Check recovery points when errors cascade
Validate input syntax with simpler test cases first
Use parser context to understand where parsing failed

Wrapper Expression Variants

For clarity, several operations are modeled as distinct expression variants in the AST:

New(NewExpression) for object creation
MemberAccess(MemberAccessExpression) for obj.Member
Invocation(InvocationExpression) for calls expr(args)
Indexing(IndexingExpression) and Index(IndexExpression)
Range(RangeExpression) for start..end
With { target, initializers } for record-like with-expressions
Collection(Vec<CollectionElement>) for collection expressions

Expression Parsing

BSharp implements a complete expression parser that handles all C# expression types with proper operator precedence and associativity.

Expression Hierarchy

The expression parser follows C#'s operator precedence rules:

Primary Expressions (x, x.y, x[y], x(), etc.)
Unary Expressions (+x, -x, !x, ~x, ++x, --x)
Multiplicative (*, /, %)
Additive (+, -)
Shift (<<, >>)
Relational (<, >, <=, >=, is, as)
Equality (==, !=)
Logical AND (&)
Logical XOR (^)
Logical OR (|)
Conditional AND (&&)
Conditional OR (||)
Null Coalescing (??)
Conditional (?:)
Assignment (=, +=, -=, etc.)

Expression Types

Primary Expressions

Literals

Numeric: 42, 3.14, 0x1A
String: "hello", @"verbatim", $"interpolated {value}"
Character: 'a', '\n'
Boolean: true, false
Null: null

Identifiers and Member Access

variable          // Simple identifier
obj.property      // Member access
obj.method()      // Method invocation
obj[index]        // Indexer access

Note: In the AST, simple identifiers are represented by the Expression::Variable(Identifier) variant. Member access, invocation, and indexing are represented by dedicated wrapper variants (MemberAccess, Invocation, Indexing).

Object Creation

new MyClass()                    // Constructor
new MyClass { Prop = value }     // Object initializer
new[] { 1, 2, 3 }               // Array initializer
new { Name = "John", Age = 30 }  // Anonymous object

Lambda Expressions

The parser supports various lambda syntax forms:

x => x * 2                      // Single parameter
(x, y) => x + y                 // Multiple parameters
() => DoSomething()             // No parameters
(int x, string y) => Process(x, y)  // Typed parameters
x => { return x * 2; }          // Block body
async x => await ProcessAsync(x) // Async lambda

Query Expressions (LINQ)

Complete LINQ query syntax support:

from item in collection
where item.IsValid
orderby item.Name
select item.Value

Supported clauses:

from - Data source
where - Filtering
select - Projection
orderby - Sorting
group by - Grouping
join - Joining
let - Variable introduction
into - Query continuation

Pattern Expressions

Modern C# pattern matching:

obj is int value           // Type pattern
obj is not null           // Negation pattern
obj is > 0 and < 100     // Relational patterns
obj is var x             // Var pattern

Switch Expressions

value switch
{
    1 => "one",
    2 => "two",
    _ => "other"
}

Operator Precedence Implementation

The expression entrypoint is spanned-first. Callers can unwrap the Spanned<Expression> when they do not need spans:

#![allow(unused)]
fn main() {
use bsharp_parser::parser::expressions::primary_expression_parser::parse_expression_spanned;
use bsharp_syntax::span::Span;

let result = parse_expression_spanned(Span::new(input))
    .map(|(rest, s)| (rest, s.node));
}

Error Handling in Expressions

The expression parser provides detailed error messages:

Operator precedence conflicts
Missing operands
Invalid syntax combinations
Type compatibility issues

Advanced Features

Null-Conditional Operators

obj?.Property        // Null-conditional member access
obj?[index]         // Null-conditional element access
obj?.Method()       // Null-conditional invocation

Throw Expressions

value ?? throw new ArgumentNullException()

Range and Index Expressions

array[^1]           // Index from end
array[1..5]         // Range
array[..^1]         // Range to index from end

With Expressions (Records)

person with { Name = "Updated" }

The expression parser is designed to be extensible, allowing for easy addition of new expression types as the C# language evolves.

Statement Parsing

BSharp provides comprehensive parsing for all C# statement types, from simple expressions to complex control flow constructs.

Statement Categories

1. Declaration Statements

Local Variable Declarations

int x = 5;
var name = "John";
const double PI = 3.14159;

Local Function Declarations

void LocalFunction(int parameter)
{
    // function body
}

T GenericLocalFunction<T>(T value) where T : class
{
    return value;
}

2. Expression Statements

Any expression followed by a semicolon:

x++;                    // Increment
Method();              // Method call
obj.Property = value;  // Assignment

3. Control Flow Statements

Conditional Statements

If Statements

if (condition)
    statement;

if (condition)
{
    // block
}
else if (otherCondition)
{
    // else if block
}
else
{
    // else block
}

Switch Statements

switch (expression)
{
    case constant1:
        statements;
        break;
    case constant2 when condition:
        statements;
        goto case constant1;
    default:
        statements;
        break;
}

Loop Statements

For Loops

for (int i = 0; i < 10; i++)
{
    // loop body
}

for (;;)  // infinite loop
{
    // body
}

Foreach Loops

foreach (var item in collection)
{
    // process item
}

foreach ((string key, int value) in dictionary)
{
    // deconstruction in foreach
}

While Loops

while (condition)
{
    // loop body
}

Do-While Loops

do
{
    // loop body
} while (condition);

Jump Statements

break;              // Break from loop/switch
continue;           // Continue to next iteration
return;             // Return from method
return value;       // Return with value
goto label;         // Jump to label
goto case 5;        // Jump to switch case
goto default;       // Jump to switch default

4. Exception Handling

Try-Catch-Finally

try
{
    // risky code
}
catch (SpecificException ex) when (ex.Code == 123)
{
    // specific exception handling
}
catch (Exception ex)
{
    // general exception handling
}
finally
{
    // cleanup code
}

Throw Statements

throw;                           // Rethrow current exception
throw new InvalidOperationException();
throw new CustomException("message");

5. Resource Management

Using Statements

using (var resource = new DisposableResource())
{
    // use resource
}

using var resource = new DisposableResource();
// resource disposed at end of scope

Lock Statements

lock (syncObject)
{
    // synchronized code
}

Fixed Statements

unsafe
{
    fixed (byte* ptr = array)
    {
        // work with fixed pointer
    }
}

6. Special Statements

Yield Statements

yield return value;     // Return value in iterator
yield break;           // End iterator

Checked/Unchecked Statements

checked
{
    // arithmetic overflow checking enabled
}

unchecked
{
    // arithmetic overflow checking disabled
}

Unsafe Statements

unsafe
{
    // unsafe code block
}

Statement Parsing Implementation

Use the spanned entrypoint and unwrap when spans are not needed:

#![allow(unused)]
fn main() {
use bsharp_parser::parser::statement_parser::parse_statement_ws_spanned;
use bsharp_syntax::span::Span;

let result = parse_statement_ws_spanned(Span::new(input))
    .map(|(rest, s)| (rest, s.node));
}

Block Statements

Block statements group multiple statements:

{
    int x = 5;
    Console.WriteLine(x);
    if (x > 0)
    {
        Console.WriteLine("Positive");
    }
}

Error Recovery

The statement parser implements robust error recovery:

Statement-level recovery: Skip to next statement boundary (semicolon or brace)
Block-level recovery: Skip to matching brace
Context preservation: Maintain parsing context across errors

Statement Attributes

Statements can have attributes applied:

[Obsolete("Use NewMethod instead")]
void OldMethod() { }

[ConditionalAttribute("DEBUG")]
static void DebugMethod() { }

Top-Level Statements

Support for C# 9+ top-level statements:

// Program.cs
using System;

Console.WriteLine("Hello World!");
return 0;

The statement parser is designed to handle the full complexity of C# control flow while providing clear error messages and robust error recovery.

Declaration Parsing

BSharp implements comprehensive parsing for all C# declaration types, from simple variables to complex generic types with constraints.

Declaration Categories

1. Namespace Declarations

Traditional Namespace

namespace MyCompany.MyProject
{
    // namespace members
}

File-Scoped Namespace (C# 10+)

namespace MyCompany.MyProject;

// All following declarations belong to this namespace

Nested Namespaces

namespace Outer
{
    namespace Inner
    {
        // nested namespace content
    }
}

2. Type Declarations

Class Declarations

public class MyClass : BaseClass, IInterface1, IInterface2
{
    // class members
}

public abstract class AbstractClass
{
    public abstract void AbstractMethod();
}

public sealed class SealedClass
{
    // cannot be inherited
}

Interface Declarations

public interface IMyInterface : IBaseInterface
{
    void Method();
    int Property { get; set; }
    event Action SomeEvent;
}

public interface IGeneric<T> where T : class
{
    T GenericMethod<U>(U parameter) where U : struct;
}

Struct Declarations

public struct Point
{
    public int X { get; set; }
    public int Y { get; set; }
    
    public Point(int x, int y)
    {
        X = x;
        Y = y;
    }
}

public readonly struct ReadOnlyPoint
{
    public readonly int X;
    public readonly int Y;
    
    public ReadOnlyPoint(int x, int y)
    {
        X = x;
        Y = y;
    }
}

Record Declarations

public record Person(string FirstName, string LastName);

public record class Employee(string FirstName, string LastName, string Department)
    : Person(FirstName, LastName);

public record struct Point(int X, int Y);

Enum Declarations

public enum Color
{
    Red,
    Green,
    Blue
}

[Flags]
public enum FileAccess : byte
{
    None = 0,
    Read = 1,
    Write = 2,
    Execute = 4,
    All = Read | Write | Execute
}

Delegate Declarations

public delegate void EventHandler(object sender, EventArgs e);
public delegate T GenericDelegate<T, U>(U parameter) where T : class;

3. Member Declarations

Field Declarations

private int field;
public readonly string ReadOnlyField;
public const double PI = 3.14159;
private static readonly List<string> StaticField = new();

Property Declarations

// Auto-implemented properties
public string Name { get; set; }
public int Age { get; private set; }
public bool IsValid { get; init; }

// Properties with backing fields
private string _description;
public string Description
{
    get => _description;
    set => _description = value?.Trim();
}

// Expression-bodied properties
public string FullName => $"{FirstName} {LastName}";

// Indexer properties
public string this[int index]
{
    get => items[index];
    set => items[index] = value;
}

Method Declarations

public void VoidMethod() { }
public int MethodWithReturnType() => 42;
public static T GenericMethod<T>(T parameter) where T : new() => new T();

// Async methods
public async Task<string> AsyncMethod()
{
    await Task.Delay(1000);
    return "result";
}

// Extension methods
public static class Extensions
{
    public static bool IsEmpty(this string str) => string.IsNullOrEmpty(str);
}

Constructor Declarations

public class MyClass
{
    public MyClass() { }                    // Default constructor
    public MyClass(string name) : this()   // Constructor chaining
    {
        Name = name;
    }
    
    static MyClass()                        // Static constructor
    {
        // Static initialization
    }
}

Destructor Declarations

public class Resource
{
    ~Resource()
    {
        // Cleanup code
    }
}

Note: In the AST, DestructorDeclaration.body is Option<Statement>:

#![allow(unused)]
fn main() {
// Some(Block(...)) for `{ ... }`, None for extern (i.e., `;` only)
pub struct DestructorDeclaration {
    pub name: Identifier,
    pub body: Option<Statement>,
}
}

Event Declarations

public event Action<string> SomethingHappened;

public event EventHandler<CustomEventArgs> CustomEvent
{
    add { customEvent += value; }
    remove { customEvent -= value; }
}

Operator Declarations

public static Point operator +(Point a, Point b)
{
    return new Point(a.X + b.X, a.Y + b.Y);
}

public static implicit operator string(Point p)
{
    return $"({p.X}, {p.Y})";
}

4. Generic Constraints

Type Parameter Constraints

public class Container<T> where T : class, IDisposable, new()
{
    // T must be a reference type, implement IDisposable, and have a parameterless constructor
}

public void Method<T, U>()
    where T : class
    where U : struct, IComparable<U>
{
    // Multiple constraint clauses
}

AST mapping for constraints:

#![allow(unused)]
fn main() {
// On type declarations (class/struct/interface/record)
pub struct ClassDeclaration {
    pub constraints: Option<Vec<TypeParameterConstraintClause>>,
}

// On methods
pub struct MethodDeclaration {
    pub constraints: Option<Vec<TypeParameterConstraintClause>>,
}
}

5. Modifiers and Attributes

Access Modifiers

public - Accessible everywhere
private - Accessible only within the same class
protected - Accessible within class and derived classes
internal - Accessible within the same assembly
protected internal - Accessible within assembly or derived classes
private protected - Accessible within derived classes in the same assembly

Other Modifiers

static - Belongs to the type rather than instance
abstract - Must be overridden in derived classes
virtual - Can be overridden in derived classes
override - Overrides a virtual/abstract member
sealed - Cannot be overridden further
readonly - Can only be assigned during initialization
const - Compile-time constant
async - Asynchronous method
unsafe - Contains unsafe code
extern - Implemented externally

Attributes

[Obsolete("Use NewMethod instead")]
public void OldMethod() { }

[DllImport("kernel32.dll")]
public static extern bool SetConsoleTitle(string title);

[AttributeUsage(AttributeTargets.Class | AttributeTargets.Method)]
public class CustomAttribute : Attribute
{
    public string Description { get; set; }
}

6. Using Directives

using System;                           // Namespace using
using System.Collections.Generic;
using static System.Math;               // Static using
using Project = MyCompany.MyProject;    // Alias directive
global using System.Text;              // Global using (C# 10+)

Note: global using directives are stored at the compilation unit level in CompilationUnit.global_using_directives.

Declaration Parsing Implementation

The declaration parser uses a multi-stage approach:

Modifier Parsing: Parse access modifiers and other keywords
Declaration Type Detection: Determine what kind of declaration
Specific Parser Dispatch: Route to specialized parser
Member Collection: Gather all declaration components

#![allow(unused)]
fn main() {
fn parse_type_declaration(input: &str) -> BResult<&str, TypeDeclaration> {
    let (input, attributes) = many0(parse_attribute)(input.into())?;
    let (input, modifiers) = parse_modifiers(input.into())?;
    let (input, declaration) = alt((
        parse_class_declaration,
        parse_interface_declaration,
        parse_struct_declaration,
        parse_enum_declaration,
        parse_delegate_declaration,
        parse_record_declaration,
    ))(input.into())?;
    
    Ok((input, TypeDeclaration {
        attributes,
        modifiers,
        declaration,
    }))
}
}

Error Handling

The declaration parser provides comprehensive error reporting:

Modifier conflicts: Detecting incompatible modifier combinations
Constraint validation: Ensuring generic constraints are valid
Accessibility consistency: Verifying access level consistency
Syntax validation: Catching malformed declarations

Recovery for Malformed Members

When a member inside a type body fails to parse, the parser uses a scoped recovery strategy to skip to the next safe boundary without crossing the enclosing type's closing brace. See the dedicated section in Error Handling for details on skip_to_member_boundary_top_level() and its contract:

docs: docs/parser/error-handling.md (Declaration Error Recovery subsection)

XML Documentation

The parser handles XML documentation comments:

/// <summary>
/// Calculates the area of a rectangle.
/// </summary>
/// <param name="width">The width of the rectangle.</param>
/// <param name="height">The height of the rectangle.</param>
/// <returns>The area of the rectangle.</returns>
public double CalculateArea(double width, double height)
{
    return width * height;
}

The declaration parser is designed to handle the full complexity of C# type system while maintaining performance and providing detailed error diagnostics.

Type System

BSharp implements a comprehensive type system that accurately represents all C# type constructs, from primitive types to complex generic types with constraints.

Type Categories

1. Primitive Types

Built-in Value Types

bool        // Boolean type
byte        // 8-bit unsigned integer
sbyte       // 8-bit signed integer
short       // 16-bit signed integer
ushort      // 16-bit unsigned integer
int         // 32-bit signed integer
uint        // 32-bit unsigned integer
long        // 64-bit signed integer
ulong       // 64-bit unsigned integer
char        // 16-bit Unicode character
float       // 32-bit floating point
double      // 64-bit floating point
decimal     // 128-bit decimal

Special Types

object      // Base type of all types
string      // Immutable string type
void        // Absence of type (method returns)
dynamic     // Dynamic type
var         // Implicitly typed variable

2. Reference Types

Class Types

MyClass                 // Simple class reference
System.Collections.List<int>  // Generic class

Interface Types

IEnumerable<T>         // Generic interface
IDisposable            // Non-generic interface

Array Types

int[]                  // Single-dimensional array
int[,]                 // Multi-dimensional array
int[][]                // Jagged array
int[,,]                // Three-dimensional array

Delegate Types

Action                 // Parameterless action
Action<int>            // Action with parameter
Func<int, string>      // Function with return type
EventHandler<T>        // Event handler

3. Nullable Types

Nullable Value Types

int?                   // Nullable integer
DateTime?              // Nullable DateTime
bool?                  // Nullable boolean

Nullable Reference Types (C# 8+)

string?                // Nullable string
List<int>?             // Nullable list
MyClass?               // Nullable custom class

4. Generic Types

Type Parameters

T                      // Simple type parameter
TKey, TValue           // Multiple type parameters

Constructed Generic Types

List<int>              // Generic list of integers
Dictionary<string, object>  // Generic dictionary

Generic Constraints

T where T : class                    // Reference type constraint
T where T : struct                   // Value type constraint
T where T : new()                    // Constructor constraint
T where T : BaseClass                // Base class constraint
T where T : IInterface               // Interface constraint
T where T : class, IDisposable, new() // Multiple constraints

5. Tuple Types

Named Tuples

(int x, int y)         // Named tuple elements
(string name, int age) // Different element types

Unnamed Tuples

(int, string)          // Unnamed tuple elements

Nested Tuples

(int, (string, bool))  // Nested tuple structure

6. Pointer Types (Unsafe Context)

int*                   // Pointer to integer
char**                 // Pointer to pointer to char
void*                  // Void pointer

7. Function Pointer Types (C# 9+)

delegate*<int, string>              // Function pointer
delegate* managed<int, void>        // Managed function pointer
delegate* unmanaged<int, void>      // Unmanaged function pointer

Type Syntax Parsing

Basic Type Parsing

The type parser handles various syntactic forms:

#![allow(unused)]
fn main() {
fn parse_type(input: &str) -> BResult<&str, Type> {
    alt((
        parse_tuple_type,
        parse_function_pointer_type,
        parse_named_type,
        parse_primitive_type,
    ))(input.into())
}
}

Array Type Parsing

Array types have specific syntax rules:

int[]                  // T[]
int[,]                 // T[,]
int[,,]                // T[,,]
int[][]                // T[][] (jagged)

Generic Type Parsing

Generic types require careful parsing of type arguments:

List<int>              // Simple generic
Dictionary<string, List<int>>  // Nested generics

Nullable Type Parsing

Nullable types use special syntax:

int?                   // Nullable<int>
string?                // string with nullable annotation

Type Resolution

Qualified Names

Types can be fully qualified:

System.Collections.Generic.List<int>
MyNamespace.MyClass

Type Aliases

Using directives create type aliases:

using StringList = System.Collections.Generic.List<string>;

Global Type References

Global namespace references:

global::System.String  // Fully qualified from global namespace

Type Constraints

Constraint Types

Reference Type: where T : class
Value Type: where T : struct
Constructor: where T : new()
Base Class: where T : BaseClass
Interface: where T : IInterface
Type Parameter: where T : U

Constraint Combinations

Multiple constraints can be combined:

where T : class, IDisposable, new()

Constraint Validation

The parser validates constraint combinations:

class and struct are mutually exclusive
new() constraint must come last
Base class constraint must come before interface constraints

Type Variance

Covariance and Contravariance

interface ICovariant<out T> { }     // Covariant
interface IContravariant<in T> { }  // Contravariant
interface IInvariant<T> { }         // Invariant

Advanced Type Features

Record Types

record Person(string Name, int Age);
record class Employee(string Name, int Age, string Department);
record struct Point(int X, int Y);

Pattern Types

Types used in pattern matching:

obj is string str          // Type pattern
obj is not null           // Negation pattern
obj is > 0 and < 100     // Relational pattern

Type System Implementation

The type system is implemented with a hierarchical structure:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum Type {
    Primitive(PrimitiveType),
    Named {
        name: Identifier,
        type_arguments: Option<Vec<Type>>,
    },
    Array {
        element_type: Box<Type>,
        dimensions: u32,
    },
    Nullable(Box<Type>),
    Tuple(Vec<(Option<Identifier>, Type)>),
    Pointer(Box<Type>),
    FunctionPointer {
        parameters: Vec<Type>,
        return_type: Box<Type>,
    },
}
}

Error Handling

The type parser provides detailed error messages for:

Invalid type syntax
Constraint violations
Generic parameter mismatches
Nullable context errors
Variance violations

Type Inference

While the parser doesn't perform type inference (that's the compiler's job), it correctly parses:

var declarations
Anonymous types
Implicitly typed arrays
Lambda parameter types

The type system parser is designed to accurately represent the full complexity of C#'s type system while maintaining performance and providing clear error diagnostics.

C# Feature Completeness Matrix

This document tracks the implementation status of C# language features in the BSharp parser.

Legend:

✅ Fully Supported - Feature is completely implemented and tested
🟡 Partial Support - Feature is partially implemented or has known limitations
⚠️ Planned - Feature is planned but not yet implemented
❌ Not Supported - Feature is not currently supported

C# 1.0 Features (2002)

Type Declarations

Feature	Status	Notes
Classes	✅	Full support including nested classes
Structs	✅	Full support
Interfaces	✅	Full support
Enums	✅	Full support including flags
Delegates	✅	Full support

Members

Feature	Status	Notes
Fields	✅	Public, private, protected, internal
Properties	✅	Get/set accessors
Methods	✅	Instance and static methods
Constructors	✅	Instance and static constructors
Destructors/Finalizers	✅	Full support
Events	✅	Full support
Indexers	✅	Full support
Operators	✅	Operator overloading

Statements

Feature	Status	Notes
`if`/`else`	✅	Full support
`switch`/`case`	✅	Traditional switch statements
`for`	✅	Full support
`foreach`	✅	Full support
`while`	✅	Full support
`do-while`	✅	Full support
`break`	✅	Full support
`continue`	✅	Full support
`return`	✅	Full support
`throw`	✅	Full support
`try`/`catch`/`finally`	✅	Full exception handling
`using` statement	✅	Resource management
`lock`	✅	Thread synchronization
`goto`	✅	Including goto case
`checked`/`unchecked`	✅	Overflow checking

Expressions

Feature	Status	Notes
Literals	✅	All literal types
Arithmetic operators	✅	`+`, `-`, `*`, `/`, `%`
Comparison operators	✅	`==`, `!=`, `<`, `>`, `<=`, `>=`
Logical operators	✅	`&&`, `
Bitwise operators	✅	`&`, `
Assignment operators	✅	`=`, `+=`, `-=`, etc.
Conditional operator	✅	`? :` ternary
Member access	✅	`.` operator
Indexing	✅	`[]` operator
Method invocation	✅	Full support
Object creation	✅	`new` expressions
Array creation	✅	Single and multi-dimensional
Type casting	✅	`(Type)expr`
`typeof`	✅	Type information
`sizeof`	✅	Size of types
`is` operator	✅	Type testing
`as` operator	✅	Safe casting

Types

Feature	Status	Notes
Primitive types	✅	All built-in types
Arrays	✅	Single, multi-dimensional, jagged
Nullable value types	✅	`T?` syntax
Reference types	✅	Classes, interfaces, delegates
Value types	✅	Structs, enums

Modifiers

Feature	Status	Notes
Access modifiers	✅	public, private, protected, internal
`static`	✅	Full support
`readonly`	✅	Full support
`const`	✅	Full support
`virtual`	✅	Full support
`override`	✅	Full support
`abstract`	✅	Full support
`sealed`	✅	Full support
`extern`	✅	Full support

C# 2.0 Features (2005)

Feature	Status	Notes
Generics	✅	Full support including constraints
Generic constraints	✅	`where T : class`, `struct`, `new()`, etc.
Partial types	✅	`partial` keyword
Anonymous methods	✅	`delegate { }` syntax
Nullable types	✅	`Nullable<T>` and `T?`
Iterators	✅	`yield return`, `yield break`
Covariance/Contravariance	✅	`in`/`out` variance
Static classes	✅	Full support
Property accessors	✅	Different accessibility
Namespace aliases	✅	`using Alias = Namespace`
`??` operator	✅	Null-coalescing

C# 3.0 Features (2007)

Feature	Status	Notes
Auto-implemented properties	✅	`{ get; set; }`
Object initializers	✅	`new T { Prop = value }`
Collection initializers	✅	`new List<T> { 1, 2, 3 }`
Anonymous types	✅	`new { Name = "x" }`
Extension methods	✅	`this` parameter
Lambda expressions	✅	`x => x * 2`
Expression trees	✅	Parsing support
LINQ query syntax	✅	`from x in y select z`
Implicitly typed variables	✅	`var` keyword
Partial methods	✅	In partial classes

C# 4.0 Features (2010)

Feature	Status	Notes
Dynamic binding	✅	`dynamic` type
Named arguments	✅	`Method(param: value)`
Optional parameters	✅	Default parameter values
Generic covariance/contravariance	✅	Enhanced support
Embedded interop types	✅	`no-pia`

C# 5.0 Features (2012)

Feature	Status	Notes
Async/await	✅	`async` and `await` keywords
Caller info attributes	✅	`[CallerMemberName]`, etc.

C# 6.0 Features (2015)

Feature	Status	Notes
Auto-property initializers	✅	`public int X { get; set; } = 1;`
Expression-bodied members	✅	`=> expr` for methods/properties
`using static`	✅	Import static members
Null-conditional operator	✅	`?.` and `?[]`
String interpolation	✅	`$"Hello {name}"`
`nameof` operator	✅	`nameof(variable)`
Index initializers	✅	`[index] = value`
Exception filters	✅	`catch (E) when (condition)`
`await` in catch/finally	✅	Full support

C# 7.0 Features (2017)

Feature	Status	Notes
Out variables	✅	`Method(out var x)`
Tuples	✅	`(int, string)` syntax
Tuple deconstruction	✅	`(var x, var y) = tuple`
Pattern matching	✅	`is` patterns
Local functions	✅	Functions inside methods
Ref returns and locals	✅	`ref` keyword
Discards	✅	`_` placeholder
Binary literals	✅	`0b1010`
Digit separators	✅	`1_000_000`
Throw expressions	✅	`x ?? throw new E()`
Expression-bodied constructors	✅	`=> expr` syntax
Expression-bodied finalizers	✅	`=> expr` syntax
Expression-bodied accessors	✅	`get => expr`

C# 7.1 Features (2017)

Feature	Status	Notes
Async main	✅	`async Task Main()`
Default literal expressions	✅	`default` without type
Inferred tuple names	✅	Automatic naming
Pattern matching on generics	✅	Full support

C# 7.2 Features (2017)

Feature	Status	Notes
`ref readonly`	✅	Read-only references
`in` parameters	✅	Pass by readonly reference
`ref struct`	✅	Stack-only structs
Non-trailing named arguments	✅	Mixed named/positional
`private protected`	✅	Access modifier
Leading underscores in numeric literals	✅	`_123`
Conditional `ref` expressions	✅	`ref` in ternary

C# 7.3 Features (2018)

Feature	Status	Notes
Tuple equality	✅	`==` and `!=`
Attributes on backing fields	✅	`[field: Attribute]`
Expression variables in initializers	✅	Full support
`ref` local reassignment	✅	Reassign ref locals
Stackalloc initializers	✅	`stackalloc[] { 1, 2 }`
Pattern-based `fixed`	✅	Custom fixed
Improved overload candidates	✅	Better resolution

C# 8.0 Features (2019)

Feature	Status	Notes
Nullable reference types	✅	`string?` annotations
Default interface methods	✅	Interface implementations
Pattern matching enhancements	✅	Switch expressions, property patterns
Switch expressions	✅	`x switch { ... }`
Property patterns	✅	`{ Prop: value }`
Tuple patterns	✅	`(1, 2)` patterns
Positional patterns	✅	Deconstruction patterns
Using declarations	✅	`using var x = ...`
Static local functions	✅	`static` modifier
Disposable ref structs	✅	`IDisposable` on ref struct
Nullable reference types	✅	`#nullable` directives
Asynchronous streams	✅	`IAsyncEnumerable<T>`
Asynchronous disposable	✅	`IAsyncDisposable`
Indices and ranges	✅	`^` and `..` operators
Null-coalescing assignment	✅	`??=` operator
Unmanaged constructed types	✅	Generic constraints
Stackalloc in nested expressions	✅	Full support

C# 9.0 Features (2020)

Feature	Status	Notes
Records	✅	`record` keyword
Init-only setters	✅	`init` accessor
Top-level statements	✅	No Main method required
Pattern matching improvements	✅	Relational, logical patterns
Relational patterns	✅	`> 0`, `<= 10`
Logical patterns	✅	`and`, `or`, `not`
Target-typed `new`	✅	`new()` without type
Covariant returns	✅	Override with derived type
Extension `GetEnumerator`	✅	foreach support
Lambda discard parameters	✅	`(_, _) => expr`
Attributes on local functions	✅	Full support
Module initializers	✅	`[ModuleInitializer]`
Partial methods with return	✅	Extended partial
Native integers	✅	`nint`, `nuint`
Function pointers	✅	`delegate*` syntax
Suppress emitting localsinit	✅	`[SkipLocalsInit]`
Target-typed conditional	✅	`? :` inference

C# 10.0 Features (2021)

Feature	Status	Notes
Record structs	✅	`record struct`
Global using directives	✅	`global using`
File-scoped namespaces	✅	`namespace X;`
Extended property patterns	✅	Nested patterns
Constant interpolated strings	✅	`const` strings
Lambda improvements	✅	Natural types, attributes
Caller expression attribute	✅	`[CallerArgumentExpression]`
Improved definite assignment	✅	Better analysis
Allow `AsyncMethodBuilder`	✅	Custom builders
Record types with sealed `ToString`	✅	Sealed override
Assignment and declaration in same deconstruction	✅	Mixed syntax
Allow both assignment and declaration	✅	Full support

C# 11.0 Features (2022)

Feature	Status	Notes
Raw string literals	✅	`"""text"""`
Generic attributes	✅	`[Attr<T>]`
UTF-8 string literals	✅	`"text"u8`
Newlines in string interpolations	✅	Multi-line expressions
List patterns	✅	`[1, 2, .., 10]`
File-local types	✅	`file class`
Required members	✅	`required` modifier
Auto-default structs	✅	Default initialization
Pattern match `Span<char>`	✅	Constant patterns
Extended `nameof` scope	✅	More contexts
Numeric IntPtr	✅	Operators on IntPtr
`ref` fields	✅	In ref structs
`scoped` ref	✅	Lifetime annotations
Checked operators	✅	User-defined checked

C# 12.0 Features (2023)

Feature	Status	Notes
Primary constructors	✅	Full support for classes and structs
Collection expressions	✅	`[1, 2, 3]` and spread `..` syntax
Inline arrays	❌	Not yet implemented
Optional parameters in lambdas	✅	Full support
`ref readonly` parameters	✅	Full support
Alias any type	✅	`using Alias = (int, string)`
Experimental attribute	✅	`[Experimental]`
Interceptors	❌	Not yet implemented

C# 13.0 Features (2024)

Feature	Status	Notes
`params` collections	⚠️	Planned
New lock type	⚠️	Planned
New escape sequence `\e`	⚠️	Planned
Method group natural type	⚠️	Planned
Implicit indexer access	⚠️	Planned
`ref` and `unsafe` in iterators	⚠️	Planned
`ref struct` interfaces	⚠️	Planned
Allows `ref struct` types	⚠️	Planned

C# 14.0 Features (2025 - .NET 10)

Feature	Status	Notes
Extension members	🟡	Parser + emitter for `extension` blocks; semantics planned
`field` keyword	⚠️	Planned - Field-backed properties
Null-conditional assignment	⚠️	Planned - `?.` on left side of `=`
`nameof` unbound generics	⚠️	Planned - `nameof(List<>)`
Implicit `Span<T>` conversions	⚠️	Planned - First-class span support
Lambda parameter modifiers	⚠️	Planned - `(out x) => ...` without types
Partial constructors	⚠️	Planned - `partial` instance constructors
Partial events	⚠️	Planned - `partial` events
User-defined compound assignment	⚠️	Planned - Custom `+=`, `-=` operators

Preprocessor Directives

Feature	Status	Notes
`#if` / `#elif` / `#else` / `#endif`	✅	Conditional compilation
`#define` / `#undef`	✅	Symbol definition
`#warning` / `#error`	✅	Compiler messages
`#line`	✅	Line number control
`#region` / `#endregion`	✅	Code folding
`#pragma warning`	✅	Warning control
`#pragma checksum`	✅	Debugging support
`#nullable`	✅	Nullable context

Documentation Comments

Feature	Status	Notes
XML documentation	✅	`///` and `/** */`
`<summary>`	✅	Full support
`<param>`	✅	Full support
`<returns>`	✅	Full support
`<exception>`	✅	Full support
`<see>` / `<seealso>`	✅	Full support
`<example>`	✅	Full support
`<code>` / `<c>`	✅	Full support
`<para>`	✅	Full support
`<list>`	✅	Full support
`<include>`	✅	Full support

Unsafe Code

Feature	Status	Notes
Pointers	✅	`T*` syntax
`unsafe` keyword	✅	Blocks and methods
`fixed` statement	✅	Pin managed objects
`stackalloc`	✅	Stack allocation
Function pointers	✅	`delegate*` (C# 9+)
`sizeof` operator	✅	Type sizes
Pointer arithmetic	✅	Full support
Address-of operator	✅	`&` operator
Indirection operator	✅	`*` operator

Summary Statistics

Overall Completeness

Version	Features	Supported	Planned	Not Supported	Completion
C# 1.0	80+	80+	0	0	100%
C# 2.0	11	11	0	0	100%
C# 3.0	10	10	0	0	100%
C# 4.0	5	5	0	0	100%
C# 5.0	2	2	0	0	100%
C# 6.0	10	10	0	0	100%
C# 7.0	13	13	0	0	100%
C# 7.1	4	4	0	0	100%
C# 7.2	7	7	0	0	100%
C# 7.3	7	7	0	0	100%
C# 8.0	18	18	0	0	100%
C# 9.0	17	17	0	0	100%
C# 10.0	12	12	0	0	100%
C# 11.0	13	13	0	0	100%
C# 12.0	7	6	0	1	~86%
C# 13.0	8	0	8	0	0% (Preview)
C# 14.0	9	0	9	0	0% (Preview)

Total: ~99% of released C# features supported (C# 1.0 - 12.0)

Testing Coverage

Test Organization

All parser tests are located in tests/parser/ with comprehensive coverage:

Expression tests: tests/parser/expressions/
Statement tests: tests/parser/statements/
Declaration tests: tests/parser/declarations/
Type tests: tests/parser/types/
Pattern matching tests: tests/parser/expressions/pattern_matching_tests.rs
Preprocessor tests: tests/parser/preprocessor/

Test Fixtures

Real-world C# projects in tests/fixtures/:

happy_path/: Valid, well-formed C# code
complex/: Complex real-world scenarios

Known Limitations

C# 12.0 Limitations

Inline Arrays: Not yet implemented
- Requires [InlineArray(n)] attribute support
- Planned for future release
Interceptors: Not yet implemented
- Experimental feature in C# 12
- May be implemented when feature stabilizes

C# 13.0 & 14.0 Status

All C# 13.0 and 14.0 features are in preview/development status and planned for future implementation as they stabilize in the official .NET releases.

Contributing

To add support for new C# features:

Update AST nodes in src/syntax/nodes/
Implement parser in src/parser/
Add comprehensive tests in tests/parser/
Update this matrix to reflect new support
Document in relevant parser documentation

See Contributing Guide for details.

References

C# Language Specification: https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/
C# Version History: https://docs.microsoft.com/en-us/dotnet/csharp/whats-new/
Roslyn Source: https://github.com/dotnet/roslyn
Parser Implementation: src/parser/
Test Suite: tests/parser/

Last Updated: 2025-09-30
Parser Version: Current development version
Maintained By: BSharp Project Contributors

Keywords and Tokens

Keyword and token helpers used by the parser.

Keyword Pairs Macro

Location: src/bsharp_parser/src/keywords/mod.rs
Macro: define_keyword_pair! (macro_rules)
Generates two functions per keyword:
- kw_<name>() – consumes the keyword with word boundary check
- peek_<name>() – non-consuming peek with surrounding whitespace/comments tolerated

#![allow(unused)]
fn main() {
// Define a pair:
// define_keyword_pair!(kw_public, peek_public, "public");
#[macro_export]
macro_rules! define_keyword_pair {
    ($kw_fn:ident, $peek_fn:ident, $lit:literal) => {
        pub fn $kw_fn() -> impl FnMut($crate::syntax::span::Span) -> $crate::syntax::errors::BResult<&str> {
            use nom::Parser as _;
            (|i: $crate::syntax::span::Span| {
                nom::combinator::map(
                    nom::sequence::terminated(
                        nom_supreme::tag::complete::tag($lit),
                        nom::combinator::peek(nom::combinator::not(
                            nom::character::complete::satisfy(|c: char| c.is_alphanumeric() || c == '_'),
                        )),
                    ),
                    |s: $crate::syntax::span::Span| *s.fragment(),
                )
                .parse(i)
            })
        }
        pub fn $peek_fn() -> impl FnMut($crate::syntax::span::Span) -> $crate::syntax::errors::BResult<&str> {
            use nom::Parser as _;
            (|i: $crate::syntax::span::Span| {
                nom::combinator::peek(
                    nom::sequence::delimited(
                        $crate::syntax::comment_parser::ws,
                        nom::combinator::map(
                            nom::sequence::terminated(
                                nom_supreme::tag::complete::tag($lit),
                                nom::combinator::peek(nom::combinator::not(
                                    nom::character::complete::satisfy(|c: char| c.is_alphanumeric() || c == '_'),
                                )),
                            ),
                            |_| $lit,
                        ),
                        $crate::syntax::comment_parser::ws,
                    ),
                )
                .parse(i)
            })
        }
    };
}
}

Keyword modules live under src/bsharp_parser/src/keywords/ (e.g., access_keywords.rs, declaration_keywords.rs, linq_query_keywords.rs, type_keywords.rs).
Central keyword set: KEYWORDS in keywords/mod.rs and check is_keyword().

Token and Whitespace Helpers

Whitespace/comments: src/bsharp_parser/src/syntax/comment_parser.rs
- ws() parses optional whitespace and comments
- parse_whitespace_or_comments() returns the consumed span text
List parsing: src/bsharp_parser/src/syntax/list_parser.rs provides helpers for delimited/separated lists
Punctuation/tokens: Use nom_supreme::tag::complete::tag("...") with:
- peek(not(satisfy(|c| ...))) for word boundaries on keywords
- preceded/terminated/delimited and ws() to control surrounding trivia

Example token with trivia discipline:

#![allow(unused)]
fn main() {
use nom::{combinator::map, sequence::delimited};
use nom_supreme::tag::complete::tag;
use crate::syntax::comment_parser::ws;
use crate::syntax::errors::BResult;
use crate::syntax::span::Span;

pub fn comma(i: Span) -> BResult<()> {
    map(delimited(ws, tag(","), ws), |_| ()).parse(i)
}
}

Usage Patterns

Prefer peek_*() when branching without consuming input (e.g., lookahead for statement kind).
After consuming a keyword with kw_*(), use cut() to prevent backtracking past the commitment.
Always wrap top-level file parser with all_consuming.
Keep context labels short and specific.

Adding a New Keyword

Pick the right module in keywords/ and add a define_keyword_pair! entry.
If it's a reserved word, add it to KEYWORDS (for identifier filtering).
Use kw_*()/peek_*() in parsers with ws() at boundaries.
Add tests under src/bsharp_tests/src/parser/... for both positive and negative cases.

References

Keyword macro and modules: src/bsharp_parser/src/keywords/
Whitespace/comment parser: src/bsharp_parser/src/syntax/comment_parser.rs
Lists: src/bsharp_parser/src/syntax/list_parser.rs
Error formatting: src/bsharp_parser/src/syntax/errors.rs

Query API for AST traversal

The Query API is provided by the bsharp_syntax crate and re-exported by bsharp_analysis for convenience. It replaces older navigation traits, but the Query API itself is current and not deprecated.

Core types

NodeRef<'a>: a thin enum over AST nodes (CompilationUnit, Namespace, Class, Struct, Interface, Enum, Record, Delegate, Method, Statement, Expression, plus top-level items). Origin: bsharp_syntax::node::ast_node::NodeRef (re-exported as bsharp_analysis::framework::NodeRef).
Query<'a>: a fluent helper to enumerate descendants and select typed nodes. Origin: bsharp_syntax::query::Query (re-exported as bsharp_analysis::framework::Query).

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::{NodeRef, Query};
use bsharp_syntax::CompilationUnit;
use bsharp_syntax::{ClassDeclaration, MethodDeclaration};

fn all_classes<'a>(cu: &'a CompilationUnit) -> Vec<&'a ClassDeclaration> {
    Query::from(NodeRef::CompilationUnit(cu))
        .of::<ClassDeclaration>()
        .collect()
}

fn all_methods<'a>(cu: &'a CompilationUnit) -> Vec<&'a MethodDeclaration> {
    Query::from(NodeRef::CompilationUnit(cu))
        .of::<MethodDeclaration>()
        .collect()
}
}

Descendant enumeration

Query::descendants() walks the tree using Children implemented for NodeRef.

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::{NodeRef, Query};
use bsharp_syntax::statements::Statement;

fn all_statements<'a>(cu: &'a CompilationUnit) -> Vec<&'a Statement> {
    Query::from(NodeRef::CompilationUnit(cu))
        .of::<Statement>()
        .collect()
}
}

Filtering

Use filter_typed to filter by predicate.

#![allow(unused)]
fn main() {
use analysis::syntax::declarations::ClassDeclaration;

let public_classes: Vec<&ClassDeclaration> =
    Query::from(NodeRef::CompilationUnit(&cu))
        .filter_typed::<ClassDeclaration>(|c| c.modifiers.iter().any(|m| m.is_public()))
        .collect();
}

Best practices

Prefer Query for node enumeration across passes.
For hot path statement/expression analysis, use shared helpers (metrics::shared) or a small local walker when necessary.
Keep passes stateless and deterministic; feed inputs via AnalysisSession artifacts.

Implementation notes

The Children/Extract traits are implemented for common AST nodes, enabling Query::of<T>() to return strong types. See:

src/bsharp_syntax/src/query/ for Children, Extract, Query.
src/bsharp_syntax/src/node/ast_node.rs for NodeRef.

Comment Parsing

BSharp implements comprehensive comment parsing for both regular comments and XML documentation comments, preserving them as part of the AST for documentation generation and analysis tools.

Comment Types

1. Single-Line Comments

Standard C++ style comments:

// This is a single-line comment
int x = 5; // End-of-line comment

2. Multi-Line Comments

Traditional C-style block comments:

/*
 * This is a multi-line comment
 * that spans several lines
 */
int y = 10; /* Inline block comment */

3. XML Documentation Comments

Single-Line XML Comments

/// <summary>
/// This method calculates the sum of two integers.
/// </summary>
/// <param name="a">The first integer.</param>
/// <param name="b">The second integer.</param>
/// <returns>The sum of a and b.</returns>
public int Add(int a, int b)
{
    return a + b;
}

Multi-Line XML Comments

/**
 * <summary>
 * This is a multi-line XML documentation comment.
 * It provides detailed information about the method.
 * </summary>
 * <param name="value">The input value to process.</param>
 * <returns>The processed result.</returns>
 */
public string ProcessValue(string value) { }

XML Documentation Structure

Standard XML Tags

Summary and Description

<summary>
Brief description of the member.
</summary>

<remarks>
Detailed remarks and additional information.
</remarks>

Parameters and Returns

<param name="parameterName">Description of the parameter.</param>
<returns>Description of the return value.</returns>

Exceptions

<exception cref="ArgumentNullException">
Thrown when the parameter is null.
</exception>

Examples

<example>
This example shows how to use the method:
<code>
var result = MyMethod("input");
Console.WriteLine(result);
</code>
</example>

See References

<see cref="RelatedMethod"/>
<seealso cref="AnotherClass"/>

Generic Type Parameters

<typeparam name="T">The type parameter.</typeparam>
<typeparamref name="T"/>

Custom XML Tags

The parser supports custom XML tags:

<custom attribute="value">
Custom content with <nested>elements</nested>.
</custom>

XML Documentation Parsing

XML Element Structure

The parser represents XML elements with:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct XmlElement {
    pub name: Identifier,
    pub attributes: Vec<XmlAttribute>,
    pub children: Vec<XmlNode>,
}

#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct XmlAttribute {
    pub name: Identifier,
    pub value: String,
}

#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum XmlNode {
    Element(XmlElement),
    Text(String),
    CData(String),
    Comment(String),
}
}

XML Documentation Comment

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct XmlDocumentationComment {
    pub elements: Vec<XmlNode>,
}
}

Parsing XML Attributes

The parser handles XML attributes with various syntaxes:

<param name="value">Description</param>
<see cref="MyClass.MyMethod(int, string)"/>
<exception cref="System.ArgumentException">Error description</exception>

XML Content Parsing

The parser processes mixed content:

<summary>
This method processes <paramref name="input"/> and returns
<see cref="ProcessResult"/> containing the result.
</summary>

Comment Association

Declaration Comments

Comments are associated with their following declarations:

/// <summary>Class documentation</summary>
public class MyClass
{
    /// <summary>Method documentation</summary>
    public void MyMethod() { }
}

Member Comments

Each declaration can have associated documentation:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct MethodDeclaration {
    pub documentation: Option<XmlDocumentationComment>,
    // ... other fields
}
}

Advanced XML Features

CDATA Sections

The parser handles CDATA sections for literal content:

<example>
<![CDATA[
if (x < y && y > z)
{
    Console.WriteLine("Complex condition");
}
]]>
</example>

Nested XML Elements

Complex nested structures are supported:

<summary>
This method handles <see cref="List{T}"/> where T is
<typeparamref name="T"/> and implements <see cref="IComparable{T}"/>.
</summary>

XML Namespaces

The parser can handle XML namespaces in documentation:

<doc:summary xmlns:doc="http://schemas.microsoft.com/developer/documentation">
Namespaced documentation content.
</doc:summary>

Comment Preservation

Comment Tokens

Comments are preserved as tokens in the AST:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum CommentToken {
    SingleLine(String),
    MultiLine(String),
    XmlDocumentation(XmlDocumentationComment),
}
}

Position Information

Comments maintain position information:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct PositionedComment {
    pub comment: CommentToken,
    pub line: usize,
    pub column: usize,
}
}

Error Handling

XML Validation

The parser validates XML structure:

Well-formed XML: Proper opening and closing tags
Attribute syntax: Valid attribute name-value pairs
Nesting rules: Correct element nesting
Character escaping: Proper XML character escaping

Error Recovery

When XML is malformed, the parser attempts recovery:

Skip malformed elements: Continue parsing after errors
Preserve content: Keep as much content as possible
Error reporting: Provide detailed error locations

Integration with Analysis

Documentation Analysis

Comments are available for analysis tools:

#![allow(unused)]
fn main() {
impl XmlDocumentationComment {
    pub fn find_elements_by_name(&self, name: &str) -> Vec<&XmlElement> {
        // Find all elements with the given tag name
    }
    
    pub fn get_summary(&self) -> Option<String> {
        // Extract summary text
    }
    
    pub fn get_parameters(&self) -> Vec<(String, String)> {
        // Extract parameter documentation
    }
}
}

Documentation Generation

The parsed XML documentation can be used for:

API documentation generation
IntelliSense information
Code analysis and quality checks
Documentation coverage reports

Performance Considerations

Lazy Parsing

XML documentation can be parsed lazily when needed:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub enum DocumentationState {
    Unparsed(String),
    Parsed(XmlDocumentationComment),
    Invalid(String, ParseError),
}
}

Memory Optimization

The parser optimizes memory usage by:

String interning: Reusing common XML tag names
Structured storage: Efficient representation of XML structure
On-demand parsing: Parse XML only when accessed

The comment parsing system ensures that all documentation and comments are preserved and available for analysis, while maintaining the performance characteristics needed for large codebases.

Preprocessor Directives

This parser treats preprocessor directives as trivia that can appear at safe boundaries (file start, between members inside namespaces and type bodies). We currently parse only a small subset explicitly and skip the rest.

What is parsed today

#pragma lines are parsed into PreprocessorDirective::Pragma { pragma: String }.
#line lines are parsed into PreprocessorDirective::Line { line: String }.
Any other line starting with # is recognized and consumed as PreprocessorDirective::Unknown { text: String } (the remainder of the line after #).

All directive parsers consume the optional trailing newline so the main parser can continue cleanly at the next token.

Where directives are skipped

Directives are treated as trivia and skipped at these locations:

This skipping is centralized via parser/helpers/directives.rs: skip_preprocessor_directives().


### 2. Symbol Definition

#### #define and #undef

```csharp
#define FEATURE_ENABLED
#define VERSION_2_0

#undef OLD_FEATURE

3. Diagnostic Directives

#warning

#warning This code is deprecated and will be removed in the next version

#error

#if UNSUPPORTED_PLATFORM
#error This platform is not supported
#endif

4. Line Directives

#line

#line 100 "OriginalFile.cs"
// Following code appears to come from line 100 of OriginalFile.cs

#line default
// Reset to actual file and line numbers

#line hidden
// Hide following lines from debugger

5. Region Directives

#region and #endregion

#region Private Methods
private void HelperMethod()
{
    // Implementation
}

private void AnotherHelper()
{
    // Implementation
}
#endregion

6. Pragma Directives

#pragma warning

#pragma warning disable CS0618
// Use of obsolete members
ObsoleteMethod();
#pragma warning restore CS0618

#pragma warning disable CS0162, CS0168
// Disable multiple warnings
#pragma warning restore CS0162, CS0168

#pragma checksum

#pragma checksum "file.cs" "{406EA660-64CF-4C82-B6F0-42D48172A799}" "checksum_bytes"

7. Nullable Context Directives

#nullable

#nullable enable
string? nullable = null;  // Nullable reference types enabled

#nullable disable
string notNullable = null;  // Warning disabled

#nullable restore
// Restore previous nullable context

Preprocessor Expression Evaluation

Symbols and Operators

Boolean Operators

#if DEBUG && !RELEASE           // AND and NOT
#if WINDOWS || LINUX || MACOS   // OR
#if (A && B) || (C && D)        // Grouping with parentheses

Equality Operators

#if VERSION == "2.0"            // String equality
#if BUILD_NUMBER >= 1000        // Numeric comparison (limited support)

Symbol Resolution

Symbols can be defined:

Source code: #define SYMBOL
Compiler flags: /define:SYMBOL
Project settings: <DefineConstants>
Environment: Predefined symbols

Predefined Symbols

Common predefined symbols:

#if NET5_0_OR_GREATER          // Framework version
#if WINDOWS                    // Platform
#if DEBUG                      // Configuration
#if X64                        // Architecture

Preprocessor AST Representation

Preprocessor Directive Node

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum PreprocessorDirective {
    If {
        condition: PreprocessorExpression,
        then_block: Vec<PreprocessorDirective>,
        elif_blocks: Vec<(PreprocessorExpression, Vec<PreprocessorDirective>)>,
        else_block: Option<Vec<PreprocessorDirective>>,
    },
    Define(String),
    Undef(String),
    Warning(String),
    Error(String),
    Line {
        line_number: Option<u32>,
        file_name: Option<String>,
        hidden: bool,
    },
    Region {
        name: String,
        content: Vec<PreprocessorDirective>,
    },
    Pragma {
        directive: String,
        arguments: Vec<String>,
    },
    Nullable(NullableDirective),
}
}

Preprocessor Expression

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum PreprocessorExpression {
    Symbol(String),
    Not(Box<PreprocessorExpression>),
    And(Box<PreprocessorExpression>, Box<PreprocessorExpression>),
    Or(Box<PreprocessorExpression>, Box<PreprocessorExpression>),
    Equal(Box<PreprocessorExpression>, Box<PreprocessorExpression>),
    NotEqual(Box<PreprocessorExpression>, Box<PreprocessorExpression>),
    Parenthesized(Box<PreprocessorExpression>),
    Literal(String),
}
}

Conditional Compilation Processing

Block Structure

Conditional blocks create a tree structure:

#if CONDITION_A
    // Block A
    #if NESTED_CONDITION
        // Nested block
    #endif
#elif CONDITION_B
    // Block B
#else
    // Default block
#endif

Active Code Determination

The preprocessor determines which code blocks are active:

Evaluate conditions: Process #if expressions
Symbol lookup: Resolve defined symbols
Block selection: Choose active code paths
Nested processing: Handle nested conditionals

Integration with Main Parser

Two-Phase Parsing

Preprocessor Phase: Process directives and determine active code
Main Parse Phase: Parse the active code sections

Conditional Code Exclusion

Inactive code blocks are:

Excluded from parsing: Not processed by main parser
Preserved in AST: Available for analysis tools
Marked as inactive: Flagged for tooling

Directive Preservation

All directives are preserved for:

Code formatting tools
Refactoring utilities
Documentation generation
Build system integration

Error Handling

Directive Validation

The parser validates:

Balanced conditionals: Every #if has matching #endif
Valid expressions: Preprocessor expressions are syntactically correct
Symbol definitions: #define follows naming rules
Pragma syntax: Pragma directives have valid format

Error Recovery

When encountering malformed directives:

Skip invalid directives: Continue parsing
Report detailed errors: Show directive location and issue
Maintain structure: Keep conditional block structure intact

Advanced Features

Nested Regions

#region Outer Region
    #region Inner Region
        // Nested region content
    #endregion
#endregion

Complex Pragma Directives

#pragma warning disable IDE0051 // Remove unused private members
#pragma warning restore IDE0051

#pragma nullable enable warnings
#pragma nullable disable annotations

Source Mapping

Line directives affect source mapping:

#line 1 "Generated.cs"
// This appears to come from Generated.cs line 1
var generated = true;
#line default
// Back to actual file location

Usage in Analysis

Conditional Code Analysis

Analysis tools can:

Detect dead code: Find code that's never compiled
Track feature flags: Analyze conditional compilation usage
Generate reports: Show compilation configurations

Symbol Tracking

Track symbol definitions and usage:

Definition locations: Where symbols are defined
Usage contexts: Where symbols are referenced
Scope analysis: Symbol visibility across files

Performance Considerations

Preprocessing Optimization

Symbol caching: Cache symbol resolution results
Lazy evaluation: Process conditionals only when needed
Memory efficiency: Minimize directive storage overhead

Integration Efficiency

Single-pass processing: Process directives during parsing
Minimal backtracking: Avoid reparsing conditional blocks
Incremental updates: Support for incremental parsing with directive changes

The preprocessor directive system ensures that all C# preprocessing features are supported while maintaining the ability to analyze and manipulate code across different compilation configurations.

Spans

This page explains how source spans are represented and returned during parsing.

Span Type

Type: bsharp_parser::syntax::span::Span<'a>
Alias: type Span<'a> = nom_locate::LocatedSpan<&'a str>;
Provides line/column offsets and byte positions for parser errors and mapping.

#![allow(unused)]
fn main() {
// src/bsharp_parser/src/syntax/span.rs
pub type Span<'a> = nom_locate::LocatedSpan<&'a str>;
}

Parsing With Spans

Use the parser facade to parse and also get a span table for top-level declarations.

#![allow(unused)]
fn main() {
use bsharp_parser::facade::Parser;

let source = std::fs::read_to_string("Program.cs")?;
let (cu, spans) = Parser::new().parse_with_spans(&source)?;
}

The return value is (CompilationUnit, SpanTable).
SpanTable maps top-level declarations to byte ranges for later mapping.

Error Reporting

Pretty error formatting uses Span to print line/column with context:

#![allow(unused)]
fn main() {
use bsharp_parser::syntax::errors::format_error_tree;

let msg = format_error_tree(&source, &error_tree);
}

See: docs/parser/error-handling.md for details.

Syntax Traits

Core traits used by AST types and formatting emitters.

AstNode

Path: bsharp_syntax::node::ast_node::AstNode
Implemented by all syntax node types for traversal and visualization.

#![allow(unused)]
fn main() {
pub trait AstNode: Any {
    fn as_any(&self) -> &dyn Any;
    fn children<'a>(&'a self, _push: &mut dyn FnMut(NodeRef<'a>)) {}
    fn node_kind(&self) -> &'static str { core::any::type_name::<Self>() }
    fn node_label(&self) -> String { format!("{} ({})", self.node_kind(), core::any::type_name::<Self>()) }
}
}

Helpers:

NodeRef<'a> alias to DynNodeRef<'a> for dynamic traversal.
push_child(push, node) to push typed children.

Emit and Emitter

Path: bsharp_syntax::emitters::emit_trait::{Emit, Emitter, EmitCtx}
Emit is implemented by nodes that can render themselves as C# code.
Emitter writes items to String (or writer) using a mutable EmitCtx.

#![allow(unused)]
fn main() {
pub trait Emit {
    fn emit<W: std::fmt::Write>(&self, w: &mut W, cx: &mut EmitCtx) -> Result<(), EmitError>;
}
}

EmitCtx controls indentation, simple policies, and optional JSONL tracing.

Rendering Helpers

Graph renderers in bsharp_syntax::node::render::{to_text, to_mermaid, to_dot} operate on &impl AstNode.

Derive Macros

Procedural macros used by syntax nodes to implement traversal and visualization behavior.

`#[derive(AstNode)]`

Crate: bsharp_syntax_derive
Implements: bsharp_syntax::node::ast_node::AstNode for your struct/enum
Purpose: Auto-generates children() to enable dynamic traversal via NodeRef/DynNodeRef.

How it works

For each field, the macro emits code to push children appropriately:

Option<T>: pushes inner T if present
Vec<T>: iterates and pushes each T
Box<T>: borrows inner &T and pushes it
Other types: treated as AST nodes by default
Primitive-like types are skipped: bool, numbers, char, String, and internal primitive enums like PrimitiveType

Excerpt from implementation (src/bsharp_syntax_derive/src/lib.rs):

#![allow(unused)]
fn main() {
#[proc_macro_derive(AstNode)]
pub fn derive_ast_node(input: TokenStream) -> TokenStream {
    // ...
    impl crate::node::ast_node::AstNode for #name {
        fn as_any(&self) -> &dyn ::core::any::Any { self }
        fn children<'a>(&'a self, push: &mut dyn FnMut(crate::node::ast_node::NodeRef<'a>)) {
            // Generated per-type based on fields
        }
    }
}
}

Helper routine decides how to push for common containers:

#![allow(unused)]
fn main() {
fn gen_push_for_type(ty: &Type, access: TokenStream) -> TokenStream {
    // Handles Option<T>, Vec<T>, Box<T>, or default to AST node push
}
}

Usage

Add the derive to your AST types in bsharp_syntax:

#![allow(unused)]
fn main() {
#[derive(bsharp_syntax_derive::AstNode, Debug, Clone, PartialEq, Serialize, Deserialize)]
pub enum Expression {
    Literal(Literal),
    Variable(Identifier),
    Invocation(Box<InvocationExpression>),
    // ...
}
}

This enables:

Graph rendering via to_text, to_mermaid, to_dot
Traversal via AstWalker/Visit or Query API (by way of NodeRef children)

Guidelines

Ensure child fields are typed as AST nodes or containers of AST nodes for traversal to work.
Keep primitive data out of traversal (the derive already skips standard primitives).
Favor Box<T> for recursive enum variants to keep sizes reasonable.

Formatter and Emitters

This page describes the formatting architecture in BSharp, implemented in the bsharp_syntax crate.

Overview

The formatter is an AST-driven emitter that produces the final C# text directly. There is no post-processing pass (no normalize_text): the output is exactly what emitters write.

Core types:
- Formatter
- FormatOptions
Emission is instrumentable via a JSONL trace for debugging and profiling.

FormatOptions

#![allow(unused)]
fn main() {
pub struct FormatOptions {
    pub indent_width: usize,                      // default: 4 spaces
    pub newline: &'static str,                    // "\n" or "\r\n"
    pub max_consecutive_blank_lines: u8,          // default: 1
    pub blank_line_between_members: bool,         // default: true
    pub ensure_final_newline: bool,               // default: true (emit one final newline if any content)
    pub trim_trailing_whitespace: bool,           // default: true
    pub instrument_emission: bool,                // default: false
    pub trace_file: Option<std::path::PathBuf>,   // optional JSONL output
    pub current_file: Option<std::path::PathBuf>, // helpful in messages
}
}

Newline mode is controlled by CLI --newline-mode or defaults to LF.
Emission tracing can be toggled via CLI --emit-trace or BSHARP_EMIT_TRACE=1.

Brace Style and Spacing Policy

Brace style: All containers and headers use Allman style
- Header ends the line (e.g., namespace X, class C, void M())
- Next line is an opening {, indented body, then closing } on its own line.
Spacing is centralized in simple policy helpers (see src/bsharp_syntax/src/emitters/policy.rs):
- between_header_and_body_of_file → blank line between file header (e.g., file-scoped ns) and body
- after_file_scoped_namespace_header → blank line after namespace X.Y;
- between_using_blocks_and_declarations → blank line after using block before first declaration
- between_top_level_declarations → single separator newline between top-level declarations
- between_members → single separator newline between adjacent type members
- between_block_items → optional extra newline inside a block when a control-flow block (if/for/while/do/switch/inner block) is followed by a declaration

Notes:

Policies are invoked from emitters; emitters themselves keep logic minimal and do not hardcode extra blank lines.
Interfaces, classes, structs, and records call between_members between members; the boolean blank_line_between_members toggles this globally.

End-of-file Newline

The CompilationUnit emitter ensures at most one final newline at EOF.
There are no per-statement trailing newlines at the root; separation is handled by policy functions.

Usage

#![allow(unused)]
fn main() {
use bsharp_syntax::{Formatter, FormatOptions};

let mut opts = FormatOptions::default();
opts.newline = "\n";
opts.max_consecutive_blank_lines = 1;
opts.blank_line_between_members = true;
opts.trim_trailing_whitespace = true;

let fmt = Formatter::new(opts);
let output = fmt.format_compilation_unit(&cu)?; // cu: CompilationUnit
}

Emission Trace (JSONL)

When instrumentation is enabled, the formatter emits a stream of JSON objects describing emission steps.

CLI integration:
- --emit-trace to enable
- --emit-trace-file <FILE> to write to a file (stdout by default)
- Env var BSHARP_EMIT_TRACE=1 acts as a default toggle

The trace can be useful to:

Diagnose spacing/blank line decisions (look for action: "policy" with names like between_members, between_top_level_declarations, between_block_items)
Identify costly emission paths
Reproduce formatting anomalies

Typical actions include: enter_node, open_brace, close_brace, newline, space, token, and policy.

Integration with CLI

See bsharp format in docs/cli/format.md for options mapping to FormatOptions.
Files that fail to parse are skipped; a summary is printed.
With --write false on a single file input, the formatted output is printed to stdout.

Design Notes

Emitters are AST-driven to preserve structure while normalizing whitespace and layout based on policies.
The formatter avoids changing semantics and focuses on consistent style.
Options default to safe, conservative values and can be tuned via CLI.

Analysis Framework Overview

The BSharp analysis framework provides a comprehensive suite of tools for analyzing C# code at various levels of detail. It is built on top of the BSharp parser infrastructure and offers insights into code structure, quality, dependencies, and maintainability. These capabilities support standalone analysis tools and editor/CI integrations.

Analysis Architecture

The analysis framework is organized into specialized modules:

src/bsharp_analysis/src/
├── framework/        # pipeline, passes, registry, session, walker, query
├── passes/           # indexing, metrics, control_flow, dependencies, reporting
├── artifacts/        # symbols, cfg, dependencies
├── metrics/          # AstAnalysis data + shared helpers
├── rules/            # naming, semantic, control_flow_smells
├── report/           # AnalysisReport assembly
└── (no quality module)

Analysis Capabilities

Control Flow Analysis

Path Analysis: Identify all possible execution paths through methods
Reachability: Detect unreachable code sections
Complexity Metrics: Calculate cyclomatic complexity and other flow-based metrics
Dead Code Detection: Find code that can never be executed

Dependency Analysis

Type Dependencies: Track relationships between types
Assembly Dependencies: Analyze external assembly usage
Circular Dependencies: Detect problematic dependency cycles
Coupling Metrics: Measure afferent and efferent coupling

Code Metrics

Comprehensive metrics collection across multiple dimensions:

Complexity Metrics

Cyclomatic Complexity
Cognitive Complexity
Nesting Depth
Method Length

Size Metrics

Lines of Code (LOC)
Source Lines of Code (SLOC)
Comment Lines
Method Count per Class

Maintainability Metrics

Maintainability Index
Technical Debt Indicators
Code Duplication Detection
Halstead Metrics

Rules

Naming Rules: Basic naming convention checks
Control Flow Smells: Simple flow-related smells (e.g., deep nesting warnings)

Type Analysis

Type Usage: Track how types are used throughout the codebase
Generic Analysis: Analyze generic type usage patterns
Inheritance Hierarchies: Map class and interface hierarchies
Interface Compliance: Validate interface implementations

Analysis Workflow

1. AST Preparation

All analysis begins with a parsed AST:

#![allow(unused)]
fn main() {
let parser = Parser::new();
let compilation_unit = parser.parse(source_code)?;
}

2. Pipeline

Use the framework pipeline with registered passes. Per-file runs populate typed artifacts; a final AnalysisReport summarizes metrics, control flow, and dependencies.

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::pipeline::AnalyzerPipeline;
use bsharp_analysis::framework::session::AnalysisSession;
use bsharp_analysis::context::AnalysisContext;
use bsharp_analysis::report::AnalysisReport;
use bsharp_parser::facade::Parser;

let parser = Parser::new();
let (cu, spans) = parser.parse_with_spans(source_code)?;
let ctx = AnalysisContext::new("file.cs", source_code);
let mut session = AnalysisSession::new(ctx, spans);
AnalyzerPipeline::run_with_defaults(&cu, &mut session);
let report: AnalysisReport = AnalysisReport::from_session(&session);
}

3. Analysis Execution

The pipeline runs passes in phases:

Index → Metrics (local) → Global (CFG, deps) → Semantic rules → Reporting

Artifacts (e.g., AstAnalysis, ControlFlowIndex, DependencyGraph) are inserted into the AnalysisSession and consumed by reporting.

4. Results Processing

Analysis results are structured for easy consumption:

#![allow(unused)]
fn main() {
// Metrics results
println!("Cyclomatic Complexity: {}", metrics.cyclomatic_complexity);
println!("Lines of Code: {}", metrics.lines_of_code);

// Diagnostics
for d in &report.diagnostics.diagnostics {
    println!("{}: {}", d.code, d.message);
}
}

Analysis Registry and Passes

Analyses are implemented as AnalyzerPass implementations registered in an AnalyzerRegistry and executed by the AnalyzerPipeline. Local rulesets and semantic rulesets run alongside passes based on Phase.

Configuration and Customization

Analysis Configuration

Analyzers can be configured for different scenarios:

#![allow(unused)]
fn main() {
let config = AnalysisConfig {
    max_cyclomatic_complexity: 10,
    max_method_length: 50,
    enforce_naming_conventions: true,
    detect_code_smells: true,
    // ... other configuration options
};

let analyzer = MetricsAnalyzer::with_config(config);
}

Custom Rules

Extend analysis with custom rules:

#![allow(unused)]
fn main() {
let custom_analyzer = QualityAnalyzer::new()
    .add_rule(CustomRule::new("no-goto-statements"))
    .add_rule(CustomRule::new("max-parameters", 5))
    .add_rule(CustomRule::new("prefer-composition"));
}

Reporting Options

Flexible reporting formats:

#![allow(unused)]
fn main() {
// JSON output
let json_report = analyzer.analyze(&ast).to_json();

// XML output
let xml_report = analyzer.analyze(&ast).to_xml();

// Custom format
let custom_report = analyzer.analyze(&ast).format_with(custom_formatter);
}

Integration Points

CLI Integration

Analysis capabilities are exposed through the analyze command and configured via options (format, config, include/exclude, enable/disable passes and rulesets, severity overrides). See docs/cli/analyze.md for details.

Programmatic Usage

Direct integration in tools typically runs the pipeline and pulls artifacts from the session:

#![allow(unused)]
fn main() {
use bsharp_analysis::context::AnalysisContext;
use bsharp_analysis::framework::{AnalyzerPipeline, AnalysisSession};
use bsharp_analysis::metrics::AstAnalysis;
use bsharp_parser::facade::Parser;

let source = fs::read_to_string(path)?;
let (cu, spans) = Parser::new().parse_with_spans(&source)?;
let mut session = AnalysisSession::new(AnalysisContext::new(path, &source), spans);
AnalyzerPipeline::run_with_defaults(&cu, &mut session);
if let Some(ast) = session.artifacts.get::<AstAnalysis>() {
    println!("methods={} complexity={}", ast.total_methods, ast.cyclomatic_complexity);
}
}

Performance Characteristics

Analysis Performance

Incremental Analysis: Support for analyzing only changed parts
Parallel Processing: Multi-threaded analysis for large codebases
Memory Efficiency: Minimal memory overhead during analysis
Caching: Results caching for repeated analysis

Scalability

The framework scales from single files to large enterprise codebases:

Single file analysis: Sub-second performance
Medium projects (100+ files): Seconds to minutes
Large codebases (1000+ files): Minutes with parallel processing

This analysis framework provides the foundation for building sophisticated code quality tools, IDE integrations, and automated code review systems.

Analysis Pipeline

This document describes the analysis pipeline architecture, artifacts, rulesets, configuration toggles, and determinism guarantees in the B# analyzer.

Phases

The pipeline runs in deterministic phases (see src/bsharp_analysis/src/framework/pipeline.rs):

Index
- Runs early passes like IndexingPass to populate core artifacts (SymbolIndex, NameIndex, FqnMap).
Local Rules
- Runs per-file passes such as MetricsPass (Query-based) to compute artifacts like AstAnalysis.
- Local rulesets run here as well; use bsharp_analysis::framework::Query for AST enumeration.
Global
- Passes that aggregate information across the file (or project) after initial indexing.
Semantic
- Rules and passes that require previously built artifacts (e.g., control flow, dependencies).
Reporting
- Finalization phase that can synthesize report artifacts.

Each phase is explicitly selected in AnalyzerPipeline::run_for_file() using Phase discriminants. Pass and ruleset registration is driven by AnalyzerRegistry.

Artifacts

Artifacts are stored in the per-file AnalysisSession.artifacts and summarized into an AnalysisReport:

Symbols (src/bsharp_analysis/src/artifacts/symbols.rs)
- SymbolIndex (by id and name), NameIndex (name frequencies), FqnMap (local name → FQNs).
Control Flow (src/bsharp_analysis/src/artifacts/cfg.rs)
- ControlFlowIndex keyed per method; summarized to CfgSummary with total methods and smell counts.
Dependencies (src/bsharp_analysis/src/artifacts/dependencies.rs)
- Graph keyed by symbols; summarized to node/edge counts.
Metrics (src/bsharp_analysis/src/artifacts/metrics.rs → AstAnalysis)
- Basic metrics gathered during the local traversal.

Artifacts are optional in the final report; missing artifacts simply result in None summaries.

Rulesets and Passes

Rules implement the Rule trait and are grouped into logical rulesets. Passes implement AnalyzerPass and declare a Phase:

Rulesets are separated into Local vs. Semantic groups and executed during the respective phases.
Passes can be toggled individually by id.
The registry is created with AnalyzerRegistry::from_config(&AnalysisConfig) to honor config toggles.

Configuration

AnalysisConfig (src/bsharp_analysis/src/context.rs) controls thresholds and toggles:

Control flow thresholds
- cf_high_complexity_threshold (default 10)
- cf_deep_nesting_threshold (default 4)
Toggles
- enable_rulesets: HashMap<String, bool>
- enable_passes: HashMap<String, bool>
- rule_severities: HashMap<String, DiagnosticSeverity>
Workspace filters
- workspace.follow_refs: bool
- workspace.include: Vec<String> (glob patterns)
- workspace.exclude: Vec<String> (glob patterns)

CLI maps flags to these fields in src/bsharp_cli/src/commands/analyze.rs and supports TOML/JSON config files.

Workspace Analysis and Determinism

AnalyzerPipeline::run_workspace() and run_workspace_with_config():

Discover files deterministically by sorting absolute paths and deduping.
Analyze each file independently, then merge artifacts into a single AnalysisReport.
Diagnostics are sorted by file, line, column, then diagnostic code for stable output.
Workspace loader warnings/errors are merged into workspace_warnings (sorted, deduped).
When the parallel_analysis feature is enabled, files are analyzed in parallel but merged deterministically in path order.

Report Schema

AnalysisReport (src/bsharp_analysis/src/report/mod.rs) includes:

schema_version: u32 (currently 1)
diagnostics: DiagnosticCollection
metrics: Option<AstAnalysis>
cfg: Option<CfgSummary>
deps: Option<DependencySummary>
workspace_warnings: Vec<String>
workspace_errors: Vec<String> (reserved for future use)

The JSON shape is intentionally stable; tests use snapshots with path normalization to ensure cross-platform consistency.

Testing Guidance

Prefer deterministic fixtures under tests/fixtures/.
Normalize absolute paths in snapshots (see tests/integration/workspace_analysis_snapshot.rs).
For workspace filtering, use run_workspace_with_config() with include/exclude globs and snapshot the resulting report.

Analysis Traversal Guide

This guide explains how to traverse BSharp AST statements and expressions in analysis passes using the current framework.

Source files:
- src/bsharp_analysis/src/framework/walker.rs
- src/bsharp_analysis/src/framework/query/
- src/bsharp_analysis/src/passes/*

Statement traversal

Use AstWalker for single-pass traversal with the Visit trait, or the Query API for typed filtering.

Example using AstWalker + Visit to count if statements:

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::{AstWalker, Visit, NodeRef, AnalysisSession};

struct CountIfs { pub ifs: usize }
impl Visit for CountIfs {
    fn enter(&mut self, node: &NodeRef, _session: &mut AnalysisSession) {
        if let NodeRef::Statement(s) = node {
            if matches!(s, bsharp_syntax::statements::statement::Statement::If(_)) {
                self.ifs += 1;
            }
        }
    }
}
}

Expression traversal

Use Query for typed expression searches:

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::{NodeRef, Query};
use bsharp_syntax::expressions::AwaitExpression;

let await_count = Query::from(NodeRef::CompilationUnit(&cu))
    .of::<AwaitExpression>()
    .count();
}

Putting it together

When analyzing methods, you typically:

Parse the compilation unit and build the analysis session.
For each method body (a Statement::Block), compute metrics by walking statements and expressions.

Example (from ControlFlowPass pattern):

#![allow(unused)]
fn main() {
use bsharp_analysis::artifacts::cfg::{ControlFlowIndex, MethodControlFlowStats};
use bsharp_syntax::statements::statement::Statement;

fn stats_for_method(body: Option<&Statement>) -> MethodControlFlowStats {
    let complexity = match body { Some(s) => 1 + decision_points(s), None => 1 };
    let max_nesting = calc_max_nesting(body, 0);
    let exit_points = count_exit_points(body);
    let statement_count = count_statements(body);
    MethodControlFlowStats { complexity, max_nesting, exit_points, statement_count }
}
}

See src/bsharp_analysis/src/metrics/shared.rs for helpers like decision_points, max_nesting_of, count_statements and src/bsharp_analysis/src/passes/control_flow.rs for usage.

Tips

Keep walkers side-effect free; accumulate results in closures.
Prefer small, focused passes that use the walkers rather than embedding traversal in each pass.
If a construct is not being traversed, add it to the walker first to avoid duplicated traversal logic.

Control Flow Analysis

The control flow analysis system analyzes method control flow to calculate complexity metrics, detect control flow smells, and identify potential issues.

Overview

Location: src/bsharp_analysis/src/passes/control_flow.rs, src/bsharp_analysis/src/artifacts/cfg.rs

Control flow analysis provides:

Cyclomatic complexity calculation
Maximum nesting depth tracking
Exit point counting
Statement counting
Control flow smell detection

Control Flow Metrics

Cyclomatic Complexity

Definition: Number of linearly independent paths through a method

Calculation: CC = 1 + number of decision points

Decision Points:

if statements
case labels in switch
Loop statements (for, foreach, while, do-while)
catch clauses
Logical operators (&&, ||) in conditions
Ternary operators (?:)
Null-coalescing operators (??)

Example:

public void ProcessOrder(Order order) {  // CC = 1 (base)
    if (order == null) {                 // +1 = 2
        throw new ArgumentNullException();
    }
    
    if (order.IsValid) {                 // +1 = 3
        if (order.Amount > 1000) {       // +1 = 4
            ApplyDiscount(order);
        }
        SaveOrder(order);
    }
}
// Total CC = 4

Maximum Nesting Depth

Definition: Deepest level of nested control structures

Example:

public void Example() {
    if (condition1) {              // Depth 1
        while (condition2) {       // Depth 2
            if (condition3) {      // Depth 3
                DoSomething();
            }
        }
    }
}
// Max Nesting Depth = 3

Exit Points

Definition: Number of points where method can return

Counted:

return statements
throw statements
End of void method

Example:

public int Calculate(int x) {
    if (x < 0) {
        return -1;        // Exit point 1
    }
    if (x == 0) {
        return 0;         // Exit point 2
    }
    return x * 2;         // Exit point 3
}
// Total Exit Points = 3

Statement Count

Definition: Total number of statements in method body

Includes all statement types:

Expression statements
Declaration statements
Control flow statements
Jump statements

Control Flow Artifacts

MethodControlFlowStats

#![allow(unused)]
fn main() {
pub struct MethodControlFlowStats {
    pub complexity: usize,
    pub max_nesting: usize,
    pub exit_points: usize,
    pub statement_count: usize,
}
}

ControlFlowIndex

#![allow(unused)]
fn main() {
pub struct ControlFlowIndex {
    // Method identifier -> stats
    methods: HashMap<String, MethodControlFlowStats>,
}
}

CfgSummary

#![allow(unused)]
fn main() {
pub struct CfgSummary {
    pub total_methods: usize,
    pub high_complexity_count: usize,
    pub deep_nesting_count: usize,
}
}

Control Flow Smells

High Complexity

Threshold: Configurable (default: 10)

Detection:

#![allow(unused)]
fn main() {
if stats.complexity > config.cf_high_complexity_threshold {
    session.diagnostics.add(
        DiagnosticCode::HighComplexity,
        format!("Method complexity {} exceeds threshold {}", 
               stats.complexity, threshold)
    );
}
}

Diagnostic:

warning[CF002]: High cyclomatic complexity
  --> src/OrderProcessor.cs:42:17
   |
42 |     public void ProcessOrder(Order order) {
   |                 ^^^^^^^^^^^^ complexity = 15 (threshold: 10)
   |
   = help: Consider breaking this method into smaller methods

Deep Nesting

Threshold: Configurable (default: 4)

Detection:

#![allow(unused)]
fn main() {
if stats.max_nesting > config.cf_deep_nesting_threshold {
    session.diagnostics.add(
        DiagnosticCode::DeepNesting,
        format!("Maximum nesting depth {} exceeds threshold {}", 
               stats.max_nesting, threshold)
    );
}
}

Diagnostic:

warning[CF003]: Deep nesting detected
  --> src/Validator.cs:15:9
   |
15 |         if (condition1) {
   |         ^^ nesting depth = 5 (threshold: 4)
   |
   = help: Consider extracting nested logic into separate methods

Implementation

Analysis Pass

Location: src/bsharp_analysis/src/passes/control_flow.rs

#![allow(unused)]
fn main() {
pub struct ControlFlowPass;

impl AnalyzerPass for ControlFlowPass {
    fn id(&self) -> &'static str { "control_flow" }
    fn phase(&self) -> Phase { Phase::Semantic }
    
    fn run(&self, cu: &CompilationUnit, session: &mut AnalysisSession) {
        let mut index = ControlFlowIndex::new();
        
        // Analyze all methods in compilation unit
        for decl in &cu.declarations {
            analyze_declaration(decl, &mut index, session);
        }
        
        session.artifacts.cfg = Some(index);
    }
}
}

Method Analysis

#![allow(unused)]
fn main() {
fn analyze_method(
    method: &MethodDeclaration,
    index: &mut ControlFlowIndex,
    session: &mut AnalysisSession
) {
    let stats = calculate_stats(method.body.as_ref());
    
    // Check thresholds
    if stats.complexity > session.config.cf_high_complexity_threshold {
        session.diagnostics.add(/* high complexity diagnostic */);
    }
    
    if stats.max_nesting > session.config.cf_deep_nesting_threshold {
        session.diagnostics.add(/* deep nesting diagnostic */);
    }
    
    // Store in index
    index.add_method(&method.identifier.name, stats);
}
}

Stats Calculation

#![allow(unused)]
fn main() {
fn calculate_stats(body: Option<&Statement>) -> MethodControlFlowStats {
    let complexity = match body {
        Some(stmt) => 1 + count_decision_points(stmt),
        None => 1,
    };
    
    let max_nesting = calculate_max_nesting(body, 0);
    let exit_points = count_exit_points(body);
    let statement_count = count_statements(body);
    
    MethodControlFlowStats {
        complexity,
        max_nesting,
        exit_points,
        statement_count,
    }
}
}

Configuration

Thresholds

[analysis.control_flow]
cf_high_complexity_threshold = 10
cf_deep_nesting_threshold = 4

CLI Usage

# Analyze with custom thresholds
bsharp analyze MyProject.csproj --config .bsharp.toml

# Enable control flow pass
bsharp analyze MyProject.csproj --enable-pass control_flow

Integration with Pipeline

Phase: Semantic

Control flow analysis runs in the Semantic phase after symbol indexing:

Phase::Index    -> Build SymbolIndex
Phase::Local    -> Collect metrics
Phase::Semantic -> Control flow analysis

Artifacts

Results stored in AnalysisSession:

#![allow(unused)]
fn main() {
session.artifacts.cfg = Some(ControlFlowIndex { ... });
}

Summarized in AnalysisReport:

#![allow(unused)]
fn main() {
report.cfg = Some(CfgSummary {
    total_methods: 87,
    high_complexity_count: 5,
    deep_nesting_count: 3,
});
}

Analysis Pipeline - Pipeline integration
Metrics Collection - Related metrics
Code Quality - Quality rules
Traversal Guide - AST traversal

References

Implementation: src/bsharp_analysis/src/passes/control_flow.rs
Artifacts: src/bsharp_analysis/src/artifacts/cfg.rs
Tests: src/bsharp_tests/src/analysis/control_flow/ (planned)

Dependency Analysis

The dependency analysis system tracks relationships between types, methods, and other symbols in C# code to identify coupling, circular dependencies, and architectural issues.

Overview

Location: src/bsharp_analysis/src/artifacts/dependencies.rs

The dependency analysis builds a directed graph of symbol relationships, where:

Nodes represent symbols (classes, interfaces, methods, etc.)
Edges represent dependencies (inheritance, method calls, field types, etc.)

Dependency Types

Type Dependencies

Inheritance:

public class Derived : Base { }  // Derived depends on Base

Interface Implementation:

public class MyClass : IInterface { }  // MyClass depends on IInterface

Field Types:

public class Container {
    private Helper helper;  // Container depends on Helper
}

Method Parameters and Return Types:

public Response Process(Request req) { }  // Process depends on Request and Response

Member Dependencies

Method Calls:

public void Caller() {
    Helper.DoSomething();  // Caller depends on Helper.DoSomething
}

Property Access:

var value = obj.Property;  // Depends on Property

Constructor Calls:

var instance = new MyClass();  // Depends on MyClass constructor

Dependency Graph Structure

DependencyGraph

#![allow(unused)]
fn main() {
pub struct DependencyGraph {
    // Symbol ID -> list of symbols it depends on
    dependencies: HashMap<SymbolId, Vec<SymbolId>>,
}
}

Operations

Adding Dependencies:

#![allow(unused)]
fn main() {
graph.add_dependency(from_symbol, to_symbol);
}

Querying Dependencies:

#![allow(unused)]
fn main() {
// Direct dependencies
let deps = graph.get_dependencies(symbol_id);

// Transitive dependencies
let all_deps = graph.get_transitive_dependencies(symbol_id);

// Reverse dependencies (who depends on this symbol)
let dependents = graph.get_dependents(symbol_id);
}

Circular Dependency Detection

Algorithm

The analysis uses depth-first search to detect cycles in the dependency graph:

Start from each symbol
Traverse dependencies depth-first
Track visited nodes in current path
If we revisit a node in current path, cycle detected

Example

public class A {
    private B b;  // A depends on B
}

public class B {
    private C c;  // B depends on C
}

public class C {
    private A a;  // C depends on A -> CYCLE: A -> B -> C -> A
}

Detection:

#![allow(unused)]
fn main() {
let cycles = graph.find_cycles();
for cycle in cycles {
    // Report diagnostic for circular dependency
    session.diagnostics.add(
        DiagnosticCode::CircularDependency,
        format!("Circular dependency detected: {:?}", cycle)
    );
}
}

Coupling Metrics

Afferent Coupling (Ca)

Definition: Number of types that depend on this type (incoming dependencies)

Interpretation:

High Ca = Many types depend on this type (responsibility)
Type is stable and hard to change

Efferent Coupling (Ce)

Definition: Number of types this type depends on (outgoing dependencies)

Interpretation:

High Ce = This type depends on many others
Type is unstable and sensitive to changes

Instability (I)

Formula: I = Ce / (Ca + Ce)

Range: 0.0 to 1.0

0.0 = Maximally stable (only incoming dependencies)
1.0 = Maximally unstable (only outgoing dependencies)

Example:

#![allow(unused)]
fn main() {
let ca = graph.afferent_coupling(symbol_id);
let ce = graph.efferent_coupling(symbol_id);
let instability = ce as f64 / (ca + ce) as f64;

if instability > 0.8 {
    // Highly unstable type - consider refactoring
}
}

Dependency Summary

DependencySummary

#![allow(unused)]
fn main() {
pub struct DependencySummary {
    pub total_nodes: usize,
    pub total_edges: usize,
    pub circular_dependencies: usize,
    pub max_depth: usize,
}
}

Generated by: DependencyGraph::summarize()

Included in: AnalysisReport

Usage in Analysis Pipeline

Phase: Global

Dependency analysis runs in the Global phase after symbol indexing:

#![allow(unused)]
fn main() {
// In AnalyzerPipeline
Phase::Index   -> Build SymbolIndex
Phase::Global  -> Build DependencyGraph
Phase::Semantic -> Use dependencies for semantic analysis
}

Integration with Passes

DependencyPass (if implemented):

#![allow(unused)]
fn main() {
impl AnalyzerPass for DependencyPass {
    fn id(&self) -> &'static str { "dependencies" }
    fn phase(&self) -> Phase { Phase::Global }
    
    fn run(&self, cu: &CompilationUnit, session: &mut AnalysisSession) {
        let graph = build_dependency_graph(cu, &session.artifacts.symbols);
        session.artifacts.dependencies = Some(graph);
    }
}
}

Building Dependency Graph

From CompilationUnit

#![allow(unused)]
fn main() {
pub fn build_dependency_graph(
    cu: &CompilationUnit,
    symbols: &SymbolIndex
) -> DependencyGraph {
    let mut graph = DependencyGraph::new();
    
    // Visit all declarations
    for decl in &cu.declarations {
        match decl {
            TopLevelDeclaration::Class(class) => {
                analyze_class_dependencies(class, symbols, &mut graph);
            }
            // ... other declaration types
        }
    }
    
    graph
}
}

From Class Declaration

#![allow(unused)]
fn main() {
fn analyze_class_dependencies(
    class: &ClassDeclaration,
    symbols: &SymbolIndex,
    graph: &mut DependencyGraph
) {
    let class_symbol = symbols.lookup(&class.identifier.name);
    
    // Base types
    for base_type in &class.base_types {
        if let Some(base_symbol) = resolve_type(base_type, symbols) {
            graph.add_dependency(class_symbol, base_symbol);
        }
    }
    
    // Members
    for member in &class.body_declarations {
        analyze_member_dependencies(member, class_symbol, symbols, graph);
    }
}
}

Dependency Visualization

Dependency Matrix

Generate a matrix showing which types depend on which:

        A  B  C  D
    A   -  X  -  X
    B   -  -  X  -
    C   X  -  -  -
    D   -  -  -  -

Row A, Column B = X means A depends on B

Dependency Tree

MyApp
├── Services
│   ├── UserService
│   │   ├── IUserRepository
│   │   └── IEmailService
│   └── OrderService
│       ├── IOrderRepository
│       └── IPaymentService
└── Models
    ├── User
    └── Order

Diagnostics

Circular Dependency Warning

warning[DEP001]: Circular dependency detected
  --> src/ClassA.cs:3:5
   |
 3 |     private ClassB b;
   |             ^^^^^^ ClassA depends on ClassB
   |
   = note: Dependency cycle: ClassA -> ClassB -> ClassC -> ClassA

High Coupling Warning

warning[DEP002]: High efferent coupling detected
  --> src/GodClass.cs:1:14
   |
 1 | public class GodClass {
   |              ^^^^^^^^ depends on 25 other types
   |
   = help: Consider breaking this class into smaller, focused classes

Unstable Dependency Warning

warning[DEP003]: Stable type depends on unstable type
  --> src/StableClass.cs:5:5
   |
 5 |     private UnstableClass helper;
   |             ^^^^^^^^^^^^^ instability = 0.95
   |
   = note: Stable types (instability < 0.2) should not depend on unstable types (instability > 0.8)

Configuration

Thresholds

[analysis.dependencies]
max_efferent_coupling = 20
max_afferent_coupling = 10
max_instability = 0.8
warn_circular_dependencies = true

CLI Usage

# Analyze dependencies
bsharp analyze MyProject.csproj --enable-pass dependencies

# Generate dependency report
bsharp analyze MyProject.sln --out deps.json --format pretty-json

Future Enhancements

Planned Features

Package-Level Dependencies
- Track dependencies between namespaces/assemblies
- Identify layering violations
Dependency Metrics Dashboard
- Visual dependency graphs
- Coupling heatmaps
- Trend analysis over time
Architectural Rules
- Define allowed/forbidden dependencies
- Enforce layered architecture
- Prevent specific coupling patterns
Dependency Injection Analysis
- Track DI container registrations
- Verify dependency lifetimes
- Detect missing registrations

Implementation Status

Current State:

Basic dependency graph structure defined
Integration with analysis pipeline planned
Circular dependency detection algorithm ready

TODO:

Implement full dependency extraction from AST
Add coupling metrics calculation
Create dependency visualization tools
Add comprehensive tests

Analysis Pipeline - How dependency analysis fits in the pipeline
Control Flow Analysis - Related analysis type
Metrics Collection - Coupling metrics
Architecture Decisions - Design rationale

References

Implementation: src/bsharp_analysis/src/artifacts/dependencies.rs
Tests: src/bsharp_tests/src/analysis/dependencies/ (planned)
Related Passes: src/bsharp_analysis/src/passes/ (when implemented)

Metrics Collection

The BSharp metrics system collects comprehensive code metrics during analysis to assess code complexity, size, and maintainability.

Overview

Location: src/bsharp_analysis/src/metrics/

The metrics system provides:

Basic Metrics - Lines of code, statement counts, declaration counts
Complexity Metrics - Cyclomatic complexity, cognitive complexity, nesting depth
Maintainability Metrics - Maintainability index, Halstead metrics

Architecture

Core Components

src/bsharp_analysis/src/metrics/
├── core.rs     # AstAnalysis data structure (aggregated counts)
└── shared.rs   # Helpers: decision_points, max_nesting_of, count_statements, etc.

How metrics are produced

MetricsPass runs in Phase::LocalRules and computes an AstAnalysis artifact using the Query API to enumerate declarations, plus lightweight walkers for statement counts.
Access AstAnalysis from AnalysisSession after running the pipeline.

Metric Types

1. Basic Metrics

AstAnalysis Structure:

#![allow(unused)]
fn main() {
pub struct AstAnalysis {
    // Size metrics
    pub total_lines: usize,
    pub code_lines: usize,
    pub comment_lines: usize,
    pub blank_lines: usize,
    
    // Declaration counts
    pub namespace_count: usize,
    pub class_count: usize,
    pub interface_count: usize,
    pub struct_count: usize,
    pub enum_count: usize,
    pub method_count: usize,
    pub property_count: usize,
    pub field_count: usize,
    
    // Statement counts
    pub statement_count: usize,
    pub expression_count: usize,
    
    // Complexity (aggregated)
    pub total_complexity: usize,
    pub max_complexity: usize,
    pub max_nesting_depth: usize,
}
}

2. Complexity Metrics

Cyclomatic Complexity

Definition: Number of linearly independent paths through code

Formula: CC = E - N + 2P

E = edges in control flow graph
N = nodes in control flow graph
P = connected components (usually 1)

Simplified: CC = 1 + number of decision points

Decision Points:

if, else if
case in switch
for, foreach, while, do-while
&&, || in conditions
catch clauses
?: ternary operator
?? null-coalescing operator

Example:

public void ProcessOrder(Order order) {  // CC = 1 (base)
    if (order == null) {                 // +1 = 2
        throw new ArgumentNullException();
    }
    
    if (order.IsValid) {                 // +1 = 3
        if (order.Amount > 1000) {       // +1 = 4
            ApplyDiscount(order);
        }
        SaveOrder(order);
    } else {                             // else doesn't add
        LogError(order);
    }
}
// Total CC = 4

Implementation:

#![allow(unused)]
fn main() {
pub fn cyclomatic_complexity(method: &MethodDeclaration) -> usize {
    let mut complexity = 1;  // Base complexity
    
    if let Some(body) = &method.body {
        complexity += count_decision_points(body);
    }
    
    complexity
}

fn count_decision_points(stmt: &Statement) -> usize {
    let mut count = 0;
    
    walk_statements(stmt, &mut |s| {
        match s {
            Statement::If(_) => count += 1,
            Statement::For(_) => count += 1,
            Statement::ForEach(_) => count += 1,
            Statement::While(_) => count += 1,
            Statement::DoWhile(_) => count += 1,
            Statement::Switch(sw) => {
                // Each case is a decision point
                count += sw.sections.len();
            }
            Statement::Try(try_stmt) => {
                // Each catch is a decision point
                count += try_stmt.catch_clauses.len();
            }
            _ => {}
        }
    });
    
    // Also count logical operators in expressions
    // count += count_logical_operators(stmt);
    
    count
}
}

Thresholds:

1-10: Simple, low risk
11-20: Moderate complexity, moderate risk
21-50: Complex, high risk
50+: Very complex, very high risk - refactor recommended

Cognitive Complexity

Definition: Measure of how difficult code is to understand

Increments:

+1 for each: if, else if, switch, for, foreach, while, do-while, catch, ?:, ??
+1 for each level of nesting (nested control structures)
+1 for each break or continue that jumps out of nested structure
+1 for each recursive call

Example:

public void Process(List<int> items) {
    if (items != null) {                 // +1 (if)
        foreach (var item in items) {    // +1 (loop) +1 (nesting) = +2
            if (item > 0) {              // +1 (if) +2 (nesting) = +3
                Process(item);           // +1 (recursion) +3 (nesting) = +4
            }
        }
    }
}
// Total Cognitive Complexity = 1 + 2 + 3 + 4 = 10

Implementation:

#![allow(unused)]
fn main() {
pub fn cognitive_complexity(method: &MethodDeclaration) -> usize {
    let mut complexity = 0;
    
    if let Some(body) = &method.body {
        complexity = calculate_cognitive_complexity(body, 0);
    }
    
    complexity
}

fn calculate_cognitive_complexity(stmt: &Statement, nesting_level: usize) -> usize {
    let mut complexity = 0;
    
    match stmt {
        Statement::If(if_stmt) => {
            complexity += 1 + nesting_level;  // if + nesting penalty
            complexity += calculate_cognitive_complexity(&if_stmt.consequence, nesting_level + 1);
            if let Some(alt) = &if_stmt.alternative {
                complexity += calculate_cognitive_complexity(alt, nesting_level + 1);
            }
        }
        Statement::For(for_stmt) => {
            complexity += 1 + nesting_level;
            if let Some(body) = &for_stmt.body {
                complexity += calculate_cognitive_complexity(body, nesting_level + 1);
            }
        }
        // ... other statement types
        _ => {}
    }
    
    complexity
}
}

Nesting Depth

Definition: Maximum depth of nested control structures

Example:

public void Example() {
    if (condition1) {              // Depth 1
        while (condition2) {       // Depth 2
            if (condition3) {      // Depth 3
                for (int i = 0; i < 10; i++) {  // Depth 4
                    // Code here
                }
            }
        }
    }
}
// Max Nesting Depth = 4

Implementation:

#![allow(unused)]
fn main() {
pub fn max_nesting_depth(method: &MethodDeclaration) -> usize {
    method.body.as_ref()
        .map(|body| calculate_max_nesting(body, 0))
        .unwrap_or(0)
}

fn calculate_max_nesting(stmt: &Statement, current_depth: usize) -> usize {
    let mut max_depth = current_depth;
    
    match stmt {
        Statement::If(if_stmt) => {
            let then_depth = calculate_max_nesting(&if_stmt.consequence, current_depth + 1);
            max_depth = max_depth.max(then_depth);
            
            if let Some(alt) = &if_stmt.alternative {
                let else_depth = calculate_max_nesting(alt, current_depth + 1);
                max_depth = max_depth.max(else_depth);
            }
        }
        Statement::Block(stmts) => {
            for s in stmts {
                let depth = calculate_max_nesting(s, current_depth);
                max_depth = max_depth.max(depth);
            }
        }
        // ... other nesting statements
        _ => {}
    }
    
    max_depth
}
}

Thresholds:

1-3: Acceptable
4-5: Consider refactoring
6+: Refactor recommended

Planned: Maintainability Metrics

Maintainability Index

Definition: Composite metric indicating code maintainability

Formula (Microsoft version):

MI = MAX(0, (171 - 5.2 * ln(HV) - 0.23 * CC - 16.2 * ln(LOC)) * 100 / 171)

Where:

HV = Halstead Volume
CC = Cyclomatic Complexity
LOC = Lines of Code

Scale:

85-100: Good maintainability (green)
65-84: Moderate maintainability (yellow)
0-64: Difficult to maintain (red)

Note: Maintainability Index is not implemented in the current codebase. This section outlines potential future work.

pub fn maintainability_index(
    halstead_volume: f64,
    cyclomatic_complexity: usize,
    lines_of_code: usize
) -> f64 {
    let hv_term = 5.2 * halstead_volume.ln();
    let cc_term = 0.23 * (cyclomatic_complexity as f64);
    let loc_term = 16.2 * (lines_of_code as f64).ln();
    
    let mi = 171.0 - hv_term - cc_term - loc_term;
    let normalized = (mi * 100.0 / 171.0).max(0.0);
    
    normalized
}

Planned: Halstead Metrics

Operators and Operands:

n1 = number of distinct operators
n2 = number of distinct operands
N1 = total number of operators
N2 = total number of operands

Derived Metrics:

Program Vocabulary: n = n1 + n2
Program Length: N = N1 + N2
Calculated Length: N' = n1 * log2(n1) + n2 * log2(n2)
Volume: V = N * log2(n)
Difficulty: D = (n1 / 2) * (N2 / n2)
Effort: E = D * V
Time to Program: T = E / 18 seconds
Bugs Delivered: B = V / 3000

Note: Halstead metrics are not implemented in the current codebase.

#![allow(unused)]
fn main() {
pub struct HalsteadMetrics {
    pub distinct_operators: usize,    // n1
    pub distinct_operands: usize,     // n2
    pub total_operators: usize,       // N1
    pub total_operands: usize,        // N2
    pub vocabulary: usize,            // n
    pub length: usize,                // N
    pub volume: f64,                  // V
    pub difficulty: f64,              // D
    pub effort: f64,                  // E
    pub time_to_program: f64,         // T
    pub bugs_delivered: f64,          // B
}

impl HalsteadMetrics {
    pub fn calculate(operators: &HashSet<String>, operands: &HashSet<String>,
                     op_count: usize, operand_count: usize) -> Self {
        let n1 = operators.len();
        let n2 = operands.len();
        let n = n1 + n2;
        let N = op_count + operand_count;
        
        let volume = (N as f64) * (n as f64).log2();
        let difficulty = (n1 as f64 / 2.0) * (operand_count as f64 / n2 as f64);
        let effort = difficulty * volume;
        let time = effort / 18.0;
        let bugs = volume / 3000.0;
        
        HalsteadMetrics {
            distinct_operators: n1,
            distinct_operands: n2,
            total_operators: op_count,
            total_operands: operand_count,
            vocabulary: n,
            length: N,
            volume,
            difficulty,
            effort,
            time_to_program: time,
            bugs_delivered: bugs,
        }
    }
}
}

Metrics Collection in the Pipeline

MetricsPass is registered in the analyzer registry and runs during Phase::LocalRules. It enumerates classes/structs/methods via Query and uses helpers from bsharp_analysis::metrics::shared to compute statement counts, decision points (cyclomatic complexity), and nesting.

#![allow(unused)]
fn main() {
use bsharp_analysis::context::AnalysisContext;
use bsharp_analysis::framework::pipeline::AnalyzerPipeline;
use bsharp_analysis::framework::session::AnalysisSession;
use bsharp_analysis::metrics::AstAnalysis;
use bsharp_parser::facade::Parser;

let source = r#"public class C { public void M() { if (true) { } } }"#;
let (cu, spans) = Parser::new().parse_with_spans(source)?;
let mut session = AnalysisSession::new(AnalysisContext::new("file.cs", source), spans);
AnalyzerPipeline::run_with_defaults(&cu, &mut session);
let ast = session.artifacts.get::<AstAnalysis>().expect("AstAnalysis");
println!("classes={}, methods={}, ifs={}", ast.total_classes, ast.total_methods, ast.total_if_statements);
}

CLI Usage

Analyze Metrics

# Analyze single file
bsharp analyze MyFile.cs

# Analyze project
bsharp analyze MyProject.csproj --out metrics.json

# Analyze solution
bsharp analyze MySolution.sln --out metrics.json --format pretty-json

Example Output

{
  "schema_version": 1,
  "metrics": {
    "total_lines": 1250,
    "code_lines": 980,
    "comment_lines": 150,
    "blank_lines": 120,
    "class_count": 15,
    "method_count": 87,
    "total_complexity": 245,
    "max_complexity": 18,
    "max_nesting_depth": 5
  }
}

Thresholds and Warnings

Configuration

[analysis.metrics]
max_cyclomatic_complexity = 10
max_cognitive_complexity = 15
max_nesting_depth = 4
max_method_length = 50
min_maintainability_index = 65

Diagnostics

High Complexity Warning:

warning[MET001]: Method has high cyclomatic complexity
  --> src/OrderProcessor.cs:42:17
   |
42 |     public void ProcessOrder(Order order) {
   |                 ^^^^^^^^^^^^ complexity = 18 (threshold: 10)
   |
   = help: Consider breaking this method into smaller methods

Deep Nesting Warning:

warning[MET002]: Deep nesting detected
  --> src/Validator.cs:15:9
   |
15 |         if (condition1) {
   |         ^^ nesting depth = 5 (threshold: 4)
   |
   = help: Consider extracting nested logic into separate methods

Programmatic Usage

Analyzing a Method

#![allow(unused)]
fn main() {
use bsharp::analysis::metrics::{cyclomatic_complexity, cognitive_complexity, max_nesting_depth};

let method = parse_method("public void MyMethod() { ... }");

let cc = cyclomatic_complexity(&method);
let cog = cognitive_complexity(&method);
let nesting = max_nesting_depth(&method);

println!("Cyclomatic Complexity: {}", cc);
println!("Cognitive Complexity: {}", cog);
println!("Max Nesting Depth: {}", nesting);
}

Analyzing a file via the pipeline

#![allow(unused)]
fn main() {
let (cu, spans) = Parser::new().parse_with_spans(source_code)?;
let mut session = AnalysisSession::new(AnalysisContext::new("file.cs", source_code), spans);
AnalyzerPipeline::run_with_defaults(&cu, &mut session);
let metrics = session.artifacts.get::<AstAnalysis>().expect("AstAnalysis");
println!("Classes: {}", metrics.total_classes);
println!("Methods: {}", metrics.total_methods);
println!("Cyclomatic Complexity: {}", metrics.cyclomatic_complexity);
}

Analysis Pipeline - How metrics fit in the pipeline
Control Flow Analysis - Related complexity analysis
Code Quality - Quality assessment using metrics
Architecture - Design decisions

References

Implementation: src/bsharp_analysis/src/metrics/
Pass: src/bsharp_analysis/src/passes/metrics.rs
Tests: src/bsharp_tests/src/analysis/metrics/ (planned)
Standards: ISO/IEC 25023 (Software Quality Metrics)

Type Analysis

The type analysis system provides insights into type usage, inheritance hierarchies, and type-related patterns in C# code.

Overview

Status: Planned (module not implemented yet)

Type analysis tracks:

Type definitions and their relationships
Inheritance hierarchies
Interface implementations
Generic type usage
Type references and dependencies

Type Information

Type Categories

Value Types:

Primitives (int, bool, double, etc.)
Structs
Enums

Reference Types:

Classes
Interfaces
Delegates
Arrays

Special Types:

Generic type parameters
Nullable types
Tuple types
Anonymous types

Inheritance Analysis

Class Hierarchies

Tracking Inheritance:

public class Animal { }
public class Mammal : Animal { }
public class Dog : Mammal { }

Hierarchy Representation:

Animal
└── Mammal
    └── Dog

Analysis:

#![allow(unused)]
fn main() {
pub struct InheritanceHierarchy {
    // Type -> Base Type
    base_types: HashMap<TypeId, TypeId>,
    // Type -> Derived Types
    derived_types: HashMap<TypeId, Vec<TypeId>>,
}

impl InheritanceHierarchy {
    pub fn get_base_type(&self, type_id: TypeId) -> Option<TypeId>;
    pub fn get_derived_types(&self, type_id: TypeId) -> &[TypeId];
    pub fn get_all_ancestors(&self, type_id: TypeId) -> Vec<TypeId>;
    pub fn get_all_descendants(&self, type_id: TypeId) -> Vec<TypeId>;
    pub fn inheritance_depth(&self, type_id: TypeId) -> usize;
}
}

Interface Implementation

Tracking Implementations:

public interface IRepository { }
public interface IUserRepository : IRepository { }
public class UserRepository : IUserRepository { }

Analysis:

#![allow(unused)]
fn main() {
pub struct InterfaceImplementations {
    // Type -> Interfaces it implements
    implementations: HashMap<TypeId, Vec<TypeId>>,
    // Interface -> Types that implement it
    implementers: HashMap<TypeId, Vec<TypeId>>,
}
}

Generic Type Analysis

Type Parameters

Tracking Generic Definitions:

public class Container<T> where T : class { }
public class Repository<TEntity, TKey> where TEntity : class { }

Analysis:

#![allow(unused)]
fn main() {
pub struct GenericTypeInfo {
    pub type_parameters: Vec<TypeParameter>,
    pub constraints: Vec<TypeConstraint>,
}

pub struct TypeParameter {
    pub name: String,
    pub variance: Option<Variance>,  // in, out
}

pub struct TypeConstraint {
    pub parameter: String,
    pub kind: ConstraintKind,
}

pub enum ConstraintKind {
    Class,              // where T : class
    Struct,             // where T : struct
    New,                // where T : new()
    BaseType(TypeId),   // where T : BaseClass
    Interface(TypeId),  // where T : IInterface
}
}

Generic Type Usage

Tracking Instantiations:

var list = new List<int>();
var dict = new Dictionary<string, User>();

Analysis:

#![allow(unused)]
fn main() {
pub struct GenericInstantiation {
    pub generic_type: TypeId,
    pub type_arguments: Vec<TypeId>,
}

pub fn find_generic_instantiations(cu: &CompilationUnit) -> Vec<GenericInstantiation>;
}

Type Usage Patterns

Frequency Analysis

Most Used Types:

#![allow(unused)]
fn main() {
pub struct TypeUsageStats {
    pub type_references: HashMap<TypeId, usize>,
}

impl TypeUsageStats {
    pub fn most_used_types(&self, limit: usize) -> Vec<(TypeId, usize)>;
    pub fn usage_count(&self, type_id: TypeId) -> usize;
}
}

Type Categories Distribution

#![allow(unused)]
fn main() {
pub struct TypeDistribution {
    pub class_count: usize,
    pub interface_count: usize,
    pub struct_count: usize,
    pub enum_count: usize,
    pub delegate_count: usize,
}
}

Type Metrics

Depth of Inheritance Tree (DIT)

Definition: Maximum depth from type to root of hierarchy

Example:

class A { }              // DIT = 0 (or 1 from Object)
class B : A { }          // DIT = 1 (or 2 from Object)
class C : B { }          // DIT = 2 (or 3 from Object)

Interpretation:

Low DIT (0-2): Simple hierarchy, easy to understand
Medium DIT (3-4): Moderate complexity
High DIT (5+): Complex hierarchy, may indicate over-engineering

Number of Children (NOC)

Definition: Number of immediate subclasses

Example:

class Animal { }
class Dog : Animal { }
class Cat : Animal { }
class Bird : Animal { }
// Animal has NOC = 3

Interpretation:

High NOC: Type is heavily reused (good abstraction or god class)
Low NOC: Specialized type or leaf in hierarchy

Lack of Cohesion of Methods (LCOM)

Definition: Measure of how well methods in a class are related

Simplified Calculation:

Count pairs of methods that don't share instance variables
High LCOM suggests class should be split

Type Compatibility Analysis

Assignability

Checking Compatibility:

#![allow(unused)]
fn main() {
pub fn is_assignable_to(from: &Type, to: &Type, context: &TypeContext) -> bool {
    // Check if 'from' type can be assigned to 'to' type
    // Considers inheritance, interface implementation, variance, etc.
}
}

Rules:

Derived type assignable to base type
Type assignable to implemented interface
Covariant/contravariant generic types
Nullable value types
Implicit conversions

Type Conversions

Tracking Conversions:

int x = 42;
long y = x;              // Implicit conversion
string s = x.ToString(); // Explicit conversion

Analysis:

#![allow(unused)]
fn main() {
pub enum ConversionKind {
    Implicit,
    Explicit,
    UserDefined,
}

pub struct TypeConversion {
    pub from: TypeId,
    pub to: TypeId,
    pub kind: ConversionKind,
}
}

Nullable Reference Types Analysis

Nullability Tracking

C# 8+ Nullable Annotations:

string? nullable = null;      // Nullable reference
string nonNull = "value";     // Non-nullable reference

Analysis:

#![allow(unused)]
fn main() {
pub struct NullabilityInfo {
    pub is_nullable: bool,
    pub nullability_context: NullabilityContext,
}

pub enum NullabilityContext {
    Enabled,
    Disabled,
    Warnings,
}
}

Null Safety Diagnostics

Potential Null Reference:

warning[TYPE001]: Possible null reference
  --> src/UserService.cs:15:9
   |
15 |     user.Name = "John";
   |     ^^^^ 'user' may be null here
   |
   = help: Add null check or use null-conditional operator

Type Analysis in Pipeline

Integration

Type analysis is not part of the default registry yet. The intended phase is Semantic (after symbol indexing and global artifacts). This page outlines the planned scope.

Programmatic Usage

Analyzing Type Hierarchy

Planned APIs will expose hierarchy queries once implemented.

Finding Generic Instantiations

Planned helper(s) to enumerate generic instantiations will be documented here after implementation.

Future Enhancements

Planned Features

Type Inference Tracking
- Track var usage and inferred types
- Analyze type inference patterns
Variance Analysis
- Detect variance violations
- Suggest covariant/contravariant annotations
Type Safety Metrics
- Measure use of dynamic
- Track unsafe casts
- Nullable reference type coverage
Design Pattern Detection
- Identify common patterns (Factory, Strategy, etc.)
- Detect anti-patterns

Implementation Status

Current State:

Basic type tracking infrastructure in place
Type analysis module integrated with analysis framework
Foundation for inheritance and generic analysis established

In Progress:

Full inheritance hierarchy analysis
Generic type instantiation tracking
Type usage statistics collection
Comprehensive test coverage

Planned:

Variance analysis
Type safety metrics
Design pattern detection based on type relationships

Analysis Pipeline - Pipeline integration
Dependency Analysis - Type dependencies
Metrics Collection - Type-related metrics
AST Structure - Type representations

References

Implementation: Planned
Tests: Planned (under src/bsharp_tests/src/analysis/types/)
Related: docs/analysis/dependencies.md, docs/parser/ast-structure.md

Code Quality Analysis (Conceptual / Future Plan)

This document describes a future-facing design for quality analysis. The legacy quality module and QualityPass were removed from the codebase in the purge. Consider this document a proposal/reference for potential future work rather than current implementation.

Overview

Status: Not implemented. The legacy module was removed; this page documents future direction.

Quality analysis provides:

Code smell detection
Best practice validation
Design pattern recognition
Maintainability assessment
Technical debt identification

Code Smells

Method-Level Smells

Long Method

Description: Method with too many lines of code

Threshold: > 50 lines (configurable)

Example:

public void ProcessOrder(Order order) {
    // 150 lines of code...
}

Diagnostic:

warning[QUAL001]: Long method detected
  --> src/OrderService.cs:42:17
   |
42 |     public void ProcessOrder(Order order) {
   |                 ^^^^^^^^^^^^ method has 150 lines (threshold: 50)
   |
   = help: Consider breaking this method into smaller, focused methods

Refactoring:

Extract method
Decompose into smaller methods
Apply Single Responsibility Principle

Long Parameter List

Description: Method with too many parameters

Threshold: > 5 parameters (configurable)

Example:

public void CreateUser(string firstName, string lastName, string email, 
                      string phone, string address, string city, string zip) {
    // ...
}

Refactoring:

Introduce parameter object
Use builder pattern
Group related parameters into DTOs

Complex Conditional

Description: Deeply nested or complex conditional logic

Example:

if (user != null && user.IsActive && (user.Role == "Admin" || user.Role == "Manager") 
    && user.Department != null && user.Department.Budget > 10000) {
    // ...
}

Refactoring:

Extract condition to well-named method
Use guard clauses
Simplify boolean logic

Class-Level Smells

Large Class (God Class)

Description: Class with too many responsibilities

Indicators:

Too many methods (> 20)
Too many fields (> 10)
High cyclomatic complexity
Low cohesion

Example:

public class UserManager {
    // 50 methods handling user CRUD, authentication, authorization,
    // email sending, logging, caching, validation, etc.
}

Refactoring:

Split into multiple classes
Apply Single Responsibility Principle
Extract related functionality

Feature Envy

Description: Method uses more features of another class than its own

Example:

public class OrderProcessor {
    public decimal CalculateTotal(Order order) {
        decimal total = 0;
        foreach (var item in order.Items) {
            total += item.Price * item.Quantity;
        }
        total -= order.Discount;
        total += order.Tax;
        return total;
    }
}

Refactoring:

Move method to Order class
Method should be where the data is

Data Class

Description: Class with only fields and getters/setters, no behavior

Example:

public class User {
    public string Name { get; set; }
    public string Email { get; set; }
    public int Age { get; set; }
    // No methods, just data
}

Note: Sometimes acceptable for DTOs, but domain objects should have behavior

Code Organization Smells

Duplicate Code

Description: Identical or very similar code in multiple places

Detection:

Token-based comparison
AST structure comparison
Minimum clone size threshold

Refactoring:

Extract method
Extract class
Use inheritance or composition

Dead Code

Description: Code that is never executed

Examples:

Unreachable statements after return
Unused private methods
Unused fields
Conditions that are always true/false

Diagnostic:

warning[QUAL010]: Unreachable code detected
  --> src/Calculator.cs:15:9
   |
14 |     return result;
15 |     Console.WriteLine("Done");  // Never executed
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^ unreachable statement

Magic Numbers

Description: Unexplained numeric literals in code

Example:

if (order.Total > 1000) {  // What does 1000 mean?
    ApplyDiscount(order, 0.1);  // What does 0.1 mean?
}

Refactoring:

const decimal BULK_ORDER_THRESHOLD = 1000m;
const decimal BULK_ORDER_DISCOUNT = 0.1m;

if (order.Total > BULK_ORDER_THRESHOLD) {
    ApplyDiscount(order, BULK_ORDER_DISCOUNT);
}

Best Practices

Naming Conventions

Rules:

Classes: PascalCase
Methods: PascalCase
Properties: PascalCase
Fields: camelCase with _ prefix for private
Constants: UPPER_CASE or PascalCase
Interfaces: PascalCase with I prefix

Violations:

warning[QUAL020]: Naming convention violation
  --> src/UserService.cs:5:17
   |
 5 |     private int UserCount;
   |                 ^^^^^^^^^ private field should use camelCase with _ prefix
   |
   = help: Rename to '_userCount'

Exception Handling

Anti-patterns:

Empty Catch Block:

try {
    RiskyOperation();
} catch (Exception) {
    // Silent failure - BAD!
}

Catching Generic Exception:

try {
    SpecificOperation();
} catch (Exception ex) {  // Too broad
    // ...
}

Best Practices:

Catch specific exceptions
Log exceptions
Don't swallow exceptions
Use finally for cleanup

Resource Management

Using Statement:

// Good
using (var file = File.OpenRead("data.txt")) {
    // Use file
}

// Better (C# 8+)
using var file = File.OpenRead("data.txt");
// Disposed at end of scope

Diagnostic:

warning[QUAL030]: IDisposable not properly disposed
  --> src/FileProcessor.cs:10:9
   |
10 |     var file = File.OpenRead("data.txt");
   |         ^^^^ should be wrapped in using statement

Design Patterns and Anti-Patterns

Detected Patterns

Singleton Pattern

Detection:

Private constructor
Static instance field
Public static accessor

Example:

public class Logger {
    private static Logger _instance;
    private Logger() { }
    
    public static Logger Instance {
        get {
            if (_instance == null) {
                _instance = new Logger();
            }
            return _instance;
        }
    }
}

Factory Pattern

Detection:

Method returning interface or base class
Creates different concrete types based on parameters

Anti-Patterns

God Object

Detection:

High number of methods and fields
Low cohesion
High coupling

Spaghetti Code

Detection:

High cyclomatic complexity
Deep nesting
Lack of structure

Lava Flow

Detection:

Dead code
Commented-out code
Unused variables/methods

Quality Metrics

Code Quality Score

Composite Score (0-100):

#![allow(unused)]
fn main() {
pub struct QualityScore {
    pub overall: f64,
    pub maintainability: f64,
    pub complexity: f64,
    pub duplication: f64,
    pub test_coverage: f64,
}
}

Calculation:

Overall = (Maintainability * 0.3) + 
          (Complexity * 0.3) + 
          (Duplication * 0.2) + 
          (TestCoverage * 0.2)

Technical Debt

Estimation:

#![allow(unused)]
fn main() {
pub struct TechnicalDebt {
    pub total_issues: usize,
    pub estimated_hours: f64,
    pub debt_ratio: f64,  // debt / total development time
}
}

Calculation:

Each code smell assigned time cost
Sum all issues
Compare to total codebase size

Quality Rules

Rule System

Rule Definition:

#![allow(unused)]
fn main() {
pub trait QualityRule {
    fn id(&self) -> &'static str;
    fn name(&self) -> &'static str;
    fn description(&self) -> &'static str;
    fn check(&self, node: &NodeRef, session: &mut AnalysisSession);
}
}

Example Rule:

#![allow(unused)]
fn main() {
pub struct LongMethodRule {
    max_lines: usize,
}

impl QualityRule for LongMethodRule {
    fn id(&self) -> &'static str { "long_method" }
    fn name(&self) -> &'static str { "Long Method" }
    
    fn check(&self, node: &NodeRef, session: &mut AnalysisSession) {
        if let NodeRef::MethodDeclaration(method) = node {
            let line_count = count_lines(method);
            if line_count > self.max_lines {
                session.diagnostics.add(
                    DiagnosticCode::LongMethod,
                    format!("Method has {} lines (threshold: {})", 
                           line_count, self.max_lines)
                );
            }
        }
    }
}
}

Rule Categories

Maintainability Rules:

Long method
Long parameter list
Large class
Complex method

Reliability Rules:

Empty catch blocks
Null reference risks
Resource leaks
Unhandled exceptions

Security Rules:

SQL injection risks
XSS vulnerabilities
Hardcoded credentials
Insecure random

Performance Rules:

Inefficient loops
Unnecessary allocations
String concatenation in loops
Boxing/unboxing

Configuration

Quality Thresholds

[analysis.quality]
max_method_lines = 50
max_parameters = 5
max_class_methods = 20
max_cyclomatic_complexity = 10
max_nesting_depth = 4

[analysis.quality.rules]
long_method = "warning"
long_parameter_list = "warning"
god_class = "error"
empty_catch = "error"
magic_numbers = "info"

Severity Levels

Error: Must be fixed
Warning: Should be fixed
Info: Consider fixing
Hint: Suggestion for improvement

CLI Usage

Quality Analysis

# Analyze code quality
bsharp analyze MyProject.csproj --enable-ruleset quality

# Generate quality report
bsharp analyze MySolution.sln --out quality-report.json

# Filter by severity
bsharp analyze MyFile.cs --severity error,warning

Example Output

{
  "quality_score": {
    "overall": 72.5,
    "maintainability": 68.0,
    "complexity": 75.0,
    "duplication": 80.0
  },
  "technical_debt": {
    "total_issues": 45,
    "estimated_hours": 12.5,
    "debt_ratio": 0.08
  },
  "diagnostics": [
    {
      "code": "QUAL001",
      "severity": "warning",
      "message": "Long method detected",
      "file": "src/OrderService.cs",
      "line": 42,
      "column": 17
    }
  ]
}

Integration with Pipeline

Quality Ruleset

Registration:

#![allow(unused)]
fn main() {
// In AnalyzerRegistry
registry.add_ruleset(QualityRuleset {
    id: "quality",
    rules: vec![
        Box::new(LongMethodRule::new()),
        Box::new(LongParameterListRule::new()),
        Box::new(GodClassRule::new()),
        Box::new(EmptyCatchRule::new()),
        // ... more rules
    ],
});
}

Execution:

Rules run during Local or Semantic phase
Visitor pattern for AST traversal
Diagnostics collected in session

Programmatic Usage

Running Quality Analysis

#![allow(unused)]
fn main() {
use bsharp::analysis::quality::QualityAnalyzer;

let parser = Parser::new();
let cu = parser.parse(source_code)?;

let analyzer = QualityAnalyzer::new();
let report = analyzer.analyze(&cu);

println!("Quality Score: {}", report.quality_score.overall);
println!("Issues Found: {}", report.diagnostics.len());
}

Custom Rules

#![allow(unused)]
fn main() {
use bsharp::analysis::quality::QualityRule;

struct CustomRule;

impl QualityRule for CustomRule {
    fn id(&self) -> &'static str { "custom_rule" }
    fn name(&self) -> &'static str { "Custom Rule" }
    
    fn check(&self, node: &NodeRef, session: &mut AnalysisSession) {
        // Custom logic
    }
}

// Register custom rule
analyzer.add_rule(Box::new(CustomRule));
}

Future Enhancements

Planned Features

Machine Learning-Based Detection
- Learn from codebase patterns
- Detect project-specific smells
Refactoring Suggestions
- Automated refactoring proposals
- Preview refactoring impact
Quality Trends
- Track quality over time
- Identify degradation
- Measure improvement
Team Metrics
- Per-developer quality metrics
- Code review insights
- Best practice adoption

Analysis Pipeline - Pipeline integration
Metrics Collection - Quality metrics
Control Flow Analysis - Complexity analysis
Architecture - Design decisions

References

Standards: Clean Code (Robert C. Martin), Refactoring (Martin Fowler)

Passes and Rules Registry

This page summarizes the default analysis registry: which passes and rulesets are registered by default and when they run.

Default Registry

Source: src/bsharp_analysis/src/framework/registry.rs

#![allow(unused)]
fn main() {
// Simplified summary based on default_registry()
- Pass: indexing::IndexingPass          // indexing/symbols
- Pass: pe_loader::PeLoaderPass         // external PE metadata (if available)
- Pass: metrics::MetricsPass            // local metrics (Query-based)
- Ruleset (local): rules::naming        // naming conventions
- Ruleset (local): rules::semantic      // baseline semantic checks (local)
- Pass: control_flow::ControlFlowPass   // control flow stats and diagnostics
- Pass: dependencies::DependenciesPass  // dependency graph & summary
- Ruleset (semantic): control_flow_smells // consumes global artifacts
- Pass: reporting::ReportingPass        // consolidate artifacts into report
}

Notes:

Each pass declares its own Phase (AnalyzerPass::phase()), e.g. MetricsPass runs in Phase::LocalRules.
Semantic rulesets (e.g., control_flow_smells) run after global artifacts are produced.

Phases

Index: Build indexes (symbols, FQNs) and load external metadata.
LocalRules: Run per-file local analyses (e.g., metrics) and baseline rules.
Global/Semantic: Build global artifacts (control flow, dependencies), then run semantic rules consuming them.
Reporting: Finalize results into AnalysisReport.

Configuration: Enabling/Disabling

Toggles are driven by AnalysisConfig:

Passes: enable_passes[pass_id] = true|false
Rulesets: enable_rulesets[ruleset_id] = true|false
Severities: rule_severities["CODE"] = Error|Warning|Info|Hint

The CLI maps flags to these fields (see docs/cli/analyze.md).

IDs

Pass IDs (AnalyzerPass::id()):
- passes.indexing
- passes.pe_loader
- passes.metrics
- passes.control_flow
- passes.dependencies
- passes.reporting
Ruleset IDs depend on the ruleset constructors (e.g., naming, semantic, control_flow_smells).

References

src/bsharp_analysis/src/framework/registry.rs
src/bsharp_analysis/src/passes/*
src/bsharp_analysis/src/rules/*

Analysis Report Schema

The AnalysisReport summarizes diagnostics and artifacts produced by the analysis pipeline.

Struct

Source: src/bsharp_analysis/src/report/mod.rs

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct CfgSummary {
    pub total_methods: usize,
    pub high_complexity_methods: usize,
    pub deep_nesting_methods: usize,
}

#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct AnalysisReport {
    pub schema_version: u32,
    pub diagnostics: DiagnosticCollection,
    pub metrics: Option<AstAnalysis>,
    pub cfg: Option<CfgSummary>,
    pub deps: Option<DependencySummary>,
    pub workspace_warnings: Vec<String>,
    pub workspace_errors: Vec<String>,
}
}

Field Details

schema_version – current schema version (1)
diagnostics – all emitted diagnostics with codes, severities, locations
metrics – aggregated AstAnalysis when MetricsPass runs
cfg – summarized control flow stats when ControlFlowPass runs
deps – dependency summary when DependenciesPass runs
workspace_warnings – non-fatal workspace-level messages
workspace_errors – reserved for future use

Example (pretty JSON)

{
  "schema_version": 1,
  "diagnostics": {
    "diagnostics": [
      {
        "code": "CF002",
        "severity": "warning",
        "message": "High cyclomatic complexity",
        "file": "src/OrderProcessor.cs",
        "line": 42,
        "column": 17
      }
    ]
  },
  "metrics": {
    "total_classes": 15,
    "total_interfaces": 3,
    "total_structs": 2,
    "total_enums": 1,
    "total_records": 0,
    "total_delegates": 0,
    "total_methods": 87,
    "total_properties": 21,
    "total_fields": 12,
    "total_events": 0,
    "total_constructors": 15,
    "total_if_statements": 20,
    "total_for_loops": 5,
    "total_while_loops": 2,
    "total_switch_statements": 3,
    "total_try_statements": 1,
    "total_using_statements": 2,
    "cyclomatic_complexity": 245,
    "lines_of_code": 980,
    "max_nesting_depth": 5,
    "documented_methods": 0,
    "documented_classes": 0
  },
  "cfg": {
    "total_methods": 87,
    "high_complexity_methods": 5,
    "deep_nesting_methods": 3
  },
  "deps": {
    "nodes": 42,
    "edges": 120
  },
  "workspace_warnings": [],
  "workspace_errors": []
}

Where It Comes From

AnalysisReport::from_session(&session) collects:

metrics from session.artifacts.get::<AstAnalysis>()
cfg by summarizing the ControlFlowIndex artifact against thresholds
deps by summarizing DependencyGraph
diagnostics copied from session.diagnostics

docs/cli/analyze.md – CLI options and examples
docs/analysis/pipeline.md – Where in the pipeline artifacts are produced

Writing an Analyzer Pass

This guide shows how to create a new analysis pass by implementing AnalyzerPass and registering it in the analysis pipeline.

Trait

Source: src/bsharp_analysis/src/framework/passes.rs

#![allow(unused)]
fn main() {
pub trait AnalyzerPass: Send + Sync + 'static {
    fn id(&self) -> &'static str;
    fn phase(&self) -> Phase;                 // Index | LocalRules | Global | Semantic | Reporting
    fn depends_on(&self) -> &'static [&'static str] { &[] }
    fn run(&self, cu: &CompilationUnit, session: &mut AnalysisSession) {}
}
}

Minimal Pass

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::{AnalyzerPass, Phase, AnalysisSession};
use bsharp_syntax::CompilationUnit;

pub struct MyPass;

impl AnalyzerPass for MyPass {
    fn id(&self) -> &'static str { "passes.my_pass" }
    fn phase(&self) -> Phase { Phase::LocalRules }

    fn run(&self, cu: &CompilationUnit, session: &mut AnalysisSession) {
        // Inspect `cu` and write results into `session.artifacts` or `session.diagnostics`
        // Example: count classes and log a note (pseudo)
        let mut count = 0usize;
        for _c in bsharp_analysis::framework::Query::from(cu).of::<bsharp_syntax::ClassDeclaration>() {
            count += 1;
        }
        // session.artifacts.insert(MyArtifact { class_count: count });
        // session.diagnostics.add(...);
    }
}
}

Registration

Add your pass to the default registry in src/bsharp_analysis/src/framework/registry.rs:

#![allow(unused)]
fn main() {
reg.register_pass(crate::passes::my_pass::MyPass);
}

Or, build a custom registry for experiments:

#![allow(unused)]
fn main() {
let mut reg = AnalyzerRegistry::default_registry();
reg.register_pass(MyPass);
AnalyzerPipeline::run_for_file(&cu, &mut session, &reg);
}

You can also toggle passes via AnalysisConfig.enable_passes["passes.my_pass"] = true|false (see configuration docs).

Tips

Keep passes small: Focus on one responsibility.
Prefer Query/AstWalker: Use Query for typed enumeration or AstWalker with Visit for custom traversal.
Write artifacts: Insert results with session.artifacts.insert(T) when they may be consumed later.
Determinism: Avoid non-deterministic ordering; use sorted maps/lists if needed.

Writing a Ruleset

This guide shows how to define rules and bundle them into a RuleSet to be executed by the analysis pipeline.

Traits and Types

Source: src/bsharp_analysis/src/framework/rules.rs

#![allow(unused)]
fn main() {
pub enum RuleTarget { All, Declarations, Members, Statements, Expressions }

pub trait Rule: Send + Sync + 'static {
    fn id(&self) -> &'static str;
    fn category(&self) -> &'static str;
    fn applies_to(&self) -> RuleTarget { RuleTarget::All }
    fn visit(&self, _node: &NodeRef, _session: &mut AnalysisSession) {}
}

pub struct RuleSet { pub id: &'static str, pub rules: Vec<Box<dyn Rule>> }
}

Minimal Rule

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::{Rule, RuleTarget, NodeRef, AnalysisSession};

pub struct NoEmptyCatch;

impl Rule for NoEmptyCatch {
    fn id(&self) -> &'static str { "QUAL010" }
    fn category(&self) -> &'static str { "quality" }
    fn applies_to(&self) -> RuleTarget { RuleTarget::Statements }

    fn visit(&self, node: &NodeRef, session: &mut AnalysisSession) {
        if let NodeRef::Statement(stmt) = node {
            if let bsharp_syntax::statements::statement::Statement::Try(t) = stmt {
                for c in &t.catches {
                    if c.block_is_empty() {
                        session.diagnostics.add(
                            bsharp_analysis::DiagnosticCode::from_static("QUAL010"),
                            "Empty catch block",
                            None,
                        );
                    }
                }
            }
        }
    }
}
}

Building a RuleSet

#![allow(unused)]
fn main() {
use bsharp_analysis::framework::RuleSet;

pub fn ruleset() -> RuleSet {
    RuleSet::new("quality")
        .with_rule(NoEmptyCatch)
        // .with_rule(AnotherRule)
}
}

#![allow(unused)]
fn main() {
reg.register_ruleset(crate::rules::quality::ruleset());         // local rules
reg.register_semantic_ruleset(crate::rules::control_flow_smells::ruleset());
}

Rulesets can be enabled/disabled via AnalysisConfig.enable_rulesets["quality"] = true|false.

Tips

Choose RuleTarget thoughtfully to avoid unnecessary visits.
Emit diagnostics with specific codes and helpful messages.
Keep rules independent; accumulate state in AnalysisSession artifacts when needed.
Honor config toggles; only run if your ruleset is enabled.

Command Line Interface

The BSharp CLI provides command-line tools for parsing, analyzing, and visualizing C# code.

Installation

From Source

git clone https://github.com/mikserek/bsharp.git
cd bsharp
cargo build --release

The binary will be available at target/release/bsharp.

Add to PATH

# Linux/macOS
export PATH="$PATH:/path/to/bsharp/target/release"

# Windows
# Add to System Environment Variables

Command Structure

bsharp <COMMAND> [OPTIONS] <INPUT>

Global Options

--help, -h      Show help information
--version, -V   Show version information

Argument Files (@file)

All commands support argument files via @file syntax. Example:

bsharp @args.txt

Where args.txt contains one argument per line (comments and quoting follow standard shell parsing rules).

Available Commands

parse

Parse C# source code and print a textual AST tree to stdout.

bsharp parse <INPUT>

See: Parse Command

tree

Generate a visualization of the Abstract Syntax Tree.

bsharp tree <INPUT> [--output <FILE>] [--format mermaid|dot]

Notes:

Default format is mermaid; output defaults to <input>.mmd.
For DOT/Graphviz, use --format dot (or graphviz); output defaults to <input>.dot.

See: Tree Visualization

analyze

Analyze C# code and generate comprehensive analysis report.

bsharp analyze <INPUT> [OPTIONS]

See: Analysis Command

format

Format C# files using the built-in formatter and syntax emitters.

bsharp format <INPUT> [--write] [--newline-mode lf|crlf] [--max-consecutive-blank-lines <N>] \
  [--blank-line-between-members <BOOL>] [--trim-trailing-whitespace <BOOL>] \
  [--emit-trace] [--emit-trace-file <FILE>]

Notes:

<INPUT> can be a file or directory (recursively formats .cs files; skips hidden/bin/obj/target).
--write defaults to true; when false and a single file is given, the formatted output is printed to stdout.
Emission tracing can be enabled by --emit-trace or environment variable BSHARP_EMIT_TRACE=1.

See: Format Command

Common Usage Patterns

Quick Parse Check

# Check if file parses successfully
bsharp parse MyFile.cs

Generate AST for Inspection

# Pretty-printed JSON
bsharp parse MyFile.cs --output ast.json

Visualize Code Structure

# Generate Mermaid diagram (default), writes MyClass.mmd
bsharp tree MyClass.cs

# Generate Graphviz DOT diagram
bsharp tree MyClass.cs --format dot --output diagram.dot

Analyze Project Quality

# Full analysis with report
bsharp analyze MyProject.csproj --out report.json --format pretty-json

Analyze Solution

# Analyze entire solution
bsharp analyze MySolution.sln --follow-refs true

Input Types

Single File

bsharp parse Program.cs

Project File (.csproj)

bsharp analyze MyProject.csproj

Solution File (.sln)

bsharp analyze MySolution.sln

Output Formats

JSON (Compact)

bsharp analyze MyFile.cs --format json

Output: Single-line JSON, optimized for machine consumption

Pretty JSON

bsharp analyze MyFile.cs --format pretty-json

Output: Indented JSON, human-readable

Mermaid/DOT (Tree Command)

# Mermaid (default)
bsharp tree MyFile.cs --output diagram.mmd

# Graphviz DOT
bsharp tree MyFile.cs --format dot --output diagram.dot

Output: Mermaid (.mmd) or Graphviz DOT (.dot)

Error Handling

Parse Errors

$ bsharp parse InvalidSyntax.cs
Error: Parse failed at line 5, column 12
Expected ';' but found 'class'

public class MyClass
            ^

File Not Found

$ bsharp parse NonExistent.cs
Error: File not found: NonExistent.cs

Invalid Project

$ bsharp analyze Invalid.csproj
Error: Failed to parse project file: Invalid XML

Environment Variables

RUST_LOG

Control logging verbosity:

# Show all logs
RUST_LOG=debug bsharp parse MyFile.cs

# Show only warnings and errors
RUST_LOG=warn bsharp analyze MyProject.csproj

# Show specific module logs
RUST_LOG=bsharp::parser=debug bsharp parse MyFile.cs

RUST_BACKTRACE

Enable stack traces on panic:

RUST_BACKTRACE=1 bsharp parse MyFile.cs

Performance Considerations

Large Files

For large files (> 10,000 lines), parsing may take several seconds:

# Monitor progress with debug logging
RUST_LOG=info bsharp parse LargeFile.cs

Large Solutions

For solutions with many projects, use parallel analysis:

# Requires parallel_analysis feature
cargo build --release --features parallel_analysis
bsharp analyze LargeSolution.sln

Memory Usage

Memory usage scales with AST size. For very large codebases:

# Analyze incrementally by project
for proj in **/*.csproj; do
    bsharp analyze "$proj" --out "$(basename $proj .csproj).json"
done

Integration with Other Tools

CI/CD Pipeline

# GitHub Actions example
- name: Analyze Code Quality
  run: |
    bsharp analyze MySolution.sln --out analysis.json
    # Upload analysis.json as artifact

Pre-commit Hook

#!/bin/bash
# .git/hooks/pre-commit

changed_files=$(git diff --cached --name-only --diff-filter=ACM | grep '\.cs$')

for file in $changed_files; do
    if ! bsharp parse "$file" > /dev/null 2>&1; then
        echo "Parse error in $file"
        exit 1
    fi
done

Editor Integration

// VS Code tasks.json
{
    "version": "2.0.0",
    "tasks": [
        {
            "label": "Analyze Current File",
            "type": "shell",
            "command": "bsharp",
            "args": [
                "analyze",
                "${file}",
                "--out",
                "${file}.analysis.json"
            ]
        }
    ]
}

Troubleshooting

Command Not Found

$ bsharp: command not found

Solution: Add bsharp to PATH or use full path:

/path/to/bsharp/target/release/bsharp parse MyFile.cs

Permission Denied

$ bsharp parse MyFile.cs
Permission denied

Solution: Make binary executable:

chmod +x /path/to/bsharp

Out of Memory

$ bsharp analyze HugeSolution.sln
Error: memory allocation failed

Solution: Analyze smaller subsets or increase system memory

Configuration Files

Analysis Configuration

Create .bsharp.toml in project root:

[analysis]
max_cyclomatic_complexity = 10
max_method_length = 50

[analysis.quality]
long_method = "warning"
god_class = "error"

[workspace]
follow_refs = true
include = ["**/*.cs"]
exclude = ["**/obj/**", "**/bin/**"]

Usage:

# Automatically loads .bsharp.toml from current directory
bsharp analyze MyProject.csproj

Shell Completion

Shell completion generation is currently not available in the CLI.

Examples

Example 1: Quick Syntax Check

# Check if all C# files in directory parse correctly
find . -name "*.cs" -exec bsharp parse {} \; 2>&1 | grep -i error

Example 2: Generate Documentation

# Parse all files and extract class/method names
for file in src/**/*.cs; do
    bsharp parse "$file" --output "${file}.json"
done

# Process JSON to generate documentation
# (custom script)

Example 3: Code Quality Gate

#!/bin/bash
# quality-gate.sh

bsharp analyze MyProject.csproj --out report.json --format json

# Extract error count
errors=$(jq '.diagnostics | map(select(.severity == "error")) | length' report.json)

if [ "$errors" -gt 0 ]; then
    echo "Quality gate failed: $errors errors found"
    exit 1
fi

echo "Quality gate passed"

Example 4: Complexity Report

# Generate complexity report for all methods
bsharp analyze MySolution.sln --out complexity.json

# Extract high-complexity methods
jq '.diagnostics | map(select(.code == "MET001"))' complexity.json

CLI Architecture

Implementation

Location: src/bsharp_cli/

src/bsharp_cli/
├── src/
│   ├── main.rs         # CLI entry point, clap definitions
│   └── commands/
│       ├── mod.rs      # Command module exports
│       ├── parse.rs    # Parse command implementation
│       ├── tree.rs     # AST visualization command (Mermaid/DOT)
│       └── analyze.rs  # Analysis command
└── Cargo.toml

Command Pattern

Each command follows this pattern:

#![allow(unused)]
fn main() {
pub fn execute(input: PathBuf, /* other args */) -> Result<()> {
    // 1. Validate input
    // 2. Load/parse files
    // 3. Perform operation
    // 4. Generate output
    // 5. Handle errors
    Ok(())
}
}

Future Enhancements

Planned Features

Interactive Mode
- REPL for exploring AST
- Interactive analysis
Watch Mode
- Monitor files for changes
- Re-analyze on save
Language Server
- LSP implementation
- IDE integration
Web Interface
- Browser-based visualization
- Interactive reports

Parse Command - Detailed parse command documentation
Tree Visualization - AST visualization
Analysis Pipeline - Analysis internals

References

Implementation: src/bsharp_cli/
Commands: src/bsharp_cli/src/commands/
Clap Documentation: https://docs.rs/clap/

--emit-spans

When used with --errors-json, include absolute and relative spans in the JSON under error.spans.
No effect unless --errors-json is set.

Parse Command

The parse command parses C# source code and prints a textual AST tree representation to stdout.

Usage

bsharp parse --input <INPUT> [--errors-json] [--emit-spans] [--no-color] [--lenient]

Arguments

<INPUT> (required)

Path to C# source file
Must have .cs extension
File must exist and be readable

Options

--errors-json

Print a machine-readable JSON error object to stdout on parse failure and exit non-zero
Disables pretty error output

See: Parse Errors JSON Output

--no-color

Disable ANSI colors in pretty error output

--lenient

Enable best-effort recovery mode (default is strict)

Note: The --output option is currently not used; the command writes the textual tree to stdout.

Examples

Basic Parsing

# Parse and print textual AST tree to stdout
bsharp parse Program.cs

Batch Parsing

# Parse all C# files in a directory (prints textual trees)
for file in src/**/*.cs; do
    bsharp parse "$file"
done

Output

The command prints a human-readable textual tree describing the AST. For visualization outputs (Mermaid/DOT), use the tree command.

Error Handling

Parse Errors

$ bsharp parse InvalidSyntax.cs
Error: Parse failed

0: at line 5, in keyword "class":
public clas MyClass { }
       ^--- expected keyword "class"

1: in context "class declaration"

Error Information:

Line and column numbers
Context stack showing where parsing failed
Expected vs. actual input
Helpful error messages

Pretty error formatting

The parser integrates with the miette crate for rich, labeled diagnostics in pretty (non-JSON) mode. CLI parse errors are formatted from the underlying ErrorTree with spans and context information for easier debugging.

For programmatic formatting from parser code, see bsharp_parser::errors::to_miette_report which converts an ErrorTree to a miette::Report with source code attached.

File Errors

$ bsharp parse NonExistent.cs
Error: Failed to read file: NonExistent.cs
Caused by: No such file or directory (os error 2)

Use Cases

1. Syntax Validation

# Check if file has valid syntax
if bsharp parse MyFile.cs > /dev/null 2>&1; then
    echo "Syntax OK"
else
    echo "Syntax Error"
    exit 1
fi

2. AST Inspection

# Parse and inspect AST structure
bsharp parse MyClass.cs --output ast.json
jq '.declarations[0].Class.name.name' ast.json

3. Documentation Input

# Parse C# and generate documentation using your own script
bsharp parse MyFile.cs --output ast.json
python generate_docs.py ast.json > docs.md

4. Static Analysis

# Parse and analyze with custom tool
bsharp parse MyFile.cs --output ast.json
./my-analyzer ast.json

Performance

Parsing Speed

Small files (< 100 lines): < 10ms
Medium files (100-1000 lines): 10-100ms
Large files (1000-10000 lines): 100ms-1s
Very large files (> 10000 lines): 1-10s

Memory Usage

Memory usage scales linearly with file size
Typical: 1-5 MB per 1000 lines of code
Peak memory during AST construction

Integration

CI/CD Pipeline

# GitHub Actions
- name: Validate C# Syntax
  run: |
    find . -name "*.cs" | while read file; do
      bsharp parse "$file" || exit 1
    done

Pre-commit Hook

#!/bin/bash
# .git/hooks/pre-commit

git diff --cached --name-only --diff-filter=ACM | grep '\.cs$' | while read file; do
    if ! bsharp parse "$file" > /dev/null 2>&1; then
        echo "Parse error in $file"
        exit 1
    fi
done

Build Script

#!/bin/bash
# validate-syntax.sh

errors=0
for file in src/**/*.cs; do
    if ! bsharp parse "$file" > /dev/null 2>&1; then
        echo "ERROR: $file"
        ((errors++))
    fi
done

if [ $errors -gt 0 ]; then
    echo "Found $errors files with syntax errors"
    exit 1
fi

Comparison with Other Tools

vs. Roslyn

BSharp: Fast, standalone, JSON output
Roslyn: Full compiler, .NET required, complex API

vs. Tree-sitter

BSharp: C#-specific, complete AST
Tree-sitter: Multi-language, syntax tree only

Implementation

Location: src/bsharp_cli/src/commands/parse.rs

#![allow(unused)]
fn main() {
pub fn execute(
    input: PathBuf,
    output: Option<PathBuf>,
    errors_json: bool,
    no_color: bool,
    lenient: bool,
) -> Result<()> {
    // Read file, choose strict/lenient, parse, and write <input>.json by default
    // See the source file for detailed behavior and error formatting.
    Ok(())
}
}

CLI Overview - General CLI usage
Tree Visualization - Visualize parsed AST
AST Structure - AST node reference
Error Handling - Parse error details

References

Implementation: src/bsharp_cli/src/commands/parse.rs
Parser: src/bsharp_parser/src/
AST Definitions: src/bsharp_syntax/src/

Tree Visualization Command

The tree command generates a visualization of the Abstract Syntax Tree (AST) from C# source code in Mermaid or Graphviz DOT format.

Usage

bsharp tree <INPUT> [--output <FILE>] [--format mermaid|dot]

Arguments

<INPUT> (required)

Path to C# source file
Must have .cs extension

Options

--output, -o <FILE> (optional)

Output file path
Default: <input>.mmd for Mermaid, <input>.dot for DOT

--format <FORMAT> (optional)

One of: mermaid (default), dot (alias: graphviz)

Examples

Basic Visualization

# Generate Mermaid diagram (default)
bsharp tree Program.cs              # writes Program.mmd

# Generate Graphviz DOT diagram
bsharp tree Program.cs --format dot # writes Program.dot

# Specify output file
bsharp tree Program.cs --format dot --output ast-diagram.dot

View/Render

# Mermaid preview (e.g., VS Code Mermaid extension) or CLI renderer
# Graphviz render to PNG
dot -Tpng Program.dot -o Program.png

Output Formats

Mermaid

Outputs a simple top-level graph in Mermaid syntax (.mmd).

graph TD
n0["CompilationUnit\\nUsings: 1\\nDecls: 1"]
u0["Using using System;"]
n0 --> u0
d0["Class: Program"]
n0 --> d0

Graphviz DOT

Outputs a simple top-level graph in DOT syntax (.dot).

digraph AST {
  node [shape=box, fontname="Courier New"];
  n0 [label="CompilationUnit\\nUsings: 1\\nDecls: 1"];
  u0 [label="Using using System;"];
  n0 -> u0;
  d0 [label="Class: Program"];
  n0 -> d0;
}

Color Scheme

Gray - Root nodes (CompilationUnit)
Blue - Type declarations (Class, Interface, Struct)
Green - Member declarations (Method, Property, Field)
Yellow - Statements (If, For, While)
Orange - Expressions (Binary, Invocation)
Purple - Types (Primitive, Named, Generic)

Visualization Features

Node Information

Each node displays:

Node Type - AST node type name
Identifier - Name (for named nodes)
Additional Info - Modifiers, types, etc.

Tree Layout

Top-down - Root at top, leaves at bottom
Hierarchical - Parent-child relationships clear
Balanced - Nodes distributed evenly
Scalable - Adjusts to tree size

Use Cases

1. Understanding Code Structure

# Visualize complex class
bsharp tree ComplexClass.cs --output structure.svg

2. Teaching/Documentation

# Generate diagrams for documentation
bsharp tree Example.cs --output docs/ast-example.svg

3. Debugging Parser

# Verify parser output
bsharp tree TestCase.cs --output debug.svg

4. Code Review

# Visualize changes
bsharp tree NewFeature.cs --output review.svg

Limitations

Large Files

Files > 1000 lines may produce very large SVGs
Consider visualizing specific classes/methods only

Complex Nesting

Deeply nested structures may be hard to read
SVG may require horizontal scrolling

Performance

Generation time increases with AST size
Large files (> 5000 lines) may take several seconds

Advanced Usage

Selective Visualization

# Extract specific class and visualize
# (requires custom script to extract class)
extract-class.sh MyFile.cs MyClass > temp.cs
bsharp tree temp.cs --output MyClass-ast.svg
rm temp.cs

Batch Generation

# Generate visualizations for all files
for file in src/**/*.cs; do
    output="diagrams/$(basename $file .cs).svg"
    bsharp tree "$file" --output "$output"
done

Integration with Documentation

# MyClass Documentation

## AST Structure

![AST Diagram](./diagrams/MyClass.svg)

The class structure shows...

Implementation

Location: src/bsharp_cli/src/commands/tree.rs

#![allow(unused)]
fn main() {
pub fn execute(args: Box<TreeArgs>) -> Result<()> {
    // Parses input in lenient mode, then writes Mermaid (.mmd) or DOT (.dot)
    // using bsharp_syntax::node::render::{to_mermaid, to_dot}.
    Ok(())
}
}

Renderer functions live in src/bsharp_syntax/src/node/render.rs:

#![allow(unused)]
fn main() {
to_mermaid(&ast);
to_dot(&ast);
}

Customization

Future Enhancements

Interactive SVG
- Click to expand/collapse nodes
- Hover for details
- Search functionality
Export Formats
- PNG/PDF export
- DOT format for Graphviz
- PlantUML format
Filtering
- Show only specific node types
- Hide implementation details
- Focus on structure
Styling
- Custom color schemes
- Font customization
- Layout options

Troubleshooting

SVG Too Large

Problem: Generated SVG is too large to view

Solution:

Visualize smaller code sections
Use SVG viewer with zoom/pan
Export to PDF for printing

Overlapping Nodes

Problem: Nodes overlap in complex trees

Solution:

Increase SVG dimensions
Simplify code structure
Use horizontal layout (future feature)

Missing Nodes

Problem: Some AST nodes not shown

Solution:

Check parser output with parse command
Report issue if nodes are missing

CLI Overview - General CLI usage
Parse Command - Parse textual AST tree
AST Structure - AST node reference

References

Implementation: src/bsharp_cli/src/commands/tree.rs
Formats: Mermaid or Graphviz DOT

Analyze Command

The analyze command performs comprehensive code analysis on C# files, projects, or solutions, generating detailed reports with diagnostics, metrics, and quality assessments.

Usage

bsharp analyze <INPUT> [OPTIONS]

Arguments

<INPUT> (required)

Path to C# source file (.cs)
Path to project file (.csproj)
Path to solution file (.sln)
Path to directory

Options

Output Options

--out <FILE>

Output file path for analysis report (JSON)
Default: stdout
Creates parent directories if needed

--format <FORMAT>

Output format: json (compact) or pretty-json (indented)
Default: pretty-json

Configuration

--config <FILE>

Path to analysis configuration file (JSON or TOML)
Overrides default settings
CLI flags override config file settings

See: Configuration Overview

Workspace Options

--follow-refs <BOOL>

Follow ProjectReference dependencies transitively
Default: true
Set to false to analyze only specified project

--include <GLOB>...

Include only files matching glob patterns
Multiple patterns allowed
Example: --include "**/*Service.cs" "**/*Controller.cs"

--exclude <GLOB>...

Exclude files matching glob patterns
Multiple patterns allowed
Example: --exclude "**/obj/**" "**/bin/**" "**/Tests/**"

Analysis Control

--enable-ruleset <ID>...

Enable specific rulesets by ID
Multiple IDs allowed
Overrides config file
Example: --enable-ruleset naming quality

--disable-ruleset <ID>...

Disable specific rulesets by ID
Multiple IDs allowed
Example: --disable-ruleset experimental

--enable-pass <ID>...

Enable specific analysis passes by ID
Multiple IDs allowed
Example: --enable-pass indexing control_flow

--disable-pass <ID>...

Disable specific analysis passes by ID
Multiple IDs allowed
Example: --disable-pass dependencies

--severity <CODE=LEVEL>...

Override diagnostic severity for specific codes
Format: CODE=level where level is error, warning, info, or hint
Multiple overrides allowed
Example: --severity MET001=error QUAL010=warning

Legacy Options (Single File Mode)

--symbol <NAME>

Search for specific symbol by name
Only works in single-file mode
Prints symbol locations and information

Examples

Basic Analysis

# Analyze single file
bsharp analyze MyFile.cs

# Analyze project
bsharp analyze MyProject.csproj

# Analyze solution
bsharp analyze MySolution.sln

Output to File

# Save report to file
bsharp analyze MyProject.csproj --out report.json

# Compact JSON format
bsharp analyze MyProject.csproj --out report.json --format json

Using Configuration File

# Load config from file
bsharp analyze MyProject.csproj --config .bsharp.toml

# Config file with CLI overrides
bsharp analyze MyProject.csproj \
    --config .bsharp.toml \
    --enable-ruleset quality \
    --severity MET001=error

Workspace Filtering

# Analyze only service files
bsharp analyze MySolution.sln --include "**/*Service.cs"

# Exclude test files
bsharp analyze MySolution.sln --exclude "**/Tests/**"

# Multiple filters
bsharp analyze MySolution.sln \
    --include "src/**/*.cs" \
    --exclude "**/obj/**" "**/bin/**" "**/Tests/**"

Controlling Analysis

# Enable specific rulesets
bsharp analyze MyProject.csproj \
    --enable-ruleset naming quality control_flow

# Disable experimental features
bsharp analyze MyProject.csproj \
    --disable-ruleset experimental

# Enable/disable specific passes
bsharp analyze MyProject.csproj \
    --enable-pass indexing control_flow \
    --disable-pass dependencies

Severity Overrides

# Treat specific warnings as errors
bsharp analyze MyProject.csproj \
    --severity MET001=error \
    --severity QUAL001=error

# Downgrade specific errors to warnings
bsharp analyze MyProject.csproj \
    --severity CS0168=warning

Symbol Search (Single File)

# Find symbol in file
bsharp analyze MyFile.cs --symbol MyClass

# Output:
# Found symbol 'MyClass' at line 10, column 14

Analysis Modes

Single File Mode

Triggered when: Input is a .cs file

Behavior:

Parses single file
Runs analysis pipeline on CompilationUnit
Supports --symbol option for symbol search
Faster for quick checks

Example:

bsharp analyze Program.cs --out analysis.json

Workspace Mode

Triggered when: Input is .sln, .csproj, or directory

Behavior:

Loads entire workspace
Discovers all source files
Follows project references (if --follow-refs true)
Applies include/exclude filters
Analyzes all files deterministically
Aggregates results into single report

Example:

bsharp analyze MySolution.sln \
    --follow-refs true \
    --exclude "**/Tests/**" \
    --out workspace-analysis.json

Configuration File Format

TOML Format

.bsharp.toml:

[analysis]
max_cyclomatic_complexity = 10
max_method_length = 50

[analysis.control_flow]
cf_high_complexity_threshold = 10
cf_deep_nesting_threshold = 4

[analysis.quality]
long_method = "warning"
god_class = "error"
empty_catch = "error"

[workspace]
follow_refs = true
include = ["src/**/*.cs"]
exclude = ["**/obj/**", "**/bin/**", "**/Tests/**"]

[enable_rulesets]
naming = true
quality = true
control_flow = true

[enable_passes]
indexing = true
control_flow = true
dependencies = true

[rule_severities]
MET001 = "error"
QUAL001 = "warning"

JSON Format

.bsharp.json:

{
  "analysis": {
    "max_cyclomatic_complexity": 10,
    "max_method_length": 50,
    "control_flow": {
      "cf_high_complexity_threshold": 10,
      "cf_deep_nesting_threshold": 4
    }
  },
  "workspace": {
    "follow_refs": true,
    "include": ["src/**/*.cs"],
    "exclude": ["**/obj/**", "**/bin/**"]
  },
  "enable_rulesets": {
    "naming": true,
    "quality": true
  },
  "enable_passes": {
    "indexing": true,
    "control_flow": true
  },
  "rule_severities": {
    "MET001": "error",
    "QUAL001": "warning"
  }
}

Output Format

Analysis Report Structure

{
  "schema_version": 1,
  "diagnostics": {
    "items": [
      {
        "code": "MET001",
        "severity": "warning",
        "message": "Method has high cyclomatic complexity",
        "file": "src/OrderService.cs",
        "line": 42,
        "column": 17,
        "end_line": 85,
        "end_column": 5
      }
    ]
  },
  "metrics": {
    "total_lines": 1250,
    "code_lines": 980,
    "comment_lines": 150,
    "blank_lines": 120,
    "class_count": 15,
    "interface_count": 3,
    "method_count": 87,
    "total_complexity": 245,
    "max_complexity": 18,
    "max_nesting_depth": 5
  },
  "cfg": {
    "total_methods": 87,
    "high_complexity_count": 5,
    "deep_nesting_count": 3
  },
  "deps": {
    "total_nodes": 15,
    "total_edges": 42,
    "circular_dependencies": 0,
    "max_depth": 4
  },
  "workspace_warnings": [
    "Failed to parse project: MyBrokenProject.csproj"
  ]
}

Diagnostic Fields

code: Diagnostic code (e.g., MET001, QUAL010)
severity: error, warning, info, or hint
message: Human-readable description
file: Source file path
line/column: Start position
end_line/end_column: End position (optional)

Metrics Fields

total_lines: Total lines including blank/comments
code_lines: Lines with actual code
comment_lines: Lines with comments
blank_lines: Empty lines
class_count: Number of classes
interface_count: Number of interfaces
method_count: Number of methods
total_complexity: Sum of all method complexities
max_complexity: Highest method complexity
max_nesting_depth: Deepest nesting level

Available Rulesets

Built-in Rulesets

naming - Naming convention rules

Class names: PascalCase
Method names: PascalCase
Field names: camelCase with _ prefix
Constant names: UPPER_CASE or PascalCase

quality - Code quality rules

Long method detection
Long parameter list
God class detection
Empty catch blocks
Magic numbers

control_flow - Control flow rules

High complexity warnings
Deep nesting warnings
Unreachable code detection

semantic - Semantic rules

Type checking
Null reference analysis
Resource leak detection

Available Passes

Built-in Passes

indexing (Phase: Index)

Builds symbol index
Creates name index
Generates FQN map

control_flow (Phase: Semantic)

Analyzes control flow
Calculates complexity metrics
Detects control flow smells

dependencies (Phase: Global)

Builds dependency graph
Detects circular dependencies
Calculates coupling metrics

reporting (Phase: Reporting)

Generates final report
Aggregates diagnostics
Summarizes artifacts

Diagnostic Codes

Metrics (MET)

MET001: High cyclomatic complexity
MET002: Deep nesting detected
MET003: Long method
MET004: Long parameter list

Quality (QUAL)

QUAL001: Long method
QUAL002: Long parameter list
QUAL010: Empty catch block
QUAL020: Naming convention violation
QUAL030: Resource not disposed

Control Flow (CF)

CF001: Unreachable code
CF002: High complexity
CF003: Deep nesting

Dependencies (DEP)

DEP001: Circular dependency
DEP002: High coupling
DEP003: Unstable dependency

Performance

Analysis Speed

Single file (< 1000 lines): < 100ms
Small project (< 10 files): < 500ms
Medium project (10-50 files): 500ms-2s
Large solution (100+ files): 2-10s

Memory Usage

Scales with codebase size
Typical: 50-200 MB for medium projects
Artifacts cached in memory during analysis

Parallel Analysis

With parallel_analysis feature enabled:

cargo build --release --features parallel_analysis

Files analyzed in parallel, significantly faster for large workspaces.

Integration

CI/CD Pipeline

# GitHub Actions
- name: Code Quality Analysis
  run: |
    bsharp analyze MySolution.sln \
      --out analysis.json \
      --format json \
      --severity MET001=error QUAL001=error
    
    # Check for errors
    errors=$(jq '.diagnostics.items | map(select(.severity == "error")) | length' analysis.json)
    if [ "$errors" -gt 0 ]; then
      echo "Quality gate failed: $errors errors"
      exit 1
    fi

Pre-commit Hook

#!/bin/bash
# .git/hooks/pre-commit

changed_files=$(git diff --cached --name-only --diff-filter=ACM | grep '\.cs$')

for file in $changed_files; do
    result=$(bsharp analyze "$file" --format json 2>/dev/null)
    errors=$(echo "$result" | jq '.diagnostics.items | map(select(.severity == "error")) | length')
    
    if [ "$errors" -gt 0 ]; then
        echo "Analysis errors in $file"
        exit 1
    fi
done

Quality Gate Script

#!/bin/bash
# quality-gate.sh

bsharp analyze MySolution.sln \
    --out report.json \
    --format json \
    --enable-ruleset naming quality control_flow \
    --severity MET001=error QUAL001=error

# Extract metrics
errors=$(jq '.diagnostics.items | map(select(.severity == "error")) | length' report.json)
max_complexity=$(jq '.metrics.max_complexity' report.json)

echo "Errors: $errors"
echo "Max Complexity: $max_complexity"

if [ "$errors" -gt 0 ]; then
    echo "❌ Quality gate failed: $errors errors found"
    exit 1
fi

if [ "$max_complexity" -gt 15 ]; then
    echo "❌ Quality gate failed: complexity $max_complexity exceeds threshold 15"
    exit 1
fi

echo "✅ Quality gate passed"

Troubleshooting

Analysis Fails

$ bsharp analyze MyProject.csproj
Error: Failed to load workspace

Solutions:

Check project file is valid XML
Verify all referenced projects exist
Use --follow-refs false to skip references

Out of Memory

Error: memory allocation failed

Solutions:

Analyze smaller subsets with --include/--exclude
Disable expensive passes with --disable-pass
Increase system memory

Slow Analysis

Solutions:

Build with parallel_analysis feature
Exclude unnecessary files
Disable unused rulesets/passes

CLI Overview - General CLI usage
Analysis Pipeline - Analysis internals
Metrics Collection - Metrics details
Code Quality - Quality rules
Report Schema - Output JSON layout
Configuration Overview - Config fields and examples

References

Implementation: src/bsharp_cli/src/commands/analyze.rs
Pipeline: src/bsharp_analysis/src/framework/pipeline.rs
Configuration: src/bsharp_analysis/src/context.rs

Format Command

The format command formats C# code using the built-in formatter and syntax emitters.

Usage

bsharp format <INPUT> [--write <BOOL>] [--print] [--newline-mode lf|crlf] \
  [--max-consecutive-blank-lines <N>] [--blank-line-between-members <BOOL>] \
  [--trim-trailing-whitespace <BOOL>] [--emit-trace] [--emit-trace-file <FILE>]

Arguments

<INPUT> (required)

Path to .cs file or directory
When a directory is given, formats all .cs files recursively
Hidden directories and bin/, obj/, target/ are skipped

Options

--write, -w <BOOL>

Write changes to files in-place
Default: true
When false and <INPUT> is a single file, the formatted content is printed to stdout
When false and formatting differences are found for multiple files, exits with code 2

--print

Always print formatted output for a single-file input and exit
Useful for piping to other tools; does not write to disk regardless of --write

--newline-mode <MODE>

Newline mode: lf (default) or crlf

--max-consecutive-blank-lines <N>

Maximum consecutive blank lines to keep (default: 1)

--blank-line-between-members <BOOL>

Insert a blank line between type members (default: true)

--trim-trailing-whitespace <BOOL>

Trim trailing whitespace (default: true)

--emit-trace

Enable emission tracing (JSONL) for debugging formatter behavior
Can also be enabled via environment variable BSHARP_EMIT_TRACE=1

--emit-trace-file <FILE>

Path to write the trace JSONL (defaults to stdout when omitted)

Examples

# Format a single file in-place
bsharp format Program.cs

# Print formatted output to stdout (do not write)
bsharp format Program.cs --write false

# Force printing formatted output even if --write is not set
bsharp format Program.cs --print

# Format a directory recursively
bsharp format src/

# Use CRLF newlines and avoid extra blank lines
bsharp format Program.cs --newline-mode crlf --max-consecutive-blank-lines 1

# Enable emission tracing to a file
bsharp format Program.cs --emit-trace --emit-trace-file format_trace.jsonl

Implementation

Command: src/bsharp_cli/src/commands/format.rs
Formatter: bsharp_syntax::Formatter with FormatOptions
Emission tracing is controlled by CLI flags or BSHARP_EMIT_TRACE and recorded as JSONL.
Files that fail to parse are skipped; a summary is printed and they are not modified.

Parse Errors JSON Output

When bsharp parse is run with --errors-json, parse failures are emitted as a single JSON object to stdout and the process exits with a non-zero code.

Schema

{
  "error": {
    "kind": "parse_error",
    "file": "<path>",
    "line": 0,
    "column": 0,
    "expected": "",
    "found": "",
    "line_text": "",
    "message": "<pretty formatted message>",
    "spans": {
      "abs": { "start": 0, "end": 1 },
      "rel": {
        "start": { "line": 0, "column": 0 },
        "end": { "line": 0, "column": 1 }
      }
    }
  }
}

kind – always parse_error for parse failures.
file – path of the file being parsed.
line, column – 1-based location of the deepest error span.
expected, found – reserved fields (currently empty strings).
line_text – the full source line at the error location.
message – multi-line pretty message formatted from the parser's error tree.
spans – present only when --emit-spans is provided; includes absolute byte range and relative line/column positions.

Example

bsharp parse Invalid.cs --errors-json | jq

{
  "error": {
    "kind": "parse_error",
    "file": "Invalid.cs",
    "line": 7,
    "column": 12,
    "expected": "",
    "found": "",
    "line_text": "public clas Program { }",
    "message": "0: at 7:12: expected keyword \"class\"\n  public clas Program { }\n           ^\nContexts:\n  - class declaration\n"
  }
}

Notes

In pretty (non-JSON) mode, errors are sent to stderr with optional ANSI colors (disable via --no-color or NO_COLOR=1).
--errors-json disables pretty errors and always prints the JSON object.

Workspace Loading

The BSharp workspace loading system provides comprehensive support for loading C# projects and solutions, including solution files (.sln), project files (.csproj), and directory-based discovery.

Overview

Location: src/bsharp_analysis/src/workspace/

The workspace loader:

Parses Visual Studio solution files (.sln)
Parses MSBuild project files (.csproj)
Discovers source files
Resolves project references
Handles multiple projects deterministically

Workspace Model

Core Types

#![allow(unused)]
fn main() {
pub struct Workspace {
    pub root: PathBuf,
    pub projects: Vec<Project>,
    pub solution: Option<Solution>,
    pub source_map: SourceMap,
}

pub struct Project {
    pub name: String,
    pub path: PathBuf,
    pub target_framework: String,
    pub output_type: String,
    pub files: Vec<ProjectFile>,
    pub references: Vec<ProjectRef>,
    pub package_references: Vec<PackageReference>,
    pub errors: Vec<String>,
}

pub struct Solution {
    pub name: String,
    pub path: PathBuf,
    pub projects: Vec<SolutionProject>,
}
}

Loading Workspaces

WorkspaceLoader API

#![allow(unused)]
fn main() {
pub struct WorkspaceLoader;

impl WorkspaceLoader {
    // Load from any path (auto-detects type)
    pub fn from_path(path: &Path) -> Result<Workspace>;
    
    // Load with options
    pub fn from_path_with_options(
        path: &Path, 
        opts: WorkspaceLoadOptions
    ) -> Result<Workspace>;
}

pub struct WorkspaceLoadOptions {
    pub follow_refs: bool,  // Follow ProjectReference transitively
}
}

Loading from Solution File

#![allow(unused)]
fn main() {
use bsharp_analysis::workspace::WorkspaceLoader;

let workspace = WorkspaceLoader::from_path(Path::new("MySolution.sln"))?;

println!("Loaded {} projects", workspace.projects.len());
for project in &workspace.projects {
    println!("  - {}: {} files", project.name, project.files.len());
}
}

Loading from Project File

#![allow(unused)]
fn main() {
let workspace = WorkspaceLoader::from_path(Path::new("MyProject.csproj"))?;

// Automatically follows ProjectReference if follow_refs = true
assert!(workspace.projects.len() >= 1);
}

Loading from Directory

#![allow(unused)]
fn main() {
let workspace = WorkspaceLoader::from_path(Path::new("./src"))?;

// Discovers .sln or .csproj files in directory
}

Solution File Parsing

Solution Format

Example .sln:

Microsoft Visual Studio Solution File, Format Version 12.00
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "MyApp", "MyApp\MyApp.csproj", "{GUID}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "MyLib", "MyLib\MyLib.csproj", "{GUID}"
EndProject

Parsing Implementation

Location: src/bsharp_analysis/src/workspace/sln/reader.rs

#![allow(unused)]
fn main() {
pub struct SolutionReader;

impl SolutionReader {
    pub fn read(path: &Path) -> Result<Solution> {
        let content = fs::read_to_string(path)?;
        Self::parse(&content, path)
    }
    
    fn parse(content: &str, base_path: &Path) -> Result<Solution> {
        // Parse solution format
        // Extract project entries
        // Resolve project paths
    }
}
}

Solution Structure

#![allow(unused)]
fn main() {
pub struct Solution {
    pub name: String,
    pub path: PathBuf,
    pub projects: Vec<SolutionProject>,
}

pub struct SolutionProject {
    pub name: String,
    pub path: PathBuf,
    pub type_guid: String,
    pub guid: String,
}
}

Project File Parsing

Project Format

Example .csproj:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFramework>net8.0</TargetFramework>
    <OutputType>Exe</OutputType>
  </PropertyGroup>
  
  <ItemGroup>
    <Compile Include="Program.cs" />
    <Compile Include="Utils.cs" />
  </ItemGroup>
  
  <ItemGroup>
    <ProjectReference Include="..\MyLib\MyLib.csproj" />
  </ItemGroup>
  
  <ItemGroup>
    <PackageReference Include="Newtonsoft.Json" Version="13.0.1" />
  </ItemGroup>
</Project>

Parsing Implementation

Location: src/bsharp_analysis/src/workspace/csproj/reader.rs

#![allow(unused)]
fn main() {
pub struct CsprojReader;

impl CsprojReader {
    pub fn read(path: &Path) -> Result<Project> {
        let content = fs::read_to_string(path)?;
        Self::parse(&content, path)
    }
    
    fn parse(content: &str, project_path: &Path) -> Result<Project> {
        // Parse XML
        // Extract properties (TargetFramework, OutputType)
        // Discover source files (Compile items)
        // Extract ProjectReference entries
        // Extract PackageReference entries
    }
}
}

Source File Discovery

Glob Patterns:

Default: **/*.cs (all C# files recursively)
Respects <Compile Include="..." /> items
Respects <Compile Remove="..." /> exclusions
Excludes obj/ and bin/ directories

Implementation:

#![allow(unused)]
fn main() {
fn discover_source_files(project_dir: &Path) -> Vec<ProjectFile> {
    let pattern = project_dir.join("**/*.cs");
    let mut files = Vec::new();
    
    for entry in glob::glob(pattern.to_str().unwrap()) {
        let path = entry.unwrap();
        
        // Skip obj/ and bin/
        if path.components().any(|c| c.as_os_str() == "obj" || c.as_os_str() == "bin") {
            continue;
        }
        
        files.push(ProjectFile {
            path,
            kind: ProjectFileKind::Compile,
        });
    }
    
    files
}
}

Project References

Transitive Resolution

follow_refs Option:

#![allow(unused)]
fn main() {
let opts = WorkspaceLoadOptions { follow_refs: true };
let workspace = WorkspaceLoader::from_path_with_options(path, opts)?;
}

Behavior:

Follows <ProjectReference> transitively
Loads all referenced projects
Avoids duplicates
Stays within workspace root
Deterministic ordering (sorted by path)

Example:

MyApp.csproj
  → MyLib.csproj
    → MyCore.csproj

Result: [MyApp, MyLib, MyCore]

Implementation

#![allow(unused)]
fn main() {
fn follow_project_references(root: &Path, projects: &mut Vec<Project>) {
    let mut seen = HashSet::new();
    let mut queue = VecDeque::new();
    
    // Add initial projects
    for proj in projects.iter() {
        seen.insert(proj.path.clone());
        queue.push_back(proj.path.clone());
    }
    
    // BFS traversal
    while let Some(proj_path) = queue.pop_front() {
        let proj = match CsprojReader::read(&proj_path) {
            Ok(p) => p,
            Err(_) => continue,
        };
        
        for ref_path in proj.references.iter().map(|r| &r.path) {
            // Resolve relative to project directory
            let abs_path = proj_path.parent().unwrap().join(ref_path);
            
            // Skip if outside root
            if !abs_path.starts_with(root) {
                continue;
            }
            
            // Skip if already seen
            if seen.insert(abs_path.clone()) {
                queue.push_back(abs_path.clone());
                
                // Load and add project
                if let Ok(referenced_proj) = CsprojReader::read(&abs_path) {
                    projects.push(referenced_proj);
                }
            }
        }
    }
    
    // Sort for determinism
    projects.sort_by(|a, b| a.path.cmp(&b.path));
}
}

Source Map

Purpose

The SourceMap provides fast lookup of source files:

#![allow(unused)]
fn main() {
pub struct SourceMap {
    files: HashMap<PathBuf, SourceFileInfo>,
}

impl SourceMap {
    pub fn get(&self, path: &Path) -> Option<&SourceFileInfo>;
    pub fn all_files(&self) -> Vec<&Path>;
}
}

Usage

#![allow(unused)]
fn main() {
let workspace = WorkspaceLoader::from_path(path)?;

// Look up file
if let Some(info) = workspace.source_map.get(Path::new("Program.cs")) {
    println!("Found in project: {}", info.project_name);
}

// Iterate all files
for file_path in workspace.source_map.all_files() {
    println!("File: {}", file_path.display());
}
}

Error Handling

Resilient Loading

Philosophy: Continue loading even if individual projects fail

#![allow(unused)]
fn main() {
// Failed projects recorded as stubs with errors
let workspace = WorkspaceLoader::from_path(sln_path)?;

for project in &workspace.projects {
    if !project.errors.is_empty() {
        eprintln!("Errors in {}: {:?}", project.name, project.errors);
    }
}
}

Error Types

#![allow(unused)]
fn main() {
pub enum WorkspaceError {
    IoError(io::Error),
    ParseError(String),
    InvalidPath(String),
    Unsupported(String),
}
}

CLI Integration

Analyze Command

# Analyze solution
bsharp analyze MySolution.sln

# Analyze project
bsharp analyze MyProject.csproj

# Follow references (default: true)
bsharp analyze MyProject.csproj --follow-refs true

# Don't follow references
bsharp analyze MyProject.csproj --follow-refs false

Filtering

# Include only specific files
bsharp analyze MySolution.sln --include "**/*Service.cs"

# Exclude test files
bsharp analyze MySolution.sln --exclude "**/Tests/**"

# Multiple patterns
bsharp analyze MySolution.sln \
    --include "src/**/*.cs" \
    --exclude "**/obj/**" "**/bin/**"

Deterministic Behavior

Guarantees

Project Order: Always sorted by absolute path
File Order: Always sorted within each project
Deduplication: No duplicate projects or files
Reproducible: Same input always produces same output

Implementation

#![allow(unused)]
fn main() {
// Sort projects
projects.sort_by(|a, b| a.path.cmp(&b.path));

// Deduplicate by path
let mut seen = HashSet::new();
projects.retain(|p| seen.insert(p.path.clone()));

// Sort files within each project
for project in &mut projects {
    project.files.sort_by(|a, b| a.path.cmp(&b.path));
}
}

Performance

Loading Speed

Small solution (1-5 projects): < 100ms
Medium solution (5-20 projects): 100-500ms
Large solution (20-100 projects): 500ms-2s

Memory Usage

Minimal: Only metadata loaded, not source content
Typical: 1-5 MB per solution

Optimization

Parallel project loading (with parallel_analysis feature)
Lazy source file reading
Efficient path canonicalization

Examples

Example 1: Load and Analyze

#![allow(unused)]
fn main() {
use bsharp_analysis::workspace::WorkspaceLoader;
use bsharp_parser::facade::Parser;

let workspace = WorkspaceLoader::from_path(Path::new("MySolution.sln"))?;

let parser = Parser::new();
for project in &workspace.projects {
    for file in &project.files {
        let source = fs::read_to_string(&file.path)?;
        match parser.parse(&source) {
            Ok(cu) => println!("Parsed: {}", file.path.display()),
            Err(e) => eprintln!("Error in {}: {}", file.path.display(), e),
        }
    }
}
}

Example 2: Project Statistics

#![allow(unused)]
fn main() {
let workspace = WorkspaceLoader::from_path(path)?;

println!("Solution: {}", workspace.solution.as_ref().unwrap().name);
println!("Projects: {}", workspace.projects.len());

let total_files: usize = workspace.projects.iter()
    .map(|p| p.files.len())
    .sum();
println!("Total files: {}", total_files);

for project in &workspace.projects {
    println!("  {}: {} files", project.name, project.files.len());
}
}

Example 3: Dependency Graph

#![allow(unused)]
fn main() {
let workspace = WorkspaceLoader::from_path(path)?;

println!("Project Dependencies:");
for project in &workspace.projects {
    if !project.references.is_empty() {
        println!("{}:", project.name);
        for ref_ in &project.references {
            println!("  → {}", ref_.name);
        }
    }
}
}

Testing

Test Fixtures

Location: tests/fixtures/

tests/fixtures/
├── happy_path/
│   ├── test.sln
│   ├── testApplication/
│   │   ├── testApplication.csproj
│   │   └── Program.cs
│   └── testDependency/
│       ├── testDependency.csproj
│       └── Library.cs
└── complex/
    └── ...

Test Examples

#![allow(unused)]
fn main() {
#[test]
fn test_load_solution() {
    let sln_path = PathBuf::from("tests/fixtures/happy_path/test.sln");
    let workspace = WorkspaceLoader::from_path(&sln_path).unwrap();
    
    assert_eq!(workspace.projects.len(), 2);
    assert!(workspace.solution.is_some());
}

#[test]
fn test_follow_references() {
    let proj_path = PathBuf::from("tests/fixtures/happy_path/testApplication/testApplication.csproj");
    let workspace = WorkspaceLoader::from_path(&proj_path).unwrap();
    
    // Should load both testApplication and testDependency
    assert_eq!(workspace.projects.len(), 2);
}
}

Future Enhancements

Planned Features

NuGet Package Resolution
- Resolve package references
- Download packages if needed
- Parse package assemblies
MSBuild Integration
- Full MSBuild evaluation
- Property expansion
- Target execution
Multi-targeting Support
- Handle multiple target frameworks
- Conditional compilation
Incremental Loading
- Cache workspace metadata
- Reload only changed projects

CLI Overview - CLI integration
Analysis Pipeline - Using workspace in analysis
Architecture - Design decisions

References

Implementation: src/bsharp_analysis/src/workspace/
Loader: src/bsharp_analysis/src/workspace/loader.rs
Solution Reader: src/bsharp_analysis/src/workspace/sln/reader.rs
Project Reader: src/bsharp_analysis/src/workspace/csproj/reader.rs
Model: src/bsharp_analysis/src/workspace/model.rs
Source Map: src/bsharp_analysis/src/workspace/source_map.rs
Tests: src/bsharp_tests/src/workspace/ and src/bsharp_tests/src/integration/

Configuration Overview

BSharp analysis can be configured via TOML or JSON files and by CLI flags that map to config fields.

Locations

Project root: .bsharp.toml or .bsharp.json
Custom path via bsharp analyze <INPUT> --config <FILE>

AnalysisConfig (fields)

Source: src/bsharp_analysis/src/context.rs

#![allow(unused)]
fn main() {
pub struct AnalysisConfig {
    // Control flow thresholds
    pub cf_high_complexity_threshold: usize, // default: 10
    pub cf_deep_nesting_threshold: usize,    // default: 4

    // Toggles and severities
    pub enable_rulesets: HashMap<String, bool>,
    pub enable_passes: HashMap<String, bool>,
    pub rule_severities: HashMap<String, DiagnosticSeverity>,

    // Workspace filters
    pub workspace: WorkspaceConfig,

    // Optional churn/PE settings (reserved/future)
    pub churn_enable: bool,
    pub churn_period_days: u32,
    pub churn_include_merges: bool,
    pub churn_max_commits: Option<u32>,
    pub pe_reference_paths: Vec<String>,
    pub pe_references: Vec<String>,
}

pub struct WorkspaceConfig {
    pub follow_refs: bool,
    pub include: Vec<String>,
    pub exclude: Vec<String>,
}
}

TOML Example

[analysis]
cf_high_complexity_threshold = 10
cf_deep_nesting_threshold = 4

[enable_rulesets]
naming = true
semantic = true
control_flow_smells = true

[enable_passes]
passes.metrics = true
passes.control_flow = true
passes.dependencies = true

[rule_severities]
CF002 = "warning"
CF003 = "warning"

[workspace]
follow_refs = true
include = ["src/**/*.cs"]
exclude = ["**/obj/**", "**/bin/**"]

JSON Example

{
  "cf_high_complexity_threshold": 10,
  "cf_deep_nesting_threshold": 4,
  "enable_rulesets": {
    "naming": true,
    "semantic": true,
    "control_flow_smells": true
  },
  "enable_passes": {
    "passes.metrics": true,
    "passes.control_flow": true,
    "passes.dependencies": true
  },
  "rule_severities": {
    "CF002": "warning",
    "CF003": "warning"
  },
  "workspace": {
    "follow_refs": true,
    "include": ["src/**/*.cs"],
    "exclude": ["**/obj/**", "**/bin/**"]
  }
}

CLI Mapping

--enable-ruleset <ID> / --disable-ruleset <ID> → enable_rulesets[ID] = true|false
--enable-pass <ID> / --disable-pass <ID> → enable_passes[ID] = true|false
--severity CODE=level → rule_severities[CODE] = level (error|warning|info|hint)
--follow-refs <BOOL> → workspace.follow_refs
--include <GLOB>... → workspace.include
--exclude <GLOB>... → workspace.exclude

Tips

Prefer TOML for readability; JSON is supported for tool integration.
Thresholds influence CfgSummary counts in the final report.
Use unique IDs for passes/rulesets consistent with registry (see Passes & Rules).

Contributing to BSharp

Thank you for your interest in contributing to BSharp! This document provides guidelines for contributing to the project.

Development Setup

Prerequisites

Rust 1.70 or later
Git
A text editor or IDE with Rust support

Building the Project

Clone the repository:

git clone https://github.com/mikserek/bsharp.git
cd bsharp

Parser Testing Best Practices

Prefer expect_ok(input, parse(input.into())) from syntax::test_helpers when asserting successful parses. It prints readable, rustc-like diagnostics on failure via format_error_tree.
Keep tests focused and minimal; add a separate negative test when ambiguity is possible (e.g., ternary vs ?. vs ??, range vs dot vs float).
For lookahead/disambiguation boundaries, add cases to tests/parser/expressions/lookahead_boundaries2_tests.rs.
For complex constructs (e.g., new with object/collection initializers), add positive and negative cases near tests/parser/expressions/new_expression_tests.rs and target_typed_new_tests.rs.
Invalid-input diagnostics: place small snapshot-style assertions in tests/parser/expressions/invalid_diagnostics_tests.rs that check for line/column and caret presence. Avoid overfitting on exact wording.
When adding delimited constructs (parentheses, brackets, braces), guard the closing delimiter with cut(...) once committed to that branch to prevent misleading backtracking.
Always wrap sub-parsers with bws(...) to ensure whitespace/comments are handled consistently.

Adding New Parser Test Files

In tests/parser/expressions/, simply add a new *_tests.rs file; it will be discovered by the existing integration test harness.
For declarations/statements/types, follow the existing directory structure under tests/parser/ and mimic module organization.
Keep tests deterministic and avoid relying on environment-specific paths or random data.

Build the project:

cargo build

Run tests:

cargo test

Run the CLI tool:

cargo run -- --help

Project Structure

Understanding the codebase organization:

src/
├── parser/           # Parser implementations (expressions, statements, etc.)
├── syntax/           # Parser infrastructure (AST nodes, helpers, errors)
├── analysis/         # Code analysis framework
├── workspace/        # Solution and project file loading
├── cli/              # Command-line interface
└── lib.rs           # Library entry point

Code Style

Follow Rust conventions:

Use cargo fmt to format code
Use cargo clippy to check for common issues
Follow naming conventions (snake_case for functions, PascalCase for types)
Add documentation comments for public APIs

Testing

All contributions should include appropriate tests:

Parser Tests

IMPORTANT: Parser tests must live in an external test crate under src/bsharp_tests/src/, NOT inline #[cfg(test)] modules.

#![allow(unused)]
fn main() {
// ✅ CORRECT: External test file
// tests/parser/declarations/class_declaration_tests.rs

use bsharp::syntax::test_helpers::expect_ok;
use bsharp::parser::expressions::declarations::parse_class_declaration;

#[test]
fn test_parse_simple_class() {
    let input = "public class MyClass { }";
    let class = expect_ok(input, parse_class_declaration(input.into()));
    assert_eq!(class.identifier.name, "MyClass");
}
}

Analysis Tests

#![allow(unused)]
fn main() {
// tests/analysis/complexity_tests.rs

use bsharp::syntax::Parser;
use bsharp::analysis::metrics::cyclomatic_complexity;

#[test]
fn test_complexity_analysis() {
    let source = r#"
        public class Test {
            public void Method() {
                if (true) {
                    for (int i = 0; i < 10; i++) {
                        // complexity += 2
                    }
                }
            }
        }
    "#;
    
    let parser = Parser::new();
    let cu = parser.parse(source).unwrap();
    
    // Find the method and calculate complexity
    // (implementation details depend on analysis API)
    
    assert_eq!(complexity, 3);
}
}

Documentation

Add rustdoc comments for public functions and types
Update this documentation when adding new features
Include examples in documentation

Adding New Language Features

When adding support for new C# language features:

Define AST Nodes: Add node definitions in src/syntax/nodes/
Implement Parser: Add parser in appropriate src/parser/ subdirectory
Add Tests: Include comprehensive tests in tests/parser/ directory
Update Traversal: Prefer the bsharp_analysis::framework::Query API for AST enumeration; for statement/expression-heavy logic, use shared helpers or a focused walker.
Document: Add documentation for the new feature

Example process for adding a new expression type:

Define the AST node:

#![allow(unused)]
fn main() {
// src/syntax/nodes/expressions/new_expression.rs
#[derive(Debug, PartialEq, Clone, Serialize, Deserialize)]
pub struct NewExpression {
    pub keyword: String,  // "new"
    pub arguments: Vec<Expression>,
}
}

Add to Expression enum:

#![allow(unused)]
fn main() {
// src/syntax/nodes/expressions/expression.rs
pub enum Expression {
    // ... existing variants
    New(NewExpression),
}
}

Implement parser:

#![allow(unused)]
fn main() {
// src/parser/expressions/new_expression_parser.rs
pub fn parse_new_expression(input: &str) -> BResult<&str, NewExpression> {
    // Parser implementation
}
}

Add tests:

#![allow(unused)]
fn main() {
// tests/parser/expressions/new_expression_tests.rs
#[test]
fn test_parse_new_expression() {
    // Test implementation
}
}

Submitting Changes

Pull Request Process

Fork the repository
Create a feature branch: git checkout -b feature/new-feature
Make your changes
Run tests: cargo test
Run formatting: cargo fmt
Run clippy: cargo clippy
Commit changes with clear messages
Push to your fork
Create a pull request

Commit Messages

Use clear, descriptive commit messages:

feat: add support for C# 11 file-scoped types

- Add parser for file-scoped type declarations
- Update AST to handle new syntax
- Add comprehensive tests
- Update documentation

Fixes #123

Pull Request Requirements

All tests must pass
Code must be formatted with cargo fmt
No clippy warnings
Include tests for new functionality
Update documentation if needed

Common Development Tasks

Adding a New Parser

Define the AST node structure
Implement the parser function
Add the parser to the appropriate module
Write comprehensive tests
Update integration points

Extending Analysis

Define analysis traits if needed
Implement analyzer struct
Add configuration options
Write tests with various scenarios
Update CLI integration

Debugging Parser Issues

Use these tools for debugging:

# Test specific parser with debug output
RUST_LOG=debug cargo test test_name -- --nocapture

# Run parser on test file (prints textual AST tree)
cargo run -- parse debug_cases/test.cs

# Check AST visualization
cargo run -- tree debug_cases/test.cs --output debug.svg

Getting Help

Check existing issues and documentation
Ask questions in GitHub issues
Join community discussions

Code of Conduct

Be respectful and inclusive
Focus on constructive feedback
Help others learn and grow
Maintain a positive environment

Thank you for contributing to BSharp!

Testing Guide

This document provides comprehensive guidance on testing in the BSharp project, covering test organization, best practices, and debugging strategies.

Test Organization Philosophy

External Test Structure

Critical Principle: Parser tests are external to implementation modules and live in a dedicated test crate.

src/bsharp_tests/
├── cargo.toml               # Test crate manifest
└── src/
    ├── parser/
    │   ├── expressions/
    │   │   ├── expression_tests.rs
    │   │   ├── lambda_expression_tests.rs
    │   │   ├── pattern_matching_tests.rs
    │   │   ├── ambiguity_tests.rs
    │   │   ├── lookahead_boundaries2_tests.rs
    │   │   └── ...
    │   ├── statements/
    │   │   ├── if_statement_tests.rs
    │   │   ├── for_statement_tests.rs
    │   │   ├── expression_statement_tests.rs
    │   │   └── ...
    │   ├── declarations/
    │   │   ├── class_declaration_tests.rs
    │   │   ├── interface_declaration_parser_tests.rs
    │   │   ├── recovery_tests.rs
    │   │   └── ...
    │   ├── types/
    │   │   ├── type_tests.rs
    │   │   ├── advanced_type_tests.rs
    │   │   └── ...
    │   ├── preprocessor/
    │   │   └── ...
    │   └── keyword_parsers_tests.rs
    └── fixtures/
        ├── happy_path/
        └── complex/

Rationale:

Separation of Concerns: Test code separate from implementation
Compilation Efficiency: Tests don't bloat production binary
Organization: Clear structure mirrors parser organization
Maintainability: Easy to find and update tests

What NOT to Do:

#![allow(unused)]
fn main() {
// ❌ NEVER do this in src/parser/ files
#[cfg(test)]
mod tests {
    use super::*;
    
    #[test]
    fn test_something() {
        // ...
    }
}
}

What to Do Instead:

#![allow(unused)]
fn main() {
// ✅ Create tests/parser/expressions/my_feature_tests.rs
use bsharp::syntax::test_helpers::expect_ok;
use bsharp::parser::expressions::parse_my_feature;

#[test]
fn test_my_feature() {
    let input = "my feature syntax";
    let result = parse_my_feature(input.into());
    let ast = expect_ok(input, result);
    // assertions...
}
}

Test Helpers

`expect_ok()` - Readable Test Failures

Location: src/syntax/test_helpers.rs

Usage:

#![allow(unused)]
fn main() {
use bsharp::syntax::test_helpers::expect_ok;

#[test]
fn test_parse_class() {
    let input = "public class MyClass { }";
    let result = parse_class_declaration(input.into());
    let class = expect_ok(input, result);
    
    assert_eq!(class.identifier.name, "MyClass");
}
}

Benefits:

Automatic Error Formatting: Pretty-prints ErrorTree on failure
Readable Diagnostics: Shows parse failure context with caret
Panic on Failure: Test fails with clear error message

Error Output Example:

0: at line 1, in keyword "class":
public clas MyClass { }
       ^--- expected keyword "class"

1: in context "class declaration"

Other Test Helpers

parse_input_unwrap() - Unwrap parse result:

#![allow(unused)]
fn main() {
use bsharp_syntax::span::Span;
let (remaining, ast) = parse_input_unwrap(
    parse_expression_spanned(Span::new(input)).map(|(rest, s)| (rest, s.node))
);
assert_eq!(remaining, "");  // Verify full consumption
}

assert_parse_error() - Verify parse failures:

#![allow(unused)]
fn main() {
use bsharp_syntax::span::Span;
assert_parse_error(
    parse_expression_spanned(Span::new("invalid syntax")).map(|(rest, s)| (rest, s.node))
);
}

Parser Testing Best Practices

1. Prefer `expect_ok()` for Successful Parses

#![allow(unused)]
fn main() {
#[test]
fn test_if_statement() {
    let input = "if (x > 0) { return x; }";
    let stmt = expect_ok(input, parse_if_statement(input.into()));
    
    // Now assert on the AST structure
    match stmt {
        Statement::If(if_stmt) => {
            // Verify condition, consequence, etc.
        }
        _ => panic!("Expected IfStatement"),
    }
}
}

2. Keep Tests Focused and Minimal

Good:

#![allow(unused)]
fn main() {
#[test]
fn test_simple_lambda() {
    let input = "x => x * 2";
    let expr = expect_ok(input, parse_lambda_expression(input.into()));
    // Test one thing
}

#[test]
fn test_lambda_with_multiple_params() {
    let input = "(x, y) => x + y";
    let expr = expect_ok(input, parse_lambda_expression(input.into()));
    // Test another thing
}
}

Bad:

#![allow(unused)]
fn main() {
#[test]
fn test_all_lambda_forms() {
    // Testing too many things in one test
    // Hard to debug when it fails
}
}

3. Add Negative Tests for Ambiguity

When disambiguation is possible, add tests for both valid and invalid cases:

#![allow(unused)]
fn main() {
#[test]
fn test_ternary_vs_nullable() {
    // Valid ternary
    let input = "x ? y : z";
    expect_ok(input, parse_conditional_expression(input.into()));
    
    // Valid null-conditional (different test)
}

#[test]
fn test_null_conditional_operator() {
    let input = "obj?.Property";
    expect_ok(input, parse_postfix_expression(input.into()));
}
}

4. Test Lookahead/Disambiguation Boundaries

Location: tests/parser/expressions/lookahead_boundaries2_tests.rs

#![allow(unused)]
fn main() {
#[test]
fn test_range_vs_dot_vs_float() {
    // Range operator
    expect_ok("1..10", parse_range_expression("1..10"));
    
    // Member access
    expect_ok("obj.Method", parse_postfix_expression("obj.Method"));
    
    // Float literal
    expect_ok("3.14", parse_literal("3.14"));
}
}

5. Test Complex Constructs

For complex constructs like new expressions with initializers:

Location: tests/parser/expressions/new_expression_tests.rs

#![allow(unused)]
fn main() {
#[test]
fn test_new_with_object_initializer() {
    let input = "new Person { Name = \"John\", Age = 30 }";
    let expr = expect_ok(input, parse_new_expression(input.into()));
    // Verify structure
}

#[test]
fn test_new_with_collection_initializer() {
    let input = "new List<int> { 1, 2, 3 }";
    let expr = expect_ok(input, parse_new_expression(input.into()));
    // Verify structure
}

#[test]
fn test_target_typed_new() {
    let input = "new(42, \"test\")";
    let expr = expect_ok(input, parse_new_expression(input.into()));
    // Verify structure
}
}

6. Test Invalid Input Diagnostics

Location: tests/parser/expressions/invalid_diagnostics_tests.rs

#![allow(unused)]
fn main() {
#[test]
fn test_unclosed_paren_diagnostic() {
    use bsharp_syntax::span::Span;
    let input = "(x + y";
    let result = parse_expression_spanned(Span::new(input)).map(|(rest, s)| (rest, s.node));
    assert!(result.is_err());
    // Optionally check error contains expected message
}
}

Guidelines:

Keep small snapshot-style assertions
Check for line/column and caret presence
Avoid overfitting on exact wording (may change)

7. Guard Closing Delimiters with `cut()`

When adding delimited constructs, ensure closing delimiters use cut():

#![allow(unused)]
fn main() {
use nom::combinator::cut;
use crate::syntax::parser_helpers::{bdelimited, bchar};

fn parse_parenthesized(input: &str) -> BResult<&str, Expression> {
    bdelimited(
        bchar('('),
        parse_expression,
        cut(bchar(')'))  // ✅ Prevents misleading backtracking
    )(input.into())
}
}

8. Wrap Sub-Parsers with `bws()`

Ensure whitespace/comments are handled consistently:

#![allow(unused)]
fn main() {
use crate::syntax::parser_helpers::bws;

fn parse_if_statement(input: &str) -> BResult<&str, Statement> {
    let (input, _) = bws(keyword("if"))(input.into())?;
    let (input, _) = bws(bchar('('))(input.into())?;
    let (input, condition) = bws(parse_expression)(input.into())?;
    // ...
}
}

Test Discovery and Execution

Running All Tests

cargo test

Running Specific Test Suites

# All parser tests
cargo test --test parser

# Specific module
cargo test --test parser expression_tests

# Specific test
cargo test --test parser test_lambda_expression

Running with Output

# Show println! output
cargo test -- --nocapture

# Show test names as they run
cargo test -- --nocapture --test-threads=1

Running with Debug Logging

RUST_LOG=debug cargo test test_name -- --nocapture

Test Fixtures

Fixture Organization

tests/fixtures/
├── happy_path/           # Valid, well-formed C# projects
│   ├── testApplication/
│   │   ├── Program.cs
│   │   ├── testApplication.csproj
│   │   └── ...
│   └── testDependency/
│       └── ...
└── complex/              # Complex, real-world scenarios
    ├── testApplication/
    └── testDependency/

Using Fixtures in Tests

#![allow(unused)]
fn main() {
use std::fs;
use std::path::PathBuf;

#[test]
fn test_parse_fixture() {
    let fixture_path = PathBuf::from("tests/fixtures/happy_path/testApplication/Program.cs");
    let source = fs::read_to_string(&fixture_path).unwrap();
    
    let parser = Parser::new();
    let result = parser.parse(&source);
    
    assert!(result.is_ok());
}
}

Fixture Guidelines

Valid Code: Fixtures should be valid C# that compiles
Realistic: Use real-world patterns, not contrived examples
Documented: Add README.md explaining fixture purpose
Minimal: Keep fixtures as small as possible while testing feature

Snapshot Testing

Using `insta` for Snapshot Tests

Installation: Already included in Cargo.toml dev-dependencies

#![allow(unused)]
fn main() {
use insta::assert_json_snapshot;

#[test]
fn test_class_ast_structure() {
    let input = "public class MyClass { public int Field; }";
    let result = parse_class_declaration(input.into());
    let class = expect_ok(input, result);
    
    // Creates snapshot file on first run
    assert_json_snapshot!(class);
}
}

Reviewing Snapshots

# Review snapshot changes
cargo insta review

# Accept all changes
cargo insta accept

# Reject all changes
cargo insta reject

Snapshot Guidelines

Complex Structures: Use for complex AST structures
Regression Prevention: Catch unintended changes
Review Carefully: Always review snapshot diffs
Commit Snapshots: Include snapshot files in git

Debugging Test Failures

Strategy 1: Use `expect_ok()` Error Output

When a test fails, expect_ok() shows the parse error:

0: at line 1, in keyword "class":
public clas MyClass { }
       ^--- expected keyword "class"

Strategy 2: Add Debug Logging

#![allow(unused)]
fn main() {
#[test]
fn test_with_logging() {
    env_logger::init();  // Initialize logger
    
    use bsharp_syntax::span::Span;
    let input = "complex syntax";
    log::debug!("Parsing: {}", input);
    
    let result = parse_expression_spanned(Span::new(input)).map(|(rest, s)| (rest, s.node));
    log::debug!("Result: {:?}", result);
    
    expect_ok(input, result);
}
}

Run with:

RUST_LOG=debug cargo test test_with_logging -- --nocapture

Strategy 3: Test Smaller Components

If a complex parser fails, test its sub-parsers individually:

#![allow(unused)]
fn main() {
#[test]
fn test_method_declaration() {
    // Fails - too complex
    let input = "public async Task<int> Method(int x) { return x; }";
    expect_ok(input, parse_method_declaration(input.into()));
}

// Break it down:

#[test]
fn test_method_modifiers() {
    let input = "public async";
    expect_ok(input, parse_modifiers(input.into()));
}

#[test]
fn test_method_return_type() {
    let input = "Task<int>";
    expect_ok(input, parse_type(input.into()));
}

#[test]
fn test_method_parameters() {
    let input = "(int x)";
    expect_ok(input, parse_parameter_list(input.into()));
}
}

Strategy 4: Use Parser Debugging Tools

# Parse file and output JSON
cargo run -- parse debug_cases/test.cs --output debug.json

# Generate AST visualization
cargo run -- tree debug_cases/test.cs --output debug.svg

Strategy 5: Check Error Recovery

For declaration error recovery tests:

#![allow(unused)]
fn main() {
#[test]
fn test_recovery_from_malformed_member() {
    let input = r#"
    public class MyClass {
        public int ValidField;
        public invalid syntax here;  // Malformed
        public int AnotherValidField;  // Should recover
    }
    "#;
    
    let result = parse_class_declaration(input.into());
    // Should parse despite error
    assert!(result.is_ok());
}
}

Integration Testing

Workspace Loading Tests

#![allow(unused)]
fn main() {
use bsharp::workspace::WorkspaceLoader;

#[test]
fn test_load_solution() {
    let sln_path = PathBuf::from("tests/fixtures/happy_path/test.sln");
    let workspace = WorkspaceLoader::from_path(&sln_path).unwrap();
    
    assert_eq!(workspace.projects.len(), 2);
    assert!(workspace.solution.is_some());
}

#[test]
fn test_load_csproj() {
    let csproj_path = PathBuf::from("tests/fixtures/happy_path/testApplication/testApplication.csproj");
    let workspace = WorkspaceLoader::from_path(&csproj_path).unwrap();
    
    assert_eq!(workspace.projects.len(), 1);
}
}

Analysis Pipeline Tests

#![allow(unused)]
fn main() {
use bsharp::analysis::framework::pipeline::AnalyzerPipeline;
use bsharp::analysis::framework::session::AnalysisSession;

#[test]
fn test_analysis_pipeline() {
    let source = "public class Test { public void Method() { } }";
    let parser = Parser::new();
    let cu = parser.parse(source).unwrap();
    
    let mut session = AnalysisSession::new();
    AnalyzerPipeline::run_with_defaults(&cu, &mut session);
    
    let report = session.into_report();
    assert!(report.diagnostics.is_empty());  // No errors
}
}

Performance Testing

Benchmarking

#![allow(unused)]
fn main() {
#[test]
#[ignore]  // Run with --ignored flag
fn bench_parse_large_file() {
    use std::time::Instant;
    
    let source = fs::read_to_string("tests/fixtures/large_file.cs").unwrap();
    let parser = Parser::new();
    
    let start = Instant::now();
    let result = parser.parse(&source);
    let duration = start.elapsed();
    
    assert!(result.is_ok());
    println!("Parse time: {:?}", duration);
    
    // Assert reasonable performance
    assert!(duration.as_millis() < 1000, "Parse took too long");
}
}

Running Performance Tests

cargo test --ignored -- bench_

Continuous Integration

CI Test Strategy

# .github/workflows/test.yml (example)
name: Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions-rs/toolchain@v1
        with:
          toolchain: stable
      - name: Run tests
        run: cargo test --all-features
      - name: Run clippy
        run: cargo clippy -- -D warnings
      - name: Check formatting
        run: cargo fmt -- --check

Test Coverage

Measuring Coverage

# Install tarpaulin
cargo install cargo-tarpaulin

# Run coverage
cargo tarpaulin --out Html --output-dir coverage

Coverage Goals

Parser Core: 90%+ coverage
Analysis Framework: 80%+ coverage
CLI Commands: 70%+ coverage
Workspace Loading: 80%+ coverage

Common Testing Patterns

Pattern 1: Positive and Negative Tests

#![allow(unused)]
fn main() {
#[test]
fn test_valid_syntax() {
    let input = "valid syntax";
    expect_ok(input, parse_feature(input.into()));
}

#[test]
fn test_invalid_syntax() {
    let input = "invalid syntax";
    assert!(parse_feature(input.into()).is_err());
}
}

Pattern 2: Boundary Testing

#![allow(unused)]
fn main() {
#[test]
fn test_empty_input() {
    assert!(parse_feature("").is_err());
}

#[test]
fn test_minimal_input() {
    expect_ok("x", parse_feature("x"));
}

#[test]
fn test_maximal_input() {
    let input = "very complex nested structure...";
    expect_ok(input, parse_feature(input.into()));
}
}

Pattern 3: Equivalence Testing

#![allow(unused)]
fn main() {
#[test]
fn test_whitespace_insensitive() {
    let compact = "if(x){y;}";
    let spaced = "if (x) { y; }";
    
    let ast1 = expect_ok(compact, parse_if_statement(compact));
    let ast2 = expect_ok(spaced, parse_if_statement(spaced));
    
    assert_eq!(ast1, ast2);
}
}

Test Maintenance

When to Update Tests

API Changes: Update tests when parser API changes
Bug Fixes: Add regression tests for fixed bugs
New Features: Add tests for new language features
Refactoring: Ensure tests still pass after refactoring

Test Cleanup

Remove Duplicate Tests: Consolidate similar tests
Update Outdated Tests: Fix tests using deprecated APIs
Remove Dead Tests: Delete tests for removed features
Improve Names: Use descriptive test names

Test Documentation

#![allow(unused)]
fn main() {
/// Tests that lambda expressions with multiple parameters are parsed correctly.
/// 
/// This test verifies:
/// - Parameter list parsing
/// - Arrow token recognition
/// - Expression body parsing
#[test]
fn test_lambda_with_multiple_params() {
    let input = "(x, y) => x + y";
    let expr = expect_ok(input, parse_lambda_expression(input.into()));
    // ...
}
}

Summary

Testing Checklist

Tests in tests/ directory, not inline
Use expect_ok() for readable failures
Keep tests focused and minimal
Add negative tests for ambiguity
Test lookahead/disambiguation boundaries
Test complex constructs thoroughly
Use cut() for closing delimiters
Wrap sub-parsers with bws()
Add fixtures for integration tests
Use snapshot tests for complex structures
Document test purpose and coverage

Resources

Test Helpers: src/syntax/test_helpers.rs
Example Tests: tests/parser/expressions/
Fixtures: tests/fixtures/
Contributing Guide: docs/development/contributing.md
Architecture: docs/development/architecture.md

Architecture Decisions

This document explains the key architectural decisions made in the BSharp project, their rationale, and their implications for contributors.

Core Design Philosophy

BSharp is designed as a modular, extensible C# parser and analysis toolkit written in Rust. The architecture prioritizes:

Correctness - Accurate parsing of C# syntax
Performance - Efficient parsing and analysis of large codebases
Maintainability - Clear module boundaries and minimal coupling
Extensibility - Easy addition of new language features and analyzers

Parser Architecture

Why nom Parser Combinators?

Decision: Use the nom parser combinator library as the foundation for parsing.

Rationale:

Composability: Small, focused parsers combine to handle complex syntax
Type Safety: Rust's type system catches parser errors at compile time
Performance: Zero-copy parsing with minimal allocations
Testability: Individual parser functions are easily unit tested
Maintainability: Declarative style is easier to understand than hand-written parsers

Trade-offs:

Learning curve for contributors unfamiliar with parser combinators
Error messages require additional work (addressed with nom-supreme)

Implementation:

Core parsing infrastructure: src/bsharp_parser/src/helpers/
Parser implementations: src/bsharp_parser/src/
All parsers return BResult<I, O> type alias

Error Handling Strategy

Decision: Use nom-supreme::ErrorTree for all parser errors.

Rationale:

Rich Context: Tree structure preserves full parse failure path
Better Diagnostics: Context annotations via .context() method
Integration: Seamless integration with nom combinators
Debugging: Pretty-printing via format_error_tree()

Evolution:

Initially used custom BSharpParseError type
Migrated to ErrorTree for better diagnostics
Custom error type deprecated and removed

Implementation:

#![allow(unused)]
fn main() {
pub type BResult<I, O> = IResult<I, O, ErrorTree<I>>;
}

Helper Functions (in src/bsharp_parser/src/helpers/)

context() - Adds contextual information
cut() - Commits to parse branch (prevents misleading backtracking)
bws() - Whitespace-aware wrapper with error context
bdelimited() - Delimited parsing with cut on closing delimiter

Module Organization

Decision: Separate the parser crate from the syntax (AST) crate, and keep analysis in its own crate.

Structure:

src/
├── bsharp_parser/          # Parser implementations and public facade
│   ├── src/
│   │   ├── expressions/    # Expression parsers
│   │   ├── keywords/       # Keyword parsing (modularized)
│   │   ├── helpers/        # Parsing utilities (bws, cut, context, directives, ...)
│   │   ├── facade.rs       # Public Parser facade
│   │   └── ...
├── bsharp_syntax/          # AST node definitions and shared syntax types
│   └── src/                # (re-exported by bsharp_parser as `syntax`)
├── bsharp_analysis/        # Analysis framework and workspace
│   └── src/
└── bsharp_cli/             # CLI entry and subcommands

Rationale:

Separation of Concerns: Infrastructure vs implementation
Reusability: Helpers used across all parsers
API Clarity: syntax module is the public API
Testing: Infrastructure can be tested independently

Keyword Modularization

Decision: Organize keywords by category in dedicated modules.

Structure:

src/parser/keywords/
├── mod.rs                      # Keyword infrastructure
├── access_keywords.rs          # public, private, protected, internal
├── accessor_keywords.rs        # get, set, init, add, remove
├── type_keywords.rs            # class, struct, interface, enum, record
├── modifier_keywords.rs        # static, abstract, virtual, sealed
├── flow_control_keywords.rs    # if, else, switch, case, default
├── iteration_keywords.rs       # for, foreach, while, do
├── expression_keywords.rs      # new, this, base, typeof, sizeof
├── linq_query_keywords.rs      # from, where, select, orderby
└── ...

Rationale:

Maintainability: Easy to find and update keyword parsers
Consistency: Uniform keyword parsing strategy
Word Boundaries: All keywords use keyword() helper for boundary checking
Prevents Bugs: Avoids partial matches (e.g., "int" vs "int32")

Implementation:

keyword() function enforces [A-Za-z0-9_] word boundaries
Parsers grouped under src/bsharp_parser/src/keywords/

AST Design

Naming Convention

Decision: Use PascalCase names without 'Syntax' suffix for all AST nodes.

Examples:

ClassDeclaration (not ClassDeclarationSyntax)
MethodDeclaration (not MethodDeclarationSyntax)
ExpressionStatement (not ExpressionStatementSyntax)
IfStatement (not IfStatementSyntax)

Rationale:

Clarity: Shorter, clearer names
Roslyn Inspiration: Mirrors Roslyn's structure where appropriate
Consistency: Uniform naming across entire codebase
User Preference: Explicit design decision (documented in memories)

Implications:

All AST node types follow this convention
Test code uses these names
Documentation uses these names
Breaking change from earlier versions with 'Syntax' suffix

AST Ownership Model

Decision: Parent nodes own their children; no circular references.

Structure:

#![allow(unused)]
fn main() {
pub struct ClassDeclaration {
    pub attributes: Vec<AttributeList>,
    pub modifiers: Vec<Modifier>,
    pub name: Identifier,
    pub type_parameters: Option<Vec<TypeParameter>>,
    pub primary_constructor_parameters: Option<Vec<Parameter>>,
    pub base_types: Vec<Type>,
    pub body_declarations: Vec<ClassBodyDeclaration>,  // Owned
    pub documentation: Option<XmlDocumentationComment>,
    pub constraints: Option<Vec<TypeParameterConstraintClause>>,
}
}

Rationale:

Rust Ownership: Leverages Rust's ownership system
Memory Safety: No reference cycles or lifetime complexity
Simplicity: Clear ownership semantics
Traversal: Navigation traits provide search without ownership issues

Trade-offs:

Cannot directly reference parent from child
Navigation requires traversal from root
Mitigated by AstNavigate and FindDeclarations traits

Zero-Copy Parsing

Decision: Minimize string allocations during parsing where possible.

Implementation:

String slices reference original input
Identifiers store String (owned) for convenience
Literals preserve original format as String

Rationale:

Performance: Reduces allocation overhead
Memory Efficiency: Lower memory footprint
Trade-off: Some allocations necessary for AST lifetime

Spans and Location Tracking

Decision: Track source locations via spans for precise diagnostics and tooling.

Implementation:

Span type based on nom_locate::LocatedSpan lives in src/bsharp_parser/src/syntax/span.rs and is re-exported through the public parser API.
The parser facade supports parse_with_spans() which returns both the AST and span table for mapping nodes back to source locations.
Error reporting uses spans to include line/column, highlighting ranges via format_error_tree().

Rationale:

Diagnostics: Accurate error locations and ranges.
Tooling: Enables IDE features, navigation, and source mapping.
Testing: Stable, comparable locations for snapshot tests.

See also: docs/syntax/spans.md.

Analysis Framework

Framework-Driven Architecture

Decision: Implement a pipeline-based analysis framework with passes, rules, and visitors.

Structure:

src/analysis/
├── framework/        # Core analysis infrastructure
│   ├── pipeline.rs   # Analysis pipeline orchestration
│   ├── passes.rs     # Analysis pass trait and phases
│   ├── rules.rs      # Rule trait and rulesets
│   ├── walker.rs     # AST walker and visitor pattern
│   ├── registry.rs   # Analyzer registry
│   └── session.rs    # Analysis session and state
├── passes/           # Concrete analysis passes
├── rules/            # Concrete analysis rules
├── artifacts/        # Analysis artifacts (symbols, metrics, CFG)
└── ...

Rationale:

Extensibility: Easy to add new analyzers
Composability: Passes and rules compose via registry
Performance: Single-pass traversal for local rules
Configurability: Enable/disable passes and rules via config

Phases:

Index - Symbol indexing and scope building
Local - Single-pass local rules and metrics collection
Global - Cross-file analysis (dependencies, etc.)
Semantic - Type checking and semantic rules
Reporting - Report generation and formatting

Visitor Pattern

Decision: Use visitor pattern for AST traversal.

Implementation:

#![allow(unused)]
fn main() {
pub trait Visit {
    fn enter(&mut self, node: &NodeRef, session: &mut AnalysisSession);
    fn exit(&mut self, node: &NodeRef, session: &mut AnalysisSession) {}
}

pub struct AstWalker {
    visitors: Vec<Box<dyn Visit>>,
}
}

Rationale:

Separation of Concerns: Traversal logic separate from analysis logic
Composability: Multiple visitors in single traversal
Performance: Single pass for multiple analyses
Extensibility: Easy to add new visitors

Query API

Decision: Use a typed Query API over a minimal NodeRef to traverse the AST. This is the current traversal API; the term “legacy” only refers to older navigation traits that the Query API replaced.

Implementation:

NodeRef enumerates coarse node categories (compilation unit, namespaces, declarations, methods, statements, expressions), and now includes top-level items like file-scoped namespaces, using directives, global using directives, and global attributes.
Children provides child enumeration for NodeRef.
Extract<T> enables Query::of<T>() to yield typed nodes without extending NodeRef for every concrete type.
Macro helpers impl_extract_expr! and impl_extract_stmt! simplify adding Extract impls for expression/statement variants.
Location: src/bsharp_syntax/src/query/ (re-exported as bsharp_analysis::framework::Query)

Rationale:

Composability: Typed filters via Query::filter_typed.
Maintainability: Avoids wide trait surfaces and duplicated traversal.
Performance: Focused walkers remain available for hot paths.
Determinism: Traversal order and artifact hashing remain stable.

See also:

docs/parser/navigation.md (Query API overview)
docs/analysis/traversal-guide.md (using Query in passes)
docs/development/query-cookbook.md (recipes)

Formatting and Emitters

Decision: Implement formatting via an Emit trait with per-node emitters in bsharp_syntax.

Implementation:

Emit trait and emitters live under src/bsharp_syntax/src/emitters/ (e.g., emitters/declarations/*, emitters/expressions/*, emitters/statements/*).
Formatting is separated from parsing; emitters reconstruct code from AST with consistent whitespace and trivia handling.
Trivia and XML doc emitters are under emitters/trivia/.

Rationale:

Separation of Concerns: Parsing and formatting evolve independently.
Consistency: Centralized formatting rules for all nodes.
Extensibility: Adding a new node implies an Emit impl in a known location.

See also: docs/syntax/formatter.md.

Workspace Loading

Multi-Format Support

Decision: Support loading from .sln, .csproj, or directory.

Implementation:

#![allow(unused)]
fn main() {
pub struct WorkspaceLoader;

impl WorkspaceLoader {
    pub fn from_path(path: &Path) -> Result<Workspace>;
    pub fn from_path_with_options(path: &Path, opts: WorkspaceLoadOptions) -> Result<Workspace>;
}
}

Rationale:

Flexibility: Support different entry points
IDE Integration: Match IDE project loading behavior
Incremental Analysis: Load only what's needed

Features:

Solution file (.sln) parsing
Project file (.csproj) parsing with XML
Transitive ProjectReference following
Source file discovery with glob patterns
Deterministic project ordering

Error Resilience

Decision: Continue loading workspace even if individual projects fail.

Implementation:

Failed projects recorded as stubs with error messages
Workspace loading succeeds with partial results
Errors accessible via Project::errors field

Rationale:

Robustness: Don't fail entire workspace for one bad project
User Experience: Show what can be analyzed
Debugging: Error messages preserved for investigation

Testing Strategy

External Test Organization

Decision: Externalize tests; in the current workspace they live under src/bsharp_tests/ rather than inline #[cfg(test)] modules.

Structure:

src/bsharp_tests/src/
├── parser/
│   ├── expressions/
│   ├── statements/
│   ├── declarations/
│   └── types/
├── cli/
└── integration/

Rationale:

Separation: Test code separate from implementation
Organization: Clear structure mirrors crates
Compilation: Tests don't bloat production binaries

Note: A future migration to top-level tests/ may be considered.

Test Helpers

Decision: Provide expect_ok() helper for readable test failures.

Implementation:

#![allow(unused)]
fn main() {
pub fn expect_ok<T>(input: &str, result: BResult<&str, T>) -> T {
    match result {
        Ok((_, value)) => value,
        Err(e) => {
            eprintln!("{}", format_error_tree(&input, &e));
            panic!("Parse failed");
        }
    }
}
}

Rationale:

Diagnostics: Pretty-printed errors on failure
Debugging: Shows parse failure context
Consistency: Uniform test error reporting

Snapshot Testing

Decision: Use insta crate for snapshot testing.

Implementation:

Cargo.toml includes insta in dev-dependencies
Snapshot tests for complex AST structures
JSON serialization for comparison

Rationale:

Regression Prevention: Catch unintended AST changes
Review: Visual diff of AST changes
Maintenance: Update snapshots when intentional

Performance Considerations

Parallel Analysis

Decision: Optional parallel analysis via rayon feature.

Implementation:

[features]
parallel_analysis = ["rayon"]

Rationale:

Scalability: Faster analysis for large workspaces
Optional: Not required for single-file use cases
Trade-off: Adds dependency and complexity

Incremental Parsing

Decision: Not implemented yet; designed for future addition.

Future Design:

Cache parsed ASTs by file hash
Reparse only changed files
Incremental analysis based on change scope

Rationale:

Performance: Critical for IDE integration
Complexity: Requires careful cache invalidation
Priority: Deferred until core features stable

CLI Design

Subcommand Structure

Decision: Use clap with subcommands for different operations.

Commands:

parse - Parse C# file to JSON
tree - Generate AST visualization (Mermaid/DOT)
analyze - Run analysis and generate report

Rationale:

Clarity: Each command has clear purpose
Extensibility: Easy to add new commands
Discoverability: --help shows all options
Consistency: Follows common CLI patterns

Output Formats

Decision: Support multiple output formats (JSON, pretty-JSON, SVG).

Implementation:

JSON for machine consumption
Pretty-JSON for human readability
SVG for visualization

Rationale:

Integration: JSON for tool integration
Debugging: Pretty-JSON for manual inspection
Visualization: SVG for understanding AST structure

Future Extensibility

Planned Enhancements

Incremental Parsing
- Cache parsed ASTs
- Reparse only changed regions
- Critical for IDE integration
Language Server Protocol (LSP)
- IDE integration
- Real-time diagnostics
- Code completion
More Analysis Passes
- Nullability analysis
- Lifetime analysis
- Security analysis
Code Transformation
- AST modification API
- Code generation from AST
- Refactoring support

Design for Extension

Principles:

Trait-Based: Use traits for extensibility points
Registry Pattern: Dynamic registration of analyzers
Configuration: Enable/disable features via config
Versioning: Stable API with clear versioning

Lessons Learned

What Worked Well

Parser Combinators: Excellent for composability and testing
Module Organization: Clear boundaries reduce coupling
Error Context: ErrorTree provides excellent diagnostics
External Tests: Clean separation improves maintainability

What We'd Do Differently

Earlier Keyword Modularization: Should have organized keywords from start
Error Type Migration: Earlier adoption of ErrorTree would have saved refactoring
Documentation: More inline documentation from the beginning

Recent Refactoring

Major refactoring improvements completed:

Expression precedence chain builder implemented
Statement group deduplication completed
Consistent error recovery with skip_to_member_boundary_top_level()
Whitespace handling standardization via bws() combinator
Keyword modularization by category

Contributing Guidelines

When adding new features, follow these architectural principles:

Use Existing Patterns: Follow established parser patterns
Add Tests: External tests in tests/ directory
Document Decisions: Update this file for significant changes
Error Context: Add .context() calls for debugging
Naming Convention: PascalCase without 'Syntax' suffix
Keyword Boundaries: Use keyword() helper for all keywords

See docs/development/contributing.md for detailed contribution guidelines.

Cookbooks

Short, task-focused examples and patterns.

Available Cookbooks

Query Cookbook
- Practical Query API patterns for traversing the AST.
Parser Cookbook
- Nom recipes: identifiers, lists, delimited blocks with cut, tokens with complete, and all_consuming file parsers.

When to use

You know the outcome you want and need a concise example.
You want to copy/paste a small starting point and adapt.

For deeper explanations, see:

docs/development/writing-parsers.md
docs/analysis/traversal-guide.md

Query Cookbook

Practical examples for using the Query API to traverse the AST.

Imports

#![allow(unused)]
fn main() {
// Option A (canonical): import directly from bsharp_syntax
use bsharp_syntax::node::ast_node::NodeRef;
use bsharp_syntax::query::Query;
use bsharp_syntax::{CompilationUnit, ClassDeclaration, MethodDeclaration};

// Option B (ergonomic in analysis code): re-exports via bsharp_analysis
// use bsharp_analysis::framework::{NodeRef, Query};
}

All classes in a file

#![allow(unused)]
fn main() {
fn all_classes(cu: &CompilationUnit) -> Vec<&ClassDeclaration> {
    Query::from(NodeRef::CompilationUnit(cu))
        .of::<ClassDeclaration>()
        .collect()
}
}

All methods in a class

#![allow(unused)]
fn main() {
fn all_methods_in_class(c: &ClassDeclaration) -> Vec<&MethodDeclaration> {
    Query::from(NodeRef::from(c))
        .of::<MethodDeclaration>()
        .collect()
}
}

Public methods only

#![allow(unused)]
fn main() {
use bsharp_syntax::modifiers::Modifier;

fn public_methods(cu: &CompilationUnit) -> Vec<&MethodDeclaration> {
    Query::from(NodeRef::CompilationUnit(cu))
        .filter_typed::<MethodDeclaration>(|m| m.modifiers.iter().any(|mm| *mm == Modifier::Public))
        .collect()
}
}

Count await expressions

#![allow(unused)]
fn main() {
use bsharp_syntax::expressions::AwaitExpression;

fn await_count(cu: &CompilationUnit) -> usize {
    Query::from(NodeRef::CompilationUnit(cu))
        .of::<AwaitExpression>()
        .count()
}
}

Find invocations of a method name

#![allow(unused)]
fn main() {
use bsharp_syntax::expressions::{InvocationExpression, Expression};

fn invocations_of(cu: &CompilationUnit, name: &str) -> Vec<&InvocationExpression> {
    Query::from(NodeRef::CompilationUnit(cu))
        .filter_typed::<InvocationExpression>(|inv| {
            // Match simple Variable(...) calls; extend for MemberAccess as needed
            match &*inv.expression {
                Expression::Variable(id) => id.name == name,
                _ => false,
            }
        })
        .collect()
}
}

Methods with deep nesting

#![allow(unused)]
fn main() {
use bsharp_syntax::statements::statement::Statement;

fn deeply_nested_methods(cu: &CompilationUnit, threshold: usize) -> Vec<&MethodDeclaration> {
    Query::from(NodeRef::CompilationUnit(cu))
        .filter_typed::<MethodDeclaration>(|m| {
            if let Some(body) = &m.body {
                max_nesting(body, 0) > threshold
            } else {
                false
            }
        })
        .collect()
}

fn max_nesting(s: &Statement, cur: usize) -> usize {
    match s {
        Statement::If(i) => {
            let then_d = max_nesting(&i.consequence, cur + 1);
            let else_d = i.alternative.as_ref().map(|a| max_nesting(a, cur + 1)).unwrap_or(cur);
            then_d.max(else_d)
        }
        Statement::Block(stmts) => stmts.iter().map(|st| max_nesting(st, cur)).max().unwrap_or(cur),
        Statement::For(f) => max_nesting(&f.body, cur + 1),
        Statement::ForEach(f) => max_nesting(&f.body, cur + 1),
        Statement::While(w) => max_nesting(&w.body, cur + 1),
        Statement::DoWhile(d) => max_nesting(&d.body, cur + 1),
        _ => cur,
    }
}
}

Tips

Chain filters sparingly: Prefer a single filter_typed with a clear predicate.
Use NodeRef::from(x): Start from any AST node to scope queries.
Profile: For hot paths, consider a custom walker when you need full control.

Parser Cookbook

Practical recipes for nom-based parsers in bsharp_parser.

Spanned-first policy

All public parser entrypoints return Spanned<T> so callers have precise source ranges for AST nodes.
Internals should prefer spanned parsers as well to preserve spans through transformations.
When you only need the inner value, map via .node.

Examples:

#![allow(unused)]
fn main() {
// Prefer the spanned variant and map to inner node when spans are not needed
let (rest, expr) = nom::sequence::delimited(ws, parse_expression_spanned, ws)
    .map(|s| s.node)
    .parse(input)?;

// Lists of expressions: collect inner nodes
let (rest, args) = parse_delimited_list0(
    |i| delimited(ws, tok_l_paren(), ws).parse(i),
    |i| delimited(ws, parse_expression_spanned, ws).map(|s| s.node).parse(i),
    |i| delimited(ws, tok_comma(), ws).parse(i),
    |i| delimited(ws, tok_r_paren(), ws).parse(i),
    false,
    true,
).parse(input)?;
}

Parsable trait

For one-shot parsing of a type to Spanned<Self>, implement or use the crate’s Parsable abstraction (where available) instead of bespoke entrypoints.
This keeps a consistent contract across the parser and simplifies tests and tools that need spans.

Conventions

Use Span<'a> and BResult<'a, T> from bsharp_parser::syntax modules.
Prefer small, composable parsers and add context() labels.
Use cut() to avoid misleading backtracking after committing to a branch.

#![allow(unused)]
fn main() {
use bsharp_parser::syntax::span::Span;
use bsharp_parser::syntax::errors::BResult;
use nom::{IResult, branch::alt, bytes::complete::tag, character::complete as cc, combinator::{all_consuming, complete, map}, sequence::{delimited, preceded, terminated, tuple}};
use nom_supreme::ParserExt; // for .context(), .cut()
}

Identifier

#![allow(unused)]
fn main() {
fn identifier(input: Span) -> BResult<String> {
    // very simplified: letter (letter|digit|_)*
    map(
        tuple((cc::alpha1, cc::alphanumeric0)),
        |(h, t): (&str, &str)| format!("{}{}", h, t)
    ).context("identifier").parse(input)
}
}

Comma-Separated List

#![allow(unused)]
fn main() {
use nom::multi::separated_list0;

fn comma_sep<T, F>(item: F) -> impl FnMut(Span) -> BResult<Vec<T>>
where F: Fn(Span) -> BResult<T> {
    separated_list0(cc::multispace0.and(tag(",")).and(cc::multispace0), item)
}
}

Delimited Braces Block

#![allow(unused)]
fn main() {
fn lbrace(i: Span) -> BResult<()> { map(tag("{"), |_| ()).context("'{'").parse(i) }
fn rbrace(i: Span) -> BResult<()> { map(tag("}"), |_| ()).context("'}'").parse(i) }

fn block<T, F>(mut inner: F) -> impl FnMut(Span) -> BResult<Vec<T>>
where F: FnMut(Span) -> BResult<Vec<T>> {
    move |input| {
        delimited(
            lbrace.context("block start"),
            // prevent backtracking past '}' so the missing brace is reported
            inner.cut(),
            rbrace.cut().context("block end")
        ).parse(input)
    }
}
}

Using complete() for Tokens

#![allow(unused)]
fn main() {
use nom::bytes::streaming::take;
use nom::combinator::complete;

fn exactly_n(n: u8) -> impl FnMut(Span) -> BResult<Span<'_>> {
    move |input| complete(take(n)).context("exactly_n").parse(input)
}
}

all_consuming at File Level

#![allow(unused)]
fn main() {
use nom::combinator::all_consuming;

fn parse_file(input: Span) -> BResult<File> {
    all_consuming(file_parser).parse(input)
}
}

Precedence Chain Skeleton

#![allow(unused)]
fn main() {
fn primary(i: Span) -> BResult<Expr> { /* literals, names, parenthesized */ }
fn postfix(i: Span) -> BResult<Expr> { /* member access, invocation */ }
fn unary(i: Span) -> BResult<Expr> { /* + - ! ~ */ }
fn multiplicative(i: Span) -> BResult<Expr> { /* * / % */ }
fn additive(i: Span) -> BResult<Expr> { /* + - */ }
fn relational(i: Span) -> BResult<Expr> { /* < > <= >= */ }
fn equality(i: Span) -> BResult<Expr> { /* == != */ }
fn assignment(i: Span) -> BResult<Expr> { /* = += -= */ }

// Entry point used by statement parsers
fn expression(i: Span) -> BResult<Expr> { assignment(i) }
}

Context Labels and Cuts

#![allow(unused)]
fn main() {
fn class_declaration(i: Span) -> BResult<ClassDecl> {
    preceded(
        tag("class").context("keyword 'class'"),
        tuple((
            identifier.cut().context("class name"),
            // ... type params, base list
        ))
    ).context("class declaration").map(|(name, ..)| ClassDecl { name }).parse(i)
}
}

Tips

Whitespace: Prefer explicit multispace0/multispace1 at boundaries to avoid accidental greedy matches.
Error messages: Keep context() labels concise and domain-specific (e.g., "parameter list").
Backtracking: Insert cut() after committing to a branch to stop alt from swallowing errors.

Writing Tests

How to write and organize tests for BSharp.

Test Locations

Parser and analysis tests live under src/bsharp_tests/src/.
Prefer dedicated files per area, e.g.:
- src/bsharp_tests/src/parser/expressions/...
- src/bsharp_tests/src/parser/statements/...
- src/bsharp_tests/src/analysis/...

Parser Tests

Use realistic C# snippets and assert AST shapes.
Prefer external test helpers (avoid inline #[cfg(test)] in parser modules).

#![allow(unused)]
fn main() {
// Example skeleton
#[test]
fn parses_simple_invocation() {
    let source = "class C { void M() { Foo(1); } }";
    let (cu, _spans) = bsharp_parser::facade::Parser::new().parse_with_spans(source).unwrap();
    // Use Query or pattern matching to verify nodes
}
}

Analysis Tests

Run AnalyzerPipeline::run_with_defaults and inspect artifacts:
- AstAnalysis metrics
- CFG summary
- Dependency summary

#![allow(unused)]
fn main() {
#[test]
fn counts_methods() {
    let src = "class C { void A(){} void B(){} }";
    let (cu, spans) = bsharp_parser::facade::Parser::new().parse_with_spans(src).unwrap();
    let mut session = bsharp_analysis::framework::AnalysisSession::new(
        bsharp_analysis::context::AnalysisContext::new("file.cs", src), spans);
    bsharp_analysis::framework::AnalyzerPipeline::run_with_defaults(&cu, &mut session);
    let metrics = session.artifacts.get::<bsharp_analysis::metrics::AstAnalysis>().unwrap();
    assert!(metrics.total_methods >= 2);
}
}

Tips

Names: Use descriptive test names; each file should focus on one area.
Fixtures: Keep sources small and focused; add comments for intent.
Determinism: Avoid relying on traversal order; query by type or match by name.

bsharp_tests Overview

Structure and conventions for the test crate.

Location

All tests live under src/bsharp_tests/src/.
Organize by domain:
- parser/ for parsing-related tests
- analysis/ for analysis pipeline tests

Running Tests

cargo test -p bsharp_tests

Conventions

Prefer descriptive file names and test names.
Keep fixtures small and focused.
Use Parser::parse_with_spans and AnalyzerPipeline::run_with_defaults in integration-style tests.

Extending Syntax (New Nodes)

How to add new AST node types to bsharp_syntax.

1. Define the Node

Add a struct or enum in the relevant module under src/bsharp_syntax/src/.
Derive bsharp_syntax_derive::AstNode so it participates in traversal and rendering.

#![allow(unused)]
fn main() {
#[derive(bsharp_syntax_derive::AstNode, Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct InterpolatedString {
    pub parts: Vec<InterpolatedPart>,
}

#[derive(bsharp_syntax_derive::AstNode, Debug, Clone, PartialEq, Serialize, Deserialize)]
pub enum InterpolatedPart {
    Text(String),
    Expr(Expression),
}
}

The derive implements AstNode and auto-generates children() that pushes nested nodes.

2. Implement Emit (Optional)

If the node needs to be formatted back to C#, implement Emit in bsharp_syntax emitters.

#![allow(unused)]
fn main() {
impl crate::emitters::emit_trait::Emit for InterpolatedString {
    fn emit<W: std::fmt::Write>(&self, w: &mut W, cx: &mut EmitCtx) -> Result<(), EmitError> {
        cx.token(w, "$")?;
        cx.bracketed(w, '"', '"', || {
            for p in &self.parts { p.emit(w, cx)?; }
            Ok(())
        })
    }
}
}

Add per-part emitters in the same or nearby module (e.g., emitters/expressions/...).

3. Wire Up Parser (in `bsharp_parser`)

Add a parser in src/bsharp_parser/src/expressions/... that constructs the new node.
Use Span-based parsers (bsharp_parser::syntax::span::Span).
On errors, rely on helpers and contexts so format_error_tree() is informative.

3a. Add Keywords & Tokens

Define keyword helpers using define_keyword_pair! in src/bsharp_parser/src/keywords/.
If a new reserved word, add it to KEYWORDS (identifier filtering).
Use kw_*()/peek_*() in parsers, wrapped with ws() at boundaries, and insert .cut() after commitment.

See: docs/parser/keywords-and-tokens.md for the macro and examples.

3b. Use Syntax Parsers (Whitespace/Lists)

Whitespace/comments: syntax/comment_parser.rs (ws(), parse_whitespace_or_comments())
Lists: syntax/list_parser.rs for delimited/separated lists
Tokens: prefer nom_supreme::tag::complete::tag() and compose with preceded/terminated/delimited and ws()

Example token with trivia:

#![allow(unused)]
fn main() {
use nom::{combinator::map, sequence::delimited};
use nom_supreme::tag::complete::tag;
use crate::syntax::comment_parser::ws;

map(delimited(ws, tag(","), ws), |_| ())
}

4. Tests (`bsharp_tests`)

Create tests under src/bsharp_tests/src/parser/... verifying the node appears in the AST.
Add formatter round-trip tests if Emit is implemented.

#![allow(unused)]
fn main() {
#[test]
fn interpolated_string_ast() {
    let src = r#"class C { void M(){ var s = $"x={x}"; } }"#;
    let (cu, _spans) = bsharp_parser::facade::Parser::new().parse_with_spans(src).unwrap();
    // Use Query to find InterpolatedString once parser supports it
}
}

5. Visualization (Optional)

Graph views require no changes: to_text, to_mermaid, and to_dot use AstNode traversal.

Tips

Box recursion: Use Box<T> for recursive enum variants.
Keep primitives out: Store String, bool, numbers as payload only; derive will skip them.
Naming: Use PascalCase node names; no Syntax suffix.

Writing Parsers

Guidelines for implementing parsers in bsharp_parser using nom and spans.

Spans & Result Type

Span: bsharp_parser::syntax::span::Span<'a> (alias of nom_locate::LocatedSpan<&'a str>)
Error type: nom_supreme::error::ErrorTree<Span<'a>>
Result alias: type BResult<'a, O> = IResult<Span<'a>, O, ErrorTree<Span<'a>>> in bsharp_parser::syntax::errors

#![allow(unused)]
fn main() {
use bsharp_parser::syntax::errors::BResult;
use bsharp_parser::syntax::span::Span;
}

Streaming vs Complete

nom supports streaming parsers by default. Use nom::combinator::complete(parser) to transform Incomplete into Error when you want a "complete input" behavior for a sub-parser (e.g., tokens, literals).

Example (from nom docs):

#![allow(unused)]
fn main() {
use nom::bytes::streaming::take;
use nom::combinator::complete;

let mut parser = complete(take(5u8));
assert_eq!(parser.parse("abcdefg"), Ok(("fg", "abcde")));
assert!(parser.parse("abcd").is_err());
}

At the top level, wrap file parsers with nom::combinator::all_consuming to ensure the entire input is consumed:

#![allow(unused)]
fn main() {
use nom::combinator::all_consuming;
let mut parser = all_consuming(file_parser);
}

Error Contexts and Cuts

Use nom_supreme for structured errors and better messages:

context("label", p) to push human-readable frames.
cut(p) to prevent backtracking across critical boundaries and surface the right error.
Our pretty-printer format_error_tree(&source, &error_tree) renders the tree with line/column and context stack.

#![allow(unused)]
fn main() {
use nom::{branch::alt, sequence::{preceded, terminated}};
use nom_supreme::context::ContextError;
use nom_supreme::ParserExt; // for .context(), .cut()

fn identifier(input: Span) -> BResult<String> { /* ... */ }
fn lbrace(input: Span) -> BResult<()> { /* ... */ }
fn rbrace(input: Span) -> BResult<()> { /* ... */ }

fn block(input: Span) -> BResult<Vec<Stmt>> {
    preceded(
        lbrace.context("block: '{'"),
        terminated(statements, rbrace.cut().context("block: '}'"))
    ).parse(input)
}
}

Common Combinators

preceded(a, b), terminated(a, b), delimited(a, b, c)
alt((p1, p2, ...)) for alternatives
tuple((p1, p2, ...)) to sequence
separated_list0(sep, item) to parse comma-separated lists
map(p, f) to build AST nodes

Prefer small, focused parsers composed with these combinators.

Top-Level Entry Points

Keep clear entry points for precedence chains (e.g., primary → postfix → binary → assignment).
Use wrapper nodes for constructs like New, Invocation, MemberAccess, etc., to keep variants orthogonal in the AST (see bsharp_syntax::expressions::expression.rs).

Testing Parsers

Place tests in src/bsharp_tests/src/parser/....
Parse using Parser::new().parse_with_spans(&source) and assert expected AST shapes.
On failure, pretty-print errors with format_error_tree to diagnose.

#![allow(unused)]
fn main() {
#[test]
fn parses_expression_statement() {
    let src = "class C { void M(){ Foo(1); } }";
    let (cu, _spans) = bsharp_parser::facade::Parser::new().parse_with_spans(src).unwrap();
    // Verify expected nodes using Query or pattern matching
}
}

Tips

Return early with cut after consuming a keyword to avoid misleading alternatives.
Use complete for tokens/literals that must not be partial.
all_consuming at file/compilation-unit to ban trailing garbage.
Context labels: Be concise and specific; they surface in error messages and docs.

References

nom combinator complete: https://docs.rs/nom/8.0.0/nom/combinator/fn.complete.html
nom combinator all_consuming: https://docs.rs/nom/8.0.0/nom/combinator/fn.all_consuming.html

Spanned-first Parsers

This project follows a spanned-first policy for all parser entrypoints. Public parsers return Spanned<T> so every AST value carries precise source ranges for diagnostics, tooling, and downstream analysis.

Rationale

Rich diagnostics: precise byte and line/column ranges for errors and UI highlighting.
Uniform contract: tools and tests can rely on span presence everywhere.
Safer refactors: span plumbing is not an afterthought.

Usage Patterns

1) Prefer spanned entrypoints

#![allow(unused)]
fn main() {
// Prefer spanned variants
let (rest, s_expr) = parse_expression_spanned(input)?;
// Use inner value if spans are not needed at the call site
let expr = s_expr.node;
}

2) Map lists to inner nodes

#![allow(unused)]
fn main() {
use nom::sequence::delimited;

let (rest, args) = parse_delimited_list0(
    |i| delimited(ws, tok_l_paren(), ws).parse(i),
    |i| delimited(ws, parse_expression_spanned, ws).map(|s| s.node).parse(i),
    |i| delimited(ws, tok_comma(), ws).parse(i),
    |i| delimited(ws, tok_r_paren(), ws).parse(i),
    false,
    true,
).parse(input)?;
}

3) Statements

#![allow(unused)]
fn main() {
let (rest, s_stmt) = parse_statement_ws_spanned(input)?;
let stmt = s_stmt.node;
}

Implementing new parsers

Return Spanned<T> from public entrypoints.
Compose with existing spanned parsers to retain spans through transformations.
For adapters that must return unspanned values (e.g., legacy APIs), .map(|s| s.node) at the last possible boundary.
Use cut() after committing to a branch to produce focused errors.
Add context("...") labels on user-facing constructs.

Example:

#![allow(unused)]
fn main() {
use nom::sequence::delimited;
use nom_supreme::ParserExt;

pub fn parse_lambda_body(input: Span) -> BResult<LambdaBody> {
    nom::branch::alt((
        // block
        nom::combinator::map(parse_lambda_block_body, LambdaBody::Block),
        // expression
        nom::combinator::map(
            delimited(ws, parse_expression_spanned, ws).map(|s| s.node),
            LambdaBody::ExpressionSyntax,
        ),
    ))
    .context("lambda body")
    .parse(input)
}
}

Testing

Prefer helpers that accept/return Spanned<T> in new tests.
When asserting only structure, map to .node before comparison.
For diagnostics, use the existing pretty printers (see bsharp_parser::errors::format_error_tree and to_miette_report).

Migration Notes

Old unspanned entrypoints are deprecated; use their _spanned counterparts.
If a caller previously depended on unspanned types, add .map(|s| s.node).
For bulk changes: search for parse_expression( and parse_statement( and replace with spanned + .node mapping.

Compliance

Warning

This Compliance section is a work in progress. Content, mappings, and assertions may evolve and change between versions.

This section documents the Roslyn compliance pipeline and how we validate our bsharp_parser and bsharp_syntax against Roslyn’s structure tests.

Start with the high-level Overview
Learn how to write Custom Asserts
Understand the Generator

Compliance Overview

Warning

This Compliance section is a work in progress. Content, mappings, and assertions may evolve and change between versions.

This section describes the Roslyn compliance effort for the C# parser, using our Rust-based bsharp_parser and the bsharp_syntax AST. The goal is to automatically extract structural assertions from Roslyn tests and validate that our AST shape and key payloads match Roslyn’s expectations (normalized to our naming conventions: PascalCase, no "Syntax" suffix).

High-Level Flow

Source: Roslyn test files in roslyn_testing/roslyn_repo/src/Compilers/CSharp/Test/Syntax/Parsing/.
Extraction: A generator scans for UsingTree(...) blocks and parses the following DSL of N(SyntaxKind.X) nodes.
Translation: The extracted Roslyn tree is translated and normalized to our canonical kinds and structure.
Running: Tests are emitted into bsharp_compliance_testing, parsing provided C# snippets with bsharp_parser and comparing the actual AST with the expected structure.

Core Components

bsharp_compliance (generator)
- Reads Roslyn files and extracts structural expected trees.
- Parses the Roslyn DSL (N(SyntaxKind.X), M(...), EOF()).
- Normalizes kinds via kind_map.rs (e.g., RecordStructDeclaration → RecordDeclaration).
- Emits Rust tests into bsharp_compliance_testing/src/generated/.
bsharp_compliance_testing (tests & asserts)
- Contains custom structural assertions in custom_asserts/structure_assert.rs.
- Walks real bsharp_syntax nodes to build a comparable ExpectedTree.
- Compares node kind shapes and selected token payloads (e.g., identifier text).

Normalization Principles

Node names are PascalCase and omit Roslyn’s ...Syntax suffix.
Tokens/keywords are filtered from structure; identifier text is lifted where relevant.
Harness differences (Roslyn’s class-with-method wrappers vs. our top-level statements) are normalized at assert time when needed.

What This Validates

Structural presence and order of major nodes (CompilationUnit, declarations, using directives, type parameters, constraint clauses, etc.).
Selected payloads (e.g., IdentifierName.token_value).
Deeper constructs incrementally (e.g., TypeParameterConstraintClause, “allows ref struct” constraints, record primary parameter lists).

Roadmap

Expand kind mapping and walker coverage across more Roslyn suites.
Tighten token payload checks where meaningful.
Add targeted hand-authored structure tests for corner cases.

Compliance Guide

Warning

This Compliance section is a work in progress. Content, mappings, and assertions may evolve and change between versions.

This guide explains how to write custom asserts for Roslyn compliance tests using our bsharp_compliance_testing helpers. It focuses on structural checks and optional diagnostics checks.

Where custom asserts live

File: roslyn_testing/bsharp_compliance_testing/src/custom_asserts/after_parse.rs
Entry points:
- after_parse(...): lightweight per-case hook for structural or source-based assertions.
- after_parse_with_expected(...): adds an optional diagnostics expectation integration.
Helper macro:
- assert_when! { ... } — enables concise, per-case matching on module/file/method/index.

Using `assert_when!`

The macro lets you target a specific Roslyn case by module name, Roslyn filename, Roslyn test method name, and case index (0-based within the method).

Example for a Statement case:

#![allow(unused)]
fn main() {
use crate::custom_asserts::after_parse::{after_parse, CaseData};

pub fn after_parse(
    module: &str,
    roslyn_file: &str,
    roslyn_method: &str,
    idx: usize,
    case: CaseData<'_>,
) {
    assert_when!(
        module = "statement_parsing_tests",
        roslyn_file = "StatementParsingTests",
        roslyn_method = "TestSwitchStatementWithNullableTypeInPattern3",
        idx = 2,
        Statement(ast, src) {
            // Add your targeted assertions here
            assert!(src.contains("switch"));
            // Optional: pattern-match on `ast` when you need structure checks
            // match ast { /* ... */ }
        }
    );
}
}

Example for a File case (full CompilationUnit available):

#![allow(unused)]
fn main() {
use crate::custom_asserts::after_parse::{after_parse, CaseData};

pub fn after_parse(
    module: &str,
    roslyn_file: &str,
    roslyn_method: &str,
    idx: usize,
    case: CaseData<'_>,
) {
    assert_when!(
        module = "using_directive_parsing_tests",
        roslyn_file = "UsingDirectiveParsingTests",
        roslyn_method = "SimpleUsingDirectiveNamePointer",
        idx = 0,
        File(unit, src, original) {
            assert!(src.starts_with("using "));
            // `unit` is a &bsharp_syntax::ast::CompilationUnit
            // You can inspect its using directives or declarations if needed.
            assert!(unit.using_directives.len() >= 1);
            let _ = original; // original Roslyn text when provided
        }
    );
}
}

Diagnostics integration

If the generator attaches expected diagnostics, use after_parse_with_expected(...) to compare counts when diagnostics are supported by the build:

#![allow(unused)]
fn main() {
use crate::custom_asserts::after_parse::{after_parse_with_expected, CaseData};

pub fn my_integration(
    module: &str,
    roslyn_file: &str,
    roslyn_method: &str,
    idx: usize,
    expected: Option<crate::custom_asserts::roslyn_asserts::ExpectedDiagnostics>,
    case: CaseData<'_>,
) {
    // Runs custom case asserts and then asserts diagnostics count when available
    after_parse_with_expected(module, roslyn_file, roslyn_method, idx, expected, case);
}
}

Notes:

When diagnostics support is disabled, the helper asserts with an explicit "unimplemented" fallback to avoid silent failures.
Keep asserts precise and self-contained; prefer checking concrete substrings or specific AST facts.

Best practices

Keep assertions small and focused. Use assert_when! blocks per case.
Avoid brittle assumptions: prefer checking presence/shape over exact token trivia.
Match our naming convention in any structure references (PascalCase, no Syntax suffix).
Fail fast with clear messages; do not silently swallow errors.

Generator

Warning

This Compliance section is a work in progress. Content, mappings, and assertions may evolve and change between versions.

This document describes how the Roslyn structure test generator works in bsharp_compliance and how it produces executable tests for bsharp_compliance_testing.

Inputs

Roslyn source files under roslyn_testing/roslyn_repo/src/Compilers/CSharp/Test/Syntax/Parsing/.
The generator scans for UsingTree(...) calls and parses the immediately following Roslyn structure DSL composed of N(SyntaxKind.X) and EOF() entries (with M(...) ignored as "missing").

Pipeline

Scan and collect test methods
- Locates Roslyn [Fact] methods and all UsingTree(...) call sites.
- Captures the closest preceding var text = "..."; snippet as input source, when present.
Parse structure DSL
- Reads the DSL block following UsingTree(...) and constructs a nested ExpectedTree (ExpectedNode graph) mirroring the Roslyn node hierarchy.
- Tolerates whitespace, comments, and missing markers (M(...)).
Kind translation and normalization
- generator/kind_map.rs maps Roslyn kinds to our canonical naming (PascalCase, no Syntax suffix).
- Filters token/keyword nodes, lifting identifier text (IdentifierToken → parent IdentifierName.token_value).
- Applies targeted renames (e.g., RecordStructDeclaration → RecordDeclaration).
Emit tests
- Writes Rust tests into bsharp_compliance_testing/src/generated/<module>.rs.
- Each test parses the captured src with bsharp_parser and asserts structure via custom_asserts/structure_assert.rs.

Assertions

Structure assertions build a comparable expected tree from our actual AST (bsharp_syntax) and compare:
- Node kinds and order
- Selected token payloads (e.g., IdentifierName.token_value)
Normalization in the assert layer adapts Roslyn’s harness (class + method) to our top-level statements when applicable.

Extending the Generator

Update generator/kind_map.rs to add or refine kind mappings.
Expand custom_asserts/structure_assert.rs to walk deeper AST areas (e.g., records, types, constraints).
Improve the DSL parser (generator/structure_dsl.rs) as new Roslyn DSL shapes appear.

Output Location

Generated files live under roslyn_testing/bsharp_compliance_testing/src/generated/.
Modules track Roslyn file groups, e.g. record_parsing.rs, using_directive_parsing_tests.rs.

Keyboard shortcuts

BSharp C# Parser Documentation