BSharp C# Parser Documentation
BSharp is a comprehensive C# parser and analysis toolkit written in Rust. It provides a complete solution for parsing C# source code into an Abstract Syntax Tree (AST), performing various code analyses, and generating insights about code quality and structure.
What is BSharp?
BSharp consists of several key components:
- Parser: A robust C# parser built using the
nomparser combinator library - AST: A complete representation of C# language constructs
- Analysis Framework: Tools for analyzing code structure, dependencies, and quality
- CLI Tools: Command-line utilities for parsing, visualization, and analysis
Key Features
-
Complete C# Language Support: Supports modern C# features including:
- Classes, structs, interfaces, records, enums
- Methods, properties, fields, events, indexers
- All statement types (if, for, while, switch, try-catch, etc.)
- Expression parsing with operator precedence
- Generic types and constraints
- Attributes and modifiers
- Preprocessor directives
-
Robust Error Handling: Custom error types with context information for debugging parse failures
-
Query API: Typed, ergonomic traversal of the AST via
bsharp_analysis::framework::Query -
Code Analysis: Built-in analyzers for:
- Control flow analysis
- Dependency tracking
- Code metrics (complexity, maintainability)
- Type analysis
- Code quality assessment
-
Extensible Architecture: Modular design allowing easy extension of parsing and analysis capabilities
Architecture Overview
The codebase is organized into several main modules:
src/
├── bsharp_parser/ # Parser crate (expressions, statements, declarations, helpers)
├── bsharp_syntax/ # AST nodes and shared syntax types (re-exported by parser)
├── bsharp_analysis/ # Analysis framework and workspace loader
├── bsharp_cli/ # Command-line interface
└── bsharp_tests/ # External tests for parser/analysis/CLI
Key Components
Parser (src/bsharp_parser/, src/bsharp_syntax/)
- Modular parser using nom combinators
- Complete C# language support
- Rich error diagnostics with ErrorTree
- Keyword parsing organized by category
- AST nodes follow PascalCase naming without 'Syntax' suffix
Workspace Loading (src/bsharp_analysis/src/workspace/)
- Solution file (.sln) parsing
- Project file (.csproj) parsing with XML
- Transitive ProjectReference resolution
- Source file discovery with glob patterns
- Deterministic project ordering
Analysis Framework (src/bsharp_analysis/src/)
- Pipeline-based architecture with phases
- Extensible passes and rules system
- Metrics collection (complexity, maintainability)
- Control flow analysis
- Dependency tracking
- Code quality assessment
CLI Tools (src/bsharp_cli/)
parse- Parse C# and print textual AST treetree- Generate AST visualization (Mermaid/DOT)analyze- Comprehensive code analysisformat- Format C# code using syntax emitters
Getting Started
The easiest way to get started is using the CLI tools:
# Parse a C# file and print textual AST tree
bsharp parse input.cs
# Generate AST visualization
bsharp tree input.cs --output ast.svg
# Analyze a project or solution
bsharp analyze MyProject.csproj --out report.json
# Format a file in-place (or a directory recursively)
bsharp format input.cs --write true
Formatter Quickstart
Use the built-in formatter from the CLI or integrate the Formatter directly.
- CLI usage and options: see Format Command
- Formatter design and policies: see Formatter and Emitters
Quick examples:
# Format a single file in-place
bsharp format Program.cs
# Print formatted output (do not write)
bsharp format Program.cs --write false
# Enable emission tracing to a JSONL file
bsharp format Program.cs --emit-trace --emit-trace-file format_trace.jsonl
Use Cases
BSharp is designed for:
- Static Analysis Tools: Build custom analyzers for code quality, security, or style
- Code Transformation: Parse, modify, and regenerate C# code
- Language Tooling: Create IDE extensions, linters, or formatters
- Educational Tools: Understand and visualize C# code structure
- Migration Tools: Analyze legacy code for modernization efforts
This documentation will guide you through all aspects of using and extending BSharp.
Parser Overview
The BSharp parser transforms C# source code into a structured Abstract Syntax Tree (AST). Built using the nom parser combinator library, it provides a robust and extensible foundation for parsing modern C# syntax as part of the BSharp toolkit.
Architecture
The parser follows a modular architecture with clear separation of concerns. It serves as the frontend for tools that consume the AST (analysis, visualization, etc.):
Parser Infrastructure (src/bsharp_syntax/src/)
mod.rs: Public API and re-exportsast.rs: Root AST node definitions (CompilationUnit,TopLevelDeclaration)errors.rs: Error formatting utilities (format_error_tree)parser_helpers.rs: Core parsing utilities (context,bws,keyword, etc.)test_helpers.rs: Testing utilities (expect_ok, etc.)nodes/: AST node definitions organized by category
Parser Implementations (src/bsharp_parser/src/)
The parsers are organized by language construct type:
expressions/: All expression parsing (literals, operators, method calls, etc.)keywords/: Keyword parsing organized by categorytypes/: Type system parsing (primitives, generics, arrays, etc.)helpers/: Declaration helpers and utilitiespreprocessor/: Preprocessor directive parsing
AST Node Definitions (src/bsharp_syntax/src/)
Structured node definitions that mirror C# language constructs:
declarations/: All declaration node typesexpressions/: All expression node typesstatements/: All statement node typestypes/: Type system node definitions
Parser Design Principles
1. Compositional Design
The parser is built from small, focused parser functions that combine to handle complex language constructs:
#![allow(unused)] fn main() { // Example: Method declaration combines multiple sub-parsers fn parse_method_declaration(input: &str) -> BResult<&str, MethodDeclaration> { let (input, attributes) = parse_attributes(input.into())?; let (input, modifiers) = parse_modifiers(input.into())?; let (input, return_type) = parse_type(input.into())?; let (input, name) = parse_identifier(input.into())?; let (input, parameters) = parse_parameter_list(input.into())?; let (input, body) = opt(parse_block_statement)(input.into())?; // ... construct and return MethodDeclaration } }
2. Error Recovery and Context
Custom error types provide detailed context about parse failures:
- Location information (line, column)
- Expected vs. actual input
- Contextual error messages
- Error recovery strategies
3. Extensibility
The modular design allows easy addition of new language features:
- Add new expression types by extending the
Expressionenum - Implement new statement parsers following established patterns
- Extend AST navigation traits for new analysis capabilities
Parsing Flow
1. Entry Point
Parsing begins via the public facade in src/bsharp_parser/src/facade.rs:
#![allow(unused)] fn main() { let parser = bsharp_parser::facade::Parser::new(); let cu = parser.parse(source)?; }
2. Compilation Unit Parsing
The parser starts by parsing a CompilationUnit, which represents a complete C# source file:
- Global attributes (assembly/module level)
- Using directives
- Top-level declarations (namespaces, classes, etc.)
- File-scoped namespaces (C# 10+)
- Top-level statements (C# 9+)
3. Recursive Descent
The parser uses recursive descent to handle nested structures:
- Namespaces contain type declarations
- Types contain member declarations
- Methods contain statements
- Statements contain expressions
Key Parser Features
Expression Parsing with Precedence
The expression parser handles operator precedence correctly:
- Primary expressions (literals, identifiers, parentheses)
- Unary operators (!, -, +, ++, --, etc.)
- Binary operators with correct precedence and associativity
- Conditional expressions (ternary operator)
- Assignment expressions
Statement Parsing
Comprehensive support for all C# statement types:
- Control flow:
if,switch,for,foreach,while,do-while - Jump statements:
break,continue,return,throw,goto - Exception handling:
try-catch-finally - Resource management:
using,lock - Local declarations and assignments
Declaration Parsing
Full support for C# type and member declarations:
- Types: classes, structs, interfaces, records, enums, delegates
- Members: methods, properties, fields, events, indexers, operators
- Modifiers: access modifiers, static, abstract, virtual, override, etc.
- Generics: type parameters, constraints, variance
Modern C# Features
Support for recent C# language additions:
- Records (C# 9)
- File-scoped namespaces (C# 10)
- Top-level statements (C# 9)
- Pattern matching enhancements
- Nullable reference types
Error Handling Strategy
The parser uses a multi-layered error handling approach:
- Parse Errors: Detailed information about what went wrong during parsing
- Context Propagation: Errors include context about where in the parsing process they occurred
- Recovery Mechanisms: Ability to continue parsing after certain types of errors
- User-Friendly Messages: Clear, actionable error messages for developers
This design makes the parser robust and helpful for development and debugging. Code generation/compilation is out of scope for now; the parser and analysis crates form the current focus of the toolkit.
See Also
- Keywords and Tokens – keyword helpers, word boundaries, trivia handling
Core Parser Components
This document details the fundamental components that make up the BSharp parser infrastructure.
Public Parser API
Parser Struct
The main entry point for all parsing operations:
#![allow(unused)] fn main() { #[derive(Default)] pub struct Parser; impl Parser { pub fn new() -> Self pub fn parse(&self, input: &str) -> Result<ast::CompilationUnit, String> } }
The Parser provides a clean, simple interface that abstracts away the complexity of the underlying parsing implementation.
Error System
ErrorTree (nom-supreme)
BSharp uses nom-supreme's ErrorTree for rich error diagnostics:
#![allow(unused)] fn main() { pub type BResult<I, O> = IResult<I, O, ErrorTree<I>>; }
Key features:
- Context Stack: Maintains parsing contexts via
.context()calls - Position Tracking: Built-in span tracking for error locations
- Rich Diagnostics: Tree structure shows complete parse failure path
- Integration: Seamless with nom combinators
Error Helpers
Utility functions for enhanced error handling:
Location: src/bsharp_parser/src/helpers/
context(): Adds contextual information to parser errorsbws(): Whitespace-aware wrapper with error contextbdelimited(): Delimited parsing with cut on closing delimitercut(): Commits to parse branch, preventing misleading backtracking- Error recovery mechanisms for common parsing scenarios
Pretty Error Formatting
Location: src/bsharp_parser/src/syntax/errors.rs
#![allow(unused)] fn main() { pub fn format_error_tree(input: &str, error: &ErrorTree<Span<'_>>) -> String; }
Produces rustc-like error messages with:
- Line and column numbers
- Source code context
- Caret pointing to error location
- Context stack showing parse path
AST Foundation
CompilationUnit
The root node of every parsed C# file:
#![allow(unused)] fn main() { pub struct CompilationUnit { pub global_attributes: Vec<GlobalAttribute>, pub using_directives: Vec<UsingDirective>, pub global_using_directives: Vec<GlobalUsingDirective>, pub declarations: Vec<TopLevelDeclaration>, pub file_scoped_namespace: Option<FileScopedNamespaceDeclaration>, pub top_level_statements: Vec<Statement>, } }
Represents the complete structure of a C# source file, supporting both traditional and modern C# features.
TopLevelDeclaration
Enum representing all possible top-level declarations:
#![allow(unused)] fn main() { pub enum TopLevelDeclaration { Namespace(NamespaceDeclaration), FileScopedNamespace(FileScopedNamespaceDeclaration), Class(ClassDeclaration), Struct(StructDeclaration), Record(RecordDeclaration), Interface(InterfaceDeclaration), Enum(EnumDeclaration), Delegate(DelegateDeclaration), GlobalAttribute(GlobalAttribute), } }
Keyword Parsing
Keyword Module Organization
Location: src/bsharp_parser/src/keywords/
Keywords are organized by category in dedicated modules for maintainability and consistency:
src/bsharp_parser/src/keywords/
├── mod.rs # Keyword infrastructure
├── access_keywords.rs # public, private, protected, internal
├── accessor_keywords.rs # get, set, init, add, remove
├── type_keywords.rs # class, struct, interface, enum, record
├── modifier_keywords.rs # static, abstract, virtual, sealed
├── flow_control_keywords.rs # if, else, switch, case, default
├── iteration_keywords.rs # for, foreach, while, do
├── expression_keywords.rs # new, this, base, typeof, sizeof
├── linq_query_keywords.rs # from, where, select, orderby
└── ...
Keyword Parsing Strategy
Word Boundary Enforcement:
#![allow(unused)] fn main() { pub fn keyword(kw: &'static str) -> impl Fn(&str) -> BResult<&str, &str>; }
The keyword() helper enforces [A-Za-z0-9_] word boundaries to prevent partial matches:
- Correctly rejects "int" when parsing "int32"
- Ensures "class" doesn't match "classname"
- Consistent across all keyword parsers
Benefits:
- Maintainability: Easy to find and update keyword parsers
- Consistency: Uniform keyword parsing strategy
- Bug Prevention: Avoids partial match issues
- Centralization: Single source of truth for keywords
Parser Helpers
Context Management
Functions for maintaining parsing context:
#![allow(unused)] fn main() { pub fn context<I, O, F>( ctx: &'static str, parser: F ) -> impl FnMut(I) -> BResult<I, O> }
Wraps parsers with contextual information that appears in error messages, making debugging much easier.
Parser Composition
Utilities for combining smaller parsers into larger ones:
- Sequencing parsers with error propagation
- Optional parsing with fallbacks
- Alternative parsing with preference ordering
- Repetition parsing with separators
Whitespace and Comment Handling
Consistent handling of whitespace and comments throughout the parser:
- Automatic whitespace skipping between tokens
- Comment preservation for documentation tools
- Preprocessor directive handling
Node Structure Standards
Common Traits
All AST nodes implement standard traits:
Debug: For debugging and loggingPartialEq: For testing and comparisonClone: For AST manipulationSerialize/Deserialize: For JSON export/import
Node Organization
AST nodes are organized hierarchically:
nodes/
├── declarations/ # Type and member declarations
├── expressions/ # All expression types
├── statements/ # All statement types
├── types/ # Type system representations
└── ... # Other language constructs
Identifier Handling
Consistent identifier representation throughout the AST:
#![allow(unused)] fn main() { pub struct Identifier { pub name: String, // Additional metadata like source location } }
Type System Integration
Type Representation
The parser builds a complete representation of C# types:
- Primitive types (int, string, bool, etc.)
- Reference types (classes, interfaces)
- Value types (structs, enums)
- Generic types with constraints
- Array and pointer types
- Nullable types
Generic Support
Full support for C# generics:
- Type parameters with constraints
- Variance annotations (in, out)
- Generic method declarations
- Complex constraint combinations
Memory Management
Zero-Copy Parsing
Where possible, the parser avoids unnecessary string allocations:
- String slices reference original input
- Minimal cloning during parsing
- Efficient error reporting without excessive allocation
AST Ownership
Clear ownership semantics for AST nodes:
- Parent nodes own their children
- Shared references through navigation traits
- No circular references in the AST structure
This foundation provides a robust base for parsing complex C# code while maintaining performance and usability.
AST Structure
The BSharp AST (Abstract Syntax Tree) provides a complete, structured representation of C# source code. This document explains the organization and relationships between different AST node types.
AST Hierarchy
Root Node: CompilationUnit
Every parsed C# file results in a CompilationUnit, which serves as the root of the AST:
#![allow(unused)] fn main() { pub struct CompilationUnit { pub global_attributes: Vec<GlobalAttribute>, // [assembly: ...] attributes pub using_directives: Vec<UsingDirective>, // using statements pub global_using_directives: Vec<GlobalUsingDirective>, // C# 10+ global using pub declarations: Vec<TopLevelDeclaration>, // namespaces, types pub file_scoped_namespace: Option<FileScopedNamespaceDeclaration>, // C# 10+ pub top_level_statements: Vec<Statement>, // C# 9+ top-level code } }
This structure supports both traditional C# files and modern features like file-scoped namespaces and top-level statements.
Declaration Hierarchy
Top-Level Declarations
Top-level declarations represent constructs that can appear at the file or namespace level:
#![allow(unused)] fn main() { pub enum TopLevelDeclaration { Namespace(NamespaceDeclaration), FileScopedNamespace(FileScopedNamespaceDeclaration), Class(ClassDeclaration), Struct(StructDeclaration), Record(RecordDeclaration), Interface(InterfaceDeclaration), Enum(EnumDeclaration), Delegate(DelegateDeclaration), GlobalAttribute(GlobalAttribute), } }
Type Declarations
Each type declaration contains comprehensive information about the type:
ClassDeclaration
#![allow(unused)] fn main() { pub struct ClassDeclaration { pub attributes: Vec<AttributeList>, pub modifiers: Vec<Modifier>, pub name: Identifier, pub type_parameters: Option<Vec<TypeParameter>>, pub primary_constructor_parameters: Option<Vec<Parameter>>, // C# 12 pub base_types: Vec<Type>, pub body_declarations: Vec<ClassBodyDeclaration>, pub documentation: Option<XmlDocumentationComment>, pub constraints: Option<Vec<TypeParameterConstraintClause>>, } }
MethodDeclaration
#![allow(unused)] fn main() { pub struct MethodDeclaration { pub modifiers: Vec<Modifier>, pub return_type: Type, pub name: Identifier, pub type_parameters: Option<Vec<TypeParameter>>, pub parameters: Vec<Parameter>, pub body: Option<Statement>, // None for abstract/interface methods pub constraints: Option<Vec<TypeParameterConstraintClause>>, } }
Member Declarations
Class body declarations represent all possible class members:
#![allow(unused)] fn main() { pub enum ClassBodyDeclaration { Method(MethodDeclaration), Constructor(ConstructorDeclaration), Destructor(DestructorDeclaration), Property(PropertyDeclaration), Field(FieldDeclaration), Event(EventDeclaration), Indexer(IndexerDeclaration), Operator(OperatorDeclaration), NestedClass(ClassDeclaration), NestedStruct(StructDeclaration), NestedInterface(InterfaceDeclaration), NestedEnum(EnumDeclaration), NestedDelegate(DelegateDeclaration), } }
Expression Hierarchy
Expression Types
The expression system covers all C# expression types with proper precedence:
#![allow(unused)] fn main() { pub enum Expression { // Primary and names Literal(Literal), Variable(Identifier), // Object and member operations New(Box<NewExpression>), MemberAccess(Box<MemberAccessExpression>), Invocation(Box<InvocationExpression>), Indexing(Box<IndexingExpression>), Index(Box<IndexExpression>), Range(Box<RangeExpression>), // Lambda and anonymous methods Lambda(Box<LambdaExpression>), AnonymousMethod(Box<AnonymousMethodExpression>), // Keywords This, Base, // Operators Unary { op: UnaryOperator, expr: Box<Expression> }, Binary { left: Box<Expression>, op: BinaryOperator, right: Box<Expression> }, PostfixUnary { op: UnaryOperator, expr: Box<Expression> }, Assignment(Box<AssignmentExpression>), // Patterns and type ops Pattern(Box<Pattern>), IsPattern { expression: Box<Expression>, pattern: Box<Pattern> }, As { expression: Box<Expression>, target_type: Type }, Cast { expression: Box<Expression>, target_type: Type }, // Misc language features Conditional(Box<ConditionalExpression>), Query(Box<QueryExpression>), Await(Box<AwaitExpression>), Throw(Box<ThrowExpression>), Nameof(Box<NameofExpression>), Typeof(Box<TypeofExpression>), Sizeof(Box<SizeofExpression>), Default(Box<DefaultExpression>), StackAlloc(Box<StackAllocExpression>), Ref(Box<Expression>), Checked(Box<CheckedExpression>), Unchecked(Box<UncheckedExpression>), // With/collection expressions With { target: Box<Expression>, initializers: Vec<WithInitializerEntry> }, Collection(Vec<CollectionElement>), // Composite forms AnonymousObject(AnonymousObjectCreationExpression), Tuple(TupleExpression), SwitchExpression(Box<SwitchExpression>), } }
Key helper structs:
#![allow(unused)] fn main() { pub struct SwitchExpression { pub expression: Expression, pub arms: Vec<SwitchExpressionArm>, } pub enum WithInitializerEntry { Property { name: String, value: Expression }, Indexer { indices: Vec<Expression>, value: Expression }, } pub enum CollectionElement { Expr(Expression), Spread(Expression), } }
Literal Types
Comprehensive support for C# literals:
#![allow(unused)] fn main() { pub enum Literal { Boolean(bool), Integer(String), // Preserves original format FloatingPoint(String), // Preserves original format Character(char), String(String), InterpolatedString(InterpolatedStringLiteral), Null, Default, } }
Statement Hierarchy
Statement Types
Complete coverage of C# statement types:
#![allow(unused)] fn main() { pub enum Statement { // Control flow If(IfStatement), Switch(SwitchStatement), For(ForStatement), ForEach(ForEachStatement), While(WhileStatement), DoWhile(DoWhileStatement), // Jump statements Break(BreakStatement), Continue(ContinueStatement), Return(ReturnStatement), Throw(ThrowStatement), Goto(GotoStatement), // Exception handling Try(TryStatement), // Resource management Using(UsingStatement), Lock(LockStatement), // Declarations and expressions LocalVariableDeclaration(LocalVariableDeclaration), ExpressionStatement(Expression), Block(Vec<Statement>), Empty, // Modern features LocalFunction(LocalFunctionStatement), } }
Control Flow Statements
Complex control flow statements contain nested structures:
IfStatement
#![allow(unused)] fn main() { pub struct IfStatement { pub condition: Expression, pub consequence: Box<Statement>, pub alternative: Option<Box<Statement>>, } }
TryStatement
#![allow(unused)] fn main() { pub struct TryStatement { pub body: Box<Statement>, pub catch_clauses: Vec<CatchClause>, pub finally_clause: Option<FinallyClause>, } }
Type System
Type Representation
The type system models all C# type constructs:
#![allow(unused)] fn main() { pub enum Type { // Primitive types Primitive(PrimitiveType), // Named types Named { name: Identifier, type_arguments: Vec<Type> }, // Array types Array { element_type: Box<Type>, rank: usize }, // Pointer types Pointer(Box<Type>), // Nullable types Nullable(Box<Type>), // Generic type parameters TypeParameter(Identifier), // Tuple types Tuple(Vec<Type>), } }
Generic Support
Full support for C# generics:
TypeParameter
#![allow(unused)] fn main() { pub struct TypeParameter { pub attributes: Vec<Attribute>, pub variance: Option<Variance>, // in, out pub identifier: Identifier, } }
TypeParameterConstraint
#![allow(unused)] fn main() { pub enum TypeParameterConstraint { TypeConstraint { parameter: Identifier, constraint_type: Type }, ConstructorConstraint(Identifier), // new() ClassConstraint(Identifier), // class StructConstraint(Identifier), // struct UnmanagedConstraint(Identifier), // unmanaged } }
AST Metadata
Attributes
Comprehensive attribute support:
#![allow(unused)] fn main() { pub struct Attribute { pub name: Identifier, pub arguments: Vec<AttributeArgument>, } pub enum AttributeArgument { Positional(Expression), Named { name: Identifier, value: Expression }, } }
Modifiers
All C# modifiers are represented:
#![allow(unused)] fn main() { pub enum Modifier { // Access modifiers Public, Private, Protected, Internal, ProtectedInternal, PrivateProtected, // Other modifiers Static, Abstract, Virtual, Override, Sealed, New, Async, Unsafe, Volatile, Readonly, Const, Partial, Extern, } }
Navigation and Relationships
The AST maintains clear parent-child relationships while providing navigation capabilities through traits:
- Ownership: Parent nodes own their children
- Navigation: Traits provide methods to traverse and search the AST
- Context: Nodes can access their containing context when needed
This structure provides a complete, navigable representation of C# code that supports both analysis and transformation scenarios.
Error Handling
BSharp implements a comprehensive error handling system that provides detailed context information for debugging parse failures.
Error Types
The parser uses ErrorTree from nom-supreme for structured error information:
#![allow(unused)] fn main() { pub type BResult<I, O> = nom::IResult<I, O, ErrorTree<I>>; }
ErrorTree Structure
The ErrorTree type provides:
- Context Stack: Hierarchical parsing context via
.context()calls - Location: Span tracking for error positions
- Error Tree: Complete parse failure path
- Rich Diagnostics: Detailed error information for debugging
Error Recovery
The parser implements several error recovery strategies:
1. Malformed Syntax Recovery
When encountering malformed syntax, the parser attempts to skip to recovery points:
- Semicolons (
;) - Closing braces (
}) - End of input
1.a Declaration Error Recovery (Type Member Top-Level)
For type declarations (classes, structs, records, interfaces), malformed members are recovered using a lightweight, scope-aware helper:
- Helper:
skip_to_member_boundary_top_level() - Location:
src/bsharp_parser/src/expressions/declarations/type_declaration_helpers.rs
Contract:
- Only use from within a type body when a member parser fails.
- Stops at the next safe boundary at top level of the current type:
- Consumes a top-level
;and returns the slice after it. - Or stops at a top-level
}without consuming it (so the caller can close the current body cleanly). - Returns an empty slice at EOF.
- Consumes a top-level
- Depth-tracks
(),[],{}, and a heuristic<>to avoid stopping inside expressions, attribute arguments, or generic argument lists. - Ignores control characters inside strings, chars, and comments.
Limitations:
- Angle-bracket tracking is heuristic and does not fully disambiguate generics from shift operators.
- Verbatim/interpolated strings are not fully lexed here; this helper is intended for robust, not perfect, recovery.
Usage example (simplified):
#![allow(unused)] fn main() { match member_parser(cur) { Ok((rest, member)) => { members.push(member); cur = rest; } Err(_) => { let next = skip_to_member_boundary_top_level(cur); if next.is_empty() || next == cur { break; } cur = next; } } }
1.b Namespace Body: Using-Directives Before Members
Inside a block-scoped namespace body, using directives are accepted before type and nested-namespace members.
- Implementation:
parse_namespace_declaration()scans forusingimmediately after the opening{and collects all consecutive directives before parsing members. - This ensures inputs like the following are parsed deterministically without interleaving usings with members:
namespace Outer {
using System;
namespace Inner {
using System.Collections;
class MyClass {}
}
}
Contract and limitations:
- Only leading
usingdirectives at the current namespace body level are collected. - Interleaving
usingdirectives among members is not supported yet (matches common style and avoids ambiguous recovery).
1.c File-Scoped Namespace
When parsing a file-scoped namespace, the parser also skips preprocessor directives following the namespace line before parsing members, mirroring the block-scoped behavior.
Preprocessor Directives and Trivia
Preprocessor directives (e.g., #pragma, #line) are treated as structured trivia, not AST declarations:
- Parser entrypoints (e.g.,
parse_csharp_source()) skip directive lines anywhere they can appear at the compilation-unit level. parse_preprocessor_directive()consumes the entire directive line including an optional trailing newline.- Current status: directives inside type and namespace bodies are planned to be skipped similarly; tests are tracked and temporarily ignored until this is integrated.
Example:
#pragma warning disable CS0168
namespace N {
// class and members...
}
The directive is skipped and not present as a namespace member.
2. Context-Aware Errors
Errors include contextual information about the parsing context:
#![allow(unused)] fn main() { context("method declaration", parse_method_body)(input.into()) }
This provides clear error messages like "expected method body in method declaration context".
Helper Location: src/bsharp_parser/src/helpers/
3. Graceful Degradation
The parser continues parsing even after encountering errors, collecting multiple errors to provide comprehensive feedback.
Error Reporting
Errors are reported with:
- Line and column numbers
- Surrounding context
- Suggestions for fixes
- Parser state information
Common Error Scenarios
Syntax Errors
- Missing semicolons
- Unmatched braces
- Invalid identifiers
Type Errors
- Unknown type references
- Generic constraint violations
- Invalid type parameter usage
Declaration Errors
- Conflicting modifiers
- Missing required elements
- Invalid access levels
Debugging Tips
- Use verbose error output to get detailed parser state
- Check recovery points when errors cascade
- Validate input syntax with simpler test cases first
- Use parser context to understand where parsing failed
Wrapper Expression Variants
For clarity, several operations are modeled as distinct expression variants in the AST:
New(NewExpression)for object creationMemberAccess(MemberAccessExpression)forobj.MemberInvocation(InvocationExpression)for callsexpr(args)Indexing(IndexingExpression)andIndex(IndexExpression)Range(RangeExpression)forstart..endWith { target, initializers }for record-like with-expressionsCollection(Vec<CollectionElement>)for collection expressions
Expression Parsing
BSharp implements a complete expression parser that handles all C# expression types with proper operator precedence and associativity.
Expression Hierarchy
The expression parser follows C#'s operator precedence rules:
- Primary Expressions (
x,x.y,x[y],x(), etc.) - Unary Expressions (
+x,-x,!x,~x,++x,--x) - Multiplicative (
*,/,%) - Additive (
+,-) - Shift (
<<,>>) - Relational (
<,>,<=,>=,is,as) - Equality (
==,!=) - Logical AND (
&) - Logical XOR (
^) - Logical OR (
|) - Conditional AND (
&&) - Conditional OR (
||) - Null Coalescing (
??) - Conditional (
?:) - Assignment (
=,+=,-=, etc.)
Expression Types
Primary Expressions
Literals
- Numeric:
42,3.14,0x1A - String:
"hello",@"verbatim",$"interpolated {value}" - Character:
'a','\n' - Boolean:
true,false - Null:
null
Identifiers and Member Access
variable // Simple identifier
obj.property // Member access
obj.method() // Method invocation
obj[index] // Indexer access
Note: In the AST, simple identifiers are represented by the Expression::Variable(Identifier) variant. Member access, invocation, and indexing are represented by dedicated wrapper variants (MemberAccess, Invocation, Indexing).
Object Creation
new MyClass() // Constructor
new MyClass { Prop = value } // Object initializer
new[] { 1, 2, 3 } // Array initializer
new { Name = "John", Age = 30 } // Anonymous object
Lambda Expressions
The parser supports various lambda syntax forms:
x => x * 2 // Single parameter
(x, y) => x + y // Multiple parameters
() => DoSomething() // No parameters
(int x, string y) => Process(x, y) // Typed parameters
x => { return x * 2; } // Block body
async x => await ProcessAsync(x) // Async lambda
Query Expressions (LINQ)
Complete LINQ query syntax support:
from item in collection
where item.IsValid
orderby item.Name
select item.Value
Supported clauses:
from- Data sourcewhere- Filteringselect- Projectionorderby- Sortinggroup by- Groupingjoin- Joininglet- Variable introductioninto- Query continuation
Pattern Expressions
Modern C# pattern matching:
obj is int value // Type pattern
obj is not null // Negation pattern
obj is > 0 and < 100 // Relational patterns
obj is var x // Var pattern
Switch Expressions
value switch
{
1 => "one",
2 => "two",
_ => "other"
}
Operator Precedence Implementation
The expression entrypoint is spanned-first. Callers can unwrap the Spanned<Expression> when they do not need spans:
#![allow(unused)] fn main() { use bsharp_parser::parser::expressions::primary_expression_parser::parse_expression_spanned; use bsharp_syntax::span::Span; let result = parse_expression_spanned(Span::new(input)) .map(|(rest, s)| (rest, s.node)); }
Error Handling in Expressions
The expression parser provides detailed error messages:
- Operator precedence conflicts
- Missing operands
- Invalid syntax combinations
- Type compatibility issues
Advanced Features
Null-Conditional Operators
obj?.Property // Null-conditional member access
obj?[index] // Null-conditional element access
obj?.Method() // Null-conditional invocation
Throw Expressions
value ?? throw new ArgumentNullException()
Range and Index Expressions
array[^1] // Index from end
array[1..5] // Range
array[..^1] // Range to index from end
With Expressions (Records)
person with { Name = "Updated" }
The expression parser is designed to be extensible, allowing for easy addition of new expression types as the C# language evolves.
See Also
- Keywords and Tokens – keyword helpers, word boundaries, trivia handling for tokens used in expressions
Statement Parsing
BSharp provides comprehensive parsing for all C# statement types, from simple expressions to complex control flow constructs.
Statement Categories
1. Declaration Statements
Local Variable Declarations
int x = 5;
var name = "John";
const double PI = 3.14159;
Local Function Declarations
void LocalFunction(int parameter)
{
// function body
}
T GenericLocalFunction<T>(T value) where T : class
{
return value;
}
2. Expression Statements
Any expression followed by a semicolon:
x++; // Increment
Method(); // Method call
obj.Property = value; // Assignment
3. Control Flow Statements
Conditional Statements
If Statements
if (condition)
statement;
if (condition)
{
// block
}
else if (otherCondition)
{
// else if block
}
else
{
// else block
}
Switch Statements
switch (expression)
{
case constant1:
statements;
break;
case constant2 when condition:
statements;
goto case constant1;
default:
statements;
break;
}
Loop Statements
For Loops
for (int i = 0; i < 10; i++)
{
// loop body
}
for (;;) // infinite loop
{
// body
}
Foreach Loops
foreach (var item in collection)
{
// process item
}
foreach ((string key, int value) in dictionary)
{
// deconstruction in foreach
}
While Loops
while (condition)
{
// loop body
}
Do-While Loops
do
{
// loop body
} while (condition);
Jump Statements
break; // Break from loop/switch
continue; // Continue to next iteration
return; // Return from method
return value; // Return with value
goto label; // Jump to label
goto case 5; // Jump to switch case
goto default; // Jump to switch default
4. Exception Handling
Try-Catch-Finally
try
{
// risky code
}
catch (SpecificException ex) when (ex.Code == 123)
{
// specific exception handling
}
catch (Exception ex)
{
// general exception handling
}
finally
{
// cleanup code
}
Throw Statements
throw; // Rethrow current exception
throw new InvalidOperationException();
throw new CustomException("message");
5. Resource Management
Using Statements
using (var resource = new DisposableResource())
{
// use resource
}
using var resource = new DisposableResource();
// resource disposed at end of scope
Lock Statements
lock (syncObject)
{
// synchronized code
}
Fixed Statements
unsafe
{
fixed (byte* ptr = array)
{
// work with fixed pointer
}
}
6. Special Statements
Yield Statements
yield return value; // Return value in iterator
yield break; // End iterator
Checked/Unchecked Statements
checked
{
// arithmetic overflow checking enabled
}
unchecked
{
// arithmetic overflow checking disabled
}
Unsafe Statements
unsafe
{
// unsafe code block
}
Statement Parsing Implementation
Use the spanned entrypoint and unwrap when spans are not needed:
#![allow(unused)] fn main() { use bsharp_parser::parser::statement_parser::parse_statement_ws_spanned; use bsharp_syntax::span::Span; let result = parse_statement_ws_spanned(Span::new(input)) .map(|(rest, s)| (rest, s.node)); }
Block Statements
Block statements group multiple statements:
{
int x = 5;
Console.WriteLine(x);
if (x > 0)
{
Console.WriteLine("Positive");
}
}
Error Recovery
The statement parser implements robust error recovery:
- Statement-level recovery: Skip to next statement boundary (semicolon or brace)
- Block-level recovery: Skip to matching brace
- Context preservation: Maintain parsing context across errors
Statement Attributes
Statements can have attributes applied:
[Obsolete("Use NewMethod instead")]
void OldMethod() { }
[ConditionalAttribute("DEBUG")]
static void DebugMethod() { }
Top-Level Statements
Support for C# 9+ top-level statements:
// Program.cs
using System;
Console.WriteLine("Hello World!");
return 0;
The statement parser is designed to handle the full complexity of C# control flow while providing clear error messages and robust error recovery.
Declaration Parsing
BSharp implements comprehensive parsing for all C# declaration types, from simple variables to complex generic types with constraints.
Declaration Categories
1. Namespace Declarations
Traditional Namespace
namespace MyCompany.MyProject
{
// namespace members
}
File-Scoped Namespace (C# 10+)
namespace MyCompany.MyProject;
// All following declarations belong to this namespace
Nested Namespaces
namespace Outer
{
namespace Inner
{
// nested namespace content
}
}
2. Type Declarations
Class Declarations
public class MyClass : BaseClass, IInterface1, IInterface2
{
// class members
}
public abstract class AbstractClass
{
public abstract void AbstractMethod();
}
public sealed class SealedClass
{
// cannot be inherited
}
Interface Declarations
public interface IMyInterface : IBaseInterface
{
void Method();
int Property { get; set; }
event Action SomeEvent;
}
public interface IGeneric<T> where T : class
{
T GenericMethod<U>(U parameter) where U : struct;
}
Struct Declarations
public struct Point
{
public int X { get; set; }
public int Y { get; set; }
public Point(int x, int y)
{
X = x;
Y = y;
}
}
public readonly struct ReadOnlyPoint
{
public readonly int X;
public readonly int Y;
public ReadOnlyPoint(int x, int y)
{
X = x;
Y = y;
}
}
Record Declarations
public record Person(string FirstName, string LastName);
public record class Employee(string FirstName, string LastName, string Department)
: Person(FirstName, LastName);
public record struct Point(int X, int Y);
Enum Declarations
public enum Color
{
Red,
Green,
Blue
}
[Flags]
public enum FileAccess : byte
{
None = 0,
Read = 1,
Write = 2,
Execute = 4,
All = Read | Write | Execute
}
Delegate Declarations
public delegate void EventHandler(object sender, EventArgs e);
public delegate T GenericDelegate<T, U>(U parameter) where T : class;
3. Member Declarations
Field Declarations
private int field;
public readonly string ReadOnlyField;
public const double PI = 3.14159;
private static readonly List<string> StaticField = new();
Property Declarations
// Auto-implemented properties
public string Name { get; set; }
public int Age { get; private set; }
public bool IsValid { get; init; }
// Properties with backing fields
private string _description;
public string Description
{
get => _description;
set => _description = value?.Trim();
}
// Expression-bodied properties
public string FullName => $"{FirstName} {LastName}";
// Indexer properties
public string this[int index]
{
get => items[index];
set => items[index] = value;
}
Method Declarations
public void VoidMethod() { }
public int MethodWithReturnType() => 42;
public static T GenericMethod<T>(T parameter) where T : new() => new T();
// Async methods
public async Task<string> AsyncMethod()
{
await Task.Delay(1000);
return "result";
}
// Extension methods
public static class Extensions
{
public static bool IsEmpty(this string str) => string.IsNullOrEmpty(str);
}
Constructor Declarations
public class MyClass
{
public MyClass() { } // Default constructor
public MyClass(string name) : this() // Constructor chaining
{
Name = name;
}
static MyClass() // Static constructor
{
// Static initialization
}
}
Destructor Declarations
public class Resource
{
~Resource()
{
// Cleanup code
}
}
Note: In the AST, DestructorDeclaration.body is Option<Statement>:
#![allow(unused)] fn main() { // Some(Block(...)) for `{ ... }`, None for extern (i.e., `;` only) pub struct DestructorDeclaration { pub name: Identifier, pub body: Option<Statement>, } }
Event Declarations
public event Action<string> SomethingHappened;
public event EventHandler<CustomEventArgs> CustomEvent
{
add { customEvent += value; }
remove { customEvent -= value; }
}
Operator Declarations
public static Point operator +(Point a, Point b)
{
return new Point(a.X + b.X, a.Y + b.Y);
}
public static implicit operator string(Point p)
{
return $"({p.X}, {p.Y})";
}
4. Generic Constraints
Type Parameter Constraints
public class Container<T> where T : class, IDisposable, new()
{
// T must be a reference type, implement IDisposable, and have a parameterless constructor
}
public void Method<T, U>()
where T : class
where U : struct, IComparable<U>
{
// Multiple constraint clauses
}
AST mapping for constraints:
#![allow(unused)] fn main() { // On type declarations (class/struct/interface/record) pub struct ClassDeclaration { pub constraints: Option<Vec<TypeParameterConstraintClause>>, } // On methods pub struct MethodDeclaration { pub constraints: Option<Vec<TypeParameterConstraintClause>>, } }
5. Modifiers and Attributes
Access Modifiers
public- Accessible everywhereprivate- Accessible only within the same classprotected- Accessible within class and derived classesinternal- Accessible within the same assemblyprotected internal- Accessible within assembly or derived classesprivate protected- Accessible within derived classes in the same assembly
Other Modifiers
static- Belongs to the type rather than instanceabstract- Must be overridden in derived classesvirtual- Can be overridden in derived classesoverride- Overrides a virtual/abstract membersealed- Cannot be overridden furtherreadonly- Can only be assigned during initializationconst- Compile-time constantasync- Asynchronous methodunsafe- Contains unsafe codeextern- Implemented externally
Attributes
[Obsolete("Use NewMethod instead")]
public void OldMethod() { }
[DllImport("kernel32.dll")]
public static extern bool SetConsoleTitle(string title);
[AttributeUsage(AttributeTargets.Class | AttributeTargets.Method)]
public class CustomAttribute : Attribute
{
public string Description { get; set; }
}
6. Using Directives
using System; // Namespace using
using System.Collections.Generic;
using static System.Math; // Static using
using Project = MyCompany.MyProject; // Alias directive
global using System.Text; // Global using (C# 10+)
Note: global using directives are stored at the compilation unit level in CompilationUnit.global_using_directives.
Declaration Parsing Implementation
The declaration parser uses a multi-stage approach:
- Modifier Parsing: Parse access modifiers and other keywords
- Declaration Type Detection: Determine what kind of declaration
- Specific Parser Dispatch: Route to specialized parser
- Member Collection: Gather all declaration components
#![allow(unused)] fn main() { fn parse_type_declaration(input: &str) -> BResult<&str, TypeDeclaration> { let (input, attributes) = many0(parse_attribute)(input.into())?; let (input, modifiers) = parse_modifiers(input.into())?; let (input, declaration) = alt(( parse_class_declaration, parse_interface_declaration, parse_struct_declaration, parse_enum_declaration, parse_delegate_declaration, parse_record_declaration, ))(input.into())?; Ok((input, TypeDeclaration { attributes, modifiers, declaration, })) } }
Error Handling
The declaration parser provides comprehensive error reporting:
- Modifier conflicts: Detecting incompatible modifier combinations
- Constraint validation: Ensuring generic constraints are valid
- Accessibility consistency: Verifying access level consistency
- Syntax validation: Catching malformed declarations
Recovery for Malformed Members
When a member inside a type body fails to parse, the parser uses a scoped recovery strategy to skip to the next safe boundary without crossing the enclosing type's closing brace. See the dedicated section in Error Handling for details on skip_to_member_boundary_top_level() and its contract:
- docs:
docs/parser/error-handling.md(Declaration Error Recovery subsection)
XML Documentation
The parser handles XML documentation comments:
/// <summary>
/// Calculates the area of a rectangle.
/// </summary>
/// <param name="width">The width of the rectangle.</param>
/// <param name="height">The height of the rectangle.</param>
/// <returns>The area of the rectangle.</returns>
public double CalculateArea(double width, double height)
{
return width * height;
}
The declaration parser is designed to handle the full complexity of C# type system while maintaining performance and providing detailed error diagnostics.
Type System
BSharp implements a comprehensive type system that accurately represents all C# type constructs, from primitive types to complex generic types with constraints.
Type Categories
1. Primitive Types
Built-in Value Types
bool // Boolean type
byte // 8-bit unsigned integer
sbyte // 8-bit signed integer
short // 16-bit signed integer
ushort // 16-bit unsigned integer
int // 32-bit signed integer
uint // 32-bit unsigned integer
long // 64-bit signed integer
ulong // 64-bit unsigned integer
char // 16-bit Unicode character
float // 32-bit floating point
double // 64-bit floating point
decimal // 128-bit decimal
Special Types
object // Base type of all types
string // Immutable string type
void // Absence of type (method returns)
dynamic // Dynamic type
var // Implicitly typed variable
2. Reference Types
Class Types
MyClass // Simple class reference
System.Collections.List<int> // Generic class
Interface Types
IEnumerable<T> // Generic interface
IDisposable // Non-generic interface
Array Types
int[] // Single-dimensional array
int[,] // Multi-dimensional array
int[][] // Jagged array
int[,,] // Three-dimensional array
Delegate Types
Action // Parameterless action
Action<int> // Action with parameter
Func<int, string> // Function with return type
EventHandler<T> // Event handler
3. Nullable Types
Nullable Value Types
int? // Nullable integer
DateTime? // Nullable DateTime
bool? // Nullable boolean
Nullable Reference Types (C# 8+)
string? // Nullable string
List<int>? // Nullable list
MyClass? // Nullable custom class
4. Generic Types
Type Parameters
T // Simple type parameter
TKey, TValue // Multiple type parameters
Constructed Generic Types
List<int> // Generic list of integers
Dictionary<string, object> // Generic dictionary
Generic Constraints
T where T : class // Reference type constraint
T where T : struct // Value type constraint
T where T : new() // Constructor constraint
T where T : BaseClass // Base class constraint
T where T : IInterface // Interface constraint
T where T : class, IDisposable, new() // Multiple constraints
5. Tuple Types
Named Tuples
(int x, int y) // Named tuple elements
(string name, int age) // Different element types
Unnamed Tuples
(int, string) // Unnamed tuple elements
Nested Tuples
(int, (string, bool)) // Nested tuple structure
6. Pointer Types (Unsafe Context)
int* // Pointer to integer
char** // Pointer to pointer to char
void* // Void pointer
7. Function Pointer Types (C# 9+)
delegate*<int, string> // Function pointer
delegate* managed<int, void> // Managed function pointer
delegate* unmanaged<int, void> // Unmanaged function pointer
Type Syntax Parsing
Basic Type Parsing
The type parser handles various syntactic forms:
#![allow(unused)] fn main() { fn parse_type(input: &str) -> BResult<&str, Type> { alt(( parse_tuple_type, parse_function_pointer_type, parse_named_type, parse_primitive_type, ))(input.into()) } }
Array Type Parsing
Array types have specific syntax rules:
int[] // T[]
int[,] // T[,]
int[,,] // T[,,]
int[][] // T[][] (jagged)
Generic Type Parsing
Generic types require careful parsing of type arguments:
List<int> // Simple generic
Dictionary<string, List<int>> // Nested generics
Nullable Type Parsing
Nullable types use special syntax:
int? // Nullable<int>
string? // string with nullable annotation
Type Resolution
Qualified Names
Types can be fully qualified:
System.Collections.Generic.List<int>
MyNamespace.MyClass
Type Aliases
Using directives create type aliases:
using StringList = System.Collections.Generic.List<string>;
Global Type References
Global namespace references:
global::System.String // Fully qualified from global namespace
Type Constraints
Constraint Types
- Reference Type:
where T : class - Value Type:
where T : struct - Constructor:
where T : new() - Base Class:
where T : BaseClass - Interface:
where T : IInterface - Type Parameter:
where T : U
Constraint Combinations
Multiple constraints can be combined:
where T : class, IDisposable, new()
Constraint Validation
The parser validates constraint combinations:
classandstructare mutually exclusivenew()constraint must come last- Base class constraint must come before interface constraints
Type Variance
Covariance and Contravariance
interface ICovariant<out T> { } // Covariant
interface IContravariant<in T> { } // Contravariant
interface IInvariant<T> { } // Invariant
Advanced Type Features
Record Types
record Person(string Name, int Age);
record class Employee(string Name, int Age, string Department);
record struct Point(int X, int Y);
Pattern Types
Types used in pattern matching:
obj is string str // Type pattern
obj is not null // Negation pattern
obj is > 0 and < 100 // Relational pattern
Type System Implementation
The type system is implemented with a hierarchical structure:
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] pub enum Type { Primitive(PrimitiveType), Named { name: Identifier, type_arguments: Option<Vec<Type>>, }, Array { element_type: Box<Type>, dimensions: u32, }, Nullable(Box<Type>), Tuple(Vec<(Option<Identifier>, Type)>), Pointer(Box<Type>), FunctionPointer { parameters: Vec<Type>, return_type: Box<Type>, }, } }
Error Handling
The type parser provides detailed error messages for:
- Invalid type syntax
- Constraint violations
- Generic parameter mismatches
- Nullable context errors
- Variance violations
Type Inference
While the parser doesn't perform type inference (that's the compiler's job), it correctly parses:
vardeclarations- Anonymous types
- Implicitly typed arrays
- Lambda parameter types
The type system parser is designed to accurately represent the full complexity of C#'s type system while maintaining performance and providing clear error diagnostics.
C# Feature Completeness Matrix
This document tracks the implementation status of C# language features in the BSharp parser.
Legend:
- ✅ Fully Supported - Feature is completely implemented and tested
- 🟡 Partial Support - Feature is partially implemented or has known limitations
- ⚠️ Planned - Feature is planned but not yet implemented
- ❌ Not Supported - Feature is not currently supported
C# 1.0 Features (2002)
Type Declarations
| Feature | Status | Notes |
|---|---|---|
| Classes | ✅ | Full support including nested classes |
| Structs | ✅ | Full support |
| Interfaces | ✅ | Full support |
| Enums | ✅ | Full support including flags |
| Delegates | ✅ | Full support |
Members
| Feature | Status | Notes |
|---|---|---|
| Fields | ✅ | Public, private, protected, internal |
| Properties | ✅ | Get/set accessors |
| Methods | ✅ | Instance and static methods |
| Constructors | ✅ | Instance and static constructors |
| Destructors/Finalizers | ✅ | Full support |
| Events | ✅ | Full support |
| Indexers | ✅ | Full support |
| Operators | ✅ | Operator overloading |
Statements
| Feature | Status | Notes |
|---|---|---|
if/else | ✅ | Full support |
switch/case | ✅ | Traditional switch statements |
for | ✅ | Full support |
foreach | ✅ | Full support |
while | ✅ | Full support |
do-while | ✅ | Full support |
break | ✅ | Full support |
continue | ✅ | Full support |
return | ✅ | Full support |
throw | ✅ | Full support |
try/catch/finally | ✅ | Full exception handling |
using statement | ✅ | Resource management |
lock | ✅ | Thread synchronization |
goto | ✅ | Including goto case |
checked/unchecked | ✅ | Overflow checking |
Expressions
| Feature | Status | Notes |
|---|---|---|
| Literals | ✅ | All literal types |
| Arithmetic operators | ✅ | +, -, *, /, % |
| Comparison operators | ✅ | ==, !=, <, >, <=, >= |
| Logical operators | ✅ | &&, ` |
| Bitwise operators | ✅ | &, ` |
| Assignment operators | ✅ | =, +=, -=, etc. |
| Conditional operator | ✅ | ? : ternary |
| Member access | ✅ | . operator |
| Indexing | ✅ | [] operator |
| Method invocation | ✅ | Full support |
| Object creation | ✅ | new expressions |
| Array creation | ✅ | Single and multi-dimensional |
| Type casting | ✅ | (Type)expr |
typeof | ✅ | Type information |
sizeof | ✅ | Size of types |
is operator | ✅ | Type testing |
as operator | ✅ | Safe casting |
Types
| Feature | Status | Notes |
|---|---|---|
| Primitive types | ✅ | All built-in types |
| Arrays | ✅ | Single, multi-dimensional, jagged |
| Nullable value types | ✅ | T? syntax |
| Reference types | ✅ | Classes, interfaces, delegates |
| Value types | ✅ | Structs, enums |
Modifiers
| Feature | Status | Notes |
|---|---|---|
| Access modifiers | ✅ | public, private, protected, internal |
static | ✅ | Full support |
readonly | ✅ | Full support |
const | ✅ | Full support |
virtual | ✅ | Full support |
override | ✅ | Full support |
abstract | ✅ | Full support |
sealed | ✅ | Full support |
extern | ✅ | Full support |
C# 2.0 Features (2005)
| Feature | Status | Notes |
|---|---|---|
| Generics | ✅ | Full support including constraints |
| Generic constraints | ✅ | where T : class, struct, new(), etc. |
| Partial types | ✅ | partial keyword |
| Anonymous methods | ✅ | delegate { } syntax |
| Nullable types | ✅ | Nullable<T> and T? |
| Iterators | ✅ | yield return, yield break |
| Covariance/Contravariance | ✅ | in/out variance |
| Static classes | ✅ | Full support |
| Property accessors | ✅ | Different accessibility |
| Namespace aliases | ✅ | using Alias = Namespace |
?? operator | ✅ | Null-coalescing |
C# 3.0 Features (2007)
| Feature | Status | Notes |
|---|---|---|
| Auto-implemented properties | ✅ | { get; set; } |
| Object initializers | ✅ | new T { Prop = value } |
| Collection initializers | ✅ | new List<T> { 1, 2, 3 } |
| Anonymous types | ✅ | new { Name = "x" } |
| Extension methods | ✅ | this parameter |
| Lambda expressions | ✅ | x => x * 2 |
| Expression trees | ✅ | Parsing support |
| LINQ query syntax | ✅ | from x in y select z |
| Implicitly typed variables | ✅ | var keyword |
| Partial methods | ✅ | In partial classes |
C# 4.0 Features (2010)
| Feature | Status | Notes |
|---|---|---|
| Dynamic binding | ✅ | dynamic type |
| Named arguments | ✅ | Method(param: value) |
| Optional parameters | ✅ | Default parameter values |
| Generic covariance/contravariance | ✅ | Enhanced support |
| Embedded interop types | ✅ | no-pia |
C# 5.0 Features (2012)
| Feature | Status | Notes |
|---|---|---|
| Async/await | ✅ | async and await keywords |
| Caller info attributes | ✅ | [CallerMemberName], etc. |
C# 6.0 Features (2015)
| Feature | Status | Notes |
|---|---|---|
| Auto-property initializers | ✅ | public int X { get; set; } = 1; |
| Expression-bodied members | ✅ | => expr for methods/properties |
using static | ✅ | Import static members |
| Null-conditional operator | ✅ | ?. and ?[] |
| String interpolation | ✅ | $"Hello {name}" |
nameof operator | ✅ | nameof(variable) |
| Index initializers | ✅ | [index] = value |
| Exception filters | ✅ | catch (E) when (condition) |
await in catch/finally | ✅ | Full support |
C# 7.0 Features (2017)
| Feature | Status | Notes |
|---|---|---|
| Out variables | ✅ | Method(out var x) |
| Tuples | ✅ | (int, string) syntax |
| Tuple deconstruction | ✅ | (var x, var y) = tuple |
| Pattern matching | ✅ | is patterns |
| Local functions | ✅ | Functions inside methods |
| Ref returns and locals | ✅ | ref keyword |
| Discards | ✅ | _ placeholder |
| Binary literals | ✅ | 0b1010 |
| Digit separators | ✅ | 1_000_000 |
| Throw expressions | ✅ | x ?? throw new E() |
| Expression-bodied constructors | ✅ | => expr syntax |
| Expression-bodied finalizers | ✅ | => expr syntax |
| Expression-bodied accessors | ✅ | get => expr |
C# 7.1 Features (2017)
| Feature | Status | Notes |
|---|---|---|
| Async main | ✅ | async Task Main() |
| Default literal expressions | ✅ | default without type |
| Inferred tuple names | ✅ | Automatic naming |
| Pattern matching on generics | ✅ | Full support |
C# 7.2 Features (2017)
| Feature | Status | Notes |
|---|---|---|
ref readonly | ✅ | Read-only references |
in parameters | ✅ | Pass by readonly reference |
ref struct | ✅ | Stack-only structs |
| Non-trailing named arguments | ✅ | Mixed named/positional |
private protected | ✅ | Access modifier |
| Leading underscores in numeric literals | ✅ | _123 |
Conditional ref expressions | ✅ | ref in ternary |
C# 7.3 Features (2018)
| Feature | Status | Notes |
|---|---|---|
| Tuple equality | ✅ | == and != |
| Attributes on backing fields | ✅ | [field: Attribute] |
| Expression variables in initializers | ✅ | Full support |
ref local reassignment | ✅ | Reassign ref locals |
| Stackalloc initializers | ✅ | stackalloc[] { 1, 2 } |
Pattern-based fixed | ✅ | Custom fixed |
| Improved overload candidates | ✅ | Better resolution |
C# 8.0 Features (2019)
| Feature | Status | Notes |
|---|---|---|
| Nullable reference types | ✅ | string? annotations |
| Default interface methods | ✅ | Interface implementations |
| Pattern matching enhancements | ✅ | Switch expressions, property patterns |
| Switch expressions | ✅ | x switch { ... } |
| Property patterns | ✅ | { Prop: value } |
| Tuple patterns | ✅ | (1, 2) patterns |
| Positional patterns | ✅ | Deconstruction patterns |
| Using declarations | ✅ | using var x = ... |
| Static local functions | ✅ | static modifier |
| Disposable ref structs | ✅ | IDisposable on ref struct |
| Nullable reference types | ✅ | #nullable directives |
| Asynchronous streams | ✅ | IAsyncEnumerable<T> |
| Asynchronous disposable | ✅ | IAsyncDisposable |
| Indices and ranges | ✅ | ^ and .. operators |
| Null-coalescing assignment | ✅ | ??= operator |
| Unmanaged constructed types | ✅ | Generic constraints |
| Stackalloc in nested expressions | ✅ | Full support |
C# 9.0 Features (2020)
| Feature | Status | Notes |
|---|---|---|
| Records | ✅ | record keyword |
| Init-only setters | ✅ | init accessor |
| Top-level statements | ✅ | No Main method required |
| Pattern matching improvements | ✅ | Relational, logical patterns |
| Relational patterns | ✅ | > 0, <= 10 |
| Logical patterns | ✅ | and, or, not |
Target-typed new | ✅ | new() without type |
| Covariant returns | ✅ | Override with derived type |
Extension GetEnumerator | ✅ | foreach support |
| Lambda discard parameters | ✅ | (_, _) => expr |
| Attributes on local functions | ✅ | Full support |
| Module initializers | ✅ | [ModuleInitializer] |
| Partial methods with return | ✅ | Extended partial |
| Native integers | ✅ | nint, nuint |
| Function pointers | ✅ | delegate* syntax |
| Suppress emitting localsinit | ✅ | [SkipLocalsInit] |
| Target-typed conditional | ✅ | ? : inference |
C# 10.0 Features (2021)
| Feature | Status | Notes |
|---|---|---|
| Record structs | ✅ | record struct |
| Global using directives | ✅ | global using |
| File-scoped namespaces | ✅ | namespace X; |
| Extended property patterns | ✅ | Nested patterns |
| Constant interpolated strings | ✅ | const strings |
| Lambda improvements | ✅ | Natural types, attributes |
| Caller expression attribute | ✅ | [CallerArgumentExpression] |
| Improved definite assignment | ✅ | Better analysis |
Allow AsyncMethodBuilder | ✅ | Custom builders |
Record types with sealed ToString | ✅ | Sealed override |
| Assignment and declaration in same deconstruction | ✅ | Mixed syntax |
| Allow both assignment and declaration | ✅ | Full support |
C# 11.0 Features (2022)
| Feature | Status | Notes |
|---|---|---|
| Raw string literals | ✅ | """text""" |
| Generic attributes | ✅ | [Attr<T>] |
| UTF-8 string literals | ✅ | "text"u8 |
| Newlines in string interpolations | ✅ | Multi-line expressions |
| List patterns | ✅ | [1, 2, .., 10] |
| File-local types | ✅ | file class |
| Required members | ✅ | required modifier |
| Auto-default structs | ✅ | Default initialization |
Pattern match Span<char> | ✅ | Constant patterns |
Extended nameof scope | ✅ | More contexts |
| Numeric IntPtr | ✅ | Operators on IntPtr |
ref fields | ✅ | In ref structs |
scoped ref | ✅ | Lifetime annotations |
| Checked operators | ✅ | User-defined checked |
C# 12.0 Features (2023)
| Feature | Status | Notes |
|---|---|---|
| Primary constructors | ✅ | Full support for classes and structs |
| Collection expressions | ✅ | [1, 2, 3] and spread .. syntax |
| Inline arrays | ❌ | Not yet implemented |
| Optional parameters in lambdas | ✅ | Full support |
ref readonly parameters | ✅ | Full support |
| Alias any type | ✅ | using Alias = (int, string) |
| Experimental attribute | ✅ | [Experimental] |
| Interceptors | ❌ | Not yet implemented |
C# 13.0 Features (2024)
| Feature | Status | Notes |
|---|---|---|
params collections | ⚠️ | Planned |
| New lock type | ⚠️ | Planned |
New escape sequence \e | ⚠️ | Planned |
| Method group natural type | ⚠️ | Planned |
| Implicit indexer access | ⚠️ | Planned |
ref and unsafe in iterators | ⚠️ | Planned |
ref struct interfaces | ⚠️ | Planned |
Allows ref struct types | ⚠️ | Planned |
C# 14.0 Features (2025 - .NET 10)
| Feature | Status | Notes |
|---|---|---|
| Extension members | 🟡 | Parser + emitter for extension blocks; semantics planned |
field keyword | ⚠️ | Planned - Field-backed properties |
| Null-conditional assignment | ⚠️ | Planned - ?. on left side of = |
nameof unbound generics | ⚠️ | Planned - nameof(List<>) |
Implicit Span<T> conversions | ⚠️ | Planned - First-class span support |
| Lambda parameter modifiers | ⚠️ | Planned - (out x) => ... without types |
| Partial constructors | ⚠️ | Planned - partial instance constructors |
| Partial events | ⚠️ | Planned - partial events |
| User-defined compound assignment | ⚠️ | Planned - Custom +=, -= operators |
Preprocessor Directives
| Feature | Status | Notes |
|---|---|---|
#if / #elif / #else / #endif | ✅ | Conditional compilation |
#define / #undef | ✅ | Symbol definition |
#warning / #error | ✅ | Compiler messages |
#line | ✅ | Line number control |
#region / #endregion | ✅ | Code folding |
#pragma warning | ✅ | Warning control |
#pragma checksum | ✅ | Debugging support |
#nullable | ✅ | Nullable context |
Documentation Comments
| Feature | Status | Notes |
|---|---|---|
| XML documentation | ✅ | /// and /** */ |
<summary> | ✅ | Full support |
<param> | ✅ | Full support |
<returns> | ✅ | Full support |
<exception> | ✅ | Full support |
<see> / <seealso> | ✅ | Full support |
<example> | ✅ | Full support |
<code> / <c> | ✅ | Full support |
<para> | ✅ | Full support |
<list> | ✅ | Full support |
<include> | ✅ | Full support |
Unsafe Code
| Feature | Status | Notes |
|---|---|---|
| Pointers | ✅ | T* syntax |
unsafe keyword | ✅ | Blocks and methods |
fixed statement | ✅ | Pin managed objects |
stackalloc | ✅ | Stack allocation |
| Function pointers | ✅ | delegate* (C# 9+) |
sizeof operator | ✅ | Type sizes |
| Pointer arithmetic | ✅ | Full support |
| Address-of operator | ✅ | & operator |
| Indirection operator | ✅ | * operator |
Summary Statistics
Overall Completeness
| Version | Features | Supported | Partial | Planned | Not Supported | Completion |
|---|---|---|---|---|---|---|
| C# 1.0 | 80+ | 80+ | 0 | 0 | 0 | 100% |
| C# 2.0 | 11 | 11 | 0 | 0 | 0 | 100% |
| C# 3.0 | 10 | 10 | 0 | 0 | 0 | 100% |
| C# 4.0 | 5 | 5 | 0 | 0 | 0 | 100% |
| C# 5.0 | 2 | 2 | 0 | 0 | 0 | 100% |
| C# 6.0 | 10 | 10 | 0 | 0 | 0 | 100% |
| C# 7.0 | 13 | 13 | 0 | 0 | 0 | 100% |
| C# 7.1 | 4 | 4 | 0 | 0 | 0 | 100% |
| C# 7.2 | 7 | 7 | 0 | 0 | 0 | 100% |
| C# 7.3 | 7 | 7 | 0 | 0 | 0 | 100% |
| C# 8.0 | 18 | 18 | 0 | 0 | 0 | 100% |
| C# 9.0 | 17 | 17 | 0 | 0 | 0 | 100% |
| C# 10.0 | 12 | 12 | 0 | 0 | 0 | 100% |
| C# 11.0 | 13 | 13 | 0 | 0 | 0 | 100% |
| C# 12.0 | 7 | 6 | 0 | 0 | 1 | ~86% |
| C# 13.0 | 8 | 0 | 0 | 8 | 0 | 0% (Preview) |
| C# 14.0 | 9 | 0 | 0 | 9 | 0 | 0% (Preview) |
Total: ~99% of released C# features supported (C# 1.0 - 12.0)
Testing Coverage
Test Organization
All parser tests are located in tests/parser/ with comprehensive coverage:
- Expression tests:
tests/parser/expressions/ - Statement tests:
tests/parser/statements/ - Declaration tests:
tests/parser/declarations/ - Type tests:
tests/parser/types/ - Pattern matching tests:
tests/parser/expressions/pattern_matching_tests.rs - Preprocessor tests:
tests/parser/preprocessor/
Test Fixtures
Real-world C# projects in tests/fixtures/:
- happy_path/: Valid, well-formed C# code
- complex/: Complex real-world scenarios
Known Limitations
C# 12.0 Limitations
-
Inline Arrays: Not yet implemented
- Requires
[InlineArray(n)]attribute support - Planned for future release
- Requires
-
Interceptors: Not yet implemented
- Experimental feature in C# 12
- May be implemented when feature stabilizes
C# 13.0 & 14.0 Status
All C# 13.0 and 14.0 features are in preview/development status and planned for future implementation as they stabilize in the official .NET releases.
Contributing
To add support for new C# features:
- Update AST nodes in
src/syntax/nodes/ - Implement parser in
src/parser/ - Add comprehensive tests in
tests/parser/ - Update this matrix to reflect new support
- Document in relevant parser documentation
See Contributing Guide for details.
References
- C# Language Specification: https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/
- C# Version History: https://docs.microsoft.com/en-us/dotnet/csharp/whats-new/
- Roslyn Source: https://github.com/dotnet/roslyn
- Parser Implementation:
src/parser/ - Test Suite:
tests/parser/
Last Updated: 2025-09-30
Parser Version: Current development version
Maintained By: BSharp Project Contributors
Keywords and Tokens
Keyword and token helpers used by the parser.
Keyword Pairs Macro
- Location:
src/bsharp_parser/src/keywords/mod.rs - Macro:
define_keyword_pair!(macro_rules) - Generates two functions per keyword:
kw_<name>()– consumes the keyword with word boundary checkpeek_<name>()– non-consuming peek with surrounding whitespace/comments tolerated
#![allow(unused)] fn main() { // Define a pair: // define_keyword_pair!(kw_public, peek_public, "public"); #[macro_export] macro_rules! define_keyword_pair { ($kw_fn:ident, $peek_fn:ident, $lit:literal) => { pub fn $kw_fn() -> impl FnMut($crate::syntax::span::Span) -> $crate::syntax::errors::BResult<&str> { use nom::Parser as _; (|i: $crate::syntax::span::Span| { nom::combinator::map( nom::sequence::terminated( nom_supreme::tag::complete::tag($lit), nom::combinator::peek(nom::combinator::not( nom::character::complete::satisfy(|c: char| c.is_alphanumeric() || c == '_'), )), ), |s: $crate::syntax::span::Span| *s.fragment(), ) .parse(i) }) } pub fn $peek_fn() -> impl FnMut($crate::syntax::span::Span) -> $crate::syntax::errors::BResult<&str> { use nom::Parser as _; (|i: $crate::syntax::span::Span| { nom::combinator::peek( nom::sequence::delimited( $crate::syntax::comment_parser::ws, nom::combinator::map( nom::sequence::terminated( nom_supreme::tag::complete::tag($lit), nom::combinator::peek(nom::combinator::not( nom::character::complete::satisfy(|c: char| c.is_alphanumeric() || c == '_'), )), ), |_| $lit, ), $crate::syntax::comment_parser::ws, ), ) .parse(i) }) } }; } }
- Keyword modules live under
src/bsharp_parser/src/keywords/(e.g.,access_keywords.rs,declaration_keywords.rs,linq_query_keywords.rs,type_keywords.rs). - Central keyword set:
KEYWORDSinkeywords/mod.rsand checkis_keyword().
Token and Whitespace Helpers
- Whitespace/comments:
src/bsharp_parser/src/syntax/comment_parser.rsws()parses optional whitespace and commentsparse_whitespace_or_comments()returns the consumed span text
- List parsing:
src/bsharp_parser/src/syntax/list_parser.rsprovides helpers for delimited/separated lists - Punctuation/tokens: Use
nom_supreme::tag::complete::tag("...")with:peek(not(satisfy(|c| ...)))for word boundaries on keywordspreceded/terminated/delimitedandws()to control surrounding trivia
Example token with trivia discipline:
#![allow(unused)] fn main() { use nom::{combinator::map, sequence::delimited}; use nom_supreme::tag::complete::tag; use crate::syntax::comment_parser::ws; use crate::syntax::errors::BResult; use crate::syntax::span::Span; pub fn comma(i: Span) -> BResult<()> { map(delimited(ws, tag(","), ws), |_| ()).parse(i) } }
Usage Patterns
- Prefer
peek_*()when branching without consuming input (e.g., lookahead for statement kind). - After consuming a keyword with
kw_*(), usecut()to prevent backtracking past the commitment. - Always wrap top-level file parser with
all_consuming. - Keep context labels short and specific.
Adding a New Keyword
- Pick the right module in
keywords/and add adefine_keyword_pair!entry. - If it's a reserved word, add it to
KEYWORDS(for identifier filtering). - Use
kw_*()/peek_*()in parsers withws()at boundaries. - Add tests under
src/bsharp_tests/src/parser/...for both positive and negative cases.
References
- Keyword macro and modules:
src/bsharp_parser/src/keywords/ - Whitespace/comment parser:
src/bsharp_parser/src/syntax/comment_parser.rs - Lists:
src/bsharp_parser/src/syntax/list_parser.rs - Error formatting:
src/bsharp_parser/src/syntax/errors.rs
Query API for AST traversal
The Query API is provided by the bsharp_syntax crate and re-exported by bsharp_analysis for convenience. It replaces older navigation traits, but the Query API itself is current and not deprecated.
Core types
NodeRef<'a>: a thin enum over AST nodes (CompilationUnit,Namespace,Class,Struct,Interface,Enum,Record,Delegate,Method,Statement,Expression, plus top-level items). Origin:bsharp_syntax::node::ast_node::NodeRef(re-exported asbsharp_analysis::framework::NodeRef).Query<'a>: a fluent helper to enumerate descendants and select typed nodes. Origin:bsharp_syntax::query::Query(re-exported asbsharp_analysis::framework::Query).
#![allow(unused)] fn main() { use bsharp_analysis::framework::{NodeRef, Query}; use bsharp_syntax::CompilationUnit; use bsharp_syntax::{ClassDeclaration, MethodDeclaration}; fn all_classes<'a>(cu: &'a CompilationUnit) -> Vec<&'a ClassDeclaration> { Query::from(NodeRef::CompilationUnit(cu)) .of::<ClassDeclaration>() .collect() } fn all_methods<'a>(cu: &'a CompilationUnit) -> Vec<&'a MethodDeclaration> { Query::from(NodeRef::CompilationUnit(cu)) .of::<MethodDeclaration>() .collect() } }
Descendant enumeration
Query::descendants() walks the tree using Children implemented for NodeRef.
#![allow(unused)] fn main() { use bsharp_analysis::framework::{NodeRef, Query}; use bsharp_syntax::statements::Statement; fn all_statements<'a>(cu: &'a CompilationUnit) -> Vec<&'a Statement> { Query::from(NodeRef::CompilationUnit(cu)) .of::<Statement>() .collect() } }
Filtering
Use filter_typed to filter by predicate.
#![allow(unused)] fn main() { use analysis::syntax::declarations::ClassDeclaration; let public_classes: Vec<&ClassDeclaration> = Query::from(NodeRef::CompilationUnit(&cu)) .filter_typed::<ClassDeclaration>(|c| c.modifiers.iter().any(|m| m.is_public())) .collect(); }
Best practices
- Prefer
Queryfor node enumeration across passes. - For hot path statement/expression analysis, use shared helpers (
metrics::shared) or a small local walker when necessary. - Keep passes stateless and deterministic; feed inputs via
AnalysisSessionartifacts.
Implementation notes
The Children/Extract traits are implemented for common AST nodes, enabling Query::of<T>() to return strong types. See:
src/bsharp_syntax/src/query/forChildren,Extract,Query.src/bsharp_syntax/src/node/ast_node.rsforNodeRef.
Comment Parsing
BSharp implements comprehensive comment parsing for both regular comments and XML documentation comments, preserving them as part of the AST for documentation generation and analysis tools.
Comment Types
1. Single-Line Comments
Standard C++ style comments:
// This is a single-line comment
int x = 5; // End-of-line comment
2. Multi-Line Comments
Traditional C-style block comments:
/*
* This is a multi-line comment
* that spans several lines
*/
int y = 10; /* Inline block comment */
3. XML Documentation Comments
Single-Line XML Comments
/// <summary>
/// This method calculates the sum of two integers.
/// </summary>
/// <param name="a">The first integer.</param>
/// <param name="b">The second integer.</param>
/// <returns>The sum of a and b.</returns>
public int Add(int a, int b)
{
return a + b;
}
Multi-Line XML Comments
/**
* <summary>
* This is a multi-line XML documentation comment.
* It provides detailed information about the method.
* </summary>
* <param name="value">The input value to process.</param>
* <returns>The processed result.</returns>
*/
public string ProcessValue(string value) { }
XML Documentation Structure
Standard XML Tags
Summary and Description
<summary>
Brief description of the member.
</summary>
<remarks>
Detailed remarks and additional information.
</remarks>
Parameters and Returns
<param name="parameterName">Description of the parameter.</param>
<returns>Description of the return value.</returns>
Exceptions
<exception cref="ArgumentNullException">
Thrown when the parameter is null.
</exception>
Examples
<example>
This example shows how to use the method:
<code>
var result = MyMethod("input");
Console.WriteLine(result);
</code>
</example>
See References
<see cref="RelatedMethod"/>
<seealso cref="AnotherClass"/>
Generic Type Parameters
<typeparam name="T">The type parameter.</typeparam>
<typeparamref name="T"/>
Custom XML Tags
The parser supports custom XML tags:
<custom attribute="value">
Custom content with <nested>elements</nested>.
</custom>
XML Documentation Parsing
XML Element Structure
The parser represents XML elements with:
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] pub struct XmlElement { pub name: Identifier, pub attributes: Vec<XmlAttribute>, pub children: Vec<XmlNode>, } #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] pub struct XmlAttribute { pub name: Identifier, pub value: String, } #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] pub enum XmlNode { Element(XmlElement), Text(String), CData(String), Comment(String), } }
XML Documentation Comment
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] pub struct XmlDocumentationComment { pub elements: Vec<XmlNode>, } }
Parsing XML Attributes
The parser handles XML attributes with various syntaxes:
<param name="value">Description</param>
<see cref="MyClass.MyMethod(int, string)"/>
<exception cref="System.ArgumentException">Error description</exception>
XML Content Parsing
The parser processes mixed content:
<summary>
This method processes <paramref name="input"/> and returns
<see cref="ProcessResult"/> containing the result.
</summary>
Comment Association
Declaration Comments
Comments are associated with their following declarations:
/// <summary>Class documentation</summary>
public class MyClass
{
/// <summary>Method documentation</summary>
public void MyMethod() { }
}
Member Comments
Each declaration can have associated documentation:
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] pub struct MethodDeclaration { pub documentation: Option<XmlDocumentationComment>, // ... other fields } }
Advanced XML Features
CDATA Sections
The parser handles CDATA sections for literal content:
<example>
<![CDATA[
if (x < y && y > z)
{
Console.WriteLine("Complex condition");
}
]]>
</example>
Nested XML Elements
Complex nested structures are supported:
<summary>
This method handles <see cref="List{T}"/> where T is
<typeparamref name="T"/> and implements <see cref="IComparable{T}"/>.
</summary>
XML Namespaces
The parser can handle XML namespaces in documentation:
<doc:summary xmlns:doc="http://schemas.microsoft.com/developer/documentation">
Namespaced documentation content.
</doc:summary>
Comment Preservation
Comment Tokens
Comments are preserved as tokens in the AST:
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq, Eq)] pub enum CommentToken { SingleLine(String), MultiLine(String), XmlDocumentation(XmlDocumentationComment), } }
Position Information
Comments maintain position information:
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq, Eq)] pub struct PositionedComment { pub comment: CommentToken, pub line: usize, pub column: usize, } }
Error Handling
XML Validation
The parser validates XML structure:
- Well-formed XML: Proper opening and closing tags
- Attribute syntax: Valid attribute name-value pairs
- Nesting rules: Correct element nesting
- Character escaping: Proper XML character escaping
Error Recovery
When XML is malformed, the parser attempts recovery:
- Skip malformed elements: Continue parsing after errors
- Preserve content: Keep as much content as possible
- Error reporting: Provide detailed error locations
Integration with Analysis
Documentation Analysis
Comments are available for analysis tools:
#![allow(unused)] fn main() { impl XmlDocumentationComment { pub fn find_elements_by_name(&self, name: &str) -> Vec<&XmlElement> { // Find all elements with the given tag name } pub fn get_summary(&self) -> Option<String> { // Extract summary text } pub fn get_parameters(&self) -> Vec<(String, String)> { // Extract parameter documentation } } }
Documentation Generation
The parsed XML documentation can be used for:
- API documentation generation
- IntelliSense information
- Code analysis and quality checks
- Documentation coverage reports
Performance Considerations
Lazy Parsing
XML documentation can be parsed lazily when needed:
#![allow(unused)] fn main() { #[derive(Debug, Clone)] pub enum DocumentationState { Unparsed(String), Parsed(XmlDocumentationComment), Invalid(String, ParseError), } }
Memory Optimization
The parser optimizes memory usage by:
- String interning: Reusing common XML tag names
- Structured storage: Efficient representation of XML structure
- On-demand parsing: Parse XML only when accessed
The comment parsing system ensures that all documentation and comments are preserved and available for analysis, while maintaining the performance characteristics needed for large codebases.
Preprocessor Directives
This parser treats preprocessor directives as trivia that can appear at safe boundaries (file start, between members inside namespaces and type bodies). We currently parse only a small subset explicitly and skip the rest.
What is parsed today
#pragmalines are parsed intoPreprocessorDirective::Pragma { pragma: String }.#linelines are parsed intoPreprocessorDirective::Line { line: String }.- Any other line starting with
#is recognized and consumed asPreprocessorDirective::Unknown { text: String }(the remainder of the line after#).
All directive parsers consume the optional trailing newline so the main parser can continue cleanly at the next token.
Where directives are skipped
Directives are treated as trivia and skipped at these locations:
This skipping is centralized via parser/helpers/directives.rs: skip_preprocessor_directives().
### 2. Symbol Definition
#### #define and #undef
```csharp
#define FEATURE_ENABLED
#define VERSION_2_0
#undef OLD_FEATURE
3. Diagnostic Directives
#warning
#warning This code is deprecated and will be removed in the next version
#error
#if UNSUPPORTED_PLATFORM
#error This platform is not supported
#endif
4. Line Directives
#line
#line 100 "OriginalFile.cs"
// Following code appears to come from line 100 of OriginalFile.cs
#line default
// Reset to actual file and line numbers
#line hidden
// Hide following lines from debugger
5. Region Directives
#region and #endregion
#region Private Methods
private void HelperMethod()
{
// Implementation
}
private void AnotherHelper()
{
// Implementation
}
#endregion
6. Pragma Directives
#pragma warning
#pragma warning disable CS0618
// Use of obsolete members
ObsoleteMethod();
#pragma warning restore CS0618
#pragma warning disable CS0162, CS0168
// Disable multiple warnings
#pragma warning restore CS0162, CS0168
#pragma checksum
#pragma checksum "file.cs" "{406EA660-64CF-4C82-B6F0-42D48172A799}" "checksum_bytes"
7. Nullable Context Directives
#nullable
#nullable enable
string? nullable = null; // Nullable reference types enabled
#nullable disable
string notNullable = null; // Warning disabled
#nullable restore
// Restore previous nullable context
Preprocessor Expression Evaluation
Symbols and Operators
Boolean Operators
#if DEBUG && !RELEASE // AND and NOT
#if WINDOWS || LINUX || MACOS // OR
#if (A && B) || (C && D) // Grouping with parentheses
Equality Operators
#if VERSION == "2.0" // String equality
#if BUILD_NUMBER >= 1000 // Numeric comparison (limited support)
Symbol Resolution
Symbols can be defined:
- Source code:
#define SYMBOL - Compiler flags:
/define:SYMBOL - Project settings:
<DefineConstants> - Environment: Predefined symbols
Predefined Symbols
Common predefined symbols:
#if NET5_0_OR_GREATER // Framework version
#if WINDOWS // Platform
#if DEBUG // Configuration
#if X64 // Architecture
Preprocessor AST Representation
Preprocessor Directive Node
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] pub enum PreprocessorDirective { If { condition: PreprocessorExpression, then_block: Vec<PreprocessorDirective>, elif_blocks: Vec<(PreprocessorExpression, Vec<PreprocessorDirective>)>, else_block: Option<Vec<PreprocessorDirective>>, }, Define(String), Undef(String), Warning(String), Error(String), Line { line_number: Option<u32>, file_name: Option<String>, hidden: bool, }, Region { name: String, content: Vec<PreprocessorDirective>, }, Pragma { directive: String, arguments: Vec<String>, }, Nullable(NullableDirective), } }
Preprocessor Expression
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] pub enum PreprocessorExpression { Symbol(String), Not(Box<PreprocessorExpression>), And(Box<PreprocessorExpression>, Box<PreprocessorExpression>), Or(Box<PreprocessorExpression>, Box<PreprocessorExpression>), Equal(Box<PreprocessorExpression>, Box<PreprocessorExpression>), NotEqual(Box<PreprocessorExpression>, Box<PreprocessorExpression>), Parenthesized(Box<PreprocessorExpression>), Literal(String), } }
Conditional Compilation Processing
Block Structure
Conditional blocks create a tree structure:
#if CONDITION_A
// Block A
#if NESTED_CONDITION
// Nested block
#endif
#elif CONDITION_B
// Block B
#else
// Default block
#endif
Active Code Determination
The preprocessor determines which code blocks are active:
- Evaluate conditions: Process #if expressions
- Symbol lookup: Resolve defined symbols
- Block selection: Choose active code paths
- Nested processing: Handle nested conditionals
Integration with Main Parser
Two-Phase Parsing
- Preprocessor Phase: Process directives and determine active code
- Main Parse Phase: Parse the active code sections
Conditional Code Exclusion
Inactive code blocks are:
- Excluded from parsing: Not processed by main parser
- Preserved in AST: Available for analysis tools
- Marked as inactive: Flagged for tooling
Directive Preservation
All directives are preserved for:
- Code formatting tools
- Refactoring utilities
- Documentation generation
- Build system integration
Error Handling
Directive Validation
The parser validates:
- Balanced conditionals: Every #if has matching #endif
- Valid expressions: Preprocessor expressions are syntactically correct
- Symbol definitions: #define follows naming rules
- Pragma syntax: Pragma directives have valid format
Error Recovery
When encountering malformed directives:
- Skip invalid directives: Continue parsing
- Report detailed errors: Show directive location and issue
- Maintain structure: Keep conditional block structure intact
Advanced Features
Nested Regions
#region Outer Region
#region Inner Region
// Nested region content
#endregion
#endregion
Complex Pragma Directives
#pragma warning disable IDE0051 // Remove unused private members
#pragma warning restore IDE0051
#pragma nullable enable warnings
#pragma nullable disable annotations
Source Mapping
Line directives affect source mapping:
#line 1 "Generated.cs"
// This appears to come from Generated.cs line 1
var generated = true;
#line default
// Back to actual file location
Usage in Analysis
Conditional Code Analysis
Analysis tools can:
- Detect dead code: Find code that's never compiled
- Track feature flags: Analyze conditional compilation usage
- Generate reports: Show compilation configurations
Symbol Tracking
Track symbol definitions and usage:
- Definition locations: Where symbols are defined
- Usage contexts: Where symbols are referenced
- Scope analysis: Symbol visibility across files
Performance Considerations
Preprocessing Optimization
- Symbol caching: Cache symbol resolution results
- Lazy evaluation: Process conditionals only when needed
- Memory efficiency: Minimize directive storage overhead
Integration Efficiency
- Single-pass processing: Process directives during parsing
- Minimal backtracking: Avoid reparsing conditional blocks
- Incremental updates: Support for incremental parsing with directive changes
The preprocessor directive system ensures that all C# preprocessing features are supported while maintaining the ability to analyze and manipulate code across different compilation configurations.
Spans
This page explains how source spans are represented and returned during parsing.
Span Type
- Type:
bsharp_parser::syntax::span::Span<'a> - Alias:
type Span<'a> = nom_locate::LocatedSpan<&'a str>; - Provides line/column offsets and byte positions for parser errors and mapping.
#![allow(unused)] fn main() { // src/bsharp_parser/src/syntax/span.rs pub type Span<'a> = nom_locate::LocatedSpan<&'a str>; }
Parsing With Spans
Use the parser facade to parse and also get a span table for top-level declarations.
#![allow(unused)] fn main() { use bsharp_parser::facade::Parser; let source = std::fs::read_to_string("Program.cs")?; let (cu, spans) = Parser::new().parse_with_spans(&source)?; }
- The return value is
(CompilationUnit, SpanTable). SpanTablemaps top-level declarations to byte ranges for later mapping.
Error Reporting
Pretty error formatting uses Span to print line/column with context:
#![allow(unused)] fn main() { use bsharp_parser::syntax::errors::format_error_tree; let msg = format_error_tree(&source, &error_tree); }
See: docs/parser/error-handling.md for details.
Syntax Traits
Core traits used by AST types and formatting emitters.
AstNode
- Path:
bsharp_syntax::node::ast_node::AstNode - Implemented by all syntax node types for traversal and visualization.
#![allow(unused)] fn main() { pub trait AstNode: Any { fn as_any(&self) -> &dyn Any; fn children<'a>(&'a self, _push: &mut dyn FnMut(NodeRef<'a>)) {} fn node_kind(&self) -> &'static str { core::any::type_name::<Self>() } fn node_label(&self) -> String { format!("{} ({})", self.node_kind(), core::any::type_name::<Self>()) } } }
Helpers:
NodeRef<'a>alias toDynNodeRef<'a>for dynamic traversal.push_child(push, node)to push typed children.
Emit and Emitter
- Path:
bsharp_syntax::emitters::emit_trait::{Emit, Emitter, EmitCtx} Emitis implemented by nodes that can render themselves as C# code.Emitterwrites items toString(or writer) using a mutableEmitCtx.
#![allow(unused)] fn main() { pub trait Emit { fn emit<W: std::fmt::Write>(&self, w: &mut W, cx: &mut EmitCtx) -> Result<(), EmitError>; } }
EmitCtx controls indentation, simple policies, and optional JSONL tracing.
Rendering Helpers
- Graph renderers in
bsharp_syntax::node::render::{to_text, to_mermaid, to_dot}operate on&impl AstNode.
See Also
docs/syntax/spans.mddocs/syntax/derive-macros.mddocs/syntax/formatter.md
Derive Macros
Procedural macros used by syntax nodes to implement traversal and visualization behavior.
#[derive(AstNode)]
- Crate:
bsharp_syntax_derive - Implements:
bsharp_syntax::node::ast_node::AstNodefor your struct/enum - Purpose: Auto-generates
children()to enable dynamic traversal viaNodeRef/DynNodeRef.
How it works
For each field, the macro emits code to push children appropriately:
Option<T>: pushes innerTif presentVec<T>: iterates and pushes eachTBox<T>: borrows inner&Tand pushes it- Other types: treated as AST nodes by default
- Primitive-like types are skipped:
bool, numbers,char,String, and internal primitive enums likePrimitiveType
Excerpt from implementation (src/bsharp_syntax_derive/src/lib.rs):
#![allow(unused)] fn main() { #[proc_macro_derive(AstNode)] pub fn derive_ast_node(input: TokenStream) -> TokenStream { // ... impl crate::node::ast_node::AstNode for #name { fn as_any(&self) -> &dyn ::core::any::Any { self } fn children<'a>(&'a self, push: &mut dyn FnMut(crate::node::ast_node::NodeRef<'a>)) { // Generated per-type based on fields } } } }
Helper routine decides how to push for common containers:
#![allow(unused)] fn main() { fn gen_push_for_type(ty: &Type, access: TokenStream) -> TokenStream { // Handles Option<T>, Vec<T>, Box<T>, or default to AST node push } }
Usage
Add the derive to your AST types in bsharp_syntax:
#![allow(unused)] fn main() { #[derive(bsharp_syntax_derive::AstNode, Debug, Clone, PartialEq, Serialize, Deserialize)] pub enum Expression { Literal(Literal), Variable(Identifier), Invocation(Box<InvocationExpression>), // ... } }
This enables:
- Graph rendering via
to_text,to_mermaid,to_dot - Traversal via
AstWalker/VisitorQueryAPI (by way ofNodeRefchildren)
Guidelines
- Ensure child fields are typed as AST nodes or containers of AST nodes for traversal to work.
- Keep primitive data out of traversal (the derive already skips standard primitives).
- Favor
Box<T>for recursive enum variants to keep sizes reasonable.
See Also
docs/syntax/traits.md–AstNode,NodeRefdocs/analysis/traversal-guide.md– traversal patternsdocs/development/query-cookbook.md– query examples
Formatter and Emitters
This page describes the formatting architecture in BSharp, implemented in the bsharp_syntax crate.
Overview
The formatter is an AST-driven emitter that produces the final C# text directly. There is no post-processing pass (no normalize_text): the output is exactly what emitters write.
- Core types:
FormatterFormatOptions
- Emission is instrumentable via a JSONL trace for debugging and profiling.
FormatOptions
#![allow(unused)] fn main() { pub struct FormatOptions { pub indent_width: usize, // default: 4 spaces pub newline: &'static str, // "\n" or "\r\n" pub max_consecutive_blank_lines: u8, // default: 1 pub blank_line_between_members: bool, // default: true pub ensure_final_newline: bool, // default: true (emit one final newline if any content) pub trim_trailing_whitespace: bool, // default: true pub instrument_emission: bool, // default: false pub trace_file: Option<std::path::PathBuf>, // optional JSONL output pub current_file: Option<std::path::PathBuf>, // helpful in messages } }
- Newline mode is controlled by CLI
--newline-modeor defaults to LF. - Emission tracing can be toggled via CLI
--emit-traceorBSHARP_EMIT_TRACE=1.
Brace Style and Spacing Policy
-
Brace style: All containers and headers use Allman style
- Header ends the line (e.g.,
namespace X,class C,void M()) - Next line is an opening
{, indented body, then closing}on its own line.
- Header ends the line (e.g.,
-
Spacing is centralized in simple policy helpers (see
src/bsharp_syntax/src/emitters/policy.rs):between_header_and_body_of_file→ blank line between file header (e.g., file-scoped ns) and bodyafter_file_scoped_namespace_header→ blank line afternamespace X.Y;between_using_blocks_and_declarations→ blank line after using block before first declarationbetween_top_level_declarations→ single separator newline between top-level declarationsbetween_members→ single separator newline between adjacent type membersbetween_block_items→ optional extra newline inside a block when a control-flow block (if/for/while/do/switch/inner block) is followed by a declaration
Notes:
- Policies are invoked from emitters; emitters themselves keep logic minimal and do not hardcode extra blank lines.
- Interfaces, classes, structs, and records call
between_membersbetween members; the booleanblank_line_between_memberstoggles this globally.
End-of-file Newline
- The
CompilationUnitemitter ensures at most one final newline at EOF. - There are no per-statement trailing newlines at the root; separation is handled by policy functions.
Usage
#![allow(unused)] fn main() { use bsharp_syntax::{Formatter, FormatOptions}; let mut opts = FormatOptions::default(); opts.newline = "\n"; opts.max_consecutive_blank_lines = 1; opts.blank_line_between_members = true; opts.trim_trailing_whitespace = true; let fmt = Formatter::new(opts); let output = fmt.format_compilation_unit(&cu)?; // cu: CompilationUnit }
Emission Trace (JSONL)
When instrumentation is enabled, the formatter emits a stream of JSON objects describing emission steps.
- CLI integration:
--emit-traceto enable--emit-trace-file <FILE>to write to a file (stdout by default)- Env var
BSHARP_EMIT_TRACE=1acts as a default toggle
The trace can be useful to:
- Diagnose spacing/blank line decisions (look for
action: "policy"with names likebetween_members,between_top_level_declarations,between_block_items) - Identify costly emission paths
- Reproduce formatting anomalies
Typical actions include: enter_node, open_brace, close_brace, newline, space, token, and policy.
Integration with CLI
- See
bsharp formatindocs/cli/format.mdfor options mapping toFormatOptions. - Files that fail to parse are skipped; a summary is printed.
- With
--write falseon a single file input, the formatted output is printed to stdout.
Design Notes
- Emitters are AST-driven to preserve structure while normalizing whitespace and layout based on policies.
- The formatter avoids changing semantics and focuses on consistent style.
- Options default to safe, conservative values and can be tuned via CLI.
Analysis Framework Overview
The BSharp analysis framework provides a comprehensive suite of tools for analyzing C# code at various levels of detail. It is built on top of the BSharp parser infrastructure and offers insights into code structure, quality, dependencies, and maintainability. These capabilities support standalone analysis tools and editor/CI integrations.
Analysis Architecture
The analysis framework is organized into specialized modules:
src/bsharp_analysis/src/
├── framework/ # pipeline, passes, registry, session, walker, query
├── passes/ # indexing, metrics, control_flow, dependencies, reporting
├── artifacts/ # symbols, cfg, dependencies
├── metrics/ # AstAnalysis data + shared helpers
├── rules/ # naming, semantic, control_flow_smells
├── report/ # AnalysisReport assembly
└── (no quality module)
Analysis Capabilities
Control Flow Analysis
- Path Analysis: Identify all possible execution paths through methods
- Reachability: Detect unreachable code sections
- Complexity Metrics: Calculate cyclomatic complexity and other flow-based metrics
- Dead Code Detection: Find code that can never be executed
Dependency Analysis
- Type Dependencies: Track relationships between types
- Assembly Dependencies: Analyze external assembly usage
- Circular Dependencies: Detect problematic dependency cycles
- Coupling Metrics: Measure afferent and efferent coupling
Code Metrics
Comprehensive metrics collection across multiple dimensions:
Complexity Metrics
- Cyclomatic Complexity
- Cognitive Complexity
- Nesting Depth
- Method Length
Size Metrics
- Lines of Code (LOC)
- Source Lines of Code (SLOC)
- Comment Lines
- Method Count per Class
Maintainability Metrics
- Maintainability Index
- Technical Debt Indicators
- Code Duplication Detection
- Halstead Metrics
Rules
- Naming Rules: Basic naming convention checks
- Control Flow Smells: Simple flow-related smells (e.g., deep nesting warnings)
Type Analysis
- Type Usage: Track how types are used throughout the codebase
- Generic Analysis: Analyze generic type usage patterns
- Inheritance Hierarchies: Map class and interface hierarchies
- Interface Compliance: Validate interface implementations
Analysis Workflow
1. AST Preparation
All analysis begins with a parsed AST:
#![allow(unused)] fn main() { let parser = Parser::new(); let compilation_unit = parser.parse(source_code)?; }
2. Pipeline
Use the framework pipeline with registered passes. Per-file runs populate typed artifacts; a final AnalysisReport summarizes metrics, control flow, and dependencies.
#![allow(unused)] fn main() { use bsharp_analysis::framework::pipeline::AnalyzerPipeline; use bsharp_analysis::framework::session::AnalysisSession; use bsharp_analysis::context::AnalysisContext; use bsharp_analysis::report::AnalysisReport; use bsharp_parser::facade::Parser; let parser = Parser::new(); let (cu, spans) = parser.parse_with_spans(source_code)?; let ctx = AnalysisContext::new("file.cs", source_code); let mut session = AnalysisSession::new(ctx, spans); AnalyzerPipeline::run_with_defaults(&cu, &mut session); let report: AnalysisReport = AnalysisReport::from_session(&session); }
3. Analysis Execution
The pipeline runs passes in phases:
- Index → Metrics (local) → Global (CFG, deps) → Semantic rules → Reporting
Artifacts (e.g., AstAnalysis, ControlFlowIndex, DependencyGraph) are inserted into the AnalysisSession and consumed by reporting.
4. Results Processing
Analysis results are structured for easy consumption:
#![allow(unused)] fn main() { // Metrics results println!("Cyclomatic Complexity: {}", metrics.cyclomatic_complexity); println!("Lines of Code: {}", metrics.lines_of_code); // Diagnostics for d in &report.diagnostics.diagnostics { println!("{}: {}", d.code, d.message); } }
Analysis Registry and Passes
Analyses are implemented as AnalyzerPass implementations registered in an AnalyzerRegistry and executed by the AnalyzerPipeline. Local rulesets and semantic rulesets run alongside passes based on Phase.
Configuration and Customization
Analysis Configuration
Analyzers can be configured for different scenarios:
#![allow(unused)] fn main() { let config = AnalysisConfig { max_cyclomatic_complexity: 10, max_method_length: 50, enforce_naming_conventions: true, detect_code_smells: true, // ... other configuration options }; let analyzer = MetricsAnalyzer::with_config(config); }
Custom Rules
Extend analysis with custom rules:
#![allow(unused)] fn main() { let custom_analyzer = QualityAnalyzer::new() .add_rule(CustomRule::new("no-goto-statements")) .add_rule(CustomRule::new("max-parameters", 5)) .add_rule(CustomRule::new("prefer-composition")); }
Reporting Options
Flexible reporting formats:
#![allow(unused)] fn main() { // JSON output let json_report = analyzer.analyze(&ast).to_json(); // XML output let xml_report = analyzer.analyze(&ast).to_xml(); // Custom format let custom_report = analyzer.analyze(&ast).format_with(custom_formatter); }
Integration Points
CLI Integration
Analysis capabilities are exposed through the analyze command and configured via options (format, config, include/exclude, enable/disable passes and rulesets, severity overrides). See docs/cli/analyze.md for details.
Programmatic Usage
Direct integration in tools typically runs the pipeline and pulls artifacts from the session:
#![allow(unused)] fn main() { use bsharp_analysis::context::AnalysisContext; use bsharp_analysis::framework::{AnalyzerPipeline, AnalysisSession}; use bsharp_analysis::metrics::AstAnalysis; use bsharp_parser::facade::Parser; let source = fs::read_to_string(path)?; let (cu, spans) = Parser::new().parse_with_spans(&source)?; let mut session = AnalysisSession::new(AnalysisContext::new(path, &source), spans); AnalyzerPipeline::run_with_defaults(&cu, &mut session); if let Some(ast) = session.artifacts.get::<AstAnalysis>() { println!("methods={} complexity={}", ast.total_methods, ast.cyclomatic_complexity); } }
Performance Characteristics
Analysis Performance
- Incremental Analysis: Support for analyzing only changed parts
- Parallel Processing: Multi-threaded analysis for large codebases
- Memory Efficiency: Minimal memory overhead during analysis
- Caching: Results caching for repeated analysis
Scalability
The framework scales from single files to large enterprise codebases:
- Single file analysis: Sub-second performance
- Medium projects (100+ files): Seconds to minutes
- Large codebases (1000+ files): Minutes with parallel processing
This analysis framework provides the foundation for building sophisticated code quality tools, IDE integrations, and automated code review systems.
Analysis Pipeline
This document describes the analysis pipeline architecture, artifacts, rulesets, configuration toggles, and determinism guarantees in the B# analyzer.
Phases
The pipeline runs in deterministic phases (see src/bsharp_analysis/src/framework/pipeline.rs):
- Index
- Runs early passes like
IndexingPassto populate core artifacts (SymbolIndex,NameIndex,FqnMap).
- Runs early passes like
- Local Rules
- Runs per-file passes such as
MetricsPass(Query-based) to compute artifacts likeAstAnalysis. - Local rulesets run here as well; use
bsharp_analysis::framework::Queryfor AST enumeration.
- Runs per-file passes such as
- Global
- Passes that aggregate information across the file (or project) after initial indexing.
- Semantic
- Rules and passes that require previously built artifacts (e.g., control flow, dependencies).
- Reporting
- Finalization phase that can synthesize report artifacts.
Each phase is explicitly selected in AnalyzerPipeline::run_for_file() using Phase discriminants. Pass and ruleset registration is driven by AnalyzerRegistry.
Artifacts
Artifacts are stored in the per-file AnalysisSession.artifacts and summarized into an AnalysisReport:
- Symbols (
src/bsharp_analysis/src/artifacts/symbols.rs)SymbolIndex(by id and name),NameIndex(name frequencies),FqnMap(local name → FQNs).
- Control Flow (
src/bsharp_analysis/src/artifacts/cfg.rs)ControlFlowIndexkeyed per method; summarized toCfgSummarywith total methods and smell counts.
- Dependencies (
src/bsharp_analysis/src/artifacts/dependencies.rs)- Graph keyed by symbols; summarized to node/edge counts.
- Metrics (
src/bsharp_analysis/src/artifacts/metrics.rs→AstAnalysis)- Basic metrics gathered during the local traversal.
Artifacts are optional in the final report; missing artifacts simply result in None summaries.
Rulesets and Passes
Rules implement the Rule trait and are grouped into logical rulesets. Passes implement AnalyzerPass and declare a Phase:
- Rulesets are separated into Local vs. Semantic groups and executed during the respective phases.
- Passes can be toggled individually by id.
- The registry is created with
AnalyzerRegistry::from_config(&AnalysisConfig)to honor config toggles.
Configuration
AnalysisConfig (src/bsharp_analysis/src/context.rs) controls thresholds and toggles:
- Control flow thresholds
cf_high_complexity_threshold(default 10)cf_deep_nesting_threshold(default 4)
- Toggles
enable_rulesets: HashMap<String, bool>enable_passes: HashMap<String, bool>rule_severities: HashMap<String, DiagnosticSeverity>
- Workspace filters
workspace.follow_refs: boolworkspace.include: Vec<String>(glob patterns)workspace.exclude: Vec<String>(glob patterns)
CLI maps flags to these fields in src/bsharp_cli/src/commands/analyze.rs and supports TOML/JSON config files.
Workspace Analysis and Determinism
AnalyzerPipeline::run_workspace() and run_workspace_with_config():
- Discover files deterministically by sorting absolute paths and deduping.
- Analyze each file independently, then merge artifacts into a single
AnalysisReport. - Diagnostics are sorted by file, line, column, then diagnostic code for stable output.
- Workspace loader warnings/errors are merged into
workspace_warnings(sorted, deduped). - When the
parallel_analysisfeature is enabled, files are analyzed in parallel but merged deterministically in path order.
Report Schema
AnalysisReport (src/bsharp_analysis/src/report/mod.rs) includes:
schema_version: u32(currently 1)diagnostics: DiagnosticCollectionmetrics: Option<AstAnalysis>cfg: Option<CfgSummary>deps: Option<DependencySummary>workspace_warnings: Vec<String>workspace_errors: Vec<String>(reserved for future use)
The JSON shape is intentionally stable; tests use snapshots with path normalization to ensure cross-platform consistency.
Testing Guidance
- Prefer deterministic fixtures under
tests/fixtures/. - Normalize absolute paths in snapshots (see
tests/integration/workspace_analysis_snapshot.rs). - For workspace filtering, use
run_workspace_with_config()withinclude/excludeglobs and snapshot the resulting report.
Analysis Traversal Guide
This guide explains how to traverse BSharp AST statements and expressions in analysis passes using the current framework.
- Source files:
src/bsharp_analysis/src/framework/walker.rssrc/bsharp_analysis/src/framework/query/src/bsharp_analysis/src/passes/*
Statement traversal
Use AstWalker for single-pass traversal with the Visit trait, or the Query API for typed filtering.
Example using AstWalker + Visit to count if statements:
#![allow(unused)] fn main() { use bsharp_analysis::framework::{AstWalker, Visit, NodeRef, AnalysisSession}; struct CountIfs { pub ifs: usize } impl Visit for CountIfs { fn enter(&mut self, node: &NodeRef, _session: &mut AnalysisSession) { if let NodeRef::Statement(s) = node { if matches!(s, bsharp_syntax::statements::statement::Statement::If(_)) { self.ifs += 1; } } } } }
Expression traversal
Use Query for typed expression searches:
#![allow(unused)] fn main() { use bsharp_analysis::framework::{NodeRef, Query}; use bsharp_syntax::expressions::AwaitExpression; let await_count = Query::from(NodeRef::CompilationUnit(&cu)) .of::<AwaitExpression>() .count(); }
Putting it together
When analyzing methods, you typically:
- Parse the compilation unit and build the analysis session.
- For each method body (a
Statement::Block), compute metrics by walking statements and expressions.
Example (from ControlFlowPass pattern):
#![allow(unused)] fn main() { use bsharp_analysis::artifacts::cfg::{ControlFlowIndex, MethodControlFlowStats}; use bsharp_syntax::statements::statement::Statement; fn stats_for_method(body: Option<&Statement>) -> MethodControlFlowStats { let complexity = match body { Some(s) => 1 + decision_points(s), None => 1 }; let max_nesting = calc_max_nesting(body, 0); let exit_points = count_exit_points(body); let statement_count = count_statements(body); MethodControlFlowStats { complexity, max_nesting, exit_points, statement_count } } }
See src/bsharp_analysis/src/metrics/shared.rs for helpers like decision_points, max_nesting_of, count_statements and src/bsharp_analysis/src/passes/control_flow.rs for usage.
See Also
Tips
- Keep walkers side-effect free; accumulate results in closures.
- Prefer small, focused passes that use the walkers rather than embedding traversal in each pass.
- If a construct is not being traversed, add it to the walker first to avoid duplicated traversal logic.
Control Flow Analysis
The control flow analysis system analyzes method control flow to calculate complexity metrics, detect control flow smells, and identify potential issues.
Overview
Location: src/bsharp_analysis/src/passes/control_flow.rs, src/bsharp_analysis/src/artifacts/cfg.rs
Control flow analysis provides:
- Cyclomatic complexity calculation
- Maximum nesting depth tracking
- Exit point counting
- Statement counting
- Control flow smell detection
Control Flow Metrics
Cyclomatic Complexity
Definition: Number of linearly independent paths through a method
Calculation: CC = 1 + number of decision points
Decision Points:
ifstatementscaselabels inswitch- Loop statements (
for,foreach,while,do-while) catchclauses- Logical operators (
&&,||) in conditions - Ternary operators (
?:) - Null-coalescing operators (
??)
Example:
public void ProcessOrder(Order order) { // CC = 1 (base)
if (order == null) { // +1 = 2
throw new ArgumentNullException();
}
if (order.IsValid) { // +1 = 3
if (order.Amount > 1000) { // +1 = 4
ApplyDiscount(order);
}
SaveOrder(order);
}
}
// Total CC = 4
Maximum Nesting Depth
Definition: Deepest level of nested control structures
Example:
public void Example() {
if (condition1) { // Depth 1
while (condition2) { // Depth 2
if (condition3) { // Depth 3
DoSomething();
}
}
}
}
// Max Nesting Depth = 3
Exit Points
Definition: Number of points where method can return
Counted:
returnstatementsthrowstatements- End of void method
Example:
public int Calculate(int x) {
if (x < 0) {
return -1; // Exit point 1
}
if (x == 0) {
return 0; // Exit point 2
}
return x * 2; // Exit point 3
}
// Total Exit Points = 3
Statement Count
Definition: Total number of statements in method body
Includes all statement types:
- Expression statements
- Declaration statements
- Control flow statements
- Jump statements
Control Flow Artifacts
MethodControlFlowStats
#![allow(unused)] fn main() { pub struct MethodControlFlowStats { pub complexity: usize, pub max_nesting: usize, pub exit_points: usize, pub statement_count: usize, } }
ControlFlowIndex
#![allow(unused)] fn main() { pub struct ControlFlowIndex { // Method identifier -> stats methods: HashMap<String, MethodControlFlowStats>, } }
CfgSummary
#![allow(unused)] fn main() { pub struct CfgSummary { pub total_methods: usize, pub high_complexity_count: usize, pub deep_nesting_count: usize, } }
Control Flow Smells
High Complexity
Threshold: Configurable (default: 10)
Detection:
#![allow(unused)] fn main() { if stats.complexity > config.cf_high_complexity_threshold { session.diagnostics.add( DiagnosticCode::HighComplexity, format!("Method complexity {} exceeds threshold {}", stats.complexity, threshold) ); } }
Diagnostic:
warning[CF002]: High cyclomatic complexity
--> src/OrderProcessor.cs:42:17
|
42 | public void ProcessOrder(Order order) {
| ^^^^^^^^^^^^ complexity = 15 (threshold: 10)
|
= help: Consider breaking this method into smaller methods
Deep Nesting
Threshold: Configurable (default: 4)
Detection:
#![allow(unused)] fn main() { if stats.max_nesting > config.cf_deep_nesting_threshold { session.diagnostics.add( DiagnosticCode::DeepNesting, format!("Maximum nesting depth {} exceeds threshold {}", stats.max_nesting, threshold) ); } }
Diagnostic:
warning[CF003]: Deep nesting detected
--> src/Validator.cs:15:9
|
15 | if (condition1) {
| ^^ nesting depth = 5 (threshold: 4)
|
= help: Consider extracting nested logic into separate methods
Implementation
Analysis Pass
Location: src/bsharp_analysis/src/passes/control_flow.rs
#![allow(unused)] fn main() { pub struct ControlFlowPass; impl AnalyzerPass for ControlFlowPass { fn id(&self) -> &'static str { "control_flow" } fn phase(&self) -> Phase { Phase::Semantic } fn run(&self, cu: &CompilationUnit, session: &mut AnalysisSession) { let mut index = ControlFlowIndex::new(); // Analyze all methods in compilation unit for decl in &cu.declarations { analyze_declaration(decl, &mut index, session); } session.artifacts.cfg = Some(index); } } }
Method Analysis
#![allow(unused)] fn main() { fn analyze_method( method: &MethodDeclaration, index: &mut ControlFlowIndex, session: &mut AnalysisSession ) { let stats = calculate_stats(method.body.as_ref()); // Check thresholds if stats.complexity > session.config.cf_high_complexity_threshold { session.diagnostics.add(/* high complexity diagnostic */); } if stats.max_nesting > session.config.cf_deep_nesting_threshold { session.diagnostics.add(/* deep nesting diagnostic */); } // Store in index index.add_method(&method.identifier.name, stats); } }
Stats Calculation
#![allow(unused)] fn main() { fn calculate_stats(body: Option<&Statement>) -> MethodControlFlowStats { let complexity = match body { Some(stmt) => 1 + count_decision_points(stmt), None => 1, }; let max_nesting = calculate_max_nesting(body, 0); let exit_points = count_exit_points(body); let statement_count = count_statements(body); MethodControlFlowStats { complexity, max_nesting, exit_points, statement_count, } } }
Configuration
Thresholds
[analysis.control_flow]
cf_high_complexity_threshold = 10
cf_deep_nesting_threshold = 4
CLI Usage
# Analyze with custom thresholds
bsharp analyze MyProject.csproj --config .bsharp.toml
# Enable control flow pass
bsharp analyze MyProject.csproj --enable-pass control_flow
Integration with Pipeline
Phase: Semantic
Control flow analysis runs in the Semantic phase after symbol indexing:
Phase::Index -> Build SymbolIndex
Phase::Local -> Collect metrics
Phase::Semantic -> Control flow analysis
Artifacts
Results stored in AnalysisSession:
#![allow(unused)] fn main() { session.artifacts.cfg = Some(ControlFlowIndex { ... }); }
Summarized in AnalysisReport:
#![allow(unused)] fn main() { report.cfg = Some(CfgSummary { total_methods: 87, high_complexity_count: 5, deep_nesting_count: 3, }); }
Related Documentation
- Analysis Pipeline - Pipeline integration
- Metrics Collection - Related metrics
- Code Quality - Quality rules
- Traversal Guide - AST traversal
References
- Implementation:
src/bsharp_analysis/src/passes/control_flow.rs - Artifacts:
src/bsharp_analysis/src/artifacts/cfg.rs - Tests:
src/bsharp_tests/src/analysis/control_flow/(planned)
Dependency Analysis
The dependency analysis system tracks relationships between types, methods, and other symbols in C# code to identify coupling, circular dependencies, and architectural issues.
Overview
Location: src/bsharp_analysis/src/artifacts/dependencies.rs
The dependency analysis builds a directed graph of symbol relationships, where:
- Nodes represent symbols (classes, interfaces, methods, etc.)
- Edges represent dependencies (inheritance, method calls, field types, etc.)
Dependency Types
Type Dependencies
Inheritance:
public class Derived : Base { } // Derived depends on Base
Interface Implementation:
public class MyClass : IInterface { } // MyClass depends on IInterface
Field Types:
public class Container {
private Helper helper; // Container depends on Helper
}
Method Parameters and Return Types:
public Response Process(Request req) { } // Process depends on Request and Response
Member Dependencies
Method Calls:
public void Caller() {
Helper.DoSomething(); // Caller depends on Helper.DoSomething
}
Property Access:
var value = obj.Property; // Depends on Property
Constructor Calls:
var instance = new MyClass(); // Depends on MyClass constructor
Dependency Graph Structure
DependencyGraph
#![allow(unused)] fn main() { pub struct DependencyGraph { // Symbol ID -> list of symbols it depends on dependencies: HashMap<SymbolId, Vec<SymbolId>>, } }
Operations
Adding Dependencies:
#![allow(unused)] fn main() { graph.add_dependency(from_symbol, to_symbol); }
Querying Dependencies:
#![allow(unused)] fn main() { // Direct dependencies let deps = graph.get_dependencies(symbol_id); // Transitive dependencies let all_deps = graph.get_transitive_dependencies(symbol_id); // Reverse dependencies (who depends on this symbol) let dependents = graph.get_dependents(symbol_id); }
Circular Dependency Detection
Algorithm
The analysis uses depth-first search to detect cycles in the dependency graph:
- Start from each symbol
- Traverse dependencies depth-first
- Track visited nodes in current path
- If we revisit a node in current path, cycle detected
Example
public class A {
private B b; // A depends on B
}
public class B {
private C c; // B depends on C
}
public class C {
private A a; // C depends on A -> CYCLE: A -> B -> C -> A
}
Detection:
#![allow(unused)] fn main() { let cycles = graph.find_cycles(); for cycle in cycles { // Report diagnostic for circular dependency session.diagnostics.add( DiagnosticCode::CircularDependency, format!("Circular dependency detected: {:?}", cycle) ); } }
Coupling Metrics
Afferent Coupling (Ca)
Definition: Number of types that depend on this type (incoming dependencies)
Interpretation:
- High Ca = Many types depend on this type (responsibility)
- Type is stable and hard to change
Efferent Coupling (Ce)
Definition: Number of types this type depends on (outgoing dependencies)
Interpretation:
- High Ce = This type depends on many others
- Type is unstable and sensitive to changes
Instability (I)
Formula: I = Ce / (Ca + Ce)
Range: 0.0 to 1.0
- 0.0 = Maximally stable (only incoming dependencies)
- 1.0 = Maximally unstable (only outgoing dependencies)
Example:
#![allow(unused)] fn main() { let ca = graph.afferent_coupling(symbol_id); let ce = graph.efferent_coupling(symbol_id); let instability = ce as f64 / (ca + ce) as f64; if instability > 0.8 { // Highly unstable type - consider refactoring } }
Dependency Summary
DependencySummary
#![allow(unused)] fn main() { pub struct DependencySummary { pub total_nodes: usize, pub total_edges: usize, pub circular_dependencies: usize, pub max_depth: usize, } }
Generated by: DependencyGraph::summarize()
Included in: AnalysisReport
Usage in Analysis Pipeline
Phase: Global
Dependency analysis runs in the Global phase after symbol indexing:
#![allow(unused)] fn main() { // In AnalyzerPipeline Phase::Index -> Build SymbolIndex Phase::Global -> Build DependencyGraph Phase::Semantic -> Use dependencies for semantic analysis }
Integration with Passes
DependencyPass (if implemented):
#![allow(unused)] fn main() { impl AnalyzerPass for DependencyPass { fn id(&self) -> &'static str { "dependencies" } fn phase(&self) -> Phase { Phase::Global } fn run(&self, cu: &CompilationUnit, session: &mut AnalysisSession) { let graph = build_dependency_graph(cu, &session.artifacts.symbols); session.artifacts.dependencies = Some(graph); } } }
Building Dependency Graph
From CompilationUnit
#![allow(unused)] fn main() { pub fn build_dependency_graph( cu: &CompilationUnit, symbols: &SymbolIndex ) -> DependencyGraph { let mut graph = DependencyGraph::new(); // Visit all declarations for decl in &cu.declarations { match decl { TopLevelDeclaration::Class(class) => { analyze_class_dependencies(class, symbols, &mut graph); } // ... other declaration types } } graph } }
From Class Declaration
#![allow(unused)] fn main() { fn analyze_class_dependencies( class: &ClassDeclaration, symbols: &SymbolIndex, graph: &mut DependencyGraph ) { let class_symbol = symbols.lookup(&class.identifier.name); // Base types for base_type in &class.base_types { if let Some(base_symbol) = resolve_type(base_type, symbols) { graph.add_dependency(class_symbol, base_symbol); } } // Members for member in &class.body_declarations { analyze_member_dependencies(member, class_symbol, symbols, graph); } } }
Dependency Visualization
Dependency Matrix
Generate a matrix showing which types depend on which:
A B C D
A - X - X
B - - X -
C X - - -
D - - - -
- Row A, Column B = X means A depends on B
Dependency Tree
MyApp
├── Services
│ ├── UserService
│ │ ├── IUserRepository
│ │ └── IEmailService
│ └── OrderService
│ ├── IOrderRepository
│ └── IPaymentService
└── Models
├── User
└── Order
Diagnostics
Circular Dependency Warning
warning[DEP001]: Circular dependency detected
--> src/ClassA.cs:3:5
|
3 | private ClassB b;
| ^^^^^^ ClassA depends on ClassB
|
= note: Dependency cycle: ClassA -> ClassB -> ClassC -> ClassA
High Coupling Warning
warning[DEP002]: High efferent coupling detected
--> src/GodClass.cs:1:14
|
1 | public class GodClass {
| ^^^^^^^^ depends on 25 other types
|
= help: Consider breaking this class into smaller, focused classes
Unstable Dependency Warning
warning[DEP003]: Stable type depends on unstable type
--> src/StableClass.cs:5:5
|
5 | private UnstableClass helper;
| ^^^^^^^^^^^^^ instability = 0.95
|
= note: Stable types (instability < 0.2) should not depend on unstable types (instability > 0.8)
Configuration
Thresholds
[analysis.dependencies]
max_efferent_coupling = 20
max_afferent_coupling = 10
max_instability = 0.8
warn_circular_dependencies = true
CLI Usage
# Analyze dependencies
bsharp analyze MyProject.csproj --enable-pass dependencies
# Generate dependency report
bsharp analyze MyProject.sln --out deps.json --format pretty-json
Future Enhancements
Planned Features
-
Package-Level Dependencies
- Track dependencies between namespaces/assemblies
- Identify layering violations
-
Dependency Metrics Dashboard
- Visual dependency graphs
- Coupling heatmaps
- Trend analysis over time
-
Architectural Rules
- Define allowed/forbidden dependencies
- Enforce layered architecture
- Prevent specific coupling patterns
-
Dependency Injection Analysis
- Track DI container registrations
- Verify dependency lifetimes
- Detect missing registrations
Implementation Status
Current State:
- Basic dependency graph structure defined
- Integration with analysis pipeline planned
- Circular dependency detection algorithm ready
TODO:
- Implement full dependency extraction from AST
- Add coupling metrics calculation
- Create dependency visualization tools
- Add comprehensive tests
Related Documentation
- Analysis Pipeline - How dependency analysis fits in the pipeline
- Control Flow Analysis - Related analysis type
- Metrics Collection - Coupling metrics
- Architecture Decisions - Design rationale
References
- Implementation:
src/bsharp_analysis/src/artifacts/dependencies.rs - Tests:
src/bsharp_tests/src/analysis/dependencies/(planned) - Related Passes:
src/bsharp_analysis/src/passes/(when implemented)
Metrics Collection
The BSharp metrics system collects comprehensive code metrics during analysis to assess code complexity, size, and maintainability.
Overview
Location: src/bsharp_analysis/src/metrics/
The metrics system provides:
- Basic Metrics - Lines of code, statement counts, declaration counts
- Complexity Metrics - Cyclomatic complexity, cognitive complexity, nesting depth
- Maintainability Metrics - Maintainability index, Halstead metrics
Architecture
Core Components
src/bsharp_analysis/src/metrics/
├── core.rs # AstAnalysis data structure (aggregated counts)
└── shared.rs # Helpers: decision_points, max_nesting_of, count_statements, etc.
How metrics are produced
MetricsPassruns inPhase::LocalRulesand computes anAstAnalysisartifact using the Query API to enumerate declarations, plus lightweight walkers for statement counts.- Access
AstAnalysisfromAnalysisSessionafter running the pipeline.
Metric Types
1. Basic Metrics
AstAnalysis Structure:
#![allow(unused)] fn main() { pub struct AstAnalysis { // Size metrics pub total_lines: usize, pub code_lines: usize, pub comment_lines: usize, pub blank_lines: usize, // Declaration counts pub namespace_count: usize, pub class_count: usize, pub interface_count: usize, pub struct_count: usize, pub enum_count: usize, pub method_count: usize, pub property_count: usize, pub field_count: usize, // Statement counts pub statement_count: usize, pub expression_count: usize, // Complexity (aggregated) pub total_complexity: usize, pub max_complexity: usize, pub max_nesting_depth: usize, } }
2. Complexity Metrics
Cyclomatic Complexity
Definition: Number of linearly independent paths through code
Formula: CC = E - N + 2P
- E = edges in control flow graph
- N = nodes in control flow graph
- P = connected components (usually 1)
Simplified: CC = 1 + number of decision points
Decision Points:
if,else ifcaseinswitchfor,foreach,while,do-while&&,||in conditionscatchclauses?:ternary operator??null-coalescing operator
Example:
public void ProcessOrder(Order order) { // CC = 1 (base)
if (order == null) { // +1 = 2
throw new ArgumentNullException();
}
if (order.IsValid) { // +1 = 3
if (order.Amount > 1000) { // +1 = 4
ApplyDiscount(order);
}
SaveOrder(order);
} else { // else doesn't add
LogError(order);
}
}
// Total CC = 4
Implementation:
#![allow(unused)] fn main() { pub fn cyclomatic_complexity(method: &MethodDeclaration) -> usize { let mut complexity = 1; // Base complexity if let Some(body) = &method.body { complexity += count_decision_points(body); } complexity } fn count_decision_points(stmt: &Statement) -> usize { let mut count = 0; walk_statements(stmt, &mut |s| { match s { Statement::If(_) => count += 1, Statement::For(_) => count += 1, Statement::ForEach(_) => count += 1, Statement::While(_) => count += 1, Statement::DoWhile(_) => count += 1, Statement::Switch(sw) => { // Each case is a decision point count += sw.sections.len(); } Statement::Try(try_stmt) => { // Each catch is a decision point count += try_stmt.catch_clauses.len(); } _ => {} } }); // Also count logical operators in expressions // count += count_logical_operators(stmt); count } }
Thresholds:
- 1-10: Simple, low risk
- 11-20: Moderate complexity, moderate risk
- 21-50: Complex, high risk
- 50+: Very complex, very high risk - refactor recommended
Cognitive Complexity
Definition: Measure of how difficult code is to understand
Increments:
- +1 for each:
if,else if,switch,for,foreach,while,do-while,catch,?:,?? - +1 for each level of nesting (nested control structures)
- +1 for each
breakorcontinuethat jumps out of nested structure - +1 for each recursive call
Example:
public void Process(List<int> items) {
if (items != null) { // +1 (if)
foreach (var item in items) { // +1 (loop) +1 (nesting) = +2
if (item > 0) { // +1 (if) +2 (nesting) = +3
Process(item); // +1 (recursion) +3 (nesting) = +4
}
}
}
}
// Total Cognitive Complexity = 1 + 2 + 3 + 4 = 10
Implementation:
#![allow(unused)] fn main() { pub fn cognitive_complexity(method: &MethodDeclaration) -> usize { let mut complexity = 0; if let Some(body) = &method.body { complexity = calculate_cognitive_complexity(body, 0); } complexity } fn calculate_cognitive_complexity(stmt: &Statement, nesting_level: usize) -> usize { let mut complexity = 0; match stmt { Statement::If(if_stmt) => { complexity += 1 + nesting_level; // if + nesting penalty complexity += calculate_cognitive_complexity(&if_stmt.consequence, nesting_level + 1); if let Some(alt) = &if_stmt.alternative { complexity += calculate_cognitive_complexity(alt, nesting_level + 1); } } Statement::For(for_stmt) => { complexity += 1 + nesting_level; if let Some(body) = &for_stmt.body { complexity += calculate_cognitive_complexity(body, nesting_level + 1); } } // ... other statement types _ => {} } complexity } }
Nesting Depth
Definition: Maximum depth of nested control structures
Example:
public void Example() {
if (condition1) { // Depth 1
while (condition2) { // Depth 2
if (condition3) { // Depth 3
for (int i = 0; i < 10; i++) { // Depth 4
// Code here
}
}
}
}
}
// Max Nesting Depth = 4
Implementation:
#![allow(unused)] fn main() { pub fn max_nesting_depth(method: &MethodDeclaration) -> usize { method.body.as_ref() .map(|body| calculate_max_nesting(body, 0)) .unwrap_or(0) } fn calculate_max_nesting(stmt: &Statement, current_depth: usize) -> usize { let mut max_depth = current_depth; match stmt { Statement::If(if_stmt) => { let then_depth = calculate_max_nesting(&if_stmt.consequence, current_depth + 1); max_depth = max_depth.max(then_depth); if let Some(alt) = &if_stmt.alternative { let else_depth = calculate_max_nesting(alt, current_depth + 1); max_depth = max_depth.max(else_depth); } } Statement::Block(stmts) => { for s in stmts { let depth = calculate_max_nesting(s, current_depth); max_depth = max_depth.max(depth); } } // ... other nesting statements _ => {} } max_depth } }
Thresholds:
- 1-3: Acceptable
- 4-5: Consider refactoring
- 6+: Refactor recommended
Planned: Maintainability Metrics
Maintainability Index
Definition: Composite metric indicating code maintainability
Formula (Microsoft version):
MI = MAX(0, (171 - 5.2 * ln(HV) - 0.23 * CC - 16.2 * ln(LOC)) * 100 / 171)
Where:
- HV = Halstead Volume
- CC = Cyclomatic Complexity
- LOC = Lines of Code
Scale:
- 85-100: Good maintainability (green)
- 65-84: Moderate maintainability (yellow)
- 0-64: Difficult to maintain (red)
Note: Maintainability Index is not implemented in the current codebase. This section outlines potential future work.
pub fn maintainability_index( halstead_volume: f64, cyclomatic_complexity: usize, lines_of_code: usize ) -> f64 { let hv_term = 5.2 * halstead_volume.ln(); let cc_term = 0.23 * (cyclomatic_complexity as f64); let loc_term = 16.2 * (lines_of_code as f64).ln(); let mi = 171.0 - hv_term - cc_term - loc_term; let normalized = (mi * 100.0 / 171.0).max(0.0); normalized }
Planned: Halstead Metrics
Operators and Operands:
- n1 = number of distinct operators
- n2 = number of distinct operands
- N1 = total number of operators
- N2 = total number of operands
Derived Metrics:
- Program Vocabulary:
n = n1 + n2 - Program Length:
N = N1 + N2 - Calculated Length:
N' = n1 * log2(n1) + n2 * log2(n2) - Volume:
V = N * log2(n) - Difficulty:
D = (n1 / 2) * (N2 / n2) - Effort:
E = D * V - Time to Program:
T = E / 18seconds - Bugs Delivered:
B = V / 3000
Note: Halstead metrics are not implemented in the current codebase.
#![allow(unused)] fn main() { pub struct HalsteadMetrics { pub distinct_operators: usize, // n1 pub distinct_operands: usize, // n2 pub total_operators: usize, // N1 pub total_operands: usize, // N2 pub vocabulary: usize, // n pub length: usize, // N pub volume: f64, // V pub difficulty: f64, // D pub effort: f64, // E pub time_to_program: f64, // T pub bugs_delivered: f64, // B } impl HalsteadMetrics { pub fn calculate(operators: &HashSet<String>, operands: &HashSet<String>, op_count: usize, operand_count: usize) -> Self { let n1 = operators.len(); let n2 = operands.len(); let n = n1 + n2; let N = op_count + operand_count; let volume = (N as f64) * (n as f64).log2(); let difficulty = (n1 as f64 / 2.0) * (operand_count as f64 / n2 as f64); let effort = difficulty * volume; let time = effort / 18.0; let bugs = volume / 3000.0; HalsteadMetrics { distinct_operators: n1, distinct_operands: n2, total_operators: op_count, total_operands: operand_count, vocabulary: n, length: N, volume, difficulty, effort, time_to_program: time, bugs_delivered: bugs, } } } }
Metrics Collection in the Pipeline
MetricsPass is registered in the analyzer registry and runs during Phase::LocalRules. It enumerates classes/structs/methods via Query and uses helpers from bsharp_analysis::metrics::shared to compute statement counts, decision points (cyclomatic complexity), and nesting.
#![allow(unused)] fn main() { use bsharp_analysis::context::AnalysisContext; use bsharp_analysis::framework::pipeline::AnalyzerPipeline; use bsharp_analysis::framework::session::AnalysisSession; use bsharp_analysis::metrics::AstAnalysis; use bsharp_parser::facade::Parser; let source = r#"public class C { public void M() { if (true) { } } }"#; let (cu, spans) = Parser::new().parse_with_spans(source)?; let mut session = AnalysisSession::new(AnalysisContext::new("file.cs", source), spans); AnalyzerPipeline::run_with_defaults(&cu, &mut session); let ast = session.artifacts.get::<AstAnalysis>().expect("AstAnalysis"); println!("classes={}, methods={}, ifs={}", ast.total_classes, ast.total_methods, ast.total_if_statements); }
CLI Usage
Analyze Metrics
# Analyze single file
bsharp analyze MyFile.cs
# Analyze project
bsharp analyze MyProject.csproj --out metrics.json
# Analyze solution
bsharp analyze MySolution.sln --out metrics.json --format pretty-json
Example Output
{
"schema_version": 1,
"metrics": {
"total_lines": 1250,
"code_lines": 980,
"comment_lines": 150,
"blank_lines": 120,
"class_count": 15,
"method_count": 87,
"total_complexity": 245,
"max_complexity": 18,
"max_nesting_depth": 5
}
}
Thresholds and Warnings
Configuration
[analysis.metrics]
max_cyclomatic_complexity = 10
max_cognitive_complexity = 15
max_nesting_depth = 4
max_method_length = 50
min_maintainability_index = 65
Diagnostics
High Complexity Warning:
warning[MET001]: Method has high cyclomatic complexity
--> src/OrderProcessor.cs:42:17
|
42 | public void ProcessOrder(Order order) {
| ^^^^^^^^^^^^ complexity = 18 (threshold: 10)
|
= help: Consider breaking this method into smaller methods
Deep Nesting Warning:
warning[MET002]: Deep nesting detected
--> src/Validator.cs:15:9
|
15 | if (condition1) {
| ^^ nesting depth = 5 (threshold: 4)
|
= help: Consider extracting nested logic into separate methods
Programmatic Usage
Analyzing a Method
#![allow(unused)] fn main() { use bsharp::analysis::metrics::{cyclomatic_complexity, cognitive_complexity, max_nesting_depth}; let method = parse_method("public void MyMethod() { ... }"); let cc = cyclomatic_complexity(&method); let cog = cognitive_complexity(&method); let nesting = max_nesting_depth(&method); println!("Cyclomatic Complexity: {}", cc); println!("Cognitive Complexity: {}", cog); println!("Max Nesting Depth: {}", nesting); }
Analyzing a file via the pipeline
#![allow(unused)] fn main() { let (cu, spans) = Parser::new().parse_with_spans(source_code)?; let mut session = AnalysisSession::new(AnalysisContext::new("file.cs", source_code), spans); AnalyzerPipeline::run_with_defaults(&cu, &mut session); let metrics = session.artifacts.get::<AstAnalysis>().expect("AstAnalysis"); println!("Classes: {}", metrics.total_classes); println!("Methods: {}", metrics.total_methods); println!("Cyclomatic Complexity: {}", metrics.cyclomatic_complexity); }
Related Documentation
- Analysis Pipeline - How metrics fit in the pipeline
- Control Flow Analysis - Related complexity analysis
- Code Quality - Quality assessment using metrics
- Architecture - Design decisions
References
- Implementation:
src/bsharp_analysis/src/metrics/ - Pass:
src/bsharp_analysis/src/passes/metrics.rs - Tests:
src/bsharp_tests/src/analysis/metrics/(planned) - Standards: ISO/IEC 25023 (Software Quality Metrics)
Type Analysis
The type analysis system provides insights into type usage, inheritance hierarchies, and type-related patterns in C# code.
Overview
Status: Planned (module not implemented yet)
Type analysis tracks:
- Type definitions and their relationships
- Inheritance hierarchies
- Interface implementations
- Generic type usage
- Type references and dependencies
Type Information
Type Categories
Value Types:
- Primitives (
int,bool,double, etc.) - Structs
- Enums
Reference Types:
- Classes
- Interfaces
- Delegates
- Arrays
Special Types:
- Generic type parameters
- Nullable types
- Tuple types
- Anonymous types
Inheritance Analysis
Class Hierarchies
Tracking Inheritance:
public class Animal { }
public class Mammal : Animal { }
public class Dog : Mammal { }
Hierarchy Representation:
Animal
└── Mammal
└── Dog
Analysis:
#![allow(unused)] fn main() { pub struct InheritanceHierarchy { // Type -> Base Type base_types: HashMap<TypeId, TypeId>, // Type -> Derived Types derived_types: HashMap<TypeId, Vec<TypeId>>, } impl InheritanceHierarchy { pub fn get_base_type(&self, type_id: TypeId) -> Option<TypeId>; pub fn get_derived_types(&self, type_id: TypeId) -> &[TypeId]; pub fn get_all_ancestors(&self, type_id: TypeId) -> Vec<TypeId>; pub fn get_all_descendants(&self, type_id: TypeId) -> Vec<TypeId>; pub fn inheritance_depth(&self, type_id: TypeId) -> usize; } }
Interface Implementation
Tracking Implementations:
public interface IRepository { }
public interface IUserRepository : IRepository { }
public class UserRepository : IUserRepository { }
Analysis:
#![allow(unused)] fn main() { pub struct InterfaceImplementations { // Type -> Interfaces it implements implementations: HashMap<TypeId, Vec<TypeId>>, // Interface -> Types that implement it implementers: HashMap<TypeId, Vec<TypeId>>, } }
Generic Type Analysis
Type Parameters
Tracking Generic Definitions:
public class Container<T> where T : class { }
public class Repository<TEntity, TKey> where TEntity : class { }
Analysis:
#![allow(unused)] fn main() { pub struct GenericTypeInfo { pub type_parameters: Vec<TypeParameter>, pub constraints: Vec<TypeConstraint>, } pub struct TypeParameter { pub name: String, pub variance: Option<Variance>, // in, out } pub struct TypeConstraint { pub parameter: String, pub kind: ConstraintKind, } pub enum ConstraintKind { Class, // where T : class Struct, // where T : struct New, // where T : new() BaseType(TypeId), // where T : BaseClass Interface(TypeId), // where T : IInterface } }
Generic Type Usage
Tracking Instantiations:
var list = new List<int>();
var dict = new Dictionary<string, User>();
Analysis:
#![allow(unused)] fn main() { pub struct GenericInstantiation { pub generic_type: TypeId, pub type_arguments: Vec<TypeId>, } pub fn find_generic_instantiations(cu: &CompilationUnit) -> Vec<GenericInstantiation>; }
Type Usage Patterns
Frequency Analysis
Most Used Types:
#![allow(unused)] fn main() { pub struct TypeUsageStats { pub type_references: HashMap<TypeId, usize>, } impl TypeUsageStats { pub fn most_used_types(&self, limit: usize) -> Vec<(TypeId, usize)>; pub fn usage_count(&self, type_id: TypeId) -> usize; } }
Type Categories Distribution
#![allow(unused)] fn main() { pub struct TypeDistribution { pub class_count: usize, pub interface_count: usize, pub struct_count: usize, pub enum_count: usize, pub delegate_count: usize, } }
Type Metrics
Depth of Inheritance Tree (DIT)
Definition: Maximum depth from type to root of hierarchy
Example:
class A { } // DIT = 0 (or 1 from Object)
class B : A { } // DIT = 1 (or 2 from Object)
class C : B { } // DIT = 2 (or 3 from Object)
Interpretation:
- Low DIT (0-2): Simple hierarchy, easy to understand
- Medium DIT (3-4): Moderate complexity
- High DIT (5+): Complex hierarchy, may indicate over-engineering
Number of Children (NOC)
Definition: Number of immediate subclasses
Example:
class Animal { }
class Dog : Animal { }
class Cat : Animal { }
class Bird : Animal { }
// Animal has NOC = 3
Interpretation:
- High NOC: Type is heavily reused (good abstraction or god class)
- Low NOC: Specialized type or leaf in hierarchy
Lack of Cohesion of Methods (LCOM)
Definition: Measure of how well methods in a class are related
Simplified Calculation:
- Count pairs of methods that don't share instance variables
- High LCOM suggests class should be split
Type Compatibility Analysis
Assignability
Checking Compatibility:
#![allow(unused)] fn main() { pub fn is_assignable_to(from: &Type, to: &Type, context: &TypeContext) -> bool { // Check if 'from' type can be assigned to 'to' type // Considers inheritance, interface implementation, variance, etc. } }
Rules:
- Derived type assignable to base type
- Type assignable to implemented interface
- Covariant/contravariant generic types
- Nullable value types
- Implicit conversions
Type Conversions
Tracking Conversions:
int x = 42;
long y = x; // Implicit conversion
string s = x.ToString(); // Explicit conversion
Analysis:
#![allow(unused)] fn main() { pub enum ConversionKind { Implicit, Explicit, UserDefined, } pub struct TypeConversion { pub from: TypeId, pub to: TypeId, pub kind: ConversionKind, } }
Nullable Reference Types Analysis
Nullability Tracking
C# 8+ Nullable Annotations:
string? nullable = null; // Nullable reference
string nonNull = "value"; // Non-nullable reference
Analysis:
#![allow(unused)] fn main() { pub struct NullabilityInfo { pub is_nullable: bool, pub nullability_context: NullabilityContext, } pub enum NullabilityContext { Enabled, Disabled, Warnings, } }
Null Safety Diagnostics
Potential Null Reference:
warning[TYPE001]: Possible null reference
--> src/UserService.cs:15:9
|
15 | user.Name = "John";
| ^^^^ 'user' may be null here
|
= help: Add null check or use null-conditional operator
Type Analysis in Pipeline
Integration
Type analysis is not part of the default registry yet. The intended phase is Semantic (after symbol indexing and global artifacts). This page outlines the planned scope.
Programmatic Usage
Analyzing Type Hierarchy
Planned APIs will expose hierarchy queries once implemented.
Finding Generic Instantiations
Planned helper(s) to enumerate generic instantiations will be documented here after implementation.
Future Enhancements
Planned Features
-
Type Inference Tracking
- Track
varusage and inferred types - Analyze type inference patterns
- Track
-
Variance Analysis
- Detect variance violations
- Suggest covariant/contravariant annotations
-
Type Safety Metrics
- Measure use of
dynamic - Track unsafe casts
- Nullable reference type coverage
- Measure use of
-
Design Pattern Detection
- Identify common patterns (Factory, Strategy, etc.)
- Detect anti-patterns
Implementation Status
Current State:
- Basic type tracking infrastructure in place
- Type analysis module integrated with analysis framework
- Foundation for inheritance and generic analysis established
In Progress:
- Full inheritance hierarchy analysis
- Generic type instantiation tracking
- Type usage statistics collection
- Comprehensive test coverage
Planned:
- Variance analysis
- Type safety metrics
- Design pattern detection based on type relationships
Related Documentation
- Analysis Pipeline - Pipeline integration
- Dependency Analysis - Type dependencies
- Metrics Collection - Type-related metrics
- AST Structure - Type representations
References
- Implementation: Planned
- Tests: Planned (under
src/bsharp_tests/src/analysis/types/) - Related:
docs/analysis/dependencies.md,docs/parser/ast-structure.md
Code Quality Analysis (Conceptual / Future Plan)
This document describes a future-facing design for quality analysis. The legacy quality module and QualityPass were removed from the codebase in the purge. Consider this document a proposal/reference for potential future work rather than current implementation.
Overview
Status: Not implemented. The legacy module was removed; this page documents future direction.
Quality analysis provides:
- Code smell detection
- Best practice validation
- Design pattern recognition
- Maintainability assessment
- Technical debt identification
Code Smells
Method-Level Smells
Long Method
Description: Method with too many lines of code
Threshold: > 50 lines (configurable)
Example:
public void ProcessOrder(Order order) {
// 150 lines of code...
}
Diagnostic:
warning[QUAL001]: Long method detected
--> src/OrderService.cs:42:17
|
42 | public void ProcessOrder(Order order) {
| ^^^^^^^^^^^^ method has 150 lines (threshold: 50)
|
= help: Consider breaking this method into smaller, focused methods
Refactoring:
- Extract method
- Decompose into smaller methods
- Apply Single Responsibility Principle
Long Parameter List
Description: Method with too many parameters
Threshold: > 5 parameters (configurable)
Example:
public void CreateUser(string firstName, string lastName, string email,
string phone, string address, string city, string zip) {
// ...
}
Refactoring:
- Introduce parameter object
- Use builder pattern
- Group related parameters into DTOs
Complex Conditional
Description: Deeply nested or complex conditional logic
Example:
if (user != null && user.IsActive && (user.Role == "Admin" || user.Role == "Manager")
&& user.Department != null && user.Department.Budget > 10000) {
// ...
}
Refactoring:
- Extract condition to well-named method
- Use guard clauses
- Simplify boolean logic
Class-Level Smells
Large Class (God Class)
Description: Class with too many responsibilities
Indicators:
- Too many methods (> 20)
- Too many fields (> 10)
- High cyclomatic complexity
- Low cohesion
Example:
public class UserManager {
// 50 methods handling user CRUD, authentication, authorization,
// email sending, logging, caching, validation, etc.
}
Refactoring:
- Split into multiple classes
- Apply Single Responsibility Principle
- Extract related functionality
Feature Envy
Description: Method uses more features of another class than its own
Example:
public class OrderProcessor {
public decimal CalculateTotal(Order order) {
decimal total = 0;
foreach (var item in order.Items) {
total += item.Price * item.Quantity;
}
total -= order.Discount;
total += order.Tax;
return total;
}
}
Refactoring:
- Move method to
Orderclass - Method should be where the data is
Data Class
Description: Class with only fields and getters/setters, no behavior
Example:
public class User {
public string Name { get; set; }
public string Email { get; set; }
public int Age { get; set; }
// No methods, just data
}
Note: Sometimes acceptable for DTOs, but domain objects should have behavior
Code Organization Smells
Duplicate Code
Description: Identical or very similar code in multiple places
Detection:
- Token-based comparison
- AST structure comparison
- Minimum clone size threshold
Refactoring:
- Extract method
- Extract class
- Use inheritance or composition
Dead Code
Description: Code that is never executed
Examples:
- Unreachable statements after
return - Unused private methods
- Unused fields
- Conditions that are always true/false
Diagnostic:
warning[QUAL010]: Unreachable code detected
--> src/Calculator.cs:15:9
|
14 | return result;
15 | Console.WriteLine("Done"); // Never executed
| ^^^^^^^^^^^^^^^^^^^^^^^^^^ unreachable statement
Magic Numbers
Description: Unexplained numeric literals in code
Example:
if (order.Total > 1000) { // What does 1000 mean?
ApplyDiscount(order, 0.1); // What does 0.1 mean?
}
Refactoring:
const decimal BULK_ORDER_THRESHOLD = 1000m;
const decimal BULK_ORDER_DISCOUNT = 0.1m;
if (order.Total > BULK_ORDER_THRESHOLD) {
ApplyDiscount(order, BULK_ORDER_DISCOUNT);
}
Best Practices
Naming Conventions
Rules:
- Classes: PascalCase
- Methods: PascalCase
- Properties: PascalCase
- Fields: camelCase with
_prefix for private - Constants: UPPER_CASE or PascalCase
- Interfaces: PascalCase with
Iprefix
Violations:
warning[QUAL020]: Naming convention violation
--> src/UserService.cs:5:17
|
5 | private int UserCount;
| ^^^^^^^^^ private field should use camelCase with _ prefix
|
= help: Rename to '_userCount'
Exception Handling
Anti-patterns:
Empty Catch Block:
try {
RiskyOperation();
} catch (Exception) {
// Silent failure - BAD!
}
Catching Generic Exception:
try {
SpecificOperation();
} catch (Exception ex) { // Too broad
// ...
}
Best Practices:
- Catch specific exceptions
- Log exceptions
- Don't swallow exceptions
- Use
finallyfor cleanup
Resource Management
Using Statement:
// Good
using (var file = File.OpenRead("data.txt")) {
// Use file
}
// Better (C# 8+)
using var file = File.OpenRead("data.txt");
// Disposed at end of scope
Diagnostic:
warning[QUAL030]: IDisposable not properly disposed
--> src/FileProcessor.cs:10:9
|
10 | var file = File.OpenRead("data.txt");
| ^^^^ should be wrapped in using statement
Design Patterns and Anti-Patterns
Detected Patterns
Singleton Pattern
Detection:
- Private constructor
- Static instance field
- Public static accessor
Example:
public class Logger {
private static Logger _instance;
private Logger() { }
public static Logger Instance {
get {
if (_instance == null) {
_instance = new Logger();
}
return _instance;
}
}
}
Factory Pattern
Detection:
- Method returning interface or base class
- Creates different concrete types based on parameters
Anti-Patterns
God Object
Detection:
- High number of methods and fields
- Low cohesion
- High coupling
Spaghetti Code
Detection:
- High cyclomatic complexity
- Deep nesting
- Lack of structure
Lava Flow
Detection:
- Dead code
- Commented-out code
- Unused variables/methods
Quality Metrics
Code Quality Score
Composite Score (0-100):
#![allow(unused)] fn main() { pub struct QualityScore { pub overall: f64, pub maintainability: f64, pub complexity: f64, pub duplication: f64, pub test_coverage: f64, } }
Calculation:
Overall = (Maintainability * 0.3) +
(Complexity * 0.3) +
(Duplication * 0.2) +
(TestCoverage * 0.2)
Technical Debt
Estimation:
#![allow(unused)] fn main() { pub struct TechnicalDebt { pub total_issues: usize, pub estimated_hours: f64, pub debt_ratio: f64, // debt / total development time } }
Calculation:
- Each code smell assigned time cost
- Sum all issues
- Compare to total codebase size
Quality Rules
Rule System
Rule Definition:
#![allow(unused)] fn main() { pub trait QualityRule { fn id(&self) -> &'static str; fn name(&self) -> &'static str; fn description(&self) -> &'static str; fn check(&self, node: &NodeRef, session: &mut AnalysisSession); } }
Example Rule:
#![allow(unused)] fn main() { pub struct LongMethodRule { max_lines: usize, } impl QualityRule for LongMethodRule { fn id(&self) -> &'static str { "long_method" } fn name(&self) -> &'static str { "Long Method" } fn check(&self, node: &NodeRef, session: &mut AnalysisSession) { if let NodeRef::MethodDeclaration(method) = node { let line_count = count_lines(method); if line_count > self.max_lines { session.diagnostics.add( DiagnosticCode::LongMethod, format!("Method has {} lines (threshold: {})", line_count, self.max_lines) ); } } } } }
Rule Categories
Maintainability Rules:
- Long method
- Long parameter list
- Large class
- Complex method
Reliability Rules:
- Empty catch blocks
- Null reference risks
- Resource leaks
- Unhandled exceptions
Security Rules:
- SQL injection risks
- XSS vulnerabilities
- Hardcoded credentials
- Insecure random
Performance Rules:
- Inefficient loops
- Unnecessary allocations
- String concatenation in loops
- Boxing/unboxing
Configuration
Quality Thresholds
[analysis.quality]
max_method_lines = 50
max_parameters = 5
max_class_methods = 20
max_cyclomatic_complexity = 10
max_nesting_depth = 4
[analysis.quality.rules]
long_method = "warning"
long_parameter_list = "warning"
god_class = "error"
empty_catch = "error"
magic_numbers = "info"
Severity Levels
- Error: Must be fixed
- Warning: Should be fixed
- Info: Consider fixing
- Hint: Suggestion for improvement
CLI Usage
Quality Analysis
# Analyze code quality
bsharp analyze MyProject.csproj --enable-ruleset quality
# Generate quality report
bsharp analyze MySolution.sln --out quality-report.json
# Filter by severity
bsharp analyze MyFile.cs --severity error,warning
Example Output
{
"quality_score": {
"overall": 72.5,
"maintainability": 68.0,
"complexity": 75.0,
"duplication": 80.0
},
"technical_debt": {
"total_issues": 45,
"estimated_hours": 12.5,
"debt_ratio": 0.08
},
"diagnostics": [
{
"code": "QUAL001",
"severity": "warning",
"message": "Long method detected",
"file": "src/OrderService.cs",
"line": 42,
"column": 17
}
]
}
Integration with Pipeline
Quality Ruleset
Registration:
#![allow(unused)] fn main() { // In AnalyzerRegistry registry.add_ruleset(QualityRuleset { id: "quality", rules: vec![ Box::new(LongMethodRule::new()), Box::new(LongParameterListRule::new()), Box::new(GodClassRule::new()), Box::new(EmptyCatchRule::new()), // ... more rules ], }); }
Execution:
- Rules run during Local or Semantic phase
- Visitor pattern for AST traversal
- Diagnostics collected in session
Programmatic Usage
Running Quality Analysis
#![allow(unused)] fn main() { use bsharp::analysis::quality::QualityAnalyzer; let parser = Parser::new(); let cu = parser.parse(source_code)?; let analyzer = QualityAnalyzer::new(); let report = analyzer.analyze(&cu); println!("Quality Score: {}", report.quality_score.overall); println!("Issues Found: {}", report.diagnostics.len()); }
Custom Rules
#![allow(unused)] fn main() { use bsharp::analysis::quality::QualityRule; struct CustomRule; impl QualityRule for CustomRule { fn id(&self) -> &'static str { "custom_rule" } fn name(&self) -> &'static str { "Custom Rule" } fn check(&self, node: &NodeRef, session: &mut AnalysisSession) { // Custom logic } } // Register custom rule analyzer.add_rule(Box::new(CustomRule)); }
Future Enhancements
Planned Features
-
Machine Learning-Based Detection
- Learn from codebase patterns
- Detect project-specific smells
-
Refactoring Suggestions
- Automated refactoring proposals
- Preview refactoring impact
-
Quality Trends
- Track quality over time
- Identify degradation
- Measure improvement
-
Team Metrics
- Per-developer quality metrics
- Code review insights
- Best practice adoption
Related Documentation
- Analysis Pipeline - Pipeline integration
- Metrics Collection - Quality metrics
- Control Flow Analysis - Complexity analysis
- Architecture - Design decisions
References
- Standards: Clean Code (Robert C. Martin), Refactoring (Martin Fowler)
Passes and Rules Registry
This page summarizes the default analysis registry: which passes and rulesets are registered by default and when they run.
Default Registry
Source: src/bsharp_analysis/src/framework/registry.rs
#![allow(unused)] fn main() { // Simplified summary based on default_registry() - Pass: indexing::IndexingPass // indexing/symbols - Pass: pe_loader::PeLoaderPass // external PE metadata (if available) - Pass: metrics::MetricsPass // local metrics (Query-based) - Ruleset (local): rules::naming // naming conventions - Ruleset (local): rules::semantic // baseline semantic checks (local) - Pass: control_flow::ControlFlowPass // control flow stats and diagnostics - Pass: dependencies::DependenciesPass // dependency graph & summary - Ruleset (semantic): control_flow_smells // consumes global artifacts - Pass: reporting::ReportingPass // consolidate artifacts into report }
Notes:
- Each pass declares its own
Phase(AnalyzerPass::phase()), e.g.MetricsPassruns inPhase::LocalRules. - Semantic rulesets (e.g.,
control_flow_smells) run after global artifacts are produced.
Phases
- Index: Build indexes (symbols, FQNs) and load external metadata.
- LocalRules: Run per-file local analyses (e.g., metrics) and baseline rules.
- Global/Semantic: Build global artifacts (control flow, dependencies), then run semantic rules consuming them.
- Reporting: Finalize results into
AnalysisReport.
Configuration: Enabling/Disabling
Toggles are driven by AnalysisConfig:
- Passes:
enable_passes[pass_id] = true|false - Rulesets:
enable_rulesets[ruleset_id] = true|false - Severities:
rule_severities["CODE"] = Error|Warning|Info|Hint
The CLI maps flags to these fields (see docs/cli/analyze.md).
IDs
- Pass IDs (
AnalyzerPass::id()):passes.indexingpasses.pe_loaderpasses.metricspasses.control_flowpasses.dependenciespasses.reporting
- Ruleset IDs depend on the ruleset constructors (e.g.,
naming,semantic,control_flow_smells).
References
src/bsharp_analysis/src/framework/registry.rssrc/bsharp_analysis/src/passes/*src/bsharp_analysis/src/rules/*
Analysis Report Schema
The AnalysisReport summarizes diagnostics and artifacts produced by the analysis pipeline.
Struct
Source: src/bsharp_analysis/src/report/mod.rs
#![allow(unused)] fn main() { #[derive(Debug, Clone, Default, Serialize, Deserialize)] pub struct CfgSummary { pub total_methods: usize, pub high_complexity_methods: usize, pub deep_nesting_methods: usize, } #[derive(Debug, Clone, Default, Serialize, Deserialize)] pub struct AnalysisReport { pub schema_version: u32, pub diagnostics: DiagnosticCollection, pub metrics: Option<AstAnalysis>, pub cfg: Option<CfgSummary>, pub deps: Option<DependencySummary>, pub workspace_warnings: Vec<String>, pub workspace_errors: Vec<String>, } }
Field Details
schema_version– current schema version (1)diagnostics– all emitted diagnostics with codes, severities, locationsmetrics– aggregatedAstAnalysiswhenMetricsPassrunscfg– summarized control flow stats whenControlFlowPassrunsdeps– dependency summary whenDependenciesPassrunsworkspace_warnings– non-fatal workspace-level messagesworkspace_errors– reserved for future use
Example (pretty JSON)
{
"schema_version": 1,
"diagnostics": {
"diagnostics": [
{
"code": "CF002",
"severity": "warning",
"message": "High cyclomatic complexity",
"file": "src/OrderProcessor.cs",
"line": 42,
"column": 17
}
]
},
"metrics": {
"total_classes": 15,
"total_interfaces": 3,
"total_structs": 2,
"total_enums": 1,
"total_records": 0,
"total_delegates": 0,
"total_methods": 87,
"total_properties": 21,
"total_fields": 12,
"total_events": 0,
"total_constructors": 15,
"total_if_statements": 20,
"total_for_loops": 5,
"total_while_loops": 2,
"total_switch_statements": 3,
"total_try_statements": 1,
"total_using_statements": 2,
"cyclomatic_complexity": 245,
"lines_of_code": 980,
"max_nesting_depth": 5,
"documented_methods": 0,
"documented_classes": 0
},
"cfg": {
"total_methods": 87,
"high_complexity_methods": 5,
"deep_nesting_methods": 3
},
"deps": {
"nodes": 42,
"edges": 120
},
"workspace_warnings": [],
"workspace_errors": []
}
Where It Comes From
AnalysisReport::from_session(&session) collects:
metricsfromsession.artifacts.get::<AstAnalysis>()cfgby summarizing theControlFlowIndexartifact against thresholdsdepsby summarizingDependencyGraphdiagnosticscopied fromsession.diagnostics
Related
docs/cli/analyze.md– CLI options and examplesdocs/analysis/pipeline.md– Where in the pipeline artifacts are produced
Writing an Analyzer Pass
This guide shows how to create a new analysis pass by implementing AnalyzerPass and registering it in the analysis pipeline.
Trait
Source: src/bsharp_analysis/src/framework/passes.rs
#![allow(unused)] fn main() { pub trait AnalyzerPass: Send + Sync + 'static { fn id(&self) -> &'static str; fn phase(&self) -> Phase; // Index | LocalRules | Global | Semantic | Reporting fn depends_on(&self) -> &'static [&'static str] { &[] } fn run(&self, cu: &CompilationUnit, session: &mut AnalysisSession) {} } }
Minimal Pass
#![allow(unused)] fn main() { use bsharp_analysis::framework::{AnalyzerPass, Phase, AnalysisSession}; use bsharp_syntax::CompilationUnit; pub struct MyPass; impl AnalyzerPass for MyPass { fn id(&self) -> &'static str { "passes.my_pass" } fn phase(&self) -> Phase { Phase::LocalRules } fn run(&self, cu: &CompilationUnit, session: &mut AnalysisSession) { // Inspect `cu` and write results into `session.artifacts` or `session.diagnostics` // Example: count classes and log a note (pseudo) let mut count = 0usize; for _c in bsharp_analysis::framework::Query::from(cu).of::<bsharp_syntax::ClassDeclaration>() { count += 1; } // session.artifacts.insert(MyArtifact { class_count: count }); // session.diagnostics.add(...); } } }
Registration
Add your pass to the default registry in src/bsharp_analysis/src/framework/registry.rs:
#![allow(unused)] fn main() { reg.register_pass(crate::passes::my_pass::MyPass); }
Or, build a custom registry for experiments:
#![allow(unused)] fn main() { let mut reg = AnalyzerRegistry::default_registry(); reg.register_pass(MyPass); AnalyzerPipeline::run_for_file(&cu, &mut session, ®); }
You can also toggle passes via AnalysisConfig.enable_passes["passes.my_pass"] = true|false (see configuration docs).
Tips
- Keep passes small: Focus on one responsibility.
- Prefer Query/AstWalker: Use
Queryfor typed enumeration orAstWalkerwithVisitfor custom traversal. - Write artifacts: Insert results with
session.artifacts.insert(T)when they may be consumed later. - Determinism: Avoid non-deterministic ordering; use sorted maps/lists if needed.
Writing a Ruleset
This guide shows how to define rules and bundle them into a RuleSet to be executed by the analysis pipeline.
Traits and Types
Source: src/bsharp_analysis/src/framework/rules.rs
#![allow(unused)] fn main() { pub enum RuleTarget { All, Declarations, Members, Statements, Expressions } pub trait Rule: Send + Sync + 'static { fn id(&self) -> &'static str; fn category(&self) -> &'static str; fn applies_to(&self) -> RuleTarget { RuleTarget::All } fn visit(&self, _node: &NodeRef, _session: &mut AnalysisSession) {} } pub struct RuleSet { pub id: &'static str, pub rules: Vec<Box<dyn Rule>> } }
Minimal Rule
#![allow(unused)] fn main() { use bsharp_analysis::framework::{Rule, RuleTarget, NodeRef, AnalysisSession}; pub struct NoEmptyCatch; impl Rule for NoEmptyCatch { fn id(&self) -> &'static str { "QUAL010" } fn category(&self) -> &'static str { "quality" } fn applies_to(&self) -> RuleTarget { RuleTarget::Statements } fn visit(&self, node: &NodeRef, session: &mut AnalysisSession) { if let NodeRef::Statement(stmt) = node { if let bsharp_syntax::statements::statement::Statement::Try(t) = stmt { for c in &t.catches { if c.block_is_empty() { session.diagnostics.add( bsharp_analysis::DiagnosticCode::from_static("QUAL010"), "Empty catch block", None, ); } } } } } } }
Building a RuleSet
#![allow(unused)] fn main() { use bsharp_analysis::framework::RuleSet; pub fn ruleset() -> RuleSet { RuleSet::new("quality") .with_rule(NoEmptyCatch) // .with_rule(AnotherRule) } }
Register in the default registry (src/bsharp_analysis/src/framework/registry.rs) or construct a custom registry.
#![allow(unused)] fn main() { reg.register_ruleset(crate::rules::quality::ruleset()); // local rules reg.register_semantic_ruleset(crate::rules::control_flow_smells::ruleset()); }
Rulesets can be enabled/disabled via AnalysisConfig.enable_rulesets["quality"] = true|false.
Tips
- Choose RuleTarget thoughtfully to avoid unnecessary visits.
- Emit diagnostics with specific codes and helpful messages.
- Keep rules independent; accumulate state in
AnalysisSessionartifacts when needed. - Honor config toggles; only run if your ruleset is enabled.
Command Line Interface
The BSharp CLI provides command-line tools for parsing, analyzing, and visualizing C# code.
Installation
From Source
git clone https://github.com/mikserek/bsharp.git
cd bsharp
cargo build --release
The binary will be available at target/release/bsharp.
Add to PATH
# Linux/macOS
export PATH="$PATH:/path/to/bsharp/target/release"
# Windows
# Add to System Environment Variables
Command Structure
bsharp <COMMAND> [OPTIONS] <INPUT>
Global Options
--help, -h Show help information
--version, -V Show version information
Argument Files (@file)
All commands support argument files via @file syntax. Example:
bsharp @args.txt
Where args.txt contains one argument per line (comments and quoting follow standard shell parsing rules).
Available Commands
parse
Parse C# source code and print a textual AST tree to stdout.
bsharp parse <INPUT>
See: Parse Command
tree
Generate a visualization of the Abstract Syntax Tree.
bsharp tree <INPUT> [--output <FILE>] [--format mermaid|dot]
Notes:
- Default format is
mermaid; output defaults to<input>.mmd. - For DOT/Graphviz, use
--format dot(orgraphviz); output defaults to<input>.dot.
See: Tree Visualization
analyze
Analyze C# code and generate comprehensive analysis report.
bsharp analyze <INPUT> [OPTIONS]
See: Analysis Command
format
Format C# files using the built-in formatter and syntax emitters.
bsharp format <INPUT> [--write] [--newline-mode lf|crlf] [--max-consecutive-blank-lines <N>] \
[--blank-line-between-members <BOOL>] [--trim-trailing-whitespace <BOOL>] \
[--emit-trace] [--emit-trace-file <FILE>]
Notes:
<INPUT>can be a file or directory (recursively formats .cs files; skips hidden/bin/obj/target).--writedefaults to true; when false and a single file is given, the formatted output is printed to stdout.- Emission tracing can be enabled by
--emit-traceor environment variableBSHARP_EMIT_TRACE=1.
See: Format Command
Common Usage Patterns
Quick Parse Check
# Check if file parses successfully
bsharp parse MyFile.cs
Generate AST for Inspection
# Pretty-printed JSON
bsharp parse MyFile.cs --output ast.json
Visualize Code Structure
# Generate Mermaid diagram (default), writes MyClass.mmd
bsharp tree MyClass.cs
# Generate Graphviz DOT diagram
bsharp tree MyClass.cs --format dot --output diagram.dot
Analyze Project Quality
# Full analysis with report
bsharp analyze MyProject.csproj --out report.json --format pretty-json
Analyze Solution
# Analyze entire solution
bsharp analyze MySolution.sln --follow-refs true
Input Types
Single File
bsharp parse Program.cs
Project File (.csproj)
bsharp analyze MyProject.csproj
Solution File (.sln)
bsharp analyze MySolution.sln
Directory
bsharp analyze ./src
Output Formats
JSON (Compact)
bsharp analyze MyFile.cs --format json
Output: Single-line JSON, optimized for machine consumption
Pretty JSON
bsharp analyze MyFile.cs --format pretty-json
Output: Indented JSON, human-readable
Mermaid/DOT (Tree Command)
# Mermaid (default)
bsharp tree MyFile.cs --output diagram.mmd
# Graphviz DOT
bsharp tree MyFile.cs --format dot --output diagram.dot
Output: Mermaid (.mmd) or Graphviz DOT (.dot)
Error Handling
Parse Errors
$ bsharp parse InvalidSyntax.cs
Error: Parse failed at line 5, column 12
Expected ';' but found 'class'
public class MyClass
^
File Not Found
$ bsharp parse NonExistent.cs
Error: File not found: NonExistent.cs
Invalid Project
$ bsharp analyze Invalid.csproj
Error: Failed to parse project file: Invalid XML
Environment Variables
RUST_LOG
Control logging verbosity:
# Show all logs
RUST_LOG=debug bsharp parse MyFile.cs
# Show only warnings and errors
RUST_LOG=warn bsharp analyze MyProject.csproj
# Show specific module logs
RUST_LOG=bsharp::parser=debug bsharp parse MyFile.cs
RUST_BACKTRACE
Enable stack traces on panic:
RUST_BACKTRACE=1 bsharp parse MyFile.cs
Performance Considerations
Large Files
For large files (> 10,000 lines), parsing may take several seconds:
# Monitor progress with debug logging
RUST_LOG=info bsharp parse LargeFile.cs
Large Solutions
For solutions with many projects, use parallel analysis:
# Requires parallel_analysis feature
cargo build --release --features parallel_analysis
bsharp analyze LargeSolution.sln
Memory Usage
Memory usage scales with AST size. For very large codebases:
# Analyze incrementally by project
for proj in **/*.csproj; do
bsharp analyze "$proj" --out "$(basename $proj .csproj).json"
done
Integration with Other Tools
CI/CD Pipeline
# GitHub Actions example
- name: Analyze Code Quality
run: |
bsharp analyze MySolution.sln --out analysis.json
# Upload analysis.json as artifact
Pre-commit Hook
#!/bin/bash
# .git/hooks/pre-commit
changed_files=$(git diff --cached --name-only --diff-filter=ACM | grep '\.cs$')
for file in $changed_files; do
if ! bsharp parse "$file" > /dev/null 2>&1; then
echo "Parse error in $file"
exit 1
fi
done
Editor Integration
// VS Code tasks.json
{
"version": "2.0.0",
"tasks": [
{
"label": "Analyze Current File",
"type": "shell",
"command": "bsharp",
"args": [
"analyze",
"${file}",
"--out",
"${file}.analysis.json"
]
}
]
}
Troubleshooting
Command Not Found
$ bsharp: command not found
Solution: Add bsharp to PATH or use full path:
/path/to/bsharp/target/release/bsharp parse MyFile.cs
Permission Denied
$ bsharp parse MyFile.cs
Permission denied
Solution: Make binary executable:
chmod +x /path/to/bsharp
Out of Memory
$ bsharp analyze HugeSolution.sln
Error: memory allocation failed
Solution: Analyze smaller subsets or increase system memory
Configuration Files
Analysis Configuration
Create .bsharp.toml in project root:
[analysis]
max_cyclomatic_complexity = 10
max_method_length = 50
[analysis.quality]
long_method = "warning"
god_class = "error"
[workspace]
follow_refs = true
include = ["**/*.cs"]
exclude = ["**/obj/**", "**/bin/**"]
Usage:
# Automatically loads .bsharp.toml from current directory
bsharp analyze MyProject.csproj
Shell Completion
Shell completion generation is currently not available in the CLI.
Examples
Example 1: Quick Syntax Check
# Check if all C# files in directory parse correctly
find . -name "*.cs" -exec bsharp parse {} \; 2>&1 | grep -i error
Example 2: Generate Documentation
# Parse all files and extract class/method names
for file in src/**/*.cs; do
bsharp parse "$file" --output "${file}.json"
done
# Process JSON to generate documentation
# (custom script)
Example 3: Code Quality Gate
#!/bin/bash
# quality-gate.sh
bsharp analyze MyProject.csproj --out report.json --format json
# Extract error count
errors=$(jq '.diagnostics | map(select(.severity == "error")) | length' report.json)
if [ "$errors" -gt 0 ]; then
echo "Quality gate failed: $errors errors found"
exit 1
fi
echo "Quality gate passed"
Example 4: Complexity Report
# Generate complexity report for all methods
bsharp analyze MySolution.sln --out complexity.json
# Extract high-complexity methods
jq '.diagnostics | map(select(.code == "MET001"))' complexity.json
CLI Architecture
Implementation
Location: src/bsharp_cli/
src/bsharp_cli/
├── src/
│ ├── main.rs # CLI entry point, clap definitions
│ └── commands/
│ ├── mod.rs # Command module exports
│ ├── parse.rs # Parse command implementation
│ ├── tree.rs # AST visualization command (Mermaid/DOT)
│ └── analyze.rs # Analysis command
└── Cargo.toml
Command Pattern
Each command follows this pattern:
#![allow(unused)] fn main() { pub fn execute(input: PathBuf, /* other args */) -> Result<()> { // 1. Validate input // 2. Load/parse files // 3. Perform operation // 4. Generate output // 5. Handle errors Ok(()) } }
Future Enhancements
Planned Features
-
Interactive Mode
- REPL for exploring AST
- Interactive analysis
-
Watch Mode
- Monitor files for changes
- Re-analyze on save
-
Language Server
- LSP implementation
- IDE integration
-
Web Interface
- Browser-based visualization
- Interactive reports
Related Documentation
- Parse Command - Detailed parse command documentation
- Tree Visualization - AST visualization
- Analysis Pipeline - Analysis internals
References
- Implementation:
src/bsharp_cli/ - Commands:
src/bsharp_cli/src/commands/ - Clap Documentation: https://docs.rs/clap/
--emit-spans
- When used with
--errors-json, include absolute and relative spans in the JSON undererror.spans. - No effect unless
--errors-jsonis set.
Parse Command
The parse command parses C# source code and prints a textual AST tree representation to stdout.
Usage
bsharp parse --input <INPUT> [--errors-json] [--emit-spans] [--no-color] [--lenient]
Arguments
<INPUT> (required)
- Path to C# source file
- Must have
.csextension - File must exist and be readable
Options
--errors-json
- Print a machine-readable JSON error object to stdout on parse failure and exit non-zero
- Disables pretty error output
--no-color
- Disable ANSI colors in pretty error output
--lenient
- Enable best-effort recovery mode (default is strict)
Note: The --output option is currently not used; the command writes the textual tree to stdout.
Examples
Basic Parsing
# Parse and print textual AST tree to stdout
bsharp parse Program.cs
Batch Parsing
# Parse all C# files in a directory (prints textual trees)
for file in src/**/*.cs; do
bsharp parse "$file"
done
Output
The command prints a human-readable textual tree describing the AST. For visualization outputs (Mermaid/DOT), use the tree command.
Error Handling
Parse Errors
$ bsharp parse InvalidSyntax.cs
Error: Parse failed
0: at line 5, in keyword "class":
public clas MyClass { }
^--- expected keyword "class"
1: in context "class declaration"
Error Information:
- Line and column numbers
- Context stack showing where parsing failed
- Expected vs. actual input
- Helpful error messages
Pretty error formatting
The parser integrates with the miette crate for rich, labeled diagnostics in pretty (non-JSON) mode. CLI parse errors are formatted from the underlying ErrorTree with spans and context information for easier debugging.
For programmatic formatting from parser code, see bsharp_parser::errors::to_miette_report which converts an ErrorTree to a miette::Report with source code attached.
File Errors
$ bsharp parse NonExistent.cs
Error: Failed to read file: NonExistent.cs
Caused by: No such file or directory (os error 2)
Use Cases
1. Syntax Validation
# Check if file has valid syntax
if bsharp parse MyFile.cs > /dev/null 2>&1; then
echo "Syntax OK"
else
echo "Syntax Error"
exit 1
fi
2. AST Inspection
# Parse and inspect AST structure
bsharp parse MyClass.cs --output ast.json
jq '.declarations[0].Class.name.name' ast.json
3. Documentation Input
# Parse C# and generate documentation using your own script
bsharp parse MyFile.cs --output ast.json
python generate_docs.py ast.json > docs.md
4. Static Analysis
# Parse and analyze with custom tool
bsharp parse MyFile.cs --output ast.json
./my-analyzer ast.json
Performance
Parsing Speed
- Small files (< 100 lines): < 10ms
- Medium files (100-1000 lines): 10-100ms
- Large files (1000-10000 lines): 100ms-1s
- Very large files (> 10000 lines): 1-10s
Memory Usage
- Memory usage scales linearly with file size
- Typical: 1-5 MB per 1000 lines of code
- Peak memory during AST construction
Integration
CI/CD Pipeline
# GitHub Actions
- name: Validate C# Syntax
run: |
find . -name "*.cs" | while read file; do
bsharp parse "$file" || exit 1
done
Pre-commit Hook
#!/bin/bash
# .git/hooks/pre-commit
git diff --cached --name-only --diff-filter=ACM | grep '\.cs$' | while read file; do
if ! bsharp parse "$file" > /dev/null 2>&1; then
echo "Parse error in $file"
exit 1
fi
done
Build Script
#!/bin/bash
# validate-syntax.sh
errors=0
for file in src/**/*.cs; do
if ! bsharp parse "$file" > /dev/null 2>&1; then
echo "ERROR: $file"
((errors++))
fi
done
if [ $errors -gt 0 ]; then
echo "Found $errors files with syntax errors"
exit 1
fi
Comparison with Other Tools
vs. Roslyn
- BSharp: Fast, standalone, JSON output
- Roslyn: Full compiler, .NET required, complex API
vs. Tree-sitter
- BSharp: C#-specific, complete AST
- Tree-sitter: Multi-language, syntax tree only
Implementation
Location: src/bsharp_cli/src/commands/parse.rs
#![allow(unused)] fn main() { pub fn execute( input: PathBuf, output: Option<PathBuf>, errors_json: bool, no_color: bool, lenient: bool, ) -> Result<()> { // Read file, choose strict/lenient, parse, and write <input>.json by default // See the source file for detailed behavior and error formatting. Ok(()) } }
Related Documentation
- CLI Overview - General CLI usage
- Tree Visualization - Visualize parsed AST
- AST Structure - AST node reference
- Error Handling - Parse error details
References
- Implementation:
src/bsharp_cli/src/commands/parse.rs - Parser:
src/bsharp_parser/src/ - AST Definitions:
src/bsharp_syntax/src/
Tree Visualization Command
The tree command generates a visualization of the Abstract Syntax Tree (AST) from C# source code in Mermaid or Graphviz DOT format.
Usage
bsharp tree <INPUT> [--output <FILE>] [--format mermaid|dot]
Arguments
<INPUT> (required)
- Path to C# source file
- Must have
.csextension
Options
--output, -o <FILE> (optional)
- Output file path
- Default:
<input>.mmdfor Mermaid,<input>.dotfor DOT
--format <FORMAT> (optional)
- One of:
mermaid(default),dot(alias:graphviz)
Examples
Basic Visualization
# Generate Mermaid diagram (default)
bsharp tree Program.cs # writes Program.mmd
# Generate Graphviz DOT diagram
bsharp tree Program.cs --format dot # writes Program.dot
# Specify output file
bsharp tree Program.cs --format dot --output ast-diagram.dot
View/Render
# Mermaid preview (e.g., VS Code Mermaid extension) or CLI renderer
# Graphviz render to PNG
dot -Tpng Program.dot -o Program.png
Output Formats
Mermaid
Outputs a simple top-level graph in Mermaid syntax (.mmd).
graph TD
n0["CompilationUnit\\nUsings: 1\\nDecls: 1"]
u0["Using using System;"]
n0 --> u0
d0["Class: Program"]
n0 --> d0
Graphviz DOT
Outputs a simple top-level graph in DOT syntax (.dot).
digraph AST {
node [shape=box, fontname="Courier New"];
n0 [label="CompilationUnit\\nUsings: 1\\nDecls: 1"];
u0 [label="Using using System;"];
n0 -> u0;
d0 [label="Class: Program"];
n0 -> d0;
}
Color Scheme
- Gray - Root nodes (CompilationUnit)
- Blue - Type declarations (Class, Interface, Struct)
- Green - Member declarations (Method, Property, Field)
- Yellow - Statements (If, For, While)
- Orange - Expressions (Binary, Invocation)
- Purple - Types (Primitive, Named, Generic)
Visualization Features
Node Information
Each node displays:
- Node Type - AST node type name
- Identifier - Name (for named nodes)
- Additional Info - Modifiers, types, etc.
Tree Layout
- Top-down - Root at top, leaves at bottom
- Hierarchical - Parent-child relationships clear
- Balanced - Nodes distributed evenly
- Scalable - Adjusts to tree size
Use Cases
1. Understanding Code Structure
# Visualize complex class
bsharp tree ComplexClass.cs --output structure.svg
2. Teaching/Documentation
# Generate diagrams for documentation
bsharp tree Example.cs --output docs/ast-example.svg
3. Debugging Parser
# Verify parser output
bsharp tree TestCase.cs --output debug.svg
4. Code Review
# Visualize changes
bsharp tree NewFeature.cs --output review.svg
Limitations
Large Files
- Files > 1000 lines may produce very large SVGs
- Consider visualizing specific classes/methods only
Complex Nesting
- Deeply nested structures may be hard to read
- SVG may require horizontal scrolling
Performance
- Generation time increases with AST size
- Large files (> 5000 lines) may take several seconds
Advanced Usage
Selective Visualization
# Extract specific class and visualize
# (requires custom script to extract class)
extract-class.sh MyFile.cs MyClass > temp.cs
bsharp tree temp.cs --output MyClass-ast.svg
rm temp.cs
Batch Generation
# Generate visualizations for all files
for file in src/**/*.cs; do
output="diagrams/$(basename $file .cs).svg"
bsharp tree "$file" --output "$output"
done
Integration with Documentation
# MyClass Documentation
## AST Structure

The class structure shows...
Implementation
Location: src/bsharp_cli/src/commands/tree.rs
#![allow(unused)] fn main() { pub fn execute(args: Box<TreeArgs>) -> Result<()> { // Parses input in lenient mode, then writes Mermaid (.mmd) or DOT (.dot) // using bsharp_syntax::node::render::{to_mermaid, to_dot}. Ok(()) } }
Renderer functions live in src/bsharp_syntax/src/node/render.rs:
#![allow(unused)] fn main() { to_mermaid(&ast); to_dot(&ast); }
Customization
Future Enhancements
-
Interactive SVG
- Click to expand/collapse nodes
- Hover for details
- Search functionality
-
Export Formats
- PNG/PDF export
- DOT format for Graphviz
- PlantUML format
-
Filtering
- Show only specific node types
- Hide implementation details
- Focus on structure
-
Styling
- Custom color schemes
- Font customization
- Layout options
Troubleshooting
SVG Too Large
Problem: Generated SVG is too large to view
Solution:
- Visualize smaller code sections
- Use SVG viewer with zoom/pan
- Export to PDF for printing
Overlapping Nodes
Problem: Nodes overlap in complex trees
Solution:
- Increase SVG dimensions
- Simplify code structure
- Use horizontal layout (future feature)
Missing Nodes
Problem: Some AST nodes not shown
Solution:
- Check parser output with
parsecommand - Report issue if nodes are missing
Related Documentation
- CLI Overview - General CLI usage
- Parse Command - Parse textual AST tree
- AST Structure - AST node reference
References
- Implementation:
src/bsharp_cli/src/commands/tree.rs - Formats: Mermaid or Graphviz DOT
Analyze Command
The analyze command performs comprehensive code analysis on C# files, projects, or solutions, generating detailed reports with diagnostics, metrics, and quality assessments.
Usage
bsharp analyze <INPUT> [OPTIONS]
Arguments
<INPUT> (required)
- Path to C# source file (
.cs) - Path to project file (
.csproj) - Path to solution file (
.sln) - Path to directory
Options
Output Options
--out <FILE>
- Output file path for analysis report (JSON)
- Default: stdout
- Creates parent directories if needed
--format <FORMAT>
- Output format:
json(compact) orpretty-json(indented) - Default:
pretty-json
Configuration
--config <FILE>
- Path to analysis configuration file (JSON or TOML)
- Overrides default settings
- CLI flags override config file settings
Workspace Options
--follow-refs <BOOL>
- Follow ProjectReference dependencies transitively
- Default:
true - Set to
falseto analyze only specified project
--include <GLOB>...
- Include only files matching glob patterns
- Multiple patterns allowed
- Example:
--include "**/*Service.cs" "**/*Controller.cs"
--exclude <GLOB>...
- Exclude files matching glob patterns
- Multiple patterns allowed
- Example:
--exclude "**/obj/**" "**/bin/**" "**/Tests/**"
Analysis Control
--enable-ruleset <ID>...
- Enable specific rulesets by ID
- Multiple IDs allowed
- Overrides config file
- Example:
--enable-ruleset naming quality
--disable-ruleset <ID>...
- Disable specific rulesets by ID
- Multiple IDs allowed
- Example:
--disable-ruleset experimental
--enable-pass <ID>...
- Enable specific analysis passes by ID
- Multiple IDs allowed
- Example:
--enable-pass indexing control_flow
--disable-pass <ID>...
- Disable specific analysis passes by ID
- Multiple IDs allowed
- Example:
--disable-pass dependencies
--severity <CODE=LEVEL>...
- Override diagnostic severity for specific codes
- Format:
CODE=levelwhere level iserror,warning,info, orhint - Multiple overrides allowed
- Example:
--severity MET001=error QUAL010=warning
Legacy Options (Single File Mode)
--symbol <NAME>
- Search for specific symbol by name
- Only works in single-file mode
- Prints symbol locations and information
Examples
Basic Analysis
# Analyze single file
bsharp analyze MyFile.cs
# Analyze project
bsharp analyze MyProject.csproj
# Analyze solution
bsharp analyze MySolution.sln
Output to File
# Save report to file
bsharp analyze MyProject.csproj --out report.json
# Compact JSON format
bsharp analyze MyProject.csproj --out report.json --format json
Using Configuration File
# Load config from file
bsharp analyze MyProject.csproj --config .bsharp.toml
# Config file with CLI overrides
bsharp analyze MyProject.csproj \
--config .bsharp.toml \
--enable-ruleset quality \
--severity MET001=error
Workspace Filtering
# Analyze only service files
bsharp analyze MySolution.sln --include "**/*Service.cs"
# Exclude test files
bsharp analyze MySolution.sln --exclude "**/Tests/**"
# Multiple filters
bsharp analyze MySolution.sln \
--include "src/**/*.cs" \
--exclude "**/obj/**" "**/bin/**" "**/Tests/**"
Controlling Analysis
# Enable specific rulesets
bsharp analyze MyProject.csproj \
--enable-ruleset naming quality control_flow
# Disable experimental features
bsharp analyze MyProject.csproj \
--disable-ruleset experimental
# Enable/disable specific passes
bsharp analyze MyProject.csproj \
--enable-pass indexing control_flow \
--disable-pass dependencies
Severity Overrides
# Treat specific warnings as errors
bsharp analyze MyProject.csproj \
--severity MET001=error \
--severity QUAL001=error
# Downgrade specific errors to warnings
bsharp analyze MyProject.csproj \
--severity CS0168=warning
Symbol Search (Single File)
# Find symbol in file
bsharp analyze MyFile.cs --symbol MyClass
# Output:
# Found symbol 'MyClass' at line 10, column 14
Analysis Modes
Single File Mode
Triggered when: Input is a .cs file
Behavior:
- Parses single file
- Runs analysis pipeline on CompilationUnit
- Supports
--symboloption for symbol search - Faster for quick checks
Example:
bsharp analyze Program.cs --out analysis.json
Workspace Mode
Triggered when: Input is .sln, .csproj, or directory
Behavior:
- Loads entire workspace
- Discovers all source files
- Follows project references (if
--follow-refs true) - Applies include/exclude filters
- Analyzes all files deterministically
- Aggregates results into single report
Example:
bsharp analyze MySolution.sln \
--follow-refs true \
--exclude "**/Tests/**" \
--out workspace-analysis.json
Configuration File Format
TOML Format
.bsharp.toml:
[analysis]
max_cyclomatic_complexity = 10
max_method_length = 50
[analysis.control_flow]
cf_high_complexity_threshold = 10
cf_deep_nesting_threshold = 4
[analysis.quality]
long_method = "warning"
god_class = "error"
empty_catch = "error"
[workspace]
follow_refs = true
include = ["src/**/*.cs"]
exclude = ["**/obj/**", "**/bin/**", "**/Tests/**"]
[enable_rulesets]
naming = true
quality = true
control_flow = true
[enable_passes]
indexing = true
control_flow = true
dependencies = true
[rule_severities]
MET001 = "error"
QUAL001 = "warning"
JSON Format
.bsharp.json:
{
"analysis": {
"max_cyclomatic_complexity": 10,
"max_method_length": 50,
"control_flow": {
"cf_high_complexity_threshold": 10,
"cf_deep_nesting_threshold": 4
}
},
"workspace": {
"follow_refs": true,
"include": ["src/**/*.cs"],
"exclude": ["**/obj/**", "**/bin/**"]
},
"enable_rulesets": {
"naming": true,
"quality": true
},
"enable_passes": {
"indexing": true,
"control_flow": true
},
"rule_severities": {
"MET001": "error",
"QUAL001": "warning"
}
}
Output Format
Analysis Report Structure
{
"schema_version": 1,
"diagnostics": {
"items": [
{
"code": "MET001",
"severity": "warning",
"message": "Method has high cyclomatic complexity",
"file": "src/OrderService.cs",
"line": 42,
"column": 17,
"end_line": 85,
"end_column": 5
}
]
},
"metrics": {
"total_lines": 1250,
"code_lines": 980,
"comment_lines": 150,
"blank_lines": 120,
"class_count": 15,
"interface_count": 3,
"method_count": 87,
"total_complexity": 245,
"max_complexity": 18,
"max_nesting_depth": 5
},
"cfg": {
"total_methods": 87,
"high_complexity_count": 5,
"deep_nesting_count": 3
},
"deps": {
"total_nodes": 15,
"total_edges": 42,
"circular_dependencies": 0,
"max_depth": 4
},
"workspace_warnings": [
"Failed to parse project: MyBrokenProject.csproj"
]
}
Diagnostic Fields
- code: Diagnostic code (e.g.,
MET001,QUAL010) - severity:
error,warning,info, orhint - message: Human-readable description
- file: Source file path
- line/column: Start position
- end_line/end_column: End position (optional)
Metrics Fields
- total_lines: Total lines including blank/comments
- code_lines: Lines with actual code
- comment_lines: Lines with comments
- blank_lines: Empty lines
- class_count: Number of classes
- interface_count: Number of interfaces
- method_count: Number of methods
- total_complexity: Sum of all method complexities
- max_complexity: Highest method complexity
- max_nesting_depth: Deepest nesting level
Available Rulesets
Built-in Rulesets
naming - Naming convention rules
- Class names: PascalCase
- Method names: PascalCase
- Field names: camelCase with
_prefix - Constant names: UPPER_CASE or PascalCase
quality - Code quality rules
- Long method detection
- Long parameter list
- God class detection
- Empty catch blocks
- Magic numbers
control_flow - Control flow rules
- High complexity warnings
- Deep nesting warnings
- Unreachable code detection
semantic - Semantic rules
- Type checking
- Null reference analysis
- Resource leak detection
Available Passes
Built-in Passes
indexing (Phase: Index)
- Builds symbol index
- Creates name index
- Generates FQN map
control_flow (Phase: Semantic)
- Analyzes control flow
- Calculates complexity metrics
- Detects control flow smells
dependencies (Phase: Global)
- Builds dependency graph
- Detects circular dependencies
- Calculates coupling metrics
reporting (Phase: Reporting)
- Generates final report
- Aggregates diagnostics
- Summarizes artifacts
Diagnostic Codes
Metrics (MET)
- MET001: High cyclomatic complexity
- MET002: Deep nesting detected
- MET003: Long method
- MET004: Long parameter list
Quality (QUAL)
- QUAL001: Long method
- QUAL002: Long parameter list
- QUAL010: Empty catch block
- QUAL020: Naming convention violation
- QUAL030: Resource not disposed
Control Flow (CF)
- CF001: Unreachable code
- CF002: High complexity
- CF003: Deep nesting
Dependencies (DEP)
- DEP001: Circular dependency
- DEP002: High coupling
- DEP003: Unstable dependency
Performance
Analysis Speed
- Single file (< 1000 lines): < 100ms
- Small project (< 10 files): < 500ms
- Medium project (10-50 files): 500ms-2s
- Large solution (100+ files): 2-10s
Memory Usage
- Scales with codebase size
- Typical: 50-200 MB for medium projects
- Artifacts cached in memory during analysis
Parallel Analysis
With parallel_analysis feature enabled:
cargo build --release --features parallel_analysis
Files analyzed in parallel, significantly faster for large workspaces.
Integration
CI/CD Pipeline
# GitHub Actions
- name: Code Quality Analysis
run: |
bsharp analyze MySolution.sln \
--out analysis.json \
--format json \
--severity MET001=error QUAL001=error
# Check for errors
errors=$(jq '.diagnostics.items | map(select(.severity == "error")) | length' analysis.json)
if [ "$errors" -gt 0 ]; then
echo "Quality gate failed: $errors errors"
exit 1
fi
Pre-commit Hook
#!/bin/bash
# .git/hooks/pre-commit
changed_files=$(git diff --cached --name-only --diff-filter=ACM | grep '\.cs$')
for file in $changed_files; do
result=$(bsharp analyze "$file" --format json 2>/dev/null)
errors=$(echo "$result" | jq '.diagnostics.items | map(select(.severity == "error")) | length')
if [ "$errors" -gt 0 ]; then
echo "Analysis errors in $file"
exit 1
fi
done
Quality Gate Script
#!/bin/bash
# quality-gate.sh
bsharp analyze MySolution.sln \
--out report.json \
--format json \
--enable-ruleset naming quality control_flow \
--severity MET001=error QUAL001=error
# Extract metrics
errors=$(jq '.diagnostics.items | map(select(.severity == "error")) | length' report.json)
max_complexity=$(jq '.metrics.max_complexity' report.json)
echo "Errors: $errors"
echo "Max Complexity: $max_complexity"
if [ "$errors" -gt 0 ]; then
echo "❌ Quality gate failed: $errors errors found"
exit 1
fi
if [ "$max_complexity" -gt 15 ]; then
echo "❌ Quality gate failed: complexity $max_complexity exceeds threshold 15"
exit 1
fi
echo "✅ Quality gate passed"
Troubleshooting
Analysis Fails
$ bsharp analyze MyProject.csproj
Error: Failed to load workspace
Solutions:
- Check project file is valid XML
- Verify all referenced projects exist
- Use
--follow-refs falseto skip references
Out of Memory
Error: memory allocation failed
Solutions:
- Analyze smaller subsets with
--include/--exclude - Disable expensive passes with
--disable-pass - Increase system memory
Slow Analysis
Solutions:
- Build with
parallel_analysisfeature - Exclude unnecessary files
- Disable unused rulesets/passes
Related Documentation
- CLI Overview - General CLI usage
- Analysis Pipeline - Analysis internals
- Metrics Collection - Metrics details
- Code Quality - Quality rules
- Report Schema - Output JSON layout
- Configuration Overview - Config fields and examples
References
- Implementation:
src/bsharp_cli/src/commands/analyze.rs - Pipeline:
src/bsharp_analysis/src/framework/pipeline.rs - Configuration:
src/bsharp_analysis/src/context.rs
Format Command
The format command formats C# code using the built-in formatter and syntax emitters.
Usage
bsharp format <INPUT> [--write <BOOL>] [--print] [--newline-mode lf|crlf] \
[--max-consecutive-blank-lines <N>] [--blank-line-between-members <BOOL>] \
[--trim-trailing-whitespace <BOOL>] [--emit-trace] [--emit-trace-file <FILE>]
Arguments
<INPUT> (required)
- Path to
.csfile or directory - When a directory is given, formats all
.csfiles recursively - Hidden directories and
bin/,obj/,target/are skipped
Options
--write, -w <BOOL>
- Write changes to files in-place
- Default:
true - When
falseand<INPUT>is a single file, the formatted content is printed to stdout - When
falseand formatting differences are found for multiple files, exits with code2
--print
- Always print formatted output for a single-file input and exit
- Useful for piping to other tools; does not write to disk regardless of
--write
--newline-mode <MODE>
- Newline mode:
lf(default) orcrlf
--max-consecutive-blank-lines <N>
- Maximum consecutive blank lines to keep (default:
1)
--blank-line-between-members <BOOL>
- Insert a blank line between type members (default:
true)
--trim-trailing-whitespace <BOOL>
- Trim trailing whitespace (default:
true)
--emit-trace
- Enable emission tracing (JSONL) for debugging formatter behavior
- Can also be enabled via environment variable
BSHARP_EMIT_TRACE=1
--emit-trace-file <FILE>
- Path to write the trace JSONL (defaults to stdout when omitted)
Examples
# Format a single file in-place
bsharp format Program.cs
# Print formatted output to stdout (do not write)
bsharp format Program.cs --write false
# Force printing formatted output even if --write is not set
bsharp format Program.cs --print
# Format a directory recursively
bsharp format src/
# Use CRLF newlines and avoid extra blank lines
bsharp format Program.cs --newline-mode crlf --max-consecutive-blank-lines 1
# Enable emission tracing to a file
bsharp format Program.cs --emit-trace --emit-trace-file format_trace.jsonl
Implementation
- Command:
src/bsharp_cli/src/commands/format.rs - Formatter:
bsharp_syntax::FormatterwithFormatOptions - Emission tracing is controlled by CLI flags or
BSHARP_EMIT_TRACEand recorded as JSONL. - Files that fail to parse are skipped; a summary is printed and they are not modified.
Related Documentation
Parse Errors JSON Output
When bsharp parse is run with --errors-json, parse failures are emitted as a single JSON object to stdout and the process exits with a non-zero code.
Schema
{
"error": {
"kind": "parse_error",
"file": "<path>",
"line": 0,
"column": 0,
"expected": "",
"found": "",
"line_text": "",
"message": "<pretty formatted message>",
"spans": {
"abs": { "start": 0, "end": 1 },
"rel": {
"start": { "line": 0, "column": 0 },
"end": { "line": 0, "column": 1 }
}
}
}
}
kind– alwaysparse_errorfor parse failures.file– path of the file being parsed.line,column– 1-based location of the deepest error span.expected,found– reserved fields (currently empty strings).line_text– the full source line at the error location.message– multi-line pretty message formatted from the parser's error tree.spans– present only when--emit-spansis provided; includes absolute byte range and relative line/column positions.
Example
bsharp parse Invalid.cs --errors-json | jq
{
"error": {
"kind": "parse_error",
"file": "Invalid.cs",
"line": 7,
"column": 12,
"expected": "",
"found": "",
"line_text": "public clas Program { }",
"message": "0: at 7:12: expected keyword \"class\"\n public clas Program { }\n ^\nContexts:\n - class declaration\n"
}
}
Notes
- In pretty (non-JSON) mode, errors are sent to stderr with optional ANSI colors (disable via
--no-colororNO_COLOR=1). --errors-jsondisables pretty errors and always prints the JSON object.
Workspace Loading
The BSharp workspace loading system provides comprehensive support for loading C# projects and solutions, including solution files (.sln), project files (.csproj), and directory-based discovery.
Overview
Location: src/bsharp_analysis/src/workspace/
The workspace loader:
- Parses Visual Studio solution files (.sln)
- Parses MSBuild project files (.csproj)
- Discovers source files
- Resolves project references
- Handles multiple projects deterministically
Workspace Model
Core Types
#![allow(unused)] fn main() { pub struct Workspace { pub root: PathBuf, pub projects: Vec<Project>, pub solution: Option<Solution>, pub source_map: SourceMap, } pub struct Project { pub name: String, pub path: PathBuf, pub target_framework: String, pub output_type: String, pub files: Vec<ProjectFile>, pub references: Vec<ProjectRef>, pub package_references: Vec<PackageReference>, pub errors: Vec<String>, } pub struct Solution { pub name: String, pub path: PathBuf, pub projects: Vec<SolutionProject>, } }
Loading Workspaces
WorkspaceLoader API
#![allow(unused)] fn main() { pub struct WorkspaceLoader; impl WorkspaceLoader { // Load from any path (auto-detects type) pub fn from_path(path: &Path) -> Result<Workspace>; // Load with options pub fn from_path_with_options( path: &Path, opts: WorkspaceLoadOptions ) -> Result<Workspace>; } pub struct WorkspaceLoadOptions { pub follow_refs: bool, // Follow ProjectReference transitively } }
Loading from Solution File
#![allow(unused)] fn main() { use bsharp_analysis::workspace::WorkspaceLoader; let workspace = WorkspaceLoader::from_path(Path::new("MySolution.sln"))?; println!("Loaded {} projects", workspace.projects.len()); for project in &workspace.projects { println!(" - {}: {} files", project.name, project.files.len()); } }
Loading from Project File
#![allow(unused)] fn main() { let workspace = WorkspaceLoader::from_path(Path::new("MyProject.csproj"))?; // Automatically follows ProjectReference if follow_refs = true assert!(workspace.projects.len() >= 1); }
Loading from Directory
#![allow(unused)] fn main() { let workspace = WorkspaceLoader::from_path(Path::new("./src"))?; // Discovers .sln or .csproj files in directory }
Solution File Parsing
Solution Format
Example .sln:
Microsoft Visual Studio Solution File, Format Version 12.00
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "MyApp", "MyApp\MyApp.csproj", "{GUID}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "MyLib", "MyLib\MyLib.csproj", "{GUID}"
EndProject
Parsing Implementation
Location: src/bsharp_analysis/src/workspace/sln/reader.rs
#![allow(unused)] fn main() { pub struct SolutionReader; impl SolutionReader { pub fn read(path: &Path) -> Result<Solution> { let content = fs::read_to_string(path)?; Self::parse(&content, path) } fn parse(content: &str, base_path: &Path) -> Result<Solution> { // Parse solution format // Extract project entries // Resolve project paths } } }
Solution Structure
#![allow(unused)] fn main() { pub struct Solution { pub name: String, pub path: PathBuf, pub projects: Vec<SolutionProject>, } pub struct SolutionProject { pub name: String, pub path: PathBuf, pub type_guid: String, pub guid: String, } }
Project File Parsing
Project Format
Example .csproj:
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net8.0</TargetFramework>
<OutputType>Exe</OutputType>
</PropertyGroup>
<ItemGroup>
<Compile Include="Program.cs" />
<Compile Include="Utils.cs" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\MyLib\MyLib.csproj" />
</ItemGroup>
<ItemGroup>
<PackageReference Include="Newtonsoft.Json" Version="13.0.1" />
</ItemGroup>
</Project>
Parsing Implementation
Location: src/bsharp_analysis/src/workspace/csproj/reader.rs
#![allow(unused)] fn main() { pub struct CsprojReader; impl CsprojReader { pub fn read(path: &Path) -> Result<Project> { let content = fs::read_to_string(path)?; Self::parse(&content, path) } fn parse(content: &str, project_path: &Path) -> Result<Project> { // Parse XML // Extract properties (TargetFramework, OutputType) // Discover source files (Compile items) // Extract ProjectReference entries // Extract PackageReference entries } } }
Source File Discovery
Glob Patterns:
- Default:
**/*.cs(all C# files recursively) - Respects
<Compile Include="..." />items - Respects
<Compile Remove="..." />exclusions - Excludes
obj/andbin/directories
Implementation:
#![allow(unused)] fn main() { fn discover_source_files(project_dir: &Path) -> Vec<ProjectFile> { let pattern = project_dir.join("**/*.cs"); let mut files = Vec::new(); for entry in glob::glob(pattern.to_str().unwrap()) { let path = entry.unwrap(); // Skip obj/ and bin/ if path.components().any(|c| c.as_os_str() == "obj" || c.as_os_str() == "bin") { continue; } files.push(ProjectFile { path, kind: ProjectFileKind::Compile, }); } files } }
Project References
Transitive Resolution
follow_refs Option:
#![allow(unused)] fn main() { let opts = WorkspaceLoadOptions { follow_refs: true }; let workspace = WorkspaceLoader::from_path_with_options(path, opts)?; }
Behavior:
- Follows
<ProjectReference>transitively - Loads all referenced projects
- Avoids duplicates
- Stays within workspace root
- Deterministic ordering (sorted by path)
Example:
MyApp.csproj
→ MyLib.csproj
→ MyCore.csproj
Result: [MyApp, MyLib, MyCore]
Implementation
#![allow(unused)] fn main() { fn follow_project_references(root: &Path, projects: &mut Vec<Project>) { let mut seen = HashSet::new(); let mut queue = VecDeque::new(); // Add initial projects for proj in projects.iter() { seen.insert(proj.path.clone()); queue.push_back(proj.path.clone()); } // BFS traversal while let Some(proj_path) = queue.pop_front() { let proj = match CsprojReader::read(&proj_path) { Ok(p) => p, Err(_) => continue, }; for ref_path in proj.references.iter().map(|r| &r.path) { // Resolve relative to project directory let abs_path = proj_path.parent().unwrap().join(ref_path); // Skip if outside root if !abs_path.starts_with(root) { continue; } // Skip if already seen if seen.insert(abs_path.clone()) { queue.push_back(abs_path.clone()); // Load and add project if let Ok(referenced_proj) = CsprojReader::read(&abs_path) { projects.push(referenced_proj); } } } } // Sort for determinism projects.sort_by(|a, b| a.path.cmp(&b.path)); } }
Source Map
Purpose
The SourceMap provides fast lookup of source files:
#![allow(unused)] fn main() { pub struct SourceMap { files: HashMap<PathBuf, SourceFileInfo>, } impl SourceMap { pub fn get(&self, path: &Path) -> Option<&SourceFileInfo>; pub fn all_files(&self) -> Vec<&Path>; } }
Usage
#![allow(unused)] fn main() { let workspace = WorkspaceLoader::from_path(path)?; // Look up file if let Some(info) = workspace.source_map.get(Path::new("Program.cs")) { println!("Found in project: {}", info.project_name); } // Iterate all files for file_path in workspace.source_map.all_files() { println!("File: {}", file_path.display()); } }
Error Handling
Resilient Loading
Philosophy: Continue loading even if individual projects fail
#![allow(unused)] fn main() { // Failed projects recorded as stubs with errors let workspace = WorkspaceLoader::from_path(sln_path)?; for project in &workspace.projects { if !project.errors.is_empty() { eprintln!("Errors in {}: {:?}", project.name, project.errors); } } }
Error Types
#![allow(unused)] fn main() { pub enum WorkspaceError { IoError(io::Error), ParseError(String), InvalidPath(String), Unsupported(String), } }
CLI Integration
Analyze Command
# Analyze solution
bsharp analyze MySolution.sln
# Analyze project
bsharp analyze MyProject.csproj
# Follow references (default: true)
bsharp analyze MyProject.csproj --follow-refs true
# Don't follow references
bsharp analyze MyProject.csproj --follow-refs false
Filtering
# Include only specific files
bsharp analyze MySolution.sln --include "**/*Service.cs"
# Exclude test files
bsharp analyze MySolution.sln --exclude "**/Tests/**"
# Multiple patterns
bsharp analyze MySolution.sln \
--include "src/**/*.cs" \
--exclude "**/obj/**" "**/bin/**"
Deterministic Behavior
Guarantees
- Project Order: Always sorted by absolute path
- File Order: Always sorted within each project
- Deduplication: No duplicate projects or files
- Reproducible: Same input always produces same output
Implementation
#![allow(unused)] fn main() { // Sort projects projects.sort_by(|a, b| a.path.cmp(&b.path)); // Deduplicate by path let mut seen = HashSet::new(); projects.retain(|p| seen.insert(p.path.clone())); // Sort files within each project for project in &mut projects { project.files.sort_by(|a, b| a.path.cmp(&b.path)); } }
Performance
Loading Speed
- Small solution (1-5 projects): < 100ms
- Medium solution (5-20 projects): 100-500ms
- Large solution (20-100 projects): 500ms-2s
Memory Usage
- Minimal: Only metadata loaded, not source content
- Typical: 1-5 MB per solution
Optimization
- Parallel project loading (with
parallel_analysisfeature) - Lazy source file reading
- Efficient path canonicalization
Examples
Example 1: Load and Analyze
#![allow(unused)] fn main() { use bsharp_analysis::workspace::WorkspaceLoader; use bsharp_parser::facade::Parser; let workspace = WorkspaceLoader::from_path(Path::new("MySolution.sln"))?; let parser = Parser::new(); for project in &workspace.projects { for file in &project.files { let source = fs::read_to_string(&file.path)?; match parser.parse(&source) { Ok(cu) => println!("Parsed: {}", file.path.display()), Err(e) => eprintln!("Error in {}: {}", file.path.display(), e), } } } }
Example 2: Project Statistics
#![allow(unused)] fn main() { let workspace = WorkspaceLoader::from_path(path)?; println!("Solution: {}", workspace.solution.as_ref().unwrap().name); println!("Projects: {}", workspace.projects.len()); let total_files: usize = workspace.projects.iter() .map(|p| p.files.len()) .sum(); println!("Total files: {}", total_files); for project in &workspace.projects { println!(" {}: {} files", project.name, project.files.len()); } }
Example 3: Dependency Graph
#![allow(unused)] fn main() { let workspace = WorkspaceLoader::from_path(path)?; println!("Project Dependencies:"); for project in &workspace.projects { if !project.references.is_empty() { println!("{}:", project.name); for ref_ in &project.references { println!(" → {}", ref_.name); } } } }
Testing
Test Fixtures
Location: tests/fixtures/
tests/fixtures/
├── happy_path/
│ ├── test.sln
│ ├── testApplication/
│ │ ├── testApplication.csproj
│ │ └── Program.cs
│ └── testDependency/
│ ├── testDependency.csproj
│ └── Library.cs
└── complex/
└── ...
Test Examples
#![allow(unused)] fn main() { #[test] fn test_load_solution() { let sln_path = PathBuf::from("tests/fixtures/happy_path/test.sln"); let workspace = WorkspaceLoader::from_path(&sln_path).unwrap(); assert_eq!(workspace.projects.len(), 2); assert!(workspace.solution.is_some()); } #[test] fn test_follow_references() { let proj_path = PathBuf::from("tests/fixtures/happy_path/testApplication/testApplication.csproj"); let workspace = WorkspaceLoader::from_path(&proj_path).unwrap(); // Should load both testApplication and testDependency assert_eq!(workspace.projects.len(), 2); } }
Future Enhancements
Planned Features
-
NuGet Package Resolution
- Resolve package references
- Download packages if needed
- Parse package assemblies
-
MSBuild Integration
- Full MSBuild evaluation
- Property expansion
- Target execution
-
Multi-targeting Support
- Handle multiple target frameworks
- Conditional compilation
-
Incremental Loading
- Cache workspace metadata
- Reload only changed projects
Related Documentation
- CLI Overview - CLI integration
- Analysis Pipeline - Using workspace in analysis
- Architecture - Design decisions
References
- Implementation:
src/bsharp_analysis/src/workspace/ - Loader:
src/bsharp_analysis/src/workspace/loader.rs - Solution Reader:
src/bsharp_analysis/src/workspace/sln/reader.rs - Project Reader:
src/bsharp_analysis/src/workspace/csproj/reader.rs - Model:
src/bsharp_analysis/src/workspace/model.rs - Source Map:
src/bsharp_analysis/src/workspace/source_map.rs - Tests:
src/bsharp_tests/src/workspace/andsrc/bsharp_tests/src/integration/
Configuration Overview
BSharp analysis can be configured via TOML or JSON files and by CLI flags that map to config fields.
Locations
- Project root:
.bsharp.tomlor.bsharp.json - Custom path via
bsharp analyze <INPUT> --config <FILE>
AnalysisConfig (fields)
Source: src/bsharp_analysis/src/context.rs
#![allow(unused)] fn main() { pub struct AnalysisConfig { // Control flow thresholds pub cf_high_complexity_threshold: usize, // default: 10 pub cf_deep_nesting_threshold: usize, // default: 4 // Toggles and severities pub enable_rulesets: HashMap<String, bool>, pub enable_passes: HashMap<String, bool>, pub rule_severities: HashMap<String, DiagnosticSeverity>, // Workspace filters pub workspace: WorkspaceConfig, // Optional churn/PE settings (reserved/future) pub churn_enable: bool, pub churn_period_days: u32, pub churn_include_merges: bool, pub churn_max_commits: Option<u32>, pub pe_reference_paths: Vec<String>, pub pe_references: Vec<String>, } pub struct WorkspaceConfig { pub follow_refs: bool, pub include: Vec<String>, pub exclude: Vec<String>, } }
TOML Example
[analysis]
cf_high_complexity_threshold = 10
cf_deep_nesting_threshold = 4
[enable_rulesets]
naming = true
semantic = true
control_flow_smells = true
[enable_passes]
passes.metrics = true
passes.control_flow = true
passes.dependencies = true
[rule_severities]
CF002 = "warning"
CF003 = "warning"
[workspace]
follow_refs = true
include = ["src/**/*.cs"]
exclude = ["**/obj/**", "**/bin/**"]
JSON Example
{
"cf_high_complexity_threshold": 10,
"cf_deep_nesting_threshold": 4,
"enable_rulesets": {
"naming": true,
"semantic": true,
"control_flow_smells": true
},
"enable_passes": {
"passes.metrics": true,
"passes.control_flow": true,
"passes.dependencies": true
},
"rule_severities": {
"CF002": "warning",
"CF003": "warning"
},
"workspace": {
"follow_refs": true,
"include": ["src/**/*.cs"],
"exclude": ["**/obj/**", "**/bin/**"]
}
}
CLI Mapping
--enable-ruleset <ID>/--disable-ruleset <ID>→enable_rulesets[ID] = true|false--enable-pass <ID>/--disable-pass <ID>→enable_passes[ID] = true|false--severity CODE=level→rule_severities[CODE] = level(error|warning|info|hint)--follow-refs <BOOL>→workspace.follow_refs--include <GLOB>...→workspace.include--exclude <GLOB>...→workspace.exclude
Tips
- Prefer TOML for readability; JSON is supported for tool integration.
- Thresholds influence
CfgSummarycounts in the final report. - Use unique IDs for passes/rulesets consistent with registry (see Passes & Rules).
Contributing to BSharp
Thank you for your interest in contributing to BSharp! This document provides guidelines for contributing to the project.
Development Setup
Prerequisites
- Rust 1.70 or later
- Git
- A text editor or IDE with Rust support
Building the Project
- Clone the repository:
git clone https://github.com/mikserek/bsharp.git
cd bsharp
Parser Testing Best Practices
- Prefer
expect_ok(input, parse(input.into()))fromsyntax::test_helperswhen asserting successful parses. It prints readable, rustc-like diagnostics on failure viaformat_error_tree. - Keep tests focused and minimal; add a separate negative test when ambiguity is possible (e.g., ternary vs
?.vs??, range vs dot vs float). - For lookahead/disambiguation boundaries, add cases to
tests/parser/expressions/lookahead_boundaries2_tests.rs. - For complex constructs (e.g.,
newwith object/collection initializers), add positive and negative cases neartests/parser/expressions/new_expression_tests.rsandtarget_typed_new_tests.rs. - Invalid-input diagnostics: place small snapshot-style assertions in
tests/parser/expressions/invalid_diagnostics_tests.rsthat check for line/column and caret presence. Avoid overfitting on exact wording. - When adding delimited constructs (parentheses, brackets, braces), guard the closing delimiter with
cut(...)once committed to that branch to prevent misleading backtracking. - Always wrap sub-parsers with
bws(...)to ensure whitespace/comments are handled consistently.
Adding New Parser Test Files
- In
tests/parser/expressions/, simply add a new*_tests.rsfile; it will be discovered by the existing integration test harness. - For declarations/statements/types, follow the existing directory structure under
tests/parser/and mimic module organization. - Keep tests deterministic and avoid relying on environment-specific paths or random data.
- Build the project:
cargo build
- Run tests:
cargo test
- Run the CLI tool:
cargo run -- --help
Project Structure
Understanding the codebase organization:
src/
├── parser/ # Parser implementations (expressions, statements, etc.)
├── syntax/ # Parser infrastructure (AST nodes, helpers, errors)
├── analysis/ # Code analysis framework
├── workspace/ # Solution and project file loading
├── cli/ # Command-line interface
└── lib.rs # Library entry point
Code Style
Follow Rust conventions:
- Use
cargo fmtto format code - Use
cargo clippyto check for common issues - Follow naming conventions (
snake_casefor functions,PascalCasefor types) - Add documentation comments for public APIs
Testing
All contributions should include appropriate tests:
Parser Tests
IMPORTANT: Parser tests must live in an external test crate under src/bsharp_tests/src/, NOT inline #[cfg(test)] modules.
#![allow(unused)] fn main() { // ✅ CORRECT: External test file // tests/parser/declarations/class_declaration_tests.rs use bsharp::syntax::test_helpers::expect_ok; use bsharp::parser::expressions::declarations::parse_class_declaration; #[test] fn test_parse_simple_class() { let input = "public class MyClass { }"; let class = expect_ok(input, parse_class_declaration(input.into())); assert_eq!(class.identifier.name, "MyClass"); } }
Analysis Tests
#![allow(unused)] fn main() { // tests/analysis/complexity_tests.rs use bsharp::syntax::Parser; use bsharp::analysis::metrics::cyclomatic_complexity; #[test] fn test_complexity_analysis() { let source = r#" public class Test { public void Method() { if (true) { for (int i = 0; i < 10; i++) { // complexity += 2 } } } } "#; let parser = Parser::new(); let cu = parser.parse(source).unwrap(); // Find the method and calculate complexity // (implementation details depend on analysis API) assert_eq!(complexity, 3); } }
Documentation
- Add rustdoc comments for public functions and types
- Update this documentation when adding new features
- Include examples in documentation
Adding New Language Features
When adding support for new C# language features:
- Define AST Nodes: Add node definitions in
src/syntax/nodes/ - Implement Parser: Add parser in appropriate
src/parser/subdirectory - Add Tests: Include comprehensive tests in
tests/parser/directory - Update Traversal: Prefer the
bsharp_analysis::framework::QueryAPI for AST enumeration; for statement/expression-heavy logic, use shared helpers or a focused walker. - Document: Add documentation for the new feature
Example process for adding a new expression type:
- Define the AST node:
#![allow(unused)] fn main() { // src/syntax/nodes/expressions/new_expression.rs #[derive(Debug, PartialEq, Clone, Serialize, Deserialize)] pub struct NewExpression { pub keyword: String, // "new" pub arguments: Vec<Expression>, } }
- Add to Expression enum:
#![allow(unused)] fn main() { // src/syntax/nodes/expressions/expression.rs pub enum Expression { // ... existing variants New(NewExpression), } }
- Implement parser:
#![allow(unused)] fn main() { // src/parser/expressions/new_expression_parser.rs pub fn parse_new_expression(input: &str) -> BResult<&str, NewExpression> { // Parser implementation } }
- Add tests:
#![allow(unused)] fn main() { // tests/parser/expressions/new_expression_tests.rs #[test] fn test_parse_new_expression() { // Test implementation } }
Submitting Changes
Pull Request Process
- Fork the repository
- Create a feature branch:
git checkout -b feature/new-feature - Make your changes
- Run tests:
cargo test - Run formatting:
cargo fmt - Run clippy:
cargo clippy - Commit changes with clear messages
- Push to your fork
- Create a pull request
Commit Messages
Use clear, descriptive commit messages:
feat: add support for C# 11 file-scoped types
- Add parser for file-scoped type declarations
- Update AST to handle new syntax
- Add comprehensive tests
- Update documentation
Fixes #123
Pull Request Requirements
- All tests must pass
- Code must be formatted with
cargo fmt - No clippy warnings
- Include tests for new functionality
- Update documentation if needed
Common Development Tasks
Adding a New Parser
- Define the AST node structure
- Implement the parser function
- Add the parser to the appropriate module
- Write comprehensive tests
- Update integration points
Extending Analysis
- Define analysis traits if needed
- Implement analyzer struct
- Add configuration options
- Write tests with various scenarios
- Update CLI integration
Debugging Parser Issues
Use these tools for debugging:
# Test specific parser with debug output
RUST_LOG=debug cargo test test_name -- --nocapture
# Run parser on test file (prints textual AST tree)
cargo run -- parse debug_cases/test.cs
# Check AST visualization
cargo run -- tree debug_cases/test.cs --output debug.svg
Getting Help
- Check existing issues and documentation
- Ask questions in GitHub issues
- Join community discussions
Code of Conduct
- Be respectful and inclusive
- Focus on constructive feedback
- Help others learn and grow
- Maintain a positive environment
Thank you for contributing to BSharp!
Testing Guide
This document provides comprehensive guidance on testing in the BSharp project, covering test organization, best practices, and debugging strategies.
Test Organization Philosophy
External Test Structure
Critical Principle: Parser tests are external to implementation modules and live in a dedicated test crate.
src/bsharp_tests/
├── cargo.toml # Test crate manifest
└── src/
├── parser/
│ ├── expressions/
│ │ ├── expression_tests.rs
│ │ ├── lambda_expression_tests.rs
│ │ ├── pattern_matching_tests.rs
│ │ ├── ambiguity_tests.rs
│ │ ├── lookahead_boundaries2_tests.rs
│ │ └── ...
│ ├── statements/
│ │ ├── if_statement_tests.rs
│ │ ├── for_statement_tests.rs
│ │ ├── expression_statement_tests.rs
│ │ └── ...
│ ├── declarations/
│ │ ├── class_declaration_tests.rs
│ │ ├── interface_declaration_parser_tests.rs
│ │ ├── recovery_tests.rs
│ │ └── ...
│ ├── types/
│ │ ├── type_tests.rs
│ │ ├── advanced_type_tests.rs
│ │ └── ...
│ ├── preprocessor/
│ │ └── ...
│ └── keyword_parsers_tests.rs
└── fixtures/
├── happy_path/
└── complex/
Rationale:
- Separation of Concerns: Test code separate from implementation
- Compilation Efficiency: Tests don't bloat production binary
- Organization: Clear structure mirrors parser organization
- Maintainability: Easy to find and update tests
What NOT to Do:
#![allow(unused)] fn main() { // ❌ NEVER do this in src/parser/ files #[cfg(test)] mod tests { use super::*; #[test] fn test_something() { // ... } } }
What to Do Instead:
#![allow(unused)] fn main() { // ✅ Create tests/parser/expressions/my_feature_tests.rs use bsharp::syntax::test_helpers::expect_ok; use bsharp::parser::expressions::parse_my_feature; #[test] fn test_my_feature() { let input = "my feature syntax"; let result = parse_my_feature(input.into()); let ast = expect_ok(input, result); // assertions... } }
Test Helpers
expect_ok() - Readable Test Failures
Location: src/syntax/test_helpers.rs
Usage:
#![allow(unused)] fn main() { use bsharp::syntax::test_helpers::expect_ok; #[test] fn test_parse_class() { let input = "public class MyClass { }"; let result = parse_class_declaration(input.into()); let class = expect_ok(input, result); assert_eq!(class.identifier.name, "MyClass"); } }
Benefits:
- Automatic Error Formatting: Pretty-prints
ErrorTreeon failure - Readable Diagnostics: Shows parse failure context with caret
- Panic on Failure: Test fails with clear error message
Error Output Example:
0: at line 1, in keyword "class":
public clas MyClass { }
^--- expected keyword "class"
1: in context "class declaration"
Other Test Helpers
parse_input_unwrap() - Unwrap parse result:
#![allow(unused)] fn main() { use bsharp_syntax::span::Span; let (remaining, ast) = parse_input_unwrap( parse_expression_spanned(Span::new(input)).map(|(rest, s)| (rest, s.node)) ); assert_eq!(remaining, ""); // Verify full consumption }
assert_parse_error() - Verify parse failures:
#![allow(unused)] fn main() { use bsharp_syntax::span::Span; assert_parse_error( parse_expression_spanned(Span::new("invalid syntax")).map(|(rest, s)| (rest, s.node)) ); }
Parser Testing Best Practices
1. Prefer expect_ok() for Successful Parses
#![allow(unused)] fn main() { #[test] fn test_if_statement() { let input = "if (x > 0) { return x; }"; let stmt = expect_ok(input, parse_if_statement(input.into())); // Now assert on the AST structure match stmt { Statement::If(if_stmt) => { // Verify condition, consequence, etc. } _ => panic!("Expected IfStatement"), } } }
2. Keep Tests Focused and Minimal
Good:
#![allow(unused)] fn main() { #[test] fn test_simple_lambda() { let input = "x => x * 2"; let expr = expect_ok(input, parse_lambda_expression(input.into())); // Test one thing } #[test] fn test_lambda_with_multiple_params() { let input = "(x, y) => x + y"; let expr = expect_ok(input, parse_lambda_expression(input.into())); // Test another thing } }
Bad:
#![allow(unused)] fn main() { #[test] fn test_all_lambda_forms() { // Testing too many things in one test // Hard to debug when it fails } }
3. Add Negative Tests for Ambiguity
When disambiguation is possible, add tests for both valid and invalid cases:
#![allow(unused)] fn main() { #[test] fn test_ternary_vs_nullable() { // Valid ternary let input = "x ? y : z"; expect_ok(input, parse_conditional_expression(input.into())); // Valid null-conditional (different test) } #[test] fn test_null_conditional_operator() { let input = "obj?.Property"; expect_ok(input, parse_postfix_expression(input.into())); } }
4. Test Lookahead/Disambiguation Boundaries
Location: tests/parser/expressions/lookahead_boundaries2_tests.rs
#![allow(unused)] fn main() { #[test] fn test_range_vs_dot_vs_float() { // Range operator expect_ok("1..10", parse_range_expression("1..10")); // Member access expect_ok("obj.Method", parse_postfix_expression("obj.Method")); // Float literal expect_ok("3.14", parse_literal("3.14")); } }
5. Test Complex Constructs
For complex constructs like new expressions with initializers:
Location: tests/parser/expressions/new_expression_tests.rs
#![allow(unused)] fn main() { #[test] fn test_new_with_object_initializer() { let input = "new Person { Name = \"John\", Age = 30 }"; let expr = expect_ok(input, parse_new_expression(input.into())); // Verify structure } #[test] fn test_new_with_collection_initializer() { let input = "new List<int> { 1, 2, 3 }"; let expr = expect_ok(input, parse_new_expression(input.into())); // Verify structure } #[test] fn test_target_typed_new() { let input = "new(42, \"test\")"; let expr = expect_ok(input, parse_new_expression(input.into())); // Verify structure } }
6. Test Invalid Input Diagnostics
Location: tests/parser/expressions/invalid_diagnostics_tests.rs
#![allow(unused)] fn main() { #[test] fn test_unclosed_paren_diagnostic() { use bsharp_syntax::span::Span; let input = "(x + y"; let result = parse_expression_spanned(Span::new(input)).map(|(rest, s)| (rest, s.node)); assert!(result.is_err()); // Optionally check error contains expected message } }
Guidelines:
- Keep small snapshot-style assertions
- Check for line/column and caret presence
- Avoid overfitting on exact wording (may change)
7. Guard Closing Delimiters with cut()
When adding delimited constructs, ensure closing delimiters use cut():
#![allow(unused)] fn main() { use nom::combinator::cut; use crate::syntax::parser_helpers::{bdelimited, bchar}; fn parse_parenthesized(input: &str) -> BResult<&str, Expression> { bdelimited( bchar('('), parse_expression, cut(bchar(')')) // ✅ Prevents misleading backtracking )(input.into()) } }
8. Wrap Sub-Parsers with bws()
Ensure whitespace/comments are handled consistently:
#![allow(unused)] fn main() { use crate::syntax::parser_helpers::bws; fn parse_if_statement(input: &str) -> BResult<&str, Statement> { let (input, _) = bws(keyword("if"))(input.into())?; let (input, _) = bws(bchar('('))(input.into())?; let (input, condition) = bws(parse_expression)(input.into())?; // ... } }
Test Discovery and Execution
Running All Tests
cargo test
Running Specific Test Suites
# All parser tests
cargo test --test parser
# Specific module
cargo test --test parser expression_tests
# Specific test
cargo test --test parser test_lambda_expression
Running with Output
# Show println! output
cargo test -- --nocapture
# Show test names as they run
cargo test -- --nocapture --test-threads=1
Running with Debug Logging
RUST_LOG=debug cargo test test_name -- --nocapture
Test Fixtures
Fixture Organization
tests/fixtures/
├── happy_path/ # Valid, well-formed C# projects
│ ├── testApplication/
│ │ ├── Program.cs
│ │ ├── testApplication.csproj
│ │ └── ...
│ └── testDependency/
│ └── ...
└── complex/ # Complex, real-world scenarios
├── testApplication/
└── testDependency/
Using Fixtures in Tests
#![allow(unused)] fn main() { use std::fs; use std::path::PathBuf; #[test] fn test_parse_fixture() { let fixture_path = PathBuf::from("tests/fixtures/happy_path/testApplication/Program.cs"); let source = fs::read_to_string(&fixture_path).unwrap(); let parser = Parser::new(); let result = parser.parse(&source); assert!(result.is_ok()); } }
Fixture Guidelines
- Valid Code: Fixtures should be valid C# that compiles
- Realistic: Use real-world patterns, not contrived examples
- Documented: Add README.md explaining fixture purpose
- Minimal: Keep fixtures as small as possible while testing feature
Snapshot Testing
Using insta for Snapshot Tests
Installation: Already included in Cargo.toml dev-dependencies
#![allow(unused)] fn main() { use insta::assert_json_snapshot; #[test] fn test_class_ast_structure() { let input = "public class MyClass { public int Field; }"; let result = parse_class_declaration(input.into()); let class = expect_ok(input, result); // Creates snapshot file on first run assert_json_snapshot!(class); } }
Reviewing Snapshots
# Review snapshot changes
cargo insta review
# Accept all changes
cargo insta accept
# Reject all changes
cargo insta reject
Snapshot Guidelines
- Complex Structures: Use for complex AST structures
- Regression Prevention: Catch unintended changes
- Review Carefully: Always review snapshot diffs
- Commit Snapshots: Include snapshot files in git
Debugging Test Failures
Strategy 1: Use expect_ok() Error Output
When a test fails, expect_ok() shows the parse error:
0: at line 1, in keyword "class":
public clas MyClass { }
^--- expected keyword "class"
Strategy 2: Add Debug Logging
#![allow(unused)] fn main() { #[test] fn test_with_logging() { env_logger::init(); // Initialize logger use bsharp_syntax::span::Span; let input = "complex syntax"; log::debug!("Parsing: {}", input); let result = parse_expression_spanned(Span::new(input)).map(|(rest, s)| (rest, s.node)); log::debug!("Result: {:?}", result); expect_ok(input, result); } }
Run with:
RUST_LOG=debug cargo test test_with_logging -- --nocapture
Strategy 3: Test Smaller Components
If a complex parser fails, test its sub-parsers individually:
#![allow(unused)] fn main() { #[test] fn test_method_declaration() { // Fails - too complex let input = "public async Task<int> Method(int x) { return x; }"; expect_ok(input, parse_method_declaration(input.into())); } // Break it down: #[test] fn test_method_modifiers() { let input = "public async"; expect_ok(input, parse_modifiers(input.into())); } #[test] fn test_method_return_type() { let input = "Task<int>"; expect_ok(input, parse_type(input.into())); } #[test] fn test_method_parameters() { let input = "(int x)"; expect_ok(input, parse_parameter_list(input.into())); } }
Strategy 4: Use Parser Debugging Tools
# Parse file and output JSON
cargo run -- parse debug_cases/test.cs --output debug.json
# Generate AST visualization
cargo run -- tree debug_cases/test.cs --output debug.svg
Strategy 5: Check Error Recovery
For declaration error recovery tests:
#![allow(unused)] fn main() { #[test] fn test_recovery_from_malformed_member() { let input = r#" public class MyClass { public int ValidField; public invalid syntax here; // Malformed public int AnotherValidField; // Should recover } "#; let result = parse_class_declaration(input.into()); // Should parse despite error assert!(result.is_ok()); } }
Integration Testing
Workspace Loading Tests
#![allow(unused)] fn main() { use bsharp::workspace::WorkspaceLoader; #[test] fn test_load_solution() { let sln_path = PathBuf::from("tests/fixtures/happy_path/test.sln"); let workspace = WorkspaceLoader::from_path(&sln_path).unwrap(); assert_eq!(workspace.projects.len(), 2); assert!(workspace.solution.is_some()); } #[test] fn test_load_csproj() { let csproj_path = PathBuf::from("tests/fixtures/happy_path/testApplication/testApplication.csproj"); let workspace = WorkspaceLoader::from_path(&csproj_path).unwrap(); assert_eq!(workspace.projects.len(), 1); } }
Analysis Pipeline Tests
#![allow(unused)] fn main() { use bsharp::analysis::framework::pipeline::AnalyzerPipeline; use bsharp::analysis::framework::session::AnalysisSession; #[test] fn test_analysis_pipeline() { let source = "public class Test { public void Method() { } }"; let parser = Parser::new(); let cu = parser.parse(source).unwrap(); let mut session = AnalysisSession::new(); AnalyzerPipeline::run_with_defaults(&cu, &mut session); let report = session.into_report(); assert!(report.diagnostics.is_empty()); // No errors } }
Performance Testing
Benchmarking
#![allow(unused)] fn main() { #[test] #[ignore] // Run with --ignored flag fn bench_parse_large_file() { use std::time::Instant; let source = fs::read_to_string("tests/fixtures/large_file.cs").unwrap(); let parser = Parser::new(); let start = Instant::now(); let result = parser.parse(&source); let duration = start.elapsed(); assert!(result.is_ok()); println!("Parse time: {:?}", duration); // Assert reasonable performance assert!(duration.as_millis() < 1000, "Parse took too long"); } }
Running Performance Tests
cargo test --ignored -- bench_
Continuous Integration
CI Test Strategy
# .github/workflows/test.yml (example)
name: Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
- name: Run tests
run: cargo test --all-features
- name: Run clippy
run: cargo clippy -- -D warnings
- name: Check formatting
run: cargo fmt -- --check
Test Coverage
Measuring Coverage
# Install tarpaulin
cargo install cargo-tarpaulin
# Run coverage
cargo tarpaulin --out Html --output-dir coverage
Coverage Goals
- Parser Core: 90%+ coverage
- Analysis Framework: 80%+ coverage
- CLI Commands: 70%+ coverage
- Workspace Loading: 80%+ coverage
Common Testing Patterns
Pattern 1: Positive and Negative Tests
#![allow(unused)] fn main() { #[test] fn test_valid_syntax() { let input = "valid syntax"; expect_ok(input, parse_feature(input.into())); } #[test] fn test_invalid_syntax() { let input = "invalid syntax"; assert!(parse_feature(input.into()).is_err()); } }
Pattern 2: Boundary Testing
#![allow(unused)] fn main() { #[test] fn test_empty_input() { assert!(parse_feature("").is_err()); } #[test] fn test_minimal_input() { expect_ok("x", parse_feature("x")); } #[test] fn test_maximal_input() { let input = "very complex nested structure..."; expect_ok(input, parse_feature(input.into())); } }
Pattern 3: Equivalence Testing
#![allow(unused)] fn main() { #[test] fn test_whitespace_insensitive() { let compact = "if(x){y;}"; let spaced = "if (x) { y; }"; let ast1 = expect_ok(compact, parse_if_statement(compact)); let ast2 = expect_ok(spaced, parse_if_statement(spaced)); assert_eq!(ast1, ast2); } }
Test Maintenance
When to Update Tests
- API Changes: Update tests when parser API changes
- Bug Fixes: Add regression tests for fixed bugs
- New Features: Add tests for new language features
- Refactoring: Ensure tests still pass after refactoring
Test Cleanup
- Remove Duplicate Tests: Consolidate similar tests
- Update Outdated Tests: Fix tests using deprecated APIs
- Remove Dead Tests: Delete tests for removed features
- Improve Names: Use descriptive test names
Test Documentation
#![allow(unused)] fn main() { /// Tests that lambda expressions with multiple parameters are parsed correctly. /// /// This test verifies: /// - Parameter list parsing /// - Arrow token recognition /// - Expression body parsing #[test] fn test_lambda_with_multiple_params() { let input = "(x, y) => x + y"; let expr = expect_ok(input, parse_lambda_expression(input.into())); // ... } }
Summary
Testing Checklist
-
Tests in
tests/directory, not inline -
Use
expect_ok()for readable failures - Keep tests focused and minimal
- Add negative tests for ambiguity
- Test lookahead/disambiguation boundaries
- Test complex constructs thoroughly
-
Use
cut()for closing delimiters -
Wrap sub-parsers with
bws() - Add fixtures for integration tests
- Use snapshot tests for complex structures
- Document test purpose and coverage
Resources
- Test Helpers:
src/syntax/test_helpers.rs - Example Tests:
tests/parser/expressions/ - Fixtures:
tests/fixtures/ - Contributing Guide:
docs/development/contributing.md - Architecture:
docs/development/architecture.md
Architecture Decisions
This document explains the key architectural decisions made in the BSharp project, their rationale, and their implications for contributors.
Core Design Philosophy
BSharp is designed as a modular, extensible C# parser and analysis toolkit written in Rust. The architecture prioritizes:
- Correctness - Accurate parsing of C# syntax
- Performance - Efficient parsing and analysis of large codebases
- Maintainability - Clear module boundaries and minimal coupling
- Extensibility - Easy addition of new language features and analyzers
Parser Architecture
Why nom Parser Combinators?
Decision: Use the nom parser combinator library as the foundation for parsing.
Rationale:
- Composability: Small, focused parsers combine to handle complex syntax
- Type Safety: Rust's type system catches parser errors at compile time
- Performance: Zero-copy parsing with minimal allocations
- Testability: Individual parser functions are easily unit tested
- Maintainability: Declarative style is easier to understand than hand-written parsers
Trade-offs:
- Learning curve for contributors unfamiliar with parser combinators
- Error messages require additional work (addressed with nom-supreme)
Implementation:
- Core parsing infrastructure:
src/bsharp_parser/src/helpers/ - Parser implementations:
src/bsharp_parser/src/ - All parsers return
BResult<I, O>type alias
Error Handling Strategy
Decision: Use nom-supreme::ErrorTree for all parser errors.
Rationale:
- Rich Context: Tree structure preserves full parse failure path
- Better Diagnostics: Context annotations via
.context()method - Integration: Seamless integration with nom combinators
- Debugging: Pretty-printing via
format_error_tree()
Evolution:
- Initially used custom
BSharpParseErrortype - Migrated to
ErrorTreefor better diagnostics - Custom error type deprecated and removed
Implementation:
#![allow(unused)] fn main() { pub type BResult<I, O> = IResult<I, O, ErrorTree<I>>; }
Helper Functions (in src/bsharp_parser/src/helpers/)
context()- Adds contextual informationcut()- Commits to parse branch (prevents misleading backtracking)bws()- Whitespace-aware wrapper with error contextbdelimited()- Delimited parsing with cut on closing delimiter
Module Organization
Decision: Separate the parser crate from the syntax (AST) crate, and keep analysis in its own crate.
Structure:
src/
├── bsharp_parser/ # Parser implementations and public facade
│ ├── src/
│ │ ├── expressions/ # Expression parsers
│ │ ├── keywords/ # Keyword parsing (modularized)
│ │ ├── helpers/ # Parsing utilities (bws, cut, context, directives, ...)
│ │ ├── facade.rs # Public Parser facade
│ │ └── ...
├── bsharp_syntax/ # AST node definitions and shared syntax types
│ └── src/ # (re-exported by bsharp_parser as `syntax`)
├── bsharp_analysis/ # Analysis framework and workspace
│ └── src/
└── bsharp_cli/ # CLI entry and subcommands
Rationale:
- Separation of Concerns: Infrastructure vs implementation
- Reusability: Helpers used across all parsers
- API Clarity:
syntaxmodule is the public API - Testing: Infrastructure can be tested independently
Keyword Modularization
Decision: Organize keywords by category in dedicated modules.
Structure:
src/parser/keywords/
├── mod.rs # Keyword infrastructure
├── access_keywords.rs # public, private, protected, internal
├── accessor_keywords.rs # get, set, init, add, remove
├── type_keywords.rs # class, struct, interface, enum, record
├── modifier_keywords.rs # static, abstract, virtual, sealed
├── flow_control_keywords.rs # if, else, switch, case, default
├── iteration_keywords.rs # for, foreach, while, do
├── expression_keywords.rs # new, this, base, typeof, sizeof
├── linq_query_keywords.rs # from, where, select, orderby
└── ...
Rationale:
- Maintainability: Easy to find and update keyword parsers
- Consistency: Uniform keyword parsing strategy
- Word Boundaries: All keywords use
keyword()helper for boundary checking - Prevents Bugs: Avoids partial matches (e.g., "int" vs "int32")
Implementation:
keyword()function enforces[A-Za-z0-9_]word boundaries- Parsers grouped under
src/bsharp_parser/src/keywords/
AST Design
Naming Convention
Decision: Use PascalCase names without 'Syntax' suffix for all AST nodes.
Examples:
ClassDeclaration(notClassDeclarationSyntax)MethodDeclaration(notMethodDeclarationSyntax)ExpressionStatement(notExpressionStatementSyntax)IfStatement(notIfStatementSyntax)
Rationale:
- Clarity: Shorter, clearer names
- Roslyn Inspiration: Mirrors Roslyn's structure where appropriate
- Consistency: Uniform naming across entire codebase
- User Preference: Explicit design decision (documented in memories)
Implications:
- All AST node types follow this convention
- Test code uses these names
- Documentation uses these names
- Breaking change from earlier versions with 'Syntax' suffix
AST Ownership Model
Decision: Parent nodes own their children; no circular references.
Structure:
#![allow(unused)] fn main() { pub struct ClassDeclaration { pub attributes: Vec<AttributeList>, pub modifiers: Vec<Modifier>, pub name: Identifier, pub type_parameters: Option<Vec<TypeParameter>>, pub primary_constructor_parameters: Option<Vec<Parameter>>, pub base_types: Vec<Type>, pub body_declarations: Vec<ClassBodyDeclaration>, // Owned pub documentation: Option<XmlDocumentationComment>, pub constraints: Option<Vec<TypeParameterConstraintClause>>, } }
Rationale:
- Rust Ownership: Leverages Rust's ownership system
- Memory Safety: No reference cycles or lifetime complexity
- Simplicity: Clear ownership semantics
- Traversal: Navigation traits provide search without ownership issues
Trade-offs:
- Cannot directly reference parent from child
- Navigation requires traversal from root
- Mitigated by
AstNavigateandFindDeclarationstraits
Zero-Copy Parsing
Decision: Minimize string allocations during parsing where possible.
Implementation:
- String slices reference original input
- Identifiers store
String(owned) for convenience - Literals preserve original format as
String
Rationale:
- Performance: Reduces allocation overhead
- Memory Efficiency: Lower memory footprint
- Trade-off: Some allocations necessary for AST lifetime
Spans and Location Tracking
Decision: Track source locations via spans for precise diagnostics and tooling.
Implementation:
Spantype based onnom_locate::LocatedSpanlives insrc/bsharp_parser/src/syntax/span.rsand is re-exported through the public parser API.- The parser facade supports
parse_with_spans()which returns both the AST and span table for mapping nodes back to source locations. - Error reporting uses spans to include line/column, highlighting ranges via
format_error_tree().
Rationale:
- Diagnostics: Accurate error locations and ranges.
- Tooling: Enables IDE features, navigation, and source mapping.
- Testing: Stable, comparable locations for snapshot tests.
See also: docs/syntax/spans.md.
Analysis Framework
Framework-Driven Architecture
Decision: Implement a pipeline-based analysis framework with passes, rules, and visitors.
Structure:
src/analysis/
├── framework/ # Core analysis infrastructure
│ ├── pipeline.rs # Analysis pipeline orchestration
│ ├── passes.rs # Analysis pass trait and phases
│ ├── rules.rs # Rule trait and rulesets
│ ├── walker.rs # AST walker and visitor pattern
│ ├── registry.rs # Analyzer registry
│ └── session.rs # Analysis session and state
├── passes/ # Concrete analysis passes
├── rules/ # Concrete analysis rules
├── artifacts/ # Analysis artifacts (symbols, metrics, CFG)
└── ...
Rationale:
- Extensibility: Easy to add new analyzers
- Composability: Passes and rules compose via registry
- Performance: Single-pass traversal for local rules
- Configurability: Enable/disable passes and rules via config
Phases:
- Index - Symbol indexing and scope building
- Local - Single-pass local rules and metrics collection
- Global - Cross-file analysis (dependencies, etc.)
- Semantic - Type checking and semantic rules
- Reporting - Report generation and formatting
Visitor Pattern
Decision: Use visitor pattern for AST traversal.
Implementation:
#![allow(unused)] fn main() { pub trait Visit { fn enter(&mut self, node: &NodeRef, session: &mut AnalysisSession); fn exit(&mut self, node: &NodeRef, session: &mut AnalysisSession) {} } pub struct AstWalker { visitors: Vec<Box<dyn Visit>>, } }
Rationale:
- Separation of Concerns: Traversal logic separate from analysis logic
- Composability: Multiple visitors in single traversal
- Performance: Single pass for multiple analyses
- Extensibility: Easy to add new visitors
Query API
Decision: Use a typed Query API over a minimal NodeRef to traverse the AST. This is the current traversal API; the term “legacy” only refers to older navigation traits that the Query API replaced.
Implementation:
NodeRefenumerates coarse node categories (compilation unit, namespaces, declarations, methods, statements, expressions), and now includes top-level items like file-scoped namespaces, using directives, global using directives, and global attributes.Childrenprovides child enumeration forNodeRef.Extract<T>enablesQuery::of<T>()to yield typed nodes without extendingNodeReffor every concrete type.- Macro helpers
impl_extract_expr!andimpl_extract_stmt!simplify addingExtractimpls for expression/statement variants. - Location:
src/bsharp_syntax/src/query/(re-exported asbsharp_analysis::framework::Query)
Rationale:
- Composability: Typed filters via
Query::filter_typed. - Maintainability: Avoids wide trait surfaces and duplicated traversal.
- Performance: Focused walkers remain available for hot paths.
- Determinism: Traversal order and artifact hashing remain stable.
See also:
docs/parser/navigation.md(Query API overview)docs/analysis/traversal-guide.md(using Query in passes)docs/development/query-cookbook.md(recipes)
Formatting and Emitters
Decision: Implement formatting via an Emit trait with per-node emitters in bsharp_syntax.
Implementation:
Emittrait and emitters live undersrc/bsharp_syntax/src/emitters/(e.g.,emitters/declarations/*,emitters/expressions/*,emitters/statements/*).- Formatting is separated from parsing; emitters reconstruct code from AST with consistent whitespace and trivia handling.
- Trivia and XML doc emitters are under
emitters/trivia/.
Rationale:
- Separation of Concerns: Parsing and formatting evolve independently.
- Consistency: Centralized formatting rules for all nodes.
- Extensibility: Adding a new node implies an
Emitimpl in a known location.
See also: docs/syntax/formatter.md.
Workspace Loading
Multi-Format Support
Decision: Support loading from .sln, .csproj, or directory.
Implementation:
#![allow(unused)] fn main() { pub struct WorkspaceLoader; impl WorkspaceLoader { pub fn from_path(path: &Path) -> Result<Workspace>; pub fn from_path_with_options(path: &Path, opts: WorkspaceLoadOptions) -> Result<Workspace>; } }
Rationale:
- Flexibility: Support different entry points
- IDE Integration: Match IDE project loading behavior
- Incremental Analysis: Load only what's needed
Features:
- Solution file (.sln) parsing
- Project file (.csproj) parsing with XML
- Transitive ProjectReference following
- Source file discovery with glob patterns
- Deterministic project ordering
Error Resilience
Decision: Continue loading workspace even if individual projects fail.
Implementation:
- Failed projects recorded as stubs with error messages
- Workspace loading succeeds with partial results
- Errors accessible via
Project::errorsfield
Rationale:
- Robustness: Don't fail entire workspace for one bad project
- User Experience: Show what can be analyzed
- Debugging: Error messages preserved for investigation
Testing Strategy
External Test Organization
Decision: Externalize tests; in the current workspace they live under src/bsharp_tests/ rather than inline #[cfg(test)] modules.
Structure:
src/bsharp_tests/src/
├── parser/
│ ├── expressions/
│ ├── statements/
│ ├── declarations/
│ └── types/
├── cli/
└── integration/
Rationale:
- Separation: Test code separate from implementation
- Organization: Clear structure mirrors crates
- Compilation: Tests don't bloat production binaries
Note: A future migration to top-level tests/ may be considered.
Test Helpers
Decision: Provide expect_ok() helper for readable test failures.
Implementation:
#![allow(unused)] fn main() { pub fn expect_ok<T>(input: &str, result: BResult<&str, T>) -> T { match result { Ok((_, value)) => value, Err(e) => { eprintln!("{}", format_error_tree(&input, &e)); panic!("Parse failed"); } } } }
Rationale:
- Diagnostics: Pretty-printed errors on failure
- Debugging: Shows parse failure context
- Consistency: Uniform test error reporting
Snapshot Testing
Decision: Use insta crate for snapshot testing.
Implementation:
Cargo.tomlincludesinstain dev-dependencies- Snapshot tests for complex AST structures
- JSON serialization for comparison
Rationale:
- Regression Prevention: Catch unintended AST changes
- Review: Visual diff of AST changes
- Maintenance: Update snapshots when intentional
Performance Considerations
Parallel Analysis
Decision: Optional parallel analysis via rayon feature.
Implementation:
[features]
parallel_analysis = ["rayon"]
Rationale:
- Scalability: Faster analysis for large workspaces
- Optional: Not required for single-file use cases
- Trade-off: Adds dependency and complexity
Incremental Parsing
Decision: Not implemented yet; designed for future addition.
Future Design:
- Cache parsed ASTs by file hash
- Reparse only changed files
- Incremental analysis based on change scope
Rationale:
- Performance: Critical for IDE integration
- Complexity: Requires careful cache invalidation
- Priority: Deferred until core features stable
CLI Design
Subcommand Structure
Decision: Use clap with subcommands for different operations.
Commands:
parse- Parse C# file to JSONtree- Generate AST visualization (Mermaid/DOT)analyze- Run analysis and generate report
Rationale:
- Clarity: Each command has clear purpose
- Extensibility: Easy to add new commands
- Discoverability:
--helpshows all options - Consistency: Follows common CLI patterns
Output Formats
Decision: Support multiple output formats (JSON, pretty-JSON, SVG).
Implementation:
- JSON for machine consumption
- Pretty-JSON for human readability
- SVG for visualization
Rationale:
- Integration: JSON for tool integration
- Debugging: Pretty-JSON for manual inspection
- Visualization: SVG for understanding AST structure
Future Extensibility
Planned Enhancements
-
Incremental Parsing
- Cache parsed ASTs
- Reparse only changed regions
- Critical for IDE integration
-
Language Server Protocol (LSP)
- IDE integration
- Real-time diagnostics
- Code completion
-
More Analysis Passes
- Nullability analysis
- Lifetime analysis
- Security analysis
-
Code Transformation
- AST modification API
- Code generation from AST
- Refactoring support
Design for Extension
Principles:
- Trait-Based: Use traits for extensibility points
- Registry Pattern: Dynamic registration of analyzers
- Configuration: Enable/disable features via config
- Versioning: Stable API with clear versioning
Lessons Learned
What Worked Well
- Parser Combinators: Excellent for composability and testing
- Module Organization: Clear boundaries reduce coupling
- Error Context:
ErrorTreeprovides excellent diagnostics - External Tests: Clean separation improves maintainability
What We'd Do Differently
- Earlier Keyword Modularization: Should have organized keywords from start
- Error Type Migration: Earlier adoption of
ErrorTreewould have saved refactoring - Documentation: More inline documentation from the beginning
Recent Refactoring
Major refactoring improvements completed:
- Expression precedence chain builder implemented
- Statement group deduplication completed
- Consistent error recovery with
skip_to_member_boundary_top_level() - Whitespace handling standardization via
bws()combinator - Keyword modularization by category
Contributing Guidelines
When adding new features, follow these architectural principles:
- Use Existing Patterns: Follow established parser patterns
- Add Tests: External tests in
tests/directory - Document Decisions: Update this file for significant changes
- Error Context: Add
.context()calls for debugging - Naming Convention: PascalCase without 'Syntax' suffix
- Keyword Boundaries: Use
keyword()helper for all keywords
See docs/development/contributing.md for detailed contribution guidelines.
Cookbooks
Short, task-focused examples and patterns.
Available Cookbooks
- Query Cookbook
- Practical
QueryAPI patterns for traversing the AST.
- Practical
- Parser Cookbook
- Nom recipes: identifiers, lists, delimited blocks with
cut, tokens withcomplete, andall_consumingfile parsers.
- Nom recipes: identifiers, lists, delimited blocks with
When to use
- You know the outcome you want and need a concise example.
- You want to copy/paste a small starting point and adapt.
For deeper explanations, see:
docs/development/writing-parsers.mddocs/analysis/traversal-guide.md
Query Cookbook
Practical examples for using the Query API to traverse the AST.
Imports
#![allow(unused)] fn main() { // Option A (canonical): import directly from bsharp_syntax use bsharp_syntax::node::ast_node::NodeRef; use bsharp_syntax::query::Query; use bsharp_syntax::{CompilationUnit, ClassDeclaration, MethodDeclaration}; // Option B (ergonomic in analysis code): re-exports via bsharp_analysis // use bsharp_analysis::framework::{NodeRef, Query}; }
All classes in a file
#![allow(unused)] fn main() { fn all_classes(cu: &CompilationUnit) -> Vec<&ClassDeclaration> { Query::from(NodeRef::CompilationUnit(cu)) .of::<ClassDeclaration>() .collect() } }
All methods in a class
#![allow(unused)] fn main() { fn all_methods_in_class(c: &ClassDeclaration) -> Vec<&MethodDeclaration> { Query::from(NodeRef::from(c)) .of::<MethodDeclaration>() .collect() } }
Public methods only
#![allow(unused)] fn main() { use bsharp_syntax::modifiers::Modifier; fn public_methods(cu: &CompilationUnit) -> Vec<&MethodDeclaration> { Query::from(NodeRef::CompilationUnit(cu)) .filter_typed::<MethodDeclaration>(|m| m.modifiers.iter().any(|mm| *mm == Modifier::Public)) .collect() } }
Count await expressions
#![allow(unused)] fn main() { use bsharp_syntax::expressions::AwaitExpression; fn await_count(cu: &CompilationUnit) -> usize { Query::from(NodeRef::CompilationUnit(cu)) .of::<AwaitExpression>() .count() } }
Find invocations of a method name
#![allow(unused)] fn main() { use bsharp_syntax::expressions::{InvocationExpression, Expression}; fn invocations_of(cu: &CompilationUnit, name: &str) -> Vec<&InvocationExpression> { Query::from(NodeRef::CompilationUnit(cu)) .filter_typed::<InvocationExpression>(|inv| { // Match simple Variable(...) calls; extend for MemberAccess as needed match &*inv.expression { Expression::Variable(id) => id.name == name, _ => false, } }) .collect() } }
Methods with deep nesting
#![allow(unused)] fn main() { use bsharp_syntax::statements::statement::Statement; fn deeply_nested_methods(cu: &CompilationUnit, threshold: usize) -> Vec<&MethodDeclaration> { Query::from(NodeRef::CompilationUnit(cu)) .filter_typed::<MethodDeclaration>(|m| { if let Some(body) = &m.body { max_nesting(body, 0) > threshold } else { false } }) .collect() } fn max_nesting(s: &Statement, cur: usize) -> usize { match s { Statement::If(i) => { let then_d = max_nesting(&i.consequence, cur + 1); let else_d = i.alternative.as_ref().map(|a| max_nesting(a, cur + 1)).unwrap_or(cur); then_d.max(else_d) } Statement::Block(stmts) => stmts.iter().map(|st| max_nesting(st, cur)).max().unwrap_or(cur), Statement::For(f) => max_nesting(&f.body, cur + 1), Statement::ForEach(f) => max_nesting(&f.body, cur + 1), Statement::While(w) => max_nesting(&w.body, cur + 1), Statement::DoWhile(d) => max_nesting(&d.body, cur + 1), _ => cur, } } }
Tips
- Chain filters sparingly: Prefer a single
filter_typedwith a clear predicate. - Use
NodeRef::from(x): Start from any AST node to scope queries. - Profile: For hot paths, consider a custom walker when you need full control.
Parser Cookbook
Practical recipes for nom-based parsers in bsharp_parser.
Spanned-first policy
- All public parser entrypoints return
Spanned<T>so callers have precise source ranges for AST nodes. - Internals should prefer spanned parsers as well to preserve spans through transformations.
- When you only need the inner value, map via
.node.
Examples:
#![allow(unused)] fn main() { // Prefer the spanned variant and map to inner node when spans are not needed let (rest, expr) = nom::sequence::delimited(ws, parse_expression_spanned, ws) .map(|s| s.node) .parse(input)?; // Lists of expressions: collect inner nodes let (rest, args) = parse_delimited_list0( |i| delimited(ws, tok_l_paren(), ws).parse(i), |i| delimited(ws, parse_expression_spanned, ws).map(|s| s.node).parse(i), |i| delimited(ws, tok_comma(), ws).parse(i), |i| delimited(ws, tok_r_paren(), ws).parse(i), false, true, ).parse(input)?; }
Parsable trait
- For one-shot parsing of a type to
Spanned<Self>, implement or use the crate’sParsableabstraction (where available) instead of bespoke entrypoints. - This keeps a consistent contract across the parser and simplifies tests and tools that need spans.
Conventions
- Use
Span<'a>andBResult<'a, T>frombsharp_parser::syntaxmodules. - Prefer small, composable parsers and add
context()labels. - Use
cut()to avoid misleading backtracking after committing to a branch.
#![allow(unused)] fn main() { use bsharp_parser::syntax::span::Span; use bsharp_parser::syntax::errors::BResult; use nom::{IResult, branch::alt, bytes::complete::tag, character::complete as cc, combinator::{all_consuming, complete, map}, sequence::{delimited, preceded, terminated, tuple}}; use nom_supreme::ParserExt; // for .context(), .cut() }
Identifier
#![allow(unused)] fn main() { fn identifier(input: Span) -> BResult<String> { // very simplified: letter (letter|digit|_)* map( tuple((cc::alpha1, cc::alphanumeric0)), |(h, t): (&str, &str)| format!("{}{}", h, t) ).context("identifier").parse(input) } }
Comma-Separated List
#![allow(unused)] fn main() { use nom::multi::separated_list0; fn comma_sep<T, F>(item: F) -> impl FnMut(Span) -> BResult<Vec<T>> where F: Fn(Span) -> BResult<T> { separated_list0(cc::multispace0.and(tag(",")).and(cc::multispace0), item) } }
Delimited Braces Block
#![allow(unused)] fn main() { fn lbrace(i: Span) -> BResult<()> { map(tag("{"), |_| ()).context("'{'").parse(i) } fn rbrace(i: Span) -> BResult<()> { map(tag("}"), |_| ()).context("'}'").parse(i) } fn block<T, F>(mut inner: F) -> impl FnMut(Span) -> BResult<Vec<T>> where F: FnMut(Span) -> BResult<Vec<T>> { move |input| { delimited( lbrace.context("block start"), // prevent backtracking past '}' so the missing brace is reported inner.cut(), rbrace.cut().context("block end") ).parse(input) } } }
Using complete() for Tokens
#![allow(unused)] fn main() { use nom::bytes::streaming::take; use nom::combinator::complete; fn exactly_n(n: u8) -> impl FnMut(Span) -> BResult<Span<'_>> { move |input| complete(take(n)).context("exactly_n").parse(input) } }
all_consuming at File Level
#![allow(unused)] fn main() { use nom::combinator::all_consuming; fn parse_file(input: Span) -> BResult<File> { all_consuming(file_parser).parse(input) } }
Precedence Chain Skeleton
#![allow(unused)] fn main() { fn primary(i: Span) -> BResult<Expr> { /* literals, names, parenthesized */ } fn postfix(i: Span) -> BResult<Expr> { /* member access, invocation */ } fn unary(i: Span) -> BResult<Expr> { /* + - ! ~ */ } fn multiplicative(i: Span) -> BResult<Expr> { /* * / % */ } fn additive(i: Span) -> BResult<Expr> { /* + - */ } fn relational(i: Span) -> BResult<Expr> { /* < > <= >= */ } fn equality(i: Span) -> BResult<Expr> { /* == != */ } fn assignment(i: Span) -> BResult<Expr> { /* = += -= */ } // Entry point used by statement parsers fn expression(i: Span) -> BResult<Expr> { assignment(i) } }
Context Labels and Cuts
#![allow(unused)] fn main() { fn class_declaration(i: Span) -> BResult<ClassDecl> { preceded( tag("class").context("keyword 'class'"), tuple(( identifier.cut().context("class name"), // ... type params, base list )) ).context("class declaration").map(|(name, ..)| ClassDecl { name }).parse(i) } }
Tips
- Whitespace: Prefer explicit
multispace0/multispace1at boundaries to avoid accidental greedy matches. - Error messages: Keep
context()labels concise and domain-specific (e.g., "parameter list"). - Backtracking: Insert
cut()after committing to a branch to stop alt from swallowing errors.
Writing Tests
How to write and organize tests for BSharp.
Test Locations
- Parser and analysis tests live under
src/bsharp_tests/src/. - Prefer dedicated files per area, e.g.:
src/bsharp_tests/src/parser/expressions/...src/bsharp_tests/src/parser/statements/...src/bsharp_tests/src/analysis/...
Parser Tests
- Use realistic C# snippets and assert AST shapes.
- Prefer external test helpers (avoid inline
#[cfg(test)]in parser modules).
#![allow(unused)] fn main() { // Example skeleton #[test] fn parses_simple_invocation() { let source = "class C { void M() { Foo(1); } }"; let (cu, _spans) = bsharp_parser::facade::Parser::new().parse_with_spans(source).unwrap(); // Use Query or pattern matching to verify nodes } }
Analysis Tests
- Run
AnalyzerPipeline::run_with_defaultsand inspect artifacts:AstAnalysismetrics- CFG summary
- Dependency summary
#![allow(unused)] fn main() { #[test] fn counts_methods() { let src = "class C { void A(){} void B(){} }"; let (cu, spans) = bsharp_parser::facade::Parser::new().parse_with_spans(src).unwrap(); let mut session = bsharp_analysis::framework::AnalysisSession::new( bsharp_analysis::context::AnalysisContext::new("file.cs", src), spans); bsharp_analysis::framework::AnalyzerPipeline::run_with_defaults(&cu, &mut session); let metrics = session.artifacts.get::<bsharp_analysis::metrics::AstAnalysis>().unwrap(); assert!(metrics.total_methods >= 2); } }
Tips
- Names: Use descriptive test names; each file should focus on one area.
- Fixtures: Keep sources small and focused; add comments for intent.
- Determinism: Avoid relying on traversal order; query by type or match by name.
bsharp_tests Overview
Structure and conventions for the test crate.
Location
- All tests live under
src/bsharp_tests/src/. - Organize by domain:
parser/for parsing-related testsanalysis/for analysis pipeline tests
Running Tests
cargo test -p bsharp_tests
Conventions
- Prefer descriptive file names and test names.
- Keep fixtures small and focused.
- Use
Parser::parse_with_spansandAnalyzerPipeline::run_with_defaultsin integration-style tests.
Extending Syntax (New Nodes)
How to add new AST node types to bsharp_syntax.
1. Define the Node
- Add a struct or enum in the relevant module under
src/bsharp_syntax/src/. - Derive
bsharp_syntax_derive::AstNodeso it participates in traversal and rendering.
#![allow(unused)] fn main() { #[derive(bsharp_syntax_derive::AstNode, Debug, Clone, PartialEq, Serialize, Deserialize)] pub struct InterpolatedString { pub parts: Vec<InterpolatedPart>, } #[derive(bsharp_syntax_derive::AstNode, Debug, Clone, PartialEq, Serialize, Deserialize)] pub enum InterpolatedPart { Text(String), Expr(Expression), } }
The derive implements AstNode and auto-generates children() that pushes nested nodes.
2. Implement Emit (Optional)
If the node needs to be formatted back to C#, implement Emit in bsharp_syntax emitters.
#![allow(unused)] fn main() { impl crate::emitters::emit_trait::Emit for InterpolatedString { fn emit<W: std::fmt::Write>(&self, w: &mut W, cx: &mut EmitCtx) -> Result<(), EmitError> { cx.token(w, "$")?; cx.bracketed(w, '"', '"', || { for p in &self.parts { p.emit(w, cx)?; } Ok(()) }) } } }
Add per-part emitters in the same or nearby module (e.g., emitters/expressions/...).
3. Wire Up Parser (in bsharp_parser)
- Add a parser in
src/bsharp_parser/src/expressions/...that constructs the new node. - Use
Span-based parsers (bsharp_parser::syntax::span::Span). - On errors, rely on helpers and contexts so
format_error_tree()is informative.
3a. Add Keywords & Tokens
- Define keyword helpers using
define_keyword_pair!insrc/bsharp_parser/src/keywords/. - If a new reserved word, add it to
KEYWORDS(identifier filtering). - Use
kw_*()/peek_*()in parsers, wrapped withws()at boundaries, and insert.cut()after commitment.
See: docs/parser/keywords-and-tokens.md for the macro and examples.
3b. Use Syntax Parsers (Whitespace/Lists)
- Whitespace/comments:
syntax/comment_parser.rs(ws(),parse_whitespace_or_comments()) - Lists:
syntax/list_parser.rsfor delimited/separated lists - Tokens: prefer
nom_supreme::tag::complete::tag()and compose withpreceded/terminated/delimitedandws()
Example token with trivia:
#![allow(unused)] fn main() { use nom::{combinator::map, sequence::delimited}; use nom_supreme::tag::complete::tag; use crate::syntax::comment_parser::ws; map(delimited(ws, tag(","), ws), |_| ()) }
4. Tests (bsharp_tests)
- Create tests under
src/bsharp_tests/src/parser/...verifying the node appears in the AST. - Add formatter round-trip tests if
Emitis implemented.
#![allow(unused)] fn main() { #[test] fn interpolated_string_ast() { let src = r#"class C { void M(){ var s = $"x={x}"; } }"#; let (cu, _spans) = bsharp_parser::facade::Parser::new().parse_with_spans(src).unwrap(); // Use Query to find InterpolatedString once parser supports it } }
5. Visualization (Optional)
Graph views require no changes: to_text, to_mermaid, and to_dot use AstNode traversal.
Tips
- Box recursion: Use
Box<T>for recursive enum variants. - Keep primitives out: Store
String,bool, numbers as payload only; derive will skip them. - Naming: Use PascalCase node names; no
Syntaxsuffix.
Writing Parsers
Guidelines for implementing parsers in bsharp_parser using nom and spans.
Spans & Result Type
- Span:
bsharp_parser::syntax::span::Span<'a>(alias ofnom_locate::LocatedSpan<&'a str>) - Error type:
nom_supreme::error::ErrorTree<Span<'a>> - Result alias:
type BResult<'a, O> = IResult<Span<'a>, O, ErrorTree<Span<'a>>>inbsharp_parser::syntax::errors
#![allow(unused)] fn main() { use bsharp_parser::syntax::errors::BResult; use bsharp_parser::syntax::span::Span; }
Streaming vs Complete
nom supports streaming parsers by default. Use nom::combinator::complete(parser) to transform Incomplete into Error when you want a "complete input" behavior for a sub-parser (e.g., tokens, literals).
Example (from nom docs):
#![allow(unused)] fn main() { use nom::bytes::streaming::take; use nom::combinator::complete; let mut parser = complete(take(5u8)); assert_eq!(parser.parse("abcdefg"), Ok(("fg", "abcde"))); assert!(parser.parse("abcd").is_err()); }
At the top level, wrap file parsers with nom::combinator::all_consuming to ensure the entire input is consumed:
#![allow(unused)] fn main() { use nom::combinator::all_consuming; let mut parser = all_consuming(file_parser); }
Error Contexts and Cuts
Use nom_supreme for structured errors and better messages:
context("label", p)to push human-readable frames.cut(p)to prevent backtracking across critical boundaries and surface the right error.- Our pretty-printer
format_error_tree(&source, &error_tree)renders the tree with line/column and context stack.
#![allow(unused)] fn main() { use nom::{branch::alt, sequence::{preceded, terminated}}; use nom_supreme::context::ContextError; use nom_supreme::ParserExt; // for .context(), .cut() fn identifier(input: Span) -> BResult<String> { /* ... */ } fn lbrace(input: Span) -> BResult<()> { /* ... */ } fn rbrace(input: Span) -> BResult<()> { /* ... */ } fn block(input: Span) -> BResult<Vec<Stmt>> { preceded( lbrace.context("block: '{'"), terminated(statements, rbrace.cut().context("block: '}'")) ).parse(input) } }
Common Combinators
preceded(a, b),terminated(a, b),delimited(a, b, c)alt((p1, p2, ...))for alternativestuple((p1, p2, ...))to sequenceseparated_list0(sep, item)to parse comma-separated listsmap(p, f)to build AST nodes
Prefer small, focused parsers composed with these combinators.
Top-Level Entry Points
- Keep clear entry points for precedence chains (e.g., primary → postfix → binary → assignment).
- Use wrapper nodes for constructs like
New,Invocation,MemberAccess, etc., to keep variants orthogonal in the AST (seebsharp_syntax::expressions::expression.rs).
Testing Parsers
- Place tests in
src/bsharp_tests/src/parser/.... - Parse using
Parser::new().parse_with_spans(&source)and assert expected AST shapes. - On failure, pretty-print errors with
format_error_treeto diagnose.
#![allow(unused)] fn main() { #[test] fn parses_expression_statement() { let src = "class C { void M(){ Foo(1); } }"; let (cu, _spans) = bsharp_parser::facade::Parser::new().parse_with_spans(src).unwrap(); // Verify expected nodes using Query or pattern matching } }
Tips
- Return early with cut after consuming a keyword to avoid misleading alternatives.
- Use complete for tokens/literals that must not be partial.
- all_consuming at file/compilation-unit to ban trailing garbage.
- Context labels: Be concise and specific; they surface in error messages and docs.
References
- nom combinator
complete: https://docs.rs/nom/8.0.0/nom/combinator/fn.complete.html - nom combinator
all_consuming: https://docs.rs/nom/8.0.0/nom/combinator/fn.all_consuming.html
Spanned-first Parsers
This project follows a spanned-first policy for all parser entrypoints. Public parsers return Spanned<T> so every AST value carries precise source ranges for diagnostics, tooling, and downstream analysis.
Rationale
- Rich diagnostics: precise byte and line/column ranges for errors and UI highlighting.
- Uniform contract: tools and tests can rely on span presence everywhere.
- Safer refactors: span plumbing is not an afterthought.
Usage Patterns
1) Prefer spanned entrypoints
#![allow(unused)] fn main() { // Prefer spanned variants let (rest, s_expr) = parse_expression_spanned(input)?; // Use inner value if spans are not needed at the call site let expr = s_expr.node; }
2) Map lists to inner nodes
#![allow(unused)] fn main() { use nom::sequence::delimited; let (rest, args) = parse_delimited_list0( |i| delimited(ws, tok_l_paren(), ws).parse(i), |i| delimited(ws, parse_expression_spanned, ws).map(|s| s.node).parse(i), |i| delimited(ws, tok_comma(), ws).parse(i), |i| delimited(ws, tok_r_paren(), ws).parse(i), false, true, ).parse(input)?; }
3) Statements
#![allow(unused)] fn main() { let (rest, s_stmt) = parse_statement_ws_spanned(input)?; let stmt = s_stmt.node; }
Implementing new parsers
- Return
Spanned<T>from public entrypoints. - Compose with existing spanned parsers to retain spans through transformations.
- For adapters that must return unspanned values (e.g., legacy APIs),
.map(|s| s.node)at the last possible boundary. - Use
cut()after committing to a branch to produce focused errors. - Add
context("...")labels on user-facing constructs.
Example:
#![allow(unused)] fn main() { use nom::sequence::delimited; use nom_supreme::ParserExt; pub fn parse_lambda_body(input: Span) -> BResult<LambdaBody> { nom::branch::alt(( // block nom::combinator::map(parse_lambda_block_body, LambdaBody::Block), // expression nom::combinator::map( delimited(ws, parse_expression_spanned, ws).map(|s| s.node), LambdaBody::ExpressionSyntax, ), )) .context("lambda body") .parse(input) } }
Testing
- Prefer helpers that accept/return
Spanned<T>in new tests. - When asserting only structure, map to
.nodebefore comparison. - For diagnostics, use the existing pretty printers (see
bsharp_parser::errors::format_error_treeandto_miette_report).
Migration Notes
- Old unspanned entrypoints are deprecated; use their
_spannedcounterparts. - If a caller previously depended on unspanned types, add
.map(|s| s.node). - For bulk changes: search for
parse_expression(andparse_statement(and replace with spanned +.nodemapping.
Compliance
This Compliance section is a work in progress. Content, mappings, and assertions may evolve and change between versions.
This section documents the Roslyn compliance pipeline and how we validate our bsharp_parser and bsharp_syntax against Roslyn’s structure tests.
- Start with the high-level Overview
- Learn how to write Custom Asserts
- Understand the Generator
Compliance Overview
This Compliance section is a work in progress. Content, mappings, and assertions may evolve and change between versions.
This section describes the Roslyn compliance effort for the C# parser, using our Rust-based bsharp_parser and the bsharp_syntax AST. The goal is to automatically extract structural assertions from Roslyn tests and validate that our AST shape and key payloads match Roslyn’s expectations (normalized to our naming conventions: PascalCase, no "Syntax" suffix).
High-Level Flow
- Source: Roslyn test files in
roslyn_testing/roslyn_repo/src/Compilers/CSharp/Test/Syntax/Parsing/. - Extraction: A generator scans for
UsingTree(...)blocks and parses the following DSL ofN(SyntaxKind.X)nodes. - Translation: The extracted Roslyn tree is translated and normalized to our canonical kinds and structure.
- Running: Tests are emitted into
bsharp_compliance_testing, parsing provided C# snippets withbsharp_parserand comparing the actual AST with the expected structure.
Core Components
-
bsharp_compliance(generator)- Reads Roslyn files and extracts structural expected trees.
- Parses the Roslyn DSL (
N(SyntaxKind.X),M(...),EOF()). - Normalizes kinds via
kind_map.rs(e.g.,RecordStructDeclaration→RecordDeclaration). - Emits Rust tests into
bsharp_compliance_testing/src/generated/.
-
bsharp_compliance_testing(tests & asserts)- Contains custom structural assertions in
custom_asserts/structure_assert.rs. - Walks real
bsharp_syntaxnodes to build a comparableExpectedTree. - Compares node kind shapes and selected token payloads (e.g., identifier text).
- Contains custom structural assertions in
Normalization Principles
- Node names are PascalCase and omit Roslyn’s
...Syntaxsuffix. - Tokens/keywords are filtered from structure; identifier text is lifted where relevant.
- Harness differences (Roslyn’s class-with-method wrappers vs. our top-level statements) are normalized at assert time when needed.
What This Validates
- Structural presence and order of major nodes (CompilationUnit, declarations, using directives, type parameters, constraint clauses, etc.).
- Selected payloads (e.g.,
IdentifierName.token_value). - Deeper constructs incrementally (e.g.,
TypeParameterConstraintClause, “allows ref struct” constraints, record primary parameter lists).
Roadmap
- Expand kind mapping and walker coverage across more Roslyn suites.
- Tighten token payload checks where meaningful.
- Add targeted hand-authored structure tests for corner cases.
Compliance Guide
This Compliance section is a work in progress. Content, mappings, and assertions may evolve and change between versions.
This guide explains how to write custom asserts for Roslyn compliance tests using our bsharp_compliance_testing helpers. It focuses on structural checks and optional diagnostics checks.
Where custom asserts live
- File:
roslyn_testing/bsharp_compliance_testing/src/custom_asserts/after_parse.rs - Entry points:
after_parse(...): lightweight per-case hook for structural or source-based assertions.after_parse_with_expected(...): adds an optional diagnostics expectation integration.
- Helper macro:
assert_when! { ... }— enables concise, per-case matching on module/file/method/index.
Using assert_when!
The macro lets you target a specific Roslyn case by module name, Roslyn filename, Roslyn test method name, and case index (0-based within the method).
Example for a Statement case:
#![allow(unused)] fn main() { use crate::custom_asserts::after_parse::{after_parse, CaseData}; pub fn after_parse( module: &str, roslyn_file: &str, roslyn_method: &str, idx: usize, case: CaseData<'_>, ) { assert_when!( module = "statement_parsing_tests", roslyn_file = "StatementParsingTests", roslyn_method = "TestSwitchStatementWithNullableTypeInPattern3", idx = 2, Statement(ast, src) { // Add your targeted assertions here assert!(src.contains("switch")); // Optional: pattern-match on `ast` when you need structure checks // match ast { /* ... */ } } ); } }
Example for a File case (full CompilationUnit available):
#![allow(unused)] fn main() { use crate::custom_asserts::after_parse::{after_parse, CaseData}; pub fn after_parse( module: &str, roslyn_file: &str, roslyn_method: &str, idx: usize, case: CaseData<'_>, ) { assert_when!( module = "using_directive_parsing_tests", roslyn_file = "UsingDirectiveParsingTests", roslyn_method = "SimpleUsingDirectiveNamePointer", idx = 0, File(unit, src, original) { assert!(src.starts_with("using ")); // `unit` is a &bsharp_syntax::ast::CompilationUnit // You can inspect its using directives or declarations if needed. assert!(unit.using_directives.len() >= 1); let _ = original; // original Roslyn text when provided } ); } }
Diagnostics integration
If the generator attaches expected diagnostics, use after_parse_with_expected(...) to compare counts when diagnostics are supported by the build:
#![allow(unused)] fn main() { use crate::custom_asserts::after_parse::{after_parse_with_expected, CaseData}; pub fn my_integration( module: &str, roslyn_file: &str, roslyn_method: &str, idx: usize, expected: Option<crate::custom_asserts::roslyn_asserts::ExpectedDiagnostics>, case: CaseData<'_>, ) { // Runs custom case asserts and then asserts diagnostics count when available after_parse_with_expected(module, roslyn_file, roslyn_method, idx, expected, case); } }
Notes:
- When diagnostics support is disabled, the helper asserts with an explicit "unimplemented" fallback to avoid silent failures.
- Keep asserts precise and self-contained; prefer checking concrete substrings or specific AST facts.
Best practices
- Keep assertions small and focused. Use
assert_when!blocks per case. - Avoid brittle assumptions: prefer checking presence/shape over exact token trivia.
- Match our naming convention in any structure references (PascalCase, no
Syntaxsuffix). - Fail fast with clear messages; do not silently swallow errors.
Generator
This Compliance section is a work in progress. Content, mappings, and assertions may evolve and change between versions.
This document describes how the Roslyn structure test generator works in bsharp_compliance and how it produces executable tests for bsharp_compliance_testing.
Inputs
- Roslyn source files under
roslyn_testing/roslyn_repo/src/Compilers/CSharp/Test/Syntax/Parsing/. - The generator scans for
UsingTree(...)calls and parses the immediately following Roslyn structure DSL composed ofN(SyntaxKind.X)andEOF()entries (withM(...)ignored as "missing").
Pipeline
-
Scan and collect test methods
- Locates Roslyn
[Fact]methods and allUsingTree(...)call sites. - Captures the closest preceding
var text = "...";snippet as input source, when present.
- Locates Roslyn
-
Parse structure DSL
- Reads the DSL block following
UsingTree(...)and constructs a nestedExpectedTree(ExpectedNodegraph) mirroring the Roslyn node hierarchy. - Tolerates whitespace, comments, and missing markers (
M(...)).
- Reads the DSL block following
-
Kind translation and normalization
generator/kind_map.rsmaps Roslyn kinds to our canonical naming (PascalCase, noSyntaxsuffix).- Filters token/keyword nodes, lifting identifier text (
IdentifierToken→ parentIdentifierName.token_value). - Applies targeted renames (e.g.,
RecordStructDeclaration→RecordDeclaration).
-
Emit tests
- Writes Rust tests into
bsharp_compliance_testing/src/generated/<module>.rs. - Each test parses the captured
srcwithbsharp_parserand asserts structure viacustom_asserts/structure_assert.rs.
- Writes Rust tests into
Assertions
- Structure assertions build a comparable expected tree from our actual AST (
bsharp_syntax) and compare:- Node kinds and order
- Selected token payloads (e.g.,
IdentifierName.token_value)
- Normalization in the assert layer adapts Roslyn’s harness (class + method) to our top-level statements when applicable.
Extending the Generator
- Update
generator/kind_map.rsto add or refine kind mappings. - Expand
custom_asserts/structure_assert.rsto walk deeper AST areas (e.g., records, types, constraints). - Improve the DSL parser (
generator/structure_dsl.rs) as new Roslyn DSL shapes appear.
Output Location
- Generated files live under
roslyn_testing/bsharp_compliance_testing/src/generated/. - Modules track Roslyn file groups, e.g.
record_parsing.rs,using_directive_parsing_tests.rs.