Graph-sitter performs advanced static analysis to build a rich graph representation of your codebase. This pre-computation step analyzes dependencies, references, types, and control flow to enable fast and reliable code manipulation operations.

Graph-sitter is built on top of Tree-sitter and rustworkx and has implemented most language server features from scratch.

Graph-sitter is open source. Check out the source code to learn more!

The Codebase Graph

At the heart of Graph-sitter is a comprehensive graph representation of your code. When you initialize a Codebase, it performs static analysis to construct a rich graph structure connecting code elements:

# Initialize and analyze the codebase
from graph_sitter import Codebase
codebase = Codebase("./")

# Access pre-computed relationships
function = codebase.get_symbol("process_data")
print(f"Dependencies: {function.dependencies}")  # Instant lookup
print(f"Usages: {function.usages}")  # No parsing needed

Building the Graph

Codegen’s graph construction happens in two stages:

  1. AST Parsing: We use Tree-sitter as our foundation for parsing code into Abstract Syntax Trees. Tree-sitter provides fast, reliable parsing across multiple languages.

  2. Multi-file Graph Construction: Custom parsing logic, implemented in rustworkx and Python, analyzes these ASTs to construct a more sophisticated graph structure. This graph captures relationships between symbols, files, imports, and more.

Performance Through Pre-computation

Pre-computing a rich index enables Graph-sitter to make certain operations very fast that that are relevant to refactors and code analysis:

  • Finding all usages of a symbol
  • Detecting circular dependencies
  • Analyzing the dependency graphs
  • Tracing call graphs
  • Static analysis-based code retrieval for RAG
  • …etc.

Pre-parsing the codebase enables constant-time lookups rather than requiring re-parsing or real-time analysis.

Multi-Language Support

One of Codegen’s core principles is that many programming tasks are fundamentally similar across languages.

Currently, Graph-sitter supports:

Learn about how Graph-sitter handles language specifics in the Language Support guide.

We’ve started with these ecosystems but designed our architecture to be extensible. The graph-based approach provides a consistent interface across languages while handling language-specific details under the hood.

Build with Us

Graph-sitter is just getting started, and we’re excited about the possibilities ahead. We enthusiastically welcome contributions from the community, whether it’s:

  • Adding support for new languages
  • Implementing new analysis capabilities
  • Improving performance
  • Expanding the API
  • Adding new transformations
  • Improving documentation

Check out our community guide to get involved!