P1
Project Idea: Rebuilding Aider with Jac and Agentic Object-Spatial Programming#
This project proposes enhancing Aider's architecture using Jac's object-spatial programming paradigm and byLLM (by <llm>
) features to optimize multi-file editing, token handling, and repository understanding through graph-based representations.
Core Concepts#
Aider's Functionality#
Aider assists developers by:
- Understanding Code Structure: Analyzing the existing codebase to build a mental model.
- Responding to Prompts: Taking user requests (e.g., "add a new feature," "fix this bug," "refactor this code").
- Generating Code Changes: Producing code snippets or entire file modifications.
- Applying Changes: Integrating the generated code into the existing codebase.
- Iterative Refinement: Allowing users to review, accept, or reject changes, and provide further instructions.
Jac's Object-Spatial Programming#
In Jac, a codebase can be represented as a graph.
- Nodes: Represent files, classes, functions, variables, comments, and other code constructs. Each node can have properties (e.g., file path, function signature, variable type, code content).
- Edges: Represent relationships between these constructs (e.g., a function
calls
another function, a classcontains
a method, a fileimports
another file, a variable isdefined in
a function). - Walkers: Can traverse this code graph to understand its structure, identify relevant sections for a given prompt, and even apply modifications.
Jac's byLLM (by <llm>
)#
The by <llm>
feature allows Jac to delegate complex reasoning and generation tasks to Large Language Models.
- Natural Language Understanding: An LLM can interpret user prompts for Aider.
- Code Generation: An LLM can generate code snippets based on the prompt and the context derived from the code graph.
- Decision Making: An LLM can decide which parts of the code graph are most relevant to a user's request.
Understanding Aider's Current Architecture#
Aider's success comes from several key architectural decisions:
- Tree-sitter Based Repository Map: Uses tree-sitter to parse source code into ASTs, extracting symbol definitions and references to create a concise map of the codebase.
- Graph Ranking Algorithm: Analyzes dependencies between files as a graph, using ranking algorithms to select the most relevant portions of the repo_map that fit within token budgets (default 1k tokens via
--map-tokens
). - Multiple Edit Formats: Supports different edit formats (diff, whole, udiff, diff-fenced, editor-*) optimized for different LLM capabilities and use cases.
- Architect/Editor Mode: Separates high-level reasoning (architect) from detailed code editing (editor), allowing optimal model pairing (e.g., o1 + GPT-4o).
- Context Management: Dynamically adjusts repo_map size based on chat state, expanding when no files are in context, contracting when specific files are being edited.
- Git Integration: Automatic commits, diff management, and change tracking through git integration.
Jac OSP Value Proposition#
Where Jac's Object-Spatial Programming can enhance this proven architecture:
- Richer Graph Representation: While Aider uses file-level dependency graphs for ranking, OSP can represent finer-grained relationships (function calls, variable usage, type dependencies) as first-class spatial relationships.
- Advanced Traversal Patterns: Jac walkers can implement sophisticated traversal algorithms for context gathering that go beyond simple graph ranking.
- Multi-File Change Coordination: OSP's spatial relationships can help ensure consistency across complex refactoring that spans multiple files.
- Query-Based Context Selection: Instead of purely algorithmic ranking, enable semantic queries like "find all functions that handle user authentication" through spatial traversal.
- Incremental Graph Updates: Maintain live graph representation that updates as code changes, enabling more sophisticated change impact analysis.
Proposed OSP Mode Implementation#
Building upon Aider's proven architecture, we propose specific enhancements using Jac's object-spatial programming:
-
Enhanced Repository Graph with OSP:
- Extend Aider's existing tree-sitter based repo_map with Jac's spatial graph capabilities.
- Create a dynamic, queryable codebase graph where nodes represent code entities (files, classes, functions, variables) and edges represent relationships (calls, imports, dependencies, inheritance).
- Use Jac's native graph traversal to optimize the current graph ranking algorithm that selects relevant portions of the repository map.
-
Intelligent Context Retrieval with Spatial Walkers:
- Implement specialized walkers that can traverse the codebase graph to gather contextually relevant information.
ContextGatheringWalker
- navigates from a starting point to collect related code entities based on semantic distance and dependency relationships.ImpactAnalysisWalker
- determines which files/functions would be affected by proposed changes.DependencyWalker
- maps function call chains and import dependencies for better context understanding.
-
byLLM-Optimized Architect and Editor Pipeline:
- Enhance Aider's existing architect/editor mode with OSP-optimized prompting.
ArchitectLLM
- analyzes user requests and proposes high-level changes using the spatial graph context.EditorLLM
- translates architectural decisions into specific code edits, optimized for Aider's existing edit formats (diff, whole, udiff).- Use Jac's
by <llm>
syntax to create specialized abilities for different aspects of code understanding and generation.
-
Enhanced Token Budget Management:
- Implement dynamic token allocation using spatial graph analysis.
TokenBudgetOptimizer
- walker that prioritizes which parts of the repo_map to include based on relevance scores derived from graph centrality and user context.- Smart context windowing that expands/contracts based on the complexity of the requested changes.
-
Multi-File Change Coordination:
ChangeCoordinator
- walker that ensures consistency across multi-file edits.- Implements change propagation logic to maintain code integrity across file boundaries.
- Validates that changes in one file don't break contracts expected by dependent files.
-
OSP-Enhanced Mode:
- Introduce a new
/genius
mode that leverages full OSP capabilities. - This mode builds and maintains a live graph of the codebase, updating it as changes are made.
- Provides advanced querying capabilities like "find all functions that depend on this API" or "show me the call chain for this error path".
- Introduce a new
Key OSP Optimizations for Aider's Core Components#
1. Repository Map Enhancement#
Proposed OSP Mode Implementation#
Building upon Aider's existing chat modes (code, architect, ask, help), introduce a new /genius
mode
Integration with Existing Aider Commands#
Extend existing commands with spatial awareness:
# Enhanced /add command with spatial context
/add --spatial file.py # Automatically add related files based on dependencies
# Enhanced repository understanding
/genius find "error handling" # Find all error handling code
/genius trace payment_flow # Trace payment processing dependencies
/genius impact "database schema change" # Analyze impact of schema changes
# Context expansion based on spatial relationships
/expand --spatial # Add files related to current context via graph traversal
Expected Benefits and Improvements#
1. Better Multi-File Editing#
- Current Issue: Aider sometimes misses dependent files that need updates when making complex changes
- OSP Solution: Spatial traversal ensures all related code is identified and considered
- Example: When refactoring an API, automatically identify all client code that needs updates
2. Optimized Token Usage#
- Current Issue: Fixed token budgets may include irrelevant code or miss important context
- OSP Solution: Dynamic context selection based on spatial relevance and user intent
- Example: For authentication changes, prioritize auth-related code over unrelated utilities
3. Enhanced Code Understanding#
- Current Issue: Limited to file-level dependencies and tree-sitter symbol extraction
- OSP Solution: Function-level dependencies, usage patterns, and semantic relationships
- Example: Understanding that error handling functions are related even across different modules
4. Improved Change Impact Analysis#
- Current Issue: Manual identification of files that might be affected by changes
- OSP Solution: Automated impact analysis using graph traversal
- Example: Changing a database model automatically identifies all services, controllers, and tests that use it
Performance Benchmarks and Metrics#
To validate the OSP enhancements, we would measure:
- Multi-File Change Success Rate: Percentage of complex changes that don't require manual fixes
- Context Relevance Score: How often included context is actually used in generated changes
- Token Efficiency: Amount of relevant context per token in the budget
- Change Completeness: Percentage of dependent changes automatically identified
- User Satisfaction: Reduced number of iterations needed to complete complex tasks
Practical Implementation Strategy#
Phase 1: Proof of Concept#
# Basic spatial graph representation
node CodeFile {
has path: str;
has content: str;
has symbols: list[str];
has imports: list[str];
}
edge FileRelation {
has relation_type: str; // imports, calls, references
}
walker BasicSpatialMapper {
can build_file_graph(repo_path: str) -> 'SpatialGraph';
can find_related_files(target_file: str) -> list[str];
}
Phase 2: Integration with Aider#
- Create Jac plugin that integrates with Aider's existing repo_map generation
- Implement basic spatial queries for context enhancement
- Add
/genius
command support to Aider's CLI
Phase 3: byLLM Optimization#
- Implement LLM-powered context selection and summarization
- Enhance architect/editor pipeline with spatial context
- Add intelligent token budget allocation
Phase 4: Advanced Features#
- Multi-file change coordination
- Impact analysis and change propagation
- Performance optimization and caching
Getting Started#
For contributors interested in this project:
- Understand Aider's Architecture: Study the repo_map implementation, graph ranking algorithms, and architect/editor modes
- Learn Jac OSP: Familiarize with Jac's spatial programming concepts, walkers, and graph traversal
- Experiment with byLLM: Practice using Jac's
by <llm>
features for code analysis tasks - Start Small: Begin with a simple spatial graph representation of a small codebase
- Benchmark Early: Establish baseline measurements for context relevance and change success rates
Advantages of the OSP-Enhanced Approach#
- Builds on Proven Architecture: Leverages Aider's successful tree-sitter based repo_map and graph ranking algorithms while enhancing them with OSP capabilities.
- Optimized Token Management: Smart context selection using spatial graph analysis ensures maximum relevance within token budgets.
- Multi-File Coherence: OSP's spatial relationships help maintain consistency across complex multi-file changes.
- Scalable Architecture: Graph-based approach scales better with repository size compared to linear approaches.
- Enhanced Architect/Editor Pipeline: byLLM optimization of Aider's existing two-stage approach for better reasoning and editing separation.
- Advanced Query Capabilities: Spatial queries enable sophisticated code understanding ("find all callers", "trace data flow", "impact analysis").
- Extensible Framework: New walkers and capabilities can be easily added for different types of code analysis and modification patterns.
This project would demonstrate Jac's capabilities in enhancing existing AI-powered development tools while providing significant improvements to multi-file editing, context management, and repository understanding. The OSP approach offers a principled way to represent and navigate complex codebases that could benefit the broader AI-assisted development ecosystem.
References and Further Reading#
- Aider Documentation - Understanding current architecture and capabilities
- Repository Map with Tree-sitter - Aider's approach to code context
- Architect/Editor Mode - Separating reasoning from editing
- Jac Documentation - Object-Spatial Programming concepts
- byLLM Framework - AI integration in Jac