Symbol Table Building#
The symbol table building process is one of the first major steps in Jaclang's compilation pipeline. This document explains how symbol tables are constructed during the compilation process through the SymTabBuildPass
.
The SymTabBuildPass#
The SymTabBuildPass
is a compiler pass responsible for creating the initial structure of symbol tables and populating them with symbols from declarations. It traverses the Abstract Syntax Tree (AST) and creates a symbol table for each scope in the program.
flowchart TD
AST[AST from Parser] --> SymTabBuildPass
SymTabBuildPass --> ScopesCreated[Scopes Created]
SymTabBuildPass --> SymbolsInserted[Symbols Inserted]
SymTabBuildPass --> SymbolTableTree[Symbol Table Tree]
style AST fill:#2d3748,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style SymTabBuildPass fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style ScopesCreated fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style SymbolsInserted fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style SymbolTableTree fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
Key Mechanisms#
Scope Stack Management#
The SymTabBuildPass
maintains a stack of symbol tables to track the current scope during traversal:
def before_pass(self) -> None:
"""Before pass."""
self.cur_sym_tab: list[UniScopeNode] = []
def push_scope_and_link(self, key_node: uni.UniScopeNode) -> None:
"""Push scope."""
if not len(self.cur_sym_tab):
self.cur_sym_tab.append(key_node)
else:
self.cur_sym_tab.append(self.cur_scope.link_kid_scope(key_node=key_node))
def pop_scope(self) -> UniScopeNode:
"""Pop scope."""
return self.cur_sym_tab.pop()
@property
def cur_scope(self) -> UniScopeNode:
"""Return current scope."""
return self.cur_sym_tab[-1]
This stack-based approach allows the pass to: 1. Create nested scopes that mirror the program's scope structure 2. Link child scopes to their parent scopes 3. Ensure symbols are inserted into the correct scope
Scope Creation#
The pass creates a new scope for every construct that introduces a scope in the language:
flowchart TB
AST[AST Traversal] --> EnterNode[Enter Node]
EnterNode --> IsScope{Is Scope\nNode?}
IsScope -->|Yes| PushScope[Push New Scope]
IsScope -->|No| Continue[Continue Traversal]
PushScope --> ProcessNode[Process Node]
ProcessNode --> ExitNode[Exit Node]
ExitNode --> WasScope{Was Scope\nNode?}
WasScope -->|Yes| PopScope[Pop Scope]
WasScope -->|No| NextNode[Next Node]
Continue --> ExitNode
PopScope --> NextNode
NextNode --> EnterNode
style AST fill:#2d3748,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style IsScope fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style WasScope fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style PushScope fill:#38a169,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style PopScope fill:#f56565,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style EnterNode fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style ExitNode fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style ProcessNode fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style Continue fill:#4fd1c5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style NextNode fill:#4fd1c5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
Scopes are created for: - Modules - Architypes (object, node, edge, walker) - Abilities (methods and functions) - Control structures (if, for, while, etc.) - Block statements - Lambda expressions - Comprehensions
For example, here's how scopes are created for architypes:
def enter_architype(self, node: uni.Architype) -> None:
self.push_scope_and_link(node)
assert node.parent_scope is not None
node.parent_scope.def_insert(node, access_spec=node, single_decl="architype")
def exit_architype(self, node: uni.Architype) -> None:
self.pop_scope()
Symbol Insertion#
As the pass traverses the AST, it inserts symbols into the current scope for various declarations:
sequenceDiagram
participant AST as AST Node
participant Pass as SymTabBuildPass
participant Scope as Current Scope
participant Symbol as Symbol
AST->>Pass: Process declaration
Pass->>Scope: def_insert(node, access_spec, single_decl)
Scope->>Scope: insert(node, access_spec, single, force_overwrite)
Scope->>Symbol: Create Symbol
Symbol->>AST: Set node.sym reference
Scope->>Scope: Update names_in_scope
style AST fill:#2d3748,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style Pass fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style Scope fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style Symbol fill:#38a169,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
Symbols are inserted for: - Global variables - Import statements - Architype declarations - Ability (method and function) declarations - Parameters - Has variables - Enum declarations
For example, here's how global variables are processed:
def exit_global_vars(self, node: uni.GlobalVars) -> None:
for i in self.get_all_sub_nodes(node, uni.Assignment):
for j in i.target.items:
if isinstance(j, uni.AstSymbolNode):
j.sym_tab.def_insert(j, access_spec=node, single_decl="global var")
else:
self.ice("Expected name type for globabl vars")
Special Symbol Handling#
The pass also handles special symbols that have implicit definitions:
def enter_ability(self, node: uni.Ability) -> None:
self.push_scope_and_link(node)
assert node.parent_scope is not None
node.parent_scope.def_insert(node, access_spec=node, single_decl="ability")
if node.is_method:
node.sym_tab.def_insert(uni.Name.gen_stub_from_node(node, "self"))
node.sym_tab.def_insert(
uni.Name.gen_stub_from_node(
node, "super", set_name_of=node.owner_method
)
)
In this example, self
and super
are automatically inserted for methods.
Symbol Table Building Flow#
The overall flow of the symbol table building process is as follows:
graph TD
ParsedAST[Parsed AST] --> A[Begin SymTabBuildPass]
A --> B[Create Module SymTable]
B --> C{For each node\nin AST}
C -->|AST Traversal| D{Is Scope Node?}
D -->|Yes| E[Create New Scope\nLink to Parent]
D -->|No| F{Is Declaration?}
F -->|Yes| G[Insert Symbol]
F -->|No| H[Continue Traversal]
E --> I[Process Node Contents]
I --> J[Pop Scope on Exit]
G --> H
J --> H
H --> C
C -->|Done| K[Complete Symbol Table Tree]
style ParsedAST fill:#2d3748,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style A fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style B fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style C fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style D fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style E fill:#38a169,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style F fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style G fill:#38a169,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style H fill:#4fd1c5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style I fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style J fill:#f56565,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style K fill:#38a169,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
Symbol Resolution in SymTabBuildPass#
During this phase, resolution of symbols is not yet complete. The pass only:
- Creates the symbol table structure
- Inserts symbols from declarations
- Establishes parent-child relationships between scopes
Subsequent passes will: - Link symbol uses to their declarations (DefUsePass) - Connect symbol tables across module boundaries (SymTabLinkPass) - Validate symbol references and perform type checking
Example: Building Symbol Tables for a Simple Program#
Consider this example Jac program:
node Person {
has name: str;
can greet {
print("Hello, " + self.name);
}
}
can say_hi(p: Person) {
p.greet();
}
Here's how the SymTabBuildPass
would process this program:
- Create a module symbol table
- Encounter
node Person
→ Create a new scope, insert "Person" in the module scope - Encounter
has name
→ Insert "name" in the Person scope - Encounter
can greet
→ Create a new scope, insert "greet" in the Person scope - Inside
greet
, insert implicit "self" symbol - Exit
greet
scope → Pop scope - Exit
Person
scope → Pop scope - Encounter
can say_hi
→ Create a new scope, insert "say_hi" in the module scope - Encounter parameter
p
→ Insert "p" in the say_hi scope - Exit
say_hi
scope → Pop scope
The resulting symbol table structure:
flowchart TD
ModuleTable["Module SymTable
- Person: Symbol(ARCHITYPE)
- say_hi: Symbol(ABILITY)"]
ModuleTable --> PersonTable["Person SymTable
- name: Symbol(HAS)
- greet: Symbol(ABILITY)"]
ModuleTable --> SayHiTable["say_hi SymTable
- p: Symbol(PARAM)"]
PersonTable --> GreetTable["greet SymTable
- self: Symbol(PARAM)"]
style ModuleTable fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style PersonTable fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style SayHiTable fill:#4fd1c5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
style GreetTable fill:#4fd1c5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
Conclusion#
The SymTabBuildPass
is a foundational component of Jaclang's compilation pipeline. It creates the symbol table structure that subsequent passes will use for name resolution, type checking, and code generation.
Understanding this pass is essential for: - Adding new language constructs that introduce scopes - Modifying symbol handling behavior - Debugging symbol resolution issues - Implementing new semantic analyses that depend on symbol tables