Symbol Table Structure#

Symbol tables in Jaclang are hierarchical data structures designed to track symbols (identifiers) and their relationships throughout the compilation process. This document explains the core components of the symbol table system and how they interact.

Core Components#

Jaclang's symbol table system consists of the following key classes:

classDiagram
    class UniScopeNode {
        +str nix_name
        +UniNode nix_owner
        +UniScopeNode parent_scope
        +list[UniScopeNode] kid_scope
        +dict[str, Symbol] names_in_scope
        +list[InheritedSymbolTable] inherited_scope
        +lookup(name, deep)
        +insert(node, access_spec, single, force_overwrite)
        +find_scope(name)
        +link_kid_scope(key_node)
        +def_insert(node, access_spec, single_decl, force_overwrite)
        +chain_def_insert(node_list)
        +use_lookup(node, sym_table)
        +chain_use_lookup(node_list)
    }

    class Symbol {
        +list[NameAtom] defn
        +list[NameAtom] uses
        +SymbolAccess access
        +UniScopeNode parent_tab
        +decl
        +sym_name
        +sym_type
        +sym_dotted_name
        +fetch_sym_tab
        +add_defn(node)
        +add_use(node)
    }

    class InheritedSymbolTable {
        +UniScopeNode base_symbol_table
        +bool load_all_symbols
        +list[str] symbols
        +lookup(name, deep)
    }

    class NameAtom {
        +str value
        +Symbol sym
        +UniNode name_of
        +sym_name
        +sym_category
    }

    UniScopeNode --> Symbol : names_in_scope
    UniScopeNode --> InheritedSymbolTable : inherited_scope
    UniScopeNode --> UniScopeNode : parent_scope/kid_scope
    Symbol --> NameAtom : defn/uses
    Symbol --> UniScopeNode : parent_tab
    InheritedSymbolTable --> UniScopeNode : base_symbol_table
    NameAtom --> Symbol : sym

UniScopeNode#

UniScopeNode represents a scope in the program, such as a module, function, class, or block. Each scope has its own symbol table containing the symbols defined within it.

Key attributes: - nix_name: Name of the scope - nix_owner: The AST node this scope belongs to - parent_scope: Reference to the parent scope - kid_scope: List of child scopes - names_in_scope: Dictionary mapping symbol names to their Symbol objects - inherited_scope: List of inherited symbol tables

Key methods: - lookup(name, deep): Looks up a symbol by name, optionally searching parent scopes - insert(node, access_spec, single, force_overwrite): Inserts a symbol into the scope - find_scope(name): Finds a child scope by name - link_kid_scope(key_node): Links a child scope to this scope - def_insert(node, access_spec, single_decl, force_overwrite): Inserts a symbol definition - chain_def_insert(node_list): Inserts a chain of symbols (e.g., for member access) - use_lookup(node, sym_table): Looks up a symbol use - chain_use_lookup(node_list): Looks up a chain of symbol uses

Symbol#

Symbol represents a symbol definition in the program, keeping track of all definitions and uses of the symbol.

Key attributes: - defn: List of definition name atoms - uses: List of use name atoms - access: Access level of the symbol (public, private, etc.) - parent_tab: The symbol table this symbol belongs to

Key properties: - decl: The first definition of the symbol - sym_name: The name of the symbol - sym_type: The type of the symbol (var, function, class, etc.) - sym_dotted_name: The fully qualified name of the symbol (including module path) - fetch_sym_tab: The symbol table associated with this symbol (if applicable)

InheritedSymbolTable#

InheritedSymbolTable represents a symbol table that is inherited from another scope, such as when importing symbols from a module.

Key attributes: - base_symbol_table: The base symbol table being inherited from - load_all_symbols: Whether all symbols should be loaded (e.g., for "import *") - symbols: List of specific symbols to inherit

Key methods: - lookup(name, deep): Looks up a symbol by name, respecting symbol filtering

NameAtom#

NameAtom represents a named reference in the code, which can be either a definition or a use of a symbol.

Key attributes: - value: The actual name string - sym: Reference to the associated Symbol object - name_of: The AST node this name belongs to - sym_name: The name of the symbol - sym_category: The category of the symbol (variable, function, class, etc.)

Symbol Table Hierarchy#

Symbol tables are organized hierarchically to reflect the scope structure of the program:

graph TD
    Module["Module SymTable"] --> GlobalScope["Global Scope"]
    Module --> Architype1["Architype SymTable"]
    Module --> Architype2["Architype SymTable"]
    Module --> Function["Function SymTable"]
    Architype1 --> Method1["Method SymTable"]
    Architype1 --> Method2["Method SymTable"]
    Function --> Block["Block SymTable"]
    Block --> IfBlock["If Block SymTable"]
    Block --> ForLoop["For Loop SymTable"]

    style Module fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style GlobalScope fill:#4fd1c5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style Architype1 fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style Architype2 fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style Function fill:#38a169,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style Method1 fill:#4fd1c5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style Method2 fill:#4fd1c5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style Block fill:#4fd1c5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style IfBlock fill:#6b7280,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style ForLoop fill:#6b7280,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0

The module symbol table is at the top level
Each architype (object, node, edge, walker) has its own symbol table
Functions, methods, and blocks have their own nested symbol tables
Control structures like if statements and loops have their own symbol tables

Symbol Lookup Process#

When looking up a symbol by name, the process follows these steps:

flowchart TD
    Start([Start Lookup]) --> A{Symbol in\ncurrent scope?}
    A -->|Yes| ReturnSymbol([Return Symbol])
    A -->|No| B{Check inherited\nsymbol tables}
    B -->|Found| ReturnSymbol
    B -->|Not Found| C{parent_scope\nexists and deep=true?}
    C -->|Yes| D[Move to parent scope]
    D --> A
    C -->|No| ReturnNone([Return None])

    style Start fill:#2d3748,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style A fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style B fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style C fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style D fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style ReturnSymbol fill:#38a169,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style ReturnNone fill:#f56565,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0

Check if the symbol exists in the current scope's names_in_scope
If not, check all inherited symbol tables through inherited_scope
If still not found and deep=True, recursively check the parent scope
If no symbol is found, return None

Symbol Access Levels#

Jaclang supports different access levels for symbols:

class SymbolAccess(IntEnum):
    """Symbol access."""

    EXTERNAL = -1  # Used for builtins
    PUBLIC = 0
    PRIVATE = 1
    PROTECTED = 2

Access levels determine visibility of symbols across scopes and are used for enforcing access control in the language.

Symbol Types#

Symbols are categorized into different types using the SymbolType enum:

class SymbolType(IntEnum):
    """Symbol type."""

    VAR = 0
    GLOBAL = 1
    HAS = 2
    PARAM = 3
    ARCHITYPE = 4
    ABILITY = 5
    ENUM = 6
    ENUM_VAL = 7
    MODULE = 8
    IMPORT = 9

This categorization helps the compiler understand the role and behavior of different symbols during compilation.

Practical Example#

Consider the following Jac code:

node Person {
    has name: str, age: int;

    can greet {
        print("Hello, my name is " + self.name);
    }
}

walker PersonVisitor {
    can visit_person(p: Person) {
        p.greet();
    }
}

The symbol table structure would look like:

flowchart TD
    ModuleTable["Module SymTable
    - Person: Symbol(ARCHITYPE)
    - PersonVisitor: Symbol(ARCHITYPE)"]

    ModuleTable --> PersonTable["Person SymTable
    - name: Symbol(HAS)
    - age: Symbol(HAS)
    - greet: Symbol(ABILITY)"]

    ModuleTable --> VisitorTable["PersonVisitor SymTable
    - visit_person: Symbol(ABILITY)"]

    PersonTable --> GreetTable["greet SymTable
    - self: Symbol(PARAM)"]

    VisitorTable --> VisitTable["visit_person SymTable
    - p: Symbol(PARAM)"]

    style ModuleTable fill:#3182ce,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style PersonTable fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style VisitorTable fill:#805ad5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style GreetTable fill:#4fd1c5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0
    style VisitTable fill:#4fd1c5,stroke:#e2e8f0,stroke-width:2px,color:#e2e8f0

Each symbol in the tables would contain: - References to its definitions - References to all uses throughout the code - Information about its type and access level - Links to its parent symbol table

This hierarchical structure enables efficient name resolution, type checking, and code generation during compilation.