Jac Compiler Error Handling Subsystem#
1. Introduction#
This document outlines the design for a new error handling subsystem for the Jac compiler. The goal is to create a more robust, maintainable, and developer-friendly system for reporting, managing, and diagnosing errors and warnings encountered during the various compilation phases (parsing, semantic analysis, code generation, etc.).
The key objectives are: - Centralization: A single source of truth for all error and warning definitions (codes, message templates). - Clarity: Consistent and informative error messages for the user. - Maintainability: Easier to add, modify, and track error types. - Extensibility: Ability to add richer diagnostic information (e.g., fix-it hints) in the future. - Tooling Support: Facilitate better integration with IDEs and other development tools.
2. Current System Overview#
The existing error handling mechanism relies on an Alert
class (jaclang.compiler.passes.transform.Alert
).
- Each
Transform
(base class for compiler passes) maintains its ownerrors_had
andwarnings_had
lists. - Passes report errors using
log_error(message_string, ...)
andlog_warning(message_string, ...)
. - These methods create
Alert
objects, storing the raw message string, code location (CodeLocInfo
), and the originating pass type. - Alerts from individual passes are also aggregated into global
errors_had
andwarnings_had
lists in theJacProgram
instance.
While functional, this system has limitations: - Error messages are decentralized (defined as raw strings within each pass). - No standardized error codes, making it harder to categorize or programmatically handle specific errors. - Less flexibility for internationalization or complex message formatting.
3. Proposed Design#
The new design introduces a centralized error registry, standardized error codes, and an enhanced error reporting structure.
3.1. Error Code Enumeration and Registry#
We will define a central enumeration for all possible compiler errors and warnings. Each member of this enumeration will represent a unique error type.
Location: jaclang.compiler.errors.JacErrorCode.py
(New File)
# jaclang.compiler.errors.JacErrorCode.py
from enum import Enum, auto
class JacErrorCode(Enum):
# Lexer/Parser Errors (L000-L999)
UNEXPECTED_TOKEN = auto() # msg: "Unexpected token '{token_value}' of type {token_type}, expected {expected_tokens}."
UNTERMINATED_STRING_LITERAL = auto() # msg: "Unterminated string literal."
INVALID_ESCAPE_SEQUENCE = auto() # msg: "Invalid escape sequence '\\{char}' in string literal."
MALFORMED_NUMERIC_LITERAL = auto() # msg: "Malformed numeric literal: {details}."
SYNTAX_ERROR_NEAR_TOKEN = auto() # msg: "Syntax error near '{token}'."
MISSING_EXPECTED_TOKEN = auto() # msg: "Missing expected token: {expected_token}."
# ... more parser errors
# Semantic Errors (S000-S999)
DUPLICATE_SYMBOL_DECL = auto() # msg: "Symbol '{name}' already declared in this scope at {original_loc}."
SYMBOL_NOT_FOUND = auto() # msg: "Symbol '{name}' not found in the current scope or accessible scopes."
TYPE_MISMATCH = auto() # msg: "Type mismatch: Expected type '{expected_type}', but got '{actual_type}' for '{construct_name}'."
INCOMPATIBLE_OPERANDS = auto() # msg: "Incompatible operands for operator '{operator}': '{left_type}' and '{right_type}'."
ASSIGN_TO_READ_ONLY = auto() # msg: "Cannot assign to read-only symbol '{name}'."
ACCESS_TO_PRIVATE_MEMBER = auto() # msg: "Cannot access private member '{member_name}' of '{owner_name}' from this context."
ACCESS_TO_PROTECTED_MEMBER = auto() # msg: "Cannot access protected member '{member_name}' of '{owner_name}' from this context."
NO_SUCH_ABILITY = auto() # msg: "Object of type '{type_name}' does not have ability '{ability_name}'."
INCORRECT_ARGUMENT_COUNT = auto() # msg: "Function/Ability '{callable_name}' expects {expected_count} arguments, but got {actual_count}."
ARGUMENT_TYPE_MISMATCH = auto() # msg: "Argument '{arg_name}' of '{callable_name}' expects type '{expected_type}', but got '{actual_type}'."
IMPORT_ERROR = auto() # msg: "Could not import module '{module_name}': {reason}."
CIRCULAR_IMPORT = auto() # msg: "Circular import detected involving module '{module_name}'."
# ... more semantic errors
# Code Generation Errors (C000-C999)
CODEGEN_UNSUPPORTED_FEATURE = auto() # msg: "Feature '{feature_name}' is not yet supported by the code generator."
BYTECODE_GENERATION_FAILED = auto() # msg: "Bytecode generation failed for '{construct_name}': {details}."
# ... more codegen errors
# Internal Compiler Errors (I000-I999)
ICE_UNEXPECTED_NODE_TYPE = auto() # msg: "Internal Compiler Error: Unexpected AST node type '{node_type}' in pass '{pass_name}'."
ICE_MISSING_SYMBOL_INFO = auto() # msg: "Internal Compiler Error: Missing symbol information for '{name}'."
# ... more internal errors
# Warnings (W000-W999)
UNUSED_VARIABLE = auto() # msg: "Variable '{name}' is declared but never used."
UNUSED_IMPORT = auto() # msg: "Import '{module_name}' is unused."
POTENTIAL_NULL_DEREFERENCE = auto() # msg: "Potential null dereference of '{variable_name}'."
DEPRECATED_FEATURE_USED = auto() # msg: "Feature '{feature_name}' is deprecated and will be removed in a future version. {suggestion}"
# ... more warnings
def __str__(self) -> str:
return self.name
Location: jaclang.compiler.errors.JacErrorStrings.py
(New File)
# jaclang.compiler.errors.JacErrorStrings.py
from jaclang.compiler.errors.JacErrorCode import JacErrorCode
# Using a dictionary to map error codes to message templates
# This allows for easier updates and potential future localization
ERROR_MESSAGE_TEMPLATES: dict[JacErrorCode, str] = {
# Lexer/Parser Errors
JacErrorCode.UNEXPECTED_TOKEN: "Unexpected token '{token_value}' of type {token_type}, expected {expected_tokens}.",
JacErrorCode.UNTERMINATED_STRING_LITERAL: "Unterminated string literal.",
JacErrorCode.INVALID_ESCAPE_SEQUENCE: "Invalid escape sequence '\\\\{char}' in string literal.",
JacErrorCode.MALFORMED_NUMERIC_LITERAL: "Malformed numeric literal: {details}.",
JacErrorCode.SYNTAX_ERROR_NEAR_TOKEN: "Syntax error near '{token}'.",
JacErrorCode.MISSING_EXPECTED_TOKEN: "Missing expected token: {expected_token}.",
# Semantic Errors
JacErrorCode.DUPLICATE_SYMBOL_DECL: "Symbol '{name}' already declared in this scope at {original_loc}.",
JacErrorCode.SYMBOL_NOT_FOUND: "Symbol '{name}' not found in the current scope or accessible scopes.",
JacErrorCode.TYPE_MISMATCH: "Type mismatch: Expected type '{expected_type}', but got '{actual_type}' for '{construct_name}'.",
JacErrorCode.INCOMPATIBLE_OPERANDS: "Incompatible operands for operator '{operator}': '{left_type}' and '{right_type}'.",
JacErrorCode.ASSIGN_TO_READ_ONLY: "Cannot assign to read-only symbol '{name}'.",
JacErrorCode.ACCESS_TO_PRIVATE_MEMBER: "Cannot access private member '{member_name}' of '{owner_name}' from this context.",
JacErrorCode.ACCESS_TO_PROTECTED_MEMBER: "Cannot access protected member '{member_name}' of '{owner_name}' from this context.",
JacErrorCode.NO_SUCH_ABILITY: "Object of type '{type_name}' does not have ability '{ability_name}'.",
JacErrorCode.INCORRECT_ARGUMENT_COUNT: "Function/Ability '{callable_name}' expects {expected_count} arguments, but got {actual_count}.",
JacErrorCode.ARGUMENT_TYPE_MISMATCH: "Argument '{arg_name}' of '{callable_name}' expects type '{expected_type}', but got '{actual_type}'.",
JacErrorCode.IMPORT_ERROR: "Could not import module '{module_name}': {reason}.",
JacErrorCode.CIRCULAR_IMPORT: "Circular import detected involving module '{module_name}'.",
# Code Generation Errors
JacErrorCode.CODEGEN_UNSUPPORTED_FEATURE: "Feature '{feature_name}' is not yet supported by the code generator.",
JacErrorCode.BYTECODE_GENERATION_FAILED: "Bytecode generation failed for '{construct_name}': {details}.",
# Internal Compiler Errors
JacErrorCode.ICE_UNEXPECTED_NODE_TYPE: "Internal Compiler Error: Unexpected AST node type '{node_type}' in pass '{pass_name}'.",
JacErrorCode.ICE_MISSING_SYMBOL_INFO: "Internal Compiler Error: Missing symbol information for '{name}'.",
# Warnings
JacErrorCode.UNUSED_VARIABLE: "Variable '{name}' is declared but never used.",
JacErrorCode.UNUSED_IMPORT: "Import '{module_name}' is unused.",
JacErrorCode.POTENTIAL_NULL_DEREFERENCE: "Potential null dereference of '{variable_name}'.",
JacErrorCode.DEPRECATED_FEATURE_USED: "Feature '{feature_name}' is deprecated and will be removed in a future version. {suggestion}",
}
3.2. Enhanced Alert Structure#
The Alert
class will be modified (or a new class, e.g., JacAlert
, will be created) to include the error code and a dictionary of contextual arguments for message formatting.
Location: jaclang.compiler.passes.transform.py
(Modify existing Alert
)
# jaclang.compiler.passes.transform.py (Modified Alert class)
# ... (other imports)
from jaclang.compiler.errors.JacErrorCode import JacErrorCode
from jaclang.compiler.errors.JacErrorStrings import ERROR_MESSAGE_TEMPLATES
# ...
class Alert:
"""Alert interface for compiler errors and warnings."""
def __init__(
self,
code: JacErrorCode,
loc: CodeLocInfo,
from_pass: Type[Transform],
args: Optional[dict[str, Any]] = None,
) -> None:
"""Initialize alert."""
self.code: JacErrorCode = code
self.loc: CodeLocInfo = loc
self.from_pass: Type[Transform] = from_pass
self.args: dict[str, Any] = args if args else {}
self.msg: str = self._format_message()
def _format_message(self) -> str:
"""Format the message using the template and arguments."""
template = ERROR_MESSAGE_TEMPLATES.get(self.code)
if not template:
return f"Internal Compiler Error: No message template for error code {self.code.name}"
try:
return template.format(**self.args)
except KeyError as e:
return f"Internal Compiler Error: Missing argument {e} for error {self.code.name} in template: '{template}'"
# __str__, __repr__, as_log, pretty_print methods remain largely the same,
# but will use self.msg (the formatted message) instead of an initial raw message.
# Example modification for __str__
def __str__(self) -> str:
"""Return string representation of alert."""
return (
f"[{self.code.name}] {self.loc.mod_path}, line {self.loc.first_line},"
f" col {self.loc.col_start}: {self.msg}"
)
def as_log(self) -> str:
"""Return the alert as a single line log."""
file_path: str = self.loc.mod_path
if file_path == "":
# There are error messages without file references.
return f"[{self.code.name}] {self.msg}"
line: int = self.loc.first_line
column: int = self.loc.col_start
return f"{file_path}:{line}:{column}: [{self.code.name}] {self.msg}"
def pretty_print(self) -> str:
"""Pretty prints the Alert to show the alert with source location."""
pretty_dump = pretty_print_source_location(
self.loc.mod_path,
self.loc.orig_src.code, # Assumes orig_src is available on CodeLocInfo
self.loc.first_line,
self.loc.pos_start,
self.loc.pos_end,
)
if pretty_dump != "":
pretty_dump = "\\n" + pretty_dump
return self.as_log() + pretty_dump
# ... (rest of Transform class)
3.3. Reporting Errors/Warnings in Passes#
Compiler passes will use the new system to report errors:
# Example usage within a Transform subclass
# from jaclang.compiler.errors.JacErrorCode import JacErrorCode
# ... inside a pass method ...
# self.log_error(
# JacErrorCode.SYMBOL_NOT_FOUND,
# args={"name": "my_var"},
# node_override=current_ast_node # Optional: if different from self.cur_node
# )
#
# self.log_warning(
# JacErrorCode.UNUSED_VARIABLE,
# args={"name": "temp_var"},
# node_override=variable_declaration_node
# )
The log_error
and log_warning
methods in Transform
will be updated:
# jaclang.compiler.passes.transform.py (Modified Transform class methods)
# ...
def log_error(
self,
code: JacErrorCode,
args: Optional[dict[str, Any]] = None,
node_override: Optional[UniNode] = None,
) -> None:
"""Pass Error."""
alrt = Alert(
code,
self.cur_node.loc if not node_override else node_override.loc,
self.__class__,
args,
)
self.errors_had.append(alrt)
self.prog.errors_had.append(alrt) # JacProgram accumulates these
self.logger.error(alrt.as_log())
def log_warning(
self,
code: JacErrorCode,
args: Optional[dict[str, Any]] = None,
node_override: Optional[UniNode] = None,
) -> None:
"""Pass Warning."""
alrt = Alert(
code,
self.cur_node.loc if not node_override else node_override.loc,
self.__class__,
args,
)
self.warnings_had.append(alrt)
self.prog.warnings_had.append(alrt) # JacProgram accumulates these
self.logger.warning(alrt.as_log())
# ...
3.4. Mermaid Diagrams#
3.4.1. Error Information Structure (Class Diagram)#
classDiagram
class JacErrorCode {
<<enumeration>>
UNEXPECTED_TOKEN
DUPLICATE_SYMBOL_DECL
SYMBOL_NOT_FOUND
TYPE_MISMATCH
UNUSED_VARIABLE
# ... other codes
}
class ERROR_MESSAGE_TEMPLATES {
<<dictionary>>
JacErrorCode: str_template
}
class Alert {
+JacErrorCode code
+CodeLocInfo loc
+Type[Transform] from_pass
+dict~str, Any~ args
+str msg // Formatted message
+__init__(code, loc, from_pass, args)
-_format_message() str
+as_log() str
+pretty_print() str
}
class CodeLocInfo {
+str mod_path
+int first_line
+int col_start
# ... other location details (pos_start, pos_end, orig_src)
}
JacErrorCode <--o Alert : has a
ERROR_MESSAGE_TEMPLATES ..> Alert : used by
CodeLocInfo <--o Alert : has a
3.4.2. Error Reporting Workflow (Sequence Diagram)#
sequenceDiagram
participant Pass as CompilerPass
participant TransformBase as Transform (Base Class)
participant AlertClass as Alert
participant ErrorRegistry as JacErrorCode / ErrorStrings
participant JacProgramInstance as JacProgram
Pass->>TransformBase: log_error(JacErrorCode.SYMBOL_NOT_FOUND, args={"name":"x"}, node)
TransformBase->>AlertClass: create Alert(code, loc, self.__class__, args)
AlertClass->>ErrorRegistry: get_template(JacErrorCode.SYMBOL_NOT_FOUND)
ErrorRegistry-->>AlertClass: return "Symbol '{name}' not found..."
AlertClass-->>TransformBase: return formatted_alert_obj
TransformBase->>TransformBase: self.errors_had.append(formatted_alert_obj)
TransformBase->>JacProgramInstance: self.prog.errors_had.append(formatted_alert_obj)
TransformBase->>Logging: self.logger.error(formatted_alert_obj.as_log())
3.4.3. Component Interaction (Component Diagram)#
graph TD
UserCode[Jac Source Code] --> CompilerNode[Jac Compiler]
subgraph Compiler
Input --> JacParser[Parser / Lexer]
JacParser --> AST[Abstract Syntax Tree]
AST --> SemanticPasses[Semantic Analysis Passes]
SemanticPasses --> IR[Intermediate Representation]
IR --> CodeGenPasses[Code Generation Passes]
CodeGenPasses --> Output[Bytecode / Other Output]
JacParser -- reports to --> ErrorSubsystem
SemanticPasses -- reports to --> ErrorSubsystem
CodeGenPasses -- reports to --> ErrorSubsystem
subgraph ErrorSubsystem [Error Handling Subsystem]
ES_Alert[Alert Class]
ES_Codes[JacErrorCode Enum]
ES_Strings[Error Message Templates]
ES_Reporting[Reporting Logic in Transform]
end
ErrorSubsystem --> JacProgram["JacProgram (Error Aggregation)"]
JacProgram --> UserOutput[User-facing Error/Warning Display]
end
ES_Codes -->|defines keys for| ES_Strings
ES_Strings -->|provides templates to| ES_Alert
ES_Codes -->|provides codes to| ES_Alert
ES_Reporting -->|creates| ES_Alert
4. Integration with Compiler Passes#
- All existing compiler passes (subclasses of
Transform
) will need to be updated to use the newlog_error
andlog_warning
signatures (i.e., passingJacErrorCode
andargs
dictionary instead of a raw string). - The
JacProgram
class will continue to aggregateAlert
objects in itserrors_had
andwarnings_had
lists. No major changes are expected here, as it already storesAlert
objects. - The
CodeLocInfo
object, which is part ofUniNode.loc
, must reliably provideorig_src.code
for thepretty_print
functionality to work correctly. This should be ensured during AST construction and transformation.
5. Benefits of the New System#
- Standardization: Error codes provide a clear, unambiguous way to identify specific issues.
- Maintainability: Error messages are centrally managed, making them easier to update, correct, or translate. Adding new errors is a structured process.
- Improved Diagnostics: Formatted messages with contextual arguments can provide more precise and helpful information to users. Error codes can be looked up in documentation.
- Reduced Redundancy: Eliminates scattered string literals for error messages across the codebase.
- Testability: Easier to write tests that check for specific error codes rather than matching fragile message strings.
- Foundation for Advanced Features: Paves the way for features like:
- More detailed error reports (e.g., with "fix-it" suggestions).
- Selective error suppression or promotion (e.g., treating certain warnings as errors).
- Better IDE integration for displaying errors.
6. Future Enhancements (Optional)#
- Error Recovery: Implement mechanisms for certain passes (especially the parser) to attempt recovery after an error to report multiple issues in one compilation run.
- Structured Diagnostic Information: Extend
Alert
orargs
to carry more structured diagnostic data beyond simple string placeholders. - Localization: The centralized string templates make it easier to add support for multiple languages for error messages.
- Documentation Links: Error messages could include URLs pointing to detailed explanations or examples for specific error codes.
- Fix-it Hints: Provide concrete suggestions or code snippets to help users fix errors.
- Severity Levels: Introduce more granular severity levels beyond just error/warning if needed.
This design provides a solid foundation for a more professional and manageable error handling system within the Jac compiler.