Type Inference#
This document describes the type inference system in Jac, which allows the compiler to automatically determine the types of expressions and variables without explicit annotations.
Motivation#
Type inference provides the best of both worlds: the safety of a statically typed language with the concise syntax of a dynamically typed language. In Jac, type inference helps:
- Reduce verbosity and boilerplate code
- Make code more readable while maintaining type safety
- Avoid redundant type annotations when types are obvious
Basic Principles#
Type inference in Jac follows these core principles:
- Local Inference: Types are inferred from local context, like initialization expressions
- Bidirectional Propagation: Types can flow both from expressions to variables and from expected contexts to expressions
- Progressive Refinement: Types are refined as more information becomes available during compilation
- Defaulting Rules: When inference is ambiguous, sensible defaults are chosen
Inference Mechanisms#
graph TD
A[Type Inference Sources] --> B[Initialization Inference]
A --> C[Contextual Inference]
A --> D[Flow-sensitive Inference]
A --> E[Return Type Inference]
B --> B1["x = 42 ➝ x inferred as int"]
C --> C1["func(x) where func expects int ➝ constrains x to int"]
D --> D1["if x: ... ➝ constrains x to be bool-convertible"]
E --> E1["return x ➝ constraints x to match function return type"]
Initialization Inference#
The simplest form of inference occurs for local variables with initializers:
x = 42; // x is inferred as int
y = 3.14; // y is inferred as float
z = "hello"; // z is inferred as str
items = []; // items is inferred as list[any]
For collections, the element type is inferred from the elements:
nums = [1, 2, 3]; // nums is inferred as list[int]
mixed = [1, "two", 3.0]; // mixed is inferred as list[int|str|float]
pairs = {"a": 1, "b": 2}; // pairs is inferred as dict[str, int]
Contextual Inference#
Types can be inferred from how variables are used:
def process(items: list[int]) -> int:
return sum(items);
// x is inferred to be list[int] from the context of the function call
x = [];
process(x); // Constrains x to list[int]
Flow-sensitive Inference#
Types can be refined based on control flow:
x = get_value(); // Initial type might be Any or a union type
if isinstance(x, str):
// Within this block, x is known to be a str
print(x.upper());
else:
// Here, x's type excludes str
// ...
Return Type Inference#
For functions without explicit return types, the return type is inferred:
The Inference Algorithm#
Type inference works in several phases:
1. Collection Phase#
In this phase, the compiler: - Identifies all variables and expressions - Records explicit type annotations - Creates placeholder types for variables without annotations - Generates type variables for generic type parameters
2. Constraint Generation Phase#
The compiler analyzes expressions and statements to generate constraints:
flowchart TD
A[AST Walk] --> B[Generate Type Variables]
B --> C[Generate Constraints]
C --> D[Constraint Solving]
C --> C1["Assignment: lhs = rhs ➝ rhs <: lhs"]
C --> C2["BinaryOp: a + b ➝ Constraints on a, b, result"]
C --> C3["FunctionCall: f(a, b) ➝ a <: param1, b <: param2"]
C --> C4["Return: return e ➝ e <: func_return_type"]
Example constraints:
- Assignment: x = e
generates typeof(e) <: typeof(x)
- Binary operation: a + b
generates type constraints based on operation rules
- Function call: f(a, b)
generates typeof(a) <: typeof(param1)
, typeof(b) <: typeof(param2)
3. Constraint Solving Phase#
The constraint solver attempts to find the most specific types that satisfy all constraints. This typically involves:
- Constructing a constraint graph
- Applying constraint propagation
- Resolving type variables to concrete types
- Defaulting remaining ambiguous types
# Simplified constraint solving algorithm
def solve_constraints(constraints, type_vars):
changed = True
while changed:
changed = False
for constraint in constraints:
# Apply constraint propagation rules
result = apply_constraint(constraint, type_vars)
if result:
changed = True
# Apply defaulting rules for any remaining ambiguous types
for var in type_vars:
if is_ambiguous(var):
type_vars[var] = default_type_for(var)
4. Type Application Phase#
Finally, the inferred types are attached to AST nodes and made available to the type checker and code generator.
Handling Special Cases#
Type Inference for Containers#
For containers, inference considers both the variable declaration and element types:
// Inferred as list[int]
x = [1, 2, 3];
// Inferred as list[int|str]
y = [1, "two", 3];
// If z is annotated, this constrains the elements:
z: list[float] = []; // z must contain floats
Generic Function Type Inference#
For generic functions, type parameters are inferred from arguments:
can first_item<T>(items: list[T]) -> T {
return items[0];
}
// T inferred as int
x = first_item([1, 2, 3]);
// T inferred as str
y = first_item(["a", "b", "c"]);
Inference with Union Types#
Union types are inferred when needed:
Bidirectional Type Checking#
Jac uses bidirectional type checking, which combines:
- Type inference: Determining the type of an expression bottom-up
- Type checking: Verifying an expression against an expected type top-down
graph TD
A[Bidirectional Type Checking] --> B[Inference Mode]
A --> C[Checking Mode]
B --> B1["Infer expression type from its parts"]
B --> B2["Bottom-up: e ⇒ T"]
C --> C1["Check if expression matches expected type"]
C --> C2["Top-down: Γ ⊢ e : T"]
D[Expression] --> B
C --> D
Example:
// In checking mode: x's type known to be int
x: int = ...;
// In inference mode: expression's type calculated
calc_result = complex_expression();
// Bidirectional: x's type constrains the expression
x = some_expression();
Handling Ambiguity#
When type inference is ambiguous, Jac applies defaulting rules:
- Empty collections default to element type
any
- Numeric literals without context default to
int
- When multiple valid types exist, the most specific common supertype is chosen
Interaction with Gradual Typing#
Jac supports gradual typing, where some parts of the program have explicit types while others rely on inference:
// Explicitly typed
x: int = 42;
// Inferred
y = x + 10;
// Partially explicit (list of ints)
z: list[int] = [];
Error Handling in Type Inference#
When inference encounters conflicting constraints, it produces clear error messages:
test.jac:10:12 - Type Error
Cannot infer consistent type for variable 'x'
Conflicting constraints:
- x must be int (from assignment at line 5)
- x must be str (from function call at line 10)
Performance Considerations#
Type inference can be computationally expensive. To maintain good compiler performance:
- Incremental Inference: Only re-infer types for code that has changed
- Caching: Cache inference results for common expressions
- Early Stopping: Stop inference when sufficient information is available
- Constraint Simplification: Simplify constraint sets before solving