Skip to content

Lexer Tokens#

Code Example

Runnable Example in Jac and JacLib

# Lexer tokens - Builtin type keywords

with entry {
    # These lexer tokens are used as type annotations
    # They are keywords that represent builtin types

    x: str = "string";
    y: int = 42;
    z: float = 3.14;
    lst: list = [1, 2, 3];
    tup: tuple = (1, 2);
    s: set = {1, 2};
    d: dict = {"key": "value"};
    b: bool = True;

    print(x, y, z, lst, tup, s, d, b);
    # Note: These are tokenized specially so they can be used as types
    # See builtin_types.jac for more comprehensive type usage examples
}
# Lexer tokens - Builtin type keywords

with entry {
    # These lexer tokens are used as type annotations
    # They are keywords that represent builtin types

    x: str = "string";
    y: int = 42;
    z: float = 3.14;
    lst: list = [1, 2, 3];
    tup: tuple = (1, 2);
    s: set = {1, 2};
    d: dict = {"key": "value"};
    b: bool = True;

    print(x, y, z, lst, tup, s, d, b);
    # Note: These are tokenized specially so they can be used as types
    # See builtin_types.jac for more comprehensive type usage examples
}
from __future__ import annotations
from jaclang.runtimelib.builtin import *
x: str = 'string'
y: int = 42
z: float = 3.14
lst: list = [1, 2, 3]
tup: tuple = (1, 2)
s: set = {1, 2}
d: dict = {'key': 'value'}
b: bool = True
print(x, y, z, lst, tup, s, d, b)
Jac Grammar Snippet
TYP_STRING: "str"
TYP_INT: "int"
TYP_FLOAT: "float"
TYP_LIST: "list"
TYP_TUPLE: "tuple"
TYP_SET: "set"
TYP_DICT: "dict"
TYP_BOOL: "bool"
TYP_BYTES: "bytes"
TYP_ANY: "any"
TYP_TYPE: "type"

// Keywords ---------------------------------------------------------------- //

KW_LET: "let"
KW_ABSTRACT: "abs"
KW_CLASS: "class"
KW_OBJECT: "obj"
KW_ENUM: "enum"
KW_NODE: "node"
KW_VISIT: "visit"
KW_SPAWN: "spawn"
KW_WITH: "with"
KW_LAMBDA: "lambda"
KW_ENTRY: "entry"
KW_EXIT: "exit"
KW_IMPORT: "import"
KW_INCLUDE: "include"
KW_FROM: "from"
KW_AS: "as"
KW_EDGE: "edge"
KW_WALKER: "walker"
KW_ASYNC: "async"
KW_AWAIT: "await"
KW_FLOW: "flow"
KW_WAIT: "wait"
KW_TEST: "test"
KW_IMPL: "impl"
KW_SEM: "sem"
KW_ASSERT: "assert"
KW_IF: "if"
KW_ELIF: "elif"
KW_ELSE: "else"
KW_FOR: "for"
KW_TO: "to"
KW_BY: "by"
KW_WHILE: "while"
KW_CONTINUE: "continue"
KW_BREAK: "break"
KW_DISENGAGE: "disengage"
KW_YIELD: "yield"
KW_SKIP: "skip"
KW_REPORT: "report"
KW_RETURN: "return"
KW_DELETE: "del"
KW_TRY: "try"
KW_EXCEPT: "except"
KW_FINALLY: "finally"
KW_RAISE: "raise"
KW_IN: "in"
KW_IS: "is"
KW_PRIV: "priv"
KW_PUB: "pub"
KW_PROT: "protect"
KW_HAS: "has"
KW_GLOBAL: "glob"
KW_CAN: "can"
KW_DEF: "def"
KW_STATIC: "static"
KW_OVERRIDE: "override"
KW_MATCH: "match"
KW_CASE: "case"

KW_INIT: "init"
KW_POST_INIT: "postinit"

KW_HERE: "here"
KW_VISITOR: "visitor"
KW_SELF: "self"
KW_SUPER: "super"
KW_ROOT: "root"

KW_NIN.1: /\bnot\s+in\b/
KW_ISN.1: /\bis\s+not\b/
KW_AND.1: /&&|and/
KW_OR.1:  /\|\||or/
NOT: "not" // TODO:AST: Rename to KW_NOT

// Literals ---------------------------------------------------------------- //

STRING: /(r?b?|b?r?)("[^"\r\n]*"|'[^'\r\n]*')/
       | /(r?b?|b?r?)("""(.|\r|\n)*?"""|'''(.|\r|\n)*?''')/

NULL.1: "None"
BOOL.1: /True|False/
FLOAT: /(\d+(\.\d*)|\.\d+)([eE][+-]?\d+)?|\d+([eE][-+]?\d+)/
HEX.1: /0[xX][0-9a-fA-F_]+/
BIN.1: /0[bB][01_]+/
OCT.1: /0[oO][0-7_]+/
INT: /[0-9][0-9_]*/


// Identifier -------------------------------------------------------------- //

KWESC_NAME: /<>[a-zA-Z_][a-zA-Z0-9_]*/
NAME: /[a-zA-Z_][a-zA-Z0-9_]*/


// Object-Spatial Operators -------------------------------------------------- //

ARROW_BI: "<-->"
ARROW_L: "<--"
ARROW_R: "-->"
ARROW_L_P1: "<-:"
ARROW_R_P2: ":->"
ARROW_L_P2: ":<-"
ARROW_R_P1: "->:"
CARROW_BI: "<++>"
CARROW_L: "<++"
CARROW_R: "++>"
CARROW_L_P1: "<+:"
CARROW_R_P2: ":+>"
CARROW_L_P2: ":<+"
CARROW_R_P1: "+>:"


// Assignment Operator ----------------------------------------------------- //

EQ: "="
WALRUS_EQ: ":="

ADD_EQ: "+="
SUB_EQ: "-="
MUL_EQ: "*="
DIV_EQ: "/="
MOD_EQ: "%="
MATMUL_EQ: "@="
STAR_POW_EQ: "**="
FLOOR_DIV_EQ: "//="

BW_AND_EQ: "&="
BW_OR_EQ: "|="
BW_XOR_EQ: "^="
LSHIFT_EQ: "<<="
RSHIFT_EQ: ">>="


// Arithmatic -------------------------------------------------------------- //

EE: "=="
LT: "<"
GT: ">"
LTE: "<="
GTE: ">="
NE: "!="

PLUS: "+"
MINUS: "-"
STAR_MUL: "*"
DIV: "/"
MOD: "%"
STAR_POW: "**"
FLOOR_DIV: "//"
DECOR_OP: "@"

BW_AND: "&"
BW_OR: "|"
BW_XOR: "^"
BW_NOT: "~"
LSHIFT: "<<"
RSHIFT: ">>"

// Other Operator ---------------------------------------------------------- //

A_PIPE_FWD: ":>"
A_PIPE_BKWD: "<:"
PIPE_FWD: "|>"
PIPE_BKWD: "<|"
DOT_FWD: ".>"
DOT_BKWD: "<."


// ************************************************************************* //
// Comments and Whitespace                                                   //
// ************************************************************************* //

COMMENT: /#\*(.|\n|\r)*?\*#|#.*/
WS.-2: /[ \t\f\r\n]/+
%ignore COMMENT
%ignore WS

Description

Builtin type keywords are special tokens in Jac that represent fundamental data types. These keywords are recognized by the lexer and can be used both as type annotations and as runtime type objects.

What are Builtin Type Keywords?

When you write code in Jac, the lexer (the part of the compiler that reads your code) recognizes certain words as special type keywords. These keywords represent the basic building blocks of data in your programs.

The Eight Builtin Type Keywords

Lines 7-14 demonstrate all eight builtin type keywords used as type annotations:

Keyword Type Example Value Line
str String (text) "string" 7
int Integer (whole number) 42 8
float Floating-point (decimal) 3.14 9
list List (ordered collection) [1, 2, 3] 10
tuple Tuple (immutable sequence) (1, 2) 11
set Set (unique values) {1, 2} 12
dict Dictionary (key-value pairs) {"key": "value"} 13
bool Boolean (true/false) True 14

Type Annotation Syntax

The pattern for declaring a variable with a type annotation is:

For example, line 7 shows x: str = "string", which means: - x is the variable name - str is the type annotation (telling Jac this should be a string) - "string" is the value being assigned

How the Lexer Treats These Keywords

Lines 17-18 explain an important detail: these keywords are "tokenized specially" by the lexer. This means the lexer gives them special treatment so they can serve two purposes:

graph TD
    A[Builtin Type Keyword] --> B[Used as Type Annotation]
    A --> C[Used as Runtime Type Object]
    B --> D["Example: x: int = 5"]
    C --> E["Example: type(x) == int"]

Purpose 1: Type Annotations

Type annotations provide compile-time type information. They tell Jac (and developers reading the code) what type of data a variable should hold:

Declaration What It Means
x: str x should hold string values
y: int y should hold integer values
z: float z should hold floating-point values

Purpose 2: Runtime Type Objects

The same keywords can also be used at runtime as type objects. For example, you can use them with type() checks, type conversions, or as values in your code.

Where These Keywords Appear

These builtin type keywords can be used in several contexts:

Context Example Lines
Variable declarations x: str = "hello" 7-14
Function parameters def greet(name: str) {...} -
Return type annotations def get_age() -> int {...} -
Class attributes has name: str -

Complete Example Breakdown

Line 16 prints all the variables, demonstrating that: - Each variable holds a value of its declared type - The type annotations don't prevent the code from running - All the builtin types work together in a single program

Related Information

Line 18 points to builtin_types.jac for more comprehensive examples of how to use these types. This file (lexer_tokens.jac) focuses specifically on showing that these are special keywords recognized by the lexer, not just regular identifiers.

Why Special Tokenization Matters

By tokenizing these keywords specially, Jac can: 1. Provide better error messages when types are misused 2. Enable type checking and inference 3. Allow these words to be used both as types and values 4. Reserve these words so they can't be used as variable names in most contexts