The ion programming language
- FrontEnd: lexing + parsing + AST
- Analysis: semantic analysis, optimization (optional)
- Backend: codegen (bytecode or native code)
- Interpreter: interpreting bytecode or walking the AST
Nonterminal (rule defined elsewhere) "text" Terminal (literal keyword or symbol) ::= Defines a rule | Choice (OR) () Grouping
? Optional (0 or 1 occurrence)
-
Repetition (0 or more occurrences)
-
Repetition (1 or more occurrences)
::= (<function_decl> | <struct_decl> | <variable_decl>)* ::= "{" ()* "}"
::= ( | | )
<variable_decl> ::= "var" ":" (()? ("=" )) | (() ("=" )?) ";" // var test: int; // var test := 5; // var test: int = 5;
<function_decl> ::= "fn" "(" <param_list>? ")" "->" <return_type> <param_list> ::= ("," )* ::= ":" <return_type> ::= | "(" <type_list> ")" | "void" <type_list> ::= ("," )* /* func get_value(a: int, b: int) -> void {} */
<struct_decl> ::= "struct" "{" (<struct_member>)* "}" <struct_member> ::= ":" ";"
<primitive_type> ::= "int" | "float" | "bool" | "string"
::= | | <if_else> | | |
::= "=" ";" ::= | <member_access> | <array_access> // test = 4
<member_access> ::= <member_access> "." <member_access> | <array_access> ::= ("[" "]")+
<return_stmt> ::= "return" <return_value>? ";" <return_value> ::= | "(" ("," )* ")"
<if_stmt> ::= "if" "(" ")" ("else" )? <while_stmt> ::= "while" "(" ")"
// └── Logical (||, &&) // └── Comparison (==, !=, <, >, etc.) // └── Additive (+, -, &, |, ^) (BinaryOp) // └── Multiplicative (, /, %, <<, >>) (BinaryOp) // └── Unary (+, -, !, ~, &, ) // └── Primary (literals, identifiers, etc.) ::= ::= (("||" | "&&") ) ::= (("==" | "!=" | "<" | "<=" | ">" | ">=") ) ::= (("+" | "-" | "|" | "^") )* ::= (("" | "/" | "%" | "<<" | ">>") ) ::= ("+" | "-" | "!" | "~" | "&" | "*") | ::= | | "(" ")" | <function_call> | <member_access> | <array_access>
<function_call> ::= "(" <expression_list>? ")" <expression_list> ::= ("," )*
::= <integer_literal> | <float_literal> | <string_literal> | <bool_literal> <integer_literal> ::= e.g (-1, 0, 1, 2, 3, ...) <float_literal> ::= e.g (-1.01, 0.00, 1.01, 2.02, 3.03, ...) <string_literal> ::= e.g ("Hello", "World") <bool_literal> ::= "true" | "false" ::= e.g(name, test, foo, bar)
- Language Features
-
[] Out of order compilation (remove the need for function prototypes and forward declares)
-
[] Parse multiple files (How do you even build the ast? Do you build it seperately and just connect it somehow?)
-
[] Defer
-
[] Functions
-
[] Structs
-
[] LHS Exprsesion Access
-
[] Arrays
-
[] Proper Copying of data when passed to function or assignment
-
[] Typechecking
- [] Type Inference
-
Generic Parameters
-
// Ultimately I think a simple is just a IonToken and structure kind an bool if its resolved yet // Basically you want to recorrd all instances of a symbol being used and the context in which its being used.
/* // This is for non-declarations of course enum IonSymbolKind { ION_SYMBOL_IDENT // the symbol in quesiton is being used as an identifer ION_SYMBOL_TYPE // the symbol in quesiton is being used as a type ION_SYMBOL_FUNCTION_CALL // the symbol in quesiton is being used as a type }
// So the last thing you have to do is just have a mapping: StructDeclaration -> ION_SYMBOL_TYPE Variable Declaration -> ION_SYMBOL_IDENT FunctionCall_SE -> SYMBOL_FUNCITON_CALL
typedef struct IonSymbol { IonSymbolKind kind; IonNode* decl; // NULLPTR if not resolved yet
// This is resolved if and only if I can lookup in a symbol and that declaration matches the context.
// One nice thing about the symbols is that you can only ever have once instance of a symbol because
// we only allow for one symbol per context for example a symbol can't be marked as having
// a TYPE context, but later we resolve the definition and its a function decl. This is not allowed.
} IonSymbol;
typedef struct IonSymbolTable { CKG_HashMap(CKG_StringView, IonSymbol)* symbols; struct IonSymbolTable* parent; } IonSymbolTable;
at the end of the second pass iterate through the symbols IonSymbolTable table = ...; for (int i = 0; i < table.symbols.meta.capacity; i++) { if (entries[i].filled && !entries[i].dead) { CKG_StringView str = entries[i].key; Symbol s = entries[i].value; if (s.decl == NULLPTR) { printf("Unresolved symbol: %.*s", str.length, str.data); exit(-1); } } } */
- Backend Features
-
[] Tree walk interpreter
-
[?] Transpile to C
-
[?] FFI to call native functions
- [] If a function definition is marked as foreign then when you encounter a call you know that you have to do a proc address lookup into a dll of a c library.
foreign fn printf(fmt: string) -> int;
-
[?] ByteCode
-
[?] LLVM
-
Author Notes:
- How do we handle:
- Bultins
- Runtime Time Info (I really like the TypeAny void* and a type kind in jai)
- Syscalls/Foreign Calls
- Core/Runtime
- Imports
- Variatic Arguments
- Maybe some type concept or type union? (number -> float|int|uint)
- spread operator on slices and array
- Regression testing (it needs to be easy to add ion source as a test and have that be tested)
- Zero initalization! But also all types need to be able to initalize to zero so string needs to be null or "" ect
- We need enums!
- We need to be able to genuinely cast to pointer types.
- get away from the interpreter transpile to C so pointer types are free (then we can research how to actual do them)
- function pointer calls
- typedef
- memory compatible (This i have no actual idea how to do other than just look at the type defs?)
How should for loops actually work?
- Should they only iterate over ranges?
- Should there be like two different for loops? one for ranges and one normal one?
// inclusive go this from jai for (i : 0..15) {}
- Bultins:
// These could be function prototypes in the
// actual language however, At the end of the day you must call a compiler builtin
// that doens't really have the typechecking. So this is just a wrapper:
fn len(Iterable iter) -> number;
fn print(arg []number) -> number;
fn println(arg []number) -> number;
fn type_info(...) -> TypeInfo
fn sizeof(...) -> int
- Variatic Arguments / Spread Operator:
fn sum(nums ..number) -> number {
accumulator: number;
for (num : nums) {
// maybe implicit index if you want?
accumulator += num;
}
return accumulator;
}
// Runtime type
fn do_anything(args ...any) -> void {
for (arg : args) {
switch (arg.type) {
case int: {
} break;
case string: {
} break;
case float: {
} break;
}
}
}
// Really I need to learn more about the data segment, object files, linking/relocations