Lexical Analyser is a Lexer / Tokenizer that determines whether a source code is accepted by a given Deterministic Finite State Automaton or not, and outputs each lexeme with its corresponding token type or whether was it not accepted using the given DFSA.
Since LexicalAnalyser detects Tokens by their final states, the following final states rule must be respected for automaton's tokens !
| Token | FinalState |
|---|---|
| ID | 1 |
| KEYWORD | 12 |
| ARTH OP | 2, 11 |
| REL OP | 5, 9, 10 |
| STRING | 13 |
| INT | 4 |
| REAL | 3, 8 |
| COMMENT | 14 |
Number of states
Alphabets
Initial State
Final states separated by space
Language operators separated by space
Language keywords separated by space
Transitions {StartState Symbol EndState} [From 8th line to the end of the file]
Comments {# Comment} [Starting from line 8]
-
Simple Language Automaton : Automaton file example
-
Source Code : Simple source code file example, that's accepted by the automaton above
Please make sure that
.NET Coreruntime is installed before running LexicalAnalyser, if not visit : https://aka.ms/dotnet-core-applaunch
merzak-x@PR3C1S10N:~$ ./LexicalAnalyser SimpleLanguageAutomaton.test SourceCode.testOutput result
merzak-x@PR3C1S10N:~$ ./LexicalAnalyser SimpleLanguageAutomaton.test SourceCode.test
Automaton [SimpleLanguageAutomaton] :
E = {0, 1, 4, 2, 15, 3, 6, 7, 8, 5, 9, 10, 11} ;
A = {a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, _, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, <, >, (, ), ", +, -, *, /, =} ;
q₀ = 0 ;
F = {1, 2, 3, 4, 5, 8, 9, 10, 11} ;
✓ <KEYWORD,{BEGIN}>
✓ <ID,{var}>
✓ <REL_OP,{=}>
✓ <STRING,{some quite long string with ¢ħæræŧ€rß +°c §.-?}>
✓ <KEYWORD,{IF}>
✓ <REAL,{13.4}>
✓ <REL_OP,{>=}>
✓ <INT,{77}>
✓ <KEYWORD,{THEN}>
✓ <ID,{variable}>
✓ <REL_OP,{=}>
✓ <INT,{99}>
✓ <ARTH_OP,{-}>
✓ <INT,{98}>
✓ <KEYWORD,{ELSE}>
✓ <ID,{var}>
✓ <REL_OP,{=}>
✓ <REAL,{17.54E^485512}>
✓ <INT,{1}>
✓ <INT,{-2}>
✓ <INT,{9}>
✓ <REL_OP,{=}>
✓ <INT,{8}>
✓ <INT,{8}>
✓ <ARTH_OP,{/}>
✓ <INT,{9}>
✓ <ID,{var}>
✓ <ARTH_OP,{*}>
✓ <INT,{-1}>
✓ <REL_OP,{<=}>
✓ <INT,{77}>
✓ <ARTH_OP,{/}>
✓ <INT,{18}>
✓ <ID,{test}>
✓ <REL_OP,{=}>
✓ <INT,{99}>
✓ <ARTH_OP,{+}>
✓ <INT,{98}>
✓ <ARTH_OP,{+}>
✓ <INT,{1}>
✓ <KEYWORD,{END}>
✓ The source file `SourceCode.test` is accepted by the automaton's described language !
Process finished with exit code 0.
Output result with -q
merzak-x@PR3C1S10N:~$ dotnet LexicalAnalyser.dll "/home/merzak-x/EMSI/C#/Projects/LexicalAnalyser/lib/examples/SimpleLanguageAutomaton.test" "/home/merzak-x/EMSI/C#/Projects/LexicalAnalyser/lib/examples/SourceCode.test" -q
✓ <KEYWORD,{BEGIN}>
✓ <ID,{var}>
✓ <REL_OP,{=}>
✓ <STRING,{some quite long string with ¢ħæræŧ€rß +°c §.-?}>
✓ <KEYWORD,{IF}>
✓ <REAL,{13.4}>
✓ <REL_OP,{>=}>
✓ <INT,{77}>
✓ <KEYWORD,{THEN}>
✓ <ID,{variable}>
✓ <REL_OP,{=}>
✓ <INT,{99}>
✓ <ARTH_OP,{-}>
✓ <INT,{98}>
✓ <KEYWORD,{ELSE}>
✓ <ID,{var}>
✓ <REL_OP,{=}>
✓ <REAL,{17.54E^485512}>
✓ <INT,{1}>
✓ <INT,{-2}>
✓ <INT,{9}>
✓ <REL_OP,{=}>
✓ <INT,{8}>
✓ <INT,{8}>
✓ <ARTH_OP,{/}>
✓ <INT,{9}>
✓ <ID,{var}>
✓ <ARTH_OP,{*}>
✓ <INT,{-1}>
✓ <REL_OP,{<=}>
✓ <INT,{77}>
✓ <ARTH_OP,{/}>
✓ <INT,{18}>
✓ <ID,{test}>
✓ <REL_OP,{=}>
✓ <INT,{99}>
✓ <ARTH_OP,{+}>
✓ <INT,{98}>
✓ <ARTH_OP,{+}>
✓ <INT,{1}>
✓ <KEYWORD,{END}>
✓ The source file `SourceCode.test` is accepted by the automaton's described language !
Process finished with exit code 0.
Output result with -v
merzak-x@PR3C1S10N:~$ dotnet LexicalAnalyser.dll "/home/merzak-x/EMSI/C#/Projects/LexicalAnalyser/lib/examples/SimpleLanguageAutomaton.test" "/home/merzak-x/EMSI/C#/Projects/LexicalAnalyser/lib/examples/SourceCode.test" -v
Automaton [SimpleLanguageAutomaton] :
E = {0, 1, 4, 2, 15, 3, 6, 7, 8, 5, 9, 10, 11} ;
A = {a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, _, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, <, >, (, ), ", +, -, *, /, =} ;
q₀ = 0 ;
F = {1, 2, 3, 4, 5, 8, 9, 10, 11} ;
Source code :
```
(* test comment *)
BEGIN
var = "some quite long string with ¢ħæræŧ€rß +°c §.-?";
IF 13.4 >= 77
THEN
variable = 99 - 98;
ELSE
var=17.54E^485512
1-2;9=8;8/9;
var*-1<=77/18;
test=99+98+1;
END
```
✓ <KEYWORD,{BEGIN}>
✓ <ID,{var}>
✓ <REL_OP,{=}>
✓ <STRING,{some quite long string with ¢ħæræŧ€rß +°c §.-?}>
✓ <KEYWORD,{IF}>
✓ <REAL,{13.4}>
✓ <REL_OP,{>=}>
✓ <INT,{77}>
✓ <KEYWORD,{THEN}>
✓ <ID,{variable}>
✓ <REL_OP,{=}>
✓ <INT,{99}>
✓ <ARTH_OP,{-}>
✓ <INT,{98}>
✓ <KEYWORD,{ELSE}>
✓ <ID,{var}>
✓ <REL_OP,{=}>
✓ <REAL,{17.54E^485512}>
✓ <INT,{1}>
✓ <INT,{-2}>
✓ <INT,{9}>
✓ <REL_OP,{=}>
✓ <INT,{8}>
✓ <INT,{8}>
✓ <ARTH_OP,{/}>
✓ <INT,{9}>
✓ <ID,{var}>
✓ <ARTH_OP,{*}>
✓ <INT,{-1}>
✓ <REL_OP,{<=}>
✓ <INT,{77}>
✓ <ARTH_OP,{/}>
✓ <INT,{18}>
✓ <ID,{test}>
✓ <REL_OP,{=}>
✓ <INT,{99}>
✓ <ARTH_OP,{+}>
✓ <INT,{98}>
✓ <ARTH_OP,{+}>
✓ <INT,{1}>
✓ <KEYWORD,{END}>
✓ The source file `SourceCode.test` is accepted by the automaton's described language !
Process finished with exit code 0.
Output result with -vv
merzak-x@PR3C1S10N:~$ dotnet LexicalAnalyser.dll "/home/merzak-x/EMSI/C#/Projects/LexicalAnalyser/lib/examples/SimpleLanguageAutomaton.test" "/home/merzak-x/EMSI/C#/Projects/LexicalAnalyser/lib/examples/SourceCode.test" -vv
Automaton [SimpleLanguageAutomaton] :
E = {0, 1, 4, 2, 15, 3, 6, 7, 8, 5, 9, 10, 11} ;
A = {a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, _, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, <, >, (, ), ", +, -, *, /, =} ;
Transitions: {
σ(0, a) = 1
σ(0, b) = 1
σ(0, c) = 1
σ(0, d) = 1
σ(0, e) = 1
σ(0, f) = 1
σ(0, g) = 1
σ(0, h) = 1
σ(0, i) = 1
σ(0, j) = 1
σ(0, k) = 1
σ(0, l) = 1
σ(0, m) = 1
σ(0, n) = 1
σ(0, o) = 1
σ(0, p) = 1
σ(0, q) = 1
σ(0, r) = 1
σ(0, s) = 1
σ(0, t) = 1
σ(0, u) = 1
σ(0, v) = 1
σ(0, w) = 1
σ(0, x) = 1
σ(0, y) = 1
σ(0, z) = 1
σ(0, _) = 1
σ(1, 0) = 1
σ(1, 1) = 1
σ(1, 2) = 1
σ(1, 3) = 1
σ(1, 4) = 1
σ(1, 5) = 1
σ(1, 6) = 1
σ(1, 7) = 1
σ(1, 8) = 1
σ(1, 9) = 1
σ(1, a) = 1
σ(1, b) = 1
σ(1, c) = 1
σ(1, d) = 1
σ(1, e) = 1
σ(1, f) = 1
σ(1, g) = 1
σ(1, h) = 1
σ(1, i) = 1
σ(1, j) = 1
σ(1, k) = 1
σ(1, l) = 1
σ(1, m) = 1
σ(1, n) = 1
σ(1, o) = 1
σ(1, p) = 1
σ(1, q) = 1
σ(1, r) = 1
σ(1, s) = 1
σ(1, t) = 1
σ(1, u) = 1
σ(1, v) = 1
σ(1, w) = 1
σ(1, x) = 1
σ(1, y) = 1
σ(1, z) = 1
σ(1, _) = 1
σ(0, 0) = 4
σ(0, 1) = 4
σ(0, 2) = 4
σ(0, 3) = 4
σ(0, 4) = 4
σ(0, 5) = 4
σ(0, 6) = 4
σ(0, 7) = 4
σ(0, 8) = 4
σ(0, 9) = 4
σ(0, -) = 2
σ(2, 0) = 4
σ(2, 1) = 4
σ(2, 2) = 4
σ(2, 3) = 4
σ(2, 4) = 4
σ(2, 5) = 4
σ(2, 6) = 4
σ(2, 7) = 4
σ(2, 8) = 4
σ(2, 9) = 4
σ(4, 0) = 4
σ(4, 1) = 4
σ(4, 2) = 4
σ(4, 3) = 4
σ(4, 4) = 4
σ(4, 5) = 4
σ(4, 6) = 4
σ(4, 7) = 4
σ(4, 8) = 4
σ(4, 9) = 4
σ(4, .) = 15
σ(15, 0) = 3
σ(15, 1) = 3
σ(15, 2) = 3
σ(15, 3) = 3
σ(15, 4) = 3
σ(15, 5) = 3
σ(15, 6) = 3
σ(15, 7) = 3
σ(15, 8) = 3
σ(15, 9) = 3
σ(3, 0) = 3
σ(3, 1) = 3
σ(3, 2) = 3
σ(3, 3) = 3
σ(3, 4) = 3
σ(3, 5) = 3
σ(3, 6) = 3
σ(3, 7) = 3
σ(3, 8) = 3
σ(3, 9) = 3
σ(3, e) = 6
σ(6, ^) = 7
σ(7, 0) = 8
σ(7, 1) = 8
σ(7, 2) = 8
σ(7, 3) = 8
σ(7, 4) = 8
σ(7, 5) = 8
σ(7, 6) = 8
σ(7, 7) = 8
σ(7, 8) = 8
σ(7, 9) = 8
σ(8, 0) = 8
σ(8, 1) = 8
σ(8, 2) = 8
σ(8, 3) = 8
σ(8, 4) = 8
σ(8, 5) = 8
σ(8, 6) = 8
σ(8, 7) = 8
σ(8, 8) = 8
σ(8, 9) = 8
σ(0, =) = 5
σ(0, <) = 9
σ(0, >) = 10
σ(9, =) = 5
σ(9, >) = 5
σ(10, =) = 5
σ(0, +) = 11
σ(0, *) = 11
σ(0, /) = 11
} ;
q₀ = 0 ;
F = {1, 2, 3, 4, 5, 8, 9, 10, 11} ;
Source code :
```
(* test comment *)
BEGIN
var = "some quite long string with ¢ħæræŧ€rß +°c §.-?";
IF 13.4 >= 77
THEN
variable = 99 - 98;
ELSE
var=17.54E^485512
1-2;9=8;8/9;
var*-1<=77/18;
test=99+98+1;
END
```
✓ <KEYWORD,{BEGIN}>
✓ <ID,{var}>
✓ <REL_OP,{=}>
✓ <STRING,{some quite long string with ¢ħæræŧ€rß +°c §.-?}>
✓ <KEYWORD,{IF}>
✓ <REAL,{13.4}>
✓ <REL_OP,{>=}>
✓ <INT,{77}>
✓ <KEYWORD,{THEN}>
✓ <ID,{variable}>
✓ <REL_OP,{=}>
✓ <INT,{99}>
✓ <ARTH_OP,{-}>
✓ <INT,{98}>
✓ <KEYWORD,{ELSE}>
✓ <ID,{var}>
✓ <REL_OP,{=}>
✓ <REAL,{17.54E^485512}>
✓ <INT,{1}>
✓ <INT,{-2}>
✓ <INT,{9}>
✓ <REL_OP,{=}>
✓ <INT,{8}>
✓ <INT,{8}>
✓ <ARTH_OP,{/}>
✓ <INT,{9}>
✓ <ID,{var}>
✓ <ARTH_OP,{*}>
✓ <INT,{-1}>
✓ <REL_OP,{<=}>
✓ <INT,{77}>
✓ <ARTH_OP,{/}>
✓ <INT,{18}>
✓ <ID,{test}>
✓ <REL_OP,{=}>
✓ <INT,{99}>
✓ <ARTH_OP,{+}>
✓ <INT,{98}>
✓ <ARTH_OP,{+}>
✓ <INT,{1}>
✓ <KEYWORD,{END}>
✓ The source file `SourceCode.test` is accepted by the automaton's described language !
Process finished with exit code 0.
LexicalAnalyser v1.4: https://github.com/MERZAK-X/LexicalAnalyser
Usage: dotnet LexicalAnalyser.dll [[Automaton] [Sourcecode]] [-v|-vv|-q] [--help]
Arguments:
Automaton Path to the Automaton's file
Source Path to the source code file to be analysed
Options:
-v, -vv Verbose level, 1 or 2 respectively, if not set 0
-q Quiet (verbose level -1), only display results
--help Display this help and exit
Examples:
./LexicalAnalyser SimpleLanguageAutomaton.test SourceCode.test -v
dotnet LexicalAnalyser.dll lib/examples/SimpleLanguageAutomaton.test lib/examples/SourceCode.test
Documentation: https://git.io/JfNf4
Copyright (C) 2020 "NUL-X"