-
Notifications
You must be signed in to change notification settings - Fork 59
Rholang Lexer/Parser with Diagnostic API and informative errors #1015
Description
RhoLP - RhoLang Lexer/Parser
Сurrent state: Interpreter/Web-Compliler with automatically generated front-end (lexer, parser) from BNFC has no diagnostic API and often generate non informative errors.
Idea: NOT replace cup/jflex interpreter front-end with hand-written but in case of an error from cup/jflex front-end - additionally run handmade lexer/parser (not full interpreter, only front-end) for informative erros.
This bounties issue created for development epic (RHOL-1027) = RHOL-1029 + RHOL-1030 + RHOL-1031.
Project RhoLP sources.
Part I: Lexer (36 codepoints)
- Lexer sceleton: Diagnostics API (12 codepoints)
- Standard error format, error codes
- Error/warn messages database
- One scan - multiple diagnostic messages
- Non-existed literals handling (12 codepoints)
- Int problems: too big integer literals, absent Hex/Binary format ('0xFF', '0b1010')
- Floating-point literals: '42.42e-42f'
- Char literals: 'A', '\uFFFF'
- Non-existed token types (12 codepoints)
- Absent operators: '->', '%', '&', '&&', '^', etc
- Absent keywords: 'do', 'int', 'this', etc
- Absent UTF support
Part II: Parser
TBD
Benefit to RChain
1. Interpreter, Web-Compliter will be more user friendly in error situations
2. This hand made lexer/parser can resolve next issues
- Confusing error message around ellipsis: RHOL-501
- Rholang interpreter errors should have a uniform structure: RHOL-488
- better diagnostics for large integers please: RHOL-575
- Error message for incorrect usage of % vs %% is not helpful: RHOL-592
- Need consistent error messages around method invocation: RHOL-497
- "Errors received during evaluation" not useful: RHOL-662
- Parser does not understand floating point numbers: RHOL-256
- compiler error message usability: RHOL-301
Example/Demo
import net.golovach.rholp.*;
import net.golovach.rholp.log.*;
import java.util.List;
public class Demo {
public static void main(String[] args) {
String content =
"type T = Functor[({ type λ[α] = Map[Int, α] })#λ]";
DiagnosticListener listener = new DiagnosticCollapsedPrinter();
RhoLexer lexer = new RhoLexer(content, listener);
List<RhoTokenType> tokens = lexer.scanAll();
}
}NOTE
Error code: lexer.note.identifier-like-absent-keyword
Message: identifier 'type' like absent keyword, may cause confusion
Line/Column: [1, 1]
----------
type T = Functor[({ type λ[α] = Map[Int, α] })#λ]
^^^^
ERROR
Error code: lexer.err.non-existent.unicode.identifiers
Messages:
there is no Unicode support: 'λ', codepoint = 955, char[] = '\u03BB'
there is no Unicode support: 'α', codepoint = 945, char[] = '\u03B1'
Line/Column: [1, 26], [1, 28], [1, 42], [1, 48]
----------
type T = Functor[({ type λ[α] = Map[Int, α] })#λ]
^ ^ ^ ^
ERROR
Error code: lexer.err.non-existent.operator
Message: there is no operator '#'
Line/Column: [1, 47]
----------
type T = Functor[({ type λ[α] = Map[Int, α] })#λ]
^
Budget and Objective
Estimated Budget of Task: $[5400] for Part I (Lexer)
Estimated Timeline Required to Complete the Task: [3 weeks]
How will we measure completion? [example: commited library ready to integrate with Interpreter+Web-Compliler]