Tags: ml4ai/skema
Tags
Generic MathML Errors using Parser Lookahead (some endpoints updated) (… …#386) ### Changes - Generic MathML Parser Error update: - Added tag level errors to `generic_mathml.rs` parser: `<mi>`, `<mn>`, `<msup>`, `<msub>`, `<msqrt>`, `<mfrac>`, `<mrow>`, `<munder>`, `<mover>`, `<msubsup>`, `<mtext>`, `<mstyle>`, `<mspace>`, `<mo>`. - `/mathml/ast-graph` endpoint now shows these errors. - - First Order ODE Parser Error update: - Updated `ParseError` messages using the `context` combinator, removing the previous macro usage. - The generic MathML errors were excluded as this parser uses `interpreted_mathml.rs`, which doesn't encounter those errors at the math_expression level. - `/pmml/equations-to-amr` and `/latex/equations-to-amr` are passing on these errors. from `skema-rs` ### Notes - Lookahead Algorithm: - Solved the problem of adding tag level parse errors by implementing a lookahead in the parser. - In `math_expression`, instead of using `alt` for multiple branches of parsers, the following steps were adopted: 1. Grab the content of the next tag. 2. If it is an open tag, call the appropriate parser. If the parser fails, we can immediately stop execution with [`cut`](https://tikv.github.io/doc/nom/combinator/fn.cut.html) because of the lookahead knowledge. 3. If the tag was a close tag, return an `Error` instead of a `Failure`. `Failure` cuts the execution, but returning an `Error` allows the parent combinator to continue using parsers on the remaining input. - This approach enables `many0` and other combinators to work as expected. When we run out of things (like math expressions) for `many0` to match (encountered a close tag), we return an `Error`, allowing the parent combinator to continue. But, as long as we know there is an expression to match (open tag), we can guarantee that if the internal parser (for `<mi>`, `<mo>`, etc.) fails, it was due to bad input. ### Testing - `cargo test` and `cargo clippy` passing.
PreviousNext