stream.hls: refactor and fix attribute list parser#5125
Merged
bastimeyer merged 2 commits intoFeb 7, 2023
Conversation
- Rewrite and fix attribute list regex - Be more strict when parsing multiple attributes and expect the mandatory comma separator, but be lenient about spaces surrounding attributes (off-spec), similar to the old implementation - Add missing support for signed decimal floating point numbers - Fix invalid range of hexadecimal sequences and allow uppercase prefix - Fix character range of enumerated strings being too narrow - Fix character range of quoted strings - Discard entire attribute list on parsing failure and emit warning - Add unit tests
- Turn methods into classmethods - Emit warning messages on parsing failure - Add unit tests
adfc61a to
9e9c0a6
Compare
Member
|
@bastimeyer Is there any further testing you want to do or is this ready to go? Code looks good. |
gravyboat
approved these changes
Feb 6, 2023
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This fixes and cleans up the M3U8Parser's attribute list regex+parser and some individual attribute parser methods, and it adds missing unit tests.
The old attribute list regex was incorrect and was also missing some stuff. This is now fixed and it's a bit more strict when parsing the attribute list, but it's not doing that 100% according to the spec either, because the previous one also wasn't (and shouldn't, to be able to support broken/invalid playlists).
https://www.rfc-editor.org/rfc/rfc8216#section-4.2
One key difference here is that the entire attribute list gets discarded if there's a parsing issue with one of the attributes in the attributes list. Previously, this was not the case and the already parsed attributes of a specific tag were returned and it kept parsing via
re.finditer(), which could lead to invalid results. That's also the reason, why the comma separator has been added to the regex.It also adds a couple of warning log messages, which can be useful, even though the input string is not included in the log message. I didn't add warnings for everything, though.
But before I'm opening this for merging, I will check and see if this breaks the parsing of actual production playlists.