Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@ghtjr410
Copy link

@ghtjr410 ghtjr410 commented Jan 8, 2026

Add JSONL format support for datasets

  • Add Dataset.fromJsonl() methods for parsing JSONL files and strings
  • Add DatasetParser.parseJsonl() with line-by-line parsing
  • Update ClasspathDatasetResolver and FileDatasetResolver to detect .jsonl extension
  • Add jsonl() attribute to @DatasetSource annotation
  • Add tests and sample.jsonl resource files

Closes #21

Copy link
Collaborator

@fkapsahili fkapsahili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ghtjr410 Thanks a lot for the contribution! Just added 1-2 comments regarding the unit tests. Would you like to have a look at those? Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you maybe add a few more test cases with some nested JSONL inputs/outputs to make sure that the parsing works too in more complex scenarios? I think the following JSON example would be a good starting point:

@Test
void shouldParseJsonWithNestedInputsOutputs() {
String json = """
{
"name": "complex-qa",
"examples": [
{
"inputs": {
"question": "What is AI?",
"context": ["AI is artificial intelligence"]
},
"expectedOutputs": {
"answer": "Artificial intelligence",
"confidence": 0.9
},
"metadata": {
"source": "wikipedia"
}
}
]
}
""";
var dataset = Dataset.fromJson(json);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these tests could benefit from 1-2 more complex dataset examples, with some nested outputs here too.

@fkapsahili fkapsahili added the enhancement New feature or request label Jan 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support JSONL in Dataset and @DatasetSource annotation #7

2 participants