Thanks to visit codestin.com
Credit goes to github.com

Skip to content

refactor(datasets): add compress_level parameter to write_image() and set it to 1#2135

Merged
CarolinePascal merged 3 commits intomainfrom
refactor/write_image_compress_level
Oct 8, 2025
Merged

refactor(datasets): add compress_level parameter to write_image() and set it to 1#2135
CarolinePascal merged 3 commits intomainfrom
refactor/write_image_compress_level

Conversation

@imstevenpmwork
Copy link
Collaborator

@imstevenpmwork imstevenpmwork commented Oct 7, 2025

This PR steams from the conversation in: #1959

Rationale

Why is compression not critical at this step?
We aim to preserve as much raw image information as possible, as these images are intermediate artifacts. They will later be compressed during video encoding at the end of each episode, where compression efficiency potentially matters more.

How was the compression level chosen?
The optimal compression level depends on the entropy characteristics of the images. However, since our main goal here is speed rather than file size, a low compression level is preferred to minimize CPU overhead during frequent writes.

Why compress_level=1 instead of 0?
Although 0 uses the least CPU for compression, it can paradoxically result in slower overall performance due to the larger output files. Writing significantly larger files increases I/O time, often offsetting any CPU gains.
Setting compress_level=1 provides a better balance between CPU usage and disk throughput.

Future Work

As suggested in the original ticket, compression and encoding parameters (e.g., format, compression level, codec options) should eventually be exposed to users for fine-grained control. This will be addressed in a future PR; although it is not currently a priority.

@imstevenpmwork imstevenpmwork self-assigned this Oct 7, 2025
@imstevenpmwork imstevenpmwork added enhancement Suggestions for new features or improvements dataset Issues regarding data inputs, processing, or datasets refactor performance Issues aimed at improving speed or resource usage labels Oct 7, 2025
@imstevenpmwork imstevenpmwork linked an issue Oct 7, 2025 that may be closed by this pull request
2 tasks
Copy link
Collaborator

@CarolinePascal CarolinePascal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small comment : wouldn't this be the perfect occasion to add a docstring to this method ? c:

@imstevenpmwork
Copy link
Collaborator Author

Just a small comment : wouldn't this be the perfect occasion to add a docstring to this method ? c:

Done in: d52473b

Copy link
Collaborator

@CarolinePascal CarolinePascal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@CarolinePascal CarolinePascal self-requested a review October 8, 2025 17:03
@CarolinePascal CarolinePascal merged commit 9a49e57 into main Oct 8, 2025
10 checks passed
@CarolinePascal CarolinePascal deleted the refactor/write_image_compress_level branch October 8, 2025 18:06
aDaikiKamata pushed a commit to aDaikiKamata/lerobot that referenced this pull request Oct 20, 2025
… set it to 1 (huggingface#2135)

* refactor(datasets): add compress_level parameter to write_image() and set it to 1

* docs(dataset): add docs to write_image()
annarborace01 pushed a commit to annarborace01/lerobot that referenced this pull request Nov 16, 2025
… set it to 1 (huggingface#2135)

* refactor(datasets): add compress_level parameter to write_image() and set it to 1

* docs(dataset): add docs to write_image()
nepyope pushed a commit that referenced this pull request Nov 21, 2025
… set it to 1 (#2135)

* refactor(datasets): add compress_level parameter to write_image() and set it to 1

* docs(dataset): add docs to write_image()
XHAKA3456 pushed a commit to XHAKA3456/lerobot that referenced this pull request Dec 12, 2025
… set it to 1 (huggingface#2135)

* refactor(datasets): add compress_level parameter to write_image() and set it to 1

* docs(dataset): add docs to write_image()
massu2002 pushed a commit to massu2002/lerobot that referenced this pull request Dec 17, 2025
… set it to 1 (huggingface#2135)

* refactor(datasets): add compress_level parameter to write_image() and set it to 1

* docs(dataset): add docs to write_image()
sandhya-cb pushed a commit to sandhya-cb/lerobot-clutterbot that referenced this pull request Jan 28, 2026
… set it to 1 (huggingface#2135)

* refactor(datasets): add compress_level parameter to write_image() and set it to 1

* docs(dataset): add docs to write_image()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dataset Issues regarding data inputs, processing, or datasets enhancement Suggestions for new features or improvements performance Issues aimed at improving speed or resource usage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

write_image() is slow due to default compress_level=6

2 participants

Comments