kneeslap downloads an S3 object in chunks concurrently and outputs the data
in order to stdout.
Using aws s3 cp to stdout was slower than when writing to disk, presumably
because when writing to a file, the download is chunked and parts written
concurrently. I could not find a way to achieve similar performance to stdout
(even when changing max_concurrent_requests and multipart_chunksize). If
you know of a setting that enables this, making kneeslap redundant, please
let me know!
kneeslap s3://bucket/key will download and output to stdout.
- You can use
pvto see performance:kneeslap s3://bucket/key | pv -X. - The
-connectionsflag sets how many connections are made to the object, default is 32. - The
-chunksizeflag sets the chunk size, default is 16MB.
A good example of when to use this is if you're doing on-the-fly (de)compression with zstd.
For example, you could upload an object with:
tar -cvf - file1 file2 | zstd --sparse -c -3 -T0 | aws s3 cp - s3://bucket/key.zstdAnd download it with:
kneeslap s3://bucket/key | zstd -d | tar -Szxf -This would be comparable to aws s3 cp s3://bucket/key - | zstd -d | tar -Szxf, but there should be a pretty significant speed increase (at least on very large files...)