Benchmark Script for Pipelines#1150
Conversation
bfineran
left a comment
There was a problem hiding this comment.
LGTM overall - could you add an example of the expected output format either to the PR or readme?
Added to the README! |
…e into pipeline-benchmark
…e into pipeline-benchmark
rahul-tuli
left a comment
There was a problem hiding this comment.
The code looks very close, had a few comments, and nits: mostly around docstrings.
Additionally, something to think about, as we generally support yaml for all our recipes and other configs(like deepsparse server); would it make sense to use yaml over json here too for consistency? We could also support both I'm in favor of that, but really think we should atleast support yaml
|
GHA failures unrelated, merging |

Adding a script for benchmarking pipelines, which reports the amount of compute time spent in each phase of the pipeline. This allows us to identify bottlenecks outside the engine in pre or post processing.
Example Usage
Based on the pipeline argument, the script will infer what type of data to generate or search for (See
get_input_schema_typefor details)Config File Documentation
Configurations for generating or loading data to the pipeline are specified as JSON, documented in the README. A quick summary:
data_typeto "real" to pull text or image data fromdata_folder, or set it to "dummy" to use randomly generated data. * In dummy mode, string lengths are set withgen_sequence_lengthand image shapes are set byinput_image_shapemax_string_lengthwill truncate input text if >0, set to -1 for no truncationdata_folderis a path to either images(.jpg, .jpeg, .gif) or text(.txt) files, to be read in real modeto true to recursively searchdata_folder`pipeline.Pipeline()can be added topipeline_kwargsPipeline.input_schema()can be added toinput_schema_kwargsExample:
{ "data_type": "dummy", "gen_sequence_length": 100, "input_image_shape": [500,500,3], "data_folder": "/home/sadkins/imagenette2-320/", "recursive_search": true, "max_string_length": -1, "pipeline_kwargs": {}, "input_schema_kwargs": {} }Testing
Added unit tests to
test_pipeline_benchmark.py, and also manually tested the following pipelines:deepsparse.benchmark_pipeline text_classification zoo:nlp/sentiment_analysis/distilbert-none/pytorch/huggingface/sst2/pruned90-none -c tests/test_data/pipeline_bench_config.jsondeepsparse.benchmark_pipeline image_classification zoo:cv/classification/resnet_v1-50_2x/pytorch/sparseml/imagenet/base-none -c tests/test_data/pipeline_bench_config.jsondeepsparse.benchmark_pipeline text_generation zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base_quant-none -c tests/test_data/pipeline_bench_config.jsondeepsparse.benchmark_pipeline yolo zoo:cv/detection/yolov5-l/pytorch/ultralytics/coco/pruned_quant-aggressive_95-c tests/test_data/pipeline_bench_config.jsondeepsparse.benchmark_pipeline question_answering zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni -c tests/test_data/pipeline_bench_config.jsondeepsparse.benchmark_pipeline token_classification zoo:nlp/token_classification/distilbert-none/pytorch/huggingface/conll2003/pruned90-none -c tests/test_data/pipeline_bench_config.json