Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Benchmark Script for Pipelines#1150

Merged
bfineran merged 54 commits into
mainfrom
pipeline-benchmark
Aug 17, 2023
Merged

Benchmark Script for Pipelines#1150
bfineran merged 54 commits into
mainfrom
pipeline-benchmark

Conversation

@Satrat

@Satrat Satrat commented Jul 27, 2023

Copy link
Copy Markdown

Adding a script for benchmarking pipelines, which reports the amount of compute time spent in each phase of the pipeline. This allows us to identify bottlenecks outside the engine in pre or post processing.

Example Usage

deepsparse.benchmark_pipeline "text_classification" "zoo:nlp/sentiment_analysis/distilbert-none/pytorch/huggingface/sst2/pruned90-none" -c "config.json"

Based on the pipeline argument, the script will infer what type of data to generate or search for (See get_input_schema_type for details)

Config File Documentation

Configurations for generating or loading data to the pipeline are specified as JSON, documented in the README. A quick summary:

  • Set data_type to "real" to pull text or image data from data_folder, or set it to "dummy" to use randomly generated data. * In dummy mode, string lengths are set with gen_sequence_length and image shapes are set by input_image_shape
  • In real mode, max_string_length will truncate input text if >0, set to -1 for no truncation
  • data_folder is a path to either images(.jpg, .jpeg, .gif) or text(.txt) files, to be read in real mode
  • set 'recursive_searchto true to recursively searchdata_folder`
  • additional keyword arguments to pipeline.Pipeline() can be added to pipeline_kwargs
  • additional keyword arguments to Pipeline.input_schema() can be added to input_schema_kwargs

Example:

{
    "data_type": "dummy",
    "gen_sequence_length": 100,
    "input_image_shape": [500,500,3],
    "data_folder": "/home/sadkins/imagenette2-320/",
    "recursive_search": true,
    "max_string_length": -1, 
    "pipeline_kwargs": {},
    "input_schema_kwargs": {}
} 

Testing

Added unit tests to test_pipeline_benchmark.py, and also manually tested the following pipelines:

  • text_classification: deepsparse.benchmark_pipeline text_classification zoo:nlp/sentiment_analysis/distilbert-none/pytorch/huggingface/sst2/pruned90-none -c tests/test_data/pipeline_bench_config.json
  • image_classification: deepsparse.benchmark_pipeline image_classification zoo:cv/classification/resnet_v1-50_2x/pytorch/sparseml/imagenet/base-none -c tests/test_data/pipeline_bench_config.json
  • text_generation: deepsparse.benchmark_pipeline text_generation zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base_quant-none -c tests/test_data/pipeline_bench_config.json
  • yolo: deepsparse.benchmark_pipeline yolo zoo:cv/detection/yolov5-l/pytorch/ultralytics/coco/pruned_quant-aggressive_95-c tests/test_data/pipeline_bench_config.json
  • question_answering: deepsparse.benchmark_pipeline question_answering zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni -c tests/test_data/pipeline_bench_config.json
  • token_generation: deepsparse.benchmark_pipeline token_classification zoo:nlp/token_classification/distilbert-none/pytorch/huggingface/conll2003/pruned90-none -c tests/test_data/pipeline_bench_config.json

@Satrat Satrat changed the title WIP: Benchmark Script for Pipelines Benchmark Script for Pipelines Jul 28, 2023
@Satrat Satrat marked this pull request as ready for review July 28, 2023 21:13

@bfineran bfineran left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall - could you add an example of the expected output format either to the PR or readme?

@Satrat

Satrat commented Aug 4, 2023

Copy link
Copy Markdown
Author

LGTM overall - could you add an example of the expected output format either to the PR or readme?

Added to the README!

Comment thread src/deepsparse/benchmark/benchmark_pipeline.py
bfineran
bfineran previously approved these changes Aug 10, 2023

@rahul-tuli rahul-tuli left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks very close, had a few comments, and nits: mostly around docstrings.

Additionally, something to think about, as we generally support yaml for all our recipes and other configs(like deepsparse server); would it make sense to use yaml over json here too for consistency? We could also support both I'm in favor of that, but really think we should atleast support yaml

Comment thread src/deepsparse/benchmark/helpers.py
Comment thread src/deepsparse/benchmark/helpers.py
Comment thread src/deepsparse/benchmark/helpers.py
Comment thread src/deepsparse/benchmark/helpers.py
Comment thread src/deepsparse/benchmark/helpers.py
Comment thread src/deepsparse/benchmark/data_creation.py
Comment thread src/deepsparse/benchmark/data_creation.py
Comment thread src/deepsparse/benchmark/data_creation.py Outdated
Comment thread src/deepsparse/benchmark/data_creation.py Outdated
Comment thread src/deepsparse/benchmark/data_creation.py Outdated

@rahul-tuli rahul-tuli left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Satrat Satrat requested a review from bfineran August 17, 2023 20:46
@bfineran

Copy link
Copy Markdown
Contributor

GHA failures unrelated, merging

@bfineran bfineran merged commit 545348b into main Aug 17, 2023
@bfineran bfineran deleted the pipeline-benchmark branch August 17, 2023 21:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants