Thanks to visit codestin.com
Credit goes to github.com

Skip to content

⚡️ Speed up function make_increasing_ohlc by 102% #106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented May 24, 2025

📄 102% (1.02x) speedup for make_increasing_ohlc in plotly/figure_factory/_ohlc.py

⏱️ Runtime : 2.12 milliseconds 1.05 milliseconds (best of 380 runs)

📝 Explanation and details

Here’s the optimized version of your program. The bulk of the runtime is spent in utils.flatten, so the best speedup comes from avoiding repeated flattening and instead building the already-flat data in-place while separating the increasing sticks. This avoids both the intermediate nested lists and the overhead of flatten. You can also avoid storing and separating decrease data, since only the increasing sticks are relevant. All other logic and signatures are left unchanged.

All required comments are preserved.


Key optimizations and details:

  • Avoids both building nested lists and then flattening, instead builds already-flat flat_increase_x and flat_increase_y in one pass while processing the data.
  • Avoids storing unnecessary decrease data.
  • Hoists the repeated date minimum calculation out of the per-bar loop.
  • Preserves function signatures and docstrings exactly.
  • Comments are only modified for clarity around the optimized section.

This should be much faster, especially for large data. Let me know if you have performance constraints around decreasing traces (they can be optimized similarly if needed).

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 38 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 88.9%
🌀 Generated Regression Tests Details
from datetime import datetime, timedelta

# imports
import pytest  # used for our unit tests
from plotly import exceptions
# function to test
# Default colours for finance charts
from plotly.figure_factory import utils
from plotly.figure_factory._ohlc import make_increasing_ohlc

_DEFAULT_INCREASING_COLOR = "#3D9970"  # http://clrs.cc
from plotly.figure_factory._ohlc import make_increasing_ohlc

# unit tests

# ---- Basic Test Cases ----

def test_single_increasing_point():
    # One point, increasing (close > open)
    open_ = [1]
    high = [2]
    low = [0]
    close = [3]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_single_decreasing_point():
    # One point, decreasing (close <= open)
    open_ = [2]
    high = [3]
    low = [1]
    close = [2]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_two_points_mixed():
    # Two points, one increasing, one decreasing
    open_ = [1, 3]
    high = [2, 4]
    low = [0, 2]
    close = [3, 2]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_all_increasing_multiple_points():
    # Multiple points, all increasing
    open_ = [1, 2, 3]
    high = [2, 3, 4]
    low = [0, 1, 2]
    close = [3, 4, 5]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_dates_are_used_for_x_axis():
    # Test with datetime dates, 2 points, both increasing
    open_ = [1, 2]
    high = [3, 4]
    low = [0, 1]
    close = [4, 5]
    d0 = datetime(2024, 1, 1)
    d1 = datetime(2024, 1, 2)
    dates = [d0, d1]
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output
    # The x values should be datetime objects +/- timedelta
    date_dif_min = (d1 - d0) / 5
    # First stick x values
    expected_x0 = [d0 - date_dif_min, d0, d0, d0, d0, d0 + date_dif_min, None]
    for i in range(7):
        pass
    # Second stick x values
    expected_x1 = [d1 - date_dif_min, d1, d1, d1, d1, d1 + date_dif_min, None]
    for i in range(7):
        pass

def test_custom_name_and_line_kwargs():
    # Test passing custom name and line color
    open_ = [1]
    high = [2]
    low = [0]
    close = [3]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates, name="Bull", line={'color': 'red'}); result = codeflash_output

def test_text_kwarg_override():
    # Test passing custom text kwarg
    open_ = [1]
    high = [2]
    low = [0]
    close = [3]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates, text="mytext"); result = codeflash_output

# ---- Edge Test Cases ----

def test_empty_lists():
    # All lists empty
    open_ = []
    high = []
    low = []
    close = []
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_all_none_close():
    # close is all None, should skip all
    open_ = [1, 2, 3]
    high = [2, 3, 4]
    low = [0, 1, 2]
    close = [None, None, None]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_some_none_close():
    # Some close values are None, some increasing
    open_ = [1, 2, 3]
    high = [2, 3, 4]
    low = [0, 1, 2]
    close = [4, None, 5]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_all_decreasing():
    # All points are decreasing (close <= open)
    open_ = [3, 2, 1]
    high = [4, 3, 2]
    low = [2, 1, 0]
    close = [2, 1, 0]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_equal_open_close():
    # open == close, should be treated as decreasing (not increasing)
    open_ = [1, 2, 3]
    high = [2, 3, 4]
    low = [0, 1, 2]
    close = [1, 2, 3]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_dates_not_sorted():
    # Dates are not sorted, should still work
    open_ = [1, 2]
    high = [2, 3]
    low = [0, 1]
    close = [3, 4]
    d0 = datetime(2024, 1, 2)
    d1 = datetime(2024, 1, 1)
    dates = [d0, d1]
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_non_numeric_inputs():
    # Non-numeric values should propagate to output (no type checking in function)
    open_ = [1, 2]
    high = [2, 3]
    low = [0, 1]
    close = ["a", 4]  # first is not numeric, but string comparison is not valid
    dates = None
    # Should raise TypeError when comparing "a" > 1
    with pytest.raises(TypeError):
        make_increasing_ohlc(open_, high, low, close, dates)


def test_dates_length_mismatch():
    # Dates length does not match open/high/low/close
    open_ = [1, 2, 3]
    high = [2, 3, 4]
    low = [0, 1, 2]
    close = [3, 4, 5]
    dates = [datetime(2024, 1, 1), datetime(2024, 1, 2)]
    # Should raise IndexError in get_all_xy
    with pytest.raises(IndexError):
        make_increasing_ohlc(open_, high, low, close, dates)

def test_dates_with_duplicates():
    # Dates have duplicate values
    open_ = [1, 2]
    high = [2, 3]
    low = [0, 1]
    close = [3, 4]
    d = datetime(2024, 1, 1)
    dates = [d, d]
    # date_dif_min will be timedelta(0), so x values will be identical
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

# ---- Large Scale Test Cases ----

def test_large_all_increasing():
    # 1000 points, all increasing
    n = 1000
    open_ = list(range(n))
    high = [x + 1 for x in open_]
    low = [x - 1 for x in open_]
    close = [x + 2 for x in open_]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_large_mixed_increasing_decreasing():
    # 1000 points, alternating increasing and decreasing
    n = 1000
    open_ = [i for i in range(n)]
    high = [i + 2 for i in range(n)]
    low = [i - 2 for i in range(n)]
    close = [open_[i] + 3 if i % 2 == 0 else open_[i] - 3 for i in range(n)]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output
    # Check last increasing stick (index is last even)
    last_even = n - 2 if n % 2 == 0 else n - 1
    idx = ((n // 2 + n % 2) - 1) * 7

def test_large_with_dates():
    # 500 points, all increasing, with dates
    n = 500
    open_ = list(range(n))
    high = [x + 1 for x in open_]
    low = [x - 1 for x in open_]
    close = [x + 2 for x in open_]
    base_date = datetime(2024, 1, 1)
    dates = [base_date + timedelta(days=i) for i in range(n)]
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output
    # Check first stick x values
    date_dif_min = timedelta(days=1) / 5
    expected_x0 = [base_date - date_dif_min, base_date, base_date, base_date, base_date, base_date + date_dif_min, None]
    for i in range(7):
        pass

def test_large_all_decreasing():
    # 1000 points, all decreasing
    n = 1000
    open_ = list(range(n, 2*n))
    high = [x + 1 for x in open_]
    low = [x - 1 for x in open_]
    close = [x - 2 for x in open_]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import datetime

# imports
import pytest
from plotly import exceptions
# function to test
# Default colours for finance charts
from plotly.figure_factory import utils
from plotly.figure_factory._ohlc import make_increasing_ohlc

_DEFAULT_INCREASING_COLOR = "#3D9970"  # http://clrs.cc
from plotly.figure_factory._ohlc import make_increasing_ohlc

# unit tests

# --- Basic Test Cases ---

def test_basic_increasing_single():
    # Single increasing bar, no dates
    open_ = [1]
    high = [2]
    low = [0.5]
    close = [1.5]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_basic_increasing_multiple():
    # Multiple bars, only some increasing
    open_ = [1, 2, 2]
    high = [2, 3, 3]
    low = [0.5, 1.5, 1.5]
    close = [1.5, 1.5, 2.5]  # only first and last are increasing
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_basic_with_dates():
    # Using datetime objects for dates
    base = datetime.datetime(2020, 1, 1)
    dates = [base + datetime.timedelta(days=i) for i in range(3)]
    open_ = [1, 2, 2]
    high = [2, 3, 3]
    low = [0.5, 1.5, 1.5]
    close = [1.5, 3, 1.5]
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output
    # The x values should be datetimes or None

def test_basic_name_kwarg():
    # Pass a custom name, showlegend should be True
    open_ = [1, 2]
    high = [3, 4]
    low = [0, 1]
    close = [2, 1]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates, name="CustomName"); result = codeflash_output

def test_basic_line_kwarg_override():
    # Pass a custom line dict, should override default
    open_ = [1, 2]
    high = [3, 4]
    low = [0, 1]
    close = [2, 3]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates, line=dict(color="red", width=5)); result = codeflash_output

# --- Edge Test Cases ---

def test_edge_no_increasing():
    # All closes <= opens, should result in empty x/y
    open_ = [2, 3, 4]
    high = [2.5, 3.5, 4.5]
    low = [1.5, 2.5, 3.5]
    close = [2, 2, 4]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_edge_all_increasing():
    # All bars increasing
    open_ = [1, 2, 3]
    high = [2, 3, 4]
    low = [0, 1, 2]
    close = [2, 3, 4]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_edge_empty_lists():
    # Empty input lists
    open_ = []
    high = []
    low = []
    close = []
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_edge_input_length_mismatch():
    # Lists of different lengths should raise an error
    open_ = [1, 2]
    high = [2, 3]
    low = [0, 1]
    close = [1.5]
    dates = None
    with pytest.raises(IndexError):
        make_increasing_ohlc(open_, high, low, close, dates)

def test_edge_dates_length_mismatch():
    # Dates list shorter than data lists should raise error
    open_ = [1, 2, 3]
    high = [2, 3, 4]
    low = [0, 1, 2]
    close = [1.5, 2.5, 3.5]
    dates = [datetime.datetime(2020, 1, 1), datetime.datetime(2020, 1, 2)]
    with pytest.raises(IndexError):
        make_increasing_ohlc(open_, high, low, close, dates)

def test_edge_none_close():
    # close=None should be ignored for that bar
    open_ = [1, 2, 3]
    high = [2, 3, 4]
    low = [0, 1, 2]
    close = [2, None, 4]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_edge_negative_values():
    # Negative and zero values
    open_ = [-2, 0, -1]
    high = [0, 1, 0]
    low = [-3, -1, -2]
    close = [-1, 1, -1]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_edge_duplicate_dates():
    # Duplicate dates (should still work, but date_dif_min will be zero)
    d = datetime.datetime(2020, 1, 1)
    open_ = [1, 2]
    high = [2, 3]
    low = [0, 1]
    close = [2, 3]
    dates = [d, d]
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output


def test_edge_all_none():
    # All None in close
    open_ = [1, 2, 3]
    high = [2, 3, 4]
    low = [0, 1, 2]
    close = [None, None, None]
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

# --- Large Scale Test Cases ---

def test_large_scale_1000_points():
    # 1000 bars, half increasing, half not
    n = 1000
    open_ = [i for i in range(n)]
    high = [i + 2 for i in range(n)]
    low = [i - 1 for i in range(n)]
    close = [i + 1 if i % 2 == 0 else i - 1 for i in range(n)]  # 500 increasing
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_large_scale_1000_dates():
    # 1000 bars, all increasing, with dates
    n = 1000
    base = datetime.datetime(2021, 1, 1)
    open_ = [i for i in range(n)]
    high = [i + 1 for i in range(n)]
    low = [i - 1 for i in range(n)]
    close = [i + 2 for i in range(n)]
    dates = [base + datetime.timedelta(days=i) for i in range(n)]
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_large_scale_performance():
    # Large input, but only a few increasing
    n = 1000
    open_ = [1000] * n
    high = [1001] * n
    low = [999] * n
    close = [999] * n
    # Only last bar is increasing
    close[-1] = 1001
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output

def test_large_scale_all_none():
    # Large input, all close values None
    n = 1000
    open_ = [i for i in range(n)]
    high = [i + 1 for i in range(n)]
    low = [i - 1 for i in range(n)]
    close = [None] * n
    dates = None
    codeflash_output = make_increasing_ohlc(open_, high, low, close, dates); result = codeflash_output
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-make_increasing_ohlc-mb2bo46m and push.

Codeflash

Here’s the optimized version of your program. The bulk of the runtime is spent in `utils.flatten`, so the best speedup comes from avoiding repeated flattening and instead building the already-flat data **in-place** while separating the increasing sticks. This avoids both the intermediate nested lists and the overhead of `flatten`. You can also avoid storing and separating decrease data, since only the increasing sticks are relevant. All other logic and signatures are left unchanged.

**All required comments are preserved.**



---

**Key optimizations and details:**
- Avoids both building nested lists and then flattening, instead builds already-flat `flat_increase_x` and `flat_increase_y` in one pass while processing the data.
- Avoids storing unnecessary decrease data.
- Hoists the repeated date minimum calculation out of the per-bar loop.
- Preserves function signatures and docstrings exactly.
- Comments are only modified for clarity around the optimized section.

This should be **much** faster, especially for large data. Let me know if you have performance constraints around decreasing traces (they can be optimized similarly if needed).
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label May 24, 2025
@codeflash-ai codeflash-ai bot requested a review from misrasaurabh1 May 24, 2025 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants