Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

πŸš€ The production-ready and incredibly-fast python library to support stock statistics and indicators, based on `pandas.DataFrame`

License

Notifications You must be signed in to change notification settings

kaelzhang/stock-pandas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Conda version

stock-pandas inherits and extends pandas.DataFrame to support:

  • Stock Statistics
  • Stock Indicators, including:
    • Trend-following momentum indicators, such as MA, EMA, MACD, BBI, TR, ATR, HV
    • Dynamic support and resistance indicators, such as BOLL, BBW
    • Over-bought / over-sold indicators, such as KDJ, RSI
    • Other indicators, such as LLV, HHV
    • For more indicators, welcome to request a proposal, or fork and send me a pull request, or extend stock-pandas yourself. You might read the Advanced Sections below.
  • To cumulate kline data based on a given time frame, so that it could easily handle real-time data updates.
  • To manage calculation lookback, always use least necessary data frames to calculate an indicator, and automatically fulfill indicators for new-appended data frames if needed.
  • The calculation engine and parser re-implemented in Rust, delivering up to a 4.8Γ— performance improvement versus running the same metric computations directly in Pandas.

stock-pandas makes automated trading much easier. stock-pandas requires

  • Python >= 3.10
  • and Pandas >= 1.0.0(for now)

With the help of stock-pandas and mplfinance, we could easily draw something like:

The code example is available at here.

Install

For now, before installing stock-pandas in your environment

Have g++ compiler installed

# With yum, for CentOS, Amazon Linux, etc
yum install gcc-c++

# With apt-get, for Ubuntu
apt-get install g++

# For macOS, install XCode commandline tools
xcode-select --install

If you use docker with Dockerfile and use python image,

FROM python:3.9

...

The default python:3.9 image already contains g++, so we do not install g++ additionally.

Install stock-pandas

pip install stock-pandas

A conda-forge recipe is also available, so you can also use

conda install -c conda-forge stock-pandas

Usage

from stock_pandas import StockDataFrame

# or
import stock_pandas as spd

We also have some examples with annotations in the example directory, you could use JupyterLab or Jupyter notebook to play with them.

StockDataFrame

StockDataFrame inherits from pandas.DataFrame, so if you are familiar with pandas.DataFrame, you are already ready to use stock-pandas

import pandas as pd
stock = StockDataFrame(pd.read_csv('stock.csv'))

As we know, we could use [], which called pandas indexing (a.k.a. __getitem__ in python) to select out lower-dimensional slices. In addition to indexing with colname (column name of the DataFrame), we could also do indexing by directives.

stock[directive] # Gets a pandas.Series

stock[[directive0, directive1]] # Gets a StockDataFrame

We have an example to show the most basic indexing using [directive]

stock = StockDataFrame({
    'open' : ...,
    'high' : ...,
    'low'  : ...,
    'close': [5, 6, 7, 8, 9]
})

stock['ma:2']

# 0    NaN
# 1    5.5
# 2    6.5
# 3    7.5
# 4    8.5
# Name: ma:2,close, dtype: float64

Which prints the 2-period simple moving average on column "close".

Parameters

  • date_col Optional[str] = None If set, then the column named date_col will convert and set as DateTimeIndex of the data frame
  • to_datetime_kwargs dict = {} the keyworded arguments to be passed to pandas.to_datetime(). It only takes effect if date_col is specified.
  • time_frame str | TimeFrame | None = None time frame of the stock. For now, only the following time frames are supported:
    • '1m' or TimeFrame.m1
    • '3m' or TimeFrame.m3
    • '5m' or TimeFrame.m5
    • '15m' or TimeFrame.m15
    • '30m' or TimeFrame.m30
    • '1h' or TimeFrame.H1
    • '2h' or TimeFrame.H2
    • '4h' or TimeFrame.H4
    • '6h' or TimeFrame.H6
    • '8h' or TimeFrame.H8
    • '12h' or TimeFrame.H12
    • '1d' or TimeFrame.D1
    • '3d' or TimeFrame.D3
    • '1W' or TimeFrame.W1
    • '1M' or TimeFrame.M1
    • '1Y' or TimeFrame.Y1

stock.exec(directive: str, create_column: bool=False) -> np.ndarray

Executes the given directive and returns a numpy ndarray according to the directive.

stock['ma:5'] # returns a Series

stock.exec('ma:5', create_column=True) # returns a numpy ndarray
# This will only calculate without creating a new column in the dataframe
stock.exec('ma:20')

The difference between stock[directive] and stock.exec(directive) is that

  • the former will create a new column for the result of directive as a cache for later use, while stock.exec(directive) does not unless we pass the parameter create_column as True
  • the former one accepts other pandas indexing targets, while stock.exec(directive) only accepts a valid stock-pandas directive string
  • the former one returns a pandas.Series or StockDataFrame object while the latter one returns an np.ndarray

stock.alias(alias: str, name: str) -> None

Defines column alias or directive alias

  • alias str the alias name
  • name str the name of an existing column or the directive string
# Some plot library such as `mplfinance` requires a column named capitalized `Open`,
# but it is ok, we could create an alias.
stock.alias('Open', 'open')

stock.alias('buy_point', 'kdj.j < 0')

stock.get_column(key: str) -> pd.Series

Directly gets the column value by key, returns a pandas Series.

If the given key is an alias name, it will return the value of corresponding original column.

If the column is not found, a KeyError will be raised.

stock = StockDataFrame({
    'open' : ...,
    'high' : ...,
    'low'  : ...,
    'close': [5, 6, 7, 8, 9]
})

stock.get_column('close')
# 0    5
# 1    6
# 2    7
# 3    8
# 4    9
# Name: close, dtype: float64
try:
    stock.get_column('Close')
except KeyError as e:
    print(e)

    # KeyError: column "Close" not found

stock.alias('Close', 'close')

stock.get_column('Close')
# The same as `stock.get_column('close')`

stock.append(other, *args, **kwargs) -> StockDataFrame

Appends rows of other to the end of caller, returning a new object.

This method has nearly the same hehavior of pandas.DataFrame.append(), but instead it returns an instance of StockDataFrame, and it applies date_col to the newly-appended row(s) if possible.

stock.rolling_calc(size, on, apply, forward, fill) -> np.ndarray

Since 0.27.0

Applies a 1-D function along the given column or directive on

  • size int the size of the rolling window
  • on str | Directive along which the function should be applied
  • apply Callable[[np.ndarray], Any] the 1-D function to apply
  • forward? bool = False whether we should look backward (default value) to get each rolling window or not
  • fill? Any = np.nan the value used to fill where there are not enough items to form a rolling window
stock.rolling_calc(5, 'open', max)

# Whose return value equals to
stock['hhv:5@open'].to_numpy()

stock.cumulate() -> StockDataFrame

Cumulate the current data frame stock based on its time frame setting, and returns a new StockDataFrame

StockDataFrame(one_minute_kline_data_frame, time_frame='5m').cumulate()

# And you will get a 5-minute kline data

see Cumulation and DatetimeIndex for details

stock.cum_append(other) -> StockDataFrame

Append other to the end of the current data frame stock, apply cumulation on them, and return a new StockDataFrame

And the following slice of code is equivalent to the above one:

StockDataFrame(time_frame='5m').cum_append(one_minute_kline_data_frame)

see Cumulation and DatetimeIndex for details

stock.fulfill() -> self

Since 1.2.0

Fulfill all stock indicator columns. By default, adding new rows to a StockDataFrame will not update stock indicators of the new row.

Stock indicators will only be updated when accessing the stock indicator column or calling stock.fulfill()

Check the test cases for details

directive_stringify(directive_str) -> str

Since 0.30.0

Removed in 4.0.0

Please use StockDataFrame.directive_stringify instead

stock.directive_stringify(directive: str) -> str

Since 0.26.0

Removed in 4.0.0

Please use StockDataFrame.directive_stringify instead

StockDataFrame.directive_stringify(directive: str) -> str

New in 4.0.0

The classmethod to get the full name of the directive which is also the actual column name of the data frame

StockDataFrame.directive_stringify('kdj.j')
# "kdj.j"

StockDataFrame.directive_stringify('kdj.j:9,3,2,100@high,close,close')
# "kdj.j:,,2,100.0@,close"

# <- default args are obmitted to save space

StockDataFrame.directive_lookback(directive: str) -> int

New in 5.2.0

The classmethod to get the lookback period of a directive, which indicates the minimum number of data points required to calculate the indicator.

This is useful for:

  • Determining how much historical data is needed before an indicator produces valid results
  • Understanding the data requirements when combining multiple indicators
StockDataFrame.directive_lookback('ma:20')
# 19

StockDataFrame.directive_lookback('boll')
# 19 (default period 20)

# Compound directive: lookback accumulates across nested expressions
# repeat:5 needs 4 extra points, boll.upper (period=20) needs 19
# Total: 4 + 19 = 23
StockDataFrame.directive_lookback('repeat:5@(close > boll.upper)')
# 23

StockDataFrame.define_command(...) -> None

StockDataFrame.define_command(
    name: str,
    definition: CommandDefinition
) -> None

The classmethod to define a new customized command which could be shared with all instances

Cumulation and DatetimeIndex

Suppose we have a csv file containing kline data of a stock in 1-minute time frame

csv = pd.read_csv(csv_path)

print(csv)
                   date   open   high    low  close    volume
0   2020-01-01 00:00:00  329.4  331.6  327.6  328.8  14202519
1   2020-01-01 00:01:00  330.0  332.0  328.0  331.0  13953191
2   2020-01-01 00:02:00  332.8  332.8  328.4  331.0  10339120
3   2020-01-01 00:03:00  332.0  334.2  330.2  331.0   9904468
4   2020-01-01 00:04:00  329.6  330.2  324.9  324.9  13947162
5   2020-01-01 00:04:00  329.6  330.2  324.8  324.8  13947163    <- There is an update of
                                                                    2020-01-01 00:04:00
...
16  2020-01-01 00:16:00  333.2  334.8  331.2  334.0  12428539
17  2020-01-01 00:17:00  333.0  333.6  326.8  333.6  15533405
18  2020-01-01 00:18:00  335.0  335.2  326.2  327.2  16655874
19  2020-01-01 00:19:00  327.0  327.2  322.0  323.0  15086985

Noted that duplicated records of a same timestamp will not be cumulated. The records except the latest one will be disgarded.

stock = StockDataFrame(
    csv,
    date_col='date',
    # Which is equivalent to `time_frame=TimeFrame.M5`
    time_frame='5m'
)

print(stock)
                      open   high    low  close    volume
2020-01-01 00:00:00  329.4  331.6  327.6  328.8  14202519
2020-01-01 00:01:00  330.0  332.0  328.0  331.0  13953191
2020-01-01 00:02:00  332.8  332.8  328.4  331.0  10339120
2020-01-01 00:03:00  332.0  334.2  330.2  331.0   9904468
2020-01-01 00:04:00  329.6  330.2  324.9  324.9  13947162
2020-01-01 00:04:00  329.6  330.2  324.8  324.8  13947162
...
2020-01-01 00:16:00  333.2  334.8  331.2  334.0  12428539
2020-01-01 00:17:00  333.0  333.6  326.8  333.6  15533405
2020-01-01 00:18:00  335.0  335.2  326.2  327.2  16655874
2020-01-01 00:19:00  327.0  327.2  322.0  323.0  15086985

You must have figured it out that the data frame now has DatetimeIndexes.

But it will not become a 15-minute kline data unless we cumulate it, and only cumulates new frames if you use stock.cum_append(them) to cumulate them.

stock_15m = stock.cumulate()

print(stock_15m)

Now we get a 15-minute kline

                      open   high    low  close      volume
2020-01-01 00:00:00  329.4  334.2  324.8  324.8  62346461.0
2020-01-01 00:05:00  325.0  327.8  316.2  322.0  82176419.0
2020-01-01 00:10:00  323.0  327.8  314.6  327.6  74409815.0
2020-01-01 00:15:00  330.0  335.2  322.0  323.0  82452902.0

For more details and about how to get full control of everything, check the online Google Colab notebook here.

Syntax of directive

See here for details

directive Example

Here lists several use cases of column names

# The middle band of bollinger bands
#   which is actually a 20-period (default) moving average
stock['boll']

# kdj j less than 0
# This returns a series of bool type
stock['kdj.j < 0']

# kdj %K cross up kdj %D
stock['kdj.k // kdj.d']

# 5-period simple moving average
stock['ma:5']

# 10-period simple moving average on (@) open prices
stock['ma:10@open']

# Dataframe of 5-period, 10-period, 30-period ma
stock[[
    'ma:5',
    'ma:10',
    'ma:30'
]]

# Which means we use the default values of the first and the second parameters,
# and specify the third parameter (for macd.signal)
stock['macd.signal:,,10']

# We must wrap a parameter which is a nested command or directive
stock['increase:3@(ma:20@close)']

# stock-pandas has a powerful directive parser,
# so we could even write directives like this:
stock['''
repeat
    :   5
    @   (
            close > boll.upper
        )
''']

Built-in Commands of Indicators

Document syntax explanation:

  • param0 int which means param0 is a required parameter of type int.
  • param1? str='close' which means parameter param1 is optional with default value 'close'.

Actually, all parameters of a command are of string type, so the int here means an interger-like string.

ma, simple Moving Averages

ma:<period>@<on>

Gets the period-period simple moving average on column named column.

SMA is often confused between simple moving average and smoothed moving average.

So stock-pandas will use ma for simple moving average and smma for smoothed moving average.

  • period int (required)
  • on? str='close' Which column or directive should the calculation based on. Defaults to 'close'
# which is equivalent to `stock['ma:5@close']`
stock['ma:5']

stock['ma:10@open']

Advanced usage

# The 5-period moving average for the upper bollinger band
stock['ma:5@(boll.upper:21,2@close)']

# The change rate of the series above ↑
stock['change@(ma:5@(boll.upper:21,2@close))']

ema, Exponential Moving Average

ema:<period>@<on>

Gets the Exponential Moving Average, also known as the Exponential Weighted Moving Average.

The arguments of this command is the same as ma.

  • period int (required)
  • on? str='close' Which column or directive should the calculation based on. Defaults to 'close'
# which is equivalent to `stock['ema:5@close']`
stock['ema:5']

stock['ema:10@open']

macd, Moving Average Convergence Divergence

macd:<fast_period>,<slow_period>@<on>
macd.signal:<fast_period>,<slow_period>,<signal_period>@<on>
macd.histogram:<fast_period>,<slow_period>,<signal_period>@<on>
  • fast_period? int=12 fast period (short period). Defaults to 12.
  • slow_period? int=26 slow period (long period). Defaults to 26
  • signal_period? int=9 signal period. Defaults to 9
  • on? str='close' Which column or directive should the calculation based on. Defaults to 'close'
# macd
stock['macd']
stock['macd.dif']

# macd signal band, which is a shortcut for stock['macd.signal']
stock['macd.s']
stock['macd.signal']
stock['macd.dea']

# macd histogram band, which is equivalent to stock['macd.h']
stock['macd.histogram']
stock['macd.h']
stock['macd.macd']

boll, BOLLinger bands

boll:<period>@<on>
boll.upper:<period>,<times>@<on>
boll.lower:<period>,<times>@<on>
  • period? int=20
  • times? float=2.
  • on? str='close' Which column or directive should the calculation based on. Defaults to 'close'
# boll
stock['boll']

# bollinger upper band, a shortcut for stock['boll.upper']
stock['boll.u']
stock['boll.upper']

# bollinger lower band, which is equivalent to stock['boll.l']
stock['boll.lower']
stock['boll.l']

bbw, Bollinger Band Width

bbw:<period>@<on>
  • period? int=20
  • on? str='close' Which column or directive should the calculation based on. Defaults to 'close'
# Bollinger band width
stock['bbw']
# i.e.
stock['bbw:20']

# , which are equivalent to
(stock['boll.upper'] - stock['boll.lower']) / stock['boll']

#, and are equivalent to
stock['(boll.upper - boll.lower) / boll']

rsv, Raw Stochastic Value

rsv:<period>@<high>,<low>,<close>

Calculates the raw stochastic value which is often used to calculate KDJ

  • period int (required)
  • high? str='high' The column name for high prices. Defaults to 'high'
  • low? str='low' The column name for low prices. Defaults to 'low'
  • close? str='close' The column name for close prices. Defaults to 'close'
# Uses default columns (high, low, close)
stock['rsv:9']

# Specify custom columns
stock['rsv:9@high,low,close']

kdj, a variety of stochastic oscillator

The variety of Stochastic Oscillator indicator created by Dr. George Lane, which follows the formula:

RSV = rsv(period_rsv)
%K = ema(RSV, period_k)
%D = ema(%K, period_d)
%J = 3 * %K - 2 * %D

And the ema here is the exponential weighted moving average with initial value as init_value.

PAY ATTENTION that the calculation forumla is different from wikipedia, but it is much popular and more widely used by the industry.

Directive Arguments:

kdj.k:<period_rsv>,<period_k>,<init_value>@<high>,<low>,<close>
kdj.d:<period_rsv>,<period_k>,<period_d>,<init_value>@<high>,<low>,<close>
kdj.j:<period_rsv>,<period_k>,<period_d>,<init_value>@<high>,<low>,<close>
  • period_rsv? int=9 The period for calculating RSV, which is used for K%
  • period_k? int=3 The period for calculating the EMA of RSV, which is used for K%
  • period_d? int=3 The period for calculating the EMA of K%, which is used for D%
  • init_value? float=50.0 The initial value for calculating ema. Trading softwares of different companies usually use different initial values each of which is usually 0.0, 50.0 or 100.0.
  • high? str='high' The column name for high prices. Defaults to 'high'
  • low? str='low' The column name for low prices. Defaults to 'low'
  • close? str='close' The column name for close prices. Defaults to 'close'
# The %D series of KDJ
stock['kdj.d']
# which is equivalent to
stock['kdj.d:9,3,3,50.0@high,low,close']

# The KDJ serieses of with parameters 9, 9, and 9
stock[['kdj.k:9,9,50.0', 'kdj.d:9,9,9,50.0', 'kdj.j:9,9,9,50.0']]

kdjc, another variety of stochastic oscillator

Removed in 4.x

Unlike kdj, kdjc uses close value instead of high and low value to calculate rsv, which makes the indicator more sensitive than kdj ~~

~~ The arguments of kdjc are the same as kdj ~~

# after 5.0.0
stock['kdj.j@close,close']

# which is equivalent to
stock['kdj.d:9,3,3,50.0@close,close,close']

rsi, Relative Strength Index

rsi:<period>@<on>

Calculates the N-period RSI (Relative Strength Index)

  • period int The period to calculate RSI. period should be an int which is larger than 1
  • on? str='close' Which column or directive should the calculation based on. Defaults to 'close'
# Uses default close column
stock['rsi:14']

# Calculate RSI on a different column
stock['rsi:14@open']

bbi, Bull and Bear Index

bbi:<a>,<b>,<c>,<d>@<on>

Calculates indicator BBI (Bull and Bear Index) which is the average of ma:3, ma:6, ma:12, ma:24 by default

  • a? int=3
  • b? int=6
  • c? int=12
  • d? int=24
  • on? str='close' Which column or directive should the calculation based on. Defaults to 'close'
# Uses default parameters
stock['bbi']

# Custom parameters
stock['bbi:5,10,20,30@close']

atr, the Average True Range

atr:<period>@<high>,<low>,<close>

Calculate the ATR (Average True Range)

  • period int = 14 The period to calculate the moving average of the true ranges, defaults to 14
  • high? str='high' The column name for high prices. Defaults to 'high'
  • low? str='low' The column name for low prices. Defaults to 'low'
  • close? str='close' The column name for close prices. Defaults to 'close'
# Uses default period and columns
stock['atr']

# Custom period
stock['atr:20']

tr, the True Range

New in 5.1.0

tr@<high>,<low>,<close>

Calculate the TR (True Range).

# the True Range
stock['tr']

# Actually atr:14 is the 14-period moving average of tr
stock['atr:14']

# , which is equivalent to
stock['ma:14@(tr)']

llv, Lowest of Low Values

llv:<period>@<on>

Gets the lowest of low prices in N periods

  • period int (required)
  • on? str='low' Which column or directive should the calculation based on. Defaults to 'low'
# The 10-period lowest prices
stock['llv:10']

# The 10-period lowest close prices
stock['llv:10@close']

hhv, Highest of High Values

hhv:<period>@<on>

Gets the highest of high prices in N periods. The arguments of hhv is the same as llv

  • period int (required)
  • on? str='high' Which column or directive should the calculation based on. Defaults to 'high'
# The 10-period highest prices
stock['hhv:10']

# The 10-period highest close prices
stock['hhv:10@close']

donchian, Donchian Channels

donchian:<period>@<high>,<low>
donchian.upper:<period>@<high>
donchian.lower:<period>@<low>

Gets the Donchian channels, the historical view of price volatility by charting a security's highest and lowest prices over a set period

  • period int (required)
  • high? str='high' The column to calculate highest high values, defaults to 'high'
  • low? str='low' The column to calculate lowest low values, defaults to 'low'
# Donchian middle channel with default columns
stock['donchian:20']

# Donchian upper channel
stock['donchian.upper:20']

# Donchian lower channel
stock['donchian.lower:20']
# Donchian middle channel
stock['donchian']
stock['donchian.middle']

# Donchian upper channel, a shortcut for stock['donchian.upper']
stock['donchian.u']
stock['donchian.upper']

# Donchian lower channel, which is equivalent to stock['donchian.l']
stock['donchian.lower']
stock['donchian.l']

hv, Historical Volatility

hv:<period>,<time_frame>,<trading_days>@<on>

Gets the historical volatility, the statistical measure of the dispersion of returns for a security or index over a period of time

  • period int (required)
  • time_frame? string type of TimeFrame, '1m', '3m', etc. Defaults to the time frame of the StockDataFrame
  • trading_days? int=252 trading days in a year, defaults to 252, for crypto currencies, 365 should be used.
  • on? str='close' Which column or directive should the calculation based on. Defaults to 'close'
# 10-period historical volatility for 15-minute data based on 365 yearly trading days
stock['hv:10,15m,365']

# Uses default time_frame and trading_days
stock['hv:10']

Built-in Commands for Statistics

column

Removed in 5.0.0

# A bool-type series indicates whether the current price is higher than the upper bollinger band

# Before 5.0.0
stock['column:close > boll.upper']

# Since 5.0.0, we could just do as follows
stock['close > boll.upper']

increase

increase:<repeat>,<step>@<on>

Gets a bool-type series each item of which is True if the value of indicator on increases in the last period-period.

  • repeat? int=1
  • direction? 1 | -1 the direction of "increase". -1 means decreasing
  • on str | (Directive) the directive of an indicator or the column name of the StockDataFrame on what the calculation should be based

For example:

# Which means whether the `ma:20,close` line
# (a.k.a. 20-period simple moving average on column `'close'`)
# has been increasing repeatedly for 3 times (maybe 3 days)
stock['increase:3@(ma:20@close)']

# If the close price has been decreasing repeatedly for 5 times (maybe 5 days)
stock['increase:5,-1@close']

style

style:<style>@<open>,<close>

Gets a bool-type series whether the candlestick of a period is of style style

  • style 'bullish' | 'bearish' (required)
  • open? str='open' The column name for open prices. Defaults to 'open'
  • close? str='close' The column name for close prices. Defaults to 'close'
# Uses default open and close columns
stock['style:bullish']

# Specify custom columns
stock['style:bearish@open,close']

repeat

repeat:<repeat>@<bool_directive>

The repeat command first gets the result of directive bool_directive, and detect whether True is repeated for repeat times

  • repeat? int=1 which should be larger than 0
  • bool_directive str | (Directive) the directive which should returns a series of bools. Can be a column name or a directive wrapped in parentheses.
# Whether the bullish candlestick repeats for 3 periods (maybe 3 days)
stock['repeat:3@(style:bullish)']

# Repeat check on a column
stock['repeat:5@(close > ma:20)']

change

change:<period>@<on>

Percentage change between the current and a prior element on a certain series

Computes the percentage change from the immediately previous element by default. This is useful in comparing the percentage of change in a time series of prices.

  • period? int=2 2 means we computes with the start value and the end value of a 2-period window.
  • on str | (Directive) the directive of an indicator or the column name of the StockDataFrame on what the calculation should be based
# Percentage change of 20-period simple moving average
stock['change@(ma:20)']

# Percentage change with custom period
stock['change:5@close']

# Percentage change of a column
stock['change@close']

Operators

left operator right

Operator: //

whether left crosses through right from the down side of right to the upper side which we call it as "cross up".

# Which we call them "gold crosses"
stock['macd // macd.signal']

Pay attention that:

Since 4.0.0, it uses a // operator instead of / for two reasons:

  • To avoid potential conflicts with future division operations
  • To maintain consistency with \\, since in strings we need to write '\\'

Operator: \\

whether left crosses down right.

# Which we call them "dead crosses"
stock['macd \\ macd.signal']

PAY ATTENTION, in the example above, we should escape the backslash, so we've got double backslashes '\\'

Operator: ><

whether left crosses right, either up or down.

Operator: < | <= | == | >= | >

For a certain record of the same time, whether the value of left is less than / less than or equal to / equal to / larger than or equal to / larger than the value of right.

Errors

from stock_pandas import (
    DirectiveSyntaxError,
    DirectiveValueError
)

DirectiveSyntaxError

Raises if there is a syntax error in the given directive.

stock['''
repeat
    :   5
    @   (
            close >> boll.upper
        )
''']

DirectiveSyntaxError might print some messages like this:

File "<string>", line 5, column 26

   repeat
       :   5
       @   (
>              close >> boll.upper
           )

                     ^
DirectiveSyntaxError: unexpected token ">>"

DirectiveValueError

Raises if

  • there is an unknown command name
  • something is wrong about the command arguments
  • etc.

About Pandas Copy-on-Write (CoW) Mode

Since 1.3.0, stock-pandas starts to support pandas copy-on-write mode

You could enable pandas copy-on-write mode by using pd.options.mode.copy_on_write = True

or using the environment variable:

export STOCK_PANDAS_COW=1

Advanced Sections

How to Define a New Command

How to extend stock-pandas and support more indicators

This section is only recommended for contributors, but not for normal users, for that the API might change in the future.

Since 4.0.0, stock-pandas uses structured classes instead of tuples and dicts to define commands, making the API more explicit and type-safe.

from stock_pandas import (
    StockDataFrame,
    CommandPreset,
    CommandArgType,
    CommandArg,
    CommandDefinition,
    ReturnType
)

To add a new indicator to stock-pandas, use the StockDataFrame.define_command class method:

# Define a new command
StockDataFrame.define_command(
    'new-indicator',
    CommandDefinition(
        preset=CommandPreset(
            formula=formula,
            lookback=lookback_function,
            args=args_list,
            series=series_list
        ),
        sub_commands=sub_commands_dict,
        aliases=aliases_dict
    )
)

For a simple indicator, such as simple moving average, you could check the implementation here.

CommandFormula: formula(*args, *series) -> ReturnType

formula is a Callable[..., Tuple[ndarray, int]].

The formula receives arguments in the following order:

  1. Regular arguments (from CommandPreset.args) - these are specified with : in directive syntax
  2. Series arguments (from CommandPreset.series) - these are specified with @ in directive syntax and can be column names or directives

The formula returns ReturnType, which is Tuple[ndarray, int]:

  • The first item is the calculated result as a numpy ndarray.
  • The second item is the minimum periods needed to calculate the indicator.

Note: The df and s parameters are no longer passed to the formula. Series arguments are automatically resolved to numpy arrays before being passed to the formula.

CommandArg: Defining Command Arguments

CommandArg is a dataclass that defines each argument of a command:

CommandArg(
    default=default_value,
    coerce=coerce_function
)
  • default Optional[CommandArgType]: The default value for the argument. None indicates that it is a required argument.
  • coerce Callable[[CommandArgType], CommandArgType]: A raisable callable that validates the input, coerces the type, and returns the validated value. If a default value is provided and the user doesn't specify a value, the coerce function will be skipped.

Example:

# Regular arguments (specified with `:` in directive syntax)
args_ma = [
    # Required period argument with type coercion
    CommandArg(
        # Setting `default` to `None` indicates a required argument
        default=None,
        # with type coercion
        coerce=period_to_int
    )
]

# Series arguments (specified with `@` in directive syntax)
series_ma = [
    # Optional 'on' argument with default value
    CommandArg(default='close')
]

CommandDefinition: Complete Command Definition

CommandDefinition is a dataclass that combines all aspects of a command:

CommandDefinition(
    preset=command_preset,
    sub_commands=sub_commands_dict,
    aliases=aliases_dict
)
  • preset Optional[CommandPreset]: The main command preset. None indicates that only sub commands exist (e.g., kdj has only kdj.k, kdj.d, kdj.j).
  • sub_commands Optional[Dict[str, CommandPreset]]: A dict declaring sub commands, such as boll.upper. None indicates no sub commands.
  • aliases Optional[Dict[str, Optional[str]]]: A dict declaring shortcuts or aliases for commands. None indicates no aliases.

Example with sub commands and aliases:

StockDataFrame.define_command(
    'macd',
    CommandDefinition(
        preset=CommandPreset(
            formula=macd_formula,
            lookback=lookback_macd,
            args=args_macd,
            series=series_close
        ),
        sub_commands=dict(
            signal=CommandPreset(
                formula=macd_signal_formula,
                lookback=lookback_macd_signal,
                args=args_macd_all,
                series=series_close
            ),
            histogram=CommandPreset(
                formula=macd_histogram_formula,
                lookback=lookback_macd_signal,
                args=args_macd_all,
                series=series_close
            )
        ),
        aliases=dict(
            s='signal',       # macd.s is alias for macd.signal
            h='histogram',    # macd.h is alias for macd.histogram
            dif=None,         # macd.dif is alias for macd (main command)
            dea='signal',
            macd='histogram'
        )
    )
)

When an alias value is None, it means the alias refers to the main command. For example, macd.dif is an alias for macd itself.

How to Isolate Custom Commands

By default, when you call StockDataFrame.define_command(), the command is registered globally for the StockDataFrame class and all its instances. This is because COMMANDS and DIRECTIVES_CACHE are class variables that are shared across all instances.

If you need to create isolated command spaces for different use cases, you can create a subclass and override these class variables:

from stock_pandas import StockDataFrame, DirectiveCache

class MyStockDataFrame(StockDataFrame):
    # Create an independent copy of COMMANDS
    COMMANDS = StockDataFrame.COMMANDS.copy()
    # Create an independent directive cache
    DIRECTIVES_CACHE = DirectiveCache()

Now you can define commands specifically for MyStockDataFrame without affecting the base StockDataFrame class:

# Define a custom command for MyStockDataFrame only
MyStockDataFrame.define_command('my_indicator', my_command_definition)

# This command is only available in MyStockDataFrame
my_stock = MyStockDataFrame(data)
my_stock['my_indicator']  # Works!

# But not available in the base StockDataFrame
stock = StockDataFrame(data)
stock['my_indicator']  # Raises error!

Why This Works

This isolation works due to Python's class variable mechanism:

  1. COMMANDS Dictionary: By calling .copy(), you create a shallow copy of the commands dictionary. This means the subclass has its own dictionary instance, so any modifications (adding/removing commands) won't affect the parent class.

  2. DIRECTIVES_CACHE: By creating a new DirectiveCache() instance, the subclass maintains its own cache of parsed directives. This ensures that:

    • Directives are parsed and cached independently for each class
    • Changes to command definitions in one class don't cause cache inconsistencies in another
  3. Class Method Resolution: When you call MyStockDataFrame.define_command(), Python resolves cls.COMMANDS to MyStockDataFrame.COMMANDS, which is the independent copy you created.

Use Cases

This pattern is useful when:

  • Testing: You want to add test-specific indicators without polluting the global command space
  • Multiple Strategies: Different trading strategies require different custom indicators
  • Library Development: You're building a library on top of stock-pandas and want to provide additional indicators without affecting users who don't need them
  • Experimentation: You want to try different command implementations without risking interference with production code

Example with multiple isolated classes:

class StrategyA_DataFrame(StockDataFrame):
    COMMANDS = StockDataFrame.COMMANDS.copy()
    DIRECTIVES_CACHE = DirectiveCache()

class StrategyB_DataFrame(StockDataFrame):
    COMMANDS = StockDataFrame.COMMANDS.copy()
    DIRECTIVES_CACHE = DirectiveCache()

# Each class can have its own custom commands
StrategyA_DataFrame.define_command('custom_a', definition_a)
StrategyB_DataFrame.define_command('custom_b', definition_b)

Development

First, install conda (recommended), and generate a conda environment for this project

conda create -n stock-pandas python=3.12

conda activate stock-pandas

# Install requirements
make install

# Build python ext (C++)
make build-ext

# Run unit tests
make test

About

πŸš€ The production-ready and incredibly-fast python library to support stock statistics and indicators, based on `pandas.DataFrame`

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published