A practical Python SDK for stock analysis with:
- baseline (no tools)
- single-agent (all tools)
- multi-agent with 3 patterns:
orchestrator + specialists + criticsequential pipelineparallel specialists + aggregator
- critic strategies for orchestrator mode:
strict-rewrite(default)no-rewritesoft-gateddual-draftminimal-rewriteauto(question-aware strategy selector)
- evaluator and benchmark runner
This project is extracted from mp3_assignment_chenlei1_dnngong2.ipynb and packaged for reuse.
- Tooling for price performance, market status, top movers, company overview, news sentiment, and local SQL lookup.
- Switchable data-source providers:
alphavantage,yahoo,hybrid. - SDK APIs for
run_single_agent,run_multi_agent,run_evaluator, andrun_full_evaluation. - CLI (
stock-agents) for quick usage. - Environment-variable-only secrets loading (no hardcoded keys).
User Question
|
v
[Orchestrator]
|- decides active specialists
|- generates sub-tasks
v
┌───────────────────────────────────────────────────────────┐
│ Specialists (tool-scoped) │
│ • Market Specialist -> tickers/price/status/movers│
│ • Fundamental Specialist -> overview/sql/tickers │
│ • Sentiment Specialist -> news/sql │
└───────────────────────────────────────────────────────────┘
|
v
[Synthesis]
|- merges specialist outputs into one draft answer
v
[Critic]
|- checks missing fields / contradictions / support
|- outputs: confidence, issues, corrected final answer (strategy-dependent)
v
Final Answer (+ agent_results, elapsed_sec, architecture)
Critic strategy layer (orchestrator only):
+------------------------------+
Draft A --------------->| strict-rewrite |--> final = critic rewrite
+------------------------------+
+------------------------------+
Draft A + critic score->| no-rewrite |--> final = Draft A
+------------------------------+
+------------------------------+
Draft A + critic score->| soft-gated |--> low risk: keep Draft A
| |--> high risk: use critic rewrite
+------------------------------+
+------------------------------+
Draft A + issues ------>| minimal-rewrite |--> patch only risky/missing parts
+------------------------------+
Draft A + Draft B ----->| dual-draft |--> critic scores both drafts
| |--> pick higher-scoring draft
+------------------------------+
+------------------------------+
Question text --------->| auto |--> choose strict/no/soft by complexity
+------------------------------+
User Question
|
v
[Agent 1: Market]
|
v
[Agent 2: Fundamental] (sees Agent 1 output)
|
v
[Agent 3: Sentiment] (sees Agent 1 + 2 outputs)
|
v
[Aggregator]
|
v
Final Answer
User Question
├── [Market Specialist] ─┐
├── [Fundamental Spec.] ─┼──> [Aggregator] -> Final Answer
└── [Sentiment Spec.] ──┘
- Python 3.10+
- A local
stocks.dbfile with tablestocks(columns expected in the notebook assignment).- You can generate it with the built-in
build-dbcommand (see below).
- You can generate it with the built-in
cd stock_analysis_agents
python -m pip install -e .Install the Streamlit web app extras if you want the browser UI:
python -m pip install -e ".[app]"Export these in your shell before running:
export OPENAI_API_KEY="your_openai_key"
export ALPHAVANTAGE_API_KEY="your_alpha_vantage_key"
# Optional overrides
export STOCK_AGENTS_MODEL="gpt-4o-mini"
export STOCK_AGENTS_MODEL_SMALL="gpt-4o-mini"
export STOCK_AGENTS_MODEL_LARGE="gpt-4o"
export STOCK_AGENTS_DB_PATH="/absolute/path/to/stocks.db"
export STOCK_AGENTS_DATA_PROVIDER="hybrid" # alphavantage | yahoo | hybrid
export STOCK_AGENTS_STRUCTURED_LOG="1" # optional: 1 enables JSONL logs
export STOCK_AGENTS_LOG_PATH="./stock_agents_events.jsonl" # optionalProvider notes:
alphavantage: full endpoint coverage in this SDK (requiresALPHAVANTAGE_API_KEY).yahoo: good for price + overview; market-status/movers/news sentiment returnerror(unsupported).hybrid: Alpha Vantage for status/movers/news, Yahoo for price, overview fallback.
If you have a companies dataset in .csv or .xlsx, build the local sqlite DB:
stock-agents build-db --input /path/to/sp500_companies.csvor:
stock-agents build-db --input /path/to/companies.xlsx --db-path /path/to/stocks.dbExpected input columns (case-insensitive, common variants supported):
symbol/tickershortname/companysectorindustryexchange- optional
marketcap(used to deriveLarge/Mid/Small)
stock-agents -h
stock-agents ask -h
stock-agents eval -h
stock-agents eval-strategies -h
stock-agents build-db -hRun the Streamlit app from the project root:
streamlit run src/stock_analysis_agents/app_streamlit.pyThe app provides:
- A chat workspace for stock-analysis questions.
- Single-agent and multi-agent mode selection.
- Orchestrator, sequential pipeline, and parallel specialist patterns.
- Critic strategy controls for orchestrator mode.
- Data provider, model, API key, and local
stocks.dbcontrols. - Per-answer diagnostics with tool usage, resolved follow-up questions, critic output, and agent summaries.
You can configure credentials either in the sidebar or through environment variables:
export OPENAI_API_KEY="your_openai_key"
export ALPHAVANTAGE_API_KEY="your_alpha_vantage_key"
export STOCK_AGENTS_DATA_PROVIDER="hybrid"
export STOCK_AGENTS_DB_PATH="/absolute/path/to/stocks.db"For a lighter setup, choose yahoo as the data provider in the sidebar. Yahoo mode supports price and overview tools without an Alpha Vantage key, while market status, movers, and news sentiment require Alpha Vantage-backed modes.
stock-agents ask "What is the P/E ratio of Apple (AAPL)?"Default output is a human-readable summary. Use --json for raw structured output, and
use --trace to print detailed tool-call logs.
stock-agents ask "Top 3 semiconductor stocks by 1-year return" --arch multi --multi-arch orchestrator
stock-agents ask "Top 3 semiconductor stocks by 1-year return" --arch multi --multi-arch pipeline
stock-agents ask "Top 3 semiconductor stocks by 1-year return" --arch multi --multi-arch parallelstock-agents ask "What is Apple's P/E ratio?" --arch multi --multi-arch orchestrator --critic-strategy strict-rewrite
stock-agents ask "What is Apple's P/E ratio?" --arch multi --multi-arch orchestrator --critic-strategy no-rewrite
stock-agents ask "What is Apple's P/E ratio?" --arch multi --multi-arch orchestrator --critic-strategy soft-gated
stock-agents ask "What is Apple's P/E ratio?" --arch multi --multi-arch orchestrator --critic-strategy dual-draft
stock-agents ask "What is Apple's P/E ratio?" --arch multi --multi-arch orchestrator --critic-strategy minimal-rewrite
stock-agents ask "What is Apple's P/E ratio?" --arch multi --multi-arch orchestrator --critic-strategy autoask output now includes a Critic Diagnostics section (strategy, rewrite applied, gate, draft choice, critic confidence/issues).
# Manual global thresholds
stock-agents ask "Which tech stocks dropped this month but grew this year?" \
--arch multi --multi-arch orchestrator --critic-strategy soft-gated \
--soft-gate-conf-threshold 0.60 --soft-gate-issue-threshold 3
# Enable manual stratified thresholds (easy/medium/hard groups)
stock-agents ask "Which tech stocks dropped this month but grew this year?" \
--arch multi --multi-arch orchestrator --critic-strategy soft-gated \
--soft-gate-stratified-thresholds
# Data-driven global threshold (from historical .xlsx files)
stock-agents ask "Which tech stocks dropped this month but grew this year?" \
--arch multi --multi-arch orchestrator --critic-strategy soft-gated \
--soft-gate-data-driven global --soft-gate-history-glob "./results_*.xlsx"
# Data-driven stratified thresholds (easy/medium/hard each learned separately)
stock-agents ask "Which tech stocks dropped this month but grew this year?" \
--arch multi --multi-arch orchestrator --critic-strategy soft-gated \
--soft-gate-data-driven stratified --soft-gate-history-glob "./results_*.xlsx"stock-agents ask "Compare the 1-year returns of AAPL, MSFT, GOOGL" --arch singlestock-agents ask "Are US markets open right now?" --provider alphavantage
stock-agents ask "Compare AAPL and MSFT 1-year return" --provider yahoo
stock-agents ask "What is the P/E ratio of Apple (AAPL)?" --provider yahoo --jsonstock-agents eval --model gpt-4o-mini --multi-arch orchestrator --output results_sdk_mini_orch.xlsx
stock-agents eval --model gpt-4o-mini --multi-arch pipeline --output results_sdk_mini_pipeline.xlsx
stock-agents eval --model gpt-4o-mini --multi-arch parallel --output results_sdk_mini_parallel.xlsx
stock-agents eval --model gpt-4o --multi-arch orchestrator --critic-strategy soft-gated --output results_sdk_4o_orch_soft.xlsx
stock-agents eval --model gpt-4o --multi-arch orchestrator --critic-strategy dual-draft --output results_sdk_4o_orch_dual.xlsx
stock-agents eval --model gpt-4o --multi-arch orchestrator --critic-strategy no-rewrite --output results_sdk_4o_orch_norewrite.xlsx
stock-agents eval --model gpt-4o --multi-arch orchestrator --critic-strategy minimal-rewrite --output results_sdk_4o_orch_minrewrite.xlsx
stock-agents eval --model gpt-4o --multi-arch orchestrator --critic-strategy auto --output results_sdk_4o_orch_auto.xlsx
# Explicit run_config path
stock-agents eval --model gpt-4o --output results_sdk_4o.xlsx \
--run-config-path ./artifacts/run_config_4o.json
# Run with data-driven stratified soft-gated thresholds
stock-agents eval --model gpt-4o --multi-arch orchestrator --critic-strategy soft-gated \
--soft-gate-data-driven stratified --soft-gate-history-glob "./results_*.xlsx" \
--output results_sdk_4o_soft_stratified_learned.xlsxEach evaluation xlsx now includes:
Results(per-question scores + MA diagnostics columns)Summary(Q3-style accuracy by architecture and difficulty)Calibration(confidence-vs-score calibration metrics)
Each evaluation also writes run_config.json:
- default path:
<output_stem>_run_config.json - custom path:
--run-config-path /your/path/run_config.json
stock-agents eval-strategies --model gpt-4o --output-prefix results_strategy_compare
stock-agents eval-strategies --model gpt-4o --strategies strict-rewrite,no-rewrite,soft-gated,dual-draft,minimal-rewrite,auto
stock-agents eval-strategies --model gpt-4o \
--strategies soft-gated \
--soft-gate-data-driven global \
--soft-gate-history-glob "./results_*.xlsx" \
--output-prefix results_strategy_compare_soft_global
# Optional run_config path template per strategy
stock-agents eval-strategies --model gpt-4o \
--run-config-path "./configs/run_config_{strategy}.json"This command writes one xlsx per strategy plus a consolidated CSV summary:
results_strategy_compare_<strategy>.xlsxresults_strategy_compare_summary.csv- and one config per run: default
<output_stem>_run_config.json
from stock_analysis_agents import (
load_settings,
make_client,
make_data_provider,
FinanceTools,
build_tool_function_map,
run_multi_agent,
)
settings = load_settings()
client = make_client(settings)
provider = make_data_provider(settings.data_provider, settings.alphavantage_api_key)
tools = FinanceTools(provider=provider, db_path=settings.db_path)
func_map = build_tool_function_map(tools)
out = run_multi_agent(
client=client,
model=settings.active_model,
tool_functions=func_map,
question="For the top 3 semiconductor stocks by 1-year return, what are their P/E ratios?",
verbose=False,
architecture="orchestrator", # orchestrator | pipeline | parallel
critic_strategy="auto", # strict-rewrite | no-rewrite | soft-gated | dual-draft | minimal-rewrite | auto
)
print(out["final_answer"])
print(out["diagnostics"])stock_analysis_agents/
├── pyproject.toml
├── README.md
├── src/
│ └── stock_analysis_agents/
│ ├── __init__.py
│ ├── app_streamlit.py # Streamlit web frontend
│ ├── cli.py # CLI entrypoint: stock-agents
│ ├── config.py # env-based settings
│ ├── llm.py # OpenAI client factory
│ ├── models.py # shared dataclasses
│ ├── providers.py # data-source abstraction layer
│ ├── tools.py # tool wrappers + local DB queries
│ ├── schemas.py # tool schemas for function calling
│ ├── agent_runner.py # reusable tool-call loop
│ ├── baseline.py # no-tool baseline agent
│ ├── single_agent.py # single-agent architecture
│ ├── multi_agent.py # orchestrator / pipeline / parallel multi-agent patterns
│ ├── evaluator.py # LLM-as-judge scoring
│ ├── benchmark.py # fixed benchmark question set
│ ├── evaluation.py # batch runner + xlsx output
│ └── db_builder.py # build stocks.db from csv/xlsx
└── tests/
├── test_imports.py
└── test_db_builder.py
- All secrets are loaded from environment variables.
query_local_dbonly allowsSELECTstatements for safety.run_specialist_agentlimits tool calls per turn to reduce oversized API payload failures.- Database builder supports both
.csvand.xlsxinputs. - Structured logs (JSONL) are disabled by default. Enable with:
STOCK_AGENTS_STRUCTURED_LOG=1- optional
STOCK_AGENTS_LOG_PATH=/path/to/events.jsonl - event types include:
agent.*,multi_agent.*,evaluation.* - each line is a JSON object, suitable for pandas/duckdb/ELK ingestion