Table of contents
-
- 1. Synthetic uncertain points
- 2. MIT-BIH preprocessing
- 3. MIT‑BIH uncertain points
- 4. RadioML 2016.10A — uncertain points
- 5. Puiseux Test
- 6. Local Analysis test
- 7. Post-Processing Synthetic Data
- 8. Post-Processing Real Data (MIT-BIH)
- 9. Post-Processing Radio Data (RadioML 2016.10A)
- 10. Newton–Puiseux Evidence & Triage (MIT-BIH)
- 11. Newton–Puiseux Evidence & Triage (RadioML 2016.10A)
This repository contains the complete, reproducible codebase that accompanies the paper “Newton-Puiseux Analysis for Interpretability and Calibration of Complex-Valued Neural Networks”.
Published in Neural Networks, Volume 195, 2026, Article 108172.
View on ScienceDirect
It implements our end-to-end pipeline – from data preprocessing through CVNN training to Newton–Puiseux-based local analysis – across three settings: (1) a controlled synthetic benchmark, (2) the MIT-BIH arrhythmia corpus, and (3) the RadioML 2016.10A wireless modulation dataset.
If you use this code or refer to our results, please cite:
Piotr Migus,
Newton–Puiseux analysis for interpretability and calibration of complex-valued neural networks,
Neural Networks, Volume 195, 2026, Article 108172.
https://doi.org/10.1016/j.neunet.2025.108172
@article{Migus2026NPAnalysis,
title = {Newton--Puiseux analysis for interpretability and calibration of complex-valued neural networks},
author = {Piotr Migus},
journal = {Neural Networks},
volume = {195},
pages = {108172},
year = {2026},
issn = {0893-6080},
doi = {https://doi.org/10.1016/j.neunet.2025.108172},
url = {https://www.sciencedirect.com/science/article/pii/S0893608025010524}
}This repository provides the full codebase and reproducible scripts for our Newton–Puiseux framework, which enhances interpretability and calibration of complex-valued neural networks (CVNNs). We demonstrate the approach on a controlled synthetic dataset and two real-data settings: the MIT-BIH arrhythmia corpus and the RadioML 2016.10A wireless modulation dataset.
├── mit‑bih/ # Raw MIT‑BIH dataset
├── radio-data/ # External RadioML files
├── mit_bih_pre/ # Preprocessing scripts for MIT‑BIH
│ └── pre_pro.py # Signal filtering and feature extraction
├── src/ # Core library modules
│ ├── post_processing.py # Post‑processing (common)
│ ├── find_up_synthetic.py # Uncertainty mining on synthetic data
│ ├── find_up_real.py # Uncertainty mining on MIT‑BIH data
│ ├── find_up_radio.py # Uncertainty mining on RadioML data
│ ├── local_analysis.py # Local surrogate + Puiseux wrapper
│ └── puiseux.py # Newton‑Puiseux solver
├── up_synth/ # Synthetic dataset training and evaluation
│ └── up_synthetic.py
├── local_analysis_synth_test/ # Tests for local analysis (synthetic)
│ └── local_analysis_synth_test.py
├── puiseux_test/ # Tests for Puiseux solver
│ └── puiseux_test.py
├── post_processing_synth/ # Post‑processing for synthetic data
│ └── post_processing_synth.py
├── up_real/ # MIT‑BIH CVNN training and evaluation
│ └── up_real.py
├── up_radio/ # RadioML 2016.10A CVNN training and evaluation
│ └── up_radio.py
├── post_processing_real/ # Post‑processing for MIT‑BIH data
│ └── post_processing_real.py
├── post_processing_radio/ # Post‑processing for RadioML 2016.10A
│ └── post_processing_radio.py
├── NP-analysis_real/ # Newton–Puiseux evidence & triage (MIT-BIH)
│ └── NP-analysis_real.py
├── NP-analysis_radio/ # Newton–Puiseux evidence & triage (RadioML 2016.10A)
│ └── NP-analysis_radio.py
└── README.md # This file
MIT-BIH Arrhythmia Database is not stored in this repository.
Manual download
- Go to https://physionet.org/content/mitdb/1.0.0/
- Download all files and unzip them into
mit-bih/.
Licence notice:
MIT-BIH data are released under the PhysioNet open-access license.
By downloading the files you agree to its terms.
Dataset is not stored in this repository.
Manual download
- Go to the DeepSig datasets page (RADIOML 2016.10A) and download the archive
RML2016.10a.tar.bz2(it containsRML2016.10a_dict.pkl) or the.pklfile directly. - After download, extract (if needed) and place the file into
radio-data/. Expected path: the code will look for:radio-data/RML2016.10a_dict.pkl.
License notice (RadioML 2016.10A):
Released by DeepSig under CC BY‑NC‑SA 4.0 (Attribution‑NonCommercial‑ShareAlike).
By downloading/using the dataset you agree to these terms. See the dataset page and license for details.
- Clone the repository:
git clone https://github.com/piotrmgs/puiseux-cvnn.git cd puiseux-cvnn - Create a virtual environment and install dependencies:
conda env create -f environment.yml conda activate puiseux-cvnn
# 1 Run the complete synthetic pipeline (≃ 30 s on CPU)
python -m up_synth.up_synthetic
# 2 Generate Newton-Puiseux analysis for uncertain points
python -m post_processing_synth.post_processing_synthTo generate synthetic data, train a CVNN on it, and identify uncertain points, run:
python -m up_synthetic.up_syntheticThis script performs the following steps:
-
Data generation: Creates a controlled synthetic dataset in
$\mathbb{C}^2$(default: 200 samples per class). - Train/Test split: Splits data 80/20 into training and test sets.
-
Model construction: Builds a
SimpleComplexNetwith 2 complex input features, 16 hidden units, and 2 output classes. -
Training: Optimizes the model for 50 epochs using Adam (learning rate
$10^{-3}$ ). - Evaluation: Computes and prints test accuracy.
- Uncertainty mining: Flags test points where the maximum softmax probability falls below the threshold (default: 0.1).
-
CSV export: Saves uncertain samples to
up_synthetic/uncertain_synthetic.csv, including feature vectors, true labels, and model probabilities. -
Model checkpoint: Saves trained model parameters to
up_synthetic/model_parameters.pt. -
Visualization: Generates and saves a PCA scatter plot showing all test samples colored by confidence, with uncertain points highlighted (e.g.,
up_synthetic/uncertainty_plot.png).
Outputs:
up_synthetic/uncertain_synthetic.csv— CSV file listing indices, complex inputs, labels, and softmax probabilities of uncertain points.up_synthetic/model_parameters.pt— PyTorch state dict of the trained network.up_synthetic/uncertainty_plot.png— PCA visualization of test-set confidence.
To load, filter, segment, and visualize ECG recordings from the MIT-BIH Arrhythmia Database, run:
python -m mit_bih_pre.pre_proThis script performs the following steps:
- Record loading: Reads raw signal and annotation files (
.dat,.hea,.atr) frommit-bih/for records 100–107. - Bandpass filtering: Applies a zero-phase Butterworth filter (0.5–40 Hz, 2nd order) to each channel to remove baseline wander and high‑frequency noise.
- Segmentation: Extracts 128‑sample windows around each R peak (50 samples before peak) for beats labeled 'N' (normal) and 'V' (PVC).
- Hilbert transform: Computes analytic signals per channel and concatenates real and imaginary parts into feature vectors.
- Normalization: Standardizes all features using
StandardScaler. - Train/Test split: Splits the data 80/20 (stratified by label).
- Tensor conversion: Converts the full normalized dataset to PyTorch tensors (optional for downstream ML pipelines).
- Visualizations: Generates and saves plots in
mit_bih_pre/:class_distribution.png— histogram of 'N' vs. 'V' samplessignal_comparison_<record>_class_<N|V>.png— raw vs. filtered signal with R‑peak markerhilbert_transform_<record>_class_<N|V>.png— filtered signal and Hilbert envelopespectrogram_<record>_class_<N|V>.png— time–frequency spectrogramcorrelation_matrix.png— feature correlation heatmap (first 50 features)tsne_visualization.png— 2D t‑SNE embedding of the feature space
Outputs:
- All figures listed above saved to
mit_bih_pre/ - In-memory variables
X,y,X_norm,X_train,X_test,y_train,y_test(accessible within the script)
To train a CVNN with cross‑patient K‑Fold CV, calibrate it, and automatically flag uncertain predictions on the held‑out TEST split, run:
# GPU if available
python -m up_real.up_real
# Force CPU (optional)
python -m up_real.up_real --cpuWhat this does
- Cross‑patient K‑Fold training (
--folds, default 10): splits patient records so no subject appears in both TRAIN and TEST in a fold; trainsSimpleComplexNeton complex features (complex_stats). - Per‑fold calibration (controlled by
--calibration:temperature[default],isotonic, ornone): applies calibration on the validation split of each fold and evaluates on the fold’s TEST.- If
temperature, savesT_calib_fold{fold}.pt.
- If
- Multi‑calibration evaluation (optional, via
--calibs): runs a panel of calibrators on the same fold logits (e.g.,temperature,isotonic,platt,beta,vector,none) and writes comparative metrics per fold. - Global reliability curves: writes baseline RAW and calibrated reliability diagrams aggregated across folds.
- Automatic threshold selection on VALIDATION (full‑model stage):
- Builds a (τ, δ) grid where τ is the minimum confidence (max probability) to accept a prediction and δ is the minimum margin between the top‑2 class probabilities.
- If
--sensitivity(default on), saves the grid tosens_grid.csvand heatmapssens_full_*; also computes a "knee" score proxy. - Selects
(τ*, δ*)using an exact review budget (--review_budget, default 10 samples) viaselect_thresholds_budget_count. The chosen pair and stats are written tosens_full.csv.
- Uncertain‑point detection on TEST: flags samples with low confidence and/or small margin (i.e.,
max_prob < τ*and/ormargin < δ*) and saves them touncertain_full.csv. - Artifacts: per‑fold training curves, confusion matrix + ROC, overall training history across folds, reliability diagrams (RAW vs calibrated), and uncertainty histograms (confidence and margin).
Outputs (files)
- Logs & meta
run.log— full run log (folds, timings, resources).run_meta.json— versions and device info for reproducibility.
- Per‑fold artifacts (filenames include the fold index):
- training curves (e.g.,
training_history_fold{fold}.png), - confusion matrix & ROC (e.g.,
confusion_fold{fold}.png,roc_fold{fold}.png), scaler_fold{fold}.pkl,- (if
--calibration temperature)T_calib_fold{fold}.pt.
- training curves (e.g.,
- Cross‑fold (CV) metrics
cv_metrics_per_fold.csv— RAW vs calibrated (ECE, NLL, Brier, Acc, AUC) per fold.cv_metrics_summary.csv— mean ±95% CI for ECE/NLL/Brier across folds.predictions_all_folds.csv— row‑wise predictions with calibrated probs and the top‑2 margin.
- Multi‑calibration panel (if
--calibsis non‑empty; by default several methods are evaluated):cv_metrics_per_fold_multi.csv— per‑fold metrics for each method in--calibs.cv_metrics_summary_multi.csv— cross‑fold means and 95% CIs by method.
- Global visualizations
calibration_curve_RAW.pngandcalibration_curve_{TS|ISO|CAL}.png— reliability diagrams pre/post calibration.uncertainty_histogram.png— distribution ofmax_probon TEST with τ* marker.uncertainty_margin_hist.png— distribution of the top‑2 margin with δ* marker.complex_pca_scatter.png— PCA of complex features over all records.
- Full‑model stage
best_model_full.pt— best weights after full retraining.scaler_full.pkl— StandardScaler for the full‑model pipeline.- (if
--calibration temperature)T_calib.pt— temperature for the full model. sens_grid.csv,sens_full.csv, and sensitivity heatmapssens_full_*.
- Uncertain points
uncertain_full.csv— flagged TEST samples with columns:index(row id in TEST),X(feature vector),true_label,p1,p2.
Key CLI switches
--calibration {temperature,isotonic,none}— per‑fold and full‑model calibration (default: temperature).--calibs— comma‑separated list of calibration methods to evaluate in addition to the operational one (default:temperature,isotonic,platt,beta,vector,none).--sensitivity/--no-sensitivity— enable/disable τ/δ grid & heatmaps (on by default).--review_budget <int>— exact count of validation samples allowed for manual review; used to pick(τ*, δ*)(default: 10).--select_mode {capture,budget,risk,knee}+--capture_target,--max_abstain,--target_risk— configure how the grid is scored/summarized (the final selection still uses--review_budget).--epochs,--lr,--batch_size,--folds,--cpu,--seed— standard training controls.
Example
# 10-fold CV on GPU (if available), temperature scaling, exact review budget of 20 samples:
python -m up_real.up_real \
--data_folder mit-bih \
--output_folder up_real \
--epochs 10 --lr 1e-3 --batch_size 128 --folds 10 \
--calibration temperature \
--review_budget 20 \
--sensitivityThis section mirrors the MIT‑BIH pipeline but runs on RadioML 2016.10A.
Make sure you have downloaded the dataset as described in Datasets
radio-data/RML2016.10a_dict.pkl
is present.
Run
# GPU if available
python -m up_radio.up_radio
# Force CPU (optional)
python -m up_radio.up_radio --cpuWhat this does
- Subset selection (RadioML 2016.10A): filters by modulation classes (
--mods, default:BPSK QPSK) and SNR range (--snr_low/--snr_high, default: 5…15 dB). - Complex‑aware features: converts raw IQ windows into compact STFT‑based statistics via
prepare_complex_input(method='stft_stats'). - Stratified K‑Fold on samples (
--folds, default 10): trainsSimpleComplexNetper fold with a train/val split inside each fold; standardization fitted on TRAIN only and saved per fold. - Probability calibration on VALIDATION (operational method chosen by
--calibration; default here: platt for binary tasks). The same calibrator is applied on TEST for metrics and plots. - Multi‑calibration sweep (optional;
--calibs): evaluates a panel (temperature,isotonic,platt,beta,vector,none) on the same TEST logits and logs per‑method metrics. - Cross‑fold aggregation: reliability diagrams for RAW vs CAL (the chosen method), learning curves across folds, confusion matrices & ROC curves per fold.
- Full‑model stage: retrains on a fresh Train/Val/Test split; performs (τ, δ) sensitivity analysis on VALIDATION (if
--sensitivity), selects (τ*, δ*) with an exact review budget (--review_budget), then plots TEST histograms forp_max(with τ*) and top‑2 margin (with δ*). - Uncertainty export: flags TEST samples with
(p_max < τ*)OR(margin < δ*)and writes them touncertain_full.csv.
Outputs
- Logs & metadata
run.log— run log (fold splits, timings, memory).run_meta.json— Python/NumPy/PyTorch versions and device.
- Per‑fold artifacts (file names are suffixed with fold index):
- Training curves from
save_plots(e.g.,training_history_fold{fold}.png), - Confusion matrix & ROC:
confusion_fold{fold}.png,roc_fold{fold}.png, scaler_fold{fold}.pkl— StandardScaler fitted on TRAIN,- If
--calibration temperature:T_calib_fold{fold}.pt.
- Training curves from
- Cross‑fold (CV) metrics
cv_metrics_per_fold.csv— per‑fold RAW vs CAL metrics: ECE, NLL, Brier, Acc, AUC (CAL columns are tagged with the chosen method, e.g.,CAL_PLATT).cv_metrics_summary.csv— mean and 95% CI half‑widths for ECE/NLL/Brier across folds.predictions_all_folds.csv— per‑sample calibrated probabilities with pmax and top‑2 margin; column keys are suffixed with the CAL method (e.g.,p1_CAL_PLATT,margin_CAL_PLATT).
- Multi‑calibration panel (if
--calibsnon‑empty)cv_metrics_per_fold_multi.csv— per‑fold metrics for each method in--calibs.cv_metrics_summary_multi.csv— cross‑fold means and 95% CIs by method.
- Global visualizations
calibration_curve_RAW.pngandcalibration_curve_<CAL>.png(e.g.,calibration_curve_PLATT.png),complex_pca_scatter.png— PCA onstft_statsfeatures,uncertainty_histogram.png— TEST distribution ofp_maxwith τ* marker,uncertainty_margin_hist.png— TEST distribution of top‑2 margin with δ* marker.
- Full‑model artifacts
best_model_full.pt,scaler_full.pkl,- If
--calibration temperature:T_calib.pt, - Sensitivity products:
sens_grid.csv,sens_full.csv, and heatmapssens_full_*.
- Uncertain points
uncertain_full.csvwith columns:index,X(scaled feature vector),true_label,p1,p2.
Key CLI switches
- Dataset filters:
--mods <list>(e.g.,BPSK QPSK 8PSK),--snr_low <int>,--snr_high <int>. - Calibration (operational):
--calibration {temperature,isotonic,platt,beta,vector,none}(default: platt). - Calibration (evaluation panel):
--calibs— comma‑separated methods to compare on TEST (default includes several). - Sensitivity & thresholds:
--sensitivity/--no-sensitivity,--review_budget <int>(exact‑count selection of (τ*, δ*); default 10), and scoring knobs:--select_mode {capture,budget,risk,knee},--capture_target,--max_abstain,--target_risk. - Training:
--epochs,--lr,--batch_size,--folds,--seed,--cpu.
To compute and save Newton-Puiseux series expansions for a sample polynomial, run:
python -m puiseux_test.puiseux_testThis script performs the following steps:
- Polynomial definition: Specifies a symbolic two-variable polynomial
f(x,y)in SymPy (default example included, editable in code). - Initial branch computation: Uses
initial_branchesand polygon utilities to identify starting terms. - Puiseux expansion: Invokes
puiseux_expansions(f, x, y, max_terms=5)to compute fractional-power series up to 5 terms. - Console output: Prints each expansion to
stdoutwith separators. - File export: Writes formatted expansions into
puiseux_test/puiseux_expansions.txt, including headers, individual expansions, and a footer.
Outputs:
puiseux_test/puiseux_expansions.txt— text file containing the list of computed Puiseux series, complete with numeric evaluation of each term.
To perform local polynomial approximation and Newton-Puiseux analysis on uncertain synthetic points, run:
python -m local_analysis_synth_test.local_analysis_synth_testThis script executes the following pipeline:
- Model & data loading: Reads
up_synth/uncertain_synthetic.csvfor uncertain points and loads the pretrainedSimpleComplexNetweights fromup_synth/model_parameters.pt. - Point iteration: Loops over each uncertain point (
xstar) from the CSV. - Local surrogate fitting: Constructs a degree-4 polynomial approximation
F̂of the logit difference aroundxstarusinglocal_poly_approx_complex, samplingn_samples=200within a cube of radiusdelta=0.1in ℝ⁴. - Puiseux analysis:
- benchmark_local_poly_approx_and_puiseux: Times and factors the surrogate, returning its symbolic expression and initial Puiseux expansions (around the origin).
- puiseux_uncertain_point: Computes full Puiseux series anchored back at
xstarwith precision 4.
- Quality evaluation: Measures approximation fidelity with
evaluate_poly_approx_quality(RMSE, MAE, correlation ρ, sign-agreement) on 300 fresh perturbations. - Result export: For each point, writes a report
local_analysis_synth_test/benchmark_point<idx>.txtcontaining:- Timing breakdown
- Final polynomial expression
- Puiseux expansions at origin and at
xstar - Approximation metrics
- Console logging: Prints progress, loaded point data, timings, metrics, and final Puiseux expansions to
stdout.
Outputs:
local_analysis_synth_test/benchmark_point<idx>.txt— detailed report per uncertain point.
To analyze uncertain synthetic points, generate local explanations, and robustness analyses, run:
python -m post_processing_synth.post_processing_synthThis script performs the following steps:
- Model & data loading: Loads the pretrained
SimpleComplexNetand its weights (up_synth/model_parameters.pt) and reads uncertain points fromup_synth/uncertain_synthetic.csv. - Local polynomial fitting: Computes a degree-4 surrogate
F̂at each uncertain point usinglocal_poly_approx_complex(delta=0.05, n_samples=300). - Approximation quality: Evaluates RMSE, MAE, Pearson correlation, and sign-agreement via
evaluate_poly_approx_quality. - Puiseux expansions: Calculates local Puiseux series at each point (
puiseux_uncertain_point) and interprets them withinterpret_puiseux_expansions. - Adversarial robustness:
- Identifies promising directions via
find_adversarial_directions. - Measures class-flip radii with
test_adversarial_impactand plots robustness curves (plot_robustness_curve).
- Identifies promising directions via
- Local explanations:
- Computes LIME explanations (
compute_lime_explanation). - Computes SHAP values (
compute_shap_explanation).
- Computes LIME explanations (
- 2D decision contours: Generates contour plots fixing pairs of dimensions with
plot_local_contour_2d. - Report generation: For each point, produces
post_processing_synth/benchmark_point<idx>.txtwith:- Approximation metrics
- Puiseux expressions & interpretations
- Robustness analysis table
- LIME & SHAP feature attributions
- Paths to saved contour and robustness plots
Outputs:
post_processing_synth/benchmark_point<idx>.txt— comprehensive local analysis report per point.post_processing_synth/robustness_curves_point<idx>.png— robustness plots.post_processing_synth/contour_point<idx>_fix_dim=[...].png— local decision boundary visualizations.
This step consumes the artifacts produced by Section 3 (MIT‑BIH uncertain points) and performs local Newton–Puiseux analysis, robustness probes, LIME/SHAP explanations, calibration comparisons with confidence intervals, and sensitivity summaries.
Prerequisites
- You have already run the MIT‑BIH pipeline (Section 3) and produced at least:
up_real/best_model_full.ptup_real/scaler_full.pklup_real/uncertain_full.csv- (optional)
up_real/T_calib.pt(if temperature scaling was used) - (optional)
up_real/sens_grid.csv,up_real/cv_metrics_per_fold_multi.csv
- The MIT‑BIH Arrhythmia data are available locally (see Datasets
⚠️ ).
Run
# From the repo root
python -m post_processing_real.post_processing_realWhat this does
-
Load model & artifacts
- Loads
best_model_full.pt, optionalT_calib.pt(temperature), andscaler_full.pkl. - Loads uncertain anchors from
uncertain_full.csv. - Builds a background set (up to 512 windows) for LIME/SHAP and ensures consistent scaling via the saved scaler.
- Loads
-
(τ, δ) Sensitivity summary (if
sens_grid.csvpresent)- Parses the grid and writes detailed and summarized reports:
sensitivity_detailed.csv(full grid values),sensitivity_summary.txtandsensitivity_extra.txt(aggregates incl. correlations, medians, budget‑constrained bests).
- Parses the grid and writes detailed and summarized reports:
-
Calibration comparison with 95% CI
- If
cv_metrics_per_fold_multi.csvexists, compiles per‑method summaries:comparative_table.csv(mean ± CI for ECE/NLL/Brier/Acc/AUC),calibration_ci_report.txt(readable CI table),calibration_stats_tests.csv(pairwise Wilcoxon tests; skips if SciPy not installed),calibration_winrate_vs_none.csv(fold‑wise win‑rate vs NONE).
- If
-
Per‑anchor local analysis For each uncertain ECG window:
- Kink diagnostics: non‑holomorphicity probe around the anchor (
kink_score) and fraction of modReLU “kinks”; saves sweeps overkink_eps. - Robust local surrogate: degree‑4 complex polynomial fit with outlier/weighting safeguards; reports condition, rank, kept ratio.
- Quality metrics: RMSE, MAE, Pearson correlation, sign‑agreement against the CVNN.
- Newton–Puiseux expansions + interpretation for local branches.
- Robustness along phase‑selected directions (class change & flip radius) with plots.
- LIME & SHAP explanations on consistently scaled C² features.
- 2D local contours of the decision boundary for fixed dim pairs (1,3) and (0,2).
- Resource benchmark: timing/memory of Puiseux pipeline vs. gradient saliency.
- Kink diagnostics: non‑holomorphicity probe around the anchor (
-
Aggregate reports & calibration CI table
- Builds 5‑fold CI tables for multiple calibrators (
none,platt,isotonic,beta,vector,temperature), plus an ablation none_T0 (uncalibrated at inference) if a temperature file is present. - Summarizes kink prevalence and its effect on fit quality and residual statistics.
- Optional sweep of ECE sensitivity to branch‑multiplicity mis‑estimation.
- Builds 5‑fold CI tables for multiple calibrators (
-
Dominant-ratio summary
- Extracts dominant Puiseux coefficients per anchor (max |c₂| and |c₄| across branches) and computes the dominant-ratio proxy
r_dom = sqrt(|c₂|/|c₄|). - Writes a consolidated CSV for downstream evidence building (see Section 10):
post_processing_real/dominant_ratio_summary.csv.
- Extracts dominant Puiseux coefficients per anchor (max |c₂| and |c₄| across branches) and computes the dominant-ratio proxy
Outputs (saved to post_processing_real/)
-
Logs
post_processing_real.log— progress, warnings, file provenance.
-
Sensitivity (τ, δ)
sensitivity_detailed.csv,sensitivity_summary.txt,sensitivity_extra.txt.
-
Calibration comparisons (from CV panel)
comparative_table.csvcalibration_ci_report.txtcalibration_stats_tests.csv(if SciPy available)calibration_winrate_vs_none.csv
-
Per‑anchor artifacts (for each point i)
benchmark_point<i>.txt— comprehensive report (kink, fit, metrics, Puiseux, robustness, LIME/SHAP, resources).robustness_curves_point<i>.pngcontour_point<i>_fix_dim=[1,3].png,contour_point<i>_fix_dim=[0,2].pngkink_sweep_point<i>.csvresource_point<i>.txt
-
Aggregate summaries
kink_summary.csv— fractions of kink/active/inactive across points.resource_summary.csv— Puiseux vs. saliency (time/memory).local_fit_summary.csv— kept ratio, cond(A), degree, RMSE, sign‑agreement, residual stats.kink_global_summary.txt— prevalence and effects on fits/residuals.dominant_ratio_summary.csv— per-anchor table withpoint,c2_max_abs,c4_max_abs, and the dominant-ratio proxyr_dom = sqrt(|c2|/|c4|); consumed byNP-analysis_real.py(Section 10).
-
Calibration CI (local 5‑fold)
calibration_folds_raw.csvcalibration_ci_table.csvcalibration_ci_report.txt(same name as above if present; latest run overwrites)
-
Branch multiplicity sensitivity
branch_multiplicity_sensitivity.csv
This step mirrors the MIT‑BIH post‑processing but uses artifacts from the RadioML pipeline (Section 4). It performs local Newton–Puiseux analysis in C² (two complex features → 4 real inputs), robustness probes, LIME/SHAP explanations, calibration comparisons with 95% CIs (plus Wilcoxon and win‑rate), sensitivity summaries for (τ, δ), and resource benchmarking.
Prerequisites
- You have already run the RadioML pipeline (Section 4) and produced at least:
up_radio/best_model_full.ptup_radio/scaler_full.pklup_radio/uncertain_full.csv- (optional)
up_radio/T_calib.pt(if temperature scaling was used) - (optional)
up_radio/sens_grid.csv,up_radio/cv_metrics_per_fold_multi.csv
- The RadioML dataset is available locally as described in Datasets
⚠️ (placeRML2016.10a_dict.pklinradio-data/).
Run
# From the repo root (module form)
python -m post_processing_radio.post_processing_radio What this does
-
Load model & artifacts (C²)
Loadsbest_model_full.pt(infers in/hidden dims from weights), optionalT_calib.pt,scaler_full.pkl, and uncertain anchors fromuncertain_full.csv. Verifies the model’s first layer accepts 4‑real input (C²). -
Background for LIME/SHAP
Rebuilds a background set from raw IQ windows usingprepare_complex_input(method='stft_stats')to obtain C² features; applies the saved scaler for consistency. -
(τ, δ) Sensitivity summary (if
sens_grid.csvpresent)
Writes:sensitivity_detailed.csv(the full grid),sensitivity_summary.txtandsensitivity_extra.txt(aggregates, correlations, budget‑constrained best).
-
Calibration comparison with 95% CI
Ifcv_metrics_per_fold_multi.csvexists, compiles:comparative_table.csv(mean ± CI for ECE/NLL/Brier/Acc/AUC),calibration_ci_report.txt(human‑readable CI table),calibration_stats_tests.csv(pairwise Wilcoxon; skipped if SciPy missing),calibration_winrate_vs_none.csv(fold‑wise win‑rate vs NONE).
-
Per‑anchor local analysis (for each uncertain point)
- Kink diagnostics (modReLU neighborhood): non‑holomorphicity probe and kink fraction (skipped if no modReLU‑like activations).
- Robust local surrogate: degree‑4 complex polynomial with outlier/weighting safeguards; reports kept ratio, cond(A), rank.
- Quality metrics: RMSE, MAE, Pearson correlation, sign‑agreement vs the CVNN.
- Newton–Puiseux expansions + interpretation for local branches.
- Robustness along phase‑selected directions (class change & flip radius) with plots.
- LIME & SHAP explanations on consistently scaled C² features.
- 2D local contours of the decision boundary for fixed dim pairs (1,3) and (0,2).
- Resource benchmark: Puiseux pipeline vs gradient saliency (time/memory).
-
Calibration CI table (local 5‑fold on RadioML)
Builds per‑method mean ± 95% CI fornone,platt,isotonic,beta,vector,temperatureusing raw model probabilities (T=None) as the baseline, with:calibration_folds_raw.csv(per‑fold values),calibration_ci_table.csvandcalibration_ci_report.txt(summaries).
-
Branch‑multiplicity sensitivity
Savesbranch_multiplicity_sensitivity.csvshowing ECE sensitivity to multiplicity mis‑estimation (viasweep_multiplicity_misestimation). -
Dominant-ratio summary
- Extracts dominant Puiseux coefficients per anchor (max |c₂| and |c₄| across branches) and computes the dominant-ratio proxy
r_dom = sqrt(|c₂|/|c₄|). - Writes a consolidated CSV for downstream evidence building (see Section 11):
post_processing_radio/dominant_ratio_summary.csv.
- Extracts dominant Puiseux coefficients per anchor (max |c₂| and |c₄| across branches) and computes the dominant-ratio proxy
Outputs (saved to post_processing_radio/)
-
Logs
post_processing_radio.log
-
Sensitivity (τ, δ)
sensitivity_detailed.csv,sensitivity_summary.txt,sensitivity_extra.txt
-
Calibration comparisons (from CV panel)
comparative_table.csvcalibration_ci_report.txtcalibration_stats_tests.csv(if SciPy available)calibration_winrate_vs_none.csv
-
Per‑anchor artifacts (for each point i)
benchmark_point<i>.txtrobustness_curves_point<i>.pngcontour_point<i>_fix_dim=[1,3].png,contour_point<i>_fix_dim=[0,2].pngkink_sweep_point<i>.csvresource_point<i>.txt
-
Aggregate summaries
kink_summary.csv,kink_global_summary.txtresource_summary.csvlocal_fit_summary.csvdominant_ratio_summary.csv— per-anchor table withpoint,c2_max_abs,c4_max_abs, and the dominant-ratio proxyr_dom = sqrt(|c2|/|c4|); consumed byNP-analysis_radio.py(Section 11).
-
Calibration CI (local 5‑fold)
calibration_folds_raw.csvcalibration_ci_table.csvcalibration_ci_report.txt
-
Branch multiplicity
branch_multiplicity_sensitivity.csv
Build a compact, publication-ready summary of Newton–Puiseux evidence by joining uncertain anchors with per-anchor post-processing reports and (optionally) dominant-ratio estimates. Produces correlation stats, triage PR curves (AUPRC), head-to-head flip-rate summaries vs. XAI baselines, figures, and a short Markdown report.
Prerequisites
- You have completed Section 3 (MIT-BIH uncertain points) and Section 8 (Post-Processing Real Data), which should produce at least:
up_real/uncertain_full.csvpost_processing_real/benchmark_point<i>.txt(per-anchor reports)- (optional)
post_processing_real/dominant_ratio_summary.csv
If absent, the script will fall back tor_dom_predparsed from TXT; if missing but|c2|,|c4|are available, it will computer_dom ≈ sqrt(|c2|/|c4|)automatically.
Run
# From the repo root
python -m NP-analysis_real.NP-analysis_real.pyWhat this does
-
Load & join
-
Loads anchors from
up_real/uncertain_full.csvand assigns a stablepointindex0..N-1. -
Parses
post_processing_real/benchmark_point<i>.txtfor:- Kink diagnostics:
frac_kink,frac_active,frac_inactive. - Local fit:
kept_ratio,cond,rank,n_monomials,degree_used,retry. - Approx. quality:
RMSE,MAE,Pearson,Sign_Agreement, residual moments. - Robustness: per-direction flip radii and
min_flip_radius(mapped tor_flip_obs). - Timings/footprint:
puiseux_time_s,saliency_ms, CPU/GPU memory,saliency_grad_norm. - (optional)
r_dom_predand axis-baseline sweep flips:flip_grad,flip_lime,flip_shap.
- Kink diagnostics:
-
Merges
post_processing_real/dominant_ratio_summary.csvif present (normalized to columns:c2_max_abs,c4_max_abs,r_dom,r_flip). -
If
r_domis missing but|c2|,|c4|exist, computesr_dom = sqrt(|c2|/|c4|).
-
-
Correlation & error summary
-
Computes
MAE(|r_dom - r_flip_obs|), Pearson and Spearman correlations. -
Saves a scatter with reference line and a robust regression slope:
NP-analysis_real/figures/scatter_rdom_vs_rflip.png
-
If
frac_kinkexists, saves:NP-analysis_real/figures/scatter_kink_vs_rflip.png
-
-
Triage analysis (AUPRC)
-
Defines an anchor as fragile if
r_flip_obs ≤ BUDGET, withBUDGET = 0.02(edit inside the script to adjust). -
Builds PR curve and AUPRC for ranking by
|c4|(dominant quartic term magnitude); also reports F1-max and threshold at F1-max. Files:NP-analysis_real/pr_by_abs_c4.csvNP-analysis_real/figures/pr_curve_by_abs_c4.png
-
Compares additional scores when available:
1/r_grad,1/r_lime,1/r_shap(derived from flip radii),grad_norm(from saliency logs), and writes a comparison table:NP-analysis_real/triage_compare_summary.csv(Per-score PR curves are saved aspr_by_<prefix>.csv/png, e.g.,pr_by_per_grad.csv,figures/pr_by_per_grad.png.)
-
-
Head-to-head flip rates
-
Reports share of anchors with
r ≤ BUDGETfor Puiseux vs. XAI baselines, plus median radii:NP-analysis_real/xai_vs_puiseux_summary.csv
-
-
Joined evidence table + one-pager
-
Saves the full joined table:
NP-analysis_real/evidence_anchors_joined.csv
-
Saves a compact correlation/PR summary:
NP-analysis_real/corr_summary.csv
-
Writes a Markdown one-pager with key numbers and figure pointers:
NP-analysis_real/np_evidence_report.md
-
Outputs (folder NP-analysis_real/)
evidence_anchors_joined.csv— anchors + benchmark + dominant-ratio (joined).corr_summary.csv— MAE/ρ stats, PR summary, F1-max & threshold.triage_compare_summary.csv— AUPRC by score (|c4|,1/r_*,grad_norm).xai_vs_puiseux_summary.csv— hit-rates ≤BUDGETand median radii by method.pr_by_abs_c4.csv,figures/pr_curve_by_abs_c4.png— main PR curve.pr_by_<prefix>.csv,figures/pr_by_<prefix>.png— extra PR curves (if available).figures/scatter_rdom_vs_rflip.png—r_domvsr_flip_obs.figures/scatter_kink_vs_rflip.png—frac_kinkvsr_flip_obs(if available).np_evidence_report.md— textual summary (ready to paste into appendix/supplement).
Notes
- If
dominant_ratio_summary.csvis missing, the script usesr_dom_predfrom TXT or re-computesr_domfrom available|c2|,|c4|. - To change the fragility budget for triage, edit
BUDGETnear the top of the script.
Same evidence-building and triage pipeline as above, but for RadioML 2016.10A artifacts.
Prerequisites
- You have completed Section 4 (RadioML uncertain points) and Section 9 (Post-Processing Radio Data), which should produce at least:
up_radio/uncertain_full.csvpost_processing_radio/benchmark_point<i>.txt- (optional)
post_processing_radio/dominant_ratio_summary.csv
If absent, the script will user_dom_predparsed from TXT; if missing but|c2|,|c4|are available, it will computer_dom ≈ sqrt(|c2|/|c4|)automatically.
Run
# From the repo root
python -m NP-analysis_radio.NP-analysis_radio.pyWhat this does
- Join & normalize anchors + benchmark TXT + optional dominant-ratio.
- Correlation & error: MAE(|
r_dom−r_flip_obs|), Pearson/Spearman, scatter(s). - Triage (AUPRC): PR for
|c4|+ optional scores (1/r_grad,1/r_lime,1/r_shap,grad_norm), with CSVs and PNGs. - Head-to-head flip-rates vs. XAI baselines.
- Markdown one-pager with key metrics and figure references.
Outputs (folder NP-analysis_radio/)
evidence_anchors_joined.csvcorr_summary.csvtriage_compare_summary.csvxai_vs_puiseux_summary.csvpr_by_abs_c4.csv,figures/pr_curve_by_abs_c4.pngpr_by_<prefix>.csv,figures/pr_by_<prefix>.png(if available)figures/scatter_rdom_vs_rflip.pngfigures/scatter_kink_vs_rflip.png(if available)np_evidence_report.md
Notes
- The fragility budget is
BUDGET = 0.02(edit in the script to change). - The script auto-detects
up_radio/andpost_processing_radio/relative to its location and writes results intoNP-analysis_radio/.
This project is released under the MIT License. See LICENSE for details.
MIT-BIH Arrhythmia Database
ECG recordings are redistributed under the PhysioNet open-access license.
Please ensure compliance with the original terms: https://physionet.org/content/mitdb/1.0.0/
RadioML 2016.10A
This dataset is released by DeepSig under the Creative Commons Attribution‑NonCommercial‑ShareAlike 4.0 International (CC BY‑NC‑SA 4.0) license.
The dataset is not redistributed in this repository. Please ensure compliance with the original terms and cite the dataset appropriately (see the DeepSig datasets page for details).
For questions or contributions, please open an issue or contact Piotr Migus at [email protected].