VAPE MK53 - Real-time 6-DOF Aircraft Pose Estimator

Enhanced with Timestamp Support for Vision Latency Correction

20250816 update: The recent code is VAPE_MK53_3.py

Overview

VAPE MK53 is a state-of-the-art real-time 6-DOF (6 Degrees of Freedom) pose estimation system designed specifically for aircraft tracking. It combines classical robotics techniques with modern deep learning to achieve robust, low-latency pose estimation with proper handling of vision processing delays.

Key Features

🚁 Real-time Aircraft Tracking: Specialized for aircraft pose estimation with 14 viewpoint-specific anchors
⏱️ Timestamp-Aware Processing: Canonical VIO/SLAM approach for handling vision latency
🧠 Enhanced Unscented Kalman Filter: Variable-dt prediction with fixed-lag buffer for out-of-sequence measurements
🎯 Multi-threaded Architecture: Optimized for both low-latency display (30 FPS) and accurate processing
🔧 Physics-Based Filtering: Rate limiting prevents impossible orientation/position jumps
📊 Adaptive Viewpoint Selection: Intelligent switching between 14 pre-computed viewing angles

System Architecture

┌─────────────────┐    ┌─────────────────┐
│   MainThread    │    │ ProcessingThread │
│   (30 FPS)      │    │   (Variable)     │
│                 │    │                  │
│ • Camera capture│    │ • YOLO detection │
│ • Timestamp     │ ┌──│ • Feature match  │
│ • Visualization │ │  │ • Pose estimation│
│ • UKF prediction│ │  │ • UKF update     │
└─────────────────┘ │  └─────────────────┘
         │           │           │
         └─── Queues + Locks ────┘
                   │
            ┌─────────────┐
            │   Enhanced  │
            │     UKF     │
            │(Timestamp-  │
            │   Aware)    │
            └─────────────┘

Multi-Threading Design

MainThread: High-frequency capture and display (30 FPS) with immediate timestamp recording
ProcessingThread: AI-heavy computation (YOLO + SuperPoint + LightGlue + PnP) with timestamp-aware updates
Enhanced UKF: Handles measurements at correct historical times with variable-dt motion models

Technical Innovation

Timestamp-Aware Vision Processing

Unlike traditional pose estimation systems that suffer from vision latency, VAPE MK53 implements the canonical VIO/SLAM approach:

Immediate Timestamp Capture: t_capture = time.monotonic() recorded the moment frames are obtained
Latency-Corrected Updates: UKF processes measurements at their actual capture time, not processing time
Fixed-Lag Buffer: 200-frame history enables handling of out-of-sequence measurements
Variable-dt Motion Model: Adapts to actual time intervals instead of assuming fixed frame rates

Enhanced Unscented Kalman Filter

State Vector (16D):

# [0:3]   - Position (x, y, z)
# [3:6]   - Velocity (vx, vy, vz)  
# [6:9]   - Acceleration (ax, ay, az)
# [9:13]  - Quaternion (qx, qy, qz, qw)
# [13:16] - Angular velocity (wx, wy, wz)

Key Features:

dt-Scaled Process Noise: Q_scaled = Q * dt + Q * (dt²) * 0.5
Quaternion Normalization: Prevents numerical drift
Rate Limiting: Physics-based constraints prevent impossible motions
Robust Covariance: SVD fallback for numerical stability

Computer Vision Pipeline

1. Multi-Scale Object Detection

YOLO v8: Custom trained on aircraft ("iha" class)
Adaptive Thresholding: 0.30 → 0.20 → 0.10 confidence cascade
Largest-Box Selection: Focuses on primary aircraft target

2. Deep Feature Extraction & Matching

SuperPoint: CNN-based keypoint detector (up to 2048 keypoints)
LightGlue: Attention-based feature matching with early termination
14 Viewpoint Anchors: Pre-computed reference images for different viewing angles

3. Robust Pose Estimation

EPnP + RANSAC: Initial pose estimation with outlier rejection
VVS Refinement: Virtual Visual Servoing for sub-pixel accuracy
Temporal Consistency: Viewpoint selection with failure recovery

4. Intelligent Viewpoint Management

viewpoints = ['NE', 'NW', 'SE', 'SW', 'E', 'W', 'N', 'S', 
              'NE2', 'NW2', 'SE2', 'SW2', 'SU', 'NU']

Temporal Consistency: Stick with working viewpoint
Adaptive Search: Switch when current viewpoint fails
Quality Metrics: Match count, inlier count, reprojection error

Installation

Requirements

Python Version: 3.11+

Hardware Requirements:

NVIDIA GPU with CUDA 12.2+ (recommended)
8GB+ RAM
USB camera or video input

Dependencies

# Core Dependencies
pip install torch==2.6.0+cu124 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

# Computer Vision & AI
pip install ultralytics>=8.0.0
pip install lightglue
pip install opencv-python>=4.8.0

# Scientific Computing
pip install numpy>=1.24.0
pip install scipy>=1.11.0

# Utilities
pip install matplotlib>=3.7.0

Required Files

YOLO Model: best.pt (trained aircraft detection model)
Anchor Images: 14 viewpoint reference images (NE.png, NW.png, etc.)
Input Video: Your aircraft footage for processing

Usage

Basic Usage

# Real-time webcam processing
python3 VAPE_MK53_3.py --webcam --show

# Video file processing with feature visualization
python3 VAPE_MK53_3.py --video_file your_video.mp4 --show

# Image sequence processing
python3 VAPE_MK53_3.py --image_dir ./images/ --save_output

# Custom rate limiting for different scenarios
python3 VAPE_MK53_3.py --video_file fast_maneuvers.mp4 --max_rotation_dps 60 --max_position_mps 3.0

Command Line Options

# Input Sources (required, mutually exclusive)
--webcam              # Use webcam input
--video_file PATH     # Process video file
--image_dir PATH      # Process image sequence

# Visualization Options
--show               # Show SuperPoint keypoint detections
--save_output        # Save pose data to JSON file

# UKF Tuning Parameters
--max_rotation_dps   # Maximum rotation rate (default: 30°/s)
--max_position_mps   # Maximum position speed (default: 1.5 m/s)

Rate Limiting Presets

# Handheld/Walking Around Aircraft
kf.set_rate_limits(max_rotation_dps=30.0, max_position_mps=1.5)

# Fast Movements/Drone Footage  
kf.set_rate_limits(max_rotation_dps=60.0, max_position_mps=3.0)

# Stable Tripod/Fixed Camera
kf.set_rate_limits(max_rotation_dps=15.0, max_position_mps=0.5)

Output

Real-time Display

Main Window: Video with 3D coordinate axes overlaid on aircraft
Feature Window (with --show): SuperPoint keypoints visualization
Console Output: Timing, viewpoint selection, and rejection statistics

Saved Data (with `--save_output`)

{
  "frame": 42,
  "success": true,
  "position": [x, y, z],
  "quaternion": [qx, qy, qz, qw],
  "kf_position": [x_filtered, y_filtered, z_filtered],
  "kf_quaternion": [qx_f, qy_f, qz_f, qw_f],
  "num_inliers": 25,
  "viewpoint_used": "NW",
  "capture_time": 1234567.890
}

Performance Characteristics

Timing Analysis

🕒 Frame captured at t=1234.567
🔬 Processing latency: 125.3ms  
🎯 Total system latency: 167.8ms (capture→display)
⏭️ UKF predicting forward: 0.083s for proper temporal fusion

Typical Performance

Main Thread: 30 FPS (display)
Processing Thread: 5-15 FPS (AI processing)
System Latency: 100-200ms (capture to pose update)
Memory Usage: ~2GB GPU, ~1GB RAM

Algorithm Details

Critical Timing Flow

t=0.000: Frame captured, t_capture recorded
t=0.033: Frame sent to processing queue  
t=0.080: YOLO detection completes
t=0.120: Feature matching finishes
t=0.125: UKF.update_with_timestamp(measurement, t_capture=0.000)
         ↳ Filter predicts back to t=0.000
         ↳ Applies measurement at correct time
         ↳ Fast-forwards to t=0.125 for display

UKF Prediction Process

Generate 33 Sigma Points around current state estimate
Propagate through Motion Model (constant acceleration)
Recombine with Weights to get predicted mean and covariance
dt-Scaled Process Noise reflects uncertainty growth over time
Quaternion Normalization prevents numerical drift

Physics-Based Validation

# Orientation rate limiting
max_angle_change = max_rotation_dps * dt
if angle_diff > max_angle_change:
    reject_measurement("Orientation jump too large")

# Position rate limiting  
max_distance = max_position_mps * dt
if movement_distance > max_distance:
    reject_measurement("Position jump too large")

Troubleshooting

Common Issues

1. YOLO Detection Failures

🚫 No aircraft detected in frame

Check lighting conditions
Verify aircraft is clearly visible
Try different confidence thresholds

2. Excessive Rejections

🚫 Frame 147: Rejected (Orientation Jump: 34.6° > 30°)
⚠️ Exceeded 5 consecutive rejections. Re-initializing KF.

Increase rate limits for faster movements
Check for motion blur or poor lighting
Verify anchor images match aircraft type

3. GPU Memory Issues

CUDA out of memory

Reduce video resolution
Use CPU mode: set device = 'cpu'
Close other GPU applications

4. Missing Anchor Images

FileNotFoundError: Required anchor image not found: NE.png

Ensure all 14 viewpoint images are present
Check file naming convention matches exactly

Camera Calibration

For accurate pose estimation, replace the default camera intrinsics in _get_camera_intrinsics():

def _get_camera_intrinsics(self):
    # Replace with your camera's calibration data
    fx, fy, cx, cy = 1460.10150, 1456.48915, 604.85462, 328.64800
    K = np.array([[fx, 0, cx], [0, fy, cy], [0, 0, 1]], dtype=np.float32)
    return K, None  # Add distortion coefficients if needed

Research Applications

This system implements cutting-edge techniques from:

Visual-Inertial Odometry (VIO)
Simultaneous Localization and Mapping (SLAM)
Real-time Computer Vision
Robust State Estimation

Academic Contributions

Timestamp-Aware Pose Estimation: Proper handling of vision processing latency
Multi-threaded UKF Architecture: Optimized for both accuracy and latency
Adaptive Viewpoint Management: Robust to viewing angle changes
Physics-Based Measurement Validation: Prevents impossible state transitions

Development and Debugging

Adding Debug Output

# In MainThread.run() - Monitor capture timing
t_capture = time.monotonic()
print(f"🕒 CAPTURE: Frame {self.frame_count} at t={t_capture:.3f}")

# In ProcessingThread._process_frame() - Track latency
latency_ms = (time.monotonic() - t_capture) * 1000
print(f"🔬 PROCESS: Frame {frame_id}, latency={latency_ms:.1f}ms")

# In UKF.update_with_timestamp() - Monitor filter decisions
if t_meas >= self.t_state:
    print(f"⏭️ UKF: Predicting forward {dt:.3f}s")
else:
    print(f"⏮️ UKF: Out-of-sequence {abs(dt)*1000:.1f}ms late")

VS Code Debugging Setup

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "VAPE MK53 Debug",
            "type": "python",
            "request": "launch",
            "program": "${workspaceFolder}/VAPE_MK53_3.py",
            "args": ["--video_file", "test_video.mp4", "--show"],
            "console": "integratedTerminal",
            "justMyCode": false
        }
    ]
}

License

This project builds upon and extends SuperGlue by Magic Leap, Inc. The original SuperGlue components are licensed under the terms provided by Magic Leap.

Modifications and Enhancements

Multi-threaded timestamp-aware architecture
Enhanced Unscented Kalman Filter with variable-dt
Aircraft-specific YOLO integration
Viewpoint management system
Physics-based measurement validation
Real-time performance optimizations

Citation

If you use this work in your research, please cite:

@software{vape_mk53_2025,
  title={VAPE MK53: Real-time 6-DOF Aircraft Pose Estimator with Timestamp Support},
  author={[Your Name]},
  year={2025},
  url={https://github.com/[your-repo]/VAPE_MK53}
}

Core Deep Learning Components:

@inproceedings{sarlin20superglue,
  title={SuperGlue: Learning Feature Matching with Graph Neural Networks},
  author={Sarlin, Paul-Edouard and DeTone, Daniel and Malisiewicz, Tomasz and Rabinovich, Andrew},
  booktitle={CVPR},
  year={2020}
}

@inproceedings{detone2018superpoint,
  title={SuperPoint: Self-Supervised Interest Point Detection and Description},
  author={DeTone, Daniel and Malisiewicz, Tomasz and Rabinovich, Andrew},
  booktitle={CVPR Deep Learning for Visual SLAM Workshop},
  year={2018}
}

@inproceedings{lindenberger2023lightglue,
  title={LightGlue: Local Feature Matching at Light Speed},
  author={Lindenberger, Philipp and Sarlin, Paul-Edouard and Pollefeys, Marc},
  booktitle={ICCV},
  year={2023}
}

Contributing

Contributions are welcome! Areas of interest:

Additional aircraft viewpoint anchors
Performance optimizations
Extended camera support
Improved motion models
Better visualization options

Contact

For questions, issues, or collaboration opportunities, please open an issue on GitHub.

Note: This system represents state-of-the-art real-time pose estimation combining classical robotics (Enhanced UKF) with modern deep learning (YOLO, SuperPoint, LightGlue). The timestamp-aware architecture follows canonical VIO/SLAM practices used in production robotics systems.

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
20250902_test_result		20250902_test_result
20250904_mmpose		20250904_mmpose
20250908_out		20250908_out
GT		GT
MATLAB		MATLAB
Matlab_results		Matlab_results
Optitrack		Optitrack
Pose_estimation_JSON		Pose_estimation_JSON
RuunPoseResult		RuunPoseResult
Ruun_code		Ruun_code
Study		Study
Viewpoint_pth		Viewpoint_pth
assets		assets
calibration		calibration
lightglue		lightglue
misclassified_images		misclassified_images
mmpose_dos_not_work_in_this_env		mmpose_dos_not_work_in_this_env
models		models
output/results		output/results
run_out_20250908_slow_view_change		run_out_20250908_slow_view_change
runs		runs
.gitattributes		.gitattributes
.gitignore		.gitignore
0.png		0.png
1.jpg		1.jpg
20241118_Pose_estimation.py		20241118_Pose_estimation.py
20241119_RuunPose-realtime.py		20241119_RuunPose-realtime.py
20241121_RuunPose-realtime.py		20241121_RuunPose-realtime.py
20241128_RTK.py		20241128_RTK.py
20241129_RuunPose_KF_NoiseHandle.py		20241129_RuunPose_KF_NoiseHandle.py
20241218_auto_annotation_3d.py		20241218_auto_annotation_3d.py
20241218_manual_keypoint_on_image.py		20241218_manual_keypoint_on_image.py
20241227_C_test1.json		20241227_C_test1.json
20241227_C_test1_anlaysis.json		20241227_C_test1_anlaysis.json
20241227_C_test1_anlaysis_KF_10hz.json		20241227_C_test1_anlaysis_KF_10hz.json
20241227_C_test1_anlaysis_R_thresh.json		20241227_C_test1_anlaysis_R_thresh.json
20241227_C_test1_anlaysis_R_thresh_100D_noseX_tailX.json		20241227_C_test1_anlaysis_R_thresh_100D_noseX_tailX.json
20241227_C_test1_anlaysis_R_thresh_15D.json		20241227_C_test1_anlaysis_R_thresh_15D.json
20241227_C_test1_anlaysis_R_thresh_15D_noseX.json		20241227_C_test1_anlaysis_R_thresh_15D_noseX.json
20241227_C_test1_anlaysis_R_thresh_15D_noseX_tailX.json		20241227_C_test1_anlaysis_R_thresh_15D_noseX_tailX.json
20241227_C_test1_anlaysis_R_thresh_15D_noseX_tailX_1.json		20241227_C_test1_anlaysis_R_thresh_15D_noseX_tailX_1.json
20241227_C_test1_anlaysis_R_thresh_15D_noseX_tailX_2.json		20241227_C_test1_anlaysis_R_thresh_15D_noseX_tailX_2.json
20241227_C_test1_new_anchor.json		20241227_C_test1_new_anchor.json
20241227_C_test1_new_anchor_kd_1.json		20241227_C_test1_new_anchor_kd_1.json
20241227_C_test1_new_anchor_kd_2.json		20241227_C_test1_new_anchor_kd_2.json
20241227_C_test2.json		20241227_C_test2.json
20241227_C_test2_anlaysis.json		20241227_C_test2_anlaysis.json
20241227_C_test2_new_anchor.json		20241227_C_test2_new_anchor.json
20241227_C_test3_anlaysis.json		20241227_C_test3_anlaysis.json
20241227_C_test3_miss.json		20241227_C_test3_miss.json
20241227_C_test4.json		20241227_C_test4.json
20241227_test1.json		20241227_test1.json
20241227_test2.json		20241227_test2.json
20241227_test3.json		20241227_test3.json
20241227_test4.json		20241227_test4.json
20241230_test1.json		20241230_test1.json
20241230_test1_2.json		20241230_test1_2.json
20241230_test2.json		20241230_test2.json
20241230_test3.json		20241230_test3.json
20241230_test4.json		20241230_test4.json
20241230_test5.json		20241230_test5.json
20241230_test6.json		20241230_test6.json
20241230_test7.json		20241230_test7.json
20241230_test8.json		20241230_test8.json
20241230_test8_2.json		20241230_test8_2.json
20241230_test8_2_1.json		20241230_test8_2_1.json
20241230_test8_2_1_1.json		20241230_test8_2_1_1.json
20250106_main.py		20250106_main.py
20250106_test2.json		20250106_test2.json
20250106_test2_ori_thresh_up.json		20250106_test2_ori_thresh_up.json
20250106_test3.json		20250106_test3.json
20250107_test10_15.json		20250107_test10_15.json
20250107_test11.json		20250107_test11.json
20250107_test12.json		20250107_test12.json
20250107_test13.json		20250107_test13.json
20250107_test14.json		20250107_test14.json
20250107_test16.json		20250107_test16.json
20250107_test17.json		20250107_test17.json
20250107_test18.json		20250107_test18.json
20250107_test19.json		20250107_test19.json
20250107_test2.json		20250107_test2.json
20250107_test2_good.json		20250107_test2_good.json
20250107_test2_thresh_15cm.json		20250107_test2_thresh_15cm.json
20250107_test3.json		20250107_test3.json
20250107_test4.json		20250107_test4.json
20250107_test5.json		20250107_test5.json
20250107_test6.json		20250107_test6.json
20250107_test7.json		20250107_test7.json
20250107_test8.json		20250107_test8.json
20250107_test9.json		20250107_test9.json
20250108_analysis.py		20250108_analysis.py
20250108_main_pnp.py		20250108_main_pnp.py
20250108_test2_1hz.json		20250108_test2_1hz.json
20250108_test2_pnp.json		20250108_test2_pnp.json
20250108_test2_pnp2.json		20250108_test2_pnp2.json
20250108_test2_pnp3.json		20250108_test2_pnp3.json
20250108_test2_pnp_no_thresh.json		20250108_test2_pnp_no_thresh.json
20250108_test2_pnp_thresh.json		20250108_test2_pnp_thresh.json
20250108_test2_pnp_thresh_RE.json		20250108_test2_pnp_thresh_RE.json
20250108_test2_thresh.json		20250108_test2_thresh.json

License

morningsunshine0401/RUUN_GLUE

Folders and files

Latest commit

History

Repository files navigation

VAPE MK53 - Real-time 6-DOF Aircraft Pose Estimator

20250816 update: The recent code is VAPE_MK53_3.py

Overview

Key Features

System Architecture

Multi-Threading Design

Technical Innovation

Timestamp-Aware Vision Processing

Enhanced Unscented Kalman Filter

Computer Vision Pipeline

1. Multi-Scale Object Detection

2. Deep Feature Extraction & Matching

3. Robust Pose Estimation

4. Intelligent Viewpoint Management

Installation

Requirements

Dependencies

Required Files

Usage

Basic Usage

Command Line Options

Rate Limiting Presets

Output

Real-time Display

Saved Data (with --save_output)

Performance Characteristics

Timing Analysis

Typical Performance

Algorithm Details

Critical Timing Flow

UKF Prediction Process

Physics-Based Validation

Troubleshooting

Common Issues

Camera Calibration

Research Applications

Academic Contributions

Development and Debugging

Adding Debug Output

VS Code Debugging Setup

License

Modifications and Enhancements

Citation

Contributing

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Saved Data (with `--save_output`)

Packages