Below are boxplots and histograms for selected features, comparing gesture classes 2, 3, 6, and 7. Analysis Focus: These are the hardest-to-classify gesture classes that are dragging down overall performance. The plots reveal why the current feature set struggles - there's significant overlap in distributions across classes, meaning the statistical features (means, standard deviations, etc.) don't provide enough separation. Each graph is explained to help you interpret which features might need enhancement and guide targeted feature engineering efforts.

Critical Feature Analysis Insights: Looking at these plots, you'll notice substantial overlap between gesture classes 2, 3, 6, and 7. This explains the low F1 scores - the model can't reliably distinguish between these gestures using current features. Key observations: Next Steps: Focus feature engineering on temporal patterns, frequency analysis, and gesture-specific feature combinations to create better class separation.
content="BFRB dashboard, Helios wrist device, IMU time-series, TOF heatmap, sensor data visualization, interactive dashboard">

BFRB Sensor Data Dashboard

This dashboard provides a comprehensive overview of sensor recordings from the Helios wrist device, including model performance and feature analysis. Current Status: Binary detection achieves strong performance (F1: 0.92), but gesture classification remains challenging (F1: 0.54) due to overlapping feature distributions and difficult class separation. The analysis reveals that gesture classes 2, 3, and 6 are the weakest performers, requiring targeted feature engineering and data augmentation. Use the filters to drill down to sequences or see dataset-wide averages.

Sequence Gesture Distribution

This plot shows the distribution of gesture classes across all recorded sequences. It helps identify class imbalance and highlights which gestures are most common in the dataset.
Total Sequences
Gesture Classes
Missing TOF Rows
Missing Thermopile Rows

Dataset Summary


            
Status: initializing…
Loading per-sequence data…

Visualizations

IMU plot will appear here when a sequence is selected.
This interactive plot displays the IMU (Inertial Measurement Unit) time-series data for the selected sequence, allowing you to explore movement patterns and sensor signals in detail.
TOF heatmap will appear here for dataset averages or selected sequences.
This heatmap visualizes Time-of-Flight (TOF) sensor data, showing spatial patterns and sensor coverage for either the dataset average or a specific sequence.

Model Performance

Latest model training and evaluation results are shown below. Key Findings: The binary detection model achieves strong accuracy (F1: 0.92), but gesture classification performance is suboptimal (F1: 0.54) due to significant feature overlap between gesture classes. The analysis shows that classes 2, 3, and 6 are the biggest challenges with F1 scores of 0.40, 0.45, and 0.45 respectively. This suggests the current feature set doesn't provide sufficient separation for these gesture types. Metrics and confusion matrices reflect subject-grouped cross-validation and the impact of recent feature engineering and stacking meta-model updates.

Binary F1 (Full Data)
0.92
Gesture F1 (Full Data)
0.54
Overall Score (Full Data)
0.73
Binary F1 (IMU Only)
0.91
Gesture F1 (IMU Only)
0.37
Overall Score (IMU Only)
0.64

Evaluation Visualizations

Gesture Confusion Matrix (Full Data)

Latest from today's run (September 5, 2025, 12:10):
[[455  22  29  17  25  24  38  28]
 [ 15 336  35  88  10  17  76  60]
 [ 31  48 243 114  77  84  15  26]
 [ 29 107 114 270  24  39  23  34]
 [ 21  15  76  14 365 121   9  19]
 [ 13  13  51  18  83 440   8  14]
 [ 64  73   7  20   7   7 282 180]
 [ 33  49   8  18   1   8 166 357]]
This confusion matrix visualizes the gesture classification model's predictions using all available sensors. It highlights which gesture classes are most often confused and where the model performs best or struggles.

Per-Class F1 Scores (Gesture Classification)

Detailed performance breakdown by gesture class from today's run:
Class 0: 0.70
Class 1: 0.52
Class 2: 0.40 (lowest)
Class 3: 0.45
Class 4: 0.59
Class 5: 0.64 (highest)
Class 6: 0.45
Class 7: 0.53
These scores show which gesture classes need the most improvement. Classes 2, 3, and 6 are the weakest performers.

Results Summary & Analysis

Current Performance: Using all available sensors (IMU, thermopile, TOF) yields strong binary classification (F1: 0.92), but gesture classification remains significantly below target (F1: 0.54 vs target ≥0.898). This results in an overall score of 0.73, which fails to meet competition requirements.

Root Causes of Low Gesture F1:

  • Feature Overlap: The hardest gesture classes (2, 3, 6) show substantial overlap in feature distributions, making them difficult to distinguish
  • Class Imbalance: Some gesture classes have fewer training examples, leading to poorer generalization
  • Feature Limitations: Current statistical features (mean, std, RMS, etc.) may not capture the temporal patterns unique to each gesture
  • Sensor Integration: While additional sensors help, their features may not be optimally combined

Critical Issues by Gesture Class:

  • Class 2: F1 = 0.40 (lowest performer) - needs significant feature engineering
  • Class 3: F1 = 0.45 - shows confusion with multiple other classes
  • Class 6: F1 = 0.45 - high misclassification rate
  • Classes 0, 4, 5: Best performers (F1: 0.70, 0.59, 0.64) - can serve as reference

Future Directions & Next Steps:

  • Advanced Feature Engineering: Develop temporal features, frequency domain analysis, and gesture-specific patterns
  • Data Augmentation: Generate synthetic examples for underrepresented classes (2, 3, 6)
  • Temporal Modeling: Implement RNNs, LSTMs, or attention mechanisms to capture sequence dynamics
  • Feature Selection: Identify and prioritize features that best separate the hard classes
  • Ensemble Optimization: Fine-tune stacking weights and explore alternative ensemble methods

IMU-only models perform well for binary detection but struggle with gesture recognition, highlighting the value of additional sensors but also the need for better multi-modal feature integration.

Binary Classification Details
Class 0 (Non-gesture): F1: 0.94, Precision: 0.94, Recall: 0.94
Class 1 (Gesture): F1: 0.92, Precision: 0.91, Recall: 0.92

Support: 3038 non-gestures, 5113 gestures. Accuracy: 0.93, Macro Avg: 0.93

Feature Analysis: Hardest Gesture Classes

Below are boxplots and histograms for selected features, comparing gesture classes 2, 3, 6, and 7. Each graph is explained to help you interpret which features best separate the hardest classes and guide further feature engineering.

Summary: Looking at these feature analysis plots, you’ll notice that the distributions for different gesture classes often overlap quite a bit. This means the current features aren’t doing a great job of helping the model tell the classes apart. In other words, the model struggles to distinguish between gestures because the features don’t provide enough separation. To improve classification, we’ll need to engineer new features or find better ways to represent the data so that the classes become more distinct in these plots.

acc_mag_mean

acc_mag_mean boxplot
Boxplot: Shows the distribution of mean acceleration magnitude for each gesture class, highlighting differences and overlap between classes.
acc_mag_mean histogram
Histogram: Displays the frequency of mean acceleration magnitude values, helping to visualize class separation and feature usefulness.

acc_x_mean

acc_x_mean boxplot
Boxplot: Shows the distribution of mean acceleration in the X direction for each gesture class.
acc_x_mean histogram
Histogram: Displays the frequency of mean acceleration X values, useful for spotting class overlap or separation.

acc_y_mean

acc_y_mean boxplot
Boxplot: Shows the distribution of mean acceleration in the Y direction for each gesture class.
acc_y_mean histogram
Histogram: Displays the frequency of mean acceleration Y values, useful for identifying feature separation.

acc_z_mean

acc_z_mean boxplot
Boxplot: Shows the distribution of mean acceleration in the Z direction for each gesture class.
acc_z_mean histogram
Histogram: Displays the frequency of mean acceleration Z values, useful for visualizing class differences and overlap.
For a full set of feature plots, see feature_analysis_plots/ in the project directory.

Next Steps & Future Directions

Based on the current analysis, here are the prioritized actions to improve gesture classification performance and achieve the target F1 score of ≥0.898:

🔧 Immediate Actions (High Priority)

  • Feature Engineering: Develop temporal features (autocorrelation, zero-crossing rates, peak analysis)
  • Frequency Analysis: Add spectral features to capture motion rhythms
  • Class 2, 3, 6 Focus: Targeted feature development for weakest classes
  • Data Augmentation: Generate synthetic examples for underrepresented gestures

🧠 Advanced Modeling (Medium Priority)

  • Temporal Models: Implement RNNs/LSTMs for sequence understanding
  • Attention Mechanisms: Focus on important time steps in gestures
  • Multi-modal Fusion: Better integration of IMU, TOF, and thermopile data
  • Ensemble Optimization: Fine-tune stacking weights and model combinations

📊 Analysis & Validation (Ongoing)

  • Feature Importance: Identify which features contribute most to classification
  • Error Analysis: Deep dive into misclassification patterns
  • Cross-validation: Ensure robust performance across subjects
  • Ablation Studies: Test impact of different feature sets

🎯 Success Metrics & Milestones

Current baseline: Gesture F1 = 0.54, Overall = 0.73. Need ~66% improvement to reach target.

About this Web App

Overview

This dashboard visualizes processed recordings from the Helios wrist device. It provides a dataset summary, interactive IMU time-series plots, and TOF heatmaps for either the dataset average or individual sequences.

How to use

  1. Use "View Mode" to switch between dataset-average and per-sequence views.
  2. Apply the gesture or subject filters to narrow the selection; the sequence selector updates accordingly.
  3. In per-sequence mode, pick a sequence to render IMU time-series and the sequence-averaged TOF heatmap.
  4. If TOF appears empty, check console logs for detected TOF column names or run the preprocessing step to regenerate processed/ files.

Future directions