EEG Seizure Detection Project

This web application allows you to explore EEG (Electroencephalogram) data and visualize spectrograms, providing insights into the detection of seizures using machine learning techniques. Throughout the page, we present both the raw data and the results of models that have been pre-trained to identify seizure patterns. Below, we provide a detailed overview of each section and how to interpret the information presented.

EEG Spectrograms Section:

In the EEG Spectrograms section, you can explore the raw EEG data by selecting different datasets from the dropdown menu labeled "Select Dataset." The available datasets include training, testing, and validation data used to develop and evaluate the seizure detection models. Each dataset consists of EEG recordings that have been processed into spectrogram images. Spectrograms visually represent the temporal patterns of brain activity over time and frequency, making it easier to identify distinguishing features between seizure and non-seizure events.

You can navigate through the individual spectrogram images using the slider controls beneath each image. The "Non-Seizure" panel shows spectrograms of EEG segments where no seizures were detected, while the "Seizure" panel shows spectrograms from segments where seizures occurred. This section provides a direct view of the data used to train and test the models, helping you understand the visual differences that the model analyzes when making its predictions.

Model Results Section:

The Model Results section displays the performance metrics of the convolutional neural network (CNN) models that were trained to distinguish between seizure and non-seizure events. These models have been pre-trained using the datasets presented earlier. The dropdown menu labeled "Select Run" allows you to choose from different model runs, each corresponding to a different training session or configuration of the model.

Accuracy and Loss Plot: The plots on the left show the model's performance over time. The Accuracy Plot indicates how well the model correctly identifies seizure and non-seizure events throughout the training process, while the Loss Plot shows the error rate of the model, which ideally decreases as the model learns from the data. These plots help visualize the model's learning curve and its ability to generalize from the training data to new, unseen data.

Confusion Matrix: The Confusion Matrix provides a detailed breakdown of the model's predictions versus the actual labels. It shows how many instances were correctly classified as either seizure or non-seizure (true positives and true negatives) and how many were misclassified (false positives and false negatives). This matrix is essential for evaluating the reliability and accuracy of the model in a clinical context.

Misclassified Images: The Misclassified Images section highlights examples where the model made incorrect predictions. These images are crucial for understanding the limitations of the current model and identifying specific cases where the model may need further refinement. By examining these examples, you can gain insights into potential patterns or characteristics that may be confusing the model.

Training Log: The Training Log provides a comprehensive summary of the model's training parameters and performance metrics. It includes details such as test accuracy, average training and validation accuracy, and loss values, along with model settings like batch size, number of epochs, learning rate, and whether data augmentation or shuffling was applied. This log helps replicate the model's training process and understand its performance characteristics.

Explanation of the Code:

This section describes the Python code used to train the convolutional neural network (CNN) models for detecting seizures from EEG data. The code begins by setting up the environment and defining parameters such as image size, batch size, number of epochs, and learning rate. These parameters are crucial for controlling how the model learns from the data.

Next, the code defines functions to load images from specific folders, such as 'seizure' and 'non-seizure', within training, validation, and testing datasets. It reads the images, converts them to a format suitable for machine learning, and assigns appropriate labels to them. After loading, the data is normalized to ensure consistency in model training. Normalization scales the pixel values between 0 and 1, which helps the model learn more efficiently.

The neural network is constructed using several layers. The initial convolutional layers are designed to extract important features from the EEG spectrogram images. Each of these layers is followed by a max-pooling layer, which reduces the size of the extracted feature maps and helps the model focus on the most significant patterns. After feature extraction, the data is flattened and passed through fully connected (dense) layers that make the final prediction on whether the EEG data indicates a seizure or not.

To train the model, the code uses the specified optimizer and loss function, and applies data augmentation techniques when enabled. The model is trained over several epochs, with real-time feedback on its performance through accuracy and loss metrics. The training process is monitored using callbacks to save the best model and stop training early if it does not improve, reducing the chances of overfitting. After training, the model is evaluated on unseen test data to check its performance, and important metrics like accuracy and loss are calculated and displayed. The code also plots graphs of accuracy and loss over time and generates a confusion matrix to visualize the model's performance.

Detailed Model Architecture:

The model architecture is designed to analyze EEG spectrogram images for seizure detection. It consists of several layers, each playing a distinct role in learning and decision-making. The first layer is an input layer that takes the EEG images in a specified format. This layer is followed by two convolutional layers, which act like filters scanning through the images to detect specific patterns that are characteristic of seizure activity. The first convolutional layer has 16 filters, while the second has 32 filters, both using a 3x3 kernel size. These filters allow the model to capture fine-grained details from the spectrogram images.

After each convolutional layer, a max-pooling layer is used to reduce the dimensionality of the feature maps, effectively summarizing the most important features detected by the convolutional layers. This step helps in reducing the computational load and the risk of overfitting. The data is then flattened, meaning it is transformed into a one-dimensional array to be fed into the dense layers. The dense layers, which are fully connected neural layers, further process the extracted features and make the final decision on whether the EEG signal indicates a seizure. The final dense layer has two nodes corresponding to the two possible outcomes: seizure or non-seizure. This layer uses a softmax activation function to output probabilities for each class, allowing the model to assign a confidence score to its predictions.

Model Performance and Evaluation:

The performance of the convolutional neural network (CNN) model is evaluated using several key metrics to understand its effectiveness in detecting seizures from EEG data. Accuracy is a primary metric, representing the proportion of correctly classified instances out of all instances in the dataset. During our tests, the accuracy varied between 80% and 95%, reflecting the model’s ability to generalize from the training data to new, unseen data. This variation can be attributed to different training configurations, such as changes in learning rate, batch size, or the number of epochs.

Loss is another critical metric, providing insight into the model’s error during training and validation. It measures the difference between the predicted outputs and the actual labels. A lower loss value indicates a better fit of the model to the data. The training and validation loss are tracked over time to ensure that the model is learning effectively without overfitting to the training data. Ideally, the loss should decrease as the model learns, reaching a stable point where it can generalize well to new data.

The confusion matrix is a powerful tool for evaluating the model’s performance in more detail. It provides a breakdown of true positives, false positives, true negatives, and false negatives. In our context, true positives represent correctly detected seizures, while true negatives are correctly identified non-seizure events. False positives occur when the model incorrectly identifies a seizure where there is none, and false negatives occur when it fails to detect an actual seizure. By analyzing the confusion matrix, we can assess not just the overall accuracy but also the reliability of the model in different clinical scenarios. For example, minimizing false negatives is crucial in a clinical setting, where failing to detect a seizure could have serious consequences.

In summary, these performance metrics offer a comprehensive view of how well the CNN model is functioning. They help us understand the strengths and weaknesses of the model, providing valuable feedback for further optimization and development.

Model Training Performance - Comparing the Last Two Runs:

In our recent experiments with the EEG seizure detection model, we conducted two separate training runs using identical datasets and model configurations. Despite keeping the dataset and model parameters consistent, we observed some interesting differences in performance metrics between the two runs, which highlights the nuances and complexities inherent in training deep learning models.

For both runs, we used a batch size of 64, which defines how many samples the model processes before updating its internal parameters. The number of epochs, or full training cycles through the entire dataset, was also kept constant at 100. Additionally, we set the initial learning rate to 0.001, a parameter that controls how much the model's weights are adjusted with respect to the loss gradient. No data augmentation was applied, meaning the model was trained solely on the original images without any artificial alterations or transformations, such as flips or rotations. Finally, we enabled shuffling for both runs, ensuring that the order of training samples was randomized at each epoch to help the model generalize better.

Despite these identical settings, we observed variations in the performance metrics. In Run 9, the model achieved a test accuracy of 96.67%, with an average training accuracy of 82.89% and an average validation accuracy of 93.04%. The corresponding training and validation losses were 0.4323 and 0.3520 respectively, indicating a relatively good fit with a low margin of error. Conversely, Run 10 showed a test accuracy of 90.00%, with an average training accuracy of 73.06% and an average validation accuracy of 85.78%. The training and validation losses for this run were higher, at 0.5729 and 0.5109, respectively. This suggests that while the model still performed reasonably well, it did not generalize as effectively as in Run 9.

These differences, while appearing significant, are actually a common occurrence in the field of machine learning. Neural networks, especially deep learning models, are influenced by various stochastic factors, such as the random initialization of weights or the random order of data presented during training. Even with identical settings, these random elements can cause slight variations in model performance. The lower test accuracy and higher losses in Run 10, for example, could be the result of less favorable initial conditions or subtle differences in how the model encountered the data during each epoch.

Understanding these variations is crucial as it underscores the importance of running multiple experiments and not relying solely on a single run to judge a model's effectiveness. It also highlights the necessity of careful parameter tuning and the value of techniques like cross-validation to ensure that the model is genuinely learning patterns that are generalizable beyond the specific dataset it was trained on.

Overall, these results remind us of the inherently probabilistic nature of machine learning. The key takeaway is to embrace the variability as part of the learning process, continually experimenting with different configurations and techniques to achieve the most robust and reliable outcomes. We encourage you to explore these concepts further by experimenting with different settings or datasets, and to share your findings or questions as you engage with this evolving tool.

Challenges and Future Directions

While our convolutional neural network (CNN) model has shown promising results in detecting seizures from EEG data, several challenges still need to be addressed to make this tool more robust and clinically relevant. One significant challenge is optimizing the model's hyperparameters, such as the learning rate, batch size, and number of epochs. These parameters directly influence the model's ability to learn from the data and generalize to new, unseen examples. Fine-tuning them to find the optimal balance between underfitting and overfitting remains an ongoing process.

Another critical challenge is addressing data imbalance. In our current dataset, there may be more non-seizure instances than seizure instances, which can lead to a model biased towards predicting non-seizure events. To mitigate this, we are exploring advanced data augmentation techniques and alternative sampling strategies to create a more balanced dataset that can improve the model's accuracy across different scenarios.

We also plan to enhance the model's robustness by incorporating additional layers or alternative architectures, such as residual networks (ResNets) or long short-term memory networks (LSTMs). These architectures could potentially capture more complex patterns in the EEG data and lead to better seizure detection. Additionally, we are looking into integrating domain adaptation techniques to make the model more adaptable to data from different sources or equipment, as this is often a significant hurdle in deploying machine learning models in real-world clinical settings.

Future iterations of the model will focus on improving interpretability. This includes developing techniques that can provide insights into which features or parts of the EEG signal contribute most to the model's decisions. Understanding these aspects is crucial for gaining trust and acceptance from clinicians who rely on the model's outputs for making patient care decisions. We also plan to explore real-time processing capabilities, aiming to make the model faster and more efficient so that it can be integrated into real-time monitoring systems.

Conclusion

This project underscores the immense potential of deep learning in medical diagnostics, particularly in detecting neurological conditions like epilepsy. By leveraging convolutional neural networks, we have created a tool that can accurately differentiate between seizure and non-seizure events in EEG data, providing a non-invasive and reliable method for clinicians to use in their decision-making processes. As we continue refining our models and addressing the challenges mentioned, our ultimate goal is to develop a tool that not only achieves high accuracy but also offers interpretability and real-time processing capabilities. This will enhance its utility in various clinical settings and contribute to better patient outcomes.

We also acknowledge the importance of building on the work of others. The EEG data used in this project were derived from the CHB-MIT Scalp EEG database, and the spectrogram images were processed to fit our specific use case. As we move forward, we hope to collaborate with more data sources and researchers to continuously improve and validate our models.

We invite you, our viewers and users, to experiment with the displayed results and explore how different settings impact the model's performance. Your feedback and questions are invaluable, and we encourage you to reach out with any thoughts or inquiries you might have about this application or the broader implications of using machine learning in healthcare.

By refining these sections, we've added more depth and clarity to our current challenges and our future ambitions, as well as reinforcing our commitment to transparency and collaboration in advancing medical AI tools.

For more information on the original CHB-MIT Scalp EEG Database, visit the official website: CHB-MIT Scalp EEG Database.

Appendix: Python Code for Model Training


import os
import numpy as np
from PIL import Image
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras import Input
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns
import time

# Set random seeds for reproducibility
# np.random.seed(42)
# tf.random.set_seed(42)

# Parameters
img_size = (280, 274)  # Image size from the study

# batch_size = 16        # original paper
# batch_size = 32        # 
batch_size = 64          # better results

# epochs = 15            # original paper
# epochs = 30            #
# epochs = 60            # better results
epochs = 100             # 

# initial_lr = 0.01      # original paper
# initial_lr = 0.005      # Updated learning rate 
initial_lr = 0.001      # better results
# initial_lr = 0.0001      # Updated learning rate

# augmentation = True   # worse results
augmentation = False   # better results
shuffling = True      # better results
# shuffling = False      # worse results

# Define base directory
base_dir = os.path.expanduser('~/bionichaos-site/RhythmScan/static/data/dataset')
train_dir = os.path.join(base_dir, 'train')
val_dir = os.path.join(base_dir, 'val')
test_dir = os.path.join(base_dir, 'test')

# Function to load images from a folder
def load_images_from_folder(folder, label):
    images = []
    labels = []
    if not os.path.exists(folder):
        print(f"Directory {folder} does not exist.")
        return np.array(images), np.array(labels)
    for filename in os.listdir(folder):
        img_path = os.path.join(folder, filename)
        try:
            img = Image.open(img_path).convert('RGB')
            img = img.resize(img_size)
            img_array = np.array(img)
            images.append(img_array)
            labels.append(label)
        except Exception as e:
            print(f"Error loading image {img_path}: {e}")
    return np.array(images), np.array(labels)

# Load images
print("Loading and processing Train set images:")
seizure_train_images, seizure_train_labels = load_images_from_folder(os.path.join(train_dir, 'seizure'), 1)
non_seizure_train_images, non_seizure_train_labels = load_images_from_folder(os.path.join(train_dir, 'non seizure'), 0)

print("Loading and processing Validation set images:")
seizure_val_images, seizure_val_labels = load_images_from_folder(os.path.join(val_dir, 'seizure'), 1)
non_seizure_val_images, non_seizure_val_labels = load_images_from_folder(os.path.join(val_dir, 'non seizure'), 0)

print("Loading and processing Test set images:")
seizure_test_images, seizure_test_labels = load_images_from_folder(os.path.join(test_dir, 'seizure'), 1)
non_seizure_test_images, non_seizure_test_labels = load_images_from_folder(os.path.join(test_dir, 'non seizure'), 0)

# Combine datasets
X_train = np.concatenate((seizure_train_images, non_seizure_train_images), axis=0)
y_train = np.concatenate((seizure_train_labels, non_seizure_train_labels), axis=0)

X_val = np.concatenate((seizure_val_images, non_seizure_val_images), axis=0)
y_val = np.concatenate((seizure_val_labels, non_seizure_val_labels), axis=0)

X_test = np.concatenate((seizure_test_images, non_seizure_test_images), axis=0)
y_test = np.concatenate((seizure_test_labels, non_seizure_test_labels), axis=0)

# Confirm that both 0 and 1 labels are present
print("Unique labels in y_train:", np.unique(y_train))
print("Unique labels in y_val:", np.unique(y_val))
print("Unique labels in y_test:", np.unique(y_test))

# Print out some labels to confirm correctness
print("First 10 y_train labels:", y_train[:10])
print("First 10 y_val labels:", y_val[:10])
print("First 10 y_test labels:", y_test[:10])

# Confirm dataset shapes
print(f"X_train shape: {X_train.shape}")
print(f"y_train shape: {y_train.shape}")
print(f"X_val shape: {X_val.shape}")
print(f"y_val shape: {y_val.shape}")
print(f"X_test shape: {X_test.shape}")
print(f"y_test shape: {y_test.shape}")

# Normalize and reshape data
X_train = X_train / 255.0
X_val = X_val / 255.0
X_test = X_test / 255.0

# One-hot encode labels
y_train = to_categorical(y_train, 2)
y_val = to_categorical(y_val, 2)
y_test = to_categorical(y_test, 2)

# Define the model
model = Sequential([
    Input(shape=(img_size[0], img_size[1], 3)),
    Conv2D(16, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Conv2D(32, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(2, activation='softmax')
])

# Compile the model
# optimizer = tf.keras.optimizers.SGD(learning_rate=initial_lr, momentum=0.9, decay=1e-6)
optimizer = tf.keras.optimizers.Adam(learning_rate=initial_lr)
optimizer_name = 'Adam'
model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])

# Data augmentation (only if augmentation is set to True)
if augmentation:
    datagen = ImageDataGenerator(
        rotation_range=20,
        width_shift_range=0.2,
        height_shift_range=0.2,
        horizontal_flip=True
    )
else:
    datagen = ImageDataGenerator()

# Prepare data generators
train_generator = datagen.flow(X_train, y_train, batch_size=batch_size, shuffle=shuffling)
val_generator = datagen.flow(X_val, y_val, batch_size=batch_size, shuffle=shuffling)

# Callbacks for saving the best model and early stopping
model_checkpoint = ModelCheckpoint(os.path.expanduser('~/bionichaos-site/RhythmScan/best_model.keras'), monitor='val_loss', save_best_only=True)
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

# Track training time
start_time = time.time()

# Train the model
history = model.fit(
    train_generator,
    validation_data=val_generator,
    epochs=epochs,
    steps_per_epoch=len(X_train) // batch_size,
    validation_steps=len(X_val) // batch_size,
    callbacks=[model_checkpoint, early_stopping]
)

# Calculate training time per epoch
training_time_per_epoch = (time.time() - start_time) / epochs

# Save the final model
model.save(os.path.expanduser('~/bionichaos-site/RhythmScan/seizure_detection_final_model.keras'))

# Evaluate the model
val_loss, val_acc = model.evaluate(X_test, y_test)
print(f'Test accuracy: {val_acc * 100:.2f}%')

# Model size
model_size = os.path.getsize(os.path.expanduser('~/bionichaos-site/RhythmScan/seizure_detection_final_model.keras')) / (1024 * 1024)  # in MB

# Calculate averages for accuracy and loss
avg_train_acc = np.mean(history.history['accuracy'])
avg_val_acc = np.mean(history.history['val_accuracy'])
avg_train_loss = np.mean(history.history['loss'])
avg_val_loss = np.mean(history.history['val_loss'])

# Output results
print(f'Average Training Accuracy: {avg_train_acc * 100:.2f}%')
print(f'Average Validation Accuracy: {avg_val_acc * 100:.2f}%')
print(f'Average Training Loss: {avg_train_loss:.4f}')
print(f'Average Validation Loss: {avg_val_loss:.4f}')
print(f'Training time per epoch: {training_time_per_epoch:.2f} seconds')
print(f'Model size: {model_size:.2f} MB')

# Plot accuracy and loss over epochs
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Train')
plt.plot(history.history['val_accuracy'], label='Validation')
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(loc='upper left')

plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Train')
plt.plot(history.history['val_loss'], label='Validation')
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(loc='upper left')

# Save the figure
timestamp = time.strftime("%Y%m%d-%H%M%S")
save_dir = os.path.expanduser(f'~/bionichaos-site/RhythmScan/static/results_{timestamp}')
os.makedirs(save_dir, exist_ok=True)

plt.savefig(os.path.join(save_dir, f'accuracy_loss_{timestamp}.png'))
plt.show()

# Predict the classes for the test set
y_pred = model.predict(X_test)
y_pred_classes = np.argmax(y_pred, axis=1)
y_true_classes = np.argmax(y_test, axis=1)

# Find misclassified indices
misclassified_indices = np.where(y_pred_classes != y_true_classes)[0]

# Plot the confusion matrix
conf_matrix = confusion_matrix(y_true_classes, y_pred_classes)
plt.figure(figsize=(10, 7))
sns.heatmap(conf_matrix, annot=True, fmt="d", cmap="Blues", xticklabels=['Non-Seizure', 'Seizure'], yticklabels=['Non-Seizure', 'Seizure'])
plt.title('Confusion Matrix')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')

# Save the confusion matrix figure
plt.savefig(os.path.join(save_dir, f'confusion_matrix_{timestamp}.png'))
plt.show()

# Save the terminal output to a log file
log_file_path = os.path.join(save_dir, f'training_log_{timestamp}.txt')
output_text = f"""Test accuracy: {val_acc * 100:.2f}%
Average Training Accuracy: {avg_train_acc * 100:.2f}%
Average Validation Accuracy: {avg_val_acc * 100:.2f}%
Average Training Loss: {avg_train_loss:.4f}
Average Validation Loss: {avg_val_loss:.4f}
Training time per epoch: {training_time_per_epoch:.2f} seconds
Model size: {model_size:.2f} MB

# Model Settings:
Batch Size: {batch_size}
Epochs: {epochs}
Initial Learning Rate: {initial_lr}
Augmentation: {augmentation}
Shuffling: {shuffling}
Optimizer: {optimizer_name}
"""

# Save log text to the file
with open(log_file_path, 'w') as log_file:
    log_file.write(output_text)


# Plot the first 10 misclassified images
num_images_to_display = 10
plt.figure(figsize=(15, 15))
for i, index in enumerate(misclassified_indices[:num_images_to_display]):
    plt.subplot(5, 2, i + 1)
    plt.imshow(X_test[index])
    plt.title(f"True label: {y_true_classes[index]}, Predicted: {y_pred_classes[index]}")
    plt.axis('off')

# Save misclassified images plot
plt.tight_layout()
plt.savefig(os.path.join(save_dir, f'misclassified_images_{timestamp}.png'))
plt.show()

EEG Seizure Detection Project

Explore the Dataset

Data Overview

EEG Spectrograms

Model Results

Accuracy and Loss Plot

Confusion Matrix

Misclassified Images

Training Log

Detailed Overview: Understanding and Improving Model Performance in Seizure Detection using CNNs