Using BreastMNIST: A Simpler Dataset for Deep Learning
BreastMNIST, a subset of the MedMNIST dataset, is a publicly available binary classification dataset designed for
medical image analysis. Binary classification plays a critical role in medical imaging as it simplifies complex
diagnostic tasks, allowing models to focus on identifying key differences between normal and pathological
conditions. This specificity is particularly valuable for datasets like BreastMNIST, which aim to distinguish
between benign and malignant tumors in breast tissue samples. Unlike other MedMNIST subsets, which often involve
multi-class classification tasks, BreastMNIST focuses solely on distinguishing between benign and malignant
tumors in breast tissue samples. This simplicity makes it an excellent starting point for anyone venturing into
medical image processing and deep learning.
In this blog, we will explore why BreastMNIST is an optimal choice for beginners in medical AI research, walk
through the initial steps of dataset handling, and discuss the potential improvements in AI agent integration
for processing similar datasets in the future.
Why Choose BreastMNIST?
- Binary Classification: With only two classes (benign or malignant), BreastMNIST reduces the
complexity of training models, allowing researchers to focus on improving model performance without the
added challenges of multi-class data.
- Compact Dataset: It is one of the smallest datasets in the MedMNIST suite, making it easy
to download, preprocess, and visualize without requiring extensive computational resources.
- High Performance Potential: As shown in the comparison table below, BreastMNIST achieves
impressive results across various architectures. Even with relatively fewer pixels, the dataset’s
performance is competitive, highlighting its usefulness for algorithm testing and rapid prototyping.
Methods |
AUC |
ACC |
ResNet-18 (28x28) |
0.901 |
0.863 |
ResNet-18 (224x224) |
0.891 |
0.833 |
ResNet-50 (28x28) |
0.857 |
0.812 |
ResNet-50 (224x224) |
0.866 |
0.842 |
AutoML Vision (Google) |
0.919 |
0.861 |
Auto-Keras |
0.871 |
0.831 |
auto-sklearn |
0.836 |
0.803 |
As seen in the table, ResNet-18 on 28x28 images achieves the highest accuracy among the manually trained models,
while Google AutoML Vision surpasses all other methods. Notably, increasing image size to 224x224 did not
improve performance significantly, challenging the conventional expectation that higher resolutions yield better
results. This finding suggests that the additional pixels may not provide meaningful new information for this
specific task, and it raises broader questions about the diminishing returns of increased resolution in small
datasets. For BreastMNIST, the task’s simplicity and the dataset’s design likely optimize performance at lower
resolutions, emphasizing the need to carefully match model input sizes with the inherent characteristics of the
data. These insights could influence future dataset curation and model development strategies.
Workflow: Processing and Training BreastMNIST
- Dataset Download and Exploration: The dataset is freely available on the MedMNIST website or through Zenodo. No registration is required, ensuring easy access.
- Visualization: Start by visualizing the 28x28 grayscale images to understand the data
distribution. A simple Python script using libraries like Matplotlib can display these images. This step
helps ensure that the data is correctly loaded and provides insights into class imbalances, if any. Once
visualized, the dataset can be further reviewed for artifacts or inconsistencies.
- Model Training: Using a lightweight convolutional neural network (CNN) such as ResNet-18 or
a similar architecture, you can quickly achieve high performance. These models, optimized for simplicity and
computational efficiency, are well-suited for BreastMNIST's binary classification task, enabling robust
predictions with minimal resource requirements. Training metrics such as accuracy, AUC, loss curves, and
confusion matrices should be visualized to evaluate the model’s robustness. Additionally, exporting these
metrics to a JSON file can streamline analysis and facilitate sharing results.
- Interactive Results: An interactive blog format would allow readers to scroll through
sample images and view training metrics dynamically. By implementing scrollable image galleries, users can
seamlessly explore multiple samples from the dataset without interruptions. Visualizations such as heatmaps
for feature activation or plots comparing training and validation losses can be embedded directly, providing
comprehensive insights into the model’s behavior. Additionally, interactive tools could allow users to hover
over specific data points to view detailed metrics, making the analysis more engaging and precise.
Furthermore, including animation transitions between visualizations might enhance clarity and maintain user
engagement. This format could also support filtering options, enabling users to focus on specific metrics or
subsets of data for tailored insights.
Explaining the Image Size Dilemma
Why does increasing image resolution from 28x28 to 224x224 fail to significantly boost performance? There are
several potential reasons:
- Redundant Features: The additional pixels in higher resolutions may not provide meaningful
new information for the task.
- Overfitting Risk: Larger input sizes increase the parameter count, making the model prone
to overfitting, especially with limited training data.
- Preprocessing Alignment: The original MedMNIST dataset was curated with 28x28 images.
MedMNIST+ now provides larger versions of the dataset, including resolutions of 64x64, 128x128, and 224x224
for 2D images, as well as 64x64x64 for 3D images. These larger sizes serve as standardized benchmarks for
medical foundation models, which are generalized deep learning architectures pre-trained on large and
diverse medical datasets. These models can be fine-tuned for specific tasks, making them valuable for
advancing the performance of domain-specific applications like BreastMNIST. The original design of
BreastMNIST with 28x28 images, however, remains advantageous for computational efficiency and simplicity in
binary classification tasks.
Novel Training Methods for BreastMNIST
To further enhance the performance of models trained on BreastMNIST, novel training methods could be explored.
These include:
- Self-Supervised Learning: Utilizing unlabeled data alongside BreastMNIST could allow models
to learn generalizable features before fine-tuning on labeled data. Techniques such as contrastive learning
or masked autoencoders could be applied to extract meaningful representations.
- Domain-Specific Transfer Learning: Instead of generic pretrained models, leveraging models
fine-tuned on similar medical imaging tasks may improve performance. For example, initializing with weights
from a model trained on mammography datasets could provide a head start for BreastMNIST.
- Curriculum Learning: Introducing examples in an order of increasing complexity might help
the model learn better. Simpler, high-confidence examples could be presented first, gradually moving to more
challenging cases.
- Data Augmentation Strategies: Beyond conventional techniques like rotation and flipping,
advanced methods like GAN-based augmentation or mixup could generate realistic synthetic samples to increase
data diversity.
- Explainable AI (XAI) Integration: Training methods that prioritize interpretability, such
as attention-based mechanisms or prototype learning, could improve both performance and trustworthiness in
medical AI applications.
By exploring these approaches, we could establish more robust and efficient training pipelines for BreastMNIST.
Among these methods, self-supervised learning and domain-specific transfer learning are particularly impactful
for immediate implementation. Self-supervised learning leverages unlabeled data to enhance feature extraction,
while transfer learning from mammography-trained models accelerates performance improvements. These strategies
offer a balance of practicality and innovation, making them well-suited for advancing the state of medical
imaging research and application.
Conclusion
BreastMNIST stands out as an excellent dataset for binary classification in medical imaging, offering simplicity
and computational efficiency without compromising on performance. Through innovative training methods,
thoughtful exploration of dataset characteristics, and advancements in model architecture, researchers can
unlock its full potential. The insights gained from working with BreastMNIST can guide broader medical AI
applications, paving the way for enhanced diagnostic tools and improved healthcare outcomes. We encourage
researchers and developers to explore this dataset and contribute to the growing field of medical imaging
research.