2.7 KiB
Accidental Classifier
A small CNN that classifies musical accidentals — sharp, flat, and natural — from cropped grayscale images of engraved scores.
Trained on crops extracted from the PrIMuS dataset and achieves 100% accuracy on a held-out test set (750 samples).
Confusion matrix (rows=true, cols=predicted):
flat natural sharp
flat 250 0 0
natural 0 250 0
sharp 0 0 250
Architecture
| Input | 1 × 40 × 40 grayscale |
| Backbone | 3 conv blocks (16 → 32 → 64 channels), each with BatchNorm + ReLU + MaxPool |
| Head | Dropout → FC(1600, 64) → ReLU → Dropout → FC(64, 3) |
| Parameters | ~115 k |
| Output | 3-class logits (flat / natural / sharp) |
A saved ONNX export is also produced by train.py for inference via OpenCV's dnn module.
Repository layout
train.py Training pipeline (CNN + augmentation + weighted sampling)
eval.py Evaluate saved checkpoint on the held-out test set
extract_accidentals.py Extract accidental crops from PrIMuS MEI files via Verovio
extract_fast.py Faster single-font variant of the extractor
explore_dataset.py Explore PrIMuS agnostic encodings and image statistics
segment_test.py Connected-component symbol segmentation experiment
model/
accidental_classifier.pt Saved PyTorch weights (best validation accuracy)
Pipeline
1. Extract crops
Render each PrIMuS MEI file with Verovio, locate accidental glyphs (SMuFL codepoints E260/E261/E262) in the SVG, rasterize with cairosvg, and crop each symbol into crops/{flat,natural,sharp}/.
python extract_fast.py
2. Train
Loads all crops into RAM, applies a stratified train/val/test split (seed 42), trains with data augmentation (random affine + erasing) and class-balanced sampling, and saves the best checkpoint.
python train.py
3. Evaluate
Reproduces the identical test split and reports accuracy, per-class metrics, the confusion matrix, and sample misclassifications.
python eval.py
Dependencies
- Python 3.10+
- PyTorch
- torchvision
- NumPy
- Pillow
- OpenCV (
cv2) — for ONNX verification and segmentation experiments - Verovio (Python bindings) — for crop extraction only
- cairosvg — for crop extraction only
Data
Training data is derived from PrIMuS (Calvo-Zaragoza & Rizo, 2018). The extracted crops and raw dataset are not included in this repository due to size.
License
This project is released into the public domain.