Add README with architecture, pipeline, and usage docs
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
afc16c2bbb
commit
85c8cfd8bb
1 changed files with 82 additions and 1 deletions
83
README.md
83
README.md
|
|
@ -1,2 +1,83 @@
|
|||
# accidentals
|
||||
# Accidental Classifier
|
||||
|
||||
A small CNN that classifies musical accidentals — **sharp**, **flat**, and **natural** — from cropped grayscale images of engraved scores.
|
||||
|
||||
Trained on crops extracted from the [PrIMuS dataset](https://grfia.dlsi.ua.es/primus/) and achieves **100% accuracy** on a held-out test set (750 samples).
|
||||
|
||||
```
|
||||
Confusion matrix (rows=true, cols=predicted):
|
||||
flat natural sharp
|
||||
flat 250 0 0
|
||||
natural 0 250 0
|
||||
sharp 0 0 250
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| **Input** | 1 × 40 × 40 grayscale |
|
||||
| **Backbone** | 3 conv blocks (16 → 32 → 64 channels), each with BatchNorm + ReLU + MaxPool |
|
||||
| **Head** | Dropout → FC(1600, 64) → ReLU → Dropout → FC(64, 3) |
|
||||
| **Parameters** | ~115 k |
|
||||
| **Output** | 3-class logits (flat / natural / sharp) |
|
||||
|
||||
A saved ONNX export is also produced by `train.py` for inference via OpenCV's `dnn` module.
|
||||
|
||||
## Repository layout
|
||||
|
||||
```
|
||||
train.py Training pipeline (CNN + augmentation + weighted sampling)
|
||||
eval.py Evaluate saved checkpoint on the held-out test set
|
||||
extract_accidentals.py Extract accidental crops from PrIMuS MEI files via Verovio
|
||||
extract_fast.py Faster single-font variant of the extractor
|
||||
explore_dataset.py Explore PrIMuS agnostic encodings and image statistics
|
||||
segment_test.py Connected-component symbol segmentation experiment
|
||||
model/
|
||||
accidental_classifier.pt Saved PyTorch weights (best validation accuracy)
|
||||
```
|
||||
|
||||
## Pipeline
|
||||
|
||||
### 1. Extract crops
|
||||
|
||||
Render each PrIMuS MEI file with [Verovio](https://www.verovio.org/), locate accidental glyphs (SMuFL codepoints E260/E261/E262) in the SVG, rasterize with cairosvg, and crop each symbol into `crops/{flat,natural,sharp}/`.
|
||||
|
||||
```bash
|
||||
python extract_fast.py
|
||||
```
|
||||
|
||||
### 2. Train
|
||||
|
||||
Loads all crops into RAM, applies a stratified train/val/test split (seed 42), trains with data augmentation (random affine + erasing) and class-balanced sampling, and saves the best checkpoint.
|
||||
|
||||
```bash
|
||||
python train.py
|
||||
```
|
||||
|
||||
### 3. Evaluate
|
||||
|
||||
Reproduces the identical test split and reports accuracy, per-class metrics, the confusion matrix, and sample misclassifications.
|
||||
|
||||
```bash
|
||||
python eval.py
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
- Python 3.10+
|
||||
- PyTorch
|
||||
- torchvision
|
||||
- NumPy
|
||||
- Pillow
|
||||
- OpenCV (`cv2`) — for ONNX verification and segmentation experiments
|
||||
- Verovio (Python bindings) — for crop extraction only
|
||||
- cairosvg — for crop extraction only
|
||||
|
||||
## Data
|
||||
|
||||
Training data is derived from [PrIMuS](https://grfia.dlsi.ua.es/primus/) (Calvo-Zaragoza & Rizo, 2018). The extracted crops and raw dataset are not included in this repository due to size.
|
||||
|
||||
## License
|
||||
|
||||
This project is released into the public domain.
|
||||
|
|
|
|||
Loading…
Reference in a new issue