Skip to content

Latest commit

 

History

History
71 lines (61 loc) · 2.12 KB

File metadata and controls

71 lines (61 loc) · 2.12 KB

Musical Genre Classification

Three Models for Musical Genre Classification are discussed, one CNN model and two CRNN models. This project is based on Pytorch. Project report here.

Get Start:

Audio Files

  • We use enhanced GTZAN Dataset, which contains 72 full songs and 1000 30s audio tracks.
  • There are 10 genres, they are:
{0: 'pop',
 1: 'metal',
 2: 'disco',
 3: 'blues',
 4: 'reggae',
 5: 'classical',
 6: 'rock',
 7: 'hiphop',
 8: 'country',
 9: 'jazz'}

Log Mel-Spectrogram Datasets

We offers three pre-processed datasets, you can also generate datasets using Build Dataset Handmade.ipynb or Build Dataset.ipynb. Download Here

  • Pure GTZAN Dataset (128^2 Chunks, 7000 in total)
  • Mixed DatasetI (128^2 Chunks, 12370 in total)
  • Mixed DatasetII (256^2 Chunks, 4533 in total)

Training

  • Define Parameters in Paras.py
  • Use train.py for training
  • Training Logs saved in log fold (loss/accuracy vs epoch on train set and validation set)

Test

  • Use music_dealer.py to predict the genre components of full song, see genre_predictor.ipynb and music_dealer.py for details
  • Test result saved in log fold

Result

Test on frames

  • Accuracy
CNN Model CRNN-I Model CRNN-II Model
Test Set 88.05% 85.08% 88.45%
Validation Set 86.89% 83.05% 82.67%
  • Confusion Matrix

Test on full songs

30 songs are used for test, Samples:

Thanks: