Algorithm overview for the BirdCLEF 2021 - Birdcall Identification data:
- data split of audio recordings:
- train: 56863 recordings
- validation: 2997 recordings
- test: 3014 recordings
- outputs for 397 bird species
- pretrained Inception neural network backbone
- recording-level metrics
(based on Deep CNN framework for audio event recognition using weakly labeled web data):
MultiCategoryAccuracy: a value of 1 for a recording if its top scoring species is in either the primary or secondary bird labels; a value of 0 otherwise.CrossentropyModified: species in a recording's secondary bird labels are effectively ignored. Remaining species are used for its loss computation.
- training augmentation (the code for the random masks of the time/frequency slices is based on the documentation for Tensorflow IO's audio package):
- remove a random amount of audio from the beginning of each recording that is longer than 5 seconds
- apply a mask to a random time slice
- apply a mask to a random frequency slice
- randomly reduce the range of the decibel levels
- add waveform from another recording within the batch
- tunable hyperparameters:
- (initial) learning rate (of Adam optimizer)
- number of training batches to process before updating the weights
- maximum time slice mask
- maximum frequency slice mask
- maximum decibel threshold reduction
- dropout rate for final layer of Inception
- fixed hyperparameter:
- weight multiplied to loss for features with positive labels
- recording-level accuracy on the validation data. (Accuracy on a blind test set would likely be lower):
- 0.7781
Setup Amazon SageMaker AI:
- In the left panel click on Domains
- Click on
Create Domain - Select
Set up for single user (Quick setup). Click onSet up - After the environment is set up, click on
Open Studio - Click on
JupyterLab - Click on
Create JupyterLab space - Enter a name for the space. Leave
SharingasPrivate. Click onCreate space - Set
Storage (GB)to 90. SetInstancetoml.t3.large.
Open AWS Service Quotas:
- Click on AWS Services
- Search for and select Amazon SageMaker
- Search for and select
ml.g5.4xlarge for spot training job usage - Click on
Request increase at account level - For
Increase quota value, enter3 - Click on
Request
- Install Kaggle app
- Download Kaggle token from Settings->Account->API to
~/.kaggle/kaggle.json - Agree to Kaggle birdsong data rules.
- Download Kaggle birdsong data and put into SageMaker bucket (requires access to the S3 bucket):
kaggle competitions download -c birdclef-2021
aws s3 cp birdclef-2021.zip s3://sagemaker-{REGION}-{ACCOUNT}/
- From SageMaker studio, open previously created JupyterLab space.
- Click on
Run space - After the space is running, click on
Open JupyterLab - In JupyterLab, click on the Git icon.
- Enter the repository's url: https://github.com/toddstep/birdclef.git . Click on
Clone - Run Birdclef Training Notebook
- Update Test Scores Analysis and Deploy Notebook with the
tun.latest_tuning_job.job_namefrom Birdclef Training Notebook - Tune notebook can be shutdown and the space stopped while the tuning runs.
- Wait for the hyperparameter tuning job to complete,
- The tuning jobs can be monitored at Training | Amazon SageMaker AI
- Best tuning job:
- Hyperparameter values:
- learning_rate: 0.0009533691065423334
- num_batch_accum: 2
- time_mask_param: 87
- freq_mask_param: 108
- reduce_db_param: 0.21920253534419515
- feat_drop_rate: 0.578526285895281
- pos_weight: 1.0 (fixed)
- 29 training epochs (best epoch chosen using an accuracy precision of 0.01)
- Hyperparameter values:
- Go to SageMaker Studio.
- Open previously created JupyterLab space
- Estimate optimal threshold level for a 1% targeted false-positive rate using Test Scores Analysis. Ideally, this would be done on a dataset not previously used in tuning the model.
- Run Deploy Notebook
- See Bird Frontend