In July, 2019, Pod.Cast was developed by Microsoft employees Akash Mahajan, Prakruti Gogia, and Nithya Govindarajan volunteering through Microsoft’s annual hackathon for Orcasound. Since then, the tool has labeled many rounds of open data with help from bioacousticians like Scott Veirs. On-going development is led by Akash and Prakruti, mainly through AI4Earth & OneWeek hackathons.
Pod.Cast uses a machine learning model to accelerate annotation of biological signals in audio data through web-based crowdsourcing. The software lets a user:
- Predict the time bounds of signals in a recording based on a machine learning model.
- Visualize the audio data as a spectrogram and listen via a web-based playback UI that is synchronized with the spectrogram.
- Validate the predicted annotations and optionally add annotations manually, collaborating with a “crowd” of other human annotators.
- Save annotations from each labeling “round” in a .tsv file.
Open-source code & open data
- orcalabel-podcast repo (Github)
- Pod.Cast training and test data (metadata in Github wiki, with links to open audio data in AWS/S3)
- Pod.Cast with Orcasound data and a killer whale call model. (VGG-ish model trained first on global killer whale calls from the Watkins Marine Mammal Library, and then on multiple rounds of Southern Resident Killer Whale call data from Orcasound hydrophones.)
Collaborators & contributions
- Lead developers
- Akash Mahajan (2019+, Microsoft*)
- Prakruti Gogia (2019+, Microsoft*)
- Other developers
- Nithya Govindarajan (2019, Microsoft*)
- Annotators / beta-testers
- Scott Veirs (2019+, Beam Reach, Orcasound)
- Val Veirs (2020+, Beam Reach, Orcasound)
* Note: this is volunteer-driven & is not an official product of Microsoft.
Support & credits:
- Microsoft Garage on Twitter (organizers of MS Hack 2019)
- Watkins Marine Mammal Sound Database, Woods Hole Oceanographic Institution (non-commercial academic or personal use of killer whale un/labled recordings for initial model training and annotation round #1)
- audio-annotator was forked for the frontend code; audio-annotator uses wavesurfer.js for rendering/playing audio.
- AI for Orcas project page at Orcasound
- Add links to blog posts — out on the WWW, here, and/or at orcaound.net (posts tagged there with podcast tag)?