BirdVox-scaper-10k: a synthetic dataset for multilabel species classification of flight calls from 10-second audio recordings (Q6365)

From MaRDI portal
Revision as of 15:10, 20 February 2025 by Importer (talk | contribs) (‎Created a new Item)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Dataset published at Zenodo repository.
Language Label Description Also known as
English
BirdVox-scaper-10k: a synthetic dataset for multilabel species classification of flight calls from 10-second audio recordings
Dataset published at Zenodo repository.

    Statements

    0 references
    BirdVox-scaper-10k: a synthetic dataset for multilabel species classification of flight calls from 10-second audio recordings ============================================================================================= Version 1.0, September 2019. Created By ------------- Elizabeth Mendoza (1), Vincent Lostanlen (2, 3, 4), Justin Salamon (3, 4), Andrew Farnsworth (2), Steve Kelling (2), and Juan Pablo Bello (3, 4). (1): Forest Hills High School, New York, NY, USA (2): Cornell Lab of Ornithology, Cornell University, Ithaca, NY, USA (3): Center for Urban Science and Progress, New York University, New York, NY, USA (4): Music and Audio Research Lab, New York University, New York, NY, USA https://wp.nyu.edu/birdvox Description -------------- The BirdVox-scaper-10k dataset contains 9983 artificial soundscapes. Each soundscape lasts exactly ten seconds and contains one or several avian flight calls from up to 30 different species of New World warblers (Parulidae). Alongside each audio file, we include an annotation file describing the start time and end time of each flight call in the corresponding soundscape, as well as the species of warbler it belongs to. In order to synthesize soundscapes in BirdVox-scaper-10k, we mixed natural sounds from various pre-recorded sources. First, we extracted isolated recordings of flight calls containing little or no background noise from the CLO-43SD dataset [1]. Secondly, we extracted 10-second empty acoustic scenes from the BirdVox-DCASE-20k dataset [2]. These acoustic scenes contain various sources of real-world background noise, including biophony (insects) and anthropophony (vehicles), yet are guaranteed to be devoid of any flight calls. Lastly, we fill each acoustic scene by mixing it with flight calls sampled at random. Although the BirdVox-scaper-10k does not consist of natural recordings, we have taken several measures to ensure the plausibility of each synthesized soundscape, both from qualitative and quantitative standpoints. The BirdVox-scaper-10k dataset can be used, among other things, for the research, development, and testing of bioacoustic classification models. For details on the hardware of ROBIN recording units, we refer the reader to [2]. [1] J. Salamon, J. Bello. Fusing shallow and deep learning for bioacoustic bird species classification. Proc. IEEE ICASSP, 2017. [2] V. Lostanlen, J. Salamon, A. Farnsworth, S. Kelling, and J. Bello. BirdVox-full-night: a dataset and benchmark for avian flight call detection. Proc. IEEE ICASSP, 2018. [3] J. Salamon, J. P. Bello, A. Farnsworth, M. Robbins, S. Keen, H. Klinck, and S. Kelling. Towards the Automatic Classification of Avian Flight Calls for Bioacoustic Monitoring. PLoS One, 2016. @inproceedings{lostanlen2018icassp, title = {BirdVox-full-night: a dataset and benchmark for avian flight call detection}, author = {Lostanlen, Vincent and Salamon, Justin and Farnsworth, Andrew and Kelling, Steve and Bello, Juan Pablo}, booktitle = {Proc. IEEE ICASSP}, year = {2018}, published = {IEEE}, venue = {Calgary, Canada}, month = {April}, }
    0 references
    9 February 2019
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    1.0
    0 references

    Identifiers

    0 references