BirdVox-296h: a large-scale dataset for detection and classification of flight calls (Q6392)

From MaRDI portal
Dataset published at Zenodo repository.
Language Label Description Also known as
English
BirdVox-296h: a large-scale dataset for detection and classification of flight calls
Dataset published at Zenodo repository.

    Statements

    0 references
    BirdVox 296 hours dataset (BirdVox-296h) ==================================== Version 2.1, May 2022. Created By ---------- Andrew Farnsworth (1), Benjamin Mark Van Doren (1), Steve Kelling (1), Vincent Lostanlen (2), Justin Salamon (3), Aurora Cramer (4), Juan Pablo Bello (4) (1): Cornell Lab of Ornithology (CLO) (2): Laboratoire des Sciences du Numrique de Nantes (LS2N), CNRS (3): Adobe Research (4): New York University https://wp.nyu.edu/birdvox Description --------------- The BirdVox-296h dataset contains 148 audio recordings, each two hours in duration. These recordings come from ROBIN autonomous recording units, placed near Ithaca, NY, USA during the fall 2015. They were captured by nine different sensors, originally numbered 1, 2, 3, 4, 5, 6, 7, 8, and 10. Ornithologist Andrew Farnsworth used the Raven software to pinpoint and label every avian flight call in time and frequency. He found 26138 sound events, of which 21546 are flight calls from Passeriformes. Of those, 13385 are identifiable in terms of family, and 8669 are identifiable in terms of both family and species. The annotation process took over 600 hours. The dataset can be used, among other things, for the research, development and testing of machine listening models for bird migration monitoring. Data Files ------------ The BirdVox-296h_wav folder contains 148 recordings as WAV files, sampled at 24 kHz, with a single channel (mono). Each recording lasts exactly two hours and is named according to the following format: YYYY-MM-DD_hh-mm-ss_unitUU.wav Where Y means Year, M means Month, D means Day, h means hour, m means minute, and s means second. This date format corresponds to the start time of the recording file, expressed in Coordinated Universal Time (UTC). The field UU contains two digits corresponding to the identifier of the autonomous recording unit (i.e., bioacoustic sensor). UU is either equal to 01, 02, 03, 04, 05, 06, 07, 08, or 10. Note that 09 is absent from the list because sensor 09 failed during the acquisition campaign. Metadata Files ------------------- The BirdVox-296h_csv-annotations folder contains CSV files, one for each audio file. The columns of each CSV file are: ID,Time (s),Frequency (Hz),Taxonomy Code,Fine Label,Medium Label,Coarse Label Taxonomy Code is compliant with the BirdVoxClassify software: github.com/BirdVox/BirdVoxClassify Fine Label, Medium Label, and Coarse Label most often correspond to species, family and order respectively. The BirdVox-296h_gps-coordinates.csv file contains the approximate GPS coordinates of the sensors (latitudes and longitudes rounded to 2 decimal points) of all nine sensors. Conditions of Use ----------------- Dataset created by Andrew Farnsworth, Steve Kelling, Vincent Lostanlen, Justin Salamon, Aurora Cramer, and Juan Pablo Bello. The BirdVox-full-night dataset is offered free of charge under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) license: https://creativecommons.org/licenses/by/4.0/ The dataset and its contents are made available on an as is basis and without warranties of any kind, including without limitation satisfactory quality and conformity, merchantability, fitness for a particular purpose, accuracy or completeness, or absence of errors. Subject to any liability that may not be excluded or limited by law, Cornell Lab of Ornithology is not liable for, and expressly excludes all liability for, loss or damage however and whenever caused to anyone by any use of the BirdVox-full-night dataset or any part of it. Feedback ------------- Please help us improve BirdVox-296h by sending your feedback to: vincent.lostanlen@ls2n.fr and af27@cornell.edu In case of a problem, please include as many details as possible. Acknowledgements -------------------------- Jessie Barry, Ian Davies, Tom Fredericks, Jeff Gerbracht, Sara Keen, Holger Klinck, Anne Klingensmith, Ray Mack, Peter Marchetto, Ed Moore, Matt Robbins, Ken Rosenberg, and Chris Tessaglia-Hymes. We acknowledge that the land on which the data was collected is the unceded territory of the Cayuga nation, which is part of the Haudenosaunee (Iroquois) confederacy.
    0 references
    19 December 2021
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    2.1
    0 references

    Identifiers

    0 references