Abstract
Artificial intelligence (AI) currently exhibits considerable potential within the realm of biodiversity conservation. However, high-quality regionally customized datasets remain scarce, particularly within urban environments. The existing large-scale bird image datasets often lack a dedicated focus on endangered species endemic to specific geographic regions, as well as a nuanced consideration of the complex interplay between urban and natural environmental contexts. Therefore, this paper introduces Macao-ebird, a novel dataset designed to advance AI-driven recognition and conservation of endangered bird species in Macao. The dataset comprises two subsets: (1) Macao-ebird-cls, a classification dataset with 7341 images covering 24 bird species, emphasizing endangered and vulnerable species native to Macao; and (2) Macao-ebird-det, an object detection dataset generated through AI-agent-assisted labeling using grounding DETR with improved denoising anchor boxes (DINO), significantly reducing manual annotation effort while maintaining high-quality bounding-box annotations. We validate the dataset’s utility through baseline experiments with the You Only Look Once (YOLO) v8–v12 series, achieving a mean average precision (mAP50) of up to 0.984. Macao-ebird addresses critical gaps in the existing datasets by focusing on region-specific endangered species and complex urban–natural environments, providing a benchmark for AI applications in avian conservation.
| Original language | English |
|---|---|
| Article number | 84 |
| Journal | Data |
| Volume | 10 |
| Issue number | 6 |
| DOIs | |
| Publication status | Published - Jun 2025 |
Keywords
- Macao-ebird dataset
- YOLO
- classification
- detection
- endangered bird
- grounding DINO