Combining Acoustic and Semantic Similarity for Acoustic Scene Retrieval

Sert, Mustafa; Basbug, Ahmet Melih

Tarih

2019

Yazar

Sert, Mustafa

Basbug, Ahmet Melih

Üst veri

Tüm öğe kaydını göster

Özet

Automatic retrieval of acoustic scenes in large audio collections is a challenging task due to the complex structures of these sounds. A robust and flexible retrieval system should address both the acoustic- and semantic aspects of these sounds and how to combine them. In this study, we introduce an acoustic scene retrieval system that uses a combined acoustic- and semantic-similarity method. To address the acoustic aspects of sound scenes, we use a cascaded convolutional neural network (CNN) with a gated recurrent unit (GRU). The acoustic similarity is calculated in feature space using the Euclidean distance and the semantic similarity is obtained using the Path Similarity method of the WordNet. Two performance datasets from the TAU Urban Acoustic Scenes 2019 and the TUT Urban Acoustic Scenes 2018 are used to compare the performance of the proposed retrieval system with the literature and the developed baseline. Results show that the semantic similarity improves the mAP and P@k scores.

Bağlantı

http://hdl.handle.net/11727/10461

Koleksiyonlar

Mühendislik Fakültesi / Faculty of Engineering [642]

Politika
Kullanıcı Rehberi
Başkent Üniversitesi Kütüphanesi
Başkent Üniversitesi