Audio Based Violent Scene Classification Using Ensemble Learning
Özet
In this paper, we deal with the problem of violent scene detection. Although visual signal has been widely used in detection of violent scenes from video data, audio modality; on the other hand, has not been explored as much as visual modality of the video data. Also, in some scenarios such as video surveillance, visual modality can be missing or absent due to the environmental conditions. Therefore, we use the audio modality of video data to decide whether a video scene is violent or not. For this purpose, we propose an ensemble learning method to classify video scenes as "violent" or "non-violent". We provide empirical analyses both for different audio features and classifiers. As a result, we obtain best classification performance by using the Random Forest algorithm along with the ZCR feature. We use MediaEval Violent Scene Detection task dataset for the evaluations and obtain superior results with the official metric MAP@100 of 66% compared with the literature.