Kangkang Zhang, Tong Liu, Muxun Liu, Aoqiang Li, Yanhong Xiao, Walter Metzner, and Ying Liu
For analysis of vocal syntax, accurate classification of call sequence structures in different behavioural contexts is essential. However, an effective, intelligent program for classifying call sequences from numerous recorded sound files is still lacking. Here, we employed three machine learning algorithms (Logistic Regression, Support Vector Machine (SVM) and Decision Trees) to classify call sequences of social vocalizations of greater horseshoe bats (Rhinolophus ferrumequinum) in aggressive and distress contexts. The three machine learning algorithms obtained highly accurate classification rates (Logistic Regression 98%, SVM 97% and Decision Trees 96%). The algorithms also extracted three of the most important features for the classification, the transition between two adjacent syllables, the probability of occurrences of syllables in each position of a sequence, and characteristics of a sequence. The results of statistical analysis also supported the classification of the algorithms. The study provides the first efficient method for data mining of call sequences and the possibility of linguistic parameters in animal communication. It suggests the presence of song-like syntax in the social vocalizations emitted within a non-breeding context in a bat species.