Emotional expressions serve multiple functions, including communication and behavioural/physiological responses. Facial expressions like fear and disgust, for example, have both communicative and functional roles for the person expressing them (Shariff & Tracy, 2011). Some of these functional similarities are shared across species, as observed in facial morphology and cortical areas related to muscle control (Kret, Prochazkova, Sterck & Clay, 2020). Research has demonstrated that humans can recognise facial expressions of closely related species like chimpanzees and bonobos, presumably due to these similarities (Preuschoft, 2000; Burrows, Waller, Parr & Bonar, 2006; Müri, 2016). Humans can also categorise facial expressions of bonobos and barbary macaques to varying degrees (Maréchal et al., 2017; Filippi et al., 2017a; Kret & Van Berlo, 2021). Similarly, a recent study on cross-species auditory perception has found that the accuracy of emotional state inference decreases as species become more phylogenetically distant (Fritz et al., 2018), highlighting the importance of inter-species similarity in such inferences. However, this research has also shown that there are stark individual differences in how skilled people are at recognising the emotional expressions of other species, and also has shown that it is easier to read some species compared to other ones.
The ability of cross-species emotional inference in humans however is not limited to closely related species. For example, humans can infer emotional states from facial expressions of dogs, despite having fewer similarities with humans compared to chimpanzees (Bloom & Friedman, 2013). Dogs can also discriminate human facial expressions (Müller, Schmitt, Barber & Huber, 2015; Albuquerque et al., 2016), and similar abilities have been observed in other domesticated animals such as cats, goats, and horses (Smith et al., 2016; Nawroth et al., 2018; Quaranta, d’Ingeo, Amoruso, & Siniscalchi, 2020). This evidence suggests that shared environment and familiarity between humans and domesticated animals are important factors in cross-species emotion perception (Hare & Tomasello, 2005; Nagasawa, Mogi & Kikusui, 2009).
Only a few studies however have tried to explore the relative importance of similarity and familiarity in cross-species emotion perception. In a study where humans were asked to categorise emotions from facial expressions of dogs and chimpanzees, they showed higher accuracy for dogs' facial expressions than chimpanzee expressions (Sullivan, Kim, Vinicius & Harris, 2022). In the same manner, wild bonnet macaques could respond to alarm calls from other species only if they had prior experience with those species, while urban monkeys not exposed to these other species did not recognise the alarm calls (Ramakrishnan & Coss, 2000). These studies seem to suggest a larger role of familiarity than similarity in cross-species emotion recognition.
However, limitations remain in cross-species emotion recognition research. Accurately identifying or measuring the internal state of the expressor, which is essential for true "emotion recognition," is extremely challenging. Even for humans, it is nearly impossible to identify or quantify emotional states with complete accuracy (Barrett et al., 2019; Marechal et al., 2019), and this difficulty is even more pronounced when dealing with nonverbal species (Waller & Micheletta, 2013; Barrett et al., 2019). Additionally, there is an ongoing debate regarding the nature of emotions, with alternative models suggesting that emotions are socially constructed based on previous experiences and are unique to each situation (Lindquist & Barrett, 2008; Barrett, 2011, 2014), rather than adhering to a basic or discrete emotion approach (Colombetti, 2009; Barrett, Gendron & Huang, 2009), which challenges the validity of previous studies on cross-species emotion recognition. Moreover, previous studies most often used facial expressions as indicators of emotional states, despite the fact that facial expressions are just one component of a larger context that includes bodily and environmental cues in real-life situations (Kret, Stekelenburg, Roelofs & De Gelder, 2013). Each species has its own sensory channels and specific body parts that play significant roles in communication, which can disrupt cross-species emotion “recognition” especially when they are unfamiliar to these informational cues. For example, when humans and dogs infer each other's emotional states, dogs tend to focus more on the ears than other facial parts and on the body rather than the head, while humans tend to focus on the face and particularly the eye region when viewing dog images, compared to other facial parts such as the ears, which are important in dog communication (Correia-Caeiro, Guo & Mills, 2021).
The above-mentioned limitations hinder our understanding of, and scientific progress on, cross-species emotion perception. In the current study where we investigated human perception of bonobo emotional expressions, we tried to overcome these limitations by measuring inter-rater reliability among the human raters for each image, instead of calculating accuracy. Moreover, to account for the importance of bodily expressions in conveying emotional information, half of the stimulus images showed only the face, while the other half displayed the whole body without facial information. Specifically, we conducted a study involving human participants who rated images of bonobo facial and bodily expressions depicting emotionally charged or neutral situations. Bonobos, being closely related to humans and sharing facial muscles and behavioural characteristics, were chosen as an ideal species to examine the role of morphological similarity in inferring emotional states (Preuschoft, 2000; Burrows et al., 2006; Prüfer et al., 2012; Muri, 2016). We focused on the bared-teeth display and open mouth display (laughter) and their corresponding bodily expressions in particular, as they resemble human smiles and laughter and are used in distinct social contexts. The bared-teeth display is observed during socio-sexual interactions with high social tension, while the open mouth display is exclusive to positive social interactions like social play (Waller & Dunbar, 2005; Palagi, 2006; 2008; Vlaeyen et al., 2022). To investigate the role of morphological similarity and familiarity (expertise) in cross-species emotion perception, participants with varying levels of experience with bonobos, ranging from novices to experts, were recruited. Participants were asked to rate the valence, intensity, and category of the emotional state depicted in the images, and the inter-rater reliability scores were used as a measure of cross-species emotion perception. We hypothesised that expertise, rather than morphological similarity, would contribute to higher inter-rater reliability, predicting that experts who are familiar with bonobo expressions would show higher inter-rater reliability scores than other groups (see also Van Berlo, Bionda & Kret, 2020). The effect of expertise would be enhanced especially in the expressions produced in the negatively charged emotional situations, such as bared-teeth displays, as they are morphologically similar to the humans smile which is often perceived positive. We also explored the relative importance of the face and body in conveying emotional states.