Investigating the Impact of Background Noise on Group Decision-Making Using an Individual-Weighted Voting Model

doi:10.21203/rs.3.rs-4868818/v1

Download PDF

Article

Investigating the Impact of Background Noise on Group Decision-Making Using an Individual-Weighted Voting Model

https://doi.org/10.21203/rs.3.rs-4868818/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Conceptualizing and measuring communication success is crucial for evaluating hearing interventions, as many hearing-impaired individuals struggle in interactive communication settings. However, no established methods exist to directly assess communication success in the context of hearing impairment and intervention. This study introduces a new perspective on communication success, defining it as the effective exchange of information between interlocutors. Experiments were conducted with ten triads (N = 30) to examine the impact of loud background noise on decision-making using a general-knowledge decision-making task in triadic groups. Participants answered questions twice, both before and after group discussions, under noise conditions of 78 dB and 48 dB SPL. A group decision model was applied to quantify the relative influence of group members on each other’s post-discussion decisions, formalized as a set of model weights. Four statistics were used to summarize the results across groups: overall weight change, self-weighting, weight equality, and weight similarity. Results showed that background noise significantly altered the overall weight participants gave to each other’s prior decisions, but self-weighting, weight equality, and weight similarity were not affected by the noise condition. This methodology offers a new tool for assessing the communicative consequences of hearing loss, providing insights beyond traditional hearing tests.

Biological sciences/Psychology/Human behaviour

Biological sciences/Psychology

Group decision-making

communication

task-oriented dialogue

confidence

hearing impairment

An estimated 1.57 billion people worldwide suffer from some degree of hearing loss, with around 400 million experiencing moderate to severe hearing loss¹. Existing clinical tools used to measure a patient’s hearing status commonly include measures of pure-tone sensitivity and speech intelligibility in stationary noise, aiming to diagnose hearing loss in isolated, passive listening scenarios. However, the situations in which hearing-impaired listeners struggle the most are very different from these clinical settings. Such situations typically involve the need to communicate interactively with other people, which is known to be particularly difficult for individuals with hearing disabilities ^2–4. While a person's ability to communicate relies strongly on their ability to hear, it is not solely determined by it. Passive listening tests do not reflect the interactive nature of communication and may ultimately be unsuitable for quantifying the capacity for successful communication⁵.

The disregard of the interactive elements of communication has its roots in the source-message-channel-receiver model of communication, also known as the linear model of communication⁶. In this model, problems induced by hearing loss are considered to be solely due to a degradation of the signal on the receiver side. Today, it is well known that passively listening to speech is not people's default or natural mode of communication. Barnlund (1970) criticized the linear model of communication and proposed the transactional model of communication as an alternative. In this model, encoding and decoding the message is a simultaneous process for which both the sender and receiver are collectively responsible⁷. Various later studies provided empirical evidence supporting this approach to modelling communication. Schober and Clark (1989) showed that individuals who were listening in on a conversation without participating had worse comprehension than those engaged in that same conversation⁸. Bavelas et al. (2000) found that listener behavior plays a crucial role in shaping the narrative of a monologue. Speakers narrating a story to a distracted listener were judged to be worse storytellers, when evaluated by naïve observers, than speakers narrating to fully engaged, non-distracted listeners⁹.

These findings cannot be explained by the linear model of communication. If communication were a one-way process, comprehension should not depend on participation, and the quality of a story should not be affected by the listener. Instead, these findings support the view that interaction is a crucial element of successful communication, as elements of the interactive process continuously shape the message itself^10,11. While the linear model of communication has contributed immensely to a better understanding of how to restore hearing ability, understanding communication ability requires other evaluation paradigms that acknowledge the interactive nature of communication. To date, despite established accounts of the interactive nature of dialogue, very few attempts have been made to quantify communication ability in the context of hearing impairment. A quantitative measure of communication ability would be a valuable tool for revealing communication difficulties not adequately captured by non-interactive listening tests. A recent consensus paper noted that the development of such a measure is "of utmost importance" in hearing rehabilitation⁵.

Recent years have seen an increased interest in studying the relationship between conversational behavior signals and communication-inhibiting effects such as background noise and hearing impairment. It has been shown that interlocutors adapt to background noise by moving closer to each other and increasing their speech level, both for normal-hearing^12,13 and hearing-impaired individuals¹⁴. Gesturing can also aid communication in noise, with listeners benefitting from iconic gesture cues in background noise¹³. However, it remains unclear to what extent conversational behavior alone can signify whether a communication scenario was successful. Conversational behavior is highly dependent on context, with factors such as group size, noise type and task material influencing the behaviors employed by interlocutors^15,16. Even if an interlocutor’s behavior changes, such as in response to background noise, their ability to communicate is not necessarily impeded. For example, in the loud setting of a bar, a patron and the bartender generally manage to agree on the drink order and payment amount, despite having to lean closer or talk more loudly. While analyses of conversational behavior can help understand the adaptive mechanisms people use to overcome communicative difficulties, they do not necessarily inform about the attainment of communicative goals¹⁷.

For at least some everyday conversations, like the exchange with the bartender, the communicative goal can be considered in terms of information exchange. In these cases, communication is the process by which a group of people attempts to share information with each other. One way to quantify and analyze information exchange is through a combination of individual and group decision-making tasks^18–24. In this framework, participants first make individual decisions about a query, followed by a group discussion round and a second decision round. In the second decision round, participants can update their decisions based on what they learned during the discussion (i.e., based on the information that was exchanged). Studies have shown that successful information exchange is not guaranteed, with factors such as group size^18,19, individual differences in task ability ^20–22, and linguistic coordination^10,25 affecting group decision-making. However, this framework has not yet been used to investigate the impact of acoustical conditions, such as background noise. Acoustical conditions can negatively impact communication, so it is plausible that they would also affect group decision-making. If this is the case, group decision-making tasks could be used as a quantitative measure of communication ability in naturalistic, face-to-face interactions similar to the environments where hearing-impaired individuals struggle the most.

In this study, we used the group decision-making framework to quantify information exchange in triadic interactions between normal-hearing interlocutors conversing in two different levels of background noise. We employed a formalized model of the decision-making process to infer how much individual group members contributed to each other’s post-conversation decisions. We adapted an existing model of group decision-making, the so-called confidence weighted majority voting (CWMV) model²³, to analyze a data from a previous study on group decision-making in the presence of background noise²⁶. This model accommodates any group size and is a generalized model that encompasses multiple common strategies employed by groups during decision-making^24,27. For this study, the existing CWMV model was slightly simplified to allow for an analytical solution to the maximum likelihood estimator and was modified to accommodate individual, non-consensus post-conversation responses. This modification allowed the model to predict one-way information flow, such that one member could influence another to change their decision without the reverse necessarily being true. Formally, this was implemented by adding separate weights for each ordered pair of participants within a given group. This addition allows for modelling how individual group members respond differently to interventions, such as background noise in this context. For example, even if a participant cannot pick up information from other group members, they could still share whatever information they possess with the rest of the group. Conversely, some people might be less inclined to speak in a noisy environment due to the added effort, thus sharing less information but still picking up and using information shared by other group members. These types of individual differences in behavior can be accounted for in the extended model presented here.

Based on this adapted model of individual decision-making before and after group discussions, we proposed four summary statistics, all of which can be derived from the estimated interpersonal weights of the decision model. The first measures the overall change in participants’ weights between the two noise conditions. The second statistic measures the increase in self-weighting, i.e. participants’ tendency to rely more on their own previously held information and less on information from others. The third statistic measures the change towards uniform weighting, i.e. the extent to which members equalize the weighting of each other’s contributions. The fourth and final summary statistic captures the differences between individual members’ weights, i.e. whether they become more similar in their weighting of contributions between the two conditions.

Participants & Experimental setup

In the experiment, ten triads (30 participants) took part in the study. Participants were between 20 and 35 years old and reported having normal hearing. The experiment was conducted in Danish, and all participants were native Danish speakers. The majority of subjects were students at the Technical University of Denmark. When organizing the triads, emphasis was placed on creating mixed-gender groups and ensuring that all three participants were strangers to each other prior to the experiment. However, due to scheduling difficulties, these criteria had to be relaxed. As a result, three triads ended up being same-gender groups, and two triads included pairs of individuals who were acquainted prior to the experiment. The experiment lasted about 2.5 hours, and participants were offered hourly monetary compensation for their participation. All experiments were approved by the Science-Ethics Committee for the Capital Region of Denmark (reference H-16036391), and were carried out in accordance with relevant guidelines and regulations. All participants provided informed consent prior to participation in the experiment.

During the experiments, participants were seated facing each other in an equilateral triangle, approximately 1.5m apart. Background noise was played back via an array of eight loudspeakers (Dynaudio BM6P) placed at a distance of 2.4 meters from the center. The loudspeakers were driven by a sonible d:24 amplifier, and each one played a Danish monologue²⁸, resulting in spatially distributed multi-talker noise. The monologues lasted approximately 90 seconds each and were looped for the duration of the conversation. The noise was presented at a combined sound pressure level (SPLs) of either 48 dB or 78 dB, referred to as the “quiet” and the “noisy” conditions, respectively. The simultaneous presentation of multiple masking speech sources rendered them individually unintelligible in both conditions. Behind the loudspeakers, a circular black curtain fully enclosed the participant area to minimize visual distractions.

Task

Each participant went through three main phases of the experiment, as visualized in Fig. 1. First, participants were asked a series of 28 general knowledge questions on a given topic. The topics of the general knowledge questions and subsequent conversations were Hollywood movies (Which of these two movies is oldest?), Copenhagen landmarks (Which of these two places is closest to the city center?), and European countries (Which of these two countries has the larger population?). For each question, two response alternatives were given, each accompanied by a visual illustration and a label. The 28 questions were presented on a touch-screen tablet and included all unique paired combinations of 8 items (i.e. 8 Hollywood movies).

Participants were instructed to select one of the two options and to provide a confidence level, expressed as a percentage between 50% and 100%, with 50% indicating no preference for either option, and 100% indicating absolute certainty in the decision. They were instructed to interpret the scale as indicating their estimated probability of having answered the question correctly. After answering the 28 questions, the participants discussed the questions they had just answered in their triad. They were instructed to view the task as a collaborative effort and were told to aim to improve both their own performance and that of their group members. This was intended to encourage participants to share their own beliefs and to ask for assistance on questions where they were unsure, thereby facilitating a free exchange of information. To further aid the discussion, each participant was given a sheet displaying the eight items that appeared in the preceding question round. Once a 10-minute time limit was reached, or the conversation concluded naturally, the sheet was removed and participants individually answered the same 28 questions again, without talking. At the end of the round, participants received feedback in the form of a percent correct score on their pre- and post-discussion responses.

Prior to the main experiment, the group performed a short trial round on a separate topic not included in the main experiment. This trial round familiarized them with the task and interface and helped them overcome any initial awkwardness in the conversations. For each topic, there were two lists of 28 questions, one list for each noise condition. Thus, this process was repeated six times, once for each of the three topics and in each of the two noise conditions. The order of topics and conditions was randomized between groups, with the restriction that the same topic would never appear twice in a row. A brief break was included after the third or fourth round of questions.

Group decision model

The group decision model employed was based on the confidence weighted majority voting (CWMV) model²³. The model was originally used in the context of a perceptual task where participants estimated probabilities of biased coin flip sequences. Adapting it the general-knowledge paradigm was straightforward, as participants were asked to submit confidence ratings based on their estimated probability of being correct. The model predicts a group's combined confidence rating, $\:{c}_{g}$, in a binary decision scenario where each individual member's prior confidence is known. The group confidence is assumed to be reached through a consensus decision. Given prior confidence ratings, $\:{c}_{i}$, from $\:M$ individuals, $\:{C}_{g}$ is predicted to be:

$$\:\begin{array}{c}\:{C}_{g}\sim\:N\left(k{\sum\:}_{i=1}^{M}{C}_{i}^{\beta\:},\sigma\:\right) \left(1\right)\end{array}$$

The confidence ratings $\:{C}_{g}$ and $\:{C}_{i}$ are measured in log-odds units, such that $\:C=\text{ln}\left(\frac{c}{1-c}\right)$, where $\:c$ is a confidence rating on a bounded scale from 0 to 1. Here, $\:c=0\:\Rightarrow\:C\:=-{\infty\:}$ indicates maximal confidence in one option, $\:c=1\:\Rightarrow\:C\:=\:{\infty\:}$ indicates maximal confidence in the other option, and $\:c=0.5\:\Rightarrow\:C\:=\:0$ indicates no preference for either option. The parameters $\:k$, $\:\beta\:$ and$\:\:\sigma\:$ are free parameters that control the shape of the probability distribution of posterior confidences given a set of $\:M$ prior confidence ratings. Note that the definition given here differs slightly from the one originally proposed, as the errors are assumed to be normally distributed around $\:{C}_{g}$, and not $\:{c}_{g}$. Modelling the errors this way mitigates the truncation problems introduced by making $\:{c}_{p}\in\:\left[0;1\right]$ normally distributed, as pointed out by the authors of the original study²³.

While the CWMV model was originally intended for cases where a deliberating group makes a single consensus decision, this was not the case in the present study. To allow for individual posterior decisions, $\:{C}_{g}$ was replaced with $\:{C}_{j}^{p}$, representing the posterior (indicated by the superscript $\:p$) confidence of member $\:j$. Furthermore, different weights were included for each pair of group members by adding indices $\:i$ and $\:j$ to the free parameter $\:k$. For simplicity, the $\:\beta\:\:$parameter was dropped from the model. This allowed us to derive an analytical solution to the maximum likelihood estimator (MLE) of the model parameters, eliminating the need for numerical methods when inferring model parameters. The resulting group decision model used in this study was thus:

$$\:\begin{array}{c}\:{C}_{j}^{p}\sim\:N\left({\sum\:}_{i\in\:a,b,c}{k}_{j,i}{C}_{i},{\sigma\:}\right) \left(2\right)\end{array}$$

Here, $\:i$ has been converted to a categorical variable representing the three group members, $\:a$, $\:b$, and $\:c$. This change clarifies that we explicitly looked at triads, though the model is still, in principle, applicable to any group size. In this model, the free parameter $\:{k}_{j,i}$ acts as a weighting factor on the initial confidence ratings $\:{C}_{i}$. The weighting factor $\:{k}_{j,i}$ controls how much participant $\:j$ is influenced by group member $\:i$'s prior confidence rating when making their own posterior decision, or, in other words, how much information they obtained from member$\:\:i$.

Using individual weights for each group member allows the model to account for the fact that individual members might gain more or less information from each other due to factors like hearing status, susceptibility to noise, personality factors, etc. The model defined by Eq. (2) can thus be considered an extension of CWMV for cases where consensus decisions are not enforced, and where individual differences in impact on the final decision are accounted for.

The model weights $\:\mathbf{k}$ were estimated using an MLE. Given $\:N$ trials of prior and posterior confidences from three group members, an MLE for the weight vector $\:{\mathbf{k}}_{\varvec{j}}$ of group member $\:j$ can be derived from the following system of equations (see supplementary materials for derivation details):

$$\:\begin{array}{c}\sum\:_{n=1}^{N}\left(\left[\begin{array}{c}{C}_{a,n}\\\:{C}_{b,n}\\\:{C}_{c,n}\end{array}\right]\cdot\:{\left[\begin{array}{c}{C}_{a,n}\\\:{C}_{b,n}\\\:{C}_{c,n}\end{array}\right]}^{T}\cdot\:\left[\begin{array}{c}{k}_{j,a}\\\:{k}_{j,b}\\\:{k}_{j,c}\end{array}\right]\right)=\sum\:_{n=1}^{N}\left(\begin{array}{c}{C}_{a,n}{C}_{j,n}^{p}\\\:{C}_{b,n}{C}_{j,n}^{p}\\\:{C}_{c,n}{C}_{j,n}^{p}\end{array}\right) \left(3\right)\end{array}$$

For clarity, the summation operators are taken to act on each row separately. $\:{C}_{i,n}$ denotes the prior confidence of member $\:i$ on the $\:n$'th trial, and $\:{C}_{j,n}^{p}$ is the posterior confidence of member $\:j$ on the $\:n$'th trial. Given observations of $\:C$ and $\:{C}_{j}^{p}$, this system of equations can be solved for $\:{\mathbf{k}}_{\mathbf{j}}=\:{\left[\begin{array}{ccc}{k}_{j,a}&\:{k}_{j,b}&\:{k}_{j,c}\end{array}\right]}^{T}$, the weight vector of a given member $\:j$.

When estimating weights using data from the experiment, trials with a confidence rating of 100% were first truncated to 99% to prevent infinite values when converting the confidence ratings to the log-odds domain. This effectively limited the magnitude of confidence scale in the log-odds domain to $\:\pm\:\frac{0.99}{1-0.99}\approx\:\pm\:4.60$.

When using Eq. (3) to estimate weights from observed data, the weights are assumed to be invariant across multiple decisions; $\:N$ distinct decisions are used to estimate each group member's weight $\:{\mathbf{k}}_{\mathbf{j}}$. However, the conditions under which communication happens may impact the weights, so that the members of a given group might apply different weights towards each other depending on the conditions. For example, background noise can reduce the audibility of other group members, making their utterances less clear to the listener(s). This could, in turn, cause the listener to reduce their weight towards others, as the cues they shared were less salient or judged to be less reliable. The weight vectors can thus act as a quantitative measure of the dynamics by which information is exchanged in the group, and they can be compared across different conditions to explore how these dynamics are affected by an intervention.

Weight distances

To make quantitative claims about the effect of an intervention on the information exchange weights, a meaningful measure of distances between weights is required. Here, we used the inverse cosine similarity, or cosine distance, to quantify the distance between weight vectors. Results are reported in radians, corresponding to the angle between two weight vectors in the three-dimensional space of the decision weights. A cosine distance of zero radians between two weight vectors (i.e., the vectors are parallel and share the same sign) indicates that, if the set of initial confidence ratings is held constant, those two vectors represent identical posterior decisions in terms of binary choices. Similarly, the smaller the angular distance between two vectors, the higher the similarity between the posterior decisions they are derived from. The angular distance can also be used to quantify the relative weight towards individual group members. This is done by finding the distance between a weight vector and individual axes in $\:k$-space.

These two different ways to use the angular distance are illustrated in Fig. 2. Two hypothetical weight vectors, $\:{\mathbf{k}}_{\mathbf{i}}$ and $\:{\mathbf{k}}_{\mathbf{j}}$, belonging to members $\:i$ and $\:j$, are shown in blue and red, respectively. The notation $\:D\left(\cdot\:,\cdot\:\right)$ is used to refer to the angular distance between two weight vectors, measured in radians. The bold line shows the angular distance between the two weights, i.e. $\:D\left({\mathbf{k}}_{\mathbf{i}},{\mathbf{k}}_{\mathbf{j}}\right)$. The three axes in Fig. 2 can each be thought of as "belonging" to a specific group member, i.e., the $\:{k}_{b}$-axis belongs to member $\:b$, as this dimension represents the weight towards member $\:b$. Defining $\:\widehat{\mathbf{m}}$ as the unit vector parallel to some member $\:m$'s axis, $\:D\left({\mathbf{k}}_{\mathbf{i}},\widehat{\mathbf{m}}\right)\:$measures the distance between member $\:i$'s weight vector and member $\:m$'s axis. When $\:D\left({\mathbf{k}}_{\mathbf{i}},\widehat{\mathbf{m}}\right)\to\:0$ rad, the posterior (binary) decisions made by member $\:i$ will approach $\:m$'s prior decisions. The confidence values may be scaled by some constant; this would correspond to changing the magnitude of the weight vector.

Assuming that weights are non-negative, the maximum possible value of $\:D\left({\mathbf{k}}_{\mathbf{i}},\widehat{\mathbf{m}}\right)$ would be $\:\frac{{\pi\:}}{2}$ rad, which would occur only if $\:{\mathbf{k}}_{\mathbf{i}}$ is orthogonal to $\:\widehat{\mathbf{m}}$, i.e., if member $\:i$'s weight towards $\:m$ is zero. This would occur if $\:i$ completely ignores any information shared by $\:m$ when making their posterior decisions. A negative weight towards a member can only occur if the information shared by that member is "inverted" before it is integrated into the posterior answer. This would most likely occur only if participants believed they were being deliberately deceived by another member, or if they for some other reason believed the other member to be consistently more likely to be wrong than right. In the experiment presented in this study, we assumed that such behavior would not take place, as the task was explicitly collaborative. We thus assumed that negative weights would only occur as statistical anomalies.

Weight distance summary statistics

The possible directions of weight vectors spanned by non-negative weights are illustrated as the gray hemisphere in Fig. 2. The weights provide an abstract representation of how information is exchanged between individuals in a particular group. To facilitate comparison across multiple groups, four summary statistics were defined based on the information exchange weights and the angular distances between them. These summary statistics – overall weight change, self-weighting, weight equality and weight similarity, introduced separately in the following – are each associated with a different view on what constitutes successful information exchange, providing complementary perspectives on how to interpret the weights estimated using the decision model.

The first summary statistic, overall weight change, was quantified as $\:D\left({\mathbf{k}}_{\mathbf{N}},{\mathbf{k}}_{\mathbf{Q}}\right)$, where $\:{\mathbf{k}}_{\mathbf{N}}$ and $\:{\mathbf{k}}_{\mathbf{Q}}$ denote the noise and quiet condition weights, respectively, for any given participant. This statistic was motivated by the idea that the quiet condition may be thought of as representing an “ideal” communication scenario, where no inhibitive effects on communication are present. Participants were thus assumed to use the weights that came naturally to them, given their individual personality traits and the groups' social dynamics. In this view, any substantial change in weights away from the quiet condition would represent a detriment to the information exchange process, as different weights than those achieved in quiet would indicate that different posterior choices would follow.

The second summary statistic, self-weighting, was defined using the relative weight towards oneself, i.e. $\:D\left({\mathbf{k}}_{\mathbf{a}},\widehat{\mathbf{a}}\right)$ for the self-weighting of some group member $\:a$. A low value of $\:D\left({\mathbf{k}}_{\mathbf{a}},\widehat{\mathbf{a}}\right)$ indicates a high degree of self-weighting. Self-weighting is particularly interesting as its magnitude depends on how much new information the participant receives during the experiment. For example, consider a hypothetical “impossible” communication scenario, where the noise is imagined to be so loud that there is no way for participants to exchange any information. In such a scenario, each group member would be forced to simply repeat their prior decisions in the post-conversation round. This would result in weights that are equal to one towards oneself and zero towards others. Each participants’ weight vector would thus be parallel to their own axis. As the noise level gradually increases from quiet to infinite noise, one might expect an equally gradual increase in self-weighting, representing the effect that information from other group members gradually became harder to obtain or less reliable as the noise increased. If the noise level used in this study is loud enough to impact information exchange negatively, self-weighting should be higher in noise.

The third summary statistic used was weight equality. Defining the uniform weight $\:\widehat{\mathbf{u}}=\left[1\hspace{1em}1\hspace{1em}1\right]$, the distance $\:D\left(\mathbf{k},\widehat{\mathbf{u}}\right)$ was used to quantify any given weights’ distance to this uniform weighting. A low value of $\:D\left(\mathbf{k},\widehat{\mathbf{u}}\right)$ thus indicated high weight equality. This statistic is motivated by mathematical considerations about the optimal weights that interacting agents can use to combine information in decision-making tasks ^27,29. In the original CWMV model, this observation is one of the motivations for transforming the raw confidence ratings into log-odds²³. Assuming that the confidence ratings $\:c$ provided by participants reflect their probability of being correct on a given trial, the ideal value of the weight vector $\:k$ in the present model would be a uniform weight, since the log-odds transformation of the confidence ratings is already performed in the model via $\:C=\text{ln}\left(\frac{c}{1-c}\right)$. Under these assumptions, the uniform weight represents the weight that an ideal observer would use, and non-uniform weights are interpreted as representing non-ideal information exchange. If noise impacts information exchange negatively, weight equality should thus be expected to be higher in quiet conditions.

The fourth and final summary statistic used was weight similarity. Weight similarity was quantified by the distance between the weights of each pair of individuals in a group, i.e. $\:D\left({\mathbf{k}}_{\mathbf{a}},{\mathbf{k}}_{\mathbf{b}}\right)$, $\:D\left({\mathbf{k}}_{\mathbf{a}},{\mathbf{k}}_{\mathbf{c}}\right)$ and $\:D\left({\mathbf{k}}_{\mathbf{b}},{\mathbf{k}}_{\mathbf{c}}\right)$. Lower values of these distances indicate that members used more similar weights. Weight similarity may be related to successful information exchange, as similar weights would mean that group members are making similar posterior decisions. One route by which such decision similarity can occur is if 1) group members successfully share with each other all relevant cues that they use to make their prior decision, and that 2) the validity of each shared cue is judged similarly by each group member when making the posterior decision. In this view, weights that are close together will be indicative of both successful exchange of information and collective agreement on the validity of the exchanged information. Thus, the closer together members’ weights were, the more successful the exchange of information. If the noise impacts information exchange negatively, weight similarity should thus be expected to be higher in the quiet condition.

Statistical analysis

The four weight change statistics were compared between the two conditions using permutation tests. For individual-level analysis, permuted samples were created by randomly shuffling the noise and quiet labels 10,000 times for each participant’s confidence ratings. New weights were estimated in each permuted sample, and permuted summary statistics were calculated using these weights. All tests were two-tailed, except for the test of the overall weight change statistic, which was one-tailed, since the statistic in question was non-negative by definition. For population-level analysis, permutation tests were performed using the median of the permuted weight change statistics from the individual-level analysis.

Estimated information exchange weights

The weight vectors estimated by the model are reported in the supplementary materials and are summarized in Fig. 3. Each of the ten subplots represents a single group, with each color (blue, red and green) representing a given member within that group. Each corner of the triangular grid represents a point where the weight vector is nonzero only for the member indicated by the color of the circle. The center of the diagram represents the uniform weight, where all three components of the weight vector have the same magnitude. Each weight is represented by a colored, triangular marker. This two-dimensional representation of the weights is achieved by normalizing each weight to sum to one. Compared to the visualization shown in Fig. 2, the diagrams in Fig. 3 correspond to a two-dimensional planar slice through the points $\:{k}_{a}=1$, $\:{k}_{b}=1$, and $\:{k}_{c}=1$, i.e. the plane where $\:{k}_{a}+{k}_{b}+{k}_{c}=1$. The visual distance between two weight vectors in Fig. 3 is a good indicator of the three-dimensional cosine distances between those weights, as the two-dimensional plane shown in Fig. 3 corresponds roughly to the grey hemisphere in Fig. 2 along which the cosine distances are measured. In Fig. 3, dark-shaded weights show the noise condition, and light-shaded weights show the quiet condition. The closer that a given weight is to a particular corner, the higher the relative weight towards that member. For example, in group C in the quiet condition, the red member has a high relative weight towards themself, blue has a slightly lower relative weight towards red, and green has a much lower relative weight towards red.

Some weights are shown outside the bounds of the ternary plots, indicating negative weights. In theory, this would mean that the member in question is deliberately going against the decisions of the member towards whom they have a negative weight. All the negative weights visible in Fig. 3 are close to zero (i.e., close to an edge of the diagram), indicating that these are most likely statistical anomalies and that the weight is more properly interpreted as simply being close to zero. However, three weights have large enough negative components to not be visible in Fig. 3 One of these weights occurred in group H (green subject, noise condition) and two in group J (red and green subject, both in the noise condition). The individuals who were given large negative weights (H-green and J-blue) all submitted very low confidence ratings in the relevant condition (average $\:\left|C\right|=0.21$) compared to the overall average confidence rating across all individuals (average $\:\left|C\right|=0.82$). When confidence ratings are low, the model is free to assign large weights without substantial consequences for the model prediction. Even these large negative weights are thus potentially explained as statistical anomalies resulting from an overrepresentation of low-confidence trials. Nevertheless, these groups were considered potential outlier groups, as these large negative weights might have a disproportionate impact on weight change statistics. While these groups were not omitted from analysis, care was taken to check any significant population-level effects with and without these groups included.

Weight change statistics

The four weight change summary statistics were derived from the estimated weights of each participant. The contrast between conditions was analyzed at the individual level (or pairs, for the similarity statistic) and at the population level, i.e., by pooling the outcomes of all participants together. The results are shown in Fig. 4. The uppermost panel shows the overall weight change, which inherently measures the distance between conditions. The three subsequent panels show the self-weighting, weight equality and weight similarity statistics. For these, the difference between the noise and quiet conditions is reported, with positive values indicating larger distances in noise. Note that this means that positive values indicate lower self-weighting, weight equality, and weight similarity in noise, respectively, as these statistics should be interpreted as being inversely related to the distance metrics they are based on (e.g., a lower distance to one’s own axis indicates higher self-weighting).

The distributions shown in each of the four panels show the result of permutation tests with N = 10,000 permuted samples of the confidence ratings. One test was performed for each of the thirty participants (or participant pairs, for the similarity statistic), which are identified in Fig. 4 by color and group labels A-J (corresponding to the labelling in Fig. 3). The distributions shown represent the expected distribution of each statistic for a given participant, under the null hypothesis that the condition had no effect on a given statistic. The far-right distribution, identified by the ‘pop’ label, shows the population-level results, indicated by the grey distribution, representing the median statistic across all participants. These distributions represent the population-level median of each statistic under the null assumption that the condition did not impact participants’ weights. The markers that accompany each distribution show the observed statistic in the unpermuted data, with the style of the marker indicating the level of significance of the permutation test in question.

The overall weight change between conditions was found to be significant for 12 participants, belonging to six distinct groups. The population-level median change was $\:0.424$ rad (bootstrapped 95% CI using the percentile method: [0.351 0.605]) and was strongly significant ($\:p\:=\:0$ for $\:N\:=\:\text{10,000}$ permutations). This finding was robust to the omission of the potential outlier groups H and J ($\:p\:=\:0.0064$, $\:N\:=\:\text{10,000}$ when these groups were omitted).

Changes in self-weighting were significant for ten participants. However, the direction of change varied, with four participants showing higher self-weighting in noise, while six had higher self-weighting in quiet. The population-level median change was $\:-0.0486$ rad (95% CI: [-0.169 0.0557]), but this difference was not significant ($\:p\:=\:0.242$, $\:N\:=\:\text{10,000}$).

Weight equality changed significantly between conditions for nine participants, with six showing increased equality in quiet and three showing increased equality in noise. The population-level median change was $\:0.0893$ rad (95% CI: [-0.0517 0.161]), indicating a general tendency for higher equality in quiet ($\:p\:=\:0.0123$, $\:N\:=\:\text{10,000}$). However, this effect was not robust to the omission of the potential outlier groups H and J ($\:p\:=\:0.417$, $\:N\:=\:\text{10,000}$).

Weight similarity changed significantly between conditions for five participants, with four showing increased similarity in quiet and one showing increased similarity in noise. The population-level median change was $\:0.0859$ rad (95% CI: [0.0159 0.2303]), indicating a general tendency for higher similarity in quiet, but the effect was not significant ($\:p\:=\:0.0815$, $\:N\:=\:\text{10,000}$).

To better understand how the individual differences in weights manifested in each group, qualitative assessments were made by comparing the significant changes in the overall weight change shown in the first panel of Fig. 4 with the more detailed view of each group’s weights shown in Fig. 3. In groups A, B and C, no significant overall change was observed. In group D, the blue and red members showed a significant change in their weights. In Fig. 3-D, it appears that all three members move away from blue in the noise condition. In group E, the only significant change is observed in green, who shows a large increase in self-weighting in quiet. In group F, no significant overall changes were observed, but green significantly increased their self-weighting in noise. In group G, all members change their weight in the same direction, towards lower weighting of red in noise. In group H, all members increase their tendency towards weight equality. In group I, all members decrease their weight towards green and increase the weight towards blue, suggesting that the changes might be driven by green dropping out of the conversation and/or blue taking on a leading role. Finally, in group J, no clear tendency is observed. Blue increases their self-weighting in quiet, and red increases their weight equality in quiet, although the latter is probably mainly an effect of the anomalous negative weight observed for this participant in noise.

Main findings

While several previous studies have shown how noise impacts the conversational behavior of interlocutors^12,14,26,30, this study introduces a new way of framing communication success by analyzing group decision-making, specifically through the decision weights individuals use to integrate information from each other into their post-conversation decisions. Although the estimated weights varied widely across participants and groups, some general observations were still possible.

First, the model almost exclusively estimated positive weights, both in noise and in quiet conditions. This indicates that the task successfully elicited conversations that led to collaborative information exchange, as positive weights suggest that participants’ post-discussion decisions converged towards a compromise between their prior decisions.

Second, one of the proposed summary statistics, the overall weight change between conditions, was significantly larger than what would be expected by chance at the population level. The observed median difference between conditions was 0.424 radians, corresponding to approximately 27% of the maximal possible change of $\:\frac{\pi\:}{2}$ within the positive octant of the weight space. A change of $\:\frac{\pi\:}{2}$ corresponds to moving a weight from one cardinal axis to another, i.e. from one corner of the triangular plots shown in Fig. 3 to another. This magnitude of change is analogous to the change in weight that would occur if a participant were to completely abandon their own prior beliefs in favor of adopting someone else’s decisions entirely. Based on this analogy, one interpretation of the observed median overall difference of 0.424 radians could be that the presence of noise perturbs the decision-making process to a degree that causes around one in four decisions to change (when controlling for the confidence level, as the model does). Note that such a claim is not directly observable in the present framework, as the exact same questions could not be repeated in the two conditions. This interpretation thus only holds to the extent that the questions used in this study can be assumed to represent a common class of dilemmas for groups to resolve. Regardless, the observation that the median overall difference was larger than what would be expected by chance indicates that noise generally impacted the relative weighting scheme applied by participants. The effect was very clear for some groups and participants (e.g., groups G, H and I), while others did not seem to change their weights in response to the noise (e.g., groups A, B, and C). This suggests that even young normal-hearing subjects exhibit individual differences in noise susceptibility during collaborative information exchange and decision-making.

Of the three remaining summary statistics, only weight equality was significantly impacted by the noise condition. However, this finding was not robust to the omission of the two groups with high negative weights (H and J), indicating no strong evidence for a general effect of noise on weight equality. It is noteworthy that all three statistics nonetheless changed in the direction predicted by the assumptions motivating their inclusion in this study: self-weighting was higher in noise, and weight equality and similarity were higher in quiet. This suggests there may be some merit to the underlying ideas motivating these statistics. The effects of noise on these outcomes might become clearer if the experimental design were modified to specifically target these individual statistics. For example, adjusting the instructions and/or feedback given to participants to emphasize the importance of making similar decisions could promote weight equality and similarity. Previous research has shown that changing instructions can impact group decision-making outcomes^31,32, and with more specific instructions, the contrast in weights between conditions may become clearer.

The four proposed summary statistics used in this study are specific instances of a broader class of potentially meaningful measures that can be derived from the information exchange weights. Similarly, the specific intervention considered here – acoustic background noise – is not the only factor that could impact decision weights. Factors such as room acoustics, hearing assistive devices, and communication medium would likely also affect the information integration and decision process and thus the estimated model weights. The weights and summary statistics proposed here could be used to gauge the impact of such interventions or environments on the ability to exchange information.

The methodology and experiment presented here demonstrate that information exchange can be quantified using a group decision-making task and a formal group decision model. However, it remains unclear what direction of changes can be expected when communication is made more difficult. Different groups may follow different weight change patterns in response to communicative interventions or inhibitions. It is also possible that different types of interventions could lead to different changes in decision-making behavior. Care should be taken to define a priori what changes are expected based on the intervention in question.

Model limitations

While the extended CWMV model presented here provides a starting point for model-based quantification of information exchange between individuals in a conversation, there are some notable limitations in its current form.

First, the model assumes that the combination of information from the three group members is unaffected by the agreement status of the group. A trial where the group starts by agreeing is assumed to use the same weights as a trial where there is initial disagreement. Although the disagreement itself is implicitly coded by the sign of the confidence ratings, it is likely that a majority vote has some benefit beyond the mere summing of confidence ratings³². For example, two agreeing members, who are each half as confident as the third disagreeing member, may still generally obtain an advantage by reinforcing each other’s positions. This effect is not captured by the model, which assumes that posterior confidences are not directly affected by whether the prior confidences used for prediction belong to a majority or a minority vote.

Second, the model does not take into account the time dynamics of the conversation, which are likely to influence how information is exchanged. For example, there may be an additionally persuasive effect of being the first group member to speak about a particular question. If the first speaker's arguments are very compelling, remaining speakers may refrain from voicing their disagreement, thinking it unnecessary. In such a scenario, the first speaker would not have access to the other members' prior information when making the posterior decision, while the remaining group members would have access to both their own and the first speaker's information. Adapting the model to account for such effects would require some form of input related to the timing of individual utterances.

A third limitation of the model is that it does not account for individual biases in using the confidence scale. While it seems fair to assume that a participant’s submitted confidence rating is proportional to their expressed confidence (i.e., the conviction with which they present their beliefs during the following conversation), this relationship is likely to vary between individuals. In its current form, the model combines the submitted confidences under the assumption that they consistently reflect the expressed confidences. If two members have the same expressed confidence, but one uses the scale more conservatively, the model attempts to account for this by increasing weights towards the conservative member. Thus, estimated weights are systematically skewed towards members whose expressed confidence is stronger than their submitted confidence. The estimated weights thus reflect both the degree of information exchange between group members and individual group members' relative bias in using the confidence scale. While the current approach cannot account for individual scale use bias, we posit that such biases are largely unaffected by the noise condition, as there is evidence that the expression of confidence is consistent across a wide variety of cognitive tasks³³. This suggests that, despite the impact of individual scale use on the estimated weights, differences in weights observed between conditions can still be attributed to the information exchange process being affected by the noise.

Experimental limitations

There are also experimental aspects of the current task implementation that may contribute to unexplained variance in the weight estimates. One example is the memory capacity required to solve the task. Since the 28 questions in each list are discussed simultaneously, participants need to keep all these questions and the exchanged information in mind while making their post-discussion decisions. There is likely some information loss between the conversation and the posterior decision, especially related to new information obtained from other members. This could result in participants regressing back towards their own prior beliefs if they forget what they learned in the discussion round.

Discussing a long list of related items also has another drawback: the transitivity of the relationship between items. For example, in a list containing items A, B, and C, if one group member knows that A > B and another knows that B > C, they will be able to infer A > C deductively by combining their individually held information, even if both were previously agnostic about the relationship between A and C. This would result in high posterior confidence even though the prior confidences were zero, which the current model cannot account for.

Using different elicitation stimuli than general knowledge questions might also increase model precision. Using general knowledge questions to elicit confidence ratings risks creating a situation where one or more participants have little or nothing to contribute. Using perceptual stimuli or some kind of preconditioning of participants may be a more robust approach^25,34. Nevertheless, there are certain benefits of using general knowledge questions: they are straightforward to implement, require no preconditioning of participants to elicit a confidence rating, and generally seemed to produce lively and engaged discussions in the groups observed in this study.

In conclusion, the methodology presented in this study provides a promising foundation for developing a framework to assess communication success. The benefits of successful group decision-making are not only instrumental, facilitating better joint decisions, but also intrinsic, as group interactions are an integral part of a healthy social life. Quantifying an individual’s capacity to participate in group decision-making may offer a meaningful way to evaluate the quality of a communication scenario.

Author Contribution

I.Ö. wrote the main manuscript text and prepared figures 1-4. A.A., T.M., and T.D. jointly supervised the work. All authors reviewed the manuscript.

Acknowledgement

We would like to acknowledge Valeska Slomianka for assisting with data collection, Alejandro Saurí Suares for designing the task interface used in the experiment, and Riccardo Fusaroli for his valuable input to an early version of the manuscript.

Data Availability

The confidence and decision data gathered from participants in this study is publicly available from DTU Data at DOI 10.11583/DTU.25163816.

Haile, L. M. et al. Hearing loss prevalence and years lived with disability, 1990–2019: findings from the Global Burden of Disease Study 2019. The Lancet 397, 996–1009 (2021).
Nicoras, R., Gotowiec, S., Hadley, L. V., Smeds, K. & Naylor, G. Conversation success in one-to-one and group conversation: a group concept mapping study of adults with normal and impaired hearing. International Journal of Audiology 62, 868–876 (2023).
Kiessling, J. et al. Candidature for and delivery of audiological services: Special needs of older people. International Journal of Audiology 42 Suppl 2, 2S92-101 (2003).
Holman, J. A., Drummond, A., Hughes, S. E. & Naylor, G. Hearing impairment and daily-life fatigue: a qualitative study. International Journal of Audiology 58, 408–416 (2019).
Carlile, S. & Keidser, G. Conversational Interaction Is the Brain in Action: Implications for the Evaluation of Hearing and Hearing Interventions. Ear & Hearing 41, 56S-67S (2020).
Berlo, D. K. The Process of Communication: An Introduction to Theory and Practice. (Holt, Rinehart and Winston, Inc, New York Chicago San Francisco Atlanta Dallas Montreal Toronto London Sydney, 1960).
Barnlund, D. C. A Transactional Model of Communication. in Language Behavior 43–61 (De Gruyter, 1970).
Schober, M. F. & Clark, H. H. Understanding by addressees and overhearers. Cognitive Psychology 21, 211–232 (1989).
Bavelas, J. B., Coates, L. & Johnson, T. Listeners as co-narrators. Journal of Personality and Social Psychology 79, 941–952 (2000).
Fusaroli, R. & Tylén, K. Investigating Conversational Dynamics: Interactive Alignment, Interpersonal Synergy, and Collective Task Performance. Cognitive Science 40, 145–171 (2016).
Garrod, S. & Pickering, M. J. Joint Action, Interactive Alignment, and Dialog. Topics in Cognitive Science 1, 292–304 (2009).
Miles, K. et al. Behavioral dynamics of conversation, (mis)communication and coordination in noisy environments. Scientific Reports 13, 20271 (2023).
Dohen, M. & Roustan, B. Co-production of speech and pointing gestures in clear and perturbed interactive tasks: multimodal designation strategies. in Interspeech 2017–18th Annual Conference of the International Speech Communication Association (Stockholm, Sweden, 2017).
Hadley, L. V., Brimijoin, W. O. & Whitmer, W. M. Speech, movement, and gaze behaviours during dyadic conversation in noise. Scientific Reports 9, 10451 (2019).
Hadley, L. V., Whitmer, W. M., Brimijoin, W. O. & Naylor, G. Conversation in small groups: Speaking and listening strategies depend on the complexities of the environment and group. Psychonomic Bulletin & Review 28, 632–640 (2021).
Watson, S., Sørensen, A. J. M. & MacDonald, E. N. The effect of conversational task on turn taking in dialogue. Proceedings of the International Symposium on Auditory and Audiological Research (Proc. ISAAR) 7, (2019).
O’Connell, D. C., Kowal, S. & Kaltenbacher, E. Turn-taking: A critical analysis of the research tradition. Journal of Psycholinguistic Research 19, 345–373 (1990).
Fay, N., Garrod, S. & Carletta, J. Group Discussion as Interactive Dialogue or as Serial Monologue: The Influence of Group Size. Psychological Science 11, 481–486 (2000).
Keshmirian, A., Deroy, O. & Bahrami, B. Many heads are more utilitarian than one. Cognition 220, 104965 (2022).
Mahmoodi, A. et al. Equality bias impairs collective decision-making across cultures. Proceedings of the National Academy of Sciences 112, 3835–3840 (2015).
Bang, D. et al. Does interaction matter? Testing whether a confidence heuristic can replace interaction in collective decision-making. Consciousness and Cognition 26, 13–23 (2014).
Bahrami, B. et al. Optimally Interacting Minds. Science 329, 1081–1085 (2010).
Meyen, S., Sigg, D. M. B., Luxburg, U. von & Franz, V. H. Group decisions based on confidence weighted majority voting. Cogn. Research 6, 18 (2021).
Koriat, A. When Are Two Heads Better than One and Why? Science 336, 360–362 (2012).
Dideriksen, C., Christiansen, M. H., Tylén, K., Dingemanse, M. & Fusaroli, R. Quantifying the Interplay of Conversational Devices in Building Mutual Understanding.
Örnólfsson, I., May, T., Ahrens, A. & Dau, T. How noise impacts decision-making in triadic conversations. in Proceedings of the 10th Convention of the European Acoustics Association Forum Acusticum 2023 429–432 (European Acoustics Association, Turin, Italy, 2024). doi:10.61782/fa.2023.0720.
Grofman, B., Owen, G. & Feld, S. L. Thirteen theorems in search of the truth. Theory and Decision 15, 261–278 (1983).
Ahrens, A. & Lund, K. D. Auditory spatial analysis in reverberant multi-talker environments with congruent and incongruent audio-visual room information. The Journal of the Acoustical Society of America 152, 1586–1594 (2022).
Marshall, J. A. R., Brown, G. & Radford, A. N. Individual Confidence-Weighting and Group Decision-Making. Trends in Ecology & Evolution 32, 636–645 (2017).
Sørensen, A. J. M. & Fereczkowski, M. Effects of noise and L2 on the timing of turn taking in conversation. Proceedings of the International Symposium on Auditory and Audiological Research (Proc. ISAAR) 7, (2019).
Hall, J. & Watson, W. H. The Effects of a Normative Intervention on Group Decision-Making Performance. Human Relations 23, 299–317 (1970).
Schulz-Hardt, S. & Mojzisch, A. How to achieve synergy in group decision making: Lessons to be learned from the hidden profile paradigm. European Review of Social Psychology 23, 305–343 (2012).
Pallier, G. et al. The Role of Individual Differences in the Accuracy of Confidence Judgments. The Journal of General Psychology 129, 257–299 (2002).
Bang, D. et al. Confidence matching in group decision-making. Nature Human Behaviour 1, 0117 (2017).

No competing interests reported.

SupplementaryMaterials.docx

Download PDF

Reviewers invited by journal
18 Oct, 2024
Editor assigned by journal
15 Oct, 2024
Editor invited by journal
20 Aug, 2024
Submission checks completed at journal
17 Aug, 2024
First submitted to journal
06 Aug, 2024

You are reading this latest preprint version

Investigating the Impact of Background Noise on Group Decision-Making Using an Individual-Weighted Voting Model

Status:

Version 1

Abstract

Figures

Introduction

Methods

Statistical analysis

Results

Discussion

Conclusion

Declarations

Author Contribution

Acknowledgement

Data Availability

References

Additional Declarations

Supplementary Files

Status:

Version 1