Background: Gurage refers to pockets of South Ethiosemitic language speakers living in the Gurage Zone of the Southern Nations, Nationalities and Peoples’ Regional State, and to the set of languages and dialects spoken in the Gurage Zone, Silt’e Zone, and the Oromo region. The language clusters the people speak. Gurage clusters are grouped into two major branches; namely, Gunnӓn-Gurage Hetzron (1977) which includes the North Gurage and West Gurage, and East Gurage which we may call Dumi- Gurage in parallel with Gunnӓn Gurage. The words Gunnӓn and Dumi etymologically refer to ‘hair’ or ‘head’ in Gunnӓn-Gurage and East Gurage, respectively. However, Zay has two different terms for dabasa for ‘hair’ and oxәt for ‘head’.
The Gurage languages are enclosed by the Cushitic language Afan Oromo in the North, North East, and South East, Libido also called Mareqo in the East, K’abeena in the North West, Hadiyya in the South West, Alaba in the South, and an Omotic language Yemsa in the West. Fellman (1993, p. 673) metaphorically described the location of Gurage as “a tiny Semitic island floating in a vast Cushitic sea”. Zay is completely enclosed by Afan Oromo, as it is located in the Oromo Regional State, specifically on the Island of Lake Zway. Figure 1 shows the Gurage languages and the surrounding Omotic language Yemsa, and the Cushitic languages Oromo, Hadiyya, and Alaba.
Other Gurage languages which have to be mentioned are Galila which was spoken around Lake Wenchi in Ambo area of the Oromo region, and two extinct languages +Gafat a north Gurage language (cf. Faber, 1997, p. 6), and +Mesmes a Peripheral West Gurage language.
Gurage as a linguistic group has been more complex in terms of classification. The linguistic features used to characterize each of the sub-groups of the Gurage languages were diverse and sometimes incorrect (Demeke 2001). The linguistic complexity has been attributed to different factors. One factor is the origin of speakers who came into the Gurage land, from different sources and at different times, with their languages that in due time intermingled with languages that already were spoken in the area (Menuta, 2015, p. 11). The second factor is the language contact with the neighboring Omotic language Yemsa and the Cushitic languages of Oromo and Sidamo (Leslau, 1952) such as Hadiyya, Alaba, Qabena, and Mareqo. Amharic, the national official language of Ethiopia, and Classical Arabic, the language of Islam religion, were also in contact with Gurage languages. Furthermore, there are within Gurage languages contacts (Nurga, 2021). In addition to the linguistic complexity, the methodology used by scholars who worked in the classification of Gurage languages was a challenge to the accuracy and consistency. Some scholars based their classification on lexical items (Hudson, 2013; Bender, 1971, 1966), while others used shared innovations and morphological affixes (Demeke, 2001; Hetzron, 1972; Leslau, 1969). Another challenge for the inaccuracy of the classification of Gurage languages was the lack of data on each of the language varieties; hence, there were language varieties that were not included as a sample in the investigations.
Objective: the objective of this study is to revisit the classification of Gurage languages and characterize the sub-group with linguistic features. It also attempts to correlate linguistic distance with geographical distance as language variation is influenced by the areal distance in addition to other variables such as language contact. It also discusses the finding with pre-existing classification.
Significance: The study will help to document the existing works, thereby acknowledging the scholars who contributed to our knowledge of Gurage languages classification, show the existing mismatches, and provide alternative classification based on phonetic distance, and language area-specific morphemes. It can be used as a base for future research works, and refinements.
Previous studies: As there are several studies on the description of Gurage languages, we will focus on literature dealing with intelligibility tests that somehow show language grouping and language distance, and on literature directly concerned with language classification. The major works on intelligibility study include Gutt (1980), Ahland (2010), and Menuta (2015).
Ahland (2010) compares Kistane, Chaha, and Silt’e each representing North Gurage, West Gurage, and East Gurage, respectively. He reports that the intelligibility test result among the varieties is low. On the contrary, Hetzron (1997, p. 536) reports communication among Gunnӓn Gurage is easy. Ahland (2010) compared, using an intelligibility test, eleven Gunnӓ Gurage languages. He used the test result as a means of classification and grouped the languages into four: the Kistane-group (Kistane and Dobbi) Mesqan, and Sebatbet-group (Chaha, Gura, Gumer, Ezha, Geto, and Muher with sub-groups Aklil and Dessa); Inor-group (Inor, Ener, Endegagn, and Mesmes). The problem with Ahland (2010) grouping is the use of intelligibility tests for language classification. Intelligibility is affected by inter-lingual comprehension, frequency of contact among speakers, and intergroup relationships. What is more, the author’s test results were not strictly bi-directional as every language variety speaking group that was tested did not take the test for every language variety compared. The terminology used for Sebatbet also seems misleading because it includes the Inor-group linguistically as well as sociologically.
Menuta (2014, 2015) compares the inherent intelligibility using lexical items, phonological features, and morphological affixes of six purposefully sampled Gurage language varieties; namely, Kistane (north Gurage), Chaha (Central West Gurage), Inor (Peripheral West Gurage), Mesqan which is grouped as West Gurage genetically but is in the Eastern part of Gurage geographically together with Dobbi, and Muher which is controversially grouped as West Gurage by some authors (Demeke, 2001; Leslau, 1969) and as a north Gurage by some others (Hetzron, 1972), and Wolane representing the East Gurage languages. Menuta (2015) also makes intelligibility tests among the languages. As the purpose of this study was to cluster the languages based on the ease or difficulty of inter-group communication, it did not attempt to provide their classification.
Studies pertinent to the classification of Gurage languages are those works dealing with the classification of Semitic languages in general and Ethiosemitic languages in particular. When it comes to the classification of Ethiosemitic languages, there are two competing views. The first view, the traditional theory, on the origin of Ethiosemitic assumes that Ethiosemitic languages emerged from South Arabia (Tadesse, 2009, p. 11; Hetzron, 1972, p. 17; Ullendorff, 1960, p. 34). The second view also called the Africanists or the alternative theory, assumes that the Ethiosemitic languages were in Ethiopia together with other Afro-Asiatic languages; namely, the Cushitic and Omotic languages (Tadesse, 2009, p. 11; Murtonen, 1967, p. 74; Hudson, 2000, pp.78-79; 2002, p.1770). The tenets of the latter view base their argument on the principle of the wave theory which assumes, similar to waves that travel from high-pressure to low-pressure areas, languages spread from high-diversity to low-diversity areas. As to the claim, we find diverse language varieties in the Ethiosemitic branches than in the other Semitic languages-speaking areas. However, there is no substantial linguistic evidence provided as to which one of the Ethiosemitic languages is a proto-language and how the other Ethiosemitic languages and/or other Semitic languages branched out from the proto-language. The language mentioned as a proto-form for the Ethiosemitic languages from the Africanists’ perspective is the Ge’ez. Fellman (1996, p. 205), for instance, states that all Ethiosemitic languages “… are descendants of a Proto Ethiopic most closely resembling Ge'ez”. Thus, he groups the Ethiosemitic languages considering Geez a proto-form that branches out to Northern Ethiopic (Tigre and Tigrinya) and Southern Ethiopic that branches out into three: Western (Gafat and West Gurage), Central (Amharic and Argoba), and Eastern (East Gurage and Harari). Fellman (1996) had no data, but he based his classification on the reviews of (Cohen, 1931; Polotsky, 1949; Leslau, 1960; Goldenberg, 1968; Hetzron, 1972). Thus, there is no empirical data-driven, and well-established pieces of linguistic evidence provided to consider Geez a proto-Ethiopic (cf. Feleke, 2021).
In the present study, we adhere to the traditional theory regarding the origin of Gurage languages and focus on linguistic distance and its correlates with geographical distance. The most notable works on the classification of Ethiosemitic languages based on the traditional theory include (Bender, 1966, 1971; Leslau, 1969; Hetzron, 1972; Goldenberg, 1977), and more recent works (Demeke, 2001; Kitchen, et al 2009; Hudson, 2013; Feleke, et al. 2020; Feleke, 2021). The most cited work on Ethiosemitic language classification is Hetzron (1972) provided in Figure 2. The other works reviewed are follow-up classifications that attempted to argue against Hetzron (1972) and provided alternative proposals.
Hetzron’s (1972) classification depicts that Gurage languages emerge from two routes; namely, the Eastern branch of the Transversal South Ethiopic- the East Gurage (Dumi-group) and the Outer South Ethiopic (Gunnän Gurage-group). From the traditional theory perspective, the origin of the Gurage languages from these two directions, probably at different times is relatively less contested; however, their internal classification has been controversial, ever-changing, and not yet settled. Studies that attempted to amend Cohen's (1931), Leslau's (1969), and Hetzron's (1972 and 1977) works are the following.
Fellman (1996) used literature from different authors and provided an alternative classification discussed above. Goldenberg (1977 and 2013) discusses the problems in the internal classification of Semitic languages and inter-language contacts that influenced the classification. Demeke (2001) regrouped Gurage languages into the eastern Group (East Gurage which consists of Wolane and Silt’e as sisters, and Harari and Zay, two different languages which are sisters to the east Gurage groups), Outer south Ethiopic group which branches into GMS group, and West Gurage group. The GMS, which is supposed to be the initials of Gafat, Gogot, Mesqan, and Soddo, with G representing two language varieties (Gafat and Gogot), is divided into Gafat and North Gurage, which branches into AMCM, affirmative main clause marker, group (Gogot and Soddo), and Mesqan. West Gurage is divided into Central West Gurage (CWG) and Peripheral West Gurage (PWG) consistent with Hetzron (1972), but the languages grouped in each group differ slightly. The CWG consisted of Muher and its sisters (Ezha, Chaha, Gumer, Gura, and Ennemor the alternative name of Inor), and the PWG consisted of Geyto, Endegeñ, Ener, and Mesmes. Being other groupings relatively similar, Demeke (2001) differs from Hetzron (1972) in the grouping of the following language varieties:
Table 1: Differences in findings of Hetzron (1972) and Demeke (2001)
Languages
|
Hetzron, 1972
|
Demeke, 2001
|
Mesqan
|
West Gurage
|
North Gurage
|
Muher
|
North Gurage
|
West Gurage
|
Inor (Ennemor)
|
PWG
|
CWG
|
The reasons Demeke (2001) provided for regrouping Mesqan to the North Gurage were (1) it has only two tenses similar to Kistane and Dobbi, a factor which was also reported by Hetzron (1972), (2) the presence of gemination in the perfective forms of verb patterns A, B, and C in Kistane and Mesqan; according to the author this feature is lacking in the West Gurage languages. The gemination in the perfective verbs, however, is also a feature of some of the West Gurage languages such as Ezha, Muher, and partly Gumer. Muher is grouped with the West Gurage languages based on verb root patterns (Demeke, 2001, p.79). Demeke (2001) did not provide any strong morphological evidence for grouping Inor (Ennemor) to the Central West Gurage than the Peripheral West Gurage.
The Kitchen et al. (2009) classification is not comprehensive and considers only 15 selected Ethiosemitic languages. These authors groups Mesmes, a peripheral West Gurage language, with Kistane (the North Gurage language); Zay and Wolane belong to the same group, but Silt’e is not in the sample and the group. Other Gurage languages clustered as similar groups are Chaha, Gyeto, Mesqan, and Inor.
Hudson (2013) used lexical comparison and intelligibility tests to classify Ethiosemitic languages. Similar to Hetzron (1977), he groups Gurage languages into Gunnӓn Gurage (Kistane, Mesqan, Muher, Chaha, and Inor) and, East Gurage (Silt’e and Zay). Several Gurage varieties are missing in his classification.
Feleke et al. (2020) make a comparison of some Gurage languages, and South Ethiosemitic languages based on phonetic distance, lexical distance, functional distance, and perceptual distance. They grouped the Gurage languages into four (1) Silt’e (Wolane and Zay are not included in the sample), (2) Kistane (Dobbi is not included in the sample) (3) Inor and Endegagn (Ener and Gyeto are not included in the sample), (4) Gumer, Gura, Ezha, and Chaha, and (5) Mesqan and Muher. Though this study contributes to grouping the Gurage languages into their closer relatives, it is incomplete in including all the language varieties. The criteria used are multiple hence the authors have to conclude based on the intelligibility test (the functional and probably the perceptual) and phonetic and lexical comparisons (structural). It is not clear how the authors negotiated the differences, for instance, Kistane’s position which is a sister to Mesqan and Muher (p.16a), a sister to Inor, Endegagn, Mesqan, Muher, Gumer, Gura, Ezha, Chaha (p.16b), a sister to Mesqan (p.16c) and a sister to Silt’e (p.16d).
A more recent study of the classification of Ethiosemitic is Feleke (2021) which is based on phonetic and lexical distance computation. The finding of this study is more similar to Hetzron (1972) with some differences. It groups Gurage languages into Silt’e, Wolane, and Zay in one group, and all the Gunnӓn-Gurage languages into another group. Of the latter groups, Inor, Endegagn and Gyeto are sisters to the North Gurage languages (Kistane and Dobbi) and the West Gurage languages; namely, Muher, Mesqan, Gumer, Gura, Ezha, Chaha (Feleke, 2021, p.8).
The several backs and forths in Gurage language classification due to the method of classification, shortage of data on each language variety, and advances in computing language variation necessitate further studies. The present study provides data on phonetic, and morphological comparisons while at the same time building on the previous works. This approach will help provide a more comprehensive and reliable classification of Gurage varieties by offering linguistic features that characterize each sub-grouping.
Methodology: In scope, all the Gurage language varieties, except the extinct +Gafat and +Mesmes, and the Galia for which we could not have access to linguistic data are included in the study. We used 255 lexical items (Appendix A) to compare the phonetic distance of the language varieties. We had four informants, two males, and two females, in each language variety site; hence, a total of 60 informants. The data were collected over three years (2019-2022).
Data were analyzed using a mixed methods methodology. In the quantitative method, string variables (lexical items) are changed into numeric variables and then computed with a Cog. 1.3.6.1 and Gabman Software. In Cog.1.3.6.1, phonetic distance is measured by aligning the words compared and using a threshold assigned to sounds based on their sonority value. The sonority scale value provided as a threshold is given in Table 2:
Table 2: Sonority Scale for vowels and consonants
Name
|
Type
|
Description
|
Sonority
|
Pre-nasal
|
Segment
|
M, n, ŋ
|
1
|
Stop
|
Consonant
|
Manner: stop, nasal
|
2
|
Affricate
|
Consonant
|
Manner: affricate
|
3
|
Fricative
|
Consonant
|
Manner: fricative, lateral
|
4
|
Nasal
|
Consonant
|
Nasal: +
|
5
|
Trill
|
Consonant
|
Manner: trill
|
6
|
Lateral
|
Consonant
|
Lateral:+
|
7
|
Flap
|
Consonant
|
Manner: flap
|
8
|
Glide
|
Segment
|
j, ɥ, ɰ, w
|
9
|
Non-syllabic vowel
|
Vowel
|
[Syllabic:-]
|
9
|
Close vowel
|
Vowel
|
[Syllabic:+, Height: close]
|
10
|
Near-close vowel
|
Vowel
|
[Syllabic:+, Height: near-close]
|
11
|
Close-mid vowel
|
Vowel
|
[Syllabic:+, Height: close-mid]
|
12
|
Mid-vowel
|
Vowel
|
[Syllabic:+, Height: mid]
|
13
|
Open-mid vowel
|
Vowel
|
[Syllabic: +, Height: open-mid]
|
14
|
Near-open vowel
|
Vowel
|
[Syllabic: +, Height: near-open]
|
15
|
Open vowel
|
Vowel
|
[Syllabic: +, Height: open]
|
16
|
The system aligns the word pairs and /or multi-words and provides them with the values given above for computation. The words aligned are also grouped as cognates and non-cognates as the alignments for ‘house’ below:
In this alignment, we have three groups of cognates: Cognate-1 with bet and bid ‘house’; Cognate-2 with gar ‘house’, and Non-cognate bejt and ge ‘house’. Had it not been for the software application used, one may align bejt ‘house’ with Cognate-1 and ge with cognate-2. It seems that the application considered /jt/ of bejt as a single grapheme similar to the affricate; otherwise, ge ‘house’ could have been represented as ge - - than ge-. Such groupings may influence the output concerning the cognate groups and the statistics as the affricates and non-affricates have different sonority values.
Cog.1.3.6.1 provides results of the computation in several forms, including a percentage of shared words in the table matrix, and the dendrograms. Two approaches are used to analyze linguistic distance using the dendrograms; that is, the unweighted pair group method with arithmetic mean (UPGMA) and neighbor-joining (NJ). The two approaches differ in that the UPGMA assumes the evolution of languages based on average linkage whereas NJ assumes the minimum evolution of languages. The two also differ in the phylogenetic tree they produce; UPGMA provides rooted trees with equal branching tips due to the assumption of equal rates of evolution, whereas NJ produces unrooted trees whose branches are proportionally varying with the change of the languages. Cog.1.3.6.1 also provides network graphs. Though we focused on minimum evolutionary change; hence, a synchronic description, we have provided both the UPGMA and NJ forms in the present study.
Gabmap like the Cog.1.3.6.1 application provides phonetic distance. However, Gabmap has additional features as it compares linguistic distance and geographical map distance of the languages compared. It also provides output in the form of plots, bar graphs, and Regression lines comparing the fitness of linguistic and geographical distances. In Gabmap, dendrograms are produced with advanced statistics that is using fuzzy clustering than cluster analysis. The latter enables to make the unstable output of clustering more stable (cf. §2.1.3). We used both Cog.1.3.6.1 and Gabmap as they support one another.
We used qualitative data, the morphological affixes, to substantiate the results obtained from phonetic similarities. We focused on the affixes that characterize the different language clusters than the affixes that are equally shared across the language varieties.