A Multimodal Corpus-based Study on Co-speech Metaphorical Gestures in Political Speeches

With the assistance of multimedia annotation software ELAN, this study employs corpus-based approach to explore the inner mechanisms and characteristics of multimodal meaning construction in the co-speech metaphorical gestures in political speeches. Research findings show that in multimodal meaning making process, three patterns like “Metonymy within metaphor”, “Metonymy-prominent” and “A-ING IS B-ING” can be identified. Metonymy is crucial in multimodal meaning construction for they can either be used to form a similarity relation between the source domain and target domain in the first interactive pattern, or to achieve its referential function in the second pattern, or to be used to motivate metaphorical mapping process level upon level in the third interactive pattern. lexical categories as verbs, verbal phrases and nouns are most frequently occurring words that accompany metaphorical gestures, which reveal the importance of metaphorical gestures in highlighting the language focus.


Introduction
Metaphor is not just an ornamental associate for language, but a fundamental program by which people conceptualize the world. Since the publication of Metaphors We Live By written by Lakoff and Johnson in 1980, related studies concerning the working mechanism of metaphor in the language in the light of cognitive linguistics and conceptual metaphor theory have been very fruitful. As the brought out of conceptual metaphor, which holds the view that human's conceptual structure comes from the sensor motor system [1], gestures are thought to manifest metaphorical thinking just like speeches [2][3] [4]. Evidences have shown that co-speech gesture is indeed an inherent part of language system, which means that gestures also work as signs to transfer thoughts. As stepping into the 21st century when the speedy development of science and technology is promoting human beings into a multimodal era, the recent researches on metaphor are beginning to shift from a focus on exclusively verbal text to discourses in which language is but one communicative mode, since there are other modes of expression such as pictures, sound and smell etc. in discourses. The new shift leads to Forceville's proposal of the concept of pictorial/multimodal metaphor, which expands the study of metaphor to a multimodal and cross-disciplinary scale. Previous studies have shown the present researches on metaphor mainly focus on four aspects: (1) the influence of genres in multimodal metaphor studies, including advertisements and cartoons; (2) the study of its distinctive features; (3) the relation between metaphor and metonymy; (4) characterization of metaphorical similarity. In terms of metaphoric gestures studies, much has been achieved on concrete contexts, such as teachings and interviews, or relating the study of metaphor and gesture with other subjects, such as semiotics, neuroscience, psychology, etc., but fewer studies touch upon the cognitive mechanism of multimodal metaphor and its functions in multimodal meaning construction, let alone political speech-gesture synchronization in particular. As an important form of public speech, political speech is to clarify viewpoints and propagate claims towards a country's inner affairs and diplomatic relations. The use of spoken language in political speech-gesture synchronization is inherent a process of multimodal communication, involving not only the oral production of sound and its aural reception, but also the production of various kinds of bodily motion in space, which the addressee can perceive visually. In multimodal discourse of political speeches, verbal and gestural modalities dynamically interplay with each other to manifest the on-going abstract thinking process. On account of this, this paper tries to explore the inner mechanisms and characteristics of multimodal meaning construction in the speeches and functions of metaphorics in speech-gesture synchronization with the help of annotation tool ELAN 4.6.2 in political speeches, expecting not only to enrich the study of multimodal metaphor, but also provide implications for language teachings and learning in China in terms of improvement of students' multimodal metaphorical abilities.

Research Design
The data of self-compiled corpus come from six selected political speeches addressed by American President Barack Obama, which includes the inaugural addresses of 2008 and 2013, and his four national addresses ranging from 2010 to 2013. Speeches videos with the juxtaposition of both the verbal mode and gestural mode all belong to the multimodal corpus. Besides, the speeches are videotaped in naturalistic setting with spontaneous gestures with total length of 300 minutes with 34696 words and the frequency of gestures amounting to 2000.
In each political speech, the patterns of multimodal meaning construction in metaphorical gestures should be identified in the first place, which is followed by annotations about verbal languages that co-occur with metaphorical gestures, especially the target words of metaphorical gestures, at the same time, the patterns of multimodal meaning construction process need to be highlighted. The percentages of patterns in multimodal meaning construction, as well as the frequencies of target lexical properties that exist in these patterns can be figured out through the quantitative statistics of the data, then based on the qualitative research, the characteristics of multimodal meaning making process will be displayed.

The Distribution and Features of Gestures in Obama's Speeches
Four types of co-speech gestures classified by McNeill [5] namely metaphorics, deictics, beats, and iconics can be identified in the six selected speeches with distributions of them shown in Table 1. Beats are rhythmic movements with the purpose to mark words or phrases regarded as important in meaning expression, they do not refer to any abstract concept; deitics include concrete pointing and abstract pointing according to whether the referents are present or not; iconics only depict concrete mental images and do not involve cross-domain mappings; Though the frequency of metaphorics is far less than that of beats, however, the prominent characteristics and functions of metaphorics far overweigh that of other types of gestures, that is: metaphorical gestures can express abstract concepts in human's brain by means of cross-domain mappings between source domain and target domain, most often, the source domain is depicted in gesture.

Metaphorical Gestures Plus Metaphorical Utterances
Metaphorical utterances here refer to utterances which involve cross-domain mappings, and the source domain as well as the target domain are presented in the same verbal mode. In the total 300 metaphorical gestures, there are 252 metaphorical gestures that accompany with metaphorical utterances, with the frequency of 84%. According to the characteristics of these metaphorical gestures, we further classify them into ontological metaphorics, orientational metaphorics and metonymics in this paper.
(1) Ontological Metaphorics Our experiences of substances or physical objects provide us with a further foundation for comprehension, thus giving rise to ontological metaphors, which mean people's experiences of physical objects, especially our own bodies, can provide us with a basis for a wide variety of ontological metaphors, which are ways of viewing emotions, activities, events, ideas, etc., as physical substances or entities [1]. By this means, we can quantify them, refer to them, categorize them, group them, and finally, reason about them [1]. On a certain level, people can make arguments that any time a gesture is made when there is not a concrete referent in the given context, the gesture is metaphoric by virtue of representing an ontological metaphor, which represents something abstract as concrete. Discourse-structuring gestures which highlight different parts of a logical argument can be seen as representing the speaker's mental spaces in the form of physical spaces [8]. It can be seen as expressing an ontological metaphorical mapping form of ABSTRACT AS CONCRETE in which the source domain is no more specific than that of a DISTINCT SPACE.
Among the 300 examples of metaphorics accompanied by metaphorical utterances, there are 186 (62%) ontological metaphorics, which rank as the most frequent type of metaphorics. For example, in Obama's 2013 national address, when saying "making even bigger cuts to things like education and job training", he rotates his left arm and holds it higher and higher with his fist clenched tensely, which indicates a more and more big and hard object. This gestural movement reflects the speaker's inner metaphorical thinking process. In the speaker's mind, the abstract "cuts" is likened to an object which has its size and shape. Firstly, the speaker makes a gesture which is like a hard object in shape based on the first step of metonymic mapping process, and through the gesticulation, reflects the change of its size. Secondly, it is the similarity between the source domain concrete object represented by gestures and the target domain "cuts" that provides a foundation for the cross-domain metaphorical process. Thus the inner cognitive mechanism of the former example of ontological metaphorics can be inferred: the speaker views the abstract "tax cuts" as concrete physical entities, and through the experiences and understanding of the physical entities, the source domain gesture provides a basis for the apprehension of the verbal abstract concept, and the speaker also makes material description of it by referring and qualifying the concrete physical object. Another example is from Obama's 2010 national address, when saying "From the day I took office, I've been told that addressing our larger challenges is too ambitious", Obama extends his forearms with palms facing inward, thus forming a space between his two palms. The target word here is "larger", which is used to describe the abstract noun "challenges". "Challenges" is viewed as an object and also be quantified as "larger", which can be seen clearly through the increasing physical distance between the two palms. In this ontological metaphorics, the speaker uses the concrete object to refer to the abstract "challenges", and then through the process of quantifying, the growing size of the object is embodied through the increasing space between his palms. In this case, President Obama uses two different modalities (verbal modality and gestural modality) to express the same conceptual metaphor: CHALLENGES-AS-OBJECTS.
(2) Orientational Metaphorics Orientational metaphor means the structure of the abstract concept is concretized in terms of our experiences with spatial orientation such as in-out, on-off, front-back, up-down, deep-shallow, central-peripheral, and so on. "Spatial orientation arises from the fact that we have bodies of the sort we have and that they function as they do in our physical environment. Orientational metaphors give a concept a spatial orientation, for example, HAPPY IS UP" [1]. Orientational metaphors have a foundation in our cultural and physical experiences. Just as the case of ontological metaphor of verbal languages, we find that orientational metaphors also exist in that of speech-gesture synchronization, and the source domain as well as the target domain of orientational metaphor are embodied in two different modalities.
In all the 300 metaphorical gestures that are accompanied by metaphorical utterances, there are 102 orientational metaphorics, with the frequency of 34%. For instance, in Obama's 2013 national address, when saying "And that's why we need to build new ladders of opportunity into the middle class for all who are willing to climb them", Obama stretches his left arm with palm down from a lower position to a higher position, the inner cognitive mechanism within which is: according to people's social and physical experiences, status is closely related to social power, the higher the status is, the more social power or authority they own, thus through the movements of gesture from low to high, we can get a concrete concept of "climb", then through the metaphorical cross-domain mapping process, we get the abstract concept that the rising of social status through the grasping of opportunities, the target concept here is the hidden word "improve social status".
(3) Metonymics Unlike metaphors, which is mainly a way of conceiving one thing in terms of another, and its primary function is for comprehension, metonymy, however, with its referential function as a dominant feature, in another words, it allows us to use one entity to stand for another. According to the classification of Lakoff [1].
Based on the data analyzed in the corpus, we find that metonymy does not merely exist in the verbal utterances, but also exist in the cross-domain mappings between gestural mode and verbal mode. Being different from the multimodal metaphorical mapping process which includes two different semantic domains, the multimodal metonymic mapping process is within a single semantic field rather than across two fields, the latter is also embedded in our experiences. What's more, the grounding of metonymic concepts is generally more apparent than that of the metaphoric concepts. In this paper, we name metonymic gestures as metonymics. Although in metonymy a mapping is connected with the mental highlighting or activation of one (sub)domain over another, metaphorical mapping, on the other hand, bridges the distance between entities that are experienced as belonging to two different domains, the position now is widely taken that the two tropes should be seen as interacting with each other, or existing on a continuum [7]. In this paper, we categorize metonymics into metaphorics. In the self-compiled multimodal corpus, there are totally 9 metonymics that accompany metaphorical utterances. For example of Obama's 2010 national address, when saying "I'm interested in protecting our economy", Obama extends his two arms with palms in arc shape, as if defending something. The target word here is the abstract concept "protecting". According to people's experiences of the world, to protect somebody or something, we have to use our hands to prevent them from getting hurt, so here it is not the similarity between the source domain (gesture) and target domain (verbal languages) that helps to build the cross-domain mapping, but the referential function of the gesture itself. What's more, both the source domain and target domain belong to the same semantic field.

Metaphorical Gestures Plus Non-metaphorical
Utterances Metaphorics are gestures whose referents are abstract ideas, and also involve cross-domain mappings from gestural mode to the verbal mode. According to the conceptual metaphor theory of Lakoff and Johnson, metaphorics are not simple duplication of metaphorical utterances, these gestural modes can be used to express metaphorical thoughts alone, that is to say, the target sentences of metaphorical gestures may not always be metaphorical [1]. In the corpus, total 48 metaphorical gestures accompanied by non-metaphorical utterances are found, with a frequency of 16%. We can further divide those metaphorics into temporal metaphorics, ontological metaphorics and metonymics.
(1) Temporal Metaphorics People's experiences with spatial domain provide a basis for their cognitive activities. Generally speaking, human's cognition of both the physical and non-physical world, in general, starts from the spatial domain to the temporal domain, and then to other abstract cognitive domains. In all the 48 metaphorical gestures that are accompanied by non-metaphorical utterances, there are 39 temporal metaphorics, accounting for 82%. In Obama's 2010 national address, when saying "Tomorrow, I visit Tampa", he points out his forefinger to the front, which means "tomorrow is ahead". Another example is that in Obama's 2011 national address, when saying "the future is ours to win", Obama extends his left arm to the front with palm vertical down, with the meaning of "future is in front". Both of these two examples embody the same temporal conceptual metaphor: "FUTURE-IS-IN-FRONT". Apart from the temporal conceptual metaphors stated above, we also find another kind of temporal conceptual metaphor that exists in the self-compiled corpus: "past is in left and the future is in right". In the 2013 national address, when saying "jeopardize the promise of a secure retirement for future generations", Obama stretches his arm and points out his forefinger to the right, meaning "future is in right". On the contrary, when speaking "we can't afford another so-called economic 'expansion' like the one from the last decade" in his 2010 national address, Obama moves his left arm from the normal position to the left side, meaning that "last decade is in left". This kind of temporal conceptual metaphor stems from the influence of human's writing habit. The left side means something that has already been written, which indicates the past; the right side represents something that has not been written yet, meaning the future

(2) Metonymics and Ontological Metaphorics
There are totally two metonymic gestures that accompany non-metaphorical utterances. In these two cases, it is the multimodal metonymic mapping process that predominates the multimodal meaning construction. For instance, in Obama's 2013 inaugural address, when saying "we will act-not only to create new jobs", Obama extends his right palm, here the target word is the abstract word "create". According to people's experiences of the physical world, when we create something concrete, we have to use our hands. In this utterance, it is the "new jobs" which is abstract that we have to create, so, here, the target source gesture has a referential function, and both the gestural mode and target mode belong to the same semantic field. In obama's 2011 national address, when speaking "We share common hopes and a common creed", he moves his right hand horizontally from left to right, here the target word is the abstract concept of "common". "Hopes" are likened to physical objects, which can be referred to and qualified, through the horizontal move of hand in the air, we get the concrete concept of "the same" or "equal" by the metonymic mapping process, then through the similarity relation between the source domain and target domain, the abstract meaning of "common" is concretized during the multimodal mapping process.

Multimodal Meaning Construction in
Speech-gesture Synchronization

The Patterns of Multimodal Meaning Making Process (1) "Metonymy within Metaphor" Multimodal Meaning Making Process
In Obama's 2008 inaugural addresses, when speaking "the long, rugged path towards prosperity and freedom", Obama extends his left arm with palm down while making a transitional movement in the air, from which, audience can easily get the image of a concrete path. The target domain here is an abstract "path" which means "a hard and difficult process that leads people to prosperity and freedom". To understand this metaphorical gesture, people need to undergo a two-step thinking process: the first step is the metonymic process, through the gesticulation, we acquire the concept of a concrete road (source domain), then through the second step of metaphorical process, people get the knowledge of an abstract "path" (target domain) based on the similarity relation between the source domain and target domain, the target domain "path" is in verbal mode, while the source domain is presented in gesture (visual mode), these two kinds of modes interact with each other to complete the cross-modality mapping in this metaphorical gesture (see Figure 1). In this pattern, a metonymic mapping process is embedded within a (complex) multimodal metaphorical mapping process. Among all the 250 metaphorical gestures, there are 183 (73%) belong to the first interactive pattern.
In Obama's 2008 inaugural addresses, when speaking "the long, rugged path towards prosperity and freedom", Obama extends his left arm with palm down while making a transitional movement in the air, from which, audience can easily get the image of a concrete path. The target domain here is an abstract "path" which means "a hard and difficult process that leads people to prosperity and freedom". To understand this metaphorical gesture, people need to undergo a two-step thinking process: the first step is the metonymic process, through the gesticulation, we acquire the concept of a concrete road (source domain), then through the second step of metaphorical process, people get the knowledge of an abstract "path" (target domain) based on the similarity relation between the source domain and target domain, the target domain "path" is in verbal mode, while the source domain is presented in gesture (visual mode), these two kinds of modes interact with each other to complete the cross-modality mapping in this metaphorical gesture (see Figure1). In this pattern, a metonymic mapping process is embedded within a (complex) multimodal metaphorical mapping process. Among all the 250 metaphorical gestures, there are 183 (73%) belong to the first interactive pattern. There exists a continuum between metonymy and metaphor, so Metonymics is classified as one type of metaphorics. Among all the metaphorical gestures that exist in the self-compiled corpus, the total 11(4%) metonymics are all subordinate to the second pattern. In Obama's 2008 inaugural address, when saying "They fought and died", Obama extends his left arm with fist tensely clenched. The target word here is the abstract verb "fought", which means "the ancestors make arduous efforts in the building of this great nation". according to people's experiences of the physical world, in the early time, when combating or fighting with their enemies, they usually use their clenched fist as a weapon to hit or defeat them, through a clenched fist, people can get a concrete concept of "fighting" or "combating" through the metonymic process, so here it is not the similarity between the source domain (gesture) and target domain (verbal languages) that helps to build the cross-domain mapping, but the referential function of the gesture itself, what's more, both of the source and target domain belong to the same semantic field, see Figure 2 for the general multimodal metonymic mapping process. Forceville and Urios-Aparisi states [6] "It would be better to conceive of metaphor as A-ING IS B-ING, since metaphor is always metaphor in action". In the self-compiled multimodal corpus, we also find out the dynamic nature of the interactive pattern between metaphor and metonymy which is manifested in the process of multimodal meaning construction, the metaphorical mapping process is promoted by the metonymic mapping process level upon level. In total, 52 (23%) metaphorical gestures subordinate to this pattern. In Obama's 2011 national address, when saying "will grow the economy", Obama uses a complex and dynamic gesture: he extends his two arms with palms facing inward, while moving them from the middle to the opposite sides, thus the physical distance between his two palms is from being near to distant, the gesticulation presents a dynamic process, which embodies multilevel meanings, the direction of the gestural motion and the increase of distance between the two palms represent several levels of metonymies. Firstly, people get the concrete concept of a dynamic "grow" from the physical expansion of the two palms in the metonymic process, then through the similarity relation between the source domain and target domain, the abstract concept of the "grow" will be obtained; secondly, an abstract concept of the dynamic "grow" process which means the process that "economy is becoming better" is obtained through the presentation of concrete expansion of his two palms. Due to the vivid and dynamic presentation of the gesticulation, the whole cross-modal mapping process shows a rather typical multidimensional nature, and during the process, people can easily get the meaning of the abstract concept "grow" through the acquisition of multiple streams of information. In this interactive pattern, the metaphorical process is pushed forward level by level on the basis of the metonymic process. Figure3 demonstrates the general cross-modal mapping process in this interactive pattern.

Characteristics of the Patterns in Multimodal
Meaning Construction Statistical analysis of the distributions as well as characteristics of the three patterns that are indispensable in multimodal meaning construction of speech-gesture synchronization is carried out (see Table 3 and Table 4) to further probe into the characteristics and inner cognitive mechanisms of the patterns in multimodal meaning making process. Based on the statistics, we can clearly see that the distribution of the first interactive pattern "metonymy within metaphor" dominates in multimodal meaning construction process. According to the annotations of 177 metaphorical gestures existed in this pattern, there are totally 78 (44%) metaphorical gestures whose target words are verbs and verbal phrase, with the occurrence of 50 (64%) and 28 (36%) respectively. For example, in Obama's 2013 national address, when speaking "Opens the doors of opportunity to every child across this great nation", Obama makes a gesture of pushing the doors, the target word here is the abstract "opens the doors" which means "creating more opportunities". Through the first step of metonymic mapping process, people get the concrete concept of "opening the door", then through the similarity relation between source domain gesture and target domain verbal languages, we get the meaning of the abstract target concept.
The interactive pattern that ranks as the second in the multimodal meaning construction process is "A-ING IS B-ING" pattern. In all these 52 gestures, there are 47 (89%) whose target words are verbs (57%) and verbal phrases (32%). For example, in Obama's 2008 inaugural address, when saying "those ideals still light the world", Obama uses a dynamic and complex gesture, he moves his right palm from a lower position to a higher one, and repeats it for several times, the upward position of the gesture and the repetition for several times on one hand, give the audience the concept of concrete "light" through the first step of metonymic process, then through the second step of metaphorical mapping process which is based on the similarity between source field and target field, we can get the abstract notion that "the ideals benefit for people"; on the other hand, also through the two-step mapping process, we can get the abstract meaning that "those ideals benefit people continuously since ancient times" based on the concrete repetition of the gestures. In this interactive pattern, those concrete concepts obtained from metonymic process are mapped onto the target domain through the metaphorical mapping process, thus the features of the target abstract ideas are fully got from all-round sides.

A Multimodal Corpus-based Study on Co-speech Metaphorical Gestures in Political Speeches
The third pattern with a distribution of 9 (4%) in multimodal meaning construction is the "metonymy-predominant" multimodal meaning making process. Among the 9 metonymics, there are 6 whose target words are verbs, such as the aforementioned metaphorical gestures that accompany the non-metaphorical utterance "they fought and died", whose target words is "fought", the other 3 are nouns which are "builder" and "hit". Under this working mechanism, the source meaning and the target meaning is within a same semantic field, and it is the referential function of gestures that helps to build the multimodal meaning, in other words, the source domain gesture helps to highlight or activate the meaning of the target domain, but not through the similarity relation between the two domains.

Conclusions
The present study, based on a combination of quantitative statistics and qualitative studies annotated by ELAN 4.6.2, is to explore inner cognitive mechanisms of the multimodal meaning making process. From what have been discussed, we can draw the conclusions as follows.
First, there are three patterns in the multimodal meaning making process: "metonymy within metaphor" interactive pattern, "metonymy-prominent" multimodal meaning making process and "A-ING IS B-ING" multimodal meaning making process. Metonymy plays an important role in multimodal meaning making process, for they can either work to fulfill its referential functions or helps to form a similarity relation between the source field and target field, and also can be used to motivate the metaphorical mapping process level upon level. "metonymy within metaphor" interactive pattern is the most frequently used pattern in multimodal meaning construction process for the target lexical categories in this pattern are the most comprehensive than that of the other patterns.
Second, lexical categories as verbs, verbal phrases and nouns are most frequently occurring words (metaphorical keywords) that accompany metaphorical gestures, which show the function of gestures in highlighting the language focus. As for the gesturer/speaker, metaphorical gestures can make their thoughts fully expressed and arise the audience's resonance through the ex-bodiment of their ideas "on the fly", and from the view of the observer/addressee, it is easier to motivate their experiences of the abstract verbal referents by the juxtaposition of both the visual mode (gesture) and the verbal mode (speeches). What's more, the presentation of metaphorics can provide them hints to shift their attention to language focuses (metaphorical keywords).
Third, the results of analysis also indicate that the use of metaphorical gesture attests to the view of cognitive linguistics which claims that metaphor is a fundamental aspect of conceptual organization, the application of metaphorical gestures evidently shows the basic claim of cognitive semantics, namely the embodied nature of meaning and the grounding of abstract conceptions is based on perceptual and motor experience. Co-occuring metaphorical gestures can provide independent evidence supporting the semantic analysis of particular elements and expressions. In foreign language learning and teaching, in particular, metaphorical competence is crucial for the speakers to make their ideas fully expressed and also important for the speaker to decode the multimodal meanings hidden under the metaphorical gestures, hence, in future foreign language teaching in China, more attention should be paid to the improvement of students' multimodal metaphorical abilities, which is helpful for the enhancement of "multiliteracy" as well as achieving greater effectiveness in communication.
Finally, since the selected corpus is confined to only six speeches presented by American President Barack Obama, the comparison of different political gestures given by different politicians still needs further researches. Besides, more empirical studies, for example, how different addressees interpret a metaphorical gesture could be very illuminating to find the patterns and characteristics of multimodal meaning construction. The theory of multimodal metaphor is relatively new and still on its road to ripeness, so the future researchers may further develop and refine the present theories.