Corpus-based Priming for Inverse Translation Training

Training students to become competent translators out of their mother tongue is a challenging objective. Yet for Chinese undergraduate English majors, inverse translation is a necessary skill and an indispensible curricular component. In the pedagogical contexts, teachers and students of translation practice have generally found dictionaries to be of limited use as the explanations or answers offered are often de-contextualized, outdated, misleading or simply wrong. In contrast, corpora can offer more in-situ reference for the struggling translator. It is an area rather under-explored, especially in China, where not much research focusing on corpus-based priming in inverse translation training has been done. Presenting a case study of 50 Chinese undergraduate students majoring in English while they complete three rounds of translation of the same ST with/without different reference materials as tool kits, this paper explores whether, how and what types of corpora can be used in the classroom of translation training for quality improvement in student’s inverse translation practice. Upon analysis, evidence from the tentative experiment confirms that a corpus-based preparatory activation session prior to inverse translation serves to prepare students in terms of linguistic capacity and knowledge base for the task at hand. However, students might place too much importance on the technical aspects of the ST and become implicitly influenced more by the reference material in their translation of technical terms than the more general words and phrases in the original texts. Grammatical nuances and creative writing are also areas in which the priming effect is weak.


Introduction
According to Mona Baker, corpora refer to collections of texts "held in machine-readable form and capable of being analyzed automatically or semi-automatically in a variety of ways" (1995: 225). In the context of translation practice and translation pedagogy, three major types of corpus are identified: parallel corpus, multilingual corpus and comparable corpus (Baker, 1995: 230). Further elaboration of corpora typologies includes the distinction between monolingual and bilingual/multilingual corpora (Kenny 2001:57).
Monolingual corpora in the form of introductory specialized texts written originally in the target language are often deemed as a form of orientation for translators approaching their specialized translation tasks. Teague (1993:162) has pointed out that scientific and technical translators should read "orientation" texts to acquire subject knowledge as the first stage towards delivering high-quality translations. Maier and Massardier- Kenny (1993:151) place research skills as the most important component of their model for specialized translator training. Similarly, Durieux (2007) finds preliminary documentary research necessary for translators of technical texts to familiarize themselves with the topic before approaching the translation task.
As Borja mentions, the Internet itself can be considered a corpus, "the largest one in existence and the broadest in scope" (2007:3). The Internet as the host of various corpora has been widely used by professional translators in their translation related subject knowledge preparation. Studies reveal that finding appropriate and relevant reference on the Internet might be a challenge for professional as well as non-professional translators since many texts are "of doubtful quality as regards authority, subject matter and language use" (Sánchez-Gijón, 2009: 114). The inexhaustible nature of Internet contents also poses the problems of cognitive overload for translators trying to orient themselves towards the subject while working under time constraints.
It is generally accepted that corpora offer useful resources for professional translators engaging in specialized translation, "providing information on terminology, phraseology and textual features" (Borja 2007:13). Gallego-Herná ndez believes translators may enhance their translation accuracy and efficiency simply by "reading different words, chunks or segments that occur frequently in the various texts in a corpus" (2015). Word frequency lists, n-gram frequency lists, concordances, and collocation profiles as different forms of corpus linguistics tools may be "used to solve problems related to a variety of topics, such as terminology, phraseology, general language, source-text comprehension, and genre and discourse features" (Gallego-Herná ndez, 2015).
Beyond the arena of professional translation, studies on the use of text corpora for pedagogy purposes have also been done, including some empirical ones. Tiayon (2004) has explored the use of parallel corpora in translation teaching/learning especially in non-mother tongue situations. Giampieri's study highlights the benefits of corpus analysis in aiding students to make learned and informed translation decisions. Laursen and Pellón (2012)'s study demonstrates the use of comparable text corpora and concordance software for classroom training of specialized translation between Spanish and Danish. According to their findings, comparable corpora are proved to be more helpful than parallel corpora in that the former reveals contrastive features specific to source and target languages and genre-specific contrasts, thus facilitating the production of a functional translation. Sharkas (2013) has gauged the extent to which reading target language introductory specialized texts may help translation students produce accurate translations of scientific texts.
Some of the key issues remain unanswered. What sort of enhancement can a corpus-based approach bring to translation trainees? What role do corpora play in the classroom of translation training? How to guide trainees in their use of corpora? What are the targets for using corpora in translation training? What types of corpora would be most helpful for achieving such targets? It is intuitively believed by quite a lot of educators and students that the use of corpora facilitates a focused information search and provides quicker and more reliable answers to lexical obstacles and subject knowledge gaps encountered in translation, especially for inverse translation. The current study takes the intuition a step further, seeking empirical evidence for the question of whether corpora works as a priming tool for students to identify language conventions and for translation accuracy and overall quality enhancement.

Inverse Translation Training
The activity of translating one's mother tongue into a second language or foreign language has been described in various terms, such as translating into a non-native language or L2, or inverse translation. The current study adopts the term "inverse translation" as it is widely used and easy to understand.
Teaching students to perform inverse translation incurs a different set of challenges as it involves producing a text in a language in which students are likely to be less proficient. Students are much more likely to generate erroneous output by the lexical or grammatical standards of the target language alone. Corpora in the target language might play an even bigger role in helping students produce adequate inverse translations. In recent years, many scholars have researched and advocating the use of electronic corpora for translator training in inverse translation. Attempts of incorporating corpus-based activities in translator training classes have been made and studied for pedagogy enlightenment. Rodrí guez-Iné s (2014) has tried and tested various corpus-based language and translation exercises on searching for natural equivalents, appropriate collocations, frequency data etc., highlighting issues that inverse translation entails. The focus on corpora in inverse translation training as explored by such scholar often lies in corpus-retrieved texts in the target language relevant to the translation task at hand.
Inverse translation throws the spotlight onto the translator, who is tasked to produce a text in a language other than his/her mother tongue. For novice translators who might still struggle to express an idea in the target language in a grammatically correct and natural way, the risk for them to produce totally unreadable translation is particularly high as their foreign language proficiency might not be up to par, let alone the challenge of rendering the original textual or stylistic nuances. The students trained for inverse translation has to take center stage in our exploration of translation pedagogical inspiration since the design of any corpus-based training approach hinges largely on their language proficiency, learning habits and expectations. If corpus-based exercises are vital components to be incorporated in the classroom, how to design the exercises particular to the nature and demands of inverse translation in relation to the students involved? This is an area rather under-explored, especially in China, where research specifically on why/how/what corpora can be incorporated in inverse translation training has yet been done. Wang (2008), one of the leading academia in translation studies in China has conducted a review on Corpus-based Translation Studies and its progress in the recent decades, in which the use of corpus in the field of translation teaching was mentioned briefly, but nothing very practical or inspirational for a translation trainer or trainee could be distilled from the macro-perspective research except that the "development of new methodologies, rational hypotheses and their confirmations, and consilience in practical studies are the main points for the progress of Corpus-based Translation Studies and other research areas" (2008). As an effort to narrow such a vast research gap, the current tentative study takes inverse translation training as the field of focus for our discussion of corpus-based translation studies.

Priming
Similar to corpora-based studies, priming is another concept and practice originated outside of the field of translation pedagogy. First used by Feldman and Weld (cited in McDonough & Trofimovich, 2011), the term "priming" refers to a state of attention preparedness for perception. Then Lashley (cited in McDonough & Trofimovich 2011) further developed the concept in the field of psycholinguistics and used it to describe internal activation or readiness of linguistic elements in speech production. And in the early 1960s, Segal and Cofer (cited in McDonough & Trofimovich 2011) published a study demonstrating that when language users are exposed to a list of words, they are more likely to reuse these words to perform a subsequent task and referred to this phenomenon as priming. Since then, priming has been used as an important experimental paradigm for exploring the cognitive aspects of language learning and use, increasingly popular in applied linguistics studies.
The term priming is now widely used to describe situations in which prior language exposure influences subsequent language processing, in the sense of facilitating or interfering with subsequent language comprehension or production, and it is believed to be an implicit process that occurs with little awareness. Psycholinguists frequently use priming as an experimental technique to examine how the language input influence learners' comprehension and production of the L2. Since introducing new material before a lesson occurs, priming can also be viewed as a teaching approach to prepare students for an activity with which they usually have difficulty. Horner and Henson (2008) have pointed out, "priming may be of enormous value for educational purposes, as it may increase the learning speed and growth, as well as positively influence a learner's motivation on a certain task. The overlapping pedagogical concerns of L2 teaching and inverse translation training means "priming" could possibly be employed as a means of facilitating the discussion and practice in inverse translation training. In the current study, the term priming is used in the sense of a preparatory pedagogical approach for the translation class and subject to examination.

Methodology
This study aims to explore whether and how corpora can be used as a priming tool in translation training to help students identify language conventions and enhance their translation accuracy and overall quality. Inspired by Hala Sharkas's (2013) suggestion, the current study tries to answer the question by evaluating and comparing the accuracy of students' translations produced with different resources as reference following different procedural sequences, while cross-matching their errors against such resources.
To assess the effectiveness of corpus-based priming in inverse translation training, two groups comprising 50 Chinese undergraduate students majoring in English (translation and interpreting stream) were instructed to translate four short pieces of photo captions from a photography book from Chinese to English. The subjects were students taking the course "Translation Ⅱ" (instructed by the author) in the second semester of their third-year studies who had finished the course "Translation I" (also instructed by the author) for the previous semester. All 22 student from Class 1 forms Group A, while all 28 students from Class 2 forms Group B. The average scores in the final test for the course "Translation I" were 80.5 (out of 100) for Class 1 and 81 for Class 2, indicating comparable level of language and translation proficiency.
The source texts comprise three photo technique descriptions and one photo title which is more of a poetic summary of the photographer's inspiration. A total of 14 lexical units (marked in bold letters) from ST 1, ST 2 and ST 3 are marked out as keywords for analysis on lexical translation errors. Students are not expected to produce translations using the exact wordings of those in the reference translation, but TT expressions deviated from the semantics of the marked meaning units in the ST would be counted as error. Non-translation without corresponding meaning compensation would also be counted as error. A list of acceptable translations (see Table 1) is included to better illustrate what might be deemed accurate or the opposite, though a certain degree of discretion on the part of the author is unavoidable as translation between languages is seldom a clean-cut business of 1+1=2. And the list is not intended to be exhaustive as students could be really inventive in their translation, be it right or wrong. As for translation quality in terms of grammatical accuracy, each clause of the Chinese sentences in the three short photo technique descriptions is considered a syntactic unit and the translation of these units is subject to statistical analysis of syntactic translation errors. A total of 14 such units are identified (as marked by the "/" sign) in ST 1-3. Failure to produce grammatically correct linguistic structures representing each of the meaning units would be counted as one syntactic error. The photo title in the original texts is mainly intended for measurement of the effect of corpus-based independent research with regard to stylistic choices as compared with the more technical source texts in the group. Diffused light produces small contrast, rendering the subject less three-dimensional, but it is particularly apt for bringing out the color details of white or pale subjects.

一般而言顶光不宜拍人物，/此片例外。/小孩头部黑 色头发借顶光之助勾划分离了深色背景。/孩子脸上满 载欢乐，/滚滚沙尘，/增添气氛。
Reference Translation: Generally speaking, top light is not for people shots, except it works in this image. The black hair on the children's heads is picked up by the top light from the dark background. There's great joy beaming from the children's faces, while the rolling dust adds to the atmosphere.

冲冠一怒为"红颜"
This line of poetic expression takes inspiration from the old narrative poem "Yuanyuan Qu", which describes the protagonist General Wu Sangui flying into a rage when he heard that his beloved mistress Chen Yuanyuan had been abdicated by the enemy troop, literally meaning "one wave of headgear-lifting anger propelled him, all for the sake of the fair-faced one." However, in the ST here the photographer is borrowing the line to describe the look of the bursting red petals of a flower in full bloom. There is huge room for a creative interpretation of the ST as the poetic line is now used as the title for a photographical work. It's not realistic to gauge lexical or syntactic errors in the translation of the line here, and therefore it's intended for the study of students' stylistic and cultural awareness in their independent search for corpus reference. The two groups were instructed to proceed with the translation first without referring to dictionaries or any other forms of research resources while they produce the first batch of unaided independent translation. Then both groups were given two pieces of corpus-based comparable texts on photographic techniques to read and allowed to use whatever reference they could find online or offline, be it dictionaries, encyclopedia, search portals, professional websites or online corpus. But Group A was allowed to conduct independent research first and produce the second translation version before they were given the comparable texts to read and asked to produce the third version. Whereas Group B was provided with the comparable texts first and asked to produce the second translation version before they were allowed to conduct independent research and produce the third translation version. The two comparable reference texts were selected by the author after a ST-targeted process of corpora search, comparison and sifting. They can be retrieved at the following web pages: https://www.lightstalking.com/types-of-lighting/ and https://www.canva.cn/learn/beginners-guide-natural-lightuse-take-great-photos/. One in English and the other in Chinese, though not translations of each other, both texts cover the same topic of photography lighting with profession terms pertinent to the ST subject matter mentioned and explained. For the purpose of distinguishing corpus-based priming and non-corpus-based priming, students were required to report what research tools they have used for their independent research and leave some comment or feedback at the end of their translation.
The hypothesis that a smaller rate of translation errors in the translation versions could be produced after reading the given comparable text corpora or the independent corpus-based referencing compared with the first versions would confirm the effectiveness of corpus-based priming in helping students produce better translations. And the different procedural arrangements with the two groups in the experiment are intended to compare the efficacy of guided corpus-based translation priming and student's independent research in the translation practice.

Findings and Discussion
Comparing both groups' 1st translation and 2nd translation versions, the analysis shows a significant drop of lexical translation error rates in the second versions, an average accuracy improvement of 19% in Group A and 33% in Group B. Improvements in syntactic formulations are relatively less substantial, but still present. More in-depth analysis reveals that while the overall lexical accuracy is enhanced, the improvement in technical terminology is more evident than in general words and phrases. It is obvious through the figures that both independent research and guided corpus-based priming do help students identify previous mistakes and enhance translation quality. From the participants' feedback, they found that using the guided corpora had helped them gain a much firmer grasp of the technical aspects of the content they were tasked to translate and locate the right terminology faster. Though the feedback by itself is not statistically significant to prove a definite relation between guided corpora priming and improved inverse translation quality, it shows where students' focus of attention lies in tackling the task.  One interesting finding from the experiment is that students who were instructed to read the corpus-based comparable texts before the second round of translation (Group B, see Table 3) show a greater leap of translation quality than those who were asked to conduct their own independent referencing for the same round of test (Group A, see Table 2). And the comparison of both groups' 3rd translations and their 1st versions indicates a more significant quality improvement by Group B than Group A. Another interesting observation is, for Group B participants, improvement of lexical accuracy in the translation of photography terminology is found most evident in their 2nd translation versions, and lexical accuracy in the translation of general words is most significant in their 3rd translation versions. The phenomenon can be explained through the figures and participants' feedbacks as they would focus on tackling the technical terms first with the help of the targeted corpora, saving energy for tweaking the more general linguistic challenges in the third round of translation preceded with independent research. In contrast, with a different sequence of pre-translation research, students in Group A would spend too much time and energy in finding equivalents for the technical lexicon in the ST while conducting their independent research, neglecting the translation of general words and phrases. And quite a lot of them report difficulties in locating the right terminology while they conduct independent research. As their attention was largely placed on the photography terminology, the final round of guided corpora reading would simply be used as confirmation or negation of their previous research results. Almost 86% of Group A participants are found to have revised their correct translation of "顶光" (top light) in the 2nd version from "top light" into a more cumbersome expression "light from above" (as it appears in the given corpora) in the 3rd translation, which shows students' lack of confidence in their own research and reliance on the corpora reference they are provided with.
Based on the final translation results, we may infer that guided corpus-based priming works better if it is administered before students' engagement in independent research. Students' research would be much targeted and efficient with the initial exposure to authentic texts dealing with the subject matter, and they would have more confidence to grapple with the rest of the linguistic challenges in subsequent revisions, inspired by the corpora. Such learning and tweaking processes are proved most beneficial when the students are not plunged into a blind search for equivalent terms and expressions at the beginning of the task, but guided with corpus-based research approaches or given targeted corpora for reference, in which they get to know word uses in context and consolidate subject knowledge for better rendition of meaning.
Expressions with similar semantic meanings can pose huge challenge for trainee translators translating into a foreign language. Students would either be trapped by their limited reserve of active vocabulary or baffled by the potential choices of similar expressions. Corpus-based priming prompts students to evaluate expressions used in context, in relation to the subject matter and text genre. And just as Rodrí guez-Iné s (2014) has concluded, in inverse translation training, electronic corpora can be of great assistance in helping students produce translations that "sound natural". As an implicit cognitive phenomenon, priming for inverse translation practice works in line with the principle that prior experience with language shapes their subsequent language use. Therefore, the choice of corpus material for the priming session is of particular importance, especially for beginners learning to perform inverse translation, who might struggle to locate the right corpora in their independent research session or simply have no idea of the possible help from referring to corpora. Almost all students participating in the current experiment have used some form of corpus consultation (through search engines like Baidu, Bing and Google, or corpus platforms like Linguee.com or glosbe.com) in their independent research, but the degree of help varies from person to person, highlighting the need for training in "corpus navigation" (Zanettin, 2001: 179). The corpus material to be used for research purposes should be originally written in the target language. But it is often neglected by students, who might take whatever comes up in the search result as reliable reference so long as it is written in the target language. Students shall be made aware of checking the authenticity of the corpus material and "experience" the technical language and terminology from a contextualized point of view.
As for the translation of more literary or creative writing, as in the case of the photo caption ST (a word-play on a folk saying) in our experiment, the samples collected reflect a minor degree of translation quality improvement over the different procedures in both groups. Many of the students chose to stick with their initial translations even with later sessions of corpus consultation and independent research. It is possible that the priming fails in this case of translating creative writing because the given corpora reference covers a subject matter unrelated to the ST caption and belongs to totally difference text genre. The creative dimension of the task makes it harder not only for students to navigate the vast sea of potential corpora but also for researchers to quantify or even judge the quality of their translation. However, it does not mean that corpus-based priming won't work with literary/creative translation. Again it's just a matter of whether the right corpus reference could be located and utilized.
While the hypothesis of this study might not be proved conclusively by the experiment, results collected do suggest that targeted corpus reference and corpus-based research can work as an effective priming tool for students to produce more accurate and natural translation in inverse translation practice, especially when particular technical subject is involved. The benefits lie in faster location and more accurate use of terminology in the TT as well as fewer syntactic mistakes, though individual variations exist.

Conclusions
To answer the questions raised at the beginning of the thesis, mainly about whether, how and what types of corpora can be used in the classroom of translation training for quality improvement in student's inverse translation practice, the author has conducted a small-scale study by using corpus-based material as a preparatory tool for the intended effect of lowered lexical and syntactic errors in the primed translation.
The implication of the experiment for translation training is multifaceted. First, the findings have shown that exposing students to a corpus-based reading material relevant to the subject matter of the translation task helps to prime them for the daunting task of inverse translation, which entails producing a target text written in their foreign language. No doubt corpus-based preparatory reading allows students a glimpse of word use and collocations in context, speeding up terminology and subject matter research. According to the experiment findings, the priming works best for them to locate the right technical terms, as it is very likely where their attention is focused. For more general words and phrases, students are also likely to be implicitly influenced in their choice of lexicon, though the figures show a less degree of influence as compared to the translation of technical terminology. While the overall error rate in lexical and syntactic representation is lowered with the help of guided corpus material exposure and independent research, the level of quality enhancement varies with the different time nodes when these reference materials are used, which is a crucial point for both translation teaching and learning purposes.
In line with such findings, a further implication for reflection has to do with how we employ corpus-based reference materials for translation training purposes. The potential priming effect exerted on beginners in inverse translation is reflected in the experiment, highlighting the need for guidance in corpora navigation and analysis. Participants in the current experiment have engaged in a largely passive corpus-based priming session, where they rely on and trust the corpora provided by the teacher more than other reference they find by themselves. It is evidence of a lack of confidence as well as experience in independent research for translation. Students might feel overwhelmed by the vast sea of information online or offline and be distracted by the sheer volume of corpora. The next step to take in exploiting the pedagogical potentials of corpus-based priming would be to help student develop corpus research competencies, not only in developing search skills for identifying the useful materials, but also in analyzing the textual nuances for greater help in inverse translation as a form of writing in the foreign language.
To sum up, evidence from the tentative experiment confirms that a corpus-based preparatory activation session (reading SL and TT comparable corpora in this case) prior to inverse translation serves to warm up and reinforce student's linguistic capacity and knowledge base for the task at hand, shoring up confidence as well. However, students might place too much emphasis on the technical aspects of the ST and become implicitly influenced more in the translation of technical terms than the more general words and phrases. Grammatical nuances and creative writing to be translated are also areas in which the priming effect is weak. Therefore, it is safe to say that corpus-based priming is more suitable for training students in translating technical writings from their mother tongue to a foreign language.
There are quite a lot of imitations with the current study. Although some form of quantitative data is obtained in the study to prove a positive relation between corpus-based priming and an enhanced quality in the final translation production, the statistics is not comprehensive or conclusive enough as only part of the ST lexicon was taken as subject for study. And the assumption to draw unit-for-unit correspondence in lexicon and syntax between the ST and TT is also problematic. The attempt to quantify grammatical errors as a translation error is, again, not completely fallacy-proof. The benefits and downsides of using comparable corpora texts as compared with parallel corpora/ monolingual corpora is another issue remaining unexplored. As a tentative attempt to seek empirical evidence for corpus-based priming, the study has done its part to confirm the use of such activation session for translation training purpose, distinguishing areas of strength and weakness in the priming effect, and pinpointing the importance of guidance in corpus navigation. Surely more in-depth studies are needed before we could see the whole picture of the issue.