Academia.eduAcademia.edu

Language Processing and Translation

2017

Chapter 5 Language processing and translation Moritz Schaeffer Johannes Gutenberg University of Mainz Michael Carl Renmin University of China The current chapter reviews studies which investigate the behavioural differences during reading and writing for translation and other non-translational language use. This chapter further argues that eye movement measures imported from Psychology are not well suited to describe the unique co-occurrence of reading and writing during written translation. In order to address these shortcomings, one existing measure (the Eye-Key Span, Dragsted & Hansen 2008; Dragsted 2010), which describes how reading and writing activities are coordinated, is further tested by replicating existing findings with more language combinations and participants. A second, novel measure (the probability that source text reading and target text writing overlap in time) is used in conjunction with the Eye-Key Span to test predictions from an existing model of the translation process (Schaeffer & Carl 2013a). Finally, one new feature (HCross) is introduced with which an existing model of bilingual memory (Hartsuiker et al. 2004) is extended. 1 Translation and non-translational language processing There is a long tradition of studying the differences between original texts written in one language and texts translated from a different language – in terms of the product of translation, i.e., in corpora of the final (published) texts (e.g. Hansen-Schirra et al. 2012). Corpus-based translation studies have the great advantage that the data which led to the formulation of theoretical insights is ecologically valid to a high degree: the texts used in corpora such as the CroCo corpus (Hansen-Schirra et al. 2012) are published texts and have therefore been produced in situations which are real and natural. Experimental studies, on the Moritz Schaeffer & Michael Carl. Language processing and translation. In Silvia Hansen-Schirra, Oliver Czulo & Sascha Hofmann (eds.), Empirical modelling of translation and interpreting, 117–154. Berlin: Language Science Press. DOI:10.5281/zenodo.1090958 Moritz Schaeffer & Michael Carl other hand, often manipulate source texts (henceforth ST) and the STs are normally far shorter than those in real life situations (ranging from single words, to single sentences and short texts of approximately 150 words). In addition to the unnatural characteristics of the STs, participants are often not allowed to use reference material such as dictionaries or glossaries and typically do not have access to the internet. Further increasing the unnatural conditions of experimental studies is the fact that participants translate knowing that their reaction times or keystrokes and/or eye movements are recorded and the simple presence of a researcher may further impinge on the process of translation. However, the shortcoming of corpus-based translation studies is that it is difficult to attribute observed effects to particular aspects of the translation process, given that the source of information is typically the frequency of a particular item in the final product. The factors which led to the observed result of the process remain hidden in the dialogue between ST reading and target text (henceforth TT) reading and writing and interaction with other information sources. The current study therefore aims to provide insights into the cognitive processes which occur during translation by first reviewing existing studies which compare translational and non-translational language use and by comparing the effect of two tasks (monolingual copying and translation) on two behavioural measures. One of these behavioural measures was first proposed by Dragsted & Hansen (2008) and Dragsted (2010), and the second behavioural measure is novel. The two measures take into account both eye movements on the source text and typing activity. The eye-key span (Dragsted & Hansen 2008; Dragsted 2010) describes the temporal distance between a first reading of a particular word and the first keystroke which contributed to the translation of that particular ST word. It can be seen as a relatively late indicator: Many intervening processes between a first reading and the first keystroke can and typically do occur during translation, while fewer occur during monolingual copying. The second measure is the probability that ST reading and TT typing occur (at least partially) at the same time. It is an indicator of cognitive effort: the less likely the co-occurrence of these two processes, the more effortful the process. The more likely it is that reading and writing overlap in time, the less effortful is the process as a whole at that time. These two measures take into account one aspect of the nature of the translation process which it shares with few other tasks, apart from monolingual copying: the direct relationship between read input and written output. Jakobsen argued that with the introduction of eye tracking and keylogging into translation process research the hope was that …eye data would provide evidence pertaining identifiably to source-text reading so that source-text comprehension processes could be studied sep118 5 Language processing and translation arately from text-production processes and could be compared with other reading processes that were not part of a translation process. (Jakobsen 2011: 41) Very few studies have systematically compared the cognitive processes during non-translational language use with those that occur during translation. The current chapter will review the studies which have done so and will provide new evidence which addresses shortcomings in existing studies. 2 Reaction times and eye movements during translation 2.1 Reaction times per clause Shreve et al. (1993) compared reading times in three tasks and groups: reading for later translation by translation students, reading for later monolingual paraphrasing by students of English and reading for comprehension by students in psychology. Reading times were measured per clause (including re-reading) and normalised by the number of words in each clause. Results from principal component analysis of the reading times showed that, at least on the basis of these behavioural measures, none of the four factors of the principal component analysis distinguished reading for translation clearly from the other two tasks. However, reading for translation was overall more similar to reading for monolingual paraphrasing than to reading for comprehension. The authors further point out that there was more variation in how translators read for translation while the other two groups of participants approached their tasks more homogeneously. The paraphrasing and translation groups were also asked to indicate post-task the nature and number of problems in the clauses they identified in their reading. The expectation was that the number of problems identified post-task would correlate with reading times. This was not the case. Although the authors do not interpret their findings in this way, it is entirely possible to argue that post-task identification of problems might not accurately reflect the processes which occurred during reading, given that they are produced off-line. One other reason might be the fact that reading times per clause might not accurately reflect actual reading times, which might show the expected effect locally rather than globally. 2.2 Reaction times per word In a series of studies, Bajo and colleagues (Macizo & Bajo 2004; 2006; Ruiz et al. 2008) employed more sensitive behavioural measures, i.e., reaction times per 119 Moritz Schaeffer & Michael Carl word using the self-paced reading paradigm. In all three studies, a similar experimental design was used: masked self-paced reading is the sequential presentation of single words which is controlled via button press by the participant, so that subsequent button presses are used to measure reaction times per word. The interval between two successive button presses is taken as an indication of the time needed to process the currently displayed word. These studies therefore address the concerns raised in relation to the study by Shreve et al. (1993). Bajo and colleagues (Macizo & Bajo 2004; 2006; Ruiz et al. 2008) refer to the model proposed by Seleskovitch (1976) who argued that translation is normally carried out sequentially in that the first step is source text comprehension and only when this is complete and only once the source material is “deverbalised” can reformulation in the target language begin. Opposed to this sequential view is the assumption that representations specific to the target language (TL) are activated at the same time as source language (SL) representations are activated (horizontally and in parallel). The vertical model by Seleskovitch (1976) is essentially what in machine translation would be called an interlingual model. It is the highest level in the Vauquois triangle (Vauquois 1968) (see Figure 1), where transfer occurs at a language-independent interlingual representation, common to all languages. Interlingua Semantic structure Semantic transfer Semantic generation Semantic analysis Syntactic structure Syntactic transfer Syntactic structure Syntactic generation Syntactic analysis Source Text Semantic structure Direct translation Target Text Figure 1: The Vauquois triangle of translation based on Vauquois (1968) The studies by Bajo and colleagues were designed to test the Seleskovitch model. Participants in all three studies carried out two tasks: reading for com- 120 5 Language processing and translation prehension and reading for translation. Participants were not overtly producing the translation while reading – they were asked to orally produce the translation after having read the sentence (for translation). The expectation in all three studies was that the manipulation of the stimuli would elicit an effect only in the reading for translation condition, because of a) increased working memory load due to the added effort related to online translation and b) because the assumption was that during reading for comprehension the TL would not be activated and TL-specific manipulations would not have an effect on source text (ST) reading. In the 2004 study, Macizo and Bajo manipulated both working memory load and the availability of pragmatic cues. The stimuli consisted of object relative sentences such as “The judge that the reporter interviewed dismissed the charge at the end of the hearing.” The authors argued that working memory load would be particularly high for the verbs of the main clause and the relative clauses, because in object relative clauses, the thematic roles of the first two constituents (judge and reporter in the example) can only be assigned retrospectively once the subordinate verb (interviewed) is read. Pragmatic cues consisted of verbs which were either more or less predictable based on the previous context. It is, for example, more predictable that a reporter interviews than that a reporter admires or it is more predictable that a judge dismisses a charge than that he drives a car. In addition to testing the sequential versus parallel view of translation, Macizo & Bajo (2004) tested the predictions of the Revised Hierarchical Model (RHM) of bilingual memory (Kroll & Stewart 1994) which predicts that backward translation (BT, from L2 into L1) is faster than forward translation (FT, from L1 into L2), because L2 lexical representations have stronger connections to their L1 equivalents than to shared conceptual representations. Translation from L2 into L1 is therefore predicted to use the faster lexical routes and translation from L1 into L2 is mediated by the less direct conceptual connections. However, during translation, both routes are always activated – one is simply faster than the other. The predictions based on the sequential/parallel model and the RHM are therefore that an effect appears only in the reading for translation condition and that FT, because it is more conceptually mediated than BT, is especially susceptible to the manipulation of pragmatic cues. These results are clearly borne out by the evidence: Reaction times were significantly slower during reading for translation, particularly during FT and particularly for the constructions which require retrospective assignment of thematic roles and therefore high working memory load, supporting the parallel activation of SL and TL representations during reading for translation. In addition, more predictable verbs were read significantly faster than less predictable verbs in FT, but not in BT, supporting the predictions of the RHM. 121 Moritz Schaeffer & Michael Carl Further support for the co-activation of SL and TL representations during reading for translation was provided by the two subsequent studies (Macizo & Bajo 2006; Ruiz et al. 2008). In both studies, participants also read single sentences for comprehension and for translation in a self-paced reading paradigm. In the 2006 study, the stimuli for experiments 1a and b consisted of interlingual homographs which created an ambiguity only if they were translated: the Spanish word presente is not ambiguous in Spanish (it can only refer to the present time), but it is ambiguous when translated into English, given that present can refer both to a gift and the present time. In experiment 1a and b, the number of words intervening between the ambiguous homograph and the disambiguating context was manipulated so that working memory load was a factor in the design. In experiment 2a and b, cognates were used. The manipulation in experiments 1a and b was expected to result in inhibition only when the reading purpose was translation and particularly when the working memory load was high, but not when the reading purpose was comprehension alone. The prediction for experiment 2a and b was that the presence of cognates would facilitate. Both of these predictions were designed to lend further support to the hypothesis that activation of the TL during ST reading is task-dependent. Again, the predictions were confirmed in this study. The 2004 study by Macizo and Bajo only employed professional translators, but the 2006 study by the same authors replicated the effects found in professional translators with innocent bilinguals who had no professional translation experience: interlingual homographs, the working memory manipulation and cognates resulted in the same pattern of results, suggesting that the mechanisms underlying the task-dependent co-activation of SL and TL is not a function of expertise, but co-extensive with bilingualism. The 2008 study by Ruiz et al., again, employed essentially the same experimental design as the previous two studies. TL-specific aspects were manipulated here: the frequency of critical SL items was kept constant while the frequency of their TL equivalents was either high or low (experiment 1). Experiment 2 manipulated the congruence of the word order in the ST with that in the TT: In the SL Spanish, adjectives can either precede the noun they modify or they can be placed after it while in the TL (English) they can only precede it. Only professional translators participated in this study and working memory load was not manipulated. Results were as predicted, in that the manipulations only had a significant effect on reaction times when the reading purpose was translation, but not when the reading purpose was comprehension only. All three studies by Bajo and colleagues support the horizontal model of translation. All three studies show that co-activation of SL and TL is task-dependent. 122 5 Language processing and translation In all three studies by Bajo and colleagues, the results are interpreted in terms of Grosjean’s (1997) language mode continuum, which predicts that, depending on the context of language use, a bilingual’s two languages are activated to varying degrees. At one extreme is the monolingual mode, in which mainly one language is active and at the other extreme is the bilingual mode in which both languages are active. 2.3 Complete texts and eye movements Jakobsen & Hvelplund Jensen (2008) investigated essentially the same question as all the studies presented thus far, but employed an eye tracker. In this study, there were four tasks: reading for comprehension, reading for translation, reading while speaking a translation and reading while writing a translation. The expectation was that the task would have an effect on eye movements. The authors found significantly more fixations on the whole ST in reading for later translation than reading for comprehension, reading while speaking a translation had significantly more fixations than reading for translation and reading while typing a translation had significantly more fixations than reading while speaking a translation. Further support for task-dependent co-activation of two linguistic systems comes from the study by Hvelplund Jensen et al. (2009). The manipulation in this study is very similar to the one by Ruiz et al. (2008), in that it investigates the congruence of word order. In the study by Jensen et al., the stimuli consisted of complete Danish texts which were translated into English. In the critical declarative clauses, embedded in the longer texts, the subject either preceded (SV) or followed the verb (VS). When translating these clauses into English, participants had to invert the order of verb and subject for the VS clauses, but not for the SV clauses. As in the study by Ruiz et al. (2008) the expectation was that it would be more difficult to process the incongruent clauses than the congruent ones. Results confirmed this. Jensen et al. employed an eye tracker and so the dependent variable was total reading time on the phrases. Total reading time is the sum total of all fixations on the area of interest. During translation, participants (professional translators) looked longer at clauses which had an incongruent word order than at clauses with a congruent word order. The fact that this effect is task dependent came from a follow-up study (Winther Balling et al. 2014) which employed the same stimuli as in the previous study, but in this case, the participants were either Danish-English bilinguals or English-Danish bilinguals and they were asked to read for comprehension only. The participants were therefore asked to read in their L1 and L2 respectively. The rationale for the follow-up 123 Moritz Schaeffer & Michael Carl study was to make sure that the effect observed in the 2009 study was in fact taskdependent and not due to the fact that VS clauses are inherently more difficult to process when reading for comprehension in either L1 or L2. The manipulation (VS vs. SV) had no effect on total reading time during reading for comprehension in either L1 or L2. One question, which is relevant in this context is how early the effect of the coactivation of the two linguistic systems during translation appears. The study by Shreve et al. (1993) employed a very late measure (reading latency of a complete clause), the studies by Bajo and colleagues employed a more sensitive measure (reaction time per word). The studies by Balling and colleagues ((2009; 2014)) employed total reading time on a phrase. Total reading time, given that it is the sum total of fixations on a particular region of text, is not informative regarding the time course of the effect. 2.4 Early and late eye-movement measures Schaeffer et al. (2017) employed more fine-grained eye movement measures than previous studies, but otherwise, the design was similar to previous research. Professional translators read for comprehension and translated single sentences. The manipulation consisted of the number of target words which were equivalent to a single source word. Half of the stimuli contained items which had a one-toone equivalence (the likelihood that an ST word was translated using just one TT word was high) and the other half contained one-to-many equivalences (the likelihood that an ST word was translated into more than one TT word was high). Global analyses showed that average fixation durations were 20ms longer during reading for translation than during reading for comprehension. Participants made on average 16 fixations more per sentence during reading for translation and the number of regressions also doubled, as did total reading time. The significant increase in all these eye-movement measures confirms and extends findings from earlier studies discussed above, i.e., that during reading for translation, coactivation of the two linguistic systems increases effort from early on (duration of single fixations) and into later processes (total reading time and regressions). That co-activation occurs very early during the process is further supported by the fact that the manipulation had a significant effect on first fixation durations: when it was likely that an ST word would be translated using more than one word, participants spent 23ms longer on this word when they were to translate it afterwards, but not when they only had to read it for comprehension. First fixation durations describe the time readers spend on a word the first time they encounter it. The critical items which were likely to be translated using more 124 5 Language processing and translation than one TT word necessarily introduced lexical items which, when translated back into the SL, had no direct equivalents (see examples 1a and 1b below) in the context in which they appeared. It is therefore likely that, in the context in which they appeared, the one-to-many items did not share semantic representations across the two languages to the same degree as did the one-to-one items. First fixation durations on one-to-one items were not significantly different from first fixation durations on either kind of item during reading for comprehension. This pattern of results suggests that if the overlap in terms of lexico-semantic representations between SL and TL items is high, as in the case of one-to-one items, then translators are able to exploit the effects of co-activation and (initial) processing is similar during reading for comprehension and reading for translation. If, however, the semantic overlap is smaller, as in the case of one-to-many items, co-activation has an inhibiting effect on reading for translation, but not on reading for comprehension. (1) a. One-to-many ‘The water in the bottle is low…’ In the bottle is not any more much water… In der Flasche ist nicht mehr viel Wasser… b. One-to-one ‘The water in the bottle is bad…’ The water in the bottle is bad… Das Wasser in der Flasche ist schlecht… Further support for the early activation of TL-specific representations during ST reading comes from a corpus-based eye movement study (Schaeffer et al. 2016). This study was designed to test a model proposed by Schaeffer & Carl (2013b). While the studies by Bajo and colleagues and Balling et al. described above contrasted a sequential and parallel model of translation, the model by Schaeffer and Carl argued that translation is best represented by both early, parallel and late, sequential processes. Schaeffer and Carl hypothesised that early automatic priming processes activate semantic and syntactic representations which are shared by the SL and the TL and later, more conscious, essentially monolingual vertical processes monitor the output from the early processes. Shared syntactic representations are defined in terms of the shared syntax account (Hartsuiker et al. 2004) and shared semantic representations are defined in terms of the Distributed Feature Model (de Groot 1992). In line with these models, Schaeffer and Carl argue that “shared representations are accessed very early during the process” (Schaeffer & Carl 2013b: 174) and that during the early stages “there is no 125 Moritz Schaeffer & Michael Carl conscious control over how source and target are aligned cognitively” (Schaeffer & Carl 2013b: 173). In order to test the possibility that the automatic cognitive alignment has an observable effect on early eye movement measures and that these primed, shared representations serve as a basis for later processes, Schaeffer et al. (2016: 189) quantify the syntactic similarity (in terms of word order) of the source and the target texts and the variation of word translation realizations. The metric termed Cross (Carl et al. 2016: 26) describes the relative word order differences between the ST and the TT. If the word order is identical in two segments, then the Cross value for each word is 1. If, say, the equivalent of the first ST word is aligned to the sixth TT word, then the Cross value is 6. If, however, the distortion is in the opposite direction, i.e., if the sixth TT word is aligned to the first ST word, then the Cross value is -5. The Cross value can be computed by counting how many TT words need to be progressively or regressively counted in order to arrive at the equivalent of a given ST word. It is then termed CrossS. But the Cross value can also be computed by counting the number of ST words which need to be read progressively or regressively in order to arrive at the equivalent of a given TT word. This is then termed CrossT. CrossS can be seen as a process by which the ST is cognitively aligned with the TT, while CrossT describes a process which aligns the TT with the ST. The computation of CrossS progresses in a linear and sequential manner through the ST and finds aligned TT items, while CrossT progresses in a linear and sequential manner through the TT and finds aligned ST items. The variation in terms of TT realizations of a particular ST item is computed by counting how many different TT items, which are aligned to the same ST item, there are in a corpus of a number of translations of the same ST. On the basis of the probabilities of each of these TT realizations, the distribution of these probabilities is then calculated. This is then expressed as word translation entropy (HTra) (Carl et al. 2016: 31) if the variation underlying this metric is lexical in nature, and it is termed syntactic entropy (Bangalore et al. 2016) if the underlying variation is syntactic in nature. Schaeffer et al. (2016) find that both word translation entropy (HTra) and syntactic distortion (CrossS) have a significant positive effect on first fixation durations and total reading time. It is therefore likely that the effect of CrossS and HTra on first fixation durations represents early, automatic cognitive alignment, which is less effortful in the case of ST items for which the overlap between ST and TT representations in terms of syntax and lexico-semantics, respectively, is greater (low HTra and Cross values). The study by Bangalore et al. (2016) found that syntactic entropy had a significant positive effect on total reading time of 126 5 Language processing and translation source text segments. The studies by Bangalore et al. (2016) and Schaeffer et al. (2016) found evidence of the above in the TPR-DB (Carl et al. 2016), which is a large database containing eye-movement and keylogging data in relation to several translations of the same source texts into a large number of target languages. The data for the study by Schaeffer et al. (2016) consisted of 42,211 English ST words translated into six different target languages and the data for the study by Bangalore et al. (2016) consisted of 26,139 words translated from English into three different target languages. While the large number of languages and the sizeable amount of material warrants confidence in the results, it should be stressed that a non-negligible amount of variation could not be explained with the predictors in the model presented by Schaeffer et al. (2016). In other words, while the model could make predictions with a certain degree of confidence, a possibly large number of variables which impact eye movements during translation remains unknown. To sum up, it is likely that task-dependent co-activation occurs early (horizontally) and that later processes use the output from these relatively automatic processes in the relatively vertical processes. The time needed to process a particular ST item is likely to be a function of the degree of overlap between ST and TT syntax and/or semantics. 3 Automatic translation The studies reviewed so far have found that co-activation during translation is task-dependent. However, there is evidence which suggests that activation of translation equivalents is automatic even if participants are explicitly asked to ignore verbal stimuli (Wu & Thierry 2012; Wu et al. 2013). In the 2012 study by Wu and Thierry, participants were asked to perform a go/no-go task in which they had to respond with a button press to the presentation of shapes (circles or squares) while electrophysiological data were recorded. Half of the trials consisted of words. Participants were told to ignore the words and only respond to the shapes. Unbeknown to the participants, 30% of the word trials consisted of English words, which, when translated into Chinese, were homophone with the Chinese words for circle or square. Behavioural responses to the critical items showed that Chinese-English bilinguals were not likely to make more erroneous responses to the critical items (English words which when translated into Chinese sounded like either circle or square) than to control items (English words which were unrelated to the Chinese sounds for circle and square). However, ERP results (results from the recorded electrophysiological data) showed that the manipulation resulted in an N200 effect. The N200 effect is normally observed in 127 Moritz Schaeffer & Michael Carl situations in which conflicts of a linguistic or non-linguistic nature are the underlying cause. What the study by Wu and Thierry thus shows is that, although the Chinese-English bilinguals were told to ignore all the word trials and only respond to shapes, Chinese translations of the English words were nevertheless activated automatically and early (200-300ms). The fact that this did not translate into a motor response and increased erroneous responses to critical word trials shows that the Chinese-English bilinguals were not necessarily aware of the co-activation and/or inhibited the Chinese equivalents. This interpretation is in line with the Inhibitory Control (IC) model (Green 2003) which predicts that the non-target language, i.e., the language which is not intended to be used in a given task, is inhibited to varying degrees. The 2013 study by Wu et al. showed very similar effects in an eye movement study. It is therefore reasonable to think that the failure to find a co-activation effect during reading for comprehension in the studies by Bajo and colleagues and Balling et al. is due to the fact that the behavioural dependent variables are not sensitive enough to detect (inhibited) co-activation during reading for comprehension. 3.1 Independent translation routes García (2015) reviews 21 cases of pathologies in bilinguals who presented with disorders which affected their translation behaviour. Though limited, this evidence makes exciting neurofunctional predictions regarding the relationship between languages in bilinguals. The most interesting of these hypotheses is that “Lexical translation routes are independent from those supporting monolingual production” (García 2015: 131). In other words, the suggestion is that there are connections or networks which are exclusively used for translation and not for monolingual language use. The evidence regarding this hypothesis comes from patients who were e.g. unable to spontaneously use one of their languages, but were able to translate from or into it. If it is confirmed that some form of translation route is independent of monolingual language use, this would explain how translators and interpreters are able to navigate the competing demands of a linguistic system which is essentially non-selective and which inhibits the SL to some degree, while still allowing it to be used for reading or listening and while activating the TL only rather than also the SL for production. This argument must remain speculative, given current evidence, but, should it find further support, it is entirely possible to argue that the unique and repeated exposure to translation or interpreting tasks may strengthen and possibly expand the nature of these translation routes which are independent of monolingual language use 128 5 Language processing and translation and which are co-extensive with bilingualism. It is further possible to hypothesise that these routes are likely faster than those routes which are also active during monolingual language use, because they do not face the competing demands emerging from an essentially non-selective system which needs to inhibit the non-intended language system. In addition to the speed and strength of these translation routes, a third hypothesis may be articulated: it is possible that lexical items which are translated very frequently in the same way (low HTra) may result in better established translation routes than items which are translated in different ways when encountered in context. In other words, the strength and availability of these routes may be a function of their semantic overlap. So far, only reception-related processes have been considered, but, as will be shown in the remaining sections, translation also has an effect on typing behaviour. 4 Monolingual text production and translation Very few studies have systematically studied the difference between monolingual writing and typing during translation – in terms of the cognitive process and on the basis of behavioural data (as mentioned above, corpus based translation studies have investigated the differences between original and translated texts successfully and extensively). The studies by Immonen (Immonen & Mäkisalo 2010; Immonen 2006; 2011) are a notable exception. Immonen (2006) had 18 Finnish professional translators carry out two tasks: the author asked participants to write a short original text in their L1 (Finnish). The second task consisted of a translation of a text from English (L2) into Finnish (L1). Immonen asked participants to write an informative presentation based on a brochure which was a guide for those planning a career in the European Commission. The ST for the translation task was similar in register and topic – it was a text about the unity of the EU and had been used in exams for translators applying for a post at the EU. No particular brief was given for the translation task apart from the requirement that they should have publishable quality. Both tasks were recorded with the keylogging software Translog (Jakobsen & Schou 1999). One obvious difference between writing an original text and translation was that, at least on the basis of the raw means, participants spent proportionally more time drafting during original production (73%) than during translation (63%). Participants also spent less time revising after writing the original text (11%) than after drafting the translation was finished (24%). Immonen (2006: 323) classified all pauses according to where on the linguistic hierarchy they occurred: preceding a paragraph, 129 Moritz Schaeffer & Michael Carl a sentence, within a clause, preceding a word, a compound boundary within a word, preceding a syllable within a word and within a word other than at the compound or syllable boundary. Of course, a pause preceding a paragraph is also a pause preceding a word and a sentence, but Immonen defined a pause always at the highest possible level of the hierarchy of linguistic categories. So a pause at the beginning of a paragraph is a pause preceding a paragraph (the highest rank), not a sentence or a word. Immonen found that the distribution of pause lengths was similar in original writing and translation in that the higher up in the linguistic hierarchy the pauses occurred, the longer they were in both tasks. However, pauses within a word (both at the syllable boundary and elsewhere word medially) were significantly longer during translation than during original text production. Pauses between words were also significantly longer during translation than during original text production. However, at the sentence and paragraph boundaries, pauses during original text production were significantly longer than during translation. Immonen (2006: 333) argues that macro-level planning may be the driving force behind the longer pauses during original text production at the higher levels of the linguistic hierarchy, given that pauses between paragraphs and sentences are mainly used for this kind of planning. During translation, macro-level planning may be less important. Decisions between a number of possible lexical items and between different word orders or other syntactic choices may be more effortful during translation than during original text production and hence lead to longer pauses at the lower levels of the linguistic hierarchy, where these choices are relevant. 28 professional translators participated in the study by Immonen (2011). Participants carried out the same tasks as in the previous study. A very similar pause classification as that in the previous study was used. In the 2011 study, Immonen defines a processing unit by comparing the pause lengths at the different levels of the linguistic hierarchy for each participant. If the pause lengths to adjacent levels of the linguistic hierarchy did not significantly differ from each other, then they were grouped together. Results showed that grouped processing units were very different in the two tasks. Immonen (2011: 243) thus concludes that “processing units in translation cannot be predicted from the profile in monolingual text production”. Immonen clusters the different linguistic levels into three further groups according to what kind of processing takes place: textual (paragraphs and sentences), lexical and syntactic (clauses, phrases and words) and word medial processes. In terms of textual processing, monolingual processing and translation were not significantly different. The most interesting difference between the two tasks was in terms of syntactic processing: on the basis of the clustering, 130 5 Language processing and translation Immonen (2011) concludes that in monolingual text production, “the weight of syntactic level processing is carried by clauses and words” (244) while in translation, “the emphasis of syntactic processing is on phrases and words.” (245) At the textual level, processing clusters were more varied during monolingual text production than during translation, while in the syntactic clusters, the opposite was the case. Immonen therefore suggests that control over processes is stronger at lower levels during translation and that processing occurs in smaller units during translation. The studies by Immonen compared writing of an original text with translation. However, copying may be a better comparison, given that a copyist, like a translator, has no control over the content of the text that is being produced. Carl & Dragsted (2012) show, on the basis of an implemented model, that copying can be very similar to translation. Carl and Dragsted show that the model by John (1996) predicts the time a copyist needs to produce a segment with an error rate of less than 5% when the segment is easy to comprehend. However, John’s model does not predict extensive re-reading, while the examples Carl and Dragsted provide show that copyists do present such behaviour when the segment is difficult to comprehend. Translation by professionals of easy segments can also be predicted with an error rate of less than 5% by John’s model, while translation of segments which are difficult exceed the production time predicted by John’s model. In sum, the study by Carl and Dragsted suggests that copying may provide a good contrast to translation because it involves coordination of input and output in a similar manner to how eye movements and typing activity need to be coordinated during translation, so that a difference in the behaviour may be attributed to the involvement of two linguistic systems rather than one. The next sections will show that traditional eye movement measures are not adequate for the description of the extensive re-reading behaviour typical for translation, as observed by the studies discussed so far. 5 Beyond the first run The dependent variables in eye movement studies during reading typically employed are all based on the assumption that a reader moves from left to right (or from right to left in languages such as Hebrew) in a fairly linear manner. The fundamental criterion for defining dependent variables is what is called a first run. A first run describes a more or less sequential progression through the sentence. A first run is interrupted when a regression to an earlier word is made. All early eye movement measures are defined in relation to the first run: a first fixation 131 Moritz Schaeffer & Michael Carl duration is the time a reader spends on a word before moving on to either an earlier wordn-m , to a later wordn+m , or when the same wordn is refixated. The probability that a word is skipped is also defined on the basis of a first run, i.e., if wordn is fixated, wordn+1 is not while wordn+m is, then wordn+1 is defined as a skipped word, even if it is re-fixated in a later run. The same applies to gaze duration: this measure is the sum of all fixations on a wordn before a wordn+/-m is fixated. Later eye movement measures typically include the spillover duration, i.e., the time spent on (a number of) word(s)n+m , the probability of a regression, the second pass duration and total reading time. Probability of a regression in refers to a situation in which an eye movement is made from a wordn+m to a wordn – so here again, a regression in is defined as a deviation from a linear, more or less sequential progression through the sentence. The second pass duration consists of the sum of fixations which were registered during the second run – if there was one. Total reading time, however, is entirely insensitive to the sequence of eye movement events and simply describes the sum of all fixations on a word irrespectively of when they occurred. The measures described above have also been applied to areas of interest covering several words. The eye movement measures described above have been very useful for the description of early effects of the text that is being read on how it is processed. However, previous studies (e.g. Jakobsen & Hvelplund Jensen 2008; Schaeffer et al. 2016) have found that reading for translation is especially intense during the later stages of reading. This may have several reasons. On the most basic level, it may have to do with the fact that reading for comprehension is often investigated using single sentences which normally do not form a coherent text: when single sentences are presented one at a time, rereading of earlier text is of course impossible, resulting thus in potentially fewer late eye movement events. During translation, a number of other processes co-occur which may result in more and later eye movement events: typing and the presence of two texts (the ST and the emerging TT). In addition, the ST and the TT are of course in two different languages. During translation, reading occurs typically as a succession of eye movements in the source text followed by eye movements on the target text and shifts from one text to the other are relatively frequent, as is re-reading of already read source and target text (e.g. Jakobsen & Hvelplund Jensen 2008; Hvelplund Jensen 2011). A very rough indication of the importance of late events during reading for translation may be the average total reading time. Kliegl et al. (2004) report a mean total reading time per word during reading for comprehension of 245ms (SD = 48), a subset of the TPR-DB shows that during (monolingual) copying the mean total reading time per word on the source text is 797ms (SD = 1068), however, during translation, the mean total reading time per word on 132 5 Language processing and translation the source text is 1577ms (SD = 5824). There have been attempts (Hyönä et al. 2003) to develop late eye movement measures which are more adequate for the description of global text processing. However, these eye movement measures, while extending the ones described above, still depart from a first run and, crucially, cannot do justice to the complexities of translation, because they involve one rather than two intimately related texts and these measures do of course not take the relationship of eye movements to typing behaviour into account. The next section will describe an eye movement measure which addresses these shortcomings. 6 The eye-key span Dragsted (Dragsted & Hansen 2008; Dragsted 2010) developed the eye-key span (EKS) in reference to the ear-voice span which is used to describe the distance between input and output during simultaneous interpreting, typically measured in words or seconds (e.g. Defrancq 2015). While translators do not have the same time pressure as simultaneous interpreters, it is nevertheless the case that translators have to coordinate input and output similarly to copyists and simultaneous interpreters. The eye-key span describes the time that elapses between the first or last time an ST word is fixated before the first key is pressed which contributed to the production of the equivalent TT word(s) (Dragsted 2010: 51). Hansen (2008) found that difficult words result in longer eye-key spans than easy words. The difficulty of the words is described in terms of the number of alternative translations different translators produced for the same source text words. Easy words were translated the same way by all translators and difficult words were translated differently by nearly all translators in the sample. However, only three ST words were analysed and only eight translators participated in the study. Dragsted (2010) also found that professional translators have a shorter eye-key span than student translators. The next sections will present analyses from the TPRDB, which were designed to replicate and extend the findings from Dragsted (Dragsted & Hansen 2008; Dragsted 2010). 6.1 The dependent variable for experiment 1a and 1b The EKS was calculated from the first fixation. Only the drafting phase was included, i.e., both orientation and revision were excluded from the analysis. Figure 2 visualises the eye-key span for the ST word “flaring” in the segment “His withdrawal comes in the wake of fighting flaring up again in Darfur…” which has been translated into German. 133 Moritz Schaeffer & Michael Carl Figure 2: Progression graph exemplifying the eye-key span from first fixation on the ST word to first keystroke of the equivalent expression. In Figure 2, the horizontal axis represents time in ms. The left vertical axis represents the ST and the right vertical axis the TT. Blue dots are fixations on the ST, keystrokes are black (insertions) and red (deletions), while fixations on the TT are green diamonds. The first fixation on the ST word “flaring” occurs at around the time of 487,000 during a first, relatively linear reading of the segment. The segment is read again in a far less linear manner before TT production of this segment begins around the time 542,000. The eye-key span (EKS) for this word is therefore roughly 55 seconds. From a first contact with the word, the translator needs to re-read the ST segment twice before they are in a position to produce an equivalent TT item. The aim of experiment 1a was to firstly replicate the findings from Dragsted & Hansen (2008) and Dragsted (2010) in a larger sample involving more participants and target languages. Secondly, the aim was to find factors which can predict the EKS during translation. The aim of experiment 2 was to test how the EKS during copying differs from the EKS during translation. 6.2 Experiment 1a: Data, participants and procedure For experiment 1, the following studies were used: ACS08, BD08, BD13, BML12, KTHJ08, MS12, NJ12, SG12. The SL for all these studies is English and the TLs are Danish, Spanish, Chinese, Hindi, and German. Together, these constitute 12,474 ST words, 3,242 unique ST items, 108 participants and 12 different STs. The task was always translation. 134 5 Language processing and translation 6.3 Data Analysis For all the analyses in the present study, R (R Core Team 2014) and the lme4 (Bates et al. 2014) and languageR (Baayen 2013) packages were used to perform (general) linear mixed-effects models ((G)LMEMs). To test for significance, the R package lmerTest (Kuznetsova et al. 2014) was used, which implements ANOVA for mixed-effects models using the Satterthwaite approximation to estimate degrees of freedom. Data points which were more than 2.5 standard deviations above or below a participant’s mean for the dependent variable were excluded. This resulted in the exclusion of less than 4% of the data. The dependent variable (EKS) was transformed with the natural logarithm because it was not normally distributed. The final LMEM for the EKS had the following random variables: item, participant, text and target language. The predictors were: • TokS.sg represents the number of words in a given ST segment. • LenSWord represents the number of characters in a given ST word. • The segments in each ST are numbered sequentially. STsegment represents this. • The different texts in the TPR-DB are of comparable length (around 150 words), but they are not comparable in terms of the number of segments in each text. STseg_nbr therefore represents the number of sentences in each text. • Given that Cross values can be either positive or negative, the absolute values of CrossS were used for this analysis. • The only categorical variable in the analysis was whether participants were students or professionals. • The variable HCross is calculated in the same way as HTra, but represents something different. HCross is determined on the basis of the probability that a given ST word has a particular Cross value. Given that there is considerable variance in the word orders of different translations of the same ST, HCross describes the distribution of these probabilities. The higher the value, the less likely it is that a number of different translations of the same ST item will have the same Cross value. This metric therefore represents both lexical and syntactic aspects in one value, given that, if the word order is different it is also likely that different lexical items are chosen. 135 Moritz Schaeffer & Michael Carl Collinearity was assessed by inspecting variance inflation factors for the predictors; all values were low (<1.2), indicating that collinearity between predictors was not a problem. Initially, HTra was also in the model and it had a significant positive effect on EKS. However the variance inflation factor was relatively high (1.96) and was therefore excluded from the final model. Table 1 lists the effects of the predictor variables on EKS and Figure 3 visualises these effects. Table 1: LMEM results for the effect of LenSWord, STsegment, STseg_nbr, Cross, Student, HCross and Student on EKS (experiment 1a) Estimate Intercept 9.680 TokS.sg 1.187 × 10−2 LenSWord 5.259 × 10−2 STsegment 1.753 × 10−1 STseg_nbr −8.620 × 10−2 1.542 × 10−2 abs(Cross) StudentYes 6.050 × 10−1 HCross 1.989 × 10−1 StudentYes:HCross −9.894 × 10−2 SE 4.771 × 10−1 1.812 × 10−3 5.209 × 10−3 7.432 × 10−3 2.576 × 10−2 5.116 × 10−3 2.383 × 10−1 3.511 × 10−2 4.112 × 10−2 t p 20.288 1.71 6.550 6.95 × 10−11 10.095 <2.00 × 10−16 23.582 <2.00 × 10−16 −3.347 0.00295 3.014 0.00258 2.539 0.04770 5.666 1.62 × 10−8 −2.406 0.01617 *** *** *** *** ** ** * *** * 6.4 Results of experiment 1a The number of words in a segment (TokS.sg) had a positive effect on EKS. This might not be too surprising, given that if a translator first reads the whole segment before translating it, the EKS is naturally longer for longer segments. The number of characters in a word (LenSWord) had a positive effect. Word frequency also had a similar and highly significant effect on EKS, but only when word length was not included. This is not surprising, given that these two variables covary to a high degree. That word length or frequency should result in longer EKS is to be expected, given that it is more difficult to process long rare words than short frequent words. The sequential numbering of segments in the ST (STsegment) had a positive effect on EKS. The likelihood that a word situated further to the end is fixated long before it is translated may lead to this effect. The number of segments in a given ST (STseg_nbr) had a negative effect on EKS. 136 5 Language processing and translation Figure 3: Visualisation of the effects of the predictor variables on EKS (experiment 1a) This effect is to be seen in relation to the number of words in a segment. Given that all texts had a comparable length, longer segments which were associated with longer EKS, result in fewer segments per text. The length and number of segments in a text can therefore be seen as an indicator of the difficulty in translating it: the longer the segments, the more effortful. CrossS had a positive effect on EKS. Again, this would be expected, given that CrossS describes the distance (in number of words) between an ST item and the TT item to which it is aligned. The coordination of reading and writing is less effortful when ST and TT follow the same word order as opposed to a situation where they do so to a lesser extent. HCross had a positive effect on EKS. The higher the number of different, possible word orders, the more effortful is the coordination of reading and writing. This result extends those found in the study by Bangalore et al. (2016). However, in the latter study, the dependent variable was the total reading time on a segment. The current results localise the effect on a word level. Students had longer EKS than professionals. This suggests that the coordination of reading and writing while translating in addition to all the other processes which take place during translation is something which is acquired during practice. Additional analyses 137 Moritz Schaeffer & Michael Carl showed that HCross had an effect on the EKS of both students (t= 4.1, p < .001) and on professionals (t= 5.7, p< .001). In addition, there was an interaction between HCross and professional status. HCross had a stronger effect on EKS in professional translators than for students’ EKS. 6.5 Experiment 1b: Data, participants and procedure The study by Carl & Dragsted (2012) showed that when the text is easy to copy or translate, the behaviour in these two tasks is very similar. As pointed out earlier, traditional eye movement measures do not adequately capture the behaviour during translation. EKS may be one measure which can capture the effort that is associated with the coordination of reading and writing. The same data that was used in the previous analysis was compared to data gathered during monolingual copying. One additional study was included here (HLR13), which does not have any information regarding the professional status of participants and was therefore not part of experiment 1a. The data comprised 24,684 ST words, 5,111 unique ST items, 158 participants, 15 different texts and the 5 TLs as in experiment 1a in addition to Estonian and English (for the copying task). The same random variables as those in experiment 1a were used. Outliers (< 4%) were determined in the same way as in the previous experiment. 6.6 Results of experiment 1b Table 2 lists the effect of the same predictors that were used in the previous study and they had a similar effect: word length (LenSWord) had a positive effect and so did the position of a sentence in the text (STsegment). STseg_nbr remained positive after the inclusion of the monolingual copying data. CrossS was not included in this model, because for monolingual copying all Cross values are constant, i.e. 1. The number of words in a segment (TokS.sg) was only marginally significant after the inclusion of the data from the copying task and was therefore excluded. Figure 3 visualises the effects. As would be expected, the EKS is considerably shorter during copying (~3 seconds) as compared to translation (~60 seconds). However, the fact that there is an EKS during copying of non-negligible length suggests that copying and translation share a process which consists of coordinating reading and writing, at least to some degree, and the longer EKS during translation can therefore be seen as resulting from the involvement of two different linguistic systems. 138 5 Language processing and translation Table 2: LMEM results for the effect of LenSWord, STsegment, Stseg_nbr and Task on EKS (experiment 1b) Estimate Intercept 7.908 LenSWord 4.913 × 10−2 STsegment 1.927 × 10−1 STseg_nbr −9.776 × 10−2 TaskTranslation 2.715 SE t p 6.443 × 10−1 12.273 0.00149 4.204 × 10−3 11.686 <2.00 × 10−16 6.010 × 10−3 32.057 <2.00 × 10−16 2.223 × 10−2 −4.398 6.73 × 10−5 6.781 × 10−1 4.004 0.04232 ** *** *** *** * 6.7 Concurrent ST reading and TT typing EKS is only a rough measure which describes the temporal distance between a first contact with a word and the first keystroke which contributes to the production of an aligned TT item. What happens within this time frame remains unknown. In order to describe the processes of how a translator arrives at a translation for a given ST item, by shifting visual attention between the ST and the emerging TT different eye movement measures to those used traditionally need to be developed. One such measure describes the probability that the ST is read while the TT is being produced. Schaeffer & Carl (2013a: 184) argued that “instances of concurrent reading and writing during translation are indicative of automatic processes and shared representations.” In other words, the hypothesis was that, if the activation of shared semantic and/or syntactic representations results in a TT which is acceptable to target norms, and the monitor does not interrupt the tight coupling of reading and writing, the process as a whole is relatively automatic and ST reading may occur concurrently with TT production – at least to some degree. Experiment 2a was designed to test this hypothesis. 6.8 Experiment 2a: Data, Participants and procedure The data for experiment 2a and 2b was essentially the same as the one used in the previous experiments. However, for this experiment, the .pu files were used. A .pu file represents the information on the basis of a production unit (PU). A PU is defined as a sequence of coherent keystrokes. The boundaries between different PUs are determined by the pauses between keystrokes: a pause of more than 1000ms constitutes a PU boundary. Carl & Kay (2011) found that at pause values below 1000ms, the resulting PUs were less linguistically plausible, i.e. they 139 Moritz Schaeffer & Michael Carl were more likely to divide individual words and alignment units (aligned ST and TT items). At higher pause values, the number of PUs per text was very small, so that the pause value of 1000ms was adopted in the TPR-DB as defining the boundaries between PUs. There were a total of 21,973 PUs, 110 participants and the same 5 TLs as in the previous experiments. The task was always translation. 6.9 The dependent variable for experiment 2a and 2b The dependent variable for experiment 2a and 2b is binomial. It expresses the probability that the ST is fixated during TT production, i.e., during a PU. Figure 4 may exemplify this. The progression graph in Figure 4 shows the translation of the ST words “..investments in the Sudanese…” Striped boxes visualise PUs. There are two PUs in this graph: the translation of “investments in the” and “Sudanese”. During the first PU, while the translator is typing “die” (the), they already look at the next ST item (“Sudanese”). There are two fixations on this word before it is then typed in the second PU. 6.10 Results of experiment 2a Given that the dependent variable for this experiment was binomial, generalised fixed effects models (GLMEM) were used. GLMEMs for the concurrent ST reading and TT writing had the following random variables: participant, study and TL. Item was not included as a random variable, because, of course, PUs are not the same across participants. When text was included as a random variable, the models did not converge. It accounted for the smallest amount of variance and was therefore excluded. The predictors were: professional status, i.e., whether a participant was a student or a professional. CrossT represents the distance in number of words between the TT and the ST, as counted while progressing in a linear and sequential fashion through the TT while searching for aligned ST items. The CrossT value for PUs is the average CrossT value for all the words in the PU. STsegment is the sequential numbering of segments in a given text and PuSTnbr is the number of ST words in a given PU. Table 3 and Figure 5 show that as translators progress in the target text, they are less likely to read the ST while typing (the effect of STsegment). Concurrent ST reading and TT writing may be an indicator of the degree of co-activation of the two linguistic systems. Very much like during simultaneous interpreting, the translator processes input in one language at the same time as output is produced in a different language. Given that this is more likely at the beginning of 140 5 Language processing and translation Figure 4: Progression graph showing concurrent ST reading and TT production 141 Moritz Schaeffer & Michael Carl Figure 5: Visualisation of the effect of predictor variables on the probability that ST reading occurs during a PU (experiment2a) 142 5 Language processing and translation Table 3: The effect of the predictor variables on the probability that ST reading occurs during a PU (experiment 2a) Estimate Intercept StudentYes CrossT STsegment PuSTnbr StudentYes:CrossT −0.80 −0.46 −0.16 −0.06 0.27 0.064 SE z value Pr(>|z|) 0.32 0.23 0.03 0.01 0.01 0.03 −2.57 −2.03 −5.22 −6.827 31.44 2.078 0.0102 0.0425 1.75 × 10−7 8.66 × 10−12 <2.00 × 10−16 0.0377 * * *** *** *** * a text rather than towards the end may suggest several things: on the one hand, it may mean that, as the translator progresses in the text, they move closer towards the monolingual end of the bilingual continuum (Grosjean 1997). It may, however, also mean that at the beginning of a translation, the translators need to engage in more concurrent reading and writing in order to co-activate relevant task schemas and semantic fields relevant to the text. The facilitation observed in all relevant traditional eye movement measures in the study by Schaeffer et al. (2016) would support such a view: towards the end of the text, the process is less effortful, because the translator is in a more monolingual mode and the extra demands emerging from co-activation are smaller. The fact that the number of ST words in a PU (PuSTnbr) has such a large effect on concurrent ST reading is hardly surprising: the longer a PU is the more likely it is that a translator will fixate the ST at least once. Both CrossS and CrossT had a significant and negative effect when entered separately. When both were entered, the model did not converge. CrossT had a stronger effect than CrossS and CrossS was therefore dropped. The negative effect of CrossT on concurrent reading and writing suggests that when the word order is similar in a stretch of ST and TT, processes are likely to be more automatic than when the word order is very different. Concurrent ST reading and TT typing is an early measure which also describes how well integrated the process is as a whole, i.e. how horizontal/parallel it is. The fact that CrossT had such a large effect on the dependent variable suggests that when the syntax in the ST and the TT is likely to overlap to a high degree (low CrossT values), then primed, shared syntactic representations serve as the basis for TT production. In addition, there was an interaction between professional status and CrossT such that when the 143 Moritz Schaeffer & Michael Carl CrossT values were very low, professionals were more likely to read and write concurrently. For higher CrossT values, on the other hand, professionals were less likely to read and write at the same time (see Figure 6). Figure 7 shows the distribution of CrossT values in the data. It is very obvious that the lower CrossT are much more frequent. In other words, professionals are most of the time more likely to read and write concurrently, but when the text becomes more difficult, they are more sensitive to this than students are and they are more likely to work sequentially, i.e., more monolingually – of course not entirely monolingual, though. Figure 6: Interaction of CrossT and professional status (experiment 2a) 6.11 Experiment 2b: Data, Participants and procedure The data for experiment 2b was identical to the data in experiment 2a apart from the fact that the same copying data that was used in experiment 1b was also added. There were 28,226 PUs, 153 participants, 12 texts, 8 studies and 6 TLs. The tasks were translation and copying. 144 5 Language processing and translation Figure 7: Distribution of average CrossT values in PUs (experiment 2a) 6.12 The dependent variable for experiment 2b The dependent variable for experiment 2b was identical to the one in experiment 2a. 6.13 Results of experiment 2b Table 4 summarises the effects of the predictor variables on the probability that some concurrent reading occurs during a PU and Figure 8 visualises these effects. The effect of both the position of a segment within a text (STsegment) and the number of ST words in a PU (PuSTnbr) in experiment 2b was similar to the effect in experiment 2a. The likelihood that concurrent ST reading and TT typing occurs was significantly higher during copying than during translation. This suggests that, while both copying and translation share some aspects, the involvement of two linguistic systems makes a more automated and horizontal process less likely. 145 Moritz Schaeffer & Michael Carl Table 4: The effect of predictor variables on both translation and copying (experiment 2b) Estimate Intercept STsegment PuSTnbr TaskTranslation 0.35 −0.07 0.27 −1.62 SE z value 0.55 0.01 0.01 0.61 0.63 −9.43 38.53 −2.64 Pr(>|z|) 0.527 <2.00 × 10−16 *** <2.00 × 10−16 *** 0.008 ** Figure 8: Visualisation of the effect of predictor variables on the probability that the ST is fixated during a PU (experiment 2b) 146 5 Language processing and translation 7 General discussion Research aimed at showing that the target language is activated during source text reading and that translation is a horizontal process rather than a sequential, vertical process (Macizo & Bajo 2004; 2006; Ruiz et al. 2008; Jakobsen & Hvelplund Jensen 2008; Hvelplund Jensen et al. 2009; Winther Balling et al. 2014). What these studies have shown is that co-activation is task-dependent, at least if behaviour is observed. However, there is a large body of evidence which suggests that inhibition plays an important role in bilinguals as such (Kroll et al. 2015) and in translation (Macizo et al. 2010) and there is considerable evidence which suggests that not only lexical access is fundamentally non-selective, but also production is affected by competition between the two languages of the bilingual (de Groot & Starreveld 2015). Grosjean (1997) argued that a bilingual’s two languages are more or less active depending on the context. The studies reviewed here are consistent with this. It is very likely that translation increases the co-activation of the two linguistic systems to a high degree. Rather than pitting the horizontal view of translation against the vertical one, the model proposed by Schaeffer & Carl (2013a) argued that translation is best understood as both an early and a late effect, i.e., it is likely that translation is best understood as early, relatively automatic processes which are highly bilingual in nature and late processes which are more monolingual. This chapter further argues that traditional eye movement measures cannot adequately describe the processes which are unique to the task of translation. The eye-key span (Dragsted & Hansen 2008; Dragsted 2010) and the degree to which ST reading and TT typing co-occur are measures that address this shortcoming. Schaeffer & Carl (2013a: 184) predicted that concurrent ST reading and TT typing is evidence of the activation of shared representations and automatic processing. The results presented here support this view. Both the early and the late processes during translation are likely to be modulated by the degree to which SL and TL items share representation. The DFM (de Groot 1992) suggests that semantic overlap between two lexical items is a matter of degree. This model receives support from two eye movement studies (Schaeffer et al. 2016; 2017). The shared syntax account (Hartsuiker et al. 2004) predicts that syntax, if similar across languages, shares the same representation. The effect of word order differences (HCross) on the eye-key span and the effect of word order differences on the likelihood that ST reading and TT typing occur concurrently lend support to the shared syntax account. The measure HCross, introduced in the current chapter, lends further support to this notion and extends it in that it shows that when the word order in the ST and the TT is dissimilar, also the eye-key span (EKS) is shorter and fewer different word orders are observed. 147 Moritz Schaeffer & Michael Carl The shared syntax account is very well suited to explain priming effects. However, when the choice of lexical item leads to required changes in the syntactic structure (and word order), and the more different word orders (and syntactic structures) are possible, these possibilities compete for selection and inhibit the translation process, resulting in a longer EKS. The shared syntax account predicts priming effects when the syntax is shared across the ST and TL, but makes no predictions about when the degree of overlap in terms of syntax is small. The present study quantifies and predicts the effects of such a situation in the form of the HCross metric. The following sentence from the data may serve as an example: “As a result, full-time leaders, bureaucrats, or artisans are rarely supported by huntergatherer societies.” In the database, there are 26 translations into German of this text. In the appendix, we list seven versions which all use a different lexical item for the verb phrase [are supported]. The verb [supported] has a very high HCross value (3.57). Only one translation [schätzen (appreciate)] out of the seven shares a combinatorial node with the source, because [schätzen], just like [supported], is in the passive voice. All other lexical choices require additional changes in the syntactic structure of the target language sentence, some of the underlying syntactic choices are depicted in Figure 9. Differences in syntactic choices result in changes in word order. There is the possibility that there are translation routes in bilinguals which are independent from monolingual processing routes. It was hypothesised that these might be faster, because they might be less susceptible to the competing demands of co-activation and inhibition and it was hypothesised that their strength and breadth might be modulated by practice. Both the eye-key span and concurrent reading and writing are modulated by expertise. This could be seen as an indication that independent translation routes, modulated by extended exposure to the task, result in strengthened and widened access to these independent translation routes, though this must remain speculative at present. Finally, it is likely that the mechanisms underlying translation are shared to some degree by monolingual copying. Very few studies have systematically compared translation to monolingual language use. The existing findings are promising and both reading for comprehension and monolingual copying seem to be good contrasts. It further seems necessary to develop eye movement measures, such as the eye-key span and concurrent reading and writing, which do justice to the complexities of translation, particularly when the later processes are investigated. These later processes seem particularly relevant, simply because they so evidently distinguish reading for or 148 5 Language processing and translation Figure 9: Item (support) with a high HCross value (3.57). Different lexical choices (into German) lead to different syntax which in turn result in large differences in word order (see appendix). Based on the shared syntax account (Hartsuiker et al. 2004). In the shared syntax account, there is a shared conceptual level, a language node, a lemma node and combinatorial nodes. In this case, the overlap for combinatorial nodes is minimal (only [schätzen] shares two combinatorial nodes with [support]). while translating from reading for comprehension, while paraphrasing and while copying. Appendix ST As a result, full-time leaders, bureaucrats, or artisans are rarely supported by hunter-gatherer societies. TT1 Folglich werden Führungspersönlichkeiten, Bürokraten oder Handwerker nur selten von Jägern und Sammlern geschätzt. (Passive) TT2 Deshalb gibt es in Jäger-und Sammlergesellschaften meistens keine Perso- 149 Moritz Schaeffer & Michael Carl nen, die nur Anführer, Bürokraten oder Kunsthandwerker sind. (Dummy Subject) TT3 Daher unterhalten Jäger-Sammler-Gesellschaften nur selten hauptberufliche Anführer, Bürokraten oder Handwerker. (Active) TT4 Daher ist in Jäger-und Sammlergesellschaften auch kein Platz für Anführer, Bürokraten oder Handwerker, die ansonsten keine Aufgaben übernehmen. (Copula) TT5 Daher kommen in Jäger-und-Sammler-Gesellschaften kaum Bürokraten, Handwerker oder Personen vor, die ihre gesamte Zeit als Anführer verbringen. (Active) TT6 Dies ist der Grund dafür, dass man hier auch kaum Personen in ständiger Führungsposition und Künstler findet. (Dummy Subject) TT7 Dementsprechend leisten sich solche Gesellschaften auch selten den Luxus, Berufspolitiker, Bürokraten oder Kunsthandwerker zu unterhalten. (Reflexive) References Baayen, R. Harald. 2013. languageR: Data sets and functions with analyzing linguistic data: A practical introduction to statistics. Tech. rep. http : / / cran . r project.org/package=languageR. Bangalore, Srinivas, Bergljot Behrens, Michael Carl, Maheshwar Ghankot, Arndt Heilmann, Jean Nitzke, Moritz Schaeffer & Annegret Sturm. 2016. Syntactic variance and priming effects in translation. In Michael Carl, Srinivas Bangalore & Moritz Schaeffer (eds.), New directions in empirical translation process research: Exploring the CRITT TPR-DB, 211–238. Cham: Springer. Bates, Douglas, Martin Maechler, Ben Bolker & Steven Walker. 2014. lme4: Linear mixed-effects models using Eigen and S4. http://cran.r- project.org/package= lme4. Carl, Michael & Barbara Dragsted. 2012. Inside the monitor model: Processes of default and challenged translation production. Translation: Corpora, Computation, Cognition 2(1). 127–145. Carl, Michael & Martin Kay. 2011. Gazing and typing activities during translation: A comparative study of translation units of professional and student translators. Meta 56(4). 952–975. 150 5 Language processing and translation Carl, Michael, Moritz Schaeffer & Srinivas Bangalore. 2016. The CRITT translation process research database. In Michael Carl, Srinivas Bangalore & Moritz Schaeffer (eds.), New directions in empirical translation process research: Exploring the CRITT TPR-DB, 13–54. Cham: Springer. de Groot, Annette M. B. 1992. Determinants of word translation. Journal of Experimental Psychology: Learning, Memory, and Cognition 18(5). 1001–1018. de Groot, Annette M. B. & Peter A. Starreveld. 2015. Parallel language activation in bilinguals’ word production and its modulating factors: A review and computer simulations. In John Schwieter (ed.), The Cambridge handbook of bilingual processing, 389–415. Cambridge: Cambridge University Press. Defrancq, Bart. 2015. Corpus-based research into the presumed effects of short EVS. Interpreting 17(1). 26–45. Dragsted, Barbara. 2010. Coordination of reading and writing processes in translation: An eye on unchartered territory. In Gregory M. Shreve & Erik Angelone (eds.), Translation and cognition, 41–62. Amsterdam: Benjamins. Dragsted, Barbara & Inge Gorm Hansen. 2008. Comprehension and production in translation: A pilot study on segmentation and the coordination of reading and writing processes. In Susanne Göpferich, Arnt Lykke Jakobsen & Inger M. Mees (eds.), Looking at eyes: Eye-tracking studies of reading and translation processing, vol. 36, 9–29. García, Adolfo M. 2015. Translating with an injured brain: Neurolinguistic aspects of translation as revealed by bilinguals with cerebral lesions. Meta 60(1). 112–134. Green, David W. 2003. Mental control of the bilingual lexico-semantic system. Bilingualism: Language and Cognition 1(2). 67–81. Grosjean, François. 1997. The bilingual individual. Interpreting - International Journal of Research and Practice in Interpreting 2. 163–187. Hansen, Gyde. 2008. The dialogue in translation process research. In Translation and cultural diversity: Selected proceedings of the XVIII FIT world congress, 386– 397. Shanghai: Foreign Language Press. Hansen-Schirra, Silvia, Stella Neumann & Erich Steiner. 2012. Cross-linguistic corpora for the study of translations: Insights from the language pair English– German. Berlin & Boston: de Gruyter. Hartsuiker, Robert J., Martin J. Pickering & Eline Veltkamp. 2004. Is syntax separate or shared between languages? Cross-linguistic syntactic priming in Spanish–English bilinguals. Psychological Science 15(6). 409–414. 151 Moritz Schaeffer & Michael Carl Hvelplund Jensen, Kristian Tangsgaard. 2011. Allocation of cognitive resources in translation: An eye-tracking and key-logging study. Copenhagen Business School dissertation. Hvelplund Jensen, Kristian Tangsgaard, Annette C. Sjørup & Laura Winther Balling. 2009. Effects of L1 syntax on L2 translation. In Fabio Alves, Susanne Göpferich & Inger M. Mees (eds.), Methodology, technology and innovation in translation process research: A tribute to Arnt Lykke Jakobsen, 319–336. Copenhagen: Samfundslitteratur. Hyönä, Jukka, Robert F. Lorch & Mike Rinck. 2003. Eye movement measures to study global text processing. In Jukka Hyönä, Ralf Radach & Heiner Deubel (eds.), The mind’s eye: Cognitive and applied aspects of eye movement research, 313–334. Amsterdam: Elsevier. Immonen, Sini. 2006. Translation as a writing process: Pauses in translation versus monolingual text production. Target 18(2). 313–336. Immonen, Sini. 2011. Unravelling the processing units of translation. Across Languages and Cultures: A Multidisciplinary Journal for Translation and Interpreting Studies 12(2). 235–257. Immonen, Sini & Jukka Mäkisalo. 2010. Pauses reflecting the processing of syntactic units in monolingual text production and translation. Hermes – Journal of Language and Communication Studies 44(44). 45–61. Jakobsen, Arnt Lykke. 2011. Tracking translators’ keystrokes and eye movements with Translog. In Cecilia Alvstad, Adelina Hild & Elisabet Tiselius (eds.), Methods and strategies of process research: Integrative approaches in translation studies. Amsterdam: Benjamins. Jakobsen, Arnt Lykke & Kristian Tangsgaard Hvelplund Jensen. 2008. Eye movement behaviour across four different types of reading task. In Susanne Göpferich, Arnt Lykke Jakobsen & Inger M. Mees (eds.), Looking at eyes. eyetracking studies of reading and translation processing, vol. 36, 103–124. Copenhagen: Samfundslitteratur. Jakobsen, Arnt Lykke & Lasse Schou. 1999. Translog documentation. In Gyde Hansen (ed.), Probing the process in translation: Methods and results, 1–36. Frederiksberg: Samfundslitteratur. John, Bonnie E. 1996. Typist: A theory of performance in skilled typing. HumanComputer Interaction 11(4). 321–355. Kliegl, Reinhold, Ellen Grabner, Martin Rolfs & Ralf Engbert. 2004. Length, frequency, and predictability effects of words on eye movements in reading. European Journal of Cognitive Psychology 16(1-2). 262–284. 152 5 Language processing and translation Kroll, Judith F., Jason W. Gullifer, Rhonda McClain & Eleonora Rossi. 2015. Selection and control in bilingual comprehension and production. In John W. Schwieter (ed.), The Cambridge handbook of bilingual processing, 485–507. Cambridge: Cambridge University Press. Kroll, Judith F. & Erika Stewart. 1994. Category interference in translation and picture naming: Evidence for asymmetric connections between bilingual memory representations. Journal of Memory and Language 33(2). 149–174. Kuznetsova, Alexandra, Rune Haubo Bojesen Christensen & Per Bruun Brockhoff. 2014. lmertest: Tests for Random and Fixed Effects for Linear Mixed Effect Models (lmer Objects of lme4 Package). R package version 2.0-6. http://www. cran.rproject.org/package=lmerTest/. Macizo, Pedro & Mª Teresa Bajo. 2004. When translation makes the difference: Sentence processing in reading and translation. Psicológica 25. 181–205. Macizo, Pedro & Mª Teresa Bajo. 2006. Reading for repetition and reading for translation: Do they involve the same processes? Cognition 99(1). 1–34. Macizo, Pedro, Mª Teresa Bajo & María Cruz Martín. 2010. Inhibitory processes in bilingual language comprehension: Evidence from Spanish–English interlexical homographs. Journal of Memory and Language 63(2). 232–244. R Core Team. 2014. A language and environment for statistical computing. Vienna. http://R-project.org/. Ruiz, Carmen, Natalia Paredes, Pedro Macizo & Mª Teresa Bajo. 2008. Activation of lexical and syntactic target language properties in translation. Acta psychologica 128(3). 490–500. Schaeffer, Moritz & Michael Carl. 2013a. Shared representations and the translation process: A recursive model. Translation and Interpreting Studies 8(2). 169– 190. Schaeffer, Moritz & Michael Carl. 2013b. Shared representations and the translation process: a recursive model. Translation and Interpreting Studies. The Journal of the American Translation and Interpreting Studies Association 8(2). 169– 190. Schaeffer, Moritz, Barbara Dragsted, Laura Winther Balling & Michael Carl. 2016. Word translation entropy: Evidence of early target language activation during reading for translation. In Michael Carl, Srinivas Bangalore & Moritz Schaeffer (eds.), New directions in empirical translation process research: Exploring the CRITT TPR-DB, 183–210. Cham: Springer. Schaeffer, Moritz, Kevin Paterson, Victoria A. McGowan, Sarah J. White & Kirsten Malmkjær. 2017. Reading for translation. In Arnt Lykke Jakobsen & 153 Moritz Schaeffer & Michael Carl Bartolome Mesa-Lao (eds.), Translation in transition, 18–54. Amsterdam: Benjamins. Seleskovitch, Danica. 1976. Interpretation: A psychological approach to translating. In Richard W. Brislin (ed.), Translation: Applications and research, 92–116. New York: Gardner. Shreve, Gregory M., Christina Schaffner, Joseph H. Danks & Jennifer Griffin. 1993. Is there a special kind of “reading” for translation? Target 5(1). 21–41. Vauquois, Bernard. 1968. A survey of formal grammars and algorithms for recognition and transformation in machine translation. In A.J.H. Morrell (ed.), Proceedings of the IFIP Congress-68, 254–260. Edinburgh: North-Holland. Winther Balling, Laura, Kristian Tangsgaard Hvelplund Jensen & Annette Camilla Sjørup. 2014. Evidence of parallel processing during translation. Meta 59(2). 234–259. Wu, Yan Jing, Filipe Cristino, Charles Leek & Guillaume Thierry. 2013. Nonselective lexical access in bilinguals is spontaneous and independent of input monitoring: Evidence from eye tracking. Cognition 129(2). 418–425. Wu, Yan Jing & Guillaume Thierry. 2012. Unconscious translation during incidental foreign language processing. NeuroImage 59(4). 3468–3473. 154