Anke Schulz PDF
Anke Schulz PDF
Anke Schulz PDF
Anke Schulz
Abstract
This paper describes work in progress on a corpus-based study, comparing seemingly similar
registers in two languages: English and German newsgroup texts, collected in the Bremen
Translation Corpus. Systemic Functional Grammar (SFG, Halliday 1994 [1985]) provides a
theoretical framework for categorizing empirical findings. I will focus on three systems of the
finite verbal group, i.e. tense, modality and polarity, to describe these registers on the textual
and the interpersonal metafunction levels.
The use of tense, apart from its logical function, can also reflect the textual
metafunction, since different tenses are preferred in spoken and written discourse (see Biber
et al. 1999 or Duden 2005). Do newsgroup text authors favour one tense over the other? And
is tense choice the same in both languages? On the interpersonal metafunction level, I
analyse the form and function of modal auxiliaries, and look at how modal auxiliaries
combine with process types. I then investigate the position of the syntactic negation markers
in English and German clauses.
My aim is to provide a thorough description of the realization of finite verbal groups
in English and German newsgroup texts as a preliminary step towards research on variation
in parallel translations using the Bremen Translation Corpus.
The theoretical background provided by SFG is explained in the next section. Section
2 will give a description of the design and purpose of my corpus, the Bremen Translation
Corpus (BTC), which consists of newsgroup texts in English and in German and five parallel
translations of each text into the other language. In that section I also give a brief outline of
how I processed the data.
In section 3, some first exploratory results will be presented. Part 3.1 focuses on the
frequency of different tense forms in the original newsgroup texts. I will argue that, apart
from the logical function of tense forms, the frequency of tense forms reflects to a certain
extent whether we are dealing with written or spoken texts, making tense a feature of the
textual metafunction as well as of the interpersonal metafunction. As a ground for transfer
comparison of the distribution of tense forms in different registers, I refer to the Longman
Grammar of Spoken and Written English (LGSWE) (Biber et al. 1999). The LGSWE uses a 40
million word corpus, the LSWE corpus, to compare tense form distribution in different written
and spoken registers. With an investigation of the tense form distribution in my corpus of
newsgroup texts I hope to show the affinity of these texts to either written or spoken
discourse.
Section 3.2 then is dealing with modality, more specifically with the function of modal
auxiliaries in my English and German texts, comparing the results from the two language
corpora. Do English and German authors express modality by the same means, i.e. modal
auxiliaries, and to the same extent? I will also look at the kind of process types (following the
Cardiff Grammar categorisation) that occur in combination with a modal auxiliary.
In section 3.3 I am concerned with the position of syntactic negation markers in the English
and German clause complexes. Where in a clause are negation markers typically placed in my
newsgroup text corpus?
A conclusion is provided in section 4, with an evaluation of the work presented here as
well as an outlook into future work following in section 5.
698
comparing English and German. Following Caffarel et al. (2004: 15), I will take the practical
heuristic approach in applying the method of transfer comparison:
However, the type of approach where no assumptions are made based on other
languages and where the description of the lexicogrammatical system is built up
from observations of discursive instances takes a considerable amount of time, so
as a practical heuristic, it may be helpful to model the description of one language
on the description of another this is the method of transfer comparison [].
(Caffarel et al. 2004: 15)
I will describe the realization of finite verbal groups in the corpus of English and German
newsgroup texts, taking descriptions of English as the starting point. The focus will be on the
textual and interpersonal metafunctions of SFG.
Traditional grammars, e.g. Biber et al. (1999) or Duden (2005), see tense as being
used differently in spoken and written discourse. In addition to their function on the ideational
and interpersonal levels, we can therefore consider finite verbal groups as a component of the
textual metafunction in SFG. Among other things, the interpersonal metafunction is reflected
in the use of modal operators and negation markers. With these the authors of the newsgroup
texts reveal their stance towards the problems they discuss. Due to limitations of space the
ideational metafunction will not be considered in this paper.
In their paper Metafunctional profile of the grammar of German, Steiner & Teich
(2004) describe modality and polarity in German, though they only dedicate one page to it.
Still they postulate a difference in the realization of modality between English and German:
699
700
Original
German
Eating
Disorder
Original
German
Relationship
Problems
Translation 5
Translation 4
Translation 3
Translation 2
Translation 1
Translation 5
Translation 4
Translation 3
Translation 2
Translation 1
Translation 5
Original
English
Eating
Disorder
Original
English
Relationship
Problems
Translation 4
Translation 3
Translation 2
Translation 1
Translation 5
Translation 4
Translation 3
Translation 2
Translation 1
The need for a corpus of parallel translations has been expressed before, e.g. by Mauranen
(2002: 166):
One problem with translation corpora that has been pointed out by Malmkjr
(1998) is that they only provide one translation solution for every SL instance,
which conceals the variation in translations that would ensue if we had available
versions of the same source text by different translators.
701
With a corpus of just one translation of one original text, as for example in the EnglishNorwegian Parallel Corpus (see Johansson & Hofland 1994), it is not possible to answer
questions such as: how much variation occurs when different people translate the same text?
Where in the text/sentence/clause does the variation occur? Can variation in parallel
translations be explained by differences in the language systems, or is it due to differences in
language use? The BTC was built to make possible linguistic research of all these aspects
(and more). The following examples (1, 2) show two sentences from the corpus and the five
corresponding parallel translations:
1 Ich denk-e
mir
halt,
I think-1SG.PRS.TR me-PRON.1SG.DAT just-ADV,
I just think,
jed-e
bereits vergeben-e
Frau
every-DET.3SG.F already spoken_for-ADJ.3SG.F.NOM woman-NN.NOM
every woman who is already spoken for
msste
bei mein-en
Annherungsversuch-en
should-AUX.SBJV at my-DET.DAT.PL advances-NN.DAT.PL
should immediately put a stop
sofort
Einhalt gebieten.
immediately put_a_stop_to-VB
to my advances.
A I just think that this woman has to reject my advances
B I just think that every unavailable woman must immediately stop my approaches
C I had thought that every already-taken women must immediately back off at my
attempts to get closer
D I think that every woman who is already attached should react to advances
immediately
E I think every woman whos spoken for has always left me hanging when I tried
something.
702
C Klar
will
ich die
negativ-en
Of_course want_to-AUX.IND I the-DET.ACC.PL negative-ADJ.ACC.PL
Gewohnheit-en
irgendwann mal aufgeben.
habits-NN.ACC.PL sometime or other give_up-VB.
Of course I want to give up the negative habits sooner or later.
D Ach ja,
ich mchte
schon
gern
Well yes, I
want_to-AUX.IND certainly-ADV like_to-ADV
eines Tages die
negativ-en
Gewohnheit-en
one day
the-DET.ACC.PL negative-ADJ.ACC.PL habits-NN.ACC.PL
aufgeben.
give_up-VB.
Well, certainly I want to give up the negative habits someday.
As a starting point I focus only on the original texts to see whether the use of tense, modality
and polarity is different in English and in German. The results can serve in future research as
grounds for a comparison of the parallel translations.
703
Additional to all that could be worth investigating in the parallel translations, the
original newsgroup texts are interesting in themselves in terms of mode (Halliday & Hasan
1989). They can be described as written-as-if-spoken; the texts show features of spoken
discourse, e.g. interjections and discourse particles, however, the discourse is written and
distributed in written form on the internet. Beiwenger & Storrer (to appear: 14) note that
[s]ince this dichotomy is crucial for the categorization in speech and text corpora, it is
difficult to decide whether CMC discourse should form part of text or speech corpora. The
research presented here is meant to investigate the affinity of CMC to written and to spoken
discourse.
704
simple-present
present
PRESENT T YPE
present-perfect
progressive
PROGRESSIVET YPE
simple-progressive
perfect-progressive
simple-past
past
tense
PAST T YPE
T ENSET YPE
past-perfect
past-progressive
future
simple
perfect
will
be-going-to
modal
to-clause
non-finite
ing-clause
imperative
705
einfaches-praesens
present
PRESENT T YPE
praesens-perfekt
konjunktiv
praesens
perfekt
einfaches-praeteritum
past
PAST T YPE
praeteritum-perfekt
konjunktiv_
tense
praeteritum
perfekt_
einfaches-futur
T ENSET YPE
future
futur-perfekt
konditional
modal
MODALT YPE
1-wuerde
2-wuerde-perfekt
praesens_
praeteritum_
infinitiv
non-finite
zu-infinitiv
partizip-perfekt
imperative
As shown in table 1, English newsgroup texts have a higher amount of non-finite clauses, 17
%, compared with 5 % of non-finite clauses in German. The amount of modal auxiliaries is
slightly lower in English, 9 %, as compared to 12 % in German. In the following tables, the
number in brackets is the total number of clauses displaying a feature. A sum of more than
100 % results from rounding up results to full numbers.
Clause type
Tensed
(658)
73 %
(530)
79 %
Modal auxiliary
(78)
9%
(78)
12 %
Non-finite
(153)
17 %
(30)
5%
Minor clause
(17)
2%
(29)
5%
706
For the investigation of tense, we will concentrate on the tensed clauses: 658 in English and
530 in German. With the method of transfer comparison, I try to apply to both my English
and German texts what an English grammar, i.e. LGSWE (Biber et al. 1999), says about the
frequency of tense forms in written and spoken discourse. Unfortunately, I do not have any
data available about the frequency of tense forms in different registers in German, which
would be essential for valid conclusions. I assume, however, that present, past and future
tense has the same functions of logically connecting the discourse to the context in both
English and German. Table 2 displays the results of the computer-assisted manual annotation
with Systemic Coder using the system networks shown in Figures 1 and 2.
Tense type
Simple Present
(376)
57 %
(315)
59 %
Present Perfect
(39)
6%
(106)
20 %
Present Progressive
(37)
6%
--
Simple Past
(187)
28 %
(74)
14 %
Past Perfect
(6)
1%
(3)
1%
Past Progressive
(8)
1%
--
Future
(5)
1%
(23)
4%
Subjunctive mood
--
(9)
2%
707
academic prose) in the LSWE Corpus, also show a strong preference for present tense forms
(simple present + present perfect + present progressive). The English texts have 69 % present
tense forms in the finite verbal groups, in the German counterpart, present tense forms amount
to 79 %. With only 30 % past tense verbs in the English texts and only 15 % in the German
texts, we cannot identify a strong preference for past tense verbs in the newsgroup texts.
Therefore, these newsgroup texts can be said to be more similar to conversation than to
written narration (fiction) if we look at the use of tense in isolation.
While the second most frequent tense in English is the simple past, the second most
frequent tense in German is the present perfect. Again, this reflects the similarities with
spoken discourses in German. Quoting the standard German grammar Duden (2005: 519-20),
in German, the simple past is the unmarked tense for narration in written discourse, while the
present perfect is the tense that is typically chosen for narration in spoken discourse.
In addition, the results show gaps in the language systems: German does not mark verbs for
aspect, while English has no formally distinguishable subjunctive mood. German marks
modal auxiliaries for present and past tense, and English does not.
Remaining fully aware of the dangers involved in comparing a 4,500 word corpus with
a 40 million word corpus, if we convert our results to the frequency in one million words,
160
140
120
100
Modal
80
Past
60
Present
40
20
0
Conv
Fict
E NG
G NG
Figure 4: Frequency of present tense, past tense and modal verbs across registers, including
English and German newsgroup texts, adapted from Biber et al. 1999: 456
German newsgroup texts (G NG) seem to be more similar to English conversation than the
English newsgroup texts (E NG). We also see that there are differences between the use of
708
tenses in German and English newsgroup texts. The results presented here, however, must be
verified with more data from a much larger corpus to allow us to form any conclusions about
their significance. My exploratory research suggests that it might be worth carrying out such a
comparison of English and German use of tenses in a relatively new register.
3.2 Modality
The next aspect that catches the eye when exploring the BTC is the divergence between the
parallel translations of modal auxiliaries. If we think back to example 1, the German modal
auxiliary msste must (subjunctive mood) has been translated into English as has to (once),
must (twice) and should (once), and in the last translation it appears without a modal
auxiliary. Again I start with an exploration of the originals before investigating the parallel
translations. With only 9 % (English) and 12 % (German) of all clauses in the sub-corpora on
relationship problems carrying a modal auxiliary (that is, 78 clauses in both languages) the
study is easily feasible. The results, however, must naturally be treated with caution due to the
limited size of the corpus. Once again the clauses are manually annotated with support from
the Systemic Coder according to the main functions of modality (Halliday 1994 [1985]) and
using the system network shown in figure 5. The results are shown in table 3 below.
modalization
modality
MODALIT YT YPE
modulation
probability
usuality
obligation
inclination
Modalization
70 %
49 %
-probability
87 %
100 %
-usuality
13 %
0%
30 %
51 %
-obligation
48 %
60 %
-inclination
52 %
40 %
Modulation
709
We can see that in the German texts the clauses are evenly distributed between modalization
(probability and usuality) and modulation (obligation and inclination). In English, however,
about two thirds of the clauses display modalization, only one third of the clauses showing
some kind of modulation. We might say that the German writers have a stronger sense of
obligation or inclination, i.e. they think more about what they should do, or what other people
should do. We might even claim that the German writers actually think that other people
should be doing something (60% obligation) rather than feeling the necessity to act
themselves (40 % inclination).
Another difference lies in the subtypes of modalization: whereas in English texts 13 %
of modal auxiliaries express usuality, in the German texts there are none. Either, by
coincidence, none of these specific German texts actually expressed usuality, or maybe this is
a sign that in the German language system modal auxiliaries cannot express usuality. The
results, however, seem to support the statement in Steiner & Teich (2004: 151):
710
verbs
action
56 %
51 %
relational
16 %
11 %
mental
11 %
26 %
12 %
9%
influential
5%
3%
3.3 Polarity
The investigation of polarity, or negation markers, is of a slightly different nature. It is
inspired by Sinclair (1991) and his statement about where the phrasal verb set in was most
likely to appear in a clause complex: A number of the clauses are subordinate. With the
samples available, it is not possible to assign status in every case, and there are some of clear
main clauses; but I think the tendency to lower status should be noted. (Sinclair 1991: 74).
While I carried out the above research, I had the impression that markers of syntactic negation
(no, not, never and contracted forms like e.g. arent, doesnt, couldnt, wouldnt in English or
nicht, nichts, nie, kein*, niemals in German) appeared most often in subordinate clauses. To
verify this hypothesis, all negated clauses in English and German from the sub-corpora on
relationship problems are annotated using the system network shown in figure 6 below.
711
simple-clause
negation
main-clause
clause-complex
CLAUSECOMPLEX-T YPE
co-ordination
sub-ordination
The results are shown in table 7 for the 78 sentences in English and 86 sentences in German
carrying negative polarity. We see that in the English newsgroup texts less than one third of
all negation markers appear in a sentence consisting of a simple clause. In the German
newsgroup texts, even less than a quarter of all syntactic negations are in a simple clause.
Position
Simple clause
28 %
22 %
32 %
31 %
Co-ordinated clause
27 %
33 %
Sub-ordinated clause
41 %
36 %
4. Conclusion
In my exploration of the Bremen Translation Corpus, Systemic Functional Grammar has been
a valuable theoretical background for an analysis of tense, modality and polarity. On the level
of the textual metafunction, I have shown that differences in the use of tense can reflect
712
whether we are dealing with spoken or written discourse. This is an interesting feature when
studying a fairly new register such as newsgroup texts from the internet. With a frequency of
69 % (English) and 79 % (German) of present tense verbs, newsgroup texts show their
proximity to conversation, which also displays a preference for present tense verbs (Biber et
al. 1999: 457). But differences can also be detected between the English and German
newsgroup texts; the second most frequently used tense in the English texts is the simple past,
whereas in the German texts it is the present perfect.
On the level of the interpersonal metafunction, modality and polarity provide us with
categories to study how writers express their stance towards the problems they discuss in
these newsgroup texts. While in the German texts the distribution between modalization and
modulation is fairly even, in the English texts more than two thirds of all modal auxiliaries
express modalization. Another finding that may point to a difference between the systems of
modal auxiliaries in English and German is that in the German sub-corpus not one modalizing
auxiliary expressed usuality. The process types modified most frequently in both languages
are action processes, followed in the English sub-corpus by a rather equal distribution
between relational, mental and three-role-cognition (verbal) processes. In the German subcorpus, however, mental processes are clearly the second most frequently modified processes.
The research on polarity is of a different nature. I study the position of negation markers in
the clause (complex). There seems to be a tendency for syntactic negation to appear in clause
complexes rather than simple clauses, and syntactic negation seems most often to be placed
within a subordinate clause.
713
not, how can we analyse process types in German, if we want to carry out contrastive studies
based on SFG?
The major problem has been the small size of the two sub-corpora of original texts on
relationship problems in the BTC. With such small numbers, no statements can be made about
the relevance of the results, or even about their truthfulness. The research presented here has
constituted what Sinclair (1991: 137) has called only the first dipping of an inquisitive toe
into the vast pool of language texts. The limited size of the data has, however, made
exploratory research feasible on a number of different aspects, thus revealing interesting
features in the newsgroup texts and in the comparison of English and German.
As a first step in the future, the corpus of original newsgroup texts will be extended to
10,000 words in English and also 10,000 words in German. According to Biber (1995), []
it is possible to represent the distributions of many core linguistic features, both within and
across registers, based on relatively short text samples (as short as 1,000 words) and relatively
few texts from each register (as few as ten texts) (Biber 1995: 131).
Since a comparison of registers is not the aim of this research project, I feel confident
that ten times 1,000 words in each language will suffice to provide reliable results in an
investigation of the lexicogrammatical systems of German and English, while still remaining
a feasible size for computer-assisted manual annotation. Furthermore, the results from the
BTC must be compared to results from a reference corpus, e.g. the British National Corpus
(BNC) for English and the Cosmas II for German. This procedure will enable us to see more
clearly how newsgroup texts from the internet differ from other registers.
Work then has to be undertaken to develop guidelines for the annotation of tenses,
process types and modality in both languages to make results comparable. When detailed
descriptions of the finite verbal groups in the two languages represented in the BTC have
been completed, we can start to analyze how these phrases are translated. My ultimate aim is
to study what kind of variation exists between the five parallel translations, and how this
variation can be explained.
714
References
Beiwenger, Michael & Angelika Storrer (to appear). Corpora of Computer-mediated Communication in Anke Ldeling & Merja Kyt (eds.) Corpus Linguistics. An International Handbook. Series: Handbcher zur Sprach- und Kommunikationswissenschaft/Handbooks of Linguistics and Communication Science. Berlin: Mouton de
Gruyter. http://www.michael-beisswenger.de/pub/hsk-corpora.pdf (12.03.2008)
Biber, Douglas (1995): Dimensions of Register Variation: A Cross-Linguistic Comparison,
Cambridge: Cambridge University Press.
Biber, Douglas, et al. (eds.) (1999). Longman Grammar of Spoken and Written English.
Harlow: Pearson Education Limited.
Caffarel, Alice & J.R. Martin & Christian Matthiessen (eds.) (2004). Language Typology: A
Functional Perspective. Amsterdam & Philadelphia: John Benjamins.
Dudenredaktion (2005). Die Grammatik, 7. Auflage. Mannheim & Leipzig & Wien &
Zuerich: Dudenverlag.
Fawcett, R. (in preparation). The Functional Semantics Handbook: Analyzing English at the
level of meaning. London: Equinox.
Halliday, M.A.K. (1994). An Introduction to Functional Grammar. London: Edward Arnold.
Halliday, M. A. K. & Ruqaiya Hasan (1989). Language, Context and Text: Aspects of
language in a social-semiotic perspective. Oxford: Oxford University Press.
Johansson, Stig & Knut Hofland (1994). Towards an English-Norwegian parallel corpus in
U. Fries, G. Tottie and P. Schneider (eds.) Creating and using English language
corpora. Amsterdam: Rodopi: 25-37.
Malmkjr, K. (1998). Love thy Neighbour: Will Parallel Corpora Endear Linguists to
Translators?. Meta 43:4, 534-541.
Martinek, Zdenek & Les Siegrist (1998). WConcord: Concordancer for Windows. Version
3.0. Institut fuer Sprach- und Literaturwissenschaft, Technische Universitaet Darmstadt.
Mauranen, Anna (2002). Will translationese ruin a contrastive study?. Languages in
Contrast 2: 2, 161-185.
ODonnell, Mick (2005). Systemic Coder a Text Markup tool. Version 4.68.
http://www.wagsoft.com/Coder/ (12.03.2008)
Sinclair, John (1991). Corpus, Concordance, Collocation. Oxford: Oxford University Press.
715
Steiner, Erich & Elke Teich (2004). Metafunctional Profile of the Grammar of German in
Caffarel, Alice & J.R. Martin & Christian Matthiessen (eds.) Language Typology: A
Functional Perspective. Amsterdam & Philadelphia: John Benjamins: 139-184.
Teich, Elke (2001). Contrast and commonality between English and German in system and
text: a methodology for the investigation of cross-linguistic variation in translations and
multilingually comparable texts. Universitaet des Saarlandes, Philosophische Fakultaet
II: Habilitationsschrift.
Zitzen, Michaela (2004). Topic Shift Markers in asynchronous and synchronous Computermediated communication (CMC). PhD thesis, Universitaet Duesseldorf.
http://docserv.uni-duesseldorf.de/servlets/DerivateServlet/Derivate-2771/771.pdf
(12.03.2008)
716