Original field study: Patrick Boylan and Nadia Mari, Iniziativa Didattica Studentesca 40/82-83,
La Sapienza University of Rome, 1983. See end note for a full explanation.
The present paper was co-authored and circulated in photocopied form in 1983
and then, entirely rewritten by Patrick Boylan, presented in the form which follows
at the 6th International Pragmatics Association Congress, Reims, 1996.
BEING ONE OF THE GROUP
Patrick Boylan and Nadia Mari
0. Abstract/Introduction
This paper analyzes how second-language learners and native speakers interact in small groups. It focuses
on how apparently marginal behavior, such as eye and head movement, may contribute to determining
each participant's status as an ‘insider’ or ‘outsider’. Data is taken from spontaneous small-group
conversations among American and Italian college students, surreptitiously filmed and analyzed by a
research team composed of teachers and students of English as a Second Language (henceforth ESL) in
Rome.
To what degree, then, are eye/head movements culturally distinctive? What messages do they convey?
How can we be sure we have understood them? Can adapting our eye/head movements improve
communication? The paper argues that by studying conversations as ‘texts’, such questions cannot fully
be answered. It then proposes the notion of ‘enactment of intent’ as the basic ‘unit’ making up
conversations, and outlines an innovative research procedure based on experiential knowledge
(phronesis), better suited to ascertaining the local meaning of somatic as well as verbal and prosodic
messages. This, it claims, is the kind of knowledge that learners of a second language (henceforth L2)
need and that conversationanalysis (henceforth CA) should consider investigating.
1. Aims
The aims of the present research project were:
a. to observe and record spontaneous conversational interaction in informal settings between
self-selected members of culturally heterogeneous small groups, specifically, between
mixed-gender American and Italian university students and teachers surreptitiously filmed as
they chatted together informally in English after a lecture/debate;
b. to take one of the features of the video-recorded interaction — here, gaze and head
movement — and, using classical research methods (Kendon 1967, Birdwhistell 1970, Allen
& Guy 1974, Argyle & Cook 1976), to verify, together with the participants:
♦ if each cultural group has a ‘typical somatic behavior’ in conversing and, if so,
♦ whether a participant can be more successful in communicating with members of the other
culture by adopting their conversational behavior;
c. to determine — this time with only the Italian participants, all of whom were ESL students
— whether the kind of knowledge classical research methods furnish gives them a better
insight into the communicative competence they wish to acquire, i.e., native-like
conversational skill in multi-party multi-cultural encounters;
d. if the response to (c.) is negative, to determine what kind of research questions are not
currently being asked in the field of conversation analysis (but that the ESL learners deem of
interest) and to propose a methodology for answering those questions.
2. Experimental task and setup
Fifteen American third-year university students studying art history in Italy were invited by fifteen Italian
ESL students to a lecture/debate in English on cultural differences between their two countries, held at the
University of Rome. After the debate, the students were free to mill and converse in the lecture hall while
a lawn picnic was being set up outside. A video camera was present in the room and was constantly
pointed at the table up front where the lecturer and the discussion moderator sat. The camera was
equipped with an Eibl-Eibesfeldt lens, i.e.,a hidden mirror arrangement which permitted filming
anywhere in the room unnoticed; audio came from a tiny radio microphone concealed on one of the
students. By rotating the hidden mirror, the cameraman was able to keep the camera constantly focused
on the ‘wired’ student as she wandered from one group to another engaging in conversation. To make
sure the student acted as naturally as possible, she was told, when the lecture began, that the microphone
batteries had gone dead and thus there would be no audio recording; since the microphone was sewn
inside her dress, she had to leave it there. One wall of the lecture room had a panel mirror so that the faces
of any group standing in front of it could be filmed either directly or as a reflection. Unfortunately, no
group spontaneously formed in front of the mirror; thus in some cases, gaze must be reconstructed
through the interactional dynamics. All students were informed afterwards of the video recording and
consent was obtained for its use.
The reader is now invited to turn to Appendix 1 and inspect the drawings (called a ‘storyboard’ in cinema
jargon) of a fragment of the filmed conversation. Appendix 2 gives a verbal description of the
movements;Appendix 3, the transcription conventions.
3. Previous studies; new directions
3.1 Typical research into conversational interaction centers on culturally homogenous dyads (see Allen &
Guy 1974:54 for a justification of the preference for dyads). Less studied are triads (Kerbrat-Orecchioni
1990-4), large groups (Edelsky 1981) and, least of all, small groups (Berrier 1997:326). While studies of
intercultural interaction are now fashionable in our multi-cultural societies, they also concern dyads or
triads (Orletti 1992, Shea 1994, Jensen et al. 1995) and only rarely small groups (Berrier 1995).
Yet conversational competence in small-group situations is what learners of any L2 seem to find most
difficult. In paired exchanges with members of the other culture (service encounters or one-to-one
socializing), they automatically get individualized attention from their interlocutor, however begrudging it
may be. In large groups, L2 learners usually have the option of creating a dyadic relationship to get
individualized attention and to isolate themselves from the crowd. Not so in multi-cultural small group
conversations: here they are mostly on their own and, in order to cope, need "a superior know-how"
(Kerbrat-Orecchioni 1997:10). They either ‘fit in’ or get left out.
Unfortunately, current language teaching methodology gives L2 learners neither the know-how KerbratOrecchioni speaks of, nor the intellectual tools to acquire such know-how empirically in real-life
situations (Boylan 1998). Thus it is common for even moderately fluent L2 learners to report that, in
small group conversations with native speakers, they often feel stifled, ignored or, worse yet,
condescendingly listened to during the occasional lulls in the ‘real’ conversation. In other words, these L2
learners do not feel accepted as ‘one of the group’. But, of course, how could they be? They have no idea
of what it takes to gain a "footing" (Goffman 1981) in the new culture.
3.2 What can we do to prepare L2 students to understand and assimilate the dynamics of the
conversational interactions in which they find themselves?
A first step would be to determine what indeed is required to gain a footing in a given foreign culture.
Frake (1964) suggests, rather convincingly, that this involves learning to feel as ‘real’ what is ‘real’ for
one's interlocutors. There is no doubt, in fact, that in multi-cultural group conversations L2 learners will
tend to be ignored if the topics they raise — ‘real’ to people back home — appear pointless to the group
or out or place (see Tenny 1989 for rules of topic appropriateness). The same goes if these L2 learners, by
lack of appropriate reaction, treat as pointless or irrelevant the topics that the rest of the group finds
‘really’ significant, controversial, humorous, scandalous or whatever.
Indeed, to use a metaphor from transplant surgery, can we really blame the group for rejecting a body felt
as foreign? After all, why should the group take seriously people who don't take seriously things that
matter? It may be instructive but it is certainly not pleasurable to let an outsider, by her apparent
disregard, call into question what one always thought mattered; and it must be remembered that
conversation (fun) is different from discussion (work) precisely because it operates on the pleasure
principle. If, on top of it all, the outsider makes conversation management a chore — by sending
confusing signals through ‘strange’ eye and head movements or by maintaining a speaking style
diametrically opposite from her interlocutors on such culturally indicative scales as physical-contact ~
distance, one-speaker-at-a-time ~ multiple-flooring, explicitness ~ allusiveness, (Hall 1966,
Jensen et al. 1995) — she can hardly complain at being left out of the conversation. She has in fact opted
out by not having opted in.
In a perfectly just world, of course, she would not have to ‘opt in.’ She and her conversational partners
would want to learn from each other in order to create a terrain of authentically shared values and
creative diversity (see the special issue of Pragmatics 4/3, September 1994). This is not, however, the
world that many L2 learners live in.
What L2 learners need, then, is the capacity to meet their interlocutors a little more than half-way
culturally. This does not mean memorizing lists of conversational topics and Do's & Don'ts. Actors get
quickly unmasked. It means something much simpler but much more radical: sharing, at least in part, the
existential value system at the bottom of the other culture. To be truly ‘one of the group’, L2 learners
must truly feel they are.
3.3 Thus, the basic task facing L2 teachers is to get learners to internalize their interlocutors' culture as a
value system or Weltanschauung (Boylan 1998). Doing so will allow them to forget the lists of rules and
play by ear. Far from being cultural imperialism, this technique permits the L2 learner to communicate
her values as a member of a different culture deserving respect much more effectively and
convincingly than if she spoke as an ‘outsider’. In addition, it gives the L2 learner a negotiating edge,
since she controls the flow of information. (These points are developed in Boylan, forthcoming/a.).
But getting learners to internalize a second language does not simply mean getting them to playact. It
presupposes a thorough contrastive study of the cultures and communication modalities involved. In fact,
to return to the topic at hand, one way to help L2 learners ‘enter into’ the new language and culture is to
get them to formulate (and then try to imitate) how people manage conversations in the target culture —
including how they use their eyes and heads — as the expression of a different, felt Weltanschauung.
Kendon (1967) reported long ago that eye and head movements are our principal signaling devices in
conversation. Allen & Guy (1974:142) confirm his assertion: "Eye contact is an important supplement to
the conversational encounter. It is a good indicator of the degree to which the channel is open or closed,
[...] the degree of bonding and the level of attention in the listening mode." Participation, bond of
solidarity, interest: these are precisely the qualities we just mentioned as essential for ‘being one of the
group.’
3.4 Several classification schemes have been devised over the years — see Section 1 b; for the most
recent, see Pezzato & Poggi 1998 — which give a research group interested in replicating a typical
‘candid camera’ ethnographic experimentation a framework for ordering observations of eye and head
movement. What remains to be seen, however, is the usefulness of these schemes to L2 learners or, for
that matter, to anyone interested in communication. Classical research methods measure, for example,
phenomena like the length of time a typical subject spends gazing at an interlocutor when speaking or
when listening. But these observations are seldom related to what the recorded phenomena mean.
Average "length of gaze" is an interesting concept; but when does a glance — in a foreign culture, or
even in one’s own culture — become ‘fleeting’ or ‘overly insistent’? What non-average "length of gaze"
enhances participation, creates bonds of solidarity, manifests interest or, on the contrary, appears brazen,
standoffish or bored? These are the findings that people who use language need to have. More
importantly, they need to know what procedures will permit them to learn these things in the culture in
which they may one day have to live and work, if no studies have been made yet.
The fundamental problem, then, is how to assign meaning to behavior. And given that communication
is multi-modal and holistic (Poggi 1997), not only must we take note of eye and head movements as
potentially meaningful behavior, but we must also record behavior in every other modality (verbal
included) and weave all the data together into a single fabric. Note that the fabric thus obtained is not the
‘meaning’ of the communicative event, but simply its material substratum, no more than the
documentation of an automobile accident (photos, declarations, measurements) is the ‘event’ determining
a claimant's legal rights. In the case of an accident, legal rights derive from an interpretation of available
documentation by an insurance expert or a judge. They alone are authorized to decide what really
happened, by using applicable law and jurisprudence to link intentionality, behavior, concurrent
circumstances and social responsibility into a legally-binding whole (the ‘event’). In the same way, the
fabric of a conversation that we weave when we analyze a video recording, is not the ‘event’ but rather
the material to which we assign a sensein the very act of weaving it, thereby creating the ‘event’
(Gadamer 1960). Like a judge, rightly or wrongly we determine what the conversational ‘event’ was
through the hermeneutics, not of law and jurisprudence, but of psychology and cultural practices. It is not
clear, however, what authorizes us to pass judgments.
Let us therefore look for a moment at how we go about ‘making sense’ of talk-in-interaction. This will
allow us to answer some of the crucial methodological questions that Grossen & Orvig (1998:153-154)
raise concerning the validity of clinical interviews, and which apply to any exchange that involves
‘judging’ what an interlocutor means.
4. General considerations: making sense of a conversational event
4.1 A key step in making sense of any event is defining the units that go to make it up. This of course
depends on the traits we have selected to foreground in viewing the event: in any empirical science, what
we discover depends on what we have chosen to notice — or, more exactly, not to notice. Indeed, as
Gadamer (1960) argues, researchers (such as ourselves in this very paper) may be said to take de facto a
social and ethical stand in finding ‘interesting’ certain phenomena which they then elaborate as
knowledge.
4.2 For Berrier (1997), revisiting Sacks et al. (1974), the basic unit of conversational interaction is the
utterance, defined as "any verbal attempt at turn-taking" (p. 326). However Ford et al. (1996) question
whether the floor is in fact held or ceded on purely verbal grounds. Real-life conversational turns, they
assert (p. 449), are "constellations of convergence and divergence" of verbal/prosodic/somatic signals, the
weightings of which co-determine possible turn junctures dynamically. In this perspective, then,
conversations should be seen as composed not of utterances but rather of moves (Goffman 1981) to gain
or hold the floor by whatever means available, verbal, prosodic or somatic.
4.2.1 The two definitions just given offer useful insights. However both follow a consolidated but
questionable tradition: that of treating conversations as ‘texts’, i.e., bounded, finalized semantic
representations, transferable onto paper for easy inspection and dividable into verbal (or
verbal/prosodic/somatic) units with the stroke of a pen. This approach, we hold, while applicable to text
productions proper (a sonnet, a joke, ritual insulting) is misleading when applied to conversation.
Conversations are not simply ‘texts’ — neither verbal texts, nor multi-modal texts (i.e., concurrent
verbal/prosodic/somatic realizations pictured as a musical score: Poggi 1997), nor semiotic texts (e.g., in
the way some semiologists attempt to reduce culture to a text). This is because a ‘text’, by definition, is a
framed representation to be viewed from without, while conversations (like cultures) are boundless
events, the sense of which can only be experienced from within. Conversations (and cultures) produce
‘texts’ as their residue. While much can be learned by studying that residue, to fully grasp a conversation
(or a culture), a procedure other than textual reconstruction is needed. Let us briefly clarify these
assertions.
4.2.2 When one delineates (‘frames’) an object in order to observe it as a ‘text’, one endows it with
meaning potential. Pop Art did that with soup cans. It is also what CA does in framing conversation
fragments as ‘adjacency pairs’ or ‘side sequences’.
If a stretch of speech has been conventionally bounded and finalized by the speakers (even
unconsciously) — for example, a ‘side sequence’ — and if the subsequent framing operation by the
linguist respects those boundaries and puts that finalization into perspective, then textual analysis of the
framed object is both practicable and instructive. If on the other hand, as in (Pop) Art, the framing does
not correspond to the way whoever created the object sees it, but rather to how the framer sees it — i.e.,
to the personal mental world the framer projects upon the object — then the analysis will end up telling us
more about the framer than about the object. This, too, can, be a perfectly legitimate and instructive
operation, if we are interested in the framer’s ‘perceptions’ or metadiscourse. It must not, however, be
confused with scientific inquiry, which imposes on us to accept the dictates of the object as other in order
to come to grips with it (7.1.8.) .
The first question to resolve, then, is whether conversations, as such, are bounded and finalized and thus
frameable as ‘texts’. If they are, we can then move on to defining procedures that guarantee that our
framing respects the object. If they are not, we have a more difficult task — discovering a non-textual
approach for coming to grips with them.
4.2.3 Conversations, as Garfinkel (1967) argues convincingly, are on-going, situated, collegial,
intentional attempts at existential meaning-making (or meaning-confirmation); they are the very stuff
out of which our conscious lives are made. That is why Aristotle, long before ethnomethodology,
considered societies to be communities of discourse more than communities of people (Lo Piparo
1996:40).In this perspective, therefore, the stretches of talk ordinarily called ‘conversations’ — e.g., the
talk that occurs between the lifting and the lowering of a phone receiver — are simply fragments of a
single, uninterrupted conversation punctuating the lifetime of individuals and their community (and still
unachieved when both pass away). Chats on the phone will of course have, like all conversational
fragments, recognizable textual features (phone calls are partly ritual, as CA has shown) and even goals.
But insofar as the call is ‘conversation’, it will not be a ‘text’. Thus, a specific kind of non-textual
competence is needed for analyzing a conversation as a researcher, as well as for co-creating one as a
participant.
Travelers, for example, know that, while mastering a community's conversational rituals is sufficient for
service exchanges (e.g., knowing "Excuse me; how do I get to..." is sufficient for asking directions), it is
rarely sufficient for entering into communion, through conversation, with the members of that community
— no more than mastering the rituals of prayer in a given religion puts one in communion with the god
being prayed to. Conversation is different from ‘information exchanges’ precisely because it is an
‘entering into communion’. As such it requires making a community's existential value system one's own,
in order to spontaneously recognize and react to the units of intent that make up ‘free talk’ in that
community. Recognizing existential units of intent as an outsider is not easy. This is why it is harder, in a
foreign language, to learn how to converse as ‘one of the group’ than to create sophisticated forms of
bounded, explicitly finalized discourse (i.e., ‘texts’: academic papers, sales talks, literary reviews...), the
structures of which can be thoroughly described by teachers or learned from handbooks.
4.2.4 If conversation proper is not a ‘text’, then what is it concretely? In this paper, it will be treated as an
‘intentional event’ — specifically, as an on-going, situated enactment of a collegial will to make sense
of ordinary experience, accompanied by verbal/prosodic/somatic improvisations. Thus, while any verbal
interaction has textual features which must certainly be accounted for, we shall consider the basic units
of conversation to be neither physical objects (‘utterances’) nor discourse functions (‘turns’) nor logical
constructs (‘felicitous acts’) but rather intuitively-grasped enactments of will, or ‘stances’. A stance is
the perceived intent behind a basic behavioral pattern.
How can such a subjective criterion be scientific? For now we simply appeal to the ordinary experience of
the reader. In general, when we encounter manifestations of intent — a strange movement made by a
passer-by in a dark alley or human-like cooing coming from the apartment next door — we recognize
them as willful acts on the strength of intuition alone, i.e., without explicitly calculating the degree of
their deviation from randomly generated acts of similar nature. In other words, an intent ‘makes itself felt’
to us clearly and distinctly, even if we are not sure of what it means. Of course at times we are mistaken:
we project intentionality on phenomena we later find have none; thus we discover the pitfalls of intuition
and learn to practice preventive strategies, like inner as well as outer listening. Whatever level of
expertise we reach, however, sensing intentionality still remains the soul of our conversational ability. It
is what conversation analysts rely on all the time when they confidently assert the ‘meaning’ of other
people’s talk.
4.2.5 Treating conversations as interplays of ‘intentional events’ — and not exclusively as verbal or
verbal/prosodic/somatic ‘texts’ — is not only closer to our experience of them but also more useful. First,
it permits us to use a wider range of data in our textual analyses of the frameable parts of a conversation
(to be specified in 4.5). This allows us to explain interactions like a TV talk show on politics in which the
skirting of a certain issue by A provokes a protest from A’s rival B (reticence is, in fact, an ‘intentional
event’); or one in which A respectfully lets B talk at length without interruption (deference also
constitutes an ‘intentional event’); or one in which A sadistically sits in silence letting his expensive
clothes show as B, stammering, constantly tugs at his overly-short jacket cuffs (this normal exhibition of
self by A is just as much an ‘intentional event’ as was the act of putting those clothes on at home); or,
finally, one in which A chooses to remain in silence after B, having made a terrible admission,
relinquishes his turn (if A’s silence lasts a long time, the turn goes back to B; if B has nothing to add, his
silence, with A’s, becomes a simultaneous silent turn). Occurrences such as these constitute genuine
communicative acts by A, through doing nothing (see Hall 1966 and Watzlawick et al. 1967 on covert
communication; novelists, too, regularly note occurrences like these in describing conversations). And yet
they are not "verbal attempts at turn-taking" nor even, strictly speaking, multi-modal attempts. If,
however, we consider conversations to be made up of units of intent, then we can treat them structurally,
in order, as: alternating turns in a side sequence, alternating turns with immediate turn relinquishment,
backchanneling, and an alternating turn followed by a simultaneous turn.
More importantly, by considering experiential investigation through cultural assimilation as a valid
research tool for grasping units of intent (the procedure we shall suggest in Section 7), we can gain
access to forms of communication that remain obscure if analyzed as ‘text’. An obvious example are
certain Native American cultural practices:
When used in a special way by Blackfeet, the term 'listening' refers to a form of
communication that is unique to them; when enacted in its special way, 'listening'
connects participants intimately to a specific physical place. [...] 'Listening' this
way can involve the listener in an intense, efficacious, and complex set of
communicative acts in which one is not speaking, discussing, or disclosing, but
sitting quietly, watching, and feeling-the-place [communally], through all the
senses. — Carbaugh (1998)
Indeed, the complementary heuristic we shall be proposing at the end of this paper can give investigators
better access to any intercultural interaction (native/non-native, boss/worker, woman/man...). Only by
learning how to sense, represent and introject the Weltanschauung of their conversational interlocutors
can investigators grasp the ‘enactments of intent’ that direct the flow of the conversation in which they
seek to participate.
4.3 This paper, then, will view conversations as boundless meaning-making events, moved forward by
bounded sequences of stances, i.e.,perceived minimal enactments of intentionality. (See ahead for the
bounding criteria.) It is possible to sense a stance without fully grasping the intent behind it, as we have
seen, simply by noting a suggestive pattern of behavior. But to understand a stance fully, we must grasp
the intent giving it shape within the context of both the whole conversation and the particular
sequence in which it appears. (See Gumperz 1992, 1995, on contextualization.) As a unit of effect —
what we are ‘summoned’ to feel — a stance is the equivalent, in the realm of intentionality, to a ‘speech
act’ in the realm of discursive thought (Austin 1962: see ahead). As a unit of perceived expression, a
stance is a bundle of signals, i.e., triggering occurrences. Examples of signals: stammering, keeping
silent, letting one's expensive clothes show... Example of a stance: while B stammers, A seems to keep
purposefully silent, letting his expensive clothes show... (which is seen as a sadistic stance only if the
dynamic of the whole conversation is grasped). The basic unit of conversations is thus not the signal but
the bundle of signals (triggering occurrences) within which each occurrence may acquire its presumed
intentional value — i.e.,the stance.
4.4 The Oxford Ordinary Language philosophers, and many pragmaticists with them, have adopted a
different, quasi-legal perspective in defining the basic units of communication. To converse is to
‘felicitously’ do something with words and/or gestures (Austin 1962). This is clearly an
oversimplification, as Goffman (1981) has pointed out: people do much more with language than
accomplish speech acts; they indulge in phatic communion, for instance, or talk to keep from thinking.
Thus, while Austin's ‘speech acts’ may constitute the building blocks of the world of discursive talk, the
world in which some philosophers choose to live, they do not qualify as the building blocks of all talk,
much less conversation. Moreover, a definition of the basic units of communication should establish at
least some correlation between units of expression and units of effect. But speech act theory can do so
only with propositions, not with the co-constructions and multi-purpose utterance-turns typical of
conversation (see Goffman 1981 for examples). Nonetheless, Austin's basic notion of speech as ‘doing’
(1. ‘creating a value’; 2. ‘modifying/advancing a state of affairs’) seems worth keeping. In fact,
applied to enactments of intent, the first sense defines ‘stance’; the second, ‘sequence of stances’.
But what exactly constitutes a ‘sequence’? Taking inspiration from Kerbrat-Orecchioni (1997), Colas &
Vion (1998) define the minimal unit of conversation as a multi-phase contribution. (Also see the notion
of ‘mutual adjustment’ in Clark & Wilkes-Gibbs 1986, ‘contribution’ in Clark & Schaefer 1989, and
‘exchange’ in Kerbrat-Orecchioni 1994.) Each ‘contribution’ has a three-part structure: A makes a verbal
offering;B acknowledges the offering; A acknowledges the acknowledgment (explicitly or tacitly).
This schema elegantly explains how a whole series of utterances can constitute a single functional unit.
Nonetheless, as worded, it clarifies conversational exchanges uniquely as exchanges of information —
more precisely, as epistemic or deontic transactions (a tribute to the Austinian tradition?). Our objective,
on the other hand, is to clarify how conversation works as an alogical world-building practice. Thus, we
shall borrow the term, conserve its primitive sense ("finalized transaction made up of a series of turns"),
but define it teleologically as a "perceived claim to value" (Stuart Hall, lecture). Contributions make a
conversation advance, as political struggles make a nation advance. They modify a state of affairs. The
entire storyboard in Appendix 1, for example, represents a single contribution: PHIL’s defense of an
American idiom (a claim to value), with acknowledgment by the GROUP and PHIL’s acknowledgment
of the acknowledgment.
We shall not consider here units superior to the ‘contribution’. Our thesis is that intentionality drives
conversations. Larger units (cognitive ‘enhancements’) are derivative.
4.5 Drawing on Goffman's notions of ‘frames’ and ‘moves’ (1974, 1981), Schank's notion of ‘scripts’
(revisited in Schank & Leake 1989) and the existentialist notion of ‘will affirming itself’ (Sartre 1943),
we have defined stances as the basic units of intent which go to make up contributions and thus
conversations. But what are they in practice?
For the moment the reader can get some idea by glancing once again at the conversational fragment given
in Appendix 1 in the form of a cinema storyboard. Clearly, in order to choose a video sequence for the
storyboard, the authors of this paper must have had some idea of what had struck the participants as a
‘claim to value’ (or ‘contribution’) during the conversation. Debriefing revealed that PHIL’s irruption
into the conversation to comment on the expression "Give me a break!" had struck everyone. Next, the
authors had to define the beginning and end of that contribution and divide it into stances. Those choices
were not automatic. A video tape is a continuous flow of visual and audio phenomena: some principle
must have therefore guided the authors not only in subdividing that flow, before and after PHIL’s
guffaws, into a coherent unit with a start and a finish (the ‘contribution’), but also in subdividing that
contribution into single units to be pictured as ‘frames’, i.e.,the six boxes making up the storyboard.
Whatever that principle was — and we shall try to describe it further on — Appendix 1 illustrates it.
4.6 We may therefore define a ‘frame’ as the pictorial representation of a stance. It is a bundle of
individual/collective signals (e.g., changes in gaze and head position) seen as enacting a personal/group
intent. This is represented pictorially by keeping all visual elements intact from one frame to the other,
except those in which change is seen as potentially significant and indicative of intent (see Note 6). Since
all other changes are perceived by the participants as either background or as noise, they need not be
pictured. As a rule of thumb, frames — like stances themselves — represent the shortest sequences (or
‘flashes’) that a Video Editor might cut from the tape of the interaction to make a brief TV spot to
publicize the conversation, like spots made for films. Minimal units give enough (not necessarily all) of
an utterance or gesture to make it seem like the concrete, intelligible expression of some will directed to
moving events forward.
The storyboard makes it clear that, in any case, the minimal units of a conversation are neither
utterances/utterance-turns, nor speech acts, nor transactions. The frames were not chosen on linguistic
grounds (one contains no utterance and one contains two that work as a single enactment) or to represent
discourse functions (although all, including the silence in frame 6, perform one or more ‘illocutionary
acts’ and frame 2 contains a topic/comment exchange). The criterion used was existential: every frame
had to capture one minimal aspect of the overall attempt (on the part of the participants in the
conversation both as individuals and as a group) to ‘get a hold on things,’ and thus to create or reinforce a
world-view. To borrow an expression from an Italian theoretician of literature inspired by Althusser,
every frame shows, in miniature, an "attempt to implement a project of acting on the world" (Liborio
1979:9, our transl.).
4.7 Now that we have ‘units’ suitable for analyzing conversations as existential meaning-making events,
we may proceed to examine a segment of our video recording. We propose to: 1. identify all
phenomenological regularities using traditional nomenclature (‘posture’...) and procedure (distributional
analysis...) and, simultaneously, 2. assign meaning to these regularities using our new nomenclature
(‘stance’...) and a yet-to-be-defined method for grasping experientially the overall sense of the
conversation.
The need for such a method is evident. We have asserted, in fact, that signals can be perceived as
intentional only within a stance; a stance can be perceived as such only within a contribution; and a
contribution, only within the development of the conversation-up-to-that-point. Moreover, we have
asserted that a conversation is not frameable and can be grasped only by experiencing it. These assertions
would seem to imply that to be able to recognize a ‘stance’ in the verbal/prosodic/somatic ‘posture’ of a
person, we must actively participate in the conversation in which the posture is struck; what is more, we
must be a bona fide member of the interaction, able to ‘get the feel’ of ‘what’s going on’ and, insofar as
possible, to work out meanings dynamically with the group.
If instead we try to make sense of a conversation from the outside, we may fail to notice many of the
stances and contributions as perceived by the participants: but these are what determines (subjectively)
the development of the conversation! The risk is especially great with covert stances (4.2.5). In addition,
we may misinterpret the various movements we notice by projecting onto them our own world of values
(‘projective interpretation’), imagining ‘stances’ and ‘contributions’ where there are none (for the
participants). In other words, we may manage to make sense of the conversation for us, but fail to capture
what is going on "from the natives’ point of view" (Malinowski 1944).
To conclude, it would seem that there is no way we can reliably establish the idiosyncratic meanings of
the interactions we recorded. Current methodology can legitimately uncover only their prototypical
meanings by using psychological/cultural universals to frame the interactions (e.g., Grice’s conversation
‘maxims’, Goffman’s conversation ‘system requirements’, Sacks’ ‘turns’, Brown & Levinson’s ‘facework’...).
This dilemma, called the hermeneutic circle, has various solutions (Gadamer 1960:312 seq.); we shall
propose ours in Section 7. In carrying out the present experiment, however, we chose to take a shortcut in
order to expedite analysis. We substituted actual participation in the conversation with ‘virtual
participation’ obtained by debriefing the participants and then attempting to view the video with their
eyes (as though we were living it as they had lived it). How well this procedure works is what we shall
now see.
5. Hypotheses, predictions
5.1 The present research hypothesized that:
1. at least a few kinds of eye/head movements would be identifiable as either typically
American or typically Italian; moreover it would be possible to grasp the idiosyncratic
meaning of these movements by vicariously ‘willing the enactment’ in which they appear, on
the basis of how the participants reported having lived that enactment;
2. participants who used meaningfully and appropriately the eye/head movements typical of a
given culture would be perceived as ‘insiders’ by the members of that culture: they would be
included in the web of intersecting gazes uniting the members of the culture in phatic
communion and would receive through gazes, although to a lesser extent, more invitations to
speak and greater attention when they spoke.
5.2 To test these extremely complex hypotheses, we formulated two simple, narrowly-focused
predictions dealing with how one value, assent, is communicated through eye/head movement. On the
basis of informal observations made during a pre-encounter held with the American students, we
predicted that, in our video-recorded encounter:
1. in a majority of the enactments of assent, most Americans would nod amply and
continually; most Italians would use brief, contained nods or jerk their heads backward
while opening their mouths slightly (individuality would be expressed, if at all, by varying the
other signals bundled with the nod or jerk);
2. during each enactment of assent, participants would gaze principally at whoever
gesticulated as they did, regardless of that person’s nationality. This means, given the first
prediction, that during most of the choruses of assent typical of friendly group conversations,
the Americans would gaze at whoever was nodding amply (in practice, at their fellow
Americans, plus any Italian adopting such behavior) and vice versa, creating de facto two
‘phatic communion’ groups based on communicative style.
6. Results
6.1 The data gathered was initially compared with classical studies on gaze and head positions (see
Section 1.b) to give the student-researchers practice in noting and collocating visual cues. The
nomenclature and data categories for describing eye/head movement in Kendon (1967) and Argyle &
Cook (1976) were found applicable to our data; no observations disconfirmed these authors’ findings.
Allen & Guy's (1974) findings matched observations, too, except for the following conjecture that left us
flabbergasted (p.137).
We do not claim that the various head movements make any distinctive
communication, and for the most part they do not. [...] If head tosses are not part
of the signal system, then what part do they play? We think it is a form of exercise
which is inherently rewarding to the actor. It also provides variety and interest for
the partner because he has a mobile object of attention.
Commenting on Birdwhistell's (1970) stimulating text would require a separate article. Pezzato and Poggi
's (1998) typology of gaze was not available for verification.
6.2 As for confirmation of our two predictions, results were disappointing. No regularities of the kind
predicted were found. The causes: there were too many dependent variables involved; the videorecording was much too brief; and ‘typical’ gestures, we discovered, are not exhibited as regularly as we
had thought they would be. As a consequence, we were not able to furnish quantitative evidence for our
two hypotheses. We could only try to verify them through case-by-case studies of ‘presumably’
culturally-marked behavior (not necessarily the predicted eye/head movements) associated with
‘apparent’ inclusion/exclusion (not necessarily manifested as predicted, i.e., through a communion of
gaze). In other words, we had to fall back on the hermeneutic practices we had hoped to avoid having to
use exclusively. The example which follows shows the kind of ‘knowledge’ we were able to obtain by
examining one of the frames we created for our storyboard (frame 4; duration: 0.5 seconds). We framed
that particular one-half-second sequence of video because in it we vaguely sensed various enactments of
intentionality, of which assent. Let us now try to give them more specific meaning.
6.3 In frame 4, EMILY and BOB nod at PHIL, while BARBARA turns toward him and starts to nod
(amply in frame 5). The Italians finish turning their heads toward PHIL and remain immobile; NADIA,
immobile, smiles faintly (EMANUELA and MINO, too?).
The group stance captured in this frame seems to be one of general assent. So why are the postures
different? One explanation is that there are culturally-different ways of enacting assent: by ‘nodding’
(USA) and by a more reserved style, ‘immobility-with-a-polite-smile’ (Italy), one which we had failed to
predict for the Italians. But are the Italians really assenting? Is not their immobility simply due to the fact
that PHIL is addressing his question to the Americans (by implication and by gazing at BOB)? And is not
immobility the way any bona fide member of the group-as-a-whole would behave, when not questioned
directly? In that case, there is indeed a cultural division here, but not because of different styles of
assenting. The Italians are simply awaiting their turn.
On the other hand, it is not really clear that PHIL’s question was in fact directed only to the Americans.
‘Keying into’ the spirit of the conversation, through debriefing, we felt the question to be fairly open. And
in any case the Italians still could have shrugged, cocked their heads or expressed bystander assent
through semi-nodding (back-channeling). In other words, the evident division here into two cultural subgroups is not simply due to turn assignments. The Italian sub-group is manifesting disinterest through
lack of somatic participation (although their gaze is on PHIL, in unison with the Americans).
Or perhaps not. Additional debriefing made us sense two intentionalities in the Italians’ stance: assent to
PHIL’s implicit assertion that "Idioms can certainly be curious";non-assent to PHIL’s implicit claim
that "Being American makes an idiom OK" (we still felt "Give me a break!" to be illogical). If this is so,
then the Italians are indeed participating: they are enacting only partial assent. Note that we are not
confirming our initial explanation: ‘nodding’ and ‘immobility-with-polite-smile’ are not two culturallydifferent ways of expressing the same assent. They are enactments of different kinds of assent, i.e.,
different positions on the acceptability of a certain idiom (undoubtedly due to culture, but that is another
question). Our second prediction is therefore disconfirmed with respect to gaze and unverifiable as to the
creation of sub-groups by gestural affinity.
As for the first prediction (head movements as culturally typical, at least among Americans), our data
offered no real confirmations. Classical studies of assent, using larger samplings (Argyle & Cook 1976),
tell us that, in fact, most of the (American) subjects in the investigator's experiment nodded to say "yes".
On closer examination, however, this finding turns out to be fairly worthless: it does not tell us who
nodded, how and in what circumstances. Nodding in relaxed laboratory conditions does not make
nodding universal: had they felt intimidated by a touchy question, Argyle & Cook’s subjects might have
simply smiled to assent (like our Italian students here). Most of all, the findings do not tell us the intent of
the subjects who ‘nodded to assent’. Perhaps assent was not the message they intended to communicate
at all. For example, debriefing revealed that BARBARA's stance (frame 5), facing PHIL head on and
nodding, was one of confrontation; she was not assenting but saying: "OK, you've spoken; now let me get
on with chatting up BOB; you've already stopped me twice!" One could even look for finer shadings.
EMILY's tilting nod differs from BOB's ‘typical’ straight nod: she may therefore mean something other
than "yes" or, if "yes", she may be adding a touch of personal warmth. (This would disconfirm our
prediction that ample straight nodding is standard among Americans, with idiosyncrasies appearing in the
bundled signals, e.g., squinting, which in fact EMILY does). EMILY was unable to tell us later what she
had meant.
To conclude, in frame 4 it is probable that the Italian students, although they did not nod or jerk their
heads back, were assenting in some way; but it is not easy to say in what way or how they would have
enacted assent if the implied question had been simply "Aren’t idioms curious!?" It is also probable that
one or more of the Americans who nodded meant something other than assent (but what?) and that the
ample nod used by one of them is not so typical of Americans as we had thought (but why did we think
so?). This is all the ‘knowledge’ we were able to obtain: not very much and not very certain.
Yet the web of subtle signals creating the group stance in frame 4 is, we would argue, both perceived and
‘understood’ (unconsciously) by everyone. In making up both group and individual stances, these signals
acquire connotations of intent. Sensing them is part of the conversational competence that an L2 learner,
or any speaker, must acquire (4.2.3). Not only is classical (distributional) research methodology unable to
define them, it does not, we claim, constitute a valid means of ever understanding them in situ.
6.4 Paradoxically, our major research finding was that the kind of study we had initially undertaken could
not furnish the kind of knowledge we were seeking. Even if we had had a longer video tape to examine, it
could only have allowed us to confirm (or disconfirm) that, in general: 1. Americans nod more amply
than Italians; 2. Italians who nod amply get looked at more often by Americans. These would be
interesting findings, of course. But even using debriefing to add qualitative data to these quantitative
findings, we still would be unable to tell L2 learners how to nod when conversing with Americans.
Certainly not all the time, nor for every assent, nor always amply. When, then, should they nod? in what
circumstances, in what manner, to what effect? With what intent?
Thus, our failure to find answers to our research questions turned out to be relatively unimportant as soon
as we discovered that the questions we had asked were relatively unimportant. The debriefing technique
used to give meaning to our analyses had had the effect of making us realize how much we were ignoring
(or misunderstanding) in our video-recorded conversations by considering them simply as texts with
behavioral regularities to be catalogued. Debriefing alone, however, could not furnish the knowledge we
sought. Another research tool — or even paradigm — was felt to be necessary.
6.5 To see in what terms, let us leave our research questions aside and examine another frame with the
sole purpose of trying to assign meaning to the eye/head movements we perceive. In frame 6 PHIL's
posture can be labeled head droop with hands in pockets. But what is his stance? What is that posture
saying in (and to) the group?
6.5.1 If we were to forget the considerations raised in Section 4.7, we might be tempted to indulge in the
kind of projective interpretation that is typical of second-rate literary critics: we would project upon
frame 6 our vision of what Americans like PHIL must be like and then look around for American cultural
icons that ‘prove’ we are right. For example, our minds filled with Norman Rockwell paintings and John
Ford films, we might interpret PHIL's stance as expressing typical American small-town informality (and
as our Sociocultural Questionnaire showed, PHIL had in fact lived in a small town). Or perhaps we might
have declared, with equal certitude, that PHIL's stance marks typical American male shyness in the
company of women (according to the movie cowboy stereotype PHIL's posture suggests). Or,
remembering the "I-have-spoken" American ‘Indian’ stereotype, we might have interpreted PHIL's
posture as a culturally-marked turn-relinquishment cue (although this is contradicted by BARBARA who,
in frame 3, seems to stick her hands in her pockets as a bid for the floor in order to speak to BOB).
6.5.2 To avoid the arbitrariness of superficial hermeneutic analyses like these, we might assign to PHIL's
posture only the very general meaning it has in any circumstance (its ‘dictionary meaning’). Birdwhistell
(1970) claims that ‘downward’ always indicates some kind of ‘diminishment’. So to play safe, we might
want to limit ourselves to saying that PHIL is signaling an ‘easing-up’. But while such an affirmation is
undoubtedly true, it tells us almost nothing. Is the ‘easing-up’ a proclamation of greater informality, a
manifestation of embarrassment after having guffawed before women, or a signal of turn-relinquishment
(to repeat the three previous hypotheses)? Or is it something else again?
According to our premises (Section 4), there can be no answer if we study a gesture in isolation. PHIL's
posture will become a ‘stance’ only when we see it as part of a contribution, i.e., as part of a perceived
enactment of individual/collective historical will.
6.5.3 The concept of "game" (Goffman 1981) would seem to offer just such a perspective. Games, in fact,
link a series of events temporarily and causally. This allows us to avoid the trap of studying a
phenomenon in isolation; it also gives every move historical density. Moreover, games simplify human
activity: behavior is explained as the rational maximization of gains, a principle that characterizes human
activity (at least in part) universally. This allows us to avoid the trap of cultural stereotyping.
All of PHIL’s actions in frames 1-6 can, in fact, be viewed as game-like responses to a stimulus in the
previous frame. In frames 1 and 2 MINO expresses surprise at the strangeness of the American expression
"Give me a break!" If we hypothesize that calling into question the reasonableness of people's language,
especially by a foreign speaker of that language, is tantamount to calling into question the reasonableness
of their culture, then we can read PHIL's movements in Frames 2 and 4 as a challenge to MINO's attack.
PHIL is using his voice and gaze to rally his fellow Americans around the flag. In frames 4 and 5, PHIL's
compatriots nod their assent while the Italians at least acquiesce by their immobility. Thus, in frame 6
PHIL is responding with a self-congratulatory stance: he is taking a modest bow for his victory over
MINO.
But this ‘game model’, to work, requires creating a ‘story’ based exclusively on rules of competition that
ignore other psychological and cultural drives possibly at work. Above all, it views the interaction as a
‘text’, the sense of which is reconstructed from the outside — just like the other interpretations. It is
therefore just as (potentially) arbitrary.
6.5.4 What we need is to be able to interact with the events (the interplays of minds and wills) that
produce meaning in a conversation. We could then test reactively the sense that a ‘word+tone+gesture’
acquires reactively. Hermeneutics (6.5.1), inductive systematics (6.5.2), formalism (6.5.3) are clearly not
interactive. A mixed hermeneutic/empirical method, that uses debriefing to countercheck interpretations,
comes closer. It is, in fact, the method used in this project (6.3). Still, it does not offer experimental
validation of falsifiable claims, like experimental science. Debriefing gives, not ‘proofs’, but clues that
require interpretation as much as the conversation itself.
The reader will recall, for example, the sense we assigned to the Italian students’ immobile heads in frame
4. Debriefing (supposedly) revealed that the students’ intended message was only partialassent. But
were the students sincere and accurate in recalling, during debriefing, their intentional states? Perhaps if
we had given more weight to certain clues (their hesitations and hypercriticism), we might have come up
with a less charitable but possibly more realistic interpretation of their immobility during PHIL’s call for
consensus in frame 4. We might have concluded that, having such poor conversational skills in English,
the Italian students simply did not understand what PHIL wanted and therefore limited themselves to a
polite smile. In other words, the intent to distance themselves from PHIL’s call to rally around American
English, was something that the Italians felt only during debriefing, as a rationalization. This
interpretation does not change the conclusions we reached in 6.3; but it shows how unreliable debriefing
can be.
In short, debriefing requires subjects with a rare capacity for minute recall, can provoke false memories,
works only if participants are always willing and available to be debriefed (PHIL, for example, wasn't)
and requires interpretation anyway. What we needed, then, was a technique that would allow us to
participate in the genesis of meaning as it happens. For if conversations are simply texts, then they only
need to be analyzed to be understood. But if conversations are events, then they must be lived to be
understood.
7. Lessons learned
7.1 Three lessons were learned from the present experimentation. The most important was a clarification
of the notion of sense in discourse. This question may seem futile to a lay person who sees conversations
as: 1. purely epistemic transactions (information exchanges), 2. conducted entirely through words
(frameable as text), 3. by interlocutors who are culturally/psychologically unproblematic (for themselves
and for the observer). But as soon as that person engages in ‘small talk’ within some group as an
outsider (by age, nationality, social class, tastes/interests, etc.) — and fails to ‘key in’ — these three
idealizations quickly crumble. Indeed, not even we, as bilingual insiders, were able to make complete
sense of the behavior in our videos (analyzed postmortem as ‘texts’) until we hit upon an alternative
research method for CA. Let us now try to describe it.
We just recalled how debriefing showed us the limits of studying conversations as ‘texts’; and yet
debriefing itself can be quite unreliable, due to memory lapses and false recollections. Why not then, we
started asking ourselves, turn the researcher into a conversationalist trained to assimilate her
interlocutors’ Weltanschauung, appropriate their stances, and then debrief herself on the spot?
Memory would no longer be a problem and, more important, the researcher could experiment with the
sense of the stances she feels she has grasped. This is what ethnographers and psychotherapists do all the
time. We would only have to shift the focus from words/deeds as revelatory of a cultural system
(ethnography) or a psyche (psychotherapy), to words/deeds as revelatory of the existential meaningmaking event we call conversation. In fact, this is what conversation analysts do all the time, too,
except that, in practicing introspection to ascertain the probable meaning of someone else’s utterances,
they do not always use systematic strategies to limit projection and accept the ‘dictates of the object’ (see
4.2.2).
Our alternative CA research paradigm therefore sees the analyst as a co-conversationalist, someone who
creates existential meaning-making events conjointly with her informants as a peer, i.e.,adhering to their
values (but see 7.1.3) and thereby learning to interpret the events from within their culture. Like an
ethnographer, the researcher accepts to be a ‘dumb but willing learner’ in the eyes of her interlocutors.
Like a therapist, she uses transference and countertransference as her investigative tools. Unlike either,
she does not seek to fit her interactions with her interlocutors into a system. Her holistic, collegial
procedure (like conversation itself) aims at obtaining, not ‘laws’ or ‘models’, but rather transient, situated,
non-formalized intuitions of the intentionalities enacted.
At first glance, this kind of knowledge may not seem ‘scientific’. It is in fact a non-epistemic variety that
Aristotle calls phronesis (Nicomachean Ethics, VI, 1140a), translated generically as ‘wisdom’ but best
rendered by ‘discernment that generates rules of procedure’. It is knowledge as ‘sure’ as that of any
social science. An example will make its specificity clearer. Phronesis is the knowledge of, say, political
history that a top-rate diplomat has: he ‘sees’ history in the choices to be made...and is usually right; his
books on history, however, are often judged by Academia as accurate but ‘impressionistic’. (Note that
phronesis is not practical expertise, i.e., techne in Aristotle’s trilogy: the diplomat may be a brilliant
analyst but a poor negotiator.) Epistemic knowledge, on the other hand, is the knowledge of political
history that a top-rate historian has: in his books he can demonstrate his assertions...and is usually right;
but his capacity to ‘see’ history in the making may be judged by seasoned diplomats as ‘schematic’ and
‘inaccurate’. (Note that epistemic knowledge is not necessarily theoretical: the historian, if pedantic,
may infer facts brilliantly from available data but have no theories to offer. Note, too, that it goes together
with technical know-how: every great historian is also a paleographer.) Universities traditionally teach
episteme and techne, not phronesis, and thus produce knowledgeable graduates who don’t know how to
use their knowledge. As we shall suggest in our conclusions, the kind of student-led research that we are
proposing here may provide an answer. In any case, let us now try to justify experiential/procedural
knowledge (phronesis) as a valid research tool in conversation analysis.
7.1.1 As is well known, Saussure distinguished two objects of linguistic inquiry: parole, i.e., what people
effectively say and mean using language, and langue, i.e., the semantic/linguistic forms inferable from
what people have actually said and presumably meant. He then proceeded, as is equally well known, to
treat only the latter and three generations of linguists have followed him. (Among the isolated exceptions
was Saussure's co-editor Bally himself.) But what many linguists seem not to have noticed is that parole
is not the mere application of langue to a specific context. Instead, langue is an abstraction from parole
and a partial abstraction at that— a frail skeleton that gives only a hint of what the real body, parole, is
like. Parole is "the sum of individual cases" (Saussure, in De Mauro 1972:30,395, our transl.) and thus
cannot be reduced to langue.
One would have thought that pragmaticists, intent on studying real acts of speech, might have redressed
the imbalance that has favored the study of langue. To some extent they have, of course. But this has not
led to the study of parole as immanence,i.e., as discourse produced and apprehended through phronesis.
Indeed, some pragmaticists seek to create, as it were, a langue of pragmatic effects — thus, once again, a
‘disembodiment’ of parole. Their efforts are commendable in that they enable us to put handles on
regularities. But communication specialists — discourse analysts, international negotiators, L2 learners
— need to come to grips with what is unrepeatable in a communicative event, not simply with what is
generalizable. If a musicologist were to show us that melodies in every part of the globe can be reduced to
a few combinations of notes, recursively expandable, she would undoubtedly help us to understand why
music is a universal language. But if we were musicians looking for new styles, we would still need to
know what makes the traditional music of Dakar or Bali or Sofia unique.
The majority of pragmaticists, of course, do study conversational interactions as parole; but they tend to
treat those interactions as ‘texts’, seldom as ‘events’. (For an example of an ‘event’ approach, see
Contento 1998.) Like historians, they carefully transcribe the exchange (usually only the words,
however); take note of settings, roles, and relationships; then use these materials to explain how the ‘text’
— which, we claim, is not the event — develops. To discover meaning on a deeper level, some even
proceed to analyze the ‘text’ hermeneutically: they reconstruct it, as literary critics do with a novel.
7.1.2 But why take for models the methods of historians and literary critics, specialists in the study of
words imprisoned on a page? Pragmatics, after all, deals with live communication. Surely we can learn
from specialists in the experiential/procedural comprehension of discourse events: sociologists
engaged in participant observation, ethnographers in the field, group therapists, social workers, diplomats,
trial lawyers, even the archeologists of Lejre, Denmark (Bibby 1970), who, to grasp the meaning of the
Neolithic tools they had unearthed, created a prehistoric-like camp in which they dressed and lived as the
users of those tools presumably did. Indeed, their example illustrates admirably the central, neoSaussurean thesis of this paper: language as parole is not ‘words’, nor even ‘words+tones+gestures’,
but the enactment of a historical will in a communicative event. Words, tools, empires are but the
residue of that will — the meager formalizations studied by linguistics, anthropology and history
(4.2.1).Parole itself can be grasped only experientially, by making the will that created it one’s own.
7.1.3 The professionals just listed all possess heuristics for grasping their interlocutor's meaning from her
standpoint, heuristics which vary considerably. For example, testing the real values of one's interlocutor
by provoking her verbally is something diplomats and trial lawyers do commonly, social workers and
group therapists do less, and participant observers or ethnographers only rarely. On the other hand,
participant observers and ethnographers usually adapt to the value system of the host population, group
therapists and social workers remain themselves while creating a group identity with their interlocutors,
and diplomats and trial lawyers usually keep their identities and their distance.
7.1.4 We hypothesize that it is possible to use combinations of these heuristics in order to grasp the
meaning of conversational behavior from the point of view of an insider. To get an idea of what me mean
concretely by ‘heuristic’, the reader may now turn to Appendix 4 and inspect one of the modules of the
training program in participant observation that we developed subsequently for our student-researchers.
This article will not go into further detail in describing the acquisition of the various heuristics. Our aim
here is to describe the kind ofknowledge such heuristics can give. In Section 7.1 we defined it as
phronesis: but what does that mean in practice? What, for example, would one of our student-researchers
have learned if she had used the participant observation heuristic in the conversation pictured in our
storyboard? Let us imagine that we are that student-researcher.We are an ESL learner with only a
rudimentary knowledge of English, standing where NADIA is standing in the pictures.
7.1.5 So how would we perceive the conversational event and PHIL’s vocal, prosodic, and somatic
behavior? The answer is easy. Even with no training in any heuristic, we could not help but sense PHIL’s
behavior as a ‘contribution’ (see Note 8). Even if his words are unclear, his intent ("Hear this!") impinges
itself upon us, as does the group’s intent ("Hear him!"). With appropriate training, on the other hand, we
would be more attuned. For example, if we had internalized reciprocal rousing as a value (i.e., what
young American males do typically when they greet each other boisterously or when they slap each
others' hands after some kind of victory), we would intuitively feel PHIL’s stance as rousing and
his /HUH?/ as a call, more than a question; our reaction might be to tilt our head up with a smile and with
an exclamation already forming on our lips. This is what BOB is doing in frame 4 and his gesture is
greeted with a twinkle in PHIL’s eye. But what if the sense of PHIL’s stance — for him — is not
rousing, as it is for the group? What if, let’s say, PHIL has a heartburn and, for some reason, has
guffawed to express pain? In that case our glance and smile would be met with a wave of the hand ("No,
that’s just me."); if we persevere, with a reprimand ("Forget it, forget it..."). In other words, by acting
coherently with PHIL’s culture, we would get him at least to correct us (and not simply ‘look through’
us, as often happens in native/non-native encounters). This is what we mean by experimenting stances in
order to grasp the sense they have for the ‘natives’. The insight gained is what allows us to analyze the
videos of our conversations more competently, seeing stances and contributions as they were seen.
7.1.6 But what do we do if we did not participate experimentally in a conversation we now wish to
analyze? In that case we simply replicate the situation, like the archeologists of Lejre. This is, in fact, how
we finally answered our question about PHIL’s posture in frame 6. So what does PHIL’s head droop with
hands in pockets mean? Students of a following year accepted to try out behaving like PHIL when
conversing with American friends. Whenever someone expressed perplexity over some fact of life, they
would erupt with a guffaw, claim normality ("That’s life!"), rouse consensus (/HUH?/), muse a moment
and then go into the downward posture. Moreover, they would do so while adopting temporarily PHIL’s
value system (reconstructed from our Questionnaire).
What PHIL’s head movement surely meant, according to our veteran experimenters, was something we
might label "swallowing food for thought". In fact, after musing a moment over the fact for which they
had claimed normality, the experimenters needed to break with their thoughts. Letting their heads droop
(then brusquely rise: not pictured) helped them to do so; furthermore, stuffing their hands into their
pockets was a way of steeling themselves for the break and returning to the group in a battle-ready stance.
7.1.7 As for the validity of our ‘game’ hypothesis (according to which PHIL’s posture is a selfcongratulatory stance, i.e., "taking a modest bow": 6.5.3), it was judged dubious by our experimenters.
They felt victorious in having normalized a perplexity, not victorious over whoever had expressed it.
They did feel ‘strategic interaction’ with the group as a whole, however, which is in fact what Goffman
meant. We are therefore considering developing a game-theoretical explanatory apparatus fed with data
from our simulations (as in experimental microeconomics) to sift out the competitive behavioral constants
from the communicative practices we observe in multi-cultural encounters.
7.1.8 As for the first two of our three projective interpretations ("typical American sign of informality";
"cowboy shyness in female company": 6.5.1), our experimenters judged them parochial. After all, PHIL
certainly felt his posture as normal, not "typically American" and "highly informal" (as it would be seen
in Italy). And, in fact, "normal" is how our experimenters felt that posture in their reenactments, thanks to
cultural assimilation. Moreover, while PHIL’s body language in female company might be seen as "shy"
in Italy, our experimenters lived it as ordinary self-control. Clearly, then, the first two projective
interpretations tell us more about their (Italian) framers’ mentality than about their object: PHIL’s posture
as meaningful to him and his compatriots (see 4.2.2.). An element of truth was recognized, however, in
the third interpretation: "solemn American-’Indian’-like turn-relinquishment". In fact, the
experimenters reported feeling under pressure after having roused consensus — all eyes were upon them
(just as in our storyboard, frame 4) — and so they gazed downward to ‘think in peace’ about what they
had said. Kendon (1967) rightly notes that a downward gaze while thinking liberates speakers from their
interlocutors’ stare and does not count as a turn relinquishment. But our more introspective experimenters
reported that their intent to muse contained a hidden intent to cede the floor (in contrast with a
concurrent, more conscious intent to keep the floor in case they thought of some quip to make). In frames
5 and 6 of our storyboard, PHIL’s stance seems to share this ambivalence: for EMILY, PHIL’s stance
says he has ceded the turn to MINO (she gazes at MINO who apparently looks blank; so she looks away
but not at PHIL); for the others PHIL is still claiming the floor.
7.1.9 In conclusion, our final answer is that, in frame 6, PHIL is signaling a break with his thoughts
(and a quasi-relinquishment of turn), together with a brusque return to invigilating the
conversation; his is a retake-control stance.
The experimenters added that, as Italians, they would never put their hands in their pockets to muse or to
conclude their thoughts although, after having adopted PHIL’s Weltanschauung, they felt natural doing so
in English. Indeed, the gesture gave them a greater feeling of self-possession and self-control, two of
PHIL’s values they had chosen to internalize to be more in tune with their American conversation
partners. And feeling those values more, they found that they communicated better with their partners.
Our second hypothesis (5.1) thus receives an unexpected (albeit partial) confirmation. Adopting every so
often PHIL’s hands-in-pockets stance actually helped these Italian students to feel — and to be — ‘one of
the group’. But their success, let us quickly add, does not prove that ‘similar conversational behavior’ and
‘acceptance as an insider’ are mechanically linked. Something more subtle had occurred in the
interpersonal and group dynamics. As one of the student-experimenters reported: "My American friends
did not consider me ‘closer’ because I put my hands in my pockets; but by putting my hands in my
pockets, I considered myself ‘closer’ to them... and so they treated me that way."
7.2 A second lesson was learned from the research project illustrated in this paper: a reciprocal
relationship between research and teaching can be extremely enriching, not only at the graduate level (as
is currently practiced), but at the undergraduate level as well.
Normally, undergraduate students are not considered motivated enough to want to learn the basic
concepts and tools of a discipline through conducting a research project, since this means a lot of selfstudy (terminology, etc.) to save class time for original investigations. Thus, university systems have
students study, for four years, the rote ‘factual’ knowledge of all the various disciplines. Then if they go
on to graduate studies, students get to study the discipline ‘hands on’ by carrying out research on a topic
of interest.
This educational practice, we suggest, may actually be counterproductive: knowledge is either ‘hands on’
or it is just words. Wouldn't it be better to have biology graduates with perhaps less encyclopedic
knowledge of history and language but with real knowledge (phronesis) of what documenting a historical
fact and learning a natural language mean? The same applies, naturally, for history or language students
with respect to biology.
In any case, it is worth noting that the research questions asked in the present study — and so the
perspective given to the study of conversational interaction (what we "chose to notice": 4.1) — were
dictated by the needs and curiosities of the ESL students who were the material executioners of the
project. In other words, without the bi-directional interaction between teaching and research, this would
have been a paper on statistically relevant concordances between gaze and head movements in some
sample population.
7.3 Finally, the research procedure described here calls into question ‘obvious’ contradictions like those
said to exist between theoretical research, applied research and technological implementation. The very
terms are full of ambiguity and mask issues worthier of discussion (for instance, the interplay with
experiential knowledge, phronesis).
Was this project, then, an example of ‘applied research’? It did in fact spring from a real need to know
expressed by the participating L2 teachers and students. Still, the methodological issues that this project
raised (and that are discussed throughout this paper) were, if anything, theoretical. The question, then, is
not whether we need more theoretical or more applied research. The question is whose needs should we
try to satisfy better by creating new elaborations and procedures resulting in what? Only when we fail to
answer that question — when our ends become simply furthering some tradition or innovation (or simply
some interest) — do the mental elaborations called "theoretical" seem disjointed from those called
"applied" or "technological". The ends of the research project presented here were to furnish L2 students
and instructors — in a university system which had no consideration for second language studies as an
academic discipline, and provided little or no support for teaching and research in that field — with tools
to acquire (and to help others acquire) intercultural conversational competence of the kind needed in
today’s global village. The parole-centered, experiential investigative apparatus hypothesized here was
created in response to that need.
____________________________________________________________________________________
NOTE
The project reported here, Spontaneous Conversation in English, was financed with funds for
extra-curricular student activities (Iniziativa Didattica Studentesca 40, 1982-83) provided by the
University of Rome La Sapienza. This paper is a rewriting by P. Boylan of N. Mari's 1983 eightpage unpublished report on eye/head movement in the video-recorded interaction. The other
student reports, also to be rewritten for publication, treat other aspects of kinesics (G. De Lorenzo,
P. Noce), prosody (I. Avvenente, A. Lamanna, S. Meli), oral syntax (F. Franchi, G. Panico), and
conversation routines (E. Agostini, N. De Cillis, G. Runcio). Thanks to Adam Kendon for
suggestions on the final draft of this paper. Belated thanks to the student volunteers who, in 1983,
helped with the staging (A. Roselli), lighting (M. Cassano), filming (G. Concetti), sound recording
(G. Saporaro), editing (C. Mosticone) and transportation (N. Fioravanti). A very warm grazie to
the students from Pitzer College, Temple University and Trinity College and to their teachers,
respectively L. Marquis, E. Miller and P. De Martino.
___________________________________________________________________________________
References
Allen, D.E., Guy, R.F. (1974) Conversation analysis. Mouton: The Hague.
Argyle, M., Cook, M. (1976) Gaze and mutual gaze. Cambridge: Cambridge University Press.
Austin, John L. (1962) How to do things with words. Oxford: Oxford University Press.
Berrier, Astrid (1995) Au-delà de l'approche communicative. Saint-Laurent: Tré-carré.
— (1997) Four-party conversation and gender. Pragmatics 7/3: 325-366.
Bibby, Geoffrey (1970) An experiment with time.Horizon XII: 96-101.
Birdwhistell, R.L. (1970) Kinesics and context. Philadelphia: University of Pennsylvania.
Boylan, Patrick (1998) Learning languages as 'culture' with CALL. In L. Calvi & W. Geerts (eds.), CALL, Culture and
Language Curriculum. London: Springer, 60-72.
— (forthcoming/a) La comunicazione interculturale. Napoli: Edizioni Scientifiche Italiane.
— (forthcoming/b) To be or not to be: success or failure in intercultural communication Acts, 8th Congress of SIETAR
Europe, Bath, 1998.
Carbaugh, Donal (1998) 'Just listen': Listening and landscape among the Blackfeet. Panel at 6th International Pragmatics
Conference, Reims.
Castelfranchi, C., Parisi, D. (1980) Linguaggio, conoscenze e scopi. Bologna: Il Mulino.
Ciliberti, Anna (1994) Manuale di glottodidattica.Florence: La Nuova Italia.
Clark, H.H., Schaefer, E.F. (1989) Contributing to discourse. Cognitive science 13/2: 259-294.
Clark, H.H., Wilkes-Gibbs, C. (1986) Referring as a collaborative process. Cognition 22: 1-39.
Colas, A., Vion, M. (1998) Ajustement au destinataire en tâche de communication référentielle. Panel at 6th International
Pragmatics Conference, Reims, 1998.
Contento, Silvana (1998) Ideological divergence and communicative convergence. Panel at 6th International Pragmatics
Conference, Reims.
De Mauro, Tullio (1972) Note. In F. de Saussure, Corso di linguistica generale. Bari: Laterza, 365-456.
Edelsky, Carole (1981) Who's got the floor? Language in society 10: 383-421.
Ford, C.E., Fox, B.A., Thompson, S.A. (1996) Practices in the construction of turns: the 'TCU' revisited. Pragmatics 6/3: 427454.
Frake, Charles (1964) How to ask for a drink in Subanun.American Anthropologist. 66: 127-132.
Gadamer, Hans Georg (1960) Wahrheit und Metode. Tübingen: Mohr (Paul Siebeck).
Garfinkel, Harold (1967) Studies in Ethnomethodology. Englewood Cliffs: Prentice-Hall.
Goffman, Erving (1974) Fame analysis. New York: Harper.
— (1981) Forms of talk. Philadelphia: University of Pennsylvania Press.
Grossen, M., Salazar Orvig, A. (1998) Clinical interviews as conversational interactions. Pragmatics 8/2:149-154.
Gumperz, John (1992) Contextualization and understanding, In A. Duranti and C. Goodwin, Rethinking context. Cambridge:
Cambridge University Press, 119-252.
— (1995) Mutual inferencing in conversation. In I. Marková, C.F. Graumann, K. Foppa (eds.) Mutualities in dialogue.,
Cambridge: Cambridge University Press, 101-123.
Hall, Edward T. (1966) The hidden dimension. New York: Doubleday.
Jensen, A.A., Jaeger, K., Lorentsen, A. (eds.) (1995) Intercultural competence. vol. II. Aalborg: Aalborg University Press.
Kendon, Adam (1967) Some functions of gaze in social interaction.Acta Psychologica 26: 22-63.
Kerbrat-Orecchioni, Catherine (1990-4) Les interactions verbales. 3 voll. Paris: Colin.
— (1997) A multilevel approach in the study of talk-in-interaction.Pragmatics 7/1: 1-20.
Lo Piparo, Franco (1996) Aristotle. In H. Stammerjohann (ed.) Lexicon grammaticorum. Tübingen: Niemeyer, 39-42.
Liborio, Mariantonia (1979) La costituzione del testo. Napoli: Istituto Universitario Orientale.
Malinowski, Bronislaw (1944) A scientific theory of culture. Chapel Hill: University of North Carolina.
Mondada, Lorenza (1998) Therapy interactions. Pragmatics 8/2: 155-165.
Orletti, Franca (1992) Modalità epistemica e epistemologia in Medina. In A. Giacalone Ramat & G. Crocco Galeas (eds.)
From pragmatics to syntax. Tübingen: Narr, 365-384.
Poggi, Isabella (1997) La partitura di Totò. In I. Poggi & E. Magno Caldognetto (eds.) Mani che parlano. Gesti e psicologia
della communicazione. Padova: Unipress, 161-173.
Pezzato, N., Poggi, I. (1998) Lexicon of gaze. Panel at 6th International Pragmatics Conference, Reims.
Sacks, H., Schegloff, E.A., Jefferson, G. (1974) A simplest systematics for the organization of turn-taking for
conversation.Language 50/4: 696-735.
Sartre, Jean-Paul (1943): L'être et le néant. Paris: Gallimard - nrf.
Schank, R.C., Leake, D.B. (1989) Creativity and learning in a case-based explainer. Artificial intelligence 40: 353-385.
Shea, David P. (1994) Perspective and production: structuring conversational participation across cultural borders.Pragmatics
4/3: 357-389.
Tenny, Yvette J. (1989): Predicting conversational reports of a personal event. Cognitive science. 13/2: 213-233.
Van Dijk, Teun (1982) Episode as units of discourse analysis. In D. Tannen (ed.) Analyzing discourse: text and talk.
Washington, D.C: Georgetown University Press, 177-195.
Watzlawick, P., Beavin, J.H., Jackson, D.D. (1967) Pragmatics of human communication. New York: Norton.
Send comments or questions to
:
Patrick Boylan
Department of Linguistics
University of Rome III
via Ostiense 236,
00146 Rome, Italy
Tel. (+39) 06 491 973
<
Fax: (+39) 06 233 213 106
>
Appendix 1: Storyboard of a fragment of a multi-cultural multi-party conversation
Appendix 2: Description of the Storyboard (Contribution) — Duration: 6 seconds
Participants, clockwise from far left (numbers give position on an imaginary dial):
PHIL / USA; 10:EMILY / USA; 12:NADIA / Italy; 2: BOB / USA; 3: BARBARA / USA (a young
ESL teacher in Rome; she knew only Emanuela); 6: EMANUELA / Italy;7: MINO / Italy
9:
FRAME 1 (1 second): MINO is speaking to BOB, while gazing at him. The OTHERS are looking at
MINO but with their heads pointing toward the center of the circle, thus obliquely in most cases. Only
EMANUELA turns her head to look at MINO directly.
MINO: you SAID , +GIVE me a BREAK —
Note: "Give me a break!" is an idiomatic American English expression meaning "That’s enough!"
FRAME 2 (1 second): MINO is repeating his utterance. EMANUELA is turning her head back to the
center and probably glancing at BOB, as are the OTHERS (except NADIA). BOB is lowering his eyes,
smiling, and emitting a very soft, constrained chuckle. After MINO finishes speaking, PHIL intervenes
with a guffaw and attracts NADIA's gaze.
MINO: GIVE me a BREAK —
PHIL: HA +HA !
FRAME 3 (0.5 seconds): Now EMILY is looking at PHIL, too, without moving her head. So is BOB
who, lifting his eyes, is passing from a low chuckle to a chortle. PHIL is cocking his head toward MINO
while uttering something out of the corner of his mouth. BARBARA, sticking her hands in her dress
pockets, is turning her head toward BOB.
PHIL: =THAT’s −aMERican .
FRAME 4 (0.5 seconds): PHIL, turning his head back to the center of the circle (and looking directly at
BOB), is uttering an even louder guffaw mixed with a monosyllabic utterance. BOB is nodding and
uttering another low, less constrained chuckle. EMILY is cocking her head toward PHIL and nodding
slightly with a wide smile, closing her eyes partially. Both MINO and EMANUELA finish turning their
heads toward PHIL and remain immobile. NADIA, immobile, has a faint smile. BARBARA, who was
turning her body to face BOB directly, brusquely begins turning back to her original position while
smiling.
PHIL: HUH ?
FRAME 5 (1 second): PHIL is lowering his eyes and repeating MINO's utterance (with a citation tone).
EMILY is turning her gaze toward MINO who is lowering his head (and gaze?) slightly. BARBARA,
finishing turning 180°, is now facing PHIL head on. She nods amply.
PHIL: +GIVE me a BREA::K .
FRAME 6 (2 seconds): EMILY, after glancing at BARBARA, is turning her head back and upward,
looking into space. NADIA is beginning to smile more broadly as her eyes follow the direction EMILY is
indicating with her glance. PHIL, smiling with a closed mouth now, lowers his head, stoops his shoulders
while moving them slightly, and sticks his hands in his pockets.
GROUP: ‘ ‘
Appendix 3: Transcription conventions
Utterances are aligned along a horizontal time bar to show overlapping/chaining. Paralinguistic
realizations are given in the description in Appendix 2. Silence is indicated by apostrophes: ' ' (2
seconds); a prolonged phoneme, with colons: No::: (0.3 seconds). A pretonic syllable is indicated by
capsand a tonic syllable by underlined caps. A pretonic or tonic higher than usual for an utterance-type
is marked by a plus sign + ; if lower by a minus sign – ; if much lower by a double minus sign = . The
tonic syllable can have one of six possible tone movements, indicated by using six punctuation marks:
1. Fall from middle or middle-high to very low (the conclusive tone): .
2. The same as above but with a fall to middle low (non conclusive tone): —
3. Middle or middle high tone slightly rising at the end (introductory phrase tone): ...
4. Middle high to (very) low, then sharply to very high (question tone): ? (??)
5. Rise from middle high to high and then sharp fall to very low (exclamatory tone): !
6. Middle low or very low tone slightly rising at the end (parenthetical tone): ,
"," can also follow "." or "!" to attenuate affirmations or express worried surprised.
Appendix 4: Preparation for field research / Participant observation: Empathy
The module below seeks to enhance the student-researcher’s capacity for empathy, usually considered a "rather vague notion"
(Mondada 1998:159) or, at best, a ‘gift of nature’. It is taken from a training program developed between 1984–1991 as part of
an ‘alternative’ English course at the Teachers’ College of the University of Rome. The course taught English in a perspective
akin to what is now called Intercultural Communication (Jensen et al. 1995). Further details may be found in Boylan
(forthcoming/b).
1. Observe foreign ‘twin’; playact her or him
Change: subjects define target values, see how their ‘self’ hinders perception
Tools: ethnographic checklists/practices à la Malinowski
2. Formulate intuited values as maxims
Change: from an epistemic to a volitional stance
Tools: Stanislavski's State of "I am" and Through Action
3. Divest (existentially)
Change: from willfulness to anomie
Tools: Bracketing à la Husserl
4. Invest (existentially)
Change: from anomie to new willfulness
Tools: Guided associations (Freud) using maxims
5. Act and verify
Change: new needs, intents, perceptions
Tools: Simulations with colleagues, thenreal-life interaction, first in controlled
situations; subsequent debriefing and, if necessary, reformulation of target values.