Academia.eduAcademia.edu

Phonetic convergence in college roommates

2012, Journal of Phonetics

Previous studies have found that talkers converge or diverge in phonetic form during a single conversational session or as a result of long-term exposure to a particular linguistic environment. In the current study, five pairs of previously unacquainted male roommates were recorded at four time intervals during the academic year. Phonetic convergence over time was assessed using a perceptual similarity test and measures of vowel spectra. There were distinct patterns of phonetic convergence during the academic year across roommate pairs, and perceptual detection of convergence varied for different linguistic items. In addition, phonetic convergence correlated moderately with roommates' self-reported closeness. These findings suggest that phonetic convergence in college roommates is variable and moderately related to the strength of a relationship.

Journal of Phonetics 40 (2012) 190–197 Contents lists available at SciVerse ScienceDirect Journal of Phonetics journal homepage: www.elsevier.com/locate/phonetics Phonetic convergence in college roommates Jennifer S. Pardo a,n, Rachel Gibbons b, Alexandra Suppes c, Robert M. Krauss b a b c Department of Psychology, Montclair State University, 1 Normal Avenue, Montclair, NJ 07043, United States Department of Psychology, Columbia College, Columbia University, United States Department of Public Health, Weill Cornell Medical College, United States a r t i c l e i n f o a b s t r a c t Article history: Received 20 March 2010 Received in revised form 22 September 2011 Accepted 1 October 2011 Available online 19 October 2011 Previous studies have found that talkers converge or diverge in phonetic form during a single conversational session or as a result of long-term exposure to a particular linguistic environment. In the current study, five pairs of previously unacquainted male roommates were recorded at four time intervals during the academic year. Phonetic convergence over time was assessed using a perceptual similarity test and measures of vowel spectra. There were distinct patterns of phonetic convergence during the academic year across roommate pairs, and perceptual detection of convergence varied for different linguistic items. In addition, phonetic convergence correlated moderately with roommates’ self-reported closeness. These findings suggest that phonetic convergence in college roommates is variable and moderately related to the strength of a relationship. & 2011 Elsevier Ltd. All rights reserved. 1. Introduction The acoustic–phonetic form of a word varies widely both between and within talkers. Production of the same word across talkers differs according to anatomy, sex, age, dialect, and region of residence. In contrast, variability in an individual talker’s production of a word on different occasions is less noticeable in everyday conversation. Much of this variability can be attributed to semantic and pragmatic impact on usage. However, talkers have also been found to vary acoustic–phonetic form with very recent exposure to another talker and after prolonged exposure to a particular linguistic environment. In particular, talkers have been found to become more similar in acoustic–phonetic form to a model or to an ambient linguistic environment, exhibiting phonetic convergence or gestural drift (e.g., Babel, 2010; Evans & Iverson, 2007; Goldinger, 1998; Namy, Nygaard, & Sauerteig, 2002; Pardo, 2006; Sancier & Fowler, 1997). Missing from the literature is an understanding of the dynamics of phonetic convergence in a pair of talkers who interact for longer than a single experimental session. Thus, the current study examined phonetic convergence in previously unacquainted college roommates through the academic year and related measures of phonetic convergence to measures of perceived closeness. 1.1. Speech accommodation Communication Accommodation Theory proposes that individuals use language to achieve a desired social distance between the n Corresponding author. Tel.: þ1 973 655 7924. E-mail addresses: [email protected], [email protected] (J.S. Pardo). 0095-4470/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.wocn.2011.10.001 self and interacting partners (Shepard, Giles, & Le Poire, 2001). Accordingly, convergence refers to the ways in which a talker adjusts speaking style to become more similar to an interacting partner, whereas divergence refers to changes in speaking style that result in reduced similarity to a partner. The changes initially observed in speech included attributes measured over long stretches of dialog such as accent, speaking rate, intensity, lowfrequency band variation, pause frequency, and utterance length (e.g., Giles, Coupland, & Coupland, 1991; Gregory, 1990; Gregory, Dagan, & Webster, 1997; Gregory & Webster, 1996; Natale, 1975). The reasons proposed for employing accommodation are varied, but the most prevalent is the similarity attraction hypothesis, which claims that individuals try to be more similar to those to whom they are attracted (Byrne, 1971). This proposal has evoked many hypothetical functions for convergence–convergence could result from a need to gain approval from the interacting partner (Street & Giles, 1982), from a concern that the interaction is carried out smoothly (Gallois, Giles, Jones, Cargile, & Ota, 1995), or from an effort to increase one’s own intelligibility during the interaction (Triandis & Triandis, 1960). Divergence, on the other hand, can be used to accentuate individual differences or to display disdain for another individual (Bourhis & Giles, 1977; Shepard et al., 2001). Although Communication Accommodation Theory was developed in the context of studies that employed social settings, convergence and mimicry have also been examined in more restricted laboratory settings. For example, talkers who were asked to repeat recorded words sampled from another talker produced utterances that were more similar to those of the sample talker than their baseline utterances (Goldinger, 1998; Namy et al., 2002; Shockley, Sabadini, and Fowler, 2004; but see Vallabha & Tuller, 2004). In these studies, talkers were first recorded producing J.S. Pardo et al. / Journal of Phonetics 40 (2012) 190–197 a baseline series of words prompted by a list, and then were instructed to listen to the same series of words produced by another talker and to repeat each word immediately. Imitation of word forms was assessed by asking a separate set of listeners to judge the similarity of the baseline and the shadowed utterances to the model utterances. Across multiple studies using this technique, shadowing talkers were found to converge (i.e., imitate or become more similar to) the model talkers that they heard. Shockley et al. (2004) also related perceived convergence to variation in an acoustic–phonetic attribute (voice onset time: VOT). Thus, shadowers were not only heard as sounding more similar to the model, but they also increased VOTs of shadowed tokens when shadowing a model token that had a lengthened VOT. In a study that examined phonetic variation with long-term exposure to different linguistic environments, Sancier and Fowler (1997) recorded a native Brazilian Portuguese and English (L2) speaker on three occasions: once after spending four months in the US before leaving for Brazil, again just after returning to the US after spending 2.5 months in Brazil, and once more after spending four months in the US. They found that the talker’s Brazilian Portuguese utterances were judged by Brazilian Portuguese listeners to be more accented after the talker had spent four months in the US compared to utterances produced after spending 2.5 months in Brazil. In addition, the talker’s VOTs in both Brazilian Portuguese and English had lengthened after her stays in the US, converging toward the average VOTs in English. This finding was surprising because adjustments were made in the language that the talker had not been using while in the US, indicating a process that is super-ordinate to a particular language. 1.2. Vowel convergence Other studies have extended phonetic convergence to measures of vowel formants (Babel, 2009, 2010; Evans & Iverson, 2007; Pardo, 2010; Pardo, Cajori Jay, & Krauss, 2010). For example, Babel (2009, 2010) has assessed variation in the first and second formants of vowels in shadowing tasks. Overall, intertalker distances in vowel formants were reduced from baseline to shadowed tokens for some of the vowels, and the degree of adjustment was related to the talkers’ implicit attitudes toward the race or nationality of the model talker. In a longer-term study of accent change similar to that of Sancier and Fowler (1997), Evans and Iverson (2007) reported that Northern British English students shifted pronunciation of some of their vowels as a result of spending up to two years at school in Southern England. After only three months, some of the talkers had started to shift some of their vowels, but after one and two years, most of the talkers had adopted the shifted dialectal variants. Moreover, their rated degree of southern accentedness also became stronger over the course of two years in school. In both approaches, vowels were found to change, but the only indication that the changes were perceptible derives from ratings of accentedness, which relates to global dialect convergence rather than individual phonetic convergence. In two studies of phonetic convergence during conversational interaction, Pardo (2010; Pardo et al., 2010) found that talkers converged on some vowels while diverging or not changing on other vowels, and that the degree of vowel convergence/divergence was related to the role of the talker. Furthermore, Pardo et al. found that the degree of vowel convergence (measured as reduction in inter-talker distances) was moderately related to the perceived convergence of receivers to givers (r(10) ¼  0.59, p¼0.04). However, there were other patterns of perceived convergence that were not readily attributable to vowel formants or to articulation rates, which indicates that perceived phonetic convergence is likely to result from multidimensional impressions of phonetic variation. 191 1.3. Conversational convergence The ability to converge to another talker’s word pronunciation or to a linguistic environment suggests detailed perceptual resolution and closely coupled perception and production. However, it is necessary to delineate the factors that modulate this process in more natural settings of language use, such as during a conversational interaction (Pardo, 2006, 2010; Pardo et al., 2010). Although interacting talkers have been found to converge in acoustic–phonetic form, the degree of convergence was subtle and was consistently influenced by the sex of the pair of talkers and a talker’s role in the conversation. Indeed, talkers converged on some acoustic–phonetic dimensions at the same time that they diverged on others (Pardo, 2010; Pardo et al., 2010; see also Bilous & Krauss, 1988). Therefore, phonetic convergence is not an automatic consequence of detailed perceptual resolution, but has variable effects on different speech attributes and is unconsciously modulated by each talker’s interpretation of the situation. At present, accommodation has been established as a prevalent yet variable phenomenon in the speech of interacting talkers (Giles et al., 1991; Shepard et al., 2001). With respect to dialect acquisition and change, Labov (1986) has found that individuals employ different vowel variants across different social settings, and that the use of local dialect markers is related to an individual’s attitudes toward the area (Labov, 1972; see also Babel, 2010; Eckert, 1989). Moreover, because the location of dialect boundaries coincides with geographical boundaries that reduce opportunities for direct social interaction, Labov (1974) has proposed that dialect variation and change result from opportunities for direct social interaction. Yet, despite an explicit desire to accommodate to an ambient linguistic environment, long-term opportunities for social interaction with native speakers, and an unconscious tendency to imitate speech, most talkers fail to eradicate a foreign accent or to lose all markers of a regional dialect. Even though the talkers in previous studies of long-term change made adjustments that were perceptible as changes in relative accentedness, they never sounded fully Southern or unaccented (Evans & Iverson, 2007; Sancier & Fowler, 1997). In order to understand the limitations of these processes, it is necessary to examine phonetic convergence at the level of individual pairs of talkers. To date, the dynamics of linguistic variation that result from continued contact with the same individual have not been studied empirically. 1.4. The current study The present study attempts to fill this gap by examining phonetic convergence among talkers who interact on a daily basis, college roommates. In order to assess phonetic convergence, speech samples were collected from previously unacquainted college roommates at four intervals during the academic year and were used to elicit measures of perceptual similarity from a separate set of listeners. In addition, measures of item duration and vowel spectra were collected and compared to perceptual assessments of phonetic convergence. According to Communication Accommodation Theory and findings of phonetic convergence during conversational interaction, the roommates ought to exhibit phonetic convergence relatively early in the academic year. If phonetic convergence follows a similar trajectory to the gestural drift observed by Sancier and Fowler (1997), then convergence should increase prior to winter break and decrease after the roommates return from winter break. However, the current study is being conducted on individuals who do not undergo crosslanguage alternation, so the roommates might not show a decrease in convergence after returning from winter break. If vowel 192 J.S. Pardo et al. / Journal of Phonetics 40 (2012) 190–197 convergence follows a trajectory similar to that reported by Evans and Iverson (2007), then only some of the talkers should demonstrate vowel convergence after the first three months of cohabitation. However, because the current study examined vowel convergence in roommates as opposed to dialectal changes in vowel formants, it is possible that vowel convergence might emerge earlier than expected in the current study. In addition, if phonetic convergence is related to a talker’s attitude toward their roommate, then the degree of phonetic convergence should be related to the roommates’ reported feelings of closeness, which was measured at the end of the first semester (Babel, 2009, 2010; Gallois et al., 1995; Street & Giles, 1982). 2. Method 2.1. Participants A total of 10 male Columbia College undergraduates (5 pairs of roommates, aged 19–21) provided speech samples. All talkers were native English speakers with no reported hearing or speech disorders. The talkers received compensation at a rate of $10 per hour. Table 1 displays the place of origin for each talker in each pair. A total of 30 members of the Columbia University undergraduate population provided perceptual similarity judgments of excerpts from the recordings. All listeners were native English speakers with no reported hearing or speech disorders. The listeners received course credit in exchange for completion of the task. 2.2. Materials 2.2.1. Recordings At all time intervals, the roommates provided 5 sets of American English vowels embedded in hVd/t words in the carrier sentence, Say ___ again (mixed with filler items). In addition, each talker provided two utterances of two sentences prompted by printed sheets, She had your dark suit in greasy wash water all year and Don’t ask me to carry an oily rag like that. These sentences were chosen because they include phrases that are phonologically diverse, that include the four point vowels, and that exhibit variation across US dialect regions (dark suit, greasy, oily rag, and wash water). For example, Clopper and Pisoni (2004) reported that New England talkers were more likely than others to produce an r-less form of dark and Southern talkers were more likely to produce a voiced fricative in greasy. 2.2.2. Relationship quality In order to assess the quality of the roommates’ relationship, they completed a Roommate Relationship questionnaire at the end of their first semester together. The questionnaire included requests to estimate the amount of time spent with their roommate throughout the first semester, the number of hours per week the two spent in the same room, and the number of meals eaten together per week. Other questions asked about the quality of their relationship and how much they liked their roommate. Specific measures included the closeness they felt toward their roommate (on a 7-point scale), the amount in common they felt they shared with their roommate (on a 7-point scale), and the Inclusion of Other in the Self Scale (IOS; Aron, Aron, & Smollan, 1992). The IOS uses a series of seven paired circles that range from showing no overlap to showing almost complete overlap. Each talker selected the circle combination depicting the overlap that best represents the closeness they feel to their roommate. 2.2.3. Similarity test To compose items for the similarity test, four key phrases were excised from the sentences produced at each time period: dark suit, greasy, oily rag, and wash water. The second repetition of each item was chosen in all but two cases, in which the first utterance was used (a cough interrupted the second repetition of one word phrase, and a mispronunciation affected another). 2.3. Procedure 2.3.1. Recordings Roommate pairs were recruited during the summer before they would spend an academic year living together. Using a list provided by Columbia University Housing Services, the study recruited only those roommate pairs who did not enter the housing lottery together, typically indicating that they had no prior relationship. At the time of their first recording, the talkers were further questioned about any prior contact with their roommates, and none of the pairs had communicated previously. Speech samples were collected at four time intervals throughout the academic year during which the roommates cohabitated: T1 recordings were sampled in late August, before the roommates had met; T2 recordings were sampled in late October; T3 recordings were sampled in December at the end of the first semester and just prior to winter break; and T4 recordings were sampled in January when the roommates returned from winter break, before they resumed interaction. Both members of each pair completed a recording session individually at each time point. The questionnaire assessing the quality of the roommates’ relationship was administered after the recording session at T3 in December. One pair did not return for the T4 recording session, therefore, their phonetic convergence measures from T2 and T3 were excluded from the statistical analyses, but were used to relate phonetic convergence at T3 to closeness ratings taken at T3. The recordings were obtained via a head-mounted AKG microphone connected to a Superscope PSD300 CD digital recorder. 2.3.2. Perceptual tests To assess phonetic convergence, an AXB similarity test was presented to a separate set of listeners. The listening test comprised trials that were designed to assess phonetic convergence at each time interval by asking a listener to judge similarity in pronunciation between the roommates’ speech at T2 (October), T3 (December), and T4 (January), relative to their pronunciation at T1 (August). On each trial, a listener heard three repetitions of the same word or phrase in which an utterance from one talker (X) that was produced at a either T2, T3, or T4 was flanked by two versions of the same phrase spoken by their roommate (A and B). One of the flanking items was the baseline utterance produced at T1, before the talkers had interacted, and the other flanking item was the corresponding utterance from the relevant time interval (T2, T3, or T4). On half the trials, utterances from one member of a roommate pair were used as X-items, and the other half of the trials used the utterances from the other member of a roommate pair as X-items. The order of presentation of T1 and T2/T3/T4 items was counterbalanced so that T1 items were presented in position A on half of the trials and in position B on the other half of the trials. Thus, at the second time interval, if the roommates sounded more similar to each other than they had prior to meeting, the T2 item of one talker should be chosen as sounding more like the T2 item of his roommate than that talker’s T1 item. In order to assess phonetic convergence, listeners were asked to determine which of the two flanking items (A or B) sounded more similar to the middle item (X) in terms of its pronunciation. Each trial began 1000 ms after a listener indicated a response, J.S. Pardo et al. / Journal of Phonetics 40 (2012) 190–197 193 and the items in each trial were presented at 200 ms ISI. The presentation of trials was blocked by roommate to keep the speaker in the X position the same throughout a block. The AXB test was presented over Sennheiser HD 280 Pro headphones connected to Macintosh computers running PsyScope, and responses were collected via keyboard using the 1 (first item) and 0 (last item) keys. 3. Results 3.1. Perceptual assessment If phonetic convergence between roommate pairs occurred, then the utterances produced by both roommates at later time intervals should sound more similar to each other than the baseline utterances produced at T1. Therefore, responses to the AXB test trials were scored as the percentage of trials on which a later time interval utterance (T2, T3, or T4) was judged to be more similar to the X-item (the roommate’s T2, T3, or T4) than the baseline utterance (T1). This procedure yielded measures corresponding to the percentage of trials that each listener detected convergence for each of the ten talkers producing each of the four phrases at each of the three time comparisons in the AXB test (except that one of the pairs did not provide utterances at T4). The data were submitted to a repeated measures analysis of variance to assess the effects of Time Interval (T2, T3, and T4), Pair (1–4), and Phrase (dark suit, greasy, oily rag, or wash water). In comparison to speech produced at T1, listeners detected phonetic convergence in roommate pairs at T2 in October (55%), T3 in December (56%), and at T4 in January (56%). The main effect of Time Interval was not significant, indicating no differences in convergence across these intervals (F(2, 58)¼1.345, p ¼0.269, Z2p ¼0.044). Despite the lack of difference across the intervals, the 95% confidence intervals from the analysis indicated greater than chance detection (50%) at all intervals. Each pair of talkers exhibited different levels of phonetic convergence, with all but one pair showing significant convergence overall (Pair 1¼57%, Pair 2¼ 52% ns, Pair 3¼59%, Pair 4¼55%; F(3, 87) ¼6.577, p o0.001, Z2p ¼0.185). Detection of phonetic convergence differed across the phrases, with greater levels of phonetic convergence on wash water (58%) and oily rag (57%) than on dark suit (54%) and greasy (53%; F(3, 87)¼7.369, p o0.001, Z2p ¼0.203; 95% confidence intervals from the analysis indicated greater than chance detection for all phrases and confirmed the observed differences between phrases). Examining the data more closely, the roommate pairs showed different patterns of phonetic convergence across the time intervals. As shown in Fig. 1, two of the pairs increased in convergence from T2 in October to T3 in December and decreased at T4 in January, when they had just returned from winter break. Pair 3 showed the opposite pattern, with the highest levels of convergence at T2 and T4 and a decrease at T3, and Pair 2 did not converge until T4. The interaction between Time Interval and Pair was significant (F(6, 174) ¼2.897, p¼ 0.01, Z2p ¼ 0.091; 95% confidence intervals from the analysis indicated greater than chance detection for all pairs at all time intervals except T2 and T3 for Pair 2). Although not included in the statistical analyses, the convergence measures for Pair 5 showed a marked decline from T2 (58%) to T3 (46%). Finally, each pair of talkers demonstrated a distinct pattern of phrase-dependent phonetic convergence. As shown in Fig. 2, listeners detected convergence on oily rag and wash water for Pair 1; dark suit and wash water for Pair 2; dark suit, oily rag, and wash water for Pair 3, and greasy and oily rag for Pair 4. The interaction between Pair and Phrase was significant (F(9, 261)¼11.963, Fig. 1. Phonetic convergence of each roommate pair varies over the course of the academic year. Convergence is measured as percent detection of increased similarity at each time interval. All values differ from 50% chance detection, except for Pair 2 at T2 and T3. Fig. 2. Each roommate pair shows a distinct pattern of phonetic convergence across different phrases. Convergence is measured as percent detection of increased similarity averaged across time intervals. Error bars depict 95% confidence intervals. po0.001, Z2p ¼0.292; error bars depict 95% confidence intervals). The interactions between Time Interval & Phrase and Time Interval, Pair, & Phrase were not significant (p¼0.115, 0.263; Z2p ¼0.056, 0.039). The location of origin and perceived convergence of each individual talker within a pair are presented in Table 1, averaged across time intervals. Six of the talkers in this study originated from locations in New York State, with other talkers from New Jersey, Tennessee, Florida, and Korea. The pairs with talkers from the most distinct regions were pairs 2, 3, and 4. Overall, pair 3 demonstrated the greatest and most evenly balanced levels of perceived convergence. Pairs 1 and 4 demonstrated the next highest levels of phonetic convergence, but their convergence was asymmetrical. (In the case of Pair 4, this asymmetry could be due to the fact that talker A was born in Korea, although he reported that he was a native English speaker, but Pair 1 showed a similar pattern.) Pairs 2 and 5 demonstrated the lowest levels of convergence overall. Therefore, phonetic convergence levels were not consistently related to distance in region of origin, as those pairs from the most distinct regions exhibited both the highest and lowest levels of convergence. Because each talker pair converged on different phrases, it is unlikely that the measure of phonetic convergence simply reflected a shift toward sounding more like New Yorkers. For example, New Englanders typically produce dark suit without the 194 J.S. Pardo et al. / Journal of Phonetics 40 (2012) 190–197 Table 1 Location of origin and convergence data for individuals participating in the recording sessions. Talker A 1 Peekskill, NY 2 Syracuse, NY 3 Huntingdon, TN 4 Seoul, Koreab 5 Syosset, NY Vowel distance changea Talker B Perceived convergence of A to B Perceived convergence of B to A New York, NY Miami, FL New York, NY Pleasant Valley, NY Brick, NJ 54 59 41 53 59 52 59  18  115 51 58  14 53 51 37 Location of origin corresponds to the place where the individual lived for the longest period of time prior to participating in this experiment. Perceived convergence has been averaged across the time intervals. " " a Vowel distance changes are in F1 by F2 Hz space. The Euclidean distances between paired talkers in the time 1 session were subtracted from paired talker distances in the subsequent sessions and then averaged. Negative values indicate a reduction in paired vowel distances from the first to the subsequent sessions. b This individual lived in Fullerton, CA for 4 years prior to participating in this experiment. post-vocalic r and greasy with a voiceless fricative (Clopper & Pisoni, 2004). However, in this group, all talkers produced dark suit with post-vocalic r consistently, so the perceived convergence on dark suit in pairs 2 and 4 and lack of convergence in the other pairs is most likely due to other phonetic attributes. This could be due to the fact that most talkers were from New York state, in which the r-less form would be found among members of lower socio-economic groups than are typically found attending Columbia University (Labov, 2006a). Furthermore, most of the talkers produced the voiceless version of greasy consistently, except for one talker from pair 2 and one from pair 4. Note that the only pair of talkers that was found to converge on greasy was pair 4. Because the listeners did not detect convergence in greasy for pair 2, but did detect convergence in greasy for pair 4 when both pairs differed in voicing, there must be other phonetic attributes that influenced perceptual similarity. 3.2. Duration analyses In order to begin to identify potential acoustic–phonetic attributes that talkers might have converged on, the duration of the AXB items were analyzed. Moreover, it is possible that talkers were using more formal/careful speech on their first visit to the laboratory, and so the T1 items that were used as baselines would be distinct in duration from the items recorded at later time intervals. The item durations decreased on average from T1 to T2/T3/T4, indicating potential usage of more casual forms at later time intervals (T2: 16 ms; T3:  5 ms; T4: 29 ms; F(2, 32)¼ 5.54, p¼0.009, Z2p ¼0.257). However, the average differences are relatively small and there was variation in the direction and degree of difference in duration across items and pairs. Some items for some pairs at some time intervals were actually longer in duration that the T1 items. In order to determine whether listeners were responding to average duration when making their perceptual similarity judgments, the differences in duration from T1 to T2/T3/T4 and the phonetic convergence data were submitted to a correlational analysis. The correlation between the perceived convergence of each item at each time interval in each pair and the difference in duration between T1 and the relevant counterpart at each time interval in each pair was not significant (r(46)¼  0.08, p 40.05). Therefore, despite the small but significant reduction in duration from T1 to later time periods, the pattern of phonetic convergence as detected by the listeners was unrelated to the pattern of duration differences. This finding echoes the previously reported lack of relationship between duration and phonetic convergence for male talkers, and the failure to find a relationship between articulation rates and phonetic convergence (Pardo, 2010; Pardo et al., 2010). 3.3. Vowel measures In addition to the items used in the perceptual similarity tests, each talker also produced 5 repetitions of the full set of American English vowels embedded in hVt/d words in a carrier sentence. Because the items used to assess phonetic convergence also contained all of the point vowels (/i/, /æ/, />/, /u/), the repetitions of the point vowels from the full set were analyzed for comparison with the perceptual measures. The averages of the first and second formants across the vocalic portion of each vowel token were derived using Praat estimates (Boersma & Weenink; www.praat. org). Then, the estimates were normalized in order to reduce the impact of anatomical differences between talkers, yielding mea" " sures of F1 and F2 . The normalization routine preserves dialectal and ideolectal differences while projecting each talker’s formant values into a common acoustic space (see Labov, 2006b, 2006c; Nearey, 1989). The data were scaled in a single batch using the North Carolina State University Linguistics Program’s online utility with the Labov ANAE extrinsic setting (Thomas & Kendall, 2007; Thomas, Kendall, Yeager-Dror, & Kretzschmar, 2007). In order to simplify presentation of the vowel measures, the " " average F1 by F2 measures from T1 (August) and T3 (December) for each talker are plotted in four panels in Fig. 3. Each panel corresponds to a single vowel, /i/, /æ/, />/, or /u/. The filled bullets depict averages from T1, and the open bullets depict averages from T3. Each talker pair is represented by the same bullet shape. Arrows connect the vowels of the same talker from T1 to T3. Across all panels in Fig. 3, the change in vowel formants from T1 to T3 is extremely complex. The arrows do not all move in the same direction, nor do they converge toward a single point for any of the panels. In the panels depicting /i/ and /u/, some of the talkers appear to be moving toward the center of the vowel space " " (mainly increasing F1 and decreasing F2 ), but others move in the opposite or different directions. If the talkers had all been converging toward a local New York City dialect, then the formants should have started to shift in a more uniform fashion for at least one of the vowels (Evans & Iverson, 2007). Unfortunately, the roommates also do not appear to be moving closer to each other in their vowel formants from T1 to T3. The only instances in which a pair of roommates’ vowels moved closer together from T1 to T3 were pair 4’s average /u/ and />/ vowels. In order to analyze the changes in vowel formants, each roommate pair’s vowels were first converted to Euclidean distances. These paired distances were submitted to a mixed-design Analysis of Variance to test for the within-subjects effects of Time Interval and Vowel, and the between subjects effects of Pair. The Euclidean distance between the roommate pairs’ vowels reduced from T1 (205) to T2 (171), then increased at T3 (212), and decreased again at T4 (189), and the pattern was significant (F(3, 60) ¼5.394, p o0.05; Z2p ¼0.212; 95% confidence intervals from the analysis indicated that T1 and T3 differed from T2 and T4). However, each roommate pair differed in their distances over time. Fig. 4 displays the differences in roommate pair distances from T1 to the later time intervals. Negative differences indicate a reduction in vowel distances over time. As the figure shows, pairs 2, 3, and 4 either reduced vowel distances or showed no overall change from T1 to later time intervals, and pairs 1 and 5 showed the opposite pattern. The interaction between Time Interval and Pair was significant (F(12, 60)¼4.595, p o0.05; Z2p ¼0.479). The J.S. Pardo et al. / Journal of Phonetics 40 (2012) 190–197 195 Fig. 3. (a)–(d) Normalized vowel formants show inconsistent changes from T1 (August) to T3 (December). Each panel plots a single vowel. Averages at T1 are depicted with filled bullets, and averages at T3 are depicted with open bullets. Roommate pairs are plotted with bullets of the same shape. Arrows indicate direction of movement for each individual talker from T1 to T3. distances, there are not enough pairs in the current study to establish a reliable pattern (r(3)¼  0.64, ns). 3.4. Relationship quality Fig. 4. The Euclidean distances between roommate’s vowels changed from T1 (August) to T2 (October), and only some pairs demonstrated additional changes at T3 (December) and T4 (January). Each bullet represents the difference in roommate vowel distances from T1 to T2, T3, or T4. average differences in distances over time are shown in the right column of Table 1, alongside the perceived convergence for each talker in each pair. Although there appears to be a relationship between perceived convergence and average change in vowel The last set of analyses assesses the relationship between the roommates’ perceived convergence and relationship quality. Roommate relationship quality was assessed at T3 using several survey questions, including number of waking hours spent in the same room per week, number of meals eaten together per week, seven-point scales measuring closeness and amount in common, and the IOS (Aron et al., 1992). Data on number of meals shared per week were excluded as each roommate indicated that the number was zero. There were significant correlations among the data for three of the measures, amount in common, closeness, and IOS, but number of hours/week spent together did not correlate with any of these measures. Therefore, a composite of the correlated relationship measures, amount in common, closeness, and IOS, was formed by averaging the ratings for the three measures, yielding one composite closeness measure for each individual. In order to determine whether an individual’s closeness to his roommate was related to his own convergence to his roommate, independent of whether his roommate shared his closeness sentiments or whether his roommate displayed convergence, 196 J.S. Pardo et al. / Journal of Phonetics 40 (2012) 190–197 the estimates, ratings, and composite Closeness measures were correlated with the individual convergence measures from each of the 10 talkers at T3 in December. This analysis revealed a significant correlation between rated closeness and convergence at T3 (r(8)¼0.54, p¼0.05) and a modest correlation between the composite Closeness index (closeness, amount in common, and IOS rating) and convergence at T3 (r(8)¼0.36, p¼0.15). No other correlations with the convergence data at T3 were significant (nor did the relationship quality measures taken at T3 correlate with perceive convergence at any of the other time intervals). These findings suggest that a talker’s degree of phonetic convergence to a roommate after approximately 3.5 months of cohabitation is related to positive levels of reported closeness to his roommate at that time. 4. Discussion This study examined phonetic convergence in five pairs of previously unacquainted college roommates at three time intervals during the academic year. Each member of each pair provided multiple samples of phrases that were submitted to AXB perceptual listening tests. In addition, each roommate provided measures of their perceived closeness, collected in December after the roommates had been cohabitating for approximately 3.5 months. As suggested by previous research, most roommates converged in perceived phonetic form within the first time interval after approximately 1.5 months of cohabitation, perceived phonetic convergence was evident for the same pairs at the second time interval after approximately 3.5 months, and all roommates converged by the last time interval, after returning from winter break. Measures of item duration differences and vowel spectra were not related to perceived convergence. The degree and patterns of convergence varied across pairs, phrases, and measures. These findings are compatible with proposals that follow from Communication Accommodation Theory (Shepard et al., 2001). For example, Gallois et al. (1995) suggested that convergence may result from a desire to make an interaction flow more smoothly, a desire that can be reasonably attributed to the roommates in this study. Moreover, studies of social interaction have shown that other forms of behavioral mimicry lead to impressions of smoother interactions and greater liking for a partner (Chartrand & Bargh, 1999; Chartrand, Maddux, & Lakin, 2005; Lakin & Chartrand, 2003). Therefore, it is likely that phonetic convergence of the kind observed in individual conversational interactions and in the current study reflects a desire to decrease social distance and to induce a smooth interaction and mutual liking. Additional research is necessary to detail the manner in which such factors interface with speech perception and production to evoke convergence in phonetic form. The current study found variable patterns of phonetic convergence over the course of the academic year, both across different pairs and different utterances. Although previous findings based on a single talker who moved between two different linguistic environments (Sancier & Fowler, 1997) suggested that convergence should be reduced after winter break, only two of the five pairs showed this pattern, and one pair only showed significant convergence after returning from winter break. One important difference between these two studies was the methods that were employed. The perceptual similarity test used in the current study is a sensitive measure of paired talker phonetic similarity, whereas an accentedness rating task likely focuses on the impact of the intonation pattern of a second language environment on a talker’s native language. The acoustic attribute that was measured by Sancier and Fowler (VOT) was selected because it is a phonological variant that is known to vary in its distribution between the two languages. However, VOT is also correlated with speaking rate, a relatively coarse-grain attribute (Miller & Grosjean, 1981). Likewise, Evans and Iverson (2007) reported changes in accentedness ratings and vowel formants for college students in England. Unfortunately, the students in the current study did not demonstrate similarly consistent shifts in vowel formants, which is probably due to the fact that most of them were from the New York state area and had already been attending Columbia University for at least one year. They all produced r-full utterances of dark suit, and variability in fricative voicing in greasy was not related to perceived convergence. The current results suggest that adjustments in perceived phonetic repertoire follow a trajectory that differs from that of accentedness or VOT patterns, appearing more resistant to decay across breaks in exposure. However, this could be due to the fact that the talker in the study by Sancier and Fowler was alternating between different languages with large phonetic differences. In order to determine whether within-language phonetic convergence differs from gestural drift between languages, it will be necessary to manipulate the range of variation that a talker explores within their own language. Previous studies of communication accommodation focused on attributes that were measured in much longer time-scales, including accent, speaking rate, intensity, low-frequency band variation, pause frequency, and utterance length (Giles et al., 1991; Gregory, 1990; Gregory et al., 1997; Gregory & Webster, 1996; Natale, 1975). In most of these reports, paired talkers were found to converge in the particular attribute that was measured. However, Pardo et al. (2010) failed to find evidence of convergence in speaking rate at the same time that talkers converged phonetically, and roommates in the current study did not converge on vowel spectra consistently. In a now classic study by Bilous and Krauss (1988), talkers were found to converge in some attributes at the same time that they diverged in others (e.g., average utterance length; frequency of pauses, laughter, interruptions, and back-channels). These findings lead to the intriguing possibility that each individual talker might converge on a unique set of acoustic– phonetic attributes while diverging, varying randomly, or remaining neutral on others. If that is the case, then measurements of individual acoustic attributes will yield inconsistent patterns. Therefore, a complete understanding of phonetic convergence is unlikely to result from acoustic analyses alone. Without a perceptual similarity assessment, a failure to find convergence in a particular acoustic–phonetic attribute cannot be interpreted. Moreover, a finding of increased similarity in any acoustic– phonetic attribute must be interpreted against the background of a talker’s complete phonetic repertoire. At this point, the most effective way to assess these relationships is to rely on the judgments of ordinary listeners. Because ordinary perception integrates multiple dimensions simultaneously, a carefully designed perceptual similarity test provides a global assessment of phonetic convergence without committing to a single acoustic– phonetic attribute that might not be used consistently by every talker on every occasion. It is important to point out that the overall levels of detected convergence in college roommates was modest, even after 3.5 months of relatively continuous cohabitation. These findings align with those found in studies of convergence in shadowing tasks and during conversational interaction (Babel, 2010; Goldinger, 1998; Namy et al., 2002; Pardo, 2006; Pardo et al., 2010; Shockley et al., 2004). If phonetic convergence is automatically evoked by perceptual resolution of phonetic forms that goad imitative production (Fowler et al., 2003; Pickering & Garrod, 2004), or by automatic activation of relatively recent J.S. Pardo et al. / Journal of Phonetics 40 (2012) 190–197 episodic traces (Goldinger, 1998), then any two talkers who live together should exhibit phonetic convergence with a high degree of fidelity, and the level of convergence should not vary across different utterances. The fact that talkers never match acoustic– phonetic attributes exactly despite a putative drive toward parity calls into question accounts that rely on a direct perceptionproduction link in determining the phonetic form of utterances (Fowler et al., 2003; Pickering & Garrod, 2004). These findings indicate that phonetic convergence in naturalistic settings is not an automatic consequence of a direct perception-production link (Pardo et al., 2010), and that the social and situational factors at play evoke variability in phonetic form that is convergent, neutral, and divergent. Despite its limitations, the current study offers a useful paradigm for examining phonetic variation that results from natural social interaction. To varying degrees, talkers have been found to imitate a model in shadowing tasks, to converge toward a conversational partner during a single interaction, and to show longer-term adjustments in phonetic repertoire as a result of continued contact with the same talker. Additional research is needed to extend the current findings both to female pairs and to mixed sex cohabitation settings. Ultimately, a complete account of variability in phonetic repertoire across both individuals and groups will examine not only individual pairs, but communities of talkers interacting across multiple settings. Acknowledgments Completion of this paper was supported in part by Grant #0545133 from the National Science Foundation to Jennifer Pardo at Barnard College. The authors are indebted to Robert Remez, Isabel Cajori Jay, and the reviewers for their role in the completion of this project. References Aron, A., Aron, E. N., & Smollan, D. (1992). Inclusion of other in the self scale and the structure of interpersonal closeness. Journal of Personality and Social Psychology, 63, 596–612. Babel, M. (2009). Phonetic and social selectivity in speech accommodation. Doctoral Dissertation. Berkeley: University of California. Babel, M. (2010). Dialect convergence and divergence in New Zealand English. Language in Society, 39, 437–456. Bilous, F. R., & Krauss, R. M. (1988). Dominance and accommodation in the conversational behaviours of same- or mixed-gender dyads. Language & Communication, 8, 183–194. Bourhis, R. Y., & Giles, H. (1977). The language of intergroup distinctiveness. In: H. Giles (Ed.), Language, ethnicity, and intergroup relations (pp. 119–135). London: Academic. Byrne, D. (1971). The attraction paradigm. New York: Academic Press. Chartrand, T. L., & Bargh, J. A. (1999). The chameleon effect: The perceptionbehavior link and social interaction. Journal of Personality and Social Psychology, 76, 893–910. Chartrand, T. L., Maddux, W. W., & Lakin, J. L. (2005). Beyond the perceptionbehavior link: The ubiquitous utility and motivational moderators of nonconcious mimicry. In: R. R. Hassin, J. S. Uleman, & J. A. Bargh (Eds.), The new unconscious (pp. 334–361). New York, NY: Oxford University Press. Clopper, C. G., & Pisoni, D. B. (2004). Some acoustic cues for the perceptual categorization of American English regional dialects. Journal of Phonetics, 32, 111–140. Eckert, P. (1989). Jocks and burnouts: Social categories and identity in the high school. New York: Teachers College Press. 197 Evans, B. G., & Iverson, P. (2007). Plasticity in vowel perception and production: A study of accent change in young adults. Journal of the Acoustical Society of America, 121, 3814–3826. Gallois, C., Giles, H., Jones, E., Cargile, A. C., & Ota, H. (1995). Accommodating intercultural encounters: Elaborations and extensions. In: R. Wiseman (Ed.), Intercultural communication theory (pp. 115–147). Thousand Oaks, CA: Sage. Giles, H., Coupland, J., & Coupland, N. (1991). Contexts of accommodation: Developments in applied sociolinguistics. Cambridge: Cambridge University Press. Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical access. Psychological Review, 105(2), 251–279. Gregory, S. W. (1990). Analysis of fundamental frequency reveals covariation in interview partners’ speech. Journal of Nonverbal Behavior, 14, 237–251. Gregory, S. W., Dagan, K., & Webster, S. (1997). Evaluating the relation of vocal accommodation in conversational partners’ fundamental frequencies to perceptions of communication quality. Journal of Nonverbal Behavior, 21, 23–43. Gregory, S. W., & Webster, S. (1996). A nonverbal signal in voices of interview partners effectively predicts communication accommodation and social status predictions. Journal of Personality & Social Psychology, 70, 1231–1240. Labov, W. (1972). The recent history of some dialect markers on the island of Martha’s Vineyard, Mass. In: L. M. Davis (Ed.), Studies in linguistics in honor of Raven I. McDavid Jr. Alabama: University of Alabama Press. Labov, W. (1974). Linguistic change as a form of communication. In: A. Silverstein (Ed.), Human communication: Theoretical explorations (pp. 221–256). Hillsdale, NJ: Lawrence Erlbaum Associates. Labov, W. (1986). Sources of inherent variation in the speech process. In: J. S. Perkell, & D. H. Klatt (Eds.), Invariance and variability in the speech processes (pp. 402–425). New Jersey: Lawrence Erlbaum Associates. Labov, W. (2006a). The social stratification of English in New York City (2nd ed.). Cambridge: Cambridge University Press. Labov, W. (2006b). The Atlas of North American English. New York: Mouton. Labov, W. (2006c). A sociolinguistic perspective on sociophonetic research. Journal of Phonetics, 34, 500–515. Lakin, J. L., & Chartrand, T. L. (2003). Using unconscious behavioral mimicry to create affiliation and rapport. Psychological Science, 14, 334–339. Miller, J. L., & Grosjean, F. (1981). How the components of speaking rate influence the perception of phonetic segments. Journal of Experimental Psychology: Human Perception and Performance, 7, 208–215. Namy, L. L., Nygaard, L. C., & Sauerteig, D. (2002). Gender differences in vocal accommodation: The role of perception. Journal of Language and Social Psychology, 21, 422–432. Natale, M. (1975). Convergence of mean vocal intensity in dyadic communication as a function of social desirability. Journal of Personality & Social Psychology, 32, 790–804. Nearey, T. M. (1989). Static, dynamic, and relational properties in vowel perception. Journal of the Acoustical Society of America, 85, 2088–2113. Pardo, J. S. (2006). On phonetic convergence during conversational interaction. Journal of the Acoustical Society of America, 119(4), 2382–2392. Pardo, J. S. (2010). Expressing oneself in conversational interaction. To appear. In: E. Morsella (Ed.), Expressing oneself/expressing one’s self (pp. 183–196). Taylor & Francis. Pardo, J. S., Cajori Jay, I., & Krauss, R. M. (2010). Conversational role influences speech imitation. Attention, Perception, & Psychophysics, 72, 2254–2264. Pickering, M. J., & Garrod, S. (2004). Toward a mechanistic psychology of dialogue. Behavioral & Brain Sciences, 27, 169–190. Sancier, M. L., & Fowler, C. A. (1997). Gestural drift in a bilingual speaker of Brazilian Portuguese and English. Journal of Phonetics, 25, 421–436. Shepard, C. A., Giles, H., & Le Poire, B. A. (2001). Communication accommodation theory. In: W. P. Robinson, & H. Giles (Eds.), The new handbook of language and social psychology. Chichester, UK: John Wiley & Sons, Ltd. Shockley, K., Sabadini, L., & Fowler, C. A. (2004). Imitation in shadowing words. Perception & Psychophysics, 66(3), 422–429. Street, R. L., & Giles, H. (1982). Speech accommodation theory: A social cognitive approach to language and speech behavior. In: M. Roloff, & C. Berger (Eds.), Social cognition and communication. Beverly Hills, CA: Sage. Thomas, E. R. & Kendall, T. (2007). NORM: The vowel normalization and plotting suite. [Online Resource: /http://ncslaap.lib.ncsu.edu/tools/norm/S]. Thomas, E. R., Kendall, T., Yeager-Dror, M., & Kretzschmar, W. (2007). Two things sociolinguists should know: Software packages for vowel normalization, and accessing linguistic atlas data. In Proceedings of the workshop at new ways of analyzing variation (NWAV) (Vol. 36). Pennsylvania, PA: University of Pennsylvania. Triandis, H. C., & Triandis, L. M. (1960). Race, social class, religion, and nationality as determinants of social distance. Journal of Abnormal and Social Psychology, 61, 110–118. Vallabha, G. K., & Tuller, B. (2004). Perceptuomotor bias in the imitation of steadystate vowels. Journal of the Acoustical Society of America, 116, 1184–1197.