Classifying Particle Semantics in English Verb-Particle Constructions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Classifying Particle Semantics in English Verb-Particle Constructions

Paul Cook Suzanne Stevenson


Department of Computer Science Department of Computer Science
University of Toronto University of Toronto
Toronto, ON M5S 3G4 Toronto, ON M5S 3G4
Canada Canada
[email protected] [email protected]

Abstract is still unknown, since the actual semantic contri-


bution of the components is yet to be determined.
Previous computational work on learning We address this problem in the domain of verb-
the semantic properties of verb-particle particle constructions (VPCs) in English, a rich
constructions (VPCs) has focused on their source of MWEs.
compositionality, and has left unaddressed VPCs combine a verb with any of a finite set
the issue of which meaning of the compo- of particles, as in jump up, figure out, or give in.
nent words is being used in a given VPC. Particles such as up, out, or in, with their literal
We develop a feature space for use in clas- meaning based in physical spatial relations, show
sification of the sense contributed by the a variety of metaphorical and aspectual meaning
particle in a VPC, and test this on VPCs extensions, as exemplified here for the particle up:
using the particle up. The features that
capture linguistic properties of VPCs that (1a) The sun just came up. [vertical spatial movement]
are relevant to the semantics of the par-
(1b) She walked up to him. [movement toward a goal]
ticle outperform linguistically uninformed
word co-occurrence features in our exper- (1c) Drink up your juice! [completion]
iments on unseen test VPCs.
(1d) He curled up into a ball. [reflexive movement]
1 Introduction
Cognitive linguistic analysis, as in Lindner (1981),
A challenge in learning the semantics of mul- can provide the basis for elaborating this type of
tiword expressions (MWEs) is their varying de- semantic variation.
grees of compositionality—the contribution of Given such a sense inventory for a particle,
each component word to the overall semantics our goal is to automatically determine its mean-
of the expression. MWEs fall on a range from ing when used with a given verb in a VPC. We
fully compositional (i.e., each component con- classify VPCs according to their particle sense,
tributes its meaning, as in frying pan) to non- using statistical features that capture the seman-
compositional or idiomatic (as in hit the roof ). Be- tic and syntactic properties of verbs and particles.
cause of this variation, researchers have explored We contrast these with simple word co-occurrence
automatic methods for learning whether, or the de- features, which are often used to indicate the se-
gree to which, an MWE is compositional (e.g., mantics of a target word. In our experiments, we
Lin, 1999; Bannard et al., 2003; McCarthy et al., focus on VPCs using the particle up because it is
2003; Fazly et al., 2005). highly frequent and has a wide range of meanings.
However, such work leaves unaddressed the ba- However, it is worth emphasizing that our feature
sic issue of which of the possible meanings of a space draws on general properties of VPCs, and is
component word is contributed when the MWE is not specific to this particle.
(at least partly) compositional. Words are notori- A VPC may be ambiguous, with its particle oc-
ously ambiguous, so that even if it can be deter- curring in more than one sense; in contrast to (1a),
mined that an MWE is compositional, its meaning come up may use up in a goal-oriented sense as in

45
Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties, pages 45–53,
Sydney, July 2006. 2006
c Association for Computational Linguistics
The deadline is coming up. While our long-term 2003). The features are motivated by the fact
goal is token classification (disambiguation) of a that semantic properties of a verb are reflected
VPC in context, following other work on VPCs in the syntactic expression of the participants in
(e.g., Bannard et al., 2003; McCarthy et al., 2003), the event the verb describes. The slot features
we begin here with the task of type classification. encode the relative frequencies of the syntactic
Given our use of features which capture the statis- slots—subject, direct and indirect object, object of
tical behaviour relevant to a VPC across a corpus, a preposition—that the arguments and adjuncts of
we assume that the outcome of type classification a verb appear in. We calculate the slot features
yields the predominant sense of the particle in the over three contexts: all uses of a verb; all uses of
VPC. Predominant sense identification is a useful the verb in a VPC with the target particle (up in our
component of sense disambiguation of word to- experiments); all uses of the verb in a VPC with
kens (McCarthy et al., 2004), and we presume our any of a set of high frequency particles (to capture
VPC type classification work will form the basis its semantics when used in VPCs in general).
for later token disambiguation.
2.1.2 Particle Features
Section 2 continues the paper with a discussion
of the features we developed for particle sense Two types of features are motivated by proper-
classification. Section 3 first presents some brief ties specific to the semantics and syntax of par-
cognitive linguistic background, followed by the ticles and VPCs. First, Wurmbrand (2000) notes
sense classes of up used in our experiments. Sec- that compositional particle verbs in German (a
tions 4 and 5 discuss our experimental set-up and somewhat related phenomenon to English VPCs)
results, Section 6 related work, and Section 7 our allow the replacement of their particle with seman-
conclusions. tically similar particles. We extend this idea, hy-
pothesizing that when a verb combines with a par-
2 Features Used in Classification ticle such as up in a particular sense, the pattern
of usage of that verb in VPCs using all other par-
The following subsections describe the two sets of ticles may be indicative of the sense of the target
features we investigated. The linguistic features particle (in this case up) when combined with that
are motivated by specific semantic and syntactic verb. To reflect this observation, we count the rel-
properties of verbs and VPCs, while the word co- ative frequency of any occurrence of the verb used
occurrence features are more general. in a VPC with each of a set of high frequency par-
2.1 Linguistically Motivated Features ticles.
Second, one of the striking syntactic properties
2.1.1 Slot Features of VPCs is that they can often occur in either the
We hypothesize that the semantic contribution joined configuration (2a) or the split configuration
of a particle when combined with a given verb is (2b):
related to the semantics of that verb. That is, the
particle contributes the same meaning when com- (2a) Drink up your milk! He walked out quickly.
bining with any of a semantic class of verbs. 1 For
(2b) Drink your milk up! He walked quickly out.
example, the VPCs drink up, eat up and gobble up
all draw on the completion sense of up; the VPCs Bolinger (1971) notes that the joined construction
puff out, spread out and stretch out all draw on the may be more favoured when the sense of the par-
extension sense of out. The prevalence of these ticle is not literal. To encode this, we calculate the
patterns suggests that features which have been relative frequency of the verb co-occurring with
shown to be effective for the semantic classifica- the particle up with each of – words between


tion of verbs may be useful for our task. the verb and up, reflecting varying degrees of verb-
We adopt simple syntactic “slot” features which particle separation.
have been successfully used in automatic seman-
tic classification of verbs (Joanis and Stevenson, 2.2 Word Co-occurrence Features
1
Villavicencio (2005) observes that verbs from a seman- We also explore the use of general context fea-
tic class will form VPCs with similar sets of particles. Here tures, in the form of word co-occurrence frequency
we are hypothesizing further that VPCs formed from verbs
of a semantic class draw on the same meaning of the given vectors, which have been used in numerous ap-
particle. proaches to determining the semantics of a target

46
word. Note, however, that unlike the task of word TR
sense disambiguation, which examines the context
of a target word token to be disambiguated, here
we are looking at aggregate contexts across all in- TR
stances of a target VPC, in order to perform type
classification.
We adopt very simple word co-occurrence fea- LM LM
tures (WCFs), calculated as the frequency of any
(non-stoplist) word within a certain window left Initial Final
and right of the target. We noted above that the
target particle semantics is related both to the se- Figure 1: Schema for Vertical up.
mantics of the verb it co-occurs with, and to the
occurrence of the verb across VPCs with different
particles. Thus we not only calculate the WCFs of The semantic contribution of a particle in a VPC
the target VPC (a given verb used with the parti- corresponds to a schema. For example, in sen-
cle up), but also the WCFs of the verb itself, and tence (3), the TR is the balloon and the LM is the
the verb used in a VPC with any of the high fre- ground the balloon is moving away from.
quency particles. These WCFs give us a very gen-
(3) The balloon floated up.
eral means for determining semantics, whose per-
formance we can contrast with our linguistic fea- The schema describing the semantic contribution
tures. of the particle in the above sentence is shown
in Figure 1, which illustrates the relationship be-
3 Particle Semantics and Sense Classes
tween the TR and LM in the initial and final con-
We give some brief background on cognitive figurations.
grammar and its relation to particle semantics, and
then turn to the semantic analysis of up that we 3.2 The Senses of up
draw on as the basis for the sense classes in our Lindner (1981) identifies a set of schemas for each
experiments. of the particles up and out, and groups VPCs ac-
cording to which schema is contributed by their
3.1 Cognitive Grammar and Schemas
particle. Here we describe the four senses of up
Some linguistic studies consider many VPCs to be identified by Lindner.
idiomatic, but do not give a detailed account of
the semantic similarities between them (Bolinger, 3.2.1 Vertical up (Vert-up)
1971; Fraser, 1976; Jackendoff, 2002). In con- In this schema (shown above in Figure 1), the
trast, work in cognitive linguistics has claimed that TR moves away from the LM in the direction of
many so-called idiomatic expressions draw on the increase along a vertically oriented axis. This in-
compositional contribution of (at least some of) cludes prototypical spatial upward movement such
their components (Lindner, 1981; Morgan, 1997; as that in sentence (3), as well as upward move-
Hampe, 2000). In cognitive grammar (Langacker, ment along an abstract vertical axis as in sen-
1987), non-spatial concepts are represented as spa- tence (4).
tial relations. Key terms from this framework are:
(4) The price of gas jumped up.
Trajector (TR) The object which is conceptually
foregrounded. In Lindner’s analysis, this sense also includes ex-
tensions of upward movement where a vertical
Landmark (LM) The object against which the
path or posture is still salient. Note that in some of
TR is foregrounded.
these senses, the notion of verticality is metaphor-
Schema An abstract conceptualization of an ex- ical; the contribution of such senses to a VPC may
perience. Here we focus on schemas depict- not be considered compositional in a traditional
ing a TR, LM and their relationship in both analysis. Some of the most common sense exten-
the initial configuration and the final config- sions are given below, with a brief justification as
uration communicated by some expression. to why verticality is still salient.

47
TR TR

LM = goal LM = goal
LM = TR LM = TR
Initial Final
Initial Final

Figure 2: Schema for Goal-Oriented up.


Figure 3: Schema for Reflexive up.

Up as a path into perceptual field. Spatially Vertical up Goal-Oriented up

high objects are generally easier to perceive.


Completive up
Examples: show up, spring up, whip up. Reflexive up

Up as a path into mental field. Here up encodes


a path for mental as opposed to physical objects. Figure 4: Simplified schematic network for up.
Examples: dream up, dredge up, think up.
3.3 The Sense Classes for Our Study
Up as a path into a state of activity. Activity is
prototypically associated with an erect posture. Adopting a cognitive linguistic perspective, we as-
Examples: get up, set up, start up. sume that all uses of a particle make some compo-
sitional contribution of meaning to a VPC. In this
3.2.2 Goal-Oriented up (Goal-up) work, we classify target VPCs according to which
of the above senses of up is contributed to the ex-
Here the TR approaches a goal LM; movement pression. For example, the expressions jump up
is not necessarily vertical (see Figure 2). Proto- and pick up are designated as being in the class
typical examples are walk up and march up. This Vert-up since up in these VPCs has the vertical
category also includes extensions into the social sense, while clean up and drink up are designated
domain (kiss up and suck up), as well as exten- as being in the class Cmpl-up since up here has
sions into the domain of time (come up and move the completive sense. The relations among the
up), as in: senses of up can be shown in a “schematic net-
work” (Langacker, 1987). Figure 4 shows a sim-
(5a) The intern kissed up to his boss. plification of such a network in which we connect
more similar senses with shorter edges. This type
(5b) The deadline is coming up quickly. of analysis allows us to alter the granularity of our
classification in a linguistically motivated fashion
by combining closely related senses. Thus we can
3.2.3 Completive up (Cmpl-up) explore the effect of different sense granularities
Cmpl-up is a sub-sense of Goal-up in which the on classification.
goal represents an action being done to comple-
4 Materials and Methods
tion. This sense shares its schema with Goal-up
(Figure 2), but it is considered as a separate sense 4.1 Experimental Expressions
since it corresponds to uses of up as an aspectual We created a list of English VPCs using up, based
marker. Examples of Cmpl-up are: clean up, drink on a list of VPCs made available by McIntyre
up, eat up, finish up and study up. (2001) and a list of VPCs compiled by two human
judges. The judges then filtered this list to include
3.2.4 Reflexive up (Refl-up) only VPCs which they both agreed were valid, re-
Reflexive up is a sub-sense of Goal-up in which sulting in a final list of 389 VPCs. From this list,
the sub-parts of the TR are approaching each other. training, verification and test sets of sixty VPCs
The schema for Refl-up is shown in Figure 3; it is each are randomly selected. Note that the expense
unique in that the TR and LM are the same object. of manually annotating the data (as described be-
Examples of Refl-up are: bottle up, connect up, low) prevents us from using larger datasets in this
couple up, curl up and roll up. initial investigation. The experimental sets are

48
chosen such that each includes the same propor- #VPCs in Sense Class
tion of verbs across three frequency bands, so that Sense Class Train Verification Test
the sets do not differ in frequency distribution of Vert-up 24 33 27
the verbs. (We use frequency of the verbs, rather Goal-up 1 1 3
than the VPCs, since many of our features are Cmpl-up 20 23 22
based on the verb of the expression, and moreover, Refl-up 15 3 8
VPC frequency is approximate.) The verification
data is used in exploration of the feature space and Table 1: Frequency of items in each sense class.
selection of final features to use in testing; the test
set is held out for final testing of the classifiers. #VPCs in Sense Class
Each VPC in each dataset is annotated by the Sense Class Train Verification Test
two human judges according to which of the four Vert-up 24 33 27
senses of up identified in Section 3.2 is contributed Goal-up 21 24 25
to the VPC. As noted in Section 1, VPCs may Cmpl-up
be ambiguous with respect to their particle sense. Refl-up 15 3 8
Since our task here is type classification, the
judges identify the particle sense of a VPC in its Table 2: Frequency of items in each class for the
predominant usage, in their assessment. The ob- 3-way task.

served inter-annotator agreement is for each
dataset. The unweighted observed kappa scores
 
 sions in all datasets. We use   and 
are , and , for the training, verifica- 

to create feature sets WCF  and WCF  respec-


 

tion and test sets respectively.


tively.
4.2 Calculation of the Features
4.3 Experimental Classes
We extract our features from the 100M word Table 1 shows the distribution of senses in each
British National Corpus (BNC, Burnard, 2000). dataset. Each of the training and verification sets
VPCs are identified using a simple heuristic based has only one VPC corresponding to Goal-up. Re-
on part-of-speech tags, similar to one technique call that Goal-up shares a schema with Cmpl-up,
used by Baldwin (2005). A use of a verb is con- and is therefore very close to it in meaning, as in-
sidered a VPC if it occurs with a particle (tagged dicated spatially in Figure 4. We therefore merge
AVP) within a six word window to the right. Over Goal-up and Cmpl-up into a single sense, to pro-
a random sample of 113 VPCs thus extracted, we vide more balanced classes.
found 88% to be true VPCs, somewhat below the
Since we want to see how our features per-
performance of Baldwin’s (2005) best extraction
form on differing granularities of sense classes, we
method, indicating potential room for improve-
run each experiment as both a 3-way and 2-way
ment.
classification task. In the 3-way task, the sense
The slot and particle features are calculated us- classes correspond to the meanings Vert-up, Goal-
ing a modified version of the ExtractVerb software up merged with Cmpl-up (as noted above), and
provided by Joanis and Stevenson (2003), which Refl-up, as shown in Table 2. In the 2-way task, we
runs over the BNC pre-processed using Abney’s further merge the classes corresponding to Goal-
(1991) Cass chunker.
To compute the word co-occurrence features
(WCFs), we first determine the relative frequency #VPCs in Sense Class
of all words which occur within a five word win- Sense Class Train Verification Test
dow left and right of any of the target expressions Vert-up 24 33 27
in the training data. From this list we eliminate Goal-up 36 27 33
the most frequent 1% of words as a stoplist and Cmpl-up
then use the next most frequent words as “fea- Refl-up
ture words”. For each “feature word”, we then cal-
Table 3: Frequency of items in each class for the
culate its relative frequency of occurrence within
2-way task.
the same five word window of the target expres-

49
up/Cmpl-up with that of Refl-up, as shown in Ta- 3-way Task 2-way Task
ble 3. We choose to merge these classes because Features Ver Test Ver Test
(as illustrated in Figure 4) Refl-up is a sub-sense of Slots 41 51 53 67
Goal-up, and moreover, all three of these senses Particles 37 33 65 47
contrast with Vert-up, in which increase along a Slots Particles 54 54 59 63
vertical axis is the salient property. It is worth em-
phasizing that the 2-way task is not simply a clas- Table 4: Accuracy (%) using linguistic features.
sification between literal and non-literal up—Vert-
up includes extensions of up in which the increase
5.1 Experiments Using the Linguistic
along a vertical axis is metaphorical.
Features
The results for experiments using the features that
4.4 Evaluation Metrics and Classifier capture semantic and syntactic properties of verbs
Software and VPCs are summarized in Table 4, and dis-
cussed in turn below.
The variation in the frequency of the sense classes
5.1.1 Slot Features
of up across the datasets makes the true distri-
bution of the classes difficult to estimate. Fur- Experiments using the slot features alone test
thermore, there is no obvious informed baseline whether features that tap into semantic informa-
for this task. Therefore, we make the assumption tion about a verb are sufficient to determine the
that the true distribution of the classes is uniform, appropriate sense class of a particle when that verb
and use the chance accuracy  as the baseline combines with it in a VPC. Although accuracy on
(where  is the number of classes—in our exper- the test data is well above the baseline in both the
 2-way and 3-way tasks, for verification data the
iments, either  or ). Accordingly, our measure
of classification accuracy should weight each class increase over the baseline is minimal. The class
evenly. Therefore, we report the average per class corresponding to sense Refl-up in the 3-way task
accuracy, which gives equal weight to each class. is relatively small, which means that a small vari-
ation in classification on these verbs may lead to
For classification we use LIBSVM (Chang and a large variation in accuracy. However, we find
Lin, 2001), an implementation of a support-vector that the difference in accuracy across the datasets
machine. We set the input parameters, cost is not due to performance on VPCs in this sense
and gamma, using 10-fold cross-validation on the class. Although these features show promise for
training
 
 data.
 
 In addition, we assign a weight of our task, the variation across the datasets indicates
 
 to each class  to eliminate the ef- the limitations of our small sample sizes.
fects of the variation in class size on the classifier.
5.1.2 Particle Features
Note that our choice of accuracy measure and
weighting of classes in the classifier is necessary We also examine the performance of the parti-
given our assumption of a uniform random base- cle features on their own, since to the best of our
line. Since the accuracy values we report incorpo- knowledge, no such features have been used be-
rate this weighting, these results cannot be com- fore in investigating VPCs. The results are dis-
pared to a baseline of always choosing the most appointing, with only the verification data on the
frequent class. 2-way task showing substantially higher accuracy
than the baseline. An analysis of errors reveals no
consistent explanation, suggesting again that the
5 Experimental Results variation may be due to small sample sizes.
5.1.3 Slot + Particle Features
We present experimental results for both We hypothesize that the combination of the slot
Ver(ification) and unseen Test data, on each features with the particle features will give an in-
set of features, individually and in combination. crease in performance over either set of linguis-
All experiments are run on both the 2-way and tic features used individually, given that they tap
3-way sense classification, which have a chance into differing properties of verbs and VPCs. We
baseline of 50% and 33%, respectively. find that the combination does indeed give more

50
3-way Task 2-way Task 3-way Task 2-way Task
Features Ver Test Ver Test Features Ver Test Ver Test
WCF  45 42 59 51 Combined  53 45 63 53
WCF  38 34 55 48 Combined  54 46 65 49

Table 5: Accuracy (%) using WCFs. Table 6: Accuracy (%) combining linguistic fea-
tures with WCFs.
consistent performance across verification and test
data than either feature set used individually. We 5.4 Discussion of Results
analyze the errors made using slot and particle fea-
tures separately, and find that they tend to classify The best performance across the datasets is at-
different sets of verbs incorrectly. Therefore, we tained using all the linguistic features. The lin-
conclude that these feature sets are at least some- guistically uninformed WCFs perform worse on
what complementary. By combining these com- their own, and do not consistently help (and in
plementary feature sets, the classifier is better able some cases hurt) the performance of the linguis-
to generalise across different datasets. tic features when combined with them. We con-
clude then that linguistically based features are
5.2 Experiments Using WCFs motivated for this task. Note that the features are
Our goal was to compare the more knowledge-rich still quite simple, and straightforward to extract
slot and particle features to an alternative feature from a corpus—i.e., linguistically informed does
set, the WCFs, which does not rely on linguistic not mean expensive (although the slot features do
analysis of the semantics and syntax of verbs and require access to chunked text).
VPCs. Recall that we experiment with both 200 Interestingly, in determining the semantic near-
feature words, WCF  , and 500 feature words, est neighbor of German particle verbs, Schulte im
WCF  , as shown in Table 5. Most of the exper- Walde (2005) found that WCFs that are restricted
iments using WCFs perform worse than the cor- to the arguments of the verb outperform simple
responding experiment using all the linguistic fea- window-based co-occurrence features. Although
tures. It appears that the linguistically motivated her task is quite different from ours, similarly re-
features are better suited to our task than simple stricting our WCFs may enable them to encode
word context features. more linguistically-relevant information.
The accuracies we achieve with the linguistic
5.3 Linguistic Features and WCFs Combined features correspond to a 30–31% reduction in er-
Although the WCFs on their own perform worse ror rate over the chance baseline for the 3-way
than the linguistic features, we find that the lin- task, and an 18–26% reduction in error rate for
guistic features and WCFs are at least somewhat the 2-way task. Although we expected that the
complementary since they tend to classify differ- 2-way task may be easier, since it requires less
ent verbs incorrectly. We hypothesize that, as with fine-grained distinctions, it is clear that combining
the slot and particle features, the different types senses that have some motivation for being treated
of information provided by the linguistic features separately comes at a price.
and WCFs may improve performance in combina- The reductions in error rate that we achieve with
tion. We therefore combine the linguistic features our best features are quite respectable for a first
with each of the WCF  and WCF features; attempt at addressing this problem, but more work
see Table 6. However, contrary to our hypothesis, clearly remains. There is a relatively high variabil-
for the most part, the experiments using the full ity in performance across the verification and test
combination of features give accuracies the same sets, indicating that we need a larger number of
or below that of the corresponding experiment us- experimental expressions to be able to draw firmer
ing just the linguistic features. We surmise that conclusions. Even if our current results extend to
these very different types of features—the linguis- larger datasets, we intend to explore other feature
tic features and WCFs—must be providing con- approaches, such as word co-occurrence features
flicting rather than complementary information to for specific syntactic slots as suggested above, in
the classifier, so that no improvement is attained. order to improve the performance.

51
6 Related Work needed to adequately address VPC semantics.

The semantic compositionality of VPC types has 7 Conclusions


recently received increasing attention. McCarthy
et al. (2003) use several measures to automati- While progress has recently been made in tech-
cally rate the overall compositionality of a VPC. niques for assessing the compositionality of VPCs,
Bannard (2005), extending work by Bannard et al. work thus far has left unaddressed the problem of
(2003), instead considers the extent to which the determining the particular meaning of the compo-
nents. We focus here on the semantic contribution
verb and particle each contribute semantically to
of the particle—a part-of-speech whose seman-
the VPC. In contrast, our work assumes that the
particle of every VPC contributes composition- tic complexity and range of metaphorical mean-
ally to its meaning. We draw on cognitive lin- ing extensions has been largely overlooked in prior
guistic analysis that posits a rich set of literal and computational work. Drawing on work within
metaphorical meaning possibilities of a particle, cognitive linguistics, we annotate a set of 180
which has been previously overlooked in compu- VPCs according to the sense class of the particle
tational work on VPCs. up, our experimental focus in this initial investiga-
tion. We develop features that capture linguistic
In this first investigation of particle meaning in
properties of VPCs that are relevant to the seman-
VPCs, we choose to focus on type-based clas-
tics of particles, and show that they outperform
sification, partly due to the significant extra ex-
linguistically uninformed word co-occurrence fea-
pense of manually annotating sufficient numbers
tures, achieving around 20–30% reduction in er-
of tokens in text. As noted earlier, though, VPCs
ror rate over a chance baseline. Areas of on-going
can take on different meanings, indicating a short-
work include development of a broader range of
coming of type-based work. Patrick and Fletcher
features, consideration of methods for token-based
(2005) classify VPC tokens, considering each as
semantic determination, and creation of larger ex-
compositional, non-compositional or not a VPC.
perimental datasets.
Again, however, it is important to recognize which
of the possible meaning components is being con- References
tributed. In this vein, Uchiyama et al. (2005)
tackle token classification of Japanese compound S. Abney. 1991. Parsing by chunks. In R. Berwick,
verbs (similar to VPCs) as aspectual, spatial, or S. Abney, and C. Tenny, editors, Principle-
adverbial. In the future, we aim to extend the Based Parsing: Computation and Psycholin-
scope of our work, to determine the meaning of guistics, p. 257–278. Kluwer Academic Pub-
a particle in a VPC token, along the lines of our lishers.
sense classes here. This will almost certainly re- Y. S. Alam. 2004. Decision trees for sense dis-
quire semantic classification of the verb token (La- ambiguation of prepositions: Case of over. In
pata and Brew, 2004), similar to our approach here HLT-NAACL 2004: Workshop on Computa-
of using the semantic class of a verb type as indica- tional Lexical Semantics, p. 52–59.
tive of the meaning of a particle type. T. Baldwin. 2005. The deep lexical acquisition of
Particle semantics has clear relations to prepo- English verb-particle constructions. Computer
sition semantics. Some research has focused on Speech and Language, Special Issue on Multi-
the sense disambiguation of specific prepositions word Expressions, 19(4):398–414.
(e.g., Alam, 2004), while other work has classi- C. Bannard. 2005. Learning about the meaning of
fied preposition tokens according to their seman- verb-particle constructions from corpora. Com-
tic role (O’Hara and Wiebe, 2003). Moreover, puter Speech and Language, Special Issue on
two large lexical resources of preposition senses Multiword Expressions, 19(4):467–478.
are currently under construction, The Preposi- C. Bannard, T. Baldwin, and A. Lascarides. 2003.
tion Project (Litkowski, 2005) and PrepNet (Saint- A statistical approach to the semantics of verb-
Dizier, 2005). These resources were not suitable particles. In Proceedings of the ACL-2003
as the basis for our sense classes because they do Workshop on Multiword Expressions: Analysis,
not address the range of metaphorical extensions Acquisition and Treatment, p. 65–72.
that a preposition/particle can take on, but future D. Bolinger. 1971. The Phrasal Verb in English.
work may enable larger scale studies of the type Harvard University Press.

52
L. Burnard. 2000. The British National Cor- D. McCarthy, B. Keller, and J. Carroll. 2003.
pus Users Reference Guide. Oxford University Detecting a continuum of compositionality in
Computing Services. phrasal verbs. In Proceedings of the ACL-
C.-C. Chang and C.-J. Lin. 2001. LIBSVM: a SIGLEX Workshop on Multiword Expressions:
library for support vector machines. Soft- Analysis, Acquisition and Treatment.
ware available at http://www.csie.ntu. D. McCarthy, R. Koeling, J. Weeds, and J. Carroll.
edu.tw/˜cjlin/libsvm. 2004. Finding predominant word senses in un-
A. Fazly, R. North, and S. Stevenson. 2005. Au- tagged text. In Proceedings of the 42nd Annual
tomatically distinguishing literal and figurative Meeting of the Association for Computational
usages of highly polysemous verbs. In Proceed- Linguistics, p. 280–287.
ings of the ACL-2005 Workshop on Deep Lexi- A. McIntyre. 2001. The particle verb list.
cal Acquisition. http://www.uni-leipzig.de/
B. Fraser. 1976. The Verb-Particle Combination in ˜angling/mcintyre/pv.list.pdf.
English. Academic Press. P. S. Morgan. 1997. Figuring out figure out:
B. Hampe. 2000. Facing up to the meaning of Metaphor and the semantics of the English
‘face up to’: A cognitive semantico-pragmatic verb-particle construction. Cognitive Linguis-
analysis of an English verb-particle construc- tics, 8(4):327–357.
tion. In A. Foolen and F. van der Leek, edi- T. O’Hara and J. Wiebe. 2003. Preposition se-
tors, Constructions in Cognitive Linguistics. Se- mantic classification via Penn Treebank and
lected Papers from the fifth International Cog- FrameNet. In Proceedings of CoNLL-2003, p.
nitive Linguistics Conference, p. 81–101. John 79–86.
Benjamins Publishing Company. J. Patrick and J. Fletcher. 2005. Classifying verb-
R. Jackendoff. 2002. English particle construc- particle constructions by verb arguments. In
tions, the lexicon, and the autonomy of syntax. Proceedings of the Second ACL-SIGSEM Work-
In N. Dehe, R. Jackendoff, A. McIntyre, and shop on the Linguistic Dimensions of Preposi-
S. Urban, editors, Verb-Particle Explorations. tions and their use in Computational Linguistics
Mouton de Gruyter. Formalisms and Applications, p. 200–209.
E. Joanis and S. Stevenson. 2003. A general fea- P. Saint-Dizier. 2005. PrepNet: a framework for
ture space for automatic verb classification. In describing prepositions: Preliminary investiga-
Proceedings of the Conference of the European tion results. In Proceedings of the Sixth Interna-
Chapter of the Association for Computational tional Workshop on Computational Semantics
Linguistics (EACL-2003), p. 163–170. (IWCS’05), p. 145–157.
R. W. Langacker. 1987. Foundations of Cognitive S. Schulte im Walde. 2005. Exploring features to
Grammar: Theoretical Prerequisites, volume 1. identify semantic nearest neighbours: A case
Stanford University Press, Stanford. study on German particle verbs. In Proceed-
M. Lapata and C. Brew. 2004. Verb class disam- ings of the International Conference on Recent
biguation using informative priors. Computa- Advances in Natural Language Processing.
tional Linguistics, 30(1):45–73. K. Uchiyama, T. Baldwin, and S. Ishizaki.
D. Lin. 1999. Automatic identification of non- 2005. Disambiguating Japanese compound
compositional phrases. In Proceedings of the verbs. Computer Speech and Language, Special
37th Annual Meeting of the Association for Issue on Multiword Expressions, 19(4):497–
Computational Linguistics, p. 317–324. 512.
S. Lindner. 1981. A lexico-semantic analysis of A. Villavicencio. 2005. The availability of verb-
English verb particle constructions with out and particle constructions in lexical resources: How
up. Ph.D. thesis, University of California, San much is enough? Computer Speech and Lan-
Diego. guage, Special Issue on Multiword Expressions,
K. C. Litkowski. 2005. The Preposition Project. In 19(4):415–432.
Proceedings of the Second ACL-SIGSEM Work- S. Wurmbrand. 2000. The structure(s) of particle
shop on the Linguistic Dimensions of Preposi- verbs. Master’s thesis, McGill University.
tions and their Use in Computational Linguis-
tics Formalisms and Applications.

53

You might also like