NIVER
S
IT
TH
E
U
Y
R
G
H
O F
E
D I
U
N B
Division of Informatics, University of Edinburgh
Institute for Communicating and Collaborative Systems
Towards A Computational Model Of Poetry Generation
by
Hisar Manurung, Graeme Ritchie, Henry Thompson
Informatics Research Report EDI-INF-RR-0015
Division of Informatics
http://www.informatics.ed.ac.uk/
May 2000
Towards A Computational Model of Poetry Generation
Hisar Maruli Manurung, Graeme Ritchie, Henry Thompson
Division of Informatics
University of Edinburgh
80 South Bridge Edinburgh EH1 1HN
[email protected],
[email protected],
[email protected]
Abstract
In this paper we describe the difficulties of poetry generation, particularly in contrast to traditional informative natural
language generation. We then point out deficiencies of previous attempts at poetry generation, and propose a stochastic
hillclimbing search model which addresses these deficiencies. We present both conceptual and implemented details of
the most important aspects of such a model, the evaluation and evolution functions. Finally, we report and discuss results
of our preliminary implementation work.
1
Motivation
Poetry is a unique artifact of human natural language production, with the distinctive feature of having a strong
unity between its content and its form. The creation of
poetry is a task that requires intelligence, expert mastery
over world and linguistic knowledge, and creativity. Although some research work has been devoted towards creative language such as story generation, poetry writing
has not been afforded the same attention. It is the aim
of this research to fill that gap, and to shed some light on
what often seems to be the most enigmatic and mysterious
forms of artistic expression.
Furthermore, poetry possesses certain characteristics
that render traditional natural language generation (NLG)
systems, which are geared towards a strictly informative
goal, unsuitable due to architectural rigidness.
Lastly, although not readily obvious, there are potential applications for computer generated poetry, such as
the increasingly large industry of electronic entertainment
and interactive fiction, the commercial greeting card poetry genre, and perhaps even the odd pop music lyric or
two.
2
2.1
Poetry Generation
What Is Poetry?
Regarding poetry the artifact, Levin (1962) states that “In
poetry the form of the discourse and its meaning are fused
into a higher unity.” This definition highlights the point of
a strong interaction between semantics, syntax and lexis.
Boulton (1982) reiterates this point, claiming that it is
misleading to separate the physical and intellectual form
of a poem so far as to ask, “What does it mean?”. The
poem means itself.
These are rather esoteric quotations, and one would
expect these quotes to be referring to “high-brow” poetry.
However, we claim that this unity is inherent in simpler
forms of poetry, for example, Hillaire Belloc’s The Lion
(Daly, 1984):
The Lion, the Lion, he dwells in the waste,
He has a big head and a very small waist;
But his shoulders are stark, and his jaws they are grim,
And a good little child will not play with him.
Essentially unity here means that the poem “works”
due to a combination of features at the surface level (the
rhyming of waste and waist, grim and him, the repetition
of The lion, the lion to fit the rhythm), and semantics (description of a lion as told to a child).
Of the many special characteristics that poetic form
possesses, among the most essential are: rhythm, rhyme,
and figurative language.
As for the process of writing poetry, it is often claimed
to proceed in a much more flexible manner than other
writing processes. There is often no well-defined communicative goal, save for a few vague concepts such as
“wintery weather” or “a scary lion”. Furthermore, a human could begin writing a poem inspired by a particular
concept, or scenario, but end up writing a poem about an
altogether different topic.
This specification of loose constraints fits with (Sharples,
1996) and (Boden, 1990), who claim that while a writer
needs to accept the constraints of goals, plans, and schemas,
creative writing requires the breaking of these constraints.
Yet these constraints are still necessary, as they allow for
the recognition and exploiting of opportunities.
Sharples (1996) models the writing process as that of
creative design, involving a cycle of analysis, known as
reflection, and synthesis, known as engagement. This pro-
cess is analogous to our iterative process of evaluation and
evolution, and ties in with the concept of unity between
content and form: during the reflection phase, when looking at an intermediate draft of the poem on paper, a poet
may come to realize the opportunities of surface features
that can be exploited, which enables further content to be
explored upon subsequent engagement phases.
2.2
Previous Attempts
Most previous attempts at poetry generation are “hobbyist experiments” that are available on the World Wide
Web, such as The Poetry Creator, ELUAR, and Pujangga,
with the two exceptions that exist in publication being
RACTER and PROSE (Hartman, 1996). RACTER is also
the only computer program with a published poetry anthology, “The Policeman’s Beard is Half Constructed” in
1984.
All of these attempts were essentially “party trick”type programs, in the mould of ELIZA (Weizenbaum,
1966). Typically, the generation process simply consisted
of randomly choosing words from a hand-crafted lexicon
to fill in the gaps provided by a template-based grammar.
However, several clever tricks and heuristics were employed on top of the randomness to give the appearance
of coherence and poeticness, such as: (1) assigning adhoc “emotional categories”, e.g. {ethereality, philosophy,
nature, love, dynamism} in ELUAR, and {romantic, patriotic, wacky, moderate} in Pujangga, (2) choosing lexical
items repetitively to give a false sense of coherence, e.g.
RACTER, (3) constructing highly elaborate sentence templates, often to the point that the resulting poetry would
have to be attributed more to the human writer than to the
program.
This is a representative output from ELUAR:
Sparkles of whiteness fly in my eyes,
The moan of stars swang branches of trees,
The heart of time sings in the snowy night.
Seconds of Eternity fly in grass,
The Clock of rain turns,
Death of the Apples,
The Equinox penetrates the words.
The two great deficiencies of these attempts were that
they took no account whatsoever of semantics (the systems were not trying to convey any message) nor of poetic
form, e.g. rhythm, rhyme, and figurative language.
2.3
What Makes It Difficult?
If we are to develop a poetry generation system which
overcomes the two deficiencies mentioned above, what
difficulties do we run into, particularly in comparison to
conventional NLG systems?
1. In conventional, informative NLG systems, the starting point is a given message, or communicative goal,
and the goal is to produce a string of text that conveys that message according to the linguistic resources available. In poetry, however, there may
not be a well-defined message to be conveyed (see
section 2.1).
2. The generation process is commonly decomposed
into stages of content determination, text planning,
and surface realisation (Reiter, 1994). We claim
this approach is unsuitable for the task of poetry
generation because it introduces problems of architectural rigidness (cf. De Smedt et al. (1996)) which
are exacerbated by the unity of poetry, where interdependencies between semantics, syntax, and lexis
are at their strongest.
3. If our poetry generator is to create texts which satisfy the multitude of phonetic, syntactic and semantic
constraints, it must have a very rich supply of resources, namely: a wide coverage grammar which
allows for paraphrasing, a rich lexicon which supplies phonetic information, and a knowledge-base
if we hope to produce coherent poems.
4. One of the main difficulties lies in the objective
evaluation of the output text. The question of measuring text quality arises for existing NLG systems,
but is much more pronounced in evaluating poetry:
how does one objectively evaluate if something is a
poem or not.
When writing about Masterman’s haiku producer,
Boden (1990) states that readers of poetry are prepared to do considerable interpretative work, and
the more the audience is prepared to contribute in
responding to a work of art, the more chance there
is that a computer’s performance may be acknowledged as aesthetically valuable. Hence readers of
computer-generated text will be more tolerant in
their assessment of poetry than of prose.
This sounds encouraging for doing work in poetry
generation. However, this observation also implies
that it could be too easy to program the computer
production of poetry: precisely because poetry readers are prepared to do interpretative work, it would
be all too easy to pass off random word-salad output
as Truly Genuine Poetry (whatever that may be).
The first three points mentioned above are of a more
technical nature, while the last one is more conceptual,
perhaps even philosophical. Currently, we do not have
much to say on this last point, except that we hope to
adopt an objective and empirical evaluation methodology
similar to that of Binsted et al. (1997).
2.4 Limiting our Poetry
The main characteristics which we look for in our target
generated poetry are a highly regular occurrence of syn-
tactic and phonetic patterns, such as metre, rhyme, and alliteration. These are easily identifiable, and one could say
we are adopting a “classic” view of poetry. Furthermore,
we will only offer a relatively simple and straightforward
treatment of semantics (see section 4.3) and of constructing the poem’s content. The verse in section 2.1 typifies
these attributes.
3
A Stochastic Hillclimbing Model
In an attempt to address the difficulties raised in Section 2.3, we propose to model poetry generation as an explicit search, where a state in the search space is a possible
text with all its underlying representation, and a “move”
in the space can occur at any level of representation, from
semantics all the way down to phonetics. This blurs the
conventional divisions of content determination, text planning, and surface realisation, and is actually readopting
what De Smedt et al. (1996) call an integrated architecture, which goes against recent developments in NLG, but
seems a necessary decision when considering poetry.
The problem, of course, is navigating the prohibitively
large search space. Our proposed solution is to employ
a stochastic hillclimbing search, not merely for its relatively efficient performance, but especially since the creative element of poetry generation seems perfectly suited
to a process with some element of randomness to it.
Our stochastic hillclimbing search model is an evolutionary algorithm, which is basically an iteration of two
phases, evaluation and evolution, applied to an ordered set
(the population) of candidate solutions (the individuals).
This approach is quite analogous to (Mellish et al.,
1998), an experiment in using stochastic search for text
planning, but in our research we extend it to the whole
NLG process.
3.1
Evaluation
Arguably the most crucial aspect of a stochastic search is
the evaluation scheme which lets the system know what
a desirable solution is. Below we present an informal description, not necessarily exhaustive, of the features that
our evaluation functions must look for in a poem. A description of the actual evaluators in our currently implemented system can be found in Section 4.5.
1. Phonetics: One of the most obvious things to look
for in a poem is the presence of a regular phonetic
form, i.e. rhyme, metre, alliteration, etc. This information can be derived from a pronunciation dictionary.
One possible evaluation method is to specify a “target phonetic form” as input, i.e. the ideal phonetic form that a candidate solution should possess,
and to then score a candidate solution based on how
closely it matches the target form.
For example, we could provide the system with the
following target form (here w means a syllable with
weak stress, s a syllable with strong stress, and
(a) and (b) would determine the rhyme scheme,
e.g. aabba), which effectively means we are requesting it to generate a limerick:
w,s,w,w,s,w,w,s(a)
w,s,w,w,s,w,w,s(a)
w,s,w,w,s(b)
w,s,w,w,s(b)
w,s,w,w,s,w,w,s(a)
Alternatively, we could specify a set of these target
forms, thus feeding the system with knowledge of
existing poetry forms: the quintain, haiku, rondeau,
sestina, etc., and allow the evaluation function to
reward candidate solutions that are found gravitating closely towards one of those patterns. This is
a more flexible alternative, but would probably not
be as informed a heuristic, as the definition of the
goal becomes less focussed.
2. Linguistics: Aside from phonetic patterns, there
are other, more subtle, features to look for in a poem:
lexical choice, where the evaluation could reward
the usage of interesting collocations and words marked
as “poetic”, syntax, where reward could be given
to usage of interesting syntactic constructs, e.g. inverse word and clause order, topicalization, and rhetoric, where evaluation would score the usage of figurative language constructs such as metonymy.
3. Semantics: Even more abstract would be a mechanism for evaluating the semantics of a certain candidate. Again, we could specify a “target semantics”
and score a candidate’s semantics relative to this
target. Unlike conventional NLG, though, this target semantics is not viewed as a message that must
be conveyed, but rather as a “pool of ideas”, from
which the system can draw inspiration. The system
could choose to convey more or less than the given
semantics (cf. approximate generation in Nicolov
(1998)).
Story generation issues such as narrative structure
and interestingness are beyond the scope of this research.
Having analysed the three points above, it seems that
to devise an evaluation function, the following 3 issues
must be tackled:
• How to identify the presence of a feature: with
the possible exception of figurative language, it is
reasonably straightforward to observe the features.
Most of them are represented directly in the data
structure, e.g. phonetic form, lexical choice, syntactic structure, semantic interpretation.
• How to quantify a feature: yielding a numerical
measure for the occurrence of a poetic feature sounds
like a very naive idea. Nonetheless, we believe that
it is the only way to mechanically and objectively
guide the stochastic search to producing poem-like
texts.
Above we have mentioned a score-relative-to-target
strategy for both phonetics and semantics. This
seems to be the most concrete method of evaluation, and is what we have chosen to implement in
our current system. Certain features, however, most
notably those considered to be preferences as opposed to constraints, do not lend themselves easily
towards this strategy. Furthermore, as mentioned
above, we would sometimes like the flexibility of
allowing the system to operate unguided by such a
specific target.
A naive alternative scoring method is to maintain
a tally of points for every occurrence of a feature
encountered in a text. This resembles a greedy algorithm search heuristic. For example: applied to
the feature of alliteration, if we scored positively for
each word that appeared in a line starting with the
same phoneme, the final output could become ridiculously riddled with redundant repetitions of rewordings. This might be good for generating tonguetwister-like sentences, but any literary critic would
baulk at these results. However, at the moment this
is how we implement evaluation of such features,
and although we do not intend to go deep into literary theory, we hope to develop a more sophisticated
approach.
For now our aim is to facilitate a modular approach
to the evaluation of features, so that each particular type of feature will have its own corresponding “evaluator function”. This will allow for more
sophisticated approaches and techniques to be easily added in the future.
Apart from a modular approach, we also aim to
parameterize the behaviour of these evaluation functions, e.g. allow a user to set the coefficients and
weighting factors that determine the calculation of
a certain score. A very interesting prospect is the
interfacing of these coefficients with empirical data
obtained from statistical literary analysis, or stylometry.
• Weighting across features: assuming we have obtained numerical scores for each of the features we
are considering, how do we combine them? As
in the previous point about parameterizing coefficients of a particular evaluator, we propose to treat
the weighting across features in a similar fashion.
This parameterization could possibly allow a choice
between, say, a preference for rigidly structured poetry and a preference for a more contemporary contentdriven poem.
3.2 Evolution
After evaluating a set of candidate solutions and choosing
a subset of candidates with the best score, we must then
create new variations of them through “mutation”. This
process can be seen as applying a collection of operators
on the chosen candidates. We introduce here three conceptual types of operators, before describing our currently
implemented operators in Section 4.6:
• Add: “John walked”→“John walked to the store”
• Delete: “John likes Jill and Mary”→“John likes Jill”
• Change: “John walked”→“John lumbered’
Due to our integrated architecture, these mutations
may occur at different underlying levels of representation of the text. Because these different levels are all
interdependent, the operators must take special care to
preserve consistency when performing mutation. For example, if the addition of “to the store” is viewed mainly
as a syntactic addition of a prepositional phrase, the operator would have to update the semantics to reflect this,
for instance by adding destination(w,shop). In contrast, if it is viewed primarily as a semantic addition, the
operator would have to realize these semantics, one option being the use of a prepositional phrase. Our practice
of introducing semantics via a flexible “semantic pool”
addresses this issue (see Section 4.3).
Another issue is that these operators can perform nonmonotonic modifications on the structures of the candidate solutions, hence our grammar formalism must allow
for this.
As it is probably too optimistic to rely on pure random mutation to lead us to a decent poem, we would also
like to introduce several heuristic-rich operators. These
heuristics would be the encoding of ”how to write poetry”
guidelines, such as ”use ’little’ words to ’pad’ sentences
when trying to fit the metre”, and ”don’t use words with
few rhymes at the end of a line”. These ’smarter’ operators, however, seems to go against stochastic search traditions wherein the operators are deliberately knowledgepoor, relying on the stochastic nature to lead us to the
solution. Here, we are adding informedness of heuristics
to the whole stochastic process, somewhat analogous to
sampling bias in stochastic search.
4
Implementation
We are currently in the process of implementing our stochastic
search model in Java. In this section we will first briefly
discuss our choice of representation for grammar, lexicon
and semantics, before describing properties of the architecture and the current implementation of our evaluation
and evolution functions.
4.1
Grammar Formalism
Our choice of representation is Lexicalized Tree Adjoining Grammar, or LTAG. For reasons of space, however,
we will not explain this formalism in depth. See Joshi
and Schabes (1992) for details.
In Tree Adjoining Grammar, a derivation tree is a kind
of meta-level tree that records operations performed on
elementary trees. In particular, nodes of a derivation tree
do not signify phrase structure in any way, but rather
the process of adjoining and substitution of elementary
phrase structure trees. Nodes of a derivation tree are labelled by references to elementary trees, and edges are
labelled by the address at which the elementary tree of
the child node substitutes or adjoins into the elementary
tree of the parent (see Figure 1).
The root node of the derivation tree would introduce
the verb, and its two siblings would introduce the subject and object noun phrases. The edges would signify at
which NP node the child gets substituted into, effectively
informing which is the subject and which is the object.
The common way to deal with TAG trees is by repeatedly performing adjunction and substitution, while
having a derivation tree as a record-keeping ’roadmap’ of
how the tree is derived. However, since we can change
and delete portions of our text, the derived tree is problematic since there is no way to “un-adjoin” subtrees. We
must always refer back to the derivation tree. In effect,
there is no point in maintaining the derived tree throughout the generation process. Instead, the derivation tree
becomes our primary data structure, and everything else
can be derived from it on demand.
When our operators are said to perform adjunction
and / or substitution, they are simply recording the operation in the derivation tree, not actually performing it.
Derivation Tree:
S
Derived Tree:
S
NP
VP
V
walked
walk(w,x)
I@1
A@2
john(x)
VP*
PP
Prep NP
to
dest(w,c)
[email protected]
NP
campus
VP
VP*
quickly
fast(w)
• LTAG provides an elegant mechanism for our nonmonotonicity requirement through the use of the
derivation tree. It keeps all syntax and semantics
locally integrated at each node, and allows nonmonotonic modification of content simply by deleting or replacing the corresponding node.
4.2 Linguistic Resources
At the moment we are still using a very small hand-crafted
grammar and lexicon. Like most TAG-based systems,
the grammar is a collection of elementary trees, and the
lexicon is a collection of words that specify which elementary trees they can anchor. The lexicon also provides
phonetic information and lexicon stress, which is extracted from the CMU Pronunciation Dictionary.
A typical lexical entry looks something like this:
Orthography:fried
Elementary Tree(s):ITV
Signature:F,Frier,Fried
Semantics:fry(F,Frier,Fried)
Phonetic Spelling:f,r,ay1,d
whereas a typical grammar entry looks something like
this:
S [X]
VP
NP[Y]
PP
quickly V
Adv
• We also adopt an extension to the formalism, Synchronous Tree Adjoining Grammar (STAG), which
has been proven useful for paraphrasing purposes
Dras (1999).
VP
John Adv
A@2
VP
NP
John
NP
• LTAG provides an extended domain of locality which
allows for predicate-argument structures and feature agreements over a structured span of text that
need not be contiguous at the surface. This potentially allows for the coupling of poetic features such
as rhyming across lines.
walked Prep
to
VP[X]
V [X][Y][Z] NP [Z]
NP
campus
Surface Form:
John quickly walked to campus.
Figure 2: Grammar entry ITV: Intransitive Verb
Semantics:
walk(w,x), john(x), dest(w,c),
campus(c), fast(w)
campus(c)
Figure 1: Making the LTAG derivation tree our main data
structure
When binding a lexical entry to an elementary tree, argument structure is preserved by unifying the signature of
a lexical entry with the signature of its preterminal node.
In the above case: (X=F, Y=Frier, Z=Fried).
LTAG has the following advantages for our work:
4.3 Semantics
• The adjunction operation in LTAG allows flexible
incremental generation, such as subsequent insertion of modifiers to further refine the message. This
is required as the system builds texts incrementally
through an iterative process.
For our semantics, we follow Stone and Doran (1997) in
using an ontologically promiscuous flat-semantics (Hobbs,
1985). The semantics of a given individual is simply the
conjunction of all the semantics introduced by each lexical item, given adjustments for unification of argument
structure.
Each individual is associated with a “semantic pool”,
which is simply a collection of propositions. Unlike the
communicative goals of a traditional NLG system, the
generator is under no commitment to realize these semantics. The relationship between an individual’s semantic
pool and its derivation tree’s semantics is very flexible. In
particular, there is no subsumption relationship either way
between them.
Furthermore, in the beginning (the initialization phase)
all semantic pools are initialised with a copy of the target
semantics. This is ultimately what we hope our resulting
poem to “be about”. But as time progresses, each individual in a population can evolve and mutate its own semantic pool. Therefore each individual not only differs in
form, but also in content.
4.4
Integrated, Incremental Generation
As mentioned above, unity of poetry demands that semantic, syntactic, and lexical information are available at
every step of decision making. This calls for an integrated
architecture, where there is no explicit decomposition of
the process. In our implementation, this is reflected by the
fact that individuals are complete structures which maintain all this information, and the semantic, syntactic and
lexical operation functions can be applied in any order.
Like most stochastic search algorithms, the process
starts with an initialisation phase. Provided with the input
of a target semantics and target phonetic form as mentioned in Section 3.1, it then creates a collection of individuals, each corresponding to a minimally complete utterance that more or less conveys the target semantics.
What follows is a process of incrementally modifying the utterance to satisfy the target phonetic form while
simultaneously attempting to maintain an approximation
of the target semantics, as follows:
• During the evaluation phase, a collection of separate evaluators will analyse each individual for its
surface form, phonetic information, and semantics,
and assign a score (see Section 4.5).
• After every individual has been scored, the set is
sorted. The higher ranked individuals spawn “children”, copies of themselves which are mutated during the next phase. These children replace the lower
ranked individuals, thus maintaining the set size.
• During the evolution phase, a collection of separate
operators will be applied randomly on the aforementioned children (see Section 4.6).
Semantics
Surface
john(j), walk(w,j), sleep(s,j) John walked.
John slept.
Operator: Semantic addition
destination of walk: store
Semantics
Surface
john(j), walk(w,j), sleep(s,j) John walked to the store.
store(st),destination(w,st) john slept.
Operator: lexical choice
syno/hypo/hyper-nym: walk -> run, lumber, march, ...
Semantics
Surface
john(j), walk(w,j), sleep(s,j) John lumbered to the store.
store(st),destination(w,st) john slept.
Operator: pronominalization
john -> he
Semantics
Surface
john(j), walk(w,j), sleep(s,j) John lumbered to the store.
store(st),destination(w,st) He slept.
Operator: clause re-ordering
topicalization
Semantics
Surface
john(j), walk(w,j), sleep(s,j) To the store John lumbered.
store(st),destination(w,st) He slept.
A long, long series of operations...
into the bookshop john did slowly lumber,
inside he fell into a peaceful slumber.
Figure 3: Idealized diagram of a stochastic search
4.5 Evaluators
Unfortunately, at the moment we have only implemented
an evaluator for rhythm: the metre evaluator. It works
by first dividing the stress pattern of a given utterance
into metrical feet of descending/falling rhythm. For instance, the line “There /once was a /man from Ma/dras”,
has a stress pattern of (w,s,w,w,s,w,w,s). This can
be divided into feet as (w),(s,w,w),(s,w,w),(s). In
other words, this line consists of a single upbeat (the weak
syllable before the first strong syllable), followed by 2
dactyls (a classical poetry unit consisting of a strong syllable followed by two weak ones), and ended with a strong
beat.
The evaluator compares the metrical configuration of
an individual with the target phonetic form by first comparing their number of feet, penalizing those that are either
too short or too long. Since each foot, with the exception
of the upbeat, contains exactly one strong syllable, this
effectively evaluates how close they match in number of
strong syllables. It then compares the number of weak
syllables between each corresponding foot, once again
penalizing the discrepancies, but the penalty coefficient
we impose here is less than that of the strong syllables.
This provides a natural formalization of the heuristic that
strong syllables dominate the effect of a line’s metre, and
a surplus or missing weak syllable here and there is quite
acceptable. For example, (2) sounds more like (1) than
(3) does:
(1) The /curfew /tolls the /knell of /parting /day
(2) The /curfew /tolls the /knell of the /parting /day
(3) The /curfew /tolls the /knell of /long /parting /day
4.6 Operators
We have currently implemented the following operators:
• Semantic explorer: this operator works with the semantic pool of an individual. Currently it just intro-
duces random propositions into the pool, but with
the help of a knowledge-base, it could introduce
propositions that are conceptually related to what
is already existing in the pool.
• Semantic realizer: This operator is one of the most
important ones: it interfaces between the semantic
pool and the actual built structure. The semantic
realizer will randomly select a proposition from the
pool and attempt to realize it by:
– Selecting all lexical items that can convey the
proposition,
– For each lexical item, selecting all elementary
trees that can be anchored by it,
– For each elementary tree, selecting all nodes
in the derivation tree where it can be applied
(either adjoined or substituted),
– Building a list of all these possible nodes and
choosing one at random, and inserting the new
lexicalized elementary tree at that position.
• Syntactic paraphraser: This operator works by randomly selecting an elementary tree in an individual’s
derivation tree and trying to apply a suitable paraphrase pair in the manner of Dras (1999). Since all
adjunction and substitution information is kept relative to one’s parent node in the derivation tree, adjusting for paraphrases (i.e. changing an elementary tree at a certain derivation tree node) is a simple
matter of replacing the elementary tree and updating the addressing of the children.
For example, if paraphrasing a sentence from active
to passive form, this would involve exchanging the
“Active Transitive Verb” elementary tree at the root
to “Passive Transitive Verb”, and updating the substitution addresses of the subject and object noun
phrases so that the subject now moves to the end of
the verb and the object moves to the front.
5
Examples and Discussion
While our system is still in a very early stage of implementation, particularly in terms of evaluators, operators,
and linguistic resources, we already have some sample
output to present.
For comparison, we first describe our previous attempt
at implementing poetry generation, reported in Manurung
(1999). This was not a stochastic search model but exhaustively produced all possible paraphrases using chart
generation, while simultaneously pruning portions of the
search space which were deemed ill-formed from an early
stage. It also worked with the target specification of both
phonetic form and semantics.
As an example, given the target semantics {cat(c),
dead(c), bread(b), gone(b), eat(e,c,b), past(e)} and the target form shown in Section 3.1, but disregarding the rhyme
scheme, the chart generator could produce, among others,
the following “limerick”:
the cat is the cat which is dead;
the bread which is gone is the bread;
the cat which consumed
the bread is the cat
which gobbled the bread which is gone
Since this system does not have the equivalent of a
semantic explorer operation (Section 4.6), the output semantics is always subsumed by the target semantics. Moreover,
because the chart generator rules out ill-formed subscontituents during the bottom-up construction, the output form
always matches exactly the target form. There are no partial solutions or imperfect poems.
In Example 1 below of our stochastic model, there is
an appearance of Luke, despite not being mentioned in the
target semantics. This is due to the semantic explorer operator, which randomly introduces new predicates into the
semantic pool. By and large, though, the output does approximate the target semantics. Unfortunately, predicate
argument structure and agreement is currently not considered, and this limits the treatment of semantics to a
rather trivial account.
The target metre is not precisely followed, but the resulting form is arguably comparable with what a human
might produce given the same task.
The resulting score is obtained from the metre evaluator (Section 4.5), and is out of a maximum 1.0.
Example 1:
Input
Target semantics:
{john(1), mary(2), dog(3), bottle(4),
love(5,6,7), slow(8), smile(9,10)}
Output (Score: 0.893)
Surface:
the bottle was loved by Luke
a bottle was loved by a dog
Target form:
w,s,w,w,s,w,w,s,
w,s,w,w,s,w,w,s
Stress:
w,s,w,w,s,w,s,
w,s,w,w,s,w,w,s
In Example 2, given no semantic input at all, the system
will still produce output, but since the semantic explorer
operator is still purely random, it resembles word salad.
Furthermore, the second sentence is the ungrammatical
“ran”. This is because this generation effort was terminated prematurely by manual intervention, and at the last
iteration before termination the semantic realizer chose to
introduce the verb “ran”, leaving its subject empty.
Example 2:
Input
Target semantics:
none
Output (Score: 0.845)
Surface:
a warm distinctive season humble mellow smiled refreshingly slowly. ran.
Target form:
w,s,w,s,w,s,w,s,w,s,
w,s,w,s,w,s,w,s,w,s
Stress:
w,s,w,s,w,s,w,s,w,s,
w,s,w,s,w,w,s,w,s
Although these results can hardly be called poems,
nevertheless they succeed in showing how the stochastic
hillclimbing search model manages to produce text that
satisfies the given constraints, something very difficult for
a random word-salad generator to achieve.
6
Related Work
The work reported in this paper is similar in some sense to
the work of, among others: NITROGEN (Langkilde and
Knight, 1998), a generator that employs a generate-andtest model, using a knowledge poor symbolic generator
for producing candidate solutions and ranking them based
on corpus-based information, and SPUD (Stone and Doran,
1996), a TAG-based generator that exploits opportunity
arising between syntax and semantics, allowing generation of collocations and idiomatic constructs.
7
Conclusion
Poetry generation is different from traditional informative
generation due to poetry’s unity, which essentially means
the satisfying of interdependent constraints on semantics,
syntax and lexis. Despite our implementation being at a
very early stage, the sample output succeeds in showing
how the stochastic hillclimbing search model manages to
produce text that satisfies these constraints.
Acknowledgements
The first author of this paper is being supported for his
postgraduate studies by the World Bank QUE Project, Faculty of Computer Science, Universitas Indonesia. We would
also like to thank the anonymous reviewers for their comments in preparing this paper.
References
Kim Binsted, Helen Pain, and Graeme Ritchie. Children’s
evaluation of computer-generated punning riddles.
Pragmatics and Cognition, 5(2):309–358, 1997.
Margaret A. Boden. The Creative Mind: Myths & Mechanisms. Weidenfeld and Nicolson, London, 1990.
Marjorie Boulton. The Anatomy of Poetry. Routledge and
Kegan Paul, London, 1982.
Audrey Daly. Animal Poems. Ladybird Books, Loughborough, 1984.
Koenraad De Smedt, Helmut Horacek, and Michael Zock.
Architectures for natural language generation: Problems and perspectives. In Giovanni Adorni and Michael Zock, editors, Trends in Natural Language Gen-
eration: An Artificial Intelligence Perspective, number 1036 in Springer Lecture Notes in Artificial Intelligence, pages 17–46. Springer-Verlag, Berlin, 1996.
Mark Dras. Tree Adjoining Grammar and the Reluctant
Paraphrasing of Text. PhD thesis, Macquarie University, Australia, 1999.
Charles O. Hartman. Virtual Muse: Experiments in Computer Poetry. Wesleyan University Press, 1996.
Jerry Hobbs. Ontological promiscuity. In Proceedings of
the 23rd Annual Meeting of the Association for Computational Linguistics, pages 61–69, Chicago, Illinois,
1985. The Association for Computational Linguistics.
Aravind K. Joshi and Yves Schabes. Tree adjoining grammars and lexicalized grammars. In Maurice Nivat and
Andreas Podelski, editors, Tree Automata and Languages. Elsevier Science, 1992.
Irene Langkilde and Kevin Knight. The practical value
of n-grams in generation. In Proceedings of the Ninth
International Workshop on Natural Language Generation, Niagara-on-the-Lake, Ontario, 1998.
Samuel R. Levin. Linguistic Structures in Poetry. Number 23 in Janua Linguarum. ’s-Gravenhage, 1962.
Hisar Maruli Manurung. A chart generator for rhythm
patterned text. In Proceedings of the First International
Workshop on Literature in Cognition and Computer,
Tokyo, 1999.
Chris Mellish, Alistair Knott, Jon Oberlander, and Mick
O’Donnell. Experiments using stochastic search for
text planning. In Proceedings of the Ninth International
Workshop on Natural Language Generation, Niagaraon-the-Lake, Ontario, 1998.
Nicolas Nicolov. Approximate Text Generation from NonHierarchical Representations in a Declarative Framework. PhD thesis, Department of Artificial Intelligence,
University of Edinburgh, 1998.
Ehud Reiter. Has a consensus on NL generation appeared? and is it psycholinguistically plausible? In
Proceedings of the Seventh International Natural Language Generation Workshop, pages 163–170, Kennebunkport, Maine, 1994. Springer-Verlag.
Mike Sharples. An account of writing as creative design.
In Michael Levy and Sarah Ransdell, editors, The Science of Writing: Theories, Methods, Individual Differences and Applications. Lawrence Erlbaum, 1996.
Matthew Stone and Christine Doran. Paying heed to collocations. In Proceedings of the Eighth International
Workshop on Natural Language Generation, pages 91–
100, Brighton, 1996.
Matthew Stone and Christine Doran. Sentence planning
as description using tree adjoining grammar. In Proceedings of the 35th Annual Meeting of the Association
for Computational Linguistics, pages 198–205, Madrid, Spain, 1997. The Association for Computational
Linguistics.
Joseph Weizenbaum. Eliza - a computer program for the
study of natural language communication between man
and machine. Communications of the ACM, 9(1):36–
45, 1966.