Contexts in context

Patrick Hayes

Contexts in context

Patrick Hayes

1997, Context in Knowledge Representation and Natural Language, AAAI Fall Symposium

visibility

…

description

11 pages

link

1 file

This purpose of this note is lexicographic rather than theoretical: to tease apart some of the meanings of the word'context'as it is used in the technical literature. I will distinguish four different kinds of context and the presuppositions that lie behind them. I hope to clarify what I believe to be a difference between two intellectual traditions, each of which brings a different collection of unspoken assumptions.

CONTEXTS IN CONTEXT Patrick J. Hayes IHMC, University of West Florida INTRODUCTION This purpose of this note is lexicographic rather than theoretical: to tease apart some of the meanings of the word 'context' as it is used in the technical literature. I will distinguish four different kinds of context and the presuppositions that lie behind them. I hope to clarify what I believe to be a difference between two intellectual traditions, each of which brings a different collection of unspoken assumptions. Speaking about unspoken assumptions is a tricky business. Readers who already possess one set of assumptions may find t h e business of explicating them tedious and elementary, while those who do not may find it ludicrous or incomprehensible. Some readers may interpret everything one writes in only one sense, thus rendering the discussion unintelligible. Also, a proper survey of these issues is beyond the scope of this short paper, so my accounts will be somewhat simplified and may seem like caricatures. If any of the following seems either pathetically obvious or outrageously incorrect, please consider the possibility that others may view things differently. I do not mean here to advocate any of these various opinions or methodologies in preference to the others, but merely to try to clarify these potentially misleading differences. The lexicographer's task is simply to report, not to judge. TWO APPROACHES TO LANGUAGE Most of the literature on language agrees in broad terms with the following general picture. People use language to communicate, and have some kind of internal representation of what it is that they mean to communicate and understand others to be communicating to them. The external language is used for communication, while the internal representation encodes beliefs and supports mental processing. Language comprehension and production establish connections between the external language and the internal representational code. Unlike external language, the internal code cannot be publicly observed, so there are many ideas about what its structure might be. Almost the entire literature of psycholinguistics and NL work in AI is concerned one way or another with hypotheses about the internal code. Fodor argues that this ‘language of thought’ must have a productive syntax, but Johnson-Laird and others think of it as consisting of ‘mental models’ which are more similar to diagrammatic images. It might consist of many rather isolated systems (spatial, visual, etc.), or may all be representable in a single logical notation, as AI often assumes.1 To avoid taking sides on any of these disputes, I will here use the deliberately artificial terms ‘EL’ to refer to the external language, i.e. the normal subject-matter of linguistics, and ‘IC’ to refer to the internal code, i.e. the internal mental representation of the result of the language comprehension process. The first intellectual tradition, that of semantic linguistics, focuses on EL. Following the early work of Montague, its ambition is to provide a semantics for EL in the form of a model theory defined directly on the syntax of EL sentences provided by linguistic grammars. This semantics should f i t naturally onto the syntactic structure assigned to sentences of EL by a grammatical 1More exactly, a logical notation is used to represent the content of the IC. From an AI perspective, this is the minimal hypothesis (at the ‘epistemological level’, McCarthy and Hayes 1969) and further details of representation types, modularization, etc., represent further hypotheses about how the mental content might be organized. theory. The linguistic syntax is central to this tradition; for example, any systematic difficulty in making a semantics properly conform to syntactic structure is considered to be a criticism of the grammatical theory. On this view, the challenge provided by t h e contextual sensitivity of EL is to elegantly incorporate extra structure (representing the contextual important information) into t h e semantic theory so that the meaning of larger syntactic forms can still be regarded as somehow formed from those of their syntactic parts plus this extra structure, while conforming to the constraints of t h e currently accepted grammatical theory. Much of the modern work in linguistic semantics is concerned to identify suitable structural devices t h a t preserve compositionality in this way, such as Kamp’s ‘discourse reference structures’ (1990) and terMeulen’s ‘chronoscopes’ (1995)2. Since all native users of language have reliable intuitions about correct and incorrect uses of NL sentences in various settings, these proposed theoretical accounts of contextual meaning can be tested against linguistic intuitions in much the way t h a t judgments of syntactic naturalness are used to test proposed grammatical theories, fitting smoothly into t h e usual linguistic methodology. While Montague’s original vision of a model theory for English now seems somewhat simplified, the central concern of this area is still a referential semantic theory which connects EL to t h e world it describes. There are many differences within this large field of study, but all work in this tradition shares some common themes and attitudes. It has a central concern with t h e syntax of EL and uses grammatical criteria to isolate particular research topics, and is only peripherally concerned with language use. It has a methodological aversion to psychological theorizing or any concern with details of hypothesized internal codes 2It may be worth emphasizing, for readers in the AI tradition, that such structures are not thought of as data structures to be used in a computational process of comprehension. or processes3. The inferences whose validity the semantics describes is thought of here as a relationship between EL sentences. All of which is in marked contrast to the other approach to language, which might be called the psycho-computational tradition, which is primarily concerned with t h e internal representation. This more cognitive perspective treats an EL utterance not as having a content, but as producing a content in the mind of t h e hearer. The meaning, here, is thought of as something to be extracted from the utterance of the sentence in a context. This tradition concentrates on the processes by which external language is understood and t h e endproduct of those processes, so that EL syntax is thought of as playing a role in t h e facilitation of comprehension rather than reflecting an objective structure. This tradition is often also concerned with t h e role of IC in other, nonlinguistic, aspects of cognition. Inference is considered to a relation between propositional structures in IC, so the proper place for a semantics, on this view, is to provide an account of meaning for the end-product of the process of linguistic comprehension, i.e. the internal code, rather than the external 4 communication language . AI work on knowledge representation, for example, has a central concern with the semantics of K R formalisms. In this tradition there is no particular methodological requirement to 3For example, Kamp (1990, p. 32): “At present there isn't much that we can say with certainty about the way in which the human mind represents and processes information ... there is little hope that this situation will significantly improve in the foreseeable future... So theorizing about these matters...is something one had better stay away from. ... A theory of attitude reports ought to be independent of any specific assumptions about the organization of mental states and the mechanisms which transform them.” 4On this view, the grammatical structure of EL might emerge simply as a by-product of the processes of language production and comprehension. focus on the grammar of EL, although it is often of considerable practical importance5. There is a further difference between the two traditions in what might be called theoretical attitude. Semantic linguistics seeks a theory of the structure of language which is as universal and simple as possible. This leads to a preference for sparse contextual structures containing just sufficient structure to work, and for subsuming as many linguistic phenomena as possible into a common framework. The methodological pressures of computational modeling lead in rather different directions, since the process of comprehension may involve information from many sources, and use this information in ways that depend in part on the source. These distinctions, and the complexities they introduce, can seem like unnecessary clutter to someone working in the first tradition. CODES AREN’T CONTEXTUAL The grammatical form of an EL utterance directly encodes only what linguistics calls the ‘character’, i.e. that part of the meaning that can be understood from the sentence in isolation. The character of the sentence “He saw him” is, roughly, that two male creatures exist and one saw the other at some time in the past. A full comprehension of that utterance in a communicative context would use the context to determine the intended referents of ‘he’ and ‘him’, resulting in a content which is better 5This sketch of these rival traditions deliberately emphasizes their differences rather than their similarities because the differences are the chief barriers to communication, but there are many points of contact between them, especially in more recent work. Linguistic semantics, like linguistics in general, is often motivated by a perceived psychological relevance of the structures it hypothesizes (but rarely accepts the experimental discipline required in psychology or the attention to implementation detail necessary in AI), and computational psycholinguistics sometimes feels itself to have relevance for grammatical theories in linguistics (but rarely pays attention to the syntactic details that interest linguists). expressed as a ground assertion with no quantifiers, explicitly naming the people being described. This decontextualized content is what IC must be able to encode; i t is supposed to represent the final output of the comprehension process, where the full content is represented and stored for later use. The task of the IC is to represent this full content, and provide a vehicle for connecting it to other mental processes. This means that while the process of comprehension uses contextual clues to discover the full content of the message, t h e IC itself cannot rely on contextual clues to provide meaning in the way that our external, communicative language does, precisely because its role is to encode t h e content which results from deciphering those clues, and to preserve this content and make it available to subsequent mental operations (in AI, typically, thought of as inferences) after the relevant context is no longer available. The IC should be able to represent information which is completely decontextual in the sense that linguistic semantics uses ‘contextual’. It cannot use indexical or anaphoric devices since it must be capable of representing the result of resolving indexicals and anaphoras; i t should have no lexical ambiguity since its role is to express the result of lexical disambiguation (Woods 1986 ); and so on for any contextually sensitive aspect of EL meanings. This has been the standard ambition and usual assumption in the second tradition (although full lexical disambiguation has been questioned, as discussed below.) This point sometimes seems elementary and obvious to those working in this tradition, but incomprehensible or ridiculous to linguistic semanticians. To illustrate it, consider the consequences of allowing the IC to contain indexicals such as ‘IC-now.’ If t h e comprehension process encodes the meaning of the present as ‘IC-now,’ then the content of a sentence meaning would also be indexical: the meaning of the EL sentence “It’s raining,” spoken and understood on Monday as referring to the time of utterance, would be represented using ‘IC-now,’ and therefore would mean that it is raining on Tuesday when accessed on the following day. Clearly, part of the comprehension process for such indexicals involves using t h e context of the utterance to find their nonindexical meaning and then encoding t h a t decontextualized meaning in the internal representational code.6 Some care is needed here. While the result of comprehension must be fully decontextalised, the process of language comprehension itself might make use of internal codes which are contextually sensitive7.This said, however, the differences between the two traditions are often evident. In the linguistic tradition much recent work has been concerned with the development of under-specified logics intended to encode partial utterance meaning. It is easy to find examples where this lack of contextual precision extends well beyond a single sentence. (“He saw him yesterday.” “Who?” “Bob.” “Oh hell, what can we do?” “No, Harry saw Bob. Bob didn’t see him, thank God.”) These pose particularly acute difficulties for a semantic theory which must define meanings attached to a syntax whose largest unit is the sentence.8 In AI work, the results of accumulating (and revising) information about the intended meaning are more naturally modeled simply by building progressively more detailed hypothetical meanings in an IC which need have no special connection to the EL syntax. For example, the pre-contextual meaning of “He saw him” might be represented (oversimplifying somewhat) as (exists x y)( (Male x) & (Male y) & (See x y) ) thereby encoding the character of the EL sentence as the content of a fragment of IC; but this 6 This decontextualized meaning may not be a calendar date, but it must be something which when accessed later will then still refer to the time of speaking rather than the time of access. 7 I am grateful to an anonymous reviewer of an earlier draft for this observation. 8 A methodological concern with the sentence as a basic unit, and hence the importance of context effects which transcend sentence boundaries, is characteristic of the linguistic tradition. content needs no special logics for its expression. PHYSICAL AND LINGUISTIC CONTEXTS Consider two people in a garden, where one says: “Look at those roses. Aren't they beautiful?” Clearly, ‘they’ refers to t h e roses mentioned in the first sentence, and these roses are, in some sense, part of t h e context which determines the content of the second sentence. However, there are two ways to understand what this means. In one, the context is taken to be the actual physical surroundings of the conversants: the garden itself (or perhaps part of it). Call this a physical context. In the other, the context is provided by the words in t h e first sentence; we might call this t h e linguistic context. If we consider the two sentences as a narrative, ‘they’ is anaphoric; but if we think of the physical context, ‘they’ can be regarded as an indexical: i t would be just as appropriate, and convey exactly the same meaning, for the speaker to indicate the roses by a gesture or gaze and say: “Aren't they beautiful?”, without any linguistic introduction of the topic of flowers.9 This contrast between physical and topic contexts is often blurred, or deliberately ignored, in computational linguistics, which uses the term ‘common ground’ (Clark & Carlson 1981) of a conversation to mean t h e objects and topics which have been somehow introduced into the range of attention of t h e participants, plus their mutual beliefs about these objects. Things can be in the common ground either by having been previously mentioned or by being part of the physical surroundings, or indeed by any other means. Linguistic semantics has no need to distinguish these: they play the same role in the subsequent interpretation of sentences by providing a set of possible referents for 9 Readers with the grammatical sensitivity of many linguists may regard this as ungrammatical, preferring ‘Aren’t t h o s e beautiful?’ precisely because it has a lexical indicator of the indexical. Nevertheless, the first sentence would be comprehensible in that contextual setting, which is what the second tradition is most interested in. pronouns, and the semantic constraints arising from shared beliefs apply in either case. One theory neatly subsumes both, so they can be identified. However, from the cognitive perspective there is an important difference between physical and linguistic contexts. The former, unlike the latter, are liable to change as time passes for nonlinguistic reasons, and once in the past they cannot be recovered. A conversation can return to an earlier topic, using a stack of common ground representations (or perhaps a single common ground itself having a stack-like organization) which can be stored in memory; but when people are speaking about some ongoing process in their physical surroundings, the past cannot be resuscitated for further indexical reference, so the distinction between the anaphoric and indexical usages illustrated above may be crucial. Consider for example the question, “Do you see that?”, with no preliminary introduction. The content of the question can be successfully determined only if t h e referent is visible in the physical environment. If it was a shooting star, the hearer who fails to see it at the time has no way to compute the meaning later.10 Even more extreme examples of the difference are provided by short command phrases such as “Look out!” or “Stop!”, which have only a trivial linguistic structure and can play no role in a narrative, but may convey vitally important information about the physical context. (It may be objected that whether or not something is physically present, it can only be thought of as part of the mutual context i f it has been somehow introduced into t h e common ground11, but this seems to not be true for physical contexts. While narratives 10Subsequent conversation may introduce the topic, but the information available to the hearer from being subsequently told that a shooting star had been present is quite different from that obtained from seeing it, and the comprehension process at the time of hearing the question is quite different. 11I am grateful to a reviewer for making this suggestion, even though it is wrong. and third-person accounts of conversations usually take care to first introduce a topic and subsequently refer to it, EL is often used in a natural conversation to comment on something in the physical context which has not been discussed previously, relying on the listener's ability to discover t h e intended referent from clues in t h e immediate environment during t h e comprehension process itself. As well as warning shouts, this is illustrated by examples like: "Have you noticed t h e Chinese vases over there?", where part of the intended meaning is to precisely to direct the hearer’s attention to a part of t h e physical context which is not yet in t h e common ground.) From the AI perspective, the key difference between physical and linguistic contexts is that the information in them must be accessed by different mechanisms, which must be sensitive to the different natural structures the two kinds of context have. Linguistic contexts can be represented in memory, stored and accessed later, corresponding to changes in conversational topic. These often seem to obey a stack-like organization, where it is natural to return to the previous topic, whatever it was. In contrast, the meaning of many indexicals can be determined only by perception, and requires close attention to the fleeting nature of physical situations, ordered by t h e relentless passage of time; but it does not require any internal record of what has been said previously. Consider a transcription of a conversation between two people walking in a garden, or a narrative description of it. To follow this the reader needs to construct a representation of what is going on in t h e situation being described and modify this representation as the narrative proceeds, to fix the referents of the words spoken by the conversants. However, t h e y needed no such representation, since they could see t h e garden itself: they were already in t h e ongoing dynamic situation which the reader must somehow model while reading their words. Different plants came into their view simply as a result of their walking. This is particularly clear in a third-person narrative, which can contain explicit information given to the reader by the author: “ ‘Aren't they lovely?’, said Fanny as she looked happily at the roses”. A conversation has no author, so such information about Fanny's gazing and state of happiness could be obtained only by looking at Fanny, if one were actually listening to her (or by being Fanny.) One might characterize the physical context of a conversation as that part of the common ground in which the conversants are situated, and the linguistic context as t h e part that is situated in them. This distinction between physical contexts, which have a linear temporal ordering and provide referents for indexical terms, and linguistic contexts, which have a stack-like nested structure and provide referents for anaphoric expressions, is familiar (see for example the distinction between 'deep' and 'surface' anaphora in Hankamer & Sag 1976). It is less widely appreciated, however, that in one intellectual tradition it is natural to introduce a general category which subsumes them both, while in t h e other it is natural, and indeed may be essential, to keep them separate. CONCEPTUAL CONTEXTS In both cases so far we are talking of a context for an utterance or a sentence. A different sense of context means, roughly, the set of background assumptions needed to fix the meaning of an ambiguous word or concept. The different senses of ‘bank’ in English, for example, include the side of a river, an institution for safely storing cash, and a tilting action of an aircraft. There is much experimental evidence that when we hear a sentence the normal process of comprehension considers a l l these possibilities and selects the appropriate one in about half a second, presumably somehow using information from the linguistic and physical contexts. However, it is not exactly clear what constitutes a single ‘sense’ of a word in EL. In the linguistic tradition, all that is necessary is that the context-structures involved in t h e semantic theory provide enough information to support any linguistically relevant distinctions, most notably to keep track of pronoun bindings and distinguish between different syntactic classes. However, if one wishes to encode enough of the content of t h e meaning of a word in the IC to support a reasonably rich level of subsequent cognitive inference (that is, if the IC is going to be of any actual subsequent use) then things get more difficult. Attempts to formalize such multiple senses often reveal a greater degree of polysemy than is suggested by ordinary lexical classification. For example, even if we restrict ourselves to financial banks, a bank may be a legal institution, a corporation, a legal agent, a building, a collection of buildings, the interior of a building, or even such things as an architectural style. A building may be a physical object which encloses a large space, or it may be that space itself. As one tries to capture these meanings by writing ‘axioms’ which would be true of them, the meanings tend to separate into finer and finer shades. Is the fitted carpet in the foyer of a bankbuilding inside it or part of it? Is a room in a building a part of the building or is it an inhabitable space enclosed by the building? It has proven very difficult to give convincing logical t h e o r i e s which adequately capture word meanings with sufficient precision to support useful subsequent inferences: there seems to be an “explosion of meaning” (Guha 1990) which taxes our representational abilities. It also seems to be beyond the resolving ability of most linguistic accounts of meaning, which generally assume a much coarser level of lexical ambiguity than that which AI seems to require. Most linguistic semantics would give ‘bank’ three or four meanings rather than thirty or forty, especially when t h e distinctions between the meanings seem to have no linguistic significance. This explosion only becomes a problem for t h e processes that must use the IC to draw subsequent conclusions. Such problems have led some to conclude t h a t a d e d u c t i v e f r a m e w o r k is fundamentally inappropriate as a way to express content; others however (Guha 1990, Singh et al 1995)) have partially abandoned the goal of total disambiguation every utterance, and developed contextual logics to give a coherent account of deduction in the presence of polysemy. Here, a context is thought of as a kind of semantic index to enable the same logical symbol to be used with several different, but related, meanings. Rather than having to distinguish b a n k - s i n g l e - b u i l d i n g , bankbuilding-interior, etc., these logics would instead allow a single term bank-building which can be reinterpreted in various ‘contexts’ to have different shades of meaning. The role of context here is fundamentally conceptual: to disambiguate the vocabulary sufficiently to enable useful inferences to be made, so I will call these conceptual contexts. Notice that conceptual contexts do not eliminate the fine distinctions, but have been proposed rather as a way to keep them properly organized. These contexts are part of the representational IC, not outside it helping to define its meaning. Physical and topic contexts are found in nature, but conceptual contexts are a formal device. It is natural therefore to ask whether the latter can be used to model or describe the former. If so, the process of meaning resolution in NL comprehension might be performed by inference in a suitable contextual logic (Buvac 1996a explores this possibility.). One way to judge the usefulness of this proposal is to ask what the natural structure of conceptual contexts must be. Just as the fact that we live in time forces physical contexts into a linear ordering, and our ability to return to a previous topic requires linguistic contexts to have a stack-like organization, we might expect that conceptual contexts must fit into some kind of overall structure in order to provide conceptual distinctions. It is not so easy to describe the intended structure of conceptual contexts, but it seems unlikely to be similar to those already considered. Both physical and topic contexts have a fairly large scope, typically extending over several sentences. The ‘common ground’ is supposed to include potential referents for all pronouns in a conversation and all t h e shared beliefs which might be relevant to pronoun resolution, for example. Moreover, they are dynamic, changing as a conversation proceeds, and one of the roles of EL is to move topics in and out of t h e narrative context. None of this is true of conceptual contexts. These change, if at a l l , at a much slower rate: the mental lexicon is largely established in childhood while a language is being learned. Also, the scope of a conceptual context can be very narrow indeed. Quite natural sentences can use many different senses of a single word. For example, each token of the word ‘in’ in t h e following sentence has a slightly different sense: “The pen in the tray in the top drawer of the desk in my office was sitting in water a while ago.” These all refer to spatial inclusion in one way or another, but the ways are all different and have different deductive properties, so must be somehow distinguishable in the IC12. If conceptual contexts are how such distinctions are recorded, then they must cluster to sentence meanings like barnacles on a rock, rather than obeying any straightforward linear or hierarchical organization. One might object that while their role in disambiguation is local and untidy, the contexts themselves have a natural structure, perhaps reflecting a concept hierarchy or obeying rules of prototypicality. But even here it is very difficult to distinguish any simple structure. All the straightforward early ideas of concept structure (for example, as feature lists) have proven to be inadequate, and t h e best that psychology can now offer seems to be little more than the observation t h a t different word senses might be organized in families, distinguishable by their satisfying different logical theories. Considered as sets of logical sentences, these ‘microtheories’ overlap in complex ways. Some of them will contain others, and some groups of theories will have some, but not all, assumptions in common. However, there is no reason whatever to expect them to be 12For example, some are transitive, others not. To be in a container in a place implies being in that place; to be an object in a bowl in liquid does not imply being in the liquid, but to be an object in liquid in a bowl does imply being in the bowl. Herskowitz (1986) documents the polysemic intricacies arising from the use of spatial prepositions in English. temporally ordered. They have nothing particularly to do with time or narrative sequence. And although (like any set of subsets) they can be partially ordered by inclusion, this partial order would not seem to be at all closely related to the topic structure of narrative contexts, for t h e reasons just discussed: a single line of reasoning, or even a single assertion, will often involve concepts defined in many different conceptual contexts.13 CONTEXT LOGICS AND DEDUCTIVE CONTEXTS Several authors have proposed adapting first-order predicate logic (the lingua franca of knowledge representation in AI) to include some contextual sensitivity. Guha (1991), McCarthy (1993), Buvac (1996b) and others are seeking a general-purpose logic of contexts in which a logical sentence can be asserted only relative to a context, and inference rules are provided for ‘moving’ between contexts. The relation between a context and a sentence is expressed by a special sentence of the form ist(k, p h i ) (read as ‘phi is true in k’). Of course, like any other sentence, this too can be asserted only in a context; but this construction allows some contexts to mention others, and make explicit assertions about what is implicit in assertions made in them. Contexts may have local assumptions assumed to be true within them (in this respect they closely resemble the familiar conditional proof subderivation constructions used in natural deduction logics) and they may also reinterpret some of the vocabulary, so t h a t for example a binary relation may be replaced by a unary property in a subcontext where some parameters are fixed. We might call this kind of context a deductive context: 13E v e n here there is scope for computational variation. One view of microtheories imagines them as mostly used for handling exceptional cases, giving a picture of a large common context like a flat savanna, with relatively sparse context/subcontext trees. A very different picture sees almost every assertion as made relative to a microtheory, the total forming a dense structure more like a rain forest or a mangrove swamp.(Guha 1990). it contains a set of assumptions and specifies a vocabulary which allows inferences to proceed without being hampered by conditional qualifications; a way, as i t were, of packaging a set of assumptions and temporary changes to ones vocabulary, like a new suit of axiomatic clothing. McCarthy gives the plausible example of moving from a context in which relations are temporally sensitive into one representing a particular time, where the temporal parameters are considered constant and can therefore be ignored, thus presumably making inferences simpler. McCarthy suggests t h a t counterfactual and fictional reasoning can be expressed in this way, so that one might have a context sherlock-holmes-stories within which the fictional characters are assumed to exist, so that ist(sherlock-holmes-stories, (exist t)(Past(t) & Resides(Watson, India, t)) ) is simply true, i.e. true in our context, even though there is in fact no such person as Watson. In McCarthy's original account, the logic makes no a priori commitments to what connections might be possible between the vocabularies and assumptions of one context and another. The context logic developed by Buvac (1996; Buvac, Buvac & Mason 1995) has a pair of rules which intuitively are used to enter and leave a subcontext from a larger context. The entrance rule translates an assertion about a contextual truth into a simple truth stated within that context: (Entrance) From ist(k, phi) infer k: phi Since all assertions are made relative to a context, there is here assumed to be an unspoken but implicit global context relative to which any simple assertion is being made; this applies to any inference rule, in fact. The exit rule refers explicitly to this outer context: (Exit) From k: phi infer k': ist(k, phi) Since the context name k' used here is free, this might be interpreted as meaning t h a t from the sentence p h i derived in context k, we can infer that ist(k, p h i ) is true in any context; but this would render the logic vacuous, since we could use it to transfer any assertion from any context to any other simply by entering a subcontext from t h e first and then exiting to the second. McCarthy and Buvac talk of using this rule and its converse to enter a context, perform some inferences, and then exit back to t h e original context. The intended usage is clearly that k' in this rule is a 'supercontext' of k, and that all context invocations form a nested recursive structure rather like that of subderivations in ND proofs, or the dynamic stacks which are used to make operational sense of recursive subroutine calls in programming languages.14 (More recent accounts of these logics make this stack structure explicit by labeling with sequences of context names. Appendix A gives an alternative natural-deduction style formalization.) This stack-like organization suggests that these logics may be useful for describing linguistic contexts, but not for conceptual or physical contexts. Consider for example describing the change in physical contexts relevant to a conversation during a walk through a garden. As the conversants turn a corner, they are confronted by a rose-bed which was previously invisible. This presumably requires a context change, and so we must either enter a new context or exit to an older one, as these are the only kinds of context shift available. But since this situation is new, exiting will be of no use, so entering a new context is the only option. As the walk proceeds, these contexts will 14 The 'universal' interpretation seems analogous to the rule of necessitation in modal logics: From p infer Np which can be understood as saying that a tautology must be true in any possible world. However, this rule is valid only in a simple logic with no mechanisms for introducing temporary hypotheses, since then any sentence in a correct derivation must indeed be a logical theorem and hence necessarily true (if the logic is sound). In context logics, however, the inferences performed in the inner context and extracted to the outer one are typically intended to be correct only locally to that context. accumulate on the context stack, like ghosts of the way the world was in the past. There is no way (unless the conversants retrace their steps) to get rid of these earlier contexts; the exit rule simply takes us back to the immediately preceding context (now in the past), and the logic provides no mechanism for moving to any other set of assumptions. For conceptual contexts t h e picture seems even less appropriate, since a cluster of different but overlapping conceptual contexts may be needed to distinguish the several occurrences of a single relation symbol in a single assertion. Buvac (1996) applies this framework of contextual logic to analyze dialog structure. To this, however, it is necessary to assume that the meanings of the utterances have already been processed into the form of logical sentences. This seems an odd assumption, since that comprehension process is itself contextually located. To be sure, one might expect that the resulting IC would refer to aspects of the context in which the EL was originally interpreted. The IC resulting from the comprehension of a narrative might refer to the sequence of events the narrative describes, for example. But this description need not itself be contextually located: in fact, the very process of comprehension would normally be expected to eliminate that contextual sensitivity, transforming a narrative text like “John got up. He had breakfast. He went out....” into a representation with a meaning expressed more accurately by: “At some time t John got up, and at time t+1 he had breakfast, and at time t+2 he went out, and...”. But this is simply a conjunction, where the narrative time order is made explicit in the content of the IC itself, and does not require any special deductive machinery. (It is instructive to contrast this with the treatment in (ter Meulen 1995), where the narrative structure seems to be intimately involved in the deductive process. terMeulen however is clearly in t h e linguistic tradition; she is talking of inferences between English sentences, not logical sentences, and the contextual narrative machinery is closely connected with such linguistic delicacies as past and progressive tense inflexions, etc. The logics do not have these syntactic subtleties, so their meaning must have been already decoded and made explicit in the logical sentences (or lost altogether); but this is only possible - as terMeulen's work clearly illustrates - by taking the relevant narrative contexts into account. CONCLUSION The four senses of ‘context’ distinguished here seem to clearly be distinct notions: t h e physical setting of a conversation; the topic of a conversation or narrative; indices for recording fine conceptual distinctions; and finally, temporary packages of deductive assumptions. There may well be other kinds of ‘context’. How plausible it is to treat some or all of them as similar, so that a single account can be found which subsumes several of them, seems to depend on ones theoretical assumptions and background. The linguistic tradition typically merges physical and linguistic concepts into a common ground and ignores conceptual and deductive contexts. The computational tradition is more sensitive to the distinctions, but if one thinks of deduction as a universal cognitivemodeling mechanism then it would be natural to hope that deductive contexts might subsume all the others. A useful general-purpose device is to ask, for each kind of context, what their natural structure seems to be. What purely mathematical structure would describe t h e significant relations on the set of all such contexts? This question seems to have different answers for the four kinds of context considered here, suggesting that there may be no very useful single notion of ‘context’, but that like most other English words, its meaning can only be properly understood when the context is made clear. REFERENCES Buvac, S. Ambiguity via Formal Theory of Context, in Semantic Ambiguity and Underspecification, ed. Peters & van Deemter, CSLI lecture note , (1996a) Buvac, S Quantificational logic of context. Proc.. 14th Int. Joint Conf. on AI, (1996b) Buvac, S., Buvac, V. & Mason, I.A., Metamathematics of Contexts. Fundamenta Informaticae 23(3) (1995) Clark, H.H. & Carlson, T.B. Context for Comprehension, in Attention and Performance IX, ed. Long & Baddeley, pp 313330. Erlbaum, (1981) Grosz, B.J. & Sidner, C.L. Attention, Intention and the structure of discourse. Computational Linguistics 12: 175-204 (1986) Guha, R. V. Contexts: A formalization and some Applications. Doctoral Dissertation, Stanford University (1991) Guha, R. V. Micro-Theories and Contexts in CYC, Part 1: basic issues. MCC technical report ACTR-CYC-129-90 (1990) Hankamer, J. & Sag, I.A., Deep and Surface Anaphora. Linguistic Inquiry 7:391-426 (1976) Herskowitz, A. (1986) Language and spatial cognition Cambridge University Press Kamp, H. Prolegomena to a Structural theory of Belief and other Attitudes, in Propositional Attitudes, ed. Anderson & Owens, CSLI Lecture Notes # 20 (1990) McCarthy, J. Notes on Formalizing Context. Proc. Thirteenth International Joint Conferences on Artificial Intelligence, (1993) McCarthy, J & Buvac, S. Formalizing Context, Technical note STAN-CS-TN-94-13, Stanford University (1994) ter Meulen, A. Content in Context. MIT Press (1995) Singh, N., Tawakol, O. & Genesereth, M. A namespace Context Graph for Multi-Context, Multi-Agent Systems. Working Notes of AAAI Fall symposia, (1995) Woods, W. (1985) Whats in a link? in Brachman& Levesque (1985) Readings in Knowledge Reperesentation. Morgan Kaufmann, 1985 APPENDIX: A NATURAL-DEDUCTION LOGIC OF CONTEXTS Consider propositional logic extended recursively to include sentences of the form ist(k, phi) where phi is a sentence and k is a constant symbol from a set C of context names disjoint from the rest of the vocabulary. Conjunction and disjunction are understood to be symmetric. Define a derivation to be a sequence of items, where an item is either a sentence or a derivation labeled with a context name. We write k:[a...b] to mean any derivation labeled with k whose first and last items are a and b respectively, and k:[.. .a...] to mean any derivation containing a. The following rules apply to any items which occur in derivations with the same label, and the conclusion is added to the end of one of those derivations. _______________ and: from (a & b) infer a from a, b infer (a&b) _______________ or: from (a or b), k:[a...c] , k':[b...c] infer c from a infer (a or b) _______________ implies: from a, (a implies b) infer b from k:[a...b] infer (a implies b) _______________ ist: from k:[...a...] infer ist(k, a) from ist(k, a) infer k:[a] _______________ not: from k:[a...(b & not b)] infer not a There are also two special ‘structural’ rules: _______________ hyp: infer k:[a] if k does not occur elsewhere in the proof. _______________ import: k:[....a...k’:[...]...] can be transformed to k:[...a...k’:[...a]...] This version of contextual logic restricts the contextual sensitivity to making temporary assumptions (although it allows these to be re-used in other co-labeled sub-derivations without needing to be repeated there). More elaborate kinds of contextual restriction would require a more elaborate syntax, and this use of contexts, particularly in the negation and implicationintroduction rules, would then be restricted to the simple case. It is straightforward to extend the usual completeness, compactness, etc. proofs to this logic as it is equivalent to a standard ND formulation of propositional logic. (The quantifiers pose no special problem but are omitted here to save space.)The point of exhibiting it here is to emphasize that these deductive contexts play an essentially proof-theoretic role, and need have no special connection to the process of thought, even if that process is expected to produce proofs which are valid in this logic. A theorem-prover for this logic need not be closely related to the contextual inclusion relation, but could proceed, for example, by backward chaining, The ist rules need not be interpreted as entry and exit processes, and doing so imposes an unnecessary extra burden on the intended meaning.

Log In

Contexts in context

Related papers

Related papers