Academia.eduAcademia.edu

Intensional Query Answering: An Application of Partial Evaluation

1993, Workshops in Computing

We consider intensional answers to be logical formulas expressing sufficient conditions for objects to belong to the usual answer to a query addressed to a knowledge base. We show that in the SLDNF-resolution framework, complete and procedurally complete sets of intensional answers can be generated by using partial evaluation. Specific treatments of recursion and negation are also presented.

Intensional Query Answering: An Application of Partial Evaluation Giuseppe De Giacomo Dipartimento di Informatica e Sistemistica Università di Roma “La Sapienza” Via Salaria 113, 00198 Roma, Italia e-mail: [email protected] Abstract We consider intensional answers to be logical formulas expressing sufficient conditions for objects to belong to the usual answer to a query addressed to a knowledge base. We show that in the SLDNF-resolution framework, complete and procedurally complete sets of intensional answers can be generated by using partial evaluation. Specific treatments of recursion and negation are also presented. 1 Introduction Intensional answers are responses that provide an abstract description of the conventional answer to a query addressed to a knowledge base. They are expected to “provide compact and intuitive characterizations of sets of facts, making explicit why a specific set of facts answers the query instead of just which facts belong to the answer” ([PR89]). Various research studies have investigated this kind of answers (e.g., [CD86], [CD88], [Corella84], [Demolombe92], [DFI91], [Imielinski87], [Motro89], [Motro91], [MY90], [PR89], [PRZ91], [SM88], etc.). Following [CD86], [PR89], [PRZ91], we consider intensional answers to be logical formulas expressing sufficient conditions for objects to belong to the conventional answer to a query. We assume knowledge bases to be, essentially, programs whose proof procedure is the SLDNF-resolution, as in [Lloyd87]. Partial evaluation (PE) for programs in the SLDNF-resolution framework is defined in [LS91]. Although it is usually considered an optimization technique, we use it for quite a different aim in this paper. We show that given a program P and a query Q(X), a new program P ′ = P ∪ {q(X) ← Q(X)} (where q is a predicate symbol not occuring in P ) can be defined such that for every PE of q(X) in P ′ there corresponds a complete set of intensional answers to Q(X) in P . Furthermore, each set SIA of intensional answers computed in this way is procedurally equivalent to the original query Q(X), i.e., the conventional answers that can be computed from SIA in P are exactly those that can be computed from Q(X) in P . Having pointed out this correspondence we have a tool to produce intensional answers for a very general class of queries and programs, i.e., for every query in every program intended to run under SLDNF-resolution. Therefore, in principle, we can deal with function symbols, recursion and negation, something usually ruled out by other approaches to intensional query answering. Specifically, we suggest a simple but quite effective way to return intensional answers when recursion is involved. Notice first of all that by PE we can obtain recursion-free intensional answers for a query involving recursive predicate symbols. On the other hand, if we cannot remove a recursive predicate symbol p from an intensional answer, then we return, together with the intensional answer, an auxiliary definition for p. This is a specialized definition that is general enough to cover the meaning of p in the context of the intensional answers in which it appears. Note that, if a recursive predicate symbol p′ other than p shows up in the auxiliary definition for p, then we return an auxiliary definition for p′ as well. The pair < SIA , AD >, where SIA is a set of intensional answers to a query Q(X), and AD is a set of auxiliary definitions for SIA , can be interpreted as the implicit representation of the infinite set of all the intensional answers to Q(X), which can be inferred from SIA , using the axioms corresponding to AD. With regard to negation, we remind the reader that if a negative literal is found at a certain point of the PE process, then either it is completely evaluated, or the atom in the negative literal is partially evaluated and the definition obtained is added to the PE to be returned (e.g. see [BL90]). We could follow a similar approach in the generation of intensional answers, returning auxiliary definitions for the atoms in the negative literals that cannot be evaluated. Yet, this would be quite unsatisfactory, because we would lose the “interactions” between the positive part and the negative part of an intensional answer. To avoid this problem, we make some additional logical transformations. Roughly, we consider the completions of such auxiliary definitions, negate both sides of the equivalences, perform some logical manipulations on the right sides, and replace the negative literals in the intensional answers by the proper instances of the corresponding right parts of the equivalences obtained. The rest of the paper is organized as follows. After recalling some preliminary notions in the next section, the basic results are presented in Section 3. Our treatments of recursion and negation are described in Sections 4 and 5, respectively. Conclusions and further work end the paper. Due to lack of space only sketches of the proofs are reported. The full proofs appear in [DeGiacomo92]. 2 Preliminaries In this section we introduce some basic definitions, the knowledge bases considered, intensional answers and partial evaluation. We assume that the reader is familiar with the standard theoretical results of logic programming (cf. [Lloyd87]).1 1 We mainly use the same notation as [Lloyd87] except that we denote sequences of terms by a single capital letter. Few other differences are pointed out when encountered. As usual, a program is a finite set of statements of the form A ← W , where A is an atom and W is a first order formula. All the statements of a program P , which have the same predicate symbol p in their head, form the definition of p in P . Statements whose bodies are conjunctions of literals are called program clauses or just clauses. A program whose statements are program clauses is called a normal program. The completion of a program P , denoted as comp(P ), is the collection of completed definitions of the predicate symbols in P together with Clark’s equality theory. Definition Let P and P ′ be two programs, G and G′ two goals with the same free variables. We say that P ∪ {G} and P ′ ∪ {G′ } are procedurally equivalent if the following holds: 1. P ∪ {G} has an SLDNF-refutation with computed answer θ iff P ′ ∪ {G′ } does. 2. P ∪ {G} has a finitely failed SLDNF-tree iff P ′ ∪ {G′ } does. ✷ When we talk about SLDNF-resolution for non-normal programs we refer to the corresponding normal forms obtained by applying Lloyd & Topor’s transformations (cf. [LT84], [Lloyd87]). Definition Let S be a set of predicate symbol definitions. We call comp′ (S) the set of the corresponding completed definitions together with Clark’s equality theory. ✷ Notice that, given a program P , comp′ (P ) is the subset of comp(P ) formed by the completed definitions of the predicate symbols explicitly defined in P (i.e., the predicate symbols appearing in the head of a statement of P ). To further clarify the concept let us see an example. Example Consider the following program P = {p(x) ← r(x) ∧ s(x), r(a) ←}, comp(P ) is {∀x(p(x) ↔ r(x) ∧ s(x)), ∀x(r(x) ↔ (x = a)), ∀x(∼ s(x))} while comp′ (P ) is {∀x(p(x) ↔ r(x) ∧ s(x)), ∀x(r(x) ↔ (x = a))}. ✷ We consider a knowledge base KB essentially constituted by a program divided in two strata IDB and EDB. • IDB is a program such that the predicate symbols defined therein may depend upon predicate symbols defined in EDB. We call such a program the intensional program of the knowledge base KB. • EDB is a program such that no predicate symbol defined therein depends upon predicate symbols defined in IDB. We call such a program the extensional program of the knowledge base KB. Typically, IDB and EDB are intended to model the intensional knowledge and the extensional knowledge of KB respectively.2 We say that an intensional program IDB is a normal intensional program if it is a normal program. In the same way, we say that an extensional program EDB is a normal extensional program if it is a normal program. A query to a knowledge base can be any first order formula3 . We now turn our attention to intensional answers. We adopt the same definitions as in [CD86], [PR89], [PRZ91], etc, adapting them to the SLDNFresolution framework. Let IDB be the intensional program of a knowledge base KB, and Q(X) a query whose free variables are X. Since in the present paper we do not use integrity constraints to generate intensional answers, we are actually considering a special kind of intensional answers defined as follows. Definition A first order formula Ai (X), whose free variables are X, is an intensional answer for Q(X) (wrt KB) if comp′ (IDB) |= ∀X(Ai (X) → Q(X)). ✷ Obviously not all the intensional answers are interesting, e.g. we can drop intensional answers which are variants of the query, those inconsistent wrt comp′ (IDB), and those subsumed by other ones. Definition A set SIA of intensional answers for Q(X) (wrt KB) is complete if _ Ai (X)) ↔ Q(X)). comp′ (IDB) |= ∀X(( Ai ∈SIA ✷ Since, SLDNF-resolution is sound but not complete in general, it makes sense to introduce the notion of a set of intensional answers, complete from the procedural point of view. Definition A set SIA of intensional answers for Q(X) (wrt KB) is procedurally complete if for every possible extensional program EDB of KB, _ Ai (X)} and IDB ∪ EDB ∪ {← Q(X)} IDB ∪ EDB ∪ {← Ai ∈SIA are procedurally equivalent. ✷ 2 Note that, there are no restrictions on the form of the statements neither in IDB nor in EDB. 3 In [Lloyd87] a query is a goal. Let ← W be a goal, we call “query” the first order formula W. We finish the preliminary section introducing partial evaluation (PE)4 . The formal notion and result described here are from [LS91]. We refer to normal programs and normal goals only. It is convenient to use slightly more general definitions of SLDNF-derivation and SLDNF-tree than those given in [Lloyd87]. In [Lloyd87], an SLDNF-derivation is either infinite, successful or failed. We also allow it to be incomplete, in the sense that at any step we are allowed simply not to select any literal and terminate the derivation. Likewise, in an SLDNF-tree we may neglect to unfold a goal. Definition A resultant is a first order formula of the form Q1 ← Q2 , where Qi , (i = 1, 2), is either absent or a conjunction of literals. Any variables in Q1 or Q2 are assumed to be universally quantified in front of the resultant. ✷ Definition Let P be a normal program, G a normal goal ← Q, and G0 = G, G1 , . . . , Gn an SLDNF-derivation P ∪ {G}, where the sequence of substitutions is θ1 , . . . , θn and Gn is ← Qn . Let θ be the restriction of θ1 . . . θn to the variables in G. Then we say the derivation has length n with computed answer θ and resultant Qθ ← Qn . 5 ✷ Definition Let P be a normal program, A an atom, and T a (not necessarily complete) SLDNF-tree for P ∪ {← A}. Let G1 , . . . , Gr be a set of (non-root) goals in T such that each non-failed branch of T contains exactly one of them. Let Ri (i = 1, . . . , r) be the resultant of the derivation from ← A down to Gi associated with the branch leading to Gi . • The set of resultants π = {R1 , . . . , Rr } is a PE of A in P . These resultants have the following form Ri = Aθi ← Qi (i = 1, . . . , r), where we have assumed Gi =← Qi . • Let A = {A1 , . . . , As } be a finite set of atoms, and πi (i = 1, . . . , s) a PE of Ai in P . Then Π = π1 ∪ . . . ∪ πs is a PE of A in P . • Let P ′ be the normal program resulting from P when the definitions therein of the predicate symbols in A are replaced by a PE of A in P . Then P ′ is a PE of P wrt A. ✷ The next two theorems are the main results on partial evaluation. Definition Let S be a set of first order formulas and A a finite set of atoms. We say S is A-closed if each atom in S containing a predicate symbol occurring in A is an instance of an atom in A. ✷ 4 Recently, in the context of logic programming, it has been proposed to replace the name partial evaluation with the name partial deduction, leaving the original name to denote the optimization oriented use of such a machinery. In this paper we stick to the name partial evaluation in conformity with [LS91] and [BL90] whose results are extensively used. 5 Note that, if n = 0, the resultant is Q ← Q. Definition Let A be a finite set of atoms. We say A is independent if no pair of atoms in A have a common instance. ✷ Theorem 1 (Lloyd Shepherdson) Let P be a normal program, W a closed first order formula, A a finite set of atoms, and P ′ a PE of P wrt A such that P ′ ∪ {W } is A-closed. If W is a logical consequence of comp(P ′ ), then it is a logical consequence of comp(P ), i.e., comp(P ′ ) |= W ⇒ comp(P ) |= W. Theorem 2 (Lloyd Shepherdson) Let P be a normal program, G a normal goal, A a finite, independent set of atoms, and P ′ a PE of P wrt A such that P ′ ∪ {G} is A-closed6 . Then P ∪ {G} and P ′ ∪ {G} are procedurally equivalent. Note that the PE of a program wrt a goal is not directly defined. Anyway, there are procedures (e.g. [BL90]) that, given a program P and a goal G, compute a set of atom A and a PE of the program P wrt A such that the original program and the partially evaluated program are procedurally equivalent wrt the goal G. 3 Intensional query answering by partial evaluation The intensional program IDB of a knowledge base KB is an “open program”, i.e., a program for which some predicate symbol definitions are missing, hence it should be considered more as a collection of predicate symbol definitions than as a running program. It is clear that for IDB, the completion comp(IDB) does not make sense, while comp′ (IDB) does. The partial evaluation theorems seen in the previous section are not directly useful in dealing with intensional programs. Here, we give analogous theorems more suitable for such programs. First, we need the next definition reported from [BL90]. Definition Let L be a set of predicate symbols. We say that a literal is L-selectable if its predicate symbol is in L. We say that an SLDNF-tree is Lcompatible if the predicate symbol of each selected literal in the tree (including subsidiary refutations and trees) is in L. ✷ Let IDB be a normal intensional program of a knowledge base KB, LIDB the set of predicate symbols defined in IDB, A a finite set of LIDB -selectable atoms, and IDB ′ a PE of IDB wrt A obtained from a LIDB -compatible SLDNF-tree, such that IDB ′ is A-closed. The following two theorems hold. Theorem 3 Let W be a first order formula which is A-closed. Then comp′ (IDB ′ ) |= W ⇒ comp′ (IDB) |= W. 6 In this theorem, the closedness condition can be replaced by the coveredness condition (cf. [LS91]). Sketch of the proof By Theorem 1, for every normal extensional program EDB: comp(IDB ′ ∪ EDB) |= W ⇒ comp(IDB ∪ EDB) |= W . Now, the thesis is proved by contradiction, showing that there exists an EDB, namely EDB ∗ = {A ← A : the predicate symbol in A is not defined in IDB, and an instance of A occurs in the body of a program clause in IDB}, such that Theorem 1 would not hold. ✷ Theorem 4 Let G be a normal goal which is A-closed. If A is independent, then for every possible normal extensional program EDB of KB: IDB∪EDB∪ {G} and IDB ′ ∪ EDB ∪ {G} are procedurally equivalent. Sketch of the proof From the definition of PE it is obvious that IDB ′ ∪EDB is a PE of IDB ∪ EDB wrt A. By Theorem 2 the thesis follows. ✷ We are now ready to describe the first results on generating intensional answers by using partial evaluation. 1) Let ← W be a normal goal. We define a new predicate symbol (i.e., a predicate symbol not appearing in P or W ), as q(X) ← W where X are the free variables occurring in W , and we add such a new definition to IDB, getting IDB q = IDB ∪ {q(X) ← W }. 2) Let LIDB q be the set of the predicate symbols defined in IDB q . We choose a PE π of q(X) in IDB q obtained from an LIDB q -compatible SLDNF-tree for IDB q ∪ {← q(X)}. Let π be q(X)θ1 ← W1 .. . q(X)θr ← Wr where θi = {Xi /Ti }, Xi are the variables in X instantiated by θi , and Ti are terms. 3) The completed definition for q given by these resultants can be written as follows: ∀X(q(X) ↔ ∃Y1 ((X1 = T1 ) ∧ W1 ) ∨ . . . ∨ ∃Yr ((Xr = Tr ) ∧ Wr )) (1) where Yi are the free variables in (Xi = Ti ) ∧ Wi other than those in X, and Xi = Ti is a loose notation for (x1i = t1i ) ∧ . . . ∧ (xni = tni ) (supposing Xi to be the sequence x1i . . . xni ). 4) The disjuncts in the above formula ∃Y1 ((X1 = T1 ) ∧ W1 ) .. . ∃Yr ((Xr = Tr ) ∧ Wr ) can be regarded as intensional answers. Furthermore the set formed by these intensional answers is complete and procedurally complete, as the following theorems show. Theorem 5 The formulas at step 4 of the process above form a complete set of intensional answers for the query W in the program P . Sketch of the proof By Theorem 3 it can be shown that (1) is a logical consequence of comp′ (IDB q ). Then, considering the axiom ∀X(q(X) ↔ W ) in comp′ (IDB q ), the formula resulting from (1) replacing q(X) with W can be proved to be a logical consequence of comp′ (IDB), hence the thesis follows. ✷ Theorem 6 The set of intensional answers obtained by the process above is procedurally complete. ′ Sketch of the proof Let IDB q be the PE of IDB q wrt {q(X)}. By Theo′ rem 4, for every possible EDB of KB, IDB q ∪ EDB ∪ {← q(X)} is procedurally equivalent to IDB q ∪ EDB ∪ {← q(X)}, which, in turn, is procedurally equivalent to IDB ∪ EDB ∪ {← W }. On the other hand, IDB ∪ EDB ∪ {← Wr ∃Y ((X 1 1 = T1 ) ∧ W1 )}, once transformed into normal form, and assumi=1 ing for the predicate symbol “=” the standard procedural meaning “unifiable”, ′ can be shown to be procedurally equivalent to IDB q ∪ EDB ∪ {← q(X)}, regardless of EDB. Hence the thesis follows. ✷ Example Consider the following fragment of the intensional program IDB of a knowledge base. publication bonus(x, 50) ← conf erence publication(x, y) publication bonus(x, 100) ← conf erence publication(x, y) ∧ major conf erence(y) publication bonus(x, 150) ← journal publication(x, y) major conf erence(x) ← sponsor(x, ACM ) major conf erence(x) ← sponsor(x, IEEE) major conf erence(x) ← accepted rate(x, y) ∧ (y ≤ 0.2) ... Suppose we want the answer to the query ← ∃y(publication bonus(x, y) ∧ (y ≥ 100)), i.e., “Which are the papers that get a publication-bonus greater or equal to 100?”. 1) We define a new predicate symbol q as q(x) ← publication bonus(x, y) ∧ (y ≥ 100), Let IDB q be IDB ∪ {q(x) ← publication bonus(x, y) ∧ (y ≥ 100)}. 2) We choose a PE π of q(x) in IDB q obtained from an LIDB q -compatible SLDNF-tree. Let such a tree be the one in Figure 1, and π the PE associated ← q(x) ← pb(x, y) ∧ (y ≥ 100) ✏PP PP ✏✏ ✏ PP {y/150} ✏ {y/50} ✏ PP ✏ ✏ PP ✏✏ PP ✏ {y/100} ✏ PP ✏ ✏ ✏ PP ← cp(x, z) ∧ (50 ≥ 100) ← jp(x, z) ∧ (150 ≥ 100) ← cp(x, z) ∧ mc(z) ∧ (100 ≥ 100) ← cp(x, z) ∧ mc(z) ← jp(x, z) ✘✘❳❳❳ ❳❳❳ ✘✘ ✘ ✘ ❳❳ ✘✘ ❳❳❳ ✘✘✘ ❳❳❳ ✘ ✘ ✘ ❳ ✘ ✘ ❳ ← cp(x, z) ∧ s(z, ACM ) ← cp(x, z) ∧ ar(z, z ′ ) ∧ (z ′ ≤ 0.2) fail ← cp(x, z) ∧ s(z, IEEE) Figure 1: The SLDNF-tree used for the partial evaluation. with the non-failing leaves of such a tree, i.e. q(x) ← conf erence publication(x, z) ∧ sponsor(z, ACM ) q(x) ← conf erence publication(x, z) ∧ sponsor(z, IEEE) q(x) ← conf erence publication(x, z) ∧ accepted rate(x, z) ∧ (z ≤ 0.2) q(x) ← journal publication(x, z). 3) The completed definition of q in IDB q is ∀x(q(x) ↔ ∃z(conf erence publication(x, z) ∧ sponsor(z, ACM )) ∨ ∃z(conf erence publication(x, z) ∧ sponsor(z, IEEE)) ∨ ∃z(conf erence publication(x, z) ∧ accepted rate(x, z) ∧ (z ≤ 0.2)) ∨ ∃z(journal publication(x, z))). 4) The disjuncts in the right-hand part of such a formula form a complete and procedurally complete set of intensional answers, that can be read as “Papers published in an ACM conference, papers published in an IEEE conference, papers published in a conference whose accepted rate is less than or equal to 0.2, and papers published in a journal.” ✷ The process above is not completely specified since we are free to choose any PE π of q in IDB q at step 2. The quality of the intensional answers returned strongly depends on such a choice of π, which in turn substantially depends on the selection rule for the related SLDNF-tree. While we do not directly address such an issue in this paper, the problem of finding a “good” selection rule is one of the most crucial to effectively do intensional query answering by means of partial evaluation. The termination of the above process depends again on the selection rule to be used in the generation of the PE π. Such a selection rule should build finite (incomplete) SLDNF-trees. Conditions on the selection rules, dealing with the termination of the partial evaluation, can be found in the related literature (e.g. [vanHarmelen89]). 4 Dealing with recursion The basic method presented in the previous section allows one, in principle, to return intensional answers for every query in every logic program. In particular, it does not rule out recursion. Obviously, such intensional answers should be expressed in a language that is known by the user.7 If recursive predicate symbols (i.e., predicate symbols which appear in a loop in the dependency graph of a program) are allowed to appear in the intensional program of a knowledge base, then it could be impossible to obtain a complete set of intensional answers in which no occurrences of recursive predicate symbols, that are not known by the user, appear. In this case, no satisfying set of intensional answers could be returned. The next example shows the problem arising when recursion cannot be eliminated, and hints on how it can be tackled. Example Consider the following fragment of the intensional program of a knowledge base: 7 We assume that the user knows a set of predicate symbols which includes those defined in the extensional program of the knowledge base, and all constants and function symbols. collateral line relative(x, y) ← ancestor(x, z) ∧ ancestor(y, z) ancestor(x, y) ← parent(x, y) ancestor(x, y) ← parent(x, z) ∧ ancestor(z, y) ... and suppose we want intensional answers for the query: ← collateral line relative(x, y) Possible complete sets of intensional answers are {∃z(ancestor(x, z) ∧ ancestor(y, z))} or {∃z(parent(x, z) ∧ ancestor(y, z)), ∃z∃z ′ (parent(x, z ′ ) ∧ ancestor(z ′ , z) ∧ ancestor(y, z))} or, also {∃z(parent(x, z) ∧ parent(y, z)), ∃z∃z ′ (parent(x, z ′ ) ∧ ancestor(z ′ , z) ∧ parent(y, z)), ∃z∃z ′ (parent(x, z) ∧ parent(y, z ′ ) ∧ ancestor(z ′ , z)), ∃z∃z ′ ∃z ′′ (parent(x, z ′ ) ∧ ancestor(z ′ , z) ∧ parent(y, z ′′ ) ∧ ancestor(z ′′ , z))} etc. As we can see, we cannot eliminate the predicate symbol ancestor in the set of intensional answers returned. Now, if the meaning of ancestor is known by the user, then the most intuitive set of answers is probably the first one, being the simplest. But, if the meaning of ancestor is not known (e.g., the user may not be clear on whether or not his wife’s grandfather is his ancestor), none of the above sets is satisfying, because ancestor appears in each of them. We need some kind of definition giving the meaning of ancestor in the context of the set of intensional answers returned. For instance we may return: {∃z(ancestor(x, z) ∧ ancestor(y, z))} ancestor(x, y) ← parent(x, y) ancestor(x, y) ← parent(x, z) ∧ ancestor(z, y). In this way, asking “which are the collateral-line relatives?” we get an answer such as “ the individuals that have a common ancestor, where an ancestor is a parent or a parent of an ancestor”. ✷ In view of the observations in the above example, we propose to answer a query by a set SIA of intensional answers and a set RD of definitions for the recursive predicate symbols which are somehow marked unknown8 , occuring in the answer. Notice that, if other predicate symbols marked unknown appear in such definitions, then their definitions are included in RD as well.9 To formalize the set RD we now introduce the notion of a set of auxiliary definitions. Let IDB be the intensional program of a knowledge base KB, LIDB the set of predicate symbols defined in IDB, Q(X) a query whose free variables are X, SIA a set of intensional answers Ai (X) (i = 1, . . . , n) for Q(X), L a subset of LIDB , and AD a set of definitions for the predicate symbols in L. Then, let AL be a set of atoms, one for each predicate symbol in L10 , such that SIA is AL -closed, and let comp′ (AD)inst be the instance of comp′ (AD) such that the atoms on the left-hand sides of the completed definitions therein coincide (modulo variants) with the corresponding atoms in AL . Definition We say AD is a set of auxiliary definitions11 for SIA wrt L if: 1. comp′ (IDB) |= comp′ (AD)inst , and 2. comp′ (AD)inst is AL -closed. ✷ Notice that a set AD of auxiliary definitions always exists. In fact, the IDB definitions of the predicate symbols in L form such a set. But the definitions in AD are not necessarily those in IDB. Intuitively, they can be a “specialized” version of those which are general enough to cover the meaning of each atom occurring in the answer returned (i.e., wrt the atoms in the answer, the definitions in AD retain the same meaning as those in the intensional program). We could also require the auxiliary definitions in AD to be used, instead of the corresponding definitions in IDB, to evaluate the intensional answers in SIA without losing correct answers, or at least computed answers. Such a property is quite “severe”, since, to enforce it, we should return auxiliary definitions that are not only general enough to cover the meaning of the predicate symbols in L, in the context of SIA and AD, but also to cover their meaning through 8 We may consider a predicate symbol to be marked unknown either generally (e.g., because its meaning is not known by the user) or more specifically, wrt the formulas in which it appears. 9 In very unfortunate cases, the set of definitions RD may almost coincide with the whole intensional program. 10 We assume that for each predicate symbol p in L there corresponds just one atom, and hence we have one logical equivalence involving p in comp′ (AD)inst which may be thought of as the logical definition of p in the context of SIA and AD. We could also assume that an independent set of atoms corresponds to p. This would entail that in comp′ (AD)inst there would be a distinct logical equivalence involving p for each such atom, therefore the idea of a single logical definition of p in the context of SIA and AD should be replaced by the idea of a logical definition of p in the context of a single intensional answer of SIA or statement of AD in which it appears. In this paper we stick to the first assumption; nevertheless the results shown here are straightforward extended to the case where the second assumption is adopted. 11 Single auxiliary definitions are defined just as elements of AD. the whole evaluation of each intensional answer in SIA . Indeed, if a predicate symbol p 6∈ L, which depends on predicate symbol p′ ∈ L, appears in some atoms of SIA ∪ AD, then in chosing the generality of the auxiliary definition for p′ we should consider the occurrences of p′ arising from the evaluation of these atoms as well. On the other hand, the formalization of the notion of set of auxiliary definitions above is sufficient to give a nice characterization of the pair < SIA , AD >, as shown below. The intensional answers in SIA have the same status as queries, while the set AD of auxiliary definitions is an (open) program. How does the pair < SIA , AD > relate to the original notion of intensional answers? The pair < SIA , AD > can be interpreted as the implicit representation of the infinite set of all the intensional answers for Q(X) which can be inferred from the intensional answers in SIA using the axioms of comp′ (AD)inst . Indeed, the pair < SIA , AD > may be thought of as representing the infinite set of all the formulas χij (X) (i = 1, . . . , n; j = 1, 2, . . .) such that comp′ (AD)inst |= ∀(χij (X) → Ai (X)). (2) Note that χij (X) (j = 1, 2 . . .) are intensional answers to Ai (X) wrt the intensional program AD. By definition of a set of auxiliary definitions, the following holds comp′ (IDB) |= comp′ (AD)inst . (3) From (2) and (3) we get comp′ (IDB) |= ∀(χij (X) → Ai (X)). (4) Now, for Ai (X) we have comp′ (IDB) |= ∀(Ai (X) → Q(X)). (5) Hence, from (4) and (5) comp′ (IDB) |= ∀(χij (X) → Q(X)), (6) that is, χij (X) (i = 1, . . . , n; j = 1, 2, . . .) are intensional answers to Q(X) wrt KB. Turning to the problem of how to compute a set of auxiliary definitions, assuming IDB to be normal, it can be shown that a PE Π of AL in IDB, obtained from an LIDB -compatible SLDNF-tree, and such that SIA ∪ Π is AL -closed, is a set of auxiliary definitions for SIA wrt L. When AD is computed by PE, unfolding the intensional answers in SIA ′ using program clauses in AD leads to new sets of intensional answers SIA which preserve the completeness and the procedural completeness, as the following theorem shows. Theorem 7 Let SIA be a complete and procedurally complete set of intensional answers, and AD a set of auxiliary definitions for SIA wrt L obtained as a PE ′ of AL . Then every set SIA of intensional answers derived by SLDNF-resolution from SIA using program clauses in AD, is complete and procedurally complete. Sketch of the proof By the sub-derivation lemma in [LS91], and lemma 4.12 in [LS91], it follows that an SLDNF-tree built using resultants in AD can be expanded into an SLDNF-tree built using only program clauses in IDB. Now, consider the query given by the disjunction of the intensional answers in SIA , and let ans(X) be the query introduced by its transformation into normal ′ form. It can be shown that every SIA can be computed as a PE of ans(X) in the transformed intensional program. Hence, by Theorem 5 and Theorem 6, the thesis follows. ✷ Now that we have characterized the notion of a set of auxiliary definitions, we can employ it to clarify the idea presented at the beginning of the section. We answer a query with a set SIA of intensional answers and a set RD of auxiliary definitions for the recursive predicate symbols marked unknown appearing in SIA or in RD itself. Notice that, by Theorem 7, if an auxiliary definition D ∈ RD of some predicate symbol p is not recursive in reality, then (assuming, for now, that p does not occur in a negative literal) we may unfold the corresponding positive literals in SIA and RD, and drop D from RD. An algorithm to compute SIA and RD, based on partial evaluation, can be adapted from the one in [BL90]. The underlying idea is to build the set of atoms that is partially evaluated “run-time” while computing SIA and RD. 5 More about negation The notion of PE is directly derived from the notion of SLDNF-tree. Therefore, negation during the PE process is treated in a somewhat limited way. In fact 1. A negative literal can be selected only if it is ground. 2. If a ground negative literal is selected, then it is either completely evaluated (if possible), or not evaluated at all. Similarly to what proposed in the literature on partial evaluation (e.g. [BL90]), we can generate an answer to a query W , constituted by a set SIA of intensional answers, and a set of auxiliary definitions for predicate symbols marked unknown occurring in the answer, partitioned into two subsets RD and N D. RD concerns recursive predicate symbols occuring in either positive or negative literals of the answer, whereas N D concerns those non-recursive occurring in negative literals. Supposing IDB and W to be normal, partial evaluation can be used to generate such an answer. Actually the algorithm, mentioned in the previous section, can easily be modified to compute SIA , RD and N D. The problem with such an approach is that, in the formulas of SIA , RD and N D, the “interactions” (i.e., possible simplifications) between the part of information in the positive literals and the one in the negative literals is lost, because the latter is embedded in separate definitions. We need to recover such interactions if the answer is to be effective. Now, for each predicate symbol p in a negative literal there is an auxiliary definition in N D to which corresponds a logical equivalence in comp′ (N D)inst of the form: ∀X(p(T (X)) ↔ ∃Y (F (X, Y ))), (7) where T (X) denotes a tuple of terms, X the variables therein, and Y the variables, other than those in X, which are free in F . We may negate both sides of such an equivalence getting: ∀X(∼ p(T (X)) ↔∼ ∃Y (F (X, Y ))). (8) The literals of SIA ∪RD ∪N D in which p occurs, must be instances of p(T (X)), so we may replace them with the proper instances of the right-hand side of (7) or (8). Obviously, when such an expansion of a negative literal is applied, the formulas obtained are logically equivalent to the original ones, but they may not be procedurally equivalent, hence while no correct answers are lost or gained, the same is not true for the computed answers, in general. The idea of negating both sides of the completed definitions and replaceing the negative literals by the right-hand side of the equivalences obtained is related to constructive negation ([Chan88], [Chan89], [Przymusinski89]), and has been used to treat negation during the partial evaluation process in [CW89]. Here we want to apply such a treatment of negation off-line wrt the partial evaluation process, so as to retain the notions and the results in [LS91]. Moreover, our aim is to expand the negative literals in such a way as not to lose computed answers. We now present a method for such an expansion. For every formula in SIA and RD we recursively apply expansion steps, defined by the following sequence of transformations, until no more expansion steps are possible. 1. We substitute atoms in the positive and negative literals of the formula, with the right-hand sides of the corresponding instances of the completed definitions in comp′ (N D)inst . 2. Equalities in the formula are treated as follows. (a) We substitute equalities whose terms unify by the equality corresponding to their mgu (if the mgu is the empty substitution then the equality is eliminated), and we eliminate the conjunctions in which there is an equality whose terms do not unify. The result of such a transformation is logically equivalent to the original formula, by Clark’s Lemma (cf. [Clark78], also Lemma 15.2 in [Lloyd87]). (b) We eliminate the equalities in which one of the terms is an existentially quantified variable, by means of the following logical equivalence: ∃y((x = y) ∧ B) ↔ B{y/x}. 3. We push the (existential) quantifiers to the right as much as possible, eliminating the redundant ones. 4. We move negation all the way inward, stopping in front of the existential quantifiers, by means of the usual logical equivalences. A few things must be pointed out. First, a formula resulting from the above process is logically equivalent to the original one. Second, such a process always terminates, since the definitions in N D are non-recursive. Third, at the end of such a process, N D is not needed any more and can be eliminated. Furthermore, although we do not yet have the complete proof, it seems that a kind of procedural containment holds, that is, if G is a goal, and G′ the goal resulting from processing G as above, then for every extensional program EDB 1. If IDB ∪ EDB ∪ {G} has an SLDNF-refutation with computed answer θ, then so does IDB ∪ EDB ∪ {G′ }. 2. If IDB ∪ EDB ∪ {G} has a finitely failed SLDNF-tree, then so does IDB ∪ EDB ∪ {G′ }. Notice that, if the intensional program of the knowledge base is not a normal program, then by normalizing it using Lloyd & Topor’s transformations to apply partial evaluation, we introduce new predicate symbols12 that are obviously unknown (i.e., they are meaningless to the user). By the method sketched here, such predicate symbols can always be replaced by a meaningful formula. Example Consider the following intensional program IDB: should visit(x, y) ← serves(y, z) ∧ likes(x, z) happy(x) ← f requents(x, y) ∧ should visit(x, y) very happy(x) ← ∀y(f requents(x, y) → should visit(x, y)) unhappy(x) ← ∀y(f requents(x, y) →∼ should visit(x, y)), the following extensional program EDB (schema): f requents(DRIN KER, P U B) serves(P U B, BEER) likes(DRIN KER, BEER), 12 New predicate symbols are introduced to eliminate the negated existentially quantified (universally quantified) formulas. and the query “Who are the drinkers that neither are unhappy nor very happy ?”, that is: ←∼ unhappy(x)∧ ∼ very happy(x). First notice that the last two statements must be transformed into normal form. very happy(x) ←∼ np1(x) np1(x) ← f requents(x, y)∧ ∼ should visit(x, y) unhappy(x) ←∼ np2(x) np2(x) ← f requents(x, y) ∧ should visit(x, y). The only possible set of intensional answers computed by the basic method is the one constituted by the query itself. To it we may add the following set N D of auxiliary definitions. very happy(x) ←∼ np1(x) np1(x) ← f requents(x, y)∧ ∼ should visit(x, y) unhappy(x) ←∼ np2(x) np2(x) ← f requents(x, y) ∧ serves(y, z) ∧ likes(x, z). Now we proceed to the expansions. We expand (in parallel, for sake of brevity) both ∼ unhappy(x) and ∼ very happy(x): ∼ unhappy(x)∧ ∼ very happy(x) (original goal) np2(x) ∧ np1(x) (first expansion step) ∃y(f requents(x, y) ∧ should visit(x, y))∧ ∃y(f requents(x, y)∧ ∼ ∃z(serves(y, z) ∧ likes(x, z))) (second expansion step) ∃y(f requents(x, y) ∧ ∃z(serves(y, z) ∧ likes(x, z)))∧ ∃y(f requents(x, y)∧ ∼ ∃z(serves(y, z) ∧ likes(x, z))) (third expansion step) The last formula is a nice intensional answer, i.e., “The drinkers who visit at least both a pub where a beer they like is served, and a pub where no beer they like is served.” ✷ 6 Conclusions In this paper we have presented a set of tools, based on PE, to generate intensional answers in the SLDNF-resolution framework, allowing function symbols, recursion, and negation. The results stated on the application of PE techniques to the generation of intensional answers and auxiliary definitions do not refer to any particular PE. It is engaging to investigate ways to choose PE, specific to intensional answering, such as heuristics that make the resulting intensional answers more “intuitive”, or selection rules that use integrity constraints to prune away inconsistent goals. Regardless to the PE chosen, the PE process tends to destroy the structure of the program to which it is applied. Now there are no reasons to preserve the structure of the original program. In fact, such a structure is normally hidden from the user, and is too general, in the sense that it does not reflect the particular query asked. Nevertheless, if the structure of the user’s knowledge is at hand, it could be used to re-express the intensional answers in a language that is more familiar to the user. Hence, another issue to explore is the use of additional components, usually considered for modelling structural aspects of a knowledge base (e.g., taxonomies and integrity constraints), to improve the quality of the intensional answers. Finally, our work may be considered a first step toward a program transformation approach to intensional answering, and it could be naturally extended using other program transformation techniques. Moreover such an approach can be applied to other kinds of non-conventional query answering. For instance, PE can be used for both “Knowledge query answering” [MY90] and, adding folding techniques, “Intelligent query answering” [Imielinski87]. Acknowledgements I am grateful to J. W. Lloyd who supervised me during the early stages of this research, and to M. Lenzerini who gave me precious advice and supported me throughout the work. References [BL90] K. Benkerimi and J. W. Lloyd. A Partial Evaluation Procedure for Logic Programs. In Proc. of North American Conf. on Logic Programming, S. K. Derbray and M. Hermenegildo eds., pp.343358, Austin, MIT Press, 1990. [Clark78] K. L. Clark. Negation as Failure. In Logic and Data Bases, H. Gallaire and J. Minker eds., pp.293-322, Plenum Press, 1978. [CD86] L. Cholvy. and R. Demolombe. Querying a Rule Base. In Proc. 1st Int Conf. on Expert Database Systems, pp.365-371, Charlesoton, South Carolina, April 1986. [CD88] F. Cuppens and R. Demolombe. Cooperative Answering: A Methodology to Provide Intelligent Access to Databases. In Proc. 2nd Int. Conf. on Expert Database Systems, pp.333-353, Tysons Corner, Virginia, April 1988. [Chan88] D. Chan. Constructive Negation Based on the Completed Database. In Proc.of 5th International Conference and Symposium on Logic Programming, R. A. Kowalski and K. A. Bowen eds., pp.111-125, MIT Press, 1988. [Chan89] D. Chan. An Extension of Constructive Negation and its Application in Coroutining. In Proc. of North American Conf. on Logic Programming, E. Lusk and R. Overback eds., pp.477-493, MIT Press, 1989. [Corella84] F. Corella. Semantic Retrieval and Levels of Abstraction. In Proc. 1st Int. Workshop on Expert Database Systems, pp.397-420, Kiawah Island, South Carolina, October 1984. [CW89] D. Chan and M. Wallance. A Treatment of Negation During Partial Evaluation. In Meta-Programming in Logic Programming, H. D. Abramson and M. H. Rogers eds., pp.299-317, MIT Press, 1989. (Proc. Meta88). [Demolombe92] R. Demolombe. A Strategy for the Computation of Conditional Answers. In Proc. ECAI’92, to appear. [DFI91] R. Demolombe, L. Farinas del Cerro, T. Imielinski (eds.). Proc. Workshop on Nonstandard Queries and Answers, Toulouse, France, July, 1991. [DeGiacomo92] G. De Giacomo. Intensional Query Answering by Partial Evaluation. Technical Report, Dipartimento di Informatica e Sistemistica, Università di Roma “La Sapienza”. In preparation. [Imielinski87] T. Imielinski. Intelligent Query Answering in Rule Based Systems. In The Journal of Logic Programming, 4(3):229-257, September 1987. [Lloyd87] J. W. Lloyd. Foundations of Logic Programming (2nd edition). Springer-Verlag, 1987. [LS91] J. W. Lloyd and J. C. Shepherdson. Partial Evaluation in Logic Programming. In The Journal of Logic Programming, 11(3&4):217242, October/November 1991. [LT84] J. W. Lloyd and R. W. Topor. Making Prolog More Expressive. The Journal of Logic Programming, 1(3):225-240, 1984. [Motro89] A. Motro. Using Integrity Constraints to Provide Intensional Answers to Relational Queries. In Proc. 15th Int. Conf on Very Large Data Bases, pp.237-246, Amsterdam, August 1989. [Motro91] A. Motro. Intensional Answers to Database Queries. Technical Report, Department of Information and Software Systems Engineering, George Mason University, Fairfax, Virginia, 1991. [MY90] A. Motro and Q. Yuan. Querying Database Knowledge. In Proc. of ACM SIGMOD-90, pp.173-183, 1990. [PR89] A. Pirotte and D. Roelantes. Constraints for Improving the Generation of Intensional Answers in a Deductive Database. In Proc. 5th Int. Conf. on Data Engineering, pp.652-659, Los Angeles, California, February 1989. [PRZ91] A. Pirotte, D. Roelantes, E Zimanyi. Controlled Generation of Intensional Answers. In IEEE Trans. on Knowledge and Data Engineering, Vol 3, No.2, pp.221-236, June 1991. [Przymusinski89] T. C. Przymusinski. On Constructive Negation in Logic Programming. In Proc. of North American Conf. on Logic Programming, E. Lusk and R. Overback eds., pp.1-19 (addendum), MIT Press, 1989. [SM88] C. Shum and R. Muntz. Implicit Representation for Extensional Answers. In Proc. 2nd Int. Conf. on Expert Database Systems, pp.257273, Tysons Corner, Virginia, April 1988. [vanHarmelen89] F. van Harmelen. The Limitations of Partial Evaluation. In Logic-Based Knowledge Representation, P. Jackson, H. Reichgelt, F. van Harmelen eds., pp.87-111, MIT Press, 1989.