Rough sets theory for multicriteria decision analysis

Benedetto Matarazzo

Rough sets theory for multicriteria decision analysis

Benedetto Matarazzo

2001, European Journal of Operational Research

visibility

…

description

47 pages

link

1 file

... Krusinska, E., Slowinski, R. and Stefanowski, J., 1992. Discriminant versus rough set approach to vague data analysis. Applied Stochastic Models and Data Analysis 8, pp. ... Lin, T., 1989.Neighborhood systems and approximation in database and knowledge base systems. ...

European Journal of Operational Research 129 (2001) 1±47 www.elsevier.com/locate/dsw Invited Review Rough sets theory for multicriteria decision analysis Salvatore Greco a, Benedetto Matarazzo a, Roman Slowinski b b,* a Faculty of Economics, University of Catania, Corso Italia 55, 95129 Catania, Italy Institute of Computing Science, Poznan University of Technology, ul. Piotrowo 3a, 60-965 Poznan, Poland Received 1 July 1999; accepted 22 December 1999 Abstract The original rough set approach proved to be very useful in dealing with inconsistency problems following from information granulation. It operates on a data table composed of a set U of objects (actions) described by a set Q of attributes. Its basic notions are: indiscernibility relation on U, lower and upper approximation of either a subset or a partition of U, dependence and reduction of attributes from Q, and decision rules derived from lower approximations and boundaries of subsets identi®ed with decision classes. The original rough set idea is failing, however, when preference-orders of attribute domains (criteria) are to be taken into account. Precisely, it cannot handle inconsistencies following from violation of the dominance principle. This inconsistency is characteristic for preferential information used in multicriteria decision analysis (MCDA) problems, like sorting, choice or ranking. In order to deal with this kind of inconsistency a number of methodological changes to the original rough sets theory is necessary. The main change is the substitution of the indiscernibility relation by a dominance relation, which permits approximation of ordered sets in multicriteria sorting. To approximate preference relations in multicriteria choice and ranking problems, another change is necessary: substitution of the data table by a pairwise comparison table, where each row corresponds to a pair of objects described by binary relations on particular criteria. In all those MCDA problems, the new rough set approach ends with a set of decision rules playing the role of a comprehensive preference model. It is more general than the classical functional or relational model and it is more understandable for the users because of its natural syntax. In order to workout a recommendation in one of the MCDA problems, we propose exploitation procedures of the set of decision rules. Finally, some other recently obtained results are given: rough approximations by means of similarity relations, rough set handling of missing data, comparison of the rough set model with Sugeno and Choquet integrals, and results on equivalence of a decision rule preference model and a conjoint measurement model which is neither additive nor transitive. Ó 2001 Elsevier Science B.V. All rights reserved. Keywords: Multicriteria decision analysis; Rough sets; Classi®cation; Sorting; Choice; Ranking; Decision rules; Conjoint measurement * Corresponding author. Tel.: +48-61-6652-375; fax: +48-61-8771-525. E-mail address: [email protected] (R. Slowinski). 0377-2217/01/$ - see front matter Ó 2001 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 7 - 2 2 1 7 ( 0 0 ) 0 0 1 6 7 - 3 2 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 1. Introduction The rough sets theory introduced by Pawlak (1982, 1991) has often proved to be an excellent mathematical tool for the analysis of a vague description of objects (called actions in decision problems). The adjective vague, referring to the quality of information, means inconsistency or ambiguity which follows from information granulation. The rough sets philosophy is based on the assumption that with every object of the universe there is associated a certain amount of information (data, knowledge), expressed by means of some attributes used for object description. Objects having the same description are indiscernible (similar) with respect to the available information. The indiscernibility relation thus generated constitutes a mathematical basis of the rough sets theory; it induces a partition of the universe into blocks of indiscernible objects, called elementary sets, that can be used to build knowledge about a real or abstract world. The use of the indiscernibility relation results in information granulation. Any subset X of the universe may be expressed in terms of these blocks either precisely (as a union of elementary sets) or approximately only. In the latter case, the subset X may be characterized by two ordinary sets, called lower and upper approximations. A rough set is de®ned by means of these two approximations, which coincide in the case of an ordinary set. The lower approximation of X is composed of all the elementary sets included in X (whose elements, therefore, certainly belong to X), while the upper approximation of X consists of all the elementary sets which have a non-empty intersection with X (whose elements, therefore, may belong to X). Obviously, the dierence between the upper and lower approximation constitutes the boundary region of the rough set, whose elements cannot be characterized with certainty as belonging or not to X, using the available information. The information about objects from the boundary region is, therefore, inconsistent or ambiguous. The cardinality of the boundary region states, moreover, to what extent it is possible to express X in exact terms, on the basis of the available information. For this reason, this cardinality may be used as a measure of vagueness of the information about X. The rough sets theory, dealing with representation and processing of vague information, presents a series of intersections and complements with respect to many other theories and mathematical techniques handling imperfect information, like probability theory, evidence theory of Dempster±Shafer, fuzzy sets theory, discriminant analysis and mereology (see Dubois and Prade, 1990, 1992; Krusinska et al., 1992; Pawlak, 1985a,b; Polkowski and Skowron, 1994; Skowron and Grzymala-Busse, 1994; Slowinski, 1995). Some important characteristics of the rough set approach make of this a particularly interesting tool in a number of problems and concrete applications. With respect to the input information, it is possible to deal with both quantitative and qualitative data, and inconsistencies need not to be removed prior to the analysis. With reference to the output information, it is possible to acquire a posteriori information regarding the relevance of particular attributes and their subsets to the quality of approximation considered in the problem at hand, without any additional inter-attribute preference information. Moreover, the ®nal result in the form of ``if..., then...'' decision rules, using the most relevant attributes, is easy to interpret. Several attempts have already been made to use the rough sets theory to decision support (Pawlak and Slowinski, 1994; Slowinski, 1993b). The original rough set approach is not able, however, to deal with preference-ordered attribute domains and decision classes. Solving this problem was crucial for application of the rough set approach to multicriteria decision analysis (MCDA). Why this application seems so important? The answer is connected with the nature of the input preferential information available in MCDA and of the output of the analysis. As to the input, the rough set approach requires a set of examples which is also convenient for acquisition of preferential information from decision makers (DMs). Very often in MCDA, this information has to be given in terms of preference model S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 3 parameters, like importance weights, substitution ratios and various thresholds. Giving such information requires a great cognitive eort of the DM. It is generally acknowledged that people prefer to make exemplary decisions than to explain them in terms of speci®c parameters. For this reason, the idea of inferring preference models from exemplary decisions provided by the DM is very attractive. Furthermore, the exemplary decisions may be inconsistent because of limited discriminatory power of criteria and because of hesitation of the DM (see, e.g., Roy, 1989). These inconsistencies cannot be considered as a simple error or noise. They can convey important information that should be taken into account in the construction of the DMs preference model. The rough set approach is intended to deal with inconsistency and this is another argument for its application to MCDA. Finally, the output of the analysis, i.e. the model of preferences in terms of decision rules seems very convenient for decision support because it is transparent and speaks the same language as the DM. Let us explain shortly why the original rough set approach is not able to deal with inconsistencies coming from consideration of criteria, i.e. attributes with preference-ordered domains (scales), like product quality, market share, debt ratio. Consider, for example, two ®rms, A and B, evaluated for assessment of bankruptcy risk by a set of criteria including the ``debt ratio'' (total debt/total assets). If ®rm A has a low value while ®rm B has a high value of the debt ratio, and evaluations of these ®rms on other attributes are equal, then, from bankruptcy risk point of view, ®rm A dominates ®rm B. Suppose, however, that ®rm A has been assigned by a DM to a class of higher risk than ®rm B. This is obviously inconsistent with the dominance principle. Within the original rough set approach, the two ®rms will be considered as just discernible and no inconsistency will be stated. For this reason, Greco et al. (1995, 1997a, 1998, 1999c,l) have proposed an extension of the rough sets theory that is able to deal with inconsistencies typical to exemplary decisions in MCDA problems. This innovation is mainly based on substitution of the indiscernibility relation by a dominance relation in the rough approximation of decision classes. An important consequence of this fact is a possibility of inferring from exemplary decisions the preference model in terms of decision rules being logical statements of the type ``if..., then...'' The separation of certain and doubtful knowledge about the DM's preferences is done by distinction of dierent kinds of decision rules, depending whether they are induced from lower approximations of decision classes or from the boundaries of these classes composed of inconsistent examples that do not observe the dominance principle. Such preference model is more general than the classical functional models considered within Multi-Attribute Utility Theory (MAUT) or relational models considered, for example, in outranking methods. The paper is organized as follows. In Section 2, a general view of the rough set approach is given. In Section 3, two extensions of the classical rough set approach based on generalizations of the basic concept of indiscernibility are presented: the ®rst is the similarity relation being only re¯exive and not necessarily symmetric and transitive; the second is a speci®c indiscernibility relation handling missing values in objects' description ± it is transitive but neither re¯exive nor symmetric. In Section 4, we introduce a distinction between classi®cation and sorting problems. The sorting problem involves preference-orders on domains of considered attributes (criteria) and among decision classes. To deal with multicriteria sorting problems rough set approximation based on dominance is proposed in this section. Furthermore, in order to handle missing values in multicriteria sorting problems a speci®c dominance relation is proposed. In Section 5, choice and ranking problems are considered. They are based on pairwise comparisons of objects, so the rough set approach concerns in this case approximation of a preference binary relation by speci®c dominance relations. These dominance relations can be multigraded, when the preferences with respect to considered criteria are cardinal, or without any degree of preference, when the preferences with respect to criteria are ordinal. Section 6 presents some results about equivalence between preference models of conjoint measurement and preference models expressed in terms of decision rules induced from rough approximations. Section 7 groups conclusions. 4 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 2. A general view of rough sets 2.1. Data table and indiscernibility relation For algorithmic reasons, the information regarding the objects is supplied in the form of a data table, whose separate rows refer to distinct objects (actions), and whose columns refer to dierent attributes considered. Each cell of this table indicates an evaluation (quantitative or qualitative) of the object placed in that row by means of the attribute in the corresponding column. Formally, a data table is the 4-tuple S hU ; Q; V ; f i, where U is a ®nite set of objectsS(universe), Q fq1 ; q2 ; . . . ; qm g is a ®nite set of attributes, Vq is the domain of the attribute q, V q2Q Vq and f : U Q ! V is a total function such that f x; q 2 Vq for each q 2 Q; x 2 U , called information function. Therefore, each object x of U is described by a vector (string) DesQ x f x; q1 ; f x; q2 ; . . . ; f x; qm , called description of x in terms of the evaluations of the attributes from Q; it represents the available information about x. To every (non-empty) subset of attributes P is associated an indiscernibility relation on U, denoted by IP : IP f x; y 2 U U : f x; q f y; q 8q 2 P g: If x; y 2 IP , it is said that the objects x and y are P-indiscernible. Clearly, the indiscernibility relation thus de®ned is an equivalence relation (re¯exive, symmetric and transitive). The family of all the equivalence classes of the relation IP is denoted by U jIP and the equivalence class containing an element x 2 U by IP x. The equivalence classes of the relation IP are called P-elementary sets. If P Q, the Q-elementary sets are called atoms. 2.2. Approximations Let S be a data table, X a non-empty subset of U and £ 6 P Q. The P-lower approximation and the Pupper approximation of X in S are de®ned, respectively, by: P X f x 2 U : IP x X g; P X [ IP X : x2X The elements of P X are all and only those objects x 2 U which belong to the equivalence classes generated by the indiscernibility relation IP , contained in X; the elements of P X are all and only those objects x 2 U which belong to the equivalence classes generated by the indiscernibility relation IP , containing at least one object x belonging to X. In other words, P X is the largest union of the P-elementary sets included in X, while P X is the smallest union of the P-elementary sets containing X. · The P-boundary of X in S, denoted by BnP X , is: BnP X P X ÿ P X . · The following relation holds: P X X P X . Therefore, if an object x belongs to P X , it is certainly also an element of X, while if x belongs to P X , it may belong to the set X. BnP X constitutes the ``doubtful region'' of X: nothing can be said with certainty about the belonging of its elements to the set X. The following relation, called complementarity property, is satis®ed: P X U ÿ P U ÿ X . If the P-boundary of X is empty, BnP X £, then the set X is an ordinary (exact) set with respect to P, that is, it may be expressed as the union of a certain number of P-elementary sets; otherwise, if 6 £, the set X is an approximate (rough) set with respect to P and may be characterized by means BnP X S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 5 of the approximations P X and P X . The family of all the sets X U having the same P-lower and P-upper approximations is called a rough set. The following ratio de®nes an accuracy of the approximation of X, X 6 £, by means of the attributes from P: aP X jP X j ; jP X j where jY j indicates the cardinality of a (®nite) set Y. Obviously, 0 6 aP X 6 1; if aP X 1, X is an ordinary (exact) set with respect to P; if aP X < 1, X is a rough (vague) set with respect to P. Another ratio de®nes a quality of the approximation of X by means of the attributes from P: cP X jP X j : jX j The quality cP X represents the relative frequency of the objects correctly classi®ed by means of the attributes from P. Moreover, 0 6 aP X 6 cP X 6 1, and cP X 0 i aP X 0, while cP X 1 i aP X 1. The de®nition of approximations of a subset X U can be extended to a classi®cation, i.e. a partition Y fY1 ; . . . ; Yn g of U. Subsets Yi ; i 1; . . . ; n, are disjunctive classes of Y. By P-lower (P-upper) approximation of Y in S we mean sets P Y fP Y1 ; . . . ; P Yn g and PY fPY1 ; . . . ; PYn g, respectively. The coecient Pn jP Yi j cP Y i1 jU j is called quality of the approximation of classi®cation Y by set of attributes P, or in short, quality of classi®cation. It expresses the ratio of all P-correctly classi®ed objects to all objects in the system. The main preoccupation of the rough sets theory is approximation of subsets or partitions of U, representing a knowledge about U, with other sets or partitions built up using available information about U. From the viewpoint of a particular object x 2 U , it may be interesting, however, to use the available information to assess the degree of its membership to a subset X of U. The subset X can be identi®ed with a concept of knowledge to be approximated. Using the rough set approach one can calculate the membership function lPX X (rough membership function) as lPX x j X \ IP xj : jIP xj The value of lPX X may be interpreted analogously to conditional probability and may be understood as the degree of certainty (credibility) to which x belongs to X. Observe that the value of the membership function is calculated from the available data, and not subjectively assumed, as it is the case of membership functions of fuzzy sets. Between the rough membership function and the approximations of X the following relationships hold: P X fx 2 U : lPX X 1g; P X fx 2 U : lPX X > 0g; BnP X fx 2 U : 0 < lPX X < 1g; P U ÿ X fx 2 U : lPX X 0g: In the rough sets theory there is, therefore, a close link between vagueness (granularity) connected with rough approximation of sets and uncertainty connected with rough membership of objects to sets. 6 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 2.3. Dependence and reduction of attributes A very important concept for concrete applications is that of dependence of attributes. Intuitively, a set of attributes T Q totally depends on a set of attributes P Q (notation P ! T ) if all the values of the attributes from T are uniquely determined by the values of the attributes from P, that is, if a functional dependence exists between evaluations by the attributes from P and by the attributes from T. In other words, the partition generated by the attributes from P is at least as ``®ne'' as that generated by the attributes from T, so that it is sucient to use the attributes from P to build the partition U jIT . Formally, T totally depends on P i IP IT . Therefore, T is totally (partially) dependent on P if all (some) elements of the universe U may be univocally assigned to classes of the partition U jIT , using only the attributes from P. Another issue of great practical importance is that of ``super¯uous'' data in a data table. Super¯uous data can be eliminated, in fact, without deteriorating the information contained in the original table. Let P Q and p 2 P . It is said that attribute p is super¯uous in P if IP IP ÿfpg ; otherwise, p is indispensable in P. The set P is independent (orthogonal) if all its attributes are indispensable. The subset P 0 of P is a reduct of P (denotation Red P ) if P 0 is independent and IP 0 IP . A reduct of P may also be de®ned with respect to an approximation of a partition Y of U. It is then called Y-reduct of P (denotation RedY P ) and speci®es a minimal subset P 0 of P which keeps the quality of classi®cation unchanged, i.e. cP 0 Y cP Y . In other words, the attributes that do not belong to Y-reduct of P are super¯uous with respect to the classi®cation Y of objects from U. More than one Y-reduct (or reduct) of P may exist in a data table. The set containing all the indispensable attributes of P is known as the Y -core. Formally \ RedY P : CoreY P Obviously, since the Y-core is the intersection of all the Y-reducts of P, it is included in every Y-reduct of P. It is the most important subset of attributes of Q, because none of its elements can be removed without deteriorating the quality of classi®cation. The calculation of all the reducts is fairly complex (see Bazan et al., 1994; Kryszkiewicz and Rybinski, 1996; Skowron and Rauszer, 1992; Susmaga, 1998). Nevertheless, in many practical applications it is not necessary to calculate all the reducts, but only some of them. For example, in Slowinski et al. (1988), the following heuristic procedure has been used to obtain the ``most satisfactory'' reduct. Starting from single attributes, the one with the greatest quality of classi®cation is chosen; then to the chosen attribute, another attribute is appended that gives the greatest increase to the quality of classi®cation for the pair of attributes; then yet another attribute is appended to the pair giving the greatest increase to the quality of classi®cation for the triple, and so on, until the maximal quality is reached by a subset of attributes. At the end of this procedure, it should be veri®ed if the obtained subset is minimal, i.e. if elimination of any attribute from this subset keeps the quality unchanged. Then, for further analysis, it is often sucient to take into consideration a reduced data table, where the set Q of attributes is con®ned to the ``most satisfactory'' reduct. 2.4. Decision table and decision rules If in a data table the attributes of set Q are divided into condition attributes (set C 6 £) and decision attributes (set D 6 £), C [ D Q and C \ D £, such a table is called a decision table. The decision attributes induce a partition of U deduced from the indiscernibility relation ID in a way that is independent of the condition attributes. D-elementary sets are called decision classes. There is a tendency to reduce the S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 7 set C while keeping all important relationships between C and D, in order to make decisions on the basis of a smaller amount of information. When the set of condition attributes is replaced by one of its reducts, the quality of approximation of the classi®cation induced by the decision attributes is not deteriorating. Since the tendency is to underline the functional dependencies between condition and decision attributes, a decision table may also be seen as a set of decision rules. These are logical statements (implications) of the type ``if..., then...'', where the antecedent (condition part) speci®es values assumed by one or more condition attributes (description of C-elementary sets) and the consequence (decision part) speci®es an assignment to one or more decision classes (description of D-elementary sets). Therefore, the syntax of a rule is the following: if f x; q1 is equal to rq1 and f x; q2 is equal to rq2 and . . . f x; qp is equal to rqp , then x belongs to Yj1 or Yj2 or . . . Yjk , where fq1 ; q2 ; . . . ; qp g C; rq1 ; rq2 ; . . . ; rqp 2 Vq1 Vq2 Vqp and Yj1 ; Yj2 ; . . . ; Yjk are some decision classes of the considered classi®cation (D-elementary sets). If the consequence is univocal, i.e. k 1, then the rule is exact, otherwise it is approximate or ambiguous. An object x 2 U supports decision rule r if its description is matching both the condition part and the decision part of the rule. We also say that decision rule r covers object x if it matches at least the condition part of the rule. Each decision rule is characterized by its strength, de®ned as the number of objects supporting the rule. In the case of approximate rules, the strength is calculated for each possible decision class separately. Let us observe that exact rules are supported only by objects from the lower approximation of the corresponding decision class. Approximate rules are supported, in turn, only by objects from the boundaries of the corresponding decision classes. Procedures for generation of decision rules from a decision table use an inductive learning principle. The objects are considered as examples of decisions. In order to induce decision rules with a unique consequent assignment to a D-elementary set, the examples belonging to the D-elementary set are called positive and all the others negative. A decision rule is discriminant if it is consistent, i.e. distinguishes positive examples from negative ones, and minimal, i.e. removing any attribute from a condition part gives a rule covering also negative objects. It may be also interesting to look for partly discriminant rules. These are rules that, besides positive examples, could cover a limited number of negative ones. They are characterized by a coecient, called level of con®dence, telling to what extent the rule is consistent, i.e. what is the ratio of the number of positive examples (supporting the rule) to the number of all examples covered by the rule. Generation of decision rules from decision tables is a complex task and a number of procedures have been proposed to solve it (see, for example, Grzymala-Busse, 1992, 1997; Mienko et al., 1996a,b; Skowron, 1993; Skowron and Polkowski, 1997; Slowinski and Stefanowski, 1992; Stefanowski, 1998; Ziarko and Shan, 1994; Slowinski et al., 2000). The existing induction algorithms use one of the following strategies: (a) generation of a minimal set of rules covering all objects from a decision table; (b) generation of an exhaustive set of rules consisting of all possible rules for a decision table; (c) generation of a set of `strong' decision rules, even partly discriminant, covering relatively many objects each but not necessarily all objects from the decision table. 2.5. Fuzzy measures and rough sets From a formal point of view, the quality of classi®cation satis®es the properties of set functions called fuzzy measures. As observed by Grabisch (1997), fuzzy measures constitute a useful tool for modeling the importance of coalitions. In Greco et al. (1998a), fuzzy measures have been used to assess a relative value of 8 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 information supplied by each attribute and to analyze the interactions among attributes, basing on the quality of classi®cation calculated from the rough set approach. Let us explain this point in more detail. Let N f1; 2; . . . ; ng be a ®nite set, whose elements could be players in a game, criteria in a multicriteria decision problem, attributes in a data table, etc., and let P N denote the power set of N, i.e. the set of all subsets of N. A fuzzy measure on N is a set function l : P N ! 0; 1 satisfying the following axioms: 1. l ; 0; l N 1, 2. A B implies l A 6 l B for all A; B 2 P N . In the following, the ®rst axiom is relaxed by considering the condition l N 6 1 instead of l N 1. Within game theory, the function l A is called characteristic function and represents the payo obtained by the coalition A N in a cooperative game (Shapley, 1953; Banzhaf, 1965); in a multicriteria decision problem, l A can be interpreted as the conjoint importance of the criteria from A N (Grabisch, 1994, 1996). Some indices have been introduced in game theory as speci®c solutions of cooperative games. The most important are the Shapley value and the Banzhaf value. The Shapley value (Shapley, 1953) for every i 2 N is de®ned by X /S i KN ÿfig n ÿ jKj ÿ 1!jKj! l K [ fig ÿ l K: n! The Banzhaf value (Banzhaf, 1965) for every i 2 N is de®ned by /B i 1 2nÿ1 X l K [ fig ÿ l K: KN ÿfig The Shapley value and the Banzhaf value can be interpreted as speci®c kinds of weighted average contribution of element i alone to all coalitions. Let us remind that in the case of /S i the value of l N is Pn shared among the elements of N, i.e. i1 /S i 1, while an analogous property does not hold for /B (i). The Shapley and Banzhaf values have also been proposed to represent the average importance of particular criteria within multicriteria decision analysis, when for the conjoint importance of criteria fuzzy measures are used (Murofushi, 1992). In addition to the indices concerning particular criteria, other indices have been proposed to measure the interaction between pairs of criteria. Interaction indices have been suggested by Murofushi and Soneda (1993) and Roubens (1996) with respect to Shapley value and Banzhaf value, respectively. The Murofushi±Soneda interaction index for elements i; j 2 N is de®ned by X IMS i; j KN ÿfi;jg n ÿ jKj ÿ 2!jKj! l K [ fi; jg ÿ l K [ fig ÿ l K [ j l K: n ÿ 1! The Roubens interaction index for elements i; j 2 X is de®ned by IR i; j 1 2nÿ2 X l K [ fi; jg ÿ l K [ fig ÿ l K [ fjg l K: KN ÿfi;jg The interaction indices IMS i; j and IR i; j can be interpreted as speci®c kinds of average added values resulting from putting i and j together in each possible coalition. The following cases can happen: · IMS i; j > 0 IR i; j > 0: i and j are complementary, · IMS i; j < 0 IR i; j < 0: i and j are substitutive, · IMS i; j 0 IR i; j 0: i and j are independent. S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 9 The de®nition of interaction indices can be extended from non-ordered pairs i; j 2 N to any subset A N ; A 6 £. Extensions of interaction indices in this sense have been proposed by Grabisch (1996) and Roubens (1996), with respect to Shapley index and Banzhaf index, respectively. The Shapley interaction index of elements from A N is de®ned by IS A X KN ÿA n ÿ jKj ÿ jAj!jKj! X ÿ1jAjÿjLj l L [ K: n ÿ jAj 1! LA The Banzhaf interaction index of elements from A N is de®ned by IB A 1 2nÿjAj X X ÿ1 jAjÿjLj l K [ L: KN ÿA LA In addition to the interaction indices, another concept useful for the interpretation of the fuzzy measures is the Mobius representation of l, i.e. the set function m : P N ! R de®ned by X ÿ1j AÿBj l B m A BA for any A N . Within Dempster±Shafer theory of evidence (Shafer, 1976), m A is interpreted as basic probability assignment. obius representation m have The relations between fuzzy measures l, interaction indices IS and IB and M been extensively studied in Grabisch (1997), Dennemberg and Grabisch (1996), Roubens (1996) and Grabisch and Roubens (1997). obius representation m can be used within rough set analysis to study Interaction indices IS and IB and M the relative value of the information supplied by dierent attributes (Greco et al., 1998a). Considering the quality of classi®cation as a fuzzy measure, we conclude that: 1. the Shapley value /S i and the Banzhaf value /B i can be interpreted as measures of the contribution of attribute i 1; . . . ; n to the quality of approximation of the considered classi®cation, 2. the Murofushi±Soneda interaction index IMS i; j and Roubens interaction index IR i; j can be interpreted as the average conjoint contribution of the non-ordered pair of attributes i; j 1; . . . ; n; i 6 j, to the quality of classi®cation when adjoined to all sets K C such that K \ fi; jg £, 3. the Shapley interaction index IS A and the Banzhaf interaction index IB A can be interpreted as the average conjoint contribution of the subset of attributes A C to the quality of classi®cation when adjoined to all sets B C such that B \ A £, 4. the M obius representation m A of l can be interpreted as the conjoint contribution of the subset of attributes A C to the quality of classi®cation. All of these indices are useful to study the informational dependence among the considered attributes and to choose the best reduct. 2.6. An example Let us consider an example inspired by the example of evaluation in a high school proposed by Grabisch (1994). The director of the school wants to assign students to two classes: bad and good. To ®x classi®cation rules the director is asked to give some examples. The examples concern six students described by means of four attributes (see Table 1): · A1 , level in Mathematics, · A2 , level in Physics, 10 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 Table 1 Data table with examples of classi®cation Student A1 A2 A3 A4 1 2 3 4 5 6 good medium medium bad medium good good bad bad bad good bad bad bad bad bad good good good bad good bad bad good · A3 , level in Literature, · A4 , global evaluation (decision class). The components of the data table S are: U f1; 2; 3; 4; 5; 6g; Q fA1 ; A2 ; A3 ; A4 g; V1 fbad; medium; goodg; V2 fgood; badg; V3 fgood; badg; V4 fgood; badg; the information function f x; q, taking values f 1; A1 good; f 1; A2 good, and so on. Observe that each student has a dierent description in terms of the attributes A1 , A2 , A3 and A4 , so they can be distinguished (discerned) by means of the information supplied by the four attributes. Formally, the indiscernibility relation based on all four attributes is IQ f 1; 1; 2; 2; 3; 3; 4; 4; 5; 5; 6; 6g and, therefore, there is no two distinct students x and y such that x; y 2 IQ . However, students 2 and 3 are indiscernible with respect to the attributes from P fA1 ; A2 ; A3 g, since they have the same values on the three attributes. Formally, the indiscernibility relation based on P is IP f 1; 1; 2; 2; 2; 3; 3; 2; 3; 3; 4; 4; 5; 5; 6; 6g. Similarly, students 2,3,4 are indiscernible with respect to the attributes from P 0 fA2 ; A3 g, and so on, considering all the possible subsets of attributes from Q. Each P Q determines a partition U jIP that groups in the corresponding equivalence classes the objects having the same description in terms of the attributes from P, e.g., for P 0 fA2 ; A3 g, U jIP 0 ff1g; f2; 3; 4g; f5g; f6gg, and thus, f1g; f2; 3; 4g; f5g; f6g are the P 0 -elementary sets. Suppose that, using the set of attributes P fA1 ; A2 ; A3 g, we wish to approximate the set X of students having a global evaluation ``good'', i.e. X f1; 3; 6g. Since U jIP ff1g; f2; 3g; f4g; f5g; f6gg, the result is P X f1; 6g; P X f1; 2; 3; 6g; BnP X f2; 3g: The answer to the question whether it is possible to describe X by means of the information supplied by the attributes from P is not univocal. Observe that the P- boundary BnP X is not empty: students 2 and 3, belonging to the P-boundary have the same description in terms of attributes considered, however, student 2 is globally bad while student 3 is good. Nevertheless, the P-lower approximation of X ; P X , is also not empty and consists of students 1 and 6, whose descriptions are dierent from those of all the students not belonging to X. Summing up, in intuitive terms, it may be said that, on the basis of the information supplied by the attributes from P: · students 1 and 6, from the P-lower approximation of X, certainly belong to the set of ``good'' students, · students 1, 2, 3 and 6, from the P-upper approximation of X, could belong to the set of ``good'' students, 11 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 · students 2 and 3, from the P-boundary of X, represent cases of uncertain membership to the set of ``good'' students. Using the same set of attributes P fA1 ; A2 ; A3 g, the approximation of the set Y of students having global evaluation ``bad'', i.e. Y f2; 4; 5g, gives the following result: P Y f4; 5g; P Y f2; 3; 4; 5g; BnP Y f2; 3g: Let us consider now the following subsets of Q : P fA1 ; A2 ; A3 g, R fA1 ; A2 g; T fA1 ; A3 g, W fA2 ; A3 g. It is easy to observe that IR IP ; IT IP , while IW 6 IP . This means that R and T are reducts of P, while W is not. In other words, R and T are minimal subsets of P that induce the same partition of the elements of U as the set of attributes P. It can also be observed that in the core of P, de®ned by R \ T , there is attribute A1 , which is thus indispensable for the approximation of the class of ``good'' students (and also for the class of ``bad'' students), while other attributes from R and T may be mutually exchanged. If in the set of attributes Q, condition attributes C fA1 ; A2 ; A3 g and decision attribute D fA4 g were distinguished, the data table could be seen as a decision table. In order to explain the evaluations of the decision attribute in terms of the evaluations of the condition attributes, one can represent the data table as a set of decision rules. Such a representation of Table 1 gives the following rules: 1. if f x; A1 good and f x; A2 good and f x; A3 bad, then f x; A4 good (or, in linguistic terms, ``if the level in Mathematics is good and the level in Physics is good and the level in Literature is bad, then the students is good''), 2. if f x; A1 medium and f x; A2 bad and f x; A3 bad; then f x; A4 bad, 3. if f x; A1 medium and f x; A2 bad and f x; A3 bad; then f x; A4 good, 4. if f x; A1 bad and f x; A2 bad and f x; A3 bad; then f x; A4 bad, 5. if f x; A1 medium and f x; A2 good and f x; A3 good; then f x; A4 bad, 6. if f x; A1 good and f x; A2 bad and f x; A3 good; then f x; A4 good. The above set of rules may then be reduced by induction, obtaining a more concise representation of the decision table (within parentheses there are the objects supporting the corresponding rule): (10 ) if f (20 if f (30 if f (40 ) if f x; A1 good, then f x; A4 good, x; A1 bad, then f x; A4 bad, x; A1 medium and f x; A2 good, then f x; A4 bad, x; A1 medium and f x; A2 bad, then f x; A4 good or bad. (1, 6) (4) (5) (2,3). Observe that rules (10 ), (20 ) and (30 ) have a univocal consequence and therefore these are exact rules, while rule (40 ) does not have a univocal consequence and for this reason it is an approximate rule. obius representation of all Finally, the quality of approximation, interaction indices IS and IB and M subsets of attributes in C were calculated. Their values are presented in Table 2.The results presented in Table 2 can be interpreted as follows: Table 2 Quality of approximation, M obius representation and interaction indices Attributes Quality M obius Shapley Banzhaf fA1 g fA2 g fA3 g fA1 ; A2 g fA1 ; A3 g fA2 ; A3 g fA1 ; A2 ; A3 g 0.5 0 0 0.67 0.67 0.5 0.67 0.5 0 0 0.17 0.17 0.5 )0.67 0.44 0.11 0.11 )0.17 )0.17 0.17 )0.67 0.5 0.17 0.17 )0.17 )0.17 0.17 )0.67 12 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 1. The second column shows the quality of approximation for the considered subset of attributes. 2. The third column presents the M obius representation and gives a measure of the conjoint contribution of the corresponding subset of attributes to the quality of classi®cation. The negative value corresponding to fA1 ; A2 ; A3 g should be read as a measure of the information redundancy in conjoint contribution of the three attributes. 3. The fourth column shows the Shapley interaction index: more precisely, the ®rst three values are the Shapley values and can be interpreted as measures of importance of the corresponding attributes in the rough approximation. One can notice a relatively great importance of A1 in comparison to A2 and A3 . Furthermore, A2 and A3 are complementary, while A1 and A2 , as well as A1 and A3 , are substitutive. Finally, there is redundancy between A1 ; A2 and A3 , as pointed out by the negative value of the corresponding interaction index. 4. The ®fth column presents the Banzhaf interaction index that has an interpretation analogous to the Shapley interaction index. 3. Generalization of the indiscernibility relation As mentioned above, the classical de®nitions of lower and upper approximations are based on the use of the binary indiscernibility relation which is an equivalence relation. In this case, the sets to be approximated and the relation used for this approximation are both ordinary crisp. Generalizations consisting in approximation of fuzzy sets with a fuzzy indiscernibility relation have been considered by Dubois and Prade (1990, 1992), Slowinski (1993a), Slowinski and Stefanowski (1989, 1994, 1996), Yao (1996, 1998) and Inuiguchi and Tanino (2000). Nevertheless, this approach is still based on the use of the indiscernibility relation. Further generalizations replacing the indiscernibility relation by a weaker binary similarity relation have considerably extended the capacity of the rough set approach, since in the least demanding case the similarity relation requires re¯exivity only, relaxing the assumptions of symmetry and transitivity (see Slowinski and Vanderpooten, 1997, 2000). 3.1. Similarity The indiscernibility implies an impossibility to distinguish two objects of U having the same description in terms of the attributes from Q. This relation induces equivalence classes on U, which constitute the basic granules of knowledge. In reality, due to the imprecision of data describing the objects, small dierences are often not considered signi®cant for the purpose of discrimination. This situation may be formally modeled by considering similarity or tolerance relations (see, e.g., Nieminen, 1988; Lin, 1989; Marcus, 1994; Polkowski et al., 1995; Skowron and Stepaniuk, 1995; Slowinski and Vanderpooten, 1997, 2000; Yao and Wong, 1995). In general, the similarity relations R do not generate partitions on U; the information regarding similarity may be represented using similarity classes for each object x 2 U . Precisely, the similarity class of x, denoted by R x, consists of the set of objects which are similar to x: R x fy 2 U : yRxg: It is obvious that an object y may be similar to both x and z, while z is not similar to x, i.e. y 2 R x and y 2 R z, but z 62 R x, x; y; z 2 U . The similarity relation is of course re¯exive (each object is similar to itself). Slowinski and Vanderpooten (1997, 2000) have proposed a similarity relation which is only re¯exive. S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 13 The abandon of the transitivity requirement is easily justi®able, remembering ± for example ± Luce's paradox of the cups of tea (Luce, 1956). As for the symmetry, one should notice that yRx, which means ``y is similar to x'', is directional; there is a subject y and a referent x, and in general this is not equivalent to the proposition ``x is similar to y'', as maintained by Tversky (1977). This is quite evident when the similarity relation is de®ned in terms of a percentage dierence between numerical evaluations of the objects, calculated with respect to the evaluation of the referent object. Therefore, the symmetry of the similarity relation should not be imposed and then it makes sense to consider the inverse relation of R, denoted by Rÿ1 , where xRÿ1 y means again ``y is similar to x''; Rÿ1 x; x 2 U , is the class of referent objects to which x is similar: Rÿ1 x fy 2 U : xRyg: Given a subset X U and a similarity relation R on U, an object x 2 U is said to be non-ambiguous in each of the two following cases: · x belongs to X without ambiguity, that is x 2 X and Rÿ1 x X ; such objects are also called positive; · x does not belong to X without ambiguity (x clearly does not belong to X), that is x 2 U ÿ X and Rÿ1 x U ÿ X (or Rÿ1 x \ X £); such objects are also called negative. The objects which are neither positive nor negative are said to be ambiguous. A more general de®nition of lower and upper approximation may thus be oered (see Slowinski and Vanderpooten, 2000). Let X U and R a re¯exive binary relation de®ned on U; the lower approximation of X, denoted by R X , and the upper approximation of X, denoted by R X , are de®ned, respectively, as: R X fx 2 U : Rÿ1 x X g; R X [ R x: x2X It may be demonstrated that the key property: R X X R X , still holds and that R X U ÿ R U ÿ X complementarity property and R X fx 2 U : Rÿ1 x \ X 6 £g: Moreover, the de®nitions proposed are the only ones that correctly characterize the set of positive objects (lower approximation) and the set of positive or ambiguous objects (upper approximation) when a similarity relation is re¯exive, but not necessarily symmetric nor transitive. Using similarity relation one is able to induce decision rules from a decision table. The syntax of a rule is the following: if f x; q1 is similar to rq1 and f x; q2 is similar to rq2 and . . . f x; qp is similar to rqp , then x belongs to Yj1 or Yj2 or . . . :Yjk , where fq1 ; q2 ; . . . ; qp g C; rq1 ; rq2 ; . . . ; rqp 2 Vq1 Vq2 Vqp and Yj1 ; Yj2 ; . . . ; Yjk are some classes of the considered classi®cation (D-elementary sets). If k 1 then the rule is exact or certain, otherwise it is approximate or uncertain. Procedures for generation of decision rules adapt the scheme described in Section 2.4. One such procedure has been proposed by Krawiec et al. (1998) ± it involves a similarity relation that is learned from data. Let us add that, recently, Greco et al. (1998g, 2000) proposed a fuzzy extension of the similarity, i.e. rough approximation of fuzzy sets by means of fuzzy similarity relations. 14 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 3.2. Missing values The classical rough set approach based on the use of indiscernibility relations requires the data table to be complete, i.e. without missing values on condition attributes describing the objects. In practice, however, the data tables are very often incomplete. To deal with these cases, we proposed in Greco et al. (1999e) and Greco et al. (1999f) an extension of the rough set methodology to the analysis of incomplete data tables. The extended indiscernibility relation between two objects is considered as a directional statement where a subject is compared to a referent object. We require that the referent object has no missing values. The extended rough set approach maintains all good characteristics of its original version. It also boils down to the original approach when there are no missing values. The rules induced from the rough approximations de®ned according to the extended relation verify a suitable property: they are robust in a sense that each rule is supported by at least one object with no missing value on the condition attributes represented in the rule. For any two objects x; y 2 U , we are considering a directional comparison of y to x; object y is called subject and object x, referent. We say that subject y is indiscernible with referent x, with respect to condition attributes from P C (denotation yIP x), if for every q 2 P the following conditions are met: (1) f x; q 6 , (2) f x; q f y; q or f y; q , where denotes a missing value. The above means that the referent object considered for indiscernibility with respect to P should have no missing values on attributes from set P. The binary relation IP is not necessarily re¯exive and also not necessarily symmetric. However, IP is transitive. For each P C let us de®ne a set of objects having no missing values on attributes from P: UP fx 2 U : f x; q 6 for each q 2 P g: For each x 2 U and for each P C let IP x fy 2 U : yIP xg denote the class of objects indiscernible with x. Given X U and P C, we de®ne lower and upper approximation of X with respect to P as follows: I P X fx 2 UP : IP x X g; 1 I P X fx 2 UP : IP x \ X 6 £g: 2 Let XP X \ UP . The rough approximation de®ned as above satis®es the following properties: · (Rough inclusion). For each X 2 U and for each P C : I P X XP I P X . · (Complementarity). For each X 2 U and for each P C : I P X UP ÿ IP U ÿ X : Let us observe that a very useful property of lower approximation within classical rough sets theory is that if an object x 2 U belongs to the lower approximation of X with respect to P C, then x belongs also to the lower approximation of X with respect to R C when P R (this is a kind of monotonicity property). However, de®nition (1) does not satisfy this property of lower approximation, because it is possible that f x; q 6 for all q 2 P but f x; q for some q 2 R ÿ P . This is quite problematic for some key concepts of the rough sets theory, like quality of approximation, reduct and core. Therefore, another de®nition of lower approximation should be considered to restore the concepts of quality of approximation, reduct and core in the case of missing values. Given X U and P C, this de®nition is the following: [ I P X I R X : 3 RP I P X is called cumulative P-lower approximation of X because it includes all the objects belonging to all Rlower approximations of X, where R P . 15 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 It can be shown that another type of indiscernibility relation, denoted by IP , permits a direct de®nition of the cumulative P-lower approximation in a usual way. For each x; y 2 U and for each P C; yIP x means that f x; q f y; q or f x; q and/or f y; q , for every q 2 P . Let IP x fy 2 U : yIP xg for each x 2 U and for each P C. IP is re¯exive and symmetric but not transitive (Kryszkiewicz, 1998). Greco et al. (1999e) proved that de®nition (3) is equivalent to the following de®nition: I P X fx 2 UP : IP x X g, where UP fx 2 U : f x; q 6 for at least one q 2 P g. Using IP we can give de®nition of the P-upper approximation of X, complementary to I P X I P X fx 2 UP : IP x \ X 6 £g: 4 For each X U , let XP X \ UP . Let us remark that x 2 UP if and only if there exists R 6 £ such that R P and x 2 UR . Rough approximation I P X and I P X satis®es the following properties: · (Rough inclusion). For each X U and for each P C: I P X XP I P X ; · (Complementarity). For each X U and for each P C: I P X UP ÿ I P U ÿ X . · (Monotonicity of the accuracy of approximation). For each X U and for each P ; R C, such that P R, the following inclusion holds: I P X I R X . Furthermore, if UP UR , the following inclusion is also true: I P X I R X . Due to the property of monotonicity, when augmenting a set of attributes P, we get a lower approximation of X that is at least of the same cardinality. Thus, we can restore for the case of missing values the following key concepts of the rough sets theory: accuracy and quality of approximation, reduct and core. These concepts have the same de®nitions as those given in Sections 2.2 and 2.3 but they use rough ap proximation I P X and I P X . 3.2.1. An example The illustrative example presented in this section is intended to explain the concepts introduced in Section 3.2. The director of the school must give a global evaluation to some students. This evaluation should be based on the level in Mathematics, Physics and Literature. However, not all the students have passed all three exams and, therefore, there are some missing values. The director gave the examples of evaluation as shown in Table 3. The following lower and upper approximations can be calculated from Table 3: I C good f6g; I C bad f1g; I C good f2; 6g; I C good f6g; I C bad f1; 5g; I C bad f1g; I C good f2; 3; 4; 6g; I C bad f1; 3; 4; 5g: The quality of approximation of the partition of U using attributes from C is equal to 0.67. There are two reducts: Red1 fMathematics; Physicsg, Red2 fMathematics; Literatureg. The intersection of Red1 and Red2 constitutes the core, i.e. Core fMathematicsg. Table 3 Student evaluations with missing values Student Mathematics Physics Literature Global evaluation 1 2 3 4 5 6 medium good medium good bad medium medium good medium bad medium medium bad bad bad good bad good bad good 16 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 The following minimal exact rules can be induced from Table 3 (within parentheses there are objects supporting the corresponding decision rule): 1. ``if Mathematics is good and Physics is medium, then the student is good'' (students 2, 4, 6) 2. ``if Mathematics is medium and Literature is bad, then the student is bad'' (students 1, 5) It is also possible to induce the following minimal approximate rule from Table 3: 3. ``if Mathematics is medium and Literature is medium, then the student is bad or good'' (students 3, 4) We claim that decision rules induced from an incomplete data table according to our approach are robust in the following sense: among objects supporting a given decision rule there is at least one object matching exactly all elementary conditions of the rule. This is a distinctive feature in comparison with the approach proposed by Kryszkiewicz (1998). Her approach is based on the concept of a possible ``completion'' of an incomplete data table S hU ; Q; V ; f i. The completion is understood as a complete data table S0 hU ; Q; V ; f 0 i obtained by substitution of each missing value by some possible value from the domain of the corresponding attribute. According to Kryszkiewicz's approach, a decision rule ``if f x; q1 rq1 and f x; q2 rq2 and . . . f x; qp rqp , then x is assigned to class Clt '' is certain (exact) if for any possible completion of data table S the implication expressed by the rule is true. A certain decision rule is called optimal (minimal) if no elementary condition can be eliminated from the condition part of the rule. Let us observe that some of the decision rules obtained using Kryszkiewicz's approach may not be robust, i.e. certain and founded on a real object present in the data table. Consider for instance the following optimal and certain decision rule induced using this approach from Table 3: 1. ``if Mathematics is good and Physics is good, then student is bad'' (student 5). In Table 3 there is no object having a description matching exactly the condition part of this rule. In other words, there is no student characterized by a good level in Mathematics and by a good level in Physics. Therefore, rule (4) is not robust. On the contrary, each of the rules generated using our approach is supported by at least one real object matching exactly the condition part of the corresponding decision rule and, therefore, they are robust. Precisely, rule (1) is founded on student 2 and 6, rule (2) on student 1, and rule (3) on student 3. 4. Multiattribute and multicriteria classi®cation and sorting problems As mentioned above, a decision table contains the information relative to a set of objects, described by a certain number of attributes. The traditional rough set analysis of such a table consists in approximating the classi®cations induced by decision attributes by means of the classi®cations induced by condition attributes. The two kinds of classi®cations are built independently, i.e. they are not deduced one from the other. The aim of the decision analysis is to answer two basic questions. The ®rst question is to explain decisions in terms of the circumstances in which they were made. The second is to give a recommendation how to make a decision under speci®c circumstances. Recommendation is mainly based on decision rules induced from a decision table. In this sense, the rough set approach is similar to the inductive learning approach (Michalski et al., 1998), however, the former one is going far beyond the latter because in the rough set approach, the recommendation task is preceded by the explanation which gives pertinent information useful for decision support (reducts, core, quality of approximation, relevance of attributes). According to Roy (1985, 1993), it is possible to distinguish the following three, most frequent decision problems: classi®cation, choice and ranking. In general, decisions are based on some characteristics of objects (actions). For example, when buying a car, the decisions can be based on such characteristics as price, maximum speed, fuel consumption, color, country of production, etc. We refer to these characteristics calling them attributes. Let us observe that, S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 17 depending on interpretation given to the attributes by the DM, some of them may have ordinal properties expressing preference scales, while others may not. The former attributes are called criteria, while the latter ones keep the name of attributes. In the above example, price, maximum speed and fuel consumption are criteria, because, for instance, a low price is better than a high price; most probably, color and country of production are not criteria but simple attributes because, for instance, red is not better than green. However, one can imagine that also those attributes could become criteria, because a DM could consider, for example, red better than green. Moreover, decisions may be ordinal, because of expressing a preference, or may not be ordinal. For example, classi®cation of cars for a catalogue does not impose any preference order among the classes (sport cars, family cars, utility cars, etc.), however, choice of the best car, or ranking of a set of cars from the best to the worst surely impose a preference order. Let us also observe that, depending on interpretation given to the classi®cation by the DM, the classes may express a preference, so also classi®cation may be ordinal. For instance, the DM could be interested in classi®cation of cars in three categories: acceptable, hardly acceptable, non-acceptable. This type of classi®cation is called sorting. In the case of any multicriteria and/or multiattribute decision problem, no recommendation can be elaborated before the DM provides some preferential information suitable to the preference model assumed. There are two major models used until now in multicriteria decision analysis: functional and relational ones. The functional model has been extensively used within the framework of multiattribute utility theory (Keeney and Raia, 1976). The relational model has its most widely known representation in the form of an outranking relation (Roy, 1991) and a fuzzy relation (Fodor and Roubens, 1994). These models require speci®c preferential information more or less explicitly related with their parameters. For example, in the deterministic case, the DM is often asked for pairwise comparisons of objects, from which we can assess the substitution rates in the functional model or importance weights in the relational model (see Fishburn, 1967; Jacquet-Lagreze and Siskos, 1982; Mousseau, 1993). This kind of preferential information seems to be close to the natural reasoning of the DM. He/she is typically more con®dent exercising his/her comparisons than explaining them. The representation of this information by functional or relational models seems, however, less natural. According to Slovic (1975), people make decisions by searching for rules that provide good justi®cation of their choices. So, after getting the preferential information in terms of exemplary comparisons, it would be natural to build the preference model in terms of ``if. . ., then. . .'' rules. Then, these rules can be applied to a set of objects (potential actions) in order to obtain speci®c preference relations. From the exploitation of these relations, a suitable recommendation can be obtained to support the DM in decision problem at hand. The induction of rules from examples is a typical approach of arti®cial intelligence. It is concordant with the principle of posterior rationality by March (1988) and with aggregation±disaggregation logic by Jacquet-Lagreze (1981). The rules explain the preferential attitude of the DM and enable his/her understanding of the reasons of his/her preferences. The recognition of the rules by the DM (Langley and Simon, 1998) justi®es their use for decision support. So, the preference model in the form of rules derived from examples, ful®ls both explanation and recommendation tasks mentioned above. In Sections 4.2±5.4, we are presenting the main extensions of the rough set approach, resulting in a new methodology of modeling and exploitation of preferences in terms of decision rules. The rules are induced from the preferential information given by the DM in the form of examples of decisions. More precisely, for A being a ®nite set of objects (real or ®ctitious actions, potential or not) considered in a multicriteria problem, the examples of decisions are con®ned to a subset of objects B A, relatively well known to the DM, called reference objects. Depending on the type of the multicriteria problem, the examples concern either assignment of reference objects to decision classes (sorting problem) or pairwise comparisons of reference objects (choice and ranking problems). 18 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 4.1. Problems of multiattribute classi®cation Up to now, the rough set approach to decision analysis has been limited to problems of multiattribute classi®cation concerning the assignment of a set of objects described by a set of attributes (not criteria) to one of pre-de®ned categories (Pawlak and Slowinski, 1994). Rough set analysis is naturally adapted to this type of problems because the set of classi®cation examples may be represented directly in the decision table and it is possible to extract all the essential knowledge contained in the table using indiscernibility or similarity relations. The rough sets theory has been successfully applied to a number of real classi®cation problems in different ®elds, such as medicine, pharmacology, engineering, credit management, market research, ®nancial analysis, environmental economics, linguistics, databases and other important sectors. The interesting results have encouraged experts in various disciplines to study the rough sets theory and its applications. For a collection of studies on the application of the rough set approach to real-world problems (see Slowinski, 1992; Slowinski et al., 1988; Polkowski and Skowron, 1998). A brief but thorough review of the most important applications has been made by Pawlak (1997). 4.2. Problems of multicriteria sorting As pointed out by Greco et al. (1996, 1998b,e, 1999c), the original rough set approach cannot extract all the essential knowledge contained in the decision table of multicriteria sorting problems, i.e. problems of assigning a set of objects described by a set of criteria to one of pre-de®ned and preference-ordered categories. Notwithstanding, in many real problems it is important to consider the ordinal properties of the considered criteria. For example, in bankruptcy risk evaluation, if the debt ratio (total debt/total activity) of company A has a modest value, while the same ratio of company B has a signi®cant value, then, within the rough set approach, the two ®rms are just discernible, but no preference is given to one of them with respect to the attribute ``debt ratio''. In reality, from the point of view of the bankruptcy risk evaluation, it would be reasonable to consider ®rm A better than ®rm B, and not simply dierent (discernible). Let us observe that the rough set approach based on the use of indiscernibility or similarity relations is not able to capture a speci®c kind of inconsistency which may occur when in the decision table there is at least one criterion. For instance, in the bankruptcy risk evaluation, which is a sorting problem, if ®rm A is better than B with respect to all the considered criteria (e.g. debt ratio, return on equity, etc.) but ®rm A is assigned to a class of higher risk than B, then there is an inconsistency which cannot be captured by the original rough set approach, because these ®rms are discernible. In order to detect this inconsistency, the rough approximation should handle the ordinal properties of criteria. This can be made by replacing the indiscernibility or similarity relation by the dominance relation, which is a very natural concept within multicriteria decision making. On the basis of these considerations, Greco et al. (1998b,e, 1999g,h) have proposed a new rough set approach to multicriteria sorting problems, which is described in the following parts of the paper. Let us also mention that it is sometimes reasonable to consider both criteria and regular attributes (without preference-ordered domains) in a sorting problem. In this particular case, the rough approximation is based on the use of a binary relation which aggregates dominance (with respect to considered criteria) and indiscernibility (with respect to considered attributes), as proposed by Greco et al. (1998e). A more general binary relation that aggregates dominance, indiscernibility and similarity was considered by Greco et al. (1999d). Let us also mention that a fuzzy set extension of the rough approximation by dominance has been presented by Greco et al. (1999a). Moreover, a fuzzy set extension of the rough approximation based on dominance and similarity together has been described recently by Greco et al. (1999k). S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 19 4.2.1. Rough approximation by means of dominance relations Let Sq be an outranking relation Roy (1985) on U with reference to criterion q 2 C, such that xSq y means ``x is at least as good as y with respect to criterion q''. Suppose that Sq is a complete preorder, i.e. a strongly complete and transitive binary relation. Moreover, let Cl fClt ; t 2 T g; T f1; . . . ; ng, be a set of classes of U, such that each x 2 U belongs to one and only one class Clt 2 Cl. We assume that for all r; s 2 T , such that r > s, each element of Clr is preferred (strictly or weakly) to each element of Cls . More formally, if S is a comprehensive outranking relation on U, i.e. xSy means: ``x is at least as good as y'' for any x; y 2 U , then it is supposed that x 2 Clr ; y 2 Cls ; r > s ) xSy and not ySx: Let us also consider the following upward and downward unions of classes, respectively, CltP [ Clt6 Cls ; [ Cls : s6t sPt Observe that Cl1P Cln6 U ; ClnP Cln and Cl16 Cl1 . It is said that x dominates y with respect to P C (denotation xDP y) if xSq y for all q 2 P . Since the intersection of complete preorders is a partial preorder and Sq is a complete preorder for each q 2 P , and T DP q2P Sq , then the dominance relation DP is a partial preorder. Given P C and x 2 U , let D P x fy 2 U : yDP xg; Dÿ P x fy 2 U : xDP yg represent, so-called, P-dominating set and P-dominated set with respect to x, respectively. In the dominancebased rough set approach, the sets to be approximated are upward and downward unions of classes and the items (granules of knowledge) used for this approximation are dominating and dominated sets. The P-lower and the P-upper approximation of CltP ; t 2 T , with respect to P C (denotation P CltP and P CltP , respectively), are de®ned as: P P CltP fx 2 U : D P x Clt g; P CltP [ ÿ P D 6 £g: P x fx 2 U : DP ÿ x \ Clt x2CltP Analogously, the P-lower and the P-upper approximation of Clt6 , t 2 T , with respect to P C (denotation P Clt6 and P Clt6 , respectively) are de®ned as: 6 P Clt6 fx 2 U : Dÿ P x Clt g; P Clt6 [ 6 Dÿ P x fx 2 U : DP x \ Clt 6 £g: x2Clt6 The P-lower and P-upper approximations de®ned as above satisfy the following properties for all t 2 T and for any P C: P CltP CltP P CltP ; P Clt6 Clt6 P Clt6 : 20 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 Furthermore, the following speci®c complementarity properties hold: 6 P CltP U ÿ P Cltÿ1 ; t 2; . . . ; n; P P Clt6 U ÿ P Clt1 ; t 1; . . . ; n ÿ 1; 6 P CltP U ÿ P Cltÿ1 ; t 2; . . . ; n; P P Clt6 U ÿ P Clt1 ; t 1; . . . ; n ÿ 1: The P-boundaries (P-doubtful regions) of CltP and Clt6 are de®ned as: BnP CltP P CltP ÿ P CltP ; BnP Clt6 P Clt6 ÿ P Clt6 : We de®ne the accuracy of approximation of CltP and Clt6 for all t 2 T and for any P C, respectively, as aP CltP P CltP ; P CltP aP Clt6 P Clt6 : P Clt6 The ratio cP Cl Uÿ ÿÿ S t2T SÿS 6 BnP CltP t2T BnP Clt jU j de®nes the quality of approximation of the partition Cl by means of the set of criteria P, or, brie¯y, quality of sorting. This ratio expresses the relation between all the P-correctly classi®ed objects and all the objects in the table. Every minimal subset P C such that cP Cl cC Cl is called a reduct of C with respect to Cl and is denoted by REDCl P . Again, a data table may have more than one reduct. The intersection of all the reducts is known as the core, denoted by CORECl . 4.2.2. Decision rules On the basis of the approximations obtained by means of the dominance relations, it is possible to induce a generalized description of the preferential information contained in the decision table, in terms of decision rules (Slowinski et al., 2000). The following three types of decision rules can be considered: 1. D P -decision rules, which have the following form: if f x; q1 P rq1 and f x; q2 P rq2 and . . . f x; qp P rqp ; then x 2 CltP ; where P fq1 ; q2 ; . . . ; qp g C; rq1 ; rq2 ; . . . ; rqp 2 Vq1 Vq2 Vqp and t 2 T ; these rules are supported only by objects from the P-lower approximations of the upward unions of classes CltP . 2. D 6 -decision rules, which have the following form: if f x; q1 6 rq1 and f x; q2 6 rq2 and . . . f x; qp 6 rqp ; then x 2 Clt6 ; where P fq1 ; q2 ; . . . ; qp g 6 C; rq1 ; rq2 ; . . . ; rqp 2 Vq1 Vq2 Vqp and t 2 T ; these rules are supported only by objects from the P-lower approximations of the downward unions of classes Clt6 . 3. D P 6 -decision rules, which have the following form: if f x; q1 P rq1 and f x; q2 P rq2 and . . . f x; qk P rqk and f x; qk1 6 rqk1 and . . . f x; qp 6 rqp ; then x 2 Clt [ Clt1 [ [ Cls ; where O0 fq1 ; q2 ; . . . ; qk g C, O00 fqk1 ; qk2 ; . . . qp g C, P O0 [ O00 , O0 and O00 not necessarily disjoint, rq1 ; rq2 ; . . . ; rqp 2 Vq1 Vq2 Vqp ; s; t 2 T such that t < s; these rules are supported only by objects from the P-boundaries of the unions of classes Clt6 and ClsP . S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 21 Let us observe that the set of decision rules induced from the approximations de®ned using dominance relations gives, in general, a more synthetic representation of knowledge contained in the decision table than the set of rules induced from classical approximations de®ned using indiscernibility relations. The minimal sets of rules thus obtained have a smaller number of rules and use a smaller number of conditions. They also do not require any discretization of numerical scales of criteria prior to the rule induction. Moreover, the application of these rules to new objects gives better results, in general. This is due to the more general syntax of the rules (`` P '' and `` 6 '' are used instead of `` ''). 4.2.3. An example Let us apply now the rough approximation by dominance relation to the decision table from Table 1. Within this approach we approximate the class Cl16 of ``(at most) bad'' students and the class Cl2P of ``(at least) good'' students. Since only two classes are considered, we have Cl16 Cl1 and Cl2P Cl2 . As previously, C fA1 ; A2 ; A3 g and D fA4 g. In this case, however, A1 ; A2 and A3 are criteria and the classes are preference-ordered. This means that · with respect to A1 , ``good'' is better than ``medium'' and ``medium'' is better than ``bad'', · with respect to A2 , ``good'' is better than ``bad'', · with respect to A3 , ``good'' is better than ``bad'', · with respect to A4 , ``good'' is better than ``bad''. The C-lower approximations, the C-upper approximations and the C-boundaries of classes Cl16 and Cl2P are equal, respectively, to: C Cl16 f4g; C Cl2P f1; 6g; C Cl16 f2; 3; 4; 5g; BnC Cl16 f2; 3; 5g; C Cl2P f1; 2; 3; 5; 6g; BnC Cl2P f2; 3; 5g: Therefore, the accuracy of the approximation is 0.25 for Cl16 and 0.4 for Cl26 , while the quality of sorting is equal to 0.5. There is only one reduct which is also the core, i.e. RedCl C CoreCl C fA1 g. The following minimal set of decision rules can be obtained from the considered decision table (within parentheses there are the objects supporting the corresponding rule): 1; 6 1. if f x; A1 P good, then x 2 Cl2P 4 2. if f x; A1 6 bad, then x 2 Cl16 3. if f x; A1 P medium and f x; A1 6 medium i:e: f x; A1 is medium; then x 2 Cl1 [ Cl2 2; 3; 5. Let us notice that student 5 dominates student 3, i.e. student 5 is at least as good as student 3 with respect to all the three criteria, however, 5 has a comprehensive evaluation worse than 3. Therefore, this can be interpreted as an inconsistency revealed by the approximation based on dominance, that cannot be captured by the approximation based on indiscernibility. Moreover, let us remark that the decision rules induced from approximations de®ned using dominance relations give a more synthetic representation of knowledge contained in the decision table. The minimal set of decision rules obtained from the dominance approach has a smaller number of stronger rules (3 against 4) and uses a smaller number of conditions (3 against 6). Furthermore, let us observe that some rules obtained from the original rough set approach make problems with their interpretation. For example, rule 30 ) obtained by the original rough set approach says that ``if the level in Mathematics is medium and the level in Physics is good, then the student is bad''. One would expect that a student with lower marks, e.g. a student with the same level in Mathematics but with a medium level in Physics, should still be bad. Surprisingly, student 3 has these characteristics and, nevertheless, he/she is quali®ed as good one. 22 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 4.2.4. Another example: a comparison of rough sets with the Sugeno integral Let us suppose that the director of the school was not satis®ed with the obtained results. Therefore, after some interactions with an analyst, he made few modi®cations of the evaluation procedure. In consequence, the scales of the evaluation in Mathematics, Physics and Literature, as well as the global evaluation scale have been composed of three following grades: ``bad'', ``medium'', ``good''. Moreover, a new set of examples presented in Table 4 has been considered. According to the decision attribute, the students are divided into three preference-ordered classes: Cl1 fbadg; Cl2 fmedium g; Cl3 fgood g. Thus, the following unions of classes were approximated: · Cl16 Cl1 , i.e. the class of (at most) bad students, · Cl26 Cl1 [ Cl2 , i.e. the class of at most medium students, · Cl2P Cl2 [ Cl3 , i.e. the class of at least medium students, · Cl3P Cl3 , i.e. the class of (at least) good students. When considering all the criteria, the unions of classes are perfectly approximated, i.e. for each union the lower and the upper approximation are the same and, therefore, coincide with the union: ÿ ÿ C Cl16 C Cl16 Cl16 ; ÿ ÿ C Cl26 C Cl26 Cl26 ; C Cl2P C Cl2P Cl2P ; C Cl3P C Cl3P Cl3P : There is only one reduct which is also the core: REDCl C CORECl C fA1 ; A2 ; A3 g. Since the reduct is composed of all the criteria, each one is indispensable for precise explanation of the sorting. The following set of minimal D P ± decision rules has been induced from Table 4: 1. ``if Mathematics P medium and Physics P medium and Literature P medium, then student P medium'', 2. ``if Mathematics P good and Physics P medium, then student P medium'', 3. ``if Mathematics P medium and Physics P good, then student P medium'', 4. ``if Mathematics P good and Physics P medium and Literature P medium, then student P good'', 5. ``if Mathematics P medium and Physics P good and Literature P medium, then student P good'', 6. all uncovered students are bad. Table 4 Decision table with examples of sorting Student Mathematics Physics Literature Decision made by DM 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 medium good bad medium good bad good medium good medium good medium good bad medium medium medium good good good bad bad medium medium good bad medium medium good good bad bad bad bad bad medium medium medium medium medium good good good good good bad medium bad medium medium bad bad medium good good bad medium good bad good S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 23 These rules permit to classify the students with all possible evaluations on the three considered criteria, as shown in Table 5 (within parentheses there are indicated the rules matching the student evaluation). If more than one D P -decision rule matches a student evaluation, then the sorting decision corresponds to the highest suggested class. Let us observe that the students represented in Table 5 could also be assigned to the same classes by the folowing set of D 6 -decision rules: (10 ) ``if Mathematics 6 bad, then student 6 bad'', (20 ) ``if Physics 6 bad, then student 6 bad'', (30 ) ``if Mathematics 6 medium and Physics 6 medium and Literature 6 bad, then student 6 bad'', (40 ) ``if Literature 6 bad, then student 6 medium'', (50 ) ``if Mathematics 6 medium and Physics 6 medium, then student 6 medium'', (60 ) all uncovered students are good. It is interesting to note that the 27 decisions made by the rules and presented in Table 5 cannot be represented by the most general max±min aggregation operator permitting ordinal aggregation, i.e. the fuzzy integral proposed by Sugeno (1974). To apply the Sugeno integral a common ordinal scale must be assumed for criteria and for a fuzzy measure de®ned on the set of criteria. Let V V1 V2 Vn denote the evaluation space of the criteria. Each x 2 V is called a pro®le. In our example, 3 V fbad; medium; goodg and each of the 27 cases of evaluation from Table 5 corresponds to a pro®le. The scale value of x 2 V on criterion ci is denoted by ci x, while the scale value of a subset of criteria J fcj1 ; cj2 ; . . . ; cjk g is denoted by l J . For each x 2 V, the criteria are ordered according to increasing values of ci x Table 5 Expression of preferences on all 27 cases Student Mathematics Physics Literature Decision made by D P -decision rules 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 bad medium good bad medium good bad medium good bad medium good bad medium good bad medium good bad medium good bad medium good bad medium good bad bad bad medium medium medium good good good bad bad bad medium medium medium good good good bad bad bad medium medium medium good good good bad bad bad bad bad bad bad bad bad medium medium medium medium medium medium medium medium medium good good good good good good good good good bad (#6) bad (#6) bad (#6) bad (#6) bad (#6) medium (#2) bad (#6) medium (#3) medium (#2,3) bad (#6) bad (#6) bad (#6) bad (#6) medium (#1) good (#1,2,4) bad (#6) good (#1,3,5) good (#1,2,3,4,5) bad (#6) bad (#6) bad (#6) bad (#6) medium (#1) good (#1,2,4) bad (#6) good (#1,3,5) good (#1,2,3,4,5) 24 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 c 1 ; c 2 ; . . . ; c n ; such that c 1 x 6 c 2 x 6 6 c n x: The Sugeno integral is de®ned as follows: f c1 x; c2 x; . . . ; cn x max fminfc i x; l J i gg; i1;...;n where J i fc i ; . . . ; c n g. Alternatively, the Sugeno integral can be presented as follows: minfcj1 x; . . . ; cjk x; l J g ; where J fcj1 ; . . . ; cjk g: f c1 x; c2 x; . . . ; cn x max J fc1 ;...;cn g Greco et al. (1999j) have shown that the Sugeno integral can be represented in terms of single graded decision rules having the following syntax: \if cj1 x P r and cj2 x P r and . . . cjh x P r; then x 2 ClrP " and satisfying the following conditions: · given the rule: ``if cj1 x P r and cj2 x P r and . . . cjh x P r; then x 2 ClrP '', the following rules are also true for each s < r : \if cj1 x P s and cj2 x P s and . . . cjh x P s; then x 2 ClsP ''; · given the rule: ``if cj1 x P r and cj2 x P r and . . . cjh x P r; then x 2 ClrP '', the following rules are also true for each fci1 ; ci2 ; . . . ; cik g fcj1 ; cj2 ; . . . ; cjh g: ``if ci1 x P r and ci2 x P r and . . . cik x P r; then x 2 ClrP '' (due to monotonicity of the fuzzy measure). Why the 27 decisions made by the rules and presented in Table 5 cannot be represented by the Sugeno integral? This can be understood intuitively from the previous result: in fact, many of the rules applied for the sorting of the 27 cases are not single-graded, i.e. they use more than one grade of the evaluation scale. The answer can also be more direct: consider the D P -decision rule (2): ``if Mathematics P good and Physics P medium, then student P medium'', and check all possibilities permitting the Sugeno integral the same sorting as rule (2), without misclassi®cation: case 1: l({Mathematics, Physics}) medium, case 2: l({Mathematics}) good, case 3: l({Physics}) medium. case 1: corresponds to the rule: ``if Mathematics P medium and Physics P medium, then student P medium'', but it has the condition part weaker than rule (2); case 2: corresponds to the rule: ``if Mathematics P good, then student P medium'', but it has the condition part weaker than rule (2); case 3: corresponds to the rule: ``if Physics P medium, then student P medium'', but it has the condition part weaker than rule (2). Therefore, there is no possibility to represent the sorting made by the rules in Table 5 by the Sugeno integral. 4.3. Multi-criteria sorting problem with missing values Alike the original approach, the rough set approach based on dominance relations requires the data table to be complete. An extension of the rough set approach based on dominance to the analysis of incomplete data tables has been proposed in Greco et al. (1999e,f). It is assumed in this extension that the dominance relation between two objects is a directional statement where a subject is compared to a referent object having no missing values. The extended approach maintains all good characteristics of the dominance-based rough set approach and boils down to the latter when there are no missing values. The rules S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 25 induced from the rough approximations de®ned according to the extended relation are robust, i.e. each rule is supported by at least one object with no missing value on the criteria represented in the condition part of the rule. Keeping in mind that a comparison of subject y to referent x is directional for any two objects x; y 2 U , we say that subject y dominates referent x with respect to criteria from P C (denoted by yD P x) if for every criterion q 2 P the following conditions are met: (1) f x; q 6 ; 2f y; q P f x; q or f y; q . We also say that subject y is dominated by referent x with respect to criteria from P C (denoted by 6 , (2) f x; q 6 f y; q or x Dÿ P y) if for every criterion q 2 P the following conditions are met: (1) f x; q f y; q . ÿ The above de®nition means that the referent object considered for dominance D P and DP should have no missing values on criteria from set P. ÿ ÿ The binary relations D P and DP are not necessarily re¯exive. However, DP and DP are transitive. For each P C we restore the de®nition of set UP from Section 3.2. Given P C and x 2 U , the ``granules of knowledge'' used for approximation are: · a set of objects dominating x, called P-dominating set, D P x fy 2 U : yDP xg, ÿ · a set of objects dominated by x, called P-dominated set, DP x fy 2 U : xDÿ P yg. For any P C we say that x 2 U belongs to CltP without any ambiguity if x 2 CltP and for all the P objects y 2 U dominating x with respect to P, we have y 2 CltP , i.e. D P x Clt . Furthermore, we say P P that x 2 U could belong to Clt if there would exist at least one object y 2 Clt dominated by x with respect to P, i.e. y 2 Dÿ P x. Thus, with respect to P C, the set of all objects belonging to CltP without any ambiguity constitutes the P-lower approximation of CltP , denoted by P CltP , and the set of all objects that could belong to CltP constitutes the P-upper approximation of CltP , denoted by P CltP for t 1; . . . ; n: P P CltP fx 2 UP : D P x Clt g; P P CltP fx 2 UP : Dÿ 6 £g: P x \ Clt Analogously, one can de®ne P-lower approximation and P-upper approximation of Clt6 for t 1; . . . ; n: 6 P Clt6 fx 2 UP : Dÿ p x Clt g; 6 P Clt6 fx 2 UP : D P x \ Clt 6 £g: Let CltP P CltP \ and Clt6 P Clt6 \ UP ; t 1; . . . ; n. The rough approximation de®ned as above satis®es the following properties: · (Rough inclusion). For each CltP and Clt6 ; t 1; . . . ; n; and for each P C: P CltP CltP P P CltP ; P Clt6 Clt6 P P Clt6 ; · (Complementarity). For each CltP ; t 2; . . . ; n; and Clt6 ; t 1; . . . ; n ÿ 1, and for each P C: 6 P CltP UP ÿ P Cltÿ1 ; P P Clt6 UP ÿ P Clt1 : To preserve the monotonicity property of the lower approximation (see Section 3.2) it is necessary to use another de®nition of the approximation for a given CltP and Clt6 ; t 1; . . . ; n, and for each P C: [ P CltP R CltP ; 5 RP P Clt6 [ RP R Clt6 : 6 26 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 P CltP and P Clt6 will be called cumulative P-lower approximations of unions CltP and Clt6 , respectively, t 1; . . . ; n, because they include all the objects belonging to all R-lower approximations of CltP and Clt6 ; where R P . It can be shown that another type of dominance relation, denoted by DP , permits a direct de®nition of the cumulative P-lower approximations in a classical way. For each x; y 2 U and for each P Q; yDP x means that f y; q P f x; q or f x; q and/or f y; q , for every q 2 P . Now, given P C and x 2 U , the ``granules of knowledge'' used for approximation are: · a set of objects dominating x, called P-dominating set, D P x fy 2 U : yDP xg; ÿ · a set of objects dominated by x, called P-dominated set, DP x fy 2 U : xDP yg: DP is re¯exive but not transitive. Greco et al. (1999e,f) proved that de®nitions (5) and (6) are equivalent to the following de®nitions (for UP see the de®nition in Section 3.2): P P CltP fx 2 UP : D P x Clt g; 6 P Clt6 fx 2 UP : Dÿ P x Clt g: Using DP we can give de®nitions of the P-upper approximations of CltP and Clt6 , complementary to P CltP and P Clt6 , respectively: P P CltP fx 2 UP : Dÿ 6 £g; P x \ Clt 6 P Clt6 fx 2 UP : D P x \ Clt 6 £g: For each CltP U and Clt6 U , let CltP CltP \ UP and Clt6 Clt6 \ UP . Let us remark that x 2 UP if and only if there exists R 6 £ such that R P and x 2 UR . Rough approximations P CltP ; P Clt6 , P CltP and P Clt6 satisfy the following properties: · (Rough inclusion). For each CltP and Clt6 ; t 1; . . . ; n, and for each P C: P Clt6 CltP P CltP ; P Clt6 Clt6 P Clt6 : · (Complementarity). For each CltP ; t 2; . . . ; n, and Clt6 ; t 1; . . . ; n ÿ 1, and for each P C: 6 P CltP UP ÿ P Cltÿ1 ; P P Clt6 UP ÿ P Clt1 : · (Monotonicity of the accuracy of approximation). For each CltP and Clt6 ; t 1; . . . ; n, and for each P ; R C, such that P R, the following inclusions hold: P Clt6 R Clt6 : P CltP R CltP ; Furthermore, if UP UR , the following inclusions are also true P CltP R Clt6 ; P Clt6 R Clt6 : Due to the property of monotonicity, when augmenting a set of attributes P, we get lower approximations of CltP and Clt6 ; t 1; . . . ; n, that are at least of the same cardinality. Thus, we can restore for the case of missing values the key concepts of the rough sets theory: accuracy and quality of approximation, reduct and core. S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 27 4.3.1. Example Let us consider the example presented in Section 3.2.1 in the context of the multi-criteria sorting, where attributes are criteria with preference-ordered scales and the decision classes are also preference-ordered. We shall apply the dominance-based rough set approach to the same Table 3. Using the extended rough set approach presented in Section 4.3, we will approximate the downward and the upward unions of classes, i.e. the class of students ``at most bad'' Cl16 and the class of students ``at least good'' Cl2P . Since only two classes are considered, these unions coincide with the class of ``bad'' students Cl1 and with the class of ``good'' students Cl2 , respectively. The C-lower approximations, the C-upper approximations and the C-boundaries of the classes of ``good'' and ``bad'' students are equal, respectively, to: C good £; C good £; C bad f1g; C bad f1g; C good f6g; C bad f1g; C good f2; 3; 4; 5; 6g; C bad f1; 2; 3; 4; 5; 6g: Let us remark that students 2 and 6 belong to C-lower approximation of the class of ``good'' students when this approximation is calculated using the indiscernibility relation (see I C good f2; 6g in Section 3.2.1), however, they belong to the boundary of ``good'' students when this approximation is calculated using the dominance relation. This is true because there is no ``bad'' student indiscernible with students 2 and 6. Observe, however, that although student 5 has a comprehensive evaluation worse than students 2 and 6 (``bad'' vs. ``good''), he/she dominates students 2 and 6 with respect to the three criteria. Precisely, student 2 is dominated by student 5 because 2 has a worse level in Physics (medium vs. good), and on the levels in Mathematics and Literature either student 2 or student 5 has a missing value. For this reason, the assignments of students 2 and 5 are inconsistent and thus they both belong to the C-boundary of the ``(at least) good'' class constructed using the dominance relation. The inconsistency between student 2 and student 5 cannot be detected using the classical rough set approach based on indiscernibility because these students are discernible with respect to C. Similar explanation holds for students 5 and 6. The quality of sorting using criteria from C is equal to 0.17. There is only one reduct which is also the core; it is composed of one criterion: fPhysicsg. The following minimal set of minimal decision rules can be obtained from the considered data table (within parentheses there are objects supporting the corresponding decision rules): 1. ``if Physics 6 bad, then student is (at most) bad'' (students 1,3) 2. ``if Physics P medium and Physics 6 good, then student is bad or good'' (not enough information to assign the student to one class only) (students 2,3,4,5,6). 5. Multicriteria choice and ranking problems As pointed out above, the use of rough sets in the past has been limited to problems of multiattribute classi®cation only. In Section 4.2, we presented an extension of the rough set approach to the multicriteria sorting problem. In the case of multicriteria choice and ranking problems we need further extensions, because the decision table in its original form does not allow the representation of preference binary relations between objects. To handle binary relations within the rough set approach, Greco et al. (1995, 1996, 1998c) proposed to operate on, so-called, pairwise comparison table (PCT), i.e. with respect to a choice or ranking problem, a decision table whose rows represent pairs of objects for which multicriteria evaluations and a comprehensive preference relation are known. The use of an indiscernibility relation on the PCT makes problems with interpretation of the approximations of the preference relation and of the decision rules derived from these approximations. 28 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 Indiscernibility permits handling inconsistency which occures when two pairs of objects have preferences of the same strength on considered criteria, however, the comprehensive preference relations established for these pairs are not the same. When dealing with criteria, there may occur also another type of inconsistency connected with violation of the dominance principle: on a given set of criteria, one pair of objects is characterized by some preferences and another pair has all preferences at least of the same strength, however, for the ®rst pair we have a comprehensive preference and for the other ± inverse comprehensive preference. The indiscernibility relation is not able to handle this type of inconsistency. For this reason, another way of de®ning the approximations and decision rules has been proposed, which is based on the use of graded dominance relations. 5.1. The pairwise comparison table Let C be the set of criteria used for evaluation of objects from A. For any criterion q 2 C, let Tq be a ®nite set of binary relations de®ned on A on the basis of the evaluations of objects from A with respect to the considered criterion q, such that for any x; y 2 A A exactly one binary relation s 2 Tq is veri®ed. More precisely, given the domain Vq of q 2 C, if v0q ; v00q 2 Vq are the respective evaluations of x; y 2 A by means of q and x; y 2 s, with s 2 Tq , then for each w; z 2 A having the same evaluations v0q ; v00q by means of q; w; z 2 s. For interesting applications it should be card Tq P 2, for each q 2 C. Furthermore, let Td be a set of binary relations de®ned on set A (comprehensive pairwise comparisons) such that at most one binary relation s 2 Td is veri®ed for any x; y 2 A A. The preferential information has the form of pairwise comparisons of reference objects from B A, [ Td ; gi, considered as examples of decision. The PCT is de®ned as data table SPCT hB; C [ fdg; TCS where B B B is a non-empty set of exemplary pairwise comparisons of reference objects, TC q2C Tq , d is a decision corresponding to the comprehensive pairwise comparison (comprehensive preference relation), and g : B C [ fdg ! TC [ Td is a total function such that g x; y; q 2 Tq for any x; y 2 A A and for each q 2 C, and g x; y; d 2 Td for any x; y 2 B. It follows that for any pair of reference objects x; y 2 B there is veri®ed one and only one binary relation s 2 Td . Thus, Td induces a partition of B. In fact, data table SPCT can be seen as decision table, since the set of considered criteria C and decision d are distinguished. We assume that the exemplary pairwise comparisons provided by the DM can be represented in terms of graded preference relations (for example ``very weak preference'', ``weak preference'', ``strict preference'', ``strong preference'', ``very strong preference'') Pqh : for each q 2 C and for any x; y 2 A A, Tq fPqh ; h 2 Hq g; where Hq is a particular subset of the relative integers and xPqh y; h > 0, means that object x is preferred to object y by degree h with respect to the criterion q, xPqh y; h < 0, means that object x is not preferred to object y by degree h with respect to the criterion q, xPq0 y means that object x is similar (asymmetrically indierent) to y with respect to the criterion q. Let us remark that Pq0 is the same similarity relation as presented in Section 3.1 in very general terms, i.e. without any speci®c reference to preference modeling. Within the preference context, the similarity relation, even if not symmetric, resembles indierence relation. Thus, in this case, we call this similarity relation ``asymmetric indierence''. Notice that, for each q 2 C and for any x; y 2 A A; xPqh y; h > 0 ) yPqk x; k 6 0 and xPqh y; h < 0 ) yPqk x; k P 0. The set of binary relations Td may be de®ned in a similar way, but xPdh y means that x is comprehensively preferred to y by degree h. S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 29 Technically, the modeling of the binary relation Pqh , i.e. the assessment of h, can be organized as follows: · ®rst, it is observed that for any q 2 C there exists a function cq : A ! R which is increasing with respect to the preferences on q, · then, it is possible to de®ne a function kq : R2 ! R which measures the strength of the preference (positive or negative) of x over y (e.g. kq cq x; cq y cq x ÿ cq y); it should satisfy the following properties for any x; y; z 2 A: cq x > cq y () kq cq x; cq z > kq cq y; cq z; cq x > cq y () kq cq z; cq x < kq cq z; cq y; cq x cq y () kq cq x; cq y 0; · next, the domain of kq is divided into intervals, using a suitable set of thresholds Dq , for each q 2 C; the intervals are numbered in such a way that kq cq x; cq y 0 belongs to interval no. 0, · the value of h in relation xPqh y is then equal to the number of interval including kq cq x; cq y, for any x; y 2 A A. We are considering a PCT where the set Td is composed of two binary relations de®ned on A: 1. x outranks y (denoted by xSy or x; y 2 S, where x; y 2 B, 2. x does not outrank y (denoted by xS c y or x; y 2 S c ), where x; y 2 B, and S [ S c B, where ``x outranks y'' means ``x is at least as good as y'' Roy (1985); observe that the binary relation S is re¯exive, but neither necessarily transitive nor complete (Roy, 1991; Bouyssou, 1996). 5.2. Multigraded dominance In Greco et al. (1996), we proposed a rough set approach to analysis of SPCT using single-graded dominance relations, assuming common degrees of preference for all the criteria. While this permits a simple calculation of the approximations and of the resulting decision rules, it is lacking in precision. A dominance relation allowing a dierent degree of preference for each considered criterion gives a far more accurate picture of the preferential information contained in the pairwise comparison table SPCT . More formally, given x; y; w; z 2 A A; x; y is said to dominate w; z, taking into account the criteria from £ 6 P C (denoted by x; yDP w; z), if x is preferred to y at least as strongly as w is preferred to z with respect to each q 2 P . Precisely, ``at least as strongly'' means ``by at least the same degree'', i.e. hq P kq, where hq; kq 2 Hq ; xPqhq y and wPqkq z, for each q 2 P . Let Dfqg be the dominance relation con®ned to the single criterion q 2 P . The binary relation Dfqg is re¯exive ( x; yDfqg x; y, for any x; y 2 A A), transitive ( x; yDfqg w; z; w; zDfqg u; v imply x; yDfqg u; v, for any x; y; w; z; u; v 2 A A), and complete ( x; yDfqg w; z and/or w; zDfqg x; y, for any x; y; w; z 2 A A). Therefore,TDfqg is a complete preorder. Since the intersection of complete preorders is a partial preorder and DP q2P Dfqg , P C, then the dominance relation DP is a partial preorder. Let R P C and x; y; u; v 2 A A; then the following implication holds: x; yDP u; v ) x; yDR u; v: Given P C and x; y 2 A A, let us introduce the positive dominance (denoted by D P x; y) and the negative dominance (denoted by Dÿ x; y) P D P x; y f w; z 2 A A : w; zDP x; yg; Dÿ P x; y f w; z 2 A A : x; yDP w; zg: 30 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 Using the dominance relations DP , it is possible to de®ne P-lower and P-upper approximations of the outranking relation S with respect to P C, respectively, as: P S f x; y 2 B : D P x; y Sg; P S [ ÿ D P x; y f x; y 2 B : DP x; y \ S 6 £g: x;y2S Analogously, it is possible to de®ne the approximations of S c : c P S c f x; y 2 B : Dÿ P x; y S g; [ P Sc c Dÿ P x; y f x; y 2 B : DP x; y \ S 6 £g: x;y2S c It may be proved that P S S P S; P S c S c P S c . Furthermore, the following complementarity properties hold: P S B ÿ P S c ; P S B ÿ P S c ; P S c B ÿ P S; P S c B ÿ P S: The P-boundaries (P-doubtful regions) of S and S c are de®ned as BnP S P S ÿ P S; BnP S c P S c ÿ P S c : Of course, BnP S BnP S c . The concepts of accuracy, quality of approximation, reducts and core can be extended to the approximation of the outranking relation by multigraded dominance relations. The accuracy of approximation of S and S c by P C is de®ned, respectively, by the ratios: aP S jP Sj ; jP Sj aP S c jP S c j : jP S c j The coecient cP jP S [ P S c j jBj de®nes the quality of approximation of S and S c by P C. It expresses the ratio of all pairs of objects x; y 2 B correctly assigned to S and S c by the set P of criteria, to the number of all the pairs of objects contained in B. Each minimal subset P 0 P such that cP 0 cP is called a reduct of P (denoted by REDS P ). Let us remark that SPCT can have more than one reduct. The intersection of all reducts is called the core (denoted by CORES P ). Using the approximations de®ned above, it is then possible to induce a generalized description of the preferential information contained in a given SPCT , in terms of suitable decision rules. The syntax of these rules is based on the concept of upward cumulated preferences (denoted by PqP h ) and downward cumulated preferences (denoted by Pq6 h ), having the following interpretation: · xPqP h y means ``x is preferred to y with respect to q by at least degree h'', · xPq6 h y means ``x is preferred to y with respect to q by at most degree h''. Exact de®nitions of the cumulated preferences, for each x; y 2 A A; q 2 C and h 2 Hq , are the following: S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 31 · xPqP h y if xPqk y, where k 2 Hq and k P h, · xPq6 h y if xPqk y, where k 2 Hq and k 6 h. Using the above concepts, three types of decision rules can be obtained: 1. D P -decision rules, being statements of the type: P h q1 if xPq1 P h q2 y and xPq2 y and . . . xPqpP h qp y; then xS c y; where P fq1 ; q2 ; . . . ; qp g C and h q1; h q2; . . . ; h qp 2 Hq1 Hq2 Hqp ; these rules are supported by pairs of objects from the P-lower approximation of S only; 2. D 6 -decision rules, being statements of the type: 6 h q1 if xPq1 6 h q2 y and xPq2 y and . . . xPqp6 h qp y; then xS c y; where P fq1; q2; . . . ; qpg C and (h q1; h q2; . . . ; h qp 2 Hq1 Hq2 Hqp ; these rules are supported by pairs of objects from the P-lower approximation of S c only; 3. D P 6 -decision rules, being statements of the type: P h q1 if xPq1 P h q2 y and xPq2 P h qk y and . . . xPqk 6 h qk1 y and xPqk1 y and . . . xPqp6 h qp y; then xSy or xS c y; where O0 fq1; q2; . . . ; qkg C; O00 fqk 1; qk 2; . . . ; qpg C; P O0 [ O00 ; O0 and O00 not necessarily disjoint, h q1; h q2; . . . ; h qp 2 Hq1 Hq2 Hqp ; these rules are supported by objects from the P-boundary of S and S c only. The decision rules, inferred from the approximation of S and S c on A, are then applied to a set M of objects in order to obtain a recommendation for the decision problem at hand. After the application of the decision rules to each pair of objects u; v 2 M M, one of the following four situations may occur: · uSv and not uS c v, that is true outranking (denoted by uS T v), · uS c v and not uSv, that is false outranking (denoted by uS F v), · uSv and uS c v, that is contradictory outranking (denoted by uS K v), · not uSv and not uS c v, that is unknown outranking (denoted by uS U v). The four above situations, which together constitute the so-called four-valued outranking (see Tsoukias and Vincke, 1995, 1997), have been introduced to underline the presence and absence of positive and negative reasons for the outranking. Moreover, they make it possible to distinguish contradictory situations from unknown ones. A ®nal recommendation can be obtained upon a suitable exploitation of the presence and the absence of outranking S and S c on M. A possible exploitation procedure consists in calculating a speci®c score, called Net Flow Score, for each object x 2 M: Snf x S x ÿ S ÿ x S ÿ x ÿ S ÿÿ x; where S S ÿ S ÿ S ÿÿ x jfy x jfy x jfy x jfy 2M 2M 2M 2M : : : : there there there there is is is is at at at at least least least least one one one one decision decision decision decision rule rule rule rule which which which which affirms affirms affirms affirms xSygj; ySxgj; yS c xgj; xS c ygj: The recommendation in ranking problems consists of the total preorder determined by Snf x on M; in choice problems it consists of the object(s) x 2 M such that Snf x max Snf x. The exploitation procedure described above has been recently characterized with reference to a number of desirable properties (Greco et al., 1997, 1998), however, a thorough axiomatic analysis of this and other 32 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 exploitation procedures used to obtain a recommendation in choice and ranking problems has been carried out by Greco et al. (1997b). Let us remark that a fuzzy extension of the multigraded approximation of relations S and S c has been proposed by Greco et al. (1998f, 1999c). 5.3. Dominance without degrees of preference The degree of graded preference considered in Section 5.1 is de®ned on a quantitative scale of the strength of preference kq ; q 2 C. However, in many real world problems, the existence of such a quantitative scale is rather questionable. Roy (1999) distinguishes the following cases: · preferences expressed on an ordinal scale: this is the case where the dierence between two evaluations has no clear meaning; · preferences expressed on a quantitative scale: this is the case where the scale is de®ned with reference to a unit clearly identi®ed, such that it is meaningful to consider an origin (zero) of the scale and ratios between evaluations (ratio scale); · preferences expressed on a numerical non-quantitative scale: this is an intermediate case between the previous two; there are two well-known particular cases: interval scale, where it is meaningful to compare ratios between dierences of pairs of evaluations, scale for which a complete preorder can be de®ned on all possible pairs of evaluations. The strength of preference kq and, therefore, the graded preference considered in Section 5.1, is meaningful when the scale is quantitative or numerical non-quantitative. If the information about kq is nonavailable, then it is possible to de®ne a rough approximation of S and S c using a speci®c dominance between pairs of objects from A A, de®ned on an ordinal scale represented by evaluations cq x on criterion q, for x 2 A (Greco et al., 1999c). Let us explain this latter case in detail. Let C O be the set of criteria expressing preferences on an ordinal scale, and C N the set of criteria expressing preferences on a quantitative scale or a numerical non-quantitative scale, such that C O [ C N C and C O \ C N £. Moreover, for each P C, we denote by P O the subset of P composed of criteria expressing preferences on an ordinal scale, i.e. P O P \ C O , and P N the subset of P composed of criteria expressing preferences on a quantitative scale or a numerical non-quantitative scale, i.e. P N P \ C N . Of course, for each P C, we have P P N [ P O and P O \ P N £. If P P N and P O £, then the de®nition of dominance is the same as in the case of multigraded dominance (Section 5.2). If P P O and P N £, then, given x; y; w; z 2 A A, the pair x; y is said to dominate the pair w; z, with respect to criteria from P, if for each q 2 P ; cq x P cq w and cq z P cq y. Let Dfqg be the dominance relation con®ned to the single criterion q 2 P O . The binary relation Dfqg is re¯exive x; yDfqg x; y, for any x; y 2 A A), transitive ( x; yDfqg w; z; w; zDfqg u; v imply x; yDfqg u; v, for any x; y; w; z; u; v 2 A A, but non-complete (it is possible that not x; yDfqg w; z and not w; zDfqg x; y for some x; y; w; z 2 A A). Therefore, D Tfqg is a partial preorder. Since the intersection of partial preorders is also a partial preorder and DP q2P Dfqg , P P O , then the dominance relation DP is also a partial preorder. If some criteria from P C express preferences on a quantitative or a numerical non-quantitative scale and others on an ordinal scale, i.e. if P N 6 £ and P O 6 £, then, given x; y; w; z 2 A A, the pair x; y is said to dominate the pair w; z with respect to criteria from P, if x; y dominates w; z with respect to both P N and P O . Since the dominance relation with respect to P N is a partial preorder on A A (because it is a multigraded dominance) and the dominance relation with respect to P O is also a partial preorder on A A (as explained above), then also the dominance DP , being the intersection of these two dominance relations, is a partial preorder. In consequence, all the concepts introduced in the previous point can be restored using this speci®c de®nition of dominance relation. S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 33 Using the approximations of S and S c based on the dominance relation de®ned above, it is possible to induce a generalized description of the available preferential information, in terms of decision rules. The decision rules are of the same type as the rules already introduced in the previous point, however, the conditions on criteria from C O are expressed directly in terms of evaluations belonging to domains of these criteria. Let Cq fcq x; x 2 Ag denote the domain of ordinal criterion q 2 C O . The decision rules have in this case the following syntax: 1. D P -decision rule, being a statement of the type: P h q1 y and . . . xPqeP h qe y and cqe1 x P rqe1 and cqe1 y 6 sqe1 and . . . cqp x P rqp and cqp y if xPq1 6 sqp ; then xSy; where P fq1; . . . ; qpg C; P N fq1; . . . ; qeg; P O fqe 1; . . . ; qpg; h q1; . . . ; h qe 2 Hq1 Hqe and rqe1 ; . . . ; rqp ; sqe1 ; . . . ; sqp 2 Cqe1 Cqp ; these rules are supported by pairs of objects from the P-lower approximation of S only; 2. D 6 -decision rule, being a statement of the type: 6 h q1 y and . . . xPqp6 h qp y and cqe1 x 6 rqe1 and cqe1 y P sqe1 and . . . cqp x 6 rqp and cqp y if x Pq1 P sqp ; then xS c y; where P fq1; . . . ; qpg C; P N fq1; . . . ; qeg; P O fqe 1; . . . ; qpg; h q1; . . . ; h qe 2 Hq1 Hqe and rqe1 ; . . . ; rqp ; sqe1 ; . . . ; sqp 2 Cqe1 Cqp ; these rules are supported by pairs of objects from the P-lower approximation of S c only; 3. D P 6 -decision rule, being a statement of the type: P h q1 6 h qe1 6 h qf y and . . . xPqeP h qe y and xPqe1 y . . . xPqf y and cqf 1 x P rqf 1 and cqf 1 y 6 sqf 1 and if x Pq1 . . . cqg x P rqg and cqg y 6 sqg and cqg1 x 6 rqg1 and cqg1 y P sqg1 and . . . cqp x 6 rqp and cqp y P sqp ; then xSy or xS c y; where O0 fq1; . . . ; qeg C; O00 fqe 1; . . . ; qf g C; P N O0 [ O00 ; O0 and O00 not necessarily disjoint, P O fqf 1; . . . ; qpg; h q1; . . . ; h qf 2 Hq1 Hqf and rqf 1 ; . . . ; rqp , sqf 1 ; . . . ; sqp 2 Cqf 1 Cqp ; these rules are supported by pairs of objects from the P-boundary of S and S c only. 5.4. An example Let us consider the example (Evaluation in a High School) proposed by Grabisch (1994). The students are evaluated according to the level in Mathematics, Physics and Literature. Marks are given on a scale from 0 to 20. Three students presented in Table 6 are considered. As the high school is ``scienti®cally'' oriented, the DM (director of the school) considers Mathematics and Physics as equally important, and more important than Literature. For this reason, he comprehensively Table 6 Students' evaluation table Student Mathematics Physics Literature (Comprehensive) Choquet evaluation a b c 18 10 14 16 12 15 10 18 15 13.9 13.6 14.9 34 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 prefers student a over b. Moreover, he also comprehensively prefers student c over a, because c is as good in scienti®c subjects as in Literature while a is excellent in Mathematics and Physics but too bad in Literature. To represent this comprehensive evaluation, he tries to use the weighted sum model giving an equal weight to Mathematics and Physics, greater than the weight of Literature. He discovers, however, that this model will never represent his preference for c over a, because the simple sum of all marks of a and c gives the same score for them, making them indierent; in order to obtain a better score for c, the director should give a higher weight to Literature than to Mathematics and Physics. This contradicts his original feeling about greater importance of scienti®c subjects than the Literature. To solve this impossibility, Grabisch proposes to use the model of Choquet integral. The DM should give the following weights for calculation of Choquet integral: · l(Mathematics) l(Physics) 0.45, · l(Literature) 0.3, · l(Mathematics, Physics) 0:5 < l(Mathematics) + l(Physics) 0.9, · l(Mathematics, Literature) 0:9 > l(Mathematics) + l(Literature) 0.75, · l(Physics, Literature) 0:9 > l(Physics) + l(Literature) 0.75, · l(Mathematics, Physics, Literature) 1. To apply the Choquet integral, for each student x the criteria are to be arranged in an order c 1 ; c 2 ; . . . ; c n , such that c 1 x 6 c 2 x 6 . . . 6 c n x. The Choquet integral is then de®ned as follows: C c1 x1 ; c2 x2 ; . . . ; cn xn n X ÿ ÿ c i x ÿ c iÿ1 x l J i ; i1 where J i fc i ; . . . ; c n g and c 0 x 0 for each x. In this case the Choquet integral assigns to each student the score presented in the last column of Table 6. Let us remark that this comprehensive evaluation satis®es the DM's preferences with respect to the three students and it respects his original feeling about greater importance of scienti®c subjects than the Literature. Moreover, the set of weights l represents positive interaction (between Mathematics and Literature, and between Physics and Literature) and negative interaction (between Mathematics and Physics) among criteria. Now, let us use for the same problem the rough set approach. Information from Table 6 and DM's preference-order of the students (c preferred to a preferred to b) are represented in terms of pairwise evaluations in Table 7. Note that ``x preferred to y'' means xSy and yS c x. The Hasse diagram in Fig. 1 shows the partial preorder induced by dominance relation DfMathsg on all pairs of students with respect to the level in Mathematics. Table 7 Pairwise comparison table Pair of students a; a a; b a; c b; a b; b b; c c; a c; b c; c Mathematics Physics Literature (Comprehensive) Outranking relation 18,18 18,10 18,14 10,18 10,10 10,14 14,18 14,10 14,14 16,16 16,12 16,15 12,16 12,12 12,15 15,16 15,12 15,15 10,10 10,18 10,15 18,10 18,18 18,15 15,10 15,18 15,15 S S Sc Sc S Sc S S S S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 35 Fig. 1. Hasse diagram of the dominance relation between pairs of students with respect to {Mathematics}, and approximations of S and S c (for the case: ``c preferred to a preferred to b''). The relation x; yDfMathsg w; z is true if and only if the level in Mathematics of x is at least equal to the level in Mathematics of w and the level in Mathematics of y is at most equal to the level in Mathematics of z. Intuitively, the relation DfMathsg means that, with respect to Mathematics, the interval determined by evaluations of x and y is including the interval determined by evaluations of w and z, i.e. cMaths y; cMaths x cMaths z; cMaths w. Let us remark that analogous partial preorders can be induced using dominance relations on Physics DfPhysicsg and on Literature DfLit:g . Fig. 1 shows, moreover, lower approximations and the boundary region of S and S c with respect to Mathematics. A pair of students x; y belongs to the lower approximation of S if xSy and there is no other pair of students w; z such that cMaths y; cMaths x cMaths z; cMaths w, while wS c z. Otherwise, the outranking (i.e. preference order) of pairs x; y and w; z is inconsistent with respect to the evaluation on Mathematics, so these pairs belong to the boundary region of S and S c . Analogously, a pair of students x; y belongs to the lower approximation of S c if xS c y and there is no other pair of students w; z such that cMaths z; cMaths w cMaths y; cMaths x, while wSz. Otherwise, the outranking of the pairs x; y and w; z is inconsistent and they belong to the boundary region of S and S c . One can see in Fig. 1 that the boundary region of S and S c is composed of four pairs of students a; a; a; c; c; a; c; c. This means that it is not possible to approximate the outranking relation (i.e. preference-order) on all three students using the level in Mathematics only. In other words, the preferenceorder of students cannot be explained using Mathematics alone. The Hasse diagram representing the dominance relation DfPhysicsg orders the pairs of students in the same way as DfMathsg . Therefore, the lower approximations of S and S c as well as their boundary region are the same as before. So, the information brought by the level in Physics is of the same quality as the one brought by the level in Mathematics. 36 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 As the pairs of students are ordered in the same way on the Hasse diagrams with respect to Mathematics and Physics, the conjoint consideration of Mathematics and Physics does not contribute to a better explanation of the preference-order of students. Fig. 2 presents the Hasse diagram built on the basis of evaluations on Literature. Let us remark that there is one maximal element, the pair b; a. This means that for each pair x; y of students we have b; aDfLit:g x; y. However, bS c a. As each pair x0 ; y 0 of students, for which x0 Sy 0 , is dominated by b; a, the lower approximation of S with respect to the level in Literature is empty. The Hasse diagram has also a minimal element, the pair a; b, for which aSb. Therefore, also the lower approximation of S c with respect to the level in Literature is empty. Consequently, all the pairs of students are in the boundary. In other words, the preference-order of students not only cannot be explained using Literature alone but does not give any explanation of this order. The dominance relation DfMaths;Lit:g aggregating Mathematics and Literature puts all the pairs on the same level of the Hasse diagram, because there are no two pairs of students x; y and w; z such that cMaths y; cMaths x cMaths z; cMaths w and cLit: y; cLit: x cLit: z; cLit: w. This permits to approximate perfectly the outranking relation S and its converse S c , i.e. each pair of objects is assigned to the lower approximation of S or to the lower approximation of S c and the boundary between S and S c is empty. In other words, information brought by Mathematics and Literature permits to explain completely the given preference-order of students. The same result can be obtained for joint consideration of Physics and Literature. In terms of the rough sets theory, this means that there are two reducts for this problem RED1 fMathematics; Literatureg; RED2 fPhysics; Literatureg and, consequently, the core is CORE RED1 \ RED2 fLiteratureg: Fig. 2. Hasse diagram of the dominance relation between pairs of students with respect to {Literature}, and approximations of S and S c (for the case: ``c preferred to a preferred to b''). 37 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 According to the original preferential information given by the DM, Literature is not the most important criterion and, indeed, alone it is not able to explain anything about the preference-order of students. However, being the core, it is indispensable for explanation of preferences when considered together with other criteria. The importance of and the interaction between the considered criteria can be calculated from the quality of P-approximation considered as fuzzy measure (see Section 2.5). Table 8 presents the quality of P-approximation, the M obius index and the Shapley index with respect to considered subset P of criteria. Results in Table 8 could be commented as follows. From the M obius index we can remark the negative interaction (redundancy) of Mathematics and Physics (ÿ5=9) and the positive interaction (synergy) of Mathematics and Literature (4/9), and Physics and Literature (4/9). The Shapley index con®rms that Mathematics and Physics, (19/54) are more important criteria than Literature (16/54). It shows also a negative interaction of Mathematics and Physics (ÿ14=18) and a positive interaction of Mathematics and Literature (4/18), and Physics and Literature (4/18). It is worth noting that the quality of rough approximation and, in consequence, importance and interaction indices, are calculated from data, while the interactive weights of the Choquet integral have been given by the DM. The following minimal set of decision rules can be induced from the considered examples (within parentheses there are pairs of students supporting the corresponding decision rules): ``if the level of x in Mathematics P 10 and the level of y in Mathematics 6 10; then xSy'' a; b; c; b; b; b, ``if the level of x in Mathematics 6 10 and the level of y in Mathematics P 14; then xS c y'' b; a; b; c, ``if the level of x in Mathematics P 14 and the level of y in Mathematics 6 18 and the level of x in Literature P 15 and the level of y in Literature 6 15, then xSy'' c; a; c; c, ``if the level of x in Mathematics P 14 and the level of y in Mathematics 6 18 and the level of x in Literature P 10 and the level of y in Literature 6 10, then xSy'' c; a; a; a, ``if the level of x in Mathematics 6 18 and the level of y in Mathematics P 14 and the level of x in Literature 6 10 and the level of y in Literature P 15, then x S c y'' a; c. Now, let us suppose that there is a committee composed of three members (I, II, III) giving three dierent rankings of students, presented in Table 9. Table 8 Importance and interaction indicesa Subset P of criteria Quality of rough approximation M obius index Shapley index M Ph L M+Ph M+L Ph+L M+Ph+L 5/9 5/9 0 5/9 1 1 1 5/9 5/9 0 ÿ5=9 4/9 4/9 ÿ4=9 19/54 19/54 16/54 ÿ14=18 4/18 4/18 ÿ4=9 Table 9 Rankings of students given by members of the committee Student Member I Member II Member III a b c 2° 3° 1° 3° 1° 2° 1° 2° 3° 38 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 Fig. 3. Hasse diagram of the dominance relation between pairs of students with respect to {Mathematics}, and approximations of S and S c (for the case: ``c preferred to a preferred to b preferred to c''). Let us assume that the committee takes ®nal decision according to majority rule. Therefore, c is comprehensively preferred to a (because I and II vote in favor of c and III in favor of b), a is preferred to b (because I and III vote in favor of a and II in favor of b) and b is preferred to c (because II and III vote in favor of b and I in favor of c). There is a cycle in the preference-order of the commission: it is the very wellknown Condorcet paradox. Also in this case it is possible to represent lower and upper approximations of S and S c on the basis of the Hasse diagrams used before. Fig. 3 shows the Hasse diagram with respect to Mathematics. The lower approximations of S and S c are composed of only one pair of students each, a; b and b; a, respectively. This means that using information brought by Mathematics it is possible to explain the preference relative to two pairs of students only: a; b and b; a. Physics behaves similarly to Mathematics. Fig. 4 shows the Hasse diagram with respect to Literature and the lower approximations of S and S c that are empty. Therefore, Literature alone is again not able to explain anything about the preference-order of students. Also in this case, information brought by Mathematics and Literature together permits to explain completely the preference-order of students. The same result can be obtained for joint consideration of Physics and Literature. In terms of the rough sets theory this means that there are two reducts for this problem RED1 fMathematics; Literatureg; RED2 fPhysics; Literatureg and, consequently, the core is CORE RED1 \ RED2 fLiteratureg: As before, the importance of and the interaction between the considered criteria can be calculated from the quality of P-approximation considered as fuzzy measure. Table 10 presents the quality of P-approximation, the M obius index and the Shapley index with respect to considered subset P of criteria. 39 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 Fig. 4. Hasse diagram of the dominance relation between pairs of students with respect to {Literature}, and approximations of S and S c (for the case: ``c preferred to a preferred to b preferred to c''). Table 10 Importance and interaction indices Subset P of criteria Quality of rough approximation M obius index Shapley index M Ph L M+Ph M+L Ph+L M+Ph+L 2/9 2/9 0 2/9 1 1 1 2/9 2/9 0 ÿ2=9 7/9 7/9 ÿ7=9 13/54 13/54 28/54 ÿ11=18 7/18 7/18 ÿ7=9 Let us observe that in this case the importance of Literature measured by the Shapley index increases and Literature becomes the most important criterion. Also the synergy of Mathematics and Literature or Physics and Literature increases. Finally, the following minimal set of decision rules can explain the decision policy of the commission (between parentheses there are pairs of students supporting the corresponding decision rules): ``if the level of x in Mathematics P 18 and the level of y in Mathematics 6 10; then xSy'' a; b, ``if the level of x in Mathematics 6 10 and the level of y in Mathematics P 18; then xS c y'' b; a, ``if the level of x in Mathematics P 10 and the level of y in Mathematics 6 14 and the level of x in Literature P 18 and the level of y in Literature 6 18; then xSy'' b; c; b; b, ``if the level of x in Mathematics P 14 and the level of y in Mathematics 6 18 and the level of x in Literature P 15 and the level of y in Literature 6 15, then xSy'' c; a; c; c, `` if the level of x in Mathematics P 14 and the level of y in Mathematics 6 18 and the level of x in Literature P 10 and the level of y in Literature 6 10; then xSy'' c; a; a; a, ``if the level of x in Mathematics 6 14 and the level of y in Mathematics P 10 and the level of x in Literature 6 15 and the level of y in Literature P 18, then xS c y'' c; b, ``if the level of x in Mathematics 6 18 and the level of y in Mathematics P 14 and the level of x in Literature 6 10 and the level of y in Literature P 15, then xS c y'' a; c. 40 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 The example shows that the Choquet integral is not able to represent the case of cyclic preferences, while the rough set approach is able. 6. Formal equivalence of decision rule preference models and conjoint measurement models Traditionally, preferences are modeled using a value function u . In a multicriteria context, each object a is generally seen as a vector c a c1 a; c2 a; . . . ; cm a of evaluations with reference to the m criteria c1 a; c2 a; . . . ; cm a. Greco et al. (1999j) characterized recently such a function for multicriteria sorting. They proved that a simple cancellation property permits to induce a preference order on the domain of each criterion from the order of considered classes Clt ; t 1; . . . ; n. This is equivalent to the following model of sorting uc1 a; c2 a; . . . ; cm a P zt () a 2 CltP ; where u is increasing in each argument and n ÿ 1 ordered thresholds zt ; t 2; . . . ; n, satisfy the condition z2 6 z3 6 . . . 6 zn . The authors proved, moreover, that such a model is equivalent to a sorting produced by a set of D P -decision rules having the syntax de®ned in Section 4.2.2. The above model of sorting can also be written as uc1 a; c2 a; . . . ; cm a 6 wt () a 2 Clt6 ; where u is the same as above and n ÿ 1 ordered thresholds wt ; t 1; . . . ; n ÿ 1, satisfy the condition w1 6 w2 6 . . . 6 wnÿ1 . This model is equivalent to a sorting produced by a set of D 6 -decision rules having the syntax de®ned in Section 4.2.2. The authors proved, ®nally, that in presence of some inconsistent examples of sorting, i.e. if at least one C-boundary is nonempty, this model can be generalized as follows: there are two value functions, u P and u 6 , increasing in each argument, which assign a lower and an upper value to each object a, respectively, i.e. u P c1 a; c2 a; . . . ; cm a 6 u 6 c1 a; c2 a; . . . ; cm a: Moreover, there are n ÿ 1 ordered thresholds zt ; t 2; . . . ; n; and n ÿ 1 ordered thresholds wt ; t 1; . . . ; n ÿ 1; satisfying w1 6 z2 6 w2 6 . . . 6 znÿ1 6 wnÿ1 6 zn such that for each object a u P c1 a; c2 a; . . . ; cm a P zt () a 2 C CltP ; u 6 c1 a; c2 a; . . . ; cm a 6 wt () a 2 C Clt6 : If there is no inconsistent example of sorting, the above model boils down to the previous ones. When used for multicriteria ranking or choice problems, value function u is requested to satisfy the property that object a is at least as good as object b, i.e. aSb, i u a P u b. This implies that the relation S is complete (for each couple of objects a, b, aSb and/or bSa) and transitive (for each triple of objects a; b; c; aSb and bSc imply aSc). It is often assumed that the value function is additive (see, e.g., Keeney and Raia, 1976, and Krantz et al., 1978; Wakker, 1989, for an axiomatic characterization), i.e., S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 u a m X 41 uq cq a; q1 where uq ; q 1; . . . ; m; are non-decreasing functions. The additive and transitive model represented by the additive value function is inappropriate in many situations, because: · the indierence (the symmetric part of S) may not be transitive, · S may not be complete, that is some objects may be incomparable, · the compensation between evaluations of con¯icting criteria is far more complex than the capacity of representation by the additive value function. To take these limitations into account, a variety of extensions have been proposed (e.g. Tversky, 1969, Fishburn, 1991). Bouyssou and Pirlot (1996) have proposed a model generalizing the previous ones and creating an axiomatic basis to many multicriteria decision methods proposed in the literature (see, e.g. Roy, 1993; Vincke, 1992). This model drops additivity, transitivity and completeness properties, and may be written as i aSb iff F Wq cq a; cq b;q1;...;m P 0; where Wq : R2 ! R is a non-decreasing function in its ®rst argument and a non-increasing function in its second argument, for q 1; . . . ; m, and F : Rm ! R is a non-decreasing function in every one of its arguments. Observe that the values assumed by the function Wq may be interpreted as a measure of the strength of preference of a over b with respect to criterion q; q 1; . . . ; m. Thus, Wq plays the same role as the function kq in the de®nition of the PCT. Recently, Greco et al. (1998h, 1999b) have proposed some more general models of conjoint measurement. The ®rst model can be written as (Greco et al., 1998h): aSb iff GWq cq a; cq b;q1;...;k ; cq a; cq b;qk1;...;m P 0; ii where the indices of the considered criteria are reordered such that f1; . . . ; kg is the set of criteria for which the preference is expressed on a quantitative or a numerical non-quantitative scale, and fk 1; . . . ; mg is the set of criteria for which the preference is expressed on an ordinal scale; Wq is de®ned as above for q 1; . . . ; k, and G : Rk2 mÿk ! R is a non-decreasing function in its ®rst k arguments, non-decreasing in each (k+`odd') argument (`odd' 1; 3; . . . ; 2 m ÿ k ÿ 1,) and non-increasing in each (k+`even') argument (`even' 2; 4; . . . ; 2 m ÿ k). We proved that model (ii) is based on the same axioms as the model (i), with the exception of an axiom which introduces a total preorder in the set of pairs cq a; cq b, for each q 2 f1; . . . ; mg. More precisely, this axiom is accepted only for q 1; . . . ; k, i.e. for the set of criteria with a quantitative or a numerical non-quantitative preference scale. Moreover, Greco et al. (1999b) have proposed a model of conjoint measurement to represent some inconsistencies in the preferences. This model is based on the concepts of C-lower and C-upper approximation of S and S c , and of C-boundary of S and S c , where C f1; . . . ; mg. This model can be written as iiia a; b 2 C S iff GWq cq a; cq b;q1;...;k ; cq a; cq b;qk1;...;m P t2 ; iiib a; b 2 C S c iff GWq cq a; cq b;q1;...;k ; cq a; cq b;qk1;...;m 6 t1 ; iiic a; b 2 BnC S or; equivalently; a; b 2 BnC S c iff t1 < GWq cq a; cq b;q1;...;k ; cq a; cq b;qk1;...;m < t2 ; where Wq and G are de®ned as above and t1 ; t2 2 R such that t1 < t2 . With respect to the model (iiia)±(iiic), we proved that it is always possible to obtain that representation, i.e. S should not satisfy any speci®c axiom. 42 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 It is interesting to compare the above models of conjoint measurement with the decision rule preference models resulting from the rough set approach (Sections 5.2 and 5.3). The following results have been proved: 1. the outranking relation S may be represented by means of the non-additive, non-transitive and noncomplete model (i) if and only if it may be represented by means of a set of D P -decision rules having a syntax de®ned in Section 5.2 (Greco et al., 1998d), 2. the outranking relation S may be represented by means of the non-additive, non-transitive and noncomplete model ii) if and only if it may be represented by means of a set of D P -decision rules having a syntax de®ned in Section 5.3 (Greco et al., 1998h). Greco et al. (1999c) have also pointed out the clear equivalence between the representation of S and S c obtained using the rough set approach proposed in Section 5 and the model of conjoint measurement (iiia)± (iiic). Furthermore, they observed that the rough set representation presented in Section 5.2 can be viewed as a particular case of the representation proposed in Section 5.3, when the set of criteria with an ordinal preference scale is empty. 7. Conclusions In this paper, we made a synthesis of the contribution of the extended rough sets theory to MCDA. Classical use of the rough set approach, and more generally, of machine learning, data mining and knowledge discovery, is con®ned to problems of multiattribute classi®cation, i.e. problems where neither the values of attributes describing the objects, nor the classes to which the objects are assigned, are preference-ordered. On the other hand, MCDA deals with problems where descriptions (evaluations) of objects (actions) by means of attributes (criteria), as well as decisions in sorting, choice and ranking problems, are preference-ordered. The extension of the rough set approach to problems in which preference-order properties are important is possible upon two main methodological contributions extensivelly discussed in this paper: 1. approximation by dominance relations, which allows to deal with preference-order properties of criteria, 2. analysis of pairwise comparison table, which allows to handle preference relations for choice and ranking problems. Let us point out the main advantages of the extended rough set approach to MCDA in comparison with classical approaches: · preferential information necessary to deal with a multicriteria decision problem is asked to the DM in terms of exemplary decisions, · the rough set analysis of preferential information supplies some useful elements of knowledge about the decision situation; these are: the relevance of attributes and/or criteria, information about their interaction (from quality of approximation and its analysis using fuzzy measures theory), minimal subsets of attributes or criteria (reducts) conveying the relevant knowledge contained in the exemplary decisions, the set of the non-reducible attributes or criteria (core), · the preference model induced from the preferential information is expressed in a natural and comprehensible language of ``if . . . , then . . . '' decision rules, · heterogeneous information (qualitative and quantitative, preference-ordered or not, crisp and fuzzy evaluations, and ordinal and cardinal scales of preferences, missing values) can be processed within the extended rough set approach, while classical MCDA methods consider only quantitative ordered evaluations with rare exceptions, · the decision rule preference model resulting from the rough set approach is more general than all existing models of conjoint measurement due to its capacity of handling inconsistent preferences (a new model of S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 43 conjoint measurement is formally equivalent to the decision rule preference model handling inconsistencies), · the proposed methodology is based on elementary concepts and mathematical tools (sets and set operations, binary relations), without recourse to any algebraic or analytical structures; the main idea is very natural and, in a certain sense, even objective: dominance relation. Let us conclude with a metaphor. Rough sets theory and multicriteria decision theory were like small worlds speaking their own languages. The language of rough sets did not include the words like criterion, choice, ranking, sorting. On the other hand, the language of multicriteria decision theory did not use words like approximation, reduct, core, decision rule. The results presented in this paper permitted rough sets theory and MCDA to communicate in a new language in which these words coexist and are semantically related. This communication permitted to create an added value for both theories. The rough sets theory entered the world of decision problems in which preference-orders are considered. Multicriteria decision analysis was equipped with new preference models composed of decision rules. Moreover, even if the concept of inconsistency was known in MCDA, there were only few tools to deal with it. The use of the decision rule model and the capacity of handling inconsistent preferential information opened a fascinating research ®eld to MCDA. Acknowledgements The research of the ®rst two authors has been supported by the Italian Ministry of University and Scienti®c Research (MURST). The third author wishes to acknowledge ®nancial support from State Committee for Scienti®c Research (KBN), grant no. 8T11F00619. References Bazan, J., Skowron, A., Synak, P., 1994. Dynamic reducts as a tool for extracting laws from decision tables. In: Zemankowa, M., Ras, Z. (Eds.), Methodologies for Intelligent Systems. LNAI, vol. 869. Springer, Berlin, pp. 346±355. Banzhaf, J.F., 1965. Weighted voting doesn't work: A mathematical analysis. Rutgers Law Review 19, 317±343. Bouyssou, D., 1996. Outranking Relations: Do they have special properties? Journal of Multi-Criteria Decision Analysis 5 (2), 99±111. Bouyssou, D., Pirlot, M., 1996. A general framework for the aggregation of semiorders. Technical Report, ESSEC, Cergy-Pontoise. Dennemberg, D., Grabisch, M., 1996. Shapley value and interaction index. Working paper. Dubois, D., Prade, H., 1990. Rough fuzzy sets and fuzzy rough sets. Int. J. of General Systems 17, 191±200. Dubois, D., Prade, H., 1992. Putting rough sets and fuzzy sets together. In: Slowinski, R. (Ed.), Intelligent Decision Support, Handbook of Applications and Advances of the Rough Sets Theory. Kluwer, Dordrecht, pp. 203±233. Fishburn, P.C., 1967. Methods for estimating additive utilities. Management Science 13, 435±453. Fishburn, P.C., 1991. Nontransitive additive conjoint measurement. Journal of Mathematical Psychology 35, 1±40. Fodor, J., Roubens, M., 1994. Fuzzy preference modelling and multicriteria decision support. Kluwer, Dordrecht. Grabisch, M., 1994. Fuzzy integral in multicriteria decision making. Fuzzy Sets and Systems 89, 279±298. Grabisch, M., 1996. The application of fuzzy integrals in multicriteria decision making. European Journal of Operational Research 89, 445±456. Grabisch, M., 1997. k-order additive discrete fuzzy measures and their representation. Fuzzy Sets and Systems 89, 445±456. Grabisch, M., Roubens, M., 1997. Equivalent representations of a set function with application to decision making. In: Proceedings of the FUZZ-IEEE '97 Conference. Barcelona. Greco, S., Matarazzo, B., Slowinski, R., 1995. Rough set approach to multi-attribute choice and ranking problems. ICS Research Report 38/95, Warsaw University of Technology, Warsaw 1995. In: Fandel, G., Gal, T. (Eds.), Multiple Criteria Decision Making, Proceedings of the 12th International Conference. Hagen, Springer, Berlin, 1997, pp. 318±329. Greco, S., Matarazzo, B., Slowinski, R., 1996. Rough approximation of a preference relation by dominance relations, ICS Research Report 16/96, Warsaw University of Technology, Warsaw, 1996, European Journal of Operational Research 117, 1999, 63±83. Greco, S., Matarazzo, B., Slowinski, R., 1997a. Rough approximation of a preferential information. Working Paper, Poznan University of Technology. 44 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 Greco, S., Matarazzo, B., Slowinski, R., 1997b. Exploitation procedures for rough set analysis of multicriteria decision problems. Working Paper, Poznan University of Technology. Greco, S., Matarazzo, B., Slowinski, R., 1998a. Fuzzy measures technique for rough set analysis. In: Proceedings of the Sixth European Congress on Intelligent Techniques and Soft Computing, vol. 1. Aachen, pp. 99±103. Greco, S., Matarazzo, B., Slowinski, R., 1998b. A new rough set approach to evaluation of bankruptcy risk. In: Zopounidis, C. (Ed.), Operational Tools in the Management of Financial Risks. Kluwer, Dordrecht, pp. 121±136. Greco, S., Matarazzo, B., Slowinski, R., 1998c. Rough approximation of a preference relation in a pairwise comparison table. In: Polkowski, L., Skowron, A. (Eds.), Rough Sets in Data Mining and Knowledge Discovery. Physica-Verlag, Heidelberg, pp. 13±36. Greco, S., Matarazzo, B., Slowinski, R., 1998d. Modellizzazione delle preferenze per mezzo di regole di decisione. In: Atti del Ventiduesimo Convegno A.M.A.S.E.S., Bozzi Editore, Genova, pp. 233±247. Greco, S., Matarazzo, B., Slowinski, R., 1998e. A new rough set approach to multicriteria and multiattribute classi®cation. In: Polkowski, L., Skowron, A. (Eds.), Rough sets and Current Trends in Computing, RSTCTC'98, Springer, pp. 60±67. Greco, S., Matarazzo, B., Slowinski, R., 1998f. Rough approximation of a fuzzy preference relation. Working Paper, Poznan University of Technology. Greco, S., Matarazzo, B., Slowinski, R., 1998g. Fuzzy similarity relation as a basis for rough approximation. In: Polkowski, L., Skowron, A. (Eds.), Rough sets and Current Trends in Computing, RSTCTC '98, Springer, pp. 283±289. Greco, S., Matarazzo, B., Slowinski, R., 1998h. A conjoint measurement model to represent preference on ordinal scales. Working paper, Poznan University of Technology. Greco, S., Matarazzo, B., Slowinski, R., 1999a. Fuzzy dominance as basis for rough approximations. In: Proceedings of the Fourth Meeting of the EURO WG on Fuzzy Sets and Second International Conference on Soft and Intelligent Computing, EUROFUSESIC'99, Budapest, pp. 273±278. Greco, S., Matarazzo, B., Slowinski, R., 1999b. Misurazione congiunta e incoerenze nelle preferenze. In: Atti del ventitreesimo convegno A.M.A.S.E.S., Rende-Cosenza, pp. 255±269. Greco, S., Matarazzo, B., Slowinski, R., 1999c. The use of rough sets and fuzzy sets in MCDM. Chapter 14. In: Gal, T., Stewart, T., Hanne, T. (Eds.), Advances in Multiple Criteria Decision Making. Kluwer, Dordrecht, pp. 14.1±14.59. Greco, S., Matarazzo, B., Slowinski, R., 1999d. On joint use of indiscernibility, similarity and dominance in rough approximation of decision classes. In: Despotis, D.K., Zopounidis, C. (Eds.), Proceedings of the Fifth International Conference of the Decision Sciences Institute. Athens, pp. 1380±1382. Greco, S., Matarazzo, B., Slowinski, R., 1999e. Handling missing values in rough set analysis of multi-attribute and multi-criteria decision problems. In: Zhong, N., Skowron, A., Ohsuga, S. (Eds.), New Directions in Rough Sets, Data Mining and Granular-Soft Computing, RSFDGrC'99, Lecture Notes in Arti®cial Intelligence, vol. 1711. Springer, Berlin, pp. 146±157. Greco, S., Matarazzo, B., Slowinski, R., 1999f. Dealing with missing data in rough set analysis of multi-attribute and multi-criteria decision problems. In: Zanakis, S.H., Doukidis, G., Zopounidis, C. (Eds.), Recent Developments and Applications in Decision Making. Kluwer Academic Publishers, Dordrecht, to appear. Greco, S., Matarazzo, B., Slowinski, R., 1999g. Multicriteria classi®cation by dominance-based rough set approach. Chapter C5.1.9. In: Kloesgen, W., Zytkow, J. (Eds.), Handbook of Data Mining and Knowledge Discovery, Oxford University Press, New York, to appear. Greco, S., Matarazzo, B., Slowinski, R., 1999h. Dominance-based rough set approach to rating analyis. Fuzzy Economic Review, to appear. Greco, S., Matarazzo, B., Slowinski, R., 1999j. Conjoint measurement and rough set approach for multicriteria sorting problems in presence of ordinal criteria. In: Colorni, A., Parruccini, M., Roy, B. (Eds.), Selected Papers from 49th and 50th Meeting of the EURO Working Group on MCDA, EUR-Report, Ispra-Paris, to appear. Greco, S., Matarazzo, B., Slowinski, R., 1999k. A fuzzy extension of the rough set approach to multicriteria and multiattribute sorting. In: Fodor, J., De Baets, B., Perny, P. (Eds.), Preferences and Decisions under Incomplete Information. Physica, Heidelberg, to appear. Greco, S., Matarazzo, B., Slowinski, R., 2000. Rough set processing of vague information using fuzzy similarity relations. In: Calude, C.S., Paun, G. (Eds.), Finite Versus In®nite ± Contributions to an Eternal Dilemma. Springer, London, pp. 149±173. Greco, S., Matarazzo, B., Slowinski, R., Tsoukias, A., 1997. Exploitation of a rough approximation of the outranking relation. Cahier du LAMSADE no. 152, Universite de Paris Dauphine, Paris. Greco, S., Matarazzo, B., Slowinski, R., Tsoukias, A., 1998. Exploitation of a rough approximation of the outranking relation in multicriteria choice and ranking. In: Stewart, T.J., van den Honert, R.C. (Eds.), Trends in Multicriteria Decision Making. LNEMS, vol. 465. Springer, Berlin, pp. 450±460. Greco, S., Matarazzo, B., Slowinski, R., Zanakis, S., 1999l. Rough set analysis of information tables with missing values. In: Despotis, D.K., Zopounidis, C. (Eds.), Proceedings of the Fifth International Conference of the Decision Sciences Institute. Athens, pp. 1359±1362. Grzymala-Busse, J.W., 1992. LERS - a system for learning from examples based on rough sets. In: Slowinski, R. (Ed.), Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory. Kluwer, Dordrecht, pp. 3±18. S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 45 Grzymala-Busse, J.W., 1997. A new version of the rule induction system LERS. Fundamenta Informaticae 31, 27±39. Inuiguchi, M., Tanino, T., 2000. Fuzzy rough sets based on certainty quali®cations. In: Proceedings of the 4th AFSS Symp., Tsukuba, pp. 433±438. Jacquet-Lagreze, E., 1981. Systemes de decision et acteurs multiples - Contribution a une th`eorie de l'action pour les sciences des organisations, These d'Etat, Universite de Paris-Dauphine, Paris. Jacquet-Lagreze, E., Siskos, J., 1982. Assessing a set of additive utility functions for multicriteria decision-making, the UTA method. European Journal of Operational Research 10, 151±164. Keeney, R.L., Raia, H., 1976. Decision with Multiple Objectives - Preferences and value Tradeos. Wiley, New York. Krantz, D.M., Luce, R.D., Suppes, P., Tversky, A., 1978. Foundations of Measurements I. Academic Press, New York. Krawiec, K., Slowinski, R., Vanderpooten, D., 1998. Learning of decision rules from similarity based rough approximations. In: Polkowski, L., Skowron, A. (Eds.), Rough Sets in Knowledge Discovery, vol. 2. Physica, Heidelberg, pp. 37±54. Kryszkiewicz, M., 1998. Properties of incomplete information systems in the framework of rough sets. In: Polkowski, L., Skowron, A. (Eds.), Rough Sets in Knowledge Discovery, vol. 1. Physica, Heidelberg, pp. 422±450. Krusinska, E., Slowinski, R., Stefanowski, J., 1992. Discriminant versus rough set approach to vague data analysis. Applied Stochastic Models and Data Analysis 8, 43±56. Kryszkiewicz, M., Rybinski, H., 1996. Computation of reducts of composed information systems. Fundamenta Informaticae 27, 183±195. Langley, P., Simon, H.A., 1998. Fielded applications of machine learning. In: Michalski, R.S., Bratko, I., Kubat, M. (Eds.), Machine Learning and Data Mining. Wiley, New York, pp. 113±129. Lin, T., 1989. Neighborhood systems and approximation in database and knowledge base systems. In: Proceedings of the Fourth International Symposium on Methodologies for Intelligent Systems. Luce, R.D., 1956. Semi-orders and a theory of utility discrimination. Econometrica 24, 178±191. March, J.G., 1988. Bounded rationality, ambiguity, and the engineering of choice. In: Bell, D.E., Raia, H., Tversky, A. (Eds.), Decision Making, Descriptive, Normative and Prescriptive Interactions. Cambridge University Press, New York, pp. 33±58. Marcus, S., 1994. Tolerance rough sets, Cech topologies, learning processes. Bull. of the Polish Academy of Sciences Technical Sciences 42 (3), 471±487. Michalski, R.S., Bratko, I., Kubat, M. (Eds.)., 1998. Machine Learning and Data Mining ± Methods and Applications. Wiley, New York. Mienko, R., Slowinski, R., Stefanowski, J., Susmaga, R., 1996a. Rough Family ± software implementation of rough set based data analysis and rule discovery techniques. In: Tsumoto, S., et al. (Eds.), Proceedings of the Fourth International Workshop on Rough Sets, Fuzzy Sets and Machine Discovery. Tokyo University Press, Tokyo, pp. 437±440. Mienko, R., Stefanowski, J., Toumi, K., Vanderpooten, D., 1996b. Discovery-oriented induction of decision rules. Cahier du LAMSADE no. 141, Universite de Paris Dauphine, Paris. Mousseau, V., 1993. Problemes lies a l'evaluation de l'importance en aide multicritere a la decision: Re¯exions theoriques et experimentations, These de doctorat, Universite de Paris-Dauphine, Paris. Murofushi, T., 1992. A technique for reading fuzzy measures (i): the Shapley value with respect to a fuzzy measure. In: Proceedings of the Second Fuzzy Workshop. Nagaoka, Japan, October, pp. 39±48, in Japanese. Murofushi, T., Soneda, S., 1993. Techniques for reading fuzzy measures (iii): interaction index. In: Proceedings of the Ninth Fuzzy Systems Symposium. Sapporo, Japan, May 1993, pp. 693±696, in Japanese. Nieminen, J., 1988. Rough tolerance equality. Fundamenta Informaticae 11 (3), 289±296. Pawlak, Z., 1982. Rough sets. International Journal of Information & Computer Sciences 11, 341±356. Pawlak, Z., 1985a. Rough probability. Bull. Polish Acad. Scis., Technical Sci. 33, 9±10. Pawlak, Z., 1985b. Rough sets and fuzzy sets. Fuzzy Sets and Systems 17, 99±102. Pawlak, Z., 1991. Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer, Dordrecht. Pawlak, Z., 1997. Rough set approach to knowledge-based decision support. European Journal of Operational Research 99, 48±57. Pawlak, Z., Slowinski, R., 1994. Rough set approach to multi-attribute decision analysis. European Journal of Operational Research 72, 443±459. Polkowski, L., Skowron, A., 1994. Rough mereology. In: Proceedings Symposium on Methodologies for Intelligent Systems, Lecture Notes in Arti®cial Intelligence, vol. 869. Springer, Berlin, pp. 85±94. Polkowski, L., Skowron, A. (Eds.), 1998. Rough Sets in Data Mining and Knowledge Discovery. Physica, Heidelberg. Polkowski, L., Skowron, A., Zytkow, J., 1995. Rough foundations for rough sets. In: Lin, T.Y., Wildberger, A. (Eds.), Soft Computing: Rough Sets, Fuzzy Logic, Neural Networks, Uncertainty Management, Knowledge Discovery. Simulation Councils, San Diego, CA, pp. 142±149. Roubens, M., 1996. Interaction between criteria through the use of fuzzy measures, Report 96.007, Institut de Mathematique, Universite de Liege, Liege. Roy, B., 1985. Methodologie Multicritere d'Aide a la Decision. Economica, Paris. Roy, B., 1989. Main sources of inaccurate determination, uncertinty and imprecision in decision models. Mathematical and Computer Modelling 12 (10/11), 1245±1254. 46 S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 Roy, B., 1991. The outranking approach and the foundation of ELECTRE methods. Theory and Decision 31, 49±73. Roy, B., 1993. Decision science or decision aid science?. European Journal of Operational Research 66, 184±203. Roy, B., 1999. Decision-aiding today: what should we expect? Chapter 1. In: Gal, T., Stewart, T., Hanne, T. (Eds.), Advances in Multiple Criteria Decision Making. Kluwer Academic Publishers, Dordrecht, pp. 1.1±1.35. Roy, B., 1993. Bouyssou, D.: Aide Multicritere a la Decision: Methodes et Cas. Economica, Paris. Shafer, G., 1976. A Mathematical Theory of Evidence. Princeton University Press, Princeton. Shapley, L.S., 1953. A value for n-person games. In: Kuhn, H.W., Tucker, A.W. (Eds.), Contributions to the Theory of Games II. Princeton University Press, Princeton, pp. 307±317. Skowron, A., 1993. Boolean reasoning for decision rules generation. In: Komorowski, J., Ras, Z.W. (Eds.), Methodologies for Intelligent Systems. Lecture Notes in Arti®cial Intelligence, vol. 689. Springer, Berlin, pp. 295±305. Skowron, A., Grzymala-Busse, J.W., 1994. From the rough set theory to the evidence theory. In: Fedrizzi, M., Kacprzyk, J., Yager, R.R. (Eds.), Advances in the Dempster-Shafer Theory of Evidence. Wiley, New York, pp. 193±236. Skowron, A., Polkowski, L., 1997. Decision algorithms: a survey of rough set-theoretic methods. Fundamenta Informaticae 27 (3/4), 345±358. Skowron, A., Rauszer, C., 1992. The discernibility matrices and functions in information systems. In: Slowinski, R. (Ed.), Intelligent Decision Support, Handbook of Applications and Advances of the Rough Set Theory. Kluwer Academic Publishers, Dordrecht, pp. 331±362. Skowron, A., Stepaniuk, J., 1995. Generalized approximation spaces. In: Lin, T.Y., Wildberger, A. (Eds.), Soft Computing: Rough Sets, Fuzzy Logic, Neural Networks, Uncertainty Management, Knowledge Discovery. Simulation Councils, San Diego, CA, pp. 18±21. Slovic, P., 1975. Choice between equally-valued alternatives. Journal of Experimental Psychology: Human Perception Performance 1, 280±287. Slowinski, R. (Ed.), 1992. Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory. Kluwer Academic Publishers, Dordrecht. Slowinski, R., 1993a. A generalization of the indiscernibility relation for rough set analysis of quantitative information. Rivista di matematica per le scienze economiche e sociali 15, 65±78. Slowinski, R., 1993b. Rough set learning of preferential attitude in multi-criteria decision making. In: Komorowski, J., Ras, Z.W. (Eds.), Methodologies for Intelligent Systems, Lecture Notes in Arti®cial Intelligence, vol. 689. Springer, Berlin, pp. 642±651. Slowinski, R., 1995. Rough set processing of fuzzy information. In: Lin, T.Y., Wildberger, A. (Eds.), Soft Computing: Rough Sets, Fuzzy Logic, Neural Networks, Uncertainty Management, Knowledge Discovery. Simulation Councils, San Diego, CA, pp. 142± 145. Slowinski, K., Slowinski, R., Stefanowski, J., 1988. Rough set approach to analysis of data from peritoneal lavage in acute pancreatitis. Medical Informatics 13, 143±159. Slowinski, R., Stefanowski, J., 1989. Rough classi®cation in incomplete information systems. Mathematical Computer Modelling 12 (10/11), 1347±1357. Slowinski, R., Stefanowski, J., 1992. RoughDAS and RoughClass software implementations of the rough sets approach. In: Slowinski, R. (Ed.), Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory. Kluwer Academic Publishers, Dordrecht, pp. 445±456. Slowinski, R., Stefanowski, J., 1994. Handling various types of uncertainty in the rough set approach. In: Ziarko, W.P. (Ed.), Rough Sets, Fuzzy Sets and Knowledge Discovery. Springer, London, pp. 366±376. Slowinski, R., Stefanowski, J., 1996. Rough set reasoning about uncertain data. Fundamenta Informaticae 27, 229±243. Slowinski, R., Stefanowski, J., Greco, S., Matarazzo, B., 2000. Rough sets processing of inconsistent information in decision analysis. Control and Cybernetics 29, 379±404. Slowinski, R., Vanderpooten, D., 1997. Similarity relation as a basis for rough approximations, ICS Research Report 53/95, Warsaw University of Technology, Warsaw, 1995. In: Wang, P.P. (Ed.), Advances in Machine Intelligence & Soft-Computing, vol. IV. Duke University Press, Durham, NC, pp. 17±33. Slowinski, R., Vanderpooten, D., 2000. A generalized de®nition of rough approximations based on similarity. IEEE Transactions on Data and Knowledge Engineering 12 (2), 331±336. Stefanowski, J., 1998. On rough set based approaches to induction of decision rules. In: Skowron, A., Polkowski, L. (Eds.), Rough Sets in Data Mining and Knowledge Discovery, vol. 1. Physica, Heidelberg, pp. 500±529. Susmaga, R., 1998. Experiments in incremental computation of reducts. In: Skowron, A., Polkowski, L. (Eds.), Rough Sets in Data Mining and Knowledge Discovery, vol. 1. Physica, Heidelberg, pp. 530±553. Sugeno, M., 1974. Theory of fuzzy integrals and its applications. Doctoral Thesis, Tokyo Institute of Technology. Tsoukias, A., Vincke, Ph., 1995. A new axiomatic foundation of the partial comparability theory. Theory and Decision 39, 79±114. Tsoukias, A., Vincke, Ph., 1997. Extended preference structures in MCDA. In: Climaco, J. (Ed.), Multicriteria Analysis. Springer, Berlin, pp. 37±50. Tversky, A., 1969. Intransitivity of preferences. Psychological Review 76, 31±48. S. Greco et al. / European Journal of Operational Research 129 (2001) 1±47 47 Tversky, A., 1977. Features of similarity. Psychological Review 84 (4), 327±352. Vincke, Ph., 1992. Multicriteria Decision-Aid. Wiley, New York. Wakker, P.P., 1989. Additive representions of preferences. A new foundation of decision analysis. Kluwer Academic Publishers, Dordrecht. Yao, Y., 1996. Combination of rough sets and fuzzy sets based on a-level sets. In: Lin, T.Y., Cercone, N. (Eds.), Rough Sets and Data Mining. Kluwer Academic Publishers, Boston, pp. 301±321. Yao, Y., 1998. A comparative study of fuzzy sets and rough sets. Information Sciences 109, 227±242. Yao, Y., Wong, S., 1995. Generalization of rough sets using relationships between attribute values. In: Proceedings of the Second Annual Joint Conference on Information Sciences. Wrightsville Beach, NC, pp. 30±33. Ziarko, W., Shan, N., 1994. An incremental learning algorithm for constructing decision rules. In: Ziarko, W.P. (Ed.), Rough Sets, Fuzzy Sets and Knowledge Discovery. Springer, London, pp. 326±334.

Log In

Rough sets theory for multicriteria decision analysis

Related papers

Related papers

Related topics