1-s2.0-S0305048324000422-f
1-s2.0-S0305048324000422-f
1-s2.0-S0305048324000422-f
Omega
journal homepage: www.elsevier.com/locate/omega
Keywords: Despite its recent introduction in literature, the Best–Worst Method (BWM) is among the most well-known and
Best–Worst Method applied methods in Multicriteria Decision-Making. The method can be used to elicit the relative importance
Alternatives’ priorities (weight) of the criteria as well as to get the priorities of the alternatives on the criteria at hand. In this
Parsimonious elicitation
paper, we will present an extension of the method, namely, the parsimonious Best–Worst-Method (P-BWM)
permitting to apply the BWM to get the priorities of the alternatives in case they are in a large number. At first,
the Decision-Maker (DM) is asked to give a rating to the alternatives under consideration; after, the classical
BWM is applied to a set of reference alternatives to get their priorities used to compute, then, the priorities of
all the alternatives under consideration. We propose also a procedure to select reference alternatives, possibly
in cooperation with the DM, providing a well-distributed coverage of the rating range. The new proposal
requires the DM a fewer number of pairwise comparisons than the original BWM. Another contribution of
the paper is related to the comparison between BWM, P-BWM, the Analytic Hierarchy Process (AHP), and
the parsimonious AHP in terms of the amount of preference information provided by the DM in each method
to apply it. In addition to the standard approach, we propose one alternative way of inferring the priority
vectors in BWM and P-BWM based on the barycenter of the space of alternatives priorities compatible with
the preferences given by the DM. Finally, an experiment with university students has been conducted to test the
new proposal. Results of the experiments show that P-BWM performs better than BWM in terms of capability
to represent the DM’s preferences and the difference between the results of the two methods is significant
from the statistical point of view. The new proposal will permit to use the potentialities of the BWM to get
the alternatives’ priorities in real-world decision-making problems where a large number of alternatives must
be taken into account.
1. Introduction equal) with respect to the others). Such a challenge has motivated many
scientists to develop methods to help the Decision-Maker (DM) to make
We, as individuals, groups, or organizations, make many decisions more informed decisions. Determining the relative importance of the
in our lifetime, which means we should choose from among several criteria and the (overall) value of alternatives, which are usually named
available options to achieve our different goals. Among so many exam- weight and priority, respectively, is perhaps a part that has gained
ples, we could think of choosing a university to study by a student, more attention in the existing literature on MCDM. Among the more
buying a house by a family, or selecting a supplier by a retailer. popular methods, we could refer to Multi-Attribute Value Theory [1],
Although selection might be the goal of most decision problems, it is ELECTRE [2], PROMETHEE [3], Analytic Hierarchy Process (AHP) [4],
not the only one. Sometimes, the goal is to rank the options or sort and the Best–Worst Method (BWM) [5]. The main focus of the current
them, such as ranking universities or sorting the hospitals in a country.
study is the youngest of this set, the BWM.
In general, we name a decision problem a multi-criteria decision-
BWM is a recently developed multi-criteria decision-making method
making (MCDM) problem, where the options (alternatives) need to be
that uses a structured pairwise comparison system to find the relative
evaluated with respect to a set of (conflicting) criteria (attributes). Such
importance (weight) of the decision criteria (and alternatives). Accord-
problems are usually challenging to handle for decision-makers as the
alternatives – to be considered – are non-dominated (one is better than ing to BWM, the decision-maker needs to choose the Best (e.g., most
the other with respect to some criteria (at least one) and worse (or important), and the Worst (e.g., least important) criteria (alternatives),
✩ Area: Data-Driven Analytics. This manuscript was processed by Associate Editor E. Triantaphyllo.
∗ Corresponding author.
E-mail addresses: [email protected] (S. Corrente), [email protected] (S. Greco), [email protected] (J. Rezaei).
https://doi.org/10.1016/j.omega.2024.103075
Received 30 August 2023; Accepted 7 March 2024
Available online 13 March 2024
0305-0483/© 2024 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
S. Corrente et al. Omega 126 (2024) 103075
and then conduct a pairwise comparison between the Best and all level theory of Helson [23,24] each judgment is defined with respect
the other criteria (alternatives), and between the other criteria (al- to an average of past stimuli, while in the range-frequency theory of
ternatives) and the Worst. These two vectors of pairwise comparison Parducci [25,26] the judgment depends on the range of the scale and
judgments are then used to infer the weights. There exist several the distribution of the stimuli. In both cases, to propose a whole set of
optimization models for this step, including the original non-linear well-distributed reference points rather than simply the extreme highest
model [5], a linear model [6], a multiplicative model [7], a Bayesian and lowest points seems beneficial because it prevents anchoring the
model [8], a nonadditive model [9] and a fuzzy one [10] to name a few. judgments on some specific points, such as one low or one high extreme
Due to its attractive features, including its data efficiency, simplicity in point, as it is the case for SMART and SWING procedures [11]. This is
revising inconsistent comparisons, and its debiasing mechanism against also beneficial with respect to the BWM because presenting a whole
some cognitive biases (see, 11,12), BWM has gained considerable at- range of reference points to the DM mitigates the above-mentioned
tention among researchers and practitioners. Just to cite a few recent over-evaluations and under-evaluation effects due to the comparisons
contributions about BWM, [13] studied the BWM determining the with the best and the worst points, whose anchoring effect can be
analytical form of criteria weights without the help of any optimization smoothed by the consideration of the other reference points. The
software; [14] ranked the risks associated with big data analytics parsimonious BWM we are proposing has a specific relevance with
implementation in Indian automotive manufacturing industry by BWM; respect to real-world applications. Indeed, very often, the alternatives
a three-phase methodology for supplier selection, where the last is done that have to be compared in a decision problem can be very numerous,
by BWM, has been proposed by [15]; in a similar context, [16] used in the order of tens, hundreds, or even thousands. Imagine, for example,
BWM to get the weights of criteria necessary to evaluate third-party an application in the domains of ranking of Universities [27], well-
logistics providers for sustainable last-mile delivery (for a recent survey being ranking of countries [28], healthcare system assessment [29],
about BWM see [17]. A full list of contributions related to BWM can be sustainable development [30] and so on. In this case, it is not reason-
found at bestworstmethod.com). able to pairwise compare all the units to be evaluated with the best
While the BWM has been mainly used in determining the weights of and the worst units on all considered criteria. Therefore, considering
the criteria, it can also be used to determine the priority of the alterna- the growing interest in such types of ranking, the parsimonious BWM
tives [18]. In determining the weights, one of the core assumptions of permits extending the application of the BWM to relevant domains that,
the method (which is rooted in psychological studies related to human otherwise, would not be possible to consider. Another relevant type of
brain capabilities) is that the DM does not do the pairwise comparison application of the parsimonious BWM is the repetitive assessment of
among more than nine criteria [19,20]. While the assumption works units that cannot be known ex-ante as it is the case for multi-criteria
very well for the criteria (as for most decision problems, we have a financial scoring or rating [31], and, more in general, multi-criteria
handful of relevant criteria, or in case of more than 9, we could cluster assessment in different domains such as building performances [32],
them), when it comes to the alternatives, it might work as a limitation housing evaluation [33], sustainability evaluation [34], Environmental,
as in many real-world decision-making problems, one might deal with Social and Governance (ESG), or Corporate Social Responsibility (CSR)
a large set of alternatives (think of, for instance, ranking the countries score [35]. In all these cases, as well as in similar situations, it is not
based on their sustainability performance, or sorting the schools in a possible to apply BWM because it is not possible to compare the units to
country). The main aim of the current study is to work on this limitation be evaluated with the best and the worst. After all, beyond being a large
and empower the BWM to determine the priority of the alternatives number, they are known time by time when the assessment is required.
when the set is large. We develop a parsimonious version of the However, the parsimonious BWM permits always a linear interpolation
method, which calls for rating all alternatives and conducting pairwise with the corrected rating assigned to the reference alternatives.
comparisons among only few well-distributed reference alternatives To test the performance of the proposed approach, we also con-
(instead of all). The priority to the other alternatives is assigned by ducted an experiment on student subjects, based on which we found
linear interpolation of the reference alternative priorities based on the promising results. We think that this is a significant contribution to the
DM’s rating. The basic idea here is that the rating provided by the existing literature on BWM as the new model opens up a new exciting
DM is corrected using the priority for reference alternatives obtained area of applications, namely the application of the BWM for problems
through the BWM. In fact, with this procedure, on the one hand, we with a large set of alternatives.
correct the possible errors linked to the necessity to evaluate many In the next section, we introduce the background of the study
alternatives together, while, on the other hand, we avoid the possible describing the BWM and two ways of inferring alternatives’ priorities,
unreliability of the DM’s preference information linked to the large followed by Section 3 presenting our proposed parsimonious BWM,
number of pairwise comparisons required by the BWM in the case with an illustrative example. In Section 4, we conduct a comprehensive
of many alternatives. The introduction of the reference alternatives analysis comparing the amount of data needed for the four methods of
has specific importance also from the point of view of the decision BWM, parsimonious BWM, AHP, and parsimonious AHP. In Section 5,
biases [21]. Indeed, the presence of a multiplicity of reference points we perform some experimental studies, an essential part of our study
permits to counterbalance the anchoring biases, i.e. the tendency to to show the performance of the parsimonious BWM, which is evaluated
base judgments on an initial piece of information [22]. As shown by some statistical metrics. Finally, conclusion and future research
by [11] using BWM, the alternatives at the extreme points of the scale directions are presented in Section 6.
are over-evaluated when compared with respect to the highest (best)
point and under-evaluated when compared with the lowest (worst) 2. An overview of BWM and proposing a new approach for getting
point. Moreover, the middle points are over-evaluated when compared a representative vector from the results of non-linear BWM
with the lowest point and under-evaluated with the highest point. The
compensation of the best-to-others and others-to-worst comparisons 2.1. Multiple criteria decision-making
permits the BWM to mitigate the biases related to the above over-
evaluations and under-evaluations. In this perspective, the direct rating In Multiple Criteria Decision-Making (MCDM; 1,36,37), a set of
conjugated with BWM further mitigates these biases because it avoids alternatives 𝐴 = {𝑎1 , 𝑎2 , … , 𝑎𝑛 } has to be evaluated on a coherent
any anchor. family of criteria 𝐺 = {𝑔1 , … , 𝑔𝑚 } [2] to deal with choice, ranking or
Observe also that the presence of intermediate reference points, in sorting problems. In this paper, we are interested in ranking problems
addition to the best and worst points, are beneficial from the point of in which one has to rank all alternatives from the best to the worst. The
view of the contextual effects of scaling and rating, according to which only objective information that can be gathered from the performance
they depend on the whole set of stimuli. In particular, in the adaptation matrix, where the evaluations of the alternatives on the criteria at hand
2
S. Corrente et al. Omega 126 (2024) 103075
are collected, is the dominance relation for which an alternative 𝑎𝑖 2.3. Two ways of inferring the priorities in BWM
dominates an alternative 𝑎𝑗 if 𝑎𝑖 is no worse than 𝑎𝑗 on all criteria
and better for at least one of them. However, since this relation is In this section, we present two different ways to infer a priority
quite poor (in general, there are criteria for which 𝑎𝑖 is better than vector in BWM starting from the preference information provided by
𝑎𝑗 and vice versa criteria for which 𝑎𝑗 is better than 𝑎𝑖 ) there is the DM (both methods can also be applied to the P-BWM that will be
the necessity to aggregate the alternatives’ evaluations. This can be presented in the next section).
done by value functions [1], outranking relations [2] or decision rules As explained in [6], solving the mathematical problem (1) and
methods [38]. Value functions assign a unique numerical evaluation to denoted by 𝜉 ∗ the optimal value so obtained, two different cases can
each alternative being representative of its goodness with respect to the occur:
problem at hand; outranking relations compare alternatives pairwise
to define if one is at least as good as another, and, finally, decision • 𝜉 ∗ = 0: there is only one priority vector (𝑤1 , … , 𝑤𝑛 ) satisfying
all constraints in 𝐸𝐵𝑊 𝐷𝑀 and, therefore, compatible with the
rules link the global preferences expressed by the DM on the considered 𝑀
alternatives to their performances on the criteria. information provided by the DM,
• 𝜉 ∗ > 0: there is more than one priority vector compatible with the
2.2. The best–worst method information given by the DM and they satisfy the constraints
| 𝑤𝐵 |
| | ∗ ⎫
Despite its recent introduction in literature, the Best–Worst Method | 𝑤𝑗 − 𝑎𝐵𝑗 | ⩽ 𝜉 , for all 𝑗 = 1, … , 𝑛, ⎪
| |
| 𝑤𝑗 | ⎪
(BWM, 5) is nowadays one of the most applied MCDM methods to | | ∗
| 𝑤𝑊 − 𝑎𝑗𝑊 | ⩽ 𝜉 , for all 𝑗 = 1, … , 𝑛, ⎪
deal with decision-making problems [17]. The method can be used to |𝑛 | ⎬ (3)
∑
get the weights of criteria to be used as tradeoffs in value functions 𝑤𝑗 = 1, ⎪
⎪
or, analogously, to get the alternatives’ priorities on the criteria under 𝑗=1 ⎪
consideration. In this paper, we are interested in its application to 𝑤𝑗 ⩾ 0, for all 𝑗 = 1, … , 𝑛, ⎭
get alternatives’ priorities. Regarding its application, at first, the DM that can also be written in a linear way as follows:
is asked to define the Best and the Worst alternative on the criterion
under consideration. Secondly, they have to pairwise compare the Best −𝜉 ∗ ⋅ 𝑤𝑗 ⩽ 𝑤𝐵 − 𝑎𝐵𝑗 ⋅ 𝑤𝑗 ⩽ 𝜉 ∗ ⋅ 𝑤𝑗 , for all 𝑗 = 1, … , 𝑛, ⎫
alternative with all the other alternatives and the other alternatives ⎪
−𝜉 ∗ ⋅ 𝑤𝑊 ⩽ 𝑤𝑗 − 𝑎𝑗𝑊 ⋅ 𝑤𝑊 ⩽ 𝜉∗ ⋅ 𝑤𝑊 , for all 𝑗 = 1, … , 𝑛, ⎪
with the Worst one, using the traditional 1–9 scale considered in the ⎪ 𝑂𝑝𝑡
Analytic Hierarchy Process (AHP; 4). Denoting by 𝑎𝐵𝑗 the pairwise ∑
𝑛
⎬ 𝐸𝐵𝑊 𝑀
𝑤𝑗 = 1, ⎪
comparison between the Best alternative and the alternative 𝑎𝑗 , by 𝑎𝑗𝑊 𝑗=1 ⎪
the pairwise comparison between the same alternative and the Worst 𝑤𝑗 ⩾ 0, for all 𝑗 = 1, … , 𝑛. ⎪
⎭
one, and by 𝑤1 , … , 𝑤𝑛 the alternatives’ priorities, to get them one has
to solve the following problem [5] (4)
{ }
|𝑤 | |𝑤 | We can now consider two methods, alternative to the standard ap-
min max𝑗 || 𝑤𝐵 − 𝑎𝐵𝑗 || , || 𝑤 𝑗 − 𝑎𝑗𝑊 || ,
| 𝑗 | | 𝑊 | proach, aiming to select one compatible priority vector among those
𝑂𝑝𝑡
s.t. satisfying constraints in 𝐸𝐵𝑊 𝑀
:
∑
𝑛
𝑤𝑗 = 1, • Central priority vector [6]: we can find the minimum and maxi-
𝑂𝑝𝑡
𝑗=1 mum priority 𝑤𝑗 under the constraints 𝐸𝐵𝑊 𝑀
. Formally, one has
𝑤𝑗 ⩾ 0, for all 𝑗 = 1, … , 𝑛, to compute the following LP problems for all 𝑗 = 1, … , 𝑛,
} }
that can be equivalently written in the following way min 𝑤𝑗 = 𝑤𝑚𝑖𝑛 max 𝑤𝑗 = 𝑤𝑚𝑎𝑥
𝑗 , subject to 𝑗 , subject to
𝑂𝑝𝑡 𝑂𝑝𝑡
min 𝜉, subject to 𝐸𝐵𝑊 𝑀
𝐸𝐵𝑊 𝑀
3
S. Corrente et al. Omega 126 (2024) 103075
Fig. 1. Flowchart of the P-BWM. • the alternatives with the minimal rating and the maximal rating
are included among the reference points, that is,
4
S. Corrente et al. Omega 126 (2024) 103075
Fig. 2. Ten Countries of which the DM would like to find the area.
constraint 𝜌(𝑗) + 𝜌(𝑗+1) + 𝜌(𝑗+2) ⩽ 2, 𝑗 = 1, … , 𝑛 − 2, prevents that three so that, in the above procedure to select the reference alternatives, the
consecutive alternatives could be taken as reference alternatives. grades, rather than the DM’s ratings, can be considered. In this case,
In a more simplified version, one could also consider the possibility one could also consider some transformations of the underlying value,
of choosing the 𝑡 reference members with the alternatives characterized taking into account the relationship between the stimulus magnitude 𝐼
by the ratings closest to the following values: and the sensation 𝑆, such as:
1 [ ]
𝑟(𝑎(1) ), 𝑟(𝑎(1) ) + 𝑟(𝑎(𝑛) ) − 𝑟(𝑎(1) ) , 𝑟(𝑎(1) ) • the Weber–Fechner Law [49], for which there is a logarithmic
𝑡−1
2 [ ] relation, that is, 𝑆 = 𝑎 ⋅ 𝑙𝑜𝑔 𝐼𝐼 , with 𝑎 and 𝐼0 positive constant and
+ 𝑟(𝑎(𝑛) ) − 𝑟(𝑎(1) ) , … , 𝑟(𝑎(𝑛) ). 0
𝑡−1 𝐼0 referred as detection level, or
One could also consider the possibility to select the 𝑡 reference members • the Stevens law [50], for which there is a power relation, that
in cooperation with the DM among the alternatives with the rating is, 𝑆 = 𝑏 ⋅ 𝐼 𝑛 , with 𝑏 and 𝑛 positive, and, in general, 𝑛 < 1
closest to the above-mentioned values. In this way, the DM could select because basically the increase in sensation intensity decreases
the alternatives for which they will be more confident in providing the with increasing stimulus magnitude.
required judgments which should increase the reliability of the elicited
preference information. In fact, to propose a set of reference alternatives well-distributed with
Observe also that, instead of the DM’s rating, when present, one respect to the sensation, the Weber–Fechner law suggests to apply the
could consider some underlying value on which the scaling is based. above procedure on a logarithmic transformation of the underlying
For example, in a decision problem related to university students’ value, while the Stevens law suggests to consider a power transforma-
assessment, the rating is based on the grades in the considered subjects, tion.
5
S. Corrente et al. Omega 126 (2024) 103075
Table 1
Rating of all Countries provided by the DM.
Bosnia Estonia Ireland Iceland Lithuania Czech Rep. Romania Slovenia Spain Switzerland
𝑟(⋅) 3 2.7 3.5 5 3.7 4 6 1 7 3
𝑢(𝑟(⋅)) 0.076 0.071 0.110 0.205 0.123 0.144 0.266 0.044 0.470 0.076
Table 2
Pairwise comparisons provided by the DM between Spain (the best Country) and the reference Countries (𝑎𝐵𝑂 ) as well as between the reference
Countries and Slovenia (the Worst one) (𝑎𝑂𝑊 ). Rating given by the DM to the reference Countries and priorities obtained by the BWM application
to the set composed of the reference Countries only.
Czech Rep. (𝑎𝛾3 ) Romania (𝑎𝛾4 ) Slovenia (𝑎𝛾1 ) Spain (𝑎𝛾5 ) Switzerland (𝑎𝛾2 )
𝑎𝐵𝑂 5 3 9 1 7
𝑎𝑂𝑊 5 7 1 9 3
𝑟(⋅) 𝑟(𝑎𝛾3 ) = 4 𝑟(𝑎𝛾4 ) = 6 𝑟(𝑎𝛾1 ) = 1 𝑟(𝑎𝛾5 ) = 7 𝑟(𝑎𝛾2 ) = 3
𝑢(𝑟(⋅)) 0.144 0.266 0.044 0.470 0.076
3.1. The P-BWM application: An example reference Countries comparing the Best and the Worst with the
other reference Countries using the classical 1–9 Saaty scale. In
In this section, we present how to apply the P-BWM described comparing Countries 𝐴 and 𝐵, assuming that the DM retains 𝐴
above, introducing the problem on which the experiments in Section 5 not smaller than 𝐵, the points in the scale have the following
are based. interpretation:
Let us assume we would like to find the area of the ten European 1- 𝐴 and 𝐵 have the same size,
Countries shown in Fig. 2. To this aim, let us use the P-BWM presented
above showing, in detail, the steps on which the method application is 3- 𝐴 is moderately bigger than 𝐵,
based. 5- 𝐴 is strongly bigger than 𝐵,
Step (1) The DM has to provide a rating to the considered Countries. 7- 𝐴 is very strongly bigger than 𝐵,
In order to facilitate the DM’s task, let us assume that the area 9- 𝐴 is extremely bigger than 𝐵.
of Slovenia is 1 and that the rating of the other Countries should
Values 2, 4, 6 and 8 denote a hesitation between 1–3, 3–5,
be given on the basis of this assumption. The rating given by the
5–7 and 7–9, respectively. Let us assume that the vectors 𝑎𝐵𝑂
DM is shown in Table 1.
and 𝑎𝑊 𝑂 containing the pairwise comparisons between the Best
Step (2) Let us consider the five Countries shown in Fig. 3 that will country (i.e., the biggest) and the reference ones as well as the
act as reference alternatives. They have been selected using the pairwise comparisons between the reference Countries and the
procedure described in Section 3, solving the MILP problem (6), Worst (i.e., the smallest) one are those shown in Table 2.
considering as rating the real area of the ten European Countries Computing 𝐶𝑅𝐼 as shown in Eq. (2) and observing that its
shown in Table 6 of Section 4. In other words, in the MILP value (0.2222) is lower than the threshold considered in this
problem (6), the constraint case (0.3062), one can apply the BWM finding the priorities of
the reference Countries shown in the last row of Table 2. Let
𝜌𝑗 ⋅ 𝑟(𝑎𝑗 ) − 𝜌𝑗 ′ ⋅ 𝑟(𝑎𝑗 ′ ) ⩾ 𝜀 − (2 − 𝜌𝑗 − 𝜌𝑗 ′ ) ⋅ 𝑀
us observe that [39] provide the 𝐶𝑅𝐼 threshold for problems
for all 𝑗, 𝑗 ′ = 1, … , 𝑛, such that 𝑟(𝑎𝑗 ) ⩾ 𝑟(𝑎𝑗 ′ ) composed of at most 9 criteria. In our case, even if the number
of alternatives (acting as criteria) is 10, we considered the
is replaced by the constraint
thresholds defined in the paper for the case 𝑛 = 9. This is an even
𝜌𝑗 ⋅ 𝐴𝑟𝑒𝑎(𝑎𝑗 ) − 𝜌𝑗 ′ ⋅ 𝐴𝑟𝑒𝑎(𝑎𝑗 ′ ) ⩾ 𝜀 − (2 − 𝜌𝑗 − 𝜌𝑗 ′ ) ⋅ 𝑀 more restrictive assumption observing that for a fixed value of
for all 𝑗, 𝑗 ′ = 1, … , 𝑛, such that 𝐴𝑟𝑒𝑎(𝑎𝑗 ) ⩾ 𝐴𝑟𝑒𝑎(𝑎𝑗 ′ ) 𝑎𝐵𝑊 the 𝐶𝑅𝐼 threshold is not decreasing with 𝑛. Therefore, one
would expect that passing from 𝑛 = 9 to 𝑛 = 10 the threshold
and, with reference to constraints used to check if the pairwise comparisons provided by the DM
are consistent enough should increase.
𝜌(1) = 1, 𝜌(𝑛) = 1,
and Step (4) The size of all the Countries under consideration is obtained
by Eq. (5) using the priorities of the reference Countries found in
𝜌(𝑗) + 𝜌(𝑗+1) + 𝜌(𝑗+2) ⩽ 2, 𝑗 = 1, … , 𝑛 − 2 the previous step. For example, considering the rating assigned
to Iceland by the DM, that is, 5 (see Table 1) and observing that
the alternatives 𝑎𝑗 from 𝐴 are reordered in the sequence
this rating belongs to the interval rating [𝑟(𝑎𝛾3 ), 𝑟(𝑎𝛾4 )] = [4, 6]
𝑎(1) , … , 𝑎(𝑛) assigned to reference Countries (see the third row of Table 2)
for which the priorities 𝑢(𝑟(𝑎𝛾3 )) = 𝑢(4) = 0.144 and 𝑢(𝑟(𝑎𝛾4 )) =
such that 𝐴𝑟𝑒𝑎(𝑎(𝑗) ) ⩽ 𝐴𝑟𝑒𝑎(𝑎(𝑗+1) ), 𝑗 = 1, … , 𝑛 − 1. Instead of
𝑢(6) = 0.266 have been obtained, one gets
utilizing the rating, we opted for the real area because our goal
was to evaluate P-BWM by presenting the same set of reference 𝑢(6) − 𝑢(4) 0.266 − 0.144
𝑢(5) = 𝑢(4) + (5 − 4) = 0.144 + = 0.205.
alternatives to all participants. Conversely, employing the rat- 6−4 2
ing method would have necessitated providing each participant The priorities obtained for the ten Countries under consideration
with a unique reference set tailored to their rating. are therefore shown in the last row of Table 1. In Appendix D,
we provide details on the programming problem to be solved to
Step (3) In this step, the DM is asked to select the Best and the Worst get the priorities of the reference alternatives as well as all the
Countries among the reference ones where here, Best and Worst computations done to obtain the priorities of all the alternatives
refer to their size, that is, the biggest and the smallest ones. as shown in Table 1. All the mathematical problems related to
The DM selects Spain as the Best Country and Slovenia as the the BWM and P-BWM application have been solved using the
Worst one. Then, they are asked to apply the BWM to the five commercial software MATLAB 2021.
6
S. Corrente et al. Omega 126 (2024) 103075
Table 3 Table 5
Pieces of preference information involved in the four considered methods: 𝑛 is the Number of pieces of preference information asked the DM to apply each method for
number of alternatives at hand, while 𝑟 is the number of reference alternatives some specific (𝑛, 𝑟) configuration.
considered in P-AHP as well as in P-BWM methods. (5, 3) (7, 3) (7, 4) (9, 4) (9, 5) (10, 4) (10, 5) (20, 5) (30, 7) (30, 10)
AHP BWM P-AHP P-BWM
AHP 10 21 21 36 36 45 45 190 435 435
𝑛(𝑛−1) 𝑟(𝑟−1)
2
2𝑛 − 3 𝑛+ 2
𝑛 + (2𝑟 − 3) BWM 7 11 11 15 15 17 17 37 57 57
P-AHP 8 10 13 15 19 16 20 30 51 75
P-BWM 8 10 12 14 16 15 17 27 41 47
Table 4
Comparison between methods with respect to the number of pieces of preference
information asked to the DM to apply them. Each value in the table represents under
which condition the application of the method in the row asks for a lower number of pieces of preference information involved in the application of the four
pieces of preference information than the application of the method in the column.
considered methods.
AHP BWM P-AHP P-BWM
√
As one can see, AHP is the most expensive among the four consid-
1+ 4𝑛2 −12𝑛+1 𝑛2 −3𝑛+6
AHP ■ ✗ 𝑟>
√ 2
𝑟> 4
ered methods from the cognitive point of view for the DM. Moreover,
BWM 𝑛>3 ■ 𝑟> 1+ 8𝑛−23
𝑟> 𝑛
the following points can be observed:
√ √ 2 2
1+ 4𝑛2 −12𝑛+1 1+ 8𝑛−23
P-AHP 𝑟< 𝑟< ■ ✗
2
𝑛2 −3𝑛+6 𝑛
2 • as shown in Table 4, the P-BWM application is always preferable
P-BWM 𝑟< 𝑟< 𝑟>3 ■
4 2
to the P-AHP one,
• BWM and P-BWM applications involve a similar cognitive effort
for problems with a number of alternatives up to ten, while P-
4. Comparing the amount of preference information involved in BWM is more parsimonious than BWM for problems with a great
AHP, BWM, P-AHP and P-BWM methods number of alternatives ((20,5), (30,7) and (30,10)). Of course, as
shown in Table 4, BWM could involve a lower cognitive effort
In this section, we perform a comparison between four MCDM than P-BWM but, only if the number of reference levels was
methods, namely, AHP, BWM, Parsimonious AHP [51] (denoted by P- greater than 𝑛∕2 being counter-intuitive,
AHP) and Parsimonious BWM (denoted by P-BWM) with respect to • for small problems, BWM application involves a cognitive effort
the number of pieces of preference information asked to the DM to comparable to that one of the P-AHP application and in few cases
apply it. Let us remember that P-AHP differs from P-BWM in the way lower; for big size problems (considering 𝑛 ⩾ 20), the comparison
the priorities of the reference alternatives are computed. Indeed, in between the two methods is strictly dependent on the number of
P-AHP, they are obtained using the AHP method, that is, computing reference evaluations used in P-AHP. For example, considering
the eigenvector of the pairwise comparison matrix filled by the DM 𝑛 = 30, if 𝑟 = 7, then, P-AHP is better than BWM, while, if 𝑟 = 10,
then, BWM is much better than P-AHP.
considering the reference alternatives only.
We think the comparison is fair as the type of needed information The comparison between the considered methods is performed on
is the same (pairwise comparisons using the same scale) for all the the assumption that a pairwise comparison of alternatives or a rating
methods in question. Let us also assume that the DM retains equivalent of one alternative on a particular criterion involve the same cognitive
the cognitive effort involved in a pairwise comparison of alternatives effort from the part of the DM. For this reason, looking at Table 5 and,
or in a rating of one alternative on the considered criteria. Therefore, in particular, at the (10,5) configuration, we stated that the cognitive
denoting by 𝑛 the number of alternatives in the MCDM problem and effort asked the DM in the BWM and P-BWM application is the same.
by 𝑟 the number of reference alternatives in the P-AHP and P-BWM Indeed, in both cases, 17 pieces of preference information are asked
methods (there is not any particular reason for which the number of to the DM. The main difference is, however, that the BWM application
reference alternatives in P-AHP should be different from the number of asks for 17 pairwise comparisons, while, the P-BWM application asks
reference alternatives in P-BWM), the pieces of preference information for the rating of the 10 alternatives and 7 pairwise comparisons to
involved in each of the considered methods is shown in Table 3. the DM. As we show in the next section, our claim is that providing
To compare these four methods with respect to the number of pieces a pairwise comparison or a rating is not the same for the DM and
of preference information asked to the DM to apply them, in Table 4 we mixing them (as done in the P-BWM method) could be beneficial for
report in which case the application of the method in the row involves a the application of the method.
lower number of pieces of preference information than the application
of the method in the column. To perform such a comparison, we 5. Experiments and detailed comparison
assume that 𝑛 ⩾ 3. For example, the ✗ in correspondence of the (AHP,
To check the reliability of the proposed method, we performed a
BWM) pair means that there is not any value of 𝑛 for which AHP
comparison between the BWM and the P-BWM methods on the same
involves a lower number of pieces of preference information than BWM.
problem presented in Section 3.1, that is, estimating the size of the
Viceversa, 𝑛 > 3 in correspondence of the (BWM, AHP) pair means
ten European Countries shown in Fig. 2. To this aim, we submitted
that BWM application involves a lower number of pieces of preference
two questionnaires to some students of the Department of Economics
information than AHP in all test problems having more than three
and Business of the University of Catania and, in particular, students
alternatives. Let us observe that if 𝑛 = 3, then, both AHP and BWM
attending the Marketing course of the Business Economics bachelor
applications involve three pairwise comparisons only.
degree (Group 1) and students attending the Financial Mathematics
A detailed description of the way the values in the table are obtained course of the Economics bachelor degree (Group 2). Students of the
is provided in Appendix A. two groups did not have any knowledge about MCDM. We decided
To better compare the considered methods, in Table 5 we show to submit the questionnaires to two different sets of students since
the number of pieces of preference information asked to the DM to they have different backgrounds and academic experiences: students
apply them for some specific configurations (𝑛, 𝑟) of MCDM problems. of Group 1 are 3rd year students (their mean age is approximately
Analogously, to take into account a greater number of configurations, in 22 years) and, therefore, close to complete their academic studies,
Fig. 4 we show the number of pieces of preference information involved while, students of Group 2, are 2nd year students (their mean age is
in the four considered methods
⌈ ⌉for different (𝑛, 𝑟) configurations, where, approximately 21 years) and, therefore, in the middle of their academic
𝑛 = 5, … , 40, and 𝑟 = 3, … , 2𝑛 . In particular, on the 𝑥-axis, we report career. Each of the two groups was split in two parts to fill out the two
the (𝑛, 𝑟) configuration, while, on the 𝑦-axis, we show the number of questionnaires included in Appendix B:
7
S. Corrente et al. Omega 126 (2024) 103075
Fig. 4. Comparison
⌈ ⌉ between methods with respect to the amount of involved preference information in their application. In the 𝑥-axis, (𝑛, 𝑟) configurations with 𝑛 = 5, … , 40 and
𝑟 = 3, … , 2𝑛 . In the 𝑦-axis, the number of pieces of preference information involved in each of the four methods for considered (𝑛, 𝑟) configuration is shown.
Table 6
Real and normalized areas of the ten considered European Countries.
Bosnia Estonia Ireland Iceland Lithuania Czech Rep. Romania Slovenia Spain Switzerland
Real area km2 51,209 45,227 70,273 103,000 65,300 78,871 238,397 20,273 505,992 41,284
𝐴𝑟𝑒𝑎𝑟𝑒𝑎𝑙
𝑗,𝑛𝑜𝑟𝑚 2.526 2.231 3.466 5.081 3.221 3.890 11.759 1 24.959 2.036
• 54 students of Group 1 were asked to apply the BWM to the For example, the MSE computed between the vector of normal-
Countries shown in Fig. 2 to get their priorities, while 40 students ized areas shown in the last row of Table 6 and the vector of
of Group 1 were asked to apply the P-BWM described in Section 3, normalized priorities obtained by the P-BWM application and
• 41 students of Group 2 were asked to apply the BWM, while 47 shown in the second row of Table 7 is equal to 23.78. Moreover,
students belonging to Group 2 were asked to apply the P-BWM. the Maximum Absolute Error (MAE) between the normalized
areas and the same vectors is computed as shown in Eq. (8)
The real size of the ten European Countries subject of the two question-
naires is shown in the first row of Table 6. | |
𝑀𝐴𝐸 = max |𝐴𝑟𝑒𝑎𝑟𝑒𝑎𝑙
𝑗,𝑛𝑜𝑟𝑚 − 𝑤𝑗,𝑛𝑜𝑟𝑚 || . (8)
To compare the priorities of the Countries obtained applying the 𝑗=1,…,10 |
BWM or the P-BWM methods with the real area of the ten considered
European Countries, we perform a normalization of these areas by 5.1. Results
dividing the area of each Country for the area of Slovenia so that its
normalized value becomes 1 (the normalized area of the ten Countries After removing the questionnaires filled by the students presenting
is shown in the second row of Table 6). To perform such a comparison, a 𝐶𝑅𝐼 greater than the considered threshold defined by [39], we
after applying the BWM or the P-BWM, the following steps have to be applied the BWM or the P-BWM to the information included in the
done: questionnaires for the two different groups. In Table 8 we report the
average and standard deviation of the MSE and of the MAE computed
(1) For each questionnaire, the priorities of the ten Countries ob- between the vector of the normalized real areas and the vector of the
tained by the BWM (𝑤𝐵𝑊 𝑀 ) or by the P-BWM (𝑤𝑃 −𝐵𝑊 𝑀 ) are
𝑗 𝑗 normalized priorities obtained by the different versions of the BWM
normalized by dividing them by the Slovenia priority, that is:
and P-BWM. Let us observe that 𝐵𝑊 𝑀 differs from 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 and
𝑤𝐵𝑊
𝑗
𝑀 𝑤𝑃𝑗 −𝐵𝑊 𝑀 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 (analogously 𝑃 − 𝐵𝑊 𝑀 differs from 𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙
−𝐵𝑊 𝑀
𝑤𝐵𝑊 𝑀
𝑗,𝑛𝑜𝑟𝑚 = , 𝑤𝑃𝑗,𝑛𝑜𝑟𝑚 = −𝐵𝑊 𝑀
. and 𝑃 −𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 ) only on the way the Best-Other and Other-Worst
𝑤𝐵𝑊 𝑀
𝑆𝑙𝑜𝑣𝑒𝑛𝑖𝑎
𝑤𝑃𝑆𝑙𝑜𝑣𝑒𝑛𝑖𝑎
vectors provided by the DM are used to infer the priority vector and not
Considering the example shown in Section 3.1, the priorities on the way the method is applied to infer the DM’s preferences.
obtained by the P-BWM application and reported in the first row
As one can see from the values reported in the Table, even if the
of Table 7 are normalized obtaining the values in the second row
BWM and P-BWM involve the same number of pieces of preference
of the same Table.
information (17, as shown in Table 5), the results obtained by P-
(2) The Mean Squared Error (MSE) between the vector of the nor-
BWM are clearly better than those obtained by BWM and this does
malized areas in the last column of Table 6 and the vector of
the normalized estimated areas 𝑤𝑗,𝑛𝑜𝑟𝑚 (𝑤𝐵𝑊 𝑀 𝑃 −𝐵𝑊 𝑀 ) not depend on the specific group to which the two questionnaires have
𝑗,𝑛𝑜𝑟𝑚 or 𝑤𝑗,𝑛𝑜𝑟𝑚
obtained as described above is computed as follows: been submitted. Indeed, at the global level (considering Groups 1 and 2
together) the average MSE between the vectors of normalized real areas
10 ( )2
1 ∑ and the normalized priorities obtained by P-BWM is 25.350, while,
𝑀𝑆𝐸 = 𝐴𝑟𝑒𝑎𝑟𝑒𝑎𝑙
𝑗,𝑛𝑜𝑟𝑚 − 𝑤𝑗,𝑛𝑜𝑟𝑚 . (7)
10 𝑗=1 the average MSE obtained by BWM is more than double (51.809).
8
S. Corrente et al. Omega 126 (2024) 103075
Table 7
Priorities of the ten Countries and normalized values.
Bosnia Estonia Ireland Iceland Lithuania Czech Rep. Romania Slovenia Spain Switzerland
𝑢(𝑟(⋅)) 0.076 0.071 0.110 0.205 0.123 0.144 0.266 0.044 0.470 0.076
−𝐵𝑊 𝑀
𝑤𝑃𝑗,𝑛𝑜𝑟𝑚 1.740 1.629 2.508 4.665 2.815 3.275 6.055 1.000 10.725 1.740
Table 8 on BWM and P-BWM in which the inferred priority vector is 𝐰𝐼𝑛𝑡 ; on
Mean and standard deviation of the MSE and MAE computed between the vector
the other hand, we denoted by 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 and 𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟
of the normalized real areas and the vector of the normalized priorities obtained
by the BWM or the P-BWM.
the methods based on BWM and P-BWM in which the inferred priority
(a) Global (Group 1 + Group 2)
vector is 𝐰𝐵𝑎𝑟 .
The first thing that can be observed looking at Table 8 is that
Average MSE StD MSE Average MAE StD MAE
in all cases (Global, Group 1 and Group 2) the P-BWM versions
BWM 51.809 7.621 19.835 1.344
𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 51.022 5.442 21.112 0.946
(𝑃 − 𝐵𝑊 𝑀, 𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 and 𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 ) perform bet-
𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 47.681 5.958 19.920 1.204 ter than the corresponding BWM versions (𝐵𝑊 𝑀, 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 and
P-BWM 25.350 3.757 14.473 0.850
𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 ). Going more in depth comparing the three parsimo-
𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 24.906 3.937 14.473 0.850 nious versions of BWM, we cannot observe a very big difference. At
𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 24.956 4.122 14.473 0.850 global level, 𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 performs slightly better than 𝑃 − 𝐵𝑊 𝑀
(b) Group 1 and 𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 both, in terms of average MSE and MAE, while
Average MSE StD MSE Average MAE StD MAE
𝑃 − 𝐵𝑊 𝑀 presents the lowest standard deviation for both indicators.
Considering the two groups separately, again, the results obtained by
BWM 52.252 6.671 19.829 1.178
𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 51.010 4.215 21.076 0.758 the three methods are quite similar.
𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 47.378 4.823 19.719 1.055 As to the comparison between the BWM versions, instead, one can
P-BWM 26.137 4.827 14.564 1.099 observe a significant advantage of 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 with respect to both
𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 25.739 5.007 14.564 1.099 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 and 𝐵𝑊 𝑀 at global level as well as considering the
𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 26.081 4.944 14.564 1.099 two groups differently. Indeed, while the average MSE observed for
(c) Group 2 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 is lower than the one observed for 𝐵𝑊 𝑀 but very simi-
Average MSE StD MSE Average MAE StD MAE lar, the difference between the MSE values obtained by 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙
BWM 51.351 8.589 19.842 1.517
and 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 is greater than 3 in all cases. This shows that
𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 51.034 6.553 21.149 1.122 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 performs well than the other two considered versions
𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 47.996 7.017 20.128 1.328 of BWM.
P-BWM 24.594 2.166 14.385 0.520
𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 24.105 2.369 14.385 0.520 5.2. Statistical tests on the obtained results
𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 23.875 2.842 14.385 0.520
9
S. Corrente et al. Omega 126 (2024) 103075
Table 9
First version of the Kolmogorov–Smirnov Test for the MSE values: ‘‘Equal" Test. ℎ = 0 means that the null hypothesis is not rejected (the cumulative
distributions of the MSE values obtained by the two methods are equal), while, ℎ = 1 means that the null hypothesis is rejected in favor of the alternative
hypothesis (the cumulative distributions of MSE values obtained by the two methods are different).
(a) Global (Group 1 + Group 2)
h/p-value 𝐵𝑊 𝑀 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 𝑃 − 𝐵𝑊 𝑀 𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟
𝐵𝑊 𝑀 ■ 0/0.3345 1/0.0115 1/0.0000 1/0.0000 1/0.0000
𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 ■ ■ 1/0.0062 1/0.0000 1/0.0000 1/0.0000
𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 ■ ■ ■ 1/0.0000 1/0.0000 1/0.0000
𝑃 − 𝐵𝑊 𝑀 ■ ■ ■ ■ 1/0.0001 1/0.0008
𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 ■ ■ ■ ■ ■ 1/0.0290
𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 ■ ■ ■ ■ ■ ■
(b) Group 1
h/p-value 𝐵𝑊 𝑀 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 𝑃 − 𝐵𝑊 𝑀 𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟
𝐵𝑊 𝑀 ■ 0/0.2003 1/0.0259 1/0.0000 1/0.0000 1/0.0000
𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 ■ ■ 1/0.0046 1/0.0000 1/0.0000 1/0.0000
𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 ■ ■ ■ 1/0.0000 1/0.0000 1/0.0000
𝑃 − 𝐵𝑊 𝑀 ■ ■ ■ ■ 0/0.2160 0/0.2160
𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 ■ ■ ■ ■ ■ 0/0.8608
𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 ■ ■ ■ ■ ■ ■
(c) Group 2
h/p-value 𝐵𝑊 𝑀 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 𝑃 − 𝐵𝑊 𝑀 𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟
𝐵𝑊 𝑀 ■ 0/0.9270 0/0.3213 1/0.0000 1/0.0000 1/0.0000
𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 ■ ■ 0/0.1844 1/0.0000 1/0.0000 1/0.0000
𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 ■ ■ ■ 1/0.0000 1/0.0000 1/0.0000
𝑃 − 𝐵𝑊 𝑀 ■ ■ ■ ■ 1/0.0004 1/0.0013
𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 ■ ■ ■ ■ ■ 1/0.0104
𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 ■ ■ ■ ■ ■ ■
Table 10
Second version of the Kolmogorov–Smirnov Test for the MSE values: ‘‘Greater" Test. ℎ = 0 means that the null hypothesis is not rejected (the cumulative
distribution of the MSE values obtained by the method on the row is therefore smaller or equal to the cumulative distribution of the MSE values obtained
by the method in the column), while, 𝑗 = 1 means that the null hypothesis is rejected in favor of the alternative hypothesis (the cumulative distribution
of the MSE values obtained by the method in the row is larger than the cumulative distribution of the MSE values obtained by the method in the
column).
(a) Global (Group 1 + Group 2)
h/p-value 𝐵𝑊 𝑀 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 𝑃 − 𝐵𝑊 𝑀 𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟
𝐵𝑊 𝑀 ■ ■ 0/0.7517 0/1.0000 0/1.0000 0/1.0000
𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 ■ ■ 0/1.0000 0/1.0000 0/1.0000 0/1.0000
𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 1/0.0058 1/0.0031 ■ 0/1.0000 0/1.0000 0/1.0000
𝑃 − 𝐵𝑊 𝑀 1/0.0000 1/0.0000 1/0.0000 ■ 0/0.9786 0/0.3470
𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 1/0.0000 1/0.0000 1/0.0000 1/0.0001 ■ 0/0.3470
𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 1/0.0000 1/0.0000 1/0.0000 1/0.0004 1/0.0145 ■
(b) Group 1
h/p-value 𝐵𝑊 𝑀 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 𝑃 − 𝐵𝑊 𝑀 𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟
𝐵𝑊 𝑀 ■ ■ 0/0.9647 0/1.0000 0/1.0000 0/1.0000
𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 ■ ■ 0/0.9647 0/1.0000 0/1.0000 0/1.0000
𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 1/0.0129 1/0.0023 ■ 0/1.0000 0/1.0000 0/1.0000
𝑃 − 𝐵𝑊 𝑀 1/0.0000 1/0.0000 1/0.0000 ■ ■ ■
𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 1/0.0000 1/0.0000 1/0.0000 ■ ■ ■
𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 1/0.0000 1/0.0000 1/0.0000 ■ ■ ■
(c) Group 2
h/p-value 𝐵𝑊 𝑀 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 𝑃 − 𝐵𝑊 𝑀 𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟
𝐵𝑊 𝑀 ■ ■ ■ 0/1.0000 0/1.0000 0/1.0000
𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 ■ ■ ■ 0/1.0000 0/1.0000 0/1.0000
𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 ■ ■ ■ 0/1.0000 0/1.0000 0/1.0000
𝑃 − 𝐵𝑊 𝑀 1/0.0000 1/0.0000 1/0.0000 ■ 0/0.9574 0/0.4986
𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 1/0.0000 1/0.0000 1/0.0000 1/0.0002 ■ 0/0.4986
𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 1/0.0000 1/0.0000 1/0.0000 1/0.0006 1/0.0052 ■
• With respect to Group 2, the difference between the three BWM • At the global level, the cumulative distribution of the MSE value
versions is not significant from the statistical point of view, while obtained by 𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 is greater (therefore better) than
the difference between all other pairs of methods is statistically the one obtained by all the other methods; as second best, we can
significant. consider 𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 , followed by 𝑃 − 𝐵𝑊 𝑀 and, therefore,
Performing the greater test to the pairs of methods for which the by 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 ;
difference between the cumulative distributions of the MSE values is • Considering Group 1, the three P-BWM versions are better than
significant from the statistical point of view, we can state the following all BWM versions, and 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 is better than both 𝐵𝑊 𝑀
(see Table 10): and 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 ;
10
S. Corrente et al. Omega 126 (2024) 103075
• With respect to Group 2, we have again that 𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 is • An experiment conducted by students was performed to compare
better than all other methods, followed by 𝑃 − 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 and the performance of the P-BWM with the BWM.
𝑃 − 𝐵𝑊 𝑀.
We have some interesting ideas for future research direction. First,
Considering the distributions of the MAE values, we can state the although we found outstanding performance for the parsimonious BWM
following (to save space, we included the tables with the values ob- in an experiment (where the subset is defined by the researchers
tained by the two tests in Appendix C): and it is fixed for all the subjects), we think the performance of the
new approach might even improve if each DM is free to choose a
• The three 𝑃 − 𝐵𝑊 𝑀 versions are equivalent considering the two
groups together as well as separately; analogously, the difference subset herself. In real-world settings, a DM might feel more comfort-
between the distributions of MAE values obtained by BWM and able/knowledgeable about some particular alternatives. This would be
𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 is not statistically significant at global and partial a reasonable criterion to compose the subset from those alternatives.
way; Such a choice could, in principle, lead to more reliable priorities for that
• Considering the larger test, both at the global and partial level, subset, hence a more reliable rating of the alternatives in the whole set.
we can state that P-BWM versions are better than BWM versions This idea could be investigated in a new experimental study. Second,
and BWM is better than 𝐵𝑊 𝑀𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 . the composition of the subset might relate to some biases that can be
investigated in future studies. For instance, having the alternatives with
5.3. Some comments the best and worst performance in the subset might lead to different
results than when such alternatives are not in the set. Finally, while
The results obtained by the BWM and P-BWM application to two the problem in our study is simple (for the sake of experiments),
questionnaires submitted to two different groups of university students more sophisticated cases need to be studied (alternatives with different
evince the goodness of our proposal with respect to the original BWM. dimensions) to test the performance of the new approach. The initial
Even if the two considered questionnaires involve the same number experiments conducted in the paper were carried out with students,
of pieces of preference information (17) their nature is different. In but it is recommended that future studies involve actual DMs who are
the BWM application, the students were asked to provide 17 pairwise experts in the relevant field of application. This will allow for feedback
comparisons, while, in the P-BWM application, students were asked to on any potential bias or limitations of the method when applied in a
provide 10 ratings and only 7 pairwise comparisons. Looking at the practical setting.
values presented in the previous section, this difference in the type of We plan to continue our research on parsimonious BWM in different
preference asked to the students made the application of the P-BWM directions:
simpler and more reliable than the BWM application.
As to the comparison between the different BWM and P-BWM ver- • comparing the parsimonious BWM with other multicriteria scal-
sions obtained considering the interval and the barycenter as priority ing methods such as SMART [53] and SWING procedures [54],
vectors, we observed that 𝑃 − 𝐵𝑊 𝑀𝐵𝑎𝑟𝑦𝑐𝑒𝑛𝑡𝑒𝑟 obtains the best results • developing a customized version of the method for specific real-
among the six methods. In particular, it is better than the other two world applications in relevant domains such as multicriteria eval-
P-BWM versions at the global level and considering Group 2, while uation of sustainable development [55],
the difference between the obtained MSE values is not statistically • improving the procedure to select the reference points, paying
significant with respect to Group 1. attention to the aspects related to the cooperation with the DM
in this specific task,
6. Conclusions • developing a procedure to apply the parsimonious BWM to the
elicitation of weights in case of decision problems with many
In this study, we introduced a new version of the Best–Worst criteria, possibly hierarchically organized.
Method (BWM), parsimonious BWM, to handle decision-making prob-
lems involving a large set of alternatives. Following this approach, In summary, we would like to work on the idea of decision support pro-
a manageable subset of the alternatives is chosen for conducting the cedures permitting to obtain better decisions by reducing the amount
pairwise comparison following the BWM steps by the Decision-Maker of the required preference information but increasing its salience. This
(DM). The priorities found by this subset of alternatives, along with seems a very interesting research perspective for the whole domain of
the rating of the whole set (also determined by the DM), are used to multiple criteria decision aiding.
rank the whole set of alternatives. The idea behind the procedure we
propose is that the errors associated with the evaluation of a large CRediT authorship contribution statement
number of alternatives can be corrected by taking into account the
priorities obtained from the pairwise comparison of a limited number Salvatore Corrente: Conceptualization, Methodology, Software,
of well-distributed reference alternatives. We conducted an experiment Writing – original draft, Writing – review & editing. Salvatore Greco:
to test the performance of the new approach, and as the results show, Methodology, Writing – original draft, Writing – review & editing. Ja-
it performs very well against the original BWM. We also showed that far Rezaei: Conceptualization, Methodology, Writing – original draft,
the new approach does not require more information pieces from the
Writing – review & editing.
DM, which is in line with the main philosophy of the original BWM.
We conducted a detailed analysis to reach this conclusion.
Summarizing, we detail the contributions of the paper as follows: Declaration of competing interest
• We introduced a parsimonious version of the BWM called P-BWM, The authors declare that they have no known competing finan-
which allows for determining the priorities of alternatives in cases
cial interests or personal relationships that could have appeared to
where their large quantity makes it impractical to use the original
influence the work reported in this paper.
BWM method,
• We examined AHP, P-AHP, BWM, and P-BWM in terms of the
amount of preference information required from the DM to utilize Data availability
them. This allows them to determine the most efficient method
based on the number of alternatives/criteria available, No data was used for the research described in the article.
11
S. Corrente et al. Omega 126 (2024) 103075
Acknowledgments [22] Tversky A, Kahneman D. Judgment under Uncertainty: Heuristics and Biases:
Biases in judgments reveal some heuristics of thinking under uncertainty. Science
1974;185(4157):1124–31.
Salvatore Corrente and Salvatore Greco acknowledge the support
[23] Helson H. Adaptation-level as frame of reference for prediction of psychophysical
of the Ministero dell’Istruzione, dell’Universitá e della Ricerca (MIUR) data. Am J Psychol 1947;60(1):1–29.
- PRIN 2017, project ‘‘Multiple Criteria Decision Analysis and Multiple [24] Helson H. Adaptation-level theory: an experimental and systematic approach to
Criteria Decision Theory’’, grant 2017CY2NCA. The authors are grateful behavior. New York: Harper and Row; 1964.
to students attending the Marketing and Financial Mathematics courses [25] Parducci A. Category judgment: a range-frequency model. Psychol Rev
1965;72(6):407.
of the Department of Economics and Business of the University of [26] Parducci A. Happiness, pleasure, and judgment: The contextual theory and its
Catania for their helpful contribution in filling out the questionnaires applications. Lawrence Erlbaum Associates, Inc; 1995.
presented in the paper. [27] Liu NC, Cheng Y. The academic ranking of world universities. Higher Educ
Europe 2005;30(2):127–36.
[28] Rowan AN. World happiness report 2023. WellBeing News 2023;5(3):1.
Appendix A. Supplementary data
[29] Haakenstad A, et al. Assessing performance of the healthcare access and quality
index, overall and by select age groups, for 204 countries and territories, 1990–
Supplementary material related to this article can be found online 2019: a systematic analysis from the global burden of disease study 2019. Lancet
at https://doi.org/10.1016/j.omega.2024.103075. Global Health 2022;10(12):e1715–43.
[30] Sachs JD, Kroll C, Lafortune G, Fuller G, Woelm F. Sustainable development
report 2022, Cambridge University Press; 2022.
References [31] Doumpos M, Zopounidis C. Multicriteria analysis in finance, Springer; 2014.
[32] Soebarto VI, Williamson TJ. Multi-criteria assessment of building performance:
[1] Keeney RL, Raiffa H. Decisions with multiple objectives: Preferences and value theory and implementation. Build Environ 2001;36(6):681–90.
tradeoffs. New York: J. Wiley; 1976. [33] Natividade-Jesus E, Coutinho-Rodrigues J, Antunes CH. A multicriteria decision
[2] Roy B. Multicriteria methodology for decision aiding. Nonconvex optimization support system for housing evaluation. Decis Support Syst 2007;43(3):779–90.
and its applications, Dordrecht: Kluwer Academic Publishers; 1996. [34] Cinelli M, Coles SR, Kirwan K. Analysis of the potentials of multi criteria
[3] Brans J-P, Vincke P, Mareschal B. How to select and how to rank projects: The decision analysis methods to conduct sustainability assessment. Ecol Indic
PROMETHEE method. European J Oper Res 1986;24(2):228–38. 2014;46:138–48.
[4] Saaty T. A scaling method for priorities in hierarchical structures. J Math Psych [35] Dorfleitner G, Halbritter G, Nguyen M. Measuring the level and risk of corporate
1977;15(3):234–81. responsibility–An empirical comparison of different ESG rating approaches. J
[5] Rezaei J. Best-worst multi-criteria decision-making method. Omega 2015;53:49– Asset Manag 2015;16:450–66.
57. [36] Belton V, Stewart TJ. Multiple criteria decision analysis: an integrated approach.
[6] Rezaei J. Best-worst multi-criteria decision-making method: Some properties and Springer; 2002.
a linear model. Omega 2016;64:126–30. [37] Greco S, Ehrgott M, Figueira JR. Multiple criteria decision analysis: State of the
[7] Brunelli M, Rezaei J. A multiplicative best–worst method for multi-criteria art surveys, Berlin: Springer; 2016.
decision making. Oper Res Lett 2019;47(1):12–5. [38] Greco S, Matarazzo B, Słowiński R. Rough sets theory for multicriteria decision
[8] Mohammadi M, Rezaei J. Bayesian best-worst method: A probabilistic group analysis. European J Oper Res 2001;129(1):1–47.
decision making model. Omega 2020;96:102075. [39] Liang F, Brunelli M, Rezaei J. Consistency issues in the best worst method:
[9] Liang Y, Ju Y, Tu Y, Rezaei J. Nonadditive best-worst method: Incor- Measurements and thresholds. Omega 2020;96.
porating criteria interaction using the choquet integral. J Oper Res Soc [40] Smith RL. Efficient Monte Carlo procedures for generating points uniformly
2023;74(6):1495–506. distributed over bounded regions. Oper Res 1984;32:1296–308.
[10] Guo S, Zhao H. Fuzzy best-worst multi-criteria decision-making method and its [41] Tervonen T, Figueira JR, Lahdelma R, Almeida Dias J, Salminen P. A stochastic
applications. Knowl-Based Syst 2017;121:23–31. method for robustness analysis in sorting problems. European J Oper Res
[11] Rezaei J, Arab A, Mehregan M. Analyzing anchoring bias in attribute weight 2009;192(1):236–42.
elicitation of SMART, Swing, and best-worst method. Int Trans Oper Res 2022. [42] Arcidiacono SG, Corrente S, Greco S. Scoring from pairwise winning indices.
[12] Rezaei J, Arab A, Mehregan M. Equalizing bias in eliciting attribute weights Comput Oper Res 2023;157:106268.
in multiattribute decision-making: experimental research. J Behav Decis Mak [43] Corrente S, Greco S, Rezaei J. A parsimonious extension of the best-worst
2022;35(2):e2262. method. In: 26th International Conference on Multiple Criteria Decision Making.
[13] Wu Q, Liu X, Zhou L, Qin J, Rezaei J. An analytical framework for the best–worst 2022.
method. Omega 2024;123:102974. [44] Moslem S. A novel parsimonious best worst method for evaluating travel mode
[14] Kusi-Sarpong S, Orji IJ, Gupta H, Kunc M. Risks associated with the im- choice. IEEE Access 2023;11:16768–73.
plementation of big data analytics in sustainable supply chains. Omega [45] Anderson NH. On the role of context effects in psychophysical judgment. Psychol
2021;105:102502. Rev 1975;82(6):462.
[15] Rezaei J, Nispeling T, Sarkis J, Tavasszy L. A supplier selection life cycle [46] Mellers BA, Birnbaum MH. Loci of contextual effects in judgment. J Exp Psychol
approach integrating traditional and environmental criteria using the best worst [Hum Percept] 1982;8(4):582.
method. J Clean Prod 2016;135:577–88. [47] Riskey DR, Parducci A, Beauchamp GK. Effects of context in judgments of
[16] Švadlenka L, Bošković S, Jovčić S, Simic V, Kumar S, Zanne M. Third-party sweetness and pleasantness. Percept Psychophys 1979;26:171–6.
logistics provider selection for sustainable last-mile delivery: A case study of [48] Anderson NH. Algebraic models in perception. In: Handbook of perception, vol.
E-shop in Belgrade. J Urb Dev Manag 2023;2(1):1–13. 2, 1974, p. 215–98.
[17] Mi X, Tang M, Liao H, Shen W, Lev B. The state-of-the-art survey on integrations [49] Fechner GT. Elemente der psychophysik, vol. 2, Breitkopf u. Härtel; 1860.
and applications of the best worst method in decision making: Why, what, what [50] Stevens SS. On the psychophysical law. Psychol Rev 1957;64(3):153.
for and what’s next? Omega 2019;87:205–25. [51] Abastante F, Corrente S, Greco S, Ishizaka A, Lami IM. A new parsimonious
[18] Mohammadi M, Rezaei J. Hierarchical evaluation of criteria and alternatives AHP methodology: assigning priorities to many objects by comparing pairwise
within BWM: A Monte Carlo approach. In: Advances in best-worst method: pro- few reference objects. Expert Syst Appl 2019;127:109–20.
ceedings of the second international workshop on best-worst method. Springer; [52] Conover WJ. Practical nonparametric statistics, vol. 350, John Wiley & Sons;
2022, p. 16–28. 1999.
[19] Miller GA. The magical number seven, plus or minus two: some limits on our [53] Edwards W. How to use multiattribute utility measurement for social
capacity for processing information. Psychol Rev 1956;63(2):81–97. decisionmaking. IEEE Trans Syst Man Cybern 1977;7(5):326–40.
[20] Saaty TL, Ozdemir MS. Why the magic number seven plus or minus two. Math [54] von Winterfeldt D, Edwards W. Decision analysis and behavioral research,
Comput Modelling 2003;38(3):233–44. Cambridge University Press; 1986.
[21] Montibeller G, Von Winterfeldt D. Cognitive and motivational biases in decision [55] Boggia A, Cortina C. Measuring sustainable development using a multi-criteria
and risk analysis. Risk Anal 2015;35(7):1230–51. model: A case study. J Environ Manag 2010;91(11):2301–6.
12