Arma 11 548

ARMA 11-548
Development of models for geomechanical characterization using Data

Mining techniques applied to a database gathered in underground structures
Miranda, T.
University of Minho, Guimares, Portugal
Ribeiro e Sousa, L.
University of Porto, Civil Engineering Department, Porto, Portugal
Gomes Correia, A.
University of Minho, Guimares, Portugal
Copyright 2011 ARMA, American Rock Mechanics Association
This paper was prepared for presentation at the 45th US Rock Mechanics / Geomechanics Symposium held in San Francisco, CA, June 2629,
2011.
This paper was selected for presentation at the symposium by an ARMA Technical Program Committee based on a technical and critical review of
the paper by a minimum of two technical reviewers. The material, as presented, does not necessarily reflect any position of ARMA, its officers, or
members. Electronic reproduction, distribution, or storage of any part of this paper for commercial purposes without the written consent of ARMA is
prohibited. Permission to reproduce in print is restricted to an abstract of not more than 300 words; illustrations may not be copied. The abstract
must contain conspicuous acknowledgement of where and by whom the paper was presented.
ABSTRACT: The evaluation of geomechanical parameters for rock masses is one of the issues with largest uncertainty degree.
This is mostly true in the preliminary stages of design and in works where geotechnical information is scarce. Data Mining
techniques have been successfully used in many fields but scarcely in geotechnics. They are advanced techniques which allow
analyzing large and complex databases like the ones it is possible to build with geotechnical information. In this work, a large
database of geotechnical data produced in the scope of a large underground structure was gathered and these innovative tools were
used to analyze it and induce new and useful knowledge. The main goal was to develop new and reliable models to predict
geomechanical parameters for rock masses, namely friction angle, cohesion and deformability modulus, when only limited data is
available.
1. INTRODUCTION
The evaluation of geomechanical parameters in
underground works corresponding to the
preliminary stages of design is normally performed
based on scarce and uncertain data. When a small
amount of data is available, geomechanical
information concerning other works, developed in
similar rock masses, can help in defining values for
the parameters.
The number and type of tests performed in
geotechnical site investigation is related to the
importance of the work, the inherent risk and
budget issues. In geotechnical works where the
available geological-geotechnical data is limited,
the geomechanical parameters are set based on the
available data and conservative engineering
judgment. In these cases, great amounts of
geotechnical data produced in large projects could
help in reducing uncertainties related to the
definition of design values for the parameters.
Therefore, the advantages of using geotechnical
data gathered from several different projects are
indubitable. However, this is not a straightforward

process. The central question is how vast quantities
of data with different types and origins can be
managed and explored in order to develop models
that can provide a background for future projects.
One first step for solving this problem is defining
standard ways of collection, organisation, and
representation of data. In terms of techniques to
analyse these databases and repositories, currently
there are automatic tools from the fields of artificial
intelligence and pattern recognition, for instance,
which allow a deeper understanding of large and
complex databases enabling to explore and discover
potential embedded knowledge [1]. It is believed
that the automated tools of data analysis like Data
Mining (DM) can help in developing complex
"data-driven" models.
The formal analysis process, normally called
Knowledge Discovery in Databases (KDD), defines
the main procedures for transforming raw data into
useful knowledge. DM is just one step in the KDD
process concerned with the application of
algorithms to the data in order to obtain models.
These tools allow a deep analysis of complex data,
(i.e. data with a large number of variables,

independent determinations and complex and
unclear relations with other variables) which would
otherwise be very difficult using classical statistics
or using only one or even a panel of human experts,
who could overlook important details. However, the
computational process cannot substitute human
experts. Computational tools are only a complement
that allows the automatic finding of patterns and
models embedded in the data. The knowledge
discovered in the process must be explainable at the
light of science and experience and must be
validated before being used in other applications.
DM aims at the extraction of useful patterns from
data. The knowledge derived through DM is often
described as models or patterns and it is important
that this knowledge is novel and understandable.
DM is a relatively new area of computer science
that is positioned at the intersection of statistics,
machine learning, data management and databases,
pattern recognition, artificial intelligence, and other
areas.
There are several DM techniques, each with their
own purposes and capabilities. Examples of these
techniques include Decision Trees and Rule
Induction, Neural and Bayesian Networks, Learning
Classifier Systems and Instance-Based algorithms
[2, 3].
In this paper, a KDD process was developed using
geotechnical data gathered in an important
underground work recently built in the North of
Portugal in a predominantly granite rock mass. New
alternative regression models were developed using
multiple regression and artificial neural networks
for the analytical calculation of strength and
deformability parameters. These models were built
considering different sets of input data, allowing
their application in different scenarios of data
availability. Most of the models use less
information than the original formulations but
maintain a high predictive accuracy, which can be
useful in the preliminary design stages in any case
where geological/geotechnical information is
limited. The present study also provided insight of
the most influential parameters on the behaviour of
the rock mass.
2. MATERIALS AND METHODS

2.1. Geotechnical data
The geotechnical data used to build the database
were related to an important underground facility
built in the North of Portugal called Venda Nova II,
which involves mostly a granite rock mass with
good overall geomechanical characteristics. The
main goal was to develop alternative ways for the
prediction
of
geomechanical
parameters
(deformability modulus - E; friction angle - ; and
cohesion - c').
The available data were mainly applications of the
empirical systems and results from laboratory
(uniaxial compressive strength and sliding of
discontinuities) and in situ tests (Small Flat Jacks
SFJ, Large Flat Jacks LFJ and dilatometers) [4, 5,
6]. Only the empirical systems application formed a
large enough set to be mined. Data were dispersed
in a large number of spreadsheet files and it was
necessary to remove duplicated records. The
reduced number of some tests hindered the
possibility to include them on the process because it
is important that, for each input variable, a great
amount of data exist. Finally, data were organised
and structured in a database composed by 1230
examples and twenty two attributes or variables.
Based on the original attributes the geomechanical
parameters were computed by means of analytical
solutions. Later in this work, the adopted
methodologies used to compute these parameters
will be presented. After their calculation, the
geomechanical parameters were added to the
database together with other attributes to check their
possible influence on the models. Globally, eleven
new attributes were added. Table 1 presents all the
attributes present in the database. From now forth
the geomechanical parameters obtained using this
methodology will be called computed values
while the ones obtained with the DM models will be
called predicted values. It is important to mention
that the computed values for the deformability
modulus were calibrated with the results of reliable
and large scale in situ tests as it will be described
latter.
In spite of the high number of records within the
database it presented some limitations namely: high
uniaxial compressive strength (c>100MPa), RQD
values over 65% and slightly wet to dry rock mass.
The models developed in this work should only be
applied to rock masses with similar characteristics.
Table 1 Name and description of the attributes in the

database
Name
RQD
Jw, Jn, Jr,
Ja, SRF
Q
Q
c
P1, P2, P3,
P4, P5, P6
Description
Rock Quality Designation
Q system factor related to: underground water,
number of joint sets, joint rugosity, weathering
degree of joints and stress state in the rock mass,
respectively.
Rock mass quality index proposed by [7]
Altered form of the Q index (Q = RQD/Jn * Jr/Ja)
Uniaxial compressive strength
RMR weights related to: uniaxial compressive
strength of the intact rock, RQD, joint spacing, joint
conditions, underground water conditions and joint
orientation, respectively.
Joint conditions persistence, aperture, rugosity,
filling and weathering, respectively.
P41, P42,
P43,
P44, P45
RMR, class Rock Mass Rating proposed by [8] and
classification based on this value
Ratios of the Q system representing:
RQD/Jn,
compartmentalisation of the rock mass, shear
Jr/Ja,
Jw/SRF
strength of joint and an empirical factor named
active stress, respectively.
logQ, logQ' Base 10 logarithm of the Q and Q values
GSI
Geological Strength Index proposed by [9]
N
Altered form of the Q index (Q' = RQD/Jn*Jr/Ja*Jw)
RCR
Altered form of the RMR index (RCR =
P2+P3+P4+P5+P6)
, c, E Friction angle, cohesion and deformability modulus
2.2. Modelling and evaluation

The algorithms used for the regression models were
multiple regression and Artificial Neural Networks
(ANN). The applied ANN was a multilayer feedforward network with one hidden layer of six
neurons which shown in trial calculations to have a
good performance. Nonetheless, focus was drawn to
the multiple regression models because obtaining
the explanatory physical knowledge behind the
models, which is possible using this technique, was
considered to be very important. Moreover, these
models are easier to use and implement. The ANN
models were mainly used for comparison purposes.
In regression problems, the goal is to estimate the
model which minimizes an error measurement
between real and predicted values considering N
examples. The used error measures were the Mean
Absolute Deviation (MAD) and the Root Mean
Squared Error (RMSE).
The holdout method was used to validate and assess
the accuracy of the models. In this method, data are
randomly partitioned into two independent sets, a
training and a test set. Two-thirds of the data were

used for training and one-third was used for testing.
The training set is used to estimate the model and its
accuracy is evaluated using the test set. For each
model, ten runs were performed, randomising the
data within the training and testing sets. The mean
and confidence intervals for the error measures
were then computed considering the results of the
ten runs and a 95% confidence interval of a Tstudent distribution. These statistical measures
define the range of expected errors for future
predictions of the final model, which is estimated
using all the data for training.
In addition to the error measures the coefficient of
determination (R2) was also used. For the ANN,
only the RMSE measure was computed due to
computational limitations.
3. RESULTS
3.1. Strength parameters Calculation of the parameters values
Both strength and deformability parameters were
not originally present in the database. They were
indirectly derived from the available information
using analytical methodologies. Concerning the
strength parameters, the main goal was to develop
models to predict the Mohr-Coulomb parameters
using different types of data. To obtain the values of
these parameters to include in the database, the
Hoek and Brown (H-B) strength parameters were
firstly computed. Then c' and were derived by
fitting an average linear relationship to the
generated failure envelope formulated in effective
stresses [9].
The prediction models for and c' were developed
considering a reference depth (H) of 350 meters (the
depth of the main caverns of the powerhouse
complex) and a disturbance factor (D) of zero. To
allow for a simple and direct transformation of the
values predicted by the models for other conditions
(different H and D), a parametric study was
performed. Based on this study, a generic
methodology to transform the geomechanical
parameters for a given H and D to a different pair of
values was developed and then particularised for the
DM models. The generic methodology is based on
the application of two correction factors, one for
each parameter and is described in [10].
Developed models for friction angle ()
Table 2 - Results for the models using the different IVS for
prediction
Figure 1 shows a plot of the most important

parameters in the prediction of . There is a great
amount of variables related to its prediction and
several are similarly important. However, the most
important ones are: (i) c (UCS in the picture),
which was expected because this value is also a
strength measure, and (ii) the Q index (with
logarithmic transformation) and other variables
related to the Q system. This is unexpected because
the Q system is normally used only for
classification purposes and not for the calculation of
strength parameters considering the rock mass as a
continuum medium. Nevertheless, the Q index is
very complete and should be used in models for the
prediction of geomechanical parameters.
Relative importance (%)
10
Regression
MAD
0.5210.020
1.1620.043
0.6000.021
0.7760.019
RMSE
1.0020.106
2.0190.154
1.0510.068
1.2260.071
ANN
RMSE
0.6720.195
1.9700.502
0.8070.092
2.2900.303
As expected, the models using IVS 1 were the most

accurate. Nevertheless, the remaining models also
presented good predictive performances. IVS 3,
which uses all the RMR parameters, is only slightly
outperformed by IVS 1. The error measures and R2
are very close. The good behaviour of this model
can also be observed in Figure 2, the plot of
computed versus predicted values. For a wide range
of values, approximately from 35 to 63, the
prediction capacity is very uniform and reliable
because the values lie near the 45 slope line, even
though a small accuracy reduction for the lower
values of . This range of values covers a great
variety of possible weathering states of the granite
rock mass from fresh rock to transition from rock to
soil, i.e., it excludes only the soil state.
70
Variable
Fig. 1. Relative importance of the attributes for the

prediction.
IVS 1: all variables.

IVS 2: Q; log Q; Q; log Q; RMR.
IVS 3: all RMR parameters (P1, P2,, P6).
IVS 4: RMR parameters P1, P4 and P6.
' values
60
Predicted
In this context, several sets of parameters were

tested to obtain the best models that could simplify
the way is calculated. The input variable sets
(IVS) which presented the best results were:
R2
0.9680.004
0.8690.012
0.9650.001
0.9520.002
1
2
3
4
P41
Q'
P42
P3
P4
Ja
P43
P44
P6
RMR
P1
Jw
GSI
RQD/Jn
P5
logQ'
RQD
Jn
SRF
logQ
Jr/Ja
UCS
Jw/SRF
IVS
50
40
30
20
20
30
40
Computed
The results for the different IVS are presented in

Table 2. The expressions for the regression models
of IVS 2, 3, and 4 are the following:
' = 40.566 0.398 Q + 0.342 Q'+

6.726 log Q 4.853 log Q'+0.260 RMR
(1)
' = 27.143 + 1.867 P1 + 0.184 P2 +

0.145 P3 + 0.165 P4 + 0.246 P5 + 0.181 P6
(2)
' = 32.146 + 2.123 P1 + 0.229 P4 + 0.211 P6 (3)
50
60
70
' values
Fig. 2. Computed versus predicted values for regression

model with IVS 3.
IVS 2 presented the worst performance. In spite of

using information from the RMR and Q
coefficients, it was outperformed by the simpler
models. For the case of , the use of specific
information about rock mass characteristics
presented better results than using overall quality
indexes like the RMR and Q. This model had the
worst performance within the approximate range of
35 to 45, in which absolute errors up to 10 were
found, thus estimates in this range should be used

with caution. Nevertheless, the MAD and RMSE
values point to a mean expected overall prediction
error between 1 and 2 which is small.
The most important RMR parameter was by far the
one related to c as it can be stated by the
observation of Figure 2, meaning that in granite
rock masses is closely related to this strength
measure. The variables related to joints conditions
and orientation (P4 and P6, respectively) also appear
to be good predictors. Even though a high
importance of the joint conditions was expected in
the prediction of , the considerable weight of the
parameter related to the joint orientation is not as
acceptable. It can be due to limitations of the
database or even of the RMR system itself which
can overrate the importance of this parameter.
IVS 4 uses the three parameters related to the joints
for the prediction of with very good results as it
is stated in the plot of Figure 3. Comparing with
IVS 1 and 3, error measures are higher but the
model has the advantage of being simpler because it
uses only three parameters. Considering the MAD
and RMSE values from Table 3, the mean expected
error for these models is only about 1, which can
be considered negligible for engineering purposes.
70
Developed models for cohesion (c)

The preliminary runs for this parameter highlighted
the necessity of a variable transformation to
enhance the prediction capacity of the models. After
some preliminary tests it was concluded that the
logarithmic transformation (ln(c')) was the most
suited for this case.
Figure 4 shows that for the prediction of c a great
number of variables have similar importance. GSI
appears to be the main parameter which is normal
because GSI is used in the original formulation for
the calculation of c'. GSI was not considered for the
development of the new models because the main
goal was to develop alternative ones which use
different parameters. The models which presented
the best results used similar input variable sets to
those used for meaning that they are the ones
with higher relation to these geomechanical
parameters. Thus, the most accurate IVS were
considered equal to the values used for the
prediction.
10
Relative importance (%)
Predicted
' values
60
when using fewer parameters. Nevertheless, the

RMSE of all the trained ANN was acceptable for
every model, which means that they are highly
accurate in the prediction of .
50
40
30
30
40
Computed
50
60
70
' values
Fig. 3. Computed versus predicted values for regression

model with IVS 4.
The ANN outperformed the regression models for

IVS 1 to 3 in terms of the RMSE and this fact is
especially true for IVS 1, where this error measure
was reduced in more than 30%. For IVS 4, the
RMSE of the ANN is 87% higher than the RMSE
for the regression model. The ANN performs worst
logQ'
P5
P3
Jr
Jr/Ja
SRF
logQ
P43
RMR
Q'
P4
P45
Ja
Jn
P1
Jw
Variable
RQD/Jn
20
Jw/SRF
20
GSI
Fig. 4. Relative importance of the attributes for the ln(c')

prediction.
Table 3 presents the main results. The expressions

for the regression models of IVS 2, 3, and 4 are:
ln c ' = 0.747 + 0.00099 Q +
(4)
0.0394 log Q '+0.0298 RMR
ln c ' = 0.906 + 0.067 P1 + 0.022 P2 +

0.027 P3 + 0.033 P4 + 0.021 P5 + 0.022 P6
(5)
0.046 P4 + 0.021 P6
10
(6)
Table 3 - Results for the models using the different IVS for c
prediction (in MPa)
IVS
1
2
3
4
R2
0.9860.002
0.9630.002
0.9730.002
0.9130.007
Regression
MAD
0.0380.002
0.0540.002
0.0540.001
0.0970.003
RMSE
0.0580.005
0.0920.004
0.0780.003
0.1430.008
ANN
RMSE
0.0550.006
0.0850.006
0.0430.006
0.1280.009
8
Predicted c' values (MPa)
ln c' = 0.191 + 0.059 P3 +
7
6
5
4
3
2
1
The expected error for these regression models is

approximately 0.21 MPa which is acceptable
considering the range of values of c. When using
the models, attention should be paid to the
conservative estimations for higher c' values.
Despite the logarithmic transformation, a slight
non-linear trend is still observed which is probably
the main reason for the enhanced behaviour of the
ANN, especially for IVS 3 where the RMSE value
is almost one half of the regression model RMSE.
Concerning IVS 3, the most important RMR
parameters for c' prediction were those related to
the joints (P3, P4 and P6). This is odd since the
parameter P1 presented a high influence on the
prediction of c' which is corroborated by
engineering practice. As observed for , the
parameter P6 appears with an overstressed
importance which can be, as stated before, due to
limitations of the database or of the RMR system
itself.
10
10
Computed c' values (MPa)
10
9
8
Predicted c' values (MPa)
The results for IVS 2 and 3 are quite similar in

terms of the error measures and R2. However,
Figure 5 shows different behaviours in the range of
c' values. For IVS 3, the predicted values show a
relatively stable trend until values of approximately
6 MPa. For higher values, a strong accuracy loss is
observed and the model tends to underestimate c.
On the other hand, IVS 2 shows a higher dispersion
than the previous set for values below 6 MPa. For
values above this threshold, there is also a tendency
however not so pronounced, to underestimate c.
IVS 3 has the advantage of being a simpler model
because it requires less information.
7
6
5
4
3
2
1
0
Real c' values (MPa)
Fig. 5. Computed versus predicted c' values for regression

models with a) IVS 2 and b) IVS 3.
Nevertheless, IVS 4 was created considering only

the three abovementioned parameters. As expected,
an accuracy loss is observed. Again, a non-linear
trend is present with an overestimation tendency for
the lowest values (<1.1 MPa) and underestimation
tendency for the highest values (>6.4 MPa). Still,
the average expected error is about 0.32 MPa.
Considering the range of c' values, this regression
model provides a reasonable preliminary estimation
even though being outperformed by the ANN,
which presents a RMSE value about 10% lower.
3.2. Deformability modulusCalculation of the parameter value
The deformability modulus (E) is an important
input parameter in any rock mass behaviour
analysis. However, this parameter is not an intrinsic
material characteristic since it depends on other
factors, mainly the associated strain level. Several

different deformability moduli can be defined. The
value to use in design should be associated to the
expected level of strains according to the
serviceability limit state of the structure.
expressions were computed, as well as their mean

and standard deviation. The values outside the range
of one standard deviation from the mean were
eliminated and the mean of the reminiscent was
computed and considered the final value of E.
Most procedures found in the literature to estimate

this parameter for isotropic rock masses are based
on simple expressions related to empirical systems
or other index values like the RQD [11] and intact
rock modulus - Ei [12, 13, 14].
The values of E obtained by the described

methodology were compared with the results of LFJ
tests [4, 5]. The results of these tests were compared
with the computed values of the parameter near the
area where the tests took place so the values could
be comparable. Table 5 presents some statistical
results of this evaluation. A Bayesian approach was
also used for updating the deformability values [23].
Most authors based their expressions on field test

data reported by [15] and [8] and, in some cases, by
[16]. They mostly refer to the secant modulus,
typically for deformations corresponding to 50% of
the peak load which is thought to be higher than the
serviceability levels of most geotechnical works
built in rock masses. Thus, it is expected that these
expressions typically provide conservative estimates
of E.
The expressions used in this study, their limitations
and authors are presented in Table 4. To obtain one
final value of E from the application of these
expressions, a statistical methodology was
established. For each case, the results of all
There is a similarity between the mean values

obtained by both methodologies (around 4% of
variability). The main difference is the higher
dispersion in the calculated values translated by a
higher standard deviation which is normal because
the LFJ tests are much more accurate in measuring
E than the empirically based expressions. However,
despite the higher variability, the calculated values
match well the ones obtained by reliable in situ tests
and can be considered realistic predictions.
Table 4 Expressions used for the calculation of E
E = 2 RMR 100
E = 10
Expression
(RMR 10 40 )
E = 0.1 (RMR 10 )
E = (E i 100 ) 0.0028 RMR 2 + 0.9 exp( RMR 22.82

1
E = 10 Qc 3 ; Qc = Q c 100
E = 25 log Q
E = 1 .5 Q
0 .6
E i0.14
(RMR 10 40)
D c
E = 1
10
2 100
(RMR 10 40 )
D
E = 1 10
2
1 D 2
E = 100000
1 + exp((75 + 25 D GSI ) / 11)
1 D 2
E = E i
1 + exp((60 + 15 D GSI ) / 11)
Limitations
RMR > 50 and
c > 100
RMR 80
Reference
[8]
Not limited
[17]
E<Ei
[18]
Q1
Q>1
EEi and
Q500
c 100
[19]
[15]
[20]
[21]
[9]
c > 100
Not limited
[22]
Not limited
[22]
Table 5 Comparison between calculated and measured

values of E [23].
160
N
76
E (GPa) - LFJ
95% confidence
interval for mean
36.9
35.9-37.8
E (GPa) - calculated
Mean
95% confidence
interval for mean
38.5
34.5-42.5
Mean
5.0
Std.
deviation
6.1
Std.
deviation
17.6
Obtained models for E

The study started with the development of the most
accurate models to estimate E. Then, the number of
input variables was reduced to obtain simpler
models using the most important variables [24].
Preliminary calculations allowed to conclude that a
logarithmic transformation (ln(E)) would improve
the accuracy of the models. The main reason of this
transformation was to avoid the prediction of
negative values for E in poorer rock mass
conditions, which was observed in some cases with
the linear model.
The parameters that produced the most accurate
model were the ones directly related to
geomechanical indexes, namely the RMR and Q
because they are used in most of the analytical
expressions. Additionally, these indexes assemble
important information for the rock mass
deformability prediction. These models can be used
for the prediction of E when a thorough
characterisation of the rock mass is available. The
results are presented in Table 6. The obtained
regression model is the following:
ln E = 2 . 622 + 0 . 2594 Q 0 . 25 +
0 . 1185 RMR 0 . 00058 RMR
(7)
2
Table 6. Results for the models which use the RMR and Q
coefficients
R2
0.978
0.001
Regression
MAD
0.088
0.004
RMSE
0.137 0.009
ANN
RMSE
0.141
0.016
This regression model is highly accurate and even

slightly outperforms the ANN model in terms of the
RMSE. Because ln(E) ranged from approximately
1.57 to 4.22, the error can be considered negligible
4.5
4.0
3.5
3.0
Predicted lnE values
for engineering practice. The model is stable for all

ranges of observed values as shown in Figure 6.
2.5
2.0
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
-2.0
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Computed lnE values
Fig. 6. Computed versus predicted ln(E) values for regression

model with the RMR and Q parameters.
In terms of geomechanical coefficients, the RMR

was the most important for the prediction of E.
Indeed, several regression models were tested but
the most reliable were based on this index. A simple
correlation between E and RMR using all available
data led to very acceptable results. The expression
for this correlation is the following:
E (GPa ) = 3 10 5 RMR 3.2388 (8)
When only parameters related to the joints are
available (P3, P4 and P6), which also showed to be
important parameters in the prediction of E, the
procedure that leads to better results is first to
calculate the RMR with a model based on these
parameters [10] and second, using equation 8,
calculate the final value of E. Table 7 presents the
results for these two methods. In these cases, there
are no confidence intervals because the results were
based on a simple correlation procedure using all
data. The error measures and the plots of computed
versus predicted values are also presented in
logarithmic form for comparison with the previous
models (Figure 7).
Table 7 - Results for the models which use the RMR and only
some parameters of this index
Correlation with RMR
linear
logarithmic
R2
0.962
0.970
MAD
2.357
0.116
RMSE
3.156
0.164
Correlation P3, P4, P6

RMR E
R2
MAD RMSE
0.930 3.120 4.138
0.889 0.192 0.319
5.0
4.5
4.0
3.5
Predicted lnE values
3.0
2.5
2.0
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
-2.0
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Computed lnE values
Fig. 7. Computed versus predicted values for the correlation

with RMR.
The correlation with the RMR index presents very

good results because there is a very low accuracy
loss compared to the previous more complex model
which considers information provided by the Q
index. Figure 7 shows the plot of computed versus
predicted values and it substantiates this conclusion,
showing a good distribution of values around the
45 line. The correlation has the advantage of
avoiding the Q index evaluation. However, it does
not have the statistical validation present in the
previous, more complex model.
For the last method, the decreasing accuracy is
much more significant, especially for E values
corresponding to poorer rock masses (ln(E)<1 or
RMR<34).
4. FINAL CONSIDERATIONS
An innovative work was performed considering the
use of DM tools to uncover new knowledge in a
database of geotechnical data through a KDD
process. The goal was to develop new models for
the evaluation of geomechanical parameters that
could be used in many conditions of data
availability. The main achievements of this work
are pointed out in the next items.
Identification of a new and important field of

study that can enhance the way geotechnical data
are analysed. The DM field can provide a
significant contribution in increasing the
knowledge about formations of interest.
Development of new and reliable regression
models based on multiple regression and ANN for
the calculation of the geomechanical parameters

, c and E. Several models were developed using
different sets of input information, which allow
their use in different conditions of knowledge
about the rock mass and can be helpful in the
decision-making process. Most of the estimated
models use less information than the original
formulations while maintaining high accuracy.
Enhancement of the understanding of the main
parameters related to the behavior of the granite
rock mass. The utmost importance of joints
characteristics was verified. However, some
conclusions like the importance of joints
orientation were probably due to limitations of the
database or even of the empirical systems
themselves, in particular, the RMR.
The relevance of the Q index for determining rock
mass strength parameters was already known since
the relation atan(Jr/Ja) is used to approximate the
inter block shear strength. In this work it was
found that the Q index can also be used to
compute strength parameters of jointed rock
masses assumed as a continuum medium.
Also DM intelligent methods consisting of search and

inference of patterns or models such Bayesian Networks
(BN) can be used in order to improve the predictions and
the developed models [4, 5]. A research project for
DUSEL is now undergoing suggesting the use of BN
with DM for the development of new models. These
models are expected to have higher accuracy than the
existing ones.
ACKNOWLEDGMENTS
The authors wish to express their thanks to EDP
Produo EM for the authorization and making
available the necessary data.
REFERENCES
1.
Hand, D., Mannila, H. and Smyth, P. (2001). Principles

of Data Mining. MIT Press, Cambridge, MA.
2.
Lee, S. and Siau, K. (2001). A review of Data Mining

techniques. Industrial Management & Data Systems,
MCB University, 101(1): 41-46.
3.
Berthold, M. and Hand, D. (2003). Intelligent data

analysis: an introduction. Springer, Second Edition.
4.
LNEC. (1983). LNEC colaboration in the geological

and geotechnical studies concerning the hydraulic
circuit of Venda Nova II (in Portuguese). Technical
report, LNEC Report 47/1/7084 Lisboa, Portugal.
5.
LNEC. (2003). Venda Nova II scheme. Determination

of the state of stress (in Portuguese). Technical report,
LNEC Report 371/03 Lisboa, Portugal.
6.
LNEC. (2005). Venda Nova II scheme. Laboratory tests

for the characterization of the powerhouse cavern rock
mass (in Portuguese). Technical report, LNEC Report
160/05 Lisboa, Portugal.
21. Singh, B., Viladkar, M., Samadhiya, N. and Mehrotra,

V. (1997). Rock mass strength parameters mobilised in
tunnels. Tunnelling and Underground Space
Technology, 12: 47-54.
7.
Barton, N., Lien, R. and Lunde, J. (1974). Engineering

classification of rock masses for the design of tunnel
support. Rock Mechanics, 6: 189-236.
22. Hoek, E. and Diederichs, M. (2006). Empirical

estimation of rock mass modulus. International Journal
of Rock Mechanics and Mining Science, 43: 203-215.
8.
Bieniawski, Z. (1989). Engineering

classifications. John Wiley & Sons.
9.
Hoek, E., Carranza-Torres, C. and Corkum, B. (2002).

Hoek-Brown failure criterion - 2002 edition. In Proc. of
the 5th North American Rock Mechanics Symposium,
267-273. Toronto, Canada.
23. Miranda, T.; Sousa, L.; Correia, A. (2008). Bayesian

Framework for the deformability modulus updating in
na underground structure. In proc. of the 42nd US Rock
Mechanics Symposium, June 29- July-2, San Francisco,
7.
rock
mass
10. Miranda, T. (2007). Geomechanical parameters

evaluation in underground structures. Artificial
intelligence, Bayesian probabilities and inverse
methods. PhD thesis. University of Minho, Guimares,
Portugal, 291p.
11. Zhang, L. and Einstein, H. (2004). Using RQD to
estimate the deformation modulus of rock masses.
International Journal of Rock Mechanics and Mining
Science, 41: 337-341.
12. Mitri, H., Edrissi, R. and Henning, J. (1994). Finite
element modelling of cable bolted stopes in hard rock
underground mines. SME Annual Meeting, 14-17.
Albuquerque.
13. Sonmez, H., Gokceoglu, C. and Ulusay, R. (2004).
Indirect determination of the modulus of deformation
of rock masses based on the GSI system. International
Journal of Rock Mechanics and Mining Science, 41:
849-857.
14. Carvalho, J. (2004). Estimation of rock mass modulus.
Personal communication.
15. Serafim, J. and Pereira, J. (1983). Considerations of the
geomechanics classification of Bieniawski. Proc. of the
International Symposium of Eng. Geol. Underground
Construction, II.33-II.42. Lisboa, Portugal.
16. Stephen, R. and Banks, D. (1989). Moduli for
deformation studies of the foundation and abutments of
the Portugues dam Puerto Rico. Proc. 30th US
Symposium, Morgantown, 31-38.
17. Read, S., Richards, L. and Perrin, N. (1999).
Applicability of the Hoek-Brown failure criterion to
New Zealand greywacke rocks. Proc. 9th Int. Cong. on
Rock Mechanics, 655-660. Paris, France.
18. Nicholson, G. and Bieniawski, Z. (1997). A non-linear
deformation modulus based on rock mass classification.
Int. Journal of Mining & Geology Eng., 181-202.
19. Barton, N. and Quadros, E. (2002). Engineering and
hydraulics in jointed rock masses. EUROCK 2002 Course A, Funchal, Portugal.
20. Barton, N., Loset, F., Lien, R. and Lune, J. (1980).
Application of Q-system in design decisions concerning
dimensions and appropriate support for underground
installations. Subsurface Space, 553-561.
24. Miranda, T., Gomes Correia, A. and Ribeiro e Sousa, L.

(2008). Development of new models for geomechanical
characterization using Data Mining techniques.
Geomechanics and Tunnelling, n. 5, p. 328-334.
25. Sousa, R.L. (2010). Risk analysis for tunneling

projects. MIT, Cambridge, 589p.

Arma 11 548

Uploaded by

Document Informationclick to expand document informationARMA-11-548

Document Informationclick to expand document information

Copyright:

Available Formats

Arma 11 548

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Arma 11 548

Uploaded by

Copyright:

Available Formats

ARMA 11-548

Development of models for geomechanical characterization using Data

indubitable. However, this is not a straightforward

(i.e. data with a large number of variables,

2. MATERIALS AND METHODS

Table 1 Name and description of the attributes in the

2.2. Modelling and evaluation

training and a test set. Two-thirds of the data were

Developed models for friction angle ()

Figure 1 shows a plot of the most important

As expected, the models using IVS 1 were the most

Fig. 1. Relative importance of the attributes for the

IVS 1: all variables.

In this context, several sets of parameters were

The results for the different IVS are presented in

' = 40.566 0.398 Q + 0.342 Q'+

' = 27.143 + 1.867 P1 + 0.184 P2 +

' = 32.146 + 2.123 P1 + 0.229 P4 + 0.211 P6 (3)

Fig. 2. Computed versus predicted values for regression

IVS 2 presented the worst performance. In spite of

found, thus estimates in this range should be used

Developed models for cohesion (c)

Relative importance (%)

when using fewer parameters. Nevertheless, the

Fig. 3. Computed versus predicted values for regression

The ANN outperformed the regression models for

Fig. 4. Relative importance of the attributes for the ln(c')

Table 3 presents the main results. The expressions

ln c ' = 0.906 + 0.067 P1 + 0.022 P2 +

ln c' = 0.191 + 0.059 P3 +

The expected error for these regression models is

Computed c' values (MPa)

The results for IVS 2 and 3 are quite similar in

Real c' values (MPa)

Fig. 5. Computed versus predicted c' values for regression

Nevertheless, IVS 4 was created considering only

factors, mainly the associated strain level. Several

expressions were computed, as well as their mean

Most procedures found in the literature to estimate

The values of E obtained by the described

Most authors based their expressions on field test

There is a similarity between the mean values

Table 4 Expressions used for the calculation of E

E = (E i 100 ) 0.0028 RMR 2 + 0.9 exp( RMR 22.82

Table 5 Comparison between calculated and measured

Obtained models for E

This regression model is highly accurate and even

Predicted lnE values

for engineering practice. The model is stable for all

Computed lnE values

Fig. 6. Computed versus predicted ln(E) values for regression

In terms of geomechanical coefficients, the RMR

Correlation P3, P4, P6

Predicted lnE values

Computed lnE values

Fig. 7. Computed versus predicted values for the correlation

The correlation with the RMR index presents very