Revista Facultad de Ingeniería Universidad de Antioquia
Revista Facultad de Ingeniería Universidad de Antioquia
Revista Facultad de Ingeniería Universidad de Antioquia
Title: Explicit pipe friction factor equations: evaluation, classification, and proposal
DOI: 10.17533/udea.redin.20230928
This is the PDF version of an unedited article that has been peer-reviewed and accepted for publication.
It is an early version, to our customers; however, the content is the same as the published article, but it
does not have the final copy-editing, formatting, typesetting and other editing done by the publisher
before the final published version. During this editing process, some errors might be discovered which
could affect the content, besides all legal disclaimers that apply to this journal.
1
Universidad Católica Sede Sapientiae, Perú.
KEYWORDS
Colebrook equation, turbulent fluid, relative roughness, Reynolds number
Palabras clave
Ecuación de Colebrook, fluido turbulento, rugosidad relativa, número de Reynolds
ABSTRACT:
The Colebrook equation has been used to estimate the friction factor (f) in turbulent fluids. In this regard,
many equations have been proposed to eliminate the iterative process of the Colebrook equation. The
goal of this article was to perform an evaluation, classification, and proposal of the friction factor for
better development of hydraulic projects. In this study, Gene Expression Programming (GEP), Newton-
Raphson, and Python algorithms were applied. The accuracy and model selection were performed with
the Maximum Relative Error (∆f/f), Percentage Standard Deviation (PSD), Model Selection Criterion
(MSC), and Akaike Information Criterion (AIC). Of the 30 equations evaluated, the Vatankhah equation
was the most accurate and simplest to obtain the friction factor with a classification of very high,
reaching a value of ∆f/f<0.5% and 1.5<PSD<1.6. A new equation was formulated to obtain the explicit
f with fast convergence and accuracy. It was concluded that the combination of GEP, error theory, and
selection criteria provides a more reliable and strengthened model.
RESUMEN:
La ecuación de Colebrook se ha utilizado para estimar el factor de fricción (f) en fluidos turbulentos. En
este sentido, se han propuesto varias ecuaciones para eliminar el proceso iterativo de la ecuación
Colebrook. El objetivo de este artículo fue realizar una evaluación, clasificación y propuesta del factor
de fricción para un mejor desarrollo de proyectos hidráulicos. En este estudio, se aplicaron los algoritmos
de programación de expresión génica (GEP), Newton-Raphson y Python. La precisión y la selección del
modelo se realizaron con el Máximo Error Relativo (∆f/f), Porcentaje de Desviación Estándar (PSD),
Criterio de Selección del Modelo (MSC) y Criterio de Información de Akaike (AIC). De las 30
Revista Facultad de Ingeniería, Universidad de Antioquia, No.xx, pp. x-xx, xxx-xxx 20xx
M. López-Silva et al.; Revista Facultad de Ingeniería, No. xx, pp. x-x, 20xx
ecuaciones evaluadas, la ecuación de Vatankhah fue la más precisa y sencilla para obtener el factor de
fricción con una clasificación de muy alta, alcanzó un valor de ∆f/f<0.5% y 1.5<PSD<1.6. Se formuló
una nueva ecuación para obtener el f explícita con rápida convergencia y precisión. Se concluyó que la
combinación de GEP, teoría del error y criterios de selección proporciona un modelo más confiable y
fortalecido.
[21].
−2
ε D 4.5 Re (2)
f =
−2log 3.7 + Re log 6.97
Re ≥ 104 and 0 <ε/D <5·10-2
[24].
−2
ε 10.04
f= −2log 3.7D + R *
(3)
−1
ε D 5.45
R =
*
2Re − log + 0.9 (4)
3.7 Re
R* the dimensionless number;
Limit: Re ≥ 2300 and 0 < ε/D <0.05
Revista Facultad de Ingeniería, Universidad de Antioquia, No.xx, pp. x-xx, xxx-xxx 20xx
M. López-Silva et al.; Revista Facultad de Ingeniería, No. xx, pp. x-x, 20xx
[37].
0.25
ε 68
=f 0.11 +
D Re (5)
Limit: not specified
[38].
6.4
f= 2.4
ε ε
ln(Re) − ln 1 + 0.01Re 1 + 10
D D (6)
Limit: Re ≤ 4000
[39].
−2
Re
4.518log (7)
ε 7
f=
−2log + 0.7
3.7D Re 1 + 1 Re0.52 ε
29 ( )
D
Limit: 5000 < Re <108 and 10-2 <ε/D <10-6
[14]. Model I
−2
−0.4343β ε D
f = −2log 10 +
(8)
3.71
Limit: not specified
[14]. Model II
−2
2.18β ε D
f= −2log +
(9)
Re 3.71
Where β is:
(10)
Re
β = ln
1.1Re
1.816ln
ln (1 + 1.1Re )
Limit: not specified
[17].
−2
1.038ln ( B + A ) (11)
=f 0.8686 B + − ln ( B + A )
( 0.332 + B + A )
Re ( ε D ) (12)
A=
8.0878
Revista Facultad de Ingeniería, Universidad de Antioquia, No.xx, pp. x-xx, xxx-xxx 20xx
M. López-Silva et al.; Revista Facultad de Ingeniería, No. xx, pp. x-x, 20xx
Re
B = ln
(13)
2.18
[40].
−2
ε 5.0452 ( ε D )1.1098 (14)
−2log
f= − log + 5.8506Re−0.8981
3.7065D Re 2.8257
Limit: 4000 <Re< 108 and 10-6< ε/D <5·10-2
[41].
−2
ε 7 (15)
f=
−2log 3.71D + Re0.9
[42].
1
−3 12
16
2
8 12
37530
16
1
8 + 2.457ln 0.9
f= +
Re 7 ε Re
Re + 0.27
D
(16)
Limit: 4000 < Re <108 and 10-6 <ε/D <5·10-2
[43].
−2
ε 15
f= −2log 3.715D + Re
(17)
Limit: not specified
[13].
0.2479 − 0.0000947(7 − log Re)4 (18)
f= 2
ε 7.366
log 3.615D + Re0.9142
Limit: not specified
[23].
−2
ε
1.1007
60.525 56.291
f 1.613 ln 0.234 − 1.1105 + 1.0712 (19)
D Re Re
Limit: 3000 < Re < 10 and 10 < ε/D < 5·10-2
8 -6
[44].
−2.169
ε 1.042 2.731 0.9152 (20)
f= −1.52log +
7.21D Re
Revista Facultad de Ingeniería, Universidad de Antioquia, No.xx, pp. x-xx, xxx-xxx 20xx
M. López-Silva et al.; Revista Facultad de Ingeniería, No. xx, pp. x-x, 20xx
Limit: 2100 < Re < 108 and 10-6 < ε/D < 5·10-2
[45].
−2
ε 1.11 6.9 (21)
f=
−1.8log + Re
3.7D
Limit: 400 <Re <108 and 10-6<ε/D<5·10-2
[46].
−2
ε 95 96.82
f= −2log + −
(22)
3.7D Re
0.983
Re
Limit: 5235 < Re <109
[47].
−2
7.35 − 1200 ( ε D )1.25 ε 1.115 (23)
f=
−1.8log +
Re 3.15D
Limit: 4000 ≤ Re ≤ 35.5 ·106
[12].
−2
ε 5.0272 ε 4.657 ε
0.9924 0.9345
5.3326 (24)
−2log
f= − log − log +
3.7065D Re 3.827D Re 7.7918D 208.815 + Re
Limit: Re > 4000
[9].
1
ε 106
3
(25)
=f 0.0055 1 + 2 ⋅ 10 +
4
D Re
Limit: 4000 < Re < 108 and 0 <ε/D < 10-2
[31].
−2
ε 1.975 ε 1.092 7.627 (26)
−2log
f= − ln +
3.71D Re 3.93D 395.9 + Re
Limit: not specified
[48].
−2
ε 6.81 0.9
f= −2log + (27)
7D Re
Limit: 4000 < Re < 108 and 10-6 <ε/D < 10-2
[49].
−2
6.5
−1.8log 0.27 ( ε D ) +
f = (28)
Re
Revista Facultad de Ingeniería, Universidad de Antioquia, No.xx, pp. x-xx, xxx-xxx 20xx
M. López-Silva et al.; Revista Facultad de Ingeniería, No. xx, pp. x-x, 20xx
Limit: 4000 < Re < 107 and 10-6 <ε/D < 10-2
[50].
−2
ε 5.02 ε 14.5 (29)
f=
−2log − log +
3.7D Re 3.7D Re
Limit: 4000 < Re <108 and 10-6 < ε/D<5·10-2
[51].
−2
ε 6.943 0.9
f= −2log +
3.715D Re
(30)
Limit: 5000 < Re <10 and 10 < ε/D <5·10-2
8 -6
[10].
−2
ε 5.74
f 0.25 log + 0.9 (31)
3.7D Re
Limit: 5000 < Re < 108 and 10-6 < ε/D < 5·10-2
[16].
−2
0.3984Re (32)
f = 0.8686ln
S −0.645
(0.8686S )
S +0.39
Where S is:
S 0.12363Re ( ε D ) + ln(0.3984Re)
(33)
Limit: not specified
[52].
f 0.094 ( ε D ) + 0.53( ε D ) + 88 ( ε D )
−1.62( ε D )
0.134
0.225 0.44
= Re (34)
Limit: Re > 4000 and 10-5 < ε/D < 5·10-2
[11] Model I.
−2
ε 5.02 ε 5.02 ε 13 (35)
f=−2log − log − log +
3.7D Re 3.7D Re 3.7D Re
Limit: 4000 < Re <108 and 10-5 < ε/D <5·10-2
Group III was classified as having a lower test was high, exceeding 10%. Similarly, it agrees
approximation to Colebrook's with a Maximum with the results by [29] on the mathematical
Relative Error between 2.587 ≤ ∆f/f ≤ 8.303, as models analyzed using Machine Learning tools in
equations cited by [21], [46], model II by [14], which [9] and [42] had the most unfavorable
[44], [38], [51], [10], and model I by [14]. equations.
However, the equation by [21], according to [20] Regarding Figure 1 b) and the Percent Standard
in their research, was the most accurate. Possible Deviation (PSD), it is observed that, in general, the
causes were that [20] only used 2397 experimental 30 equations analyzed presented a deviation
points, 3000 ≤ R e≤ 735-103, and 0 < ε/D <1.4-10- between 1.2%<PSD<2%. However, 81% of the
3 equations had a stable standard deviation between
. Nonetheless, group IV had to be rejected because
they exceeded ∆f/f > 10%, as are [42], [49], [9], 1.5% and 1.6%. Nevertheless, there are three
[48], [47], [52], and [37]. In particular, the equations of approximations with the lowest
equation proposed by [9], at the time provided standard deviation, such as [48], [9] and [37], but
significant results for solving problems, but it is they presented a high relative error for which they
shown that new and more accurate formulations were rejected.
have been developed. The 30 equations analyzed in this article have two
Results that agree with those obtained by [22], who perspectives: firstly, the equations with a high
evaluated 33 equations in a range of the Moody number of parameters tend to be more accurate,
diagram with 2300 ≤ Re ≤108, 0 <ε/D<5-10-2 and and secondly, the equations with the least number
in relation to the equation proposed by [9] the error of parameters are less accurate. On the other hand,
Revista Facultad de Ingeniería, Universidad de Antioquia, No.xx, pp. x-xx, xxx-xxx 20xx
M. López-Silva et al.; Revista Facultad de Ingeniería, No. xx, pp. x-x, 20xx
the engineer needs the easiest and most accurate equation model I occupies rank 30. In relation, the
equation for friction factor calculation, according AIC reached inversely proportional values, the
to [24]. In summary, as a result of the increasing [49] equation reached rank 30 and [11] equation
digitization of work, educational and economic model I has rank 1. On the other hand, in contrast
environments, the equations must be formulated to the previous equations, the number of
with the highest precision and best computational parameters by [11] equation model I is 47% higher
performance. than by [49] equation. Consequently, it can be
For this reason, the MSC and AIC Model Selection pointed out that the AIC criterion does not follow
Criteria have been implemented using a Ranking the parsimony principle because the smaller the
because it considers a decisive variable as the number of parameters, the smaller the AIC tends to
number of parameters, including the constants in be.
the equations (p). It should be noted that the AIC criterion does not
Based on the accuracies of the models, a follow the principle of parsimony. In summary,
preliminary model ranking (Rk) was proposed for there is a tendency for the AIC criterion to improve
each evaluation criterion p, ∆f/f, PSD, MSC, and as the number of parameters increases; these
AIC, and finally, a Global Ranking. Table 1 shows factors contradict the theories for which the AIC
the results of the models. It is observed that the criterion was defined. In finite samples, the AIC
error theory and theoretical functions show results value is only approximate [33]. Therefore,
that differ in their rank order for each equation, difficulties could arise regarding the validity and
with a discrepancy in optimal model selection. applicability of the method for this purpose.
Equation 5, proposed by [37] is the simplest and Additionally, the MSC criterion also showed
has the least number of steps to obtain the friction inconsistencies between the models due to the
factor. Nevertheless, in the previous analysis, it number of parameters; however, this coincides
was rejected because of its high relative error, with the results of the AIC criterion. This trend in
which is positioned at number 30. Meanwhile, the results corresponds with those results obtained
Equation 11 by [17] is classified as the most by [36].
complex for its solution due to the number of steps The global ranking obtained in Table 1 integrates
and parameters it includes. However, it was the positions of the most accurate and inaccurate
classified in group I with a relative error of less approximation models with their degrees of
than 0.5% and an acceptable deviation of less than complexity. The explicit equation 32 proposed by
1.6%, with a ranking of 8. [16] leads the Global Ranking in the first position
In this sense, MSC and AIC contributed to the as the most accurate, followed in second place by
selection of the best model. However, in both Equations 29, 26, and 22 by [50], [31], and [46].
cases, they present discrepancies with respect to The least accurate and most complex to solve are
the function of greater likelihood and entropy. The Equations 34, 23, 28 by [52], [47], and [49], which
MSC value indicates that by [49] equation in turn belong to the rejected group IV.
occupies rank 1, while the MSC value of the [11]
equ
No.
th
p
statistics criteria Ranking
Revista Facultad de Ingeniería, Universidad de Antioquia, No.xx, pp. x-xx, xxx-xxx 20xx
M. López-Silva et al.; Revista Facultad de Ingeniería, No. xx, pp. x-x, 20xx
Parameter
Global
MSC
Total
PSD
AIC
∆f/f
No Rk Rk Rk Rk ∑Rk GR
[21] 2 11 14 20 17 14 76 14
[24] 3 17 6 14 28 3 68 7
[37] 5 6 30 3 3 28 70 9
[38] 6 14 18 25 10 21 88 20
[39] 7 19 9 13 23 8 72 11
[14] I 8 18 21 18 11 20 88 20
[14] II 9 19 16 22 15 15 87 19
[17] 11 39 5 8 25 6 83 18
[40] 14 16 7 11 24 7 65 5
[41] 15 9 23 27 8 23 90 22
[42] 16 19 24 5 5 27 80 17
[43] 17 8 22 28 9 22 89 21
[13] 18 14 10 16 21 10 71 10
[23] 19 13 8 17 22 9 69 8
[44] 20 10 17 4 12 19 62 3
[45] 21 10 13 19 18 13 73 13
[46] 22 10 15 2 16 16 59 2
[47] 23 12 28 29 6 25 100 24
[12] 24 21 1 12 29 2 65 5
[9] 25 8 26 2 4 26 66 6
[31] 26 15 3 10 27 4 59 2
[48] 27 7 27 1 2 29 66 6
[49] 28 8 25 30 1 30 94 23
[50] 29 11 11 6 20 11 59 2
[51] 30 9 19 23 14 17 82 15
[10] 31 8 20 24 13 18 83 16
[16] 32 13 4 9 26 5 57 1
[52] 34 16 29 26 7 24 102 25
[11] I 35 17 2 15 30 1 65 5
[11] II 36 14 12 7 19 12 64 4
Consequently, an easier classification has been As a new proposal for explicit friction factor
established, according to the level of precision and approximation equations, 64 models were
simplicity for the first five global rankings. It was analyzed in Gene Expression Programming (GEP).
established from a very high level, which indicates The theoretical and experimental databases were
excellent precision and simplicity, to a very low developed as a training process to train the GEP
level, which is interpreted as an inaccurate and algorithm. Twenty percent of the data was reserved
complex equation to solve due to the number of for validation and the rest for calibration. Only the
operations and parameters present. most efficient results of GEP1, GEP2, GEP3, and
Revista Facultad de Ingeniería, Universidad de Antioquia, No.xx, pp. x-xx, xxx-xxx 20xx
M. López-Silva et al.; Revista Facultad de Ingeniería, No. xx, pp. x-x, 20xx
GEP4 according to the performance criteria are was 0.99873, the ∆f/f was 6.22%, and the PSD was
reflected in Table 3. 1.86%.
In contrast to the groups made in Figure 1 due to
Table 2 Model classification the maximum relative error, GEP1 was classified
in group III because it was within the interval 2.5
GR Authors Precision Simplicity ≤ ∆f/f ≤ 8.3, this being an alternative to obtain the
1 [16] Very high Very high friction factor quickly and easily.
2 [50] Very high High Although GEP4 has the highest R and a lower
2 [31] Very high High ∆f/f, PSD, it is shown to be more significant for
3 [46] Medium Medium having a greater number of functions, according to
3 [44] Medium Medium [24]. In addition, the GEP4 model has a greater
4 [11] II High Low number of operations for its solution, making it
5 [11] I Very high Very Low less simple. Regarding the increase of functions,
5 [12] Very high Very Low the Number of Chromosomes, Head Size, and
5 [40] Very high Very Low Number of Genes showed a partial relationship to
the results obtained by [51] that the GEP models
Table 3 shows that the most significant models had increase with increasing functions.
Linking Functions + and *, a Number of Equation 40 is proposed as a new nonlinear model
Chromosomes of 30, a Head Size of 8, and a to determine the explicit friction factor coefficient
Number of Genes of 2 and 6. The best-performing with the lowest error without the existence of
model was GEP1, with the lowest number of logarithmic functions, speed of calculation, or
functions (4), and 7 parameters including more accurate approximation in the turbulent flow
constants. The Root Mean Square Error (RMSE) regime. The Limit: 4000 < Re < 108 and 10-6 <ε/D
was 0.078%, the Mean Absolute Error (MAE) was < 10-2.
0.055%, the Pearson correlation coefficient (R)