QSPR Studies of Carbonyl Hydroxyl Polyene Indices

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind.

69 (1-2) (2020) 1−16 1


https://doi.org/10.15255/KUI.2019.022
QSPR Studies of Carbonyl, Hydroxyl, Polyene KUI-1/2020
Original scientific paper
Indices, and Viscosity Average Molecular Received June 17, 2019
Accepted September 7, 2019
Weight of Polymers under Photostabilization
Using ANN and MLR Approaches
H. Maouz,a* L. Khaouane,a S. Hanini,a Y. Ammi,a,b This work is licensed under a
Creative Commons Attribution 4.0
M. Hamadache,a and M. Laidi a International License
a
Laboratory of Biomaterials and Transport Phenomena (LBMPT), University of Médéa,
Quartier Aïn d’Heb, 26000, Algeria
b
University Center, Faculty of Science and Technology, Department of Process Engineering,
Relizane, Algeria

Abstract
One of the main disadvantages of the use of synthetic or semi-synthetic polymeric materials is their degradation and aging.
The purpose of this study was to use artificial neural networks (ANN) and multiple linear regressions (MLR) to predict the
carbonyl, hydroxyl, and polyene indices (ICO, IOH, and IOP), and viscosity average molecular weight (Mv ) of poly(vinyl chloride),
polystyrene, and poly(methyl methacrylate). These physicochemical properties are considered fundamental during the study
of photostabilization of polymers. From the five repeating units of monomers, the structure of the polymer studied is shown.
Quantitative structure-property relationship (QSPR) models obtained by using relevant descriptors showed good predictability.
Internal validation {R2, RMSE, and Q2LOO}, external validation {R2, RMSE, Q2pred, rm2, , k, and k’}, and applicability domain
were used to validate these models. The comparison of the results shows that the ANN models are more efficient than those
of the MLR models. Accordingly, the QSPR model developed in this study provides excellent predictions, and can be used to
predict ICO, IOH, IOP, and Mv of polymers, particularly for those that have not been tested.

Keywords
QSPR, photostabilization, polymers, artificial neural network, multiple linear regressions

1 Introduction reduce or prevent all kinds of damage to polymers. These


polymers are generally protected against such deterioration
Synthetic polymers are among the most widely produced by the addition of antioxidants, stabilizers against light and
materials and are used in various fields, such as construc- heat.10 Among the stabilizing systems developed are free
tion, electronics, chemical engineering, packaging and radical scavengers that have proven effective.11 It should
transportation, due to their excellent chemical and physi- be noted that photostabilization and aging of polymers are
cal properties.1–3 Polyvinyl chloride (PVC), polystyrene (PS) a complex problem to study in practice, as they usually
and polymethyl methacrylate (PMMA) are some of the take place slowly, and their service life generally reaches
most important industrial-scale polymers. The low cost of several decades.12–16 Several studies on photostabilization
production, the wide range of use, and the good perfor- of polymers are reported in the literature.17–22 The photo-
mance of PVC, PS, and PMMA products have generated stabilization activities of polymers compounds were deter-
interest for these polymers.4,5 mined by monitoring the carbonyl, polyene and hydroxyl
indices, as well as the variations in the viscosity average
However, one of the main disadvantages of the use of syn-
molecular weight with the duration exposure. Carbonyl,
thetic or semi-synthetic polymeric materials is their deg-
polyene and hydroxyl groups are used to evaluate/meas-
radation and aging. Thus, PVC, PS, and PMMA undergo
ure the amount of polymer degradation during ultraviolet
photodegradation when exposed to harsh environments,
radiation in the presence of oxygen over time. Growth of
such as high temperatures, sunlight, fungi, bacteria, yeasts,
carbonyl groups indicates extent of polymer degradation.
algae and their enzymes.6 The consequences of this degra-
However, the experimental determination of these indices
dation depend on the nature of the polymer and can cause
involves costly experimental studies.
scission of the polymer chain, rapid yellowing, and loss of
gloss, crosslinking accompanied by changes in the physical The photostability properties of polymers depend on the
and chemical properties of the polymer.7,8 Finally, we end molecular structures of the polymers during photodegrad-
up with useless materials after an unpredictable duration.9 ation. In addition, the use of in silico predictive methods,
based on computer tools, offers a fast and cost-effective
Research on the stabilization of polymers against harmful
alternative to experimentation. These methods include
environmental effects is extremely important. They tend to
the Quantitative Structure-Property Relationship (QSPR)
models. This strategy consists of modelling the properties
*
Corresponding author: Hadjira Maouz, PhD of the material as functions of the molecular structure us-
Email: [email protected] ing QSPR.23 The objective of QSPR is to develop mathe-
2 H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16

matical equations capable of establishing the relationships 2 Materials and methods


between property and descriptors derived solely from the
molecular structure of the polymer. Once a correlation is The general methodology adopted for this study consisted
established and validated, it may be applicable to predict- of several basic steps to generate valid QSPR models. This
ing the property of a new polymer or to discovering new methodology is indicated in the diagram of Fig. 1.
materials with the desired properties.24
A few QSPR models have been successfully applied to the
2.1 Datasets
correlation of many physicochemical properties of various
polymers. For example, Xu et al.25 developed QSPR mod- It is well known that high-quality experimental data are
el, which was built to predict refractive indices of linear essential for the development of high quality QSPR mod-
polymers by applying four molecular descriptors. Com- els.31 The polymers in the database included two families:
pared with the existing QSPR models, the proposed model 3 polymers (PVC, PS, and PMMA), and 20 polymer mix-
requires only four descriptors, which can be obtained by tures (polymers with stabilizers). The experimental values
simple calculation making it easier to predict the refrac- of carbonyl index (111 polymers), hydroxyl index (141 pol-
tive indices of polymers. QSPR study was elaborated by ymers), polyene index (81 polymers), and viscosity average
Xu et al.26 The QSPR model was performed between top- molecular weight (118 polymers) were collected from the
ological indices representing the molecular structures and literature.17,32–35 To develop an ANN model, the polymer
lower critical solution temperature (LCST) with a database database was divided into two sets: a training set, and a
of 169 data points. A satisfactory mean relative error (MRE) test set consisting of 77 % and 23 % of the polymers for ICO,
was obtained, and the authors concluded that the mod- 60 % and 40 % for IOH, 80 %, and 20 % for IPO and Mv, re-
el would be very useful in obtaining reliable estimates of spectively. The collected photostabilization values used in
LCST in polymer solutions. Liu et al.27 developed a QSPR this work were performed under the influence of acceler-
model for structural units of 35 polymethacrylates. The ated testing technique: Accelerated weatherometer Q.U.V.
QSPR of the quantum chemical descriptors and properties, tester (Q. panel, company, USA).
such as molar volumes at room temperature, refractive in-
dices, and glass transition temperature, were obtained by
stepwise regression and artificial neural network (ANN)
2.2 Molecular descriptors
method. The results calculated by ANN method meet the
experimental data better than those by the stepwise meth- The direct calculation of a molecular descriptor for the
od. A QSPR to predict the intrinsic viscosity of polymer entire structure is not possible due to the high molecu-
solutions was developed by Gharagheizi.28 With five rele- lar weight of the polymeric compound. To overcome
vant descriptors and 65 polymer solutions, a radial based this drawback, a small number of repeating units (U) was
function neural network (RBFNN) with squared correlation used.36 In addition, the molecular descriptors calculated
coefficient R2 = 0.9100 was constructed. Recently, quan- from their repeating unit structures end-capped with two
titative structure-property relationship for the thermal de- hydrogen atoms can be used in the QSPR studies for sys-
composition of polymers is suggested by Toropova et al.29 tems polymer.25,37 In this study, we chose polymer units to
The data on architecture of monomers is used to represent properly represent the interaction between polymers and
polymers. The average statistical quality of the suggested stabilizers around the photodegradation. The chosen pol-
QSPR for prediction of molar thermal decomposition is the ymer units (UUUUU) are based on existing models in the
following: RMSE = 4.71 ± 0.1 and R2 = 0.97 ± 0.01. More literature.17,33 The list of polymer units used in the develop-
recently, Duchowicz et al.30 have developed a predictive ment of QSPR model is given in Table 1.
QSPR for the refractive indices of 234 structurally diverse
polymers. The established equations were validated and One important step in obtaining a QSAR (Quantitative
tested through various well-known techniques, such as the Structure-Activity Relationships) and QSPR model is the
use of an external test set of compounds, the cross-valida- numerical representation of the structural features of mole-
tion method, Y Randomization and applicability domain. cules, which were named molecular descriptors. Currently,
They concluded that the developed QSPR could be useful there are thousands of molecular descriptors in the litera-
in assisting the development of new polymeric materials. ture that can be used to solve different problems in differ-
ent specialties.38 All descriptors were obtained through the
Unfortunately, in view of the bibliographic research, no ap- online program PaDEL-Descriptor (http://www.scbdd.com/
plication of QSPR studies has been devoted to the predic- padel_desc/index/). PaDEL-Descriptor is one of the most
tion of the following parameters: carbonyl index, hydrox- applied softwares in QSPR studies but also for QSAR anal-
yl index, polyene index, and viscosity average molecular yses.39 In the specific case of this study, for each polymer,
weight. Therefore, given the importance of these parame- 1875 molecular descriptors were calculated, belonging to
ters in photostabilization studies of polymers, the objective following classes: Autocorrelation descriptors (346), Basak
assigned to this study was to develop two models based on descriptors (42), BCUT descriptors (6), Burden descriptors
the exploitation of the relationship between the chemical (96), Connectivity descriptors (56), Constitutional descrip-
structure of polymers (PVC, PS, and PMMA) and each of tors (120), E-state descriptors (489), Kappa descriptors (3),
the four parameters using a linear approach (Multiple Lin- Molecular property descriptors (15), Quantum chemical
ear Regression: MLR) and a non-linear approach (Artificial descriptors (6), Topological descriptors (265), CPSA de-
Neural Network: ANN). scriptors (29), RDF descriptors (210), Geometrical descrip-
H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16 3

Data base and QSPR descriptor calculation by


PaDEL-Descriptor software

Inputs (Ei) Filtration and pre-treatment


E1 E2 E1 + E2 • Any descriptor that had an identical value for > 75 % of the
samples was removed
Descriptors (Di) Duration (Pi) Di + Pi
• Any descriptor with a relative standard deviation < 0.05 was
removed
• Descriptor with intercorrelation > 0.75 were removed

Relevant filtered inputs (RFEi) Filtered inputs (FEi)


RFDi RFPi RFDi + RFPi FDi FPi FDi + FPi

Parameters (structure) of the ANNs


• Number of hidden layers (1 : 3)
• Number of neurons in the hidden layers (3 : 30)
• Activation function (tansig, logsig, exponential, purelin)
** Modification and
• Training algorithm (BFGS) adjustment of parameters
• Number of iterations (1000 iteration)
• Splitting each data set into two subsets (training and test set)
No

Outputs (Sj): chemico-physical properties (QSPR)

Test Test
Yes No
ε2 ε1

Yes * ANN estimate

Imperfect Test
No
prediction ε3

Yes
Test ε1: Corresponds to the test of (R → 1, E → 0, and Q2 → 1)
Test ε2: Corresponds to the test of (IR > 6 %) “weight method
Test ε3: Corresponds to the test of (Applicability domain “Williams plot”
*: We retain all the descriptors tested (DiFR + PiFR, DiF + PiF) > 6 % Comparison with
**: We only retain (DiFR + PiFR, DiF + PiF) > 6 %
we remove the (DiFR + PiFR, DiF + PiF) < 6 % Multiple linear regression (MLR)
: First pass loop with test ε1
: Second pass loop with test ε1 + ε2
Performance
(decision)

Fig. 1 – Basic steps for generating a QSPR model in this study

tors (21), WHIM descriptors (91), and 3D Autocorrelation 2.3 Selection of relevant descriptors
descriptors (80).
The pre-processing of the database is to eliminate the ir-
Simplified Molecular Input-Line Entry System (SMILES) no- relevant descriptors in order to avoid the phenomenon
tations of polymers were obtained from the ChemBio Ultra of over-fitting. Therefore, we must reduce the variables
Software. (descriptors) that do not have or have little influence on
4 H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16

Table 1 – List of structures of the unit polymers (UUUUU) used in the development of QSPR models

PVC (100 % wt/wt)14,33,34 PVC with 0.5 % (wt/wt) additive33 PVC with 0.5 % (wt/wt) additive33

PVC with 0.5% (wt/wt) additive 33 PVC with 0.5 % (wt/wt) additive33 PVC with 0.5 % (wt/wt) additive14

PVC with 0.5 % (wt/wt) additive14 PVC with 0.5% (wt/wt) additive 14 PVC with 0.5 % (wt/wt) additive14

PVC with 0.5 % (wt/wt) additive14 PVC with 0.5 % (wt/wt) additive34 PVC with 0.5% (wt/wt) additive34

PVC with 0.5 % (wt/wt) additive34 PS (100 % wt/wt)32 PS with 0.5 % (wt/wt) additive32
H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16 5

Table 1 – (continued)

PS with 0.5 % (wt/wt) additive32 PS with 0.5 % (wt/wt) additive32 PS with 0.5 % (wt/wt) additive32

PMMA (100 % wt/wt)35 PMMA with 0.5 % (wt/wt) additive35 PMMA with 0.5 % (wt/wt) additive35

PMMA with 0.5 % (wt/wt) additive35 PMMA with 0.5 % (wt/wt) additive35

the outputs of network (carbonyl index, hydroxyl index, of descriptors obtained after the selection was 107 for ICO,
polyene index, and viscosity average molecular weight). 102 for IOH, 67 for IPO, and 107 for Mv. (5) A program based
Several methods to simplify the database are available in on the stepwise method is used to select the most rele-
the literature; for example, principal component analysis vant descriptors from those obtained previously. Finally,
(PCA), curvilinear or orthogonalization method of Graam- the number of descriptors (Table 2) obtained after stepwise
Schmidt are used. In this present study, the method used selection was: 5 for ICO, 5 for IOH, 3 for IPO, and 4 for Mv.
to select the most significant descriptors was described by The relevant descriptors as well as the duration exposure
some authors.40–42 It takes place in several stages: (1) the (evaluated in hours) were used to develop the QSPR pre-
minimum and maximum are calculated for each descriptor diction model.
using STATISTICA software, then we remove the descrip-
tors that have the maximum and the minimum equal. (2)
The descriptor which has the same value for more than
75 % of the samples is eliminated. (3) Standard deviation 2.4 Model development
(SD) is calculated for each descriptor, and those with SD
values less than 0.05 are eliminated. (4) In this stage, we Descriptors obtained after feature pre-screening were used
used “Matlab” software; a diagonal matrix is then obtained to develop predictive models. Many approaches of model
which represents the correlation between the outputs and development are widely used. Two different approaches to
the descriptors retained. The descriptors are classified ac- developing QSPR prediction models were used.
cording to the decreasing value of the correlation coeffi-
cient. The descriptor with the highest correlation is taken
and compared to the other descriptors in the matrix. Those
2.4.1 Linear model
whose correlation coefficient value is greater than 0.75 are
eliminated in their turn. The same is repeated with the de- The linear model was developed by applying Multiple
scriptor ranked just after the first, and so on. The number Linear Regression (MLR). MLR are the most widely used
6 H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16

Table 2 – List of descriptors obtained after stepwise selection and duration of exposure

Parameters
Descriptor Description Category VIF t-Test
predicts
Autocorrelation
MATS2m Moran autocorrelation – lag 2 / weighted by mass 1.37 –11.96
descriptors
minHBd Minimum E-States for (strong) Hydrogen Bond donors E-state descriptors 1.20 –5.62
Autocorrelation
MATS2s Moran autocorrelation – lag 2 / weighted by I-state 1.44 –4.72
Carbonyl descriptors
index (ICO) Logarithmic coefficient sum of the last eigenvector from
VE3_Dzm Topological descriptors 1.34 5.06
Barysz matrix / weighted by mass
Moran autocorrelation – lag 7 / weighted by Autocorrelation
MATS7i 1.44 3.65
first ionization potential descriptors
t(h) duration of exposure duration of exposure 1.01 19.97
3D topological distance based autocorrelation – lag 5 / 3D Autocorrelation
TDB5i 1.38 –12.32
weighted by first ionization potential descriptors
geomShape Petitjean geometricshape index Geometrical descriptors 1.41 –9.69
minHBd Minimum E-States for (strong) Hydrogen Bond donors E-state descriptors 1.04 –5.82
Hydroxyl
index (IOH) Centered Broto-Moreau autocorrelation – lag 7 / weighted Autocorrelation
ATSC7m 1.53 5.18
by mass descriptors
2nd component shape directional WHIM index / weighted
P2i WHIM descriptors 1.28 –3.85
by relative first ionization potential
t(h) duration of exposure duration of exposure 1.01 17.9
Smallest absolute eigenvalue of Burden modified matrix –
SpMin2_Bhv Burden descriptors 2.13 –9.33
n 2 / weighted by relative van der Waals volumes
Logarithmic coefficient sum of the last eigenvector from
VE3_Dzi Topological descriptors 1.11 6.70
Polyene Barysz matrix / weighted by first ionization potential
index (IPO) Logarithmic Randic-like eigenvector-based index from
VR3_Dzv Barysz matrix / weighted by Topological descriptors 2.08 –7.01
van der Waals volumes
t(h) duration of exposure duration of exposure 1.02 17.03
Radial distribution function – 025 / weighted by relative
RDF25p RDF descriptors 3D 1.12 21.56
polarizabilities
3D topological distance based autocorrelation – lag 4 / Autocorrelation
TDB4i 1.03 –4.99
Viscosity weighted by first ionization potential descriptors
average
Minimum E-State descriptors of strength for potential
molecular minHBint8 E-state descriptors 1.12 4.71
Hydrogen Bonds of path length 8
weight (Mv )
Centred Broto-Moreau autocorrelation – lag 3 / weighted Autocorrelation
ATSC3c 1.01 2.67
by charges descriptors
t(h) duration of exposure duration of exposure 1.00 –16.19

and known modelling methods, and used as the basis for 2.4.2 Nonlinear model
a number of multivariate methods.42 MLR is a commonly
used method in QSPR due to its simplicity, transparency, Nonlinear model was then developed by submitting the
reproducibility, and easy interpretability. MLR consists of relevant descriptors to a statistical learning method: the
a quantitative relationship between a group of variables Xi Artificial Neural Network (ANN). ANN is particularly well
(descriptors) and a response Y, as shown in Eq. (1): suited for QSPR/QSAR models because of their capability
to take out nonlinear information from the data set.43 MLP-
ANN is considered the easiest and most commonly used
(1) ANN type in literature.42 The architecture of an MLP-ANN
consists of an input layer encompassing the inputs, one or
where Y is the response or dependent variable (outputs), more hidden layers (intermediate), and an output layer
Xi represents the molecular descriptors (inputs), and a0 is a including the outputs. The layers are connected to each
constant (intercept). MLR calculations were performed us- other linearly by the weights corresponding to the neurons
ing STATISTICA v. 8.0 (StatSoft, Inc.) and XLSTAT software. in the neighbouring layers upstream and downstream. In
H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16 7
this work, the tangent sigmoid (tansig), the sigmoid log (log- To see the contribution of each parameter in the explana-
sig), and the exponential transfer function were used as a tion of the dependent variable Y in the MLR models, we
transfer function of the hidden layer, while the exponential used the test of significance of each parameter t-Student
function and the linear function (Purelin) were used as a “T-test” statistic. From this statistic, it is possible to test one
transfer function for the output layer. The number of hid- by one the nullity of the different parameters of the mul-
den neurons was optimized (from 3 to 30) by trial and error tiple linear regression models, and build confidence inter-
procedure in the training process. One output neuron was vals on these parameters, very useful during the interpreta-
used to represent the experimental values of ICO, IOH, IPO, tion phase of the model.52–53
and Mv. The network was trained using the quasi-Newton
BFGS (Broyden–Fletcher–Goldfarb–Shanno) algorithm.
(8)

2.5 Model validation


where: ti is the T-test for descriptor “i”; ai is the coefficient
The relevance of the QSPR/QSAR models is judged on the associated with descriptor “i” in the model; s is the stand-
basis of the results of statistical validation. The statistical ard error associated with descriptor “i”; α is the confidence
validation of models consists of internal and external val- interval, n is the number of observations (size of database);
idations. Recent studies44–48 have indicated that internal p is the number of independent variables (descriptors).
validation is essential for the validation of a QSPR/QSAR
model. In our study, the most important traditional valida-
tion metrics were applied: root mean square error (RMSE),
determination coefficient (R2), cross validated correlation 2.6 Sensitivity analysis
coefficient (Q2LOO), in addition to the use of the parameters To see the contribution of each input variable (descrip-
( rm2 ; ) introduced by Roy et al.49 External validation is es- tors with duration exposure) on the outputs (ICO, IOH, IPO,
sential to judge the predictive power of a model.50 For this and Mv), a sensitivity analysis was carried out using the
study, the predictive power of the QSPR model was tested “weight” method. This method, proposed by Garson,54
on the test set not used for model development, using the provides a quantification of the relative importance (RI)
Q2pred parameter and the validation criteria by Golbraikh of the different inputs (variables) on the outputs of each
and Tropsha.51 The statistical parameters are collected in neural network. The process of calculating relative im-
Eqs.  (2–7), and the terms utilized in these equations are portance by the “weight” method unrolls as follows: We
defined in the following: calculate the product of input–hidden layer and hidden–
output layer connection weights between each input neu-
ron and output neuron and sum the products across all
(2) hidden neurons.55

(3) 2.7 Applicability domain


Any model of prediction must have a range of accuracy
satisfactory for its application. This range is defined by the
applicability domain of the model. Outside this domain,
(4) the application of the model can lead to erroneous predic-
tions.50 It should be noted that there are several approach-
es to determining the application domains. In this present
work, we used the leverage approach (Williams plot). The
(5) influence of a sample on the model is measured at the
leverage (hi). The leverage of a compound in the original
variable space is defined as follows:50

(6)
(9)

(7) where X is the model matrix derived from the training


set descriptor values, and the leverage values of training
where Yobs is observed (experimental) value of Y, Ypred is set are diagonal elements of the hat or influence matrix
predicted (calculated) Y-value of training set, test set or val- (hi  =  diag  (H)). The leverage values are always between
idation set, n is the number of compounds in the data set 0 and 1. The warning leverage h* is, generally, fixed at
(training, test, validation), Yobs is the average of Yobs, and r2 is 3(p + 1)/n, where n is the total number of samples in the
the squared correlation coefficient between the observed training set, and p is the number of descriptors involved in
and predicted value of polymers with intercept. the correlation.50
8 H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16

3 Results and discussion 3.2 ANN predictive model (nonlinear model)


3.1 MLR predictive model (linear model) The optimization of the artificial neural network archi-
tecture is essential to obtain an optimal network. In ad-
The linear models obtained for the prediction of ICO, IOH, dition, the database distribution, the activation functions
IPO, and Mv of different polymers are represented by the (for hidden neurons and output neurons), the number of
following Eqs. (10–13) with the reported statistical param- neurons in the hidden layer, and the learning algorithms
eters: were optimized after several trials. The optimal perfor-
mance of the model is evaluated in terms of RMSE.38 The
ICO = 0.084 + 5.075E – 04 t (h) – 0.473 MATS2m – results of optimization of the nonlinear model are shown
– 0.118 minHBd – 9.849E – 02 MATS2s + in Table S2 (supplementary materials). Thus, the networks
+ 2.904E 04 VE3_Dzm + 0.181 MATS7i (10) architectures obtained are the following: {6-18-1} for ICO,
(N = 111; R2 = 0.862; RMSE = 0.024; {6-23-1} for IOH, {4-13-1} for IPO, and {5-24-1} for Mv.
F = 106.905; p < 0.0001) The predicted values of ICO, IOH, IPO, and Mv in the training
and test sets have been plotted versus their observed val-
IOH = 0.585 + 1.933E – 04 t (h) – 5.824E – 04 TDB5i – ues in Fig. 2. As may be seen, a close correlation between
– 9.632E – 02 geomShape – 3.987E 02 minHBd + the predicted and the observed values was found. The
+ 1.850E – 06 ATSC7m – 8.135E – 02 P2i (11) main performance parameters of ANN model are shown
(N = 141; R2 = 0.802; RMSE = 0.011; in Table 3. As may be seen from Table 3, all values of the
F = 89.907; p < 0.0001) statistical parameters (R2, Q2LOO, and RMSE) of the training
IPO = 1.319 + 6.382E – 04 t (h) – 0.546 SpMin2_Bhv + set are acceptable. The non-linear ANN model also gives
+ 5.742E – 04 VE3_Dzi – 9.128E – 04 VR3_Dzv good results for the test set. Therefore, these results reveal
(12) that the ANN model not only performed well in model
(N = 81; R2 = 0.898; RMSE = 0.030; development, but also had excellent prediction. Moreo-
F = 165.866; p < 0.0001) ver, all these results confirm the existence of a non-linear
Mv = 547869. 690 – 324.233 t (h) + 1594.286 RDF25p – relationship between the relevant descriptors of the model
– 804.937 TDB4i + 5277.705 minHBint8 + and the predicted physicochemical properties.
+ 1293.069 ATSC3c (13)
(N = 118; R2 = 0.891; RMSE = 0.0650;
F = 181.371; p < 0.0001) 3.2.1 Sensitivity analysis
The large F ratio (106.905; 89.907; 165.866; 181.371) in- The weight method was used to calculate the relative im-
dicates that Eqns. (10), (11), (12), and (13) were powerful portance IR (%) of variables (descriptors + duration expo-
in predicting ICO, IOH, IPO, and Mv. sure) for ANN model. Fig. 3 gives a graphica representation
of the relative importance IR (%) of each variable on the
For each pair of descriptors, the value of the correlation properties of photostabilization (ICO, IOH, IOP, and Mv). Ac-
coefficient (see Table S1 in the supplementary materials) of cording to this figure, all input relevant variables have a
the four parameters studied was <  0.717, which leads to significant contribution (IR > 6 %) to the photostabilization
the independence of the selected descriptors. In addition, properties. This sensitivity analysis by the weight method
the multi-collinearity of the descriptors used in the MLR successfully identified the true importance of all the varia-
model can be evaluated by calculating their Inflation Var- bles used for the modelling of physicochemical properties
iation Factors (VIF). If the calculation of VIF gives a value during the photostabilization of PVC, PS, and PMMA, and
between 1 and 5, the associated model is acceptable.54 As therefore, proves the correctness of the choice of variables
Table 2 shows, all variables have VIF values < 2.13, which that were used in this study.
shows that the resulting model has statistical significance,
and therefore, the descriptors represent some orthogonal-
ity. 3.2.2 Applicability domain
As shown in Table 2, the t-Test value of the exposure time The applicability domain of the ANN model was analysed
(t) of the carbonyl number (ICO) is 19.97 h, the latter val- using a Williams plot (Fig. 4). The vertical line is the critical
ue is greater than the other descriptors, and therefore, it leverage value (h*). As seen in Fig.  4, none of the poly-
corresponds to a significant duration of exposure. Nega- mers of the training set and the validation set had a lev-
tive regression coefficients for the MATS2m, minHBd, and erage effect higher than the warning value h* (0,2035 for
MATS2s descriptors have a negative impact on ICO. On the ICO, 0,2059 for IOH, 0,1923 for IPO, and 0,1579 for Mv). For
contrary, a positive influence for the descriptors VE3_Dzm polyene index and in the training set, one polymer was
and MATS7i, due to the positive signs of the regression underestimated. In the Williams plot of the carbonyl index
coefficients, will favour the improvement of the carbonyl and the viscosity average molecular weight, two polymers
index. We maintain the same reasoning for the other pa- belonging to the test set and one polymer for the training
rameters (IOH, IPO, and Mv), where some descriptors have set can be considered as response outliers. One of which
positive signs (good impact), while others have negative is overestimated, while another is underestimated. In the
signs (impact obsolete). domain applicability of the hydroxyl index, there polymers
H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16 9
0.18
0.35 training set b)
training set a)
test set
test set 0.16
0.3 perfect fit
perfect fit

calculated hydroxyl index by ANN, IOH


calculated carbonyl index by ANN, ICO

0.14
0.25

0.12
0.2
0.1
0.15
R2 = 0.9917 0.08
0.1
R2 = 0.9940
0.06

0.05
0.04

0
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18
experimental carbonyl index, ICO experimental hydroxyl index, IOH
x 105
2.5
d)
calculated viscosity average molecular weight by ANN, MV
training set
0.04 1 set 2
training 3 4 5 6 7 c) 8 9 test set
0.55 test set perfect fit
perfect fit 2
0.5
calculated polyene index by ANN, IPO

0.45

0.4 1.5

0.35

0.3
1
0.25
R2 = 0.9939
R2 = 0.9948
0.2
0.5
0.15

0.1

0.05 0
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0 0.5 1 1.5 2 2.5
x 105
experimental polyene index, IPO experimental viscosity average molecular weight, MV

Fig. 2 – Plot of observed vs. predicted: (a) ICO, (b) IOH, (c) IPO, and (d) Mv values from the ANN model

Table 3 – Performance of the ANN model

Internal validation External validation


Models
n R2 RMSE Q2LOO n R2 RMSE Q2pred rm2 k k’
ICO 86 0.99 0.006 0.99 25 0.96 0.012 0.95 0.89 0.04 0.96 1.00
IOH 85 0.99 0.002 0.99 56 0.98 0.004 0.98 0.97 0.01 0.99 1.00
ANN IPO 65 0.99 0.009 0.99 16 0.99 0.010 0.99 0.98 0.01 0.99 1.01
Mv 95 0.98 0.057 0.99 23 0.98 0.093 0.98 0.97 0.01 0.99 1.00

Threshold
> 0.6 > 0.5 > 0.6 > 0.5 > 0.5 < 0.2 0.85 < k < 1.15 0.85 < k′ < 1.15
value50,56
10 H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16

Carbonyl index Hydroxyl index


30 25
25 20
20 15

RI ⁄ %
RI ⁄ %

15 10
10 5
5 0
0

m
BD
e

i
i

P2
)

B5

ap
t(h

C7
H
TD

Sh

S
in
zm

7i
2m

AT
Bd

2s
)

om
t(h

m
S
S

AT
H
S

AT

ge
3_
AT

in

M
M
m

VE
M

Polyene index Viscosity average molecular weight


40 35
30
30 25
20

RI ⁄ %
RI ⁄ %

20 15
10
10
5
0
0

c
8
i
5p

B4
)

C3
nt
t(h
t(h) SpMin2_Bhv VE3_Dzi VR3_Dzv

F2

Bi
TD

S
AT
H
RD

in
m
Fig. 3 – Plot of the relative importance of the descriptors for ANN models: carbonyl index, hydroxyl index, polyene index, and
viscosity average molecular weight

Carbonyl index Hydroxyl index


6 6
a) training b) training
test
test critical leverage (h*)
critical leverage (h*)
4 limits of the normal values (+3) 4 limits of the normal values (+3)
limits of the normal values (−3)
limits of the normal values (−3)
standardized residual
standardized residual

2 2

0 0

−2
−2 h* = 0.2059
h* = 0.2035
−4
−4

−6
−6 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2
leverage leverage
Polyene index Viscosity average molecular weight
5 training
c) test 5 d) training
4 critical leverage (h*) test
limits of the normal values (+3) critical leverage (h*)
limits of the normal values (−3) 4 limits of the normal values (+3)
3 limits of the normal values (−3)
3
2
standardized residual
standardized residual

2
1 1
0 0
−1 −1

−2 −2
h* = 0.1579
h* = 0.1923 −3
−3
−4
−4
−5
−50 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18
leverage leverage

Fig. 4 – Projection of the training set and the test set in the Williams plot for ANN: (a) ICO, (b) IOH, (c) IPO, and (d) Mv
H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16 11
Table 4 – External validation of the ANN and MLR models

Models n R2 RMSE Q2pred rm2 k k’


ANN 0.96 0.012 0.95 0.89 0.04 0.96 1.00
ICO 25
MLR 0.80 0.025 0.80 0.69 0.16 1.00 0.95
ANN 0.98 0.004 0.98 0.97 0.01 0.99 1.00
IOH 56
MLR 0.76 0.013 0.76 0.65 0.17 1.00 0.96
ANN 0.99 0.010 0.99 0.98 0.01 0.99 1.01
IPO 16
MLR 0.88 0.031 0.87 0.81 0.09 0.97 1.02
ANN 0.98 0.093 0.98 0.97 0.01 0.99 1.00
Mv 23
MLR 0.84 0.283 0.85 0.77 0.10 1.05 0.91

of the test set is underestimated. These eleven response of the statistical parameters for the ANN model can be not-
outliers (3 for ICO, 4 for IOH, 1 for IPO, and 3 for Mv) could ed. In conclusion, the ANN model developed in this study
be associated with errors in the experimental values. The meets all OECD principles for QSPR validation, and can be
results of the ANN models correspond to the third prin- used to predict ICO, IOH, IOP, and Mv of polymers, particularly
ciple of the Organization for Economic Cooperation and those that have not been tested, and thus help reduce ex-
Development (OECD). perimental determination of these indices, which involves
costly experimental studies.

3.3 Comparison of the result


of MLP-ANN with MLR models ACKNOWLEDGEMENTS
To compare the performance and the predictive quality of The authors acknowledge the team of LBMPT laboratory
the ANN and MLR models, the statistical parameters are and the University of Medea for their help throughout this
summarized in Table 4. According to Table 4, all the values project.
of the statistical parameters seem satisfactory for the two
models, which show a good robustness. However, a sub-
stantial improvement of the statistical parameters for the
ANN model can be noted. Thus, it can be concluded that List of abbreviations
the ANN model has better predictive power than the MLR Popis kratica
model. This means that the model obtained with an ANN ANN – artificial neural network
suggested the existence of a non-linear correlation be-
tween the outputs of the network (ICO, IOH, IPO, and Mv ) and Mv – viscosity average molecular weight
the selected variables (descriptors + duration exposure). Di – descriptors
Ei – inputs
FDi – filtrated descriptors
4 Conclusion FEi – filtrated inputs
The aim of the present work was to develop a QSPR study Pi – physical properties
and to predict the carbonyl, hydroxyl, and polyene indices Q2 – cross-validated correlation coefficient
(ICO, IOH, IOP), and viscosity average (Mv ) of poly (vinyl chlo-
ride), polystyrene and poly (methyl methacrylate). These BFGS – Broyden-Fletcher-Goldfarb-Shanno
physicochemical properties were considered a fundamen- ICO – carbonyl index
tal property during the study of the photodegradation of IOH – hydroxyl index
the polymers. This QSPR study, which involved 111, 141,
81, and 118 structurally diverse polymers, and a series of IOP – polyene index
descriptors calculated by PaDEL-Descriptor software select- LCST – lower critical solution temperature
ed by a stepwise method, was based on the artificial neural MLR – multiple linear regressions
network (ANN) and multiple linear regression (MLR). The
built ANN and MLR models were assessed comprehen- MRE – mean relative error
sively (by internal and external validation). They showed OECD – organization for economic cooperation
good values of R2 and Q2LOO for the training set, and high and development
predictive R2 and Q2pred values for the validation set. All the PMMA – polymethyl methacrylate
validations indicated that the built QSPR models were ro-
bust and satisfactory. However, a substantial improvement PS – polystyrene
12 H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16

PVC – polyvinyl chloride tion of polymers, especially polystyrene: review, Springer-


plus 2 (2013) 398, doi: https://doi.org/10.1186/2193-1801-
QSAR – quantitative structure–activity relationships 2-398.
QSPR – quantitative structure–property relationship 10. E. Yousif, J. Salimon, N. Salih, Mechanism of photostabiliza-
tion of poly (methy methacrylate) films by 2-thioacetic acid
RBFNN– radial based function neural network benzothiazol complexes, Arab. J. Chem. 7 (2014) 306–311,
RI – relative importance doi: https://doi.org/10.1016/j.arabjc.2010.11.003.
RMSE – root mean squared error 11. E. Yousif, J. Salimon, N. Salih, New Stabilizers for Polysty-
– korijen srednje kvadratne pogreške rene Based on 2-Thioacetic Acid Benzothiazol Complexes,
J. Appl. Polym. Sci. 125 (2012) 1922–1927, doi: https://doi.
SD – standard deviation org/10.1002/app.36307.
SMILES – simplified molecular input-line entry system 12. X. Colin, G. Teyssedre, M. Fois, Ageing and degradation of
VIF – variation inflation factor multiphase polymer systems, Handbook of Multiphase
Polymer Systems, 2011, pp. 797–841, doi: https://doi.
E – error org/10.1002/9781119972020.ch21.
FPi – filtrated physical properties 13. N. S. Allen, M. Edge, T. Corrales, F. Catalina, Stabiliser inter-
actions in the thermal and photooxidation of titanium diox-
IR – relative importance ide pigmented polypropylene films, Polym. Degrad. Stabil.
R – correlation coefficient 61 (1998) 139–149, doi: https://doi.org/10.1016/S0141-
3910(97)00143-2.
RFDi – relevant filtrated descriptors
14. J. M. Peña, N. S. Allen, M. Edge, C. M. Liauw, I. Roberts, B.
RFEi – relevant filtrated input Valange, Triplet quenching and antioxidant effect of several
RFPi – relevant filtrated physical properties carbon black grades in the photodegradation of LDPE doped
with benzophenone as a photosensitiser, Polym. Degrad.
Stabil. 70 (2000) 437–454, doi: https://doi.org/10.1016/
S0141-3910(00)00140-3.
References 15. J. M. Peña, N. S. Allen, M. Edge, C. M. Liauw, B. Valange,
Literatura Studies of synergism between carbon black and stabilis-
ers in LDPE photodegradation, Polym. Degrad. Stabil.
1. E. Yousif, N. Salih, J. Salimon, Improvement of the Pho- 72 (2001) 259–270, doi: https://doi.org/10.1016/S0141-
tostabilization of PVC Films in the Presence of 2N-Sali- 3910(01)00033-7.
cylidene-5-(Substituted)-1,3, 4-Thiadiazole, J. Appl. Polym.
Sci. 120 (2011) 2207–2214, doi: https://doi.org/10.1002/ 16. N. S. Allen, M. Edge, A. Ortega, G. Sandoval, C. M. Liauw, J.
app.33463. Verran, J. Stratton, R. B. McIntyre, Degradation and stabilisa-
tion of polymers and coatings: nano versus pigmentary tita-
2. M. Ito, K. Nagai, Degradation issues of polymer materi- nia particles, Polym. Degrad. Stabil. 85 (2004) 927–946, doi:
als used in railway field, Polym. Degrad. Stabil. 93 (2008) https://doi.org/10.1016/j.polymdegradstab.2003.09.024.
1723–1735, doi: https://doi.org/10.1016/j.polymdegrad-
stab.2008.07.011. 17. E. Yousif, A. Hameed, R. Rasheed, H. Mansoor, Y. Farina,
A. Graisa, N. Salih, J. Salimon, Synthesis and Photostability
3. C. C. Wang, G. Pilania, S. A. Boggs, S. Kumar, C. Breneman, Study of Some Modified Poly(vinyl chloride) Containing Pen-
R. Ramprasad, Computational strategies for polymer dielec- dant Benzothiazole and Benzimidozole Ring, Int. J. Chem.
trics design, Polymer 55 (2014) 979–988, doi: https://doi. 2 (2010) 65–80.
org/10.1016/j.polymer.2013.12.069.
18. I. H. R. Tomi, G. Q. Ali, A. H. Jawad, E. Yousif, Photosta-
4. E. Yousif, A. A. Al-Amiery, A. Kadihum, A. A. H. Kadhum, bilizing efficiency of PVC based on epoxidized oleic acid,
A. B. Mohamad, Photostabilizing Efficiency of PVC in the J. Polym. Res. 24 (2017) 119, doi: https://doi.org/10.1007/
Presence of Schiff Bases as Photostabilizers, Molecules 20 s10965-017-1283-7.
(2015) 19886–19899, doi: https://doi.org/10.3390/mole-
cules201119665. 19. E. Yousif, J. Salimon, N. Salih, New photostabilizers for PVC
based on some diorganotin (IV) complexes, J. Saudi. Chem.
5. E. A. Yousif, S. A. Aliwi, A. A. Ameer, J. R. Ukal, Induced Soc. 19 (2015) 133–141, doi: https://doi.org/10.1016/j.
photodegradation effect on the functionalized FE (III) com- jscs.2012.01.003.
plex additive-poly(vinyl chloride) thin film, Turk J Chem. 33
(2009) 399–410, doi:10. https://doi.org/10.3906/kim-0711- 20. E. Yousif, J. Salimon, N. Salih, New stabilizers for polystyrene
4. based on 2-N-salicylidene-5-(substituted)-1,3,4-thiadiazole
compounds, J. Saudi. Chem. Soc. 16 (2012) 299–306, doi:
6. G. Q. Ali, G. A. El-Hiti, I. H. R. Tomi, R. Haddad, A. J. Al-Qaisi, https://doi.org/10.1016/j.jscs.2011.01.011.
E. Yousif, Photostability and Performance of Polystyrene Films
Containing 1,2,4-Triazole-3-thiol Ring System Schiff Bases, 21. E. Yousif, E. Bakir, J. Salimon, N. J. Salih, Evaluation of
Molecules 21 (2016) 1699, doi: https://doi.org/10.3390/ Schiff bases of 2,5-dimercapto-1,3,4- thiadiazole as pho-
molecules21121699. tostabilizer for poly(methyl methacrylate), J. Saudi. Chem.
Soc. 16 (2012) 279–285, doi: https://doi.org/10.1016/j.
7. R. R. Mohamed, “Photostabilization of Polymers.” Polymers jscs.2011.01.009.
and Polymeric Composites: A Reference Series, 2014, pp.
1–10, doi: https://doi.org/10.1007/978-3-642-37179-0_74- 22. E. Yousif, J. Salimon, N. Salih, A. Ahmed, Improvement of the
1. photostabilization of PMMA films in the presence 2N-sal-
icylidene-5-(substituted)-1,3,4-thiadiazole, J. King. Saud.
8. E. Yousif, Triorganotin (IV) complexes photo-stabilizers University (Science). 24 (2012) 131–137, doi: https://doi.
for rigid PVC against photodegradation, J. Taibah. Uni. org/10.1016/j.jksus.2010.09.001.
Sci. 7 (2013) 79–87, doi: https://doi.org/10.1016/j.jtus-
ci.2013.04.007. 23. A. Afantitis, G. Melagraki, H. Sarimveis, P. A. Koutentis, J.
Markopoulos, O. Igglessi-Markopoulou, Prediction of intrin-
9. E. Yousif, R. Haddad, Photodegradation and photostabiliza- sic viscosity in polymer–solvent combinations using a QSPR
H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16 13
model, Polymer 47 (2006) 3240–3248, doi: https://doi. ty Relationship for acute oral toxicity of pesticides on rats:
org/10.1016/j.polymer.2006.02.060. Validation, domain of application and prediction, J. Hazard.
24. M. Hamadache, A. Amrane, S. Hanini, O. Benkortbi, Multi- Mater. 303 (2016) 28–40, doi: https://doi.org/10.1016/j.
layer Perceptron Model for Predicting Acute Toxicity of Fun- jhazmat.2015.09.021.
gicides on Rats, International Journal of Quantitative Struc- 39. J. C. Dearden, The Use of Topological Indices in QSAR and
ture-Property Relationships (IJQSPR) 3 (2018) 100–118, doi: QSPR Modeling. In: Roy K. (Eds) Advances in QSAR Mode-
https://doi.org/10.4018/IJQSPR.2018010106. ling. Challenges and Advances in Computational Chemistry
25. J. Xu, B. Chen, Q. Zhang, B. Guo, Prediction of refractive indi- and Physics, vol 24, Springer, Cham (2017) pp. 57–88, doi:
ces of linear polymers by a four-descriptor QSPR model, Pol- https://doi.org/10.1007/978-3-319-56850-8_2.
ymer 45 (2004) 8651–8659, doi: https://doi.org/10.1016/j. 40. D. A. Konovalov, N. Sim, E. Deconinck, Y. Vander Heyden,
polymer.2004.10.057. D. Coomans, Statistical confidence for variable selection in
26. J. Xu, L. Liu, W. Xu, S. Zhao, D. Zuo, A general QSPR model QSAR models, J. Chem. Inf. Model. 48 (2008) 370–383, doi:
for the prediction of θ (lower critical solution temperature) https://doi.org/10.1021/ci700283s.
in polymer solutions with topological indices, J. Mol. Graph. 41. A. S. Reddy, S. Kumar, R. Garg, Hybridgenetic algorithm
Model. 26 (2007) 352–359, doi: https://doi.org/10.1016/j. based descriptor optimization and QSAR models for predict-
jmgm.2007.01.004. ing the biological activity of tipranavir analogs for HIV pro-
27. W. Liu, P. Yi, Z. Tang, QSPR Models for Various Properties tease inhibition, J. Mol. Graph. Model. 28 (2010) 852–862,
of Polymethacrylates Based on Quantum Chemical Descrip- doi: https://doi.org/10.1016/j.jmgm.2010.03.005.
tors, QSAR. Comb. Sci. 25 (2006) 936–943, doi: https://doi. 42. A. Habibi-Yangjeh, M. Danandeh-Jenagharad, Application
org/10.1002/qsar.200510177. of a genetic algorithm and an artificial neural network for
28. F. Gharagheizi, QSPR analysis for intrinsic viscosity of polymer global prediction of the toxicity of phenols to Tetrahymena
solutions by means of GA-MLR and RBFNN, Comput. Mater. pyriformis, Monatsh. Chem. 140 (2009) 1279–1288, doi:
Sci. 40 (2007) 159–167, doi: https://doi.org/10.1016/j.com- https://doi.org/10.1007/s00706-009-0185-8.
matsci.2006.11.010. 43. M. Hamadache, L. Khaouane, O. Benkortbi, C. Si Moussa, S.
29. A. P. Toropova, A. A. Toropov, K. O. Valentin, L. Leszczynska, Hanini, A. Amrane, Prediction of Acute Herbicide Toxicity in
L. Jerzy, Optimal descriptors as a tool to predict the thermal Rats from Quantitative Structure–Activity Relationship Mod-
decomposition of polymer, J. Math. Chem. 52 (2014) 1171– eling, Environ. Eng. Sci. 31 (2014) 243–252, doi: https://doi.
1181, doi: https://doi.org/10.1007/s10910-014-0323-3. org/10.1089/ees.2013.0466.
30. P. R. Duchowicz, S. E. Fioressi, D. E. Bacelo, L. M. Saavedra, 44. M. Hamadache, S. Hanini, O. Benkortbi, A. Amrane, L.
A. P. Toropova, A. A. Toropov, QSPR studies on refractive Khaouane, C. Si-Moussa, Artificial neural network-based
indices of structurally heterogeneous polymers, Chemom- equation to predict the toxicity of herbicides on rats, Che-
etr. Intell. Lab. Syst. 140 (2015) 86–91, doi: https://doi. mometr. Intell. Lab. Syst. 154 (2016) 7–15, doi: https://doi.
org/10.1016/j.chemolab.2014.11.008. org/10.1016/j.chemolab.2016.03.007.
31. M. T. D. Cronin, T. W. Schultz, Pitfalls in QSAR, J. Mol. Struct. 45. R. Wang, J. Jiang, Y. Pan, H. Cao, Y. Cui, Prediction of impact
622 (2003) 39–51, doi: https://doi.org/10.1016/S0166- sensitivity of nitro energetic compounds by neural network
1280(02)00616-4. based on electro topological-state indices, J. Hazard. Mater.
166 (2009)155–186, doi: https://doi.org/10.1016/j.jhaz-
32. E. Yousif, A. Hameed, N. Salih, J. Salimon, B. M. Abdullah, mat.2008.11.005.
New photostabilizers for polystyrene based on 2, 3-dihy-
dro-(5-mercapto-1, 3, 4-oxadiazol-2-yl)-phenyl-2-(substitut- 46. I. Mitra, A. Saha, K. Roy, Chemometric QSAR Modeling
ed)-1, 3, 4-oxazepine-4,7-dione compounds, SpringerPlus, and In Silico Design of Antioxidant NO Donor Phenols, Sci.
2 (2013) 104, doi: https://doi.org/10.1186/2193-1801-2- Pharm. 79 (2011) 31–57, doi: https://doi.org/10.3797/sci-
104. pharm.1011-02.
33. E. Yousif, A. Ahmed, R. Abood, N. Jaber, R. Noaman, R. Yusop, 47. V. Consonni, D. Ballabio, R. Todeschini, Evaluation of model
Poly(vinyl chloride) derivatives as stabilizers against photode- predictive ability by external validation techniques, J. Che-
gradation, J. Taibah. Uni. Sci. 9 (2015) 203–212, doi: https:// mometr. 24 (2010) 194–201, doi: https://doi.org/10.1002/
doi.org/10.1016/j.jtusci.2014.10.003. cem.1290.
34. M. M. Ali, G. A. El-Hiti, E. Yousif, Photostabilizing Efficien- 48. S. Bitam, M. Hamadache, S. Hanini, Prediction of therapeu-
cy of Poly(vinyl chloride) in the Presence of Organotin (IV) tic potency of tacrine derivatives as BuChE inhibitors from
Complexes as Photostabilizers, Molecules 21 (2016) 1151, quantitative structure–activity relationship modeling, SAR.
doi: https://doi.org/10.3390/molecules21091151. QSAR. Environ. Res. 29 (2018) 213–230, doi: https://doi.or
g/10.1080/1062936X.2018.1423640.
35. A. Hameed, Microwave Synthesis of Some New 1, 3-Oxaz-
epine Compounds as Photostabilizing Additives for Pmma 49. K. Roy, I. Mitra, S. Kar, P. K. Ojha, R. N. Das, H. Kabir, Com-
Films, JAUS 15 (2012) 47–59. parative studies on some metrics for external validation of
QSPR models, J. Chem. Inf. Model. 52 (2012) 396–408, doi:
36. P. R. Duchowicz, S. E. Fioressi, D. E. Bacelo, L. M. Saavedra, https://doi.org/10.1021/ci200520g.
A. P. Toropova, A. A. Toropov, QSPR studies on refractive
indices of structurally heterogeneous polymers, Chemom- 50. M. Hamadache, O. Benkortbi, S. Hanini, A. Amrane, QSAR
etr. Intell. Lab. Syst. 140 (2015) 86–91, doi: https://doi. modeling in ecotoxicological risk assessment: application to
org/10.1016/j.chemolab.2014.11.008. the prediction of acute contact toxicity of pesticides on bees
(Apis mellifera L.), Environ. Sci. Pollut. Res. Int. 25 (2018)
37. A. R. Katritzky, S. Sild, V. Lobanov, M. Karelson, Quantitative 896–907, doi: https://doi.org/10.1007/s11356-017-0498-9.
Structure-Property Relationship (QSPR) Correlation of Glass
Transition Temperatures of High Molecular Weight Polymers, 51. A. Golbraikh, A. Tropsha, Beware of q2!, J. Mol. Graph.
J. Chem. Inf. Comput. Sci. 38 (1998) 300–304, doi: https:// Model. 20 (2002) 269–276, doi: https://doi.org/10.1016/
doi.org/10.1021/ci9700687. S1093-3263(01)00123-1.
38. M. Hamadache, O. Benkortbi, S. Hanini, A. Amrane, L. 52. K. Bellifa, Etude des relations quantitatives structure–toxicité
Khaouane, C. Si-Moussa, A Quantitative Structure Activi- des composés chimiques à l’aide des descripteurs molécu-
14 H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16

laires. Modélisation QSAR (Doctoral dissertation) (2015). of methods to study the contribution of variables in artificial
53. S. Chtita, Modélisation de molécules organiques hétérocy- neural network models, Ecol. Model. 160 (2003) 249–264,
cliques biologiquement actives par des méthodes QSAR/ doi: https://doi.org/10.1016/S0304-3800(02)00257-0.
QSPR. Recherche de nouveaux médicaments (Doctoral dis- 56. S. Bitam, M. Hamadache, S. Hanini, QSAR model for pre-
sertation) (2017). diction of the therapeutic potency of N-benzylpiperidine
54. G. D. Garson, Interpreting neural network connection derivatives as AChE inhibitors, SAR. QSAR. Environ. Res.
weights, Artif. Intell. Expert. 6 (1991) 46–51. 28 (2017) 471–489, doi: https://doi.org/10.1080/106293
6X.2017.1331467.
55. M. Gevrey, I. Dimopoulos, S. Lek, Review and comparison

SAŽETAK
QSPR studije karbonilnih, hidroksilnih, polienskih indeksa i
prosječne molekulske težine polimera pod fotostabilizacijom
pristupom ANN i MLR
Hadjira Maouz,a* Latifa Khaouane,a Salah Hanini,a Yamina Ammi,a,b
Mabrouk Hamadache,a and Maamar Laidi a

Jedan od glavnih nedostataka upotrebe sintetičkih ili polusintetičkih polimernih materijala je nji-
hova razgradnja i starenje. Svrha ove studije je primjena umjetnih neuronskih mreža (ANN) i više-
strukih linearnih regresija (MLR) za predviđanje karbonilnih, hidroksilnih i polienskih indeksa (ICO,
IOH i IOP) i prosječne molekulske mase viskoznosti (Mv ) poli(vinil-klorida), polistirena i poli(metil
metakrilata). Ta fizikalno-kemijska svojstva smatraju se važnim tijekom proučavanja fotostabiliza-
cije polimera. Iz pet ponavljajućih jedinica monomera prikazana je struktura ispitivanog polimera.
Kvantitativni modeli odnosa strukture-svojstava (QSPR) dobiveni primjenom relevantnih deskrip-
tora pokazali su dobru predvidljivost. Za potvrdu tih modela provedene su: interna provjera {R2,
RMSE i Q2LOO}, vanjska provjera {R2, RMSE, Q2pred, rm2, , k i k’} i domena primjenjivosti. Us-
poredba rezultata pokazuje da su modeli ANN učinkovitiji od modela MLR. Prema tome, model
QSPR razvijen u ovoj studiji pruža izvrsna predviđanja i može se primjenjivati za predviđanje ICO,
IOH, IOP i Mv polimera, posebno za one koji nisu testirani.

Ključne riječi
QSPR, fotostabilizacija, polimeri, umjetna neuronska mreža, višestruke linearne regresije

a
Laboratory of Biomaterials and Transport Izvorni znanstveni rad
Phenomena (LBMPT), University of Médéa, Prispjelo 17. lipnja 2019.
Quartier Aïn d’Heb, 26000, Algeria Prihvaćeno 7. rujna 2019.
b
University Center, Faculty of Science
and Technology, Department of Process
Engineering, Relizane, Algeria
H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16 15

Supplementary material

Table S1 (a) – Correlation matrix of the descriptors used for the carbonyl index
t(h) MATS2m minHBd MATS2s VE3_Dzm MATS7i
t(h) 1 –0.056 –0.015 0.053 0.041 –0.038
MATS2m –0.056 1 0.014 –0.447 0.176 –0.200
minHBd –0.015 0.014 1 0.254 –0.029 0.297
MATS2s 0.053 –0.447 0.254 1 0.072 0.187
VE3_Dzm 0.041 0.176 –0.029 0.072 1 –0.441
MATS7i –0.038 –0.200 0.297 0.187 –0.441 1

Table S1 (b) – Correlation matrix of the descriptors used for the hydroxyl index
t(h) TDB5i geomShape minHBd ATSC7m P2i
t(h) 1 0.062 0.034 –0.016 0.080 0.015
TDB5i 0.062 1 –0.198 0.093 0.396 0.063
geomShape 0.034 –0.198 1 –0.091 0.205 –0.333
minHBd –0.016 0.093 –0.091 1 –0.108 0.039
ATSC7m 0.080 0.396 0.205 –0.108 1 0.213
P2i 0.015 0.063 –0.333 0.039 0.213 1

Table S1 (c) – Correlation matrix of the descriptors used for the polyene index
t(h) SpMin2_Bhv VE3_Dzi VR3_Dzv
t(h) 1 0.097 0.061 0.109
SpMin2_Bhv 0.097 1 0.306 0.717
VE3_Dzi 0.061 0.306 1 0.263
VR3_Dzv 0.109 0.717 0.263 1

Table S1 (d) – Correlation matrix of the descriptors used for the viscosity average molecular weight
t(h) RDF25p TDB4i minHBint8 ATSC3c
t(h) 1 –0.007 0.009 –0.005 0.000
RDF25p –0.007 1 0.069 0.313 –0.016
TDB4i 0.009 0.069 1 –0.063 –0.119
minHBint8 –0.005 0.313 –0.063 1 0.004
ATSC3c 0.000 –0.016 –0.119 0.004 1
16 H. MAOUZ et al.: QSPR Studies of Carbonyl, Hydroxyl, Polyene Indices, and Viscosity Average..., Kem. Ind. 69 (1-2) (2020) 1−16

Table S2 – Selected parameters of the optimal ANN model

ANN models ICO IOH IPO Mv


Number of input layers 1 1 1 1
Number of hidden layers 1 1 1 1
Number of output layers 1 1 1 1
Number of input neurons 6 6 4 5
Number of hidden neurons 18 23 13 24
Number of output neurons 1 1 1 1
Transfer function of the
Exponential Tanh Tanh Tanh
hidden neurons
Transfer function of the
Identity Exponential Identity Identity
output neurons
Training algorithm BFGS BFGS BFGS BFGS
Training set 77 % (n = 86) 60 % (n = 85) 80 % (n = 65) 80 % (n = 95)
Test set 23 % (n = 25) 40 % (n = 56) 20 % (n = 16) 20 % (n = 23)
RMSE 0.0079 0.0027 0.0095 0.0650

You might also like