Application of Regression and Neural Models To Pre
Application of Regression and Neural Models To Pre
Application of Regression and Neural Models To Pre
net/publication/228111211
CITATIONS READS
62 731
7 authors, including:
All content following this page was uploaded by Zbigniew Waśkiewicz on 08 January 2018.
*A code has been embedded in this PDF to allow the publisher to find copies and remind
posters about the terms of use.
http://www.AmSJP.com
Perceptual and Motor Skills, 2012, 114, 2, 610-626. © Perceptual and Motor Skills 2012
two groups of 30 swimmers in May 2010 so that these real results could
be compared to the models’ predictions made using the June 2009 data.
Design
In order to test the hypothesis, multidimensional statistical analyses
were applied to measurements taken in the Model Construction group.
The values of variables measured by means of robust scales and tests were
used in multiple regression models. The research problem was addressed
using empirical and predictive investigation, based on the data obtained
in the form of a multidimensional vector of variables, including inde-
pendent Xn variables and two dependent variables Y1 and Y2. On the basis
of measurement results of 189 swimmers from 2008–2009, mathematical
models were created. Then, in 2009–2010, an additional study was con-
ducted with the Model Testing group, in order to verify previously cre-
ated models, which were based on the two groups of 30 swimmers for the
50 m (n = 30) and 800 m (n = 30) crawl events.
Measures
Numerous characteristics of the participants were measured as inde-
pendent variables, such as body build, general and specific physical fit-
ness. The dependent variables were the results achieved by the swimmers
in competition at distances of 50 m and 800 m crawl stroke after 12 months
of training.
Kicking.—25 m LL2 (sec.) and 50 m LL (sec.): flutter kick with a kick
board, only legs active.
Exercises.—25 m AA (sec.): crawl stroke, arms only, the board held
between legs. 800 m ALAS (sec.): crawl stroke, alternatively legs, arms
and complete stroke, 4 × 200 m (each 200 m = 50 m LL/50 m stroke/50 m
AA/50 m stroke).
Technique.—Turn (sec.): whole-body reaction time and spatial orien-
tation in the flip turn for crawl stroke. The participant stood 7.5 m from
the turning wall, swam to the wall at maximum speed when signalled,
performed a flip turn, glided, and swam a crawl stroke back to the start-
ing point. Three trials were measured (± .01 sec.) and the best time was
recorded. Starting dive (sec.) off the block followed by the crawl stroke:
the participant stood on the block in the starting position, dove off when
signalled and swam crawl stroke for 7.5 m. The time was measured (± .01
sec.) in three trials and the best time was recorded. Glide test (sec.) was
used to assess balance: the participant stood with his back against the
turning wall and, when signalled, performed a 7.5-m glide to swim the
crawl stroke. Three time trials (± .01 sec.) were performed and the best re-
sult was recorded.
2
Throughout the rest of the manuscript, LL = legs only with kickboard; AA = arms only with
board between legs; ALAS = alternating arms only, legs only, and compete crawl stroke.
614 A. MASZCZYK, et al.
these participants were trained and measures taken in June 2009. The re-
sults of predictions for these two groups were verified by comparing the
model-generated predictions with the actual results achieved by the same
two groups of 30 swimmers in May 2010 (n = 60), see Fig. 1 for details.
Analysis
In general, whenever a regression model can be transformed into a
linear model, this is the preferred analytic method for estimating the re-
Training Group I
189 Swimmers
Group I, N = 189 (2008–2009)
Testing
20 Variable
2 Outcomes Correlations analysis
log-transformation
50 m Training 800 m
Regression Regression
Model Y1 Model Y2
4 indenpendent 8 indenpendent
variables variables
Training Group II
30 30
Group II, N = 60 (2009–2010)
Sprinters Distance
Testing
and prediction
Model Testing
4 variables 8 variables
measured measured
Training
Comparison of
Testing
predicitions
predicitions
(race times)
1. Regression 1. Regression
2. Neural models 2. Neural models
Y150 m Y2800 m
spective model. The linear multiple regression model is very well under-
stood mathematically and, from a pragmatic standpoint, is much easier to
interpret. Therefore, returning to the simple exponential regression mod-
el of Y1 as a function of Xi, one could convert this nonlinear regression
equation into a linear one by simply taking the logarithm of both sides of
the equations, so that ln (Y1) = −b1*Xi. If one substitutes ln (Y1) with y, the
standard linear regression model results, as shown earlier without the in-
tercept, which was ignored to simplify matters. Thus, the Y1 rate data can
be log-transformed and then multiple regression used to estimate the re-
lationship between Xi and Y1, that is, compute the regression coefficient b1.
Means and standard deviations were calculated for all variables. The
Kolmogorov-Smirnov test of normality and Levene’s test of homogene-
ity of variance were performed to verify the normality of the distribution.
The scores in trials and tests were used as the explanatory variables
(20 variables). A correlation matrix was calculated for independent varia-
bles (X1 − X20) with dependent variables (Y1, 50 m; Y2, 800 m). The set of the
model’s independent variables which were most related with the depen-
dent variables and less related to each other, was assembled using Pear-
son’s coefficients (Kothari, 2004; Benigno & Woodford, 2006). Non-linear
variables were log-transformed.
The multiple stepwise regression was used to select explanatory vari-
ables offering the best prediction of athletes’ results for swim distances of
50 m and 800 m in the Model Construction phase. These four predictor
variables were log-transformed and used to form regression models pre-
dicting Y1 (50 m) (glide, foot length, body height, 25 m AA) and Y2 (800
m; 50 m LL, 800 m ALAS, 25 m AA, standing long jump, AA cycle, hand
length, vital lung capacity, Rohrer Coefficient); see Table 1 for details.
Regression and Neural Network Models
Constructed graphs of variables indicated nonlinearity. It is well rec-
ognized that when any type of statistical inquiry in which principles from
some body of knowledge enter into the analysis is likely to lead to a non-
linear model. Such models play a very important role in understanding
the complex interrelations among variables. A nonlinear model is one
in which at least one of the parameters is nonlinear. More formally, in
a nonlinear model, at least one derivative with respect to a parameter
should involve that parameter. In this study the Y1(t) = exp(a1t + b1t2) and
Y2(t) = exp(a2t + b2t2) nonlinear models were used, and verified after being
transformed to linear models using the transformation Xn1(t) = ln Y1(t) and
Xn2(t) = ln Y2(t).
For generalization and prediction of swimming results, Multilayer
Perceptron (MLP) neural models were used to model the 50 m and 800
m performances with the following structures: 4-3-1 and 8-4-1 [four and
Prediction of Competitive Swimming Performance 617
Table 1
Regression Statistics for Y1–50 m and Y2–800 m Regression Models
Variable β SE β B SE B t p
Y1, 50 m crawl race
Intercept 147.64 1.52 57.05 .001
Glide −.48 .20 −25.54 8.72 −4.45 .002
Foot length .90 .20 64.41 8.85 8.26 .001
Body height −.74 .20 −53.24 8.26 −7.05 .003
25 m AA −.43 .20 −32.43 8.42 −4.08 .001
Y2, 800 m crawl race
Intercept 90.43 2.23 82.35 .001
50 m LL −.00 .09 −0.27 3.39 −0.05 .008
Vital lung capacity .34 .05 11.71 2.79 7.39 .001
800 m ALAS .35 .05 11.21 3.09 5.94 .001
Hand length .34 .05 10.73 2.96 6.35 .003
25 m AA −.53 .08 −18.61 3.98 −6.94 .001
Rohrer coefficient .20 .04 5.19 2.35 4.47 .001
Standing long jump −.26 .05 −7.23 2.47 −5.12 .002
The AA technique −.24 .05 −6.45 2.94 −3.95 .004
Note.—Percent variance accounted for: Y1, 50 m swim, R2 = .83; for Y2, 800 m swim, R2 = .89.
eight input neurons (variables), respectively, one hidden layer (with three
and four neurons, respectively) and one outcome]. In the Neural Network
Statistica Module (NNSM), 100 epochs is the standard procedure, then 20
epochs of optimization. The Network improves its performance during
the teaching process to give smaller errors in the training and validation
set. When the network is overlearned, an increase in validation error be-
gins. At this moment, one must stop the learning process and the NNSM
automatically shifts back model values to optimal epochs. (Szaleniec,
Witko, Tadeusiewicz, & Goclon, 2006; Szaleniec, Tadeusiewicz, & Witko,
2008; Maszczyk, et al., 2011). The networks were trained using the Leven-
berg-Marquardt algorithm. The training process had an iteration charac-
ter (which means that in subsequent epochs of learning, the thresholds are
modified in such a way as to diminish the value of the total network er-
ror). The Levenberg-Marquardt algorithm is most often applied in small
networks, in this case an 8-4-1 network for the 800 m and a 4-3-1 network
for the 50 m swim (Fig. 2). The benefit of this algorithm is the lack of hid-
den layers determined by the user. The Neural Model application (with
the Levenberg-Marquardt algorithm) automatically suggests the optimal
number of layers-to-epochs. The researcher only has to define the number
of epochs. If one wants to have a more accurate model, one must add more
learning epochs, yet care is needed, for when the network is overlearned,
it loses the ability to generalize data. In our models, the initial setting was
600 epochs. The level of significance for all analyses was set at p ≤ .05.
618 A. MASZCZYK, et al.
FL
Y1 = 50 m
BH
25 m AA
Fig. 2. Artificial Neural Network Model for dependent variable Y1 = 50 m with 4-3-1
structure. G = glide, FL = foot length, BH = body height and 25 m AA = arms n cycles/25 m.
Table 2
Regression Statistics of Assessment of 4-3-1 and 8-4-1 Structure
MLP Models for Dependent Variables Y1, 50 m and Y2, 800 m
Table 3
Prediction for Dependent Variables Y1, 50 m (n = 30)
for the validation and test set values were .22 and .17, respectively. This
indicates that the network model had a good data fit, because NRMSE
values were low and comparable in learning, validation, and test series
(Table 2) (Werbos, 1994). Thus the practical usefulness of this model was
supported by a large magnitude of correlation coefficients between inde-
pendent and dependent variables in each group also given in Table 2.
Table 3 and Table 4 include the results of the verification procedure
Prediction of Competitive Swimming Performance 621
Table 4
Prediction for Dependent Variables Y2, 800 m (n = 30)
of the neural networks are negative, denoting that prediction errors were
larger for athletes achieving the fastest race times. The goodness of fit be-
tween the networks and the outcomes predicted for well-performing ath-
letes was very high.
These findings confirmed the conclusions that Edelman-Nusser,
Hohmann, and Henneberg (2001) formulated and point to the applicabil-
ity of perceptron networks in predicting swimming performance. They
demonstrated the possibility of application of nonlinear mathematical
models, based on Artificial Neural Networks in predicting performance
of elite swimmers at the Olympic Games in Sydney. Their data included
results from the previous nineteen years and, their neural model with a
9-2-1 structure was very similar to that presented in this study. Discrepan-
cies with input variables and hidden layers can result from differences in
the level of training and experience of swimmers.
The research results indicate that the neural models are a useful and
potentially superior tool which offers better optimisation possibilities in
predicting sports results, athlete recruitment and selection processes, than
the widely applied regression models. The results of analysis and com-
parisons obtained using modelling allowed creation of realistic models of
sports results based on neural network technology, which predicted fu-
ture performance reasonably accurately based on prior measures of per-
formance and body characteristics.
The present models developed were based on objectively measured
values of the samples. With the utilization of module Run One-off Case
(Haykin , 1994; Lula & Tadeusiewicz, 2001; Edelmann-Nusser, et al., 2001;
Maszczyk, et al., 2011), competitive swimmers’ sports results showed
good agreement with the model prediction after one year of training. The
multilayer perceptron networks (MLP) allows multilayer perception mod-
eling based on data input from a keyboard. In Option Run One-off Case,
one can modify input data values and consequently predict outcomes for
competitive swimmers. Further research can refine and develop this ap-
proach with other swimming events, and athletes of different ages and
sport levels.
References
Alfredson, H., Nordstrom, P., Pietila, T., & Lorentzon, R. (1999) Bone mass in the
calcaneus after heavy loaded eccentric calf-muscle training in recreational athletes
with chronic achilles tendinosis. Calcified Tissue International, 64, 450-455.
Bartlett, R. M. (2006) Artificial intelligence in sports biomechanics: new dawn or
false hope? Journal of Sports Science and Medicine, 5, 23-42.
Barton, G., & Lees, A. (1993) Development of a connectionist expert system to identify
foot problems based on under foot pressure patterns. Clinical Biomechanics, 10,
31-39.
Prediction of Competitive Swimming Performance 625
Benigno, P., & Woodford, M. (2006) Optimal taxation in an RBC model: a linear-quad-
ratic approach. Journal of Economic Dynamics and Control, 30(9-10), 1445-1489.
Bompa, T. O. (2000) Training guidelines for young athletes. Champaign, IL: Human Kinet-
ics. Pp. 1-20.
Counsilman, J. (1977) Swimming power. Swimming World, 10, 33-41.
Deurenberg, P., Weststrate, J. A., & Seidell, J. C. (1991) Body Mass Index as a measure
of body fatness: age- and sex-specific prediction formulas. British Journal of Nutri-
tion, 65, 105-114.
Edelmann-Nusser, J., Hohmann, A., & Henneberg, B. (2001) Prediction of the Olympic
competitive performance in swimming using neural networks. In J. Mester, G.
King, H. Strüder, E. Tsolakidis, & A. Osterburg (Eds.), Book of Abstracts of the 6th
Annual Congress of the European College of Sport Science, Cologne: Nice, France: Eu-
ropean College of Sport Science. P. 328.
Edelmann-Nusser, J., Hohmann, A., & Henneberg, B. (2002) Modeling and prediction
of competitive performance in swimming upon neural networks. European Journal
of Sport Science, 2, 1-10.
Hamilton, D. (2009) Pedagogy and the long course of learning. Pedagogy, Culture, &
Society, 17, 115-121.
Hannula, D., & Thornton, N. (2001) The swim coaching bible. Vol. 1. Champaign. IL:
Human Kinetics.
Haykin, S. (1994) Neuralnetworks: a comprehensive foundation. New York: Macmillan.
Jones, M. K., Padilla, O. R., & Zhu, E. ( 2010) Survivin is a key factor in the differential
susceptibility of gastric endothelial and epithelial cells to alcohol-induced injury.
Journal of Physiology and Pharmacology, 61, 253-264.
Kothari, C. R. (2004) Research methodology methods and techniques. New Delhi: New
Age International. Pp. 55-152.
Laughlin, T., & Delves, J. (2004) Total immersion: the revolutionary way to swim better,
faster, and easier. New York: Fireside. Pp. 12-34.
Lees, A. (2002) Technique analysis in sports: a critical review. Journal of Sports Sciences,
20, 813-828.
Lula, P., Tadeusiewicz, R. (2001) STATISTICA Neural Networks. Krakow, Poland: Stat-
Soft.
Maier, K., Wank, V., Bartonietz, K., & Blickhan, R. (2000) Neural network based mod-
els of javelin flight: prediction of flight distances and optimal release parameters.
Sports Engineering, 3, 57-63.
Maszczyk, A., Zając, A., & Ryguła, I. (2011) A neural network model approach to ath-
lete selection. Sports Engineering, 13(2), 83-93.
Mester, J., & Perl, J. (1999) Unconventional simulation and empirical evaluation of biological
response to complex high training loads. In: P. Parisi, F. Pigozzi, G. Prinzi (Eds.); Sport
Science ´99 in Europe. Pp. 163-175.
Roczniok, R., & Ryguła, I. (2007) The use of Kohonen’s neural networks in the recruit-
ment process for sport swimming. Journal of Human Kinetics, 17, 75-88.
Szaleniec, M., Tadeusiewicz, R., & Witko, M. (2008) The selection of optimal neural
models for forecasting of biological activity of chemical compounds. Neurocomput-
ing, 72, 241-256.
626 A. MASZCZYK, et al.
Szaleniec, M., Witko, M., Tadeusiewicz, R., & Goclon, J. (2006) Appliction of artificial
neural networks and DFT-based parameters for prediction of reaction kinetics of
ethylbenzene dehydrogenase. Journal of Computer-Aided Mololecular Design, 20,
145-157.
Werbos, P. (1994) How we cut prediction error in half by using a different training method.
World Congress on Neural Networks, San Diego, Tom 1, 225-236.
Woodman, T., Zourbanos, N., Hardy, L., Beattie, S., & McQuillan, A. (2010) Do per-
formance strategies moderate the relationship between personality and training
behaviors? An exploratory study. Journal of Applied Sport Psychology, 22, 183-197.
Zadeh, L. (2002) From computing with numbers to computing with words. Interna-
tional Journal of Applied Math and Computer Science, 12, 4-12.
Zehr, E. P. (2005) Neural control of rhythmic human movement: The common core
hypothesis. Exercise & Sport Sciences Reviews, 33, 54-60.