This article appeared in a journal published by Elsevier. The attached
copy is furnished to the author for internal non-commercial research
and education use, including for instruction at the authors institution
and sharing with colleagues.
Other uses, including reproduction and distribution, or selling or
licensing copies, or posting to personal, institutional or third party
websites are prohibited.
In most cases authors are permitted to post their version of the
article (e.g. in Word or Tex form) to their personal website or
institutional repository. Authors requiring further information
regarding Elsevier’s archiving and manuscript policies are
encouraged to visit:
http://www.elsevier.com/authorsrights
Author's personal copy
Industrial Crops and Products 47 (2013) 232–238
Contents lists available at SciVerse ScienceDirect
Industrial Crops and Products
journal homepage: www.elsevier.com/locate/indcrop
Phenotypic evaluation of flax seeds by image analysis
Smykalova Iva a,∗ , Grillo Oscar b , Bjelkova Marie a , Pavelek Martin a , Venora Gianfranco b
a
b
Agritec Plant Research, Ltd., Zemědělská 16, 787 01 Šumperk, Czech Republic
Stazione Consorziale Sperimentale di Granicoltura per la Sicilia, Via Sirio 1, 95041 Borgo Santo Pietro – Caltagirone, Italy
a r t i c l e
i n f o
Article history:
Received 3 December 2012
Received in revised form 28 February 2013
Accepted 5 March 2013
Keywords:
Computer vision
Linum spp.
Germplasm characterization
Morphometric and colorimetric
measurements
Linear discriminant analysis
a b s t r a c t
Linum usitatissimum L., as other fibre plants, has high technical and nutritive value connected to the seed
fatty acid composition. The assessment of some seed aspects such as colour, size and shape is important in
grading system as well as to characterize accessions of core collections. The data of 34 morphometric and
colorimetric features of flax seeds belonging to four Czech varieties tested in 5 different Czech localities,
were used to implement an identification and classification grading system, on the basis of two sample
datasets: a training set of data to teach the classifier and a different test set of data to validate it. The
achieved results suggested the possibility to discriminate both varieties and localities, and the stability
of any studied flax seed lot.
© 2013 Elsevier B.V. All rights reserved.
1. Introduction
Flax cultivation is as old as humanity itself. The oldest findings
about flax cultivation in Europe are dated back to the Neolithic
(Late Stone Age), about 5 thousands years B.C. The genus Linum,
belonging to the family of Linaceae, with over then 200 species, is
well-known and it is divided into five subsections. The subsection
Linum contains the cultivated species L. usitatissimum L. and the
ornamentals L. grandiflorum Desf. and L. perenne L., but only cultivated L. usitatissimum has economic importance having 2n = 30.
Flax is a annual mostly self-pollinating crop, characterized by relatively short vegetation period, 90–120 days in European conditions
(Muir and Wescott, 2001). Since the specific ecology of grown
cultivars (weather, soil as well as applied agronomic practices) significantly affects on its development, growth and morphological
diversity, flax cultivation is strongly dependent but easily adaptable
to the local conditions (regionalization) taking advantage of a good
resistance to environmental biotic and abiotic stress (Fouilloux,
1988). Currently, the European Union Register holds 67 cultivars
of fibre flax. Breeding of flax in European Union is mainly conducted in France, The Netherlands, Czech Republic, Poland and
Romania. Outside the EU, the most important flax breeding centres are located in Russia, Ukraine, Byelorussia, Canada, China and
Egypt. However, the flax cultivation in the Czech Republic is concentrated in the suitable localities with high production potential
∗ Corresponding author. Tel.: +420 583 382 120; fax: +420 583 382 999.
E-mail address:
[email protected] (S. Iva).
0926-6690/$ – see front matter © 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.indcrop.2013.03.001
(fibre flax should be grown in conditions of low temperature and
high level of air humidity) of some Czech modern commercial fibre
flax varieties such as Bonet (1996), Jitka (1996), Jordán (1998),
Tábor (2002), Marylin (2004), Rina (2009). Flax is also cultivated
for the nutritional purposes, predominantly for the modified fatty
acid composition in seeds. The Czech registered varieties such as
Flanders (1996), Lola (1999), Amon and Jantar (2006) are classified
as oilseed (linseed) varieties for seed production and nutritional
purposes. Following a global trend, in the Czech Republic the flax
areas drastically decreased from approximately 5500 ha in 2004
to about 150 ha in 2008–2009, despite of the cultural tradition and
the agronomic and economic importance of this crop in agricultural
systems due to position of flax on the worldwide market.
Therefore, the biodiversity preservation of this technical crop,
supported by breeding programmes, became the main goals. The
flax international database, one of has been managed since 1994, in
private company Agritec Ltd. in Šumperk Czech Republic (Pavelek,
2004), are important for research and plant breeding purposes, for
which visual identification process is essential but it needs experienced and specialized technicians.
In the last two decades, a remarkable increase in image analysis applications has been applied in the plant biology research
field to characterize, identify and grade varieties of different crops
(Shahin and Symons, 2001; Venora et al., 2007a, 2007b, 2009;
Medina et al., 2010; Grillo et al., 2011; Smýkalová et al., 2011). Only
few papers are known for flax or linseed (Wiesnerová and Wiesner,
2008; Pearson, 2010), probably due to the small size of the seeds
4–6 mm in length and 2–3 mm in width and low level of diversity
(Diederichsen and Richards, 2003). Generally, the smooth surface
Author's personal copy
S. Iva et al. / Industrial Crops and Products 47 (2013) 232–238
233
and the gloss of the seed are related to good quality, maturity and
fitness. The common colour of seeds is brown with variations in
intensity scale, but the modern low linolenic linseed varieties are
often yellow seeded. The Red-Green-Blue (RGB) colour components
of each individual seed of 2D images are the easily estimating features by image analysis. The sophisticated non-destructive method
allowed sorting within large seeded samples.
The aims of the present study were: (1) to characterize four
commercial varieties of flax on the basis of seed shape, size and
colour measurements by computer vision; (2) to evaluate the possible effect of five different Czech localities on morpho-colorimetric
features of the seeds; (3) to develop a specific statistical classifier able to identify and classify the studied varieties and trace the
cultivation localities.
2. Material and methods
2.1. Seed material
Well ripened, cleaned and dried seed samples of four different
flax varieties (L. usitatissimum L.), without significant mechanical
and pest damages, were harvested from plants cultivated on five
different localities in the Czech Republic. Fig. 1 shows temperature
and rainfall recorded during crops season compared to long-term
data. Field trials were carried out in the stations Domanínek, Chrastava, Jaroměřice, Lednice and Žatec (Table 1) using two yellow
(Amon and Jantar) and two brown seeded varieties (Flanders and
Lola), kept in Czech National Flax Collection. The total amount of
24,534 sampled seeds was split in two sub-samples of seeds to dispose of a training and a test sample sets (Smýkalová et al., 2011).
To guarantee the representativeness of the seed samples and, at
the same time, to minimize the intra-varietal variability of shape,
sizes and colour of the seeds, due to the seed position inside the
pod and to the position of different pods in the same plant (Harper
et al., 1970), the samples were randomly assigned to the training
and test sets. The training sample set, implemented with 16,361
seeds of the four studied flax varieties, was used to develop the
classifier, while the test sample set, implemented with 8173 seeds
of the studied varieties, was used to evaluate the performance of
the developed statistical classifier.
Fig. 1. Long-term average temperature of 30 years and temperature in the 2009 (◦ C)
recorded in crop season period (April–July) for five different localities (A). Long-term
average rainfall of 30 years and rainfall in the 2009 (mm) recorded in crop season
period (April–July) for five different localities (B).
Kodak Colour Input Target (IT8.7/2-1993, Kodak, USA), following
the Shahin and Symons (2003) protocol.
The measurement of 34 morpho-colorimetric quantitative variables describing seed size, shape and colour (Table 2) on the
acquired images were automatically executed applying a macro,
called flaxeed.mcr, specifically developed for the characterization
of flax seeds, using the software package KS-400 V.3.0 (Carl Zeiss
Vision, Oberkochen, Germany) (Grillo et al., 2010).
2.2. Image acquisition and analysis
Digital images of seed samples were acquired using a flatbed
scanner (Canon 4400F, Canon Inc., Japan), with a digital resolution
of 200 dpi, a colour depth of 24 bit and a scanning area not exceeding
1024 × 1024 pixel. According to Bacchetta et al. (2008), the seeds
were arranged singly on the scanner tray, so they did not touch each
other. To achieve a perfect detection of the seeds contour and avoid
interference of environmental light, two images were acquired for
each seed sample, one with a black background, covering the seed
samples with a box dressed with opaque black paper and the other
one with a white background using a box dressed with opaque
black paper. Before image acquisition, the scanner was calibrated
according to Venora et al. (2009) for colour matching, using a Q60
2.3. Data analysis
Statistical analyses were performed with the software SPSS
release 15 (SPSS Inc. 1999), applying a stepwise Linear Discriminant Analysis (LDA) algorithm. This approach is commonly used to
Table 1
Altitude, climate, soil and geographical position of the cropping localities. Temperature and rainfall are means long-term average value of 30 years time.
Locality
Production region
Altitude (m a.s.l.)
Temperature (◦ C)
Rainfall (mm)
Soil (by FAO 1970)
Domanínek
Chrastava
Jaroměřice n/R
Lednice n/M
Žatec
Potato
Cereal
Cereal
Maize
Sugar
572
345
425
171
285
6.5
8.0
8.0
9.6
9.0
651
738
481
461
439
Spodo-dystric Cambisol (medium)
Orthic-Luvisol (medium)
Orthic-Luvisol (heavy)
Haplic Chernozem
Luvi-Haplic Chernozem (heavy)
FAO. 1970: Physical and chemical methods of soil and water analysis. FAO Soils Bulletin No. 10. FAO, Rome.
Author's personal copy
S. Iva et al. / Industrial Crops and Products 47 (2013) 232–238
234
Table 2
List of 34 morphometric and colorimetric measured features on seeds.
A
P
Pconv
PCrof
Pconv /PCrof
Dmax
Dmin
Dmin /Dmax
Sf
Rf
Ecd
EAmax
EAmin
Rmean
Rsd
Gmean
Gsd
Bmean
Bsd
Hmean
Hsd
Lmean
Lsd
Smean
Ssd
Dmean
Dsd
S
K
H
E
Dsum
SqDsum
TSW
Feature
Description
Area
Perimeter
Convex perimeter
Crofton perimeter
Perimeter ratio
Max diameter
Min diameter
Feret ratio
Shape factor
Roundness factor
Eq. circular diameter
Maximum ellipse axis
Minimum ellipse axis
Mean red channel
Red std. deviation
Mean green channel
Green std. deviation
Mean blue channel
Blue std. deviation
Mean hue channel
Hue std. deviation
Mean lightness channel
Lightness std. deviation
Mean saturation channel
Saturation std. deviation
Mean density
Density std. deviation
Skewness
Kurtosis
Energy
Entropy
Density sum
Square density sum
A thousand seeds mean weight
Seed area (mm2 )
Seed perimeter (mm)
Convex perimeter of the seed (mm)
Crofton perimeter of the seed (mm)
Ratio between convex and Crofton’s perimeters
Maximum diameter of the seed (mm)
Minimum diameter of the seed (mm)
Ratio between minimum and maximum diameters
Seed shape descriptor = (4 × × area)/perimeter2 (normalized value)
Seed roundness descriptor = (4 × area)/( × max diameter2 ) (normalized value)
Diameter of a circle with equivalent area (mm)
Maximum axis of an ellipse with equivalent area (mm)
Minimum axis of an ellipse with equivalent area (mm)
Red channel mean value of seed pixels (grey levels)
Red channel standard deviation of seed pixels
Green channel mean value of seed pixels (grey levels)
Green channel standard deviation of seed pixels
Blue channel mean value of seed pixels (grey levels)
Blue channel standard deviation of seed pixels
Hue channel mean value of seed pixels (grey levels)
Hue channel standard deviation of seed pixels
Lightness channel mean value of seed pixels (grey levels)
Lightness channel standard deviation of seed pixels
Saturation channel mean value of seed pixels (grey levels)
Saturation channel standard deviation of seed pixels
Density channel mean value of seed pixels (grey levels)
Density channel standard deviation of seed pixels
Asymmetry degree of intensity values distribution (grey levels)
Peakness degree of intensity values distribution (densitometric units)
Measure of the increasing intensity power (densitometric units)
Dispersion power (bit)
Sum of density values of the seed pixels (grey levels)
Sum of the squares of density values (grey levels)
Mean value of a thousand seeds weight (g)
classify/identify unknown groups characterized by quantitative
and qualitative variables (Fisher, 1936, 1940) finding the combination of predictor variables with the aim of minimizing the
within-class distance and maximizing the between-class distance
simultaneously, thus achieving maximum class discrimination
(Hastie et al., 2001; Holden et al., 2011). The best features for
seed sample identification were detected applying a stepwise LDA
method. The selected features were used to elaborate canonical
discriminant functions which are needed to implement statistical classifiers to discriminate and classify the seeds on the basis
of the selected features (Table 2). When several variables are available, the stepwise method can be useful by automatically selecting
the best characters on the basis of three statistical variables: Tolerance, F-to-enter and F-to-remove. The Tolerance value indicates the
proportion of a variable variance not accounted for other independent variables in the equation. A variable with very low Tolerance
value proves little information to a model. F-to-enter and F-toremove values define the power of each variable in the model and
they are useful to describe what happens if a variable is inserted
and removed, respectively, from the current model. This method
starts with a model that does not include any of the variables.
At each step, the variable with the largest F-to-enter value that
exceeds the entry criteria chosen (F ≥ 3.84) is added to the model.
The variables left out of the analysis at the last step have F-toenter values smaller than 3.84, so no more are added. The process
was automatically stopped when no remaining variables increased
the discrimination ability (Grillo et al., 2012). To graphically highlight the differences among groups, multidimensional plots were
drawn using the first three of the available canonical discriminant
functions.
3. Results
Morpho-colorimetric analysis allowed the accurate assessment
of seed size, shape and colour of the studied flax varieties. The measured mean and standard deviation values are reported in Table 3.
The seeds of the studied samples of Linum are flat, ovoid to ellipsoid in shape as also proved by mean values of shape and roundness
factors (0.82 ± 0.03 and 0.46 ± 0.03 respectively). Seeds of the four
studied varieties result are similar in size and shape, as the morphometric feature values show, but a clear difference between the
two yellow (Amon and Jantar) and the two brown seeded varieties
(Flanders and Lola) is evident, as proved by the highest RGB channels values of the yellow seeded varieties respect to the brown
seeded varieties (Table 3).
The measured data of the 34 measured features were statistically elaborated using the stepwise Linear Discriminant Analysis,
carrying out a statistical classifier able to distinguish the studied
varieties (Table 4). Using this model, 87.0% of the training sample
set were correctly identified, with performances ranging between
80.3% (Jantar) and 90.9% (Flanders) for single varieties. As described
above, the performance of the implemented classifier was tested
using an independent seed group, defined test sample set. It showed
the same overall percentage of correct classification (87.1%) and
similar range of performance, included between 81.1% (Jantar) and
91.9% (Flanders), for single varieties respect to the training sample
set classification (Table 4). Fig. 2 shows the graphical representation
on the basis of the first three discriminant functions, highlighting
the differentiation among the four varieties and contextually the
similarity between the two yellow (Amon and Jantar) and the two
brown seeded varieties (Flanders and Lola).
Author's personal copy
S. Iva et al. / Industrial Crops and Products 47 (2013) 232–238
235
Table 3
The measured mean and standard deviation values of features for four flax varieties: Amon, Jantar, Flanders, Lola.
Amon
Flanders
Mean
A
P
Pconv
PCrof
Pconv /PCrof
Dmax
Dmin
Dmin /Dmax
Sf
Rf
Ecd
EAmax
EAmin
Rmean a
Rsd b
Gmean a
Gsd b
Bmean a
Bsd b
Hmean a
Hsd b
Lmean a
Lsd b
Smean a
Ssd b
Dmean a
Dsd b
S
K
H
E
Dsum
SqDsum
TSW
a
b
s.d.
7.74
11.57
11.28
10.97
1.03
4.67
2.31
0.50
0.81
0.45
3.14
2.26
1.09
205.43
36.66
174.59
31.10
137.26
21.33
24.37
17.93
171.08
28.15
128.04
65.47
172.43
41.36
-0.85
0.42
0.01
6.04
83,432.45
15,280,789.65
5.833
0.69
0.50
0.47
0.47
0.01
0.20
0.14
0.03
0.03
0.03
0.14
0.10
0.07
11.06
4.09
10.12
3.80
7.41
3.13
2.14
8.72
8.90
3.30
23.65
10.66
9.29
2.99
0.37
0.89
0.00
0.12
9814.08
2404,478.06
0.416
Mean
7.92
11.73
11.43
11.12
1.03
4.71
2.36
0.50
0.80
0.45
3.17
2.28
1.11
140.11
33.85
119.21
26.69
117.31
19.28
87.51
16.81
16.81
26.59
35.64
20.89
125.54
29.45
0.11
-0.45
0.01
5.77
62,113.23
8289,304.02
5.803
Jantar
s.d.
Mean
0.74
0.53
0.50
0.50
0.01
0.22
0.15
0.03
0.03
0.03
0.15
0.11
0.07
12.51
6.52
9.78
5.38
7.20
3.27
38.35
17.61
9.77
4.77
7.29
5.77
9.50
4.61
0.37
0.64
0.01
0.23
7925.01
1514,022.32
0.270
8.10
11.76
11.46
11.15
1.03
4.69
2.42
0.52
0.82
0.47
3.21
2.27
1.14
206.48
36.31
176.78
30.96
136.98
21.03
24.96
15.65
171.47
27.79
130.26
65.87
173.41
41.59
-0.85
0.41
0.01
6.07
87,765.61
16,155,620.99
6.309
Lola
s.d.
Mean
0.80
0.58
0.54
0.55
0.01
0.24
0.17
0.03
0.03
0.03
0.16
0.12
0.08
11.15
3.75
9.91
3.65
6.97
3.13
1.82
7.88
8.69
3.17
23.65
10.57
9.07
2.83
0.39
0.95
0.00
0.12
10,572.29
2492,143.47
0.551
8.22
11.81
11.51
11.20
1.03
4.70
2.45
0.52
0.82
0.47
3.23
2.26
1.16
142.54
31.58
117.61
24.69
115.58
17.95
81.09
98.89
128.03
24.66
38.85
20.10
125.25
28.54
0.09
-0.25
0.02
5.71
64,316.91
8535,698.97
6.065
s.d.
0.76
0.53
0.50
0.50
0.01
0.22
0.16
0.03
0.03
0.03
0.15
0.11
0.07
12.85
6.48
9.07
5.18
6.80
2.96
34.77
16.62
9.74
4.64
7.19
4.72
9.27
4.09
0.38
0.69
0.01
0.22
8190.33
1536,122.97
0.291
Grey levels (range 0–255).
s.d. = grey level standard deviation (range 0–255).
Applying the same statistical model, a comparison among the
five Czech production localities distinguishing among the four varieties, was executed in order to evaluate the effect of locality. The
training and the test sample sets show the same trend, highlighting
one more time the clear distinction between the two yellow (Amon
and Jantar) and the two brown seeded varieties (Flanders and Lola)
in each studied production locality (Table 5). With the exception
of the variety Jantar grown in Lednice (45.1% and 44.6% for the
training and the test sample set, respectively), the four flax varieties
in all the localities showed high percentages of correct identification, included between 70.5% (Amon grown in Jaroměřice) and
99.3% (Amon grown in Lednice) for the training sample set and
between 67.7% (Amon grown in Jaroměřice) and 99.3% (Amon
grown in Lednice) for the test sample set (Table 5). However, among
the four tested flax varieties, extensive variety Flanders showed
the highest percentage of overall correct identification, both in the
Table 4
Varietal discrimination independently of the cropping localities. Percentage and amount of seeds (in parenthesis) of the training and test sample sets used for the variety
identification.
Training sample set
Flanders
Lola
Jantar
Amon
Overall
Flanders
Lola
Jantar
Amon
Total
90.92 (3877)
11.27 (477)
–
0.02 (1)
9.03 (385)
88.71 (3754)
–
–
0.02 (1)
–
80.30 (3077)
12.37 (499)
0.02 (1)
0.02 (1)
19.68 (754)
87.60 (3533)
100.0 (4264)
100.0 (4232)
100.0 (3832)
100.0 (4033)
87.00 (16,361)
Test sample set
Flanders
Lola
Jantar
Amon
Overall
Flanders
Lola
Jantar
Amon
Total
91.90 (1962)
12.59 (266)
–
–
7.96 (170)
87.27 (1844)
–
–
0.05 (1)
0.05 (1)
81.13 (1548)
12.35 (249)
0.09 (2)
0.09 (2)
18.87 (360)
87.65 (1768)
100.0 (2135)
100.0 (2113)
100.0 (1908)
100.0 (2017)
87.14 (8173)
Percentage and amount of seeds (in parenthesis) of correct identification are reported in bold.
Author's personal copy
S. Iva et al. / Industrial Crops and Products 47 (2013) 232–238
236
Table 5
Varietal identification distinguished for cropping localities. Percentage and amount of seeds (in parenthesis) of the training and test sample sets used for variety identification.
Training sample set
Variety
Locality
Flanders
Lola
Flanders
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
93.70 (669)
91.96 (732)
87.94 (744)
94.43 (967)
86.54 (765)
11.11 (82)
5.97 (55)
9.15 (71)
23.16 (223)
5.52 (46)
6.30 (45)
7.91 (63)
11.94 (101)
5.57 (57)
13.46 (119)
88.89 (656)
94.03 (867)
90.72 (704)
76.84 (740)
94.48 (787)
Lola
Jantar
–
0.13 (1)
–
–
–
–
–
–
–
–
Amon
–
–
0.12 (1)
–
–
–
–
0.13 (1)
–
–
Total
100.00 (714)
100.00 (796)
100.00 (846)
100.00 (1024)
100.00 (884)
100.00 (738)
100.00 (922)
100.00 (776)
100.00 (963)
100.00 (833)
Jantar
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
–
0.14 (1)
–
–
–
–
97.26 (674)
94.09 (684)
96.98 (738)
45.10 (405)
76.49 (576)
2.74 (19)
5.78 (42)
3.02 (23)
54.90 (493)
23.51 (177)
100.00 (693)
100.00 (727)
100.00 (761)
100.00 (898)
100.00 (753)
Amon
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
–
–
0.13 (1)
–
–
–
–
–
–
–
18.51 (124)
14.27 (124)
29.40 (222)
0.66 (6)
2.79 (23)
81.49 (546)
85.73 (745)
70.46 (532)
99.34 (909)
97.21 (801)
100.00 (670)
100.00 (869)
100.00 (755)
100.00 (915)
100.00 (824)
Test sample set
Variety
Locality
Flanders
Lola
Flanders
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
95.80 (342)
89.97 (359)
88.71 (377)
96.10 (493)
88.66 (391)
4.20 (15)
9.52 (38)
11.29 (48)
3.90 (20)
11.11 (49)
Jantar
–
0.25 (1)
–
–
–
Amon
–
0.25 (1)
–
–
0.23 (1)
Total
100.00 (357)
100.00 (399)
100.00 (425)
100.00 (513)
100.00 (441)
Lola
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
9.21 (34)
10.41 (48)
13.18 (51)
23.91 (115)
4.34 (18)
90.79 (335)
89.37 (412)
86.30 (334)
76.09 (366)
95.66 (397)
–
–
0.26 (1)
–
–
–
0.22 (1)
0.26 (1)
–
–
100.00 (369)
100.00 (461)
100.00 (387
100.00 (481)
100.00 (415)
Jantar
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
–
–
–
–
–
–
–
–
–
–
98.24 (335)
95.87 (348)
98.68 (375)
44.64 (200)
77.13 (290)
1.76 (6)
4.13 (15)
1.32 (5)
55.36 (248)
22.87 (86)
100.00 (341)
100.00 (363)
100.00 (380)
100.00 (448)
100.00 (376)
Amon
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
–
–
–
–
–
–
–
–
–
–
20.30 (68)
10.34 (45)
32.01 (121)
0.66 (3)
2.91 (12)
79.70 (267)
89.66 (390)
67.72 (256)
99.34 (453)
97.09 (401)
100.00 (335)
100.00 (435)
100.00 (378)
100.00 (456)
100.00 (413)
Percentage and amount of seeds (in parenthesis) of correct identification are reported in bold.
Fig. 2. Variety identification independent of cropping location. 3D graphic representation of discriminat scores: Amon, Jantar, Flanders, Lola.
training set (90.9%) and in the test set (91.9%), while Jantar reached
the lowest overall identification percentages (80.3% and 81.1% in
the training and the test set, respectively).
To assess the varietal stability, a further comparison among the
production localities was implemented, distinguishing for variety
(Table 6). Also in this case, the training and the test sample sets
showed the same trend. The two brown seeded varieties (Flanders and Lola) achieved high percentages of correct identification
in all the production localities, with test sample set performance
ranged from 64.7% and 91.7% for Flanders, and from 74.9% and 96.1%
for Lola. Although Jantar grown in Domanínek, Lednice and Žatec
reached test sample set high percentage of correct identification
(97.4%, 86.4% and 71.0%, respectively). The seeds grown in Chrastava and Jaroměřice were highly misattributed mainly for those
cropped in Domanínek, achieving test set correct discrimination
percentages of 22.9% and 16.9%, respectively. Similarly, the seeds
of variety Amon grown in Chrastava, Lednice and Žatec were well
identified (78.4%, 79.6% and 86.7%, respectively), while those grown
in Domanínek and Jaroměřice reached test set performances no
exceeding 47.4% (Amon in Jaroměřice) (Table 6).
Author's personal copy
S. Iva et al. / Industrial Crops and Products 47 (2013) 232–238
237
Table 6
Locality identification distinguished for cropping localities. Percentage and amount of seeds (in parenthesis) of the training and test sample sets used for variety identification.
Training sample set
Variety
Locality
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
Total
Flanders
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
63.73 (455)
4.02 (32)
4.14 (35)
–
–
30.39 (217)
93.22 (742)
22.81 (193)
–
–
5.46 (39)
1.26 (10)
67.26 (569)
–
–
0.42 (3)
1.51 (12)
5.67 (48)
89.55 (917)
25.34 (224)
–
–
0.12 (1)
10.45 (107)
74.66 (660)
100.00 (714)
100.00 (796)
100.00 (846)
100.00 (1024)
100.00 (884)
Lola
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
85.23 (629)
1.19 (11)
1.68 (13)
–
–
10.70 (79)
89.48 (825)
0.64 (5)
5.50 (53)
0.48 (4)
4.07 (30)
8.79 (81)
97.16 (754)
1.77 (17)
0.48 (4)
–
0.54 (5)
0.26 (2)
81.93 (789)
21.37 (178)
–
–
0.26 (2)
10.80 (104)
77.67 (647)
100.00 (738)
100.00 (922)
100.00 (776)
100.00 (963)
100.00 (833)
Jantar
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
98.85 (685)
56.53 (411)
81.87 (623)
–
–
–
20.22 (147)
5.26 (40)
–
0.53 (4)
1.15 (8)
23.25 (169)
12.88 (98)
–
0.13 (1)
–
–
–
84.52 (759)
26.16 (197)
–
–
–
15.48 (139)
73.17 (551)
100.00 (693)
100.00 (727)
100.00 (761)
100.00 (898)
100.00 (753)
Amon
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
32.09 (215)
0.92 (8)
0.79 (6)
–
–
57.61 (388)
79.75 (693)
27.42 (207)
–
–
10.0 (67)
17.15 (146)
47.81 (361)
–
–
–
1.27 (11)
19.87 (150)
79.56 (728)
10.80 (89)
–
0.92 (8)
4.11 (31)
20.44 (187)
89.20 (735)
100.00 (670)
100.00 (869)
100.00 (755)
100.00 (915)
100.00 (824)
Variety
Locality
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
Total
Flanders
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
64.71 (231)
5.01 (20)
3.76 (16)
–
–
31.93 (114)
91.73 (366)
23.76 (101)
–
–
3.08 (11)
2.01 (8)
66.82 (284)
0.19 (1)
–
0.28 (1)
1.25 (5)
5.41 (23)
87.72 (450)
24.94 (110)
–
–
0.24 (1)
12.09 (62)
75.06 (331)
100.00 (357)
100.00 (399)
100.00 (425)
100.00 (513)
100.00 (441)
Lola
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
84.82 (313)
3.47 (16)
3.36 (13)
–
–
10.30 (38)
82.00 (378)
0.52 (2)
3.53 (17)
–
4.88 (18)
14.32 (66)
96.12 (372)
1.87 (9)
0.72 (3)
–
0.22 (1)
–
83.16 (400)
24.34 (101)
–
–
–
11.43 (55)
74.94 (311)
100.00 (369)
100.00 (461)
100.00 (387)
100.00 (481)
100.00 (415)
Jantar
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
97.36 (332)
57.02 (207)
78.95 (300)
–
–
–
22.87 (83)
4.21 (16)
–
–
2.69 (9)
20.11 (73)
16.84 (64)
–
–
–
–
–
86.38 (387)
28.99 (109)
–
–
–
13.62 (61)
71.01 (267)
100.00 (341)
100.00 (363)
100.00 (380)
100.00 (448)
100.00 (376)
Amon
Domanínek
Chrastava
Jaroměřice
Lednice
Žatec
36.42 (122)
0.92 (4)
–
–
–
55.52 (186)
78.39 (341)
30.42 (115)
–
–
8.06 (27)
19.54 (85)
47.35 (179)
–
–
–
0.69 (3)
17.99 (68)
79.61 (363)
13.22 (55)
–
0.46 (2)
4.23 (16)
20.39 (93)
86.68 (358)
100.00 (335)
100.00 (435)
100.00 (378)
100.00 (456)
100.00 (413)
Test sample set
Percentage and amount of seeds (in parenthesis) of correct identification are reported in bold.
4. Discussion
For many plant species, seed features play an important role
mainly for varietal identification (Grillo et al., 2011). The low level of
phenotype variability among the studied flax varieties was repeatedly observed. Everaert et al. (2001), Fu et al. (2002) and Smykal
et al. (2011) illustrated the ability to detect two contrasted flax
varieties. Similarly, Wiesnerová and Wiesner (2008) and Pearson
(2010), applying image analysis techniques, were able to discriminate between brown and yellow flax on the basis of red, green and
blue mean values.
In this study, four Czech commercial varieties of flax were characterized on the basis of seed shape, size and colour measured by
computer vision methods. The achieved data were used to implement a specific statistical classifier able to identify and classify the
studied varieties and trace the cultivation localities.
Considering the remarkable visual resemblance among the
seeds of the studied varieties, the reached results have to be considered enough to support the seed lots identification process, above
all if compared with those achieved for other similar crops (Venora
et al., 2007a, 2007b; Smýkalová et al., 2011; Grillo et al., 2011). The
effect of the cultivation region, as well as soil, climatic and geographic characteristics is remarkably evident on the flax seed shape,
size and colour, and a relationship with the flax variety seems to
exist. In particular, the high performances achieved by the studied
varieties in Žatec and Lednice, allows to infer that these localities
are particularly vocated with respect to others. On the other hand,
the interaction genotype × environment, in the expression of seed
shape, size and colour, is well known. (Nieto-Ángel et al., 2009;
Medina et al., 2010; Smýkalová et al., 2011).
5. Conclusions
This work allows to investigate relationships among flax varieties and the cropping environment by seed morpho-colorimetric
characterization using an image analysis system. One more time, it
was possible to prove that an objective, reliable and repeatable,
computer-aided identification system can be effectively applied
also for the flax seeds.
Author's personal copy
238
S. Iva et al. / Industrial Crops and Products 47 (2013) 232–238
As further step of this work will be to improve the classifier
adding other seed samples of different cropping years to evaluate
the effect of the climatic and thermo-pluviometric conditions on
seed traits expression and the consequence of seed storage conditions on the colour changes of seeds.
Acknowledgement
This work was financially supported by the grant No.
MSM2678424601 of the Ministry of Education of CR.
References
Bacchetta, G., Grillo, O., Mattana, E., Venora, G., 2008. Morpho-colorimetric characterization by image analysis to identify diaspores of wild plant species. Flora
203 (8), 669–682.
Diederichsen, A., Richards, K., 2003. The seed. In: Muir, D.A., Westcott, N.D.
(Eds.), Flax, the Genus Linum. Agriculture and Agri-Food Canada, Saskatoon,
Saskatchewan, Canada, p. 306.
Everaert, I., De Riek, J., De Loose, M., Van Waes, J., Van Bockstaele, E., 2001. Most
similar variety grouping for distinctness evaluation of flax and linseed (Linum
usitatissimum L.) varieties by means of AFLP and morphological data. Plant Var.
Seeds 4, 69–87.
Fisher, R.A., 1936. The use of multiple measurements in taxonomic problems. Ann.
Eugen. 7, 179–188.
Fisher, R.A., 1940. The precision of discriminant functions. Ann. Eugen. 10 (4),
422–429.
Fouilloux, G., 1988. Breeding flax methods. In: Flax: Breeding and Utilisation. Kluwer
Academic Publishers, Dordrecht/Boston/London, 14–25.
Fu, Y.-B., Diederichsen, A., Richards, K.W., Peterson, G., 2002. Genetic diversity
within a range of cultivars and landraces of flax (Linum usitatissimum L.) as
revealed by RAPDs. Genet. Resour. Crop Evol. 49, 167–174.
Grillo, O., Mattana, E., Venora, G., Bacchetta, G., 2010. Statistical seed classifiers of
10 plant families representative of the Mediterranean vascular flora. Seed Sci.
Technol. 38 (2), 455–476.
Grillo, O., Miceli, C., Venora, G., 2011. Image analysis tool for Vetch varieties identification by seeds inspection. Seed Sci. Technol. 39 (2), 490–500.
Grillo, O., Draper, D., Venora, G., Martínez-Laborde, J.B., 2012. Seed image analysis
and taxonomy of Diplotaxis DC. (Brassicaceae Brassiceae). Systemat. Biodivers.
10 (1), 57–70.
Harper, J.L., Lovell, P.H., Moore, K.G., 1970. The shapes and sizes of seeds. Annu. Rev.
Ecol. Systemat. 1, 327–356.
Hastie, T., Tibshirani, R., Friedman, J., 2001. The Elements of Statistical Learning:
Data Mining, Inference, and Prediction. Springer, New York, USA745.
Holden, J.E., Finch, W.H., Kelly, K., 2011. A comparison of two-group classification
methods. Educ. Psychol. Meas. 71 (5), 870–901.
Medina, W., Skurtys, O., Aguilera, J.M., 2010. Study on image analysis application for identification Quinoa seeds (Chenopodium quinoa Willd) geographical
provenance. LWT – Food Sci. Technol. 43 (2), 238–246.
Muir, A.D., Wescott, N.D., 2001. Flax - the genus Linum. Harwood Acad. Publ.,
Amsterdam22–54.
Nieto-Ángel, R., Pérez-Ortega, S.A., Núñez-Colín, C.A., Martínez-Solìs, J., GonzálezAndrés, F., 2009. Seed and endocarp traits as markers of the biodiversity of
regional sources of germplasm of tejocote (Crataegus spp.) from Central and
Southern Mexico. Sci. Hortic. – Amsterdam 121, 166–170.
Pavelek M., 2004. Recent development of the International flax database as the
result of an ECP/GR initiative. IPGRI Newsletter for Europe, No. 28, June 2004.
p. 9.
Pearson, T., 2010. High speed sorting of grains by color and surface texture. Appl.
Eng. Agric. 26 (3), 499–505.
Shahin, M.A., Symons, S.J., 2001. A machine vision system for grading lentils. Can.
Biosyst. Eng. 43, 7.7–7.14.
Shahin, M.A., Symons, S.J., 2003. Colour calibration of scanners for scannerindependent grain grading. Cereal Chem. 80, 285–289.
Smykal, P., Bacova-Kerteszova, N., Kalendar, R., Corander, J., Schulman, A.H.,
Pavelek, M., 2011. Genetic diversity of cultivated flax (Linum usitatissimum
L.) germplasm assessed by retrotransposon-based markers. Theor. Appl. Genet.
122, 1385–1397, DOI 10.1007/s00122-011-1539-2.
Smýkalová, I., Grillo, O., Bjelkova, M., Hybl, M., Venora, G., 2011. Morphocolorimetric traits of Pisum seeds measured by an image analysis system. Seed
Sci. Technol. 39, 612–626.
Venora, G., Grillo, O., Shahin, M.A., Symons, S.J., 2007a. Identification of Sicilian
landraces and Canadian cultivars of lentil using an image analysis system. Food
Res. Int. 40, 161–166.
Venora, G., Grillo, O., Ravalli, C., Cremonini, R., 2007b. Tuscany beans landraces,
on-line identification from seed inspection by image analysis and linear discriminant analysis. Agrochimica 51 (4/5), 254–268.
Venora, G., Grillo, O., Saccone, R., 2009. Quality assessment of durum wheat storage
centres in Sicily: evaluation of vitreous, starchy and shrunken kernels using an
image analysis system. J. Cereal Sci. 49, 429–440.
Wiesnerová, D., Wiesner, I., 2008. Computer image analysis of seed shape
and seed color for flax cultivar description. Comput. Electron. Agric. 61,
126–135.