Academia.eduAcademia.edu

Phenotypic evaluation of flax seeds by image analysis

Linum usitatissimum L., as other fibre plants, has high technical and nutritive value connected to the seed fatty acid composition. The assessment of some seed aspects such as colour, size and shape is important in grading system as well as to characterize accessions of core collections. The data of 34 morphometric and colorimetric features of flax seeds belonging to four Czech varieties tested in 5 different Czech localities, were used to implement an identification and classification grading system, on the basis of two sample datasets: a training set of data to teach the classifier and a different test set of data to validate it. The achieved results suggested the possibility to discriminate both varieties and localities, and the stability of any studied flax seed lot.

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/authorsrights Author's personal copy Industrial Crops and Products 47 (2013) 232–238 Contents lists available at SciVerse ScienceDirect Industrial Crops and Products journal homepage: www.elsevier.com/locate/indcrop Phenotypic evaluation of flax seeds by image analysis Smykalova Iva a,∗ , Grillo Oscar b , Bjelkova Marie a , Pavelek Martin a , Venora Gianfranco b a b Agritec Plant Research, Ltd., Zemědělská 16, 787 01 Šumperk, Czech Republic Stazione Consorziale Sperimentale di Granicoltura per la Sicilia, Via Sirio 1, 95041 Borgo Santo Pietro – Caltagirone, Italy a r t i c l e i n f o Article history: Received 3 December 2012 Received in revised form 28 February 2013 Accepted 5 March 2013 Keywords: Computer vision Linum spp. Germplasm characterization Morphometric and colorimetric measurements Linear discriminant analysis a b s t r a c t Linum usitatissimum L., as other fibre plants, has high technical and nutritive value connected to the seed fatty acid composition. The assessment of some seed aspects such as colour, size and shape is important in grading system as well as to characterize accessions of core collections. The data of 34 morphometric and colorimetric features of flax seeds belonging to four Czech varieties tested in 5 different Czech localities, were used to implement an identification and classification grading system, on the basis of two sample datasets: a training set of data to teach the classifier and a different test set of data to validate it. The achieved results suggested the possibility to discriminate both varieties and localities, and the stability of any studied flax seed lot. © 2013 Elsevier B.V. All rights reserved. 1. Introduction Flax cultivation is as old as humanity itself. The oldest findings about flax cultivation in Europe are dated back to the Neolithic (Late Stone Age), about 5 thousands years B.C. The genus Linum, belonging to the family of Linaceae, with over then 200 species, is well-known and it is divided into five subsections. The subsection Linum contains the cultivated species L. usitatissimum L. and the ornamentals L. grandiflorum Desf. and L. perenne L., but only cultivated L. usitatissimum has economic importance having 2n = 30. Flax is a annual mostly self-pollinating crop, characterized by relatively short vegetation period, 90–120 days in European conditions (Muir and Wescott, 2001). Since the specific ecology of grown cultivars (weather, soil as well as applied agronomic practices) significantly affects on its development, growth and morphological diversity, flax cultivation is strongly dependent but easily adaptable to the local conditions (regionalization) taking advantage of a good resistance to environmental biotic and abiotic stress (Fouilloux, 1988). Currently, the European Union Register holds 67 cultivars of fibre flax. Breeding of flax in European Union is mainly conducted in France, The Netherlands, Czech Republic, Poland and Romania. Outside the EU, the most important flax breeding centres are located in Russia, Ukraine, Byelorussia, Canada, China and Egypt. However, the flax cultivation in the Czech Republic is concentrated in the suitable localities with high production potential ∗ Corresponding author. Tel.: +420 583 382 120; fax: +420 583 382 999. E-mail address: [email protected] (S. Iva). 0926-6690/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.indcrop.2013.03.001 (fibre flax should be grown in conditions of low temperature and high level of air humidity) of some Czech modern commercial fibre flax varieties such as Bonet (1996), Jitka (1996), Jordán (1998), Tábor (2002), Marylin (2004), Rina (2009). Flax is also cultivated for the nutritional purposes, predominantly for the modified fatty acid composition in seeds. The Czech registered varieties such as Flanders (1996), Lola (1999), Amon and Jantar (2006) are classified as oilseed (linseed) varieties for seed production and nutritional purposes. Following a global trend, in the Czech Republic the flax areas drastically decreased from approximately 5500 ha in 2004 to about 150 ha in 2008–2009, despite of the cultural tradition and the agronomic and economic importance of this crop in agricultural systems due to position of flax on the worldwide market. Therefore, the biodiversity preservation of this technical crop, supported by breeding programmes, became the main goals. The flax international database, one of has been managed since 1994, in private company Agritec Ltd. in Šumperk Czech Republic (Pavelek, 2004), are important for research and plant breeding purposes, for which visual identification process is essential but it needs experienced and specialized technicians. In the last two decades, a remarkable increase in image analysis applications has been applied in the plant biology research field to characterize, identify and grade varieties of different crops (Shahin and Symons, 2001; Venora et al., 2007a, 2007b, 2009; Medina et al., 2010; Grillo et al., 2011; Smýkalová et al., 2011). Only few papers are known for flax or linseed (Wiesnerová and Wiesner, 2008; Pearson, 2010), probably due to the small size of the seeds 4–6 mm in length and 2–3 mm in width and low level of diversity (Diederichsen and Richards, 2003). Generally, the smooth surface Author's personal copy S. Iva et al. / Industrial Crops and Products 47 (2013) 232–238 233 and the gloss of the seed are related to good quality, maturity and fitness. The common colour of seeds is brown with variations in intensity scale, but the modern low linolenic linseed varieties are often yellow seeded. The Red-Green-Blue (RGB) colour components of each individual seed of 2D images are the easily estimating features by image analysis. The sophisticated non-destructive method allowed sorting within large seeded samples. The aims of the present study were: (1) to characterize four commercial varieties of flax on the basis of seed shape, size and colour measurements by computer vision; (2) to evaluate the possible effect of five different Czech localities on morpho-colorimetric features of the seeds; (3) to develop a specific statistical classifier able to identify and classify the studied varieties and trace the cultivation localities. 2. Material and methods 2.1. Seed material Well ripened, cleaned and dried seed samples of four different flax varieties (L. usitatissimum L.), without significant mechanical and pest damages, were harvested from plants cultivated on five different localities in the Czech Republic. Fig. 1 shows temperature and rainfall recorded during crops season compared to long-term data. Field trials were carried out in the stations Domanínek, Chrastava, Jaroměřice, Lednice and Žatec (Table 1) using two yellow (Amon and Jantar) and two brown seeded varieties (Flanders and Lola), kept in Czech National Flax Collection. The total amount of 24,534 sampled seeds was split in two sub-samples of seeds to dispose of a training and a test sample sets (Smýkalová et al., 2011). To guarantee the representativeness of the seed samples and, at the same time, to minimize the intra-varietal variability of shape, sizes and colour of the seeds, due to the seed position inside the pod and to the position of different pods in the same plant (Harper et al., 1970), the samples were randomly assigned to the training and test sets. The training sample set, implemented with 16,361 seeds of the four studied flax varieties, was used to develop the classifier, while the test sample set, implemented with 8173 seeds of the studied varieties, was used to evaluate the performance of the developed statistical classifier. Fig. 1. Long-term average temperature of 30 years and temperature in the 2009 (◦ C) recorded in crop season period (April–July) for five different localities (A). Long-term average rainfall of 30 years and rainfall in the 2009 (mm) recorded in crop season period (April–July) for five different localities (B). Kodak Colour Input Target (IT8.7/2-1993, Kodak, USA), following the Shahin and Symons (2003) protocol. The measurement of 34 morpho-colorimetric quantitative variables describing seed size, shape and colour (Table 2) on the acquired images were automatically executed applying a macro, called flaxeed.mcr, specifically developed for the characterization of flax seeds, using the software package KS-400 V.3.0 (Carl Zeiss Vision, Oberkochen, Germany) (Grillo et al., 2010). 2.2. Image acquisition and analysis Digital images of seed samples were acquired using a flatbed scanner (Canon 4400F, Canon Inc., Japan), with a digital resolution of 200 dpi, a colour depth of 24 bit and a scanning area not exceeding 1024 × 1024 pixel. According to Bacchetta et al. (2008), the seeds were arranged singly on the scanner tray, so they did not touch each other. To achieve a perfect detection of the seeds contour and avoid interference of environmental light, two images were acquired for each seed sample, one with a black background, covering the seed samples with a box dressed with opaque black paper and the other one with a white background using a box dressed with opaque black paper. Before image acquisition, the scanner was calibrated according to Venora et al. (2009) for colour matching, using a Q60 2.3. Data analysis Statistical analyses were performed with the software SPSS release 15 (SPSS Inc. 1999), applying a stepwise Linear Discriminant Analysis (LDA) algorithm. This approach is commonly used to Table 1 Altitude, climate, soil and geographical position of the cropping localities. Temperature and rainfall are means long-term average value of 30 years time. Locality Production region Altitude (m a.s.l.) Temperature (◦ C) Rainfall (mm) Soil (by FAO 1970) Domanínek Chrastava Jaroměřice n/R Lednice n/M Žatec Potato Cereal Cereal Maize Sugar 572 345 425 171 285 6.5 8.0 8.0 9.6 9.0 651 738 481 461 439 Spodo-dystric Cambisol (medium) Orthic-Luvisol (medium) Orthic-Luvisol (heavy) Haplic Chernozem Luvi-Haplic Chernozem (heavy) FAO. 1970: Physical and chemical methods of soil and water analysis. FAO Soils Bulletin No. 10. FAO, Rome. Author's personal copy S. Iva et al. / Industrial Crops and Products 47 (2013) 232–238 234 Table 2 List of 34 morphometric and colorimetric measured features on seeds. A P Pconv PCrof Pconv /PCrof Dmax Dmin Dmin /Dmax Sf Rf Ecd EAmax EAmin Rmean Rsd Gmean Gsd Bmean Bsd Hmean Hsd Lmean Lsd Smean Ssd Dmean Dsd S K H E Dsum SqDsum TSW Feature Description Area Perimeter Convex perimeter Crofton perimeter Perimeter ratio Max diameter Min diameter Feret ratio Shape factor Roundness factor Eq. circular diameter Maximum ellipse axis Minimum ellipse axis Mean red channel Red std. deviation Mean green channel Green std. deviation Mean blue channel Blue std. deviation Mean hue channel Hue std. deviation Mean lightness channel Lightness std. deviation Mean saturation channel Saturation std. deviation Mean density Density std. deviation Skewness Kurtosis Energy Entropy Density sum Square density sum A thousand seeds mean weight Seed area (mm2 ) Seed perimeter (mm) Convex perimeter of the seed (mm) Crofton perimeter of the seed (mm) Ratio between convex and Crofton’s perimeters Maximum diameter of the seed (mm) Minimum diameter of the seed (mm) Ratio between minimum and maximum diameters Seed shape descriptor = (4 × ␲ × area)/perimeter2 (normalized value) Seed roundness descriptor = (4 × area)/(␲ × max diameter2 ) (normalized value) Diameter of a circle with equivalent area (mm) Maximum axis of an ellipse with equivalent area (mm) Minimum axis of an ellipse with equivalent area (mm) Red channel mean value of seed pixels (grey levels) Red channel standard deviation of seed pixels Green channel mean value of seed pixels (grey levels) Green channel standard deviation of seed pixels Blue channel mean value of seed pixels (grey levels) Blue channel standard deviation of seed pixels Hue channel mean value of seed pixels (grey levels) Hue channel standard deviation of seed pixels Lightness channel mean value of seed pixels (grey levels) Lightness channel standard deviation of seed pixels Saturation channel mean value of seed pixels (grey levels) Saturation channel standard deviation of seed pixels Density channel mean value of seed pixels (grey levels) Density channel standard deviation of seed pixels Asymmetry degree of intensity values distribution (grey levels) Peakness degree of intensity values distribution (densitometric units) Measure of the increasing intensity power (densitometric units) Dispersion power (bit) Sum of density values of the seed pixels (grey levels) Sum of the squares of density values (grey levels) Mean value of a thousand seeds weight (g) classify/identify unknown groups characterized by quantitative and qualitative variables (Fisher, 1936, 1940) finding the combination of predictor variables with the aim of minimizing the within-class distance and maximizing the between-class distance simultaneously, thus achieving maximum class discrimination (Hastie et al., 2001; Holden et al., 2011). The best features for seed sample identification were detected applying a stepwise LDA method. The selected features were used to elaborate canonical discriminant functions which are needed to implement statistical classifiers to discriminate and classify the seeds on the basis of the selected features (Table 2). When several variables are available, the stepwise method can be useful by automatically selecting the best characters on the basis of three statistical variables: Tolerance, F-to-enter and F-to-remove. The Tolerance value indicates the proportion of a variable variance not accounted for other independent variables in the equation. A variable with very low Tolerance value proves little information to a model. F-to-enter and F-toremove values define the power of each variable in the model and they are useful to describe what happens if a variable is inserted and removed, respectively, from the current model. This method starts with a model that does not include any of the variables. At each step, the variable with the largest F-to-enter value that exceeds the entry criteria chosen (F ≥ 3.84) is added to the model. The variables left out of the analysis at the last step have F-toenter values smaller than 3.84, so no more are added. The process was automatically stopped when no remaining variables increased the discrimination ability (Grillo et al., 2012). To graphically highlight the differences among groups, multidimensional plots were drawn using the first three of the available canonical discriminant functions. 3. Results Morpho-colorimetric analysis allowed the accurate assessment of seed size, shape and colour of the studied flax varieties. The measured mean and standard deviation values are reported in Table 3. The seeds of the studied samples of Linum are flat, ovoid to ellipsoid in shape as also proved by mean values of shape and roundness factors (0.82 ± 0.03 and 0.46 ± 0.03 respectively). Seeds of the four studied varieties result are similar in size and shape, as the morphometric feature values show, but a clear difference between the two yellow (Amon and Jantar) and the two brown seeded varieties (Flanders and Lola) is evident, as proved by the highest RGB channels values of the yellow seeded varieties respect to the brown seeded varieties (Table 3). The measured data of the 34 measured features were statistically elaborated using the stepwise Linear Discriminant Analysis, carrying out a statistical classifier able to distinguish the studied varieties (Table 4). Using this model, 87.0% of the training sample set were correctly identified, with performances ranging between 80.3% (Jantar) and 90.9% (Flanders) for single varieties. As described above, the performance of the implemented classifier was tested using an independent seed group, defined test sample set. It showed the same overall percentage of correct classification (87.1%) and similar range of performance, included between 81.1% (Jantar) and 91.9% (Flanders), for single varieties respect to the training sample set classification (Table 4). Fig. 2 shows the graphical representation on the basis of the first three discriminant functions, highlighting the differentiation among the four varieties and contextually the similarity between the two yellow (Amon and Jantar) and the two brown seeded varieties (Flanders and Lola). Author's personal copy S. Iva et al. / Industrial Crops and Products 47 (2013) 232–238 235 Table 3 The measured mean and standard deviation values of features for four flax varieties: Amon, Jantar, Flanders, Lola. Amon Flanders Mean A P Pconv PCrof Pconv /PCrof Dmax Dmin Dmin /Dmax Sf Rf Ecd EAmax EAmin Rmean a Rsd b Gmean a Gsd b Bmean a Bsd b Hmean a Hsd b Lmean a Lsd b Smean a Ssd b Dmean a Dsd b S K H E Dsum SqDsum TSW a b s.d. 7.74 11.57 11.28 10.97 1.03 4.67 2.31 0.50 0.81 0.45 3.14 2.26 1.09 205.43 36.66 174.59 31.10 137.26 21.33 24.37 17.93 171.08 28.15 128.04 65.47 172.43 41.36 -0.85 0.42 0.01 6.04 83,432.45 15,280,789.65 5.833 0.69 0.50 0.47 0.47 0.01 0.20 0.14 0.03 0.03 0.03 0.14 0.10 0.07 11.06 4.09 10.12 3.80 7.41 3.13 2.14 8.72 8.90 3.30 23.65 10.66 9.29 2.99 0.37 0.89 0.00 0.12 9814.08 2404,478.06 0.416 Mean 7.92 11.73 11.43 11.12 1.03 4.71 2.36 0.50 0.80 0.45 3.17 2.28 1.11 140.11 33.85 119.21 26.69 117.31 19.28 87.51 16.81 16.81 26.59 35.64 20.89 125.54 29.45 0.11 -0.45 0.01 5.77 62,113.23 8289,304.02 5.803 Jantar s.d. Mean 0.74 0.53 0.50 0.50 0.01 0.22 0.15 0.03 0.03 0.03 0.15 0.11 0.07 12.51 6.52 9.78 5.38 7.20 3.27 38.35 17.61 9.77 4.77 7.29 5.77 9.50 4.61 0.37 0.64 0.01 0.23 7925.01 1514,022.32 0.270 8.10 11.76 11.46 11.15 1.03 4.69 2.42 0.52 0.82 0.47 3.21 2.27 1.14 206.48 36.31 176.78 30.96 136.98 21.03 24.96 15.65 171.47 27.79 130.26 65.87 173.41 41.59 -0.85 0.41 0.01 6.07 87,765.61 16,155,620.99 6.309 Lola s.d. Mean 0.80 0.58 0.54 0.55 0.01 0.24 0.17 0.03 0.03 0.03 0.16 0.12 0.08 11.15 3.75 9.91 3.65 6.97 3.13 1.82 7.88 8.69 3.17 23.65 10.57 9.07 2.83 0.39 0.95 0.00 0.12 10,572.29 2492,143.47 0.551 8.22 11.81 11.51 11.20 1.03 4.70 2.45 0.52 0.82 0.47 3.23 2.26 1.16 142.54 31.58 117.61 24.69 115.58 17.95 81.09 98.89 128.03 24.66 38.85 20.10 125.25 28.54 0.09 -0.25 0.02 5.71 64,316.91 8535,698.97 6.065 s.d. 0.76 0.53 0.50 0.50 0.01 0.22 0.16 0.03 0.03 0.03 0.15 0.11 0.07 12.85 6.48 9.07 5.18 6.80 2.96 34.77 16.62 9.74 4.64 7.19 4.72 9.27 4.09 0.38 0.69 0.01 0.22 8190.33 1536,122.97 0.291 Grey levels (range 0–255). s.d. = grey level standard deviation (range 0–255). Applying the same statistical model, a comparison among the five Czech production localities distinguishing among the four varieties, was executed in order to evaluate the effect of locality. The training and the test sample sets show the same trend, highlighting one more time the clear distinction between the two yellow (Amon and Jantar) and the two brown seeded varieties (Flanders and Lola) in each studied production locality (Table 5). With the exception of the variety Jantar grown in Lednice (45.1% and 44.6% for the training and the test sample set, respectively), the four flax varieties in all the localities showed high percentages of correct identification, included between 70.5% (Amon grown in Jaroměřice) and 99.3% (Amon grown in Lednice) for the training sample set and between 67.7% (Amon grown in Jaroměřice) and 99.3% (Amon grown in Lednice) for the test sample set (Table 5). However, among the four tested flax varieties, extensive variety Flanders showed the highest percentage of overall correct identification, both in the Table 4 Varietal discrimination independently of the cropping localities. Percentage and amount of seeds (in parenthesis) of the training and test sample sets used for the variety identification. Training sample set Flanders Lola Jantar Amon Overall Flanders Lola Jantar Amon Total 90.92 (3877) 11.27 (477) – 0.02 (1) 9.03 (385) 88.71 (3754) – – 0.02 (1) – 80.30 (3077) 12.37 (499) 0.02 (1) 0.02 (1) 19.68 (754) 87.60 (3533) 100.0 (4264) 100.0 (4232) 100.0 (3832) 100.0 (4033) 87.00 (16,361) Test sample set Flanders Lola Jantar Amon Overall Flanders Lola Jantar Amon Total 91.90 (1962) 12.59 (266) – – 7.96 (170) 87.27 (1844) – – 0.05 (1) 0.05 (1) 81.13 (1548) 12.35 (249) 0.09 (2) 0.09 (2) 18.87 (360) 87.65 (1768) 100.0 (2135) 100.0 (2113) 100.0 (1908) 100.0 (2017) 87.14 (8173) Percentage and amount of seeds (in parenthesis) of correct identification are reported in bold. Author's personal copy S. Iva et al. / Industrial Crops and Products 47 (2013) 232–238 236 Table 5 Varietal identification distinguished for cropping localities. Percentage and amount of seeds (in parenthesis) of the training and test sample sets used for variety identification. Training sample set Variety Locality Flanders Lola Flanders Domanínek Chrastava Jaroměřice Lednice Žatec Domanínek Chrastava Jaroměřice Lednice Žatec 93.70 (669) 91.96 (732) 87.94 (744) 94.43 (967) 86.54 (765) 11.11 (82) 5.97 (55) 9.15 (71) 23.16 (223) 5.52 (46) 6.30 (45) 7.91 (63) 11.94 (101) 5.57 (57) 13.46 (119) 88.89 (656) 94.03 (867) 90.72 (704) 76.84 (740) 94.48 (787) Lola Jantar – 0.13 (1) – – – – – – – – Amon – – 0.12 (1) – – – – 0.13 (1) – – Total 100.00 (714) 100.00 (796) 100.00 (846) 100.00 (1024) 100.00 (884) 100.00 (738) 100.00 (922) 100.00 (776) 100.00 (963) 100.00 (833) Jantar Domanínek Chrastava Jaroměřice Lednice Žatec – 0.14 (1) – – – – 97.26 (674) 94.09 (684) 96.98 (738) 45.10 (405) 76.49 (576) 2.74 (19) 5.78 (42) 3.02 (23) 54.90 (493) 23.51 (177) 100.00 (693) 100.00 (727) 100.00 (761) 100.00 (898) 100.00 (753) Amon Domanínek Chrastava Jaroměřice Lednice Žatec – – 0.13 (1) – – – – – – – 18.51 (124) 14.27 (124) 29.40 (222) 0.66 (6) 2.79 (23) 81.49 (546) 85.73 (745) 70.46 (532) 99.34 (909) 97.21 (801) 100.00 (670) 100.00 (869) 100.00 (755) 100.00 (915) 100.00 (824) Test sample set Variety Locality Flanders Lola Flanders Domanínek Chrastava Jaroměřice Lednice Žatec 95.80 (342) 89.97 (359) 88.71 (377) 96.10 (493) 88.66 (391) 4.20 (15) 9.52 (38) 11.29 (48) 3.90 (20) 11.11 (49) Jantar – 0.25 (1) – – – Amon – 0.25 (1) – – 0.23 (1) Total 100.00 (357) 100.00 (399) 100.00 (425) 100.00 (513) 100.00 (441) Lola Domanínek Chrastava Jaroměřice Lednice Žatec 9.21 (34) 10.41 (48) 13.18 (51) 23.91 (115) 4.34 (18) 90.79 (335) 89.37 (412) 86.30 (334) 76.09 (366) 95.66 (397) – – 0.26 (1) – – – 0.22 (1) 0.26 (1) – – 100.00 (369) 100.00 (461) 100.00 (387 100.00 (481) 100.00 (415) Jantar Domanínek Chrastava Jaroměřice Lednice Žatec – – – – – – – – – – 98.24 (335) 95.87 (348) 98.68 (375) 44.64 (200) 77.13 (290) 1.76 (6) 4.13 (15) 1.32 (5) 55.36 (248) 22.87 (86) 100.00 (341) 100.00 (363) 100.00 (380) 100.00 (448) 100.00 (376) Amon Domanínek Chrastava Jaroměřice Lednice Žatec – – – – – – – – – – 20.30 (68) 10.34 (45) 32.01 (121) 0.66 (3) 2.91 (12) 79.70 (267) 89.66 (390) 67.72 (256) 99.34 (453) 97.09 (401) 100.00 (335) 100.00 (435) 100.00 (378) 100.00 (456) 100.00 (413) Percentage and amount of seeds (in parenthesis) of correct identification are reported in bold. Fig. 2. Variety identification independent of cropping location. 3D graphic representation of discriminat scores: Amon, Jantar, Flanders, Lola. training set (90.9%) and in the test set (91.9%), while Jantar reached the lowest overall identification percentages (80.3% and 81.1% in the training and the test set, respectively). To assess the varietal stability, a further comparison among the production localities was implemented, distinguishing for variety (Table 6). Also in this case, the training and the test sample sets showed the same trend. The two brown seeded varieties (Flanders and Lola) achieved high percentages of correct identification in all the production localities, with test sample set performance ranged from 64.7% and 91.7% for Flanders, and from 74.9% and 96.1% for Lola. Although Jantar grown in Domanínek, Lednice and Žatec reached test sample set high percentage of correct identification (97.4%, 86.4% and 71.0%, respectively). The seeds grown in Chrastava and Jaroměřice were highly misattributed mainly for those cropped in Domanínek, achieving test set correct discrimination percentages of 22.9% and 16.9%, respectively. Similarly, the seeds of variety Amon grown in Chrastava, Lednice and Žatec were well identified (78.4%, 79.6% and 86.7%, respectively), while those grown in Domanínek and Jaroměřice reached test set performances no exceeding 47.4% (Amon in Jaroměřice) (Table 6). Author's personal copy S. Iva et al. / Industrial Crops and Products 47 (2013) 232–238 237 Table 6 Locality identification distinguished for cropping localities. Percentage and amount of seeds (in parenthesis) of the training and test sample sets used for variety identification. Training sample set Variety Locality Domanínek Chrastava Jaroměřice Lednice Žatec Total Flanders Domanínek Chrastava Jaroměřice Lednice Žatec 63.73 (455) 4.02 (32) 4.14 (35) – – 30.39 (217) 93.22 (742) 22.81 (193) – – 5.46 (39) 1.26 (10) 67.26 (569) – – 0.42 (3) 1.51 (12) 5.67 (48) 89.55 (917) 25.34 (224) – – 0.12 (1) 10.45 (107) 74.66 (660) 100.00 (714) 100.00 (796) 100.00 (846) 100.00 (1024) 100.00 (884) Lola Domanínek Chrastava Jaroměřice Lednice Žatec 85.23 (629) 1.19 (11) 1.68 (13) – – 10.70 (79) 89.48 (825) 0.64 (5) 5.50 (53) 0.48 (4) 4.07 (30) 8.79 (81) 97.16 (754) 1.77 (17) 0.48 (4) – 0.54 (5) 0.26 (2) 81.93 (789) 21.37 (178) – – 0.26 (2) 10.80 (104) 77.67 (647) 100.00 (738) 100.00 (922) 100.00 (776) 100.00 (963) 100.00 (833) Jantar Domanínek Chrastava Jaroměřice Lednice Žatec 98.85 (685) 56.53 (411) 81.87 (623) – – – 20.22 (147) 5.26 (40) – 0.53 (4) 1.15 (8) 23.25 (169) 12.88 (98) – 0.13 (1) – – – 84.52 (759) 26.16 (197) – – – 15.48 (139) 73.17 (551) 100.00 (693) 100.00 (727) 100.00 (761) 100.00 (898) 100.00 (753) Amon Domanínek Chrastava Jaroměřice Lednice Žatec 32.09 (215) 0.92 (8) 0.79 (6) – – 57.61 (388) 79.75 (693) 27.42 (207) – – 10.0 (67) 17.15 (146) 47.81 (361) – – – 1.27 (11) 19.87 (150) 79.56 (728) 10.80 (89) – 0.92 (8) 4.11 (31) 20.44 (187) 89.20 (735) 100.00 (670) 100.00 (869) 100.00 (755) 100.00 (915) 100.00 (824) Variety Locality Domanínek Chrastava Jaroměřice Lednice Žatec Total Flanders Domanínek Chrastava Jaroměřice Lednice Žatec 64.71 (231) 5.01 (20) 3.76 (16) – – 31.93 (114) 91.73 (366) 23.76 (101) – – 3.08 (11) 2.01 (8) 66.82 (284) 0.19 (1) – 0.28 (1) 1.25 (5) 5.41 (23) 87.72 (450) 24.94 (110) – – 0.24 (1) 12.09 (62) 75.06 (331) 100.00 (357) 100.00 (399) 100.00 (425) 100.00 (513) 100.00 (441) Lola Domanínek Chrastava Jaroměřice Lednice Žatec 84.82 (313) 3.47 (16) 3.36 (13) – – 10.30 (38) 82.00 (378) 0.52 (2) 3.53 (17) – 4.88 (18) 14.32 (66) 96.12 (372) 1.87 (9) 0.72 (3) – 0.22 (1) – 83.16 (400) 24.34 (101) – – – 11.43 (55) 74.94 (311) 100.00 (369) 100.00 (461) 100.00 (387) 100.00 (481) 100.00 (415) Jantar Domanínek Chrastava Jaroměřice Lednice Žatec 97.36 (332) 57.02 (207) 78.95 (300) – – – 22.87 (83) 4.21 (16) – – 2.69 (9) 20.11 (73) 16.84 (64) – – – – – 86.38 (387) 28.99 (109) – – – 13.62 (61) 71.01 (267) 100.00 (341) 100.00 (363) 100.00 (380) 100.00 (448) 100.00 (376) Amon Domanínek Chrastava Jaroměřice Lednice Žatec 36.42 (122) 0.92 (4) – – – 55.52 (186) 78.39 (341) 30.42 (115) – – 8.06 (27) 19.54 (85) 47.35 (179) – – – 0.69 (3) 17.99 (68) 79.61 (363) 13.22 (55) – 0.46 (2) 4.23 (16) 20.39 (93) 86.68 (358) 100.00 (335) 100.00 (435) 100.00 (378) 100.00 (456) 100.00 (413) Test sample set Percentage and amount of seeds (in parenthesis) of correct identification are reported in bold. 4. Discussion For many plant species, seed features play an important role mainly for varietal identification (Grillo et al., 2011). The low level of phenotype variability among the studied flax varieties was repeatedly observed. Everaert et al. (2001), Fu et al. (2002) and Smykal et al. (2011) illustrated the ability to detect two contrasted flax varieties. Similarly, Wiesnerová and Wiesner (2008) and Pearson (2010), applying image analysis techniques, were able to discriminate between brown and yellow flax on the basis of red, green and blue mean values. In this study, four Czech commercial varieties of flax were characterized on the basis of seed shape, size and colour measured by computer vision methods. The achieved data were used to implement a specific statistical classifier able to identify and classify the studied varieties and trace the cultivation localities. Considering the remarkable visual resemblance among the seeds of the studied varieties, the reached results have to be considered enough to support the seed lots identification process, above all if compared with those achieved for other similar crops (Venora et al., 2007a, 2007b; Smýkalová et al., 2011; Grillo et al., 2011). The effect of the cultivation region, as well as soil, climatic and geographic characteristics is remarkably evident on the flax seed shape, size and colour, and a relationship with the flax variety seems to exist. In particular, the high performances achieved by the studied varieties in Žatec and Lednice, allows to infer that these localities are particularly vocated with respect to others. On the other hand, the interaction genotype × environment, in the expression of seed shape, size and colour, is well known. (Nieto-Ángel et al., 2009; Medina et al., 2010; Smýkalová et al., 2011). 5. Conclusions This work allows to investigate relationships among flax varieties and the cropping environment by seed morpho-colorimetric characterization using an image analysis system. One more time, it was possible to prove that an objective, reliable and repeatable, computer-aided identification system can be effectively applied also for the flax seeds. Author's personal copy 238 S. Iva et al. / Industrial Crops and Products 47 (2013) 232–238 As further step of this work will be to improve the classifier adding other seed samples of different cropping years to evaluate the effect of the climatic and thermo-pluviometric conditions on seed traits expression and the consequence of seed storage conditions on the colour changes of seeds. Acknowledgement This work was financially supported by the grant No. MSM2678424601 of the Ministry of Education of CR. References Bacchetta, G., Grillo, O., Mattana, E., Venora, G., 2008. Morpho-colorimetric characterization by image analysis to identify diaspores of wild plant species. Flora 203 (8), 669–682. Diederichsen, A., Richards, K., 2003. The seed. In: Muir, D.A., Westcott, N.D. (Eds.), Flax, the Genus Linum. Agriculture and Agri-Food Canada, Saskatoon, Saskatchewan, Canada, p. 306. Everaert, I., De Riek, J., De Loose, M., Van Waes, J., Van Bockstaele, E., 2001. Most similar variety grouping for distinctness evaluation of flax and linseed (Linum usitatissimum L.) varieties by means of AFLP and morphological data. Plant Var. Seeds 4, 69–87. Fisher, R.A., 1936. The use of multiple measurements in taxonomic problems. Ann. Eugen. 7, 179–188. Fisher, R.A., 1940. The precision of discriminant functions. Ann. Eugen. 10 (4), 422–429. Fouilloux, G., 1988. Breeding flax methods. In: Flax: Breeding and Utilisation. Kluwer Academic Publishers, Dordrecht/Boston/London, 14–25. Fu, Y.-B., Diederichsen, A., Richards, K.W., Peterson, G., 2002. Genetic diversity within a range of cultivars and landraces of flax (Linum usitatissimum L.) as revealed by RAPDs. Genet. Resour. Crop Evol. 49, 167–174. Grillo, O., Mattana, E., Venora, G., Bacchetta, G., 2010. Statistical seed classifiers of 10 plant families representative of the Mediterranean vascular flora. Seed Sci. Technol. 38 (2), 455–476. Grillo, O., Miceli, C., Venora, G., 2011. Image analysis tool for Vetch varieties identification by seeds inspection. Seed Sci. Technol. 39 (2), 490–500. Grillo, O., Draper, D., Venora, G., Martínez-Laborde, J.B., 2012. Seed image analysis and taxonomy of Diplotaxis DC. (Brassicaceae Brassiceae). Systemat. Biodivers. 10 (1), 57–70. Harper, J.L., Lovell, P.H., Moore, K.G., 1970. The shapes and sizes of seeds. Annu. Rev. Ecol. Systemat. 1, 327–356. Hastie, T., Tibshirani, R., Friedman, J., 2001. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York, USA745. Holden, J.E., Finch, W.H., Kelly, K., 2011. A comparison of two-group classification methods. Educ. Psychol. Meas. 71 (5), 870–901. Medina, W., Skurtys, O., Aguilera, J.M., 2010. Study on image analysis application for identification Quinoa seeds (Chenopodium quinoa Willd) geographical provenance. LWT – Food Sci. Technol. 43 (2), 238–246. Muir, A.D., Wescott, N.D., 2001. Flax - the genus Linum. Harwood Acad. Publ., Amsterdam22–54. Nieto-Ángel, R., Pérez-Ortega, S.A., Núñez-Colín, C.A., Martínez-Solìs, J., GonzálezAndrés, F., 2009. Seed and endocarp traits as markers of the biodiversity of regional sources of germplasm of tejocote (Crataegus spp.) from Central and Southern Mexico. Sci. Hortic. – Amsterdam 121, 166–170. Pavelek M., 2004. Recent development of the International flax database as the result of an ECP/GR initiative. IPGRI Newsletter for Europe, No. 28, June 2004. p. 9. Pearson, T., 2010. High speed sorting of grains by color and surface texture. Appl. Eng. Agric. 26 (3), 499–505. Shahin, M.A., Symons, S.J., 2001. A machine vision system for grading lentils. Can. Biosyst. Eng. 43, 7.7–7.14. Shahin, M.A., Symons, S.J., 2003. Colour calibration of scanners for scannerindependent grain grading. Cereal Chem. 80, 285–289. Smykal, P., Bacova-Kerteszova, N., Kalendar, R., Corander, J., Schulman, A.H., Pavelek, M., 2011. Genetic diversity of cultivated flax (Linum usitatissimum L.) germplasm assessed by retrotransposon-based markers. Theor. Appl. Genet. 122, 1385–1397, DOI 10.1007/s00122-011-1539-2. Smýkalová, I., Grillo, O., Bjelkova, M., Hybl, M., Venora, G., 2011. Morphocolorimetric traits of Pisum seeds measured by an image analysis system. Seed Sci. Technol. 39, 612–626. Venora, G., Grillo, O., Shahin, M.A., Symons, S.J., 2007a. Identification of Sicilian landraces and Canadian cultivars of lentil using an image analysis system. Food Res. Int. 40, 161–166. Venora, G., Grillo, O., Ravalli, C., Cremonini, R., 2007b. Tuscany beans landraces, on-line identification from seed inspection by image analysis and linear discriminant analysis. Agrochimica 51 (4/5), 254–268. Venora, G., Grillo, O., Saccone, R., 2009. Quality assessment of durum wheat storage centres in Sicily: evaluation of vitreous, starchy and shrunken kernels using an image analysis system. J. Cereal Sci. 49, 429–440. Wiesnerová, D., Wiesner, I., 2008. Computer image analysis of seed shape and seed color for flax cultivar description. Comput. Electron. Agric. 61, 126–135.