Las empresas desarrolladoras de software muy pequeñas o Very Small Entities (VSE), con menos de 2... more Las empresas desarrolladoras de software muy pequeñas o Very Small Entities (VSE), con menos de 25 empleados, se ven obligadas a garantizar la calidad del proceso para disminuir el reproceso e incrementar sus ganancias. Esto implica cumplir con buenas prácticas recomendadas en diversos estándares internacionales, que ofrecen pautas para mejorar los procesos y productos de software. Sin embargo, los requisitos de estos estándares son difíciles de cumplir para las VSE por su capacidad y los costos de implementación. Este documento describe el proceso de implementación de la norma ISO 29110 en el contexto específico de la región del Valle del Cauca en Colombia, para cuatro empresas, y expone las buenas prácticas, las herramientas y las técnicas utilizadas con resultados prometedores.
The Random Forest (RF) algorithm consists of an assembly of base decision trees, constructed from... more The Random Forest (RF) algorithm consists of an assembly of base decision trees, constructed from Bootstrap subsets of the original dataset. Each subset is a sample of instances (rows) by a random subset of features (variables or columns) of the original dataset to be classified. In RF, pruning is not applied in the generation of base trees and in the classification process of a new record, each tree issues a vote enabling the selected class to be defined, as that with the most votes. Bearing in mind that in the state of the art it is defined that random feature selection for constructing the Bootstrap subsets decreases the quality of the results achieved with RF, in this work the integration of covering arrays (CA) in RF is proposed to solve this situation, in an algorithm called RFCA. In RFCA, the number N of rows of the CA defines the lowest number of base trees that require to be generated in RF and each row of the CA defines the features that each Bootstrap subset will use in the creation of each tree. To evaluate the new proposal, 32 datasets available in the UCI repository are used and compared with the RF available in Weka. The experiments show that the use of a CA of strength 2 to 7 obtains promising results in terms of accuracy.
Recently, metaheuristic based algorithms have shown good results in generating automatic multi-do... more Recently, metaheuristic based algorithms have shown good results in generating automatic multi-document summaries. This paper proposes two algorithms that hybridize the metaheuristic of Global Best Harmony Search and the LexRank Graph based algorithm, called LexGbhs and GbhsLex. The objective function to be optimized is composed of the features of coverage and diversity. Coverage measures the similarity between each sentence of the candidate summary and the centroid of the sentences of the collection of documents, while diversity measures how different the sentences that make up a candidate summary are. The two proposed hybrid algorithms were compared with state of the art algorithms using ROUGE-1, ROUGE-2 and ROUGE-SU4 measurements for the DUC2005 and DUC2006 data sets. After a unified classification was carried out, the LexGbhs algorithm proposed ranked third, showing that the hybridization of metaheuristics with graphs in the generation of extractive summaries of multiple documents is a promising line of research.
Las empresas muy pequeñas de desarrollo de software poseen un máximo de 25 empleados y tienen un ... more Las empresas muy pequeñas de desarrollo de software poseen un máximo de 25 empleados y tienen un limitado flujo de caja y tiempo para implementar mejoras en sus procesos que les permita ser más competitivos. Esta es una de las razones por las que estas empresas recurren a la implementación de marcos de trabajo ágil como SCRUM para gestionar el proceso de desarrollo de software. Pero cuando inician su adopción, encuentran que los documentos solo sugieren los cambios que se pueden realizar, pero no como hacerlos, tornando el proceso de descubrir cuales técnicas, eventos y artefactos son los que deben implementar en un enfoque de prueba y error costoso y en algunos casos inviable. Lo mismo sucede con otros marcos que pueden ser complementarios a SCRUM como DevOps, que propone un acercamiento entre el área de desarrollo y operaciones, donde se automaticen la mayor cantidad de tareas y se incrementen los controles de calidad para obtener mejores productos. Este artículo expone tres buena...
Coffee is one of the most traded agricultural products internationally; in Colombia, it is the fi... more Coffee is one of the most traded agricultural products internationally; in Colombia, it is the first non-mining-energy export product. In this context, the prediction of coffee crop yields is vital for the sector since it allows coffee growers to establish crop management strategies, maximizing their profits or reducing possible losses. This paper addresses crucial aspects of coffee crop yield prediction through a systematic literature review of documents consulted in Scopus, ACM, Taylor & Francis, and Nature. These documents were subjected to a filtering and evaluation process to answer five key questions: predictor variables used, target variable, techniques and algorithms employed, metrics to evaluate the quality of the prediction, and species of coffee reported. The results reveal some groups of predictor variables, including atmospheric, chemical, satellite-derived, fertilizer-related, soil, crop management, and shadow factors. The most recurrent target variable is yield, measured in bean weight per hectare or other measures, with one case considering leaf area. Predominant techniques for yield forecasting include linear regression, random forests, principal component analysis, cluster regression, neural networks, classification and regression trees, and extreme learning machines. The most common metrics to evaluate the quality of predictive models include root mean squared error, coefficient of determination (R²), mean absolute error, error deviation, Pearson's correlation coefficient, and standard deviation. Finally, robusta, arabica, racemosa, and zanguebariae are the most studied coffee varieties.
A model for searching business processes, based on a multimodal approach that integrates textual ... more A model for searching business processes, based on a multimodal approach that integrates textual and structural information.A clustering mechanism that uses a similarity function based on fuzzy logic for grouping search results.Evaluation of search method using internal quality assessment and external assessment based on human criteria. Nowadays, many companies standardize their operations through Business Process (BP), which are stored in repositories and reused when new functionalities are required. However, finding specific processes may become a cumbersome task due to the large size of these repositories. This paper presents MulTimodalGroup, a model for grouping and searching business processes. The grouping mechanism is built upon a clustering algorithm that uses a similarity function based on fuzzy logic; this grouping is performed using the results of each user request. By its part, the search is based on a multimodal representation that integrates textual and structural information of BP. The assessment of the proposed model was carried out in two phases: 1) internal quality assessment of groups and 2) external assessment of the created groups compared with an ideal set of groups. The assessment was performed using a closed BP collection designed collaboratively by 59 experts. The experimental results in each phase are promising and evidence the validity of the proposed model.
International Journal on Advanced Science, Engineering and Information Technology, Aug 31, 2022
The Vehicle Routing Problem with Time Windows is a complete NP combinatorial problem in which pro... more The Vehicle Routing Problem with Time Windows is a complete NP combinatorial problem in which product deliveries to customers must be made under certain time constraints. This problem can be solved from a single objective approach, well studied in the state of the art, in which the objective of the total travel distance or the size of the fleet (number of vehicles) is generally minimized. However, recent studies have used a multiobjective approach (Multiobjective Vehicle Routing Problem with Time Windows, MOVRPTW) that solves the problem from a viewpoint closer to reality. This work presents a new multiobjective memetic algorithm based on the GRASP (Greedy Randomized Adaptive Search Procedures) algorithm called MOMGRASP for the minimization of three objectives in MOVRPTW (total travel time, waiting time of customers to be attended, and balance of total travel time between routes). The results of the experimentation carried out with 56 problems proposed by Solomon and 45 problems proposed by Castro-Gutiérrez show that the proposed algorithm finds better solutions in these three objectives and competitive solutions than those reported by Zhou (compared to LSMOVRPTW algorithm and optimizing 5 objectives: number of vehicles, total travel distance, travel time of the longest route, total waiting time due to early arrivals, and total delay time due to late arrivals) and by Baños (versus the MMOEASA algorithm in two scenarios; case 1: total travel distance and balance of distance and case 2: total travel distance and balance of workload).
Periodicals of Engineering and Natural Sciences (PEN), Nov 3, 2021
The accelerated pace of life of companies in Colombia and the world, entails the need to obtain s... more The accelerated pace of life of companies in Colombia and the world, entails the need to obtain software developments with the highest quality, in the shortest possible time and with minimal reprocessing after it is put into production. Therefore, the use of good software development practices and their automation through tools is no longer a luxury for development teams today, but part of their way of working. Unfortunately, in Colombia many of these helps and forms of work are not widely used. This paper presents the documentation and implementation of preventive quality tools and good practices for software development that allow code versioning, continuous integration, automation of functional tests, static code analysis and continuous deployment. Objective: Present the good practices implemented in the Smart Campus Ecosystem case study for software development. Methodology or method: Good practices for software development based on XP and DevOps are reviewed. A set of tools is selected for implementation that has a direct impact on the quality of software development. These tools are used in the UNIAJC Smart campus ecosystem case study. The results of the implementation are documented in this article. Results: The preventive quality model is exposed, put on test and the results are documented. Conclusions: The preventive quality model helps to increase the results of quality assurance through the set of tools that provide development teams with key information for refinement and refactoring of source code within development runtime and no later than this stage.
Execution of business processes generates data, which are commonly recorded in logs. Historical i... more Execution of business processes generates data, which are commonly recorded in logs. Historical information of execution cases may be used for recommending future execution paths. This is useful when the control flow of the process is not known by the user. We present TrazasBP, a framework for BP indexing and searching based on execution cases. It indexes BPs based on execution cases (traces) retrieved from log files. TrazasBP not only takes into account the textual information of BP elements, but also the causal dependence between these elements. Furthermore, due to its low computational cost, TrazasBP may be used as indexing mechanism in order to reduce the search space. Experimental evaluation shows promising values of graded precision, recall and F-measure when compared with results obtained from human search.
The calibration of traffic-flow simulation models continues to be a significant problem without a... more The calibration of traffic-flow simulation models continues to be a significant problem without a generalized, comprehensive, and low-cost solution. Existing calibration approaches either have not explicitly addressed the multi-objective characteristics of the problem or determining their hyperparameters requires significant effort. In addition, statistical evaluation of alternative solution algorithms is not performed to ensure dominance and stability. This study proposes an adaptation and advanced implementation of the Multi-Objective Global-Best Harmony Search (MOGBHS) algorithm for calibrating microscopic trafficflow simulation models. The adapted MOGBHS provides five key capabilities for solving the proposed problem including 1) consideration of multiple objectives, 2) easily extendable to memetic versions, 3) simultaneous consideration of continuous and discrete variables, 4) efficient ordering of no dominated solutions, 5) relatively easy tuning of hyperparameters, and 6) easily parallelization to maximize exploration and exploitation without increasing computing time. Three traffic flow models of different dimensionality and complexity were used to test the performance of seventeen metaheuristics for solving the calibration problem. The efficiency and effectiveness of the algorithms were tested based on convergence, minimization of errors, calibration criterion, and two statistical nonparametric tests. The proposed approach dominated all alternative algorithms in all cases and provided the most stable and diverse solutions.
Vegetation indices are algebraic combinations of spectral bands produced by satellite. The indice... more Vegetation indices are algebraic combinations of spectral bands produced by satellite. The indices allow different vegetation covers to be identified by contrast evaluation. Vegetation indexes are used mainly in tasks of classification of satellite images, as well as chemical and physical land studies. An example is seen in the Normalized Difference Vegetation Index (NDVI) that shows up live green vegetation. This article describes the process of creating a new vegetation index that enables bare ground identification in the Amazon using genetic programming. It further shows how a threshold is automatically defined for the new index, a threshold that facilitates the task of photointerpretation and is not normally provided for other vegetation indexes. The new index, called BGIGP (Bare Ground Index obtained using Genetic Programming) showed significant values of contrast between the different covers analyzed, being seen to compete well with traditional vegetation indexes such as SR. The performance of BGIGP was also evaluated using the characteristics of 10448pixel images from the "2017 Kaggle Planet: Understanding the Amazon from Space" competition, to classify bare ground against water, cloudy, primary, cultivation, road, and artisanal mine, obtaining a 93.71% of accuracy.
The task of assigning tags to the words of a sentence has many applications today in natural lang... more The task of assigning tags to the words of a sentence has many applications today in natural language processing (NLP) and therefore requires a fast and accurate algorithm. This paper presents a Part-of-Speech Tagger based on Global-Best Harmony Search (GBHS) which includes local optimization (based on the Hill Climbing algorithm that includes knowledge of the problem to define the neighborhood) for the best harmony after each improvisation (iteration). In the proposed algorithm, a candidate solution (harmony) is represented as a vector of the size of the numbers of word in a sentence, while the fitness function considers the cumulative probability of tagging each word and its relation to its predecessor and successor word. The proposed algorithm obtained 95.2% precision values and improved on the results obtained by other taggers. The experimental results were analyzed with Friedman non-parametric statistical tests, with a level of significance of 90%. The proposed Part-of-Speech Tagger algorithm was found to perform with quality and efficiency in the tagging problem, in contrast to the comparison algorithms. The Brown corpus divided into 5 folders was used to conduct the experiments, thereby allowing application of cross-validation.
An Extreme Learning Machine (ELM) performs the training of a single-layer feedforward neural netw... more An Extreme Learning Machine (ELM) performs the training of a single-layer feedforward neural network (SLFN) in less time than the back-propagation algorithm. An ELM defines the input weights and biases of the hidden layer with random values, and then analytically calculates the output weights. The use of random values causes SLFN performance to decrease significantly. The present work carries out the adaptation of three continuous optimization algorithms of high dimensionality (IHDELS, DECC-G and MOS) and compares their performance to each other and with the state-of-the-art method, a memetic algorithm based on differential evolution called M-ELM. The results of the comparison show that IHDELS using a validation model based on retention (Training/Testing) obtains the best results, followed by DECC-G and MOS. All three algorithms obtain better results than M-ELM. The experimentation was carried out on 38 classification problems recognized by the scientific community, while Friedman and Wilcoxon nonparametric statistical tests support the results.
The present work describes the use of a virtual community of practice for the strengthening of ca... more The present work describes the use of a virtual community of practice for the strengthening of capacities for software development; the study was carried out with students of informatics and related areas of higher education institutions in the southwest of Colombia. The present study was conducted in the following stages: initial approach, diagnostic, preparation, implementation, and follow-up. Results obtained allow evidence that the virtual community positively influences in the acquisition of knowledge, capabilities, and attitudes of the members of the community. The latter increases possibilities of entering the market.
International Journal on Advanced Science, Engineering and Information Technology
The Vehicle Routing Problem with Time Windows is a complete NP combinatorial problem in which pro... more The Vehicle Routing Problem with Time Windows is a complete NP combinatorial problem in which product deliveries to customers must be made under certain time constraints. This problem can be solved from a single objective approach, well studied in the state of the art, in which the objective of the total travel distance or the size of the fleet (number of vehicles) is generally minimized. However, recent studies have used a multiobjective approach (Multiobjective Vehicle Routing Problem with Time Windows, MOVRPTW) that solves the problem from a viewpoint closer to reality. This work presents a new multiobjective memetic algorithm based on the GRASP (Greedy Randomized Adaptive Search Procedures) algorithm called MOMGRASP for the minimization of three objectives in MOVRPTW (total travel time, waiting time of customers to be attended, and balance of total travel time between routes). The results of the experimentation carried out with 56 problems proposed by Solomon and 45 problems proposed by Castro-Gutiérrez show that the proposed algorithm finds better solutions in these three objectives and competitive solutions than those reported by Zhou (compared to LSMOVRPTW algorithm and optimizing 5 objectives: number of vehicles, total travel distance, travel time of the longest route, total waiting time due to early arrivals, and total delay time due to late arrivals) and by Baños (versus the MMOEASA algorithm in two scenarios; case 1: total travel distance and balance of distance and case 2: total travel distance and balance of workload).
Algunos países latinoamericanos están desarrollando programas que permiten a sus comunidades educ... more Algunos países latinoamericanos están desarrollando programas que permiten a sus comunidades educativas tener un mayor acceso a las tecnologías de la información y la comunicación (TIC). Para verificar la eficacia de estos programas es necesario medir las competencias en TIC adquiridas por los integrantes de estas comunidades educativas. En este artículo se presenta un modelo dimensional para una bodega de datos que permite el análisis de información relacionada con la apropiación de competencias en TIC por parte de docentes y estudiantes de instituciones educativas. Los casos de diseño toman como referencia los propuestos por Ralph Kimball: relación muchos a muchos, productos heterogéneos, subdimensiones y jerarquías de organización. Además, el artículo plantea un nuevo caso de diseño denominado «Dimensión con medidas» que, junto con una tabla de hechos y el cálculo de medidas basado en funciones MDX, permiten el análisis ponderado de las competencias de los actores de las instituciones educativas. Este nuevo caso de diseño puede ser usado en otros contextos que requieran de cálculos recursivos de atributos numéricos ponderados en dimensiones jerárquicas de organización dentro de una bodega de datos. Por último, presenta los componentes principales de un sistema de soporte a la toma de decisiones desarrollado para el Programa Computadores para Educar (CPE) en alianza con la Universidad del Cauca el cual puede ser replicado en otros programas con objetivos similares. PALABRAS CLAVES: modelamiento dimensional; bodegas de datos; OLAP; competencias en tecnologías de la información y la comunicación (TIC).
Algunos países latinoamericanos están desarrollando programas que permiten a sus comunidades educ... more Algunos países latinoamericanos están desarrollando programas que permiten a sus comunidades educativas tener un mayor acceso a las tecnologías de la información y la comunicación (TIC). Para verificar la eficacia de estos programas es necesario medir las competencias en TIC adquiridas por los integrantes de estas comunidades educativas. En este artículo se presenta un modelo dimensional para una bodega de datos que permite el análisis de información relacionada con la apropiación de competencias en TIC por parte de docentes y estudiantes de instituciones educativas. Los casos de diseño toman como referencia los propuestos por Ralph Kimball: relación muchos a muchos, productos heterogéneos, subdimensiones y jerarquías de organización. Además, el artículo plantea un nuevo caso de diseño denominado «Dimensión con medidas» que, junto con una tabla de hechos y el cálculo de medidas basado en funciones MDX, permiten el análisis ponderado de las competencias de los actores de las instituciones educativas. Este nuevo caso de diseño puede ser usado en otros contextos que requieran de cálculos recursivos de atributos numéricos ponderados en dimensiones jerárquicas de organización dentro de una bodega de datos. Por último, presenta los componentes principales de un sistema de soporte a la toma de decisiones desarrollado para el Programa Computadores para Educar (CPE) en alianza con la Universidad del Cauca el cual puede ser replicado en otros programas con objetivos similares. PALABRAS CLAVES: modelamiento dimensional; bodegas de datos; OLAP; competencias en tecnologías de la información y la comunicación (TIC).
This paper introduces a new description-centric algorithm for web document clustering based on Me... more This paper introduces a new description-centric algorithm for web document clustering based on Memetic Algorithms with Niching Methods, Term-Document Matrix and Bayesian Information Criterion. The algorithm defines the number of clusters automatically. The Memetic Algorithm provides a combined global and local strategy for a search in the solution space and the Niching methods to promote diversity in the population and prevent the population from converging too quickly (based on restricted competition replacement and restrictive mating). The Memetic Algorithm uses the K-means algorithm to find the optimum value in a local search space. Bayesian Information Criterion is used as a fitness function, while FP-Growth is used to reduce the high dimensionality in the vocabulary. This resulting algorithm, called WDC-NMA, was tested with data sets based on Reuters-21578 and DMOZ, obtaining promising results (better precision results than a Singular Value Decomposition algorithm). Also, it was also then initially evaluated by a group of users.
Las empresas desarrolladoras de software muy pequeñas o Very Small Entities (VSE), con menos de 2... more Las empresas desarrolladoras de software muy pequeñas o Very Small Entities (VSE), con menos de 25 empleados, se ven obligadas a garantizar la calidad del proceso para disminuir el reproceso e incrementar sus ganancias. Esto implica cumplir con buenas prácticas recomendadas en diversos estándares internacionales, que ofrecen pautas para mejorar los procesos y productos de software. Sin embargo, los requisitos de estos estándares son difíciles de cumplir para las VSE por su capacidad y los costos de implementación. Este documento describe el proceso de implementación de la norma ISO 29110 en el contexto específico de la región del Valle del Cauca en Colombia, para cuatro empresas, y expone las buenas prácticas, las herramientas y las técnicas utilizadas con resultados prometedores.
The Random Forest (RF) algorithm consists of an assembly of base decision trees, constructed from... more The Random Forest (RF) algorithm consists of an assembly of base decision trees, constructed from Bootstrap subsets of the original dataset. Each subset is a sample of instances (rows) by a random subset of features (variables or columns) of the original dataset to be classified. In RF, pruning is not applied in the generation of base trees and in the classification process of a new record, each tree issues a vote enabling the selected class to be defined, as that with the most votes. Bearing in mind that in the state of the art it is defined that random feature selection for constructing the Bootstrap subsets decreases the quality of the results achieved with RF, in this work the integration of covering arrays (CA) in RF is proposed to solve this situation, in an algorithm called RFCA. In RFCA, the number N of rows of the CA defines the lowest number of base trees that require to be generated in RF and each row of the CA defines the features that each Bootstrap subset will use in the creation of each tree. To evaluate the new proposal, 32 datasets available in the UCI repository are used and compared with the RF available in Weka. The experiments show that the use of a CA of strength 2 to 7 obtains promising results in terms of accuracy.
Recently, metaheuristic based algorithms have shown good results in generating automatic multi-do... more Recently, metaheuristic based algorithms have shown good results in generating automatic multi-document summaries. This paper proposes two algorithms that hybridize the metaheuristic of Global Best Harmony Search and the LexRank Graph based algorithm, called LexGbhs and GbhsLex. The objective function to be optimized is composed of the features of coverage and diversity. Coverage measures the similarity between each sentence of the candidate summary and the centroid of the sentences of the collection of documents, while diversity measures how different the sentences that make up a candidate summary are. The two proposed hybrid algorithms were compared with state of the art algorithms using ROUGE-1, ROUGE-2 and ROUGE-SU4 measurements for the DUC2005 and DUC2006 data sets. After a unified classification was carried out, the LexGbhs algorithm proposed ranked third, showing that the hybridization of metaheuristics with graphs in the generation of extractive summaries of multiple documents is a promising line of research.
Las empresas muy pequeñas de desarrollo de software poseen un máximo de 25 empleados y tienen un ... more Las empresas muy pequeñas de desarrollo de software poseen un máximo de 25 empleados y tienen un limitado flujo de caja y tiempo para implementar mejoras en sus procesos que les permita ser más competitivos. Esta es una de las razones por las que estas empresas recurren a la implementación de marcos de trabajo ágil como SCRUM para gestionar el proceso de desarrollo de software. Pero cuando inician su adopción, encuentran que los documentos solo sugieren los cambios que se pueden realizar, pero no como hacerlos, tornando el proceso de descubrir cuales técnicas, eventos y artefactos son los que deben implementar en un enfoque de prueba y error costoso y en algunos casos inviable. Lo mismo sucede con otros marcos que pueden ser complementarios a SCRUM como DevOps, que propone un acercamiento entre el área de desarrollo y operaciones, donde se automaticen la mayor cantidad de tareas y se incrementen los controles de calidad para obtener mejores productos. Este artículo expone tres buena...
Coffee is one of the most traded agricultural products internationally; in Colombia, it is the fi... more Coffee is one of the most traded agricultural products internationally; in Colombia, it is the first non-mining-energy export product. In this context, the prediction of coffee crop yields is vital for the sector since it allows coffee growers to establish crop management strategies, maximizing their profits or reducing possible losses. This paper addresses crucial aspects of coffee crop yield prediction through a systematic literature review of documents consulted in Scopus, ACM, Taylor & Francis, and Nature. These documents were subjected to a filtering and evaluation process to answer five key questions: predictor variables used, target variable, techniques and algorithms employed, metrics to evaluate the quality of the prediction, and species of coffee reported. The results reveal some groups of predictor variables, including atmospheric, chemical, satellite-derived, fertilizer-related, soil, crop management, and shadow factors. The most recurrent target variable is yield, measured in bean weight per hectare or other measures, with one case considering leaf area. Predominant techniques for yield forecasting include linear regression, random forests, principal component analysis, cluster regression, neural networks, classification and regression trees, and extreme learning machines. The most common metrics to evaluate the quality of predictive models include root mean squared error, coefficient of determination (R²), mean absolute error, error deviation, Pearson's correlation coefficient, and standard deviation. Finally, robusta, arabica, racemosa, and zanguebariae are the most studied coffee varieties.
A model for searching business processes, based on a multimodal approach that integrates textual ... more A model for searching business processes, based on a multimodal approach that integrates textual and structural information.A clustering mechanism that uses a similarity function based on fuzzy logic for grouping search results.Evaluation of search method using internal quality assessment and external assessment based on human criteria. Nowadays, many companies standardize their operations through Business Process (BP), which are stored in repositories and reused when new functionalities are required. However, finding specific processes may become a cumbersome task due to the large size of these repositories. This paper presents MulTimodalGroup, a model for grouping and searching business processes. The grouping mechanism is built upon a clustering algorithm that uses a similarity function based on fuzzy logic; this grouping is performed using the results of each user request. By its part, the search is based on a multimodal representation that integrates textual and structural information of BP. The assessment of the proposed model was carried out in two phases: 1) internal quality assessment of groups and 2) external assessment of the created groups compared with an ideal set of groups. The assessment was performed using a closed BP collection designed collaboratively by 59 experts. The experimental results in each phase are promising and evidence the validity of the proposed model.
International Journal on Advanced Science, Engineering and Information Technology, Aug 31, 2022
The Vehicle Routing Problem with Time Windows is a complete NP combinatorial problem in which pro... more The Vehicle Routing Problem with Time Windows is a complete NP combinatorial problem in which product deliveries to customers must be made under certain time constraints. This problem can be solved from a single objective approach, well studied in the state of the art, in which the objective of the total travel distance or the size of the fleet (number of vehicles) is generally minimized. However, recent studies have used a multiobjective approach (Multiobjective Vehicle Routing Problem with Time Windows, MOVRPTW) that solves the problem from a viewpoint closer to reality. This work presents a new multiobjective memetic algorithm based on the GRASP (Greedy Randomized Adaptive Search Procedures) algorithm called MOMGRASP for the minimization of three objectives in MOVRPTW (total travel time, waiting time of customers to be attended, and balance of total travel time between routes). The results of the experimentation carried out with 56 problems proposed by Solomon and 45 problems proposed by Castro-Gutiérrez show that the proposed algorithm finds better solutions in these three objectives and competitive solutions than those reported by Zhou (compared to LSMOVRPTW algorithm and optimizing 5 objectives: number of vehicles, total travel distance, travel time of the longest route, total waiting time due to early arrivals, and total delay time due to late arrivals) and by Baños (versus the MMOEASA algorithm in two scenarios; case 1: total travel distance and balance of distance and case 2: total travel distance and balance of workload).
Periodicals of Engineering and Natural Sciences (PEN), Nov 3, 2021
The accelerated pace of life of companies in Colombia and the world, entails the need to obtain s... more The accelerated pace of life of companies in Colombia and the world, entails the need to obtain software developments with the highest quality, in the shortest possible time and with minimal reprocessing after it is put into production. Therefore, the use of good software development practices and their automation through tools is no longer a luxury for development teams today, but part of their way of working. Unfortunately, in Colombia many of these helps and forms of work are not widely used. This paper presents the documentation and implementation of preventive quality tools and good practices for software development that allow code versioning, continuous integration, automation of functional tests, static code analysis and continuous deployment. Objective: Present the good practices implemented in the Smart Campus Ecosystem case study for software development. Methodology or method: Good practices for software development based on XP and DevOps are reviewed. A set of tools is selected for implementation that has a direct impact on the quality of software development. These tools are used in the UNIAJC Smart campus ecosystem case study. The results of the implementation are documented in this article. Results: The preventive quality model is exposed, put on test and the results are documented. Conclusions: The preventive quality model helps to increase the results of quality assurance through the set of tools that provide development teams with key information for refinement and refactoring of source code within development runtime and no later than this stage.
Execution of business processes generates data, which are commonly recorded in logs. Historical i... more Execution of business processes generates data, which are commonly recorded in logs. Historical information of execution cases may be used for recommending future execution paths. This is useful when the control flow of the process is not known by the user. We present TrazasBP, a framework for BP indexing and searching based on execution cases. It indexes BPs based on execution cases (traces) retrieved from log files. TrazasBP not only takes into account the textual information of BP elements, but also the causal dependence between these elements. Furthermore, due to its low computational cost, TrazasBP may be used as indexing mechanism in order to reduce the search space. Experimental evaluation shows promising values of graded precision, recall and F-measure when compared with results obtained from human search.
The calibration of traffic-flow simulation models continues to be a significant problem without a... more The calibration of traffic-flow simulation models continues to be a significant problem without a generalized, comprehensive, and low-cost solution. Existing calibration approaches either have not explicitly addressed the multi-objective characteristics of the problem or determining their hyperparameters requires significant effort. In addition, statistical evaluation of alternative solution algorithms is not performed to ensure dominance and stability. This study proposes an adaptation and advanced implementation of the Multi-Objective Global-Best Harmony Search (MOGBHS) algorithm for calibrating microscopic trafficflow simulation models. The adapted MOGBHS provides five key capabilities for solving the proposed problem including 1) consideration of multiple objectives, 2) easily extendable to memetic versions, 3) simultaneous consideration of continuous and discrete variables, 4) efficient ordering of no dominated solutions, 5) relatively easy tuning of hyperparameters, and 6) easily parallelization to maximize exploration and exploitation without increasing computing time. Three traffic flow models of different dimensionality and complexity were used to test the performance of seventeen metaheuristics for solving the calibration problem. The efficiency and effectiveness of the algorithms were tested based on convergence, minimization of errors, calibration criterion, and two statistical nonparametric tests. The proposed approach dominated all alternative algorithms in all cases and provided the most stable and diverse solutions.
Vegetation indices are algebraic combinations of spectral bands produced by satellite. The indice... more Vegetation indices are algebraic combinations of spectral bands produced by satellite. The indices allow different vegetation covers to be identified by contrast evaluation. Vegetation indexes are used mainly in tasks of classification of satellite images, as well as chemical and physical land studies. An example is seen in the Normalized Difference Vegetation Index (NDVI) that shows up live green vegetation. This article describes the process of creating a new vegetation index that enables bare ground identification in the Amazon using genetic programming. It further shows how a threshold is automatically defined for the new index, a threshold that facilitates the task of photointerpretation and is not normally provided for other vegetation indexes. The new index, called BGIGP (Bare Ground Index obtained using Genetic Programming) showed significant values of contrast between the different covers analyzed, being seen to compete well with traditional vegetation indexes such as SR. The performance of BGIGP was also evaluated using the characteristics of 10448pixel images from the "2017 Kaggle Planet: Understanding the Amazon from Space" competition, to classify bare ground against water, cloudy, primary, cultivation, road, and artisanal mine, obtaining a 93.71% of accuracy.
The task of assigning tags to the words of a sentence has many applications today in natural lang... more The task of assigning tags to the words of a sentence has many applications today in natural language processing (NLP) and therefore requires a fast and accurate algorithm. This paper presents a Part-of-Speech Tagger based on Global-Best Harmony Search (GBHS) which includes local optimization (based on the Hill Climbing algorithm that includes knowledge of the problem to define the neighborhood) for the best harmony after each improvisation (iteration). In the proposed algorithm, a candidate solution (harmony) is represented as a vector of the size of the numbers of word in a sentence, while the fitness function considers the cumulative probability of tagging each word and its relation to its predecessor and successor word. The proposed algorithm obtained 95.2% precision values and improved on the results obtained by other taggers. The experimental results were analyzed with Friedman non-parametric statistical tests, with a level of significance of 90%. The proposed Part-of-Speech Tagger algorithm was found to perform with quality and efficiency in the tagging problem, in contrast to the comparison algorithms. The Brown corpus divided into 5 folders was used to conduct the experiments, thereby allowing application of cross-validation.
An Extreme Learning Machine (ELM) performs the training of a single-layer feedforward neural netw... more An Extreme Learning Machine (ELM) performs the training of a single-layer feedforward neural network (SLFN) in less time than the back-propagation algorithm. An ELM defines the input weights and biases of the hidden layer with random values, and then analytically calculates the output weights. The use of random values causes SLFN performance to decrease significantly. The present work carries out the adaptation of three continuous optimization algorithms of high dimensionality (IHDELS, DECC-G and MOS) and compares their performance to each other and with the state-of-the-art method, a memetic algorithm based on differential evolution called M-ELM. The results of the comparison show that IHDELS using a validation model based on retention (Training/Testing) obtains the best results, followed by DECC-G and MOS. All three algorithms obtain better results than M-ELM. The experimentation was carried out on 38 classification problems recognized by the scientific community, while Friedman and Wilcoxon nonparametric statistical tests support the results.
The present work describes the use of a virtual community of practice for the strengthening of ca... more The present work describes the use of a virtual community of practice for the strengthening of capacities for software development; the study was carried out with students of informatics and related areas of higher education institutions in the southwest of Colombia. The present study was conducted in the following stages: initial approach, diagnostic, preparation, implementation, and follow-up. Results obtained allow evidence that the virtual community positively influences in the acquisition of knowledge, capabilities, and attitudes of the members of the community. The latter increases possibilities of entering the market.
International Journal on Advanced Science, Engineering and Information Technology
The Vehicle Routing Problem with Time Windows is a complete NP combinatorial problem in which pro... more The Vehicle Routing Problem with Time Windows is a complete NP combinatorial problem in which product deliveries to customers must be made under certain time constraints. This problem can be solved from a single objective approach, well studied in the state of the art, in which the objective of the total travel distance or the size of the fleet (number of vehicles) is generally minimized. However, recent studies have used a multiobjective approach (Multiobjective Vehicle Routing Problem with Time Windows, MOVRPTW) that solves the problem from a viewpoint closer to reality. This work presents a new multiobjective memetic algorithm based on the GRASP (Greedy Randomized Adaptive Search Procedures) algorithm called MOMGRASP for the minimization of three objectives in MOVRPTW (total travel time, waiting time of customers to be attended, and balance of total travel time between routes). The results of the experimentation carried out with 56 problems proposed by Solomon and 45 problems proposed by Castro-Gutiérrez show that the proposed algorithm finds better solutions in these three objectives and competitive solutions than those reported by Zhou (compared to LSMOVRPTW algorithm and optimizing 5 objectives: number of vehicles, total travel distance, travel time of the longest route, total waiting time due to early arrivals, and total delay time due to late arrivals) and by Baños (versus the MMOEASA algorithm in two scenarios; case 1: total travel distance and balance of distance and case 2: total travel distance and balance of workload).
Algunos países latinoamericanos están desarrollando programas que permiten a sus comunidades educ... more Algunos países latinoamericanos están desarrollando programas que permiten a sus comunidades educativas tener un mayor acceso a las tecnologías de la información y la comunicación (TIC). Para verificar la eficacia de estos programas es necesario medir las competencias en TIC adquiridas por los integrantes de estas comunidades educativas. En este artículo se presenta un modelo dimensional para una bodega de datos que permite el análisis de información relacionada con la apropiación de competencias en TIC por parte de docentes y estudiantes de instituciones educativas. Los casos de diseño toman como referencia los propuestos por Ralph Kimball: relación muchos a muchos, productos heterogéneos, subdimensiones y jerarquías de organización. Además, el artículo plantea un nuevo caso de diseño denominado «Dimensión con medidas» que, junto con una tabla de hechos y el cálculo de medidas basado en funciones MDX, permiten el análisis ponderado de las competencias de los actores de las instituciones educativas. Este nuevo caso de diseño puede ser usado en otros contextos que requieran de cálculos recursivos de atributos numéricos ponderados en dimensiones jerárquicas de organización dentro de una bodega de datos. Por último, presenta los componentes principales de un sistema de soporte a la toma de decisiones desarrollado para el Programa Computadores para Educar (CPE) en alianza con la Universidad del Cauca el cual puede ser replicado en otros programas con objetivos similares. PALABRAS CLAVES: modelamiento dimensional; bodegas de datos; OLAP; competencias en tecnologías de la información y la comunicación (TIC).
Algunos países latinoamericanos están desarrollando programas que permiten a sus comunidades educ... more Algunos países latinoamericanos están desarrollando programas que permiten a sus comunidades educativas tener un mayor acceso a las tecnologías de la información y la comunicación (TIC). Para verificar la eficacia de estos programas es necesario medir las competencias en TIC adquiridas por los integrantes de estas comunidades educativas. En este artículo se presenta un modelo dimensional para una bodega de datos que permite el análisis de información relacionada con la apropiación de competencias en TIC por parte de docentes y estudiantes de instituciones educativas. Los casos de diseño toman como referencia los propuestos por Ralph Kimball: relación muchos a muchos, productos heterogéneos, subdimensiones y jerarquías de organización. Además, el artículo plantea un nuevo caso de diseño denominado «Dimensión con medidas» que, junto con una tabla de hechos y el cálculo de medidas basado en funciones MDX, permiten el análisis ponderado de las competencias de los actores de las instituciones educativas. Este nuevo caso de diseño puede ser usado en otros contextos que requieran de cálculos recursivos de atributos numéricos ponderados en dimensiones jerárquicas de organización dentro de una bodega de datos. Por último, presenta los componentes principales de un sistema de soporte a la toma de decisiones desarrollado para el Programa Computadores para Educar (CPE) en alianza con la Universidad del Cauca el cual puede ser replicado en otros programas con objetivos similares. PALABRAS CLAVES: modelamiento dimensional; bodegas de datos; OLAP; competencias en tecnologías de la información y la comunicación (TIC).
This paper introduces a new description-centric algorithm for web document clustering based on Me... more This paper introduces a new description-centric algorithm for web document clustering based on Memetic Algorithms with Niching Methods, Term-Document Matrix and Bayesian Information Criterion. The algorithm defines the number of clusters automatically. The Memetic Algorithm provides a combined global and local strategy for a search in the solution space and the Niching methods to promote diversity in the population and prevent the population from converging too quickly (based on restricted competition replacement and restrictive mating). The Memetic Algorithm uses the K-means algorithm to find the optimum value in a local search space. Bayesian Information Criterion is used as a fitness function, while FP-Growth is used to reduce the high dimensionality in the vocabulary. This resulting algorithm, called WDC-NMA, was tested with data sets based on Reuters-21578 and DMOZ, obtaining promising results (better precision results than a Singular Value Decomposition algorithm). Also, it was also then initially evaluated by a group of users.
Uploads
Papers by Carlos Cobos