Papers by John Freddy Duitama M
Applications of Computational Intelligence, 2021
2020 IEEE Colombian Conference on Applications of Computational Intelligence (IEEE ColCACI 2020), 2020
The optimization of the resources used in clinics and hospitals is a key problem in hospital mana... more The optimization of the resources used in clinics and hospitals is a key problem in hospital management. In particular, how to improve the efficiency in procedures and treatments for patients, reducing cost, but without deteriorating the quality of the patient’s stay is one of the greatest challenges faced by health providers. In this sense, the development of tools that can help health care providers to ensure that inpatients can be discharged at the times indicated by international standards according to their pathological condition is of great interest for the optimization of resources, especially in developing countries. There are different standards for grouping patients according to their diagnoses and procedures information, this work focuses on the Diagnosis-Related Groups (DRGs) patient classification system. Typically DRGs are obtained after patients’ discharge, only for billing and payment purposes, which reduce the ability of health providers to take corrective actions when the health care attention deviates from the standard attention of specific patients’ conditions.This work focuses in the use of Machine Learning (ML) techniques as an alternative to DRGs regular classification methods. The main aim is to evaluate whether ML methods are able to classify patients according to the DRGs standard, using the information available at the patient’s discharge. This results would be the base line for further analysis focused on the prediction of DRGs in early stages of the patient’s hospitalization. The results show that DRGs classification using Artificial Neural Networks and Ensemble methods can achieve up to 96% of accuracy in a real database of more than 82.910 health records.
Advances in Mechanics, Aug 24, 2021
The central goal of an Intrusion Detection System (IDS) is to find possible attacks or abnormal b... more The central goal of an Intrusion Detection System (IDS) is to find possible attacks or abnormal behaviors within a network or system. Industrial Control Systems or SCADA Systems are increasingly robust and sophisticated, allowing remotely observing and manipulating variables in PLC controllers. Moreover, information exchange and monitoring have been integrated through the internet employing IoT in recent years, thereby causing the possibility of cyber-attacks that can risk the system and even a country's national
2018 IEEE 1st Colombian Conference on Applications in Computational Intelligence (ColCACI), 2018
Support Vector Machine (SVM) is a classifier widely used in machine learning because of its high ... more Support Vector Machine (SVM) is a classifier widely used in machine learning because of its high generalization capacity. The sequential minimal optimization (SMO) its most popular implementation, scales somewhere between linear and quadratic in the training set size for various test problems. This fact makes using SVM to train large data sets have a high computational cost. SVM implementations on distributed systems such as MapReduce and Spark have shown efficiency to improve computational cost; this paper analyzes how data subset size and number of mapping tasks affects SVM performance on MapReduce and Spark. Also, a cost model as a useful tool for setting data subset size according to available hardware and data to be processed is proposed.
2019 Congreso Internacional de Innovación y Tendencias en Ingenieria (CONIITI ), 2019
Detecting the root cause of a performance problem in a distributed system is a complex and costly... more Detecting the root cause of a performance problem in a distributed system is a complex and costly task. Identifying the fault, which can be internal or external, requires a deep knowledge of the system, and tools that allow processing and filtering large amounts of information. This paper describes a methodology to identify performance problems on distributed systems. The operation flows are inferred from log files and are used to measure a performance indicator. This information is complemented by system data, such as metrics of each node (CPU, memory, disk), that is analyzed with two machine learning techniques, multivariate regression and one class support vector machine (OCSVM), with the purpose of predicting the expected performance and the presence of unusual events. The mentioned model is implemented in a tool called Logmapper, that is used to validate the approach in a controlled test environment, where several types of failures that affect performance were applied. The validation results are composed of the validation metrics of the learning methods and the response given by the tool when failures were induced in the distributed system.
Revista Facultad De Ingenieria-universidad De Antioquia, 2016
El presente trabajo toma como punto de referencia la metodologia usada para el diseno de una base... more El presente trabajo toma como punto de referencia la metodologia usada para el diseno de una base de datos distribuida relacional, evalua que problemas nuevos aparecen cuando se habla de una B. de D. orientada a objetos y propone un tratamiento a tales aspectos.
2017 IEEE Colombian Conference on Communications and Computing (COLCOM), 2017
SCADA systems, an acronym for Supervisory Control And Data Acquisition (supervisory, Control and ... more SCADA systems, an acronym for Supervisory Control And Data Acquisition (supervisory, Control and data acquisition), are control networks that allow the monitoring and management of industrial processes remotely. In the beginning, their top priority was the availability of information bidirectionally between the control station and the remote units; however, the growing escalation of industrial systems, as well as internet connectivity has led to reconsider the old paradigm to give more importance to the issue of security, in order to avoid a possible cyber-attack endangers the functioning of the SCADA system. These attacks can affect even the industry and put into risk the security of a country. The present paper proposes the creation of an adaptive system for the detection of intruders or IDS (for its acronym in English) on SCADA networks, through the use of supervised machine learning techniques, oriented to the analysis of variables of the control devices. A support vector of type “Class One” machine and a test lab, allowed the validation of the proposed model.
Support Vector Machine (SVM) is a classifier widely used in machine learning because of its high ... more Support Vector Machine (SVM) is a classifier widely used in machine learning because of its high generalization capacity. The sequential minimal optimization (SMO) its most popular implementation, scales somewhere between linear and quadratic in the training set size for various test problems. This fact makes using SVM to train large data sets have a high computational cost. SVM implementations on distributed systems such as MapReduce and Spark have shown efficiency to improve computational cost; this paper analyzes how data subset size and number of mapping tasks affects SVM performance on MapReduce and Spark. Also, a cost model as a useful tool for setting data subset size according to available hardware and data to be processed is proposed.
Communications in Computer and Information Science, 2016
Software reuse in the early stages is a key issue in rapid development of applications. Recently,... more Software reuse in the early stages is a key issue in rapid development of applications. Recently, several methodologies have been proposed for the reuse of components, but mainly in code generation as artifacts. However, these methodologies partially consider the domain analysis, the business modeling, and the reuse through of components. This paper introduces a metaprocess-oriented methodology based on reuse it as software assets starting from specifications and analysis of the domain. The approach includes the definition of a conceptual level to adequately represent the domain, a reuse process to specify the metaprocess as software assets, and an implementation level which defines the rules for conceptual level and reuse of metaprocess. The methodology has been applied successfully to the first phase, i.e. at the specification of the conceptual level in the field of e-health, in particular in monitoring system of patients with cardiovascular risk, but our work has advances in reuse of models for implementation in other contexts contributing to productivity in software development.
Lecture Notes in Computer Science, 2012
International Journal of Web Engineering and Technology, 2016
Although web personalisation has been studied for the last two decades, there remains a need to a... more Although web personalisation has been studied for the last two decades, there remains a need to address current challenges: context-awareness and the inclusion in a business environment. The wide variety of mobile devices and their continuous technological evolution demands the permanent development of new personalisation strategies. Additionally, two factors complicate the inclusion of personalised web applications in a business environment: the frequent change of personalisation strategies for each business, and the technical complexity to integrate these strategies in a short time. We propose a reference architecture as a tool to favour their modifiability. Moreover, our proposal facilitates the opportunity for enterprises to adopt web-personalised systems into their business as a strategic tool. A controlled experiment validates our approach; we compare five change scenarios that are implemented under two architectures: experimental and control architecture. We used change scenarios derived from a real Brazilian e-commerce enterprise.
Revista de la Facultad de Ingenieria
Personalización de contenidos en sistemas hipermedia educativos adaptativos: una revisión
... vii Preface Keith Harman & Alex Koohang..... ... 223 Chapter 9: An Educational Co... more ... vii Preface Keith Harman & Alex Koohang..... ... 223 Chapter 9: An Educational Component Model for Learning Objects John Freddy Duitama, Bruno Defude, Claire Lecocq, and Amel Bouzeghoub..... ...
Revista Facultad Nacional De Salud Publica, Aug 1, 2012
Recibido: 30 de Septiembre de 2011. Aprobado: 05 de Mayo de 2012. Zambrano R, Duitama JF, Posada ... more Recibido: 30 de Septiembre de 2011. Aprobado: 05 de Mayo de 2012. Zambrano R, Duitama JF, Posada JI, Flórez JF. Percepción de la adherencia a tratamientos en pacientes con factores de riesgo cardiovascular. Rev. Fac. Nac. Salud Pública 2012; 30(2): 163-174
The primary business problem that a Learning Content Management System faces is to create just en... more The primary business problem that a Learning Content Management System faces is to create just enough content, just in time, meeting the needs of different types of learners. A possible answer is the concept of educational component. This paper addresses the formal definition of an educational component model. Its framework will be the Learning Management System. Besides, we define both canonical operators to facilitate component composition and automatic metadata definition, and high-level constructors to coalesce and copy components. The goal is to provide a componentbased environment of composition and delivering of platform independent educational contents to support adaptive learning. Additionally, we provide strategies to classify educational chunks, constituting a common base of knowledge and facilitating sharing of the learning objects.
Revista de la Facultad de Ingenieria
Uploads
Papers by John Freddy Duitama M