Editors: Francisco Ortuño, Ignacio Rojas Chapter, Identification of Biologically Significant Elem... more Editors: Francisco Ortuño, Ignacio Rojas Chapter, Identification of Biologically Significant Elements Using Correlation Networks in High Performance Computing Environments, co-authored by Kathryn Dempsey Cooper, Sachin Pawaskar, and Hesham Ali, UNO faculty members. The two volume set LNCS 9043 and 9044 constitutes the refereed proceedings of the Third International Conference on Bioinformatics and Biomedical Engineering, IWBBIO 2015, held in Granada, Spain in April 2015. The 134 papers presented were carefully reviewed and selected from 268 submissions. The scope of the conference spans the following areas: bioinformatics for healthcare and diseases, biomedical engineering, biomedical image analysis, biomedical signal analysis, computational genomics, computational proteomics, computational systems for modelling biological processes, eHealth, next generation sequencing and sequence analysis, quantitative and systems pharmacology, Hidden Markov Model (HMM) for biological sequence mod...
Proceedings of the 10th International Joint Conference on Biomedical Engineering Systems and Technologies, 2017
Horizontal gene transfer is a major driver of bacterial evolution and adaptation to niche environ... more Horizontal gene transfer is a major driver of bacterial evolution and adaptation to niche environments. This holds true for the complex microbiome of the human gut. Crohn's disease is a debilitating condition characterized by inflammation and gut bacteria dysbiosis. In previous research, we analyzed transposase associated antibiotic resistance genes in Crohn's disease and healthy gut microbiome metagenomics data sets using a graph mining approach. Results demonstrated that there were significant differences in the type and bacterial distribution of transposase-associated antibiotic resistance genes in the Crohn's and healthy data sets. In this paper, we extend the previous research by considering all gene features associated with transposase sequences in the Crohn's disease and healthy data sets. Results demonstrate that some transposase-associated features are more prevalent in Crohn's disease data sets than healthy data sets. This study may provide insights into the adaptation of bacteria to gut conditions such as Crohn's disease.
High throughput biological experiments are critical for their role in systems biology-the ability... more High throughput biological experiments are critical for their role in systems biology-the ability to survey the state of cellular mechanisms on the broad scale opens possibilities for the scientific researcher to understand how multiple components come together, and what goes wrong in disease states. However, the data returned from these experiments is massive and heterogeneous, and requires intuitive and clever computational algorithms for analysis. The correlation network model has been proposed as a tool for modeling and analysis of this high throughput data; structures within the model identified by graph theory have been found to represent key players in major cellular pathways. Previous work has found that network filtering using graph theoretic structural concepts can reduce noise and strengthen biological signals in these networks. However, the process of filtering biological network using such filters is computationally intensive and the filtered networks remain large. In this research, we develop a parallel template for these network filters to improve runtime, and use this high performance environment to show that parallelization does not affect network structure or biological function of that structure.
— Sequence comparison remains one of the main computational tools in bioinformatics research. It ... more — Sequence comparison remains one of the main computational tools in bioinformatics research. It is an essential starting point for addressing many problems in bioinformatics; including problems associated with recognition and classification of organisms. Although sequence alignment provides a well-studied approach for comparing sequences, it has been well documented and reported that sequence alignment fails to solve several instances of the sequence comparison problem, particularly for those sequences that contains errors or those that represent incomplete genomes. In this work, we propose an approach to identify the relatedness among species based on whether their sequences contain similar short sequences or signals. We cluster species based on biological signals such as restriction enzymes or short sequences that occur in the coding regions, as well as random signals for baseline comparison. We focus on identifying k-mers (motifs) that would produce the best results using this a...
Loss of postnatal mammalian auditory hair cells (HCs) is irreversible. Earlier studies have highl... more Loss of postnatal mammalian auditory hair cells (HCs) is irreversible. Earlier studies have highlighted the importance of the Retinoblastoma family of proteins (pRBs) (i.e., Rb1, Rbl1/p107, and Rbl2/p130) in the auditory cells' proliferation and emphasized our lack of information on their specific roles in the auditory system. We have previously demonstrated that lack of Rbl2/p130 moderately affects HCs' and supporting cells' (SCs) proliferation. Here, we present evidence supporting multiple roles for Rbl1/p107 in the developing and mature mouse organ of Corti (OC). Like other pRBs, Rbl1/p107 is expressed in the OC, particularly in the Hensen's and Deiters' cells. Moreover, Rbl1/p107 impacts maturation and postmitotic quiescence of HCs and SCs, as evidenced by enhanced numbers of these cells and the presence of dividing cells in the postnatal Rbl1/p107 −/− OC. These findings were further supported by microarray and bioinformatics analyses, suggesting downregulation of several bHLH molecules, as well as activation of the Notch/Hes/Hey signaling pathway in homozygous Rbl1/p107 mutant mice. Physiological assessments and detection of ectopic HC marker expression in postnatal spiral ganglion neurons (SGNs) provided evidence for incomplete cell maturation and differentiation in Rbl1/p107 −/− OC. Collectively, the present study highlights an important role for Rbl1/p107 in OC cell differentiation and maturation, which is distinct from other pRBs.
in Chamonix, France, covered these three main areas: bioinformatics, biomedical technologies, and... more in Chamonix, France, covered these three main areas: bioinformatics, biomedical technologies, and biocomputing. Bioinformatics deals with the system-level study of complex interactions in biosystems providing a quantitative systemic approach to understand them and appropriate tool support and concepts to model them. Understanding and modeling biosystems requires simulation of biological behaviors and functions. Bioinformatics itself constitutes a vast area of research and specialization, as many classical domains such as databases, modeling, and regular expressions are used to represent, store, retrieve and process a huge volume of knowledge. There are challenging aspects concerning biocomputation technologies, bioinformatics mechanisms dealing with chemoinformatics, bioimaging, and neuroinformatics. Biotechnology is defined as the industrial use of living organisms or biological techniques developed through basic research. Bio-oriented technologies became very popular in various re...
Proceedings of the 52nd Hawaii International Conference on System Sciences, 2019
Parkinson's Disease is a worldwide health problem, causing movement disorder and gait deficiencie... more Parkinson's Disease is a worldwide health problem, causing movement disorder and gait deficiencies. Automatic noninvasive techniques for Parkinson's disease diagnosis is appreciated by patients, clinicians and neuroscientists. Gait offers many advantages compared to other biometrics specifically when data is collected using wearable devices; data collection can be performed through inexpensive technologies, remotely, and continuously. In this study, a new set of gait features associated with Parkinson's Disease are introduced and extracted from accelerometer data. Then, we used a feature selection technique called maximum information gain minimum correlation (MIGMC). Using MIGMC, features are first reduced based on Information Gain method and then through Pearson correlation analysis and Tukey post-hoc multiple comparison test. The ability of several machine learning methods, including Support Vector Machine, Random Forest, AdaBoost, Bagging, and Naïve Bayes are investigated across different feature sets. Similarity Network analysis is also performed to validate our optimal feature set obtained using MIGMC technique. The effect of feature standardization is also investigated. Results indicates that standardization could improve all classifiers' performance. In addition, the feature set obtained using MIGMC provided the highest classification performance. It is shown that our results from Similarity Network analysis are consistent with our results from the classification task, emphasizing on the importance of choosing an optimal set of gait features to help objective assessment and automatic diagnosis of Parkinson's disease. Results illustrate that ensemble methods and specifically boosting classifiers had better performances than other classifiers. In summary, our preliminary results support the potential benefit of accelerometers as an objective tool for diagnostic purposes in PD.
Many studies showed inconsistent cancer biomarkers due to bioinformatics artifacts. In this paper... more Many studies showed inconsistent cancer biomarkers due to bioinformatics artifacts. In this paper we use multiple data sets from microarrays, mass spectrometry, protein sequences, and other biological knowledge in order to improve the reliability of cancer biomarkers. We present a novel Bayesian network (BN) model which integrates and cross-annotates multiple data sets related to prostate cancer. The main contribution of this study is that we provide a method that is designed to find cancer biomarkers whose presence is supported by multiple data sources and biological knowledge. Relevant biological knowledge is explicitly encoded into the model parameters, and the biomarker finding problem is formulated as a Bayesian inference problem. Besides diagnostic accuracy, we introduce reliability as another quality measurement of the biological relevance of biomarkers. Based on the proposed BN model, we develop an empirical scoring scheme and a simulation algorithm for inferring biomarkers....
37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the
Wireless technologies such as 802.11 do not impose topology constraints on the network; however, ... more Wireless technologies such as 802.11 do not impose topology constraints on the network; however, Bluetooth imposes certain constraints for constructing valid topologies. The performance of Bluetooth depends largely on these topologies. This paper presents and evaluates the performance of a new evolutionary scatternet topology construction protocol for Bluetooth networks. A scatternet can be viewed as a Bluetooth ad hoc network that is formed by interconnecting piconets. The scatternets formed have the following properties: 1) the scatternets are connected, i.e. every Bluetooth device can be reached from every other device, 2) piconet size is limited to eight nodes to avoid "parking" of slaves and the associated overhead, 3) the number of piconets is close to the universal lower bound that defines the optimal number of piconets, resulting in low interference amongst piconets, and 4) end-user delay is minimized during scatternet formation. This paper also reviews existing approaches to constructing scatternet topologies and suggests extensions to the proposed scatternet formation protocol.
2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
High performance computing has become essential for many biomedical applications as the productio... more High performance computing has become essential for many biomedical applications as the production of biological data continues to increase. Next Generation Sequencing (NGS) technologies are capable of producing millions to even billions of short DNA fragments called reads. These short reads are assembled into larger sequences called contigs by graph theoretic software tools called assemblers. High performance computing has been applied to reduce the computational burden of several steps of the NGS data assembly process. Several parallel de Bruijn graph assemblers rely on a distributed assembly graph. However, the majority of assemblers that utilize distributed assembly graphs do not take the input properties of the data set into consideration to improve the graph partitioning process. Furthermore, the graph theoretic foundation for the majority of these assemblers is a distributed de Bruijn graph. In this paper, we introduce a distributed overlap graph based model upon which our parallel assembler Focus is built. The contribution of this paper is threefold. First, we demonstrate that the application of data specific knowledge regarding the inherent linearity of DNA sequences can be used to improve the partitioning processes for distributing the assembly graph. Second, we implement several parallel graph algorithms for assembly with greatly improved speedup. Finally, we demonstrate that for metagenomics datasets, the graph partitioning provides insights into the structure of the microbial community.
Strong collaborative partnerships are critical to the ongoing success of any urban or metropolita... more Strong collaborative partnerships are critical to the ongoing success of any urban or metropolitan university in its efforts to build the science, technology, engineering, and mathematics (STEM) career pathways so critical to our nation. At the University of Nebraska at Omaha, we have established a faculty leadership structure of "community chairs" that work across colleges to support campus priorities. This paper describes UNO’s STEM community chair model, including selected initiatives, impacts, and challenges to date.
The influx of biomedical measurement technologies continues to define a rapidly changing and grow... more The influx of biomedical measurement technologies continues to define a rapidly changing and growing landscape, multi-modal and uncertain in nature. The focus of the biomedical research community shifted from pure data generation to the development of methodologies for data analytics. Although many researchers continue to focus on approaches developed for analyzing single types of biological data, recent attempts have been made to utilize the availability of multiple heterogeneous data sets that contain various types of data and try to establish tools for data fusion and analysis in many bioinformatics applications. At the heart of this initiative is the attempt to consolidate the domain knowledge and experimental data sources in order to enhance our understanding of highly-specific conditions dependent on sensory data containing inherent error. This challenge refers to granularity: the specificity or mereology of alternate information sources may impact the final data fusion. In an...
The notion of repurposing of existing drugs to treat both common and rare diseases has gained tra... more The notion of repurposing of existing drugs to treat both common and rare diseases has gained traction from both academia and pharmaceutical companies. Given the high attrition rates, massive time, money, and effort of brand-new drug development, the advantages of drug repurposing in terms of lower costs and shorter development time have become more appealing. Computational drug repurposing is promising approach and has shown great potential in tailoring genomic findings to the development of treatments for diseases. However, there are still challenges involved in building a standard computational drug repurposing solution for high-throughput analysis and the implementation to clinical practice. In this study, we applied the computational drug repurposing approaches for Ulcerative Colitis (UC) patients to provide better treatment for this disabling disease. Repositioning drug candidates were identified, and these findings provide a potentially effective therapeutics for the treatmen...
The last few years have witnessed significant developments in various aspects of Biomedical Infor... more The last few years have witnessed significant developments in various aspects of Biomedical Informatics, including Bioinformatics, Medical Informatics, Public Health Informatics, and Biomedical Imaging. The explosion of medical and biological data requires an associated increase in the scale and sophistication of the automated systems and intelligent tools to enable the researchers to take full advantage of the available databases. The availability of vast amount of biological data continues to represent unlimited opportunities as well as great challenges in biomedical research. Developing innovative data mining techniques and clever parallel computational methods to implement them will surely play an important role in efficiently extracting useful knowledge from the raw data currently available. The proper integration of carefully selected/developed algorithms along with efficient utilization of high performance computing systems form the key ingredients in the process of reaching ...
High performance computing has become essential for many biomedical applications as the productio... more High performance computing has become essential for many biomedical applications as the production of biological data continues to increase. Next Generation Sequencing (NGS) technologies are capable of producing millions to even billions of short DNA fragments called reads. These short reads are assembled into larger sequences called contigs by graph theoretic software tools called assemblers. High performance computing has been applied to reduce the computational burden of several steps of the NGS data assembly process. Several parallel de Bruijn graph assemblers rely on a distributed assembly graph. However, the majority of assemblers that utilize distributed assembly graphs do not take the input properties of the data set into consideration to improve the graph partitioning process. Furthermore, the graph theoretic foundation for the majority of these assemblers is a distributed de Bruijn graph. In this paper, we introduce a distributed overlap graph based model upon which our pa...
The importance of human mobility in maintaining physical health is of emerging interest in resear... more The importance of human mobility in maintaining physical health is of emerging interest in research and practice. Technological advances in wearable technology enable us to monitor human mobility in out-oflaboratory settings. Although a large amount of human mobility data is available from wearable sensors, there is a lack of systematic methodologies for extracting useful knowledge on human mobility from the collected data. The objective description of the different status of mobility patterns to interpret different physical health levels especially remains challenging. In this paper, robust network modeling from our preliminary study is validated in a real-world scenario with stable and unstable mobility conditions. The models based on population analysis utilize mobility data and extract distinctive mobility characteristics. Correlation networks and population-based analysis are utilized to efficiently examine the natural variability of human movement. Results demonstrate that the...
With the continuous advancements of biomedical instruments and the associated ability to collect ... more With the continuous advancements of biomedical instruments and the associated ability to collect diverse types of valuable biological data, numerous recent research studies have been focusing on how to best extract useful information from the Big Biomedical Data currently available. While drug design has been one of the most essential areas of biomedical research, the drug design process for the most part has not fully benefited from the recent explosive growth of biological data and bioinformatics tools. With the significant overhead associated with the traditional drug design process in terms of time and cost, new alternative methods, possibly based on computational approaches, are very much needed to propose innovative ways to propose effective drugs and new treatment options. Employing advanced computational tools for drug design and precision treatments has been the focus of many research studies in recent years. For example, drug repurposing has gained significant attention fr...
Due to the advancement in high throughput technologies and robust experimental designs, many rece... more Due to the advancement in high throughput technologies and robust experimental designs, many recent studies attempt to incorporate heterogeneous data obtained from multiple technologies to improve our understanding of the molecular dynamics associated with biological processes. Currently available technologies produce wide variety of large amount of data spanning from genomics, transcriptomics, proteomics, and epigenetics. Due to the fact that such multi-omics data are very diverse and come from different biological levels, it has been a major research challenge to develop a model to properly integrate all available and relevant data to advance biomedical research. It has been argued by many researchers that the integration of multi-omics data to extract relevant biological information is currently one of the major biomedical informatics challenges. This paper proposes a new graph database model to efficiently store and mine multi-omics data. We show a working model of this graph da...
Editors: Francisco Ortuño, Ignacio Rojas Chapter, Identification of Biologically Significant Elem... more Editors: Francisco Ortuño, Ignacio Rojas Chapter, Identification of Biologically Significant Elements Using Correlation Networks in High Performance Computing Environments, co-authored by Kathryn Dempsey Cooper, Sachin Pawaskar, and Hesham Ali, UNO faculty members. The two volume set LNCS 9043 and 9044 constitutes the refereed proceedings of the Third International Conference on Bioinformatics and Biomedical Engineering, IWBBIO 2015, held in Granada, Spain in April 2015. The 134 papers presented were carefully reviewed and selected from 268 submissions. The scope of the conference spans the following areas: bioinformatics for healthcare and diseases, biomedical engineering, biomedical image analysis, biomedical signal analysis, computational genomics, computational proteomics, computational systems for modelling biological processes, eHealth, next generation sequencing and sequence analysis, quantitative and systems pharmacology, Hidden Markov Model (HMM) for biological sequence mod...
Proceedings of the 10th International Joint Conference on Biomedical Engineering Systems and Technologies, 2017
Horizontal gene transfer is a major driver of bacterial evolution and adaptation to niche environ... more Horizontal gene transfer is a major driver of bacterial evolution and adaptation to niche environments. This holds true for the complex microbiome of the human gut. Crohn's disease is a debilitating condition characterized by inflammation and gut bacteria dysbiosis. In previous research, we analyzed transposase associated antibiotic resistance genes in Crohn's disease and healthy gut microbiome metagenomics data sets using a graph mining approach. Results demonstrated that there were significant differences in the type and bacterial distribution of transposase-associated antibiotic resistance genes in the Crohn's and healthy data sets. In this paper, we extend the previous research by considering all gene features associated with transposase sequences in the Crohn's disease and healthy data sets. Results demonstrate that some transposase-associated features are more prevalent in Crohn's disease data sets than healthy data sets. This study may provide insights into the adaptation of bacteria to gut conditions such as Crohn's disease.
High throughput biological experiments are critical for their role in systems biology-the ability... more High throughput biological experiments are critical for their role in systems biology-the ability to survey the state of cellular mechanisms on the broad scale opens possibilities for the scientific researcher to understand how multiple components come together, and what goes wrong in disease states. However, the data returned from these experiments is massive and heterogeneous, and requires intuitive and clever computational algorithms for analysis. The correlation network model has been proposed as a tool for modeling and analysis of this high throughput data; structures within the model identified by graph theory have been found to represent key players in major cellular pathways. Previous work has found that network filtering using graph theoretic structural concepts can reduce noise and strengthen biological signals in these networks. However, the process of filtering biological network using such filters is computationally intensive and the filtered networks remain large. In this research, we develop a parallel template for these network filters to improve runtime, and use this high performance environment to show that parallelization does not affect network structure or biological function of that structure.
— Sequence comparison remains one of the main computational tools in bioinformatics research. It ... more — Sequence comparison remains one of the main computational tools in bioinformatics research. It is an essential starting point for addressing many problems in bioinformatics; including problems associated with recognition and classification of organisms. Although sequence alignment provides a well-studied approach for comparing sequences, it has been well documented and reported that sequence alignment fails to solve several instances of the sequence comparison problem, particularly for those sequences that contains errors or those that represent incomplete genomes. In this work, we propose an approach to identify the relatedness among species based on whether their sequences contain similar short sequences or signals. We cluster species based on biological signals such as restriction enzymes or short sequences that occur in the coding regions, as well as random signals for baseline comparison. We focus on identifying k-mers (motifs) that would produce the best results using this a...
Loss of postnatal mammalian auditory hair cells (HCs) is irreversible. Earlier studies have highl... more Loss of postnatal mammalian auditory hair cells (HCs) is irreversible. Earlier studies have highlighted the importance of the Retinoblastoma family of proteins (pRBs) (i.e., Rb1, Rbl1/p107, and Rbl2/p130) in the auditory cells' proliferation and emphasized our lack of information on their specific roles in the auditory system. We have previously demonstrated that lack of Rbl2/p130 moderately affects HCs' and supporting cells' (SCs) proliferation. Here, we present evidence supporting multiple roles for Rbl1/p107 in the developing and mature mouse organ of Corti (OC). Like other pRBs, Rbl1/p107 is expressed in the OC, particularly in the Hensen's and Deiters' cells. Moreover, Rbl1/p107 impacts maturation and postmitotic quiescence of HCs and SCs, as evidenced by enhanced numbers of these cells and the presence of dividing cells in the postnatal Rbl1/p107 −/− OC. These findings were further supported by microarray and bioinformatics analyses, suggesting downregulation of several bHLH molecules, as well as activation of the Notch/Hes/Hey signaling pathway in homozygous Rbl1/p107 mutant mice. Physiological assessments and detection of ectopic HC marker expression in postnatal spiral ganglion neurons (SGNs) provided evidence for incomplete cell maturation and differentiation in Rbl1/p107 −/− OC. Collectively, the present study highlights an important role for Rbl1/p107 in OC cell differentiation and maturation, which is distinct from other pRBs.
in Chamonix, France, covered these three main areas: bioinformatics, biomedical technologies, and... more in Chamonix, France, covered these three main areas: bioinformatics, biomedical technologies, and biocomputing. Bioinformatics deals with the system-level study of complex interactions in biosystems providing a quantitative systemic approach to understand them and appropriate tool support and concepts to model them. Understanding and modeling biosystems requires simulation of biological behaviors and functions. Bioinformatics itself constitutes a vast area of research and specialization, as many classical domains such as databases, modeling, and regular expressions are used to represent, store, retrieve and process a huge volume of knowledge. There are challenging aspects concerning biocomputation technologies, bioinformatics mechanisms dealing with chemoinformatics, bioimaging, and neuroinformatics. Biotechnology is defined as the industrial use of living organisms or biological techniques developed through basic research. Bio-oriented technologies became very popular in various re...
Proceedings of the 52nd Hawaii International Conference on System Sciences, 2019
Parkinson's Disease is a worldwide health problem, causing movement disorder and gait deficiencie... more Parkinson's Disease is a worldwide health problem, causing movement disorder and gait deficiencies. Automatic noninvasive techniques for Parkinson's disease diagnosis is appreciated by patients, clinicians and neuroscientists. Gait offers many advantages compared to other biometrics specifically when data is collected using wearable devices; data collection can be performed through inexpensive technologies, remotely, and continuously. In this study, a new set of gait features associated with Parkinson's Disease are introduced and extracted from accelerometer data. Then, we used a feature selection technique called maximum information gain minimum correlation (MIGMC). Using MIGMC, features are first reduced based on Information Gain method and then through Pearson correlation analysis and Tukey post-hoc multiple comparison test. The ability of several machine learning methods, including Support Vector Machine, Random Forest, AdaBoost, Bagging, and Naïve Bayes are investigated across different feature sets. Similarity Network analysis is also performed to validate our optimal feature set obtained using MIGMC technique. The effect of feature standardization is also investigated. Results indicates that standardization could improve all classifiers' performance. In addition, the feature set obtained using MIGMC provided the highest classification performance. It is shown that our results from Similarity Network analysis are consistent with our results from the classification task, emphasizing on the importance of choosing an optimal set of gait features to help objective assessment and automatic diagnosis of Parkinson's disease. Results illustrate that ensemble methods and specifically boosting classifiers had better performances than other classifiers. In summary, our preliminary results support the potential benefit of accelerometers as an objective tool for diagnostic purposes in PD.
Many studies showed inconsistent cancer biomarkers due to bioinformatics artifacts. In this paper... more Many studies showed inconsistent cancer biomarkers due to bioinformatics artifacts. In this paper we use multiple data sets from microarrays, mass spectrometry, protein sequences, and other biological knowledge in order to improve the reliability of cancer biomarkers. We present a novel Bayesian network (BN) model which integrates and cross-annotates multiple data sets related to prostate cancer. The main contribution of this study is that we provide a method that is designed to find cancer biomarkers whose presence is supported by multiple data sources and biological knowledge. Relevant biological knowledge is explicitly encoded into the model parameters, and the biomarker finding problem is formulated as a Bayesian inference problem. Besides diagnostic accuracy, we introduce reliability as another quality measurement of the biological relevance of biomarkers. Based on the proposed BN model, we develop an empirical scoring scheme and a simulation algorithm for inferring biomarkers....
37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the
Wireless technologies such as 802.11 do not impose topology constraints on the network; however, ... more Wireless technologies such as 802.11 do not impose topology constraints on the network; however, Bluetooth imposes certain constraints for constructing valid topologies. The performance of Bluetooth depends largely on these topologies. This paper presents and evaluates the performance of a new evolutionary scatternet topology construction protocol for Bluetooth networks. A scatternet can be viewed as a Bluetooth ad hoc network that is formed by interconnecting piconets. The scatternets formed have the following properties: 1) the scatternets are connected, i.e. every Bluetooth device can be reached from every other device, 2) piconet size is limited to eight nodes to avoid "parking" of slaves and the associated overhead, 3) the number of piconets is close to the universal lower bound that defines the optimal number of piconets, resulting in low interference amongst piconets, and 4) end-user delay is minimized during scatternet formation. This paper also reviews existing approaches to constructing scatternet topologies and suggests extensions to the proposed scatternet formation protocol.
2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
High performance computing has become essential for many biomedical applications as the productio... more High performance computing has become essential for many biomedical applications as the production of biological data continues to increase. Next Generation Sequencing (NGS) technologies are capable of producing millions to even billions of short DNA fragments called reads. These short reads are assembled into larger sequences called contigs by graph theoretic software tools called assemblers. High performance computing has been applied to reduce the computational burden of several steps of the NGS data assembly process. Several parallel de Bruijn graph assemblers rely on a distributed assembly graph. However, the majority of assemblers that utilize distributed assembly graphs do not take the input properties of the data set into consideration to improve the graph partitioning process. Furthermore, the graph theoretic foundation for the majority of these assemblers is a distributed de Bruijn graph. In this paper, we introduce a distributed overlap graph based model upon which our parallel assembler Focus is built. The contribution of this paper is threefold. First, we demonstrate that the application of data specific knowledge regarding the inherent linearity of DNA sequences can be used to improve the partitioning processes for distributing the assembly graph. Second, we implement several parallel graph algorithms for assembly with greatly improved speedup. Finally, we demonstrate that for metagenomics datasets, the graph partitioning provides insights into the structure of the microbial community.
Strong collaborative partnerships are critical to the ongoing success of any urban or metropolita... more Strong collaborative partnerships are critical to the ongoing success of any urban or metropolitan university in its efforts to build the science, technology, engineering, and mathematics (STEM) career pathways so critical to our nation. At the University of Nebraska at Omaha, we have established a faculty leadership structure of "community chairs" that work across colleges to support campus priorities. This paper describes UNO’s STEM community chair model, including selected initiatives, impacts, and challenges to date.
The influx of biomedical measurement technologies continues to define a rapidly changing and grow... more The influx of biomedical measurement technologies continues to define a rapidly changing and growing landscape, multi-modal and uncertain in nature. The focus of the biomedical research community shifted from pure data generation to the development of methodologies for data analytics. Although many researchers continue to focus on approaches developed for analyzing single types of biological data, recent attempts have been made to utilize the availability of multiple heterogeneous data sets that contain various types of data and try to establish tools for data fusion and analysis in many bioinformatics applications. At the heart of this initiative is the attempt to consolidate the domain knowledge and experimental data sources in order to enhance our understanding of highly-specific conditions dependent on sensory data containing inherent error. This challenge refers to granularity: the specificity or mereology of alternate information sources may impact the final data fusion. In an...
The notion of repurposing of existing drugs to treat both common and rare diseases has gained tra... more The notion of repurposing of existing drugs to treat both common and rare diseases has gained traction from both academia and pharmaceutical companies. Given the high attrition rates, massive time, money, and effort of brand-new drug development, the advantages of drug repurposing in terms of lower costs and shorter development time have become more appealing. Computational drug repurposing is promising approach and has shown great potential in tailoring genomic findings to the development of treatments for diseases. However, there are still challenges involved in building a standard computational drug repurposing solution for high-throughput analysis and the implementation to clinical practice. In this study, we applied the computational drug repurposing approaches for Ulcerative Colitis (UC) patients to provide better treatment for this disabling disease. Repositioning drug candidates were identified, and these findings provide a potentially effective therapeutics for the treatmen...
The last few years have witnessed significant developments in various aspects of Biomedical Infor... more The last few years have witnessed significant developments in various aspects of Biomedical Informatics, including Bioinformatics, Medical Informatics, Public Health Informatics, and Biomedical Imaging. The explosion of medical and biological data requires an associated increase in the scale and sophistication of the automated systems and intelligent tools to enable the researchers to take full advantage of the available databases. The availability of vast amount of biological data continues to represent unlimited opportunities as well as great challenges in biomedical research. Developing innovative data mining techniques and clever parallel computational methods to implement them will surely play an important role in efficiently extracting useful knowledge from the raw data currently available. The proper integration of carefully selected/developed algorithms along with efficient utilization of high performance computing systems form the key ingredients in the process of reaching ...
High performance computing has become essential for many biomedical applications as the productio... more High performance computing has become essential for many biomedical applications as the production of biological data continues to increase. Next Generation Sequencing (NGS) technologies are capable of producing millions to even billions of short DNA fragments called reads. These short reads are assembled into larger sequences called contigs by graph theoretic software tools called assemblers. High performance computing has been applied to reduce the computational burden of several steps of the NGS data assembly process. Several parallel de Bruijn graph assemblers rely on a distributed assembly graph. However, the majority of assemblers that utilize distributed assembly graphs do not take the input properties of the data set into consideration to improve the graph partitioning process. Furthermore, the graph theoretic foundation for the majority of these assemblers is a distributed de Bruijn graph. In this paper, we introduce a distributed overlap graph based model upon which our pa...
The importance of human mobility in maintaining physical health is of emerging interest in resear... more The importance of human mobility in maintaining physical health is of emerging interest in research and practice. Technological advances in wearable technology enable us to monitor human mobility in out-oflaboratory settings. Although a large amount of human mobility data is available from wearable sensors, there is a lack of systematic methodologies for extracting useful knowledge on human mobility from the collected data. The objective description of the different status of mobility patterns to interpret different physical health levels especially remains challenging. In this paper, robust network modeling from our preliminary study is validated in a real-world scenario with stable and unstable mobility conditions. The models based on population analysis utilize mobility data and extract distinctive mobility characteristics. Correlation networks and population-based analysis are utilized to efficiently examine the natural variability of human movement. Results demonstrate that the...
With the continuous advancements of biomedical instruments and the associated ability to collect ... more With the continuous advancements of biomedical instruments and the associated ability to collect diverse types of valuable biological data, numerous recent research studies have been focusing on how to best extract useful information from the Big Biomedical Data currently available. While drug design has been one of the most essential areas of biomedical research, the drug design process for the most part has not fully benefited from the recent explosive growth of biological data and bioinformatics tools. With the significant overhead associated with the traditional drug design process in terms of time and cost, new alternative methods, possibly based on computational approaches, are very much needed to propose innovative ways to propose effective drugs and new treatment options. Employing advanced computational tools for drug design and precision treatments has been the focus of many research studies in recent years. For example, drug repurposing has gained significant attention fr...
Due to the advancement in high throughput technologies and robust experimental designs, many rece... more Due to the advancement in high throughput technologies and robust experimental designs, many recent studies attempt to incorporate heterogeneous data obtained from multiple technologies to improve our understanding of the molecular dynamics associated with biological processes. Currently available technologies produce wide variety of large amount of data spanning from genomics, transcriptomics, proteomics, and epigenetics. Due to the fact that such multi-omics data are very diverse and come from different biological levels, it has been a major research challenge to develop a model to properly integrate all available and relevant data to advance biomedical research. It has been argued by many researchers that the integration of multi-omics data to extract relevant biological information is currently one of the major biomedical informatics challenges. This paper proposes a new graph database model to efficiently store and mine multi-omics data. We show a working model of this graph da...
Uploads
Papers by Hesham Ali