Skip to main content

Ozlem O Garibay

University of Central Florida, Electrical Engineering and Computer Science, Alumna

Middle East Technical University, Electrical and Electronics Engineering, Alumna

Followers

14

Following

4

Co-authors

4

Mentions

15

Public Views

Interests

Uploads

Papers by Ozlem O Garibay

AttentionSiteDTI: an interpretable graph-based model for drug-target interaction prediction using NLP sentence-level relation classification

Briefings in Bioinformatics, Jul 12, 2022

In this study, we introduce an interpretable graph-based deep learning prediction model, Attentio... more In this study, we introduce an interpretable graph-based deep learning prediction model, AttentionSiteDTI, which utilizes protein binding sites along with a self-attention mechanism to address the problem of drug–target interaction prediction. Our proposed model is inspired by sentence classification models in the field of Natural Language Processing, where the drug–target complex is treated as a sentence with relational meaning between its biochemical entities a.k.a. protein pockets and drug molecule. AttentionSiteDTI enables interpretability by identifying the protein binding sites that contribute the most toward the drug–target interaction. Results on three benchmark datasets show improved performance compared with the current state-of-the-art models. More significantly, unlike previous studies, our model shows superior performance, when tested on new proteins (i.e. high generalizability). Through multidisciplinary collaboration, we further experimentally evaluate the practical potential of our proposed approach. To achieve this, we first computationally predict the binding interactions between some candidate compounds and a target protein, then experimentally validate the binding interactions for these pairs in the laboratory. The high agreement between the computationally predicted and experimentally observed (measured) drug–target interactions illustrates the potential of our method as an effective pre-screening tool in drug repurposing applications.

TabFairGAN: Fair Tabular Data Generation with Generative Adversarial Networks

arXiv (Cornell University), Sep 1, 2021

With the increasing reliance on automated decision making, the issue of algorithmic fairness has ... more With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a Generative Adversarial Network for tabular data generation. The model includes two phases of training. In the first phase, the model is trained to accurately generate synthetic data similar to the reference dataset. In the second phase we modify the value function to add fairness constraint, and continue training the network to generate data that is both accurate and fair. We test our results in both cases of unconstrained, and constrained fair data generation. In the unconstrained case, i.e. when the model is only trained in the first phase and is only meant to generate accurate data following the same joint probability distribution of the real data, the results show that the model beats state-of-the-art GANs proposed in the literature to produce synthetic tabular data. Also, in the constrained case in which the first phase of training is followed by the second phase, we train the network and test it on four datasets studied in the fairness literature and compare our results with another state-of-the-art pre-processing method, and present the promising results that it achieves. Comparing to other studies utilizing GANs for fair data generation, our model is comparably more stable by using only one critic, and also by avoiding major problems of original GAN model, such as mode-dropping and non-convergence, by implementing a Wasserstein GAN.

Improving Fairness via Deep Ensemble Framework Using Preprocessing Interventions

Lecture Notes in Computer Science, 2023

Through a fair looking-glass: mitigating bias in image datasets

arXiv (Cornell University), Sep 18, 2022

With the recent growth in computer vision applications, the question of how fair and unbiased the... more With the recent growth in computer vision applications, the question of how fair and unbiased they are has yet to be explored. There is abundant evidence that the bias present in training data is reflected in the models, or even amplified. Many previous methods for image dataset de-biasing, including models based on augmenting datasets, are computationally expensive to implement. In this study, we present a fast and effective model to de-bias an image dataset through reconstruction and minimizing the statistical dependence between intended variables. Our architecture includes a U-net to reconstruct images, combined with a pre-trained classifier which penalizes the statistical dependence between target attribute and the protected attribute. We evaluate our proposed model on CelebA dataset, compare the results with a state-of-the-art de-biasing method, and show that the model achieves a promising fairness-accuracy combination.

Contrastive Counterfactual Fairness in Algorithmic Decision-Making

The widespread use of artificial intelligence algorithms and their role in decision-making with c... more The widespread use of artificial intelligence algorithms and their role in decision-making with consequential decisions for human subjects has resulted in a growing interest in designing AI algorithms accounting for fairness considerations. There have been attempts to account for fairness of AI algorithms without compromising their accuracy to improve poorly designed algorithms that disregard sensitive attributes (e.g., age, race, and gender) at the peril of introducing or increasing bias against specific groups. Although many studies have examined the optimal trade-off between fairness and accuracy, it remains a challenge to understand the sources of unfairness in decision-making and mitigate it effectively. To tackle this problem, researchers have proposed fair causal learning approaches which assist us in modeling cause and effect knowledge structures, discovering bias sources, and refining AI algorithms to make them more transparent and explainable. In this study, we formalize probabilistic interpretations of both contrastive and counterfactual causality as essential features in order to encourage users' trust and to expand the applicability of such automated systems. We use this formalism to define a novel fairness criterion that we call contrastive counterfactual fairness. This paper introduces, to the best of our knowledge, the first probabilistic fairness-aware data augmentation approach that is based on contrastive counterfactual causality. We tested our approach on two well-known fairness-related datasets, UCI Adult and German Credit, and concluded that our proposed method has a promising ability to capture and mitigate unfairness in AI deployment. This model-agnostic approach can be used with any AI model because it is applied in pre-processing. CCS CONCEPTS • Computer systems organization → Computing methodologies; • Machine learning → Machine learning approaches; • Machine learning approaches → Classification and regression trees.

Distance Correlation GAN: Fair Tabular Data Generation with Generative Adversarial Networks

Lecture Notes in Computer Science, 2023

Through a Fair Looking-Glass: Mitigating Bias in Image Datasets

Lecture Notes in Computer Science, 2023

With the recent growth in computer vision applications, the question of how fair and unbiased the... more With the recent growth in computer vision applications, the question of how fair and unbiased they are has yet to be explored. There is abundant evidence that the bias present in training data is reflected in the models, or even amplified. Many previous methods for image dataset de-biasing, including models based on augmenting datasets, are computationally expensive to implement. In this study, we present a fast and effective model to de-bias an image dataset through reconstruction and minimizing the statistical dependence between intended variables. Our architecture includes a U-net to reconstruct images, combined with a pre-trained classifier which penalizes the statistical dependence between target attribute and the protected attribute. We evaluate our proposed model on CelebA dataset, compare the results with a state-of-the-art de-biasing method, and show that the model achieves a promising fairness-accuracy combination.

Interpretable and Generalizable Attention-Based Model for Predicting Drug-Target Interaction Using 3D Structure of Protein Binding Sites: SARS-CoV-2 Case Study and in-Lab Validation

bioRxiv (Cold Spring Harbor Laboratory), Dec 9, 2021

In this study, we introduce and implement an interpretable graph-based deep learning prediction m... more In this study, we introduce and implement an interpretable graph-based deep learning prediction model, which utilizes protein binding sites along with self-attention to learn which protein binding sites interact with a given ligand. Our proposed model enables interpretability by identifying the protein binding sites that contribute the most towards the Drug-Target Interaction. Results on three benchmark datasets show improved performance compared to previous graph-based models. More significantly, unlike previous studies our model performance remains close to the optimal performance when tested with new proteins (ie., high generalizablity). Through multidisciplinary collaboration, we further experimentally evaluate the practical potential of our proposed approach. To achieve this, we first computationally predict binding interaction of some candidate compounds with a target protein, then experimentally validate the binding interactions for these pairs in the laboratory. The high agreement between the computationally-predicted and experimentally-observed (measured) DTIs illustrates the potential of our method as an effective pre-screening tool in drug re-purposing applications.

A Quantum Leap for Fairness: Quantum Bayesian Approach for Fair Decision Making

Lecture Notes in Computer Science, 2021

Examining sialic acid derivatives as potential inhibitors of SARS-CoV-2 spike protein receptor binding domain

Journal of Biomolecular Structure and Dynamics

Distraction is All You Need for Fairness

arXiv (Cornell University), Mar 14, 2022

Bias in training datasets must be managed for various groups in classification tasks to ensure pa... more Bias in training datasets must be managed for various groups in classification tasks to ensure parity or equal treatment. With the recent growth in artificial intelligence models and their expanding role in automated decision-making, it is vital to ensure that these models are not biased. There is an abundance of evidence suggesting that these models could contain or even amplify the bias present in the data on which they are trained, inherent to their objective function and learning algorithms; Existing methods for mitigating bias result in information loss and do not provide a suitable balance between accuracy and fairness or do not ensure limiting the biases in training. To this end, we propose a powerful strategy for training deep learning models called the Distraction module, which can be effective in controlling bias from affecting the classification results. This method can be utilized with different data types (e.g., tabular, images, graphs). We demonstrate the potency of the proposed method by testing it on UCI Adult and Heritage Health datasets (tabular), POKEC-Z, POKEC-N and NBA datasets (graph), and CelebA dataset (vision). Considering state-of-the-art methods proposed in the fairness literature for each dataset, we exhibit that our model is superior to these proposed methods in minimizing bias and maintaining accuracy.

BindingSiteAugmentedDTA: Enabling A Next-Generation Pipeline for Interpretable Prediction Models in Drug-Repurposing

While research into Drug-Target Interaction (DTI) prediction is fairly mature, generalizability a... more While research into Drug-Target Interaction (DTI) prediction is fairly mature, generalizability and interpretability are not always addressed in the existing works in this field. In this paper, we propose a deep learning-based framework, called BindingSite-AugmentedDTA, which improves Drug-Target Affinity (DTA) predictions by reducing the search space of potential binding sites of the protein, thus making the binding affinity prediction more efficient and accurate. Our BindingSite-AugmentedDTA is highly generalizable as it can be integrated with any DL-based regression model, while it significantly improves their prediction performance. Also, unlike many existing models, our model is highly interpretable due to its architecture and self-attention mechanism, which can provide a deeper understanding of its underlying prediction mechanism by mapping attention weights back to protein binding sites. The computational results confirm that our framework can enhance the prediction performan...

TabFairGAN: Fair Tabular Data Generation with Generative Adversarial Networks

Machine Learning and Knowledge Extraction

With the increasing reliance on automated decision making, the issue of algorithmic fairness has ... more With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a Generative Adversarial Network for tabular data generation. The model includes two phases of training. In the first phase, the model is trained to accurately generate synthetic data similar to the reference dataset. In the second phase we modify the value function to add fairness constraint, and continue training the network to generate data that is both accurate and fair. We test our results in both cases of unconstrained, and constrained fair data generation. We show that using a fairly simple architecture and applying quantile transformation of numerical attributes the model achieves promising performance. In the unconstrained case, i.e., when the model is only trained in the first phase and is only meant to generate accurate data following the same joint probability distribution of the real data, the results show that the mode...

UnbiasedDTI: Mitigating Real-World Bias of Drug-Target Interaction Prediction by Using Deep Ensemble-Balanced Learning

Molecules

Drug-target interaction (DTI) prediction through in vitro methods is expensive and time-consuming... more Drug-target interaction (DTI) prediction through in vitro methods is expensive and time-consuming. On the other hand, computational methods can save time and money while enhancing drug discovery efficiency. Most of the computational methods frame DTI prediction as a binary classification task. One important challenge is that the number of negative interactions in all DTI-related datasets is far greater than the number of positive interactions, leading to the class imbalance problem. As a result, a classifier is trained biased towards the majority class (negative class), whereas the minority class (interacting pairs) is of interest. This class imbalance problem is not widely taken into account in DTI prediction studies, and the few previous studies considering balancing in DTI do not focus on the imbalance issue itself. Additionally, they do not benefit from deep learning models and experimental validation. In this study, we propose a computational framework along with experimental v...

Pathway patterns mobility study of first time vs. transfer students in computer science and information technology programs at a public university

Journal of Applied Research in Higher Education, 2021

PurposePostsecondary institutions use metrics such as student retention and college completion ra... more PurposePostsecondary institutions use metrics such as student retention and college completion rates to measure student success. Multiple factors affect the success of first time in college (FTIC) and transfer students. Transfer student success rates are significantly low, with most transfer students nationwide failing to complete their degrees in four-year institutions. The purpose of this study is to better understand the degree progression patterns of both student types in two undergraduate science, technology, engineering and mathematics (STEM) programs: computer science (CS) and information technology (IT). Recommendations concerning academic advising are discussed to improve transfer student success.Design/methodology/approachThis study describes how transfer student success can be improved by thoroughly analyzing their degree progression patterns. This study uses institutional data from a public university in the United States. Specifically, this study utilizes the data of FT...

Computer science qualifying exam: a case study on improving student success and program quality

Perfiles de Ingeniería, 2020

Las instituciones de educación superior tradicionalmente miden el éxito de los estudiantes y la c... more Las instituciones de educación superior tradicionalmente miden el éxito de los estudiantes y la calidad de un programa académico mediante métricas estándar, como el tiempo que tarda un estudiante en obtener un título, las tasas de graduación y las tasas de retención. Además, algunos programas han instituido un "examen de calificación" como una forma alternativa de medir la calidad de un programa académico y evaluar el dominio del estudiante en los conceptos básicos. El programa de Ciencias de la Computación (CS) de la Universidad de Florida Central implementó un examen de calificación "Examen básico" en 1998 con el propósito de evaluar el dominio de los estudiantes de los conceptos básicos de CS y controlar la calidad del programa. Sin embargo, las tasas de aprobación de los estudiantes que tomaron este examen de calificación fueron significativamente bajas a lo largo de los años. Además, algunos estudiantes retrasaron sistemáticamente la realización de este exam...

The role of entry exams on higher education: a case study on reforming qualifier exam policies to improve student success while maintaining program quality

Journal of Applied Research in Higher Education, 2020

PurposeSome degree programs in colleges and universities utilize entrance exams to ensure that st... more PurposeSome degree programs in colleges and universities utilize entrance exams to ensure that students pursuing a given degree have mastered foundational concepts needed for that program. However, often these exams become a barrier to student success. The purpose of this study is to discuss the impact of policies governing an undergraduate Computer Science (CS) entry/qualifying exam at a large public university in the United States on overall student success in the program. This case study focuses on whether reforming program policies impacts students' time-to-degree, graduation and mastery in CS core skills.Design/methodology/approachThis case study describes how the CS student success was improved by updating program policies based on institutional data and the input of course instructors. The policy changes include introducing a maximum limit to attempt the exam, changing the exam requirements as well as the structure of the exam itself.FindingsThe pass rates of students tak...

Strategies to enhance university economic engagement: evidence from US universities

Studies in Higher Education, 2019

In an increasingly innovation-driven economic environment, universities serve as engines of econo... more In an increasingly innovation-driven economic environment, universities serve as engines of economic growth by igniting innovation, fueling entrepreneurship, and inspiring the next generation of scientists and professionals. While universities are committed to enhancing their economic impact, university 'economic engagement' is in many ways an emerging field. This research investigates key strategies used by US research universities to drive economic engagement by analysing 55 successful applications for the Innovation and Economic Prosperity (IEP) University designation, which consist of extensive self-study exercises, using a grounded theory approach. Six key strategies emerge from this corpus: forming mutually beneficial partnerships with industry, developing collaboration networks with relevant communities, building an innovation culture, supporting researchers in bringing research outcomes to market, promoting the transfer of new technologies to industry, and encouraging entrepreneurial activities. These results can serve as a guide for universities seeking the best-practices to advance their economic engagement.

Six Human-Centered Artificial Intelligence Grand Challenges

International Journal of Human–Computer Interaction

Widespread adoption of artificial intelligence (AI) technologies is substantially affecting the h... more Widespread adoption of artificial intelligence (AI) technologies is substantially affecting the human condition in ways that are not yet well understood. Negative unintended consequences abound including the perpetuation and exacerbation of societal inequalities and divisions via algorithmic decision making. We present six grand challenges for the scientific community to create AI technologies that are humancentered, that is, ethical, fair, and enhance the human condition. These grand challenges are the result of an international collaboration across academia, industry and government and represent the consensus views of a group of 26 experts in the field of human-centered artificial intelligence (HCAI). In essence, these challenges advocate for a human-centered approach to AI that (1) is centered in human wellbeing, (2) is designed responsibly, (3) respects privacy, (4) follows human-centered design principles, (5) is subject to appropriate governance and oversight, and (6) interacts with individuals while respecting human's cognitive capacities. We hope that these challenges and their associated research directions serve as a call for action to conduct research and development in AI that serves as a force multiplier towards more fair, equitable and sustainable societies.

Entropy-Based Characterization of Influence Pathways in Traditional and Social Media

AttentionSiteDTI: an interpretable graph-based model for drug-target interaction prediction using NLP sentence-level relation classification

Briefings in Bioinformatics, Jul 12, 2022

In this study, we introduce an interpretable graph-based deep learning prediction model, Attentio... more In this study, we introduce an interpretable graph-based deep learning prediction model, AttentionSiteDTI, which utilizes protein binding sites along with a self-attention mechanism to address the problem of drug–target interaction prediction. Our proposed model is inspired by sentence classification models in the field of Natural Language Processing, where the drug–target complex is treated as a sentence with relational meaning between its biochemical entities a.k.a. protein pockets and drug molecule. AttentionSiteDTI enables interpretability by identifying the protein binding sites that contribute the most toward the drug–target interaction. Results on three benchmark datasets show improved performance compared with the current state-of-the-art models. More significantly, unlike previous studies, our model shows superior performance, when tested on new proteins (i.e. high generalizability). Through multidisciplinary collaboration, we further experimentally evaluate the practical potential of our proposed approach. To achieve this, we first computationally predict the binding interactions between some candidate compounds and a target protein, then experimentally validate the binding interactions for these pairs in the laboratory. The high agreement between the computationally predicted and experimentally observed (measured) drug–target interactions illustrates the potential of our method as an effective pre-screening tool in drug repurposing applications.

TabFairGAN: Fair Tabular Data Generation with Generative Adversarial Networks

arXiv (Cornell University), Sep 1, 2021

With the increasing reliance on automated decision making, the issue of algorithmic fairness has ... more With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a Generative Adversarial Network for tabular data generation. The model includes two phases of training. In the first phase, the model is trained to accurately generate synthetic data similar to the reference dataset. In the second phase we modify the value function to add fairness constraint, and continue training the network to generate data that is both accurate and fair. We test our results in both cases of unconstrained, and constrained fair data generation. In the unconstrained case, i.e. when the model is only trained in the first phase and is only meant to generate accurate data following the same joint probability distribution of the real data, the results show that the model beats state-of-the-art GANs proposed in the literature to produce synthetic tabular data. Also, in the constrained case in which the first phase of training is followed by the second phase, we train the network and test it on four datasets studied in the fairness literature and compare our results with another state-of-the-art pre-processing method, and present the promising results that it achieves. Comparing to other studies utilizing GANs for fair data generation, our model is comparably more stable by using only one critic, and also by avoiding major problems of original GAN model, such as mode-dropping and non-convergence, by implementing a Wasserstein GAN.

Improving Fairness via Deep Ensemble Framework Using Preprocessing Interventions

Lecture Notes in Computer Science, 2023

Through a fair looking-glass: mitigating bias in image datasets

arXiv (Cornell University), Sep 18, 2022

With the recent growth in computer vision applications, the question of how fair and unbiased the... more With the recent growth in computer vision applications, the question of how fair and unbiased they are has yet to be explored. There is abundant evidence that the bias present in training data is reflected in the models, or even amplified. Many previous methods for image dataset de-biasing, including models based on augmenting datasets, are computationally expensive to implement. In this study, we present a fast and effective model to de-bias an image dataset through reconstruction and minimizing the statistical dependence between intended variables. Our architecture includes a U-net to reconstruct images, combined with a pre-trained classifier which penalizes the statistical dependence between target attribute and the protected attribute. We evaluate our proposed model on CelebA dataset, compare the results with a state-of-the-art de-biasing method, and show that the model achieves a promising fairness-accuracy combination.

Contrastive Counterfactual Fairness in Algorithmic Decision-Making

The widespread use of artificial intelligence algorithms and their role in decision-making with c... more The widespread use of artificial intelligence algorithms and their role in decision-making with consequential decisions for human subjects has resulted in a growing interest in designing AI algorithms accounting for fairness considerations. There have been attempts to account for fairness of AI algorithms without compromising their accuracy to improve poorly designed algorithms that disregard sensitive attributes (e.g., age, race, and gender) at the peril of introducing or increasing bias against specific groups. Although many studies have examined the optimal trade-off between fairness and accuracy, it remains a challenge to understand the sources of unfairness in decision-making and mitigate it effectively. To tackle this problem, researchers have proposed fair causal learning approaches which assist us in modeling cause and effect knowledge structures, discovering bias sources, and refining AI algorithms to make them more transparent and explainable. In this study, we formalize probabilistic interpretations of both contrastive and counterfactual causality as essential features in order to encourage users' trust and to expand the applicability of such automated systems. We use this formalism to define a novel fairness criterion that we call contrastive counterfactual fairness. This paper introduces, to the best of our knowledge, the first probabilistic fairness-aware data augmentation approach that is based on contrastive counterfactual causality. We tested our approach on two well-known fairness-related datasets, UCI Adult and German Credit, and concluded that our proposed method has a promising ability to capture and mitigate unfairness in AI deployment. This model-agnostic approach can be used with any AI model because it is applied in pre-processing. CCS CONCEPTS • Computer systems organization → Computing methodologies; • Machine learning → Machine learning approaches; • Machine learning approaches → Classification and regression trees.

Distance Correlation GAN: Fair Tabular Data Generation with Generative Adversarial Networks

Lecture Notes in Computer Science, 2023

Through a Fair Looking-Glass: Mitigating Bias in Image Datasets

Lecture Notes in Computer Science, 2023

With the recent growth in computer vision applications, the question of how fair and unbiased the... more With the recent growth in computer vision applications, the question of how fair and unbiased they are has yet to be explored. There is abundant evidence that the bias present in training data is reflected in the models, or even amplified. Many previous methods for image dataset de-biasing, including models based on augmenting datasets, are computationally expensive to implement. In this study, we present a fast and effective model to de-bias an image dataset through reconstruction and minimizing the statistical dependence between intended variables. Our architecture includes a U-net to reconstruct images, combined with a pre-trained classifier which penalizes the statistical dependence between target attribute and the protected attribute. We evaluate our proposed model on CelebA dataset, compare the results with a state-of-the-art de-biasing method, and show that the model achieves a promising fairness-accuracy combination.

Interpretable and Generalizable Attention-Based Model for Predicting Drug-Target Interaction Using 3D Structure of Protein Binding Sites: SARS-CoV-2 Case Study and in-Lab Validation

bioRxiv (Cold Spring Harbor Laboratory), Dec 9, 2021

In this study, we introduce and implement an interpretable graph-based deep learning prediction m... more In this study, we introduce and implement an interpretable graph-based deep learning prediction model, which utilizes protein binding sites along with self-attention to learn which protein binding sites interact with a given ligand. Our proposed model enables interpretability by identifying the protein binding sites that contribute the most towards the Drug-Target Interaction. Results on three benchmark datasets show improved performance compared to previous graph-based models. More significantly, unlike previous studies our model performance remains close to the optimal performance when tested with new proteins (ie., high generalizablity). Through multidisciplinary collaboration, we further experimentally evaluate the practical potential of our proposed approach. To achieve this, we first computationally predict binding interaction of some candidate compounds with a target protein, then experimentally validate the binding interactions for these pairs in the laboratory. The high agreement between the computationally-predicted and experimentally-observed (measured) DTIs illustrates the potential of our method as an effective pre-screening tool in drug re-purposing applications.

A Quantum Leap for Fairness: Quantum Bayesian Approach for Fair Decision Making

Lecture Notes in Computer Science, 2021

Examining sialic acid derivatives as potential inhibitors of SARS-CoV-2 spike protein receptor binding domain

Journal of Biomolecular Structure and Dynamics

Distraction is All You Need for Fairness

arXiv (Cornell University), Mar 14, 2022

Bias in training datasets must be managed for various groups in classification tasks to ensure pa... more Bias in training datasets must be managed for various groups in classification tasks to ensure parity or equal treatment. With the recent growth in artificial intelligence models and their expanding role in automated decision-making, it is vital to ensure that these models are not biased. There is an abundance of evidence suggesting that these models could contain or even amplify the bias present in the data on which they are trained, inherent to their objective function and learning algorithms; Existing methods for mitigating bias result in information loss and do not provide a suitable balance between accuracy and fairness or do not ensure limiting the biases in training. To this end, we propose a powerful strategy for training deep learning models called the Distraction module, which can be effective in controlling bias from affecting the classification results. This method can be utilized with different data types (e.g., tabular, images, graphs). We demonstrate the potency of the proposed method by testing it on UCI Adult and Heritage Health datasets (tabular), POKEC-Z, POKEC-N and NBA datasets (graph), and CelebA dataset (vision). Considering state-of-the-art methods proposed in the fairness literature for each dataset, we exhibit that our model is superior to these proposed methods in minimizing bias and maintaining accuracy.

BindingSiteAugmentedDTA: Enabling A Next-Generation Pipeline for Interpretable Prediction Models in Drug-Repurposing

While research into Drug-Target Interaction (DTI) prediction is fairly mature, generalizability a... more While research into Drug-Target Interaction (DTI) prediction is fairly mature, generalizability and interpretability are not always addressed in the existing works in this field. In this paper, we propose a deep learning-based framework, called BindingSite-AugmentedDTA, which improves Drug-Target Affinity (DTA) predictions by reducing the search space of potential binding sites of the protein, thus making the binding affinity prediction more efficient and accurate. Our BindingSite-AugmentedDTA is highly generalizable as it can be integrated with any DL-based regression model, while it significantly improves their prediction performance. Also, unlike many existing models, our model is highly interpretable due to its architecture and self-attention mechanism, which can provide a deeper understanding of its underlying prediction mechanism by mapping attention weights back to protein binding sites. The computational results confirm that our framework can enhance the prediction performan...

TabFairGAN: Fair Tabular Data Generation with Generative Adversarial Networks

Machine Learning and Knowledge Extraction

With the increasing reliance on automated decision making, the issue of algorithmic fairness has ... more With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a Generative Adversarial Network for tabular data generation. The model includes two phases of training. In the first phase, the model is trained to accurately generate synthetic data similar to the reference dataset. In the second phase we modify the value function to add fairness constraint, and continue training the network to generate data that is both accurate and fair. We test our results in both cases of unconstrained, and constrained fair data generation. We show that using a fairly simple architecture and applying quantile transformation of numerical attributes the model achieves promising performance. In the unconstrained case, i.e., when the model is only trained in the first phase and is only meant to generate accurate data following the same joint probability distribution of the real data, the results show that the mode...

UnbiasedDTI: Mitigating Real-World Bias of Drug-Target Interaction Prediction by Using Deep Ensemble-Balanced Learning

Molecules

Drug-target interaction (DTI) prediction through in vitro methods is expensive and time-consuming... more Drug-target interaction (DTI) prediction through in vitro methods is expensive and time-consuming. On the other hand, computational methods can save time and money while enhancing drug discovery efficiency. Most of the computational methods frame DTI prediction as a binary classification task. One important challenge is that the number of negative interactions in all DTI-related datasets is far greater than the number of positive interactions, leading to the class imbalance problem. As a result, a classifier is trained biased towards the majority class (negative class), whereas the minority class (interacting pairs) is of interest. This class imbalance problem is not widely taken into account in DTI prediction studies, and the few previous studies considering balancing in DTI do not focus on the imbalance issue itself. Additionally, they do not benefit from deep learning models and experimental validation. In this study, we propose a computational framework along with experimental v...

Pathway patterns mobility study of first time vs. transfer students in computer science and information technology programs at a public university

Journal of Applied Research in Higher Education, 2021

PurposePostsecondary institutions use metrics such as student retention and college completion ra... more PurposePostsecondary institutions use metrics such as student retention and college completion rates to measure student success. Multiple factors affect the success of first time in college (FTIC) and transfer students. Transfer student success rates are significantly low, with most transfer students nationwide failing to complete their degrees in four-year institutions. The purpose of this study is to better understand the degree progression patterns of both student types in two undergraduate science, technology, engineering and mathematics (STEM) programs: computer science (CS) and information technology (IT). Recommendations concerning academic advising are discussed to improve transfer student success.Design/methodology/approachThis study describes how transfer student success can be improved by thoroughly analyzing their degree progression patterns. This study uses institutional data from a public university in the United States. Specifically, this study utilizes the data of FT...

Computer science qualifying exam: a case study on improving student success and program quality

Perfiles de Ingeniería, 2020

Las instituciones de educación superior tradicionalmente miden el éxito de los estudiantes y la c... more Las instituciones de educación superior tradicionalmente miden el éxito de los estudiantes y la calidad de un programa académico mediante métricas estándar, como el tiempo que tarda un estudiante en obtener un título, las tasas de graduación y las tasas de retención. Además, algunos programas han instituido un "examen de calificación" como una forma alternativa de medir la calidad de un programa académico y evaluar el dominio del estudiante en los conceptos básicos. El programa de Ciencias de la Computación (CS) de la Universidad de Florida Central implementó un examen de calificación "Examen básico" en 1998 con el propósito de evaluar el dominio de los estudiantes de los conceptos básicos de CS y controlar la calidad del programa. Sin embargo, las tasas de aprobación de los estudiantes que tomaron este examen de calificación fueron significativamente bajas a lo largo de los años. Además, algunos estudiantes retrasaron sistemáticamente la realización de este exam...

The role of entry exams on higher education: a case study on reforming qualifier exam policies to improve student success while maintaining program quality

Journal of Applied Research in Higher Education, 2020

PurposeSome degree programs in colleges and universities utilize entrance exams to ensure that st... more PurposeSome degree programs in colleges and universities utilize entrance exams to ensure that students pursuing a given degree have mastered foundational concepts needed for that program. However, often these exams become a barrier to student success. The purpose of this study is to discuss the impact of policies governing an undergraduate Computer Science (CS) entry/qualifying exam at a large public university in the United States on overall student success in the program. This case study focuses on whether reforming program policies impacts students' time-to-degree, graduation and mastery in CS core skills.Design/methodology/approachThis case study describes how the CS student success was improved by updating program policies based on institutional data and the input of course instructors. The policy changes include introducing a maximum limit to attempt the exam, changing the exam requirements as well as the structure of the exam itself.FindingsThe pass rates of students tak...

Strategies to enhance university economic engagement: evidence from US universities

Studies in Higher Education, 2019

In an increasingly innovation-driven economic environment, universities serve as engines of econo... more In an increasingly innovation-driven economic environment, universities serve as engines of economic growth by igniting innovation, fueling entrepreneurship, and inspiring the next generation of scientists and professionals. While universities are committed to enhancing their economic impact, university 'economic engagement' is in many ways an emerging field. This research investigates key strategies used by US research universities to drive economic engagement by analysing 55 successful applications for the Innovation and Economic Prosperity (IEP) University designation, which consist of extensive self-study exercises, using a grounded theory approach. Six key strategies emerge from this corpus: forming mutually beneficial partnerships with industry, developing collaboration networks with relevant communities, building an innovation culture, supporting researchers in bringing research outcomes to market, promoting the transfer of new technologies to industry, and encouraging entrepreneurial activities. These results can serve as a guide for universities seeking the best-practices to advance their economic engagement.

Six Human-Centered Artificial Intelligence Grand Challenges

International Journal of Human–Computer Interaction

Widespread adoption of artificial intelligence (AI) technologies is substantially affecting the h... more Widespread adoption of artificial intelligence (AI) technologies is substantially affecting the human condition in ways that are not yet well understood. Negative unintended consequences abound including the perpetuation and exacerbation of societal inequalities and divisions via algorithmic decision making. We present six grand challenges for the scientific community to create AI technologies that are humancentered, that is, ethical, fair, and enhance the human condition. These grand challenges are the result of an international collaboration across academia, industry and government and represent the consensus views of a group of 26 experts in the field of human-centered artificial intelligence (HCAI). In essence, these challenges advocate for a human-centered approach to AI that (1) is centered in human wellbeing, (2) is designed responsibly, (3) respects privacy, (4) follows human-centered design principles, (5) is subject to appropriate governance and oversight, and (6) interacts with individuals while respecting human's cognitive capacities. We hope that these challenges and their associated research directions serve as a call for action to conduct research and development in AI that serves as a force multiplier towards more fair, equitable and sustainable societies.

Entropy-Based Characterization of Influence Pathways in Traditional and Social Media

Evaluation of Zika Vector Control Strategies Using Agent-Based Modeling

by Ivan Garibay, Chathika Gunaratne, and Ozlem O Garibay

Aedes Aegypti is the vector of several deadly diseases, including Zika. Effective and sustainable... more Aedes Aegypti is the vector of several deadly diseases, including Zika. Effective and sustainable vector control measures must be deployed to keep A. aegypti numbers under control. The distribution of A. Aegypti is subject to spatial and climatic constraints. Using agent-based modeling, we model the population dynamics of A. aegypti subjected to the spatial and climatic constraints of a neighborhood in the Key West. Satellite imagery was used to identify vegetation, houses(CO2 zones) both critical to the mosquito lifecycle. The model replicates the annual fluctuation of adult population sampled through field studies and approximates the population between 1 per 12m 2 during summer and 1 per 59 m 2 during winter. We then simulate two biological vector control strategies: 1) Release of Insects carrying a Dominant Lethal gene (RIDL) and 2) Wolbachia infection. Our results support the sustainability of Wolbachia infection within the population from the year of treatment onto the next. For the assessment of these two strategies, our approach provides a realistic simulation environment consisting of male and female Aedes aegypti, breeding spots, vegetation and CO2 sources.