Papers by Stelios Sfakianakis
Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007), 2007
In this paper, we describe an analysis tool based on the statistical environment R, GridR, which ... more In this paper, we describe an analysis tool based on the statistical environment R, GridR, which allows using the collection of methodologies available as R packages in a grid environment. The aim of GridR, which was initiated in the context of the EU project Advancing Clinico-Genomics Trials on Cancer (ACGT), is to provide a powerful framework for the analysis of clinico-genomic trials involving large amount of data (e.g. microarray-based clinical trials). As a proof of concept, an example of microarray-based analysis taken from the literature was reproduced using GridR. As GridR will ultimately be made available to the ACGT community as a web service, we are sketching the ACGT project and its architecture in the present article as well.
2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2015
Despite the multiplicity of the gene expression analysis studies for the identification of genomi... more Despite the multiplicity of the gene expression analysis studies for the identification of genomics based origins of cancerous diseases, the presented gene signatures have generally little overlap. The genes do not function in isolation and therefore a more holistic approach that takes into account the interactions among them is needed. In this study we present a stepwise refinement methodology where starting from some initial set of biomarkers we expand and enrich this set taking into account existing biological information. In particular, we start with a 27 gene signature previously identified as indicative of the presence of circulating tumor cells (CTCs) in peripheral blood of breast cancer patients. We use the manually curated HINT database of protein-protein interactions as the background biological network to locate the network-based similarity of the input genes and how they connect to each other. The result is an enriched connected set of genes that is subsequently expanded to form an even bigger network based on the ability of the surrounding genes to strongly correlate with the phenotypes of a training set of breast cancer patient cases. The induced network is then used as a new gene signature for the classification of breast brain metastases in an independent dataset. The results are encouraging for the validity of this method.
Molecular Oncology, 2015
Tamoxifen is the treatment of choice in estrogen receptor alpha breast cancer patients that are e... more Tamoxifen is the treatment of choice in estrogen receptor alpha breast cancer patients that are eligible for adjuvant endocrine therapy. However, ∼50% of ERα-positive tumors exhibit intrinsic or rapidly acquire resistance to endocrine treatment. Unfortunately, prediction of de novo resistance to endocrine therapy and/or assessment of relapse likelihood remain difficult. While several mechanisms regulating the acquisition and the maintenance of endocrine resistance have been reported, there are several aspects of this phenomenon that need to be further elucidated. Altered metabolic fate of tamoxifen within patients and emergence of tamoxifen-resistant clones, driven by evolution of the disease phenotype during treatment, appear as the most compelling hypotheses so far. In addition, tamoxifen was reported to induce pluripotency in breast cancer cell lines, in vitro. In this context, we have performed a whole transcriptome analysis of an ERα-positive (T47D) and a triple-negative breast cancer cell line (MDA-MB-231), exposed to tamoxifen for a short time frame (hours), in order to identify how early pluripotency-related effects of tamoxifen may occur. Our ultimate goal was to identify whether the transcriptional actions of tamoxifen related to induction of pluripotency are mediated through specific ER-dependent or independent mechanisms. We report that even as early as 3 hours after the exposure of breast cancer cells to tamoxifen, a subset of ERα-dependent genes associated with developmental processes and pluripotency are induced and this is accompanied by specific phenotypic changes (expression of pluripotency-related proteins). Furthermore we report an association between the increased expression of pluripotency-related genes in ERα-positive breast cancer tissues samples and disease relapse after tamoxifen therapy. Finally we describe that in a small group of ERα-positive breast cancer patients, with disease relapse after surgery and tamoxifen treatment, ALDH1A1 (a marker of pluripotency in epithelial cancers which is absent in normal breast tissue) is increased in relapsing tumors, with a concurrent modification of its intra-cellular localization. Our data could be of value in the discrimination of patients susceptible to develop tamoxifen resistance and in the selection of optimized patient-tailored therapies.
2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2015
The advancements in healthcare practice have brought to the fore the need for flexible access to ... more The advancements in healthcare practice have brought to the fore the need for flexible access to health-related information and created an ever-growing demand for the design and the development of data management infrastructures for translational and personalized medicine. In this paper, we present the data management solution implemented for the MyHealthAvatar EU research project, a project that attempts to create a digital representation of a patient's health status. The platform is capable of aggregating several knowledge sources relevant for the provision of individualized personal services. To this end, state of the art technologies are exploited, such as ontologies to model all available information, semantic integration to enable data and query translation and a variety of linking services to allow connecting to external sources. All original information is stored in a NoSQL database for reasons of efficiency and fault tolerance. Then it is semantically uplifted through a semantic warehouse which enables efficient access to it. All different technologies are combined to create a novel web-based platform allowing seamless user interaction through APIs that support personalized, granular and secure access to the relevant information.
BMC Medical Informatics and Decision Making, 2015
A plethora of publicly available biomedical resources do currently exist and are constantly incre... more A plethora of publicly available biomedical resources do currently exist and are constantly increasing at a fast rate. In parallel, specialized repositories are been developed, indexing numerous clinical and biomedical tools. The main drawback of such repositories is the difficulty in locating appropriate resources for a clinical or biomedical decision task, especially for non-Information Technology expert users. In parallel, although NLP research in the clinical domain has been active since the 1960s, progress in the development of NLP applications has been slow and lags behind progress in the general NLP domain. The aim of the present study is to investigate the use of semantics for biomedical resources annotation with domain specific ontologies and exploit Natural Language Processing methods in empowering the non-Information Technology expert users to efficiently search for biomedical resources using natural language. A Natural Language Processing engine which can "translate" free text into targeted queries, automatically transforming a clinical research question into a request description that contains only terms of ontologies, has been implemented. The implementation is based on information extraction techniques for text in natural language, guided by integrated ontologies. Furthermore, knowledge from robust text mining methods has been incorporated to map descriptions into suitable domain ontologies in order to ensure that the biomedical resources descriptions are domain oriented and enhance the accuracy of services discovery. The framework is freely available as a web application at ( http://calchas.ics.forth.gr/ ). For our experiments, a range of clinical questions were established based on descriptions of clinical trials from the ClinicalTrials.gov registry as well as recommendations from clinicians. Domain experts manually identified the available tools in a tools repository which are suitable for addressing the clinical questions at hand, either individually or as a set of tools forming a computational pipeline. The results were compared with those obtained from an automated discovery of candidate biomedical tools. For the evaluation of the results, precision and recall measurements were used. Our results indicate that the proposed framework has a high precision and low recall, implying that the system returns essentially more relevant results than irrelevant. There are adequate biomedical ontologies already available, sufficiency of existing NLP tools and quality of biomedical annotation systems for the implementation of a biomedical resources discovery framework, based on the semantic annotation of resources and the use on NLP techniques. The results of the present study demonstrate the clinical utility of the application of the proposed framework which aims to bridge the gap between clinical question in natural language and efficient dynamic biomedical resources discovery.
AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
ABSTRACT
Researchers working with vast quantities of information in a geographically distributed manner ar... more Researchers working with vast quantities of information in a geographically distributed manner are often confronted with problems of finding relevant information as well as colleagues with related interests. The MEMOIR project aims at assisting this collaboration by applying agent technology to user trails and documents. MEMOIR is an open architecture based on the existing Web infrastructure; in contrast to the Web, we treat links and trails as first class objects. Agents mine users' trails and links and also perform resource discovery tasks such as searching the Web. This paper describes the design, communication mechanism and implementation of the MEMOIR agent system which is currently being trialed in three end-user organisations.
Concepts, Methodologies, Tools, and Applications, 2010
Social Semantic Web and Semantic Web Services ( 9781605669847): Stelios Sfakianakis: Book Chapters.
PAAM98-The Third International Conference and Exhibition on The Practical Application of Intelligent Agents and Multi-Agents. Nwana HS and. Ndumu D. T (Eds.), London, UK, Mar 1, 1998
Researchers working with vast quantities of information in a geographically distributed manner ar... more Researchers working with vast quantities of information in a geographically distributed manner are often confronted with problems of nding relevant information as well as colleagues with related interests. The MEMOIR project aims at assisting this collaboration by applying agent technology to user trails and documents. MEMOIR is an open architecture based on the existing Web infrastructure; in contrast to the Web, we treat links and trails as rst class objects. Agents mine users' trails and links and also perform resource discovery ...
IEEE Transactions on Information Technology in Biomedicine, 2007
Efficient access to a citizen's Integrated Electronic Health Record (I-EHR) is considered to be t... more Efficient access to a citizen's Integrated Electronic Health Record (I-EHR) is considered to be the cornerstone for the support of continuity of care, the reduction of avoidable mistakes, and the provision of tools and methods to support evidencebased medicine. For the past several years, a number of applications and services (including a lifelong I-EHR) have been installed, and enterprise and regional infrastructure has been developed, in HYGEIAnet, the Regional Health Information Network (RHIN) of the island of Crete, Greece. Through this paper, the technological effort toward the delivery of a lifelong I-EHR by means of World Wide Web Consortium (W3C) technologies, on top of a service-oriented architecture that reuses already existing middleware components is presented and critical issues are discussed. Certain design and development decisions are exposed and explained, laying this way the ground for coordinated, dynamic navigation to personalized healthcare delivery.
Recent advances in research methods and technologies have resulted in an explosion of information... more Recent advances in research methods and technologies have resulted in an explosion of information and knowledge about cancers and their treatment. Knowledge Discovery (KD) is a key technique for dealing with this massive amount of data and the challenges of managing the steadily growing amount of available knowledge. In this paper, we present the ACGT integrated project, which is to contribute to the resolution of these problems by developing semantic grid services in support of multi-centric, post-genomic clinical trials. In particular, we describe the challenges of KD in clinico-genomic data in a collaborative Grid framework, and present our approach to overcome these difficulties by improving workflow management, construction and managing workflow results and provenance information. Our approach combines several techniques into a framework that is suitable to address the problems of interactivity and multiple dependencies between workflows, services, and data.
Journal of Medical Internet Research, 2001
Due to the greater mobility of the population, national and international healthcare networks are... more Due to the greater mobility of the population, national and international healthcare networks are increasingly used to facilitate the sharing of healthcare-related information among the various actors of the field. This sharing of information resources is generally accepted as the key to substantial improvements in productivity and better quality of care. Comprehensive medical information about a patient is difficult to obtain efficiently, unless the distributed and heterogeneous health record segments are incorporated into an Integrated Electronic Health Record (I-EHR) and viewed on-line through a unified user interface and visualization environment. Furthermore, the seamless integration of distributed EHR segments requires interoperability among heterogeneous autonomous information systems. As a result, standardization efforts for middleware that facilitate interoperability and enable the communication of information through standard messages are very active.
IEEE Transactions on Information Technology in Biomedicine, 2008
This paper reports on original results of the Advancing Clinico-Genomic Trials on Cancer integrat... more This paper reports on original results of the Advancing Clinico-Genomic Trials on Cancer integrated project focusing on the design and development of a European biomedical grid infrastructure in support of multicentric, postgenomic clinical trials (CTs) on cancer. Postgenomic CTs use multilevel clinical and genomic data and advanced computational analysis and visualization tools to test hypothesis in trying to identify the molecular reasons for a disease and the stratification of patients in terms of treatment. This paper provides a presentation of the needs of users involved in postgenomic CTs, and presents such needs in the form of scenarios, which drive the requirements engineering phase of the project. Subsequently, the initial architecture specified by the project is presented, and its services are classified and discussed. A key set of such services are those used for wrapping heterogeneous clinical trial management systems and other public biological databases. Also, the main technological challenge, i.e. the design and development of semantically rich grid services is discussed. In achieving such an objective, extensive use of ontologies and metadata are required. The Master Ontology on Cancer, developed by the project, is presented, and our approach to develop the required metadata registries, which provide semantically rich information about available data and computational services, is provided. Finally, a short discussion of the work lying ahead is included.
This paper presents the needs and requirements that led to the formation of the ACGT (Advancing C... more This paper presents the needs and requirements that led to the formation of the ACGT (Advancing Clinico Genomic Trials on Cancer) integrated project, its vision and methodological approaches. The ultimate objective of the ACGT project is the development of a European biomedical grid for cancer research, based on the principles of open access and open source, enhanced by a set
Recent advances in research methods and technologies have resulted in an explosion of information... more Recent advances in research methods and technologies have resulted in an explosion of information and knowledge about cancers and their treatment. Knowledge Discovery (KD) is a key technique for dealing with this massive amount of data and the challenges of managing the steadily growing amount of available knowledge. In this paper, we present the ACGT integrated project, which is to
2008 IEEE International Symposium on Parallel and Distributed Processing with Applications, 2008
In this paper, we describe an extension to the ACGT GridR environment which allows the paralleliz... more In this paper, we describe an extension to the ACGT GridR environment which allows the parallelization of loops in R scripts in view of their distributed execution on a computational grid. The ACGT GridR service is extended by a component that uses a set of preprocessor-like directives to organize and distribute calculations. The use of parallelization directives as special R comments provides users with the potential to accelerate lengthy calculations with changes to preexisting code. The GridR service and its extension are developed as components of the ACGT platform, one aim of which is to facilitate the data mining of clinical trials involving large datasets. In ACGT, GridR scripts are executed in the framework of a specifically developed workflow environment, which is also briefly outlined in the present article.
2001 Conference Proceedings of the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2001
The Integrated Electronic Health Record (I-EHR) is a term used to describe the whole set of infor... more The Integrated Electronic Health Record (I-EHR) is a term used to describe the whole set of information that exists in electronic form and is related to the personal health of an individual. Any approach towards I-EHR focuses on the needs of professionals or citizens who want a uniform way of accessing parts of personal health record information that is physically located in disparate information sources. Any I-EHR end-user environment must provide fast, secure and authorized access to the distributed fragments of the electronic patient record (EPR) originating at multiple clinical information systems, and to deliver them in a multitude of formats. The importance of such an environment becomes apparent when used in conjunction with a number of advanced telematic services, such as medical collaboration, home care monitoring, and/ or health emergency services, to provide seamless care without visible organizational boundaries.
Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 2013
Social media and the Web2.0 technologies are ubiquitous and due to the advances in mobile communi... more Social media and the Web2.0 technologies are ubiquitous and due to the advances in mobile communication protocols, operating systems, and internet standards they are now supported even in cell phones and tablets. We are not yet at the point where a cell phone can be used as a medical device but such small and omnipresent instruments can be used in a way that promotes research in the clinical and biomedical domain. In this paper we describe a collaborative platform for designing composite simulations for the Virtual Physiological Human (VPH) community needs. We investigate the use of pervasive mobile technologies so that scientists and researchers can easily design, share, and execute simulations. The proposed platform supports real time notification and sharing of the results, and share the results and related artifacts with their work group and colleagues.
Uploads
Papers by Stelios Sfakianakis