MedCloud Healthcare Cloud Computing System

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/257044681

medcloud : healthcare cloud computing system

Conference Paper · December 2012

CITATIONS READS
2 842

1 author:

Yasser El-sonbaty
Arab Academy for Science, Technology & Maritime Transport
60 PUBLICATIONS 787 CITATIONS

SEE PROFILE

All content following this page was uploaded by Yasser El-sonbaty on 17 August 2014.

The user has requested enhancement of the downloaded file.


The 7th International Conference for Internet Technology and Secured Transactions (ICITST-2012)

MedCloud : Healthcare Cloud Computing System


Dalia Sobhy Yasser El-Sonbaty Mohamad Abou Elnasr
College of Computer Engineering College of Computer Engineering College of Computer Engineering
Arab Academy of Science and Arab Academy of Science and Arab Academy of Science and
Technology and Maritime Transport Technology and Maritime Transport Technology and Maritime Transport
Alexandria, Egypt, 1029 Alexandria, Egypt, 1029 Alexandria, Egypt, 1029
Email: [email protected] Email: [email protected] Email: [email protected]

Abstract—Existing systems for patients’ data storage are not Cloud Computing is a magnificent way to solve this prob-
scalable enough for the increasing number of patients and lem. Moving data into the cloud offers a great convenience
applications. Cloud computing promises low cost, high scala- among users, because they don’t need to care about the com-
bility, availability and disaster recoverability which can be a
natural solution for some of the problems faced in storing and plexities of direct hardware management [8]. This will also
analysing patients’ medical records. This paper examines the help the developers to create different healthcare applications
impact of cloud computing on improving healthcare services. sharing the same data, hence saving time of gathering patients’
More specifically, this research details the architectural design data from different sources.
for a personal health record system called ”MedCloud” that Currently, cloud computing and open standards can be a
utilizes and integrates services from Hadoop’s[1] ecosystem
in conjunction with HIPAA privacy and security rules[2]. A significant foundation for streamlining healthcare. They can be
scalable platform is proposed for developers to use in application used for maintaining medical records, monitoring patients, as
development and Restlet[3], a web portal, is presented to users, well as handling cares and diseases efficiently and analysing
to access the MedCloud system. Later on, the development of the patients’ data. It is popularly believed that using clouds to
MedCloud model is illustrated through issues analysis followed manage and administer healthcare applications will result in
by an in-depth performance evaluation.
Index Terms—Cloud Computing, Hadoop, Hbase, Restlet, a revolutionary change in the way healthcare is done today.
HIPAA Enabling the access to healthcare ubiquitous not only will help
us improve healthcare as our data will always be accessible
I. I NTRODUCTION from anywhere at any time, but it will also help cut down the
Healthcare is always a major concern for the commu- costs drastically. A fundamental step for the success of tapping
nity.“Towards smarter IT” is the main slogan for a successful healthcare into the cloud is the in-depth understanding of how
healthcare institution. There is a great need for new strategies to use cloud computing models effectively.
to reduce healthcare costs and improve the quality of service. In 2003, federal Health Insurance Portability and Account-
Moreover, IT has positively affected the healthcare sector, ability Act (HIPAA) created a national standard for privacy of
it provides more accurate and timely information regarding health information. The entities covered by HIPAA are health
patient care. [4] care providers, health plans, and healthcare clearing-houses.
Several healthcare providers and insurance companies store According to HIPAA, the data records are maintained and
the patient’s medical data in the form of electronic medical transmitted in the form of electric records. The main aim of
records (EMR) [5] in centralized databases. From the past, HIPAA’s Privacy Rule is to guarantee the full protection of the
EMRs have been an essential way for storing patients medical individuals’ health information without any violation with the
records electronically. [6] The problem is that, typically, any flow of health information. Moreover, it improves the quality
patient has various healthcare providers including physicians, of healthcare and safeguards the people’s health. [2]
specialists, therapists, and other medical practitioners. Further, In this paper, a cloud computing system “MedCloud” is
he may also have different insurance companies. Healthcare proposed for storing EMRs. The main objective is forming a
providers require a complete vision about a patient’s health platform for developers to use instead of individual platforms.
status for proper diagnosis, based on the aggregation of all This is a good aid for simplifying the development phase. The
his medical records. It is common that each provider has his MedCloud system provides the users with the fundamental
own database. Therefore, a healthcare provider may request a services for building an efficient healthcare cloud application.
patient’s EMR from other healthcare providers. The interoper- The proposed system uses the Hadoop ecosystem for server
ation and sharing of data among different EMRs is extremely implementation. Column-oriented databases running on top
slow. As a result, there is a need for a common place for of distributed file systems are suitable for data storage and
storing EMRs to accelerate their sharing. [7] To overcome the analysis. The users access the system through the prominent
delay of transferring EMRs back and forth between different web framework Restlet. A detailed description for the system
healthcare providers, this common place could facilitate and architecture is also provided as the proposed model considers
enhance this process more efficiently. the privacy and security concerns based on HIPAA; where

978-1-908320-08/7/$25.00©2012 IEEE 161


The 7th International Conference for Internet Technology and Secured Transactions (ICITST-2012)

some functional parts of the system were implemented. Fi- A. System Requirements
nally, convincing performance measures for the system are In this part, we will address the system requirements with
explained. This paper is divided into four sections: Section 2 respect to HIPAA privacy and security rules [2]. Although
discusses the Related Work, followed by Section 3, where a HIPAA is an American standard, yet it can be easily applied
detailed description of the proposed architecture is provided. in various countries. The system features are customizable
Section 4 illustrates the implementation and performance eval- depending on the country, covered entities and users.
uation of the proposed model and finally, Section 5 concludes 1) Requirements:
the paper.
• The users will be categorized based on their privileges’
level, and the access level will be defined while registra-
II. R ELATED W ORK tion.
• The hospital or medical institution will choose the access
Currently, Amazon Elastic Compute Cloud [9] is one of the
most popular cloud platforms. It provides a virtual computing level for its users from the already defined list of access
environment for a user to run his application, resulting in categories that is set with respect to HIPAA’s privacy and
improved performance, but is inflexible to sharing data. It also security rules.[2]
• The Notice of Privacy Practices (NPP) that contain the
implements the pay-per-time model. Google App Engine [10]
is another cloud platform, which allows a user to run web policies and regulations of the Privacy Rule[2], should be
applications written using the Python and Java Programming posted on the system’s website.
• Participants could submit a list of authorized persons to
languages. But it only provides limited function modules. It
also provides a Web-based Administration Console for the user the HIO (Health Information Organization), depending on
to manage running his Web applications, but it is unsuitable the country.
• Any documentation that requires signatures may be pro-
for high performance distributed application. [11]
In the medical field, various medical software systems were vided as a scanned image of the signed documentation
invented worldwide, in an attempt to improve healthcare. or as an electronic document with electronic signature.
• Based on HIPAA’s Security rule, covered entities have to
Care2X is an open source hospital information system that
implements the client-server architecture. Care2X benefits are apply restrictive security measures.
flexibility, easy handling, a developer could make his own – The medical information system should apply tech-
tools, easy to select the different departments and stations, and nical policies and procedures, as well as implement
a great help from the Care2Xs community. A major drawback the required hardware and software that allow only
is there is no real standard between modules and it is still under authorized persons to access e-PHI.
development and needs a lot of innovation. Furthermore, it – Security data transmission is required.
has poor documentation and deficient security measures.[12]
In [13], the authors proposed an open source private cloud B. Proposed Architecture
solution for rural healthcare in India. They deployed Care2X In this section, the implementation and components of the
on the private cloud. They tightened their application to service proposed cloud computing system “MedCloud” are described.
only rural areas in case of emergencies. However, if it was In the MedCloud model, domain specific services were built
applied to all hospitals or patients in a country, it will result for users; services were selected and designed based on the
in saving patients’ lives at any place at any time. Moreover, requirements described before. Users are mainly composed of
some statistics about the most common diseases would help developers, who use the system for building new applications
in finding new ways to minimize the vulnerability to these that benefit from shared data. MedCloud system achieves ap-
diseases. plication agnostic characteristics, where different applications
Although these systems seem to be competent, they cannot can access the system simultaneously through the web portal.
achieve high performance computing and complex analysis. A The system is divided mainly into three parts: data layer, server
specific-purpose platform; medical, is therefore recommended. layer, application layer, and the client.
Since cloud computing has the potential of becoming a rev- With the explosive growth of medical information, the real
olutionary technology in the performance of domain-oriented challenge is how to effectively manage the computation and
service computing, a cloud-oriented approach would be a good storage requirements. The Data Layer achieves the function
choice for developing a scalable medical system. of having efficient data storage. Distributed computing and
management of the whole system is the main job of the Server
III. M ED C LOUD : A C LOUD C OMPUTING S YSTEM layer.
1) Data Storage Layer: As seen in Fig. 1, there is an
In this section, there is a detailed explanation of the EMR store for storing medical information. A Distributed File
MedCloud system. The system requirements are addressed System is necessary for storing EMRs. It is a file system
first. After that, the building blocks of the system are clearly designed for storing very large files with streaming data access
illustrated. A sequence diagram is important to show the patterns. It also runs on clusters of commodity hardware,
milestones for users access to the system. i.e. it will continue working even if any node fails, which

978-1-908320-08/7/$25.00©2012 IEEE 162


The 7th International Conference for Internet Technology and Secured Transactions (ICITST-2012)

TABLE I
PATIENT TABLE

Row-key Column Families


info:
(patient ID) First Name
Middle Name
Last Name
Date Of Birth
Gender
Marital Status
Address
Telephone
Blood Group

TABLE II
V ISIT TABLE

Row-key Column Families


info: cardiac:
(visit ID)(patient ID) AdmissionDate Diagnosis ID
DischargeDate LabTest ID
Room num Symptom ID
Physician ID Sign ID
RiskFactor ID

Fig. 1. The architecture of MedCloud mation for a patient. There are some mandatory information
such as the patient’s ID and full name in the Patient Table
and others are optional to provide flexibility in information
is important for guaranteeing availability feature. But it is gathering; because some hospitals and clinics may vary in the
not a general purpose file system, and does not provide information available. In the Visit Table, the visit could be
fast individual record lookups in files. Consequently, a data any hospital or clinic visit. There are two column families in
warehouse for fast record lookups and updates for large tables the table: one for the basic information in a visit and the other
is required. one for information about cardiac diagnosis. If more diagnosis
needs to be added to the system, more column families may
Applications, such as statistical analysis from EMRs and be added.
medical imaging transfer, require terabytes or petabytes stor-
In table III, some of the services provided by the system
age to fulfil their computational and storage requirements.
are shown; for instance, the addition, deletion, and retrieval
SQL-like databases are restricted to certain capacities and are
of patients’ information. There are also other services to be
unable to handle massive storage needs.[14] This led to the de-
added to the system. The major pro of this platform is that it is
velopment of horizontally scalable, distributed non-relational
customizable and any services can be added later on without
data stores, called NoSQL databases. Due to the quick reads
affecting the system’s efficiency.
and writes, mass storage support, easily expandable, and low
2) Server Management Layer: In the proposed architecture,
cost, the NoSQL data stores best suit the current demands.[15]
master-slave architecture is implemented. The master has two
They are categorized into document stores, key-value pairs,
main components:
and column-oriented databases.[16] The MedCloud used the
column-oriented data stores because it is the best choice for • Query Manager: is a vital element in the system. It

the system. NoSQL data stores are used for data-warehousing accepts the queries from the application layer. It contains
applications, high performance of data analysis, and business the meta-data of the file system, and data locations in the
intelligence processing. In the MedCloud an efficient data database, required for each query. It also controls system-
warehousing tool was needed, and thus, a column-oriented wide activities such as garbage collection of unused
data store running on top of a distributed file system has the chunks and chunk migration between slaves.
potential of yielding the required high performance. • Concurrency Manager: is responsible for managing and
distributing the jobs/requests on the slaves by coordinat-
Furthermore, Column-oriented databases are applying the
ing with the query manager. Moreover, it is used for data
column-families methodology; each table is uniquely identi-
replication across the slaves.
fied by one primary key and no foreign keys. [17] A sample
design for the patient table and visit table is clearly defined in In the slave part, there are:
table I and table II, respectively. In the Patient Table, there is • Data Storage Manager: is used as a worker for handling
one column family that contains most of the necessary infor- the data storage as well as storing the data files in the

978-1-908320-08/7/$25.00©2012 IEEE 163


The 7th International Conference for Internet Technology and Secured Transactions (ICITST-2012)

TABLE III
S ERVICES TABLE administrative staff, clinical staff,management, and IT
staff; should be authenticated before accessing EMRs.
Provided Services Requirements Description • Authorizer: Since the patients information is private and
AddPatient All information Add new patient if critical, therefore, HIPAA privacy and security rules are
is required not found applied within our system. This is to guarantee security
DeletePatient Patient’s ID Delete patient’s metrics and gain the users’ trust. Furthermore, this is a
data from the crucial part for distinguishing the user’s permissions and
database showing the services permitted for this user based on
UpdatePatientByID Patient’s ID Update patient’s HIPAA privacy rules. A healthcare provider has to assign
data if ID is given
a person for adding and deleting users to the system.
UpdatePatientByName Patient’s name Update patient’s
data if name is
• Request Receiver: By calling this service implementation
given strategy, the Server layer processes the service’s request
RetrievePatientByID Patient’s ID Show patient’s data and the response is then routed back to the user.
if ID is given • Data Integrator: checks and validates the input data
RetrievePatientByName Patient’s name Preview patient’s during any operations.
data if name is • NPP Registry: contains the HIPAA privacy and security
given
policies and regulations.
RetrievePatientHistory Patient’s ID, start Show the patient’s
date, and end medical for a spe-
• Authorizers Registry: embraces a list of all the authorized
date cific period of time users. Once a healthcare provider adds a user, he is
CountPatientsByDiag Type of diagnosis Count the number automatically added to this registry.
and current year of patients suffer- • Services Registry: comprises all the services provided by
ing from a partic- the MedCloud system. It is mapped with table III. Any
ular diagnosis for
the current year new services are added to the entire registry.
CountPatientsByDiagYears Type of diagno- Count the number • Disclosure Tracker: includes all the users or healthcare
sis,start date, and of patients suffer- providers who accessed each patient’s medical records.
end date ing from a partic-
ular diagnosis for
a specific period of C. Issues Analysis
time
The MedCloud framework will provide a set of services
that work across all programming models, support storage,
distributed file system. file management, monitoring, and security. The developers and
• Task Manager: is regarded as the slave for the Con- end-users are the two kinds of users in the system. Thus, each
currency Manager. This is because, it is in charge of will use the system in a different way.
instantiating and monitoring individual tasks within a job. 1) Application Deployment: The MedCloud system pro-
Coordination Manager: is the key component in the MedCloud vides the specially crafted software development kit (SDK)
system. It handles and manages the requests and responses in for the developers to create their own applications without
case of multi master-slave communication. having to know their insights. After a developer completes his
3) Application Layer: The function of the entire layer is application, he can easily deploy it. As stated before, there
to provide services for users. An application is a collection of is specific platform is created for the developers to use in
software components (services) that work together to achieve designing their own applications. The pricing and regulations
a specific goal. A user could request a service via network for usage will be formulated for future use.
access. HTTP [18] stands for Hypertext Transfer Protocol; the 2) Service Request: Fig. 2 shows the sequence diagram for
standard network protocol under Internet. Therefore, a web the request handling phases for retrievePatient(ID) service.
framework that uses an HTTP server for services transfer is • Step 1,2: The client accesses the system securely through
required for users to use the HTTP technology for passing the web server.
service request. The application server accepts this request and • Step 3,4: The authenticator validates the client’s login
compares it with the available services and then replies ac- details.
cording to different response strategies. Note that the services • Step 5,6: If valid, the login details are passed to the access
are published based on the REST architectural style[19]. For controller that sends a list of the granted services for this
this layer, all the components available run on top of the Web client based on HIPAA privacy rules.
framework. In this part, the function of the application layer • Step 7,8: The client sends retrievePatient(ID) request
is introduced through a number of elements. to the Service Request Handler, which forwards the
• Authenticator: is responsible for validating the client’s incoming request to the Patient Server Resource.
login details, i.e. logging in and out of the system. Health- • Step 9: Patient Server Resource contains definitions for
care providers and medical employees are the main users the services available. It creates a new hbase client
of a healthcare system. Naturally, Employees including instance.

978-1-908320-08/7/$25.00©2012 IEEE 164


The 7th International Conference for Internet Technology and Secured Transactions (ICITST-2012)

Fig. 3. MedCloud Deployment Model


Fig. 2. Sequence Diagram for a read request

• In the next steps, the request is handled through the


communication between hbase client and hbase server,
as in steps 10,11.
• Step 12,13: The hbase client calls the getRow(ID) method
that searches for a particular row throughout hbase region
servers via hbase master server.
• Step 14,15: The response either patient requested or “not
found” message is sent to the client. Fig. 4. Requests per second versus number of machines

IV. I MPLEMENTATION AND P ERFORMANCE E VALUATION


In this section, the physical components of the system The test is attempted to measure the system’s scalability
are described. Fig. 3 shows the deployment model of the by requesting random reads from Hbase datastore. Apache
MedCloud system. The MedCloud architecture, assigns two bench is a good benchmarking tool for measuring the per-
nodes as masters for the management of the Hadoop cluster. formance. The number of requests per second is computed
As for the slaves part, the system is able to add as many across increasing number of datanodes. The experiment pa-
slaves as needed, because Hadoop scales linearly.This cluster rameters are 1000 input requests, 100-concurrency rate, and
is implemented using Cloudera’s third release update five 100Mbps transfer rate. Every time 5 data nodes are added
(CHD3U5). Cloudera’s package includes Hadoop; used to to the cluster, 200,000 records are added to the database as
achieve scalability, and its necessary services. In the practical well, for instance 5 datanodes (250,000 record), 10 datanodes
design, Hadoop is established on 23 cloud servers from (450,000 record), 15 datanodes (650,000 record)...etc. Each
RackSpace Open Cloud Company including 3 major nodes for record contains 8 rows with 3 copies. As seen in Fig. 4, the
cluster management. The first master node carries Hmaster and requests per second are approximately 200. Although the data
Zookeeper that runs on Namenode [1], it performs the tasks is progressively growing, the requests per second are constant
of query and concurrency managers. A zookeeper cluster is along the increasing number of data nodes. This means that
required, which handles the job of the coordination manager, by raising the data size and extending the datanodes, still
hence the zookeeper is running on 3 nodes. The second MedCloud system maintains its performance level. Therefore,
master contains Secondary Namenode, Job Tracker, as well this ensures the system’s scalability.
as Zookeeper. The third node runs zookeeper only.
As for the data nodes i.e. slaves, 20 nodes are used, where V. C ONCLUSION
each contains the region server that run on datanode with This paper presented a cloud oriented approach for medical
the task tracker. The data nodes are efficient in data storage systems development. The main idea is the need for a medical
management and jobs execution. After adjusting the Hadoop system, which all people from different places can easily
cluster, Thrift was used as the client API for its simplicity and access and use, as well as help the developers in creating their
noticeable benefits to communicate with Hbase [20]. After own healthcare applications sharing the EMRs. The proposed
successfully establishing connection with Hbase server, the model introduced in this paper offers a flexible and portable
Restlet Framework (restlet 2.1rc1) is used as the web platform. platform for applications development. Scalability and privacy
The machines’ specifications are HP, Intel(R) Core(TM)2 were the major concerns. MedCloud system successfully over-
Quad, 2.66GHz processor, 320GB disk storage and 8.00GB come these issues by deploying hadoop cluster for scalability
RAM for masters. As for slaves same specifications except and designing the system based upon HIPAA requirements.
for disk storage 160GB and physical memory 8.00GB RAM. We provided the users with an easy way to access the system

978-1-908320-08/7/$25.00©2012 IEEE 165


The 7th International Conference for Internet Technology and Secured Transactions (ICITST-2012)

via Restlet web server. Finally, promising output across the


conducted tests was a good indicator for usability of the
proposed system.
R EFERENCES
[1] T. White, Hadoop : The Definitive Guide, 1st ed. 1005 Gravenstein
Highway North, Sebastopol, CA 95472.: O’reilly Media Incorporation,
June 2009, no. 978-0-596-52197-4.
[2] U. S. D. of Health and H. Services, Summary of Hipaa Privacy Rule,
Office for Civil Rights, 200 Independence Avenue, S.W. Washington,
D.C. 20201, May 2003.
[3] T. T. Jerome Lovel, Restlet in Action. Mannings, 2011.
[4] J. Walther and C. de Jong, “Multimedia, ieee,” Technology for More
Effective Healthcare, vol. 16, no. 4, pp. 5–7, April 2009.
[5] I. of Medicine, The computer-based electronic medical record: An
essential technology for healthcare. NAP, Washington, DC, 1997.
[6] Y. Shuli, Y. Xiaoping, and L. Huiling, “Research on the emr storage
model,” in International Forum on Computer Science Technology and
Applications, 2009. IFCSTA 2009., vol. 1, December 2009, pp. 222–226.
[7] P. Ray and J. Wimalasiri, “The need for technical solutions for main-
taining the privacy of ehr,” in Engineering in Medicine and Biology
Society, 2006. EMBS ’06. 28th Annual International Conference of the
IEEE, September 2006, pp. 4686 –4689.
[8] K. M. Yashpalsinh Jadeja, “Cloud computing - concepts, architecture
and challenges,” in International Conference on Computing, Electronics
and Electrical Technologies [ICCEET], 2012.
[9] (2012, September). [Online]. Available: http://aws.amazon.com/ec2/
[10] (2012, October). [Online]. Available: https://appengine.google.com/
[11] X. J.-B. Zeng Shu-Qing, “The improvement of paas platform,” in
2010 First International Conference on Networking and Distributed
Computing (ICNDC), October 2010, pp. 156–159.
[12] C. Corporation. (2012, August). [Online]. Available:
http://www.care2x.org/
[13] M. Lakshmi and J. Dhas, “An open source private cloud solution for
rural healthcare,” in Signal Processing, Communication, Computing and
Networking Technologies (ICSCCN), 2011 International Conference on,
July 2011, pp. 670–674.
[14] D. J. Abadi, “Data management in the cloud: Limitations and opportu-
nities,” IEEE Data Engineering Bulletin, vol. 32, no. 1, pp. 3–12, March
2009.
[15] J. Han, E. Haihong, G. Le, and J. Du, “Survey on nosql database,” in
Pervasive Computing and Applications (ICPCA), 2011 6th International
Conference on, October 2011, pp. 363 –366.
[16] S. C. S. Rabi Prasad Padhy, Manas Ranjan Patra, “Rdbms to nosql:
Reviewing some next-generation non-relational database’s,” Interntional
Journal of Advanced Engineering Sciences and Technologies (IJAEST),
vol. 11, no. 1, pp. 15 – 30, 2011.
[17] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach,
M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber, “Bigtable: A
distributed storage system for structured data,” in Proceedings of the 7th
USENIX Symposium on Operating Systems Design and Implementation,
ser. OSDI ’06, vol. 7, ACM. Berkeley, CA, USA: USENIX Association,
2006, p. 15.
[18] (2012, April). [Online]. Available: http://www.w3.org/Protocols/
[19] A comparison of SOAP and REST implementations of a service based
interaction independence middleware framework, ser. WSC ’09, no. 10.
Austin, Texas: Winter Simulation Conference, 2009.
[20] L. George, Hbase The Definitive Guide. Oreilly Media Incorporation,
August 2011.

978-1-908320-08/7/$25.00©2012 IEEE 166

View publication stats

You might also like