Academia.eduAcademia.edu

MIPortal: a high capacity server for molecular imaging research

Molecular imaging

The introduction of novel molecular tools in research and clinical medicine has created a need for more refined information management systems. This article describes the design and implementation of such a new information platform: the Molecular Imaging Portal (MIPortal). The platform was created to organize, archive, and rapidly retrieve large datasets using Web-based browsers as access points. The system has been implemented in a heterogeneous, academic research environment serving Macintosh, Unix, and Microsoft Windows clients and has been shown to be extraordinarily robust and versatile. In addition, it has served as a useful tool for clinical trials and collaborative multi-institutional small-animal imaging research.

RESEARCH ARTICLE Molecular Imaging . Vol. 4, No. 4, October 2005, pp. 425 – 431 425 MIPortal: A High Capacity Server for Molecular Imaging Research Misha Pivovarov1, Gokul Bhandary 2, Umar Mahmood1, Gudrun Zahlmann2, Mohammad Naraghi2, and Ralph Weissleder1 1 Center for Molecular Imaging Research, Massachusetts General Hospital and Harvard Medical School, and 2Siemens Medical Solutions, Germany Abstract The introduction of novel molecular tools in research and clinical medicine has created a need for more refined information management systems. This article describes the design and implementation of such a new information platform: the Molecular Imaging Portal (MIPortal). The platform was created to organize, archive, and rapidly retrieve large datasets using Web-based browsers as access points. The system has been implemented in a heterogeneous, academic research environment serving Macintosh, Unix, and Microsoft Windows clients and has been shown to be extraordinarily robust and versatile. In addition, it has served as a useful tool for clinical trials and collaborative multi-institutional small-animal imaging research. Mol Imaging (2005) 4, 425 – 431. Keywords: Small-animal imaging, PACS, databases, DICOM, bioinformatics. Introduction Small-animal imaging systems can generate vast amounts of data. For example, dedicated mouse MRI and CT systems generate an average of 1.5 GB of data per day, although peak data generation can be much higher [4]. In addition, there is a growing need to store associated data with imaging studies, for example, histology, pathology, immunohistochemistry, in situ hybridization, gels, microarrays, or mass spectrometry data. Furthermore, most research projects today require database queries and searches of protein structures (e.g., Entrez Protein), chemical data banks (e.g., Chembank, Pubchem), or other archives (e.g., Pubmed, GenBank). Finally, expanding multidisciplinary and multi-institutional research environments continue to foster collaborative projects, generating the need for organized and accessible shared data repositories. Typical information management systems, such as electronic laboratory notebooks or image archives, are either too cumbersome, limited to certain computer platforms, too slow or not fully adapted to handle the large number, types, and size of data files. Clinical data management systems on the basis of Picture Archiving and Communication Systems (PACS) [5] or electronic patient records [6] can be highly efficient but are prohibitively expensive for routine deployment in academic molecular imaging centers. Such systems are optimized for handling of specific data and have links to other systems without true integration of data models. Given the above needs and shortcomings of commercial systems, we set out to develop a Web-based information platform (MIPortal) in an academic setting and with particular emphasis towards storage and retrieval of datasets. We initially defined our user criteria as follows: (a) the system had to be Web based and be compatible with common browsers operating on Macintosh, Unix, or Windows clients; ( b) the system had to be fast and hold large amount of data in real time with daily backups; (c) the system had to allow storage of different file formats, including Digital Imaging and Communications in Medicine (DICOM), Tagged Image File Format (tiff ), Microsoft Office documents (.doc, .xls, .ppt) with seamless integration, uploading, and downloading; (d) the system had to allow for data analysis and storage of such analyzed data; (e) be user friendly, intuitive, highly stable, and follow the workflow of a typical molecular imaging research center. Here we report on the architecture, functionality, and performance of MIPortal in a multidisciplinary environment. Materials and Methods Architecture The MIPortal is a four-tier application, consisting of (a) a Web server, ( b) the application, (c) the database, and (d) back-end services. The Web server provides access to stored data via commonly available Web browsers such as Safari, Internet Explorer, or Firefox. The application layer extends the functionality of the Siemens Syngo platform to the Web. Most of the application logic is implemented at this level. Back-end services (PACS Abbreviations: DICOM, digital imaging and communications in medicine; MIPortal, molecular imaging portal; PACS, picture archiving and communication systems; VPN, virtual private network. Corresponding author: Ralph Weissleder, Center for Molecular Imaging Research, Massachusetts General Hospital, Building 149, 13th Street, Room 5403, Charlestown, MA 02129; e-mail: [email protected]. Received 15 March 2005; Received in revised form 2 June 2005; Accepted 15 June 2005. D 2005 Neoplasia Press, Inc. 426 High Capacity Server for Molecular Imaging Research Pivovarov et al. and DICOM converter) are responsible for interfacing with different imaging modalities and also for providing a secure communication infrastructure as well as validating incoming data. For example, in the case that an incoming data package does not conform to a specific protocol, it is rejected with a request for correction. This ensures consistency and uniformity of image data in the MIPortal. A standard Microsoft IIS server was chosen for optimized integration with the Microsoft.NET framework and Syngo platform. Hardware All hardware is enclosed in a single HP computer rack (Hewlett-Packard, Palo Alto, CA) and contains five servers (application server, PACS server, network attached storage, back-end server, and staging server), disk array, tape library, DVD Jukebox, network switch, and power distribution units (Figure 1). A Dell computer with dual Intel XEON 2.4 GHz Processors and 2 GB RAM provides sufficient computing power to allow the Web server, application server, and database server run on a single hardware platform. Clean separation of tiers in software will allow us in the future to off-load components to dedicated hardware. The back-end server (HP DL320 G2 P3.06GHz 2GB RAM) operates on Red Hat Linux and is used for auxiliary services, such as Webbased DICOM converter, shared bioinformatics, and chemoinformatics tools. The staging and testing server (HP DL320 G2 P3.06GHz 2GB RAM) was installed for rapid prototyping of new ideas. The underlying PACS archive is a Siemens MagicView 300 Archive (MV300) optimized for DICOM protocols. The MV300 utilizes an HP DL320 server with a Windows 2000 Server operating system. An NSM 3000 DVD Jukebox extends available storage to 2.3 TB. Its 240 loaded double-sided DVDs provide near-line storage. Off-loaded DVDs are used as a long-term archive. The MV300 is configured to automatically forward all incom- Figure 1. MIPortal provides repository for data received from various DICOM and non-DICOM modalities (on the left) and distributes data via Web-based interface (on the right). Molecular Imaging . Vol. 4, No. 4, October 2005 High Capacity Server for Molecular Imaging Research Pivovarov et al. ing images to the MIPortal where metadata are loaded into the database and image files are placed into corresponding directories. Image file storage is provided by an HP StorageWorks NAS 2000s server. NAS solution simplifies manageability and provides network accessible storage to a mix of clients and servers running different operating systems. An Intel Xeon 3.06 GHz with 1 GB of RAM server is connected via SCSI interface to a disk enclosure with twelve 250-GB SATA drives. This provides 3 TB of raw disk space. We configured RAID 5 for redundancy and obtain 2.4 TB of usable space. An MSL5030 tape library with HP Open View Data Protector software provides automated backup capability. The server is capable of supporting up to four disc enclosures bringing the total storage to 10 TB. Software Syngo is a universal imaging platform that offers basic functionality such as displaying and storing images as well as networking capabilities. It is well tested and a proven clinical platform for implementing medical imaging applications. We use the following Syngo features: image processing; DICOM services, such as Query, Retrieval, AutoStoreSCP; audit trail; image conversion functions; Syngo framework for individual component configuration and start up. Syngo components are developed primarily using C++ and implemented on the MS Windows platform. Hence, a decision was made to develop the MIPortal Application Layer in Windows environment. The Microsoft.Net Framework provides a fast prototyping and application development environment with many features for interoperability and interprocess communication. It delivers the flexibility and interoperability with existing Syngo components. A standard Microsoft IIS server was chosen for integration with the Microsoft.NET framework used for the browserbased ASP.net layers and Syngo Back End. The database server utilizes an Oracle 9.2 database engine. We do not store image files as binary objects in the database. Instead, files are placed on a file system with pointers kept in the database. Login information, user preferences, and access rights are also driven by relational database tables which make it less platform-dependent. Platform independence is achieved by using pure HTML, JavaScript, and Java applet (image viewing). We made sure that the client (Web browser) can run on any platform. The pages rendered by MIPortal are supported by all commonly used browsers and deliver the same user experience. It is tested on Internet Explorer, Safari, and Firefox on Macintosh, Windows, and Linux clients. Molecular Imaging . Vol. 4, No. 4, October 2005 427 XML is used for data transfer with XSLT to render HTML pages. All HTML pages are rendered dynamically by ASP.Net. In addition, Web Forms are used to generate forms. Clients connect to the system via secure http (SSL encrypted) connections and use HIPAA compliant authentication mechanism. Thumbnails (60  60 pixels) of images are first shown within the browser for preview and selection. We use lossy jpeg compression with ratio of 60%. A user can then select images for a larger view within a page of jpegconverted image (up to 512  512 pixels). The image is alternatively shown on ImageJ which can be configured per user. This requires Java Virtual Machine installed on the client computer. The jpeg is given as input to ImageJ for performance reason. We do not provide advanced imaging tools on-line. Users download individual images or complete datasets for further analysis with image processing software of their choice. The DICOM converter is built on a collection of opensource libraries and tools. DICOM functionality is provided by ‘‘dicomlib’’ ( by the imaging research group at Sunnybrook and Women’s College Health Sciences Center, Toronto, ON). Image conversion and transformation is based on ImageMagick ( by ImageMagick Studio LLC, Landenberg, PA). We also developed a simple user interface for entering DICOM-required information and ability to upload image files in common ( jpeg, gif, tiff, bmp, pict) formats as well as some proprietary formats (spe). The software then supplements this information with data read from image headers, generates unique IDs, and creates a valid DICOM image that is sent to the PACS server. Users can upload individual files, stacks, or even zipped directories. Because HTTP upload is not the most efficient way of transferring large files, we also developed a way to preload datasets via FTP. Connectivity, Security The MIPortal connectivity is based on a dual-network architecture. The primary network allows access to the system from any computer on the internal network behind a firewall. This includes Web clients connected via Virtual Private Network ( VPN ), hence, providing secure connectivity to external collaborators. The primary network is currently limited to 100 Megabit/sec. A secondary internal Gigabit network is essential for maximizing connectivity bandwidth between MIPortal components. DICOM archive, Web server, and network attached storage currently utilize this network via dedicated network adapters. This secondary network also provides us with a seamless path to upgrade the system once we split Web server, database server, and application server onto their own hardware without creating 428 High Capacity Server for Molecular Imaging Research Pivovarov et al. network bottlenecks. Because MIPortal is used for clinical trial applications, security and confidentiality is essential. HIPAA guidelines were followed for protecting personal medical information. The MIPortal also strictly controls access to the system based on user rights and provides audit trails of all retrieved data. Searches We implemented two distinctly different but complementary search functions: (a) a quick search and (b) a power search. The quick search allows a Google-style keyword search of the entire database. It returns a list (or a hierarchy) of projects, experiments, documents, and DICOM structures that produce a hit for the keyword. It is fast, convenient, and available on every page. The power search allows users to specify search criteria in a more precise and granular fashion. For instance, one can search for specific experiments, date, investigators, medical record numbers, or other fields. In the current version of the system, the content of documents and images is not indexed. However, all metadata, including annotations for projects, experiments and its components, are indexed and searchable. Future plans include extending searching capabilities to content of documents. Results Overall Design The MIPortal was designed around the workflow in a typical academic molecular imaging center. The top layer of infrastructure consists of user-defined projects (Figure 2). Each project can have multiple experiments, again in a user-defined manner. Each experiment can hold DICOM images, other images files, or a large number of nonimage documents. The user is able to easily toggle among experiments and projects or patient studies. Individual images can be downloaded as single images, series, or entire studies. Associated documents, as well as results of analysis, are uploaded and associated with experiments. Users cannot modify data in MIPortal and can only add data to the repository in order to prevent accidental overwrites and deletions. Only the system administrator has the capability to delete experiment components. Preferences allow the user to define the page layout and an administrator tool allows a system administrator to add users, models, probes, or other parameters. The system is structured with hierarchical access privileges going from administrator, to PI, to investigator to technician to guest, each with definable privileges for a given project or experiment. Images The core of the system is a large DICOM file server for image distribution. DICOM files can be previewed in the browser and downloaded as complete, zipped images (16 bit), series, or entire studies. Browser-based image analysis tools include viewing and basic image manipulation (Figure 3). A built-in converter allows the user to convert tiff, raw, and jpeg images into DICOM images to be stored on the PACS system (Figure 4). There exists functionality for storing and viewing most histology and confocal microscopy outputs. Figure 2. Information is organized in projects and experiments with secure access based on user privileges. Principal Investigators create projects and grant access to other MIPortal users. Molecular Imaging . Vol. 4, No. 4, October 2005 High Capacity Server for Molecular Imaging Research Pivovarov et al. 429 Figure 3. MIPortal facilitates quantitative image analysis. Investigators analyze datasets on-line or off-line with tools of their choice and upload results to the MIPortal. Operation, Reliability The system has been in operation for 12 months. During this time it has proven to be an indispensable tool for organizing and distributing heterogeneous data generated at our Center. The system is currently maintained by one part-time person (25% effort), and is in use by >35 investigators across six different institutions with access through VPN. Currently, there are 70 projects, with 382 experiments holding over 810,000 DICOM images. The current data amount to approximately 300 GB with a server capacity of 2.4 TB, extendable to 10 TB. Since its inception, the system has had an uptime of 99.8%. Performance The largest datasets on the system are generated by a CT. A typical experimental mouse study consists of 512 images 512  512 pixels and 16 bits/pixel resolution for a total of 256 MB. We used such dataset in our performance evaluation. It takes on average of 2 sec for the first thumbnail to appear on the page and about 1 sec for all others. The entire study preview is available in about 9 min. Downloading full-fidelity DICOM datasets is Molecular Imaging . Vol. 4, No. 4, October 2005 slower. The average download time of a single CT image (512 KB) is about 3 sec. Multiple images are zipped on the server into a single zip file before download. It takes about 7 sec to zip 10 CT image files with subsequent download in 6 sec. If an entire study is requested, it takes about 5.5 min to create a zip file and another 5 min to download (average file size is 160 MB). We also measured the performance of the search functionality. We evaluated typical wild-card searches (which are the worst-case scenarios from the performance standpoint). Each action was done three times and then averaged. We measured overall times that included page rendering of search results. For instance, it took 166 msec to search for and display 65 hits and 1900 msec for 233 hits produced by similar search criteria in the Quick Search. For the Power Search, the results also varied from 473 msec (173 hits) to 6630 msec (337 hits), which was the maximum of all our performance tests. Sample Projects We provide the following four sample projects to illustrate the versatility, adaptability, and functionality 430 High Capacity Server for Molecular Imaging Research Pivovarov et al. Figure 4. Multiplatform nature of Web-based user interface allows users to use DICOM converter at any modality. Converted images are sent to the PACS where they are validated and pushed to the MIPortal. of the system to different types of research projects common to Molecular Imaging Centers. were uploaded into the system from the collaborating institution and all investigators had real-time access. Single institution study with emphasis on image analysis. The objective of this project was to collect raw MRI images and then perform quantitative image analysis on large datasets to derive angiogenic and tumor volume parameters from each primary tumor. Over 30 mice were studied serially or in cohorts resulting in over 35,700 DICOM images. Following transfer to MIPortal, images were then analyzed, and tumor volumes and vascularity were determined in a semiautomated fashion. Analyzed MR data were then juxtaposed to immunohistochemistry. Target identification and molecular libraries. The objective of this project was twofold: (a) to perform, analyze, and archive the results of a phage display screen to identify novel peptide ligands and (b) to acquire and store imaging studies to validate the developed agents in mouse models [3]. Phage results were archived as Excel and Treeview files. Confirmatory Elisa data were also uploaded. Fluorescence microscopy images of tissue microarrays were converted to DICOM files using the built-in image converter. Confirmatory imaging experiments included MRI, endoscopic imaging, and fluorescence imaging, all stored within the project. Multi-institutional project. The objective of this project was to serially follow transgenic mice for the development of orthotopic and metastatic tumors using a variety of imaging systems including MR, CT, SPECT, and FMT imaging. The study was a collaboration between Massachusetts Institute of Technology (MIT) and Center for Molecular Imaging Research (CMIR). Mice were followed serially over 3 –6 months and therefore all imaging studies associated with a given animal were stored within distinct ‘‘experiments,’’ all belonging to the same project. The project contained over 40,000 CT images, SPECT, and optical images as well as histology and immunohistochemistry for each mouse. Autopsy data Clinical trial. The objective of this project was to store, analyze, and distribute all imaging studies for a prospective clinical trial involving magnetic nanoparticles [1,2]. Imaging studies were acquired through different MR imaging systems distributed throughout the clinical department. A total of 130 patients were enrolled in the trial with an average of 900 images per patient resulting in over 117,000 DICOM images. Image transfer from clinical scanners to MIPortal took an average of 9 min per case. Images could then be accessed through password and user-protected log-ins from with the VPN network. The typical download speed for query and transmission of 500 DICOM images (68 MB Molecular Imaging . Vol. 4, No. 4, October 2005 High Capacity Server for Molecular Imaging Research Pivovarov et al. zipped file) through a DSL home network is approximately 12 min (Ethernet download 2 min). All images can be accessed through commercial DICOM viewers and used for analyses, image quantization, anonymization, and readouts. Discussion We have developed and implemented a new information platform (MIPortal) to enable the local storage of large datasets, enhance real-time availability of diverse data, facilitate multidisciplinary and interinstitutional research, and serve as a platform for long-term data storage and subsequent searches, databanks, and analysis. The system extends the functionality far beyond common PACS systems and is also Web-based, platformindependent, and fast. The specific objectives of MIPortal included (a) the creation of a common storage for acquired data (Electronic Lab Notebook); ( b) user-defined sharing of images and experimental results; (c) combination and query of combined datasets: imaging, genomics, proteomics, histology; and (d) to facilitate development of novel image postprocessing algorithms. The system has been implemented with a hardware cost of approximately $60,000. Future extensions of the system will include the development of chemoinformatics and bioinformatics tools for target discovery. Molecular Imaging . Vol. 4, No. 4, October 2005 431 Acknowledgments The authors would like to acknowledge the help of the following investigators: Drs. M. Harisinghani, K. Kelly, J. Grimm for their invaluable input into the design of MIPortal; L. Fexon and Haiying Liu for the development of DICOM Converter. We thank Michael Wiekrykas (Cosmic Hat, LLP) for the Web design. We would also like to acknowledge Drs. A. Hengerer, D. Datta, C. Schultz from Siemens Medical Systems for overall development effort. This work was supported in part by the following grants: P50 CA86355, R24-CA92782, and a grant from Siemens Medical Solutions. References [1] Harisinghani MG, Barentsz J, Hahn PF, Deserno WM, Tabatabaei S, van de Kaa CH, de la Rosette J, Weissleder R (2003). Noninvasive detection of clinically occult lymph-node metastases in prostate cancer. N Engl J Med. 348:2491 – 2499. [2] Harisinghani MG, Weissleder R (2004). Sensitive, noninvasive detection of lymph node metastases. PLoS Med. 1:e66. [3] Kelly K, Alencar H, Funovics M, Mahmood U, Weissleder R (2004). Detection of invasive colon cancer using a novel, targeted, library-derived fluorescent peptide. Cancer Res. 64: 6247 – 6251. [4] Paulus M, Gleason S, Easterly M, Foltz C (2001). A review of highresolution X-ray computed tomography and other imaging modalities for small animal research. Lab Anim (NY ). 30:36 – 45. [5] Sinha U, Bui A, Taira R, Dionisio J, Morioka C, Johnson D, Kangarloo H (2002). A review of medical imaging informatics. Ann N Y Acad Sci. 980:168 – 197. [6] Sprague L (2004). Electronic health records: How close? How far to go? NHPF Issue Brief. 800:1 – 17.