Georgetown MRI Reading Center (GMRC)



Georgetown MRI Reading Center (GMRC)

1. Overview

The Georgetown MRI Reading Center (GMRC) brings together a group of researchers and computer specialists uniquely capable of carrying out the responsibilities of the MRI Reading Center part of the SPRINT-MIND (Systolic Blood Pressure Intervention Trial - Memory and Cognition IN Decreased Hypertension) multi-site study. Dr. VanMeter has extensive experience in three areas critical for the successful implementation of the MRI Reading Center: 1) large volume multi-site MRI data collection and QC procedures; 2) management of large-scale software and networking projects; and 3) image processing and segmentation expertise. Dr. Fricke is an MRI physicist with over 20 years of experience in the MRI field and has special expertise in MRI protocol development and optimization of QC procedures. Vlad Staroselskiy is a senior systems and network engineer with 16 years experience in developing and implementing secure network transfer systems including those used at the Pskov Savings Bank of the Russian Federation. These individuals have a long track record of working together on a number of different projects going back 6 years. This core group of experts will be assisted by additional personnel with the requisite skills needed to fulfill the mission of a project of this size and scope.

The remainder of this proposal describes the GMRC team members’ expertise and their role on the project, MRI segmentation methodologies, the MRI protocol development, quality control procedures, the database design and development plan, and the infrastructure needed to implement this project.

2. Personnel

John VanMeter, Ph.D., is Assistant Professor in the Department of Neurology at Georgetown University Medical Center (GUMC) and Interim Director of the 3T MRI facility at Center for Functional and Molecular Imaging. Dr. VanMeter received his Ph.D. in computer science from Dartmouth College. He has over 15 years experience in the design and implementation of neuroimaging experiments as well as their data analysis. Dr. VanMeter’s experience includes two years as a staff fellow in the Laboratory of Neuroscience in the National Institute of Aging, where he co-authored the first paper to use fMRI to investigate dyslexia. Dr. VanMeter has led the development of a number of major software programs as the Director of Research and Development at Sensor Systems, Inc. This includes a commercial software package that is utilized at over 300 research institutions worldwide in the analysis of multi-modal datasets including structural MRI and fMRI as well as the subsequent development of one of the first FDA cleared fMRI clinical analysis package. As PI of an NIH funded STAART (Studies to Advance Autism Research and Treatment) center grant project, Dr. VanMeter is applying a number of MRI based techniques including fMRI, Diffusion Tensor Imaging (DTI), and MR spectroscopy to investigate the neurobiological basis of autism. He also designed and developed the database and data transfer systems used for the MRI data collected in the NIH Pediatric Brain Development project (), which was a longitudinal study that included MRI scanning and neuropsychological evaluation of 500 children of various ages at seven sites across the country. Dr. VanMeter developed the secure data transfer mechanisms and tracking tools, a suite of data retrieval and viewing software used in the quality assurance of the imaging data, and the database used for the imaging data collected. The volume of data collected, stored, and databased at the end of this project was over 20TB. He has also developed a database system for the Georgetown’s Center for Functional and Molecular Imaging (CFMI) that integrates and manages the neuropsychological and imaging data collected in the 3.0Tesla MRI facility at Georgetown.

Dr. VanMeter will be the Director of the GMRC and be responsible for managing the overall efforts of the GMRC throughout the SPRINT-MIND project period. In addition, he will be responsible for establishing and coordinating with the CC the submission of the GMRC’s data to the CC and incorporating the MRI data with the CC’s database systems. His responsibilities in the GMRC will include designing the database and MRI scan transfer mechanisms including automated removal of PHI (personal health information) from the MRI scan headers. He will work closely with Mr. Staroselskiy in the development efforts and oversee the project programmer. Dr. VanMeter will work with Mr. Staroselskiy on the implementation of the data flow strategies to coordinate the data collected at the MRI sites. Dr. VanMeter will work with Dr. Fricke on the development and implementation of both the MRI scanning protocol and the QC (quality control) procedures.

Stan Fricke, Ph.D., is currently actively working on five NIH funded grants of which two involve brain imaging and one involves MRI equipment development for ultra high-speed imaging. Dr. Fricke has a Nuclear Engineer's degree (Nucl. Eng.) in radiological sciences from the Massachusetts Institute of Technology, as well as a degree in Statistical Physics from the University of Torino (Turin, Italy). These academic qualifications make him well suited to field almost any question in the field of in-vivo imaging. He has extensive experience in in-vivo magnetic resonance spectroscopy and in magnetic resonance imaging.

Dr. Fricke's first experience in human imaging was in Tulsa Oklahoma (1985-1986). As a summer research student Dr. Fricke worked at Oral Roberts City of Faith Hospital, imaging cancer patients and assisting in planning radiation therapy based on the image data. From this experience was born his first publication in 1987 (meeting abstract that can be searched on the "Web of Science") was on manganese chloride's and nickel chloride's effect as contrast agents for magnetic resonance imaging studied at various magnetic field strengths. In just the last few years manganese has been used as a neuronal tracer/contrast agent for MRI with phase one clinical trials. During 1991-1999, Dr. Fricke worked with the University of Florence on various projects involving multiplatform imaging for the diagnosis of vision disorders due to cerebral oncological complications. Later in 1994-1999, Dr. Fricke worked at Italy's Scientific Institute for Tumor Research (IST, Genoa, Italy). Dr. Fricke was a faculty member in the Department of Psychiatry at Wayne State University for three years and an Associate Professor in the department of Neuroscience at Georgetown University Medical Center for five years. He has lectured at Georgetown University on the topic of imaging of CNS trauma and he is well published in the field of brain trauma and rare disease linked neurodegeneration. During his time at Georgetown University Medical Center, he helped to setup the quality control procedures that are still in use today, defined a set of protocols most notably the MR spectroscopic protocol used to reliably and reproducibly acquire spectra from a number of different brain regions, which has been used in two large scale imaging projects: STAART (Studies to Advance Autism Research and Treatment) and the UCRDRC (UREA Cycle Rare-Diseases Research Center) center grants. Currently Dr. Fricke is the MRI Physicist at Children's National Medical Center. Dr. Fricke has worked on General Electric, Siemens and Philips MRI platforms and worked with a neuroradiologists, radiologists, and research to develop MRI protocols that are uniquely suited for the specific study.

Dr. Fricke will have primary responsibility in the development of the MRI scan protocol and the QC procedures. He will work with Dr. VanMeter to ensure the scanning protocol that is used meets the needs of the project in terms of contrast required for the various segmentation procedures and acquisition parameters with an eye towards maximizing the trade-off between total scan time and optimization of MRI scan quality. Dr Fricke’s long history in the MRI field makes him ideally suited to design the QC procedures for this project. He will work with Dr. VanMeter on the development of these procedures and their implementation to ensure that the data are truly comparable across all sites. In addition, he will make the annual onsite visits to each MRI center to ensure proper staff training and conduct a full and rigorous QC protocol.

Vlad Staroselskiy, M.S., has excellent expertise in systems and network protection. His experience has included designing and implementing the networking infrastructure for several banks, which obviously require high-level of data security. He has successfully planned, designed, installed and configured a number of systems around the world. Between 2002 and 2004 he also took a part in development of NetBait product, which acts as a trap for network intruders, trying to obtain unauthorized access to protected systems. More recently he has worked for a number of academic institutions including Wayne State University and Georgetown University Medical Center. Through these various positions he has been thoroughly trained in the requirements for HIPAA (Health Insurance Portability and Accountability Act) compliance and the need to protect PHI (Personal Health Information). As a Senior Systems Administrator at Georgetown University Medical Center, Mr. Staroselskiy has been involved into the maintenance of multi-platform computer systems, providing day-to-day support for research faculty and staff. This has included designing the websites for both the Center for Function and Molecular Imaging (CFMI) and the Center for the Study of Learning (CSL). The website for CFMI included developing the calendar system for scheduling MRI scanner and EEG time. In addition, this system includes a database that is used to record laboratory notes regarding the scanning sessions. Specific projects have included planning, installing and configuring Cisco PIX firewalls for Center for the Study of Learning CFMI, CSL, and the Small Animals Imaging Lab with network monitoring, using MIDAS, Nagios, Cacti and MRTG software. He also implemented secure site-to-site VPN connections between these labs with secure off-site access. Mr. Staroselskiy has also setup and administers CFMI’s 19-node Linux cluster using a diskless LM/MPI computational design with a dedicated management node. He also has extensive experience installing, configuring, and maintaining various MRI image analysis and spectroscopy applications, such as MedX, LCModel, FSL, Matlab/SPM, AFNI, Paravision, and others.

Mr. Staroselsky will be primarily responsible for the development of the networking protocols and procedures. He will implement the necessary procedures to ensure the appropriate level of security across the network of MRI facilities and the GMRC. He will also be the main developer of the web-based database system used in this project leveraging his experience in the development of the CFMI database system. He and Dr. VanMeter will oversee a programmer dedicated to this project who will implement the database system, the MRI data transfer system, and the various notification systems. He will also oversee a system manager for the GMRC whose responsibility it will be to troubleshoot network and computer systems problems.

3. MRI Segmentation and Volume Measurement

MRI provides a method for examining tissues in-vivo and with the use of high-resolution MR imaging it is possible to produce very detailed images of the brain. It has long been possible to quantify the volume various complicated tissues such as gray matter, white matter, and CSF from MR images of the brain (VanMeter and Sandon 1992). A number of morphometric techniques based have relied on human raters to manually trace the boundaries of the tissue of interest; however this method inherently introduces subjectivity. To reduce this subjectivity, the raters are trained and their ability to trace the same boundary must be compared within rater and across raters.

Image segmentation techniques are a class of computer algorithms designed to automatically extract the boundaries of a given class of tissue or an organ. By their nature these algorithms provide an objective way to identify and measure particular parts of the image separate from the rest of the image. There are several basic techniques available that can be applied to a given image segmentation problem. These include 1) thresholding which rely strictly on the intensity (pixel brightness) to isolate a given tissue; 2) region growing which starts from a seed point in the tissue of interest and iteratively grows outward until the boundaries of the tissue are reached typically based on a threshold; and 3) clustering which iteratively assigns pixels to one of k-classes based on how close it is to the intensities of a given class (Duda and Hart 2000). These image segmentation techniques provide a basic toolbox and are often combined to improve the overall result.

An example of the output of a computer algorithm used to identify gray matter in a T1-weighted coronal slice of a human brain (VanMeter and Sandon 1992) is show in Figure 1. This particular method uses a mathematical model of the distribution of MR image intensities called a material mixture model. In this model the distribution of intensities for each tissue class (gray matter, white matter, and CSF) is modeled with Gaussian distributions (Figure 1b). This type of model is required even though a given tissue class such as gray matter might be darker than white matter it is not the case gray matter will have the exact same range of intensities across a series of images. Thus, this model provides a mechanism for identifying the range of intensities (thresholds) hat most likely correspond to a given tissue class. In the material mixture model the thresholds correspond to the where the distributions of two neighboring tissue classes cross. The complete algorithm combines the thresholds obtained from the material mixture model with a neural network trained to recognize the overall morphology of the different tissues (VanMeter 1993).

Measurement of total cerebral, left- and right-hemisphere, and cerebellar volume can be calculated from high-resolution T1-weighted images such as the Siemens MPRAGE or the GE SPGR. Pre-processing of the images included: rating of the individual scans for quality; removal of non-cortical structures, such as the scalp; intra-subject registration of the stripped MRPAGES; calculation of the mean MPRAGE; and bias-field estimation and correction. The scalp stripping and bias field correction was performed using BET (Smith 2002) and FAST (Zhang, Brady et al. 2001) from the FSL software library () respectively. The images were registered using a rigid-body transformation with AIR (Woods, Grafton et al. 1998).

Using a program called Graph-Cuts, each subject’s mean MPRAGE was subdivided into three compartments: the first two consisted of the left- and right-cerebral hemispheres and the third included both the cerebellum and brain stem (Liang, Rehm et al. 2005). This program uses a 12 degree-of-freedom transformation derived from registering a template volume to the subject’s mean image to reslice a volume with the three compartments previously labeled. From this initially labeling, three connected graphs representing the two cerebral hemispheres and the cerebellum are formed and then expanded providing a preliminary labeling of the subject’s brain. This labeling is refined by determining the best locations to cut the three graphs from one another using a standard graph-cuts method (Liang, Rehm et al. 2005). The volume of each of these compartments was computed from the resulting labeled image. Total cerebral volume was computed by summing the volumes of the two hemispheres with the cerebellum and brain stem. The volume of CSF was removed in all volumetric calculations. In addition, a left-right symmetry index (SI) was computed as follows:

|SI = 100 x |(Lv – Rv) |

| |½ (Lv + Rv) |

where Lv and Rv are equal to volume of the left and right hemispheres respectively (Galaburda, Rosen et al. 1987).

In the STAART (Studies to Advance Autism Research and Treatment) Corpus callosum (CC) area measurements have been calculated using the seven subdivisions defined by Witelson: rostrum, genu, rostral body, anterior midbody, posterior midbody, isthmus, and splenium (Witelson 1989). The brain is first oriented with the Talairach atlas using a 6 degree-of-freedom transformation based on manual rotation of the midline and identification of the AC and PC using tools in MEDx. The outline of the corpus callosum is manually traced in the mid-sagittal plane of the transformed brain. The exact boundaries of the CC are identified using an intensity threshold inside the manually traced region. In addition, the anterior most point of the inner convexity is identified. Wtielson’s method subdivides the length of the CC between its anterior and posterior extents into areas using specific geometric rules. The area of each subdivision is calculated from the overlap of the subdivided regions and the hand-traced outline of the CC. In addition, the cerebellar vermal areas have been computed by manually tracing of lobules I-V, lobules VI-VII, and lobules VIII-X in the mid-sagittal plane of the Talairach oriented brain. Statistical comparisons of all the subregions of the CC and the cerebellar vermis are examined with and without correcting for total brain volume. A one-way multivariate analysis of variance (MANOVA) was conducted using SPSS 14.0 (SPSS, Inc, Chicago, IL) to test the null hypothesis that area of these regions was not different between the two populations (autistic and typically developing children).

4. MRI Protocol Development

Drs. VanMeter and Fricke will work together to develop the MRI scanning protocol. Development of this protocol will build on the experience Dr. VanMeter has gained by participating in the ADNI (Alzheimer’s Disease Neuroimaging Initiative), the Valproate Neuroprotection study, and the NIH Pediatric Brain Development project. Both of these projects have developed standard acquisition protocols that are designed to collect MRI data that has as uniform as possible contrast characteristics and quality from all three of the major scanner manufacturers: GE, Philips, and Siemens. In particular, the ADNI protocol is quite attractive as each of the scanner manufacturers have worked with the ADNI participating sites to distribute the sequences appropriate for that site’s particular scanner model. In addition, this protocol has been developed to work with both 1.5T and 3.0T MRI scanners. This protocol includes the collection of two high-resolution (1.0x1.0x1.2mm3 voxels) T1-weighted images and a high-resolution (0.9x0.9x3.0mm3 voxels) double-echo sequence, which acquires both the T2 and Proton-Density weighted images in the same scan. Additional calibration scans that characterize the B0 field are also acquired. The entire protocol can fit into a 30-minute scan slot. These NIH Pediatric Brain Development project also has a standard protocol optimized to provide cross-site consistency in contrast and quality. The particular needs of the SPRINT-MIND study will be taken into consideration when developing the protocol to ensure the scans collected provide the optimal data for measuring the different volumes of interest. For example, if the study requirements call for the measurement of hippocampal volume one of the scans in the protocol will be collected perpendicular to the long axis of the hippocampus as is used in the Valproate Neuroprotection study.

5. Quality Control Procedures

The quality control procedures will be developed based on the best practices and accumulated knowledge that Drs. VanMeter and Fricke have gained through their participation in other multi-site neuroimaging studies and their own experiences in the operation of MRI systems. The procedures will include four levels of quality control: site qualification, monthly QC of on a phantom, acquisition of QC scans within 24-hours of each subject’s scan, and annual onsite QC visit. The Magphan® Quantitative Imaging Phantom will be purchased for each MRI-center from the Phantom Lab, Inc. (Salem, NY). This phantom contains 165 polycarbonate spheres mounted on a series of polycarbonate plates and posts filled with copper sulfate and water solutions. Measurements of the phantom are compared with the known positions of the spheres to give an accurate measurement of the distortion of the scanner. In addition, 4 of the spheres produce unique contrast on T1, T2, and PD-weighted scans and can be used to track changes in contrast over time. This phantom has the advantage of having been tested and used in the ADNI project. All of the QC-scanning data will be uploaded to the GMRC using the same methods for the subject data. The GMRC will develop automated procedures to track the quality at each over time measuring both spatial fidelity and contrast consistency. The annual onsite QC visit will be used by the GMRC to perform a more rigorous set of quality control tests as well as a living phantom scan using the MRI protocol from the study. This onsite visit will also be used to train and re-test the staff at each site in the appropriate procedures for the collection of the subject and QC data.

6. Database design and development

One of the database systems Dr. VanMeter and Mr. Staroselskiy have developed is in use in Georgetown’s Center for Functional and Molecular Imaging (CFMI). This system is used to store, manage, and integrate the neuropsychological and imaging data collected in the various projects ongoing in the 3.0Tesla MRI facility. One of the advantages of this system is that it employs a web-based interface. This allows researchers to enter data, sort, and retrieve results from their own desktop computers.

This system is designed to permit the rapid addition of new forms through the use of an underlying architecture that is based on a set of core database tables and a combination of HTML, JSP, and Java-script web-pages. Once the design elements have been determined, all forms are laid out and designed by a research assistant using Microsoft Frontpage, which is available through a site license at GUMC. This eliminates the need for a senior developer to be involved in the development of the forms.

One of the goals of the CFMI database project was to attempt to reproduce in the web interface the design of the original paper version of each instrument. This greatly simplifies the training of users on data entry procedures using the web-based forms. These web-based data entry screens are structured nearly identically to the paper forms from which the data are entered. Sample input forms from this system are shown for a part of the WAIS3 instrument (Figure 3, below).

[pic]

Web-based data entry form for the WAIS3 Digit Span subtest

The administration of the handedness instrument involves checking one or more boxes on the paper form. This is replicated using check boxes in the same locations in the web-based form shown in Figure 1. As an example of automatic scoring, the web-based data entry form for the digit span test (Figure 2) computes the item and total scores from the entered trial scores.

Data Submission Tracking

Dr. VanMeter’s role in another database system was specifically with regards to MRI data submission. The NIH Pediatric Database (NIHPD) project collected both longitudinal imaging and neuropsychological data over 6 years on 500 children ranging in age from several months to 18 years old at baseline. The purpose of the project is the development of a number of age specific imaging templates and to examine normal brain development. Data were collected at seven research centers across the country and transmitted to a central coordinating center at the Montreal Neurological Institute (MNI). All of the data collected is made available to both the center collecting the data and the central coordinating center. Dr. VanMeter developed the image transfer database and protocols for this project. This database architecture was designed to be scalable in nature.

Transmission and storage of the image data collected at each of the sites participating in the NIHPD project required the development of an imaging database, specialized data transfer software, and quality control tools. Of particular concern is the proper tracking of the data with respect to accurate subject and MRI sequence identification. Data collected in this project are transferred from the MRI scanner to an onsite workstation that has a fully functional version of the database and image retrieval server. The data are reviewed locally for image quality using a set of tools designed and developed by Dr. VanMeter. All imaging studies are then transferred to a central data integrity center managed at Georgetown University by Dr. VanMeter. The data are passed through automated integrity checks such as identify missing slices, violations of the established imaging protocols, and mislabeled data before being sent onto the central coordinating center.

The data transfer system developed for this project uses the international image transfer and storage protocol DICOM, Digital Imaging and Communications in Medicine (ACR/NEMA 2001). The DICOM standard includes complete specification of attributes of the data including fields for subject ID, imaging parameters, and sequence labels. The DICOM standard also defines a rigorous protocol for the transfer and storage of medical image data from a variety of sources including MRI. An advantage of the DICOM image transfer protocol over simpler methods such as FTP is that the receiving application must communicate any type of transfer failure to the sender. Furthermore, when such failures are detected, the sender is configured to automatically attempt to re-send the images that failed.

This system has been in place since 2002 and has been used to successfully transfer over 10,000 image acquisitions to date with only 150 cases requiring manual intervention. The stability and reliability of this system has exceeded the original requirements. In addition, laborious hand identification and manual transfer of the data has been eliminated increasing the integrity and accuracy of the data submitted through this process.

Development of new databases, systems of data entry/flow, and data submission procedures

A comprehensive data management system will be developed to support the activities of the GMRC. This development process will leverage the work put into the database system already in place in the Center for Molecular and Functional Imaging (CFMI). The data entry system will employ a web-based system allowing multiple users to enter and access data at the same time. The GMRC will ensure that data access will be provided in a secured manner and limited to GMRC, CC, and the appropriate MRI center personnel.

1. Database Management System Development

a. Database development plan

The development of a new database system for the GMRC will utilize a spiral model of software development (Boehm 1988). This type of development process includes all the components of any development project including the gathering of requirements based upon an assessment of the needs of the GMRC, development of a design document, implementation, testing and deployment. In contrast to a waterfall development process, all of these components are performed in multiple phases of the life of the project (Hogan, Smith et al. 2002). This type of development process incorporates a mixture of top-down and bottom-up design philosophies. The iterative development process of the spiral model has also been shown to lead to a flexible design that provides early and frequent feedback to the GMRC through rapid prototyping (Thayer and Dorfman 1997).

b. Design

The design process of the new database system will follow an Object Oriented Design (OOD) approach, which encapsulates the data of the system in terms of objects, the relationships between objects, and the operations that can be performed on them (Coad and Yourdan 1991). The advantages of the OOD method include modularity, data abstraction, and encapsulation. Given the high-degree of overlap between the requirements of this database system and the one developed by Dr. VanMeter for the CFMI, a significant portion of the design of this system can be reused for the GMRC’s system.

To provide access to users across multiple sites (GUMC, CC, and the appropriate CCN) a web-based system called the Model-View-Controller (MVC) architecture will be used. An MVC architecture segregates the functionality of the system between objects used to maintain data and those used to present data (Cavaness 2002). The advantages of this system are a) that it can be used to minimize the degree of interaction between objects and b) its usefulness in multi-tier enterprise web-based applications. The model captures the logic of the processes the system implements and manages access to the data of the system. In addition, the functionality of the system is encapsulated in the model. The view presents data from the model to the user while sending keyboard and mouse inputs to the controller. Finally, the controller defines the application’s behavior by interpreting the user’s inputs, sending requests to the model, and selecting between views for presentation. An overview of this architecture is shown in Figure 4.

The model and controller parts of the system will be implemented using Java Beans (reusable software components) running on a Tomcat web-server. Tomcat is an open-source server that provides an implementation of Java Servlets and JavaServer Pages (JSP) (Harrison and McFarland 2002). JSP technology is a mechanism for creating dynamic web-based applications that are platform-independent. One of the major features of JSP technology is the ability of the programmer to encapsulate chunks Java-code in HTML-style tags that can be used by a web-page developer in the creation of the views. The JSP custom-tag libraries communicate with the model’s Java Beans, which provide encapsulated access to the database in objects that abstract the details of the organization of the database from the rest of the application. The controller’s Java Beans acts much like a ‘traffic cop’, directing user requests to the appropriate handlers. The system has handlers to process requests from the user such as updating the data displayed in a form, getting data from the database, and generating reports.

The view portion of the system will be implemented using JSP-based web pages to present the various neuropsychological tests, input forms, and menus to the user. These JSP pages are initially developed using Microsoft FrontPage as a simple means of laying out the HTML portions of the web page. The JSP custom-tag libraries greatly simplify the task of implementing the interface by allowing the web-page developer to add commands in the form HTML-style tags to validate inputs, get data from the database, and insert data into the database. This simplification in the development of input forms provides a low-cost and efficient means of adding new forms to the system since in many cases a research assistant can handle this work.

To ensure secure access to the data over the internet, the system will use 128-bit SSL (Secure Socket Layer) encryption. In addition, a system of access control lists will be implemented. Users of the system will be categorized with respect to the rights they will have to the read data from the database, enter data into the database, and modify data in the database. These control lists will be used to limit data access based on need.

c. Database Implementation

The database will be implemented using an open-source SQL database engine called MySQL (). This database engine is a low cost (free) implementation of ANSI standard SQL-99 and is used by corporations such as Google and Cisco Systems as well as governmental institutions such as NASA. In addition, there are a number of connectors for MySQL including two for Java.

The database structure itself is built around two major types of data: subjects and forms (see Figure 5). The subject data is the basic demographic data related to the subjects. For security reasons the subject data is split into two parts: public and private. The “public” data, which includes the subject ID, age, gender, and ethnicity, is accessible (under password protection) by all users of the system. The “private” data, which includes sensitive data such as name, telephone number, and address, is only accessible when users access this separate database directly on the server housed at GUMC. When a subject is administered a test, the values are recorded in a TestInstance and ItemValue tables. These tables can be thought of as instantiations of the Form and Item tables. In addition, the test information is linked to the subject through the Session table, which is used to track results longitudinally.

GMRC database architecture.

The forms portion of the database includes the various neuropsychological instruments, neuropathology reports, and the clinical visit data. Rather than developing a unique database table for each specific form, the CFMI database uses a generic form table that includes references to any number of generic question/score item table entries and information needed to display the form on the screen. The item table entries are designed to allow for multiple data types including a string/text input, integers, and double-precision real numbers. Another feature of the item table entries is the ability to include data for automatic range checking as well as enumerated input set checking. By abstracting the various forms into a handful of general tables, it is possible to very quickly add new forms to the system. In the CFMI database project, this greatly reduced the amount of time that was required of a senior database programmer and led to the swift development of the data entry form.

d. Development timeline

It will be possible to significantly compress the development cycle of the GMRC’s database system by leveraging Dr. VanMeter’s efforts put into development and testing of the database system used in CFMI. While the CFMI database structure will serve as model for the development of the GMRC databases, it will still be necessary to develop GMRC-specific requirements and design documents as well as new data collection forms (DCFs) needed by the GMRC. In addition, it will be necessary to add a mechanism for importing the data in the existing MDP Access database into the new database system. The various phases of the development process for the new database system are shown in Figure 6.

GMRC database system development plan

2. Data Flow

a. Collection of raw data

The flow of data to the GMRC is shown in Figure 7. The following standard operating procedure will be used for data collection:

1) The DCFs (document control forms) filled out by the MRI technician at the time of each scan will be entered in a secure web data entry form host at the GMRC.

2) Data in the DCFs will be checked for completeness at the time of entry. The computer will generate a list of corrections identified during data entry, including missing, out of range or unclear values.

3) An automated tracking system will begin a process of checking for all of the scans in the study. Incomplete or missing scans will be trigger an email notification to MRI site’s point of contact if the scan data has not been received within three business days of submission of the DCF.

4) Data missing or incomplete after 5 business days will trigger a review by the GMRC to track down the source of the problems.

5) An automated notification system will be developed to provide email reminders to the appropriate clinical site to submit schedule the individual’s follow-up scan. Notification will be based on the date of their previous MRI visit.

6) MRI data will be sent to the GMRC using a Java-based program that will first remove all PHI information from the DICOM headers of the scans. The person sending the data will be prompted to provide the appropriate subject ID. Standardized MRI protocol scan names will be used at each site and used to identify the different parts of each study.

7) Once the scan data has been successfully uploaded to the GMRC a trained research assistant will check the quality of each scan. The results of the QC will be entered into the GMRC database.

8) Scans not passing QC will trigger a notification to the MRI site that the subject should be rescanned.

9) Automated segmentation analyses will be run on each complete dataset and the results will be automatically compared to expected norms.

Data flow diagram

b. Data entry and validation

All data collected by the GMRC will pass through a QC procedure to ensure it can be used in the volumetric analyses. Automated segmentation analyses will be run on each complete dataset and the results will be automatically compared to expected norms. Scans that are 2 standard deviations outside of the norm will be flagged and checked by the GMRC.

Data Management

The central mission of the GMRC is to ensure the integrity of the data collected and stored in the GMRC’s database system. This includes ensuring the quality of the data being entered into the system, preventing unauthorized access, tracking data modification, handling missing values, as well as providing regular data backups.

1. Data Integrity

a. Automated data validation

As data are entered into the database, range checks and missing value identification will be performed providing the data entry clerks with immediate feedback. Missing data will be coded in the database by filling the field with all 9s as prescribed by the CC’s data submission manual. When missing, out-of-range, unreadable, or logically inconsistent values are identified, the data entry clerk will flag the items on the paper form. The co-directors of the GMRC will conduct regular data validation meetings to resolve any issues so identified

In addition to the feedback mechanism built into the data entry system, the GMRC will develop computerized error checking routines that will routinely validate the accumulated data. These routines will be designed to detect logical errors and outliers in the data; they will run automatically and generate reports that will be emailed to the co-directors.

b. Privacy/Security

The GMRC is extremely security conscious and will limit access to the GMRC’s database system through a system of access controls and data encryption. All data submission and access to the GMRC databases will be conducted through a secure web-based system. To ensure that the data is not compromised, SSL (secure socket layer) 128-bit encryption will be used to encrypt all web sessions. In addition, each GMRC member will be provided with a login and password. Users’ access to data will be limited based on their role in the SPRINT-MIND project. In particular, members of the GMRC will have data entry and modification privileges. Data access and retrieval will be determined by the policies setup by the CC.

c. Data modification tracking

Any change to database records after the initial entry will automatically be recorded in an audit trail fields. Each audit trail field will include the name of the field being changed, the date of the change and the name of the person making the change. This system of tracking changes to data will enable us to construct a change history for any record.

d. Missing values

Missing data will initially be flagged by the data entry clerks on the original DCF and submitted to the co-directors of the GMRC. The flagged DCFs will be verified and stored in the database by filling the field with all 9s as prescribed by the CC’s data submission manual. In addition, one of the tables in the database will be used to track the reasons for missing data linking the record and field with missing data with the missing code. A routine will be developed to extract the information for the CC’s “Missing Data Alert for Documentation” form.

e. Data backups

While the use of RAID level-5 provides a high level of fault tolerance since the data is redundantly stored and striped across multiple disk drives, all of the data in the database system will be backed up on a regular basis. The current backup system will be expanded with additional tape drives.

The following backup regime will be implemented:

• Daily incremental backups will be used to backup any data changed since the last full backup.

• Weekly full backups with a set of 4 tapes used on a rotating schedule.

• Monthly full backups with a set of 3 tapes used on a rotating schedule.

• Quarterly full backups with a set of 4 tapes used on a rotating schedule. These tapes will be stored off-site to protect the data against theft, or loss, such as fire, flood, or earthquake.

• Yearly full backups will be stored offsite. The tapes used for these backups will not be rotated.

With this backup system in place it will be possible to recover data going up to four months back in time using the onsite tapes. In addition, the offsite tapes will provide the ability to recover data even further back in time.

f. Quality assurance

Quality assurance is a major focus of the GMRC. To that end, the GMRC will conduct regular data quality reviews to determine if corrections are required. These meetings will be used to resolve issues identified from the double-entry discrepancy check procedures, DCFs flagged for missing values, and other problems identified by the automated data checking routines. Minutes of these meetings will be used to document the resolution of each of the identified problems. In addition, the automated data modification tracking system built into the database system provides an audit trail of who made the changes to the data and when.

Additionally, the GMRC plans to institute an audit procedure to identify potential inconsistencies between the hardcopy DCFs and the database records. Quarterly audits of the database will be implemented by randomly extracting 5% of the records in the database for "manual" comparisons. Detected inconsistencies will be recorded and investigated in a systematic manner.

1. Procedures Manual

Because the GMRC will design and produce the DCFs, the GMRC will also be responsible for ensuring that all MRI site personnel that will be working on the SPRINT-MIND project are fully versed with the entry forms, the MRI data submission program, and the scan tracking system. We will develop procedure manuals on the use of the DCFs, scan data submission, and scan tracking procedures. Additional, procedures for tracking MRI scanner quality over time will be setup. The manual will include detailed, step-by-step instructions for filling in all relevant DCFs, what constitutes an acceptable scan, and the scan tracking and notification system. This procedures manual will also include instructions on personnel requirements, training procedures for new personnel, and QC procedures. A virtual training program will also be developed to train new users with a set of quizzes to test their knowledge.

2. Training and Certification of Morphometric Raters

The following procedures will be used to train new raters and certify all personnel involved in the morphometric rating process. A standard set of images will be used for this purpose. With the help of a neuroradiologist the gold-standard of the boundaries of all of the structures being measured in the SPRINT-MIND study will be identified. A minimum of 10 separate scans will be included in the gold-standard dataset. The GMRC director will validate each rater by examining their inter- and intra-rater reliability in comparison to the established gold-standard. A minimum of 0.95 and 0.97 will be required for inter-rater and intra-rater reliability respectively. All raters will be retested annually. A list of both GUMC and HUH personnel and their certification status on all registry instruments will be maintained in the GMRC regulatory binder at each clinical site.

7. MRI Reading Center Infrastructure

The GMRC computer and networking systems will be built upon the existing resources in CFMI’s 3.0T MRI facility. These resources include a 750 square feet equipment room that has a dedicated Liebert Challenger 3000 temperature and humidity controlled equipment room; a new IBM TS3310 tape library (3576-L5B) with 2 LTO4 drives with a total backup capacity of 24Tb; Arkeia 7.0 Network Backup data backup/retrieval software that includes a federated data protection architecture; and 3 72”x29”x36” rack-mount computer cabinets fully protected by 5 APC Smart UPS 5000 uninterrupted power supplies that connect to master computer to initiate orderly shutdown of all equipment in the event of sustained power failure.

Other major equipment in CFMI includes a 40-node Linux compute cluster, which has 20 TB of attached disk storage. Every node is equipped with standard software for statistical analysis of fMRI and structural MRI data as well as visualization utilizing software packages such as SPM, FSL, and MEDx. All computers are linked via an area 1000 base-T Ethernet Local Area Network (LAN). All of the CFMI computer equipment is connected to the internet via Georgetown University’s Internet2 connection. The CFMI LAN is protected by Cisco PIX 525E firewall, which has 2x1Gbit ports for LAN and WAN traffic, and 100Mbit interface for "demilitarized" zone (DMZ), that hosts the CFMI web server. It provides security from outside threats and supports constant virtual private network (VPN) connectivity for remote access by CFMI users.

Expansion of the CFMI Infrastructure for SPRINT-MIND

The CFMI infrastructure provides an excellent base upon which to build the infrastructure that will be needed for the MRI Reading Center at a relatively low cost. To meet these needs and to maintain a strict separation between the CFMI data operations and those of the GMRC the following additional equipment will need to be purchased: 5 IBM x3350 compute node servers, 1 IBM x3350 backup server, a 3Com network switch, an IBM LTO TS3100 Tape Library, APC Smart UPS, and software licenses for Red Hat Enterprise Linux and Arkeia network backup software. These additional hardware components will be housed in the CFMI equipment rack in a separate computer rack. Access to these systems will be strictly limited to those conducting GMRC related work.

GMRC Training Facilities

Other resources available at Georgetown University that will be useful for conducting training include a Video Teleconferencing facility. This service is made possible via the University’s phone system, a conventional TV, and a PolyCom View Station. There are two rooms on campus that have been specially wired for teleconferencing.

Georgetown University has several conference rooms available for departmental functions. The Research Auditorium located in the Research Building can be reserved to accommodate larger workshops and seminars. The Research Auditorium houses state of the art equipment and technical experts to assist with functions. In addition to individual conference rooms located throughout the University campus, The Leavey Conference Center houses an on-campus hotel for out of the area participants along with catering services and several interconnected conference rooms.

References:

ACR/NEMA (2001). Digital Imaging and Communications in Medicine (DICOM). Rosslyn (VA), National Electrical Manufacturers Association (NEMA).

Boehm, B. W. (1988). "A Spiral Model of Software Development and Enhancement." IEEE Computer 21(5): 61 - 72.

Cavaness, C. (2002). Programming Jakarta Struts. New York, O'Reilly & Associates.

Coad, P. and E. Yourdan (1991). Object Oriented Design. New York, Prentice Hall.

Duda, R. O. and P. E. Hart (2000). Pattern Classification, Wiley-Interscience.

Galaburda, A. M., G. D. Rosen, et al. (1987). "Individual variability in cortical organization: its relationship to brain laterality and implications for function." Neuropsychologia 28: 529–46.

Harrison, P. and I. McFarland (2002). Mastering Tomcat Development. New York, John Wiley & Sons.

Hogan, J. M., G. Smith, et al. (2002). The Real World Software Process. Ninth Asia-Pacific Software Engineering Conference, Gold Coast, Australia, IEEE Computer Society.

Liang, L.-C., K. Rehm, et al. (2005). Automatic Extraction of Left and Right Hemispheres from MRI Brain Volumes Using the Graph-cuts Algorithm. Organization of Human Brain Mapping, Toronto, Canada.

Smith, S. M. (2002). "Fast robust automated brain extraction." Human Brain Mapping 17(3): 143-55.

Thayer, R. H. and M. Dorfman (1997). Software Requirements Engineering, Wiley-IEEE Computer Society Press.

VanMeter, J. W. (1993). Segmentation and restoration of magnetic resonance images using material mixture models. Computer Science. Hanover, NH, Dartmouth College. Ph.D.: 166.

VanMeter, J. W. and P. A. Sandon (1992). Identification of cortex in magnetic resonance images. SPIE Conference on Biomedical Image Processing and 3-D Microscopy. San Jose, CA.

Witelson, S. F. (1989). "Hand and sex differences in the isthmus and genu of the human corpus callosum. A postmortem morphological study." Brain 112 ( Pt 3): 799-835.

Woods, R. P., S. T. Grafton, et al. (1998). "Automated image registration: I. General methods and intrasubject, intramodality validation." J Comput Assist Tomogr 22(1): 139-52.

Zhang, Y., M. Brady, et al. (2001). "Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm." IEEE Trans Med Imaging 20(1): 45-57.

-----------------------

Subject ID#

FormItem

FormID

ItemID

SequenceNum

Form

FormID

ParentFormID

Name

Class

JSPView

JSPScope

Description

Item

ItemID

DisplayText

DisplayImageURL

RangeMin

RangeMax

DefaultValue

Type

ValidInputSet

HTMLFormat

TestInstance

TestInstanceID

SessionID

FormID

TestDate

Comment

LastModified

ModifiedBy

Description

ItemValue

TestInstanceID

ItemID

ItemStrValue

ItemIntValue

ItemDblValue

LastModified

ModifiedBy

Session

SessionID

SubjectID

SessionDateTime

SessionNumber

LastModified

ModifiedBy

SubjectPublic

SubjectID

Gender

Ethnicity

AgeInYears

LastModified

ModifiedBy

SubjectPrivate

SubjectID

FirstName

MiddleName

LastName

DOB

Gender

Ethnicity

Address1

Address2

City

State

Zip

Country



LastModified

ModifiedBy

Requirements refinement

Install hardware on GMRC server

Install Linux on GMRC server

Develop Prototype1

Port CFMI database to GMRC server

Add GMRC Clinical forms to new DB

Develop SQL and Tcl scripts to convert

existing MDP DB into new DB

Deliver revised requirements document

PHASE 1 \ 1 Month

Review requirements document

Deploy Prototype v0.1

Beta testing of Prototype v0.1

Revise CFMI DB system design to

satisfy GMRC requirements

PHASE 2 \ 1 Month

Review requirements and results of beta testing

Deliver GMRC DB system design

Develop Prototype v0.2

Modify interface based on beta testing

Incorporate all of the CC’s MDS

Incorporate Neuropathology BRC report

PHASE 3 \ 1 Month

PHASE 4 \ 2 Months

Review requirements and design documents

and results of beta testing

Deploy Prototype v0.2

Beta testing of Prototype v0.2

Begin development of training manuals

Develop Prototype3

Modify interface base on beta testing

Add SQL procedures extracting data for CC

Incorporate remaining testing instruments

Deliver revised requirements and design documents

Deploy Prototype v0.3

Beta testing of Prototype v0.3

Beta test CC extraction procedures

Develop Production version of system

Incorporate changes arising from beta testing

Update training manuals to include final set of

testing instruments and changes in interface

PHASE 5 \ 2 Months

Deliver final version of requirements and design

documents

Finalize training manuals

Deploy Production version of system

Train RA’s on data entry procedures

Train researchers on data access methods

PHASE 6 \ 1 Month

MRI 1

MRI 2

MRI 3

Form archival

Data QC and validation

QC report generation

GMRC

Fax Forms

GMRC Database System

Web based DCF and MRI scan submission

Web based data entry & QC

CC

Status report

[pic]Figure 1. Example of volumetric measurement of gray matter. In the MR image pixels classified as gray matter are shown in red. The histogram shows the distribution of intensities regardless of tissue type (blue) and the Gaussian distribution models for the expected range of CSF (green), gray matter (red), and white matter (yellow).

[pic]

Figure 2. Results of automatic segmentation of the cerebellum (red) and left (yellow) and right (green) hemispheres.

[pic]

Figure 4. Model-View-Controller architecture (from )

[pic]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download