Using SAS to Handle HL7 Messages in A Health Information Exchange

嚜燒ESUG 2009

Applications Big & Small

Using SAS? to Handle HL7 Messages in A Health Information Exchange

Enxu Zhao, Kathryn J. Schmit, Linh H. Le

New York State Department of Health, Albany, NY

ABSTRACT

This paper will discuss issues regarding the design of an automated data processing system for the New York

State Department of Health*s (NYSDOH) Health Information Exchange (HIE). Base SAS and SAS/ACCESS are

used to access HL7 data from Oracle databases, generate minimum datasets (MDS), and initiate data transmission from NYSDOH to the Centers for Disease Control and Prevention (CDC) and local health departments.

SAS/INTRNET is used for data analysis and visualization. Some of the challenges encountered during system

development, such as SQL queries, X commands, ODS and HTML Web posts, will also be discussed.

INTRODUCTION

The Health Information Exchange (HIE) is a project sponsored by the CDC to establish an operational national

health surveillance system. Its plans include intensive data gathering from medical facilities by state and local

health departments, which will then send reports to a CDC-owned data warehouse. A networked system will allow for bi-directional information flow among public health agencies and clinical healthcare providers. The New

York State Department of Health (NYSDOH) functions as one of the nodes in this network. The success of the

project will depend on the ability of participants to use standardized messaging formats in an automated data

processing system.

The purpose of this paper is to present the system design for automated HIE data processing using SAS as core

software. This paper will focus on the SAS techniques for automated data processing, analysis, and visualization.

Some obstacles encountered during the system development will also be addressed.

HL7 MESSAGES IN HIE

The HL7 protocol, developed by the Health Level 7 Organization, is the most widely implemented standard for

healthcare information in the world. It consists of a standardized grammar and vocabulary that allow clinical data

to be shared among all healthcare systems. HL7 Version 2.5 is currently used as the standard for data exchange

in HIE in New York State (NYS). The following is a sample of a single HL7 v2.5 message.

Figure 1. HL7 Message

MSH|^~\&|ADT|HOSP1|RHIO1|NYSDOH|20090619104426||ADT^A08||P|2.5|||||||||BS1.0^HITSP^^

PID|||123456789AAA^^^HOSP1||||194201|F|||^^^NY^10309|||||||123456789BBB

PV1||E|||||||||||||||||123456789CCC|||||||||||||||||||||||||200906190707000000|

PV2|||^DIFFICULTY BREATHING

OBX|1|NM|AGE||66|years

DG1|1|FF|^PNEUMONIA COPD EXACERBATION^FF||200906191044260000|A

Message files may contain from one to tens of thousands of individual messages. Each message is in text formatting, with segments containing data relating to a specific topic (e.g., PID=patient identification) carried in fields

and subfields separated by standard delimiters. Large numbers of these HL7 messages are sent from Regional

Health Information Organizations (RHIOs) to NYSDOH every day. Our task has been to set up a data processing

system that can access these messages from an Oracle database, integrate the information into SAS datasets,

and transmit data automatically for various uses.

THE NEW YORK STATE (NYS) HIE SYSTEM

Figure 2 shows the structure of the HIE system in NYS. RHIOs serve as the data integration nodes. They first

convert the data collected from medical facilities to HL7 v2.5 messages following the implementation guide and

then transfer these messages to NYSDOH via PhinMS. These incoming data will be parsed by Rhapsody and

put into an Oracle database several times a day. As one of the state-level nodes in a national HIE network,

1

NESUG 2009

Applications Big & Small

NYSDOH first integrates these data with other surveillance data, then generates standardized datasets to share

with CDC and local health departments.

Figure 2. NYS HIE System

NEW YORK STATE*S HIE DATA PROCESSING SYSTEM

The NYSDOH HIE data processing system includes the following components:

Databases 每 repositories for data collected from RHIOs and NYS surveillance systems

SAS Server/SAS INTRNET Server 每 the core part of the data system, which has the following functions:

?

Access and query the databases, extract information, generate minimum datasets (MDS)

?

Store the reporting data in reporting data repository

?

Initialize data transmission to the CDC and the New York City Department of Health and Mental Hygiene

(NYCDOHMH)

?

Initialize automated distribution of SAS performance reports via email to system administrator and project

partners

?

Support the data query Web application, run dynamic data queries upon user*s request, generate statistical reports and send them to users* Web browsers via Web application server

Reporting Data Repository 每 store the reporting datasets and share the data with NYSDOH programs

Web Application Server 每 host the dynamic data query system interface

?

Handle the requests for data reports from users

?

Invoke SAS INTRNET server to start a data query process

?

Collect the SAS output and send it back to user*s Web browser

Email Server 每 in response to email sending request from SAS Server, distribute SAS performance reports to

system administrator and project partners

2

NESUG 2009

Applications Big & Small

Figure 3. NYS HIE Data Processing System

AUTOMATING SAS PROGRAMS

In the data processing system, all the SAS programs are running at scheduled times每some once a day, others

several times a day. There are several options for scheduling SAS jobs to run at a set time or on a repetitive basis. Windows Task Scheduler is the most popular and handy way to accomplish this task. It comes with all

Windows 95 operating systems or later versions.

To schedule a SAS job, a desktop user would click path ※Start 每> All Programs 每> Accessories 每> System Tools

每> Scheduled Tasks§ and then double-click on the ※Add Scheduled Task§ icon. At this point the Task Wizard appears and you would click ※Next§ and choose ※SAS 9.1(English)§ from the scroll-down box. The user can select

the frequency at which the task is supposed to be performed; after clicking ※Next,§ a new screen will open that

allows you to provide additional parameters for date and time when the job will be performed.

3

NESUG 2009

Applications Big & Small

On the next screen, you will be asked to provide a username and password. Make sure the account you provide

has sufficient authorization to perform the scheduled task, otherwise your job will fail to execute. On the final

screen, there is a check box labeled ※Open advanced properties for this task when I click Finish.§ Check this box

and it will direct you to a screen that allows you to specify the path and name of your SAS program. In the Run

box, type in a space after all the text that is already there, followed by 每sysin, then another space, and the complete path to your SAS program inside quotes. For example:

-sysin

'D:\HIE\data_processing.sas'

Click OK to close the window and add an icon for the job in the Scheduled Tasks folder. The default name for

SAS jobs is SAS 9.1(English); you must rename it to avoid confusion, especially when you have more than one

scheduled SAS jobs.

ACCESSING A DATABASE USING DATABASE STORED PROCEDURE

Before you can connect to an Oracle/Sybase database by SAS/ACCESS, make sure that you have database client software installed on your PC and that the database information has been correctly configured. There are two

basic methods of accessing data from Oracle/Sybase database: the LIBNAME method and the SQL PassThrough method. We will take Oracle as an example, but Sybase is similar.

LIBNAME method connects to Oracle database by defining ※Oracle§ as the engine in the LIBNAME statement,

as follows:

libname myoralib oracle user='myuid' password='mypassword' path='mydbpath';

For example:

libname oraprd03 oracle user='hie_owner' password='dragon' path='ora_f68vrec';

in this case, the path points to the connection as set up in the Oracle Client software and is the system ID (SID)

used for the Oracle database.

SQL Pass-Through method connects directly to the Oracle instance. The general format is:

proc sql;

connect to oracle (user='myuid' password='mypassword' path='mydbpath')

[sql statement]

disconnect from oracle;

quit;

As an example:

proc sql;

connect to oracle (user='hie_owner' password='dragon' path='ora_f68vrec');

%put &sqlxmsg;

create table hie.hl7 as

select * from connection to oracle

(select * from hl7_message);

%put &sqlxmsg;

disconnect from oracle;

quit;

run;

In the above example, the statement ※%put &sqlxmsg;§ simply displays any DBMS error message in the SAS

log.

These two methods can handle database queries in most cases, but when detailed information about the data

table is not available for database maintenance or data security reasons, an alternative is to query the database

by using a database stored procedure provided by the database administrator. In this case, the LIBNAME

method cannot accomplish the task. The only option is to use the SQL Path-Through method to access the database.

The following is an example of how to use a database stored procedure to query a database via the SQL PathThrough method.

4

NESUG 2009

Applications Big & Small

proc sql;

connect to oracle (user='hie_owner' password='dragon' path='ora_f68vrec');

%put &sqlxmsg;

create table hie.hl7 as

select * from connection to oracle

(spvs_get_data 'hie_owner',92,'*');

%put &sqlxmsg;

disconnect from oracle;

quit;

run;

spvs_get_data is a stored procedure in the Oracle database, which is programmed by database administrator.

'hie_owner', 92 and '*' are parameters of the stored procedure, standing for username, data table ID, and

columns in the selected data table, respectively.

HL7 MESSAGE PROCESSING

After HL7 messages are received from RHIOs via PhinMS, they are parsed into a file table, message table, and

segment-field tables by Rhapsody. Then the parsed data are saved in hierarchical Oracle data tables linked with

unique file sequence ID and message sequence ID. As can be seen in Figure 1, many fields within the HL7 message are empty; by parsing the data into separate fields, we can store only those fields containing data required

for the HIE project, which saves a great deal of database space. The following figure diagrams the hierarchical

data table structure used in our database:

Figure 4. HL7 Data Hierarchy Table Structure

In order to process the data, five different SAS jobs are scheduled to run: data fetching, data combining, data

generation, data transmission, and performance report distribution.

STEP 1 每 DATA FETCHING

This step is scheduled to run every four hours. It will extract the unprocessed HL7 messages from Oracle, file by

file. Only one file is processed in each loop. The program will keep running until all files in the database have

been processed or until a maximum running time of three hours has been reached. The maximum running time is

set to avoid the data access conflict between multiple thread programs.

The SAS codes for setting the maximum running time and controlling the program flow are as follows:

%macro fetch;

%let start_time = %sysfunc(datetime());

5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download