Java Native Interface:



TABLE OF CONTENTS

1. INTRODUCTION 3

1.1 CoreWall 3

1.2 Background & Need Identification 3

1.3 Related Work 4

2. THE COREWALL SUITE: GEOSCIENCE CORE DATA INTEGRATION AND VISUALIZATION 6

2.1 Core Workflow Database 6

3. PROJECT DESCRIPTION AND WORK BREAK DOWN STRUCTURE 8

4. DATABASE 11

4.1 Database Design [13] 11

Annotation_Table 13

Annotation_Relationship_Table 14

Database_Table 14

DataFile_Table 14

Graph_Table 15

Graph_Annotation_Mapper_table 15

Graph_Image_Mapper_Table 16

Group_Table 16

Group_Database_Mapper_Table 16

Image_Table 17

Image_Annotation_Mapper_table 17

Image_Manipulation_Non-Uniform_Scaling 18

Image_Manipulation_Uniform_Scaling 18

Manipulation_sequencer_table 19

Session_Annotation_Mapper_table 19

Session_table 19

Session_Relationship_Table 20

Track_table 20

Track_Annotation_Mapper_table 21

Track_Image_Mapper_table 21

Track_Session_Mapper_Table 21

Users_Table 22

4.2 XML DTD[8] 22

5. TECHNOLOGIES USED / CONCEPTS APPLIED 25

5.1 XML DOM Parsing [8,10] 25

5.2 XML to RDB Mapping [2] 26

Introduction to the Problem of XML to RDBMS Mapping 26

Pros And Cons Of Xml Over Relational Model 26

Need To Convert Xml Data Into Relational Model 28

Challenges And Complexities To Efficient Storing Of Xml Databases 28

Storage 29

Creating a XML structure that corresponds to a database schema 29

Referential Integrity and storing of XML documents to a RDBMS 30

5.3 Java Prepared Statements 14] 30

5.4 Java Native Interface [4] 31

6. WORKFLOW [7] 32

6.1 Save Session: 32

6.2 Retrieve_Session 39

6.3 Queries 43

UserSessions: 44

SessionTracks: 46

ImageSessions: 47

SessionTrackImages: 48

ImageAnnotations: 49

ImageGraphs: 50

6.4 User Management [12,13] 50

7. SAMPLE SCREENSHOTS 52

8. SYSTEM REQUIREMENTS 55

9. TABLE OF ACRONYMS 56

10. REFERENCES 57

INTRODUCTION

1 CoreWall

CoreWall Suite(CWS) is a real-time stratigraphic correlation, core description (CD) and data visualization system to be used by the marine, terrestrial and Antarctic science communities. It is aimed to significantly alter and enhance the current approaches used for core description and analysis of sediment and rock cores by providing an integrated environment for these activities, for both field and repository environments.

2 Background & Need Identification

A primary need for marine, lake, and polar studies of sediment and rock cores, is an integrated environment for stratigraphic correlation, visual core description, and data visualization. Within the last few years, digital images of cores, microfossils, thin sections, etc. are being routinely collected, replacing traditional photography. This dramatic change to digital imagery is creating a huge change and opportunity in the way scientific measurements are captured, collected, analyzed, and distributed by the scientific community.

These images provide the fundamental template for all sediment descriptive work, including annotations about structures, lithologic variation, macroscopic grain size variation, bioturbation intensity, chemical composition, and micropaleontology, among other features.The integration of core-section images with discrete data streams and ‘nested’ images (e.g., smear slides, photomicrographs, thin sections) provides a robust approach to the description of sediment and rock cores. Integration of value-added data, later acquired in a timely manner by individuals and teams of scientists is another major priority. The real-time and/or simultaneous display of multiple integrated databases, with all the data rectified (co-registered) to the fundamental template of the core image, is sorely needed to expedite the process that currently takes months to years.

The cost of wasted time and lost opportunity for science caused by the lack of integrated data and image management is enormous and the challenge exists for all current geoscience’s coring programs. Currently, visual core description (VCD) within IODP(Integrated Ocean Drilling Program) is primarily carried out in manual fashion with observations carried out with pencil and paper and entered into a variety of 3rd party software (e.g. AppleCore), and Excel template spreadsheets to create summary “barrel sheets” (see Figure 1 for example) of basic lithology, structures, color, and grain size, etc. This is perhaps the most labor-intensive activity with up to 8 to 10 people allotted per expedition. The VCD summaries are also now supplemented by digital line scans of the split-cores, color reflectance measurements, and other data sets.

However, the automated measurements are not captured by the “barrel sheets” and certainly not integrated into the summaries. Only the interest of individual scientific party members drives combining these data into an overall site summaries capture any of this analysis.

Essentially, there is no data visualization system within IODP or any other drilling program that allows easy, interactive data visualization and integration. The ability to easily map and analyze these diverse datasets onto multiple depth scales interchangeably has now become a requirement. The need for a data visualization system that is scaleable, integrated, linked to outside sources, and allows for collaborative research both onboard and onshore is absolutely a necessity to improve the overall science.

CWS will be an integrated interpretive environment to facilitate all scientists that use geosciences data as a component of their research.

3 Related Work

LacCore is an integrated system that provides training and service to all researchers regardless of funding source, and that aims to function as a role model for handling of and primary analysis of lacustrine sediment cores. Its facilitates interdisciplinary studies of lake sediments and environmental history by bringing together researchers in an efficient workspace that provides standard and specialized equipment, knowledgeable staff, and training. A networked computer operates as a link to the collections and acts as a server to host the various databases. However, a quantum leap in data display capability is needed to better facilitate researchers who use the LacCore. CoreWall Suite will fill this glaring need.

Figure 1: Example Barrel Sheet

THE COREWALL SUITE: GEOSCIENCE CORE DATA INTEGRATION AND VISUALIZATION

The central component of the CWS is the Core Workflow Database (CWD) whose role is to intercept queries from end-users and retrieve the relevant pieces of data and meta-data, and stage it for rapid and frequent access by the users. Figure 2 depicts how CWD associates with other CoreWall Suite components.

[pic]

Figure 2: Components of CoreWall Suite

1 Core Workflow Database

The data required for a core interpretation session can be very large. The data for an entire IODP core, including core images, multi-sensor scans, smear slides and all related data, can amount to many gigabytes. If a scientist wants to simultaneously compare multiple cores, the size of the needed data can indeed be large. To compound this problem, many users will be interpreting at locations with slow Internet connections; in the field or drawing data from databases which themselves are on slow networks. In addition, users will be interpreting data from databases that are often designed as read-only archives and not designed to hold “works in progress” of investigators.

The Core Workflow database will be developed to address these two problems. First it will have interfaces to enable the CWD to retrieve user selected data from established databases such as JANUS, LacCore, dbSEABED, and PaleoStrat. It will also be able to pull data through the emerging portals such as CHRONOS which will allow much more diverse data to be accessible to the CWS than through any single database. This gives the user instant web-services based access to multiple data sources . Second, the CWD captures the results of analyses and interpretations made through accessing and correlating data retrieved from these databases while the user is working to analyze the cores. As the workflow is captured other collaborators can access it locally or remotely. For example, the work of one scientist can be shared from one shift to the next and biologists in a separate lab can watch as other labs in the ship interpret a specific core segment. In a higher bandwidth environment, such as a core lab or a university office, a group of collaborators could track the work of one-another as they work on the same cores.

A primary feature of the CWD is the co-registration of the data across multiple coordinate systems. For instance, once several wire length control points have been assigned depths, the remainder can be extrapolated, and depth is now stored as an alternate coordinate system for the well log. Networks of Core Workflow Databases can be connected to facilitate mirroring of data, and permissions can be set at each institution’s database to control the degree of sharing. A central Core Integration Database can also be established (for example, on a ship or a core lab on shore) to collect all contributions made by the community.

PROJECT DESCRIPTION AND WORK BREAK STRUCTURE

Corewall includes a database which stores data related to the work done by geoscientist in the form of sessions for e.g. core manipulations and observation related data . In general,

Session could be thought as an aggregated study done by a geoscientist upto the point of exiting Corelyzer, which is an integrated visualization tool for the study of lake and ocean sediment .

Corelyzer allows a geoscientist to load multiple cores and associate some text with them in the form of text(called as annotations). Apart from this textual information, a user's work in Corelyzer could also contain graphs, images and there associated annotations These annotations are notes about the core study. This study could contain the hierarchical associations between the annotations in terms of   parent child relationship amongst themselves.

Currently geoscientists are not able to build on or reuse previous core studies . This restricts them in terms of retrieving their older work (old sessions), extending their work and/or integrating work with other geoscientists and analysis on the basis of different CoreWall specific data attributes e.g finding out what sessions referred a particular image, what users used a particular image in the session. All this data is beneficial to their core studies and research.

CoreWall database is the main focus of this project which allow the researchers to save, retrieve  and query the session information. This information could then be used to go back in time to see what studies were performed or to analyze the studies at different points of time. In addition to this, it also allows execution of any ad-hoc queries on this session data

With the integration of database into Corelyzer , CoreWall system was separated into two independent components:  Corelyzer rendered the core data and met other visualization needs of the geoscientist and session database acted as a repository for the core study data from which user could retrieve, save or query on specific data attributes.

Figure 3: CoreWall Demonstration

Their was a need of bidirectional data communication (see figure 4) between session database and Corelyzer. XML was chosen as the medium for data exchange due to the following reasons:

Figure 4: High Level CoreWall Design

Firstly, XML could represent the hierarchical structure of CoreWall's data better than a relational model and secondly it avoided CoreWall from getting tied up to a specific database back-end, which gave the flexibility to switch to any other proprietary databases with minimal efforts.

The work break down to build the session database included the following tasks:

• Identification of database needs for the CoreWall suite (Section 1 and 2).

• Design a database that meets those needs.(Section 4.1)

• User Management Interface for CoreWall session database(6.4).

• Design a DTD to simulate the flat structure of CoreWall session database(Section 4.2) .

• Parse the XML state files to store information into the database. This involves intricacies of XML to RDBMS mapping(Section 5.2) .

• Retrieve existing session information as small XML files (Section 6.2)

• Generate query results as XML files(Section 6.3).

• Integration of database into CoreWall as its backend. This database serves as a repository of data to meet the data integration objectives of CoreWall Suite.

DATABASE

1 Database Design [13]

The database schema for CoreWall session database is depicted below. A detailed description about each table’s columns, their data types, data constraints and the integrity constraints imposed on each table is

described in the following pages.

Database Schema

Database Schema Cont….

Database Schema Contd…..

Annotation_Table

This table is used to store annotation related information. These annotations could belong to session, track, image or graph.

|Column_Name |Null |Data Type |Description |

|Annotation_ID |No |bigint(20) |A unique ID assigned to every annotation. |

|User_ID |No |bigint(20) |ID of annotation’s owner |

|Annotation_Author |No |Varchar(25) |Name of the author |

|Annotation_Text |Yes |Text(500) |Annotation text |

|Annotation_Type |No |Varchar(25) |This tells if an annotation is related to an image, graph |

| | | |track or session. |

|Timestamp |No |Timestamp |Time when the annotation was last updated. |

Primary Key : Annotation_ID

Foreign Key : User_ID REFERENCES track_table(track_ID) on DELETE CASCADE on UPDATE RESTRICT

Annotation_Relationship_Table

This table is used to store parent –child relationship between annotations.

|Column_Name |Null |Data Type |Description |

|Annotation_Parent_ID |No |bigint(20) |A unique ID assigned to every annotation |

|Annotation_Child_ID |No |bigint(20) |A unique ID assigned to every annotation |

Primary Key : Annotation_Parent_ID, Annotation_Child_ID

Foreign Key :

Annotation_Parent_ID REFERENCES Annotation_Table(Annotation_ID) on DELETE CASCADE on UPDATE RESTRICT

Annotation_Child_ID REFERENCES Annotation_Table(Annotation_ID) on DELETE CASCADE on UPDATE RESTRICT

Database_Table

The table is used to store information about external databases.

|Column_Name |Null |Data Type |Description |

|Database_ID |No |bigint(20) |A unique ID assigned to every external database |

|Database_Name |No |Varchar(255) |Name of the database |

|IP_Address |No |Varchar(20) |IP address of the machine hosting the database |

|Admin_email |Yes |Varchar(25) |Email address of the database administrator |

Primary Key : Database_ID

DataFile_Table

The table is used to store information about datafiles that are used to generate graphs

|Column_Name |Null |Data Type |Description |

|DataFileID |No |bigint(20) |A unique ID assigned to every data file |

|File_Name_1 |No |Varchar(255) |Name of the first file |

|File_Name_2 |No |Varchar(255) |Name of the second file |

Primary Key : DataFileID

Graph_Table

This table is used to store graph information.

|Column_Name |Null |Data Type |Description |

|Graph_ID |No |bigint(20) |A unique ID assigned to every annotation |

|Data_File_ID |No |bigint(20) |Id of the file which stores data for the graph. |

|Field_Name_1 |No |Varchar(255) |Name of graph’s first field |

|Field_Name_2 |No |Varchar(255) |Name of graph’s second field |

Primary Key : Graph_ID

Foreign Key :

Data_File_ID REFERENCES datafile_table(datafile_ID) on DELETE CASCADE on UPDATE RESTRICT

Graph_Annotation_Mapper_table

This table is used to store information that relates annotations with graphs

|Column_Name |Null |Data Type |Description |

|Annotation_ID |No |bigint(20) |A unique ID assigned to every annotation |

|Graph_ID_1 |No |bigint(20) |ID of the first graph |

|Graph_ID_2 |Yes |bigint(20) |ID of the second graph |

|Session_ID |No |bigint(20) |Session with which this graph_annotation is associated with. |

|Position_X_1 |No |Decimal(10,4) |X Position where graph 1 is laid out. |

|Position_Y_1 |No |Decimal(10,4) |Y Position where graph 1 is laid out. |

|Position_X_2 |Yes |Decimal(10,4) |X Position where graph 2 is laid out. |

|Position_Y_2 |Yes |Decimal(10,4) |Y Position where graph 2 is laid out. |

|Position_X_Ann |No |Decimal(10,4) |X Position where annotation is laid out. |

|Position_Y_Ann |No |Decimal(10,4) |Y Position where annotation is laid out. |

|Timestamp |Yes |Timestamp |Time at which this graph_annotation was created. |

Primary Key : Annotation_ID, Graph_ID_1, Graph_ID_2

Foreign Key :

Annotation_ID REFERENCES annotation_table(Annotation_ID) on DELETE CASCADE on UPDATE RESTRICT

Session_ID REFERENCES session_table(session_ID) on DELETE CASCADE on UPDATE RESTRICT

Graph_ID_1 REFERENCES Graph_table(Graph_ID) on DELETE CASCADE on UPDATE RESTRICT

Graph_ID_2 REFERENCES Graph_table(Graph_ID) on DELETE CASCADE on UPDATE RESTRICT

Graph_Image_Mapper_Table

The table is used to store information about mappings between graphs and images during a user session.

|Column_Name |Null |Data Type |Description |

|Graph_ID |No |bigint(20) |Identification for the Graph |

|Image_ID |No |bigint(20) |Identification for the Image |

|Session_ID |No |bigint(20) |ID of the session |

|Position_X |No |Decimal(10,4) |X position of the graph in the image |

|Position_Y |No |Decimal(10,4) |Y position of the graph in the image |

Primary Key : Graph_ID, Image_ID

Foreign Key :

Graph_ID references Graph_Table(Graph_ID) on DELETE CASCADE on UPDATE RESTRICT

Image_ID references Image_Table(Image_ID) on DELETE CASCADE on UPDATE RESTRICT

Session_ID references Session_Table(Session_ID) on DELETE CASCADE on UPDATE RESTRICT

Group_Table

The table is used to store information about the user groups of core wall. Each user is assigned to a group.

|Column_Name |Null |Data Type |Description |

|Group_ID |No |bigint(20) |A unique ID assigned to every group |

|Group_Name |No |Varchar(255) |Name of the group |

|Description |Yes |Varchar(255) |A brief description about the group |

Primary Key : Group_ID

Group_Database_Mapper_Table

The table is used to store associate groups and the databases that group have access to.

|Column_Name |Null |Data Type |Description |

|Group_ID |No |bigint(20) |A unique ID assigned to every group |

|Database_ID |No |bigint(20) |A unique ID assigned to every database |

|User_login |No |Varchar(255) |Login for the user |

|User_password |No |Varchar(255) |Password for the user |

Primary Key : Group_ID, Database_ID

Foreign Key :

Group_ID REFERENCES Group_Table(Group_ID) on DELETE CASCADE on UPDATE RESTRICT

Database_ID REFERENCES Database_Table(Database_ID) on DELETE CASCADE on UPDATE RESTRICT

Image_Table

This table is used to store information about images that are used in a corewall session.

|Column_Name |Null |Data Type |Description |

|Image_ID |No |bigint(20) |A unique ID assigned to every image |

|Image_Name |No |Varchar(255) |Name of the image. |

|Image_URL |No |Varchar(255) |URL where the image is stored |

|Image_Description |Yes |Varchar(255) |A brief description about the image |

Primary Key : Image_ID

Image_Annotation_Mapper_table

This table is used to store annotations associated with images.

|Column_Name |Null |Data Type |Description |

|Annotation_ID |No |Bigint(20) |A unique ID assigned to annotation. |

|Session_ID |No |Bigint(20) |Session to which the image and annotation belong |

|Core_Image_ID_1 |No |Bigint(20) |ID of the first image |

|Core_Image_ID_2 |Yes |Bigint(20) |IDof the second image |

|Position_X_1 |No |Decimal(10,4) |X Position where image 1 is laid out. |

|Position_Y_1 |No |Decimal(10,4) |Y Position where image 1 is laid out. |

|Position_X_2 |Yes |Decimal(10,4) |X Position where image 2 is laid out. |

|Position_Y_2 |Yes |Decimal(10,4) |Y Position where image 2 is laid out. |

|Position_X_Ann |No |Decimal(10,4) |X Position where annotation is laid out. |

|Position_Y_Ann |No |Decimal(10,4) |Y Position where annotation is laid out. |

|Timestamp |Yes |Timestamp |The time when this annotation was last saved. |

Primary Key : Annotation_ID,Core_Image_ID_1, Session_ID

Foreign Key :

Annotation_ID REFERENCES Annotation_table(Annotation_ID) on DELETE CASCADE on UPDATE RESTRICT

Session_ID REFERENCES session_table(session_ID) on DELETE CASCADE on UPDATE RESTRICT

Core_Image_ID_1 REFERENCES Image_table( Image_ID) on DELETE CASCADE on UPDATE RESTRICT

Core_Image_ID_2 REFERENCES Image_table( Image_ID) on DELETE CASCADE on UPDATE RESTRICT

Image_Manipulation_Non-Uniform_Scaling

This table is used to store non-uniform scaling related image manipulation information.

|Column_Name |Null |Data Type |Description |

|Image_Manipulation_NUS_ID |No |bigint(20) |A unique ID for every Non uniform manipulation. |

|Image_ID |No |bigint(20) |ID of the image that is manipulated. |

|X_value |No |Decimal(10,4) |X value |

|Y_value |No |Decimal(10,4) |Y value |

|Width |No |Decimal(10,4) |Width |

|Height |No |Decimal(10,4) |Height |

|Scale_x |No |Decimal(10,4) |Scaling ratio X |

|Scale_y |No |Decimal(10,4) |Scaling ratio Y |

Primary Key : Image_Manipulation_NUS_ID

Foreign Key :

Image_ID REFERENCES Image_Table(Image_ID) on DELETE CASCADE on UPDATE RESTRICT

Image_Manipulation_Uniform_Scaling

This table is used to store uniform scaling related image manipulation information.

|Column_Name |Null |Data Type |Description |

|Image_Manipulation_US_ID |No |bigint(20) |This is a unique ID for every uniform scaling |

| | | |manipulation. |

|Image_ID |No |bigint(20) |ID of the image that is manipulated. |

|Scale_x |No |Decimal(10,4) |Scaling ratio X |

|Scale_y |No |Decimal(10,4) |Scaling ratio Y |

Primary Key : Image_Manipulation_US_ID

Foreign Key :

Image_ID REFERENCES Image_Table(Image_ID) on DELETE CASCADE on UPDATE RESTRICT

Manipulation_sequencer_table

This table is used to store the order of manipulations on a image in a particular session.

|Column_Name |Null |Data Type |Description |

|Sequencer_ID |No |bigint(20) |A unique ID for each manipulation record. |

|Session_ID |No |bigint(20) |Session to which this specific manipulation is|

| | | |done. |

|Image_ID |No |bigint(20) |Unique ID of the image |

|Image_Manipulation_ID |No |bigint(20) |Manipulation ID from one of the manipulation |

| | | |table(Uniform Scaling or Non-Uniform Scaling) |

|Manipulation_Type |No |Varchar(255) |Indicates the manipulation type. |

|Order_of_operation |No |bigint(20) |This records the sequence in which |

| | | |modifications are made to the image. |

Primary Key : Sequencer_ID

Foreign Key :

Session_ID references Session_table(Session_ID) on DELETE CASCADE on UPDATE RESTRICT

Image_ID REFERENCES Image_Table(Image_ID) on DELETE CASCADE on UPDATE RESTRICT

Session_Annotation_Mapper_table

The table is used to store information about mappings between annotations and sessions.

|Column_Name |Null |Data Type |Description |

|Annotation_ID |No |bigint(20) |A unique ID assigned to every annotation |

|Session_ID |No |bigint(20) |Session with which this track_annotation is associated with. |

|Position_X_Ann |No |Decimal(10,4) |X Position where annotation is laid out. |

|Position_Y_Ann |No |Decimal(10,4) |Y Position where annotation is laid out. |

Primary Key : Annotation_ID, Session_ID

Foreign Key :

Annotation_ID REFERENCES annotation_table(Annotation_ID) on DELETE CASCADE on UPDATE RESTRICT

Session_ID REFERENCES session_table(session_ID) on DELETE CASCADE on UPDATE RESTRICT

Session_table

The session table stores information about a user session.

|Column_Name |Null |Data Type |Description |

|Session_ID |No |bigint(20) |It is used to uniquely identify every user session. |

| | | |Automatically generated by mysql. |

|Session_Name |No |varchar(255) |A unique name for the session. |

|User_ID |No |bigint(20) |This is used to associate a user with each session |

|Permission |No |tinyint(4) |This determines the accessibility of the session. Value 1: |

| | | |Session is private, Value 2: Session is accessible to group |

| | | |for rw. 3 means the public can read this session. |

|Time_Created |No |Timestamp |The time when the session was started. |

|Time_Last_Modified |No |Timestamp |Time when the session was last modified |

Primary Key : Session_ID

Foreign Key : User_ID REFERENCES Users_Table (User_ID) on DELETE CASCADE on UPDATE RESTRICT

Session_Relationship_Table

This table is used to store parent–child relationship between sessions.

|Column_Name |Null |Data Type |Description |

|Session_Parent_ID |No |bigint(20) |A unique ID assigned to every session. |

|Session_Child ID |No |bigint(20) |A unique ID assigned to every session. |

Primary Key : Session_Parent_ID, Session_Child ID

Foreign Key :

Session_Parent_ID REFERENCES Session_table (Session_ID) on DELETE CASCADE on UPDATE RESTRICT

Session _Child_ID REFERENCES Session_table(Session_ID) on DELETE CASCADE on UPDATE RESTRICT

Track_table

The table stores information about session tracks.

|Column_Name |Null |Data Type |Description |

|Track_ID |No |bigint(20) |It is used to uniquely identify every track irrespective of |

| | | |the session. |

|Track_Name |No |varchar(255) |User assigned name for the track |

|Track_Description |Yes |varchar(255) |This contains detailed description about the track. |

Primary Key : Track_ID

Track_Annotation_Mapper_table

The table is used to store information about mappings between annotations and tracks.

|Column_Name |Null |Data Type |Description |

|Annotation_ID |No |bigint(20) |A unique ID assigned to every annotation |

|Session_ID |No |bigint(20) |Session with which this track_annotation is associated with. |

|Track_ID |No |bigint(20) |ID of the track |

|Position_X_Ann |No |Decimal(10,4) |X Position where annotation is laid out. |

|Position_Y_Ann |No |Decimal(10,4) |Y Position where annotation is laid out. |

Primary Key : Annotation_ID, Track_ID

Foreign Key :

Annotation_ID REFERENCES annotation_table(Annotation_ID) on DELETE CASCADE on UPDATE RESTRICT

Session_ID REFERENCES session_table(session_ID) on DELETE CASCADE on UPDATE RESTRICT

Track_ID REFERENCES track_table(Track_ID) on DELETE CASCADE on UPDATE RESTRICT

Track_Image_Mapper_table

The table is used to associate tracks and images.

|Column_Name |Null |Data Type |Description |

|Track_ID |No |bigint(20) |ID of the track |

|Image_ID |No |bigint(20) |ID of the image |

|Session_ID |No |bigint(20) |ID of the session |

|Position_X |No |Decimal(10,4) |X Position where the image is laid out. |

|Position_Y |No |Decimal(10,4) |Y Position where the image is laid out. |

Primary Key : Track_ID, Image_ID

Foreign Key :

Track_ID REFERENCES Track_Table(Track_ID) on DELETE CASCADE on UPDATE RESTRICT

Image_ID REFERENCES Image( Image_ID) on DELETE CASCADE on UPDATE RESTRICT

Track_Session_Mapper_Table

The table is used to associate tracks with session.

|Column_Name |Null |Data Type |Description |

|Track_ID |No |bigint(20) |ID of the track |

|Session_ID |Yes |bigint(20) |ID of the session. |

|Position_X |No |Decimal(10,4) |X Position where track is laid out. |

|Position_Y |No |Decimal(10,4) |Y Position where track is laid out. |

Primary Key : Track_ID, Session_ID

Foreign Key :

Track_ID REFERENCES Track_Table(Track_ID) on DELETE CASCADE on UPDATE RESTRICT

Session_ID REFERENCES Session_table( Session_ID) on DELETE CASCADE on UPDATE RESTRICT

Users_Table

The table is used to store information about users of core wall.

|Column_Name |Null |Data Type |Description |

|User_ID |No |bigint(20) |A unique ID assigned to every user |

|User_Name |No |Varchar(255) |User Name |

|First_Name |No |Varchar(255) |First name of the user |

|Last_Name |Yes |Varchar(255) |Last Name of the user |

|Group_ID |No |bigint(20) |Group the user belongs to |

|Password |No |Varchar(255) |User’s password |

Primary Key : User_ID

Foreign Key :

Group_ID REFERENCES Group_Table(Group_ID) on DELETE CASCADE on UPDATE RESTRICT

2 XML DTD[8]

A DTD for the state that is used to pass session information(state file) . For information on how a DTD is created that conforms to a particular relational database schema refer to section 5.2.

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

| |

Figure 5: XML DTD that corresponds to the Relational Schema of section 4.1

TECHNOLOGIES USED / CONCEPTS APPLIED

This section contains short introduction to the concepts and technologies that were employed by the project and reasons for the same. These concepts/technologies are referenced throughout this report in different sections.

1 XML DOM Parsing [8,10]

CoreWall state files are written in a well-structured XML document. XML was chosen as a technology to

store state files as they prevent CoreWall from getting tied up to a specific database from the visualization

end. This state file data needs to be stored to the database in appropriate tables. This data retrieved from

the state file needs to be parsed before it is could be used by the SQL queries.

|Create a JAXP document builder. This is a wrapper around a specific XML parser. |

|DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance(); |

|DocumentBuilder builder =builderFactory.newDocumentBuilder(); |

|Invoke the parser to create a Document representing an XML document |

|Document document = builder.parse(someInputStream); |

|Obtain the root node of the tree. This returns an Element, which is a subclass of the more general Node class that represents an XML element. |

|Element rootElement = document.getDocumentElement(); |

|Examine various properties of the node using some of the methods provided by DOM API. |

|getNodeName: Returns the Name of the Node |

|getNodeType: Returns the Type of the Node .Returned value is a predefined constants in the Node class |

|getNodeValue: Returns the Value of the Node |

|getAttributes: Returns the attributes used by the element's start tag |

|getChildNodes: Returns the elements contained between the current element's start and end tags |

| |

|Modify properties of the nodes using some of the methods provided by DOM API. |

|appendChild: Add child Node to the Current Node |

|removeChild: Remove a Child node from the current Parent Node |

|setNodeValue: Set a node’s value |

Figure 6: Steps in DOM processing

Figure 7 show an overview of how this is achieved in the CoreWall Session Database. As the order in

which XML document elements are to be accessed is determined by the order in which the primary-

foreign key relationships are designed in the database schema therefore it was imperative to use a parser

that could allow us to access any part of the XML document out of the sequence in the actual XML file. To accomplish this DOM was used as it stores the XML document in memory and allows for faster

access to specific element in the XML document. The code specific details about parsing a DOM is

shown in Figure 6

Parsing using DOM involves the following two steps:

a) Generate a DOM data structure that represents the structure of the XML document.

b) Access and modification of data of interest using the methods provided by the DOM API.

Figure 7 shows the detailed steps in DOM processing

2 XML to RDB Mapping [2]

XML was chosen as a medium for data exchange between the session database and the visualization

component. The was done to prevent it from getting tied up to a specific database. For example if

MySQL is not able to handle the data loads , the database could be changed for MS-SQL Server or Oracle

with minimal effort. Now we discuss about the XML to RDB mapping issues.

Introduction to the Problem of XML to RDBMS Mapping

XML is getting increasingly popular for data exchange and storing for the web applications because of its portability and ease of data exchange features. It is useful when applications must communicate with other applications or integrate information from several other applications.

XML documents are self-contained i.e. they contain the data as well its presentation format. However querying of XML data is difficult because of its hierarchical structure. Over the past few years a lot of

work has been done to figure out optimal ways of storing XML data in a relational model and take advantage of its query processing abilities.

Pros And Cons Of Xml Over Relational Model

XML is self-describing in nature: An XML document represents the data along with the metadata. In a relational model data and the metadata exist at different places.

XML data is portable while a data in relational model is not. For example to exchange data between MS-Access and Oracle we need to do some data transformations even both of them are based on relational model.

Figure 7: Depicts how a state file is processed before its data could be stored into the session database

Adaptable to Changes: The format of an XML document is not rigid we can easily add new tags as per our requirements. However doing the same thing in relational model is not possible because of the restrictions imposed by the relational model constraints.

Good for storing tree or graph based structure: XML’s hierarchical structure makes it possible for us to store tree or graph structured data in XML documents. Databases using relational model do not allow us to store hierarchical data.

XML is considered to be a database in a weak sense and suffers from the limitations with respect to: Efficient access to data due to parsing and conversion, Efficient Storage, Indexes, Security, Multi-user access, Triggers, Transaction Management, Data Integrity, Queries across multiple documents.

Given these limitations XML documents work good in an environment consisting of small amounts of data, few users, and modest performance requirements. However it will fail in an environment, which has many users, requires good performance and needs strict data integrity.

Need To Convert Xml Data Into Relational Model

We saw above that XML is really good way of exchanging data over the web, however looking at its limitations above. There is a need to convert XML data into a relational model. So that data can take advantage of both XML and relational model features i.e. “take advantage of portability of XML” and “be able to exercise powerful database features like querying, efficient storage and security”.

Challenges And Complexities To Efficient Storing Of Xml Databases

In order to store XML document in a relational database we need to map the XML schema to a database schema. A database is represented by a set of flat tables linked together by one-to-one, one-to-many and many to many relationships. Each table has a fixed number of columns, which represent the attributes of the data model. Relationship between different tables is represented using the Primary and Foreign Key relationship. XML data on the other hand is tree structured and relationship between data is represented by containment i.e. a sub-element or attribute stored within another element. Thus we see that XML documents and relational databases have a tree and flat structure respectively. Therefore, we need to map XML’s tree structure so that it fits into an equivalent flat relational databases structure.

Storage

Whenever an XML document is to be stored in a relational database, there are two points of consideration namely optimal generation of relations to store the XML data and preservation of order information present in an XML document. We discuss both of them below.

a) Optimal Relation Generation

When converting XML documents we need to identify what relations should be created out of elements present in an XML document. Creating a relation for each element means we are able to capture all the data and there is no loss of semantics and data when it is mapped from XML to RDBMS. However they (large number of relations) are inefficient for querying because they increase the amount of join processing to be done during querying. Since join is the most costly operation to be performed on a relational database. Therefore, for performance reasons we need to create an optimal number of relations from an XML schema/DTD.

b) Order preservation

Order is an important feature of XML documents and there is an implicit ordering of elements according to the document order. This ordering also determines the semantics/interpretation of data. The relational model on the other hand is not ordered. To ensure that the mapping of data is lossless (no loss of information) we need to make sure that order is preserved during the conversion between XML and relational form. There are two basic features desired from any ordering scheme:

a) We should be able to reconstruct an XML document using the ordering information preserved in relational schema.

b) This ordering information should facilitate translation of ordered XML queries to SQL queries.

Creating a XML structure that corresponds to a database schema

The following steps are used to convert a RDBMS structure to an equivalent XML DTD

➢ For each table in the database create an element only content because element only tags allow us to deal with data points.

➢ For each column in the table create an attribute. They constrain the number of values that can be contained unlike elements thus help in simulating data points in a XML’s tree structure.

Referential Integrity and storing of XML documents to a RDBMS

Referential Integrity means that foreign keys always point to valid primary keys . In other words, a row containing the primary key must be inserted before inserting the row containing the foreign key e.g. a session row must be created before creating the tracks, annotations that relate to it. Thus referential Integrity imposes a certain order, which is needed for transferring the data from a XML document to the database and also decide the order in which XML elements are to be accessed.

For e.g. in our saveSession example in the workflow section the session row is created first then the session ID generated is used throughout the save session process.

3 Java Prepared Statements 14]

Java prepared statements are pre-complied SQL statements .The SQL statement contained in a Java PreparedStatement could have more than one IN parameters. IN parameters are the parameters whose values are not specified when the SQL statement is created. Instead the SQL statement has a ? mark which acts as a placeholder for each IN parameter.Values for all the IN parameters must be supplied or set before executing the prepared statement. An example is shown in Figure 8.

These prepared statements provide the following three benefits

1) Increased Portability: With prepared statements the responsibility of handlijng the conversion and formatting issues of data in queries is shifted onto the implementing JDBC driver. e.g. the problems of handling date values, how to handle ‘ are all handled by the prepared statements independent of the programmer’s code. Thus, making the code more portable.

2) Security: Prepared Statements provide protection against SQL Injection attacks i.e. they prevent malicious users from running the code that was not intended.

3) Performance benefits: Prepared Statements are more efficient than the string statements. They reduce the amount of work the JDBC driver and the database has to do when the statements are reused. At the first execution these prepared statements are parsed and compiled and likely put into the statement cache. This caching allows to save the execution path which cannot be accomplished with string statements which have variable parameters .As SQL statements with different parameters appear different to the database thus caching does not happen.

|public static int userSessions(String uname) |

|{ |

|PreparedStatement prepareGetUserSessions=null; |

|String GetUserSessions=null; |

|GetUserSessions="Select session_name,session_id,ST.user_id from users_table |

|UT,session_table ST where UT.User_ID=ST.User_ID and |

|User_Name=?"; |

|prepareGetUserSessions=con.prepareStatement(GetUserSessions); |

|prepareGetUserSessions.clearParameters(); |

|prepareGetUserSessions.setString(1,uname); |

|rsGetUserSessions=prepareGetUserSessions.executeQuery(); |

|…………………………….. |

|…………………………….. |

|} |

Figure 8: Prepared Statement example that executes the userSessions query

4 Java Native Interface [4]

Java Native Interface (JNI) is a feature provided by the Java platform.. It allows us to build applications that have use both native code in programming languages like C and C++ with code written in Java. It allows programmers to combine the power of Java platform without having to abandon their investments in the legacy code.

JNI allows to handle two way situations where we need to combine Java applications with native code(refer to Figure 9). Thus, JNI provides for two types of native code.

1) Native libraries : We could write Java code that invokes functions written in other languages. These native methods are called in the same way as the Java methods. These native methods reside in native libraries.

2) Native Applications : JNI’s invocation interface allows us to embed Java Virtual Machine implementation into native applications. Native applications can link with a native library that implements the JVM and use the invocation interface to execute software components written in Java.

Figure 9: JNI acts as communicating channel between Java and legacy code

WORKFLOW [7]

In this section, detailed diagrammatic descriptions of CoreWall’s workflow are shown. They depict the high level view in terms of the user activity and the way system handles a user activity. The detailed diagrams show low level details in terms of the sequence of tables affected during a particular activity. These user activities could include saving of a session from the state file, retrieval of a specific session, executing queries to get the desired data.

1 Save Session:

This is used to store a session from a state file into the database. The name of the XML file that contains the session information is passed by the C function. The data to be inserted into the database from the XML file is parsed using the DOM parser.Figure10 below gives a high level view of the save session

Figure 10: High Level process of Save Session

process in term of system events and user activity .

Figure 11: Save Session Workflow-1

Figure 12: Save Session Workflow-2

Figure 13: Save Session Workflow-3

Figure 14 below depicts the tree structure of the state file and what tables each node affects, when the file is parsed and used to store the session information into the session database

Figure 14: Tree structure of XML file and what tables are affected by each element

Figure 15: Sample state file

Figure 16 below shows tables contained in CoreWall’s session database .

Figure 16: Database Schema Snapshot

2 Retrieve_Session

Retrieve session is used to retrieve a session belonging to an authorized user. It retrieves the session information as a number of small XML files(see Figure 20 and 21). These files are then used to recreate the session on the fly. For information on what files are generated during session retrieval and there naming conventions please refer to table in Figure 19.Figure 17 below gives a high level view of the retrieve session process in term of system events and user activity. The low-level data and logic flow during session retrieval is depicted in Figure 18.

Figure 17: High Level process of Retrieve Session

Figure 19 : Naming conventions for xml files that contain the session information

Figure 20: Retrieved XML file that shows the parent-child relationship between annotations

Figure 21: Retrieved XML file that shows graphs associated with an image in a particular session

Figure 22 below represents the snapshot of an example of retrieve session.

Figure 22 : XML files generated by retrieve session information

3 Queries

In addition to retrieve session and save session features, there is a querying facility available to the CoreWall user. This could become handy when the geoscientists are trying to analyze their previous studies or when they are trying to integrate information from different sessions. The query feature is highly extensible and more queries could be easily added following the template of the following queries. In this section, we explain the workflow of some of the queries(refer Figure 23 for High Level Query Process). The queries parameters are passed through C functions and the results are generated as XML files.

Figure 23: High Level process for any query execution

UserSessions:

This query is used to retrieve information about sessions that are associated with a given user name. The result is produced as an XML file(refer Figure 25). Figure24 below shows the workflow of the process and the query that is executed to retrieve this information.

Figure 24: UserSessions query workflow

Figure 25: Sample XML file generated by userSessions query

SessionTracks:

This query is used to retrieve information about tracks that are associated with a given session name. Result is generated as XML file(refer to Figure 27). The workflow and query executed is shown in Figure 26 below.

Figure 26: A figure depicting how to get tracks associated with a given session name

Figure 27: Sample XML file generated by sessionTracks query

ImageSessions:

This query is used to retrieve information about sessions that are associated with a given image name. Result is generated as an XML file. The workflow of process and corresponding query executed is shown in Figure 27 below.

Figure 27 : Get sessions associated with a given image name

SessionTrackImages:

This query is used to retrieve information about tracks that are associated with a given imagename. The workflow of process and corresponding query executed is show in the Figure 28 below.

Figure 28: Get tracks associated with a given image name

ImageAnnotations:

This query is used to retrieve information about annotations that are associated with a given image name across different sessions. The workflow of process and corresponding query executed is show in Figure 29 below.

Figure 29: Get annotations associated with a given image name

ImageGraphs:

This query is used to retrieve information about graphs that are associated with a given image name. The workflow of process and corresponding query executed is show in Figure 19 below.

Figure 30: Get graphs associated with a given image name

4 User Management [12,13]

There was a need to include the user and database management queries. The former was

needed to manage the users of CoreWall in terms of users, permissions and group. Latter was needed for integration of session database with other database.

This involved writing C code (example code in Figure 31) that could be used to perform standard user management functions which include

a) Creation of CoreWall user account

b) Deletion of a CoreWall user account

c) Password change feature for a CoreWall user

d) Creation of a group

e) Deletion of a group

f) Creation of a database references and their association with session database.

Figure 31 : Code that creates a new CoreWall user account

SAMPLE SCREENSHOTS

In this section a few screenshots generated by the visualization component are included. These screenshots depict how the annotations, images, tracks, graphs are presented to the user.For e.g. Figure 32 shows graphs ,tracks and images from a session. This screenshot also details the tables that are affected when the user saves this session.

Figure 32 : Track ,Image and Graph and what tables they affect when the session is saved

Figure 33: Track and its annotation and what tables they affect when the session is saved

Figure 34: Image and its annotation and what tables they affect when the session is saved

SYSTEM REQUIREMENTS

This section specifies the system requirements for both the server machine running the MySQL server and the client machine containing the Session Database along with visualization code.

|Client Side |Server Side |

|JNI version 1.4 |MySQL 4.1.12-standard |

|Java Run Time Env. jre1.5.0_05 | |

|Java Development Kit JDK 5.0 | |

|Visual 2003 | |

|mysql-connector-java-3.1.10 | |

|RS2DOM Class (included in the CD) | |

| | |

TABLE OF ACRONYMS

|CD |Core Description |

|CWD |Core Workflow Database |

|CWS |CoreWall Suite |

|DOM |Document Object Model |

|DTD |Document Type Definitions |

|IODP |Integrated Ocean Drilling Program |

|JDBC |Java Database Connectivity |

|JNI |Java Native Interface |

|JVM |Java Virtual Machine |

|MS-SQL |Microsoft SQL Server |

|RDBMS |Relational Database Management System |

|SQL |Structured Query Language |

|VCD |Visual Core Description |

|XML |Extensible Markup Language |

REFERENCES

[1] Michael, kofler. MySQL. 2001 ed. : a! Apress,

[2] Ronald, Bourret. "Data Transfer Strategies." at URL : . 2001.

[3] Liang , Sheng . Java(TM) Native Interface: Programmer's Guide and Specification . : Sun

Microsystems,1999

[4] Java Native Interface at URL :

[5] JDBC2XML/ at URL :

[6] LRC Core Facility - barrel sheets at URL :

[7] CoreWall at URL :

[8] Extensible Markup Language (XML) at URL:

[9] Java Technology at URL :

[10] org.w3c.dom (Java 2 Platform SE v1.4.2) at URL :



[11] MySQL AB :: Download Connector/J 3.1 at URL



[12] MySQL 5.0 Reference Manual :: 23.2 MySQL C API at URL



[13] MySQL AB :: The World's Most Popular Open Source Database at URL



[14] Using Prepared Statements at URL :



-----------------------

int sdb_create_user(char *uname,char *fname,char *lname,char *grp, char *pwd)

{

char query_create_user[xlong_query];

char *query1="insert into users_table(User_Name,First_Name,Last_Name,Group_ID,password) Values('";

int grp_check=-1,create_query_status=-1;

grp_check=sdb_check_grp_exist(grp);

if(grp_check ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download