XML DATABASES - DSpace at Cochin University: Home

[Pages:42]XML DATABASES

A SEMINAR REPORT

Submitted By

ATUL KUMAR

in partial fulfilment for the award of the degree of

BACHELOR OF TECHNOLOGY

in COMPUTER SCIENCE AND ENGINEERING

SCHOOL OF ENGINEERING

COCHIN UNIVERSITY OF SCIENCE & TECHNOLOGY KOCHI-682022 SEPTEMBER 2010

Division of Computer Engineering School of Engineering

Cochin University of Science & Technology

Kochi-682022

_____________________________________________________

CERTIFICATE

Certified that this is a bonafide record of the seminar work titled

Xml Databases

Done by ATUL KUMAR

of VII semester Computer Science & Engineering in the year 2010 in partial fulfillment of the requirements for the award of Degree of Bachelor of Technology in Computer Science & Engineering of Cochin University of Science & Technology

Dr. David Peter S. Head of the Division

Ms. Anu M. Seminar Guide

Xml Databases

ACKNOWLEDGEMENT

I express my sincere thanks to Ms. Anu M., my seminar guide for her valuable suggestions and sincere vigilance, Mr. Sudheep P. Eliydom (Staff in charge) for providing right guidance and co-operations and Dr David peter S. (Head of Division) for allowing us to use the facilities. Also I would like to extend my sincere thanks to all other members of the faculty of Computer Science and Engineering Department. Last but not least I want to thank my friends for their co-operation and encouragement.

ATUL KUMAR

Division of Computer Engineering

i

Xml databases

ABSTRACT

The Xml database is a rapidly growing technology which is poised to replace many existing technologies used for data storage. It uses xml and many of its derived technologies, like DTD, XSL, XSLT, Xpath, Xquery, etc., as its framework. Xml document is self-describing and is compatible on all kind of platforms because it is in text format. This makes it a very powerful technology.

We can store semi-structured data in xml databases. Also, there are protocols like SOAP for accessing data and web services over the internet. Due to its simplicity and compatibility on all kinds of platforms , xml database is rapidly becoming the de facto standard of passing data over the internet.

Division of Computer Engineering

1

TABLE OF CONTENTS

1. INTRODUCTION................................................................................................01 2. SEMI-STRUCTURED DATA...............................................................................08 3. XML ...................................................................................................................09 4. XML FOR SEMI-STRUCTURED DATA..............................................................12 5. XML DTD : DOCUMENT TYPE DEFINITION....................................................14 6. XML SCHEMA ...................................................................................................19 7. XPATH................................................................................................................22 8. XQUERY.............................................................................................................25 9. XSLT....................................................................................................................27 10. XML PARSER....................................................................................................30 11. XML DATABASE...............................................................................................32 12. SOAP.................................................................................................................35 13. CONCLUSION...................................................................................................36

14.REFERENCE.......................................................................................................37

Xml Databases

1. INTRODUCTION

For three decades, application developers have relied on relational databases as the bedrock for a persistent data storage layer. While the technology is mature, today's requirements are becoming more complex and relational databases may not be the tool for the job in hand, but what else does a designer / developer pick if they know no better? - Relational Databases were developed in the days of procedural programming languages (e.g. C, COBOL and RPG), programming techniques have evolved in many ways since 30 years ago most notably with introduction of an Object Oriented approach but the persistent storage model has stayed the same. This article tries to question if developers have been dumbing down and creating more work for themselves (unknowingly) for many years, this article also attempts to give an eye-opener into a new approach of storing and retrieving data.

Commonly today, data structures are often modelled in a hierarchical object manner, imagine a simple invoice in terms of an object hierarchy:

Simple Invoice, Theoretical Business Object

Invoice = { date : "2008-05-24" invoiceNumber : 421

InvoiceItems : {

Item : {

description : "Wool Paddock Shet Ret Double Bound Yellow 4'0"

quantity : 1

unitPrice : 105.00

}

Item : {

description : "Wool Race Roller and Breastplate Red Double"

quantity : 1

unitPrice : 75.00

Division of Computer Engineering

1

Xml Databases

} Item : { description : "Paddock Jacket Red Size Medium Inc Embroidery" quantity : 2 unitPrice : 67.50 } } }

The following is an example relational structure, containing this data

Table Invoices

date

invoiceId

2008-05-24

421

Table InvoiceItems

invoiceId description

quantity unitPrice

421

Wool Paddock Shet Ret Double Bound ... 1 105.00

421

Wool Race Roller and Breastplate Red ... 1 75.00

421

Paddock Jacket Red Size Medium Inc ... 2 67.50

Representing this simple single Invoice Object in a relational database can be done, but immediately even for something this simple you need more than 1 table, table joins based on keys and of course the Object has to be spanned over multiple tables. This leaves room for human error; when inserting and updating data it is up to the developer to ensure keys correctly match and when trying to rebuild the object from the persistent layer you need an SQL query which will select data from multiple tables, by nature the query returns the data as essentially a result set of flat 1 dimensional arrays and its then up to the developer to build this hierarchical object from scratch.

Division of Computer Engineering

2

Xml Databases

To a programmer who has been developing with relational databases for some time this may seem like second nature but for a new developer that has just learned the concepts of Object Oriented programming this may seem a little alien.

Leaving aside the programmer's responsibility to ensure the mapping between Object and relational structures, because the data types in SQL databases are quite simplistic all validation must be performed within the business logic layer of an application before any data can be inserted or updated in the database.

SQL "CREATE TABLE" and the SQL data type values a developer can bind to each column is too simplistic to be used as a means of validating data taken directly from a user's input. Often the business logic layer in today's applications performs additional validation, e.g. checks that a field is a valid phone number or a valid e-mail address or even that when the field is inserted into the SQL INSERT or UPDATE statement that it won't actually break the syntax or cause a security breach.

Object Relational Mapping has definitely eased these problems with relational databases because it allows a relational database to become a "virtual object database", but O/R Mapping has brought some problems of its own. O/R Mapping techniques and frameworks can be difficult to learn, it is by no means simple to map complex Java classes with multiple Java class descendants to a relational structure, validating user's input is still cumbersome and essentially still needs to be written in full in the business logic layer and it of course adds an additional performance overhead because essentially the O/R mapping process attempts to emulate the natural functionality of an Object oriented database.

Object oriented databases are designed to work well with object oriented programming languages such as Java, C# and C++. Object Databases use the same model as today's programming languages as they store and index theoretical objects. Object databases are generally recommended when there is a business need for high performance processing on complex data.

What has held Object databases back over the years is A. The industries resilience to change. B. The majority of developers in the industry can't be bothered to investigate about new or alternative technologies to the ones that are common place in industry.

However, thankfully change does happen. Today we are living in the information age, businesses are talking to each other via complex XML data structures, (SOAP and RESTful

Division of Computer Engineering

3

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download