The Florida State University



The Florida State University

College of Arts and Sciences

A Java Implementation of the

Simple Object Access Protocol

By Darui Xu

October 21, 2002

A project submitted to the Department of Computer Science

In partial fulfillment of requirements for the

Degree of Master of Science

Major Professor: Dr. Robert van Engelen

Master Project committee

-----------------------------------

Prof. Robert van Engelen

Major Professor

-----------------------------------

Prof. Hilbert Levitz

Committee Member

----------------------------------

Prof. Daniel Schwartz

Committee Member

AcknowlogeMent

The author wants to thank her Major Professor Dr. van Engelen for providing the original idea of the project. The project cannot be finished, without his constant help and guidance. The author also wants to thank Dr. Levitz and Dr. Schwartz for agreeing to serve as committee members.

The author wants to thank the entire faulty and staff of Computer Science Department at the Florida State University for their help during her three years of study there.

Table Of Contents

Abstract………………………………………………………………………………….. 6

1. Introduction 7

1.1 Web Services and SOAP 7

1.2 SOAP Features 9

1.3 SOAP and XML 12

1.4 SOAP Architecture 13

1.4.1 The SOAP Specification 13

1.4.2 The Basic SOAP Payload Structure 14

1.5 The Use of SOAP for RPC 18

1.5.1 SOAP, RMI and XML-RPC 18

1.5.2 SOAP RPC format 19

2 SOAP for Java: Project Motivation and Accomplishment 21

2.1 SFJ Motivation 21

2.2 SFJ Accomplishments 22

3 SJF Design and Implementation 25

3.1 Serialization 25

3.1.1 Locating Multi-references 26

3.1.2 Construct the message 27

3.1.3 Print the independent elements 29

3.2 Transmission 30

3.3 De-serialization 30

3.3.1 Locate the start node 30

3.3.2 Assign the values 31

3.3.3 Resolve Independent Element 34

3.4 Namespaces 36

3.5 Polymorphic Accessor 37

3.6 Fault Class 38

4 Client-Server Application Example 39

4.1 echoStructArray 40

4.2 Polymorphic accessor 44

5 Conclusions and Future work 46

Reference 47

Abstract

The Simple Object Access Protocol (SOAP) is a lightweight protocol for the exchange of information in a decentralized, distributed environment. The SOAP protocol is based on XML and HTTP, which makes it a programming language and platform neutral vehicle for remote method invocation over the Internet and through firewalls. The SOAP for Java (SFJ) project develops a tool for marshalling native and user-defined Java data structures into SOAP message and deserializing SOAP message into Java objects. With SFJ a Java program can be run and interoperate with other SOAP applications in a distributed environment through SOAP remote method invocation.

Introduction

This section introduces the Simple Object Access Protocol (SOAP) as a language- and platform-neutral protocol for exchange of information in a decentralized, distributed environment. This section first illustrates how SOAP together with other protocols supports the Web services. Next, it presents the features of SOAP and its relation with Extensible Markup Language (XML). Finally, it gives the details about SOAP structure and how SOAP is used to make Remote Procedure Call (RPC) calls.

1 Web Services and SOAP

A Web Service is programmable application logic accessible via standard Web protocols. Web Services opens the Internet by providing remote computing services and dynamic information exchange without the need for browsers. Dynamically generated HTML content works fine in Web browsers, but it presents a nightmare for anyone trying to utilize that data with other programs. For example, you can easily view an auction site in a browser, but an application would require a complex HTML parser to read your bid's status from the same site. Worse, you would need a different parser to track a different auction site, and the simplest redesign of either site could throw off your program. Web Services provides a solution for this problem with a consistent and easy method for accessing online information. Web Services achieves this through the use of standard-based protocols like SOAP, which serves as the remote method invocation protocol and WSDL (Web Service Description Language) and UDDI (Universal Description, Discovery, and Integration) for service descriptions, registry, discovery and lookup.

SOAP is the heart of Web Services. Web Services utilizes SOAP as an XML-based remote method invocation protocol. SOAP is a W3C submitted note (as of May 2000) that uses standards based technologies (XML for data description and HTTP for transport) to encode and transmit application data. Consumers of a Web Service do not need to know anything about the platform, object model, or programming language used to implement the service; they only need to understand how to send and receive SOAP messages (HTTP and XML).

A WSDL file is an XML document that describes a set of SOAP messages and how these messages are exchanged between SOAP processors. This enables companies to describe their services and enables the client to consume the services in a standard way without knowing much on the lower level exchange protocol (binding) like SOAP. This high level abstraction on the service limits the human interaction and enables the automatic generation of proxies for Web services. (These proxies can be static or dynamic). Imagine you want to start calling a SOAP method provided by one of your business partners. You could ask him for some sample SOAP messages and write your application to produce and consume messages that look like the samples, but this can be error-prone. For example, you might see a customer ID of 2837 and assume it's an integer when in fact it's a string. WSDL allows the description of both document-oriented and RPC-oriented messages. WSDL specifies what a request message must contain and what the response message will look like in unambiguous notation. The WSDL description published by Web Services can be used to automatically generate stub routines for the development of SOAP clients within a specific programming environment.

UDDI is a specification for an Internet-wide registry of Web Services and their metadata. The UDDI project creates a platform-independent, open framework for describing services, discovering businesses, and integrating business services using the Internet, as well as an operational registry that is available today. The metadata contains information about the company offering a given service and technical details on how it can be accessed through a WSDL description of the service.

SOAP, WSDL and UDDI work together in a three-folds service architecture (Fig. 1). Service providers register their services in a UDDI registry. When a client wants to get some web service, it makes a call to the UDDI registry and finds out the desired service. Once the client locates the right service, it looks up the service’s WSDL interface to get the appropriate request and responses format. Then the client can send the server a SOAP message to invoke the service and wait for the result, which is also a SOAP message, to be sent back by the server.

Figure 1 The web services process

2 SOAP Features

SOAP is a wire protocol similar to the IIOP (Internet Inter-ORB Protocol) for CORBA (Common Object Request Broker Architecture), ORPC (Object Remote Procedure Call) for DCOM (Distributed Component Object Model), or JRMP (Java Remote Method Protocol) for RMI (Java Remote Method Invocation). However, SOAP is somewhat different from the other wire protocols. When a CORBA client needs to obtain the services of DCOM client or vice versa, the common solution is to use a COM/CORBA bridge. The bridge's complexity results from the intricate back-and-forth translation that it must complete from CORBA's IIOP to DCOM's ORPC. So it is fraught with failure points. SOAP can help to alleviate the problem. The XML-based protocol is language and platform neutral, which means that information sharing relationships can be initiated among disparate parties, across different platforms, languages and programming environments. SOAP is not a competitive technology to CORBA and DCOM, but rather complements these technologies. CORBA, DCOM, and Enterprise Java enable resource sharing within a single organization while SOAP technology aims to bridge the sharing of resources among disparate organizations possibly located behind firewalls. SOAP applications exploit a wire-protocol (typically HTTP) to communicate with Web services to retrieve dynamic content.

The most compelling feature of SOAP is that it has been implemented on many different hardware and software platforms. This means that SOAP can be used to link disparate systems within and external to organization. Many attempts have been made in the past to come up with a common communications protocol that could be used for systems integration, but none of them has had a widespread adoption like SOAP.

Because SOAP can use existing XML Parsers and HTTP libraries to do most of the hard work, a SOAP implementation can be completed in a matter of months and it is smaller and easier to implement than most of the previous protocols. Because it is based on a vendor-agnostic technology, namely XML, HTTP, and Simple Mail Transfer Protocol (SMTP), SOAP appeals to all vendors. SOAP is a protocol created by Microsoft, DevelopMentor, and Userland Software and backed by companies that include IBM, Lotus, and Compaq.

While IIOP, ORPC, and JRMP are binary protocols, SOAP is a text-based protocol that uses XML. Using XML for data encoding gives SOAP some unique capabilities. It is much easier to debug applications based on SOAP because it is much easier to read XML than a binary stream. And since all the information in SOAP is in text form, SOAP is much more firewall-friendly than IIOP, ORPC, or JRMP.

The HTTP transport binding for SOAP makes it attractive for some users. Since most organizations are familiar with HTTP and already have it incorporated into their network infrastructure, SOAP fits right in without the complex changes to the network or firewalls that many other protocols require. Because SOAP is layered on top of HTTP, it may utilize any standard HTTP security feature or any endpoint application-specific security feature. SOAP makes it possible for system administrators to configure firewalls to selectively block out SOAP requests using SOAP-specific HTTP headers. They can also be configured to allow only certain interfaces or methods to pass through by looking at the Interface Name and Method Name extension headers defined by SOAP. The standard authentication mechanisms that are HTTP-friendly can be used with SOAP. These protocols can authenticate the server (and optionally the client), and can provide a confidential channel over which SOAP payload can travel. However, when SOAP expanded to become a more general-purpose protocol that runs on top of a number of transports, security became a bigger issue. For example, HTTP provides several ways to authenticate which user is making a SOAP call, but it doesn't provide a way for that identity to be propagated when the message is routed from HTTP to an SMTP transport. Fortunately, the W3C is already working on security for XML documents, so it's probably safe to assume that at some point in the near future, the security issues addressed by the W3C will be used to define a security implementation for SOAP.

Most people use SOAP because it supports interoperability among many different environments and it supports HTTP, which has led to SOAP becoming an industry standard. The biggest advantage of SOAP can also be a disadvantage. SOAP data is sent as XML text to enable standard message formats, standard data representation, and manipulation with standard XML tools—all good things. However, converting all you data into text and parsing it back into data structures at the other end can use up quite a bit of processing power. The tags that make SOAP self-describing make SOAP messages bigger than the equivalent message without the tags. While these things add up to minor performance penalty for using SOAP, many other protocols perform better than SOAP. If you need SOAP for one of the reasons discussed earlier, then this minor performance penalty is a small price to pay, but if these things don't apply to your situation, using SOAP may not be the best choice. As SOAP implementations mature, the performance gap between SOAP and other protocols will narrow.

There are many existing SOAP implementations (including Java implementations), such as, Apache SOAP/Axis (Java), eSOAP, gSOAP, IONA XMLBus, kSOAP, pocketSOAP 1.1 beta SILAB/TclSOAP, SIM SOAP4R, Spray B2001, SQLData, WASP Advanced 3.0, WASP for C++, White Mesa 2.5, xSOAP (Java), and Interoperability.

3 SOAP and XML

SOAP uses XML as the data-encoding format. The idea of using XML is not original to SOAP. Both XML-RPC and ebXML use XML as well. This section gives a brief introduction to XML.

XML is the Extensible Markup Language. It is a subset of Standard Generalized Markup Language (SGML) and it is much simpler and straightforward. Unlike HTML, XML is a markup language that specifies neither the tag set nor the grammar for that language. This is the power of XML: it allows you to define the content of your data in a variety of ways as long as you conform to the general structure that XML requires. Any XML document must be well formed, which means that every tag is closed, no tags nested out of order, and is syntactically correct. An XML document can be, but are not required to be valid, which means it conforms to the constraining documents such as Document Type Definition (DTD) or XML schema. Since a user can freely define any tags, namespaces are often used to allow parsers to handle collisions. A namespace is a mapping between an element prefix and a URI. A namespace is associated with a prefix to an XML element.

Because XML is plain text, any application can understand it as long as the application understands the character encoding in use. By default, XML assumes that all characters belong to ISO/IEC 10646, known as the Universal Character Set (UCS). The XML specification () mandates that all XML processors must accept character data encoded using the UCS Transformation Formats UTF-8 or UTF-16. Therefore, any XML data stream encoded in UTF-8 or UTF-16 can be understood regardless of platform or programming language. This makes XML a good choice for describing method invocations in a platform and language-neutral fashion.

4 SOAP Architecture

1 The SOAP Specification

The World Wide Web Consortium (W3C) has published a first working draft specification for SOAP Version 1.2. The working draft has been produced by the XML Protocol Working Group (WG), part of the W3C XML Protocol Activity. The SOAP 1.1 specification along with a SOAP Envelope schema and a SOAP Encoding schema was published on May 08, 2000.

SOAP a lightweight protocol for exchange of information in a decentralized, distributed environment. It is an XML based protocol that consists of four parts: an envelope that defines a framework for describing what is in a message and how to process it, a set of encoding rules for expressing instances of application-defined data types, a convention for representing remote procedure calls and responses and a binding convention for exchanging messages using an underlying protocol. SOAP can potentially be used in combination with a variety of other protocols; however, the only bindings defined in this document describe how to use SOAP in combination with HTTP and the experimental HTTP Extension Framework.

The SOAP envelope is analogous to the envelope of an actual letter. It supplies information about the message that is being encoded in a SOAP payload, including data relating to the recipient and sender, as well as details about the message itself. The header of the SOAP envelope can specify exactly how a message must be processed and a typical SOAP message can also include the encoding style, which assists the recipient in interpreting the message.

The second major element of the SOAP specification is a simple means of encoding user-defined data types. If your application doesn't understand XML, you will need to represent your program's data types—integers, floats, arrays, structs, and so on—as XML data in the SOAP message. Section 5 of the SOAP standard specifies an XML notation for representing programming language types. Because it is defined in section 5 of the standard, it is often called Section 5 encoding. The other alternative for defining the data types of the elements in a SOAP message is to use a schema as defined in the XML Schema (XSD) standard. Section 5 makes sense if you are using SOAP to support Remote Procedure Calls from a local program to a remote program because the mapping between XML and your programming language is simpler with Section 5 encoding. In XML message-based applications with complex data, it may make more sense to provide an XML schema for the SOAP message, rather than using SOAP encoding. The Section 5 encoded SOAP message is called an encoded message, and a message that contains an XML document is called a literal message. While all major SOAP implementations support section 5 encoding, it is an optional part of the spec and a conformant SOAP implementation wouldn't have to support it. The .NET XML Web services defaults to literal messages, but it does support encoded messages for interoperability.

SOAP can be used for Remote Procedure Call (RPC), in which a remote procedure is invoked on a server and gets some sort of response, or for messaging, in which a client simply sends pieces of information to a server. Most programmers are comfortable with Remote Procedure Call (RPC) programming, so it's natural for SOAP to include an RPC specification. Section 7 of the SOAP specification defines the XML element that represents a function call and the element that contains the return code in the response message. In this sense, SOAP is simply acting as a more extensible XML-RPC system, allowing better error handling and passing of complex types across the network. SOAP messaging provides for transfer of information and it doesn’t depend on a client knowing about a particular method on some server. It models the distributed systems more closely and more complicated that the RPC-style calls.

2 The Basic SOAP Payload Structure

A SOAP message is an XML document that consists of a mandatory SOAP envelope, an optional SOAP header, and a mandatory SOAP body. Below is a simple SOAP message complete with HTTP header.

|POST /EventManager HTTP/1.1 |

|Host: |

|Content-Type: text/xml; |

|charset="utf-8" |

| |

|Content-Length: 60 |

|SOAPAction=" Customer" |

| |

| |

| |

|Dumser |

| |

| |

| |

| |

|SQLI |

|Paris |

| |

| |

| |

1 HTTP header

The HTTP header for the request comes before the SOAP message. The first line defines the type of request, the URI request and the protocol. The next line gives the target site, followed by information about the MIME format for message display, the HTTP coding and the length of the message. The HTTP header can also include the HTTP SOAPAction header, which is optionally used by the server for routing the SOAP message. The identifier following the # sign must match the name of the first tag in the SOAP message body. Below is the header part:

|POST /EventManager HTTP/1.1 |

|Host: |

|Content-Type: text/xml; |

|charset="utf-8" |

| |

|Content-Length: 60 |

|SOAPAction=" Customer" |

2 SOAP Envelope

Following the HTTP headers is the body of the SOAP message. SOAP envelope is the top element of the XML document representing SOAP message and the mandatory root element of the XML document tree. It contains the name of the element (Envelope), followed by namespace declarations including the SOAP version being used, and it may also contain additional attributes such as encodingStyle attribute which points to a link where the serialization (tree structure) and encoding rules are defined. The envelope is presented as follows:

| |

|…… |

| |

4 SOAP Header

The SOAP header is optional. It is a generic mechanism that adds features to the SOAP message in a decentralized manner without prior agreement between the communicating parties. SOAP defines several attributes that can be used to indicate who must process the message, and whether this process is optional or mandatory.

If present, the header must be the first immediate child element of the SOAP Envelop element. It carries information to intermediaries, and is made up of one of more entries. These bear a local name, a full name, a namespace and the two actor attributes which designate the endpoint of the entry, and mustUnderstand, which indicates the optional nature of the process. A SOAP application must include a correct SOAP namespace for all the elements and attributes defined in the message generated. This is a URI that points to a description of the message information in order to guarantee the uniqueness of the message.

| |

| |

|Christmas Event |

| |

| |

6 SOAP Body

The SOAP body is the container for the mandatory information being sent to the message endpoint. This can contain a set of entries that are all kept in the root of the message body.

| |

| |

|Dumser |

|Johann |

|Cambridge |

|01800 |

|MA |

|USA |

| |

| |

5 The Use of SOAP for RPC

As mentioned earlier, SOAP can be used for RPC-style calls. This section first discusses the advantages SOAP RPC calls over similar technologies such as Java Remote Method Invocation (RMI), and XML-RPC. Then the requirements for SOAP RPC calls are covered.

1 SOAP, RMI and XML-RPC

RMI allows a program to invoke methods on an object when the object is not located on the same machine as the program. This is at the heart of distributed computing in the Java world, and is the backbone of many enterprise application implementations. RMI uses client stubs to describe the methods a remote object has available for invocation. The client acts upon these stubs and RMI translates the requests to a stub into a network call. This call invokes the method on the machine with the actual object, and then streams the result back across the network. Finally, the stub returns the result to the client that made the original method call. There are several disadvantages of using RMI. First, using RMI is resource-intensive. As clients issue RMI calls, sockets must be opened and maintained. RMI also requires a server to bind objects using an RMI registry, LDAP or other forms of Java Naming and Directory Interface. Finally, adding an additional method to the server class results in a change to the interface and recompilation of the client stubs.

One of the most significant differences between RMI and RPC is the way methods are made available. In RMI, a remote interface has the method signature for each remote method. If a method is implemented on the server class, but no matching signature is added to the remote interface, the new method cannot be invoked by an RMI client. This process is quite different in RPC. The method requested is never explicitly defined in the XML-RPC server, but in the request from the client. When a request comes in to an RPC server, the request contains a set of parameters and a textual value. The server tries to find a matching class and method that take parameter types that match the types within the RPC request as input. Once a match is made, the method is called, and the result is encoded and sent back to the client. If a method is added in the server, the method signature can be published to the client community and used immediately, no client stubs, skeletons or interface update necessary. RPC also allows disparate systems to work together as the request parameters and the results are generally encoded as textual data and the transport protocol can be HTTP. The greatest obstacle to using RPC has traditionally been its encoding. But XML with its simple, textual data and a structure for data solves the problem.

When used as RPC-style calls, SOAP is simply acting as a more extensible XML-RPC system, with better error handling and passing of complex types across the network. In XML-RPC, encoding can only occur for a predefined set of data types. With SOAP, XML schema can be used to easily specify new data types and those new types can be easily represented in XML as part of a SOAP payload.

2 SOAP RPC format

Section 7 of the SOAP specification titled Using SOAP with RPC defines the XML element that represents a function call and the element that contains the return code in the response message.

The payload is in XML, a single structure. The must contain a sub-item, which is a string containing the name of the method to be called. If the procedure call has parameters, the must contain a sub-item. The sub-item can contain any number of s, each of which has a .

The following HTTP request represents how you would invoke a method using the SOAP protocol:

|POST /cgi-bin/purchase-book.cg HTTP/1.1 |

|Content-Type: text/xml |

|Content-Length: 555 |

| |

| ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download