Tool for automatic generation of stubs for XML RPC



The Florida State University

College of Arts and Sciences

A Simple Object Access Protocol (SOAP) stub compiler for C

By

Gunjan Gupta

Dec 8th, 2000

A project submitted to the

Department of Computer Science

For the degree of Master of Science

The member of committee approve the Master Project of Gunjan Gupta defended on Dec 8th, 2000

________________________

Dr. Robert van Engelen

Supervising Professor

_________________________

Dr. Kyle Gallivan

Committee Member

_________________________

Dr. Gregory A. Riccardi

Committee Member

Acknowledgements

I’d like to thank Dr. Robert van Engelen for the original idea and the design of the project. And also for his constant guidance and support in the implementation.

I thank Dr. Gallivan for his valuable insight on various issues regarding the project. And I’d also like to thank Dr. Greg Riccardi for agreeing to be on the committee.

Table of Contents

Abstract……………………………………………………………………………………2

1. Introduction……………………………………………………………………………..3

1. Background Information……………………………………………………………3

2. SOAP………………………………………………………………………………9

3. Data Representation in XML……………………………………………………..13

4. Stubs………………………………………………………………………………20

5. Serialization………………………………………………………………………21

2. User Guide…………………………………………………………………………….23

2.1 Input and Output Files…………………………………………………………….23

2.2 Mapping of C structures to XML data format…………………………………….26

3. Examples………………………………………………………………………………30

3.1 Query Example……………………………………………………………………31

3.2 System Example ………………………………………………………………….33

4. Implementation………………………………………………………………………..35

4.1. Functions Generated for each type……………………………………………….35

4.2. How the functions are generated…………………………………………………36

4.3 Calls to the functions in the soapClient.c and soapServer.c………………………37

5. Conclusion…………………………………………………………………………….41

6. References…………………………………………………………………………….42

Abstract

The project presents the use of SOAP to achieve data-interoperability between applications in distributed environment. Remote procedure calling with SOAP is programming language independent and operates across different platforms. We have designed and implemented a tool for the automatic generation of SOAP stub routines and serialization converters to support application data interoperability in distributed computing environments. SOAP defines a simple mechanism for expressing application semantics by providing a modular packaging model and encoding mechanisms for encoding data within modules. Thus it provides a simple and lightweight mechanism for exchanging structured and typed information between peers in a decentralized, distributed environment using XML. Combining HTTP and XML into a single solution gives us a whole new level of interoperability. Since SOAP relies on HTTP as the transport mechanism, and most firewalls allow HTTP to pass through, we do not have no problem invoking SOAP endpoints from either side of a firewall. Which has been one of the problems in the distributed computing solutions.

1. Introduction

1.1 Background Information

XML

XML (extensible Markup Language) is a markup language for structured documents (a document that contains both content and some indication of what role that content plays). A markup language is a mechanism to identify structures in a document. XML specifications define a standard way of adding markup to structured documents.

XML is the subset Standard Generalized Markup Language, SGML defined in ISO standard 8879:1986 that is designed to make it easy to interchange structured documents over the internet[1]

DTD

Document Type Definition is a standardization of the XML object representation. Using DTD applications are able to exchange in XML without having the risk of failing to recognize objects.

XML and HTML

In HTML both the tag semantics and tag set are fixed. XML specifics neither semantics neither semantics nor a tag set. XML provides a facility to define tags and the structural relationships between them. It really is a meta language for describing markup language. Semantics are defined by the applications that process them or by stylesheets.

XML and SGML

XML is a much restricted form of SGML. XML does not require DTD. User can assign a default definition for undeclared components of markup.

DOM

Document Object Model provides an API for HTML and XML documents. DOMs are hierarchical representations of XML. An instance of XML is purely textual and lends itself to be portable between applications while the corresponding DOM of an XML instance is a tree data structure that is used internally by an application with the DOM. A programmer can build documents, navigate their structure and add, modify or delete elements and content.

DOM interfaces are an abstraction in that they are a means of specifying a way to access and manipulate an application's internal representation of a document. DOMs are platform and language neutral interface than tallow programs and scripts to dynamically access and update the content, structure and style of documents. A "binding" of the DOM to a particular programming language provides a concrete API. Currently several libraries that handle XML DOMs are being developed for a number of programming languages. This would enable applications to manipulate XML as data structures. However in designing an API for XML and HTML documents using the DOM, programmers still need to implement the concrete methods to access and manipulate document content [2]

Specifications

XML is defined by four specifications.

XML: Extensible Markup Language

Defines the syntax of XML

XLL: Extensible Linking Language

Defines a standard way to represent links between multiple resources and links between read only resources.

XSL Extensible Style Language

Will define a standard style sheet language for XML

XUA Extensible User Agent

Will help standardize user Agents.

XML Serialization

The process to translate arbitrary data structure to XML can be loosely referred to as XML serialization. XML serialization converts graph like data structures into XML by recursively traversing the graphs converting nodes to XML and marking the nodes that are converted.

HTTP

(HyperText Transport Protocol) The communications protocol used to connect to servers on the World Wide Web. Its primary function is to establish a connection with a Web server and transmit HTML pages to the client browser. An HTTP exchange takes place over a TCP/IP socket. The client opens a socket, connects to the HTTP server via the port the server is listening to, and issues a command. The command is routed to the server via the internet. The server receives the command and does something, typically involving a file lookup.

Types of requests

1. Simple Request

GET

where is a carriage return character; is a line feed character. Here the object is simply sent back to the user.

2. Full Request

Method URI Protocol Version

[*]

[ ]

example:

METHOD GET HTTP/1.0

where is a carriage return character; is a line feed character.

In the event of a full request, the object is encapsulated using the MIME protocol, and a descriptive header precedes it on its way to the client. Some examples of the METHODS are GET, PUT, POST, DELETE etc.

CGI

(Common Gateway Interface script) A small program written in a language such as Perl, Tcl, C or C++ that functions as the glue between HTML pages and other programs on the Web server. For example, a CGI script would allow search data entered on a Web page to be sent to the DBMS (database management system) for lookup. It would also format the results of that search as an HTML page and send it back to the user. The CGI script resides in the server and obtains the data from the user via environment variables that the Web server makes available to it.

RPC

(Remote Procedure Call) A programming interface that allows one program to use the services of another program in a remote machine. The calling programming sends a message and data to the remote program, which is executed, and results are passed back to the calling program

XML-RPC

RPC: Remote Procedure Call

Remote Procedure Call is a simple extension to the procedure idea. Its basically creating connection between procedures that are on different machines. Remote procedures are marshaled into a format that can be understood on the other side of the connection. There are almost infinite number of formats possible one possible format is XML. XML-RPC uses XML as the marshalling format.

Client-Server

An architecture in which the client (personal computer or workstation) is the requesting machine and the server is the supplying machine, both of which are connected via a local area network (LAN) or wide area network (WAN). The client contains the user interface and may perform some or all of the application processing.

Servers can be high-speed microcomputers, minicomputers or even mainframes

1.2. SOAP

Definition

“SOAP is a lightweight protocol for exchange of information in a decentralized, distributed environment. It is an XML based protocol that consists of three parts: an envelope that defines a framework for describing what is in a message and how to process it, a set of encoding rules for expressing instances of application-defined datatypes, and a convention for representing remote procedure calls and responses. SOAP can potentially be used in combination with a variety of other protocols; however, the only bindings defined in this document describe how to use SOAP in combination with HTTP and HTTP Extension Framework.”[W3c SOAP 1.1 specification]

Description and use

The use of SOAP can be best described if you consider a situation(which we encounter often these days) of building an internet application for example simple database query where the client puts a simple request the server looks up the database and returns with the rows that match the request.

This application would have a simple solution in a client/server(definition included in the introduction) custom application which would work perfectly if all the clients used the same platforms. But that is no the case with the different platforms being used(Windows, unix, linux etc) this would not work there are two possible solutions(excluding SOAP) to solve this problem

1. Have a different client application written for each platform. First of all this is a lot of work secondly it’s not very scalable.

2. Use the web. But still you’re tied to browser implementations, and you still have to build an infrastructure to send and receive input and output and to format and package that data for transmission. For a complicated application, you may opt for Java or ActiveX code, but then you start losing users to bandwidth and security issues.

SOAP provides a solution to this. It is a simple packaging protocol that packages the data that client sent and packages the result from the server on the back trip. And in a format that is commonly understood(XML). So all the arguments from the client required to complete the remote procedure would be converted to XML and the result the server will also be converted to XML format. It acts as a “glue” between client and server. The protocol used to transfer the data is HTTP (definition included in the introduction).

SOAP structure

“SOAP is a protocol specification for invoking methods on servers, services, components and objects. SOAP codifies the existing practice of using XML and HTTP as a method invocation mechanism. The SOAP requires a small number of HTTP headers that facilitate firewall/proxy filtering. The SOAP specification also mandates an XML vocabulary that is used for representing method parameters, return values, and exceptions." [DevelopMentor]

Following is the structure of the SOAP protocol according to the W3C SOAP1.1 Specification.

SOAP consists of three parts:

• The SOAP envelope construct defines an overall framework for expressing what is in a message; who should deal with it, and whether it is optional or mandatory.

• The SOAP encoding rules defines a serialization mechanism that can be used to exchange instances of application-defined datatypes.

• The SOAP RPC representation defines a convention that can be used to represent remote procedure calls and responses.

SOAP Message

A SOAP message is an XML document that consists of a mandatory SOAP envelope, an optional SOAP header, and a mandatory SOAP body.

A SOAP message contains the following:

• The Envelope is the top element of the XML document representing the message.

• The Header is a generic mechanism for adding features to a SOAP message in a decentralized manner without prior agreement between the communicating parties. SOAP defines a few attributes that can be used to indicate who should deal with a feature and whether it is optional or mandatory.

• The Body is a container for mandatory information intended for the ultimate recipient of the message. SOAP defines one element for the body, which is the Fault element used for reporting errors.

SOAP HEADER

SOAP provides a flexible mechanism for extending a message in a decentralized and modular way without prior knowledge between the communicating parties. Typical examples of extensions that can be implemented as header entries are authentication, transaction management, payment etc.

SOAP ENVELOPE

The Envelope is the top element of the XML document representing the message. The element MAY contain namespace declarations as well as additional attributes. If present, such additional attributes MUST be namespace-qualified. Similarly, the element MAY contain additional sub elements. If present these elements MUST be namespace-qualified and MUST follow the SOAP Body element.

SOAP BODY

The SOAP Body element provides a simple mechanism for exchanging mandatory information intended for the ultimate recipient of the message. Typical uses of the Body element include marshalling RPC calls and error reporting.

Advantages of using SOAP

1. Interoperability

Combining HTTP and XML into a single solution gives us a whole new level of interoperability. For example, lathered with SOAP, clients written in Microsoft Visual Basic can easily invoke CORBA services running on UNIX boxes, JavaScript clients can easily invoke code running on the mainframe, and Macintosh clients can start invoking Perl objects running on Linux. The list goes on. While some interoperability is achieved today through cross-platform bridges for specific technologies, once SOAP becomes standard, bridges will no longer be necessary.

2.F ireWall Problems

[pic] Currently, developers struggle to make their distributed applications work across the Internet when firewalls get in the way. Since most firewalls block all but a few ports, such as the standard HTTP port 80, all of today's distributed object protocols like DCOM suffer because they rely on dynamically assigned ports for remote method invocations. If you can persuade your system administrator to open a range of ports through the firewall, you may be able to get around this problem as long as the ports used by the distributed object protocol are included.

[pic]To make matters worse, clients of your distributed application that lie behind another corporate firewall suffer the same problems. If they don't configure their firewall to open the same port, they won't be able to use your application. Making clients reconfigure their firewalls to accommodate your application is just not practical

[pic]Since SOAP relies on HTTP as the transport mechanism, and most firewalls allow HTTP to pass through, we do not have a problem invoking SOAP endpoints from either side of a firewall. Don't forget that SOAP makes it possible for system administrators to configure firewalls to selectively block out SOAP requests using SOAP-specific HTTP headers.

Example of SOAP Request and Response

SOAP message embedded in HTTP Request

POST /StockQuote HTTP/1.1

Host :



Content-Type: test/xml;

Charset = “utf-8”

Content-Length: nnnn

SOAPAction:

“some-URI”

DIS

SOAP Message Embedded in HTTP Response

HTTP/1.1 200 OK

Content-Type: text/html;

Charset=”utf-8”

Content-Length: nnnn

34.5

1.3. Data representation in XML

XML parsers and generators are available for most programming languages. An application that adopts XML parsers and generators builds an XML document object Model (DOM), which is a tree-like data structure that closely resembles the structure of a hierarchical XML object. Internal data structures of the application have to be translated into the DOM. For hierarchical data structures, such as lists and trees, this translation is almost one to one. Many Web applications providing services can be characterized by operating on such hierarchical data structures. Consider for example Web Pages and database records. The translation of these types of objects is simple. However, applications are not necessarily constrained to hierarchical data structures.

Tree Vs Graph Problem

Care has to be taken for representing graph like data structures in XML. More specifically, pointers can lead to co-referenced objects, and a co-referenced part of the data structure must be translated and presented in XML only once. A producer of XML is responsible for generating a data structure in XML that can be translated by a consumer into a true copy of the original data structure. A naïve use of XML and the DOM could lead to trees with replicated data, because of the hierarchical layout in the DOM. Pointers pose another problem when pointers are allowed to refer to elements within arrays and records in C and C++.

The example illustrates the tedious task of implementing data conversions between an application’s internal data structures and XML in the presence of indirection (pointers). The problem is exacerbated by the well-known problem that the inspection of data structure declarations alone cannot reveal the exact usage of the data structures in question. Consider for example the C data structure shown in Fig. 1(a). Although the declaration suggests that it is a tree(because of the use of left and right field names), there is no limitation for using this data structure as a graph in which a node is referred to by more than one left or right pointer from another node. Every pointer must be treated as potentially co-referencing.

The process to translate arbitrary data structures to XML can be loosely referred to as XML serialization. XML serialization converts graph-like data structures into XML by recursively traversing the graphs. If there are co-referenced pointers then the pointers that is referred to assigned an id and the node that refers that pointer uses href with id in the XML. So effectively the pointers are visible in the xml produced only when they are co-referenced.

However the data part of the protocol is restrictive, because it does not support graphs. The XML-RPC protocol defines records and arrays and some primitive data types. A severe restriction of the protocol is the lack of constructs for indirection with pointer types and constructs for passing methods (procedures) as parameters.

The reason why pointer analysis is so important for RPC , even

though the data structures may not share common nodes (like in a DAG). Is the following

When you call a remote procedure with e.g. array arguments that are aliases, the SOAP send/receive routines will not duplicate that data. For example, when we call matrix multiply (matmul) to multiply an array A by itself, array A is send only once. Here, array A is a pointer to some matrix.

matmul (A,A,A2)

This is an additional good motivation for the pointer analysis.

Struct Node {

Int val;

Struct Node *left;

Struct Node *right;

}

5

(b)

1. a

5

3

8

7

9

1 (b) : soap generated for figure (1a)

(1)

void soap_serialize_Struct_Node(struct Node *p)

{

soap_reference(p,SOAP_Struct_Node);

soap_mark_Struct_Node(p);

}

void soap_mark_Struct_Node(struct Node *p)

{

if(p != 0){

soap_embedded(&p->val,SOAP_int);

soap_mark_int(&p->val);

soap_embedded(&p->left,SOAP_PointerToStruct_Node);

soap_mark_PointerToStruct_Node(&p->left);

soap_embedded(&p->right,SOAP_PointerToStruct_Node);

soap_mark_PointerToStruct_Node(&p->right);

}

}

void soap_default_Struct_Node(struct Node *p)

{

soap_default_int(&p->val);

soap_default_PointerToStruct_Node(&p->left);

soap_default_PointerToStruct_Node(&p->right);

}

void soap_put_Struct_Node(struct Node *p)

{

int i;

struct nlist *np;

if(i=soap_pointer_lookup(p,SOAP_Struct_Node,&np))

if (soap_is_embedded(p,i))

soap_element_ref("Node", i);

else if (soap_is_single(p,i))

soap_out_Struct_Node("Node", 0, p);

else

{

soap_out_Struct_Node("Node",i,p);

soap_set_embedded(p,i);

}

else soap_out_Struct_Node("Node", 0, p);

}

void soap_out_Struct_Node(char *tag,int id,struct Node *p)

{

soap_element_begin_out(tag, soap_embedded_id(id, p, SOAP_Struct_Node),NULL);

soap_out_int("val",-1,&p->val);

soap_out_PointerToStruct_Node("left",-1,&p->left);

soap_out_PointerToStruct_Node("right",-1,&p->right);

soap_element_end_out(tag);

}

struct Node * soap_get_Struct_Node(struct Node *a)

{

struct Node *q;

soap_independent("Node", NULL);

if(!soap_error && (q = soap_in_Struct_Node("Node",a)))

soap_independent(NULL,NULL);

return q;

}

struct Node * soap_in_Struct_Node(char * tag,struct Node *p)

{

if(soap_element_begin_in(tag))

return NULL;

if(soap_null || *soap_type != '\0' && strcasecmp(soap_type,"Node")!=0)

{ soap_error = SOAP_TYPE_MISMATCH;

return NULL;

}

if(soap_body)

{ p=soap_id_enter(soap_id, p, sizeof(struct Node));

if(soap_alloced)

soap_default_Struct_Node(p);

for(;;)

{

if(!soap_in_int("val",&p->val))

if(soap_error == SOAP_TAG_MISMATCH && !soap_in_PointerToStruct_Node("left",&p->left))

if(soap_error == SOAP_TAG_MISMATCH && !soap_in_PointerToStruct_Node("right",&p->right))

if(soap_error == SOAP_TAG_MISMATCH)

soap_error = soap_ignore_element();

if(soap_error == SOAP_NO_TAG)

break;

if(soap_error)

return NULL;

}

if(soap_element_end_in(tag)) return NULL;

}

else

{ p=soap_id_forward(soap_id, p, sizeof(struct Node));

if (soap_alloced)

soap_default_Struct_Node(p);

}

return p;

}

1.4.RPC Stubs

The stub performs basic support functions for remote procedure calls. For instance, stubs prepare input and output arguments for transmission between systems with different forms of data representation. The stubs use the RPC runtime to send and receive remote procedure calls. The client stub can also use the runtime to find servers for the client.

When a client application calls a remote procedure, the client stub first prepares the input arguments for transmission. The process for preparing arguments for transmission is known as “marshalling “. Marshalling converts call arguments, a stub unmarshals them. Unmarshalling is the process by which a stub disassembles incoming network data and converts it into application data using a format that the local system understands. Marshalling and Unmarshalling both occur twice for each remote procedure call; that is, the client stub marshals input arguments and unmarshals output arguments Marshalling and unmarshalling permit client and server systems to use different data representations for equivalent data. The stub compiler we use generated stubs by compiling an RPC interface definition written by application developers. The compiler generates marshalling and unmarshalling routines for the C data types.

To build the client for an RPC application, a developer links client application code to the client stubs of all the RPC interfaces the application uses. To build the server, the developer links the server application code to the corresponding server stubs.

1.5. Serialization

OBJECT SERIALIZATION

Object Serialization is the ability of an object to write its complete state and the complete state of any objects that it references to an output stream; and then, at some later time, to recreate itself by reading its serialized state from an input stream. The stream may be a file, a byte array or a stream associated with a TCP/IP socket.

.

2a 2b

Let’s say we need to serialize 2a and de-serialize it back to 2b. We need to make sure that 2b is a true copy (replica) of 2a. a may have different memory address in 2b than in 2a but the structure needs to be preserved that is d should not be reproduced twice. For this to happen we need to make sure that every object has one and only one entry in the hash table.

a

Fig .3a Fig.3b

Two Phase Algorithm

Now let’s consider the same case as 2a with the variation that c points back to a (3a) if we analyze the pointers and output them simultaneously we will not know that c has a pointer to a till we reach c and we would have already output a by then. Only co-referenced objects can have ID in soap so we first need to analyze the pointers to find out which objects are co-referenced and then when we output we need to output the ids for only the co-referenced objects (that’s why two phases are required)

Embedded objects

In fig 3b c is pointing to x which is embedded in a and is the first element in the struct. So the both x and a have the same starting memory location, but c points to x and not a. So the distinction is that x and a should be recognized as two distinct objects in the hash table for this reason we also need to include the type information in the hash table and two different types have different entries in the hash table.

b

d

Fig 4

Forward Pointers

In figure 4 Px has a pointer to an embedded object x which should be output when the object d is output but d output only after c is output. In this case we use forward pointers when an unresolved pointer (a href ) is obtained then a hash table entry with the id and a linked list is maintained with the linked list containing all the unresolved references to that id. Later when that object actually arrives on the stream all the references in the linked list are replaced with the pointer that arrived.

An example of forward pointers is

…..

Two Phase Approach

To generate the SOAP request for the c function a programmer would have to construct the request which would have the SOAP HEADER(with the file length and other fields)

SOAP ENVELOPE and the SOAP BODY which we discussed earlier

1. First serialize the data structure (includes the pointer the pointer analysis)

This step requires a number of routines for pointer table maintenance and also lookup and basically maintains the state of each field in the struct as having been referred once or multiple times or being embedded.

2. And output each element according to step 1 which is in compliance with the SOAP standard. Assigning and “id” and referencing. This involves generating the output routines type. These will construct the actual XML elements.

3. Now that we have the SOAP routines that will generate the SOAP BODY we need the SOAP Header which is pretty straight forward for the most part. But we need the length of the XML file in advance. Which we do not know till we output it. One solution could be storing the XML file first getting the length and then outputting this file which would not be so efficient so we have the two phase approach where we first go through the entire process as if we were outputting but the difference being that we don’t actually output the characters in the first case but just count them and then in the second phase we output the characters. This involves setting a flag to indicate whether we are counting or printing. And each of the output routines should check for this flag. But again there is one problem that in the first pass the pointer table would already have changed. So we make a copy of the pointer table before we start counting and then once we’re done counting we restore the pointer table to its initial state.

4. At the time of deserailization we have to rebuild the the exact data structure from the XML supplied and then there is problem associated with forward referencing using SOAP. SOAP allows forward referencing so though the object has not been defined yet by the “id” field it has already been referenced by the “href” field. So that has to be taken care of in the deserialization routines.

The above steps include a lot of involved programming. So every programmer who wants to use SOAP to create and deserialize a soap request would have to write all these routines for their functions. Writing these wrappers/stub routines is a tedious task.

All the information required to generate these routines is already stored in the type tables and symbol tables of a C compiler. Our C compiler tool accesses the type information and generates the routines requires at compile time. This basically automates the whole process of generating the stub routines. And saves the developer a lot of work.

Tool support for XML API Synthesis

We have developed a compiler as part of a problem-solving environment for XML API synthesis. The compiler translates source C data structure declarations into serializing XML input and output routines for instances of the data structures. The output routines serialize internal data structures into XML and the input routines for the data structure in. The compiler supports all data types except unions (see note at the end) Pointers are constrained to point to single objects (except for character pointers which are considered strings).

We are currently extending the compiler to generate SOAP-RPC routines for C functions. These routines implement RPC for arbitrary C programs by mapping XML-RPC routines for C functions. These routines implement RPC for arbitrary C programs by mapping XML-RPC to calls to C procedures. Fig…. Depicts an example C function for drawing our tool accesses

A tree data structure with the type struct Node defined in 1.a Fig 1.b shows the generated stub routine that implements XML-RPC.

A client uses the stub for a remote call to the routine of the server. After the RPC completes on the server side, the internal copy of the data structures are removed and an XML reply integer is send back to the call.

USER GUIDE

2.1. Input and output files

Stub.h

SoapC.c runserver.c runclient.c soapH.h

Client.c

Stdsoap.h

Server.c

Client Server

Fig 3

Input file

A c file containing the prototype of the function and the strucutre of the return parameters is given . For

e.g., the prototype maybe

int lookup(double SSN ,struct result *r) ;

is given as an input to the c compiler.

Notice that the struct result is always the last parameter passed it is ignored when the xml output for the function is being created.

Again this file should also declare struct result which maybe as simple as

struct result { char * name; char * SSN; };

the struct result is what will be returned from the server.

Files generated

Following files will be generated, a brief description is included it will be explained in detail later.

1. soapC.c

file containing the serialization routine and the routines for generating xml from the given strucutre and viceversa.

2. soapClient.c

file that creates the xml output with the header and the xml representation of the structure with the parameters of the function.(note result struct is not sent)

sends the xml output

waits on the result and populates the result struct with the elements of the received XML.

3. soapServer.c

looks in the list of functions for the function that was called.

and calls the appropriate function. Gets the parameter for the function from the xml and makes a call to the function with it and a reference to a struct result. so now the struct result is populated with the required parameters. This struct is first serialized and then a xml file is created with the xml representation of this struct and send back to the client.

4. soapH.h

contains the declaration/prototype of all the functions created in the soapC.c file

Also declares the user defined types and assigns a number to them.

REQUISITES FOR USING THE SYSTEM

As we saw above that the user doesn’t need to have any knowledge of SOAP or XML in order to be able to use the seriailizer. The translation of C data structer to SOAP/XML is transparent to the user. But if the user wants to use it with some other SOAP enabled system he needs to know the XML data format(as defined by XML Schemas). So for the interoperability purposes he needs to know how the C data structures are mapped to XML.

2.2 MAPPING OF C DATA STRUCTURES to XML DATA FORMAT

The c data structures are mapped in the following manner

1. Structures

Struct Record

{

Type1 field1;

Type2 field2;

Type3 field3;

TypeN fieldN;

}

will be mapped to

….

…..

…..

…..

….

ARRAY

Type array[size];

Will be mapped to

array[0]

array[1]

array[2]





array[size-1]

Basic types

All the basic types int, short, long, char etc. are implemented by mapping to XML with types defined by xsi schema types (this is described in the SOAP document)

SOAP Request and Response

Co-referenced Objects

Struct X

{

char *name1;

char *name2;

} x;

x.name1 = “Bob”;

x.name2 = x.name1;

xml output will be

Bob

Namespaces

XML namespace is defined as follows by the “Namespaces in XML”

[Definition:] An XML namespace is a collection of names, identified by a URI reference, which are used in XML documents as element types and attribute names. XML namespaces differ from the "namespaces" conventionally used in computing disciplines in that the XML version has internal structure and is not, mathematically speaking, a set.

The way we are using namespaces is the user has to define a namespace table in the main function which points to the definition the user may used and a soap namespaces stack is maintained on which all the namespaces provided by the user are stacked so whenever that word is used in the XML the URN is referred to. An ID and a URN is associated with each namespace which are stored in the table all the namespaces are included in the soap envelope and the ID is used at the time of outputting the soap begin tag. At the time of deserialization first the server populates the table with all the entries in the Soap envelope and then whenever there is a reference to a particular namespace then the struct defined by the URI specified in the table is used.

Note About Unions

We have found earlier that unions can be replaced by structs. However,

this is at a cost of memory. It is especially costly when the number

of alternatives is large and each alternative can be a large data structure

(e.g. array).

Instead, one can give a struct with pointer fields instead. The SOAP XML representation will be the same, because the pointers are invisible!

For unions in a C code, a user can translate

union U {

Tx x;

Ty y;

Tz z;

}

Into

struct U {

Tx *x;

Ty *y;

Tz *z;

}

(assuming types Tx, Ty, and Tz are not already pointers.)

However, he has to modify existing code that uses union U.

Reading a SOAP form for this struct will work fine, because the fields are

initially set to NULL. So the only field that was defined in the SOAP incoming

stream is not NULL.

Currently in the compiler, a struct of the form above will be output with null

pointers to all the fields, except to the field that contains a pointer

to the actual field of the union. This is not very elegant, but I don't know how

to do this differently right now. I mean: the null pointers should not be

output ONLY when dealing with fields in a struct, not in general.

3. EXAMPLES

3.1 Query Example

This example uses a simple txt file, which stores a list of SSNs and corresponding names.

This files resides on the server and the client queries the file using SOAP.

So the server has a function lookup_SSN which looks up through the file for the required SSN and returns the name corresponding to the SSN.

The stub.h file looks like this

struct result 



  char * name; 

  char * SSN; 

}; 

int lookup_SSN(double SSN, struct result *r);

so basically it has the declarations of the result function and the prototype of the function. Notice that the last parameter in the function prototype is struct result.

The result comes back with the SSN, name pair.

Test case : WE query for the SSN 59222222

The SOAP that is sent to the server

POST /~gupta/server.cgi HTTP/1.1

Host: cs.fsu.edu

Content-Type: text/plain Content-Length: 212

SOAPMethodName: lookup_SSN

59222222

The SOAP received from the server is

HTTP/1.1 200

OK

Date: Sat, 18 Nov 2000 21:07:43 GMT

Server: Apache/1.3.4 (Unix)

Transfer-Encoding: chunked

Content-Type: application/x-httpd-cgi

104

Koustubh

592222222

3.2 SYSTEM EXAMPLE

In this example we query who’s logged on the system the server is running on.

The client queries there is just an integer passed which really isn’t used. The server returns the array of String (for space limitation in this example the server is returning only the first 10 users logged on)

The stub file is like this

struct result

{ char * a[100];

int ain;

};

int who_here(int i, struct result *r);

The soap request that is generated looks as follows

POST /~gupta/server.cgi HTTP/1.1

Host: cs.fsu.edu

Content-Type: text/plain

Content-Length: 189

SOAPMethodName: who_here

3

The server runs the “who” system command and returns an array of strings(only the first 10 are being returned)

HTTP/1.1 200 OK

Date: Sat, 18 Nov 2000 21:03:16 GMT

Server: Apache/1.3.4 (Unix)

Transfer-Encoding: chunked

Content-Type: application/x-httpd-cgi

4ac

curci pts/48 Nov 3 21:45&#x;(dial972.acns.fsu.edu)&#x;

whalley pts/0 Oct 12 06:43&#x;(protoss)&#x;

whalley pts/1 Oct 12 06:43&#x;(protoss)&#x;

pant pts/2 Nov 16 16:43&#x;(128.186.111.178)&#x;

whalley pts/3 Oct 12 06:44&#x;(protoss)&#x;

engelen pts/4 Nov 7 09:25&#x;(taz)&#x;

whalley pts/5 Oct 12 06:50&#x;(protoss)&#x;

whalley pts/6 Oct 12 06:50&#x;(protoss)&#x;

whalley pts/7 Oct 12 06:50&#x;(protoss)&#x;

schwartz pts/9 Oct 14 08:42&#x;(du1)&#x;

0

4. Implementation

4.1 Functions generated for each type

void soap_serialize_Struct_lookup_SSN (struct lookup_SSN *p)

void soap_mark_Struct_lookup_SSN(struct lookup_SSN *p)

void soap_default_Struct_lookup_SSN(struct lookup_SSN *p)

void soap_put_Struct_lookup_SSN(struct lookup_SSN *p)

void soap_out_Struct_lookup_SSN(char *tag,int id,struct lookup_SSN *p)

struct lookup_SSN * soap_get_Struct_lookup_SSN(struct lookup_SSN *a)

struct lookup_SSN * soap_in_Struct_lookup_SSN(char * tag,struct lookup_SSN *p)

soap_serialize serialzes the type and resolves the pointer references, both a forward reference/ bac reference. It does it using the soap_mark routine which marks the element as being referred or refering

soap_default?

soap_put : decides how to output the element in xml , if it is referred, or if it has a back reference and

calls the appropriate output method accordingly soap_element_ref or soap_out with a id or soap_ref without a id.

soap_out: constructs the actual xml ouput with the tag name, id and the value. Note that this involves calling xml_out functions for the primitive type. Whether the xml will be actually ouput or used just for counting is determined by a flag COUNT if it is 0 then the counter is incremented with each character that would be printed if not then the characters are actually written to the stream. We need this in order to determine the actual length of the xml file that will be generated because it is required to generate the header at the time of creating a request

4.2 How the functions are generated

The compiler loops through the type table and when it comes across a type that is not a primitive type, or for which the functions haven't been already generated then it generates the function for it

It does so recursively i.e., if the type further uses a user defined type for e.g, pointer to a type T where T is again a type which is not primitive and for which the functions haven't been generated then it'll first generate the functions for T and then the function for pointer to T. So first functions for all composite types are generated.

For each function that is written to soapC.c a prototype is written to soapH.h

and with each type a new int is assigned to the type and added to the soapH.h

4.3 Calls to the functions in the soapClient.c and soapServer.c

soapClient.c has soap_call_lookup function that takes the parameter that the function takes and also the URL of the server.

int soap_call_lookup_SSN(char * URL, double SSN, struct result * _R)

creates a struct (s) to send the parameters to the server populates the fields with the fields passed to the function.(s.SSN = SSN)

serializes the struct

soap_begin();

soap_serialize_Struct_lookup_SSN(&s);

it then creates the xml in two steps first creating the header, for which it needs the length of the file this is taken care of by using a two phase algorithm in which whether the output is for counting all actually sending it to stream is decided by a flag it is set by using functions begin_print(0 and begin_count(). For creating the header first begin_count is called and then soap_envelope is created

Add stuff about soap_envelope

begin_count();

soap_envelope_begin_out();

soap_put_Struct_lookup_SSN(&s);

soap_envelope_end_out();

this calculates the length.

then header is generated

begin_print();

generate_header(URL,"lookup_SSN");

followed by sending the actual xml body for the struct

soap_envelope_begin_out();

soap_put_Struct_lookup_SSN(&s);

soap_envelope_end_out();

It then waits for the reply from the server. Once it gets the reply it absorbs the reply and populates the struct result it took as the parameter.

soap_element_begin_in("SOAP:Envelope");

soap_element_begin_in("SOAP:Body");

soap_get_Struct_result(_R);

soap_element_end_in("SOAP:Envelope");

soap_element_end_in("SOAP:Body");

soapServer.c

soapServer.c has a int soap_serve_F() for each function F and then it has a function soap_serve()

soap_serve()

It picks the right function. it does that by basically making using the int returned by the soap_serve_F function. It uses nested if statements. If the returned val is 0 for any of the functions that means that function succeeded and then the function will end else if it drops down to the last statement that means that the function you're looking for wasn't found.

soap_serve_F()

this function does the same two steps as the soap_client_F but in the opposite order in that it first waits for the request. Absorbs the request and populates a local struct with the parameters from xml.

soap_element_begin_in("SOAP:Envelope");

soap_element_begin_in("SOAP:Body");

soap_get_Struct_lookup_SSN(&s);

soap_element_end_in("SOAP:Envelope");

soap_element_end_in("SOAP:Body");

then it makes a call to the function using the parameters from the request and the refernce to a struct result created locally.

lookup_SSN(s.SSN,&r);

after the call the struct result holds the result that need to be send over to the client.

first step is two serialize and then to output the struct result

soap_begin();

soap_serialize_Struct_result(&r);

This again involves sending the length of the file and the envelope and the body. This is done by two step method of first counting the length by calling begin count

begin_count();

soap_envelope_begin_out();

soap_put_Struct_result(&r);

soap_envelope_end_out();

send the output

begin_print();

soap_envelope_begin_out();

soap_put_Struct_result(&r);

soap_envelope_end_out();

5. Concluding Remarks

Our XML serializartion and XML-RPC stub generation works well to achieve effective data interoperability between disparate applications. Semantic data interoperablility, however, may require the modification of data to meet specific constraints of an application that handles the data. The group will further investigate methods to automatically (re) map XML into modified forms given a specification of constraints.

6. References

1.

2.

3. Simple Object Access Protocol (SOAP) 1.1

W3C Note 08 May 2000 by

Don Box, David Ehnebuske, Gopal Kakivaya, Andrew Layman,

Noah Mendelsohn, Henrik Frystyk Nielsen, Satish Thatte,

Dave Winer

4.

5. R. van Engelen, K. Gallivan, G. Gupta, and G. Cybenko, XML-RPC Agents for Distributed Scientific Computing in IMACS'2000 Conference, Lausanne, Switzerland, August 2000.

-----------------------

7

9

3

8

5

Client Stub

Marshalling

Unmarshalling

Server Stub

Unmarshalling

Marshalling

COMPILER

BUILD

a

b

c

a

b

c

d

d

x

a

b

c

d

b

c

d

Px

x

a

c

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download