EXternal Data Representation (XDR)



CDR



Message Passing

Early and rudimentary distributed systems communicated via message passing. This form of communication is very simple. One side packages some data, known as a message and sends it to the other side where it is decoded and further action may be taken. The format of the message and the way in which it will be processed by the receiver are application dependent. In some applications the receiver may respond by sending a reply message. In other cases, this might not happen.

It is important to realize that the messages carry only data and are typically represented in a way that is known only by the sender and receiver -- there is nothing standard about it. Unless mitigating action is taken by the designer of the message format or the implementer of the application, the communication might not be interoperable across platforms, because of representation differences (e.g. big-endian vs. little-endian). This approach also makes it hard to reuse components of one distributed system in other distributed systems, because there is really no concept of a common library -- everything is "hand rolled".

Remote Procedure Calls (RPC)

Although message passing can be effective, it would be nice if there were a more uniform, reusuable, and user-friendly way of doing things. Remote Procedure Calls (RPCs) provide such an abstraction. Instead of viewing the communication between two systems in terms of independent exchanges of data, we back up a step and examine the overall behaviors of the systems.

Often times the services of the system can often be decomposed into procedures, much like those used in traditional programming. These procedures accept certain types of information, perfomr some useful operation, and then return a result. The RPC abstraction allows us to extend this paradigm to distributed systems. One system can provide remote procedure calls for use by other systems. From the applications point of view, it can use these remote procedure calls much like local procedure calls. Behind the scences, the RPC is actually connecting to a remote host, sending it the parameters, performing an operation on that remote host, and then returning the result.

This is very similar to a very specific use of message passing. In fact the function invocation is a message from the client to the server. This message names the function and also provides the parameters. After receiving this message, the server performs the operation and sends a message back to the client with a result. The client then treats this result as if it were the return value from a local procedure call.

The important charachteristics of RPCs are these:

• They provide a very familiar interface for the application developer

• They one way of implementing the commonplace request-reply primitive

• The format of the messages is standard, not application dependent

• They make it easier to reuse code, since the RPCs have a standard interface and are separate from any one application-proper.

Limits of RPCs

From the perspective of the application programmer, RPCs operate much like local procedure calls, but there are some important differences:

• A "call be address (reference)" is not possible, because the two processes have different address spaces. Parameters that otherwise might be passed by address are often passed by "in-out" instead. In-out paramters are copied by value on procedure invocation, and again upon return. The end result is that the calling procedure has the new value of the data item. The difference in semantics is that paramters passed by in-out retain their old value until the function returns. Parameters that are passed by reference can change value, even from the perspective of the calling function, throughout.

• Addresses and other large objects are typically passed by address, these need to be copied.

• Byte ordering issues (big-endian/little-endian) might need correction

• Representation issues, sucha as ACII vs. EPCIDIC might need mitigation.

• "Calls by (C++, et al) reference" require more work as does the return of an object reference. This is because both sides don't share either the same address space or the same mapping tables.

Marshalling and Stubs

The process of preparing and packaging the information for transmission is known as marshalling (think of the marshal leading people at a wedding). This often involves translating non-portable representations into a portable or canonical form. In the case of Sun's implementation of RPC, a set of conventions known as eXternal Data Representation (XDR) is used.

In order to hide this process from both the application programmer and the author of the RPC library, it is often implemented using automatically generated stubs.

The process basically works like this. The programmer develops the interface for the RPC. The stub generator takes this interface definition and creates server stubs and client stubs, as well as a common header file. The server stubs and client stubs take care of the marshalling and unmarshalling of the parameters, as well as the communication and procedure invocations. This is possibly, because these actions are well defined given the procedure's identity and parameterization. Once this is done, the programmer can build the RPCs and the application. Each is linked against the RPC library which provides the necessary code to implement the RPC machinery.

This process is shown in the figure below:

[pic]

RPC and Failure

The failure modes of RPCs are different than those experienced by local function calls -- there are common failure modes! When was the last time that you can remember a local function call failing? I'm not asking when it was that you most recently observed a bug in a function. I'm asking when it was that you most recently observed the actual transfer of control fail. My point is that in conventional systems this doesn't happen -- and if it should ever happen, it is acceptable to do nothing, but roll over.

But this isn't the case in a distributed system. The communication to the RPC server can fail. The reply from the RPC server can fail. The server can crash. The client can crash. And worst of all, even if we know that asomething bad happened, we may not know when. What to say? Bad things happen -- but good software is prepared.

Please consider the situations shown below:

[pic]

These failure modes lead to different semantics for RPCs in light of failure:

• exactly once -- the RPC will be executed exactly once -- never more, never less. Althoguh this is most like local function calls, it is very expensive to implement

• at most once -- if all goes well -- "Hurray! It worked!" Otherwise, no big deal. The important thing is that the operation is never repeated.

• at least once -- if all goes well -- "Hurray! It worked!" Otherwise, keep repeating the operation -- even if there is a risk that it might have happened already (e.g. lost ACK).

• idempotent -- The operation can be repeated without any change. Often times at least once semeantics leads to the design of idempotent operations.

Finding an RPC

RPCs live on a specific host, at a specific port. The port mapper on the host maps from the RPC name to the port number. Typically when a server is initialized, it registers its RPCs, and their version numbers with the port mapper. A client will first connect to the port mapper to get this handle to the RPC. The call to the RPC can then be made by connecting to this port.

Distributed Objects

For many of the same reasons that it was nice to extend the idea of local procedures to a distributed system, it is also convenient to extend the idea of objects to a distributed system. Distributed objects operate much as did remote procedure calls. Below we'll highlight some of the key differences.

• Objects often contain references to other objects. In these cases the marshalling process becomes more comples. The objects must be serialized. The process involves making copies of all of the referenced objects, and repeating, as necessary. After an object is serialzied, it can be sent "all as one".

• Unlike procedures whose lifetime is limited to the duration of the procedure invocation, objects can maintain state between function calls. This makes necessitates making objects persistent, so we can effectively write them out to disk, and restore them, as necessary.

• Remote Object references are typically much like local object references with the IP address and port number added.

Java and RMI

Java supports distributed objects and Remote Method Invocation (RMI). Java's native RMI has some interesting limitations and features:

• All paramters are inputs -- none can be outputs or results

• Distributed objects can return only one object.

• Java has a registry that serves much the same job as the RPC port mapper.

• Java actually has a mechanism by which a client can download the client stub via an HTTP server. This allows clients to obtain updated code and improves revision transparency.

Java RMI Overview

An overview of the organization of the Java RMI is below:

[pic]

The proxy object is a hollow container on the client system. When it recieves a message (method invocation), it passes it on to the remote reference module This module determines the provider of this object, marshalles the parameters, as necessary, and sends the information to the communications module. The communications module sends the request over the network to the other machine.

At this point a very complimentary process occurs. The communication module passes the request to the dispatcher which determines which object should receive the message. It then sends the messaeg to the skeleton for this object. The skeleton unmarshallas the paramters and forwards the request to the object, itself. THe object then does what it was told and returns the result to the skeleteon. The skeleton marshalls the result and sends it to the communications module, which sends it out over the wire. The caller's communication module recevies the marshalled response and passes it to the remote reference module for unmarshalling. The decoded response is then sent to the proxy object, whcih in turn returns the response to the invoking object -- just as if the object had been local.



Marshaling

Marshaling is process of encoding object to put them on the wire (network)

Unmarshaling is the process of decoding from the wire and placing object in the address space

RMI uses serialization (deserialization) to perform marshaling (Unmarshaling)

Making an Object Serializable

The object's class must implement the interface

java.io.Serializable

Serializable has no methods

Example

import java.io.Serializable;

class Roger implements Serializable

{

public int lowBid;

public Roger(int lowBid )

{

this.lowBid = lowBid;

}

public String toString()

{

return " " + lowBid;

}

}

Serializing and Deserializing Objects

The writeObject method of ObjectOutputStream serializes objects

The readObject method of ObjectInputStream deserializes objects

Example

import java.io.*;

class SerializeDeserialize {

public static void main( String args[] ) throws IOException {

serialize();

deserialize();

}

public static void serialize() throws Exception {

OutputStream outputFile =

new FileOutputStream( "Serialized" );

ObjectOutputStream cout =

new ObjectOutputStream( outputFile );

cout.writeObject( new Roger( 12));

cout.close();

}

public static void deserialize() throws Exception {

InputStream inputFile =

new FileInputStream( "Serialized" );

ObjectInputStream cin =

new ObjectInputStream( inputFile );

Roger test = (Roger) cin.readObject();

System.out.println( test.toString() );

}

}

Output

12

Some Rules

A Serializable class must:

Implement the java.io.Serializable interface

Identify the fields that should be serializable

This can be done in two different ways. The default method or use serialPersistentFields member. The latter will be covered later. Using the default method all fields are identified as serializable by default, unless marked transient. More about that later.

Have access to the no-argument (or default) constructor of its first nonserializable superclass (or supersuperclass, supersupersuper class)

Constructors & Deserialization

When an object is deserialized, no constructors of the object’s class are called.

All parameters passed by value between client and server are marshaled/unmarshaledSerialization

Serialization allows objects to be converted to a sequence of bytes

The sequence of bytes can be stored in a file, database, sent to a remote machine etc.

The sequence of bytes can be used to recreate the original object

Serializing

Creating the sequence of bytes from an object

Deserializing

Recreating the object from the above generated bytes

Actually creates a new object with the fields containing the same values as the original object

If the fields are also serializable objects then they are also "recreated"

Cetus Links: 18,452 Links on Objects and Components / Java RMI

CSE3420 Lecture Notes Module 10 - Serialisation, RMI and CORBA

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download