Binary Random-Access I/O in Java - Simon Fraser University

Binary Random-Access I/O in Java

Revision History: 01/09/24 ? original by Elliot Wiltshire, TA ? Cmpt275-d1 (01-3) 01/10/19 ? Rev. 6 by Russ Tront. Added page numbers (they were in table of contents but not on pages,)

and changed margins to use more of page. Also, put in some page breaks. I have not proof read it. 01/10/25 ? Rev. 7 by Elliot Wiltshire. Added section 4, on reading/writing instances of simple objects. Also added revised demonstration code (CandyItem.java and CandyDemo.java). 01/10/28 - Rev. 8 by Elliot Wiltshire. Changed source listings CandyItem.java, CandyDemo.java, also accepted all changes from previous revisions. Sample output from section 4 updated. 01/10/29 - Rev. 9 by Elliot Wiltshire, suggestions by Russ Tront. Changed source listings CandyItem.java, CandyDemo.java. Sample output from section 4 updated. 01/11/12 ? Rev. 10 by Elliot Wiltshire, suggestions by Russ Tront. Minor grammar fixups, documented try block in CandyDemo.printAll, removed `long' casts on calls to RandomAccessFile.length calls, added section 3.3 to discuss truncation of RandomAccessFile. 01/11/13 Rev. 11 by Russ Tront. Updated table of contents, and indicated that file extension pads the file length with undefined values (likely zero, but may depend on the platform).

TABLE OF CONTENTS

1. INTRODUCTION................................................................................................................................................................................2 2. I/O IN JAVA.........................................................................................................................................................................................2

2.1 STREAM BASICS ...............................................................................................................................................................................2 3. RANDOM ACCESS BINARY I/O IN JAVA ....................................................................................................................................3

3.1 RANDOMACCESSFILE CLASS............................................................................................................................................................3 3.2 WRITING BASE TYPES......................................................................................................................................................................4 3.3 READING BASE TYPES ...................................................................................................................................................................10 3.4 BINARY FILE TRUNCATION AND EXTENSION .................................................................................................................................10 4. READING AND WRITING OBJECTS...........................................................................................................................................11 4.1 READING AND WRITING `SIMPLE' OBJECTS...................................................................................................................................11

4.1.1 CandyItem.java Source Code...........................................................................................................................................13 4.1.2 CandyDemo.java Source Code.........................................................................................................................................17

Page 1

1. Introduction

You may wonder what `binary' means in the context of doing input and output (I/O). Essentially, when you look at your program's output (of integers, for instance) on the console, you're looking at the human-readable form. This means the internal number format (e.g. 2's complement) that the CPU needs for arithmetic has been changed into a simple string of ASCII characters for human presentation on the console or printer. On the other hand, the format used inside the computer is more compact and easier to do arithmetic with. In binary I/O we want to preserve this compact form in the files, since we would only have to read it back into the computer later anyway (why convert to ASCII when writing to file, then convert back to binary when reading back from it? And why waste file space with the larger format?). The conversion is both computationally intensive and slow.

The term random-access refers to the ability to move around and read/write in random places in a file. Traditionally, input and output were done in a sequential nature. You could quickly move to the middle of a file, but you couldn't write there without invaliding all the data in the file that followed. With modern disks, we have the ability to quickly move to any location in a file, and read/write anywhere in the file.

2. I/O in Java

Input and output in Java is handled by abstractions called streams. Streams are used because we (as programmers) shouldn't need to worry about whether we're writing to a file, network connection, tape, compressed archive, or a chunk of memory. Abstracting this away gives us the flexibility to write one piece of code that writes to or reads from any I/O device.

2.1 Stream Basics

InputStream is a base class, which can read from an input data source; this data source is viewed as a sequence of bytes (regardless of their internal interpretation as an int, float, etc.). OutputStream is another base class, which can write a sequence of bytes to an output destination. There are a few things to remember when dealing with streams:

All streams inherit the close() method ? this is important to call once you've finished with the stream, as it will return I/O resources to the OS. OutputStream has the flush() method, which sends any remaining buffered data to its destination. Operating Systems usually use buffers on I/O devices for improved efficiency. Also note that close() will also flush any pending I/O requests.

There are many (i.e. around 60) different Java stream classes. These allow you to do things like open and read compressed archives (JAR and ZIP), `push back' input that you've already read (if you're parsing a file for example), or compute checksums on a sequence of bytes.

The class we'll be using for random-access, binary I/O is RandomAccessFile. This class is a hybrid in I/O terms, since it implements both the DataInput and DataOutput interfaces, allowing input and output using one stream. In contrast, most Java I/O classes are dedicated to either input only, or output only.

Page 2

3. Random Access Binary I/O in Java

3.1 RandomAccessFile class

As stated before, this class allows the programmer to perform both input and output with one stream. RandomAccessFile also permits (as its name suggests) reading from and writing to the file non-sequentially.

RandomAccessFile supports non-sequential (random) access using a file pointer. This is not the same as a C/C++ pointer! You can think of it as a file marker, or placeholder, representing where the next byte will be read from, or written to, within the file. You can reposition this pointer using the seek(long) method (see below), and get its current position with the getFilePointer() method. You can also determine the total length of the file using the length() member function.

RandomAccessFile member functions for moving around in a file:

Method long getFilePointer()

long length() void seek(long)

void close()

Explanation/Usage Returns the current offset the file pointer is at, measured from the beginning of the file (the beginning is termed offset 0). Returns the total size of the file in bytes. Sets the file pointer to an offset within the file, with the beginning of the file at offset 0. Flushes any pending I/O requests and releases resources to the operating system. Once called, you can no longer write to, or read from the file.

Creating or opening a file is simply a matter of instantiating an instance of a RandomAccessFile. Unlike some other stream classes, RandomAccessFile will not, by default, clobber any data already in the file. The RandomAccessFile constructor takes both the file's name (String) and a `mode' (String) in which to open the file. The two different modes are described below:

RandomAccessFile constructors and modes

Constructor RandomAccessFile( String filename,

String mode )

RandomAccessFile( File file, String mode )

Explanation/Usage filename ? filename to open (OS dependent!) mode ? "r" = read-only access, "rw" = read and write access (if allowed) file ? File object to open mode ? same as above

Page 3

3.2 Writing Base Types

Each base type in Java has a well-defined size associated with it (in bytes), unlike C/C++.

Java base types, and their sizes in bytes

Type boolean byte char double float int long short

Size (bytes) 1 1 2 8 4 4 8 2

RandomAccessFile imposes no structure on the file, other than Java's `big-endian' convention for multi-byte types. You'll sometimes hear that certain processors use `big-endian' or `little-endian' conventions for storing data types of more than one byte. This term refers to the order in which the most significant byte is placed in memory. Big-endian refers to storing the most-significant byte first, while the little-endian convention stored most-significant byte last. Java has a nice feature of always reading and writing bytes in big-endian fashion, which means it's cross-platform.

The DataOutput interface has a binary write function for each of Java's base types. Binary write functions output the data from RAM essentially unchanged (except for endianness). They do NOT convert 2's complement or floating point to printable or displayable character strings like "3.14159".

Java also has a DataInput interface, which has binary read functions. These read each of Java's base types in binary format WITHOUT conversion, from a binary input device like disk to a binary RAM value in memory.

RandomAccessFile's write methods:

Method writeBoolean( boolean v ) writeByte( int v ) writeChar( int v ) writeDouble( double v ) writeFloat( float v ) writeInt( int v ) writeLong( long v ) writeShort( int v ) writeBytes( String s )

writeChars( String s )

writeUTF( String s )

Explanation Writes the boolean `v' as a byte in binary format. Writes `v' as a one-byte value in binary format. Writes `v' as a two-byte binary unicode character. Writes `v' as an 8-byte binary floating-point number. Writes `v' as a 4-byte binary floating-point number. Writes `v' as a 4-byte binary signed integer. Writes `v' as an 8-byte binary signed integer. Writes `v' as a 2-byte binary signed integer. Writes `s' as a sequence of bytes (high 8 bits of each character in `s' are discarded) in binary format. Writes `s' as a sequence of Java unicode characters in binary format. Write `s' as a string using binary UTF-8 coding.

Page 4

Following is an example program which: 1) Opens a (possibly existing) file. 2) Writes the integers 0-9 in binary format. 3) Gets and prints the length of the file. 4) Closes the file. 5) Re-opens the file, gets and prints the length. 6) Iterates over the file's contents and prints each integer. 7) Repositions the file pointer to the middle of the file. 8) Overwrites the middle integer with a `special' integer (-1). 9) Prints a message related to the file pointer's position. 10) Iterates over the file's contents and prints each integer. 11) Moves the file pointer to the end of the file and writes the `special' integer, extending the file's length. 12) Iterates over the entire file's contents (by catching EOFException) and prints. 13) Closes the file, returning all operating system resources, and quits.

Page 5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download