7



Addressing Image Compression Techniques on current Internet Technologies

Eduardo J. Moreira

Onyeka Ezenwoye

Florida International University

School of Computer Science

CIS 6931 Advanced Topics of Information Processing

Dr. Shu-Ching Chen

Spring 2001

INTRODUCTION

Introduction 3-4

Background 5-6

History of Data Compression 7-9

Compression Types 10

Run Length Encoding 11

Huffman Coding 12-14

Arithmetic Coding 15

JPEG Joint Photographic Experts Group 16-18

GIF Graphics Interchange Format 19-20

Portable Network Graphic Format (PNG 0) 21-22

MPEG Moving Picture Expert Group 23-25

Conclusion 26

References 27

INTRODUCTION

Data compression focuses on the assumption that when transmitting data whether it be images, music, video and so on, one can benefit in the size and transmission times associated with such endeavors. These days with the ever-growing World Wide Web, one can understand the growing need for transmission speeds and overall network performance. People have grown accustomed to instantaneous gratification, making a wait for lets say an image to appear on the screen ever so important. This is one of the most important needs for data compression; this paper will focus on some of the most widely used formats and algorithms in this field. First lets focus on the actual definitions.

Packing data is said to be the storing of data in a format that requires less space than usual. Compressing data is the same as packing data. Data compression is the term often used in reference to packing data, particularly when it involves data communications. Data compression is particularly useful in communications because it enables devices to transmit and store the same amount of data in fewer bits. A file is a compressed format. Many operating systems and applications contain commands that enable you to pack a file so that it takes up less memory. For example, suppose you have a text file containing ten consecutive space characters. Normally, this would require ten bytes of storage. However, a program that packs files would replace the space characters by a special space-series character followed by the number of spaces being replaced. In this case, the ten spaces would require only two bytes. This is just one packing technique, there are many others.

Data compression is also widely used in backup utilities, spreadsheet applications, and database management systems. Certain types of data, such as bit-mapped graphics, can be compressed to a small fraction of their normal size. Some modems automatically pack data before transmitting it across communications lines. This can produce faster communication because fewer bytes need to be sent. However, the modem on the receiving side must be capable of unpacking the data.

In this paper, we will look at the fundamentals of data compression, while focusing mainly on its application in the compression of images.

We will discuss the relative strengths and weaknesses of image formats and corresponding compression algorithms well suited to the sharing of simple graphical images across multiple hardware/operating system platforms and distributed systems. The specific formats that will be discussed are GIF, JPEG, and PNG, The compression algorithms, which we’ll focus mainly, will be JPEG, GIF and PNG 0.

BACKGROUND

In the past, documents were stored on paper and kept in filing cabinets. This can be seen to be very inefficient in terms of storage space and also the time taken to locate and retrieve information when required. Storing and accessing documents electronically through computers is now replacing this traditional method of storing documents. This has enabled us to manage things more efficiently and effectively, so that items can be located and information extracted without undue expense or inconvenience.

In terms of storage, the capacity of a storage device can be effectively increased with methods that compress a body of data on its way to a storage device and decompresses it when it is retrieved.

In terms of communications, compressing data at the sending end and decompressing data at the receiving end can effectively increase the bandwidth of a digital communication link.

To put it simply:

“Data compression saves time and money”

Lets give an example of an every day data compression application, Facsimile or better known as fax.

Exchanging faxes every day is one of those inevitable tasks of every business. But fax devices would not work without data compression algorithms. The image of your document is not being sent over the phone line as is. Fax compresses the data before it sends it to reduce the time needed to transmit the document. This reduces the cost of transmission 10 or more times. Now imagine there is no data compression. A 50c fax transmission could cost as much as $5.

Data compression has many uses, in modern networking environments. Compressing large files can significantly reduce file transfer times. This might prove useful for retrieving database information within a company’s internal network, or for downloading images on an Internet web page

At any given time, the ability of the Internet to transfer data is fixed. Thus, if data can effectively be compressed wherever possible, significant improvements of data throughput can be achieved. Many files can be combined into one compressed document making sending easier.

A BRIEF HISTORY OF DATA COMPRESSION

Data Compression basically transforms data to a new representation that typically has less redundancy, and is based on prediction using some deterministic (but possibly adapting) model. If data can be predicted with a high confidence, it can be compressed very well. The Primary goal of any data compression algorithm is the identification and extraction of the source redundancy.

The late 40's were the early years of Information Theory; the idea of developing efficient new coding methods was just starting to be fleshed out. Ideas of entropy, information content and redundancy were explored. One popular notion held that if the probability of symbols in a message were known, there ought to be a way to code the symbols so that the message will take up less space.

The first well-known method for compressing digital signals is now known as Shannon-Fanon coding. Shannon and Fanon [1948] simultaneously developed this algorithm, which assigns binary code words to unique symbols that appear within a given data file. While Shannon-Fanon coding was a great leap forward, it had the unfortunate luck to be quickly superseded by an even more efficient coding system: Huffman Coding.

Huffman coding [1952] shares most characteristics of Shannon-Fanon coding. Huffman coding could perform effective data compression by reducing the amount of redundancy in the coding of symbols. It has been proven to be the most efficient fixed-length coding method available.

In the last fifteen years, Huffman coding has been replaced by arithmetic coding. Arithmetic coding bypasses the idea of replacing an input symbol with a specific code. It replaces a stream of input symbols with a single floating-point output number. More bits are needed in the output number for longer, complex messages. 

Dictionary-based compression algorithms use a completely different method to compress data. They encode variable-length strings of symbols as single tokens. The token forms an index to a phrase dictionary. If the tokens are smaller than the phrases, they replace the phrases and compression occurs. Two dictionary-based compression techniques called LZ77 and LZ78 have been developed. LZ77 is a "sliding window" technique in which the dictionary consists of a set of fixed-length phrases found in a "window" into the previously seen text. LZ78 takes a completely different approach to building a dictionary. Instead of using fixed-length phrases from a window into the text, LZ78 builds phrases up one symbol at a time, adding a new symbol to an existing phrase when a match occurs.

These various compression methods will be covered in more detail in the following sections. Now we’ll discuss the issue of compression types.

COMPRESSION TYPES

Data compression can be divided into two main types, lossless and lossy compression.

Lossless compression can recover the exact original data after compression. It is used mainly for compression database records, spreadsheets or word processing files, program files, where exact replication of the original is essential. Examples of lossless technologies include the Run-length and Huffman codes.

Lossy compression will result in a certain loss of accuracy in exchange for a substantial increase in compression. Lossy compression is more effective when used to compress graphic images and digitized voice where losses outside visual or aural perception can be tolerated. Most lossy compression techniques can be adjusted to different quality levels, gaining higher accuracy in exchange for less effective compression. Examples of lossy methods are JPEG and MPEG.

RUN LENGTH ENCODING

This lossless compression algorithm uses a quite simple technique that can achieve compression ratios as good as 8 to 1. The whole idea is based on replacing multiple occurrences of a symbol with only one copy and a count of how many times that symbol appears. For example if the string AAABBBCCCCCCCCCDDDDDEEEEFFFFGGGGG appears in a file then it will be encoded as 3A3B9C5D4E4F5G.

This coding mechanism can also be used to compress digital images by comparing pixels that appear adjacent to each other and only store the changes. The Run Length Encoding technique is very effective when encoding images that contain large amounts of white spaces or large homogeneous areas. But, it does not work well when encoding files that contain even a small degree of variation among adjacent pixels. Because it uses two bytes to represent each symbol. In those cases, the compression algorithm can actually increase the file size. This compression technique is used frequently for fax transmissions, which usually contain large amounts of white spaces that can be removed.

HUFFMAN CODING

Huffman Code is one of the most important early developments in the field of data compression. It uses a simple strategy to achieve on the average 25 percent savings. It is also possible to achieve as much as a 50 to 60 percent savings on may instances when compressing large files. The algorithm is based on creating a variable length character code or code words to map those characters that appear in a file most frequently to code words with the smallest lengths.

Code words are represented as variable length binary strings. Each binary string is then mapped to a different character found in the file. A frequency distribution of the characters found in the file being analyzed for compression must be created. This distribution is then used to decide which codeword will be assigned to each of the symbols that appear in the file. Let us consider how this concept can be applied when compressing a file composed of printable ASCII characters. The ASCCII character set consists of approximately 100 printable characters. In order to uniquely represent each printable ASCII character we will need the ceiling of (log 100) or 7 bit strings. This type of representation will not take into consideration the frequency at which the characters appear in the file. Therefore, we will not be able to accomplish the desire space savings. Lets look at a simple example of how taking into account the frequency at which characters appear in a file allows us to save space. Lets suppose that were are trying to compress a file composed of 6 distinct characters with the following distribution: character c appears 100,000 times, d appears 30,000 times, y appears 5,000 times, t appears 1,000 times, r appears 50 times and z appears 25 times. Then, we can assign the codeword with the shortest length to c. Since, c appears with the highest frequency. The following code words could be assigned to the file characters:

USING HUFFMAN CODES

Character Codeword Space required to represent all file characters

(String length) * (frequency of characters in file)

----------------------------------------------------------------

c 0 1*(100,000)

d 101 3*(30,000)

y 100 3*(5,000)

t 111 3*(1,000)

r 1101 4*(50)

z 1100 4*(25)

---------------------------------------------------------------

Total Bits Required 208,300

---------------------------------------------------------------

USING FIX SIZED STRINGS

Character Codeword Space required to represent all file characters

(String length) * (frequency of characters in file)

------------------------------------------------------------------

c 000 3*(100,000)

d 001 3*(30,000)

y 010 3*(5,000)

t 011 3*(1,000)

r 100 3*(50)

z 101 3*(25)

-----------------------------------------------------------------

Total Bits Required 408,225

-----------------------------------------------------------------

By using Huffman codes, we were able to accomplish a savings of 199,925 bits from using fixed sized strings. Please note that if all characters were to occur with the same frequency in a file, Huffman Codes will not provide any savings.

After considering the savings that can be accomplished when using Huffman Coding for compression, the question arises, how do we build the coding scheme required. The classic scheme is to use the frequency distribution table of the characters appearing in the file. Then begin with the largest frequency and assign the codeword with the smallest length to it and move to the next most frequently occurring character. A full binary tree is always used to represent the optimal code for a file. The tree only has data at the leaves. A zero is used to indicate the left branch and a one to indicate the right branch. Characters with the highest frequencies will be found closer to the root of the tree than characters with lower frequencies. Once the code words have been defined for each character, encoding the file is a matter of simply concatenating the code words representing the characters in the file. Decoding will then be a matter of identifying the code words associated with each character and repeating the process for the entire encoded file.

Arithmetic Coding

Arithmetic coding encodes symbols using non-integer numbers of bits. It is reference based. All inputs are represented as a real number –R- between the ranges of 0.37 and 0.46. The bigger the input data the more precision is needed for R.

One of the main goals associated with lossless compression methods is the decomposition of data into sequences of events. This events need to be encoded in as few bits as it is possible without loosing any of the apparent information. We need to give smaller code words to events which are determined to be more likely. This set can be compressed whenever more probable events occur. If there exists an event in which there are ev1, ev2, ev3, evn distinct pieces, we together make a probability distribution P. The smallest possible expected numbers of bits to encode are of the entropy p(H(P)).

JPEG – JOINT PHOTOGRAPHIC EXPERTS GROUP

JPEG was developed to compress gray-scale or color images by the Joint Photographic Experts Group committee. This format of images is well suited for real world photographs; scenic a depiction of nature, but in the same token has not been shown to work well with line drawings, cartoon animation, and other similar drawings. The type of algorithm used to compress and decompress the image makes this a "lossy" style compression. This gives us somewhat of a less than the original quality we started out with. Even though it was designed to exploit the limitations of the human eye's perception of minor color changes in brightness. Thus it was designed to be viewed by the human eye not analyzed by machines, which would pickup changes on a compressed image as small errors and could give problems if used in conjunction with comparing algorithms. Nevertheless, the amount of compression that can be accomplished by this “lossy” method is far greater than with other "lossless" methods. One reason that jpeg images are well liked has to do with the amount of flexibility that one has to create smaller lower quality images or larger higher quality ones, by changing compression parameters. This in itself makes it extremely useful to a broad scope of real world applications. For example, if the main objective is to make the image as small as possible for transfer through a network, then you simply would ask yourself "What is the lowest amount of quality we need" and adjust accordingly. Another important aspect of using JPEG images is that one could control the actual decoding speed as it relates to the image quality. This is accomplished by using inaccurate approximations instead of exact calculations.

There are a couple of good reasons to use JPEG’s; of course, the first would have to be the compression advantages and then the storing of 24bit color per pixel instead of the usual 8bit color per pixel. Always remembering that moving a 100KByte-image file across a network instead of a 2Mbyte full-color file makes a big difference in disk usage and transmission times. We do have a draw back in the time it takes to decode and view a JPEG to one, which is of a different format such as a GIF. However, the trade off in network transmission time is greater then the actual time taken to decompress the file. Another advantage of using JPEG’s is the ability to store 24bit/pixel (16million colors), making it a great source for high quality/definition photographs and drawings. We also need to understand that these types of image formats can never duplicate what the human eye seas.

The best-known lossless compression method can accomplish a 2:1 compression on the average, using JPEG we can achieve a 10:1 up to 20:1 compression without visible loss. A 30:1 to 50:1 compression can be achieved with small to moderate loss of quality. For very low quality purposes, one could go as far as 100:1 ratio for usage such as previews. Gray scale images usually do not compress by as much of a factor as do color. The human eye is usually more sensitive to brightness variations than to hue variations. JPEG can compress hue data much better than brightness. Gray scale images are usually only 10-20 percentages smaller than color images but the uncompressed gray-scale data is only eight bits/pixels. Approximately 30% of a full color image of the same quality.

Lets us talk about some of the different compression methods, which are all, know as JPEG. First, one is called the baseline JPEG that is stored as one top to bottom scan of the image. Now lets look at progressive JPEG, which divides the image file into a series of scans. On the first pass, it shows the image in a very low quality setting that takes little space. The storage requirements are roughly the same as baseline except one can see an approximation of the image fairly quickly and each scan returns a higher quality rendering of the image. This format is great when it comes to the transfer of images across the World Wide Web on slow modem speeds.

GIF – GRAPHIC INTERCHANGE FORMAT

GIF was created as CompuServe’s standard for graphic image data exchange. We have two versions (GIF standards) being GIF87a and the other GIF89a. There are minor differences between both. These days GIF is becoming less used for some reasons. These reasons have to do with the number of colors that can be that can be used to represent image data. An image cannot have a bit depth of more than eight bits per pixel; meaning that a maximum of 256 separate colors can be used is a single GIF image. Sometimes more than one image is within a GIF file, this is called an animation. This is accomplished by writing out images to screen one after the other with some type of timing sequence between them. This animation method is still being used throughout the web. Image data within the GIF is compressed using what is called a modified LZW (Lempel-Ziv)algorithm.

It works with indexed 256 color images or less. It seems to perform best if the image being compressed is composed of a large percentage of solid colors throughout a wide portion of the image scope. Other methods record the actual color value of each pixel, but this method records one pixels color value and it groups other pixels with the same color value follow. GIF’s can be interlaced allowing for progressively clearer images when downloading from the web on your browser. This helps users start seeing the image without waiting for the entire image to be downloaded completely. This algorithm is a “lossless” one. The quality of the image is in no way compromised by compression or any other function. The original image (data) is restored to exactly what the original was at the beginning when it is opened. If you specify fewer that the original 256 colors for the image you can reduce file sizes. When indexing an image you should choose the lowest amount of colors needed to provide the necessary quality for that image, so that the file size is the smallest it needs to be.

If we use the enhanced GIF89A format, we can specify one color to be transparent. This allows GIF images to appear with irregular shapes instead of the normal rectangular. This is created by having borders colors be transparent giving the image a cut out effect. Always keeping in mind that if you choose a color, which happens to appear in the image, you can distort it.

PNG – PORTABLE NETWORK GRAPHIC FORMAT (method 0)

Currently method 0 is the only compression method defined for PNG. This method is a modified Lempel-Ziv 77 (LZ77) algorithm. This algorithm is very similar to the one currently used by “zip” programs like winzip, and pkzip. Because of this, we get the benefits of 15-35 percent higher compression. This is considered a lossless algorithm and intern works very well with PNG true color, palette, and grayscale color scopes. If we compare JPEG and PNG using true color and grayscale images, JPEG generally gives us better compression performance intern smaller file size and faster transmission speed, while PNG 0 offers higher image quality. For all intents and purposes it appears that both JPEG and PNG 0 seem to have a good future in the every growing and changing World Wide Web.

PNG uses various filtering techniques very similar to compression but different. Filtering is applied toward bytes of data before any compression takes place. This intern prepares the data for optimal compression. It has been shown that this pre-compression filtering works best when applied to true color images. Always noting that each individual image will ultimately determine it’s final outcome. One needs to test different filters to see which works best for the individual image.

In conclusion PNG 0 supports 24-bit color, provides good transparency scheme, gives us great gamma and interlacing support. These and other reasons I believe will help in the lasting staying power of this format.

MPEG – MOVING PICTURE EXPERT GROUP

MPEG Compression has become one of the most popular video formats currently being used today. It uses several compression techniques, which will be furthered, discussed in detail.

MPEG uses three main techniques to achieve up to 30:1 compression ratios. First one being a color conversion, which plainly transforms a 24-bit RBG picture element into a 16 or 12-bit signal. RGB is composed of three colors red, blue, and green. Which are encoded using 8 bits with 1 thru 256 levels. We use the original image resolution for luminance and reduce the red and blue colors this deals with the fact that the human eye can detect higher resolution using contrast than in color.

The second technique used is the Quantized Discrete Cosine Transform (QDCT). This transform converts individual colors and also intensity levels into an equal number of cosine frequencies. This moves the most detected variations in the block to the upper left-hand corner. Then a predetermined quantization matrix is used which weights different frequencies then divides the DCT matrix. This is governed by predetermined definitions of impact to human perception. This quantized matrix usually has zeroes in all but a few elements. The non-zeroes elements are concentrated at the upper left corner of a usually 8x8 block. This block is sent into a run-length and Huffman encoding function, which reduces the 128 byte block to non-zero coefficients being followed by a number which represents the number of zero coefficients before the next significant coefficient padding the sequence to its full length with zeros. By this step we get significant compression ratios accompanied with results in variable size Intraframes.

For the third technique we use what is called motion estimation and error coding. We assume that in a scene most of the blocks in one frame will exist in the next frame. But can be at different locations on the next.

Both MPEG-1 and MPEG-2 use a clustered group size of 15, of which the first one is called the I-frame or intraframe. This particular frame is much larger than the rest and does not use motion estimation. Usually these frames occur every ½ second in a 30frames/ps video. Also p-frames (predictive frames) are used. They use motion estimation from the previous I or P frames. P-frames contain motion vectors and also error correction information for each macro-block. B-frames (interpolated frames) are interpolated between non-B-frames on either sides. When we use two motion vectors for each macro block with interpolating between two reference frames, we decrease the size of the error correction and this results in a smaller amount for the frame.

Another format which is being developed, called MPEG-7, will provide a better set of standardized tools to facilitate the descriptions of multimedia content. Unlike the previous MPEG standards which focused on coded representation of audio and visual data, the main focus here will be the representation of information about the content, not the content itself. This intern will provide a much needed indexing standard which will intern help in accessing and managing multimedia objects which currently are growing out of control as more data streams are being sent through the internet and other networks. Volume and complexity plays a major role in the lack of current indexing, searching and retrieving algorithms mainly for multimedia resources. Some of the objectives which will be addressed are Descriptors which will provide different feature information about the multimedia resource. Description Schemes which will be defined as relationships between structures of Descriptors. And third, Description Definition Language which will define both Description Schemes and Descriptors. All this will enable us to describe audiovisual resources regardless of storage, transmission, medium, and technology.

CONCLUSION

With the ever-growing volume of multimedia data being transmitted throughout private and public networks, one can see that compression is inseparable from the data itself. We constantly strive to send more data in a timely manner even though network technologies seem to be falling behind ever so more. In the past, text and simple graphics data was the norm being transmitted throughout networks (private & public). Today this has drastically changed, where streaming videos, music and other high end media is fast becoming the norm not the exception. This is saturating the current infrastructures we have in place.

Because of the growing need to send more data faster, data compression is in the foreground of this endeavor. As new compression techniques are developed and implemented, we will see more and more data streams being transmitted, allowing world networks a more productive method for helping in the formation of the next generation multimedia formats.

REFERENCES

S. Golomb, “Runlength Encodings,” IEEE Trans, Inform. Theory, vol. 1’1’-12, July 1996

W. Pennebaker and J. Mitchell, JPEG still Image Compression Standard, New York, Nostrand Reinhold

Larry L. Peterson and Bruce S. Davie, Computer Networks – A Systems Approach, ISBN 1-55860-368-9, 1996

Cormen H. Thomas, Leiserson E. Charles and Rivest L. Ronald, Introduction to Algorithms 1990

Ian H. Witten, Alistair Moffat, and Timothy C. Bell – Managing Gigabytes: Compressing and Indexing Documents and Images.

Buford J.F.K. Multimedia Systems 1994 ACM Press/Addison-Wesley

Ziv, J. and A. Lempel. Compression of individual sequences via variable rate coding.

D.E Knuth, Dynamic Huffman Coding – Algorithms(June 1985) pg 163-175

P.G. Howard & J.S. Vitter, New methods for lossless image compression using Arithmetic Coding, Information Processing and Management28 (1992)

X. Wu, W.K. Choi, amd N.D. Memon, Lossless interframe image compression, 1998 Data Compression Conference, Mar 1998.

JPEG WWW Home

MPEG WWW Home

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download