Compression | CodeGuru

Compression

Bruce Eckel’s Thinking in Java Contents | Prev | Next Java 1.1 has also added some classes to support reading and writing streams in a compressed format. These are wrapped around existing IO classes to provide compression functionality. One aspect of these Java 1.1 classes stands out: They are not derived from the new Reader […]

Written By
CodeGuru Staff
CodeGuru Staff
Mar 1, 2001
6 minute read
CodeGuru content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Java


1.1


has also added some classes to support reading and writing streams in a
compressed format. These are wrapped around existing IO classes to provide
compression functionality.

One


aspect of these Java 1.1 classes stands out: They are not derived from the new


Reader

and


Writer

classes, but instead are part of the


InputStream

and


OutputStream

hierarchies. So you might be forced to mix the two types of streams. (Remember


that you can use


InputStreamReader

and


OutputStreamWriter

to provide easy conversion between one type and another.)

Java
1.1
Compression class
Function

CheckedInputStream GetCheckSum( )
produces checksum for any
InputStream
(not just decompression)
CheckedOutputStream GetCheckSum( )
produces checksum for any
OutputStream
(not just compression)
DeflaterOutputStream Base
class for compression classes
ZipOutputStream A
DeflaterOutputStream
that
compresses data into the Zip file format
GZIPOutputStream A
DeflaterOutputStream
that
compresses data into the GZIP file format
InflaterInputStream Base
class for decompression classes
ZipInputStream A
DeflaterInputStream
that
Decompresses data that has been stored in the Zip file format
GZIPInputStream A
DeflaterInputStream
that
decompresses data that has been stored in the GZIP file format

Although


there are many compression algorithms, Zip and GZIP are possibly the most


commonly used. Thus you can easily manipulate your compressed data with the


many tools available for reading and writing these formats.


Simple
compression with GZIP

The


GZIP interface is simple and thus is probably more appropriate when you have a


single stream of data that you want to compress (rather than a collection of


dissimilar pieces of data). Here’s an example that compresses a single


file:

//: GZIPcompress.java
// Uses Java 1.1 GZIP compression to compress
// a file whose name is passed on the command
// line.
import java.io.*;
import java.util.zip.*;
 
public class GZIPcompress {
  public static void main(String[] args) {
    try {
      BufferedReader in =
        new BufferedReader(
          new FileReader(args[0]));
      BufferedOutputStream out =
        new BufferedOutputStream(
          new GZIPOutputStream(
            new FileOutputStream("test.gz")));
      System.out.println("Writing file");
      int c;
      while((c = in.read()) != -1)
        out.write(c);
      in.close();
      out.close();
      System.out.println("Reading file");
      BufferedReader in2 =
        new BufferedReader(
          new InputStreamReader(
            new GZIPInputStream(
              new FileInputStream("test.gz"))));
      String s;
      while((s = in2.readLine()) != null)
        System.out.println(s);
    } catch(Exception e) {
      e.printStackTrace();
    }
  }
} ///:~ 

The


use of the compression classes is straightforward – you simply wrap your


output stream in a


GZIPOutputStream

or


ZipOutputStream

and your input stream in a


GZIPInputStream

or


ZipInputStream

.


All else is ordinary IO reading and writing. This is, however, a good example


of when you’re forced to mix the old IO streams with the new:


in

uses the


Reader

classes, whereas


GZIPOutputStream

’s


constructor can accept only an


OutputStream

object, not a


Writer

object.


Multi-file
storage with Zip

The


Java 1.1


library that supports the Zip format is much more extensive. With it you can
easily store multiple files, and there’s even a separate class to make
the process of reading a Zip file easy. The library uses the standard Zip
format so that it works seamlessly with all the tools currently downloadable on
the Internet. The following example has the same form as the previous example,
but it handles as many command-line arguments as you want. In addition, it
shows the use of the
Checksum
classes to calculate and verify the checksum for the file. There are two
Checksum
types:
Adler32
(which is faster) and
CRC32
(which is slower but slightly more accurate).
//: ZipCompress.java
// Uses Java 1.1 Zip compression to compress
// any number of files whose names are passed
// on the command line.
import java.io.*;
import java.util.*;
import java.util.zip.*;
 
public class ZipCompress {
  public static void main(String[] args) {
    try {
      FileOutputStream f =
        new FileOutputStream("test.zip");
      CheckedOutputStream csum =
        new CheckedOutputStream(
          f, new Adler32());
      ZipOutputStream out =
        new ZipOutputStream(
          new BufferedOutputStream(csum));
      out.setComment("A test of Java Zipping");
      // Can't read the above comment, though
      for(int i = 0; i < args.length; i++) {
        System.out.println(
          "Writing file " + args[i]);
        BufferedReader in =
          new BufferedReader(
            new FileReader(args[i]));
        out.putNextEntry(new ZipEntry(args[i]));
        int c;
        while((c = in.read()) != -1)
          out.write(c);
        in.close();
      }
      out.close();
      // Checksum valid only after the file
      // has been closed!
      System.out.println("Checksum: " +
        csum.getChecksum().getValue());
      // Now extract the files:
      System.out.println("Reading file");
      FileInputStream fi =
         new FileInputStream("test.zip");
      CheckedInputStream csumi =
        new CheckedInputStream(
          fi, new Adler32());
      ZipInputStream in2 =
        new ZipInputStream(
          new BufferedInputStream(csumi));
      ZipEntry ze;
      System.out.println("Checksum: " +
        csumi.getChecksum().getValue());
      while((ze = in2.getNextEntry()) != null) {
        System.out.println("Reading file " + ze);
        int x;
        while((x = in2.read()) != -1)
          System.out.write(x);
      }
      in2.close();
      // Alternative way to open and read
      // zip files:
      ZipFile zf = new ZipFile("test.zip");
      Enumeration e = zf.entries();
      while(e.hasMoreElements()) {
        ZipEntry ze2 = (ZipEntry)e.nextElement();
        System.out.println("File: " + ze2);
        // ... and extract the data as before
      }
    } catch(Exception e) {
      e.printStackTrace();
    }
  }
} ///:~ 

For


each file to add to the archive, you must call


putNextEntry( )

and


pass it a

ZipEntry
object.
The
ZipEntry
object contains an extensive interface that allows you to get and set all the
data available on that particular entry in your Zip file: name, compressed and
uncompressed sizes, date, CRC checksum, extra field data, comment, compression
method, and whether it’s a directory entry. However, even though the Zip
format has a way to set a password, this is not supported in Java’s Zip
library. And although
CheckedInputStream
and
CheckedOutputStream
support both
Adler32
and
CRC32
checksums, the
ZipEntry
class supports only an interface for CRC. This is a restriction of the
underlying Zip format, but it might limit you from using the faster
Adler32.

To


extract files,


ZipInputStream

has a


getNextEntry( )

method that returns the next


ZipEntry

if there is one. As a more succinct alternative, you can read the file using a


ZipFile

object, which has a method


entries( )

to


return an


Enumeration

to the


ZipEntries

.

In


order to read the checksum you must somehow have access to the associated


Checksum

object. Here, a handle to the


CheckedOutputStream

and


CheckedInputStream

objects is retained, but you could also just hold onto a handle to the


Checksum

object.

A


baffling method in Zip streams is


setComment( )

.


As shown above, you can set a comment when you’re writing a file, but


there’s no way to recover the comment in the


ZipInputStream

.


Comments appear to be supported fully on an entry-by-entry basis only via


ZipEntry

.

Of


course, you are not limited to files when using the


GZIP

or


Zip

libraries – you can compress anything, including data to be sent through


a network connection.


Advertisement

The
Java archive (jar) utility

The


Zip format is also used in the

Java
1.1 JAR (Java ARchive) file format, which is a way to collect a group of files
into a single compressed file, just like Zip. However, like everything else in
Java, JAR files are cross-platform so you don’t need to worry about
platform issues. You can also include audio and image files as well as class
files.

JAR


files are particularly helpful when you deal with the Internet. Before JAR


files, your Web browser would have to make repeated requests of a Web server in


order to download all of the files that make up an applet. In addition, each of


these files was uncompressed. By combining all of the files for a particular


applet into a single JAR file, only one server request is necessary and the


transfer is faster because of compression. And each entry in a JAR file can be


digitally signed for security (refer to the Java documentation for details).

A


JAR file consists of a single file containing a collection of zipped files


along with a “

manifest”
that describes them. (You can create your own manifest file; otherwise the
jar
program
will do it for you.) You can find out more about JAR manifests in the online
documentation.

The


jar

utility that comes with Sun’s JDK automatically compresses the files of


your choice. You invoke it on the command line:

jar
[options] destination [manifest] inputfile(s)

The


options are simply a collection of letters (no hyphen or any other indicator is


necessary). These are:

c Creates
a new or empty archive.
t Lists
the table of contents.
x Extracts
all files
x
file
Extracts
the named file
f Says:
“I’m going to give you the name of the file.” If you
don’t use this,
jar
assumes
that its input will come from standard input, or, if it is creating a file, its
output will go to standard output.
m Says
that the first argument will be the name of the user-created manifest file
v Generates
verbose output describing what
jar
is doing
O Only
store the files; doesn’t compress the files (use to create a JAR file
that you can put in your classpath)
M Don’t
automatically create a manifest file

If


a subdirectory is included in the files to be put into the JAR file, that


subdirectory is automatically added, including all of its subdirectories, etc.


Path information is also preserved.

Here


are some typical ways to invoke


jar

:

jar
cf myJarFile.jar *.class

This


creates a JAR file called


myJarFile.jar

that contains all of the class files in the current directory, along with an


automatically-generated manifest file.

jar
cmf myJarFile.jar myManifestFile.mf *.class

Like


the previous example, but adding a user-created manifest file called


myManifestFile.mf

.

jar
tf myJarFile.jar

Produces


a table of contents of the files in


myJarFile.jar

.

jar
tvf myJarFile.jar

Adds


the “verbose” flag to give more detailed information about the


files in


myJarFile.jar

.

jar
cvf myApp.jar audio classes image

Assuming


audio

,


classes,

and


image

are subdirectories, this combines all of the subdirectories into the file


myApp.jar

.


The “verbose” flag is also included to give extra feedback while the


jar

program is working.

If


you create a JAR file using the


O

option, that file can be placed in your CLASSPATH:

CLASSPATH="lib1.jar;lib2.jar;"

Then


Java can search


lib1.jar

and


lib2.jar

for class files.

The


jar

tool isn’t as useful as a


zip

utility. For example, you can’t add or update files to an existing JAR


file; you can create JAR files only from scratch. Also, you can’t move


files into a JAR file, erasing them as they are moved. However, a JAR file


created on one platform will be transparently readable by the


jar

tool on any other platform (a problem that sometimes plagues


zip

utilities).

As


you will see in Chapter 13, JAR files are also used to package Java Beans.


Contents

|

Prev

|

Next
CodeGuru Logo

CodeGuru covers topics related to Microsoft-related software development, mobile development, database management, and web application programming. In addition to tutorials and how-tos that teach programmers how to code in Microsoft-related languages and frameworks like C# and .Net, we also publish articles on software development tools, the latest in developer news, and advice for project managers. Cloud services such as Microsoft Azure and database options including SQL Server and MSSQL are also frequently covered.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.