|
| - |
The
new collections
To
me, collection classes are one of the most powerful tools for raw programming.
You might have gathered that I’m somewhat disappointed in the collections
provided in Java through version 1.1. As a result, it’s a tremendous
pleasure to see that collections were given proper attention in Java
1.2, and thoroughly redesigned (by Joshua Bloch at Sun). I consider the new
collections to be one of the two major features in Java 1.2 (the other is the
Swing library, covered in Chapter 13) because they significantly increase your
programming muscle and help bring Java in line with more mature programming
systems.
Some
of the redesign makes things tighter and more sensible. For example, many names
are shorter, cleaner, and easier to understand, as well as to type. Some names
are changed to conform to accepted terminology: a particular favorite of mine
is “iterator” instead of “enumeration.”
The
redesign also fills out the functionality of the collections library. You can
now have the behavior of linked
lists, queues,
and dequeues
(double-ended queues, pronounced “decks”).
The
design of a collections library is difficult (true of most library design
problems). In C++,
the STL covered the bases with many different classes. This was better than
what was available prior to the STL (nothing), but it didn’t translate
well into Java. The result was a rather confusing morass of classes. On the
other extreme, I’ve seen a collections library that consists of a single
class, “collection,” which acts like a
Vector
and
a
Hashtable
at
the same time. The designers of the new collections library wanted to strike a
balance: the full functionality that you expect from a mature collections
library, but easier to learn and use than the STL and other similar collections
libraries. The result can seem a bit odd in places. Unlike some of the
decisions made in the early Java libraries, these oddities were not accidents,
but carefully considered decisions based on tradeoffs in complexity. It might
take you a little while to get comfortable with some aspects of the library,
but I think you’ll find yourself rapidly acquiring and using these new
tools.
The
new collections library takes the issue of “holding your objects”
and divides it into two distinct concepts:
- Collection:
a group of individual elements, often with some rule applied to them. A List
must hold the elements in a particular sequence, and a Set
cannot have any duplicate elements. (A bag,
which is not implemented in the new collections library since
Lists
provide you with that functionality, has no such rules.)
- Map:
a group of key-value object pairs (what you’ve seen up until now as a Hashtable).
At first glance, this might seem like it ought to be a
Collection
of pairs, but when you try to implement it that way the design gets awkward, so
it’s clearer to make it a separate concept. On the other hand, it’s
convenient to look at portions of a
Map
by creating a
Collection
to represent that portion. Thus, a
Map
can return a
Set
of its keys, a
List
of
its values, or a
List
of
its pairs.
Maps,
like arrays, can easily be expanded to multiple dimensions without adding new
concepts: you simply make a
Map
whose values are
Maps
(and the values of
those
Maps
can be
Maps,
etc.).
Collections
and
Maps
may be implemented in many different ways, according to your programming needs.
It’s helpful to look at a diagram of the new collections:
This
diagram can be a bit overwhelming at first, but throughout the rest of this
chapter you’ll see that there are really only three collection components:
Map,
List,
and
Set,
and only two or three implementations of each one
[37]
(with, typically, a preferred version). When you see this, the new collections
should not seem so daunting.
The
dashed boxes represent
interfaces,
the dotted boxes represent
abstract
classes, and the solid boxes are regular (concrete) classes. The dashed arrows
indicate that a particular class is implementing an
interface
(or in the case of an
abstract
class, partially implementing that
interface).
The double-line arrows show that a class can produce objects of the class the
arrow is pointing to. For example, any
Collection
can produce an
Iterator,
while a
List
can produce a
ListIterator
(as well as an ordinary
Iterator,
since
List
is inherited from
Collection). The
interfaces that are concerned with holding objects are
Collection,
List,
Set,
and
Map.
Typically,
you’ll write the bulk of your code to talk to these interfaces, and the
only place where you’ll specify the precise type you’re using is at
the point of creation. So you can create a
List
like this:
List
x = new LinkedList();
Of
course, you can also decide to make
x
a
LinkedList
(instead
of a generic
List)
and
carry the precise type information around with
x.
The beauty (and the intent) of using the
interface
is that if you decide you want to change your implementation, all you need to
do is change it at the point of creation, like this:
List
x = new ArrayList();
The
rest of your code can remain untouched.
In
the class hierarchy, you can see a number of classes whose names begin with
“
Abstract,”
and these can seem a bit confusing at first. They are simply tools that
partially implement a particular interface. If you were making your own
Set,
for example, you wouldn’t start with the
Set
interface and implement all the methods, instead you’d inherit from AbstractSet
and do the minimal necessary work to make your new class. However, the new
collections library contains enough functionality to satisfy your needs
virtually all the time. So for our purposes, you can ignore any class that
begins with “
Abstract.” Therefore,
when you look at the diagram, you’re really concerned with only those
interfaces
at the top of the diagram and the concrete classes (those with solid boxes
around them). You’ll typically make an object of a concrete class, upcast
it to the corresponding
interface,
and then use the
interface
throughout the rest of your code. Here’s a simple example, which fills a
Collection
with
String
objects and then prints each element in the
Collection:
//: SimpleCollection.java
// A simple example using the new Collections
package c08.newcollections;
import java.util.*;
public class SimpleCollection {
public static void main(String[] args) {
Collection c = new ArrayList();
for(int i = 0; i < 10; i++)
c.add(Integer.toString(i));
Iterator it = c.iterator();
while(it.hasNext())
System.out.println(it.next());
}
} ///:~ All
the code examples for the new collections libraries will be placed in the
subdirectory
newcollections,
so you’ll be reminded that these work only with Java 1.2. As a result,
you must invoke the program by saying:
java
c08.newcollections.SimpleCollection
with
a similar syntax for the rest of the programs in the package.
You
can see that the new collections are part of the
java.util
library, so you don’t need to add any extra
import
statements to use them.
The
first line in
main( )
creates an ArrayList
object and then upcasts it to a
Collection.
Since this example uses only the
Collection
methods,
any object of a class inherited from
Collection
would work, but
ArrayList
is the typical workhorse
Collection
and takes the place of
Vector. The
add( )
method, as its name suggests, puts a new element in the
Collection.
However, the documentation carefully states that
add( )
“ensures that this Collection contains the specified element.” This
is to allow for the meaning of
Set,
which adds the element only if it isn’t already there. With an
ArrayList,
or any sort of
List,
add( )
always means “put it in.”
All
Collections
can produce an Iterator
via their iterator( )
method. An
Iterator
is just like an Enumeration,
which it replaces, except:
- It
uses a name (iterator) that is historically understood and accepted in the OOP
community.
- It
uses shorter method names than
Enumeration:
hasNext( )
instead of
hasMoreElements( ),
and next( )
instead of
nextElement( ).
- It
adds a new method, remove( ),
which removes the last element produced by the
Iterator.
So you can call
remove( )
only once for every time you call
next( ).
In
SimpleCollection.java,
you can see that an
Iterator
is created and used to traverse the
Collection,
printing each element.
Using
Collections
The
following table shows everything you can do with a
Collection,
and thus, everything you can do with a
Set
or a
List.
(
List
also has additional functionality.)
Maps
are not inherited from
Collection,
and will be treated separately.
|
|
*Ensures
that the Collection contains the argument. Returns false if it doesn’t
add the argument.
|
boolean
addAll(Collection)
|
*Adds
all the elements in the argument. Returns true if any elements were added.
|
|
|
*Removes
all the elements in the Collection.
|
|
|
True
if the Collection contains the argument.
|
boolean
containsAll(Collection)
|
True
if the Collection contains all the elements in the argument.
|
|
|
True
if the Collection has no elements.
|
|
|
Returns
an Iterator that you can use to move through the elements in the Collection.
|
|
|
*If
the argument is in the Collection, one instance of that element is removed.
Returns true if a removal occurred.
|
boolean
removeAll(Collection)
|
*Removes
all the elements that are contained in the argument. Returns true if any
removals occurred.
|
boolean
retainAll(Collection)
|
*Retains
only elements that are contained in the argument (an “intersection”
from set theory). Returns true if any changes occurred.
|
|
|
Returns
the number of elements in the Collection.
|
|
|
Returns
an array containing all the elements in the Collection.
|
|
|
*This
is an “optional” method, which means it might not be implemented by
a particular Collection. If not, that method throws an
UnsupportedOperationException. Exceptions will be covered in Chapter 9.
|
The
following example demonstrates all of these methods. Again, these work with
anything that inherits from
Collection;
an
ArrayList
is used as a kind of “least-common denominator”:
//: Collection1.java
// Things you can do with all Collections
package c08.newcollections;
import java.util.*;
public class Collection1 {
// Fill with 'size' elements, start
// counting at 'start':
public static Collection
fill(Collection c, int start, int size) {
for(int i = start; i < start + size; i++)
c.add(Integer.toString(i));
return c;
}
// Default to a "start" of 0:
public static Collection
fill(Collection c, int size) {
return fill(c, 0, size);
}
// Default to 10 elements:
public static Collection fill(Collection c) {
return fill(c, 0, 10);
}
// Create & upcast to Collection:
public static Collection newCollection() {
return fill(new ArrayList());
// ArrayList is used for simplicity, but it's
// only seen as a generic Collection
// everywhere else in the program.
}
// Fill a Collection with a range of values:
public static Collection
newCollection(int start, int size) {
return fill(new ArrayList(), start, size);
}
// Moving through a List with an iterator:
public static void print(Collection c) {
for(Iterator x = c.iterator(); x.hasNext();)
System.out.print(x.next() + " ");
System.out.println();
}
public static void main(String[] args) {
Collection c = newCollection();
c.add("ten");
c.add("eleven");
print(c);
// Find max and min elements; this means
// different things depending on the way
// the Comparable interface is implemented:
System.out.println("Collections.max(c) = " +
Collections.max(c));
System.out.println("Collections.min(c) = " +
Collections.min(c));
// Add a Collection to another Collection
c.addAll(newCollection());
print(c);
c.remove("3"); // Removes the first one
print(c);
c.remove("3"); // Removes the second one
print(c);
// Remove all components that are in the
// argument collection:
c.removeAll(newCollection());
print(c);
c.addAll(newCollection());
print(c);
// Is an element in this Collection?
System.out.println(
"c.contains(\"4\") = " + c.contains("4"));
// Is a Collection in this Collection?
System.out.println(
"c.containsAll(newCollection()) = " +
c.containsAll(newCollection()));
Collection c2 = newCollection(5, 3);
// Keep all the elements that are in both
// c and c2 (an intersection of sets):
c.retainAll(c2);
print(c);
// Throw away all the elements in c that
// also appear in c2:
c.removeAll(c2);
System.out.println("c.isEmpty() = " +
c.isEmpty());
c = newCollection();
print(c);
c.clear(); // Remove all elements
System.out.println("after c.clear():");
print(c);
}
} ///:~ The
first methods provide a way to fill any
Collection
with test data, in this case just
ints
converted to
Strings.
The second method will be used frequently throughout the rest of this chapter.
The
two versions of
newCollection( )
create
ArrayLists
containing different sets of data and return them as
Collection
objects, so it’s clear that nothing other than the
Collection
interface is being used.
The
print( )
method will also be used throughout the rest of this section. Since it moves
through a
Collection
using an
Iterator,
which any
Collection
can produce, it will work with
Lists
and
Sets
and any
Collection
that a
Map
produces.
main( )
uses simple exercises to show all of the methods in
Collection. The
following sections compare the various implementations of
List,
Set,
and
Map
and indicate in each case (with an asterisk) which one should be your default
choice. You’ll notice that the legacy classes
Vector,
Stack,
and
Hashtable
are
not
included because in all cases there are preferred classes within the new
collections.
Using
Lists
|
|
Order
is the most important feature of a
List;
it promises to maintain elements in a particular sequence. List
adds a number of methods to
Collection
that allow
insertion
and removal of elements in the middle of a
List.
(This
is recommended only for a LinkedList.)
A
List
will
produce a ListIterator,
and using this you can traverse the
List
in both directions, as well as insert and remove elements in the middle of the
list (again, recommended only for a
LinkedList).
|
|
|
A
List
backed
by an array. Use instead of
Vector
as a general-purpose object holder. Allows rapid random access to elements, but
is slow when inserting and removing elements from the middle of a list.
ListIterator
should be used only for back-and-forth traversal of an
ArrayList,
but not for inserting and removing elements, which is expensive compared to
LinkedList.
|
|
|
Provides
optimal sequential access, with inexpensive insertions and deletions from the
middle of the list. Relatively slow for random access. (Use
ArrayList
instead.) Also has
addFirst( ),
addLast( ),
getFirst( ),
getLast( ),
removeFirst( ),
and
removeLast( )
(which are not defined in any interfaces or base classes) to allow it to be
used as a stack, a queue, and a dequeue.
|
The
methods in the following example each cover a different group of activities:
things that every list can do (
basicTest( )),
moving around with an
Iterator
(
iterMotion( ))
versus
changing things with an
Iterator
(
iterManipulation( )),
seeing the effects of
List
manipulation (
testVisual( )),
and operations available only to
LinkedLists.
//: List1.java
// Things you can do with Lists
package c08.newcollections;
import java.util.*;
public class List1 {
// Wrap Collection1.fill() for convenience:
public static List fill(List a) {
return (List)Collection1.fill(a);
}
// You can use an Iterator, just as with a
// Collection, but you can also use random
// access with get():
public static void print(List a) {
for(int i = 0; i < a.size(); i++)
System.out.print(a.get(i) + " ");
System.out.println();
}
static boolean b;
static Object o;
static int i;
static Iterator it;
static ListIterator lit;
public static void basicTest(List a) {
a.add(1, "x"); // Add at location 1
a.add("x"); // Add at end
// Add a collection:
a.addAll(fill(new ArrayList()));
// Add a collection starting at location 3:
a.addAll(3, fill(new ArrayList()));
b = a.contains("1"); // Is it in there?
// Is the entire collection in there?
b = a.containsAll(fill(new ArrayList()));
// Lists allow random access, which is cheap
// for ArrayList, expensive for LinkedList:
o = a.get(1); // Get object at location 1
i = a.indexOf("1"); // Tell index of object
// indexOf, starting search at location 2:
i = a.indexOf("1", 2);
b = a.isEmpty(); // Any elements inside?
it = a.iterator(); // Ordinary Iterator
lit = a.listIterator(); // ListIterator
lit = a.listIterator(3); // Start at loc 3
i = a.lastIndexOf("1"); // Last match
i = a.lastIndexOf("1", 2); // ...after loc 2
a.remove(1); // Remove location 1
a.remove("3"); // Remove this object
a.set(1, "y"); // Set location 1 to "y"
// Make an array from the List:
Object[] array = a.toArray();
// Keep everything that's in the argument
// (the intersection of the two sets):
a.retainAll(fill(new ArrayList()));
// Remove elements in this range:
a.removeRange(0, 2);
// Remove everything that's in the argument:
a.removeAll(fill(new ArrayList()));
i = a.size(); // How big is it?
a.clear(); // Remove all elements
}
public static void iterMotion(List a) {
ListIterator it = a.listIterator();
b = it.hasNext();
b = it.hasPrevious();
o = it.next();
i = it.nextIndex();
o = it.previous();
i = it.previousIndex();
}
public static void iterManipulation(List a) {
ListIterator it = a.listIterator();
it.add("47");
// Must move to an element after add():
it.next();
// Remove the element that was just produced:
it.remove();
// Must move to an element after remove():
it.next();
// Change the element that was just produced:
it.set("47");
}
public static void testVisual(List a) {
print(a);
List b = new ArrayList();
fill(b);
System.out.print("b = ");
print(b);
a.addAll(b);
a.addAll(fill(new ArrayList()));
print(a);
// Shrink the list by removing all the
// elements beyond the first 1/2 of the list
System.out.println(a.size());
System.out.println(a.size()/2);
a.removeRange(a.size()/2, a.size()/2 + 2);
print(a);
// Insert, remove, and replace elements
// using a ListIterator:
ListIterator x = a.listIterator(a.size()/2);
x.add("one");
print(a);
System.out.println(x.next());
x.remove();
System.out.println(x.next());
x.set("47");
print(a);
// Traverse the list backwards:
x = a.listIterator(a.size());
while(x.hasPrevious())
System.out.print(x.previous() + " ");
System.out.println();
System.out.println("testVisual finished");
}
// There are some things that only
// LinkedLists can do:
public static void testLinkedList() {
LinkedList ll = new LinkedList();
Collection1.fill(ll, 5);
print(ll);
// Treat it like a stack, pushing:
ll.addFirst("one");
ll.addFirst("two");
print(ll);
// Like "peeking" at the top of a stack:
System.out.println(ll.getFirst());
// Like popping a stack:
System.out.println(ll.removeFirst());
System.out.println(ll.removeFirst());
// Treat it like a queue, pulling elements
// off the tail end:
System.out.println(ll.removeLast());
// With the above operations, it's a dequeue!
print(ll);
}
public static void main(String args[]) {
// Make and fill a new list each time:
basicTest(fill(new LinkedList()));
basicTest(fill(new ArrayList()));
iterMotion(fill(new LinkedList()));
iterMotion(fill(new ArrayList()));
iterManipulation(fill(new LinkedList()));
iterManipulation(fill(new ArrayList()));
testVisual(fill(new LinkedList()));
testLinkedList();
}
} ///:~ In
basicTest( )
and
iterMotion( )
the calls are simply made to show the proper syntax, and while the return value
is captured, it is not used. In some cases, the return value isn’t
captured since it isn’t typically used. You should look up the full usage
of each of these methods in your online documentation before you use them.
Using
Sets
Set
has exactly the same interface as
Collection,
so there isn’t any extra functionality as there is with the two different
Lists.
Instead, the
Set
is exactly a
Collection,
it just has different behavior. (This is the ideal use of inheritance and
polymorphism: to express different behavior.) A
Set
allows only one instance of each object value to exist (what constitutes the
“value” of an object is more complex, as you shall see).
|
|
Each
element that you add to the
Set
must be unique; otherwise the
Set
doesn’t add the duplicate element. Objects added to a
Set
must define
equals( )
to establish object uniqueness.
Set
has exactly the same interface as
Collection.
A
Set
does not guarantee it will maintain its elements in any particular order.
|
|
|
For
all
Sets
except very small ones. Objects must also define hashCode( ).
|
|
|
A
Set
backed
by an array. Designed for very small
Sets,
especially those that are frequently created and destroyed. For small
Sets,
creation and iteration is substantially cheaper than for
HashSet.
Performance
gets quite bad when the
Set
is large.
HashCode( )
is not required.
|
|
|
An
ordered
Set
backed by a red-black tree.
[38]
This way, you can extract an ordered sequence from a
Set.
|
The
following example does
not
show everything you can do with a
Set,
since the interface is the same as
Collection
and so was exercised in the previous example. Instead, this demonstrates the
behavior that makes a
Set
unique:
//: Set1.java
// Things you can do with Sets
package c08.newcollections;
import java.util.*;
public class Set1 {
public static void testVisual(Set a) {
Collection1.fill(a);
Collection1.fill(a);
Collection1.fill(a);
Collection1.print(a); // No duplicates!
// Add another set to this one:
a.addAll(a);
a.add("one");
a.add("one");
a.add("one");
Collection1.print(a);
// Look something up:
System.out.println("a.contains(\"one\"): " +
a.contains("one"));
}
public static void main(String[] args) {
testVisual(new HashSet());
testVisual(new ArraySet());
}
} ///:~ Duplicate
values are added to the
Set,
but when it is printed you’ll see the
Set
has accepted only one instance of each value.
When
you run this program you’ll notice that the order maintained by the
HashSet
is different from
ArraySet,
since each has a different way of storing elements so they can be located
later. (
ArraySet
keeps them sorted, while
HashSet
uses a hashing function, which is designed specifically for rapid lookups.)
When creating your own types, be aware that a
Set
needs a way to maintain a storage order, just as with the
“groundhog” examples shown earlier in this chapter. Here’s an
example:
//: Set2.java
// Putting your own type in a Set
package c08.newcollections;
import java.util.*;
class MyType {
private int i;
public MyType(int n) { i = n;}
public boolean equals(Object o) {
if ((o != null) && (o instanceof MyType))
return
i == ((MyType)o).i;
else return false;
}
// Required for HashSet, not for ArraySet:
public int hashCode() { return i; }
public String toString() { return i + " "; }
}
public class Set2 {
public static Set fill(Set a, int size) {
for(int i = 0; i < size; i++)
a.add(new MyType(i));
return a;
}
public static Set fill(Set a) {
return fill(a, 10);
}
public static void test(Set a) {
fill(a);
fill(a); // Try to add duplicates
fill(a);
a.addAll(fill(new ArraySet()));
Collection1.print(a);
}
public static void main(String[] args) {
test(new HashSet());
test(new ArraySet());
}
} ///:~ The
definitions for equals( )
and
hashCode( )
follow the form given in the “groundhog” examples. You must define
an
equals( )
in both cases, but the
hashCode( )
is necessary only if the class will be placed in a
HashSet
(which is likely, since that should generally be your first choice as a
Set
implementation).
Using
Maps
|
|
Maintains
key-value associations (pairs), so you can look up a value using a key.
|
|
|
Implementation
based on a hash table. (Use this instead of
Hashtable.)
Provides constant-time performance for inserting and locating pairs.
Performance can be adjusted via constructors that allow you to set the
capacity
and
load
factor
of the hash table.
|
|
|
Map
backed by an
ArrayList.
Gives precise control over the order of iteration. Designed for very small
Maps,
especially those that are frequently created and destroyed. For very small
Maps,
creation and iteration is substantially cheaper than for
HashMap.
Performance
gets very bad when the
Map
is large.
|
|
|
Implementation
based on a red-black tree. When you view the keys or the pairs, they will be in
sorted order (determined by Comparable
or Comparator,
discussed later). The point of a
TreeMap
is that you get the results in sorted order.
TreeMap
is
the only
Map
with
the
subMap( )
method, which allows you to return a portion of the tree.
|
The
following example contains two sets of test data and a
fill( )
method that allows you to fill any map with any two-dimensional array of
Objects.
These tools will be used in other
Map
examples, as well.
//: Map1.java
// Things you can do with Maps
package c08.newcollections;
import java.util.*;
public class Map1 {
public final static String[][] testData1 = {
{ "Happy", "Cheerful disposition" },
{ "Sleepy", "Prefers dark, quiet places" },
{ "Grumpy", "Needs to work on attitude" },
{ "Doc", "Fantasizes about advanced degree"},
{ "Dopey", "'A' for effort" },
{ "Sneezy", "Struggles with allergies" },
{ "Bashful", "Needs self-esteem workshop"},
};
public final static String[][] testData2 = {
{ "Belligerent", "Disruptive influence" },
{ "Lazy", "Motivational problems" },
{ "Comatose", "Excellent behavior" }
};
public static Map fill(Map m, Object[][] o) {
for(int i = 0; i < o.length; i++)
m.put(o[i][0], o[i][1]);
return m;
}
// Producing a Set of the keys:
public static void printKeys(Map m) {
System.out.print("Size = " + m.size() +", ");
System.out.print("Keys: ");
Collection1.print(m.keySet());
}
// Producing a Collection of the values:
public static void printValues(Map m) {
System.out.print("Values: ");
Collection1.print(m.values());
}
// Iterating through Map.Entry objects (pairs):
public static void print(Map m) {
Collection entries = m.entries();
Iterator it = entries.iterator();
while(it.hasNext()) {
Map.Entry e = (Map.Entry)it.next();
System.out.println("Key = " + e.getKey() +
", Value = " + e.getValue());
}
}
public static void test(Map m) {
fill(m, testData1);
// Map has 'Set' behavior for keys:
fill(m, testData1);
printKeys(m);
printValues(m);
print(m);
String key = testData1[4][0];
String value = testData1[4][1];
System.out.println("m.containsKey(\"" + key +
"\"): " + m.containsKey(key));
System.out.println("m.get(\"" + key + "\"): "
+ m.get(key));
System.out.println("m.containsValue(\""
+ value + "\"): " +
m.containsValue(value));
Map m2 = fill(new ArrayMap(), testData2);
m.putAll(m2);
printKeys(m);
m.remove(testData2[0][0]);
printKeys(m);
m.clear();
System.out.println("m.isEmpty(): "
+ m.isEmpty());
fill(m, testData1);
// Operations on the Set change the Map:
m.keySet().removeAll(m.keySet());
System.out.println("m.isEmpty(): "
+ m.isEmpty());
}
public static void main(String args[]) {
System.out.println("Testing ArrayMap");
test(new ArrayMap());
System.out.println("Testing HashMap");
test(new HashMap());
System.out.println("Testing TreeMap");
test(new TreeMap());
}
} ///:~ The
printKeys( ),
printValues( ),
and
print( )
methods are not only useful utilities, they also demonstrate the production of
Collection
views of a
Map.
The
keySet( )
method produces a
Set
backed by the keys in the
Map;
here, it is treated as only a
Collection.
Similar treatment is given to
values( ),
which produces a
List
containing all the values in the
Map.
(Note that keys must be unique, while values can contain duplicates.) Since
these
Collections
are backed by the
Map,
any changes in a
Collection
will be reflected in the associated
Map. The
print( )
method grabs the
Iterator
produced by
entries
and uses it to print both the key and value for each pair. The rest of the
program provides simple examples of each
Map
operation, and tests each type of
Map. When
creating your own class to use as a key in a
Map,
you must deal with the same issues discussed previously for
Sets.
Choosing
an implementation
From
the diagram on page
363
you can see that there are really only three collection components:
Map,
List,
and
Set,
and only two or three implementations of each interface. If you need to use the
functionality offered by a particular
interface,
how do you decide which particular implementation to use?
To
understand the answer, you must be aware that each different implementation has
its own features, strengths, and weaknesses. For example, you can see in the
diagram that the “feature” of
Hashtable,
Vector,
and
Stack
is that they are legacy classes, so that existing code doesn’t break. On
the other hand, it’s best if you don’t use those for new (Java 1.2)
code.
The
distinction between the other collections often comes down to what they are
”backed by;” that is, the data structures that physically implement
your desired
interface.
This means that, for example,
ArrayList,
LinkedList
,
and
Vector
(which is roughly equivalent to
ArrayList)
all
implement the
List
interface so your program will produce the same results regardless of the one
you use. However,
ArrayList
(and
Vector)
is
backed by an array, while the
LinkedList
is implemented in the usual way for a doubly-linked list, as individual objects
each containing data along with handles to the previous and next elements in
the list. Because of this, if you want to do many insertions and removals in
the middle of a list a
LinkedList
is the appropriate choice. (
LinkedList
also has additional functionality that is established in AbstractSequentialList.)
If not, an
ArrayList
is probably faster.
As
another example, a
Set
can be implemented as either an
ArraySet
or a
HashSet.
An
ArraySet
is backed by an
ArrayList
and is designed to support only small numbers of elements, especially in
situations in which you’re creating and destroying a lot of
Set
objects. However, if you’re going to have larger quantities in your
Set,
the performance of
ArraySet
will get very bad, very quickly. When you’re writing a program that needs
a
Set,
you should choose
HashSet
by default, and change to
ArraySet
only in special cases where performance improvements are indicated and necessary.
Choosing
between Lists
The
most convincing way to see the differences between the implementations of
List
is with a performance test. The following code establishes an inner base class
to use as a test framework, then creates an anonymous
inner class for each different test. Each of these inner classes is called by
the
test( )
method. This approach allows you to easily add and remove new kinds of tests.
//: ListPerformance.java
// Demonstrates performance differences in Lists
package c08.newcollections;
import java.util.*;
public class ListPerformance {
private static final int REPS = 100;
private abstract static class Tester {
String name;
int size; // Test quantity
Tester(String name, int size) {
this.name = name;
this.size = size;
}
abstract void test(List a);
}
private static Tester[] tests = {
new Tester("get", 300) {
void test(List a) {
for(int i = 0; i < REPS; i++) {
for(int j = 0; j < a.size(); j++)
a.get(j);
}
}
},
new Tester("iteration", 300) {
void test(List a) {
for(int i = 0; i < REPS; i++) {
Iterator it = a.iterator();
while(it.hasNext())
it.next();
}
}
},
new Tester("insert", 1000) {
void test(List a) {
int half = a.size()/2;
String s = "test";
ListIterator it = a.listIterator(half);
for(int i = 0; i < size * 10; i++)
it.add(s);
}
},
new Tester("remove", 5000) {
void test(List a) {
ListIterator it = a.listIterator(3);
while(it.hasNext()) {
it.next();
it.remove();
}
}
},
};
public static void test(List a) {
// A trick to print out the class name:
System.out.println("Testing " +
a.getClass().getName());
for(int i = 0; i < tests.length; i++) {
Collection1.fill(a, tests[i].size);
System.out.print(tests[i].name);
long t1 = System.currentTimeMillis();
tests[i].test(a);
long t2 = System.currentTimeMillis();
System.out.println(": " + (t2 - t1));
}
}
public static void main(String[] args) {
test(new ArrayList());
test(new LinkedList());
}
} ///:~ The
inner class
Tester
is
abstract,
to provide a base class for the specific tests. It contains a
String
to be printed when the test starts, a
size
parameter
to be used by the test for quantity of elements or repetitions of tests, a
constructor to initialize the fields, and an
abstract
method
test( )
that does the work. All the different types of tests are collected in one
place, the array
tests,
which is initialized with different anonymous inner classes that inherit from
Tester.
To add or remove tests, simply add or remove an inner class definition from the
array, and everything else happens automatically.
The
List
that’s handed to
test( )
is first filled with elements, then each test in the
tests
array is timed. The results will vary from machine to machine; they are
intended to give only an order of magnitude comparison between the performance
of the different collections. Here is a summary of one run:
You
can see that random accesses (
get( ))
and iterations are cheap for
ArrayLists
and expensive for
LinkedLists.
On the other hand, insertions and removals from the middle of a list are
significantly cheaper for a
LinkedList
than for an
ArrayList.
The best approach is probably to choose an
ArrayList
as your default and to change to a
LinkedList
if you discover performance problems because of many insertions and removals
from the middle of the list.
Choosing
between Sets
You
can choose between an
ArraySet
and
a
HashSet,
depending on the size of the
Set
(if you need to produce an ordered sequence from a
Set,
use
TreeSet[39]).
The
following test program gives an indication of this tradeoff:
//: SetPerformance.java
// Demonstrates performance differences in Sets
package c08.newcollections;
import java.util.*;
public class SetPerformance {
private static final int REPS = 100;
private abstract static class Tester {
String name;
Tester(String name) { this.name = name; }
abstract void test(Set s, int size);
}
private static Tester[] tests = {
new Tester("add") {
void test(Set s, int size) {
for(int i = 0; i < REPS; i++) {
s.clear();
Collection1.fill(s, size);
}
}
},
new Tester("contains") {
void test(Set s, int size) {
for(int i = 0; i < REPS; i++)
for(int j = 0; j < size; j++)
s.contains(Integer.toString(j));
}
},
new Tester("iteration") {
void test(Set s, int size) {
for(int i = 0; i < REPS * 10; i++) {
Iterator it = s.iterator();
while(it.hasNext())
it.next();
}
}
},
};
public static void test(Set s, int size) {
// A trick to print out the class name:
System.out.println("Testing " +
s.getClass().getName() + " size " + size);
Collection1.fill(s, size);
for(int i = 0; i < tests.length; i++) {
System.out.print(tests[i].name);
long t1 = System.currentTimeMillis();
tests[i].test(s, size);
long t2 = System.currentTimeMillis();
System.out.println(": " +
((double)(t2 - t1)/(double)size));
}
}
public static void main(String[] args) {
// Small:
test(new ArraySet(), 10);
test(new HashSet(), 10);
// Medium:
test(new ArraySet(), 100);
test(new HashSet(), 100);
// Large:
test(new HashSet(), 1000);
test(new ArraySet(), 500);
}
} ///:~ The
last test of
ArraySet
is only 500 elements instead of 1000 because it is so slow.
HashSet
is clearly superior to
ArraySet
for
add( )
and
contains( ),
and the performance is effectively independent of size. You’ll virtually
never want to use an
ArraySet
for regular programming.
Choosing
between Maps
When
choosing between implementations of
Map,
the size of the
Map
is what most strongly affects performance, and the following test program gives
an indication of this tradeoff:
//: MapPerformance.java
// Demonstrates performance differences in Maps
package c08.newcollections;
import java.util.*;
public class MapPerformance {
private static final int REPS = 100;
public static Map fill(Map m, int size) {
for(int i = 0; i < size; i++) {
String x = Integer.toString(i);
m.put(x, x);
}
return m;
}
private abstract static class Tester {
String name;
Tester(String name) { this.name = name; }
abstract void test(Map m, int size);
}
private static Tester[] tests = {
new Tester("put") {
void test(Map m, int size) {
for(int i = 0; i < REPS; i++) {
m.clear();
fill(m, size);
}
}
},
new Tester("get") {
void test(Map m, int size) {
for(int i = 0; i < REPS; i++)
for(int j = 0; j < size; j++)
m.get(Integer.toString(j));
}
},
new Tester("iteration") {
void test(Map m, int size) {
for(int i = 0; i < REPS * 10; i++) {
Iterator it = m.entries().iterator();
while(it.hasNext())
it.next();
}
}
},
};
public static void test(Map m, int size) {
// A trick to print out the class name:
System.out.println("Testing " +
m.getClass().getName() + " size " + size);
fill(m, size);
for(int i = 0; i < tests.length; i++) {
System.out.print(tests[i].name);
long t1 = System.currentTimeMillis();
tests[i].test(m, size);
long t2 = System.currentTimeMillis();
System.out.println(": " +
((double)(t2 - t1)/(double)size));
}
}
public static void main(String[] args) {
// Small:
test(new ArrayMap(), 10);
test(new HashMap(), 10);
test(new TreeMap(), 10);
// Medium:
test(new ArrayMap(), 100);
test(new HashMap(), 100);
test(new TreeMap(), 100);
// Large:
test(new HashMap(), 1000);
// You might want to comment these out since
// they can take a while to run:
test(new ArrayMap(), 500);
test(new TreeMap(), 500);
}
} ///:~ Because
the size of the map is the issue, you’ll see that the timing tests divide
the time by the size to normalize each measurement. Here is one set of results.
(Yours will probably be different.)
| |