Bruce Eckel’s Thinking in Java | Contents | Prev | Next |
A
Java program can send a CGI request to a server just like an HTML page can. As
with HTML pages, this request can be either a GET
or a POST.
In addition, the Java program can intercept the output of the CGI program, so
you don’t have to rely on the program to format a new page and force the
user to back up from one page to another if something goes wrong. In fact, the
appearance of the program can be the same as the previous version.
Java program can send a CGI request to a server just like an HTML page can. As
with HTML pages, this request can be either a GET
or a POST.
In addition, the Java program can intercept the output of the CGI program, so
you don’t have to rely on the program to format a new page and force the
user to back up from one page to another if something goes wrong. In fact, the
appearance of the program can be the same as the previous version.
It
also turns out that the code is simpler, and that CGI isn’t difficult to
write after all. (An innocent statement that’s true of many things –
after
you understand them.) So in this section you’ll get a crash
course in CGI programming. To solve the general problem, some CGI tools will be
created in C++ that will allow you to easily write a CGI program to solve any
problem. The benefit to this approach is portability – the example you
are about to see will work on any system that supports CGI, and there’s
no problem with firewalls.
also turns out that the code is simpler, and that CGI isn’t difficult to
write after all. (An innocent statement that’s true of many things –
after
you understand them.) So in this section you’ll get a crash
course in CGI programming. To solve the general problem, some CGI tools will be
created in C++ that will allow you to easily write a CGI program to solve any
problem. The benefit to this approach is portability – the example you
are about to see will work on any system that supports CGI, and there’s
no problem with firewalls.
This
example also works out the basics of creating any connection with applets and
CGI programs, so you can easily adapt it to your own projects.
example also works out the basics of creating any connection with applets and
CGI programs, so you can easily adapt it to your own projects.
Encoding
data for CGI
In
this version, the name
and
the email address will be collected and stored in the file in the form:
this version, the name
and
the email address will be collected and stored in the file in the form:
This
is a convenient form for many mailers. Since two fields are being collected,
there are no shortcuts because CGI has a particular format for encoding the
data in fields. You can see this for yourself if you make an ordinary HTML
page and add the lines:
is a convenient form for many mailers. Since two fields are being collected,
there are no shortcuts because CGI has a particular format for encoding the
data in fields. You can see this for yourself if you make an ordinary HTML
page and add the lines:
<Form method="GET" ACTION="/cgi-bin/Listmgr2.exe"> <P>Name: <INPUT TYPE = "text" NAME = "name" VALUE = "" size = "40"></p> <P>Email Address: <INPUT TYPE = "text" NAME = "email" VALUE = "" size = "40"></p> <p><input type = "submit" name = "submit" > </p> </Form>
This
creates two data entry fields called
name
and
email,
along with a
submit
button that collects the data and sends it to a CGI program.
Listmgr2.exe
is the name of the executable program that resides in the directory
that’s typically called “cgi-bin” on your Web server.
[65]
(If the named program is not in the cgi-bin directory, you won’t see any
results.) If you fill out this form and press the “submit” button,
you will see in the URL address window of the browser something like:
creates two data entry fields called
name
and
email,
along with a
submit
button that collects the data and sends it to a CGI program.
Listmgr2.exe
is the name of the executable program that resides in the directory
that’s typically called “cgi-bin” on your Web server.
[65]
(If the named program is not in the cgi-bin directory, you won’t see any
results.) If you fill out this form and press the “submit” button,
you will see in the URL address window of the browser something like:
http://www.myhome.com/cgi-bin/Listmgr2.exe?
name=First+Last&[email protected]&submit=Submit
(Without
the line break, of course). Here you see a little bit of the way that data is
encoded to send to CGI. For one thing, spaces are not allowed (since spaces
typically separate command-line arguments). Spaces are replaced by ‘
+’
signs. In addition, each field contains the field name (which is determined by
the HTML page) followed by an ‘
=’
and the field data, and terminated by a ‘
&’.
the line break, of course). Here you see a little bit of the way that data is
encoded to send to CGI. For one thing, spaces are not allowed (since spaces
typically separate command-line arguments). Spaces are replaced by ‘
+’
signs. In addition, each field contains the field name (which is determined by
the HTML page) followed by an ‘
=’
and the field data, and terminated by a ‘
&’.
At
this point, you might wonder about the ‘
+’,
‘
=,’
and ‘
&’.
What if those are used in the field, as in “John & Marsha
Smith”? This is encoded to:
this point, you might wonder about the ‘
+’,
‘
=,’
and ‘
&’.
What if those are used in the field, as in “John & Marsha
Smith”? This is encoded to:
John+%26+Marsha+Smith
That
is, the special character is turned into a
‘%’
followed by its ASCII value in hex.
is, the special character is turned into a
‘%’
followed by its ASCII value in hex.
Fortunately,
Java has a tool to perform this encoding for you. It’s a static method of
the class URLEncoder
called
encode( ).
You can experiment with this method using the following program:
Java has a tool to perform this encoding for you. It’s a static method of
the class URLEncoder
called
encode( ).
You can experiment with this method using the following program:
//: EncodeDemo.java // Demonstration of URLEncoder.encode() import java.net.*; public class EncodeDemo { public static void main(String[] args) { String s = ""; for(int i = 0; i < args.length; i++) s += args[i] + " "; s = URLEncoder.encode(s.trim()); System.out.println(s); } } ///:~
This
takes the command-line arguments and combines them into a string of words
separated by spaces (the final space is removed using
String.trim( )).
These are then encoded and printed.
takes the command-line arguments and combines them into a string of words
separated by spaces (the final space is removed using
String.trim( )).
These are then encoded and printed.
To
invoke
a CGI program, all the applet needs to do is collect the data from its fields
(or wherever it needs to collect the data from), URL-encode each piece of data,
and then assemble it into a single string, placing the name of each field
followed by an ‘
=’,
followed by the data, followed by an ‘
&’.
To form the entire CGI command, this string is placed after the URL of the CGI
program and a ‘
?’.
That’s all it takes to invoke any CGI program, and as you’ll see
you can easily do it within an applet.
invoke
a CGI program, all the applet needs to do is collect the data from its fields
(or wherever it needs to collect the data from), URL-encode each piece of data,
and then assemble it into a single string, placing the name of each field
followed by an ‘
=’,
followed by the data, followed by an ‘
&’.
To form the entire CGI command, this string is placed after the URL of the CGI
program and a ‘
?’.
That’s all it takes to invoke any CGI program, and as you’ll see
you can easily do it within an applet.
The
applet
The
applet is actually considerably simpler than
NameSender.java,
partly because it’s so easy to send a GET
request and also because no thread is required to wait for the reply. There are
now two fields instead of one, but you’ll notice that much of the applet
looks familiar, from
NameSender.java.
applet is actually considerably simpler than
NameSender.java,
partly because it’s so easy to send a GET
request and also because no thread is required to wait for the reply. There are
now two fields instead of one, but you’ll notice that much of the applet
looks familiar, from
NameSender.java.
//: NameSender2.java // An applet that sends an email address // via a CGI GET, using Java 1.02. import java.awt.*; import java.applet.*; import java.net.*; import java.io.*; public class NameSender2 extends Applet { final String CGIProgram = "Listmgr2.exe"; Button send = new Button( "Add email address to mailing list"); TextField name = new TextField( "type your name here", 40), email = new TextField( "type your email address here", 40); String str = new String(); Label l = new Label(), l2 = new Label(); int vcount = 0; public void init() { setLayout(new BorderLayout()); Panel p = new Panel(); p.setLayout(new GridLayout(3, 1)); p.add(name); p.add(email); p.add(send); add("North", p); Panel labels = new Panel(); labels.setLayout(new GridLayout(2, 1)); labels.add(l); labels.add(l2); add("Center", labels); l.setText("Ready to send email address"); } public boolean action (Event evt, Object arg) { if(evt.target.equals(send)) { l2.setText(""); // Check for errors in data: if(name.getText().trim() .indexOf(' ') == -1) { l.setText( "Please give first and last name"); l2.setText(""); return true; } str = email.getText().trim(); if(str.indexOf(' ') != -1) { l.setText( "Spaces not allowed in email name"); l2.setText(""); return true; } if(str.indexOf(',') != -1) { l.setText( "Commas not allowed in email name"); return true; } if(str.indexOf('@') == -1) { l.setText("Email name must include '@'"); l2.setText(""); return true; } if(str.indexOf('@') == 0) { l.setText( "Name must preceed '@' in email name"); l2.setText(""); return true; } String end = str.substring(str.indexOf('@')); if(end.indexOf('.') == -1) { l.setText("Portion after '@' must " + "have an extension, such as '.com'"); l2.setText(""); return true; } // Build and encode the email data: String emailData = "name=" + URLEncoder.encode( name.getText().trim()) + "&email=" + URLEncoder.encode( email.getText().trim().toLowerCase()) + "&submit=Submit"; // Send the name using CGI's GET process: try { l.setText("Sending..."); URL u = new URL( getDocumentBase(), "cgi-bin/" + CGIProgram + "?" + emailData); l.setText("Sent: " + email.getText()); send.setLabel("Re-send"); l2.setText( "Waiting for reply " + ++vcount); DataInputStream server = new DataInputStream(u.openStream()); String line; while((line = server.readLine()) != null) l2.setText(line); } catch(MalformedURLException e) { l.setText("Bad URl"); } catch(IOException e) { l.setText("IO Exception"); } } else return super.action(evt, arg); return true; } } ///:~
The
name of the CGI program (which you’ll see later) is
Listmgr2.exe.
Many Web servers are Unix machines (mine runs Linux) that don’t
traditionally use the
.exe
extension for their executable programs, but you can call the program anything
you want under Unix. By using the
.exe
extension
the program can be tested without change under both Unix and Win32.
name of the CGI program (which you’ll see later) is
Listmgr2.exe.
Many Web servers are Unix machines (mine runs Linux) that don’t
traditionally use the
.exe
extension for their executable programs, but you can call the program anything
you want under Unix. By using the
.exe
extension
the program can be tested without change under both Unix and Win32.
As
before, the applet sets up its user interface (with two fields this time
instead of one). The only significant difference occurs inside the
action( )
method,
which handles the button press. After the name has been checked, you see the
lines:
before, the applet sets up its user interface (with two fields this time
instead of one). The only significant difference occurs inside the
action( )
method,
which handles the button press. After the name has been checked, you see the
lines:
String emailData = "name=" + URLEncoder.encode( name.getText().trim()) + "&email=" + URLEncoder.encode( email.getText().trim().toLowerCase()) + "&submit=Submit"; // Send the name using CGI's GET process: try { l.setText("Sending..."); URL u = new URL( getDocumentBase(), "cgi-bin/" + CGIProgram + "?" + emailData); l.setText("Sent: " + email.getText()); send.setLabel("Re-send"); l2.setText( "Waiting for reply " + ++vcount); DataInputStream server = new DataInputStream(u.openStream()); String line; while((line = server.readLine()) != null) l2.setText(line); <p><tt> // ... </tt></p>
The
name
and
email
data are extracted from their respective text boxes, and the spaces are trimmed
off both ends using trim( ).
The
email
name is forced to lower case so all email addresses in the list can be
accurately compared (to prevent accidental duplicates based on capitalization).
The data from each field is URL-encoded, and then the GET string is assembled
in the same way that an HTML page would do it. (This way you can use a Java
applet in concert with any existing CGI program designed to work with regular
HTML GET requests.)
name
and
data are extracted from their respective text boxes, and the spaces are trimmed
off both ends using trim( ).
The
name is forced to lower case so all email addresses in the list can be
accurately compared (to prevent accidental duplicates based on capitalization).
The data from each field is URL-encoded, and then the GET string is assembled
in the same way that an HTML page would do it. (This way you can use a Java
applet in concert with any existing CGI program designed to work with regular
HTML GET requests.)
At
this point, some Java magic happens: if you want to connect to any URL, just
create a URL
object and hand the address to the constructor. The constructor makes the
connection with the server (and, with Web servers, all the action happens in
making the connection, via the string used as the URL). In this case, the URL
points to the cgi-bin directory of the current Web site (the base address of
the current Web site is produced with getDocumentBase( )).
When the Web server sees “cgi-bin” in a URL, it expects that to be
followed by the name of the program inside the cgi-bin directory that you want
it to run. Following the program name is a question mark and the argument
string that the CGI program will look for in the QUERY_STRING environment
variable, as you’ll see.
this point, some Java magic happens: if you want to connect to any URL, just
create a URL
object and hand the address to the constructor. The constructor makes the
connection with the server (and, with Web servers, all the action happens in
making the connection, via the string used as the URL). In this case, the URL
points to the cgi-bin directory of the current Web site (the base address of
the current Web site is produced with getDocumentBase( )).
When the Web server sees “cgi-bin” in a URL, it expects that to be
followed by the name of the program inside the cgi-bin directory that you want
it to run. Following the program name is a question mark and the argument
string that the CGI program will look for in the QUERY_STRING environment
variable, as you’ll see.
Usually
when you make any sort of request, you get back (you’re forced to accept
in return) an HTML page. With Java
URL
objects, however, you can intercept anything that comes back from the CGI
program by getting an
InputStream
from the
URL
object. This is performed with the URL
openStream( )
method, which is in turn wrapped in a
DataInputStream.
Then you can read lines, and when
readLine( )
returns
null
the CGI program has finished its output.
when you make any sort of request, you get back (you’re forced to accept
in return) an HTML page. With Java
URL
objects, however, you can intercept anything that comes back from the CGI
program by getting an
InputStream
from the
URL
object. This is performed with the URL
openStream( )
method, which is in turn wrapped in a
DataInputStream.
Then you can read lines, and when
readLine( )
returns
null
the CGI program has finished its output.
The
CGI program you’re about to see returns only one line, a string
indicating success or failure (and the details of the failure). This line is
captured and placed into the second
Label
field so the user can see the results.
CGI program you’re about to see returns only one line, a string
indicating success or failure (and the details of the failure). This line is
captured and placed into the second
Label
field so the user can see the results.
Displaying
a Web page from within an applet
It’s
also possible for the applet to display the result of the CGI program as a Web
page, just as if it were running in normal HTML mode. You can do this with the
following line:
also possible for the applet to display the result of the CGI program as a Web
page, just as if it were running in normal HTML mode. You can do this with the
following line:
getAppletContext().showDocument(u);
in which
u
is the
URL
object. Here’s a simple example that redirects you to another Web page.
The page happens to be the output of a CGI program, but you can as easily go to
an ordinary HTML page, so you could build on this applet to produce a
password-protected gateway to a particular portion of your Web site:
//: ShowHTML.java import java.awt.*; import java.applet.*; import java.net.*; import java.io.*; public class ShowHTML extends Applet { static final String CGIProgram = "MyCGIProgram"; Button send = new Button("Go"); Label l = new Label(); public void init() { add(send); add(l); } public boolean action (Event evt, Object arg) { if(evt.target.equals(send)) { try { // This could be an HTML page instead of // a CGI program. Notice that this CGI // program doesn't use arguments, but // you can add them in the usual way. URL u = new URL( getDocumentBase(), "cgi-bin/" + CGIProgram); // Display the output of the URL using // the Web browser, as an ordinary page: getAppletContext().showDocument(u); } catch(Exception e) { l.setText(e.toString()); } } else return super.action(evt, arg); return true; } } ///:~
The
beauty of the URL
class is how much it shields you from. You can connect to Web servers without
knowing much at all about what’s going on under the covers.
beauty of the URL
class is how much it shields you from. You can connect to Web servers without
knowing much at all about what’s going on under the covers.
The
CGI program in C++
At
this point you could follow the previous example and write the CGI program for
the server using ANSI C. One argument for doing this is that ANSI C can be
found virtually everywhere. However, C++ has become quite ubiquitous,
especially in the form of the GNU
C++ Compiler
[66]
(
g++)
that
can be downloaded free from the Internet for virtually any platform (and often
comes pre-installed with operating systems such as Linux). As you will see,
this means that you can get the benefit of object-oriented programming in a CGI
program.
this point you could follow the previous example and write the CGI program for
the server using ANSI C. One argument for doing this is that ANSI C can be
found virtually everywhere. However, C++ has become quite ubiquitous,
especially in the form of the GNU
C++ Compiler
[66]
(
g++)
that
can be downloaded free from the Internet for virtually any platform (and often
comes pre-installed with operating systems such as Linux). As you will see,
this means that you can get the benefit of object-oriented programming in a CGI
program.
To
avoid throwing too many new concepts at you all at once, this program will not
be a “pure” C++ program; some code will be written in plain C even
though C++ alternatives exist. This isn’t a significant issue because the
biggest benefit in using C++ for this program is the ability to create classes.
Since what we’re concerned with when parsing the CGI information is the
field name-value pairs, one class (
Pair)
will
be used to represent a single name-value pair and a second class (
CGI_vector)
will automatically parse the CGI string into
Pair
objects that it will hold (as a
vector)
so you can fetch each
Pair
out at your leisure.
avoid throwing too many new concepts at you all at once, this program will not
be a “pure” C++ program; some code will be written in plain C even
though C++ alternatives exist. This isn’t a significant issue because the
biggest benefit in using C++ for this program is the ability to create classes.
Since what we’re concerned with when parsing the CGI information is the
field name-value pairs, one class (
Pair)
will
be used to represent a single name-value pair and a second class (
CGI_vector)
will automatically parse the CGI string into
Pair
objects that it will hold (as a
vector)
so you can fetch each
Pair
out at your leisure.
This
program is also interesting because it demonstrates some of the pluses and
minuses of C++
in contrast with Java. You’ll see some similarities; for example the
class
keyword. Access control has identical keywords
public
and
private,
but they’re used differently: they control a block instead of a single
method or field (that is, if you say
private:
each following definition is
private
until you say
public:).
Also, when you create a class, all the definitions automatically default to
private.
program is also interesting because it demonstrates some of the pluses and
minuses of C++
in contrast with Java. You’ll see some similarities; for example the
class
keyword. Access control has identical keywords
public
and
private,
but they’re used differently: they control a block instead of a single
method or field (that is, if you say
private:
each following definition is
private
until you say
public:).
Also, when you create a class, all the definitions automatically default to
private.
One
of the reasons for using C++ here is the convenience of the C++
Standard
Template Library
.
Among other things, the STL contains a
vector
class. This is a C++
template,
which means that it will be configured at compile time so it will hold objects
of only a particular type (in this case,
Pair
objects). Unlike the Java
Vector,
which will accept anything, the C++
vector
template will cause a compile-time error message if you try to put anything but
a
Pair
object into the
vector,
and when you get something out of the
vector
it will automatically be a
Pair
object, without casting. Thus, the checking happens at compile time and
produces a more robust program. In addition, the program can run faster since
you don’t have to perform run-time casts. The
vector
also
overloads the
operator[]
so
you have a convenient syntax for extracting
Pair
objects.
The
vector
template will be used in the creation of
CGI_vector,
which you’ll see is a fairly short definition considering how powerful it
is.
of the reasons for using C++ here is the convenience of the C++
Standard
Template Library
.
Among other things, the STL contains a
vector
class. This is a C++
template,
which means that it will be configured at compile time so it will hold objects
of only a particular type (in this case,
Pair
objects). Unlike the Java
Vector,
which will accept anything, the C++
vector
template will cause a compile-time error message if you try to put anything but
a
Pair
object into the
vector,
and when you get something out of the
vector
it will automatically be a
Pair
object, without casting. Thus, the checking happens at compile time and
produces a more robust program. In addition, the program can run faster since
you don’t have to perform run-time casts. The
vector
also
overloads the
operator[]
so
you have a convenient syntax for extracting
Pair
objects.
The
vector
template will be used in the creation of
CGI_vector,
which you’ll see is a fairly short definition considering how powerful it
is.
On
the down side, look at the complexity of the definition of
Pair
in the following code.
Pair
has
more method definitions than you’re used to seeing in Java code, because
the C++ programmer must know how to control copying with the copy-constructor
and assignment with the overloaded
operator=.
As described in Chapter 12, occasionally you need to concern yourself with
similar things in Java, but in C++ you must be aware of them almost constantly.
the down side, look at the complexity of the definition of
Pair
in the following code.
Pair
has
more method definitions than you’re used to seeing in Java code, because
the C++ programmer must know how to control copying with the copy-constructor
and assignment with the overloaded
operator=.
As described in Chapter 12, occasionally you need to concern yourself with
similar things in Java, but in C++ you must be aware of them almost constantly.
The
project will start with a reusable portion, which consists of
Pair
and
CGI_vector
in a C++ header file. Technically, you shouldn’t cram this much into a
header file, but for these examples it doesn’t hurt anything and it will
also look more Java-like, so it will be easier for you to read:
project will start with a reusable portion, which consists of
Pair
and
CGI_vector
in a C++ header file. Technically, you shouldn’t cram this much into a
header file, but for these examples it doesn’t hurt anything and it will
also look more Java-like, so it will be easier for you to read:
//: CGITools.h // Automatically extracts and decodes data // from CGI GETs and POSTs. Tested with GNU C++ // (available for most server machines). #include <string.h> #include <vector> // STL vector using namespace std; // A class to hold a single name-value pair from // a CGI query. CGI_vector holds Pair objects and // returns them from its operator[]. class Pair { char* nm; char* val; public: Pair() { nm = val = 0; } Pair(char* name, char* value) { // Creates new memory: nm = decodeURLString(name); val = decodeURLString(value); } const char* name() const { return nm; } const char* value() const { return val; } // Test for "emptiness" bool empty() const { return (nm == 0) || (val == 0); } // Automatic type conversion for boolean test: operator bool() const { return (nm != 0) && (val != 0); } // The following constructors & destructor are // necessary for bookkeeping in C++. // Copy-constructor: Pair(const Pair& p) { if(p.nm == 0 || p.val == 0) { nm = val = 0; } else { // Create storage & copy rhs values: nm = new char[strlen(p.nm) + 1]; strcpy(nm, p.nm); val = new char[strlen(p.val) + 1]; strcpy(val, p.val); } } // Assignment operator: Pair& operator=(const Pair& p) { // Clean up old lvalues: delete nm; delete val; if(p.nm == 0 || p.val == 0) { nm = val = 0; } else { // Create storage & copy rhs values: nm = new char[strlen(p.nm) + 1]; strcpy(nm, p.nm); val = new char[strlen(p.val) + 1]; strcpy(val, p.val); } return *this; } ~Pair() { // Destructor delete nm; // 0 value OK delete val; } // If you use this method outide this class, // you're responsible for calling 'delete' on // the pointer that's returned: static char* decodeURLString(const char* URLstr) { int len = strlen(URLstr); char* result = new char[len + 1]; memset(result, len + 1, 0); for(int i = 0, j = 0; i <= len; i++, j++) { if(URLstr[i] == '+') result[j] = ' '; else if(URLstr[i] == '%') { result[j] = translateHex(URLstr[i + 1]) * 16 + translateHex(URLstr[i + 2]); i += 2; // Move past hex code } else // An ordinary character result[j] = URLstr[i]; } return result; } // Translate a single hex character; used by // decodeURLString(): static char translateHex(char hex) { if(hex >= 'A') return (hex & 0xdf) - 'A' + 10; else return hex - '0'; } }; // Parses any CGI query and turns it // into an STL vector of Pair objects: class CGI_vector : public vector<Pair> { char* qry; const char* start; // Save starting position // Prevent assignment and copy-construction: void operator=(CGI_vector&); CGI_vector(CGI_vector&); public: // const fields must be initialized in the C++ // "Constructor initializer list": CGI_vector(char* query) : start(new char[strlen(query) + 1]) { qry = (char*)start; // Cast to non-const strcpy(qry, query); Pair p; while((p = nextPair()) != 0) push_back(p); } // Destructor: ~CGI_vector() { delete start; } private: // Produces name-value pairs from the query // string. Returns an empty Pair when there's // no more query string left: Pair nextPair() { char* name = qry; if(name == 0 || *name == '