Java and the Internet

Bruce Eckel’s Thinking in Java

Contents | Prev | Next

Java is, in fact, yet another computer programming language, you may question

why it is so important and why it is being promoted as a revolutionary step in

computer programming. The answer isn’t immediately obvious if

you’re coming from a traditional programming perspective. Although Java

will solve traditional stand-alone programming problems, the reason it is

important is that it will also solve programming problems on the World Wide Web.

What
is the Web?

The

Web can seem a bit of a mystery at first, with all this talk of

“surfing,” “presence” and “home pages.”

There has even been a growing reaction against “Internet-mania,”

questioning the economic value and outcome of such a sweeping movement.

It’s helpful to step back and see what it really is, but to do this you

must understand client/server systems, another aspect of computing that’s

full of confusing issues.

Client/Server
computing

The

primary idea of a client/server system is that you have a central repository of

information – some kind of data, typically in a database – that you

want to distribute on demand to some set of people or machines. A key to the

client/server concept is that the repository of information is

centrally
located

that it can be changed and so that those changes will propagate out to the

information consumers. Taken together, the information repository, the software

that distributes the information and the machine(s) where the information and

software reside is called the

server

The software that resides on the remote machine, and that communicates with the

server, fetches the information, processes it, and displays it on the remote

machine is called the

client

The

basic concept of client/server computing, then, is not so complicated. The

problems arise because you have a single server trying to serve many clients at

once. Generally a database management system is involved so the designer

“balances” the layout of data into tables for optimal use. In

addition, systems often allow a client to insert new information into a server.

This means you must ensure that one client’s new data doesn’t walk

over another client’s new data, or that data isn’t lost in the

process of adding it to the database. (This is called

transaction
processing.

)

As client software changes, it must be built, debugged and installed on the

client machines, which turns out to be more complicated and expensive than you

might think. It’s especially problematic to support multiple types of

computers and operating systems. Finally, there’s the all-important

performance issue: you might have hundreds of clients making requests of your

server at any one time, and so any small delay is crucial. To minimize latency,

programmers work hard to offload processing tasks, often to the client machine

but sometimes to other machines at the server site using so-called

middleware

(Middleware is also used to improve maintainability.)

the simple idea of distributing information to people has so many layers of

complexity in implementing it that the whole problem can seem hopelessly

enigmatic. And yet it’s crucial: client/server computing accounts for

roughly half of all programming activities. It’s responsible for

everything from taking orders and credit-card transactions to the distribution

of any kind of data – stock market, scientific, government – you

name it. What we’ve come up with in the past is individual solutions to

individual problems, inventing a new solution each time. These were hard to

create and hard to use and the user had to learn a new interface for each one.

The entire client/server problem needs to be solved in a big way.

The
Web as a giant server

The

Web is actually one giant client-server system. It’s a bit worse than

that, since you have all the servers and clients coexisting on a single network

at once. You don’t need to know that, since all you care about is

connecting to and interacting with one server at a time (even though you might

be hopping around the world in your search for the correct server).

Initially

it was a simple one-way process. You made a request of a server and it handed

you a file, which your machine’s browser software (i.e. the client) would

interpret by formatting onto your local machine. But in short order people

began wanting to do more than just deliver pages from a server. They wanted

full client/server capability so that the client could feed information back to

the server, for example, to do database lookups on the server, to add new

information to the server or to place an order (which required more security

than the original systems offered). These are the changes we’ve been

seeing in the development of the Web.

The

Web browser was a big step forward: the concept that one piece of information

could be displayed on any type of computer without change. However, browsers

were still rather primitive and rapidly bogged down by the demands placed on

them. They weren’t particularly interactive and tended to clog up both

the server and the Internet because any time you needed to do something that

required programming you had to send information back to the server to be

processed. It could take many seconds or minutes to find out you had misspelled

something in your request. Since the browser was just a viewer it

couldn’t perform even the simplest computing tasks. (On the other hand,

it was safe, since it couldn’t execute any programs on your local machine

that contained bugs or viruses.)

solve this problem, different approaches have been taken. To begin with,

graphics standards have been enhanced to allow better animation and video

within browsers. The remainder of the problem can be solved only by

incorporating the ability to run programs on the client end, under the browser.

This is called

client-side
programming

Client-side
programming
[8]

The

Web’s initial server-browser design provided for interactive content, but

the interactivity was completely provided by the server. The server produced

static pages for the client browser, which would simply interpret and display

them. Basic HTML contains simple mechanisms for data gathering: text-entry

boxes, check boxes, radio boxes, lists and drop-down lists, as well as a button

that can only be programmed to reset the data on the form or

“submit” the data on the form back to the server. This submission

passes through the

Common
Gateway Interface

(CGI) provided on all Web servers. The text within the submission tells CGI

what to do with it. The most common action is to run a program located on the

server in a directory that’s typically called “cgi-bin.” (If

you watch the address window at the top of your browser when you push a button

on a Web page, you can sometimes see “cgi-bin” within all the

gobbledygook there.) These programs can be written in most languages. Perl is a

common choice because it is designed for text manipulation and is interpreted,

so it can be installed on any server regardless of processor or operating system.

Many

powerful Web sites today are built strictly on CGI, and you can in fact do

nearly anything with it. The problem is response time. The response of a CGI

program depends on how much data must be sent as well as the load on both the

server and the Internet. (On top of this, starting a CGI program tends to be

slow.) The initial designers of the Web did not foresee how rapidly this

bandwidth would be exhausted for the kinds of applications people developed.

For example, any sort of dynamic graphing is nearly impossible to perform with

consistency because a GIF file must be created and moved from the server to the

client for each version of the graph. And you’ve no doubt had direct

experience with something as simple as validating the data on an input form.

You press the submit button on a page; the data is shipped back to the server;

the server starts a CGI program that discovers an error, formats an HTML page

informing you of the error and sends the page back to you; you must then back

up a page and try again. Not only is this slow, it’s not elegant.

The

solution is client-side programming. Most machines that run Web browsers are

powerful engines capable of doing vast work, and with the original static HTML

approach they are sitting there, just idly waiting for the server to dish up

the next page. Client-side programming means that the Web browser is harnessed

to do whatever work it can, and the result for the user is a much speedier and

more interactive experience at your Web site.

The

problem with discussions of client-side programming is that they aren’t

very different from discussions of programming in general. The parameters are

almost the same, but the platform is different: a Web browser is like a limited

operating system. In the end, it’s still programming and this accounts

for the dizzying array of problems and solutions produced by client-side

programming. The rest of this section provides an overview of the issues and

approaches in client-side programming.

Plug-ins

One

of the most significant steps forward in client-side programming is the

development of the plug-in. This is a way for a programmer to add new

functionality to the browser by downloading a piece of code that plugs itself

into the appropriate spot in the browser. It tells the browser “from now

on you can perform this new activity.” (You need to download the plug-in

only once.) Some fast and powerful behavior is added to browsers via plug-ins,

but writing a plug-in is not a trivial task and isn’t something

you’d want to do as part of the process of building a particular site.

The value of the plug-in for client-side programming is that it allows an

expert programmer to develop a new language and add that language to a browser

without
the permission of the browser manufacturer

Thus, plug-ins provide the back door that allows the creation of new

client-side programming languages (although not all languages are implemented

as plug-ins).

Scripting
languages

Plug-ins

resulted in an explosion of scripting languages. With a scripting language you

embed the source code for your client-side program directly into the HTML page

and the plug-in that interprets that language is automatically activated while

the HTML page is being displayed. Scripting languages tend to be reasonably

simple to understand, and because they are simply text that is part of an HTML

page they load very quickly as part of the single server hit required to

procure that page. The trade-off is that your code is exposed for everyone to

see (and steal) but generally you aren’t doing amazingly sophisticated

things with scripting languages so it’s not too much of a hardship.

This

points out that scripting languages are really intended to solve specific types

of problems, primarily the creation of richer and more interactive graphical

user interfaces (GUIs). However, a scripting language might solve 80 percent of

the problems encountered in client-side programming. Your problems might very

well fit completely within that 80 percent, and since scripting languages tend

to be easier and faster to develop, you should probably consider a scripting

language before looking at a more involved solution such as Java or ActiveX

programming.

The

most commonly-discussed scripting languages are JavaScript (which has nothing

to do with Java; it’s named that way just to grab some of Java’s

marketing momentum), VBScript (which looks like Visual Basic) and Tcl/Tk, which

comes from the popular cross-platform GUI-building language. There are others

out there and no doubt more in development.

JavaScript

is probably the most commonly supported. It comes built into both Netscape

Navigator and the Microsoft Internet Explorer (IE). In addition, there are

probably more JavaScript books out than for the other languages, and some tools

automatically create pages using JavaScript. However, if you’re already

fluent in Visual Basic or Tcl/Tk, you’ll be more productive using those

scripting languages rather than learning a new one. (You’ll have your

hands full dealing with the Web issues already.)

Java

a scripting language can solve 80 percent of the client-side programming

problems, what about the other 20 percent – the “really hard

stuff?” The most popular solution today is Java. Not only is it a

powerful programming language built to be secure, cross-platform and

international, but Java is being continuously extended to provide language

features and libraries that elegantly handle problems that are difficult in

traditional programming languages, such as multithreading, database access,

network programming and distributed computing. Java allows client-side

programming via the

applet

applet is a mini-program that will run only under a Web browser. The applet is

downloaded automatically as part of a Web page (just as, for example, a graphic

is automatically downloaded). When the applet is activated it executes a

program. This is part of its beauty – it provides you with a way to

automatically distribute the client software from the server at the time the

user needs the client software, and no sooner. They get the latest version of

the client software without fail and without difficult re-installation. Because

of the way Java is designed, the programmer needs to create only a single

program, and that program automatically works with all computers that have

browsers with built-in Java interpreters. (This safely includes the vast

majority of machines.) Since Java is a full-fledged programming language, you

can do as much work as possible on the client before and after making requests

of the server. For example, you won’t need to send a request form across

the Internet to discover that you’ve gotten a date or some other

parameter wrong, and your client computer can quickly do the work of plotting

data instead of waiting for the server to make a plot and ship a graphic image

back to you. Not only do you get the immediate win of speed and responsiveness,

but the general network traffic and load upon servers can be reduced,

preventing the entire Internet from slowing down.

One

advantage a Java applet has over a scripted program is that it’s in

compiled form, so the source code isn’t available to the client. On the

other hand, a Java applet can be decompiled without too much trouble, and

hiding your code is often not an important issue anyway. Two other factors can

be important. As you will see later in the book, a compiled Java applet can

comprise many modules and take multiple server “hits” (accesses) to

download. (In Java 1.1

this is minimized by Java archives, called JAR files, that allow all the
required modules to be packaged together for a single download.) A scripted
program will just be integrated into the Web page as part of its text (and will
generally be smaller and reduce server hits). This could be important to the
responsiveness of your Web site. Another factor is the all-important learning
curve. Regardless of what you’ve heard, Java is not a trivial language to
learn. If you’re a Visual Basic programmer, moving to VBScript will be
your fastest solution and since it will probably solve most typical
client/server problems you might be hard pressed to justify learning Java. If
you’re experienced with a scripting language you will certainly benefit
from looking at JavaScript or VBScript before committing to Java, since they
might fit your needs handily and you’ll be more productive sooner.

ActiveX

some degree, the competitor to Java is Microsoft’s ActiveX, although it

takes a completely different approach. ActiveX is originally a Windows-only

solution, although it is now being developed via an independent consortium to

become cross-platform. Effectively, ActiveX says “if your program

connects to its environment just so, it can be dropped into a Web page and run

under a browser that supports ActiveX.” (IE directly supports ActiveX and

Netscape does so using a plug-in.) Thus, ActiveX does not constrain you to a

particular language. If, for example, you’re already an experienced

Windows programmer using a language such as C++, Visual Basic, or

Borland’s Delphi, you can create ActiveX components with almost no

changes to your programming knowledge. ActiveX also provides a path for the use

of legacy code in your Web pages.

Security

Automatically

downloading and running programs across the Internet can sound like a

virus-builder’s dream. ActiveX especially brings up the thorny issue of

security in client-side programming. If you click on a Web site, you might

automatically download any number of things along with the HTML page: GIF

files, script code, compiled Java code, and ActiveX components. Some of these

are benign; GIF files can’t do any harm, and scripting languages are

generally limited in what they can do. Java was also designed to run its

applets within a “sandbox” of safety, which prevents it from

writing to disk or accessing memory outside the sandbox.

ActiveX

is at the opposite end of the spectrum. Programming with ActiveX is like

programming Windows – you can do anything you want. So if you click on a

page that downloads an ActiveX component, that component might cause damage to

the files on your disk. Of course, programs that you load onto your computer

that are not restricted to running inside a Web browser can do the same thing.

Viruses downloaded from Bulletin-Board Systems (BBSs) have long been a problem,

but the speed of the Internet amplifies the difficulty.

The

solution seems to be “digital signatures,” whereby code is verified

to show who the author is. This is based on the idea that a virus works because

its creator can be anonymous, so if you remove the anonymity individuals will

be forced to be responsible for their actions. This seems like a good plan

because it allows programs to be much more functional, and I suspect it will

eliminate malicious mischief. If, however, a program has an unintentional bug

that’s destructive it will still cause problems.

The

Java approach is to prevent these problems from occurring, via the sandbox. The

Java interpreter that lives on your local Web browser examines the applet for

any untoward instructions as the applet is being loaded. In particular, the

applet cannot write files to disk or erase files (one of the mainstays of the

virus). Applets are generally considered to be safe, and since this is

essential for reliable client-server systems, any bugs that allow viruses are

rapidly repaired. (It’s worth noting that the browser software actually

enforces these security restrictions, and some browsers allow you to select

different security levels to provide varying degrees of access to your system.)

You

might be skeptical of this rather draconian restriction against writing files

to your local disk. For example, you may want to build a local database or save

data for later use offline. The initial vision seemed to be that eventually

everyone would be online to do anything important, but that was soon seen to be

impractical (although low-cost “Internet appliances” might someday

satisfy the needs of a significant segment of users). The solution is the

“signed applet” that uses public-key encryption to verify that an

applet does indeed come from where it claims it does. A signed applet can then

go ahead and trash your disk, but the theory is that since you can now hold the

applet creator accountable they won’t do vicious things. Java 1.1

provides a framework for digital signatures so that you will eventually be able
to allow an applet to step outside the sandbox if necessary.

Digital

signatures have missed an important issue, which is the speed that people move

around on the Internet. If you download a buggy program and it does something

untoward, how long will it be before you discover the damage? It could be days

or even weeks. And by then, how will you track down the program that’s

done it (and what good will it do at that point?).

Internet
vs. Intranet

The

Web is the most general solution to the client/server problem, so it makes

sense that you can use the same technology to solve a subset of the problem, in

particular the classic client/server problem within a company. With traditional

client/server approaches you have the problem of multiple different types of

client computers, as well as the difficulty of installing new client software,

both of which are handily solved with Web browsers and client-side programming.

When Web technology is used for an information network that is restricted to a

particular company, it is referred to as an

Intranet

Intranets provide much greater security than the Internet, since you can

physically control access to the servers within your company. In terms of

training, it seems that once people understand the general concept of a browser

it’s much easier for them to deal with differences in the way pages and

applets look, so the learning curve for new kinds of systems seems to be reduced.

The

security problem brings us to one of the divisions that seems to be

automatically forming in the world of client-side programming. If your program

is running on the Internet, you don’t know what platform it will be

working under and you want to be extra careful that you don’t disseminate

buggy code. You need something cross-platform and secure, like a scripting

language or Java.

you’re running on an Intranet, you might have a different set of

constraints. It’s not uncommon that your machines could all be

Intel/Windows platforms. On an Intranet, you’re responsible for the

quality of your own code and can repair bugs when they’re discovered. In

addition, you might already have a body of legacy code that you’ve been

using in a more traditional client/server approach, whereby you must physically

install client programs every time you do an upgrade. The time wasted in

installing upgrades is the most compelling reason to move to browsers because

upgrades are invisible and automatic. If you are involved in such an Intranet,

the most sensible approach to take is ActiveX rather than trying to recode your

programs in a new language.

When

faced with this bewildering array of solutions to the client-side programming

problem, the best plan of attack is a cost-benefit analysis. Consider the

constraints of your problem and what would be the fastest way to get to your

solution. Since client-side programming is still programming, it’s always

a good idea to take the fastest development approach for your particular

situation. This is an aggressive stance to prepare for inevitable encounters

with the problems of program development.

Server-side
programming

This

whole discussion has ignored the issue of server-side programming. What happens

when you make a request of a server? Most of the time the request is simply

“send me this file.” Your browser then interprets the file in some

appropriate fashion: as an HTML page, a graphic image, a Java applet, a script

program, etc. A more complicated request to a server generally involves a

database transaction. A common scenario involves a request for a complex

database search, which the server then formats into an HTML page and sends to

you as the result. (Of course, if the client has more intelligence via Java or

a scripting language, the raw data can be sent and formatted at the client end,

which will be faster and less load on the server.) Or you might want to

will involve changes to that database. These database requests must be

processed via some code on the server side, which is generally referred to as

server-side
programming
.
Traditionally, server-side programming has been performed using Perl and CGI
scripts, but more sophisticated systems have been appearing. These include
Java-based Web servers that allow you to perform all your server-side
programming in Java by writing what are called
servlets.

A
separate arena: applications

Most

of the brouhaha over Java has been about applets. Java is actually a

general-purpose programming language that can solve any type of problem, at

least in theory. And as pointed out previously, there might be more effective

ways to solve most client/server problems. When you move out of the applet

arena (and simultaneously release the restrictions, such as the one against

writing to disk) you enter the world of general-purpose applications that run

standalone, without a Web browser, just like any ordinary program does. Here,

Java’s strength is not only in its portability, but also its

programmability. As you’ll see throughout this book, Java has many

features that allow you to create robust programs in a shorter period than with

previous programming languages.

aware that this is a mixed blessing. You pay for the improvements through

slower execution speed (although there is significant work going on in this

area). Like any language, Java has built-in limitations that might make it

inappropriate to solve certain types of programming problems. Java is a

rapidly-evolving language, however, and as each new release comes out it

becomes more and more attractive for solving larger sets of problems.

[8]

The material in this section is adapted from an article by the author that

originally appeared on Mainspring, at

www.mainspring.com

Used with permission.

Contents

Java and the Internet

What
is the Web?

Client-side
programming
[8]

Server-side
programming

A
separate arena: applications

CodeGuru Staff

Company

Categories

Java and the Internet

What is the Web?

Client-side programming [8]

Server-side programming

A separate arena: applications

CodeGuru Staff

Company

Categories

What
is the Web?

Client-side
programming
[8]

Server-side
programming

A
separate arena: applications