Text processing

Bruce Eckel's Thinking in Java Contents | Prev | Next

Extracting code listings

You’ve no doubt noticed that each complete code listing (not code fragment) in this book begins and ends with special comment tag marks ‘ //:’ and ‘ ///:~’. This meta-information is included so that the code can be automatically extracted from the book into compilable source-code files. In my previous book, I had a system that allowed me to automatically incorporate tested code files into the book. In this book, however, I discovered that it was often easier to paste the code into the book once it was initially tested and, since it’s hard to get right the first time, to perform edits to the code within the book. But how to extract it and test the code? This program is the answer, and it could come in handy when you set out to solve a text processing problem. It also demonstrates many of the String class features.

Why bother with the packed file? Because different computer platforms have different ways of storing text information in files. A big issue is the end-of-line character or characters, but other issues can also exist. However, Java has a special type of IO stream – the DataOutputStream – which promises that, regardless of what machine the data is coming from, the storage of that data will be in a form that can be correctly retrieved by any other machine by using a DataInputStream. That is, Java handles all of the platform-specific details, which is a large part of the promise of Java. So the -p flag stores everything into a single file in a universal format. You download this file and the Java program from the Web, and when you run CodePackager on this file without the -p flag the files will all be extracted to appropriate places on your system. (You can specify an alternate subdirectory; otherwise the subdirectories will just be created in the current directory.) To ensure that no system-specific formats remain, File objects are used everywhere a path or a file is described. In addition, there’s a sanity check: an empty file is placed in each subdirectory; the name of that file indicates how many files you should find in that subdirectory.

//: CodePackager.java
// "Packs" and "unpacks" the code in "Thinking 
// in Java" for cross-platform distribution.
/* Commented so CodePackager sees it and starts
   a new chapter directory, but so you don't 
   have to worry about the directory where this
   program lives:
package c17;
*/
import java.util.*;
import java.io.*;
 
class Pr {
  static void error(String e) {
    System.err.println("ERROR: " + e);
    System.exit(1);
  }
}
 
class IO {
  static BufferedReader disOpen(File f) {
    BufferedReader in = null;
    try {
      in = new BufferedReader(
        new FileReader(f));
    } catch(IOException e) {
      Pr.error("could not open " + f);
    }
    return in;
  }
  static BufferedReader disOpen(String fname) {
    return disOpen(new File(fname));
  }
  static DataOutputStream dosOpen(File f) {
    DataOutputStream in = null;
    try {
      in = new DataOutputStream(
        new BufferedOutputStream(
          new FileOutputStream(f)));
    } catch(IOException e) {
      Pr.error("could not open " + f);
    }
    return in;
  }
  static DataOutputStream dosOpen(String fname) {
    return dosOpen(new File(fname));
  }
  static PrintWriter psOpen(File f) {
    PrintWriter in = null;
    try {
      in = new PrintWriter(
        new BufferedWriter(
          new FileWriter(f)));
    } catch(IOException e) {
      Pr.error("could not open " + f);
    }
    return in;
  }
  static PrintWriter psOpen(String fname) {
    return psOpen(new File(fname));
  }
  static void close(Writer os) {
    try {
      os.close();
    } catch(IOException e) {
      Pr.error("closing " + os);
    }
  }
  static void close(DataOutputStream os) {
    try {
      os.close();
    } catch(IOException e) {
      Pr.error("closing " + os);
    }
  }
  static void close(Reader os) {
    try {
      os.close();
    } catch(IOException e) {
      Pr.error("closing " + os);
    }
  }
}
 
class SourceCodeFile {
  public static final String 
    startMarker = "//:", // Start of source file
    endMarker = "} ///:~", // End of source
    endMarker2 = "}; ///:~", // C++ file end
    beginContinue = "} ///:Continued",
    endContinue = "///:Continuing",
    packMarker = "###", // Packed file header tag
    eol = // Line separator on current system
      System.getProperty("line.separator"),
    filesep = // System's file path separator
      System.getProperty("file.separator");
  public static String copyright = "";
  static {
    try {
      BufferedReader cr =
        new BufferedReader(
          new FileReader("Copyright.txt"));
      String crin;
      while((crin = cr.readLine()) != null)
        copyright += crin + "\n";
      cr.close();
    } catch(Exception e) {
      copyright = "";
    }
  }
  private String filename, dirname,
    contents = new String();
  private static String chapter = "c02";
  // The file name separator from the old system:
  public static String oldsep;
  public String toString() {
    return dirname + filesep + filename;
  }
  // Constructor for parsing from document file:
  public SourceCodeFile(String firstLine, 
      BufferedReader in) {
    dirname = chapter;
    // Skip past marker:
    filename = firstLine.substring(
        startMarker.length()).trim();
    // Find space that terminates file name:
    if(filename.indexOf(' ') != -1)
      filename = filename.substring(
          0, filename.indexOf(' '));
    System.out.println("found: " + filename);
    contents = firstLine + eol;
    if(copyright.length() != 0)
      contents += copyright + eol;
    String s;
    boolean foundEndMarker = false;
    try {
      while((s = in.readLine()) != null) {
        if(s.startsWith(startMarker))
          Pr.error("No end of file marker for " +
            filename);
        // For this program, no spaces before 
        // the "package" keyword are allowed
        // in the input source code:
        else if(s.startsWith("package")) {
          // Extract package name:
          String pdir = s.substring(
            s.indexOf(' ')).trim();
          pdir = pdir.substring(
            0, pdir.indexOf(';')).trim();
          // Capture the chapter from the package
          // ignoring the 'com' subdirectories:
          if(!pdir.startsWith("com")) {
            int firstDot = pdir.indexOf('.');
            if(firstDot != -1)
              chapter = 
                pdir.substring(0,firstDot);
            else
              chapter = pdir;
          }
          // Convert package name to path name:
          pdir = pdir.replace(
            '.', filesep.charAt(0));
          System.out.println("package " + pdir);
          dirname = pdir;
        }
        contents += s + eol;
        // Move past continuations:
        if(s.startsWith(beginContinue))
          while((s = in.readLine()) != null)
            if(s.startsWith(endContinue)) {
              contents += s + eol;
              break;
            }
        // Watch for end of code listing:
        if(s.startsWith(endMarker) ||
           s.startsWith(endMarker2)) {
          foundEndMarker = true;
          break;
        }
      }
      if(!foundEndMarker)
        Pr.error(
          "End marker not found before EOF");
      System.out.println("Chapter: " + chapter);
    } catch(IOException e) {
      Pr.error("Error reading line");
    }
  }
  // For recovering from a packed file:
  public SourceCodeFile(BufferedReader pFile) {
    try {
      String s = pFile.readLine();
      if(s == null) return;
      if(!s.startsWith(packMarker))
        Pr.error("Can't find " + packMarker
          + " in " + s);
      s = s.substring(
        packMarker.length()).trim();
      dirname = s.substring(0, s.indexOf('#'));
      filename = s.substring(s.indexOf('#') + 1);
      dirname = dirname.replace(
        oldsep.charAt(0), filesep.charAt(0));
      filename = filename.replace(
        oldsep.charAt(0), filesep.charAt(0));
      System.out.println("listing: " + dirname 
        + filesep + filename);
      while((s = pFile.readLine()) != null) {
        // Watch for end of code listing:
        if(s.startsWith(endMarker) ||
           s.startsWith(endMarker2)) {
          contents += s;
          break;
        }
        contents += s + eol;
      }
    } catch(IOException e) {
      System.err.println("Error reading line");
    }
  }
  public boolean hasFile() { 
    return filename != null; 
  }
  public String directory() { return dirname; }
  public String filename() { return filename; }
  public String contents() { return contents; }
  // To write to a packed file:
  public void writePacked(DataOutputStream out) {
    try {
      out.writeBytes(
        packMarker + dirname + "#" 
        + filename + eol);
      out.writeBytes(contents);
    } catch(IOException e) {
      Pr.error("writing " + dirname + 
        filesep + filename);
    }
  }
  // To generate the actual file:
  public void writeFile(String rootpath) {
    File path = new File(rootpath, dirname);
    path.mkdirs();
    PrintWriter p =
      IO.psOpen(new File(path, filename));
    p.print(contents);
    IO.close(p);
  }
}
 
class DirMap {
  private Hashtable t = new Hashtable();
  private String rootpath;
  DirMap() {
    rootpath = System.getProperty("user.dir");
  }
  DirMap(String alternateDir) {
    rootpath = alternateDir;
  }
  public void add(SourceCodeFile f){
    String path = f.directory();
    if(!t.containsKey(path))
      t.put(path, new Vector());
    ((Vector)t.get(path)).addElement(f);
  }
  public void writePackedFile(String fname) {
    DataOutputStream packed = IO.dosOpen(fname);
    try {
      packed.writeBytes("###Old Separator:" +
        SourceCodeFile.filesep + "###\n");
    } catch(IOException e) {
      Pr.error("Writing separator to " + fname);
    }
    Enumeration e = t.keys();
    while(e.hasMoreElements()) {
      String dir = (String)e.nextElement();
      System.out.println(
        "Writing directory " + dir);
      Vector v = (Vector)t.get(dir);
      for(int i = 0; i < v.size(); i++) {
        SourceCodeFile f = 
          (SourceCodeFile)v.elementAt(i);
        f.writePacked(packed);
      }
    }
    IO.close(packed);
  }
  // Write all the files in their directories:
  public void write() {
    Enumeration e = t.keys();
    while(e.hasMoreElements()) {
      String dir = (String)e.nextElement();
      Vector v = (Vector)t.get(dir);
      for(int i = 0; i < v.size(); i++) {
        SourceCodeFile f = 
          (SourceCodeFile)v.elementAt(i);
        f.writeFile(rootpath);
      }
      // Add file indicating file quantity
      // written to this directory as a check:
      IO.close(IO.dosOpen(
        new File(new File(rootpath, dir),
          Integer.toString(v.size())+".files")));
    }
  }
}
 
public class CodePackager {
  private static final String usageString =
  "usage: java CodePackager packedFileName" +
  "\nExtracts source code files from packed \n" +
  "version of Tjava.doc sources into " +
  "directories off current directory\n" +
  "java CodePackager packedFileName newDir\n" +
  "Extracts into directories off newDir\n" +
  "java CodePackager -p source.txt packedFile" +
  "\nCreates packed version of source files" +
  "\nfrom text version of Tjava.doc";
  private static void usage() {
    System.err.println(usageString);
    System.exit(1);
  }
  public static void main(String[] args) {
    if(args.length == 0) usage();
    if(args[0].equals("-p")) {
      if(args.length != 3)
        usage();
      createPackedFile(args);
    }
    else {
      if(args.length > 2)
        usage();
      extractPackedFile(args);
    }
  }
  private static String currentLine; 
  private static BufferedReader in;
  private static DirMap dm;
  private static void 
  createPackedFile(String[] args) {
    dm = new DirMap();
    in = IO.disOpen(args[1]);
    try {
      while((currentLine = in.readLine()) 
          != null) {
        if(currentLine.startsWith(
            SourceCodeFile.startMarker)) {
          dm.add(new SourceCodeFile(
                   currentLine, in));
        }
        else if(currentLine.startsWith(
            SourceCodeFile.endMarker))
          Pr.error("file has no start marker");
        // Else ignore the input line
      }
    } catch(IOException e) {
      Pr.error("Error reading " + args[1]);
    }
    IO.close(in);
    dm.writePackedFile(args[2]);
  }
  private static void 
  extractPackedFile(String[] args) {
    if(args.length == 2) // Alternate directory
      dm = new DirMap(args[1]);
    else // Current directory
      dm = new DirMap();
    in = IO.disOpen(args[0]);
    String s = null;
    try {
       s = in.readLine();
    } catch(IOException e) {
      Pr.error("Cannot read from " + in);
    }
    // Capture the separator used in the system
    // that packed the file:
    if(s.indexOf("###Old Separator:") != -1 ) {
      String oldsep = s.substring(
        "###Old Separator:".length());
      oldsep = oldsep.substring(
        0, oldsep. indexOf('#'));
      SourceCodeFile.oldsep = oldsep;
    }
    SourceCodeFile sf = new SourceCodeFile(in);
    while(sf.hasFile()) {
      dm.add(sf);
      sf = new SourceCodeFile(in);
    }
    dm.write();
  }
} ///:~ 

You’ll first notice the package statement that is commented out. Since this is the first program in the chapter, the package statement is necessary to tell CodePackager that the chapter has changed, but putting it in a package would be a problem. When you create a package, you tie the resulting program to a particular directory structure, which is fine for most of the examples in this book. Here, however, the CodePackager program must be compiled and run from an arbitrary directory, so the package statement is commented out. It will still look like an ordinary package statement to CodePackager, though, since the program isn’t sophisticated enough to detect multi-line comments. (It has no need for such sophistication, a fact that comes in handy here.)

The first two classes are support/utility classes designed to make the rest of the program more consistent to write and easier to read. The first, Pr, is similar to the ANSI C library perror, since it prints an error message (but also exits the program). The second class encapsulates the creation of files, a process that was shown in Chapter 10 as one that rapidly becomes verbose and annoying. In Chapter 10, the proposed solution created new classes, but here static method calls are used. Within those methods the appropriate exceptions are caught and dealt with. These methods make the rest of the code much cleaner to read.

The first class that helps solve the problem is SourceCodeFile, which represents all the information (including the contents, file name, and directory) for one source code file in the book. It also contains a set of String constants representing the markers that start and end a file, a marker used inside the packed file, the current system’s end-of-line separator and file path separator (notice the use of System.getProperty( ) to get the local version), and a copyright notice, which is extracted from the following file Copyright.txt.

//////////////////////////////////////////////////
// Copyright (c) Bruce Eckel, 1998
// Source code file from the book "Thinking in Java"
// All rights reserved EXCEPT as allowed by the
// following statements: You may freely use this file
// for your own work (personal or commercial),
// including modifications and distribution in
// executable form only. Permission is granted to use
// this file in classroom situations, including its
// use in presentation materials, as long as the book
// "Thinking in Java" is cited as the source. 
// Except in classroom situations, you may not copy
// and distribute this code; instead, the sole
// distribution point is http://www.BruceEckel.com 
// (and official mirror sites) where it is
// freely available. You may not remove this
// copyright and notice. You may not distribute
// modified versions of the source code in this
// package. You may not use this file in printed
// media without the express permission of the
// author. Bruce Eckel makes no representation about
// the suitability of this software for any purpose.
// It is provided "as is" without express or implied
// warranty of any kind, including any implied
// warranty of merchantability, fitness for a
// particular purpose or non-infringement. The entire
// risk as to the quality and performance of the
// software is with you. Bruce Eckel and the
// publisher shall not be liable for any damages
// suffered by you or any third party as a result of
// using or distributing software. In no event will
// Bruce Eckel or the publisher be liable for any
// lost revenue, profit, or data, or for direct,
// indirect, special, consequential, incidental, or
// punitive damages, however caused and regardless of
// the theory of liability, arising out of the use of
// or inability to use software, even if Bruce Eckel
// and the publisher have been advised of the
// possibility of such damages. Should the software
// prove defective, you assume the cost of all
// necessary servicing, repair, or correction. If you
// think you've found an error, please email all
// modified files with clearly commented changes to:
// Bruce@EckelObjects.com. (please use the same
// address for non-code errors found in the book).
//////////////////////////////////////////////////

When extracting files from a packed file, the file separator of the system that packed the file is also noted, so it can be replaced with the correct one for the local system.

The subdirectory name for the current chapter is kept in the field chapter, which is initialized to c02. (You’ll notice that the listing in Chapter 2 doesn’t contain a package statement.) The only time that the chapter field changes is when a package statement is discovered in the current file.

Building a packed file Extracting from a packed file
The second constructor is used to recover the source code files from a packed file. Here, the calling method doesn’t have to worry about skipping over the intermediate text. The file contains all the source-code files, placed end-to-end. All you need to hand to this constructor is the BufferedReader where the information is coming from, and the constructor takes it from there. There is some meta-information, however, at the beginning of each listing, and this is denoted by the packMarker. If the packMarker isn’t there, it means the caller is mistakenly trying to use this constructor where it isn’t appropriate.

Accessing and writing the listings
The next set of methods are simple accessors: directory( ), filename( ) (notice the method can have the same spelling and capitalization as the field) and contents( ), and hasFile( ) to indicate whether this object contains a file or not. (The need for this will be seen later.)

The final three methods are concerned with writing this code listing into a file, either a packed file via writePacked( ) or a Java source file via writeFile( ). All writePacked( ) needs is the DataOutputStream, which was opened elsewhere, and represents the file that’s being written. It puts the header information on the first line and then calls writeBytes( ) to write contents in a “universal” format.

Containing the entire collection of listings
It’s convenient to organize the listings as subdirectories while the whole collection is being built in memory. One reason is another sanity check: as each subdirectory of listings is created, an additional file is added whose name contains the number of files in that directory.

The main program
The previously described classes are used within CodePackager. First you see the usage string that gets printed whenever the end user invokes the program incorrectly, along with the usage( ) method that calls it and exits the program. All main( ) does is determine whether you want to create a packed file or extract from one, then it ensures the arguments are correct and calls the appropriate method.

When a packed file is created, it’s assumed to be made in the current directory, so the DirMap is created using the default constructor. After the file is opened each line is read and examined for particular conditions:

  1. If the line starts with the starting marker for a source code listing, a new SourceCodeFile object is created. The constructor reads in the rest of the source listing. The handle that results is directly added to the DirMap.
  2. If the line starts with the end marker for a source code listing, something has gone wrong, since end markers should be found only by the SourceCodeFile constructor.

Checking capitalization style

Although the previous example can come in handy as a guide for some project of your own that involves text processing, this project will be directly useful because it performs a style check to make sure that your capitalization conforms to the de-facto Java style. It opens each .java file in the current directory and extracts all the class names and identifiers, then shows you if any of them don’t meet the Java style.

For the program to operate correctly, you must first build a class name repository to hold all the class names in the standard Java library. You do this by moving into all the source code subdirectories for the standard Java library and running ClassScanner in each subdirectory. Provide as arguments the name of the repository file (using the same path and name each time) and the -a command-line option to indicate that the class names should be added to the repository.

To use the program to check your code, run it and hand it the path and name of the repository to use. It will check all the classes and identifiers in the current directory and tell you which ones don’t follow the typical Java capitalization style.

You should be aware that the program isn’t perfect; there a few times when it will point out what it thinks is a problem but on looking at the code you’ll see that nothing needs to be changed. This is a little annoying, but it’s still much easier than trying to find all these cases by staring at your code.

The explanation immediately follows the listing:

//: ClassScanner.java
// Scans all files in directory for classes
// and identifiers, to check capitalization.
// Assumes properly compiling code listings.
// Doesn't do everything right, but is a very
// useful aid.
import java.io.*;
import java.util.*;
 
class MultiStringMap extends Hashtable {
  public void add(String key, String value) {
    if(!containsKey(key))
      put(key, new Vector());
    ((Vector)get(key)).addElement(value);
  }
  public Vector getVector(String key) {
    if(!containsKey(key)) {
      System.err.println(
        "ERROR: can't find key: " + key);
      System.exit(1);
    }
    return (Vector)get(key);
  }
  public void printValues(PrintStream p) {
    Enumeration k = keys();
    while(k.hasMoreElements()) {
      String oneKey = (String)k.nextElement();
      Vector val = getVector(oneKey);
      for(int i = 0; i < val.size(); i++)
        p.println((String)val.elementAt(i));
    }
  }
}
 
public class ClassScanner {
  private File path;
  private String[] fileList;
  private Properties classes = new Properties();
  private MultiStringMap 
    classMap = new MultiStringMap(),
    identMap = new MultiStringMap();
  private StreamTokenizer in;
  public ClassScanner() {
    path = new File(".");
    fileList = path.list(new JavaFilter());
    for(int i = 0; i < fileList.length; i++) {
      System.out.println(fileList[i]);
      scanListing(fileList[i]);
    }
  }
  void scanListing(String fname) {
    try {
      in = new StreamTokenizer(
          new BufferedReader(
            new FileReader(fname)));
      // Doesn't seem to work:
      // in.slashStarComments(true);
      // in.slashSlashComments(true);
      in.ordinaryChar('/');
      in.ordinaryChar('.');
      in.wordChars('_', '_');
      in.eolIsSignificant(true);
      while(in.nextToken() != 
            StreamTokenizer.TT_EOF) {
        if(in.ttype == '/')
          eatComments();
        else if(in.ttype == 
                StreamTokenizer.TT_WORD) {
          if(in.sval.equals("class") || 
             in.sval.equals("interface")) {
            // Get class name:
               while(in.nextToken() != 
                     StreamTokenizer.TT_EOF
                     && in.ttype != 
                     StreamTokenizer.TT_WORD)
                 ;
               classes.put(in.sval, in.sval);
               classMap.add(fname, in.sval);
          }
          if(in.sval.equals("import") ||
             in.sval.equals("package"))
            discardLine();
          else // It's an identifier or keyword
            identMap.add(fname, in.sval);
        }
      }
    } catch(IOException e) {
      e.printStackTrace();
    }
  }
  void discardLine() {
    try {
      while(in.nextToken() != 
            StreamTokenizer.TT_EOF
            && in.ttype != 
            StreamTokenizer.TT_EOL)
        ; // Throw away tokens to end of line
    } catch(IOException e) {
      e.printStackTrace();
    }
  }
  // StreamTokenizer's comment removal seemed
  // to be broken. This extracts them:
  void eatComments() {
    try {
      if(in.nextToken() != 
         StreamTokenizer.TT_EOF) {
        if(in.ttype == '/')
          discardLine();
        else if(in.ttype != '*')
          in.pushBack();
        else 
          while(true) {
            if(in.nextToken() == 
              StreamTokenizer.TT_EOF)
              break;
            if(in.ttype == '*')
              if(in.nextToken() != 
                StreamTokenizer.TT_EOF
                && in.ttype == '/')
                break;
          }
      }
    } catch(IOException e) {
      e.printStackTrace();
    }
  }
  public String[] classNames() {
    String[] result = new String[classes.size()];
    Enumeration e = classes.keys();
    int i = 0;
    while(e.hasMoreElements())
      result[i++] = (String)e.nextElement();
    return result;
  }
  public void checkClassNames() {
    Enumeration files = classMap.keys();
    while(files.hasMoreElements()) {
      String file = (String)files.nextElement();
      Vector cls = classMap.getVector(file);
      for(int i = 0; i < cls.size(); i++) {
        String className = 
          (String)cls.elementAt(i);
        if(Character.isLowerCase(
             className.charAt(0)))
          System.out.println(
            "class capitalization error, file: "
            + file + ", class: " 
            + className);
      }
    }
  }
  public void checkIdentNames() {
    Enumeration files = identMap.keys();
    Vector reportSet = new Vector();
    while(files.hasMoreElements()) {
      String file = (String)files.nextElement();
      Vector ids = identMap.getVector(file);
      for(int i = 0; i < ids.size(); i++) {
        String id = 
          (String)ids.elementAt(i);
        if(!classes.contains(id)) {
          // Ignore identifiers of length 3 or
          // longer that are all uppercase
          // (probably static final values):
          if(id.length() >= 3 &&
             id.equals(
               id.toUpperCase()))
            continue;
          // Check to see if first char is upper:
          if(Character.isUpperCase(id.charAt(0))){
            if(reportSet.indexOf(file + id)
                == -1){ // Not reported yet
              reportSet.addElement(file + id);
              System.out.println(
                "Ident capitalization error in:"
                + file + ", ident: " + id);
            }
          }
        }
      }
    }
  }
  static final String usage =
    "Usage: \n" + 
    "ClassScanner classnames -a\n" +
    "\tAdds all the class names in this \n" +
    "\tdirectory to the repository file \n" +
    "\tcalled 'classnames'\n" +
    "ClassScanner classnames\n" +
    "\tChecks all the java files in this \n" +
    "\tdirectory for capitalization errors, \n" +
    "\tusing the repository file 'classnames'";
  private static void usage() {
    System.err.println(usage);
    System.exit(1);
  }
  public static void main(String[] args) {
    if(args.length < 1 || args.length > 2)
      usage();
    ClassScanner c = new ClassScanner();
    File old = new File(args[0]);
    if(old.exists()) {
      try {
        // Try to open an existing 
        // properties file:
        InputStream oldlist =
          new BufferedInputStream(
            new FileInputStream(old));
        c.classes.load(oldlist);
        oldlist.close();
      } catch(IOException e) {
        System.err.println("Could not open "
          + old + " for reading");
        System.exit(1);
      }
    }
    if(args.length == 1) {
      c.checkClassNames();
      c.checkIdentNames();
    }
    // Write the class names to a repository:
    if(args.length == 2) {
      if(!args[1].equals("-a"))
        usage();
      try {
        BufferedOutputStream out =
          new BufferedOutputStream(
            new FileOutputStream(args[0]));
        c.classes.save(out,
          "Classes found by ClassScanner.java");
        out.close();
      } catch(IOException e) {
        System.err.println(
          "Could not write " + args[0]);
        System.exit(1);
      }
    }
  }
}
 
class JavaFilter implements FilenameFilter {
  public boolean accept(File dir, String name) {
    // Strip path information:
    String f = new File(name).getName();
    return f.trim().endsWith(".java");
  }
} ///:~ 

Inside scanListing( ) the source code file is opened and turned into a StreamTokenizer. In the documentation, passing true to slashStarComments( ) and slashSlashComments( ) is supposed to strip those comments out, but this seems to be a bit flawed (it doesn’t quite work in Java 1.0). Instead, those lines are commented out and the comments are extracted by another method. To do this, the ‘ /’ must be captured as an ordinary character rather than letting the StreamTokenizer absorb it as part of a comment, and the ordinaryChar( ) method tells the StreamTokenizer to do this. This is also true for dots (‘ .’), since we want to have the method calls pulled apart into individual identifiers. However, the underscore, which is ordinarily treated by StreamTokenizer as an individual character, should be left as part of identifiers since it appears in such static final values as TT_EOF etc., used in this very program. The wordChars( ) method takes a range of characters you want to add to those that are left inside a token that is being parsed as a word. Finally, when parsing for one-line comments or discarding a line we need to know when an end-of-line occurs, so by calling eolIsSignificant(true) the eol will show up rather than being absorbed by the StreamTokenizer.



Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Live Event Date: October 29, 2014 @ 11:00 a.m. ET / 8:00 a.m. PT Are you interested in building a cognitive application using the power of IBM Watson? Need a platform that provides speed and ease for rapidly deploying this application? Join Chris Madison, Watson Solution Architect, as he walks through the process of building a Watson powered application on IBM Bluemix. Chris will talk about the new Watson Services just released on IBM bluemix, but more importantly he will do a step by step cognitive …

  • Email is the most common communication vehicle used by organizations of all shapes and sizes. Among the billions of email messages sent every day are sensitive information, critical requests, and other essential business data. IT staff bear the burden of ensuring the confidentiality, integrity, and availability of the information contained within the communication. This white paper explores the email security landscape, an assessment of the threats organizations face,  and the building blocks of an effective …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds