SHARE
Facebook X Pinterest WhatsApp

A fast lexical analyzer with IDE

Environment: VC6 SP3, Win98-WinXP, STLPort 4.0/4.5.1 recommended (free) ‘cxtPackage’ is a library allowing custom lexical analyzers supporting the following features: includes a fast generic tokenizer uses a fast caching recursive-descent analyzer the ruleset for the analyzer permits assigning precedence priorities to expressions the parse tree is built according to the precedence prioritites a small IDE […]

Written By
thumbnail
CodeGuru Staff
CodeGuru Staff
Jan 9, 2002
CodeGuru content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Environment: VC6 SP3, Win98-WinXP, STLPort 4.0/4.5.1 recommended (free)

‘cxtPackage’ is a library allowing custom lexical analyzers supporting the following features:

  • includes a fast generic tokenizer
  • uses a fast caching recursive-descent analyzer
  • the ruleset for the analyzer permits assigning precedence
    priorities to expressions
  • the parse tree is built according to the precedence prioritites
  • a small IDE for developing and testing the ruleset is included

cxtPackage:
This consists of the three projects ‘cxTokenizer’, ‘cxAnalyzer’ and ‘cxtPackage’ and provides an interface to the analyzer and tokenizer.
The setup of the analyzer is done with a plain-text initialization string, a rule definition for a simple expression evaluator could for example look like this:

std::tstringstream init(
    "[seperators]\n"
    "200:+\n"       "201:-\n"
    "202:*\n"       "203:/\n"
    "204:^\n"       "205:;\n"
    "206:(\n"       "207:)\n"
    "' Whitespace tokens:\n"
    "0: \n"         "0:\t\n"
    "0:\\n\n"       "0:\\r\n"
    "[rules]\n"
    "300:numbers\n"
    "[grammar]\n"
    "401:{.expr}=100:{.expr}{$+}{.expr}\n"
    "402:{.expr}=100:{.expr}{$-}{.expr}\n"
    "403:{.expr}=99:{.expr}{$*}{.expr}\n"
    "404:{.expr}=99:{.expr}{$/}{.expr}\n"
    "405:{.expr}=98:{.expr}{$^}{.expr}\n"
    "406:{.expr}=0:{$(}{.expr}{$)}\n"
    "400:{.expr}=0:{!number}\n"
    "500:{.line}=0:{.expr}{$;}\n");

The grammar-IDE
The grammar IDE included in the complete download provides an environment to develop and test analyzer rulesets. It has some syntax-highlighting features, shows errors by marking the lines in the editor and has an integrated test environment to live-check the results of the ruleset. I have no documentation yet and the ide is still early beta and it has some cosmetic bugs (for example, it is possible to insert via clipboard RTF-formatted text into the editor :-), but most of it is already usable.

How to use

I will submit a tutorial on how to use the library soon, but for now the sample projects included in the complete download must serve that purpose. Both ’emptyTestApp’ and ‘simpleCalc’ are documented, so you should be able to see how it works. If you have questions you can also mail me: alexander-berthold@web.de.

’emptyTestApp’ is an almost skeletal MFC dialog app and has a small piece of sample code in its ‘OnOK’ handler. ‘simpleCalc’ is a simple math expression evaluator.

Small code snippet

I confess I used hungarian notation, but some pieces of the code required me to do so because of a pretty confusing amount of classes when dealing with hierarchical linked lists :). Please don’t flame me 🙂

// Source file 'EmptyTestAppDlg.cpp', line 200+:

// flush the internal token stream
pkg.vFlush();
pkg.vSetStartFromBeginning();

// attach the input stream to the tokenizer/analyzer
pkg.vSetInputStream(&istream);
pkg.vSetDelimeterIDs(NULL);

// read until the next delimeter - no delimeters are
// set, i.e. read until the end of the input stream
pkg.nReadUntilDelimeter();

// check for the '.expr' rules defined above
cxaTokenStream::const_iterator endpos;
cxaParseBranch *papbResult = NULL;
cxaStatusCookie ascCondition;

// test for all rules belonging to the same
// group of rules with ID 300
papbResult =pkg.papbCheckForRule( 300,
                                  &endpos,
                                  &ascCondition,
                                  false);

if(papbResult==NULL)
  AfxMessageBox("Syntax error.");
else
  ...

/*
  'papbResult' contains now the parse tree. You can for
  example dump its contents to the debug output using
  papbResult->vDump(), or you can load the ruleset in
  the IDE and explore it there.
*/

Downloads

Download complete project including grammarIDE – 213 Kb

Download source only – 162 Kb

Recommended for you...

Video Game Careers Overview
CodeGuru Staff
Sep 18, 2022
Dealing with non-CLS Exceptions in .NET
Hannes DuPreez
Aug 5, 2022
Online Courses to Learn Video Game Development
Ronnie Payne
Jul 8, 2022
Best Online Courses to Learn C++
CodeGuru Staff
Jun 25, 2022
CodeGuru Logo

CodeGuru covers topics related to Microsoft-related software development, mobile development, database management, and web application programming. In addition to tutorials and how-tos that teach programmers how to code in Microsoft-related languages and frameworks like C# and .Net, we also publish articles on software development tools, the latest in developer news, and advice for project managers. Cloud services such as Microsoft Azure and database options including SQL Server and MSSQL are also frequently covered.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.