A fast lexical analyzer with IDE

Environment: VC6 SP3, Win98-WinXP, STLPort 4.0/4.5.1 recommended (free)

'cxtPackage' is a library allowing custom lexical analyzers supporting the following features:

  • includes a fast generic tokenizer
  • uses a fast caching recursive-descent analyzer
  • the ruleset for the analyzer permits assigning precedence priorities to expressions
  • the parse tree is built according to the precedence prioritites
  • a small IDE for developing and testing the ruleset is included

This consists of the three projects 'cxTokenizer', 'cxAnalyzer' and 'cxtPackage' and provides an interface to the analyzer and tokenizer.
The setup of the analyzer is done with a plain-text initialization string, a rule definition for a simple expression evaluator could for example look like this:

std::tstringstream init(
    "200:+\n"       "201:-\n"
    "202:*\n"       "203:/\n"
    "204:^\n"       "205:;\n"
    "206:(\n"       "207:)\n"
    "' Whitespace tokens:\n"
    "0: \n"         "0:\t\n"
    "0:\\n\n"       "0:\\r\n"

The grammar-IDE
The grammar IDE included in the complete download provides an environment to develop and test analyzer rulesets. It has some syntax-highlighting features, shows errors by marking the lines in the editor and has an integrated test environment to live-check the results of the ruleset. I have no documentation yet and the ide is still early beta and it has some cosmetic bugs (for example, it is possible to insert via clipboard RTF-formatted text into the editor :-), but most of it is already usable.

How to use

I will submit a tutorial on how to use the library soon, but for now the sample projects included in the complete download must serve that purpose. Both 'emptyTestApp' and 'simpleCalc' are documented, so you should be able to see how it works. If you have questions you can also mail me: alexander-berthold@web.de.

'emptyTestApp' is an almost skeletal MFC dialog app and has a small piece of sample code in its 'OnOK' handler. 'simpleCalc' is a simple math expression evaluator.

Small code snippet

I confess I used hungarian notation, but some pieces of the code required me to do so because of a pretty confusing amount of classes when dealing with hierarchical linked lists :). Please don't flame me :)

// Source file 'EmptyTestAppDlg.cpp', line 200+:

// flush the internal token stream

// attach the input stream to the tokenizer/analyzer

// read until the next delimeter - no delimeters are
// set, i.e. read until the end of the input stream

// check for the '.expr' rules defined above
cxaTokenStream::const_iterator endpos;
cxaParseBranch *papbResult = NULL;
cxaStatusCookie ascCondition;

// test for all rules belonging to the same 
// group of rules with ID 300
papbResult =pkg.papbCheckForRule( 300,

  AfxMessageBox("Syntax error.");

  'papbResult' contains now the parse tree. You can for
  example dump its contents to the debug output using
  papbResult->vDump(), or you can load the ruleset in
  the IDE and explore it there.


Download complete project including grammarIDE - 213 Kb
Download source only - 162 Kb


  • which grammar?

    Posted by Legacy on 07/24/2003 12:00am

    Originally posted by: Max

    Could you provide any comments about which grammar types your lexer can parse? (LL, LR(1), ...) I can get the point (in fact have no time to look through the implementation to find this out).

  • Great, but how can i make my own...

    Posted by Legacy on 01/09/2002 12:00am

    Originally posted by: Roman

    great!, but can i get some information on how to say, make my own language that your package can parse and help me interpret?

    i looked at complex-sample and i was lost, most of the C++ statements said "syntax error" or such, and i dont know how to make my script language in it

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Live Event Date: September 17, 2014 @ 1:00 p.m. ET / 10:00 a.m. PT Another day, another end-of-support deadline. You've heard enough about the hazards of not migrating to Windows Server 2008 or 2012. What you may not know is that there's plenty in it for you and your business, like increased automation and performance, time-saving technical features, and a lower total cost of ownership. Check out this upcoming eSeminar and join Rich Holmes, Pomeroy's practice director of virtualization, as he discusses the …

  • As mobile devices have pushed their way into the enterprise, they have brought cloud apps along with them. This app explosion means account passwords are multiplying, which exposes corporate data and leads to help desk calls from frustrated users. This paper will discover how IT can improve user productivity, gain visibility and control over SaaS and mobile apps, and stop password sprawl. Download this white paper to learn: How you can leverage your existing AD to manage app access. Key capabilities to …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds