String Tokenizer

WEBINAR: On-demand webcast

How to Boost Database Development Productivity on Linux, Docker, and Kubernetes with Microsoft SQL Server 2017 REGISTER >


When you are writing a lexical analyzer it would be helpful to have a class like the StreamTokenizer class from Sun's Java, so I've done something like that and here it is the CStringTokenizer class, the usage of this class is the same as by the StreamTokenizer from Java, there are a few additional functionality's and the function names are slightly different

The CStringTokenizer class is contained in the files StringTokenizer.h and StringTokenizer.cpp

 

The interface of the class:

class CStringTokenizer : public CObject 
{

public:
    // Constructor, you must pass as parameter the string, it initializes the 
    // tokenizer with  the default settings (see implementation)
    CStringTokenizer(CString& string);    
    virtual ~CStringTokenizer();        // Destructor

private:
// Private stuff  for internal use (see the sample code)

...

public:
    double GetNumValue();    // returns numeric value of the last returned token
    void PascalComments(BOOL bFlag);    // Enable / disable Pascal comments
    CString GetStrValue();    // returns the string value of the last token
                    
    void QuoteChar(int ch);    // specifies that this char is used as quote
    int LineNo();    // returns the current line number
    void PushBack();    // push back a token (can not be used twice) 
    int NextToken();    // parse next token returns a TT_ constant or a char value
    void LowerCaseMode(BOOL bFlag);    // Enable / Disable lower case
    void SlSlComments(BOOL bFlag);    // Enable / Disable "//" comments
    void SlStComments(BOOL bFlag);    // Enable / Disable "/*" comments
    void EolIsSignificant(BOOL bFlag);    // Is true is set EOL is returned by Next Token as a token
    void ParseNumbers();    // Enables number parsing (integer / double in normal format)
    void ResetSyntax();    // reset syntax
    void WordChars(int cLow, int cHi);    // specify that the characters in the range are word characters
    void WhiteSpaceChars(int cLow, int cHi);    // specify that the characters in the range are white space characters
    void OrdinaryChars(int cLow, int cHi);    // specify that the characters in the range are ordinary characters
    void OrdinaryChar(int ch);    // specify that the character is a ordinary character
    void CommentChar(int ch);    // specify comment char
};

How to use the CStringTokenizer class:

you must include in your file

#include "StringTokenizer.h"

sample code for using the string tokenizer class:

    CString str;

    // sample string
    str = _T("cwsddde1231+-\"asdfgasd\"-{dfsdf}iwreu/*dsfghsdgf*/fgdfg//wejfshg"); 
    str += TT_EOF;    // add EOF to the string end

    CStringTokenizer strtok(str);    // String Tokenizer class

    int val;
    while((val = strtok.NextToken())!=TT_EOF)    // parse the string
    {
        // display token code and str value
        CString msg;
        msg.Format ("%d %s",val,strtok.GetStrValue());    
        AfxMessageBox(msg);
    }

This class is writen to be used at many types of lexical analyzers, you can inherit your own lexical analyzer class from this CStringTokenizer class.

NEW!!!

Bug corections:

1. String memory alocation error corected

2. Pascal comments bug corected

Sample project:

The sample project shows how you can use the String Tokenizer Class, and how you can adjust it to your needs, the project also makes some pseudo Pascal, sintactical and some semantical analisis, the String Tokenizer should be now bugfree, but the PascalLexical, sintactical or semantical analizer should have bugs (I know 2 of them).

Download demo project - 46 KB



Comments

  • if the file size exceeds 6KB?

    Posted by Legacy on 05/24/2002 12:00am

    Originally posted by: novice programmer

    It takes a lot of time to load huge file and this fails
    when the file size exceeds 6KB.what is the solution for that?

    --novice

    Reply
  • how does the lexical's state table built???

    Posted by Legacy on 05/06/2001 12:00am

    Originally posted by: TechY

    I keep looking over and over again.
    I still don't understand how the state table built ;(
    and how does it use the state table???

    I am totally stuck on "state table" stuff.

    can someone help me out???


    thankx

    carde

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • The software-defined data center (SDDC) and new trends in cloud and virtualization bring increased agility, automation, and intelligent services and management to all areas of the data center. Businesses can now more easily manage the entire lifecycle of their applications and services via the SDDC. This Aberdeen analyst report examines how a strong foundation in both the cloud and internal data centers is empowering organizations to fully leverage their IT infrastructure and is also preparing them to be able …

  • In order for IT service providers to succeed, it's paramount that they find a competitive advantage and continually develop new ways to find additional revenue streams. IT service providers need to be able to do it all for their clients – from managing entire technology infrastructures to responding quickly to a multitude of end-user needs. With a growing number of issues to resolve and limited technicians at hand, how can IT service providers operate efficiently while providing top-notch service …

Most Popular Programming Stories

More for Developers

RSS Feeds

Thanks for your registration, follow us on our social networks to keep up-to-date