CSyntaxColorizer: Syntax Highlighting Class
Environment: Visual C++ 6, MFC CRichEditCtrl class
Overview
The CSyntaxColorizer class described here is a fast and versatile class for the syntax highlighting of code. The class is very simple to use, very fast, and highly flexible. The default highlighting mode is VC++, with comments in green, strings in dark blue, and keywords in light blue. The class exposes several methods that can be used for changing these defaults - and color is not the only option. Highlighted words can be made bold, italic, underlined, and more. The exposed methods take a CHARFORMAT structure as a parameter, so whatever text formatting changes can be made with a CHARFORMAT structure can also be made with the keyword, comment, and string formats in CSyntaxColorizer. As well, you are not limited to a predefined and fixed set of keyword groupings. The keywords can be grouped in any way you like, and then manipulated by group. For example, you could assign all compiler directives the group ID of 723 (or whatever) and then set all keywords with group ID 723 to the color red.The default groupings for the CSyntaxColorizer class are as follows:
Group 0: VC++ keywords such as for, if, while, void, etc
Group 1: VC++ compiler directives such as #define, #include, etc
Group 2: VC++ compiler pragmas such as once, auto_inline, etc
CSyntaxColorizer maintains member variables, (m_cfDefault, m_cfComment and m_cfString), of type CHARFORMAT. These defaults are used when the class initializes its internal lists, and whenever the keyword, comment or string colors are changed using the methods that take a COLORREF parameter instead of a CHARFORMAT parameter. These default structures have a font of Courier New, 10pt size. Naturally, the class exposes methods for changing the defaults (see below).
Here are the declarations in CSyntaxColorizer.h for the exposed methods:
void Colorize(long StartChar,
long nEndChar,
CRichEditCtrl *pCtrl);
void Colorize(CHARRANGE cr,
CRichEditCtrl *pCtrl);
void GetCommentStyle(CHARFORMAT &cf)
{
cf = m_cfComment;
};
void GetStringStyle(CHARFORMAT &cf)
{
cf = m_cfString;
};
void GetGroupStyle(int grp, CHARFORMAT &cf);
void GetDefaultStyle(CHARFORMAT &cf)
{
cf = m_cfDefault;
};
void SetCommentStyle(CHARFORMAT cf)
{
m_cfComment = cf;
};
void SetCommentColor(COLORREF cr);
void SetStringStyle(CHARFORMAT cf)
{
m_cfString = cf;
};
void SetStringColor(COLORREF cr);
void SetGroupStyle(int grp, CHARFORMAT cf);
void SetGroupColor(int grp, COLORREF cr);
void SetDefaultStyle(CHARFORMAT cf)
{
m_cfDefault = cf;
};
void AddKeyword(LPCTSTR Keyword,
CHARFORMAT cf,
int grp = 0);
void AddKeyword(LPCTSTR Keyword,
COLORREF cr,
int grp = 0);
void ClearKeywordList();
CString GetKeywordList();
CString GetKeywordList(int grp);
Using CSyntaxColorizer
The simplest and quickest way to use this class is to first declare it, then call one of the overloaded Colorize member functions. For example, this codeCSyntaxColorizer sc; sc.Colorize(0, -1, &m_cRichEditCtrl);first creates the object, then calls its Colorize method, which in this case colorizes all of the text in the specified rich edit box, using CSyntaxColorizer's default font, keyword groupings, and colors described above.
If you don't like the default colors, you can change them:
sc.SetCommentColor(RGB(255,0,0)); sc.SetStringColor(RGB(0,255,0)); sc.SetGroupColor(nMyGroup,RGB(0,0,255));The preceding methods change the color using the CHARFORMAT structures that would be returned by their respective "Get..." methods.
If it's more than just colors you don't like, then you can set the CHARFORMAT structures using
sc.SetCommentStyle(cfMyStyle); sc.SetStringStyle(cfMyStyle); sc.SetGroupStyle(nMyGroup,cfMyStyle);where cfMyStyle is a CHARFORMAT structure that you have created yourself from scratch, or have retrieved using one of the "Get..." methods and then modified to suit.
Adding keywords is easy too. The two AddKeyword methods each take an LPCTSTR as a parameter. The parameter is a NULL terminated list of words separated by commas. For example,
sc.AddKeyword("for,if,while", RGB(255,0,0), 4);
will add the three keywords to the sc object's list, give them the color red and place them in group 4, using the CHARFORMAT structure currently in m_cfDefault. You can also send a single word as the LPCTSTR parameter. If the keyword already exists in the list, its color and group attributes are overwritten by those passed in the AddKeyword method. The AddKeyword method that takes a CHARFORMAT as a parameter instead of a COLORREF works in a similar fashion.
A word about comments...
By default, CSyntaxColorizer deals with C++ and Java multiline comments starting with /* and single line comments starting with //. If you want single line comments as in VB, (starting with ' or REM), simply add "REM" as one of the keywords. For example:
sc.AddKeyword("REM",cfMyStyle,nMyGroup);
CSyntaxColorizer will ignore any style or color settings you specify for the REM keyword, and instead will set them to whatever attributes you have set for the comments. CSyntaxColorizer will automatically treat the ' as the start of a single line comment once "REM" is added to the keyword list. Note: if you add only "REM", then "rem", "Rem", etc will not be recognized - you will have to add these as well.
Where a comment starts with // and ends with '\' + '\n' (the line continuation character immediately followed by a newline character) CSyntaxColorizer will recognize it, and treat the next line as a comment line.
Speed
CSyntaxColorizer is pretty fast. Small files under 20K or so are colorized practically instantaneously on my 466MHz machine. Larger files, over 100K, take four or five seconds. CSyntaxColorizer can locate and identify words to be colorized fairly quickly, but the big time hog is not in the algorithm itself - it's in the text formatting functions of CRichEditCtrl. If you comment out the two lines (line #494 & 495)pCtrl->SetSel(iStart,iOffset + x); pCtrl->SetSelectionCharFormat(pskTemp->cf);then a 100K file takes less than a second, but only comments and strings are colorized.
About the Demo
The demo project is a quick and dirty dialog box with a rich edit control. It has a rather crude editing capablity, the minimum required to show off some of CSyntaxColorizer's abilities. In particular, if you type in \* to start off a multiline comment, only the one line with the cursor on it will be reformatted. In this case, press the "Format" button. As well, you can load files, but you can't save them.Downloads
Download source - 5 KbDownload demo project - 55 Kb

Comments
good job
Posted by wshcdr on 12/23/2009 10:54amwell done
ReplyUpdating syntax as text is entered.
Posted by ahoodin on 06/17/2004 12:06pmHere is a project where I added functionality to update the text as you type. It is still a work in progress. Please feel free to comment. If you have an update, you can email me ahoodin@boardermail.com. http://www.codeguru.com/forum/showthread.php?s=f37f51f59cae3bcba7dadf91e159036b&threadid=298851&highlight=richedit
ReplyStops working if RichEdit20 is used
Posted by Legacy on 11/03/2003 12:00amOriginally posted by: Mehak Lala
I am trying to use Richedit20 and once i use it everything turns into comment color as my starting line is a comment character. Why should richedit20 stop the parsing .. do i have to do soem additional setting.
Any inputs greatly appreciated
Mehak
-
Reply
-
ReplyHow to solve!!
Posted by MrTommek on 03/27/2004 07:48pmHow to solve!!
Posted by MrTommek on 03/27/2004 07:43pmFor all people who have/had the same problem, here is a solution: Change in the CSyntaxColorizer-class the line 98 from *(m_pTableTwo + '\'') = SQEND; *(m_pTableThree + '\r') = SLEND; to *(m_pTableTwo + '\'') = SQEND; *(m_pTableThree + '\n') = SLEND; !! Now the Colorizer-class will work for RichEdit20. Greez Tommek
ReplyPaste operation
Posted by Legacy on 10/02/2003 12:00amOriginally posted by: Tomasz Kulig
There is problem with pasting text from html page to this control. (or another RTF text).
1. Mark colorized line on html page.
2. Paste it to CSyntaxColorizer. At this moment all attributes of text (color, fonts, etc) are similiar to html document.
3. Move cursor in the middle of line and press <Enter>.
Now second part of line is formated but height of line is improper.
There is also problem when you put html table to this control.
And my question. Mayby one of you can help me. How to check and translate clipboard content when it differs from simple text?
ReplyWhere to obtain TOM
Posted by Legacy on 07/15/2003 12:00amOriginally posted by: Jason Shelley
Can anyone tell me where I can obtain the "tom.h" file and any other requisits for using TOM please? I have vc++ 6.0, office 2000, but no TOM :-(
ReplyHow to use it with CFontDialog class instead of CDialog
Posted by Legacy on 05/06/2003 12:00amOriginally posted by: Parinda Rivonkar
It doesn't work with CFontDialog class.
ReplyCreate the above as an ActiveX control
Posted by Legacy on 04/08/2003 12:00amOriginally posted by: Pritpal Singh Mudher
ReplyLittle bug with printf(" I said: "\bla bla\" \n");
Posted by Legacy on 08/27/2002 12:00amOriginally posted by: Dennis
this is colorized wrong:
printf(" I said: "\bla bla\" \n");
The \" going wrong.
I don't know how to fix it but I'm sure you can :)
Greetings Dennis.
Reply
How to add line numbers on the left side of the document frame?
Posted by Legacy on 07/13/2002 12:00amOriginally posted by: Ahmad Sebastian
ReplyWeird behaviors for file over 64K
Posted by Legacy on 06/12/2002 12:00amOriginally posted by: L2L
Any one experience any weird behaviors for files larger than 64K? For instance, I can't insert new text from keyboard after I opened a large file of 180K. I can delete one char and type in another one but not insert a new char. Strange indeed. Can someone suggest a work around or point out any thing I overlooked. Thanks.
Reply
Loading, Please Wait ...