Environment: VC++ 5.0-6.0, NT 4.0, Win2000, WinXP, Win95/98
String manipulation and regular expressions have always been a strength of Perl. The C language is powerful, but doing string manipulation and regular expressions is difficult. Although there are libraries such as PCRE that simulate Perl regular expressions in C, I have found them difficult to use. So, I thought, the best way is to embel Perl inside C.
With the help of Perl’s documentation, I have successfully created two classes to encapsulate some of Perl’s most useful functions. One class is for MFC users (CPerlString.h), and one class is for non-MFC users (PerlString.h). The functions encapsulated are:
- Pattern matching
- String substitution
- Joining of an array into a string
- Splitting of a string into an array
- Sort (forward or reverse)
- Chop
- Chomp
I will discuss the usage of the MFC class. The non-MFC class is similar but it uses STL string and vector<string> instead of CString and CStringArray.
I32 CPerlString::Match (CString inputString, CString pattern)
I32 CPerlString::Matches (CString inputString, CString pattern, CStringArray &matchList)
Match and Matches are two functions that make use of Perl’s pattern-matching ability. Basically, Match sends a Perl statement: inputString =~ pattern and Matches sends a Perl statement: matchList = (inputString =~ pattern). Match will return 1 if the pattern is found and 0 if the pattern is not found. Matches will return the number of matches.
CPerlString perl; CString inputString = "Hello World!"; CString pattern1 = "/Hello/"; CString pattern2 = "/(.o)/g"; CStringArray matchList; if (perl.Match(inputString, pattern1)) printf("Pattern found\n"); else printf("Pattern not found\n"); int num_matches = perl.Matches(inputString, pattern2, matchList); printf("%d matches\n", num_matches); if (num_matches > 0) { for (int i = 0; i < matchList.GetSize(); i++) printf("Match %d: %s\n", i+1, matchList.GetAt(i));
I32 Substitute(CString &inputOutputString, CString pattern)
Substitute is a function that makes use of Perl’s string substitute ability. Basically it sends: inputOutputString =~ pattern. Substitute will return 1 if the substitution is done and 0 if it is not done.
CPerlString perl; CString inputOutputString = "Hello World!"; CString pattern1 = "s/Hello/Hello Happy/"; perl.Substitute(inputOutputString, pattern1); printf("%s\n", inputOutputString);
void Join(CStringArray &inputStringList, CString pattern, CString &outputString)
Join is a function that makes use of Perl’s joining ability. It sends: outputString = join (pattern, inputStringList). Join does not return any value.
CPerlString perl; CString outputString; CString pattern1 = " "; CStringArray inputStringList; inputStringList.Add("Hello"); inputStringList.Add("Happy"); inputStringList.Add("World"); perl.Join(inputStringList, pattern1, outputString); printf ("%s\n", outputString);
I32 Split(CString inputString, CString pattern, CStringArray &splitList)
Split performs the Perl statement: splitlist = split (pattern, inputString). It returns the number of split items.
CPerlString perl; CString inputString = "Hello Happy World!"; CString pattern1 = "/\\s/"; CStringArray splitList; int num_split = perl.Split(inputString, pattern1, splitList); printf("%d split\n", num_split); if (num_split > 0) { for (int i = 0; i < splitList.GetSize(); i++) printf("Split %d: %s\n", i+1, splitList.GetAt(i));
void Sort(CStringArray &inputStringList, CStringArray &outputStringList, int Direction = 0)
Sort performs sorting on an array by sending: outputStringList = sort (inputStringList) if the direction is 0 and outputStringList = reverse sort (inputStringList) if the direction is not 0 (for example, 1). Sort does not return any value.
CPerlString perl; CStringArray inputStringList, outputStringList; inputStringList.Add("Hello"); inputStringList.Add("Happy"); inputStringList.Add("World"); perl.Sort(inputStringList, outputStringList); // Forward sort for (int i = 0; i < outputStringList.GetSize(); i++) printf("%s\n", outputStringList.GetAt(i)); perl.Sort(inputStringList, outputStringList, 1); // Reverse sort for (int i = 0; i < outputStringList.GetSize(); i++) printf("%s\n", outputStringList.GetAt(i));
void Chomp(CString &inputOutputString)
void Chomp(CStringArray &inputOutputStringList)
void Chop(CString &inputOutputString)
void Chop(CStringArray &inputOutputStringList)
Chop and Chomp are two functions that perform Perl’s chop and chomp. They are overloaded to handle a string and a string array. The corresponding Perl statement is: chomp (inputOutputString) or chomp (inputOutputStringList) or chop (inputOutputString) or chop (inputOutputStringList).
CPerlString perl; CString inputOutputString = "Hello World!"; CStringArray inputOutputStringList; inputOutputStringList.Add("Hello"); inputOutputStringList.Add("Happy"); inputOutputStringList.Add("World"); perl.Chop(inputOutputString); printf("%s\n", inputOutputString); perl.Chop(inputOutputStringList); for (int i = 0; i < inputOutputStringList.GetSize(); i++) printf("%s\n", inputOutputStringList.GetAt(i));
Usage:
To use the two classes, you will need to include them in your project by adding the statement: #include “CPerlString.h” or #include “PerlString.h”. You will also need to have the Perl core libraries. These will be found in the directory where you install Perl. For example, if you install Perl in C:\Perl, the core libraries will be found at C:\Perl\lib\CORE. You will need to set this directory as one of the default include and library directories for Visual C++. This can be done by choosing Tools->Options->Directories and adding C:\Perl\lib\CORE in the include files and library files section.
Demo
The demo program demonstrates the use of the above-mentioned functions except chomp and chop. It can be used to help test your regular expression patterns.