CPerlString - A Class to Utilize Perl String Functions

Click here for larger image

Environment: VC++ 5.0-6.0, NT 4.0, Win2000, WinXP, Win95/98

String manipulation and regular expressions have always been a strength of Perl. The C language is powerful, but doing string manipulation and regular expressions is difficult. Although there are libraries such as PCRE that simulate Perl regular expressions in C, I have found them difficult to use. So, I thought, the best way is to embel Perl inside C.

With the help of Perl's documentation, I have successfully created two classes to encapsulate some of Perl's most useful functions. One class is for MFC users (CPerlString.h), and one class is for non-MFC users (PerlString.h). The functions encapsulated are:

  1. Pattern matching
  2. String substitution
  3. Joining of an array into a string
  4. Splitting of a string into an array
  5. Sort (forward or reverse)
  6. Chop
  7. Chomp

I will discuss the usage of the MFC class. The non-MFC class is similar but it uses STL string and vector<string> instead of CString and CStringArray.

I32 CPerlString::Match (CString inputString, CString pattern)
I32 CPerlString::Matches (CString inputString, CString pattern, CStringArray &matchList)

Match and Matches are two functions that make use of Perl's pattern-matching ability. Basically, Match sends a Perl statement: inputString =~ pattern and Matches sends a Perl statement: matchList = (inputString =~ pattern). Match will return 1 if the pattern is found and 0 if the pattern is not found. Matches will return the number of matches.

CPerlString perl;
CString inputString = "Hello World!";
CString pattern1 = "/Hello/";
CString pattern2 = "/(.o)/g";
CStringArray matchList;

if (perl.Match(inputString, pattern1))
  printf("Pattern found\n");
  printf("Pattern not found\n");

int num_matches = perl.Matches(inputString, pattern2, matchList);

printf("%d matches\n", num_matches);
if (num_matches > 0)
  for (int i = 0; i < matchList.GetSize(); i++)
    printf("Match %d: %s\n", i+1, matchList.GetAt(i));

I32 Substitute(CString &inputOutputString, CString pattern)

Substitute is a function that makes use of Perl's string substitute ability. Basically it sends: inputOutputString =~ pattern. Substitute will return 1 if the substitution is done and 0 if it is not done.

CPerlString perl;
CString inputOutputString = "Hello World!";
CString pattern1 = "s/Hello/Hello Happy/";

perl.Substitute(inputOutputString, pattern1);
printf("%s\n", inputOutputString);

void Join(CStringArray &inputStringList, CString pattern, CString &outputString)

Join is a function that makes use of Perl's joining ability. It sends: outputString = join (pattern, inputStringList). Join does not return any value.

CPerlString perl;
CString outputString;
CString pattern1 = " ";
CStringArray inputStringList;

perl.Join(inputStringList, pattern1, outputString);
printf ("%s\n", outputString);

I32 Split(CString inputString, CString pattern, CStringArray &splitList)

Split performs the Perl statement: splitlist = split (pattern, inputString). It returns the number of split items.

CPerlString perl;
CString inputString = "Hello Happy World!";
CString pattern1 = "/\\s/";
CStringArray splitList;

int num_split = perl.Split(inputString, pattern1, splitList);

printf("%d split\n", num_split);
if (num_split > 0)
  for (int i = 0; i < splitList.GetSize(); i++)
    printf("Split %d: %s\n", i+1, splitList.GetAt(i));

void Sort(CStringArray &inputStringList, CStringArray &outputStringList, int Direction = 0)

Sort performs sorting on an array by sending: outputStringList = sort (inputStringList) if the direction is 0 and outputStringList = reverse sort (inputStringList) if the direction is not 0 (for example, 1). Sort does not return any value.

CPerlString perl;
CStringArray inputStringList, outputStringList;

perl.Sort(inputStringList, outputStringList);    // Forward sort

for (int i = 0; i < outputStringList.GetSize(); i++)
  printf("%s\n", outputStringList.GetAt(i));

perl.Sort(inputStringList, outputStringList, 1); // Reverse sort

for (int i = 0; i < outputStringList.GetSize(); i++)
  printf("%s\n", outputStringList.GetAt(i));

void Chomp(CString &inputOutputString)
void Chomp(CStringArray &inputOutputStringList)
void Chop(CString &inputOutputString)
void Chop(CStringArray &inputOutputStringList)

Chop and Chomp are two functions that perform Perl's chop and chomp. They are overloaded to handle a string and a string array. The corresponding Perl statement is: chomp (inputOutputString) or chomp (inputOutputStringList) or chop (inputOutputString) or chop (inputOutputStringList).

CPerlString perl;
CString inputOutputString = "Hello World!";
CStringArray inputOutputStringList;

printf("%s\n", inputOutputString);

for (int i = 0; i < inputOutputStringList.GetSize(); i++)
  printf("%s\n", inputOutputStringList.GetAt(i));


To use the two classes, you will need to include them in your project by adding the statement: #include "CPerlString.h" or #include "PerlString.h". You will also need to have the Perl core libraries. These will be found in the directory where you install Perl. For example, if you install Perl in C:\Perl, the core libraries will be found at C:\Perl\lib\CORE. You will need to set this directory as one of the default include and library directories for Visual C++. This can be done by choosing Tools->Options->Directories and adding C:\Perl\lib\CORE in the include files and library files section.


The demo program demonstrates the use of the above-mentioned functions except chomp and chop. It can be used to help test your regular expression patterns.


Download demo project - 40 Kb
Download source - 4 Kb

This article was originally published on March 6th, 2003

Most Popular Programming Stories

More for Developers

RSS Feeds

Thanks for your registration, follow us on our social networks to keep up-to-date