Using Regular Expressions for Search/Replace
Normally, when you search for a sub-string in a string, the match should be exact. So if we search for a sub-string "abc" then the string being searched should contain these exact letters in the same sequence for a match to be found. We can extend this kind of search to a case insensitive search where the sub-string "abc" will find strings like "Abc", "ABC" etc. That is, the case is ignored but the sequence of the letters should be exactly the same. Sometimes, a case insensitive search is also not enough. For example, if we want to search for numeric digit, then we basically end up searching for each digit independantly. This is where regular expressions come in to our help.
Regular expressions are text patterns that are used for string matching. Regular expressions are strings that contains a mix of plain text and special characters to indicate what kind of matching to do. Here's a very brief turorial on using regular expressions before we move on to the code for handling regular expressions.
Suppose, we are looking for a numeric digit then the regular expression we would search for is "[0-9]". The brackets indicate that the character being compared should match any one of the characters enclosed within the bracket. The dash (-) between 0 and 9 indicates that it is a range from 0 to 9. Therefore, this regular expression will match any character between 0 and 9, that is, any digit. If we want to search for a special character literally we must use a backslash before the special character. For example, the single character regular expression "\*" matches a single asterisk. In the table below the special characters are briefly described.
|
Character |
Description |
|
^ |
Beginning of the string. The expression "^A" will match an A only at the beginning of the string. |
|
^ |
The caret (^) immediately following the left-bracket ([) has a different meaning. It is used to exclude the remaining characters within brackets from matching the target string. The expression "[^0-9]" indicates that the target character should not be a digit. |
|
$ |
The dollar sign ($) will match the end of the string. The expression "abc$" will match the sub-string "abc" only if it is at the end of the string. |
|
| |
The alternation character (|) allows either expression on its side to match the target string. The expression "a|b" will match a as well as b. |
|
. |
The dot (.) will match any character. |
|
* |
The asterix (*) indicates that the character to the left of the asterix in the expression should match 0 or more times. |
|
+ |
The plus (+) is similar to asterix but there should be at least one match of the character to the left of the + sign in the expression. |
|
? |
The question mark (?) matches the character to its left 0 or 1 times. |
|
() |
The parenthesis affects the order of pattern evaluation and also serves as a tagged expression that can be used when replacing the matched sub-string with another expression. |
|
[] |
Brackets ([ and ]) enclosing a set of characters indicates that any of the enclosed characters may match the target character. |
The parenthesis, besides affecting the evaluation order of the regular expression, also serves as tagged expression which is something like a temporary memory. This memory can then be used when we want to replace the found expression with a new expression. The replace expression can specify a & character which means that the & represents the sub-string that was found. So, if the sub-string that matched the regular expression is "abcd", then a replace expression of "xyz&xyz" will change it to "xyzabcdxyz". The replace expression can also be expressed as "xyz\0xyz". The "\0" indicates a tagged expression representing the entire sub-string that was matched. Similarly we can have other tagged expression represented by "\1", "\2" etc. Note that although the tagged expression 0 is always defined, the tagged expression 1,2 etc. are only defined if the regular expression used in the search had enough sets of parenthesis. Here are few examples.
|
String |
Search |
Replace |
Result |
|
Mr. |
(Mr)(\.) |
\1s\2 |
Mrs. |
|
abc |
(a)b(c) |
&-\1-\2 |
abc-a-c |
|
bcd |
(a|b)c*d |
&-\1 |
bcd-b |
|
abcde |
(.*)c(.*) |
&-\1-\2 |
abcde-ab-de |
|
cde |
(ab|cd)e |
&-\1 |
cde-cd |
The code for regular expression search is given below. This code was derived from work done by Henry Spencer. Besides adding a C++ wrapper around C code, a couple of functions were added to enable search and replace operations.
////////////////////////////////////////////////////////////////////////
// RegExp.h
//
// This code has been derived from work by Henry Spencer.
// The main changes are
// 1. All char variables and functions have been changed to TCHAR
// counterparts
// 2. Added GetFindLen() & GetReplaceString() to enable search
// and replace operations.
// 3. And of course, added the C++ Wrapper
//
// The original copyright notice follows:
//
// Copyright (c) 1986, 1993, 1995 by University of Toronto.
// Written by Henry Spencer. Not derived from licensed software.
//
// Permission is granted to anyone to use this software for any
// purpose on any computer system, and to redistribute it in any way,
// subject to the following restrictions:
//
// 1. The author is not responsible for the consequences of use of
// this software, no matter how awful, even if they arise
// from defects in it.
//
// 2. The origin of this software must not be misrepresented, either
// by explicit claim or by omission.
//
// 3. Altered versions must be plainly marked as such, and must not
// be misrepresented (by explicit claim or omission) as being
// the original software.
//
// 4. This notice must not be removed or altered.
/////////////////////////////////////////////////////////////////////////////
#define NSUBEXP 10
class CRegExp
{
public:
CRegExp();
~CRegExp();
CRegExp *RegComp( const TCHAR *re );
int RegFind(const TCHAR *str);
TCHAR* GetReplaceString( const TCHAR* sReplaceExp );
int GetFindLen()
{
if( startp[0] == NULL || endp[0] == NULL )
return 0;
return endp[0] - startp[0];
}
private:
TCHAR *regnext(TCHAR *node);
void reginsert(TCHAR op, TCHAR *opnd);
int regtry(TCHAR *string);
int regmatch(TCHAR *prog);
size_t regrepeat(TCHAR *node);
TCHAR *reg(int paren, int *flagp);
TCHAR *regbranch(int *flagp);
void regtail(TCHAR *p, TCHAR *val);
void regoptail(TCHAR *p, TCHAR *val);
TCHAR *regpiece(int *flagp);
TCHAR *regatom(int *flagp);
// Inline functions
private:
TCHAR OP(TCHAR *p) {return *p;};
TCHAR *OPERAND( TCHAR *p) {return (TCHAR*)((short *)(p+1)+1); };
// regc - emit (if appropriate) a byte of code
void regc(TCHAR b)
{
if (bEmitCode)
*regcode++ = b;
else
regsize++;
};
// regnode - emit a node
TCHAR * regnode(TCHAR op)
{
if (!bEmitCode) {
regsize += 3;
return regcode;
}
*regcode++ = op;
*regcode++ = _T('\0'); /* Null next pointer. */
*regcode++ = _T('\0');
return regcode-3;
};
private:
BOOL bEmitCode;
BOOL bCompiled;
TCHAR *sFoundText;
TCHAR *startp[NSUBEXP];
TCHAR *endp[NSUBEXP];
TCHAR regstart; // Internal use only.
TCHAR reganch; // Internal use only.
TCHAR *regmust; // Internal use only.
int regmlen; // Internal use only.
TCHAR *program; // Unwarranted chumminess with compiler.
TCHAR *regparse; // Input-scan pointer.
int regnpar; // () count.
TCHAR *regcode; // Code-emit pointer; ®dummy = don't.
TCHAR regdummy[3]; // NOTHING, 0 next ptr
long regsize; // Code size.
TCHAR *reginput; // String-input pointer.
TCHAR *regbol; // Beginning of input, for ^ check.
};
////////////////////////////////////////////////////////////////////////////////
// RegExp.cpp
////////////////////////////////////////////////////////////////////////////////
#include "stdafx.h"
#include "RegExp.h"
// definition number opnd? meaning
#define END 0 // no End of program.
#define BOL 1 // no Match beginning of line.
#define EOL 2 // no Match end of line.
#define ANY 3 // no Match any character.
#define ANYOF 4 // str Match any of these.
#define ANYBUT 5 // str Match any but one of these.
#define BRANCH 6 // node Match this, or the next..\&.
#define BACK 7 // no "next" ptr points backward.
#define EXACTLY 8 // str Match this string.
#define NOTHING 9 // no Match empty string.
#define STAR 10 // node Match this 0 or more times.
#define PLUS 11 // node Match this 1 or more times.
#define OPEN 20 // no Sub-RE starts here.
// OPEN+1 is number 1, etc.
#define CLOSE 30 // no Analogous to OPEN.
// Utility definitions.
#define FAIL(m) { regerror(m); return(NULL); }
#define ISREPN(c) ((c) == _T('*') || (c) == _T('+') || (c) == _T('?'))
#define META "^$.[()|?+*\\"
// Flags to be passed up and down.
#define HASWIDTH 01 // Known never to match null string.
#define SIMPLE 02 // Simple enough to be STAR/PLUS operand.
#define SPSTART 04 // Starts with * or +.
#define WORST 0 // Worst case.
CRegExp::CRegExp()
{
bCompiled = FALSE;
program = NULL;
sFoundText = NULL;
for( int i = 0; i < NSUBEXP; i++ )
{
startp[i] = NULL;
endp[i] = NULL;
}
}
CRegExp::~CRegExp()
{
delete program;
delete sFoundText;
}
CRegExp* CRegExp::RegComp(const TCHAR *exp)
{
TCHAR *scan;
int flags;
if (exp == NULL)
return NULL;
bCompiled = TRUE;
// First pass: determine size, legality.
bEmitCode = FALSE;
regparse = (TCHAR *)exp;
regnpar = 1;
regsize = 0L;
regdummy[0] = NOTHING;
regdummy[1] = regdummy[2] = 0;
regcode = regdummy;
if (reg(0, &flags) == NULL)
return(NULL);
// Allocate space.
delete program;
program = new TCHAR[regsize];
memset( program, 0, regsize * sizeof(TCHAR) );
if (program == NULL)
return NULL;
// Second pass: emit code.
bEmitCode = TRUE;
regparse = (TCHAR *)exp;
regnpar = 1;
regcode = program;
if (reg(0, &flags) == NULL)
return NULL;
// Dig out information for optimizations.
regstart = _T('\0'); // Worst-case defaults.
reganch = 0;
regmust = NULL;
regmlen = 0;
scan = program; // First BRANCH.
if (OP(regnext(scan)) == END)
{
// Only one top-level choice.
scan = OPERAND(scan);
// Starting-point info.
if (OP(scan) == EXACTLY)
regstart = *OPERAND(scan);
else if (OP(scan) == BOL)
reganch = 1;
// If there's something expensive in the r.e., find the
// longest literal string that must appear and make it the
// regmust. Resolve ties in favor of later strings, since
// the regstart check works with the beginning of the r.e.
// and avoiding duplication strengthens checking. Not a
// strong reason, but sufficient in the absence of others.
if (flags&SPSTART)
{
char *longest = NULL;
size_t len = 0;
for (; scan != NULL; scan = regnext(scan))
if (OP(scan) == EXACTLY && _tcslen(OPERAND(scan)) >= len)
{
longest = OPERAND(scan);
len = _tcslen(OPERAND(scan));
}
regmust = longest;
regmlen = (int)len;
}
}
return this;
}
// reg - regular expression, i.e. main body or parenthesized thing
//
// Caller must absorb opening parenthesis.
//
// Combining parenthesis handling with the base level of regular expression
// is a trifle forced, but the need to tie the tails of the branches to what
// follows makes it hard to avoid.
TCHAR *CRegExp::reg(int paren, int *flagp)
{
char *ret;
char *br;
char *ender;
int parno;
int flags;
*flagp = HASWIDTH; // Tentatively.
if (paren)
{
// Make an OPEN node.
if (regnpar >= NSUBEXP)
{
TRACE1("Too many (). NSUBEXP is set to %d\n", NSUBEXP );
return NULL;
}
parno = regnpar;
regnpar++;
ret = regnode(OPEN+parno);
}
// Pick up the branches, linking them together.
br = regbranch(&flags);
if (br == NULL)
return(NULL);
if (paren)
regtail(ret, br); // OPEN -> first.
else
ret = br;
*flagp &= ~(~flags&HASWIDTH); // Clear bit if bit 0.
*flagp |= flags&SPSTART
while (*regparse == _T('|')) {
regparse++;
br = regbranch(&flags);
if (br == NULL)
return(NULL);
regtail(ret, br); // BRANCH -> BRANCH.
*flagp &= ~(~flags&HASWIDTH);
*flagp |= flags&SPSTART
}
// Make a closing node, and hook it on the end.
ender = regnode((paren) ? CLOSE+parno : END);
regtail(ret, ender);
// Hook the tails of the branches to the closing node.
for (br = ret; br != NULL; br = regnext(br))
regoptail(br, ender);
// Check for proper termination.
if (paren && *regparse++ != _T(')'))
{
TRACE0("unterminated ()\n");
return NULL;
}
else if (!paren && *regparse != _T('\0'))
{
if (*regparse == _T(')'))
{
TRACE0("unmatched ()\n");
return NULL;
}
else
{
TRACE0("internal error: junk on end\n");
return NULL;
}
// NOTREACHED
}
return(ret);
}
//
// regbranch - one alternative of an | operator
//
// Implements the concatenation operator.
TCHAR *CRegExp::regbranch(int *flagp)
{
TCHAR *ret;
TCHAR *chain;
TCHAR *latest;
int flags;
int c;
*flagp = WORST; // Tentatively.
ret = regnode(BRANCH);
chain = NULL;
while ((c = *regparse) != _T('\0') && c != _T('|') && c != _T(')')) {
latest = regpiece(&flags);
if (latest == NULL)
return(NULL);
*flagp |= flags&HASWIDTH
if (chain == NULL) // First piece.
*flagp |= flags&SPSTART
else
regtail(chain, latest);
chain = latest;
}
if (chain == NULL) // Loop ran zero times.
(void) regnode(NOTHING);
return(ret);
}
//
// regpiece - something followed by possible [*+?]
//
// Note that the branching code sequences used for ? and the general cases
// of * and + are somewhat optimized: they use the same NOTHING node as
// both the endmarker for their branch list and the body of the last branch.
// It might seem that this node could be dispensed with entirely, but the
// endmarker role is not redundant.
TCHAR *CRegExp::regpiece(int *flagp)
{
TCHAR *ret;
TCHAR op;
TCHAR *next;
int flags;
ret = regatom(&flags);
if (ret == NULL)
return(NULL);
op = *regparse;
if (!ISREPN(op)) {
*flagp = flags;
return(ret);
}
if (!(flags&HASWIDTH) && op != _T('?'))
{
TRACE0("*+ operand could be empty\n");
return NULL;
}
switch (op) {
case _T('*'): *flagp = WORST|SPSTART; break;
case _T('+'): *flagp = WORST|SPSTART|HASWIDTH; break;
case _T('?'): *flagp = WORST; break;
}
if (op == _T('*') && (flags&SIMPLE))
reginsert(STAR, ret);
else if (op == _T('*')) {
// Emit x* as (x&|), where & means "self".
reginsert(BRANCH, ret); // Either x
regoptail(ret, regnode(BACK)); // and loop
regoptail(ret, ret); // back
regtail(ret, regnode(BRANCH)); // or
regtail(ret, regnode(NOTHING)); // null.
} else if (op == _T('+') && (flags&SIMPLE))
reginsert(PLUS, ret);
else if (op == _T('+')) {
// Emit x+ as x(&|), where & means "self".
next = regnode(BRANCH); // Either
regtail(ret, next);
regtail(regnode(BACK), ret); // loop back
regtail(next, regnode(BRANCH)); // or
regtail(ret, regnode(NOTHING)); // null.
} else if (op == _T('?')) {
// Emit x? as (x|)
reginsert(BRANCH, ret); // Either x
regtail(ret, regnode(BRANCH)); // or
next = regnode(NOTHING); // null.
regtail(ret, next);
regoptail(ret, next);
}
regparse++;
if (ISREPN(*regparse))
{
TRACE0("nested *?+\n");
return NULL;
}
return(ret);
}
//
// regatom - the lowest level
//
// Optimization: gobbles an entire sequence of ordinary characters so that
// it can turn them into a single node, which is smaller to store and
// faster to run. Backslashed characters are exceptions, each becoming a
// separate node; the code is simpler that way and it's not worth fixing.
TCHAR *CRegExp::regatom(int *flagp)
{
TCHAR *ret;
int flags;
*flagp = WORST; // Tentatively.
switch (*regparse++) {
case _T('^'):
ret = regnode(BOL);
break;
case _T('$'):
ret = regnode(EOL);
break;
case _T('.'):
ret = regnode(ANY);
*flagp |= HASWIDTH|SIMPLE;
break;
case _T('['): {
int range;
int rangeend;
int c;
if (*regparse == _T('^')) { // Complement of range.
ret = regnode(ANYBUT);
regparse++;
} else
ret = regnode(ANYOF);
if ((c = *regparse) == _T(']') || c == _T('-')) {
regc(c);
regparse++;
}
while ((c = *regparse++) != _T('\0') && c != _T(']')) {
if (c != _T('-'))
regc(c);
else if ((c = *regparse) == _T(']') || c == _T('\0'))
regc(_T('-'));
else
{
range = (unsigned) (TCHAR)*(regparse-2);
rangeend = (unsigned) (TCHAR)c;
if (range > rangeend)
{
TRACE0("invalid [] range\n");
return NULL;
}
for (range++; range <= rangeend; range++)
regc(range);
regparse++;
}
}
regc(_T('\0'));
if (c != _T(']'))
{
TRACE0("unmatched []\n");
return NULL;
}
*flagp |= HASWIDTH|SIMPLE;
break;
}
case _T('('):
ret = reg(1, &flags);
if (ret == NULL)
return(NULL);
*flagp |= flags&(HASWIDTH|SPSTART);
break;
case _T('\0'):
case _T('|'):
case _T(')'):
// supposed to be caught earlier
TRACE0("internal error: \\0|) unexpected\n");
return NULL;
break;
case _T('?'):
case _T('+'):
case _T('*'):
TRACE0("?+* follows nothing\n");
return NULL;
break;
case _T('\\'):
if (*regparse == _T('\0'))
{
TRACE0("trailing \\\n");
return NULL;
}
ret = regnode(EXACTLY);
regc(*regparse++);
regc(_T('\0'));
*flagp |= HASWIDTH|SIMPLE;
break;
default: {
size_t len;
TCHAR ender;
regparse--;
len = _tcscspn(regparse, META);
if (len == 0)
{
TRACE0("internal error: strcspn 0\n");
return NULL;
}
ender = *(regparse+len);
if (len > 1 && ISREPN(ender))
len--; // Back off clear of ?+* operand.
*flagp |= HASWIDTH;
if (len == 1)
*flagp |= SIMPLE;
ret = regnode(EXACTLY);
for (; len > 0; len--)
regc(*regparse++);
regc(_T('\0'));
break;
}
}
return(ret);
}
// reginsert - insert an operator in front of already-emitted operand
//
// Means relocating the operand.
void CRegExp::reginsert(TCHAR op, TCHAR *opnd)
{
TCHAR *place;
if (!bEmitCode) {
regsize += 3;
return;
}
(void) memmove(opnd+3, opnd, (size_t)((regcode - opnd)*sizeof(TCHAR)));
regcode += 3;
place = opnd; // Op node, where operand used to be.
*place++ = op;
*place++ = _T('\0');
*place++ = _T('\0');
}
//
// regtail - set the next-pointer at the end of a node chain
void CRegExp::regtail(TCHAR *p, TCHAR *val)
{
TCHAR *scan;
TCHAR *temp;
// int offset;
if (!bEmitCode)
return;
// Find last node.
for (scan = p; (temp = regnext(scan)) != NULL; scan = temp)
continue;
*((short *)(scan+1)) = (OP(scan) == BACK) ? scan - val : val - scan;
}
// regoptail - regtail on operand of first argument; nop if operandless
void CRegExp::regoptail(TCHAR *p, TCHAR *val)
{
// "Operandless" and "op != BRANCH" are synonymous in practice.
if (!bEmitCode || OP(p) != BRANCH)
return;
regtail(OPERAND(p), val);
}
// RegFind - match a regexp against a string
// Returns - Returns position of regexp or -1
// if regular expression not found
// Note - The regular expression should have been
// previously compiled using RegComp
int CRegExp::RegFind(const TCHAR *str)
{
TCHAR *string = (TCHAR *)str; // avert const poisoning
TCHAR *s;
// Delete any previously stored found string
delete sFoundText;
sFoundText = NULL;
// Be paranoid.
if(string == NULL)
{
TRACE0("NULL argument to regexec\n");
return(-1);
}
// Check validity of regex
if (!bCompiled)
{
TRACE0("No regular expression provided yet.\n");
return(-1);
}
// If there is a "must appear" string, look for it.
if (regmust != NULL && _tcsstr(string, regmust) == NULL)
return(-1);
// Mark beginning of line for ^
regbol = string;
// Simplest case: anchored match need be tried only once.
if (reganch)
{
if( regtry(string) )
{
// Save the found substring in case we need it
sFoundText = new TCHAR[GetFindLen()+1];
sFoundText[GetFindLen()] = _T('\0');
_tcsncpy(sFoundText, string, GetFindLen() );
return 0;
}
//String not found
return -1;
}
// Messy cases: unanchored match.
if (regstart != _T('\0'))
{
// We know what TCHAR it must start with.
for (s = string; s != NULL; s = _tcschr(s+1, regstart))
if (regtry(s))
{
int nPos = s-str;
// Save the found substring in case we need it later
sFoundText = new TCHAR[GetFindLen()+1];
sFoundText[GetFindLen()] = _T('\0');
_tcsncpy(sFoundText, s, GetFindLen() );
return nPos;
}
return -1;
}
else
{
// We don't -- general case
for (s = string; !regtry(s); s++)
if (*s == _T('\0'))
return(-1);
int nPos = s-str;
// Save the found substring in case we need it later
sFoundText = new TCHAR[GetFindLen()+1];
sFoundText[GetFindLen()] = _T('\0');
_tcsncpy(sFoundText, s, GetFindLen() );
return nPos;
}
// NOTREACHED
}
// regtry - try match at specific point
int CRegExp::regtry(TCHAR *string)
{
int i;
TCHAR **stp;
TCHAR **enp;
reginput = string;
stp = startp;
enp = endp;
for (i = NSUBEXP; i > 0; i--)
{
*stp++ = NULL;
*enp++ = NULL;
}
if (regmatch(program))
{
startp[0] = string;
endp[0] = reginput;
return(1);
}
else
return(0);
}
// regmatch - main matching routine
//
// Conceptually the strategy is simple: check to see whether the current
// node matches, call self recursively to see whether the rest matches,
// and then act accordingly. In practice we make some effort to avoid
// recursion, in particular by going through "ordinary" nodes (that don't
// need to know whether the rest of the match failed) by a loop instead of
// by recursion.
int CRegExp::regmatch(TCHAR *prog)
{
TCHAR *scan; // Current node.
TCHAR *next; // Next node.
for (scan = prog; scan != NULL; scan = next) {
next = regnext(scan);
switch (OP(scan)) {
case BOL:
if (reginput != regbol)
return(0);
break;
case EOL:
if (*reginput != _T('\0'))
return(0);
break;
case ANY:
if (*reginput == _T('\0'))
return(0);
reginput++;
break;
case EXACTLY: {
size_t len;
TCHAR *const opnd = OPERAND(scan);
// Inline the first character, for speed.
if (*opnd != *reginput)
return(0);
len = _tcslen(opnd);
if (len > 1 && _tcsncmp(opnd, reginput, len) != 0)
return(0);
reginput += len;
break;
}
case ANYOF:
if (*reginput == _T('\0') ||
_tcschr(OPERAND(scan), *reginput) == NULL)
return(0);
reginput++;
break;
case ANYBUT:
if (*reginput == _T('\0') ||
_tcschr(OPERAND(scan), *reginput) != NULL)
return(0);
reginput++;
break;
case NOTHING:
break;
case BACK:
break;
case OPEN+1: case OPEN+2: case OPEN+3:
case OPEN+4: case OPEN+5: case OPEN+6:
case OPEN+7: case OPEN+8: case OPEN+9: {
const int no = OP(scan) - OPEN;
TCHAR *const input = reginput;
if (regmatch(next)) {
// Don't set startp if some later
// invocation of the same parentheses
// already has.
if (startp[no] == NULL)
startp[no] = input;
return(1);
} else
return(0);
break;
}
case CLOSE+1: case CLOSE+2: case CLOSE+3:
case CLOSE+4: case CLOSE+5: case CLOSE+6:
case CLOSE+7: case CLOSE+8: case CLOSE+9: {
const int no = OP(scan) - CLOSE;
TCHAR *const input = reginput;
if (regmatch(next)) {
// Don't set endp if some later
// invocation of the same parentheses
// already has.
if (endp[no] == NULL)
endp[no] = input;
return(1);
} else
return(0);
break;
}
case BRANCH: {
TCHAR *const save = reginput;
if (OP(next) != BRANCH) // No choice.
next = OPERAND(scan); // Avoid recursion.
else {
while (OP(scan) == BRANCH) {
if (regmatch(OPERAND(scan)))
return(1);
reginput = save;
scan = regnext(scan);
}
return(0);
// NOTREACHED
}
break;
}
case STAR:
case PLUS: {
const TCHAR nextch =
(OP(next) == EXACTLY) ? *OPERAND(next) : _T('\0');
size_t no;
TCHAR *const save = reginput;
const size_t min = (OP(scan) == STAR) ? 0 : 1;
for (no = regrepeat(OPERAND(scan)) + 1; no > min; no--) {
reginput = save + no - 1;
// If it could work, try it.
if (nextch == _T('\0') || *reginput == nextch)
if (regmatch(next))
return(1);
}
return(0);
break;
}
case END:
return(1); // Success!
break;
default:
TRACE0("regexp corruption\n");
return(0);
break;
}
}
// We get here only if there's trouble -- normally "case END" is
// the terminating point.
TRACE0("corrupted pointers\n");
return(0);
}
// regrepeat - report how many times something simple would match
size_t CRegExp::regrepeat(TCHAR *node)
{
size_t count;
TCHAR *scan;
TCHAR ch;
switch (OP(node))
{
case ANY:
return(_tcslen(reginput));
break;
case EXACTLY:
ch = *OPERAND(node);
count = 0;
for (scan = reginput; *scan == ch; scan++)
count++;
return(count);
break;
case ANYOF:
return(_tcsspn(reginput, OPERAND(node)));
break;
case ANYBUT:
return(_tcscspn(reginput, OPERAND(node)));
break;
default: // Oh dear. Called inappropriately.
TRACE0("internal error: bad call of regrepeat\n");
return(0); // Best compromise.
break;
}
// NOTREACHED
}
// regnext - dig the "next" pointer out of a node
TCHAR *CRegExp::regnext(TCHAR *p)
{
const short &offset = *((short*)(p+1));
if (offset == 0)
return(NULL);
return((OP(p) == BACK) ? p-offset : p+offset);
}
// GetReplaceString - Converts a replace expression to a string
// Returns - Pointer to newly allocated string
// Caller is responsible for deleting it
TCHAR* CRegExp::GetReplaceString( const TCHAR* sReplaceExp )
{
TCHAR *src = (TCHAR *)sReplaceExp;
TCHAR *buf;
TCHAR c;
int no;
size_t len;
if( sReplaceExp == NULL || sFoundText == NULL )
return NULL;
// First compute the length of the string
int replacelen = 0;
while ((c = *src++) != _T('\0'))
{
if (c == _T('&'))
no = 0;
else if (c == _T('\\') && isdigit(*src))
no = *src++ - _T('0');
else
no = -1;
if (no < 0)
{
// Ordinary character.
if (c == _T('\\') && (*src == _T('\\') || *src == _T('&')))
c = *src++;
replacelen++;
}
else if (startp[no] != NULL && endp[no] != NULL &&
endp[no] > startp[no])
{
// Get tagged expression
len = endp[no] - startp[no];
replacelen += len;
}
}
// Now allocate buf
buf = new TCHAR[replacelen+1];
if( buf == NULL )
return NULL;
TCHAR* sReplaceStr = buf;
// Add null termination
buf[replacelen] = _T('\0');
// Now we can create the string
src = (TCHAR *)sReplaceExp;
while ((c = *src++) != _T('\0'))
{
if (c == _T('&'))
no = 0;
else if (c == _T('\\') && isdigit(*src))
no = *src++ - _T('0');
else
no = -1;
if (no < 0)
{
// Ordinary character.
if (c == _T('\\') && (*src == _T('\\') || *src == _T('&')))
c = *src++;
*buf++ = c;
}
else if (startp[no] != NULL && endp[no] != NULL &&
endp[no] > startp[no])
{
// Get tagged expression
len = endp[no] - startp[no];
int tagpos = startp[no] - startp[0];
_tcsncpy(buf, sFoundText + tagpos, len);
buf += len;
}
}
return sReplaceStr;
}
Here's a function that will do global search and replace using regular expressions. Note that the CStringEx class described in the earlier section is being used here. The main reason for using CStringEx is that it provides the Replace() function which makes our task easier.
int RegSearchReplace( CStringEx& string, LPCTSTR sSearchExp,
LPCTSTR sReplaceExp )
{
int nPos = 0;
int nReplaced = 0;
CRegExp r;
LPTSTR str = (LPTSTR)(LPCTSTR)string;
r.RegComp( sSearchExp );
while( (nPos = r.RegFind((LPTSTR)str)) != -1 )
{
nReplaced++;
TCHAR *pReplaceStr = r.GetReplaceString( sReplaceExp );
int offset = str-(LPCTSTR)string+nPos;
string.Replace( offset, r.GetFindLen(),
pReplaceStr );
// Replace might have caused a reallocation
str = (LPTSTR)(LPCTSTR)string + offset + _tcslen(pReplaceStr);
delete pReplaceStr;
}
return nReplaced;
}

Comments
Gossip, Lies Then mulberry bags
Posted by hekvvddtest on 05/25/2013 02:20pmThe classy and elegant type of womens footwear that is chosen with a Special occasion in mind such as a weddings, elegant dinner party, or a special party/date for a hot night on the town, the most popular has always been and probably always will be anything with heels such as the gorgeous rhinestone embellished womens sandals, sexy stilettos or a pair of fabulous womens fashion boots. Blank Snapbacks Wholesale ralph lauren sale All of these styles come in a multitude of designs, colors, and brands..
Replywww.jordan12playoffs.infojordans 11 bred
Posted by bybeakrah on 05/25/2013 04:19amFind out Amazing things Of The red sea And also Nike jordan
Replywww.jordan12playoffs.infojordans 11 bred pas cher
Posted by bybeakrah on 05/24/2013 07:32amé?Detroit Aide Nike jordan Some with regard to Personas
Replywheloltabotly PumeSonee Phobereurce 4980053
Posted by TizefaTaNaday on 05/23/2013 04:35amburiautonia http://www.cnhappyer.com/images/index.php?louis-vuitton-current-owner-louis-vuitton-coupons-discounts.html arrarpmeaph http://www.mtsambulance.com/images/index.php?louis-vuitton-cheapest-item-louis-vuitton-best.html loonseGloke
ReplyMore concessions with herveleger, more move instead of!
Posted by jonemxoa on 04/20/2013 03:47pmsheilatoms sale cheap toms shoes sale urchintoms kids sale turtle-dovetoms outlet online toms kids sale perceiveburberry bags outlet obedientburberry outlet online rep
ReplyItemized requirements seeking induction of the Joomla! contentedness guidance system.
Posted by koltchklj on 04/16/2013 07:38amOur Resultant Clothing experts dominate made collages showing you what your [url=http://www.hollistercovfrance.fr]hollister[/url] favorite celebrities are wearing and what we be struck around to imprison up in favour of trading you that is remarkably similar. We are current to secure this âStardom [url=http://www.abercrombiesfrancevparise.fr]abercrombie paris[/url] Sightingâ a weekly manifestation so that you can reprove like your favorite reputation! This week we are featuring the pseud, Rhianna with an laid help denim look. Make trusty to bring on up the upraise our [url=http://www.airjordanfrpaschera.fr]air jordan pas cher[/url] blog to excursion out these spectacular, eye transmittable Polyvore blog posts! If I turn up to be struck away a purchaser sturdy in who is a compelling match as a care to something that's not visual in the accumulation, I'd less pass it on. I gobble up on the seal of foster parent. There's nothing I can't give liberty go of except for the benefit of my books. I ear my books because [url=http://www.abercrombieufrancersoldes.fr]abercrombie[/url] I nucleus them in search reference. I am constant not to let this bone-chilling coldness weather earn [url=http://www.airjordanspasuchera.fr]air jordan[/url] me down, kindness having to countermand our camping affected track in search Easter Weekend. This weekend I kept myself confusing making cocktails with vodka and Unsullied [url=http://www.hollisterfrancevmagusin.fr]hollister[/url] smoothies (kiwi, apple and lime) which were in point of reality nice and unquestionably a glaring route to outweigh up on vitamins. I also went to visualize the circumscribed bearing of euphonious [url=http://www.abercrombiexandfitchuke.co.uk]abercrombie uk[/url] Annie with the children. I loved it as a progeny and it was quaint to attention it again. Finally a titanic licence to allot a cold wintery Sunday afternoon. [url=http://www.monclerfranceumagasinsfr.com]moncler[/url] What obligated to you been up to? A Drop & Appreciate is a child rain held after the mollycoddle arrives! I gobble up [url=http://www.airjordanzchaussuren.com]air jordan[/url] this concept, exceptionally with a view a complex four like us. Ahead Sienna arrived we were so engross preparing to appropriate for parents + working on GWS that we a moment ago didnât [url=http://www.michaelukorsua.com]michael kors[/url] from any weekends blank after a apt tot sprinkle! As a honorarium, since Mom is no longer imaginative at a Thimbleful & Take captive sight of, she can weather a cocktail with her friends if she wants!So, we undeniable a Hello Neonate! pour [url=http://www.hollisterucoboutiques.fr]hollister[/url] would be the flawless course to set forth our short lady to next of kinswoman + friends. The other advertise I squeeze within easy reach this typeface of shower? It is attractive much justifiable an scanty building â no spavined baby drizzle games (although there are some [url=http://patrimoine.agglo-troyes.fr/BAM/louboutinpascher.html]louboutin[/url] that I do favourite!) or settled times â just a everything in compensation others to give someone a wigging congregate our illiberal ladyâ¦
ReplySexy Lingerie
Posted by Fishnetox1105 on 03/29/2013 09:59amhttp://sexylingerieshops.webs.com - Spicy LingerieEven if lingerie attracts you but is not of your size, you should avoid buying it There are also some other patterns and cuts such as the low riders and the boyâs cut panties http://sexycostumesus.webs.com - Sexy CostumesIf all your bartenders are over pouring, that's a tremendous amount of money in a month On one of the calls he just casually mentioned that four months after installing the systems, he has not had a workmans compensation claim in two months, something that was very common before the cameras were installed http://sexycostumesus.webs.com - Sexy CostumesAs the name implies, Three Wishes is an online dreamgirl lingerie retailer who aims at giving all the lingerie wishes of a purchaser, from sexy costumes to sexy lingerie down to all the basics of lingerie?Foxy Lady Boutique http://sexylingeriecostumes.webs.com - sexy adult costumesBy signing up as an affiliate, you will be directing customers to purchase items in exchange for a commission Unless you know, have a look at her nightgown/lingerie drawer before you leave http://discountsexylingerie.webs.com - Chinese ManufacturerLet’s look a bit deeper into the colors2
Replyred Lingerie
Posted by Fishnetww1021 on 03/29/2013 09:59amhttp://discounteroticlingerie.webs.com - Sexy Lingerie storeA dropshipper is a company who inventories a large variety of products and offers them to resellers at a discount price One more interesting moment - on main picture maybe the model will be in black color but this company also offers this style in other color http://sexycostumesus.webs.com - Sexy CostumesOn this day, you can be whoever you want Got a nice set of abs? Look for an outfit that reveals your mid drift http://sexycostumesa.webs.com - Maid LingerieApart from that, you can choose from a wide range of colors these wholesale lingerie pieces are available in including red, white, yellow, black and blue You have no inventory or shipping costs to worry about http://lingeriemall.webs.com - lace babydoll lingerieBut when you have multiple properties, and they are all doing a little thing wrong, it's a big thing One is who likes only one type of lingerie for example baby dolls, the other group likes lingerie in general changing the style, color and shape each time while purchasing http://sexystockings.webs.com - Fishnet BodystockingFor instance, Malays might like green but not the othersThe most important thing you need to succeed online is motivation
ReplyIntimate Apparel
Posted by Fishnetqs1042 on 03/29/2013 09:39amhttp://sexylingeriecostumes.webs.com - Lingerie CostumesMany adult mermaid Halloween costumes are made of glittering and glossy fabrics that fits the shiny nature of the fish body A mermaid Halloween costume makes for a great choice, because they are sensual, and aren't the same old scary fad similar to many Halloween costumes http://SexyTeddies.webs.com - sexy lingerie teddiesThose with large bosoms might prefer fitting designs that can be achieved with stretching synthetic materials that can be flexible to make them appear slimmer with a tighter, smoother finishDreamgirl is actually recognised as the leading marketer and manufacturer of sexy lingerie http://SexyTeddies.webs.com - Sexy TeddiesGet into character and go with itMany adult mermaid Halloween costumes are made of glittering and glossy fabrics that fits the shiny nature of the fish body http://Lingeriesv.webs.com - black LingerieGet into character and go with itMany adult mermaid Halloween costumes are made of glittering and glossy fabrics that fits the shiny nature of the fish body http://cheapspicylingerie.webs.com - Cheap Babydoll Lingerie?Provided here is a quick guide to the history of baby doll lingerie Even if lingerie attracts you but is not of your size, you should avoid buying it
Replydiscount Lingerie
Posted by Fishnetlc1017 on 03/29/2013 09:35amhttp://sexycostumesboutique.webs.com - Nurse CostumesAnother important aspect to be considered is the visual impact Lingeries basically shape the body of women http://sexycostumesus.webs.com - Nurse CostumesWhile purchasing babydoll chemises, consider the neckline http://discountsexylingerie.webs.com - Sexy Lingerie storeThin women will look good in anything, but especially teddies, corsets, and babydolls Invite your co-workers, family and friends to join you http://sexylingerieshops.webs.com - Trashy Lingerie If your partner is a brunette then she is quite capable of wearing the much bolder colours such as emerald greens or deep purples, whilst a blonde is able to wear more effectively lingerie that comes in pastel shades It is designed to accentuate a woman's curves, not to alter them http://babydollnightgowns.webs.com - Fishnet Babydoll They will be able to give you a clearer idea of what styles of lingerie she likes to wear so you don’t make any mistakes, but also they will know which companies she prefers This means a woman can enhance the lower part of the body by a loose garment with a tight fit across the waist and the bosom, to bring a combination of a slim top and an emphasized nether region
ReplyLoading, Please Wait ...