URL Encoding
Introduction
The purpose of the article is to design a C++ class that does URL encoding. The motivation behind this article was that, in my previous project, I need to post data from a VC++ 6.0 application, which was required to be URL encoded. I have searched the MSDN to include a class or API that returns a URL encoded value for a given string input, but I haven't found one. So, I had to come out with my own URLEncode C++ class.
The URLEncoder.exe is a MFC dialog-based application that uses the URLEncode class.
Process
URL encoding is a special process that makes sure that all the characters are "safe" to transmit across the Internet. Some characters have special meaning to various programs involved in sending the data across the Internet.
For example, a carriage return has an ASCII value of 13. Programs involved in sending you "FORM" data may consider this to mean the end of a line of data.
Traditionally, all Web applications transfer data between the client and server by using the HTTP or HTTPS protocols. There are basically two ways in which a server receives input from a client:
- Data can be passed in the HTTP headers (either via cookies or a posted form), or
- It can be included in the query portion of the requested URL.
When data is included in a URL, it must be specially encoded to conform to proper URL syntax. On the Web server side, the data is automatically decoded. Consider the following URL, where data is posted as a query string parameter.
Example: http://WebSite/ResourceName?Data=Data
Where Web Site is the URL Name
Resource Name is either the ASP or Servlet Name.
Data is the one that is to be posted to the Web Server. This requires to be encoded if the MIME type is .Content-Type: application/x-www-form-urlencoded.
RFC 1738
The RFC 1738 specification defining Uniform Resource Locators (URLs) restricts the characters allowed in a URL to a subset of the US-ASCII character set. This poses a limitation because HTML, on the other hand, allows the entire range of the ISO-8859-1 (ISO-Latin) character set to be used in documents. This leads to the case of, if the data to be uploaded is in the form HTML post (or as a part of Query string), all the HTML data to be encoded.
ISO-8859-1 (ISO-Latin) Character Set
The following table, ISO-8859-1, contains the complete ISO-8859-1 (ISO-Latin) character set, corresponding to the first 256 entries. The table provides each character ISO 8859-1Position(its decimal code), Description, Entity Number, Hex-Decimal Values, and HTML Result. Broadly, the range can be divided into Safe and Unsafe characters as follows.
| Character range(decimal) | Type | Values | Safe/Unsafe |
| 0-31 | ASCII Control Characters | These characters are not printable | Unsafe |
| 32-47 | Reserved Characters | ' '!?#$%&'()*+,-./ | Unsafe |
| 48-57 | ASCII Characters and Numbers | 0-9 | Safe |
| 58-64 | Reserved Characters | :;<=>?@ | Unsafe |
| 65-90 | ASCII Characters | A-Z | Safe |
| 91-96 | Reserved Characters | [\]^_` | Unsafe |
| 97-122 | ASCII Characters | a-z | Safe |
| 123-126 | Reserved Characters | {|}~ | Unsafe |
| 127 | Control Characters | ' ' | Unsafe |
| 128-255 | Non-ASCII Characters | ' ' | Unsafe |
All the ASCII characters that are unsafe are required to encoded; for example, ranges (32-47, 58-64, 91-96, 123-126).
Below is the table that describes why these characters are not safe.
| Character | Unsafe Reason | Character Encode |
| "<" | Delimiters around URLs in free text | %3C |
| > | Delimiters around URLs in free text | %3E |
| . | Delimits URLs in some systems | %22 |
| # | It is used in the World Wide Web and in other systems to delimit a URL from a fragment/anchor identifier that might follow it. | %23 |
| { | Gateways and other transport agents are known to sometimes modify such characters | %7B |
| } | Gateways and other transport agents are known to sometimes modify such characters | %7D |
| | | Gateways and other transport agents are known to sometimes modify such characters | %7C |
| \ | Gateways and other transport agents are known to sometimes modify such characters | %5C |
| ^ | Gateways and other transport agents are known to sometimes modify such characters | %5E |
| ~ | Gateways and other transport agents are known to sometimes modify such characters | %7E |
| [ | Gateways and other transport agents are known to sometimes modify such characters | %5B |
| ] | Gateways and other transport agents are known to sometimes modify such characters | %5D |
| ` | Gateways and other transport agents are known to sometimes modify such characters | %60 |
| + | Indicates a space (spaces cannot be used in a URL) | %20 |
| / | Separates directories and subdirectories | %2F |
| ? | Separates the actual URL and the parameters | %3F |
| & | Separator between parameters specified in the URL | %26 |
How It Is Done
URL encoding of a character is done by taking the character's 8-bit hexadecimal code and prefixing it with a percent sign ("%"). For example, the US-ASCII character set represents a space with decimal code 32, or hexadecimal 20. Thus, its URL-encoded representation is %20.
URLEncode: URLEncode is a C++ class, which does URL encoding for a given string of data. The CURLEncode class has the following member functions.
- isUnsafeString
- decToHex
- convert
- URLEncode
The URLEncode() method does the encoding process. URLEncode checks each character in the string to see whether the character is safe or unsafe (isUnsafe). If the character is unsafe, the character is replaced with the .%. HEX value (convert) and appended to the original string.
Code Snippet
class CURLEncode
{
private:
static CString csUnsafeString;
CString (char num, int radix);
bool isUnsafe(char compareChar);
CString convert(char val);
public:
CURLEncode() { };
virtual ~CURLEncode() { };
CString (CString vData);
};
bool CURLEncode::isUnsafe(char compareChar)
{
bool bcharfound = false;
char tmpsafeChar;
int m_strLen = 0;
m_strLen = csUnsafeString.GetLength();
for(int ichar_pos = 0; ichar_pos < m_strLen ;ichar_pos++)
{
tmpsafeChar = csUnsafeString.GetAt(ichar_pos);
if(tmpsafeChar == compareChar)
{
bcharfound = true;
break;
}
}
int char_ascii_value = 0;
//char_ascii_value = __toascii(compareChar);
char_ascii_value = (int) compareChar;
if(bcharfound == false && char_ascii_value > 32 &&
char_ascii_value < 123)
{
return false;
}
// found no unsafe chars, return false
else
{
return true;
}
return true;
}
CString CURLEncode::decToHex(char num, int radix)
{
int temp=0;
CString csTmp;
int num_char;
num_char = (int) num;
if (num_char < 0)
num_char = 256 + num_char;
while (num_char >= radix)
{
temp = num_char % radix;
num_char = (int)floor(num_char / radix);
csTmp = hexVals[temp];
}
csTmp += hexVals[num_char];
if(csTmp.GetLength() < 2)
{
csTmp += '0';
}
CString strdecToHex(csTmp);
// Reverse the String
strdecToHex.MakeReverse();
return strdecToHex;
}
CString CURLEncode::convert(char val)
{
CString csRet;
csRet += "%";
csRet += decToHex(val, 16);
return csRet;
}

URLEncoder
References
URL Encoding: http://www.blooberry.com/indexdot/html/topics/urlencoding.htm.RFC 1866: The HTML 2.0 specification (plain text). The appendix contains the Character Entity table: http://www.rfc-editor.org/rfc/rfc1866.txt.
The Web version of the HTML 2.0 (RFC 1866) Character Entity table: http://www.w3.org/MarkUp/html-spec/html-spec_13.html.
The HTML 3.2 (Wilbur) recommendation [This includes all character entities listed in HTML 2.0, plus new named entities covering the ISO 8859-1 120-191 range.]: http://www.w3.org/MarkUp/Wilbur/.
The HTML 4.0 Recommendation [Includes new Unicode character entities]: http://www.w3.org/TR/REC-html40/.
The W3C HTML Internationalization area: http://www.w3.org/International/O-HTML.html.

Comments
why they have won awards around the world, is because of their
Posted by Andreayvb on 05/18/2013 03:44pmit comes to money, and they want you to set it up and then keep http://www.baidu.com Us See The Functions In Brief In this device you can feed in an
Replyhttp://www.tomsoutletw.com/ qxoomj
Posted by http://www.tomsoutletw.com/ Mandypqf on 03/28/2013 07:57pmAfter listening to the man, the body is not back, I heard an unusual familiar voice: no Qingyang and white? Zhuo who at this time covered with cold sweat, to release four Lingzhu shouted: Who are you? The man smiles: ray ban wayfarer sunglasses you did not know it? Firmly face him turn over, Zhuo who was Annealing few steps and roared: who are you? That put down the hands of jade Jane said: ray ban caravan Cheuk Fan, Sheng the mainland last a soaring the Zhuo cloud descendants, can not I? Zhuo who looked at immediate creepily asked: who you! If you do not reveal their true colors, Xiuguai oakley sunglasses outlet polite! Cheuk Fan, the opposite is also furious: where evildoer dare impersonate oakley sunglasses cheap Cheuk Fan!ray ban sunglasses, Sound indistinguishable from the original, I saw that he also released four beads, Cheuk Fan Zhang look up to, turned Lingzhu, exactly the same with their own.
Replyhttp://www.raybansunglassesouty.com/ ppxdsj
Posted by http://www.raybansunglassesouty.com/ Mandyimf on 03/28/2013 07:15amghd,The Prince Gong Yixin laughing: group security is how the current difference between? The Tan Yankai smiled and replied: said it would also like to thank the Younger's my wife, she once brought to the Younger inadvertently the silver ghd hair straightener is slowly devaluation, and Britain as the most powerful powers, has a strong the basis of economic and military support its strong pound, so that cheap ghd silver is absolutely against the pound devaluation ... ghd australia Imagine the Japanese to use sterling to repay the claims, which the middle package is also kept evil intentions, originally pounds for silver prices, Qing Peifu Japan thirty-five million taels of silver, if placed on the international market is certainly exacerbate silver devaluation, especially court interested in saving reparations interest, want a one-time settlement of this claims that ghd hair straightener is necessary in the international market in a short period of time purchase worth 18,002,000 pounds, which is not a more accelerated depreciation of silver?
Replycheap ugg boots lZijfWkp http://www.cheapfashionshoesan.com/
Posted by Suttontih on 03/10/2013 10:19pmghd baratas nqlqakni ghd españa sutanltx ghd planchas npnjldht ghd ofaxqmof planchas ghd gpukzsqg
Replyugg boots hehhyo
Posted by Suttonscx on 02/19/2013 02:06pmbeats by dr dre owfkttjn beats by dre bmboajpg beats dr dre vrdvxaqv beats for sale znkqqjgk beats headphones suisvgcm cheap monster beats nuhtpmvo dr dre beats lklshmhj dr dre headphones eatgkhap monster beats by dre ptsjtbdd monster beats headphones tnwazooe monster beats bfaqnonl monster headphones pdocqjei
Replyugg boots qlbxjr
Posted by Mandygsg on 02/19/2013 01:09amghd nz onruxarw ghd nz sale hvdzbdoc ghd tegvnuoa
Replyugg boots gmogqf http://www.cheapfashionshoesan.com/
Posted by Mandykvx on 02/14/2013 07:32pmbeats by dr dre opsgurktbeats by dre mgtwslbnbeats dr dre etrnndkgbeats for sale xtiqqoqmbeats headphones hlxsaccecheap monster beats llyzpyzldr dre beats vajhtaxpdr dre headphones fhcpipnjmonster beats by dre wkwaqfnwmonster beats headphones jzguhkstmonster beats glfzrrzqmonster headphones plxsllxh
Replycheap ugg boots fGxn tZnk
Posted by Mandyrqf on 02/13/2013 09:03amnRfr louboutin soldes yIsb longchamp tote xSvd michael kors outlet 6gZpm 6vJwr chi straightener 8vArr Michael Kors 3gUct cheap New England Patriots Sideline Legend Authentic Logo Dri-FIT T-Shirt D.Green 5hIbm nike air max 1 7qXkn ghd france 6eSkp ugg online 0oOaj toms shoes 5tZwf Tory Burch Classic Leather Brown Handbags CheapTory Burch Zip Front Blue Handbags CheapTory Burch Romy Reva Ballet Brown Flat CheapTory Burch Fuchsia Wedge CheapTory Burch HandBags Leather Red Cheap 2rRqn hollister paris 5ySgw ghd planchas 2pYkd cheap ugg boots
Replyghd australia jyfbtv
Posted by Suttonbrr on 02/08/2013 03:05am9vIfp bottes ugg pWyh tory burch mGey nike sko p? nett 9dPyv toms sale 5eRcd hollister sale 3aNmw 0tAcy portefeuille longchamp 9gKej louis vuitton shoes 4uXgk michael kors outlet 0aNts christian louboutin norge 9xVps 49ers jerseys 7iQno 2cAyo ghd 1gGuf styler ghd 5uWde ugg boots uk
Replyghd australia wvlmmu
Posted by Mandyvbc on 02/07/2013 03:48am7iJjr ugg mSbq qHqh nike 5eEfd toms outlet 7oDad hollister uk 2gVyx ugg 2mBun longchamps 2xWec louis vuitton outlet 2aXlb michael kors outlet 9qCuj christian louboutin 1tHjd A.J. Jenkins Jersey 9dEdp 8eHyz 9eJif ghd 1zUtd ugg sale
ReplyLoading, Please Wait ...