URI Encoding and Decoding

Introduction

Here are two functions of URI encoding and decoding. They use std::string as the argument and return type.

A URI is represented as a sequence of characters, not as a sequence of octets. That is because URI might be "transported" by means that are not through a computer network, e.g., printed on paper, read over the radio, etc.—RFC2396

UriEncode() maps octets to characters, such as:

"\0\1\2" -> "%00%01%02"
"~ABCD"  -> "%7EABCD"

Each octet except alphanum is converted to "% HEX HEX". UriDecode() converts them back.

Code Snippets

Encode:

std::string UriEncode(const std::string & sSrc)
{
   const char DEC2HEX[16 + 1] = "0123456789ABCDEF";
   const unsigned char * pSrc = (const unsigned char *)sSrc.c_str();
   const int SRC_LEN = sSrc.length();
   unsigned char * const pStart = new unsigned char[SRC_LEN * 3];
   unsigned char * pEnd = pStart;
   const unsigned char * const SRC_END = pSrc + SRC_LEN;

   for (; pSrc < SRC_END; ++pSrc)
   {
      if (SAFE[*pSrc]) 
         *pEnd++ = *pSrc;
      else
      {
         // escape this char
         *pEnd++ = '%';
         *pEnd++ = DEC2HEX[*pSrc >> 4];
         *pEnd++ = DEC2HEX[*pSrc & 0x0F];
      }
   }

   std::string sResult((char *)pStart, (char *)pEnd);
   delete [] pStart;
   return sResult;
}

Decode:

std::string UriDecode(const std::string & sSrc)
{
   // Note from RFC1630: "Sequences which start with a percent
   // sign but are not followed by two hexadecimal characters
   // (0-9, A-F) are reserved for future extension"

   const unsigned char * pSrc = (const unsigned char *)sSrc.c_str();
   const int SRC_LEN = sSrc.length();
   const unsigned char * const SRC_END = pSrc + SRC_LEN;
   // last decodable '%' 
   const unsigned char * const SRC_LAST_DEC = SRC_END - 2;

   char * const pStart = new char[SRC_LEN];
   char * pEnd = pStart;

   while (pSrc < SRC_LAST_DEC)
   {
      if (*pSrc == '%')
      {
         char dec1, dec2;
         if (-1 != (dec1 = HEX2DEC[*(pSrc + 1)])
            && -1 != (dec2 = HEX2DEC[*(pSrc + 2)]))
         {
            *pEnd++ = (dec1 << 4) + dec2;
            pSrc += 3;
            continue;
         }
      }

      *pEnd++ = *pSrc++;
   }

   // the last 2- chars
   while (pSrc < SRC_END)
      *pEnd++ = *pSrc++;

   std::string sResult(pStart, pEnd);
   delete [] pStart;
   return sResult;
}

Usage Example

Just copy the source codes or external link the functions to use them.

int main()
{
   extern std::string UriEncode(const std::string & sSrc);
   extern std::string UriDecode(const std::string & sSrc);
   const std::string ORG("\0\1\2", 3);
   const std::string ENC("%00%01%02");
   assert(UriEncode(ORG) == ENC);
   assert(UriDecode(ENC) == ORG);
   return 0;
}

Differences from Other Implementations

Following are other implementations on URI/URL encoding and decoding.

This implementation differs from the above in these ways because it:

  • Has a decode function.
  • Encodes a buffer. Also, it supports encoding a char buffer, including '\0'. Example:
  • "ABC\0ABC" -> "ABC%00ABC"
  • Runs faster because it uses a array to do the mapping.
  • Is portable. It doesn't use MFC CString.


Downloads

Comments

  • Programmer

    Posted by Tim on 02/25/2014 09:14am

    Note to other users: The HEX2DEC array used in the decoding function is in the .cpp in the download file.

    Reply
  • Or we could encode like this

    Posted by Anonymous on 12/02/2012 07:25pm

    99% of the time the hex codes will be double digit in reality std::string URLEscape(char*url) { std::ostringstream s; for (;*url;s

    Reply
  • License

    Posted by Cem Kalyoncu on 05/26/2012 10:50am

    Pretty good, is it possible to specify the license? MIT/BSD/PD/Apache licenses would be most useful.

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Learn how cloud-based master data management (MDM) empowers your fast-paced business to get the right data to the right place in real time, so you can remain competitive and agile.

  • Live Event Date: September 16, 2014 @ 11:00 a.m. ET / 8:00 a.m. PT Are you starting an on-premise-to-cloud data migration project? Have you thought about how much space you might need for your online platform or how to handle data that might be related to users who no longer exist? If these questions or any other concerns have been plaguing you about your migration project, check out this eSeminar. Join our speakers Betsy Bilhorn, VP, Product Management at Scribe, Mike Virnig, PowerSucess Manager and Michele …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds