Easy Unicode

Introduction

I was given a very simple task at work: Take a dialog box that acts as a registration page with a few registration fields (edit boxes), and convert it to support Unicode. This means that the entered data is wide characters. This data of course must be dealt with and should be (in my case) sent to a server. So? Easy isn't it?

I looked it up, studied the concept of Unicode, and?!?!? NADA! You can easily change the character set of your project to Unicode, but who is crazy enough to do that? You will waste lots of time altering your whole code to fit a Unicode character set. You might even end up rewriting your whole code... I could not find an easy or simple way of doing it otherwise. All the solutions I could find contained pages of useless or irrelevant code.

After some research, I came up with a simple and elegant solution that I share here, and I hope it might help someone's project or save some people a very large headache. This article includes code samples and a complete downloadable project (Unicode.zip) that you can compile and play with. In addition, the compiled sample application (Unicode_app.zip) is downloadable as well.

My Solution

Use a Rich Edit-Box Control (CRichEditCtrl) instead of an Edit-Box Control (CEdit). Please note that, to use a rich edit control, you must call AfxInitRichEdit() at the initialization of your application. This function initializes the rich edit control (see the AfxInitRichEdit MSDN documentation regarding rich edit box versions). Now, use SendMessageW to send your text to or from the rich edit box.

What SendMessageW Is

#ifdef UNICODE
#define SendMessage SendMessageW
#else
#define SendMessage SendMessageA
#endif    //!UNICODE

SendMessageW is actually the Unicode instance of SendMessage. If you compile your code with a Unicode character set, SendMessageW is used. Otherwise, SendMessageA is used. You actually want to enforce the usage of the Unicode version of SendMessage without a definition of UNICODE, so you call SendMessageW directly. How do you use SendMessageW? Well, the same way you use SendMessage, but you need to remember that all the function's parameters are Wide Character now (Unicode).

How to Use It

Following is an example code (I used it in Unicode.exe) for the usage of Unicode in a non-Unicode (ANSI) application:

void CUnicodeDlg::OnCopy()
{
   GETTEXTEX wParamIN;
   SETTEXTEX wParamOUT;
   LRESULT lResult;
   WCHAR lParam[100]      = {0};
   wParamIN.cb            = (this->m_rich1.
                             GetWindowTextLength()+1)*2;
   wParamIN.flags         = GT_RAWTEXT;
   wParamIN.codepage      = 1200;
   wParamIN.lpDefaultChar = NULL;
   wParamIN.lpUsedDefChar = NULL;
   wParamOUT.codepage     = 1200;
   wParamOUT.flags        = ST_SELECTION;

   // handle to source control
   lResult = SendMessageW(this->m_rich1.m_hWnd,
                          // message ID
                          EM_GETTEXTEX,
                          // = (WPARAM) () wParam
                          (WPARAM) &wParamIN,
                          // = (LPARAM) () lParam
                          (LPARAM) lParam);

   MessageBoxW(this->m_hWnd, lParam, L"Here we go...",
               MB_OK|MB_ICONASTERISK);

   // handle to destination control
   lResult = SendMessageW(this->m_rich2.m_hWnd,
                          // message ID
                          EM_SETTEXTEX,
                          // = (WPARAM) () wParam
                          (WPARAM) &wParamOUT,
                          // = (LPARAM) () lParam
                          (LPARAM) lParam);
}

How to Use Unicode.exe

My simple application (Unicode.exe) simply demonstrates entering a Unicode string into a rich edit box of a non-Unicode application and copying it to another rich edit box. You can use the Windows Character Map (Start->Programs-> Accessories->System Tools) to get Unicode symbols.

Paste the copied symbols in the Input rich edit box of the Unicode.exe.

Click Copy. A message box with the input string is shown. Click OK.

The entered Unicode string is copied to the Output rich edit box.

Summary

In my sample application, I used lParam as a wide character parameter to contain the Unicode string. You can keep using it to send its content forward, but always remember that it contains Unicode formatted data, meaning that each character/symbol takes twice the memory size.

I hope I could help. Enjoy.



About the Author

Lior Peretz

I'm a developer in Aladdin Ltd. at the Software DRM R&D.

Downloads

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Enterprises today must focus on digital transformation to remain competitive or disrupt their industries. The foundation for successful transformation is the adoption of a cloud-first mindset. However, IT organizations must first address legacy infrastructure and fragmented management tools that were not designed for the speed and flexibility of the cloud and digital era. Read this IDC Technology Spotlight paper to explore: Why digital transformation is driving a shift to a cloud-centric enterprise Key …

  • Microsoft® Office 365 is a top choice for enterprises that want a cloud–based suite of productivity collaboration applications. With Office 365, you get access to Microsoft™ Office solutions practically anytime, anywhere, on virtually any device. It's a great option for current Microsoft users who can now build on their experience with Microsoft™ solutions while enjoying the flexibility of a cloud-based delivery. But even organizations with no previous investment in Microsoft will find that …

Most Popular Programming Stories

More for Developers

RSS Feeds

Thanks for your registration, follow us on our social networks to keep up-to-date