CodeGuru Forums -
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic Newsletters VB Forums Developer.com


Newest CodeGuru.com Articles:

  • Installing SQL Server 2008
  • Writing UDFs for Firebird Embedded SQL Server
  • [Updated] Shutdown Manager
  • Building Windows Azure Cloud Service Applications with Azure Storage and the Azure SDK

  • Search CodeGuru:
     



    Go Back   CodeGuru Forums > Visual C++ & C++ Programming > Visual C++ Programming
    FAQ Members List Calendar Search Today's Posts Mark Forums Read

    Visual C++ Programming Ask questions about Windows programming with Visual C++ and help others by answering their questions.

    Reply
     
    Thread Tools Search this Thread Rate Thread Display Modes
      #1    
    Old September 26th, 2002, 12:03 PM
    Bob H Bob H is offline
    Member
     
    Join Date: Apr 1999
    Posts: 117
    Bob H is an unknown quantity at this point (<10)
    Code page in WideCharToMultiByte

    I am using _getmbcp() to get the current code page used in WideCharToMultiByte. By the way, it is a _UNICODE compile.

    I am trying to convert Kanji unicode to multibyte. Somewhere along the line I not getting the correct characters, but it could be elsewhere.

    Is this the correct code page for this situation?

    Last edited by Bob H; September 26th, 2002 at 12:06 PM.
    Reply With Quote
      #2    
    Old September 26th, 2002, 02:21 PM
    Bob H Bob H is offline
    Member
     
    Join Date: Apr 1999
    Posts: 117
    Bob H is an unknown quantity at this point (<10)
    _getmbcp returns the current multibyte code page.

    Another possibly related issue:
    I thought that all uncode strings could be translated into a 2-bytes per character multibyte string using WideCharToMultiByte. But, in one of my books on unicode, I see a table which shows the number of bytes to encode UTF-8 characters and the number goes to 4. Does anyone know if Kanji uses more than 2-bytes?
    Reply With Quote
      #3    
    Old September 27th, 2002, 05:14 AM
    Bob H Bob H is offline
    Member
     
    Join Date: Apr 1999
    Posts: 117
    Bob H is an unknown quantity at this point (<10)
    I read your reply to another unicode question on the forum which was helpful.

    I have created a test program for my contact in Japan to run with Windows 2000.

    It displays text which he enters in a CEdit box and in another CEdit box the length of the CString string holding the text is displayed. The simple test program is unicode compiled.

    By the way, the code runs great with MS Mincho (with the code page set to Japan) on my XP machine but fails when it runs on Win XP, 2000 computers in Japan.

    So, I presume if the length of text equals the length of the string, we are in the UTF-16 mode. If we are not, then I am in deep trouble. My software assumes one 2-byte TCHAR per character. The code uses macros like _istlead and _tcsinc.

    Is the UTF value set by the font or the operating system and is there a way to test for it and/or set it?
    Reply With Quote
      #4    
    Old September 27th, 2002, 08:15 AM
    waqarahmad waqarahmad is offline
    Junior Member
     
    Join Date: Sep 2002
    Posts: 13
    waqarahmad is an unknown quantity at this point (<10)
    See GetFontUnicodeRanges( ) in MSDN.
    __________________
    Waqar
    Reply With Quote
      #5    
    Old September 27th, 2002, 09:44 AM
    Yves M's Avatar
    Yves M Yves M is offline
    Moderator
    Power Poster
     
    Join Date: Aug 2002
    Location: Madrid
    Posts: 4,560
    Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)
    Hum...
    Quote:
    _getmbcp returns the current multibyte code page. A return value of 0 indicates that a single byte code page is in use.
    That is not exactly what you should use, since it might be using a single byte code page that is not Latin-1 as for english.

    Another possibility is to use the Windows API calls which work fine for me.
    Code:
    UINT LangIDToCodePage(long lLangID)
    {
     char codepage[7];
     int Res;
    
     memset(codepage, 0, 7);
     Res = GetLocaleInfo(lLangID, LOCALE_IDEFAULTANSICODEPAGE, codepage, 6);
     if (Res != 0) {
      return atoi(codepage);
     } else {
      return CP_ACP;
     }
    }
    ...
    // On startup do : // for me in OnCreate
    m_InputCP = LangIDToCodePage(LOWORD(GetKeyboardLayout(0)));
    // In your message handler do :
     case WM_INPUTLANGCHANGE :
      m_InputCP = LangIDToCodePage(LOWORD(lParam));
      bHandled = TRUE;
      break;
    // When you convert from Unicode to MuliByte, use m_InputCP as the codepage
    __________________
    Get this small utility to do basic syntax highlighting in vBulletin forums (like Codeguru) easily.
    Supports C++ and VB out of the box, but can be configured for other languages.

    Last edited by Yves M; September 27th, 2002 at 10:15 AM.
    Reply With Quote
      #6    
    Old September 28th, 2002, 07:41 AM
    Bob H Bob H is offline
    Member
     
    Join Date: Apr 1999
    Posts: 117
    Bob H is an unknown quantity at this point (<10)
    I can't imagine that a single byte code page would be the situation since the problem is occurring on Win 2000 computers in Japan. But, I will create a test dialog which displays the value of your routine and _getmbcp().
    Reply With Quote
      #7    
    Old September 28th, 2002, 08:36 AM
    Yves M's Avatar
    Yves M Yves M is offline
    Moderator
    Power Poster
     
    Join Date: Aug 2002
    Location: Madrid
    Posts: 4,560
    Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)
    True, it would not be related to your problem with japanese, but in Russian, Greek arabic, Hebrew etc things wouldn't work.

    Oh yes, by the way you will have to rewrite my LangIDToCodePage function for Unicode since you compile your app for Unicode.

    Last edited by Yves M; September 28th, 2002 at 08:39 AM.
    Reply With Quote
      #8    
    Old September 28th, 2002, 04:31 PM
    Bob H Bob H is offline
    Member
     
    Join Date: Apr 1999
    Posts: 117
    Bob H is an unknown quantity at this point (<10)
    The code also services ANSI purposes -- English, German, etc. -- and Win 9x computers so I need to go the TCHAR/_MBCS route. There will be a separate _MBCS build for 9x computers which by the way works correctly on Japanese computers. It is the unicode version which has problems which are probably due to my mapping between text characters and text glyphs.

    I don't sufficient resources to have a different code base for this unicode Japanese application. Also I don't want to rewrite all MFC controls which use CString I believe.

    Evidence so far is that there is one TCHAR per text character.

    I have a bastardized GetGlyphIndex function which was inherited from the _MBCS world (and works for that world). I need to try the true unicode call for this function and I think my problem may be solved.
    Reply With Quote
      #9    
    Old October 12th, 2002, 06:22 PM
    Bob H Bob H is offline
    Member
     
    Join Date: Apr 1999
    Posts: 117
    Bob H is an unknown quantity at this point (<10)
    Since the last posting I have figured out my problems and learned some things.

    First, the LangIDToCodePage code returns the same value as _getmbcp().

    Second, in my _mbcs compile I was using what, I believe, are called character codes for GetGlyphOutline. This does not work in general for a _unicode compile. When I used glyph indices (which for ascii codes < 127 differ from character coces by 29) and set the glyphindex flag in GetGlyphOutline, my problem went away. I used GetCharacterPlacement to get the glyph indices.
    Reply With Quote
      #10    
    Old October 12th, 2002, 08:07 PM
    Yves M's Avatar
    Yves M Yves M is offline
    Moderator
    Power Poster
     
    Join Date: Aug 2002
    Location: Madrid
    Posts: 4,560
    Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)
    Quote:
    Originally posted by Bob H
    First, the LangIDToCodePage code returns the same value as _getmbcp().
    Does that mean that _getmbcp also works correctly when you switch code pages during the execution of the program ? Meaning if a japanese person needs to insert some characters in english / russian whathaveyou and switches keyboard locales while you program is running ?
    Reply With Quote
      #11    
    Old October 12th, 2002, 08:43 PM
    Bob H Bob H is offline
    Member
     
    Join Date: Apr 1999
    Posts: 117
    Bob H is an unknown quantity at this point (<10)
    I am fairly sure that English can be entered from a Japanese keyboard. I get a lot of emails from Japan in English.
    Reply With Quote
      #12    
    Old October 13th, 2002, 10:27 AM
    Yves M's Avatar
    Yves M Yves M is offline
    Moderator
    Power Poster
     
    Join Date: Aug 2002
    Location: Madrid
    Posts: 4,560
    Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)Yves M has much to be proud of (1500+)
    Well, I can also enter Japanese on my spanish or my swiss keyboards when I switch input locales
    Reply With Quote
    Reply

    Bookmarks
    Go Back   CodeGuru Forums > Visual C++ & C++ Programming > Visual C++ Programming


    Thread Tools Search this Thread
    Search this Thread:

    Advanced Search
    Display Modes Rate This Thread
    Rate This Thread:

    Posting Rules
    You may not post new threads
    You may not post replies
    You may not post attachments
    You may not edit your posts

    BB code is On
    Smilies are On
    [IMG] code is On
    HTML code is Off
    Forum Jump


    All times are GMT -5. The time now is 08:51 PM.



    Acceptable Use Policy

    internet.comMediabistrojusttechjobs.comGraphics.com

    WebMediaBrands Corporate Info


    Advertise | Newsletters | Feedback | Submit News

    Legal Notices | Licensing | Permissions | Privacy Policy


    Powered by vBulletin® Version 3.7.3
    Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
    Copyright WebMediaBrands Inc. 2002-2009