CodeGuru Forums -
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic Newsletters VB Forums Developer.com


Newest CodeGuru.com Articles:

  • Installing SQL Server 2008
  • Writing UDFs for Firebird Embedded SQL Server
  • [Updated] Shutdown Manager
  • Building Windows Azure Cloud Service Applications with Azure Storage and the Azure SDK

  • Search CodeGuru:
     



    Go Back   CodeGuru Forums > Visual C++ & C++ Programming > Managed C++ and C++/CLI
    FAQ Members List Calendar Search Today's Posts Mark Forums Read

    Managed C++ and C++/CLI Discuss Managed C++ and .NET-specific questions related to C++.

    Reply
     
    Thread Tools Search this Thread Rate Thread Display Modes
      #1    
    Old July 19th, 2005, 12:50 PM
    dy13 dy13 is offline
    Junior Member
     
    Join Date: May 2005
    Posts: 21
    dy13 is an unknown quantity at this point (<10)
    Question Culture SENSITIVE regex split?

    Hi, I'm trying to split a database of English and Chinese sentences into arrays of individual words using Regex.Split. The problem is, English words get separated by spaces while Chinese words don't. This gets even more confusing when Chinese and English words exist in the same sentence.

    Is there a way for Regex to automatically detect the language and perform the proper splits accordingly? Thanks so much!
    Reply With Quote
      #2    
    Old July 19th, 2005, 04:58 PM
    dy13 dy13 is offline
    Junior Member
     
    Join Date: May 2005
    Posts: 21
    dy13 is an unknown quantity at this point (<10)
    Re: Culture SENSITIVE regex split

    Took me a while but I figured it out! For anyone else interested in multilanguage splits, here's my way:

    Use Match instead of Split
    E.g.

    Regex* rg = new Regex(S"[A-z]+|\\w);
    Match* match = rg->Match(yourString);

    The [A-z] part will target words in English, the \w will target any non-English characters(Chinse, Japanese, etc.). You can also use [A-z|0-9]+ to include attached numbers.

    Working so far...
    Reply With Quote
    Reply

    Bookmarks
    Go Back   CodeGuru Forums > Visual C++ & C++ Programming > Managed C++ and C++/CLI


    Thread Tools Search this Thread
    Search this Thread:

    Advanced Search
    Display Modes Rate This Thread
    Rate This Thread:

    Posting Rules
    You may not post new threads
    You may not post replies
    You may not post attachments
    You may not edit your posts

    BB code is On
    Smilies are On
    [IMG] code is On
    HTML code is Off
    Forum Jump


    All times are GMT -5. The time now is 06:01 AM.



    Acceptable Use Policy

    internet.comMediabistrojusttechjobs.comGraphics.com

    WebMediaBrands Corporate Info


    Advertise | Newsletters | Feedback | Submit News

    Legal Notices | Licensing | Permissions | Privacy Policy


    Powered by vBulletin® Version 3.7.3
    Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
    Copyright WebMediaBrands Inc. 2002-2009