Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to tell if text on the windows clipboard is ISO 8859 or UTF-8 in C++?

I would like to know if there is an easy way to detect if the text on the clipboard is in ISO 8859 or UTF-8 ?

Here is my current code:

    COleDataObject  obj;

    if (obj.AttachClipboard())
    {
        if (obj.IsDataAvailable(CF_TEXT))
        {
            HGLOBAL hmem = obj.GetGlobalData(CF_TEXT);
            CMemFile sf((BYTE*) ::GlobalLock(hmem),(UINT) ::GlobalSize(hmem));
            CString buffer;

            LPSTR str = buffer.GetBufferSetLength((int)::GlobalSize(hmem));
            sf.Read(str,(UINT) ::GlobalSize(hmem));
            ::GlobalUnlock(hmem);

            //this is my string class
            s->SetEncoding(ENCODING_8BIT);
            s->SetString(buffer);
        }
    }
}
like image 479
KPexEA Avatar asked Oct 03 '08 03:10

KPexEA


People also ask

Is ISO 8859 the same as UTF-8?

UTF-8 is a multibyte encoding that can represent any Unicode character. ISO 8859-1 is a single-byte encoding that can represent the first 256 Unicode characters. Both encode ASCII exactly the same way.

Are C strings UTF-8?

Within an identifier, you would also want to allow characters >= 0x80, which is the range of UTF-8 continuation bytes. Most C string library routines still work with UTF-8, since they only scan for terminating NUL characters.

How do I know if I have UTF-8 or UTF-16?

For your specific use-case, it's very easy to tell. Just scan the file, if you find any NULL ("\0"), it must be UTF-16. JavaScript got to have ASCII chars and they are represented by a leading 0 in UTF-16.

Is UTF-8 a code page?

UTF-8 is the universal code page for internationalization and is able to encode the entire Unicode character set. It is used pervasively on the web, and is the default for *nix-based platforms.


1 Answers

Check out the definition of CF_LOCALE at this Microsoft page. It tells you the locale of the text in the clipboard. Better yet, if you use CF_UNICODETEXT instead, Windows will convert to UTF-16 for you.

like image 75
Mark Ransom Avatar answered Oct 16 '22 15:10

Mark Ransom