Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String comparisons. How can you compare string with std::wstring? WRT strcmp

Tags:

c++

string

I am trying to compare two formats that I expected would be somewhat compatible, since they are both generally strings. I have tried to perform strcmp with a string and std::wstring, and as I'm sure C++ gurus know, this will simply not compile. Is it possible to compare these two types? Is there an easy conversion here?

like image 271
Mark Avatar asked Oct 07 '09 01:10

Mark


1 Answers

You need to convert your char* string - "multibyte" in ISO C parlance - to a wchar_t* string - "wide character" in ISO C parlance. The standard function that does that is called mbstowcs ("Multi-Byte String To Wide Character String")

NOTE: as Steve pointed out in comments, this is a C99 function and thus is not ISO C++ conformant, but may be supported by C++ implementations as an extension. MSVC and g++ both support it.

It is used thus:

const char* input = ...;

std::size_t output_size = std::mbstowcs(NULL, input, 0); // get length
std::vector<wchar_t> output_buffer(output_size);

// output_size is guaranteed to be >0 because of \0 at end
std::mbstowcs(&output_buffer[0], input, output_size);

std::wstring output(&output_buffer[0]);

Once you have two wstrings, just compare as usual. Note that this will use the current system locale for conversion (i.e. on Windows this will be the current "ANSI" codepage) - normally this is just what you want, but occasionally you'll need to deal with a specific encoding, in which case the above won't do, and you'll need to use something like iconv.

EDIT

All other answers seem to go for direct codepoint translation (i.e. the equivalent of (wchar_t)c for every char c in the string). This may not work for all locales, but it will work if e.g. your char are all ASCII or Latin-1, and your wchar_t are Unicode. If you're sure that's what you really want, the fastest way is actually to avoid conversion altogether, and to use std::lexicographical_compare:

#include <algorithm>

const char* s = ...;
std::wstring ws = ...;

const char* s_end = s + strlen(s);

bool is_ws_less_than_s = std::lexicographical_compare(ws.begin, ws.end(),
                                                      s, s_end());
bool is_s_less_than_ws = std::lexicographical_compare(s, s_end(),
                                                      ws.begin(), ws.end());
bool is_s_equal_to_ws = !is_ws_less_than_s && !is_s_less_than_ws;

If you specifically need to test for equality, use std::equal with a length check:

#include <algorithm>

const char* s = ...;
std::wstring ws = ...;

std::size_t s_len = strlen(s);
bool are_equal =
    ws.length() == s_len &&
    std::equal(ws.begin(), ws.end(), s);
like image 128
Pavel Minaev Avatar answered Nov 03 '22 13:11

Pavel Minaev