how to extract a unicode string with boost.python

Question

It seems that the code will crash when I do extract<const char*>("a unicode string")

Anyone know how to solve this?

André Anjos · Accepted Answer

This compiles and works for me, with your example string and using Python 2.x:

void process_unicode(boost::python::object u) {
  using namespace boost::python;
  const char* value = extract<const char*>(str(u).encode("utf-8"));
  std::cout << "The string value is '"<< value << "'" << std::endl;
}

You can write a specific from-python converter, if you wish to auto-convert PyUnicode (@Python2.x) to const wchar_t* or to a type from ICU (that seems to be the common recommendation for dealing with Unicode on C++).

If you want full support to unicode characters which are not in the ASCII range (for example, accented characters such as á, ç or ï, you will need to write the from-python converter. Note this will have to be done separately for Python 2.x and 3.x, if you wish to support both. For Python 3.x, the PyUnicode type was deprecated and now the string type works as PyUnicode used to for Python 2.x. ~~Nothing that a couple of #if PY_VERSION_HEX >= 0x03000000 cannot handle~~.

[edit]

The above comment was wrong. Note that, since Python 3.x treats unicode strings as normal strings, boost::python will wrap that into boost::python::str objects. I have not verified how those are handled w.r.t. unicode translation in this case.

edvaldig · Answer

Have you tried

extract<std::string>("a unicode string").c_str()

or

extract<wchar_t*>(...)

how to extract a unicode string with boost.python

Tags:

python

unicode

boost

boost-python

yelo

2 Answers

André Anjos

edvaldig

Recent Activity

Donate For Us

how to extract a unicode string with boost.python

Tags:

python

unicode

boost

boost-python

yelo

2 Answers

André Anjos

edvaldig

Related questions

Recent Activity

Donate For Us