I need a library that can URLencode a string/char array.
Now, I can hex encode an ASCII array like here: http://www.codeguru.com/cpp/cpp/cpp_mfc/article.php/c4029
But I need something that works with Unicode. Note: On Linux AND on Windows !
CURL has a quite nice:
char *encodedURL = curl_easy_escape(handle,WEBPAGE_URL, strlen(WEBPAGE_URL));
but first, that needs CURL and it also is not unicode capable, as one sees by strlen
If I read the quest correctly and you want to do this yourself, without using curl I think I have a solution (sssuming UTF-8) and I think this is a conformant and portable way of URL encoding query strings:
#include <boost/function_output_iterator.hpp>
#include <boost/bind.hpp>
#include <algorithm>
#include <sstream>
#include <iostream>
#include <iterator>
#include <iomanip>
namespace {
std::string encimpl(std::string::value_type v) {
if (isalnum(v))
return std::string()+v;
std::ostringstream enc;
enc << '%' << std::setw(2) << std::setfill('0') << std::hex << std::uppercase << int(static_cast<unsigned char>(v));
return enc.str();
}
}
std::string urlencode(const std::string& url) {
// Find the start of the query string
const std::string::const_iterator start = std::find(url.begin(), url.end(), '?');
// If there isn't one there's nothing to do!
if (start == url.end())
return url;
// store the modified query string
std::string qstr;
std::transform(start+1, url.end(),
// Append the transform result to qstr
boost::make_function_output_iterator(boost::bind(static_cast<std::string& (std::string::*)(const std::string&)>(&std::string::append),&qstr,_1)),
encimpl);
return std::string(url.begin(), start+1) + qstr;
}
It has no non-standard dependencies other than boost and if you don't like the boost dependency it's not that hard to remove.
I tested it using:
int main() {
const char *testurls[] = {"http://foo.com/bar?abc<>de??90 210fg!\"$%",
"http://google.com",
"http://www.unicode.com/example?großpösna"};
std::copy(testurls, &testurls[sizeof(testurls)/sizeof(*testurls)],
std::ostream_iterator<std::string>(std::cout,"\n"));
std::cout << "encode as: " << std::endl;
std::transform(testurls, &testurls[sizeof(testurls)/sizeof(*testurls)],
std::ostream_iterator<std::string>(std::cout,"\n"),
std::ptr_fun(urlencode));
}
Which all seemed to work:
http://foo.com/bar?abc<>de??90 210fg!"$%
http://google.com
http://www.unicode.com/example?großpösna
Becomes:
http://foo.com/bar?abc%3C%3Ede%3F%3F90%20%20%20210fg%21%22%24%25
http://google.com
http://www.unicode.com/example?gro%C3%9Fp%C3%B6sna
Which squares with these examples
You can consider converting your Unicode URL to UTF8 first, the UTF8 data will carry your Unicode data in ASCII characters, Once you get your URL in UTF8 you can easily encode the URL with the API you prefer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With