I'm currently writing a C++ program that's rather math-involved. As such, I'm trying to denote some objects as having subscript numbers in a wstring member variable of their class. However, attempts at storing these characters in any capacity forces them into their non-subscript counterparts. By contrast, direct uses of the characters that are pasted in the code are maintained as desired. Here are several cases I experimented with:
setlocale(LC_ALL, "");
wchar_t txt = L'\u2080';
wcout << txt << endl;
myfile << txt << endl;
This outputs "0" to both the file and console.
setlocale(LC_ALL, "");
wcout << L"x₀₁" << endl;
myfile << L"x₀₁" << endl;
This outputs "x01" to both the file and console.
setlocale(LC_ALL, "");
wcout << "x₀₁" << endl;
myfile << "x₀₁" << endl;
This outputs "xâ'?â'?" to the console, which I'd like to avoid if possible, and "x₀₁" to the file which is what I want. An ideal program state would be one that property outputs to both the file and to console, but if that’s not possible, then printing non-subscript characters to the console is preferable.
My code intends to convert ints into their corresponding subscripts. How do I manipulate these characters as smoothly as possible without them getting converted back? I suspect that character encoding plays a part, but I do not know how to incorporate Unicode encoding into my program.
I find these things tricky and I'm never sure if it works for everyone on every Windows version and locale, but this does the trick for me:
#include <Windows.h>
#include <io.h> // _setmode
#include <fcntl.h> // _O_U16TEXT
#include <clocale> // std::setlocale
#include <iostream>
// Unicode UTF-16, little endian byte order (BMP of ISO 10646)
constexpr char CP_UTF_16LE[] = ".1200";
constexpr wchar_t superscript(int v) {
constexpr wchar_t offset = 0x2070; // superscript zero as offset
if (v == 1) return 0x00B9; // special case
if (v == 2 || v == 3) return 0x00B0 + v; // special case 2
return offset + v;
}
constexpr wchar_t subscript(int v) {
constexpr wchar_t offset = 0x2080; // subscript zero as offset
return offset + v;
}
int main() {
// set these before doing any other output:
setlocale(LC_ALL, CP_UTF_16LE);
_setmode(_fileno(stdout), _O_U16TEXT);
// subscript
for (int i = 0; i < 10; ++i)
std::wcout << L'X' << subscript(i) << L' ';
std::wcout << L'\n';
// superscript
for (int i = 0; i < 10; ++i)
std::wcout << L'X' << superscript(i) << L' ';
std::wcout << L'\n';
}
Output:
X₀ X₁ X₂ X₃ X₄ X₅ X₆ X₇ X₈ X₉
X⁰ X¹ X² X³ X⁴ X⁵ X⁶ X⁷ X⁸ X⁹
A more convenient way may be to create wstring
s directly. Here wsup
and wsub
takes a wstring
and returns a converted wstring
. Characters they can't handle are left unchanged.
#include <Windows.h>
#include <io.h> // _setmode
#include <fcntl.h> // _O_U16TEXT
#include <algorithm> // std::transform
#include <clocale> // std::setlocale
#include <iostream>
// Unicode UTF-16, little endian byte order (BMP of ISO 10646)
constexpr char CP_UTF_16LE[] = ".1200";
std::wstring wsup(const std::wstring& in) {
std::wstring rv = in;
std::transform(rv.begin(), rv.end(), rv.begin(),
[](wchar_t ch) -> wchar_t {
// 1, 2 and 3 can be put in any order you like
// as long as you keep them in the top section
if (ch == L'1') return 0x00B9;
if (ch == L'2') return 0x00B2;
if (ch == L'3') return 0x00B3;
// ...but this must be here in the middle:
if (ch >= '0' && ch <= '9') return 0x2070 + (ch - L'0');
// put the below in any order you like,
// in the bottom section
if (ch == L'i') return 0x2071;
if (ch == L'+') return 0x207A;
if (ch == L'-') return 0x207B;
if (ch == L'=') return 0x207C;
if (ch == L'(') return 0x207D;
if (ch == L')') return 0x207E;
if (ch == L'n') return 0x207F;
return ch; // no change
});
return rv;
}
std::wstring wsub(const std::wstring& in) {
std::wstring rv = in;
std::transform(rv.begin(), rv.end(), rv.begin(),
[](wchar_t ch) -> wchar_t {
if (ch >= '0' && ch <= '9') return 0x2080 + (ch - L'0');
if (ch == L'+') return 0x208A;
if (ch == L'-') return 0x208B;
if (ch == L'=') return 0x208C;
if (ch == L'(') return 0x208D;
if (ch == L')') return 0x208E;
if (ch == L'a') return 0x2090;
if (ch == L'e') return 0x2091;
if (ch == L'o') return 0x2092;
if (ch == L'x') return 0x2093;
if (ch == 0x0259) return 0x2094; // small letter schwa: ə
if (ch == L'h') return 0x2095;
if (ch >= 'k' && ch <= 'n') return 0x2096 + (ch - 'k');
if (ch == L'p') return 0x209A;
if (ch == L's') return 0x209B;
if (ch == L't') return 0x209C;
return ch; // no change
});
return rv;
}
int main() {
std::setlocale(LC_ALL, CP_UTF_16LE);
if (_setmode(_fileno(stdout), _O_U16TEXT) == -1) return 1;
auto pstr = wsup(L"0123456789 +-=() ni");
auto bstr = wsub(L"0123456789 +-=() aeoxə hklmnpst");
std::wcout << L"superscript: " << pstr << L'\n';
std::wcout << L"subscript: " << bstr << L'\n';
std::wcout << L"an expression: x" << wsup(L"(n-1)") << L'\n';
}
Output:
superscript: ⁰¹²³⁴⁵⁶⁷⁸⁹ ⁺⁻⁼⁽⁾ ⁿⁱ
subscript: ₀₁₂₃₄₅₆₇₈₉ ₊₋₌₍₎ ₐₑₒₓₔ ₕₖₗₘₙₚₛₜ
an expression: x⁽ⁿ⁻¹⁾
My console didn't manage to display the subscript versions of hklmnpst
- but apparently the transformation was correct because it shows up here ok after copy/pasting.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With