I am having trouble getting QFileInfo to work with UTF-8 paths.
I am on Ubuntu 20.04.
While std::filesystem has no issue with the (in this case) German UTF-8, it seems QFileInfo is not using UTF-8, even though the Qt documentation says the default encoding is unicode (https://doc.qt.io/qt-5/qtextcodec.html)
EDIT:
before the example with the file,here is a simpler example.
This example shows that QString is not the issue, but rather the settings that influence the Qt I/O:
QString temp {"Höhe.txt"};
qDebug()<<"Qt temp: "<<temp;
std::cout<<"Qt through std: "<<temp.toStdString()<<std::endl;
std::string str = temp.toStdString();
std::cout<<"std: "<<str<<std::endl;
results:
Qt temp: "Hhe.txt"
Qt through std: Höhe.txt
std: Höhe.txt
So, qDebug() is omitting the 'Ö' while the QString::toStdString() delivers the full string correctly.
Here is a distilled example code:
In all cases below std::filesystem finds the file, but Qt doesn't see it.
The qDebug() output is always without the 'Ö' - it is simply 'Hhe.txt'
NOTE: my real code is not using string literals - the below string literal is only used for the example to keep things simple.
#include <QFileInfo>
#include <QtDebug>
#include <QTextCodec>
#include <filesystem>
int main(int argc, char **argv)
{
std::filesystem::path p{"Höhe.txt"};
//QFileInfo f(p.c_str());
//QFileInfo f(std::filesystem::u8path(p.c_str()).c_str());
//QFileInfo f(QString::fromUtf8(std::filesystem::u8path(p.c_str()).c_str()));
QByteArray encodedString = "Höhe.txt";
QTextCodec *codec = QTextCodec::codecForName("UTF-8");
//QString file = codec->toUnicode(encodedString);
QString file = QString::fromUtf8(encodedString);
QFileInfo f(file);
if(!std::filesystem::exists(p)) {
return 1;
}
if(!f.exists()) {
qDebug()<<f.filePath(); //outputs 'Hhe.txt' for all cases
return 1;
}
std::cout<<"found"<<std::endl;
return 0;
}
Can anyone help me get QFileInfo to also be able to see files with unicode characters?
Many thanks in advance!
Some additional information (as per questions in the comments):
~$ locale
LANG=C
LANGUAGE=en:el
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
My system however generally, has no issue showing German text, be in console or UIs..
and
$ ls Höhe.txt | od -t c
0000000 c h i n e s e . e x t \n c z e c
0000020 h . e x t \n d u t c h . e x t \n
0000040 e n g l i s h _ u k . e x t \n f
0000060 i n n i s h . e x t \n f r e n c
0000100 h . e x t \n g e r m a n . e x t
0000120 \n g r e e k . e x t \n i t a l i
0000140 a n . e x t \n j a p a n e s e .
0000160 e x t \n p o l i s h . e x t \n p
0000200 o r t u g u e s e . e x t \n s p
0000220 a n i s h . e x t \n s w e d i s
0000240 h . e x t \n t u r k i s h . e x
0000260 t \n
0000262
And:
main.cpp: C source, UTF-8 Unicode text
The comments from n.m. were on the right track.
The issue is not in the code, as I originally thought but the locale configuration on my system.
The output of:
QTextCodec::codecForLocale()->name().toStdString();
is 'System'
.
I am not sure what 'System' is configured to though.
And setting Qt Locale on UTF-8 explicitly:
QTextCodec::setCodecForLocale(QTextCodec::codecForName("UTF-8"));
Made the code work correctly on my system as well.
This means that the issue is in the Locale configuration of my system.
So I have to figure out what is wrong with my Locale, but that is a different question.
Many thanks to all of you who helped!!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With