I have noticed recently (maybe a change in recent Delphi) that if I load an ASCII format txt file into a tstringlist, edit a line with file.lines[10]:='blah', and then save it again the file is now encoded as UTF-8 format. I want it to always use ASCII.
I see you can pass a Tencoding parameter to the TStringlist.SaveAsFile but I have hundreds of these load/edit/saves throughout my program handling various edits.
Is there a way to set a one time global setting that then makes all TStringlist calls use ASCII only?
The main reason for this is that I am creating batch files with these edits and when they are UTF-8 format the Windows Command Line cmd.exe cannot execute them correctly, ie
@echo off
comes back as
'@echo' is not recognized as an internal or external command, operable program or batch file.
Doing some more tests, if I check the Beta: Use unicode UTF-8 for worldwide language support checkbox under the Windows regional advanced settings the batch file display issue is gone (and TStringlist saving as UTF-8 is not an issue. But I cannot ask all users to enable that checkbox to get correct display. If there was a simple global "always use ASCII for tstringlist" that would fix my issue as I never need UTF/unicode support for Tstringlists. At least until that Beta setting becomes standard in Windows.
And to add even more confusion (at least to me), if I explicitly tell tstringlist the encoding tmp.savetofile('blah.bat', TEncoding.ASCII); it still shows as UTF-8 encoding in Notepad++ once the tstringlist is saved.
Is there a way to set a one time global setting that then makes all TStringlist calls use ASCII only?
Unfortunately, there is no such global setting. Encoding is handled on a per-TStringList, per-stream/file basis.
When you load a TStringList from a stream/file, if no encoding is specified in the Encoding parameter then an encoding is auto-detected from the text data (ie, a BOM is looked for), and if no encoding is detected (ie, no BOM is present) then the TStringList.DefaultEncoding property is used. DefaultEncoding defaults to TEncoding.Default, which on Windows is ANSI (ie the user's locale) not UTF-8 (unless you set the locale to UTF-8).
The actual encoding used to load the data is stored in the TStringList.Encoding property.
When you save a TStringList to a stream/file, and do not specify an encoding in the Encoding parameter, then the TStringList.Encoding property is used if not nil (so the new stream/file can match the previously loaded stream/file), otherwise the TStringList.DefaultEncoding property is used.
So, to do what you want, you can set the TStringList.DefaultEncoding property to TEncoding.ASCII on each TStringList object, thus loading and saving will use ASCII unless a different encoding is specified/detected per file.
And to add even more confusion (at least to me), if I explicitly tell tstringlist the encoding
tmp.savetofile('blah.bat', TEncoding.ASCII);it still shows as UTF-8 encoding in Notepad++ once the tstringlist is saved.
ASCII is a subset of UTF-8, so a valid ASCII file will also be a valid UTF-8 file. But whether or not Notepad++ treats an ASCII file as UTF-8 depends on whether or not Notepad++ is configured to use UTF-8 as its default encoding.
That being said, if all of your source files are truly ASCII to begin with, then there's no reason for the default TEncoding.Default to not work, since ASCII is the base for almost every locale. The only way your saved files could have become UTF-8 with TEncoding.Default is if your locale is UTF-8. Even so, you can avoid the BOM appearing in your saved files by setting the TStringList.WriteBOM property to False.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With