Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Pharo/Smalltalk: How to read a file with a specific encoding?

I am currently reading a file like this:

dir := FileSystem disk workingDirectory.
stream := (dir / 'test.txt' ) readStream.
line := stream nextLine.

This works when the file is utf-8 encoded but I could not find out what to do when the file has another encoding.

like image 682
Michael Avatar asked Mar 26 '18 13:03

Michael


2 Answers

For Pharo 7 there's this guide for file streams, which proposes:

('test.txt' asFileReference)
    readStreamEncoded: 'cp-1250' do: [ :stream |
        stream upToEnd ].
like image 198
Fuhrmanator Avatar answered Nov 06 '22 10:11

Fuhrmanator


The classes ZnCharacterReadStream and ZnCharacterWriteStream provide functionality to work with encoded character streams other then UTF-8 (which is the default). First, the file stream needs to be converted into a binary stream. After this, it can be wrapped by a ZnCharacter*Stream. Here is a full example for writing and reading a file:

dir := FileSystem disk workingDirectory.

(dir / 'test.txt') writeStreamDo: [ :out |
  encoded := ZnCharacterWriteStream on: (out binary) encoding: 'cp1252'.
  encoded nextPutAll: 'Über?'.
].

content := '?'.
(dir / 'test.txt') readStreamDo: [ :in |
  decoded := ZnCharacterReadStream on: (in binary) encoding: 'cp1252'.
  content := decoded nextLine.
].
content. " -> should evaluate to 'Über?'"

For more details, the book Enterprise Pharo a Web Perspective has a chapter about character encoding.

like image 25
Michael Avatar answered Nov 06 '22 12:11

Michael