I've been working with the Ruby chr
and ord
methods recently and there are a few things I don't understand.
My current project involves converting individual characters to and from ordinal values. As I understand it, if I have a string with an individual character like "A" and I call ord
on it I get its position on the ASCII table which is 65. Calling the inverse, 65.chr
gives me the character value "A", so this tells me that Ruby has a collection somewhere of ordered character values, and it can use this collection to give me the position of a specific character, or the character at a specific position. I may be wrong on this, please correct me if I am.
Now I also understand that Ruby's default character encoding uses UTF-8 so it can work with thousands of possible characters. Thus if I ask it for something like this:
'好'.ord
I get the position of that character which is 22909. However, if I call chr
on that value:
22909.chr
I get "RangeError: 22909 out of char range." I'm only able to get char
to work on values up to 255 which is extended ASCII. So my questions are:
chr
from the extended ASCII character set but ord
from UTF-8?According to Integer#chr
you can use the following to force the encoding to be UTF_8.
22909.chr(Encoding::UTF_8)
#=> "好"
To list all available encoding names
Encoding.name_list
#=> ["ASCII-8BIT", "UTF-8", "US-ASCII", "UTF-16BE", "UTF-16LE", "UTF-32BE", "UTF-32LE", "UTF-16", "UTF-32", ...]
A hacky way to get the maximum number of characters
2000000.times.reduce(0) do |x, i|
begin
i.chr(Encoding::UTF_8)
x += 1
rescue
end
x
end
#=> 1112064
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With