Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract a single character (as a string) from a larger string in Ruby?

Tags:

ruby

What is the Ruby idiomatic way for retrieving a single character from a string as a one-character string? There is the str[n] method of course, but (as of Ruby 1.8) it returns a character code as a fixnum, not a string. How do you get to a single-character string?

like image 628
Thiago Arrais Avatar asked Dec 16 '08 13:12

Thiago Arrais


3 Answers

In Ruby 1.9, it's easy. In Ruby 1.9, Strings are encoding-aware sequences of characters, so you can just index into it and you will get a single-character string out of it:

'µsec'[0] => 'µ'

However, in Ruby 1.8, Strings are sequences of bytes and thus completely unaware of the encoding. If you index into a string and that string uses a multibyte encoding, you risk indexing right into the middle of a multibyte character (in this example, the 'µ' is encoded in UTF-8):

'µsec'[0] # => 194
'µsec'[0].chr # => Garbage
'µsec'[0,1] # => Garbage

However, Regexps and some specialized string methods support at least a small subset of popular encodings, among them some Japanese encodings (e.g. Shift-JIS) and (in this example) UTF-8:

'µsec'.split('')[0] # => 'µ'
'µsec'.split(//u)[0] # => 'µ'
like image 78
Jörg W Mittag Avatar answered Sep 28 '22 09:09

Jörg W Mittag


Before Ruby 1.9:

'Hello'[1].chr  # => "e"

Ruby 1.9+:

'Hello'[1]  # => "e"

A lot has changed in Ruby 1.9 including string semantics.

like image 20
Robert Gamble Avatar answered Sep 28 '22 09:09

Robert Gamble


Should work for Ruby before and after 1.9:

'Hello'[2,1]  # => "l"

Please see Jörg Mittag's comment: this is correct only for single-byte character sets.

like image 29
Brent.Longborough Avatar answered Sep 28 '22 07:09

Brent.Longborough