Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Ruby String#split not treat consecutive trailing delimiters as separate entities?

Tags:

string

ruby

I'm reading from a government text file in which $ is used as the delimiter, but I don't think delimiter character matters...

So this is expected:

'a$b$c$d'.split('$')
# => ["a", "b", "c", "d"]

In the datafiles I'm working with, the column headers row (the first line) are uniformly filled in, i.e. there is no empty header, as in:

'a$b$$d'
# or: 
'a$b$c$'

However, each row may have consecutive trailing delimiters such as:

"w$x$$\r\n"

Usually, I read each line and chomp it. But this causes String#split to treat the final two delimiters as one column:

"w$x$$\r\n".chomp.split('$')
# => ["w", "x"] 

Not doing the chomp gets me the desired result, though I should chomp the last element:

"w$x$$\r\n".split('$')
# => ["w", "x", "", "\r\n"]

So either I have to:

  • chomp the line if the final non-newline characters are NOT consecutive delimiters
  • preserve the newline, do the split, and then chomp the final element IF the final characters are consecutive delimiter

This seems really awkward...am I missing something here?

like image 347
Zando Avatar asked Mar 07 '12 15:03

Zando


People also ask

What does string do in Ruby?

In Ruby, string is a sequence of one or more characters. It may consist of numbers, letters, or symbols. Here strings are the objects, and apart from other languages, strings are mutable, i.e. strings can be changed in place instead of creating new strings.

What does @variable mean in Ruby?

In Ruby, the at-sign ( @ ) before a variable name (e.g. @variable_name ) is used to create a class instance variable.

What is the point of symbols in Ruby?

Ruby symbols are defined as “scalar value objects used as identifiers, mapping immutable strings to fixed internal values.” Essentially what this means is that symbols are immutable strings. In programming, an immutable object is something that cannot be changed.

Are strings immutable in Ruby?

In most languages, string literals are also immutable, just like numbers and symbols. In Ruby, however, all strings are mutable by default.


1 Answers

You need to pass a negative value as the second parameter to split. This prevents it from suppressing trailing null fields:

"w$x$$\r\n".chomp.split('$', -1)
# => ["w", "x", "", ""]

See the docs on split.

like image 106
Brandan Avatar answered Sep 28 '22 08:09

Brandan