Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I convert a date into a time when parsing an .xls doc using Rails?

I'm using Rails 5. I want to parse an .xls (not to be confused with .xlsx doc) using the code below

  book = Roo::Spreadsheet.open(file_location)
  sheet = book.sheet(0)
  text = sheet.to_csv
  csv = CSV.parse(text)

  arr_of_arrs = csv
  text_content = ""
  arr_of_arrs.each do |arr|
    arr.map!{|v| v && v.to_f < 1 && v.to_f > 0 ? TimeFormattingHelper.time_as_str(v.to_f * 24 * 3600 * 1000) : v}
    text_content = "#{text_content}\n#{arr.join("\t")}"
  end

Here is the method I reference above

  def time_as_str(time_in_ms)
    regex = /^(0*:?)*0*/
    Time.at(time_in_ms.to_f/1000).utc.strftime("%H:%M:%S.%1N").sub!(regex, '')
  end

One area I'm having trouble is that a cell that appears in my .xls doc as

24:08:00

is processed as

1904-01-02T00:08:00+00:00

with the code above. How do I parse the value I see on the screen? That is, how do I convert the date value into a time value?

As an example from another Excel doc, the cell that appears as

24:02:00

is getting parsed by my code above as

1899-12-31T00:02:00+00:00
like image 347
Dave Avatar asked Jul 03 '17 20:07

Dave


People also ask

How can you format cells as date or time?

You can also press CTRL+1 to open the Format Cells dialog box. In the Category list, click Date or Time. In the Type list, click the date or time format that you want to use.


2 Answers

It seems your .xls is in the 1904 date system, and Roo is not able to distinguish between what is a Duration and what is a DateTime, so you'll need to subtract the base date 1904-01-01 to the cell value. Weirdly enough, in case of the 1900 date system, you need to subtract the base date 1899-12-30, due to a bug in Lotus 1-2-3 that Microsoft replicated in Excel for compatibility.

Here is a method that converts the DateTime read from the spreadsheet into the duration according to the base date:

def duration_as_str(datetime, base_date)
  total_seconds = DateTime.parse(datetime).to_i - base_date.to_i
  hours = total_seconds / (60 * 60)
  minutes = (total_seconds / 60) % 60
  seconds = total_seconds % 60
  "%d:%02d:%02d" % [hours, minutes, seconds]
end

Let's test it:

irb(main):019:0> duration_as_str("1904-01-02T00:08:00+00:00", DateTime.new(1904, 1, 1))
=> "24:08:00"
irb(main):020:0> duration_as_str("1899-12-31T00:02:00+00:00", DateTime.new(1899, 12, 30))
=> "24:02:00"

You can use book.workbook.date_base.year to determine the spreadsheet's date system, and then just add another map inside your each loop:

book = Roo::Spreadsheet.open(file_location)
sheet = book.sheet(0)
text = sheet.to_csv
csv = CSV.parse(text)

base_date = book.workbook.date_base.year == 1904 ? DateTime.new(1904, 1, 1) : DateTime.new(1899, 12, 30)
arr_of_arrs = csv
text_content = ""
arr_of_arrs.each do |arr|
  arr.map!{|v| v && v.to_f < 1 && v.to_f > 0 ? TimeFormattingHelper.time_as_str(v.to_f * 24 * 3600 * 1000) : v}
  arr.map!{|v| v =~ /^(1904|1899)-/ ? duration_as_str(v, base_date) : v}
  text_content = "#{text_content}\n#{arr.join("\t")}"
end
like image 119
Helder Pereira Avatar answered Sep 19 '22 21:09

Helder Pereira


You could use something like the below and write a custom parser for that string.

duration = 0

"24:08:01".split(":").each_with_index do |value, i|
  if i == 0
    duration += value.to_i.hours
  elsif i == 1
    duration += value.to_i.minutes
  else
    duration += value.to_i.seconds
  end
end

duration.value => 86881 (duration in seconds)

This parser will assume a format of hours:minutes:seconds and return an instance of ActiveSupport::Duration. Then, duration.value will give you the number of seconds.

like image 43
srt32 Avatar answered Sep 19 '22 21:09

srt32