I'm trying to parse dates from a large csv file in Racket.
The most straightforward way to do this would be to create a new date
struct. But it requires the week-day
and year-day
parameters. Of course I don't have these, and this seems like a real weakness of the date
module that I don't understand.
So, as an alternative, I decided to use find-seconds
to convert the raw date vals into seconds and then pass that to seconds->date
. This works, but is brutally slow.
(time
(let loop ([n 10000])
(apply find-seconds '(0 0 12 1 1 2012)) ; this takes 3 seconds for 10000
;(date 0 0 12 1 1 2012 0 0 #f 0) ; this is instant
(if (zero? n)
'done
(loop (sub1 n)))))
find-seconds
takes 3 seconds to do 10000 values, and I have several million. Creating the date
struct is of course instant, but I don't have the week-day, year-day values.
My questions are:
1.) Why is week-day
/year-day
required for creating date structs?
2.) Is find-seconds
supposed to be this slow (ie, bug)? Or am I doing something wrong?
3.) Are there any alternatives to parse dates in a fast manner. I know srfi/19
has a string->date
function, but I'd then have to change everything to use that module's struct instead of racket's built-in one. And it may suffer the same performance hit of find-seconds, I'm not sure.
Although not documented as such, it appears that week-day
and year-day
are "no-ops" when using the date
struct with date->seconds
. If I set them both to 0, a date->seconds
doesn't complain. I suspect it ignores them:
#lang racket
(require racket/date)
(define d (date 1 ;sc
2 ;mn
3 ;hr
20 ;day
8 ;month
2012 ;year
0 ;weekday <<<
0 ;year-day <<<
#f ;dst?
0 ;time-zone-offset
))
(displayln (seconds->date (date->seconds d)))
;; =>
#(struct:date* 1 2 3 20 8 2012 1 232 #t -14400 0 EDT)
^ ^^^
My guess is that the date
struct was defined for use with seconds->date
, where week-day
and year-day
would be interesting information to provide. Then for date->seconds
, rather than define another struct with those fields missing (they're "redundant" for determining the date, which is why you're understandably annoyed :)) for use with date->seconds
, the same struct was reused.
Does that help? It's not clear to me from your question what you're trying to do with the date information from the CSV. If you want to convert it to an integer seconds
value, I think the above should work for you. If you have something else in mind, perhaps you could explain.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With