I'm trying to parse dates from a large csv file in Racket.
The most straightforward way to do this would be to create a new date struct.  But it requires the week-day and year-day parameters.  Of course I don't have these, and this seems like a real weakness of the date module that I don't understand.
So, as an alternative, I decided to use find-seconds to convert the raw date vals into seconds and then pass that to seconds->date.  This works, but is brutally slow.
(time
 (let loop ([n 10000])
   (apply find-seconds '(0 0 12 1 1 2012)) ; this takes 3 seconds for 10000
   ;(date 0 0 12 1 1 2012 0 0 #f 0) ; this is instant
   (if (zero? n)
       'done
       (loop (sub1 n)))))
find-seconds takes 3 seconds to do 10000 values, and I have several million.  Creating the date struct is of course instant, but I don't have the week-day, year-day values.
My questions are:
1.) Why is week-day/year-day required for creating date structs?
2.) Is find-seconds supposed to be this slow (ie, bug)?  Or am I doing something wrong?
3.) Are there any alternatives to parse dates in a fast manner.  I know srfi/19 has a string->date function, but I'd then have to change everything to use that module's struct instead of racket's built-in one.  And it may suffer the same performance hit of find-seconds, I'm not sure.
Although not documented as such, it appears that week-day and year-day are "no-ops" when using the date struct with date->seconds. If I set them both to 0, a date->seconds doesn't complain. I suspect it ignores them:
#lang racket
(require racket/date)
(define d (date 1    ;sc
                2    ;mn
                3    ;hr
                20   ;day
                8    ;month
                2012 ;year
                0    ;weekday  <<<
                0    ;year-day <<<
                #f   ;dst?
                0    ;time-zone-offset
                ))
(displayln (seconds->date (date->seconds d)))
;; =>
#(struct:date* 1 2 3 20 8 2012 1 232 #t -14400 0 EDT)
                               ^ ^^^
My guess is that the date struct was defined for use with seconds->date, where week-day and year-day would be interesting information to provide. Then for date->seconds, rather than define another struct with those fields missing (they're "redundant" for determining the date, which is why you're understandably annoyed :)) for use with date->seconds, the same struct was reused.
Does that help? It's not clear to me from your question what you're trying to do with the date information from the CSV. If you want to convert it to an integer seconds value, I think the above should work for you. If you have something else in mind, perhaps you could explain.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With