Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I unpack a SQL Server DATETIME in a pyodbc Output Converter function?

I am adding output converters to pyodbc connection objects to handle date types returned from SQL Server. I was able to unpack a datetime.time struct with:

tuple   = struct.unpack("HHHI", dateObj)

which works well. I cannot figure out the secret sauce for the datetime.datetime object, though, which according to the pyodbc docs is a TIMESTAMP_STRUCT, defined here:

typedef struct tagTIMESTAMP_STRUCT
{
        SQLSMALLINT    year;
        SQLUSMALLINT   month;
        SQLUSMALLINT   day;
        SQLUSMALLINT   hour;
        SQLUSMALLINT   minute;
        SQLUSMALLINT   second;
        SQLUINTEGER    fraction;
} TIMESTAMP_STRUCT;

A datum from that column in the db is 2018-01-11 11:50:16.000, and without the add_output_convert trap pyodbc returns:

TypeError: datetime.datetime(2018, 1, 11, 11, 50, 16) is not JSON serializable

Looks like pyodbc silently drops the fraction, which is fine. Shouldn't the unpack() format be one of:

tuple = struct.unpack("hHHHHHI", dateObj)  # with missing fraction
tuple = struct.unpack("hHHHHH", dateObj)

? The latter simply returns:

error: unpack requires a string argument of length 12

For the record, according to sys.getsizeof dateObj is 41 bytes. Any suggestions for the format? This is Windows 10 64-bit, as well as Linux 64-bit.

like image 298
Omortis Avatar asked Jan 19 '18 17:01

Omortis


1 Answers

You seem to have been chasing down some false leads. The SQL Server ODBC driver does not return 41 bytes of data for a DATETIME value, it only returns 8 bytes. (sys.getsizeof returns a value of 41 because it includes "overhead" related to garbage collection.) And there's no way that those 8 bytes represent a TIMESTAMP_STRUCT, so it must be something else.

Starting with a basic test ...

import pyodbc


def datetime_as_string(raw_bytes):
    return raw_bytes


cnxn = pyodbc.connect('DSN=SQLmyDb;', autocommit=True)
cnxn.add_output_converter(pyodbc.SQL_TYPE_TIMESTAMP, datetime_as_string)
crsr = cnxn.cursor()

test_value = '2018-01-11 11:50:16'
rtn = crsr.execute("SELECT CAST(? AS DATETIME)", test_value).fetchval()
print(repr(rtn))

crsr.close()
cnxn.close()

... I saw that your test value is represented by 'e\xa8\x00\x00\xa0\x14\xc3\x00'. A bit of time with the hex converter in Windows Calculator showed me that the contents were not immediately obvious. Playing a hunch, I tried test_value = '1900-01-01 00:00:00' and that returned '\x00\x00\x00\x00\x00\x00\x00\x00' so at least I had a place to start (the "epoch" for SQL Server DATETIME values).

test_value = '1901-01-01 00:00:00' (1 year after the epoch) returned 'm\x01\x00\x00\x00\x00\x00\x00', 'm' is 0x6d, and 0x016d is 365, so that was encouraging.

test_value = '1900-01-01 00:00:01' (1 second after the epoch) returned '\x00\x00\x00\x00,\x01\x00\x00', ',' is 0x2c, and 0x012c is 300.

test_value = '1900-01-01 00:00:02' (2 seconds after the epoch) returned '\x00\x00\x00\x00X\x02\x00\x00', 'X' is 0x58, and 0x0258 is 600.

So the SQL Server ODBC driver is returning two 4-byte signed integers, the first being the offset from the epoch in whole days, and the second being the partial day in increments of 1/300 seconds.

Therefore I changed my Output Converter function to

def datetime_as_string(raw_bytes):
    tup = struct.unpack("<2l", raw_bytes)
    days_since_1900 = tup[0]
    partial_day = round(tup[1] / 300.0, 3)
    date_time = datetime(1900, 1, 1) + timedelta(days=days_since_1900) + timedelta(seconds=partial_day)
    return date_time.strftime('%Y-%m-%d %H:%M:%S.%f')[:23]

and that seemed to do the trick.

like image 136
Gord Thompson Avatar answered Oct 04 '22 22:10

Gord Thompson