Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

using pyodbc on ubuntu to insert a image field on SQL Server

I am using Ubuntu 9.04

I have installed the following package versions:

unixodbc and unixodbc-dev: 2.2.11-16build3
tdsodbc: 0.82-4
libsybdb5: 0.82-4
freetds-common and freetds-dev: 0.82-4
python2.6-dev

I have configured /etc/unixodbc.ini like this:

[FreeTDS]
Description             = TDS driver (Sybase/MS SQL)
Driver          = /usr/lib/odbc/libtdsodbc.so
Setup           = /usr/lib/odbc/libtdsS.so
CPTimeout               = 
CPReuse         = 
UsageCount              = 2

I have configured /etc/freetds/freetds.conf like this:

[global]
    tds version = 8.0
    client charset = UTF-8
    text size = 4294967295

I have grabbed pyodbc revision 31e2fae4adbf1b2af1726e5668a3414cf46b454f from http://github.com/mkleehammer/pyodbc and installed it using "python setup.py install"

I have a windows machine with Microsoft SQL Server 2000 installed on my local network, up and listening on the local ip address 10.32.42.69. I have an empty database created with name "Common". I have the user "sa" with password "secret" with full privileges.

I am using the following python code to setup the connection:

import pyodbc
odbcstring = "SERVER=10.32.42.69;UID=sa;PWD=secret;DATABASE=Common;DRIVER=FreeTDS"
con = pyodbc.connect(odbcstring)
cur = con.cursor()

cur.execute("""
IF EXISTS(SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
      WHERE TABLE_NAME = 'testing')
   DROP TABLE testing
""")
cur.execute('''
CREATE TABLE testing (
    id INTEGER NOT NULL IDENTITY(1,1), 
    myimage IMAGE NULL, 
    PRIMARY KEY (id)
)
    ''')
con.commit()

Everything WORKS up to this point. I have used SQLServer's Enterprise Manager on the server and the new table is there. Now I want to insert some data on the table.

cur = con.cursor()
# using web data for exact reproduction of the error by all.
# I'm actually reading a local file in my real code.
url = 'http://www.forestwander.com/wp-content/original/2009_02/west-virginia-mountains.jpg'
data = urllib2.urlopen(url).read()

sql = "INSERT INTO testing (myimage) VALUES (?)"

Now here on my original question, I was having trouble using cur.execute(sql, (data,)) but now I've edited the question, because following Vinay Sajip's answer below (THANKS), I have changed it to:

cur.execute(sql, (pyodbc.Binary(data),)) 
con.commit()

And insertion is working perfectly. I can confirm the size of the inserted data using the following test code:

cur.execute('SELECT DATALENGTH(myimage) FROM testing WHERE id = 1')
data_inside = cur.fetchone()[0]
assert data_inside == len(data)

Which passes perfectly!!!

Now the problem is on retrieval of the data back.

I am trying the common approach:

cur.execute('SELECT myimage FROM testing WHERE id = 1')
result = cur.fetchone()
returned_data = str(result[0]) # transforming buffer object
print 'Original: %d; Returned: %d' % (len(data), len(returned_data))
assert data == returned_data

However that fails!!

Original: 4744611; Returned: 4096
Traceback (most recent call last):
  File "/home/nosklo/devel/teste_mssql_pyodbc_unicode.py", line 53, in <module>
    assert data == returned_data
AssertionError

I've put all the code above in a single file here, for easy testing of anyone that wants to help.

Now for the question:

I want python code to insert an image file into mssql. I want to query the image back and show it to the user.

I don't care about the column type in mssql. I am using the "IMAGE" column type on the example, but any binary/blob type would do, as long as I get the binary data for the file I inserted back unspoiled. Vinay Sajip said below that this is the preferred data type for this in SQL SERVER 2000.

The data is now being inserted without errors, however when I retrieve the data, only 4k are returned. (Data is truncated on 4096).

How can I make that work?


EDITS: Vinay Sajip's answer below gave me a hint to use pyodbc.Binary on the field. I have updated the question accordingly. Thanks Vinay Sajip!

Alex Martelli's comment gave me the idea of using the DATALENGTH MS SQL function to test if the data is fully loaded on the column. Thanks Alex Martelli !

like image 271
nosklo Avatar asked Jun 29 '09 19:06

nosklo


3 Answers

Huh, just after offering the bounty, I've found out the solution.

You have to use SET TEXTSIZE 2147483647 on the query, in addition of text size configuration option in /etc/freetds/freetds.conf.

I have used

cur.execute('SET TEXTSIZE 2147483647 SELECT myimage FROM testing WHERE id = 1')

And everything worked fine.

Strange is what FreeTDS documentation says about the text size configuration option:

default value of TEXTSIZE, in bytes. For text and image datatypes, sets the maximum width of any returned column. Cf. set TEXTSIZE in the T-SQL documentation for your server.

The configuration also says that the maximum value (and the default) is 4,294,967,295. However when trying to use that value in the query I get an error, the max number I could use in the query is 2,147,483,647 (half).

From that explanation I thought that only setting this configuration option would be enough. It turns out that I was wrong, setting TEXTSIZE in the query fixed the issue.

Below is the complete working code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import pyodbc
import urllib2

odbcstring = "SERVER=10.32.42.69;UID=sa;PWD=secret;DATABASE=Common;DRIVER=FreeTDS"
con = pyodbc.connect(odbcstring)
cur = con.cursor()

cur.execute("""
IF EXISTS(SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
      WHERE TABLE_NAME = 'testing')
   DROP TABLE testing
""")

cur.execute('''
CREATE TABLE testing (
    id INTEGER NOT NULL IDENTITY(1,1), 
    myimage IMAGE NULL,
    PRIMARY KEY (id)
)
    ''')

con.commit()
cur = con.cursor()
url = 'http://www.forestwander.com/wp-content/original/2009_02/west-virginia-mountains.jpg'
data = urllib2.urlopen(url).read()

sql = "INSERT INTO testing (myimage) VALUES (?)"
cur.execute(sql, (pyodbc.Binary(data),))
con.commit()

cur.execute('SELECT DATALENGTH(myimage) FROM testing WHERE id = 1')
data_inside = cur.fetchone()[0]
assert data_inside == len(data)

cur.execute('SET TEXTSIZE 2147483647 SELECT myimage FROM testing WHERE id = 1')
result = cur.fetchone()
returned_data = str(result[0])
print 'Original: %d; Returned; %d' % (len(data), len(returned_data))
assert data == returned_data
like image 163
nosklo Avatar answered Sep 23 '22 04:09

nosklo


I think you should be using a pyodbc.Binary instance to wrap the data:

cur.execute('INSERT INTO testing (myimage) VALUES (?)', (pyodbc.Binary(data),))

Retrieving should be

cur.execute('SELECT myimage FROM testing')
print "image bytes: %r" % str(cur.fetchall()[0][0])

UPDATE: The problem is in insertion. Change your insertion SQL to the following:

"""DECLARE @txtptr varbinary(16)

INSERT INTO testing (myimage) VALUES ('')
SELECT @txtptr = TEXTPTR(myimage) FROM testing 
WRITETEXT testing.myimage @txtptr ?
"""

I've also updated the mistake I made in using the value attribute in the retrieval code.

With this change, I'm able to insert and retrieve a 320K JPEG image into the database (retrieved data is identical to inserted data).

N.B. The image data type is deprecated, and is replaced by varbinary(max) in later versions of SQL Server. The same logic for insertion/retrieval should apply, however, for the newer column type.

like image 39
Vinay Sajip Avatar answered Sep 24 '22 04:09

Vinay Sajip


I had a similar 4096 truncation issue on TEXT fields, which SET TEXTSIZE 2147483647 fixed for me, but this also fixed it for me:

import os
os.environ['TDSVER'] = '8.0'
like image 43
Collin Anderson Avatar answered Sep 21 '22 04:09

Collin Anderson