Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database fetchrow_array failed long truncated DBI attribute

Tags:

perl

dbi

i am pulling urls from my database with a perl script where i employ fetchrow_array to pull URL from the database which worked fine until i encountered a very long URL georgelog24.blog.iskreni.net/?bid=6744d9dcf85991ed2e4b8a258153a1ab&lid=ff9963b9a798ea335b75b5f7c0c295d1
then it started to give me this error.

DBD::ODBC::st fetchrow_array failed: st_fetch/SQLFetch (long truncated DBI attribute LongTruncOk not set and/or LongReadLen too small) (SQL-HY000) [state was HY000 now 01004]
[Microsoft][ODBC SQL Server Driver]String data, right truncation (SQL-01004) at C:\test\multihashtest2.pl line 44.

I believe this is on the database side as the code i have been using to pull URL has worked before. The database that i am using is MSSQL server 2005.

the URL column in the database uses text type currently, but i have tried changing it to varchar(max) and nvarchar(max) but the error still stands.

After a bit of trial and error i found that the maximum length of the url then i could query successfully with fetchrow_array was 81 characters. And since URLs can span ridiculous lengths sometimes, i cannot put a restriction on URL length.

Can anybody help me understand and suggest a fix for this?

FYI: line 44 is the first line in my code below

while (($myid,$url) = $statement_handle->fetchrow_array()) { # executes as many threads as there are jobs to do 
    my $thread = threads->create(\&webcrawl); #initiate thread
    my $tid = $thread->tid;
    print "  - Thread $tid started\n";   #obtain thread no. and print
    push (@Threads, $thread);   #push thread into array for "housekeeping" later on
}
like image 686
Marcus Lim Avatar asked Sep 07 '12 09:09

Marcus Lim


3 Answers

Try with:

#not anymore errors if content is truncated - you don't necessarily want this
$statement_handle->{'LongTruncOk'} = 1;

#nice, hard coded constant for the length of data to be read from Longs
$statement_handle->{'LongReadLen'} = 20000;
while (($myid,$url) = $statement_handle->fetchrow_array()) { # executes as many threads as there are jobs to do 
    my $thread = threads->create(\&webcrawl); #initiate thread
    my $tid = $thread->tid;
    print "  - Thread $tid started\n";   #obtain thread no. and print
    push (@Threads, $thread);   #push thread into array for "housekeeping" later on
}

Also, I'd recommend you to try Parallel::ForkManager for parallelizing jobs - I find it much more intuitive and easy to use than threads

like image 59
Tudor Constantin Avatar answered Nov 18 '22 10:11

Tudor Constantin


Please look at the DBI attributes LongTruncOk and LongReadlen

You will NEED to either accept truncation or set a max size as text and varchar(max) columns can be massive so if it was left to the DBD it would have no choice but to allocate massive amounts of memory in case the column is the max size of that column.

like image 21
bohica Avatar answered Nov 18 '22 10:11

bohica


Important point: you need to set the LongReadLen and/or LongTruncOk attributes on the database handle prior to preparing the statement, as noted here.

Attempting to set it on the prepared statement handle prior to fetching data will have no effect on truncation of the returned data.

like image 4
Trutane Avatar answered Nov 18 '22 09:11

Trutane