I'm learning to use theano. I want to populate a term-document matrix (a numpy sparse matrix) by calculating binary TF-IDF for each element inside it:
import theano
import theano.tensor as T
import numpy as np
from time import perf_counter
def tfidf_gpu(appearance_in_documents,num_documents,document_words):
start = perf_counter()
APP = T.scalar('APP',dtype='int32')
N = T.scalar('N',dtype='int32')
SF = T.scalar('S',dtype='int32')
F = (T.log(N)-T.log(APP)) / SF
TFIDF = theano.function([N,APP,SF],F)
ret = TFIDF(num_documents,appearance_in_documents,document_words)
end = perf_counter()
print("\nTFIDF_GPU ",end-start," secs.")
return ret
def tfidf_cpu(appearance_in_documents,num_documents,document_words):
start = perf_counter()
tfidf = (np.log(num_documents)-np.log(appearance_in_documents))/document_words
end = perf_counter()
print("TFIDF_CPU ",end-start," secs.\n")
return tfidf
But the numpy version is much faster than the theano implementation:
Progress 1/43
TFIDF_GPU 0.05702276699594222 secs.
TFIDF_CPU 1.454801531508565e-05 secs.
Progress 2/43
TFIDF_GPU 0.023830442980397493 secs.
TFIDF_CPU 1.1073017958551645e-05 secs.
Progress 3/43
TFIDF_GPU 0.021920352999586612 secs.
TFIDF_CPU 1.0738993296399713e-05 secs.
Progress 4/43
TFIDF_GPU 0.02303648801171221 secs.
TFIDF_CPU 1.1675001587718725e-05 secs.
Progress 5/43
TFIDF_GPU 0.02359767400776036 secs.
TFIDF_CPU 1.4385004760697484e-05 secs.
....
I've read that this can be due to overhead, that for small operations might kill the performance.
Is my code bad or should I avoid using GPU because of the overhead?
The thing is that you are compiling your Theano function every time. The compilation takes time. Try passing the compiled function like this:
def tfidf_gpu(appearance_in_documents,num_documents,document_words,TFIDF):
start = perf_counter()
ret = TFIDF(num_documents,appearance_in_documents,document_words)
end = perf_counter()
print("\nTFIDF_GPU ",end-start," secs.")
return ret
APP = T.scalar('APP',dtype='int32')
N = T.scalar('N',dtype='int32')
SF = T.scalar('S',dtype='int32')
F = (T.log(N)-T.log(APP)) / SF
TFIDF = theano.function([N,APP,SF],F)
tfidf_gpu(appearance_in_documents,num_documents,document_words,TFIDF)
Also your TFIDF task is a bandwidth intensive task. Theano, and GPU in general, is best for computation intensive tasks.
The current task will considerable overhead taking the data to the GPU and back because in the end you will need to read each element O(1) times. But if you want to do more computation it makes sense to use the GPU.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With