How should I save BM25Okapi object value to file?

Question

We are working on information retrieval task, and we need to rank research papers due to query.

After cleaning data, and creating dataframe, we have tokenized paper texts and need to save result into file.

import sys
#tokenized_corpus = [doc.split(" ") for doc in corpus]

corpus = list(df.body_text)

tokenized_corpus1 = [doc.split(" ") for doc in corpus[:20000]]
tokenized_corpus2 = [doc.split(" ") for doc in corpus[20000:40000]]
#tokenized_corpus3 = [doc.split(" ") for doc in corpus[40000:]]

tokenized_corpus = tokenized_corpus1 + tokenized_corpus2 # + tokenized_corpus3

cell above create tokenized corpus.

with open('file.csv', 'w', newline='', encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerows(tokenized_corpus)

then we save data to .csv file.

after that, we call BM25Okapi method

bm25 = BM25Okapi(tokenized_corpus)

As this step takes too much time and consumes gigabytes of memory (causing frequent errors) we want to save result, so that we will not need to recall funktion every time.

to retrieve results due to results we used the following steps.

query = "coronavirus origin"
tokenized_query = query.split(" ")

doc_scores = bm25.get_scores(tokenized_query)
doc_scores

I were not able to save BM25 objects value to file. And did not see any method in the source code. How should i do?

Ulvi Shukurzade · Accepted Answer

Question is asked in a wrong way. What we have to do is saving objects not specifically BM25Okapi results.

so, here goes the solution:

import pickle

#To save bm25 object
with open('bm25result', 'wb') as bm25result_file:
    pickle.dump(bm25, bm25result_file)

then, to read the object data:

#to read bm25 object
with open('bm25result', 'rb') as bm25result_file:
    bm25result = pickle.load(bm25result_file)

detailed description can be found this article

How should I save BM25Okapi object value to file?

Tags:

python

artificial-intelligence

ranking

information-retrieval

Ulvi Shukurzade

1 Answers

Ulvi Shukurzade

Recent Activity

Donate For Us

How should I save BM25Okapi object value to file?

Tags:

python

artificial-intelligence

ranking

information-retrieval

Ulvi Shukurzade

1 Answers

Ulvi Shukurzade

Related questions

Recent Activity

Donate For Us