Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow can not restore vocabulary in evaluation process

I am new to tensorflow and neural network. I started a project which is about detecting errors in persian texts. I used the code in this address and developed the code in here. please check the code because I can not put all the code here.

What I want to do is to give several persian sentences to the model for training and then see if model can detect wrong sentences. The model works fine with english data but when I use it for persian data I encounter this issue.

The code is too long to be written here so I try to point to the part I think might be causing the issue. I used these lines in train.py which works fine and stores vocabularies:

x_text, y = data_helpers.load_data_labels(datasets)
# Build vocabulary
max_document_length = max([len(x.split(" ")) for x in x_text])
vocab_processor = learn.preprocessing.VocabularyProcessor(max_document_length)
x = np.array(list(vocab_processor.fit_transform(x_text)))

however after training when I try this code in eval.py:

vocab_path = os.path.join(FLAGS.checkpoint_dir, "..", "vocab")
vocab_processor = learn.preprocessing.VocabularyProcessor.restore(vocab_path)
x_test = np.array(list(vocab_processor.transform(x_raw)))

this error happens:

vocab_processor = learn.preprocessing.VocabularyProcessor.restore(vocab_path)
File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\site-packages\tensorflow\contrib\learn\python\learn\preprocessing\text.py", line 226, in restore
return pickle.loads(f.read())
File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 118, in read
self._preread_check()
 File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 78, in _preread_check
  compat.as_bytes(self.__name), 1024 * 512, status)
 File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\contextlib.py", line 66, in __exit__
 next(self.gen)
 File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: ..\vocab : The system cannot find the file specified.

I think the problem is because it can not read the vocabulary stored after training ,as the data is in unicode and it's not english. Can anyone help me please

like image 820
Masoud Masoumi Moghadam Avatar asked Dec 03 '17 10:12

Masoud Masoumi Moghadam


2 Answers

The reason why this problem happens is because vocab address is not correct. In train.py after line 144 which the out_dir is set, I added this:

file = open('model_dir.txt', 'w')
file.write(out_dir)
file.close()

After training the model, address is saved in the directory in a file named as model_dir.txt.

Then in eval.py I added this:

model_dir = open('model_dir.txt').readline()
vocab_path = model_dir + "/vocab"

Now, The address is set correctly and the code is working with no problem.

like image 173
Masoud Masoumi Moghadam Avatar answered Oct 16 '22 16:10

Masoud Masoumi Moghadam


Have you tried adding this at the top of your file?

# -*- coding: utf-8 -*-
like image 3
Myles Hollowed Avatar answered Oct 16 '22 17:10

Myles Hollowed