I tried running the code in a keras blog post.
The code writes to a .npy file as follows:
bottleneck_features_train = model.predict_generator(generator, nb_train_samples // batch_size)
np.save(open('bottleneck_features_train.npy', 'w'),bottleneck_features_train)
It then reads from this file:
def train_top_model():
train_data = np.load(open('bottleneck_features_train.npy'))
Now I get an error saying:
Found 2000 images belonging to 2 classes.
Traceback (most recent call last):
File "kerasbottleneck.py", line 103, in <module>
save_bottlebeck_features()
File "kerasbottleneck.py", line 69, in save_bottlebeck_features
np.save(open('bottleneck_features_train.npy', 'w'),bottleneck_features_train)
File "/opt/anaconda3/lib/python3.6/site-packages/numpy/lib/npyio.py", line 511, in save
pickle_kwargs=pickle_kwargs)
File "/opt/anaconda3/lib/python3.6/site-packages/numpy/lib/format.py", line 565, in write_array
version)
File "/opt/anaconda3/lib/python3.6/site-packages/numpy/lib/format.py", line 335, in _write_array_header
fp.write(header_prefix)
TypeError: write() argument must be str, not bytes
After this, I tried changing the file mode from 'w' to 'wb'. This resulted in an error while reading the file:
Found 2000 images belonging to 2 classes.
Found 800 images belonging to 2 classes.
Traceback (most recent call last):
File "kerasbottleneck.py", line 104, in <module>
train_top_model()
File "kerasbottleneck.py", line 82, in train_top_model
train_data = np.load(open('bottleneck_features_train.npy'))
File "/opt/anaconda3/lib/python3.6/site-packages/numpy/lib/npyio.py", line 404, in load
magic = fid.read(N)
File "/opt/anaconda3/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 0: invalid start byte
How can I fix this error?
The code in the blog post is aimed at Python 2, where writing to and reading from a file works with bytestrings. In Python 3, you need to open the file in binary mode, both for writing and then reading again:
np.save(
open('bottleneck_features_train.npy', 'wb'),
bottleneck_features_train)
And when reading:
train_data = np.load(open('bottleneck_features_train.npy', 'rb'))
Note the b
character in the mode arguments there.
I'd use the file as a context manager to ensure it is cleanly closed:
with open('bottleneck_features_train.npy', 'wb') as features_train_file
np.save(features_train_file, bottleneck_features_train)
and
with open('bottleneck_features_train.npy', 'wb') as features_train_file:
train_data = np.load(features_train_file)
The code in the blog post should use both of these changes anyway, because in Python 2, without the b
flag in the mode text files have platform-specific newline conventions translated, and on Windows certain characters in the stream will have specific meaning (including causing the file to appear shorter than it really is if a EOF characte appears). With binary data that could be a real problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With