'b' character added when using numpy loadtxt [duplicate]

Tags:

I tried to create an array from a text file. I saw earlier that numpy had a method loadtxt, so I try it, but it add some junk character before each row...

# my txt file

    .--``--.
.--`        `--.
|              |
|              |
`--.        .--`
    `--..--`

# my python v3.4 program

import numpy as np
f = open('tile', 'r')
a = np.loadtxt(f, dtype=str, delimiter='\n')
print(a)

# my print output

["b'    .--``--.    '"
 "b'.--`        `--.'"
 "b'|              |'"
 "b'|              |'"
 "b'`--.        .--`'"
 "b'    `--..--`    '"]

What are these 'b' and double quotes ? And where do they come from ? I tried some solution picked from internet, like open the file with codecs, change the dtype by 'S20', 'S11', and a lot of other things which don't work... What I expect is an array of unicode strings which look like this :

[['    .--``--.    ']
 ['.--`        `--.']
 ['|              |']
 ['|              |']
 ['`--.        .--`']
 ['    `--..--`    ']]

Info: I'm using python 3.4 and numpy from the debian stable repository

841

asked Nov 11 '15 16:11

krshk

1 Answers

np.loadtxt and np.genfromtxt operate in byte mode, which is the default string type in Python 2. But Python 3 uses unicode, and marks bytestrings with this b.

I tried some variations, in an python3 ipython session:

In [508]: np.loadtxt('stack33655641.txt',dtype=bytes,delimiter='\n')[0]
Out[508]: b'    .--``--.'
In [509]: np.loadtxt('stack33655641.txt',dtype=str,delimiter='\n')[0]
Out[509]: "b'    .--``--.'"
...
In [511]: np.genfromtxt('stack33655641.txt',dtype=str,delimiter='\n')[0]
Out[511]: '.--``--.'
In [512]: np.genfromtxt('stack33655641.txt',dtype=None,delimiter='\n')[0]
Out[512]: b'.--``--.'
In [513]: np.genfromtxt('stack33655641.txt',dtype=bytes,delimiter='\n')[0]
Out[513]: b'.--``--.'

genfromtxt with dtype=str gives the cleanest display - except it strips blanks. I may have to use a converter to turn that off. These functions are meant to read csv data where (white)spaces are separators, not part of the data.

loadtxt and genfromtxt are over kill for simple text like this. A plain file read does nicely:

In [527]: with open('stack33655641.txt') as f:a=f.read()
In [528]: print(a)
    .--``--.
.--`        `--.
|              |
|              |
`--.        .--`
    `--..--`

In [530]: a=a.splitlines()
In [531]: a
Out[531]: 
['    .--``--.',
 '.--`        `--.',
 '|              |',
 '|              |',
 '`--.        .--`',
 '    `--..--`']

(my text editor is set to strip trailing blanks, hence the ragged lines).

@DSM's suggestion:

In [556]: a=np.loadtxt('stack33655641.txt',dtype=bytes,delimiter='\n').astype(str)
In [557]: a
Out[557]: 
array(['    .--``--.', '.--`        `--.', '|              |',
       '|              |', '`--.        .--`', '    `--..--`'], 
      dtype='<U16')
In [558]: a.tolist()
Out[558]: 
['    .--``--.',
 '.--`        `--.',
 '|              |',
 '|              |',
 '`--.        .--`',
 '    `--..--`']

answered Oct 11 '22 11:10

hpaulj

Related questions
                            
                                How to cPickle dump and load separate dictionaries to the same file?
                            
                                In Django admin, how can I hide Save and Continue and Save and Add Another buttons on a model admin?
                            
                                How to use mmap in python when the whole file is too big
                            
                                code.interact and imports/definitions visibility
                            
                                Write to robot framework console from Python
                            
                                pprint(): how to use double quotes to display strings?
                            
                                Extracting Hyperlinks From Excel (.xlsx) with Python
                            
                                How to convert from boolean array to int array in python
                            
                                Elegant way to unpack limited dict values into local variables in Python
                            
                                Custom headers in Phantomjs Selenium WebDriver
                            
                                Selenium (with python) how to modify an element css style
                            
                                Checking if a value is equal to any value in an array [duplicate]
                            
                                How can I remove unused packages from virtualenv?
                            
                                Uninstall Django completely
                            
                                Flask-restful: marshal complex object to json
                            
                                How to sort a list of tuples by their first element?
                            
                                Pandas Merge on Name and Closest Date
                            
                                Dynamically changing dropdowns in IPython notebook widgets and Spyre
                            
                                aggregate a field in elasticsearch-dsl using python
                            
                                Mock parent class __init__ method

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

'b' character added when using numpy loadtxt [duplicate]

Tags:

python

numpy

python-3.4

krshk

People also ask

1 Answers

hpaulj

Recent Activity

Donate For Us