Fastest way to generate delimited string from 1d numpy array

Tags:

I have a program which needs to turn many large one-dimensional numpy arrays of floats into delimited strings. I am finding this operation quite slow relative to the mathematical operations in my program and am wondering if there is a way to speed it up. For example, consider the following loop, which takes 100,000 random numbers in a numpy array and joins each array into a comma-delimited string.

import numpy as np x = np.random.randn(100000) for i in range(100):     ",".join(map(str, x))

This loop takes about 20 seconds to complete (total, not each cycle). In contrast, consider that 100 cycles of something like elementwise multiplication (x*x) would take than one 1/10 of a second to complete. Clearly the string join operation creates a large performance bottleneck; in my actual application it will dominate total runtime. This makes me wonder, is there a faster way than ",".join(map(str, x))? Since map() is where almost all the processing time occurs, this comes down to the question of whether there a faster to way convert a very large number of numbers to strings.

978

asked Apr 27 '10 13:04

Abiel

2 Answers

A little late, but this is faster for me:

#generate an array with strings x_arrstr = np.char.mod('%f', x) #combine to a string x_str = ",".join(x_arrstr)

Speed up is on my machine about 1.5x

answered Oct 16 '22 17:10

Markus R

Very good writeup on the performance of various string concatenation techniques in Python: http://www.skymind.com/~ocrow/python_string/

I'm a little surprised that some of the latter approaches perform as well as they do, but looks like you can certainly find something there that will work better for you than what you're doing there.

Fastest method mentioned on the site

Method 6: List comprehensions
def method6():   return ''.join([`num` for num in xrange(loop_count)]) 
This method is the shortest. I'll spoil the surprise and tell you it's also the fastest. It's extremely compact, and also pretty understandable. Create a list of numbers using a list comprehension and then join them all together. Couldn't be simpler than that. This is really just an abbreviated version of Method 4, and it consumes pretty much the same amount of memory. It's faster though because we don't have to call the list.append() function each time round the loop.

answered Oct 16 '22 15:10

sblom

Related questions
                            
                                How to convert a string to an image?
                            
                                Python multiple repeat Error
                            
                                Finding the Values of the Arrow Keys in Python: Why are they triples?
                            
                                Why does numpy.linalg.solve() offer more precise matrix inversions than numpy.linalg.inv()?
                            
                                Using Boolean Flags in Python Click Library (command line arguments)
                            
                                Turtle module - Saving an image
                            
                                In Python argparse, is it possible to have paired --no-something/--something arguments?
                            
                                Why does right-clicking create an orange dot in the center of the circle?
                            
                                Celery - How to send task from remote machine?
                            
                                Django populate() isn't reentrant
                            
                                Installing iPython: "ImportError cannot import name path"?
                            
                                How To Plot Multiple Histograms On Same Plot With Seaborn
                            
                                "System error: new style getargs format but argument is not a tuple" when using cv2.blur
                            
                                Numpy: change max in each row to 1, all other numbers to 0
                            
                                pandas join DataFrame force suffix?
                            
                                Profiling a python program with PyCharm (or any other IDE)
                            
                                Split speech audio file on words in python
                            
                                Using %matplotlib notebook after %matplotlib inline in Jupyter Notebook doesn't work
                            
                                I cannot install Tensorflow Version 1.15 through pip
                            
                                How to efficiently use MySQLDB SScursor?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Fastest way to generate delimited string from 1d numpy array

Tags:

python

numpy

Abiel

People also ask

2 Answers

Markus R

sblom

Recent Activity

Donate For Us