I have multiple files which I process using Numpy and SciPy, but I am required to deliver an Excel file. How can I efficiently copy/paste a huge numpy array to Excel?
I have tried to convert to Pandas' DataFrame object, which has the very usefull function to_clipboard(excel=True)
, but I spend most of my time converting the array into a DataFrame.
I cannot simply write the array to a CSV file then open it in excel, because I have to add the array to an existing file; something very hard to achieve with xlrd/xlwt and other Excel tools.
Use “savetxt” method of numpy to save numpy array to csv file. CSV files are easy to share and view, therefore it's useful to convert numpy array to csv. CSV stands for comma separated values and these can be viewed in excel or any text editor whereas to view a numpy array object we need python.
savetext() This method is used to save an array to a text file. Create an array then save it as a CSV file.
You can save your NumPy arrays to CSV files using the savetxt() function. This function takes a filename and array as arguments and saves the array into CSV format.
Export Data to Excel With the DataFrame. to_excel() Function in Python. If we want to write tabular data to an Excel sheet in Python, we can use the to_excel() function in Pandas DataFrame . A pandas DataFrame is a data structure that stores tabular data.
My best solution here would be to turn the array into a string, then use win32clipboard
to sent it to the clipboard. This is not a cross-platform solution, but then again, Excel is not avalable on every platform anyway.
Excel uses tabs (\t
) to mark column change, and \r\n
to indicate a line change.
The relevant code would be:
import win32clipboard as clipboard
def toClipboardForExcel(array):
"""
Copies an array into a string format acceptable by Excel.
Columns separated by \t, rows separated by \n
"""
# Create string from array
line_strings = []
for line in array:
line_strings.append("\t".join(line.astype(str)).replace("\n",""))
array_string = "\r\n".join(line_strings)
# Put string into clipboard (open, clear, set, close)
clipboard.OpenClipboard()
clipboard.EmptyClipboard()
clipboard.SetClipboardText(array_string)
clipboard.CloseClipboard()
I have tested this code with random arrays of shape (1000,10000) and the biggest bottleneck seems to be passing the data to the function. (When I add a print
statement at the beginning of the function, I still have to wait a bit before it prints anything.)
EDIT: The previous paragraph related my experience in Python Tools for Visual Studio. In this environment, it seens like the print statement is delayed. In direct command line interface, the bottleneck is in the loop, like expected.
import pandas as pd
pd.DataFrame(arr).to_clipboard()
I think it's one of the easiest way with pandas package.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With