I want to copy some tabular data from Excel into a python array. That is, user willselect a range in an Excel table, press "Copy" (CTRL+C) so that the range will be copied to clipboard. Then I will get this clipboard data into a python array (list). I use win32clipboard
from pywin32
to get clipboard data into an array:
import win32clipboard
def getClip():
win32clipboard.OpenClipboard()
data = win32clipboard.GetClipboardData()
win32clipboard.CloseClipboard()
return data
I copy the following range A1:B5
from Excel:
When I use the function above, I get a string like:
'365\t179\r\n96\t-90\r\n48\t-138\r\n12\t-174\r\n30\t-156\r\n'
How to split this string into a list, so that the list will look like:
[(365,179), (96, -90), (48, -138), (12, -174), (30, -156)]
I use split
method, but it doesn't give me what I want.
data.split("\n")
['365\t179\r', '96\t-90\r', '48\t-138\r', '12\t-174\r', '30\t-156\r', '']
There’s actually a str.splitlines
method which will split the string by line breaks, regardless of which line breaks are used. So this works on Unix systems with just an \n
, on Windows with \r\n
and even on old Mac systems where the line break was just an \r
.
>>> s = '365\t179\r\n96\t-90\r\n48\t-138\r\n12\t-174\r\n30\t-156\r\n'
>>> s.splitlines()
['365\t179', '96\t-90', '48\t-138', '12\t-174', '30\t-156']
Once you have this result, you can split by tabs to get the individual cells. So you essentially have to call cell.split('\t')
on each cell. This is best done with a list comprehension:
>>> [row.split('\t') for row in s.splitlines()]
[['365', '179'], ['96', '-90'], ['48', '-138'], ['12', '-174'], ['30', '-156']]
As an alternative, you could also use map
to apply the splitting operation on each cell:
>>> list(map(lambda cell: cell.split('\t'), s.splitlines()))
[['365', '179'], ['96', '-90'], ['48', '-138'], ['12', '-174'], ['30', '-156']]
As the copied data in the clipboard will always have the rows separated by newlines, and the columns separated by tabs, this solution is also safe to use for any range of cells you copied.
If you further want to convert integers or float to its correct datatypes in Python, I guess you could add some more conversion logic by calling int()
on all cells that only have digits in them, float()
on all cells that have digits and the dot in them .
, leaving the rest as strings:
>>> def convert (cell):
try:
return int(cell)
except ValueError:
try:
return float(cell)
except ValueError:
return cell
>>> [tuple(map(convert, row.split('\t'))) for row in s.splitlines()]
[(365, 179), (96, -90), (48, -138), (12, -174), (30, -156)]
For a different string:
>>> s = 'Foo\tbar\r\n123.45\t42\r\n-85\t3.14'
>>> [tuple(map(convert, row.split('\t'))) for row in s.splitlines()]
[('Foo', 'bar'), (123.45, 42), (-85, 3.14)]
>>> s = '365\t179\r\n96\t-90\r\n48\t-138\r\n12\t-174\r\n30\t-156\r\n'
>>> [map(int, x.split('\t')) for x in s.rstrip().split('\r\n')]
[[365, 179], [96, -90], [48, -138], [12, -174], [30, -156]]
Using the code from my other answer, you can also handle other types as well:
from ast import literal_eval
def solve(x):
try:
return literal_eval(x)
except (ValueError, SyntaxError):
return x
s = '365\tFoo\r\nBar\t-90.01\r\n48\tspam\r\n12e10\t-174\r\n30\t-156\r\n'
print [map(solve, x.split('\t')) for x in s.rstrip().split('\r\n')]
#[[365, 'Foo'], ['Bar', -90.01], [48, 'spam'], [120000000000.0, -174], [30, -156]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With