I see a lot of DataFrames posted to StackOverflow that look like:
a dt b
0 -0.713356 2015-10-01 00:00:00 -0.159170
1 -1.636397 2015-10-01 00:30:00 -1.038110
2 -1.390117 2015-10-01 01:00:00 -1.124016
I still haven't figured out a good way to copy these into my interpreter using .read_clipboard
(argument list in .read_table
docs).
I thought the key was the parse_dates
parameter:
parse_dates : boolean or list of ints or names or list of lists or dict, default False
* boolean. If True -> try parsing the index.
* list of ints or names. e.g. If [1, 2, 3] -> try parsing columns 1, 2, 3 each as a separate date column.
* list of lists. e.g. If [[1, 3]] -> combine columns 1 and 3 and parse as a single date column.
* dict, e.g. {‘foo’ : [1, 3]} -> parse columns 1, 3 as date and call result ‘foo’
pd.read_clipboard(parse_dates={'dt': [1, 2]})
raises the exception NotImplementedError: file structure not yet supported
.
When I try skipping the first row pd.read_clipboard(parse_dates=[[1, 2]], names=['a', 'dt1', 'dt2', 'b'], skiprows=1, header=None)
I get the same exception.
How do others do this?
This is what I do. First, ensure that your columns have two spaces in between them:
a dt b
0 -0.713356 2015-10-01 00:00:00 -0.159170
1 -1.636397 2015-10-01 00:30:00 -1.038110
2 -1.390117 2015-10-01 01:00:00 -1.124016
Notice, the datetime column has one space between the date and time. This is important. Next, I use something like this to load it in:
df = pd.read_clipboard(sep='\s{2,}', parse_dates=[1], engine='python')
df
a dt b
0 0 -0.713356 2015-10-01 00:00:00 -0.159170
1 1 -1.636397 2015-10-01 00:30:00 -1.038110
2 2 -1.390117 2015-10-01 01:00:00 -1.124016
df.dtypes
a object
dt datetime64[ns]
b float64
dtype: object
Yes, this isn't a fully automated process, but as long as you're dealing with small dataframes to copy in, it isn't that bad. Although I'm open to seeing better alternatives.
In case it helps someone, here is what I do now:
df = pd.read_clipboard(skiprows=1, names=['a', 'dt1', 'dt2', 'b'])
df['dt'] = pd.to_datetime(df['dt1'] + ' ' + df['dt2'])
df = df[['a', 'dt', 'b']]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With