I have a dictionary that looks like the below
defaultdict(list, {'Open': ['47.47', '47.46', '47.38', ...], 'Close': ['47.48', '47.45', '47.40', ...], 'Date': ['2016/11/22 07:00:00', '2016/11/22 06:59:00','2016/11/22 06:58:00', ...]})
My purpose is to convert this dictionary to a dataframe and to set the 'Date' key values as the index of the dataframe.
I can do this job by the below commands
df = pd.DataFrame(dictionary, columns=['Date', 'Open', 'Close']) 0 Date Open Close 1 2016/11/22 07:00:00 47.47 47.48 2 2016/11/22 06:59:00 47.46 47.45 3 2016/11/22 06:58:00 47.38 47.38 df.index = df.Date Date Date Open Close 2016/11/22 07:00:00 2016/11/22 07:00:00 47.47 47.48 2016/11/22 06:59:00 2016/11/22 06:59:00 47.46 47.45 2016/11/22 06:58:00 2016/11/22 06:58:00 47.38 47.38
but, then I have two 'Date' columns, one of which is the index and the other of which is the original column.
Is there any way to set index while converting dictionary to dataframe, without having overlapping columns like the below?
Date Close Open 2016/11/22 07:00:00 47.48 47.47 2016/11/22 06:59:00 47.45 47.46 2016/11/22 06:58:00 47.38 47.38
Thank you for reading this! :)
Set index by keeping old indexset_index() is used to set a new index to the DataFrame. It is also used to extend the existing DataFrame, i.e., we can update the index by append to the existing index. We need to use the append parameter of the DataFrame. set_index() function to append the new index to the existing one.
We can convert a dictionary to a pandas dataframe by using the pd. DataFrame. from_dict() class-method.
You can convert a dictionary to Pandas Dataframe using df = pd. DataFrame. from_dict(my_dict) statement.
Dictionaries are sometimes found in other languages as “associative memories” or “associative arrays”. Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by keys, which can be any immutable type; strings and numbers can always be keys.
Use set_index
:
df = pd.DataFrame(dictionary, columns=['Date', 'Open', 'Close']) df = df.set_index('Date') print (df) Open Close Date 2016/11/22 07:00:00 47.47 47.48 2016/11/22 06:59:00 47.46 47.45 2016/11/22 06:58:00 47.38 47.40
Or use inplace
:
df = pd.DataFrame(dictionary, columns=['Date', 'Open', 'Close']) df.set_index('Date', inplace=True) print (df) Open Close Date 2016/11/22 07:00:00 47.47 47.48 2016/11/22 06:59:00 47.46 47.45 2016/11/22 06:58:00 47.38 47.40
Another possible solution filter out dict
by Date
key and then set index by dictionary['Date']
:
df = pd.DataFrame({k: v for k, v in dictionary.items() if not k == 'Date'}, index=dictionary['Date'], columns=['Open','Close']) df.index.name = 'Date' print (df) Open Close Date 2016/11/22 07:00:00 47.47 47.48 2016/11/22 06:59:00 47.46 47.45 2016/11/22 06:58:00 47.38 47.40
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With