Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas.DataFrame.from_dict not preserving order using OrderedDict

Tags:

I want to import OData XML datafeeds from the Dutch Bureau of Statistics (CBS) into our database. Using lxml and pandas I thought this should be straigtforward. By using OrderDict I want to preserve the order of the columns for readability, but somehow I can't get it right.

from collections import OrderedDict from lxml import etree import requests import pandas as pd   # CBS URLs base_url = 'http://opendata.cbs.nl/ODataFeed/odata' datasets = ['/37296ned', '/82245NED']  feed = requests.get(base_url + datasets[1] + '/TypedDataSet') root = etree.fromstring(feed.content)  # all record entries start at tag m:properties, parse into data dict data = [] for record in root.iter('{{{}}}properties'.format(root.nsmap['m'])):     row = OrderedDict()     for element in record:         row[element.tag.split('}')[1]] = element.text     data.append(row)  df = pd.DataFrame.from_dict(data) df.columns 

Inspecting data, the OrderDict is in the right order. But looking at df.head() the columns have been sorted alphabetically with CAPS first?

Help, anyone?

like image 409
dkapitan Avatar asked Nov 17 '15 09:11

dkapitan


People also ask

Does pandas DataFrame preserve order?

Pandas. DataFrame doesn't preserve the column order when converting from a DataFrames.

Which method of pandas can be used for returning last 5 rows?

Pandas tail() method is used to return bottom n (5 by default) rows of a data frame or series.


1 Answers

Something in your example seems to be inconsistent, as data is a list and no dict, but assuming you really have an OrderedDict:

Try to explicitly specify your column order when you create your DataFrame:

# ... all your data collection df = pd.DataFrame(data, columns=data.keys()) 

This should give you your DataFrame with the columns ordered just in exact the way they are in the OrderedDict (via the data.keys() generated list)

like image 76
chris-sc Avatar answered Sep 28 '22 15:09

chris-sc