pandas.DataFrame.from_dict not preserving order using OrderedDict

Tags:

I want to import OData XML datafeeds from the Dutch Bureau of Statistics (CBS) into our database. Using lxml and pandas I thought this should be straigtforward. By using OrderDict I want to preserve the order of the columns for readability, but somehow I can't get it right.

from collections import OrderedDict from lxml import etree import requests import pandas as pd   # CBS URLs base_url = 'http://opendata.cbs.nl/ODataFeed/odata' datasets = ['/37296ned', '/82245NED']  feed = requests.get(base_url + datasets[1] + '/TypedDataSet') root = etree.fromstring(feed.content)  # all record entries start at tag m:properties, parse into data dict data = [] for record in root.iter('{{{}}}properties'.format(root.nsmap['m'])):     row = OrderedDict()     for element in record:         row[element.tag.split('}')[1]] = element.text     data.append(row)  df = pd.DataFrame.from_dict(data) df.columns

Inspecting data, the OrderDict is in the right order. But looking at df.head() the columns have been sorted alphabetically with CAPS first?

Help, anyone?

409

asked Nov 17 '15 09:11

dkapitan

1 Answers

Something in your example seems to be inconsistent, as data is a list and no dict, but assuming you really have an OrderedDict:

Try to explicitly specify your column order when you create your DataFrame:

# ... all your data collection df = pd.DataFrame(data, columns=data.keys())

This should give you your DataFrame with the columns ordered just in exact the way they are in the OrderedDict (via the data.keys() generated list)

answered Sep 28 '22 15:09

chris-sc

Related questions
                            
                                Python Scatter Plot with Multiple Y values for each X
                            
                                What does ((struct name *)0)->member) do in C? [duplicate]
                            
                                How do I add default JVM arguments with Gradle
                            
                                Get date from a Django DateTimeField
                            
                                Maven - Multiple version of same dependency
                            
                                "InvalidRequirement: Invalid requirement, parse error" error after updating a python package
                            
                                Deploying some files in .gitignore to AWS Elastic Beanstalk
                            
                                How to correctly assign a pointer returned by dlsym into a variable of function pointer type?
                            
                                Is a C compiler obligated to always reload const value from memory?
                            
                                Reading input during debugging in Python with VSCode
                            
                                Cordova iOS - how to change minimum deployment version?
                            
                                Populate on basis of condition in mongoose, mongoDB

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With