Given a Django queryset like the following:
qs = A.objects.all().values_list('A', 'B', 'C', 'D', 'E', 'F')
I can convert my qs to a pandas dataframe easily:
df = pd.DataFrame.from_records(qs.values('A', 'B', 'C', 'D', 'E', 'F'))
However, the column order is not maintained. Immediately after conversion I need to specify the new order of columns and I'm not clear why:
df = df.columns['B', 'F', 'C', 'E', 'D', 'A']
Why is this happening and what can I do differently to avoid having to set the dataframe columns explicitly?
Pandas. DataFrame doesn't preserve the column order when converting from a DataFrames.
You need to create a new list of your columns in the desired order, then use df = df[cols] to rearrange the columns in this new order.
This is because a Django QuerySet is a lazy object. It contains all of the information it needs to populate itself from the database, but will not actually do so until the information is needed.
qs.values()
converts the QuerySet into a dictionary, which is unordered. You are OK with qs.values_list()
, which returns a list of tuples.
Try:
df = pd.DataFrame.from_records(
A.objects.all().values_list('A', 'B', 'C', 'D', 'E', 'F')
)
check the docs about Django's QuerySets
try:
df = pd.DataFrame.from_records("DATA_GOES_HERE", columns=['A','B','C'.. etc.)
I'm using the columns=
parameter found here.
I believe you could also construct the DataFrame by just using pd.DataFrame and put your lists in there with the corresponding column names. This may be more manual work up-front, but if this is for an automated job it could work as well. (may have the ordering issue here again, but can easily be solved by rearranging the columns.. Again, may be more work upfront)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With