Get a decision tree in a dictionary

Tags:

I am looking for a way in python to make a dictionary of dictionaries based on the desired structure dynamically.

I have the data bellow:

{'weather': ['windy', 'calm'], 'season': ['summer', 'winter', 'spring', 'autumn'],  'lateness': ['ontime', 'delayed']}

I give the structure I want them to be like:

['weather', 'season', 'lateness']

and finally get the data in this format:

{'calm': {'autumn': {'delayed': 0, 'ontime': 0},
          'spring': {'delayed': 0, 'ontime': 0},
          'summer': {'delayed': 0, 'ontime': 0},
          'winter': {'delayed': 0, 'ontime': 0}},
 'windy': {'autumn': {'delayed': 0, 'ontime': 0},
           'spring': {'delayed': 0, 'ontime': 0},
           'summer': {'delayed': 0, 'ontime': 0},
           'winter': {'delayed': 0, 'ontime': 0}}}

This is the manual way that I thought for achieving this:

dtree = {}
for cat1 in category_cases['weather']:
    dtree.setdefault(cat1, {})
    for cat2 in category_cases['season']:
        dtree[cat1].setdefault(cat2, {})
        for cat3 in category_cases['lateness']:
            dtree[cat1][cat2].setdefault(cat3, 0)

Can you think of a way to be able to just change the structure I wrote and having the desired result? Keep in mind that the structure might not be the same size every time.

Also if you think of another way except dictionaries that I can access the result, it will also work for me.

712

asked May 16 '20 04:05

bkbilly

2 Answers

If you're not avert to using external packages, pandas.DataFrame might be a viable candidate since it looks like you'll be using a table:

import pandas as pd
df = pd.DataFrame(
       index=pd.MultiIndex.from_product([d['weather'], d['season']]),
       columns=d['lateness'], data=0
     )

Result:

              ontime  delayed
windy summer       0        0
      winter       0        0
      spring       0        0
      autumn       0        0
calm  summer       0        0
      winter       0        0
      spring       0        0
      autumn       0        0

And you can easily make changes with indexing:

df.loc[('windy', 'summer'), 'ontime'] = 1
df.loc['calm', 'autumn']['delayed'] = 2

# Result:
              ontime  delayed
windy summer       1        0
      winter       0        0
      spring       0        0
      autumn       0        0
calm  summer       0        0
      winter       0        0
      spring       0        0
      autumn       0        2

The table can be constructed dynamically if you will always use the last key for columns, assuming your keys are in the desired insertion order:

df = pd.DataFrame(
       index=pd.MultiIndex.from_product(list(d.values())[:-1]), 
       columns=list(d.values())[-1], data=0
     )

Since you're interested in pandas, given your structure, I would also recommend giving a good read over on MultiIndex and Advance Indexing, just to get some idea on how to play around with your data. Here are some examples:

# Gets the sum of 'delayed' items in all of 'calm'
# Filters all the 'delayed' data in 'calm'    
df.loc['calm', 'delayed']

# summer    5
# winter    0
# spring    0
# autumn    2
# Name: delayed, dtype: int64

# Apply a sum:
df.loc['calm', 'delayed'].sum()

# 7

# Gets the mean of all 'summer' (notice the `slice(None)` is required to return all of the 'calm' and 'windy' group)
df.loc[(slice(None), 'summer'), :].mean()

# ontime     0.5
# delayed    2.5
# dtype: float64

It definitely is very handy and versatile, but before you get too deep into it you might will definitely want to read up first, the framework might take some getting used to.

Otherwise, if you still prefer dict, there's nothing wrong with that. Here's a recursive function to generate based on the given keys (assuming your keys are in the desired insertion order):

def gen_dict(d, level=0):
    if level >= len(d):
        return 0
    key = tuple(d.keys())[level]
    return {val: gen_dict(d, level+1) for val in d.get(key)}

gen_dict(d)

Result:

{'calm': {'autumn': {'delayed': 0, 'ontime': 0},
          'spring': {'delayed': 0, 'ontime': 0},
          'summer': {'delayed': 0, 'ontime': 0},
          'winter': {'delayed': 0, 'ontime': 0}},
 'windy': {'autumn': {'delayed': 0, 'ontime': 0},
           'spring': {'delayed': 0, 'ontime': 0},
           'summer': {'delayed': 0, 'ontime': 0},
           'winter': {'delayed': 0, 'ontime': 0}}}

answered Oct 13 '22 00:10

r.ook

I think this might work for you.

def get_output(category, order, i=0):
         output = {}
         for key in order[i:i+1]:
             for value in category[key]:
                 output[value] = get_output(category, order, i+1)
         if output == {}:
            return 0
         return output

answered Oct 12 '22 23:10

Faris

Related questions
                            
                                Airflow task running tweepy exits with return code -6
                            
                                Overfitting and data leakage in tensorflow/keras neural network
                            
                                Sending messages in the on_ready? Python discord bot
                            
                                pinging ~ 100,000 servers, is multithreading or multiprocessing better?
                            
                                How to conditionally drop rows in pandas
                            
                                How to change the time of a Pandas datetime column to midnight?
                            
                                AttributeError: 'NoneType' object has no attribute 'time' paramiko
                            
                                How to download a file from Google Drive using Python and the Drive API v3
                            
                                SignatureDoesNotMatch - Boto3 Django-storages
                            
                                Anaconda won't update spyder 4
                            
                                pytorch conv2d value cannot be converted to type uint8_t without overflow
                            
                                How to get the latest release version in Github only use python-requests?
                            
                                Most pythonic way to provide defaults for class constructor
                            
                                Explosion in loss function, LSTM autoencoder
                            
                                Tensorflow 2.0: Cannot Import tf.keras.utils.conv_utils
                            
                                ModuleNotFoundError: No module named 'tf'
                            
                                No module named 'sklearn.svm._classes' when loading model from colab
                            
                                Fastest Way To Filter A Pandas Dataframe Using A List
                            
                                How to convert selected column with index to a list of tuples in pandas
                            
                                Jupyter Notebook : 'head' is not recognized as an internal or external command, operable program or batch file

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Get a decision tree in a dictionary

Tags:

python

dictionary

structure

bkbilly

People also ask

2 Answers

r.ook

Faris

Recent Activity

Donate For Us