Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get a decision tree in a dictionary

I am looking for a way in python to make a dictionary of dictionaries based on the desired structure dynamically.

I have the data bellow:

{'weather': ['windy', 'calm'], 'season': ['summer', 'winter', 'spring', 'autumn'],  'lateness': ['ontime', 'delayed']} 

I give the structure I want them to be like:

['weather', 'season', 'lateness']

and finally get the data in this format:

{'calm': {'autumn': {'delayed': 0, 'ontime': 0},
          'spring': {'delayed': 0, 'ontime': 0},
          'summer': {'delayed': 0, 'ontime': 0},
          'winter': {'delayed': 0, 'ontime': 0}},
 'windy': {'autumn': {'delayed': 0, 'ontime': 0},
           'spring': {'delayed': 0, 'ontime': 0},
           'summer': {'delayed': 0, 'ontime': 0},
           'winter': {'delayed': 0, 'ontime': 0}}}

This is the manual way that I thought for achieving this:

dtree = {}
for cat1 in category_cases['weather']:
    dtree.setdefault(cat1, {})
    for cat2 in category_cases['season']:
        dtree[cat1].setdefault(cat2, {})
        for cat3 in category_cases['lateness']:
            dtree[cat1][cat2].setdefault(cat3, 0)

Can you think of a way to be able to just change the structure I wrote and having the desired result? Keep in mind that the structure might not be the same size every time.

Also if you think of another way except dictionaries that I can access the result, it will also work for me.

like image 712
bkbilly Avatar asked May 16 '20 04:05

bkbilly


People also ask

Can you index a dictionary?

Dictionaries are sometimes found in other languages as “associative memories” or “associative arrays”. Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by keys, which can be any immutable type; strings and numbers can always be keys.

What is the use of get in dictionary?

Python Dictionary get() Method The get() method returns the value of the item with the specified key.

What is the meaning of decision tree?

Definition of decision tree. : a tree diagram which is used for making decisions in business or computer programming and in which the branches represent choices with associated risks, costs, results, or probabilities.

How to create a decision tree in Python?

Now, based on this data set, Python can create a decision tree that can be used to decide if any new shows are worth attending to. How Does it Work? To make a decision tree, all data has to be numerical. We have to convert the non numerical columns 'Nationality' and 'Go' into numerical values.

How to make a decision tree using PANDAS map?

To make a decision tree, all data has to be numerical. We have to convert the non numerical columns 'Nationality' and 'Go' into numerical values. Pandas has a map () method that takes a dictionary with information on how to convert the values. Means convert the values 'UK' to 0, 'USA' to 1, and 'N' to 2.

Are decision trees suitable for all types of data?

Despite having many benefits, decision trees are not suited to all types of data, e.g. continuous variables or imbalanced datasets. They are popular in data analytics and machine learning, with practical applications across sectors from health, to finance, and technology.


2 Answers

If you're not avert to using external packages, pandas.DataFrame might be a viable candidate since it looks like you'll be using a table:

import pandas as pd
df = pd.DataFrame(
       index=pd.MultiIndex.from_product([d['weather'], d['season']]),
       columns=d['lateness'], data=0
     )

Result:

              ontime  delayed
windy summer       0        0
      winter       0        0
      spring       0        0
      autumn       0        0
calm  summer       0        0
      winter       0        0
      spring       0        0
      autumn       0        0

And you can easily make changes with indexing:

df.loc[('windy', 'summer'), 'ontime'] = 1
df.loc['calm', 'autumn']['delayed'] = 2

# Result:
              ontime  delayed
windy summer       1        0
      winter       0        0
      spring       0        0
      autumn       0        0
calm  summer       0        0
      winter       0        0
      spring       0        0
      autumn       0        2

The table can be constructed dynamically if you will always use the last key for columns, assuming your keys are in the desired insertion order:

df = pd.DataFrame(
       index=pd.MultiIndex.from_product(list(d.values())[:-1]), 
       columns=list(d.values())[-1], data=0
     )

Since you're interested in pandas, given your structure, I would also recommend giving a good read over on MultiIndex and Advance Indexing, just to get some idea on how to play around with your data. Here are some examples:

# Gets the sum of 'delayed' items in all of 'calm'
# Filters all the 'delayed' data in 'calm'    
df.loc['calm', 'delayed']

# summer    5
# winter    0
# spring    0
# autumn    2
# Name: delayed, dtype: int64

# Apply a sum:
df.loc['calm', 'delayed'].sum()

# 7

# Gets the mean of all 'summer' (notice the `slice(None)` is required to return all of the 'calm' and 'windy' group)
df.loc[(slice(None), 'summer'), :].mean()

# ontime     0.5
# delayed    2.5
# dtype: float64

It definitely is very handy and versatile, but before you get too deep into it you might will definitely want to read up first, the framework might take some getting used to.


Otherwise, if you still prefer dict, there's nothing wrong with that. Here's a recursive function to generate based on the given keys (assuming your keys are in the desired insertion order):

def gen_dict(d, level=0):
    if level >= len(d):
        return 0
    key = tuple(d.keys())[level]
    return {val: gen_dict(d, level+1) for val in d.get(key)}

gen_dict(d)

Result:

{'calm': {'autumn': {'delayed': 0, 'ontime': 0},
          'spring': {'delayed': 0, 'ontime': 0},
          'summer': {'delayed': 0, 'ontime': 0},
          'winter': {'delayed': 0, 'ontime': 0}},
 'windy': {'autumn': {'delayed': 0, 'ontime': 0},
           'spring': {'delayed': 0, 'ontime': 0},
           'summer': {'delayed': 0, 'ontime': 0},
           'winter': {'delayed': 0, 'ontime': 0}}}
like image 99
r.ook Avatar answered Oct 13 '22 00:10

r.ook


I think this might work for you.

def get_output(category, order, i=0):
         output = {}
         for key in order[i:i+1]:
             for value in category[key]:
                 output[value] = get_output(category, order, i+1)
         if output == {}:
            return 0
         return output
like image 36
Faris Avatar answered Oct 12 '22 23:10

Faris