If I have a dictionary containing lists in one or more of its values:
data = {
'a':0,
'b':1,
'c':[0, 1, 2],
'pair':['one','two']
}
How can I get a list of dict tuples paired by pair
and iterating over c
, with all else remaining constant? E.g.
output = [
({
'a':0,
'b':1,
'c':0,
'pair':'one'
},
{
'a':0,
'b':1,
'c':0,
'pair':'two'
}),
({
'a':0,
'b':1,
'c':1,
'pair':'one'
},
...
]
Well, this doesn't feel especially elegant, but you might use a nested for loop or list comprehension:
output = []
for i in data['c']:
output.append(tuple({'a': 0, 'b': 1, 'c': i, 'pair': p} for p in data))
or
output = [tuple({'a': 0, 'b': 1, 'c': i, 'pair': p} for p in data['pair']) for i in data['c']]
A cleaner solution might separate out the generation of the component dict into a function, like this:
def gen_output_dict(c, pair):
return {'a': 0, 'b': 1, 'c': c, 'pair': pair}
output = []
for i in data['c']:
output.append(tuple(gen_output_dict(i, p) for p in data['pair']))
You can use itertools.product
on list values and keep track of the key from which each element originated. Since the key 'pair'
has a special meaning, you should treat it separately.
from itertools import product
def unzip_dict(d):
keys = [k for k, v in d.items() if isinstance(v, list) and k != 'pair']
values = [d[k] for k in keys]
for values in product(*values):
yield tuple({**d, **dict(zip(keys, values)), 'pair': pair} for pair in d['pair'])
data = {
'a': 0,
'c': [1, 2],
'pair': ['one', 'two']
}
print(*unzip_dict(data))
({'a': 0, 'c': 1, 'pair': 'one'}, {'a': 0, 'c': 1, 'pair': 'two'})
({'a': 0, 'c': 2, 'pair': 'one'}, {'a': 0, 'c': 2, 'pair': 'two'})
The following is quite an extended solution:
data = {
'a':0,
'b':1,
'c':[0, 1, 2],
'pair':['one','two']
}
# Get the length of the longest sequence
length = max(map(lambda x: len(x) if isinstance(x, list) else 1, data.values()))
# Loop through the data and change scalars to sequences
# while also making sure that smaller sequences are stretched to match
# or exceed the length of the longest sequence
for k, v in data.items():
if isinstance(v, list):
data[k] = v * int(round(length/len(v), 0))
else:
data[k] = [v] * length
# Create a dictionary to keep track of which outputs
# need to end up in which tuple
seen = dict.fromkeys(data.get('pair'), 0)
output = [tuple()] * len(seen)
# Loop through the data and place dictionaries in their
# corresponding tuples.
for v in zip(*data.values()):
d = dict(zip(data, v))
output[seen[d.get('pair')]] += (d,)
seen[d.get('pair')] += 1
print(output)
The idea is to convert the scalars in your data to sequences whose lengths match that of the longest sequence in the original data. Therefore, the first thing I did was assign to the variable length
the size of the longest sequence. Armed with that knowledge, we loop through the original data and extend the already existing sequences to match the size of the longest sequence while converting scalars to sequences.
Once that's done, we move to generating the output
variable. But first, we create a dictionary called seen
to help us both create a list of tuples and keep track of which group of dictionaries ends up in which tuple.
This, then, allows us to run one final loop to place the groups of dictionaries to their corresponding tuples.
The current output looks like the following:
[({'a': 0, 'b': 1, 'c': 0, 'pair': 'one'},
{'a': 0, 'b': 1, 'c': 1, 'pair': 'two'}),
({'a': 0, 'b': 1, 'c': 2, 'pair': 'one'},)]
Please let me know if you need any more clarifying details. Otherwise, I do hope this serves some purpose.
@r3robertson, You can also try the below code. The code is based on the concept of list comprehension
, & deepcopy() operation
in Python.
Check Shallow copy vs deepcopy in Python.
import pprint;
import copy;
data = {
'a': 0,
'b': 1,
'c': [0, 1, 2],
'pair': ['one','two'],
};
def get_updated_dict(data, index, pair_name):
d = copy.deepcopy(data);
d.update({'c': index, 'pair': pair_name});
return d;
output = [tuple(get_updated_dict(data, index, pair_name) for pair_name in data['pair']) for index in data['c']];
# Pretty printing the output list.
pprint.pprint(output, indent=4);
Output »
[ ( { 'a': 0, 'b': 1, 'c': 0, 'pair': 'one'},
{ 'a': 0, 'b': 1, 'c': 0, 'pair': 'two'}),
( { 'a': 0, 'b': 1, 'c': 1, 'pair': 'one'},
{ 'a': 0, 'b': 1, 'c': 1, 'pair': 'two'}),
( { 'a': 0, 'b': 1, 'c': 2, 'pair': 'one'},
{ 'a': 0, 'b': 1, 'c': 2, 'pair': 'two'})]
Pretty printing using json module »
Note: Tuple will convert into list here as tuples are not supported inside JSON.
import json;
print(json.dumps(output, indent=4));
Output »
[
[
{
"a": 0,
"c": 0,
"b": 1,
"pair": "one"
},
{
"a": 0,
"c": 0,
"b": 1,
"pair": "two"
}
],
[
{
"a": 0,
"c": 1,
"b": 1,
"pair": "one"
},
{
"a": 0,
"c": 1,
"b": 1,
"pair": "two"
}
],
[
{
"a": 0,
"c": 2,
"b": 1,
"pair": "one"
},
{
"a": 0,
"c": 2,
"b": 1,
"pair": "two"
}
]
]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With