I have a list of data that looks like the following:
// timestep,x_position,y_position 0,4,7 0,2,7 0,9,5 0,6,7 1,2,5 1,4,7 1,9,0 1,6,8
... and I want to make this look like:
0, (4,7), (2,7), (9,5), (6,7) 1, (2,5), (4,7), (9,0), (6.8)
My plan was to use a dictionary, where the value of t is the key for the dictionary, and the value against the key would be a list. I could then append each (x,y) to the list. Something like:
# where t = 0, c = (4,7), d = {} # code 1 d[t].append(c)
Now this causes IDLE to fail. However, if I do:
# code 2 d[t] = [] d[t].append(c)
... this works.
So the question is: why does code 2 work, but code 1 doesn't?
PS Any improvement on what I'm planning on doing would be of great interest!! I think I will have to check the dictionary on each loop through the input to see if the dictionary key already exists, I guess by using something like max(d.keys()): if it is there, append data, if not create the empty list as the dictionary value, and then append data on the next loop through.
Dictionaries are used to store data values in key:value pairs. A dictionary is a collection which is ordered*, changeable and do not allow duplicates. As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.
Python Dictionary update() Method The update() method inserts the specified items to the dictionary. The specified items can be a dictionary, or an iterable object with key value pairs.
The list is an ordered collection of data, whereas the dictionaries store the data in the form of key-value pairs using the hashtable structure. Due to this, fetching the elements from the list data structure is quite complex compared to dictionaries in Python. Therefore, the dictionary is faster than a list in Python.
It is more efficient to use a dictionary for lookup of elements because it takes less time to traverse in the dictionary than a list. For example, let's consider a data set with 5000000 elements in a machine learning model that relies on the speed of retrieval of data.
Let's look at
d[t].append(c)
What is the value of d[t]
? Try it.
d = {} t = 0 d[t]
What do you get? Oh. There's nothing in d
that has a key of t
.
Now try this.
d[t] = [] d[t]
Ahh. Now there's something in d
with a key of t
.
There are several things you can do.
setdefault
. d.setdefault(t,[]).append(c)
.defaultdict(list)
instead of a simple dictionary, {}
.Edit 1. Optimization
Given input lines from a file in the above form: ts, x, y, the grouping process is needless. There's no reason to go from a simple list of ( ts, x, y ) to a more complex list of ( ts, (x,y), (x,y), (x,y), ... ). The original list can be processed exactly as it arrived.
d= collections.defaultdict(list) for ts, x, y in someFileOrListOrQueryOrWhatever: d[ts].append( (x,y) )
Edit 2. Answer Question
"when initialising a dictionary, you need to tell the dictionary what the key-value data structure will look like?"
I'm not sure what the question means. Since, all dictionaries are key-value structures, the question's not very clear. So, I'll review the three alternatives, which may answer the question.
Example 2.
Initialization
d= {}
Use
if t not in d: d[t] = list() d[t].append( c )
Each dictionary value must be initialized to some useful structure. In this case, we check to see if the key is present; when the key is missing, we create the key and assign an empty list.
Setdefault
Initialization
d= {}
Use
d.setdefault(t,list()).append( c )
In this case, we exploit the setdefault
method to either fetch a value associated with a key or create a new value associated with a missing key.
default dict
Initialization
import collections d = collections.defaultdict(list)
Use
d[t].append( c )
The defaultdict
uses an initializer function for missing keys. In this case, we provide the list
function so that a new, empty list is created for a missing key.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With