I have a huge file (with around 200k inputs). The inputs are in the form:
A B C D
B E F
C A B D
D
I am reading this file and storing it in a list as follows:
text = f.read().split('\n')
This splits the file whenever it sees a new line. Hence text is like follows:
[[A B C D] [B E F] [C A B D] [D]]
I have to now store these values in a dictionary where the key values are the first element from each list. i.e the keys will be A, B, C, D. I am finding it difficult to enter the values as the remaining elements of the list. i.e the dictionary should look like:
{A: [B C D]; B: [E F]; C: [A B D]; D: []}
I have done the following:
inlinkDict = {}
for doc in text:
adoc= doc.split(' ')
docid = adoc[0]
inlinkDict[docid] = inlinkDict.get(docid,0) + {I do not understand what to put in here}
Please help as to how should i add the values to my dictionary. It should be 0 if there are no elements in the list except for the one which will be the key value. Like in example for 0.
Both can be nested. A list can contain another list. A dictionary can contain another dictionary. A dictionary can also contain a list, and vice versa.
We can have a list of many types in Python, like strings, numbers, and more. Python also allows us to have a list within a list called a nested list or a two-dimensional list.
Method 1: Using += sign on a key with an empty value In this method, we will use the += operator to append a list into the dictionary, for this we will take a dictionary and then add elements as a list into the dictionary.
To convert a list to a dictionary using the same values, you can use the dict. fromkeys() method. To convert two lists into one dictionary, you can use the Python zip() function. The dictionary comprehension lets you create a new dictionary based on the values of a list.
A dictionary comprehension makes short work of this task:
>>> s = [['A','B','C','D'], ['B','E','F'], ['C','A','B','D'], ['D']]
>>> {t[0]:t[1:] for t in s}
{'A': ['B', 'C', 'D'], 'C': ['A', 'B', 'D'], 'B': ['E', 'F'], 'D': []}
Try using a slice:
inlinkDict[docid] = adoc[1:]
This will give you an empty list instead of a 0 for the case where only the key value is on the line. To get a 0 instead, use an or
(which always returns one of the operands):
inlinkDict[docid] = adoc[1:] or 0
Easier way with a dict comprehension:
>>> with open('/tmp/spam.txt') as f:
... data = [line.split() for line in f]
...
>>> {d[0]: d[1:] for d in data}
{'A': ['B', 'C', 'D'], 'C': ['A', 'B', 'D'], 'B': ['E', 'F'], 'D': []}
>>> {d[0]: ' '.join(d[1:]) if d[1:] else 0 for d in data}
{'A': 'B C D', 'C': 'A B D', 'B': 'E F', 'D': 0}
Note: dict keys must be unique, so if you have, say, two lines beginning with 'C' the first one will be over-written.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With