I'm a Python newbie trying to parse a file to make a table of memory allocations. My input file is in the following format:
48 bytes allocated at 0x8bb970a0
24 bytes allocated at 0x8bb950c0
48 bytes allocated at 0x958bd0e0
48 bytes allocated at 0x8bb9b060
96 bytes allocated at 0x8bb9afe0
24 bytes allocated at 0x8bb9af60
My first objective is to make a table that counts the instances of a particular number of byte allocations. In other words, my desired output for the above input would be something like:
48 bytes -> 3 times
96 bytes -> 1 times
24 bytes -> 2 times
(for now, I'm not concerned about the memory addresses)
Since I'm using Python, I thought doing this using a dictionary would be the right way to go (based on about 3 hours' worth of reading Python tutorials). Is that a good idea?
In trying to do this using a dictionary, I decided to make the number of bytes the 'key', and a counter as the 'value'. My plan was to increment the counter on every occurrence of the key. As of now, my code snippet is as follows:
# Create an empty dictionary
allocationList = {}
# Open file for reading
with open("allocFile.txt") as fp:
for line in fp:
# Split the line into a list (using space as delimiter)
lineList = line.split(" ")
# Extract the number of bytes
numBytes = lineList[0];
# Store in a dictionary
if allocationList.has_key('numBytes')
currentCount = allocationList['numBytes']
currentCount += 1
allocationList['numBytes'] = currentCount
else
allocationList['numBytes'] = 1
for bytes, count in allocationList.iteritems()
print bytes, "bytes -> ", count, " times"
With this, I get a syntax error in the 'has_key' call, which leads me to question whether it is even possible to use variables as dictionary keys. All examples I have seen so far assume that keys are available upfront. In my case, I can get my keys only when I'm parsing the input file.
(Note that my input file can run into thousands of lines, with hundreds of different keys)
Thank you for any help you can provide.
Learning a language is as much about the syntax and basic types as it is about the standard library. Python already has a class that makes your task very easy: collections.Counter
.
from collections import Counter
with open("allocFile.txt") as fp:
counter = Counter(line.split()[0] for line in fp)
for bytes, count in counter.most_common():
print bytes, "bytes -> ", count, " times"
You get a syntax error because you are missing the colon at the end of this line:
if allocationList.has_key('numBytes')
^
Your approach is fine, but it might be easier to use dict.get()
with a default value:
allocationList[numBytes] = allocationList.get(numBytes, 0) + 1
Since your allocationList
is a dictionary and not a list, you might want to chose a different name for the variable.
The dict.has_key()
method of dictionnary has disappeared in python3, to replace it, use the in keyword :
if numBytes in allocationList: # do not use numBytes as a string, use the variable directly
#do the stuff
But in your case, you can also replace all the
if allocationList.has_key('numBytes')
currentCount = allocationList['numBytes']
currentCount += 1
allocationList['numBytes'] = currentCount
else
allocationList['numBytes'] = 1
with one line with get:
allocationList[numBytes] = allocationList.get(numBytes, 0) + 1
You most definitely can use variables as dict keys. However, you have a variable called numBytes
, but are using a string containing the text "numBytes"
- you're using a string constant, not the variable. That won't cause the error, but is a problem. Instead, try:
if numBytes in allocationList:
# do stuff
Additionally, consider a Counter. This is a convenient class for handling the case you're looking at.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With