I'm having a rather hard problem that I just can't get fixed.. The idea is to loop through a part of data and find any indentation. (always spaces) Every time a line has a bigger indentation than the previous, for example 4 more whitespaces, the first line should be the key for a dictionary and the next values should be appended.
If there is another indent this means there should be made a new dictionary with a key and values. This should happen recursive until being through the data. To make things easier to understand I made an example:
Chassis 1:
Servers:
Server 1/1:
Equipped Product Name: EEE UCS B200 M3
Equiped PID: e63-samp-33
Equipped VID: V01
Acknowledged Cores: 16
Acknowledged Adapters: 1
PSU 1:
Presence: Equipped
VID: V00
HW Revision: 0
The idea is to be able to get any part of data returned in dictionary form. dictionary.get("Chassis 1:") should return ALL data, dictionary.get("Servers") should return everything that is indented deeper than the line "Servers". dictionary.get("PSU 1:") should give {"PSU 1:":"Presence: Equipped", "VID: 100", "HW Revision: 0"} and so on. I've drawn a little scheme to demonstrate this, every colour is another dictionary.
When the indentation goes less deep again, for example from 8 to 4 spaces the data should be appended to the dictionary that has data which is less indented.
I've gave it an attempt in code but it is not coming anywhere near where I want it..
for item in Array:
regexpatt = re.search(":$", item)
if regexpatt:
keyFound = True
break
if not keyFound:
return Array
#Verify if we still have lines with spaces
spaceFound = False
for item in Array:
if item != item.lstrip():
spaceFound = True
break
if not spaceFound:
return Array
keyFound = False
key=""
counter = -1
for item in Array:
counter += 1
valueTrim = item.lstrip()
valueL = len(item)
valueTrimL = len(valueTrim)
diff = (valueL - valueTrimL)
nextSame = False
if item in Array:
nextValue = Array[counter]
nextDiff = (len(nextValue) - len(nextValue.lstrip()))
if diff == nextDiff:
nextSame = True
if diff == 0 and valueTrim != "" and nextSame is True:
match = re.search(":$", item)
if match:
key = item
newArray[key] = []
deptDetermine = True
keyFound = True
elif diff == 0 and valueTrim != "" and keyFound is False:
newArray["0"].append(item)
elif valueTrim != "":
if depthDetermine:
depth = diff
deptDetermine = False
#newValue = item[-valueL +depth]
item = item.lstrip().rstrip()
newArray[key].append(item)
for item in newArray:
if item != "0":
newArray[key] = newArray[key]
return newArray
The result should be like this for example:
{
"Chassis 1": {
"PSU 1": {
"HW Revision: 0",
"Presence: Equipped",
"VID: V00"
},
"Servers": {
"Server 1/1": {
"Acknowledged Adapters: 1",
"Acknowledged Cores: 16",
"Equiped PID: e63-samp-33",
"Equipped Product Name: EEE UCS B200 M3",
"Equipped VID: V01"
}
}
}
}
I hope this explains the concept enough
This should give you the nested structure you want.
If you want every nested dictonary, also available from the root. Uncomment the if .. is not root
parts
def parse(data):
root = {}
currentDict = root
prevLevel = -1
parents = []
for line in data:
if line.strip() == '': continue
level = len(line) - len(line.lstrip(" "))
key, value = [val.strip() for val in line.split(':', 1)]
if level > prevLevel and not len(value):
currentDict[key] = {}
# if currentDict is not root:
# root[key] = currentDict[key]
parents.append((currentDict, level))
currentDict = currentDict[key]
prevLevel = level
elif level < prevLevel and not len(value):
parentDict, parentLevel = parents.pop()
while parentLevel != level:
if not parents: return root
parentDict, parentLevel = parents.pop()
parentDict[key] = {}
parents.append((parentDict, level))
# if parentDict is not root:
# root[key] = parentDict[key]
currentDict = parentDict[key]
prevLevel = level
else:
currentDict[key] = value
return root
with open('data.txt', 'r') as f:
data = parse(f)
#for pretty print of nested dict
import json
print json.dumps(data,sort_keys=True, indent=4)
output:
{
"Chassis 1": {
"PSU 1": {
"HW Revision": "0",
"Presence": "Equipped",
"VID": "V00"
},
"Servers": {
"Server 1/1": {
"Acknowledged Adapters": "1",
"Acknowledged Cores": "16",
"Equiped PID": "e63-samp-33",
"Equipped Product Name": "EEE UCS B200 M3",
"Equipped VID": "V01"
}
}
}
}
That data format really does look like YAML. Just in case someone stumbles onto this and is fine with a library solution:
import yaml
import pprint
s = """
Chassis 1:
Servers:
Server 1/1:
Equipped Product Name: EEE UCS B200 M3
Equiped PID: e63-samp-33
Equipped VID: V01
Acknowledged Cores: 16
Acknowledged Adapters: 1
PSU 1:
Presence: Equipped
VID: V00
HW Revision: 0
"""
d = yaml.load(s)
pprint.pprint(d)
The output is:
{'Chassis 1': {'PSU 1': {'HW Revision': 0,
'Presence': 'Equipped',
'VID': 'V00'},
'Servers': {'Server 1/1': {'Acknowledged Adapters': 1,
'Acknowledged Cores': 16,
'Equiped PID': 'e63-samp-33',
'Equipped Product Name': 'EEE UCS B200 M3',
'Equipped VID': 'V01'}}}}
For reference:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With