If I were to have a list, say:
lst = ['foo', 'bar', '!test', 'hello', 'world!', 'word']
with a character of !
, how would I return a list given:
lst = ['foo', 'bar', ['test', 'hello', 'world'], 'word']
I'm having some difficulty finding a solution for this. Here's one approach I've tried:
def define(lst):
for index, item in enumerate(lst):
if item[0] == '!' and lst[index+2][-1] == '!':
temp = lst[index:index+3]
del lst[index+1:index+2]
lst[index] = temp
return lst
Any help would be greatly appreciated.
Assuming that there is no elements which starts & ends with !
like '!foo!'
.
First of all we can write helper predicates like
def is_starting_element(element):
return element.startswith('!')
def is_ending_element(element):
return element.endswith('!')
Then we can write generator-function (because they are awesome)
def walk(elements):
elements = iter(elements) # making iterator from passed iterable
for position, element in enumerate(elements):
if is_starting_element(element):
yield [element[1:], *walk(elements)]
elif is_ending_element(element):
yield element[:-1]
return
else:
yield element
Tests:
>>> lst = ['foo', 'bar', '!test', 'hello', 'world!', 'word']
>>> list(walk(lst))
['foo', 'bar', ['test', 'hello', 'world'], 'word']
>>> lst = ['foo', 'bar', '!test', '!hello', 'world!', 'word!']
>>> list(walk(lst))
['foo', 'bar', ['test', ['hello', 'world'], 'word']]
>>> lst = ['hello!', 'world!']
>>> list(walk(lst))
['hello']
as we can see from the last example if there are more closing elements than opening ones remaining closing elements will be ignored (this is because we're return
ing from generator). So if lst
has invalid signature (difference between opening and closing elements is not equal to zero) then we can have some unpredictable behavior. As a way out of this situation we can validate given data before processing and raise error if data is invalid.
We can write validator like
def validate_elements(elements):
def get_sign(element):
if is_starting_element(element):
return 1
elif is_ending_element(element):
return -1
else:
return 0
signature = sum(map(get_sign, elements))
are_elements_valid = signature == 0
if not are_elements_valid:
error_message = 'Data is invalid: '
if signature > 0:
error_message += ('there are more opening elements '
'than closing ones.')
else:
error_message += ('there are more closing elements '
'than opening ones.')
raise ValueError(error_message)
Tests
>>> lst = ['!hello', 'world!']
>>> validate_elements(lst) # no exception raised, data is valid
>>> lst = ['!hello', '!world']
>>> validate_elements(lst)
...
ValueError: Data is invalid: there are more opening elements than closing ones.
>>> lst = ['hello!', 'world!']
>>> validate_elements(lst)
...
ValueError: Data is invalid: there are more closing elements than opening ones.
Finally we can write function with validation like
def to_sublists(elements):
validate_elements(elements)
return list(walk(elements))
Tests
>>> lst = ['foo', 'bar', '!test', 'hello', 'world!', 'word']
>>> to_sublists(lst)
['foo', 'bar', ['test', 'hello', 'world'], 'word']
>>> lst = ['foo', 'bar', '!test', '!hello', 'world!', 'word!']
>>> to_sublists(lst)
['foo', 'bar', ['test', ['hello', 'world'], 'word']]
>>> lst = ['hello!', 'world!']
>>> to_sublists(lst)
...
ValueError: Data is invalid: there are more closing elements than opening ones.
If we want to handle elements which starts & ends with !
like '!bar!'
we can modify walk
function using itertools.chain
like
from itertools import chain
def walk(elements):
elements = iter(elements)
for position, element in enumerate(elements):
if is_starting_element(element):
yield list(walk(chain([element[1:]], elements)))
elif is_ending_element(element):
element = element[:-1]
yield element
return
else:
yield element
also we need to complete validation by just modifying get_sign
function
def get_sign(element):
if is_starting_element(element):
if is_ending_element(element):
return 0
return 1
if is_ending_element(element):
return -1
return 0
Tests
>>> lst = ['foo', 'bar', '!test', '!baz!', 'hello', 'world!', 'word']
>>> to_sublists(lst)
['foo', 'bar', ['test', ['baz'], 'hello', 'world'], 'word']
Here's an iterative solution that can handle arbitrarily nested lists:
def nest(lst, sep):
current_list = []
nested_lists = [current_list] # stack of nested lists
for item in lst:
if item.startswith(sep):
if item.endswith(sep):
item = item[len(sep):-len(sep)] # strip both separators
current_list.append([item])
else:
# start a new nested list and push it onto the stack
new_list = []
current_list.append(new_list)
current_list = new_list
nested_lists.append(current_list)
current_list.append(item[len(sep):]) # strip the separator
elif item.endswith(sep):
# finalize the deepest list and go up by one level
current_list.append(item[:-len(sep)]) # strip the separator
nested_lists.pop()
current_list = nested_lists[-1]
else:
current_list.append(item)
return current_list
Test run:
>>> nest(['foo', 'bar', '!test', '!baz!', 'hello', 'world!', 'word'], '!')
['foo', 'bar', ['test', ['baz'], 'hello', 'world'], 'word']
The way it works is to maintain a stack of nested lists. Every time a new nested list is created, it gets pushed onto the stack. Elements are always appended to the last list in the stack. When an element that ends with "!" is found, the topmost list is removed from the stack.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With