Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenating selected strings in list of strings

The problem is as follows. I have a list of strings

lst1=['puffing','his','first','cigarette','in', 'weeks', 'in', 'weeks']

and I would like to obtain the string

lst2=['puffing','his','first','cigarette','in weeks', 'in weeks']

that is to concatenate any occurence of the sublist ['in', 'weeks'] for reasons that are irrelevant here, where find_sub_list1 is taken from here (and included in the code below):

npis = [['in', 'weeks'], ['in', 'ages']]

# given a list a candidate sublist, return the index of the first and last
# element of the sublist within the list
def find_sub_list1(sl,l):
    results=[]
    sll=len(sl)
    for ind in (i for i,e in enumerate(l) if e==sl[0]):
        if l[ind:ind+sll]==sl:
        results.append((ind,ind+sll-1))

    return results

def concatenator(sent, npis):
    indices = []
    for npi in npis:
        indices_temp = find_sub_list1(npi, sent)
        if indices_temp != []:
            indices.extend(indices_temp)
    sorted(indices, key=lambda x: x[0])

    for (a,b) in indices:
        diff = b - a
        sent[a:b+1] = [" ".join(sent[a:b+1])]
        del indices[0]
        indices = [(a - diff, b - diff) for (a,b) in indices]

    return sent 

instead of the desired lst2 this coder returns:

concatenator(lst1,['in', 'weeks'])
>>['puffing','his','first','cigarette','in weeks', 'in', 'weeks']

so it only concatenates the first occurrence. Any ideas about where the code is failing?

like image 874
Orest Xherija Avatar asked May 02 '17 03:05

Orest Xherija


People also ask

How do you concatenate strings in a list?

You can concatenate a list of strings into a single string with the string method, join() . Call the join() method from 'String to insert' and pass [List of strings] . If you use an empty string '' , [List of strings] is simply concatenated, and if you use a comma , , it makes a comma-delimited string.

How do I concatenate multiple strings?

Concatenation is the process of appending one string to the end of another string. You concatenate strings by using the + operator. For string literals and string constants, concatenation occurs at compile time; no run-time concatenation occurs. For string variables, concatenation occurs only at run time.

How do you concatenate items in a list?

The most conventional method to perform the list concatenation, the use of “+” operator can easily add the whole of one list behind the other list and hence perform the concatenation.


1 Answers

since the desired sub-sequence is 'in' 'weeks' and possibly 'in''ages'

One possible solution could be (the looping is not very elegant though):

  1. first find all positions where 'in' occurs.

  2. then iterate through the source list, appending elements to the target list, and treating the positions of 'in' specially, i.e. if the following word is in a special set then join the two & append to the target, advancing the iterator one extra time.

  3. Once the source list is exhausted an IndexError will be thrown, indicating that we should break the loop.

code:

index_in = [i for i, _ in enumerate(lst1) if _ == 'in']

lst2 = []; n = 0

while True:
    try:
         if n in index_in and lst1[n+1] in ['weeks', 'ages']:
             lst2.append(lst1[n] + lst1[n+1])
             n += 1
         else:
             lst2.append(lst1[n])
         n += 1
     except IndexError:
         break

A better way to do this would be through regular expressions.

  1. join the list to a string with space as a separator

  2. split the list on spaces, except those spaces surrounded by in<space>weeks. Here, we can use negative lookahead & lookbehind

code:

import re

c = re.compile(r'(?<!in) (?!weeks)')

lst2 = c.split(' '.join(lst1))
like image 124
Haleemur Ali Avatar answered Sep 30 '22 09:09

Haleemur Ali