Let's say I have a list of strings:
a = ['a', 'a', 'b', 'c', 'c', 'c', 'd']
I want to make a list of items that appear at least twice in a row:
result = ['a', 'c']
I know I have to use a for loop, but I can't figure out how to target the items repeated in a row. How can I do so?
EDIT: What if the same item repeats twice in a? Then the set function would be ineffective
a = ['a', 'b', 'a', 'a', 'c', 'a', 'a', 'a', 'd', 'd']
result = ['a', 'a', 'd']
Repeat Each Element in a List in Python using itertools. repeat() This particular problem can also be solved using python inbuilt functions of itertools library. The repeat function, as the name suggests does the task of repetition and grouping into a list is done by the from_iterable function.
After getting the frequency of each element of the list using the counter() method, we can check if the frequency of any element is greater than one or not. If yes, the list contains duplicate elements. Otherwise not.
Operator. countOf() is used for counting the number of occurrences of b in a. It counts the number of occurrences of value. It returns the Count of a number of occurrences of value.
What are duplicates in a list? If an integer or string or any items in a list are repeated more than one time, they are duplicates.
try itertools.groupby()
here:
>>> from itertools import groupby,islice
>>> a = ['a', 'a', 'b', 'c', 'c', 'c', 'b']
>>> [list(g) for k,g in groupby(a)]
[['a', 'a'], ['b'], ['c', 'c', 'c'], ['b']]
>>> [k for k,g in groupby(a) if len(list(g))>=2]
['a', 'c']
using islice()
:
>>> [k for k,g in groupby(a) if len(list(islice(g,0,2)))==2]
>>> ['a', 'c']
using zip()
and izip()
:
In [198]: set(x[0] for x in izip(a,a[1:]) if x[0]==x[1])
Out[198]: set(['a', 'c'])
In [199]: set(x[0] for x in zip(a,a[1:]) if x[0]==x[1])
Out[199]: set(['a', 'c'])
timeit
results:
from itertools import *
a='aaaabbbccccddddefgggghhhhhiiiiiijjjkkklllmnooooooppppppppqqqqqqsssstuuvv'
def grp_isl():
[k for k,g in groupby(a) if len(list(islice(g,0,2)))==2]
def grpby():
[k for k,g in groupby(a) if len(list(g))>=2]
def chn():
set(x[1] for x in chain(izip(*([iter(a)] * 2)), izip(*([iter(a[1:])] * 2))) if x[0] == x[1])
def dread():
set(a[i] for i in range(1, len(a)) if a[i] == a[i-1])
def xdread():
set(a[i] for i in xrange(1, len(a)) if a[i] == a[i-1])
def inrow():
inRow = []
last = None
for x in a:
if last == x and (len(inRow) == 0 or inRow[-1] != x):
inRow.append(last)
last = x
def zipp():
set(x[0] for x in zip(a,a[1:]) if x[0]==x[1])
def izipp():
set(x[0] for x in izip(a,a[1:]) if x[0]==x[1])
if __name__=="__main__":
import timeit
print "islice",timeit.timeit("grp_isl()", setup="from __main__ import grp_isl")
print "grpby",timeit.timeit("grpby()", setup="from __main__ import grpby")
print "dread",timeit.timeit("dread()", setup="from __main__ import dread")
print "xdread",timeit.timeit("xdread()", setup="from __main__ import xdread")
print "chain",timeit.timeit("chn()", setup="from __main__ import chn")
print "inrow",timeit.timeit("inrow()", setup="from __main__ import inrow")
print "zip",timeit.timeit("zipp()", setup="from __main__ import zipp")
print "izip",timeit.timeit("izipp()", setup="from __main__ import izipp")
output:
islice 39.9123107277
grpby 30.1204478987
dread 17.8041124706
xdread 15.3691785568
chain 17.4777339702
inrow 11.8577565327
zip 16.6348844045
izip 15.1468557105
Conclusion:
Poke's solution is the fastest solution in comparison to other alternatives.
This sounds like homework, so I'll just outline what I would do:
a
, but keep the index of each element in a variable. enumerate()
will be useful.for
loop, start a while
loop from the current item's index.break
will be useful here.result
if your counter variable is >=
2.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With