I am using Excel for this task now, but I was wondering if any of you know a way to find and insert missing sequence numbers in python.
Say I have a dataframe:
import pandas as pd
data = {'Sequence': [1, 2, 4, 6, 7, 9, 10],
'Value': ["x", "x", "x", "x", "x", "x", "x"]
}
df = pd.DataFrame (data, columns = ['Sequence','Value'])
And now I want to use some code here to find missing sequence numbers in the column 'Sequence', and leave blank spaces at the column 'Values' for the rows of missing sequence numbers. To get the following output:
print(df)
Sequence Value
0 1 x
1 2 x
2 3
3 4 x
4 5
5 6 x
6 7 x
7 8
8 9 x
9 10 x
Even better would be a solution in which you can also define the start and end of the sequence. For example when the sequence starts with 3 but you want it to start from 1 and end at 12. But a solution for only the first part will already help a lot. Thanks in advance!!
You can set_index and reindex using a range from the Sequence's min and max values:
(df.set_index('Sequence')
.reindex(range(df.Sequence.iat[0],df.Sequence.iat[-1]+1), fill_value='')
.reset_index())
Sequence Value
0 1 x
1 2 x
2 3
3 4 x
4 5
5 6 x
6 7 x
7 8
8 9 x
9 10 x
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With