How do I cut a string based on digit first certain digit and the rest
Here's my data
Id actual_pattern
1 100101
2 10101
3 1010101
4 101
Here's the expected output
for cut_pattern1
is the first 4 digits from actual_pattern
for cut_pattern2
is the rest form from cut_pattern1
, if the rest from cut_pattern1
is not exist make cut_pattern2
= 0
If any 1
in cut_pattern2
, make binary_cut2
= 1 else make binary_cut2
= 0
Id actual_pattern cut_pattern1 cut_pattern2 binary_cut2
1 100101 1001 01 1
2 10101 1010 1 1
3 1010101 1010 101 1
4 101 101 0 0
To find whether a given string contains a number, convert it to a character array and find whether each character in the array is a digit using the isDigit() method of the Character class.
Create new columns by indexing with str
, replace
for change empty strings and for new column use Series.str.contains
with casting to integers:
df['actual_pattern'] = df['actual_pattern'].astype(str)
df['cut_pattern1'] = df['actual_pattern'].str[:4]
df['cut_pattern2'] = df['actual_pattern'].str[4:].replace('','0')
df['binary_cut2'] = df['cut_pattern2'].str.contains('1').astype(int)
print (df)
Id actual_pattern cut_pattern1 cut_pattern2 binary_cut2
0 1 100101 1001 01 1
1 2 10101 1010 1 1
2 3 1010101 1010 101 1
3 4 101 101 0 0
EDIT:
Solution for @Rick Hitchcock from comments:
df['actual_pattern'] = df['actual_pattern'].astype(str)
df['cut_pattern1'] = df['actual_pattern'].str[:4]
df['cut_pattern2'] = df['actual_pattern'].str[4:].replace('','0')
df['binary_cut2'] = df['cut_pattern2'].str.contains('1').astype(int)
print (df)
Id actual_pattern cut_pattern1 cut_pattern2 binary_cut2
0 1 100101 1001 01 1
1 2 10101 1010 1 1
2 3 1010101 1010 101 1
3 4 00001111 0000 1111 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With