Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how can i finding the index of non-ASCII character in python string?

Python has string.find() and string.rfind() to get the index of a substring in string.

And re.search(regex,string) to get the 'first index' of a substring in string. but, this function is return to match object :(

So i wonder, merge the two function. by a regex to check for the string and return the first index. (index is not match object type :b)

example :

string = "abcdeÿÿaaaabbbÿÿcccdddÿÿeeeÿÿ"
print custom(string)

result :

>>> 5

non-ASCII range is [^\x20-\x7E], how does implementation this function??

like image 422
holy-penguin Avatar asked Mar 16 '23 15:03

holy-penguin


2 Answers

If you want to use this 2 function use the first group of re.search within find :

>>> g = "abcdeÿÿaaaabbbÿÿcccdddÿÿeeeÿÿ"
>>> import re
>>> g.find(re.search(r'[^\x20-\x7E]',g).group(0))
5

But if you just want to find the index re.search has the start method that return the index of matched string :

>>> re.search(r'[^\x20-\x7E]',g).start()
5 

Also you can do it without regex :

>>> import string
>>> next(i for i,j in enumerate(g) if j not in string.ascii_letters)
5
like image 156
Mazdak Avatar answered Mar 19 '23 05:03

Mazdak


"MatchObjects" have a start method you can use:

import re

def custom(s):
    mat = re.search(r'[^\x20-\x7E]', s)
    if mat: return mat.start()
    return -1  # ?? match failed

string = "abcdeÿÿaaaabbbÿÿcccdddÿÿeeeÿÿ"
print(custom(string))  # 5
like image 34
jedwards Avatar answered Mar 19 '23 05:03

jedwards