Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tell whether a character is a combining diacritic mark

if you're looping though the chars a unicode string in python (2.x), say:

ak.sɛp.tɑ̃

How can you tell whether the current char is a combining diacritic mark?

For instance, the last char in the above string is actually a combining mark:

ak.sɛp.tɑ̃ --> ̃

like image 429
ʞɔıu Avatar asked Jan 24 '23 18:01

ʞɔıu


1 Answers

Use the unicodedata module:

import unicodedata
if unicodedata.combining(u'a'):
    print "is combining character"
else:
    print "is not combining"

these posts are also relevant

How do I reverse Unicode decomposition using Python?

What is the best way to remove accents in a Python unicode string?

like image 154
Joe Koberg Avatar answered Feb 02 '23 09:02

Joe Koberg