How to search a unicode character using its code point in sublime text

Tags:

From what I understand, unicode characters have various representations.

e.g., code point or hex byte (these two representations are not always the same if UTF-8 encoding is used).

If I want to search for a visible unicode character (e.g., 汉) I can just copy it and search. This works even if I do not know its underlying unicode representation. But for other characters which may not be easily visible, such as zeros width space, that way does not work well. For these characters, we may want to search it using its code point.

My question

If I have known a character's code point, how do I search it in sublime text using regular expression? I highlight sublime text because different editors may use different format.

988

asked Dec 13 '17 02:12

jdhao

2 Answers

Zero width space characters can be found via:

\x{200b}

Demo

Non breaking space characters can be found via:

\xa0

Demo

answered Nov 09 '22 05:11

CinCout

For unicode character whose code point is CODE_POINT (code point must be in hexadecimal format), we can safely use regular expression of the format \x{CODE_POINT} to search it.

General rules

For unicode characters whose code points can fit in two hex digits, it is fine to use \x without curly braces, but for those characters whose code points are more than two hex digits, you have to use \x followed by curly braces.

Some examples

For example, in order to find character A, you can use either \x{41} or \x41 to search it.

As another example, in order to find 我(according to here, its code point is U+6211), you have to use \x{6211} to search it instead of \x6211 (see image below). If you use \x6211, you will not find the character 我.

enter image description here

answered Nov 09 '22 04:11

jdhao

Related questions
                            
                                java - Why replaceAll is not working?
                            
                                Url routing regex PHP
                            
                                Regular expression that never finishes running
                            
                                Regular expressions - Matching whitespace
                            
                                Scala Regex union
                            
                                In .NET's RegEx can I get a Groups collection from a Capture object?
                            
                                How can I use javascript split method using escape character? [duplicate]
                            
                                Nginx Block/Deny Access to multiple locations regex
                            
                                Most efficient regular expression for Nginx location
                            
                                Dart: RegExp by example
                            
                                How to find minimum, maximum length strings generated given a regular expression? [closed]
                            
                                match EOF but go to endless loop in flex
                            
                                How to split a string on comma that is NOT followed by a space?
                            
                                Why does strsplit return a list
                            
                                Regular expression for validating SQL Server table name
                            
                                How to remove non-valid unicode characters from strings in java
                            
                                Extract a sample of words around a particular word using stringr in R
                            
                                Vim searching: avoid matches within comments
                            
                                Split string at first occurrence
                            
                                If there difference between `\A` vs `^` (caret) in regular expression?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to search a unicode character using its code point in sublime text

Tags:

regex

unicode

sublimetext

My question

jdhao

People also ask

2 Answers

CinCout

General rules

Some examples

jdhao

Recent Activity

Donate For Us