Same string but different bytes codes

Tags:

ruby

I have two strings:

a = 'hà nội'
b = 'hà nội'

When I compare them with a == b, it returns false.

I checked the byte codes:

a.bytes = [104, 97, 204, 128, 32, 110, 195, 180, 204, 163, 105]
b.bytes = [104, 195, 160, 32, 110, 225, 187, 153, 105]

What is the cause? How can I fix it so that a == b returns true?

427

asked Jan 27 '18 03:01

Toàn

1 Answers

This is an issue with Unicode equivalence.

In order to compare these strings you need to normalize them, so that they both use the same byte sequences for these types of characters.

a.unicode_normalize == b.unicode_normalize

unicode_normalize(form=:nfc) [link]

Returns a normalized form of str, using Unicode normalizations NFC, NFD, NFKC, or NFKD. The normalization form used is determined by form, which is any of the four values :nfc, :nfd, :nfkc, or :nfkd. The default is :nfc.

If the string is not in a Unicode Encoding, then an Exception is raised. In this context, 'Unicode Encoding' means any of UTF-8, UTF-16BE/LE, and UTF-32BE/LE, as well as GB18030, UCS_2BE, and UCS_4BE. Anything else than UTF-8 is implemented by converting to UTF-8, which makes it slower than UTF-8.

196

answered Oct 19 '22 11:10

fongfan999

Related questions
                            
                                Subclassing core Ruby class such as Hash
                            
                                what is the time complexity of Array#uniq method in ruby?
                            
                                Ruby - Join array of objects by an objects field
                            
                                Inverting a hash with array values
                            
                                haml-rails on rails 4.0?
                            
                                How to group posts by date on home page in Jekyll?
                            
                                Rails app on percise32 vagrant box - assets get "text file busy" error (Errno::ETXTBSY)
                            
                                How do I override a variable in a Ruby subclass without affecting the superclass?
                            
                                Ruby one line if return statement
                            
                                Insert HTML special characters in (i18n) yml
                            
                                Fast fuzzy/approximate search in dictionary of strings in Ruby
                            
                                Generate a nested JSON array in JBuilder
                            
                                Why does SecureRandom.uuid create a unique string? [closed]
                            
                                render a 404 page on routing error in rails
                            
                                How do I set environment variables in Capistrano 3?
                            
                                Ruby On Rails scaffold need to include foreign keys?
                            
                                Programmatically check if gem in bundle?
                            
                                Rails 4 Require and Permit Multiple
                            
                                Warning regarding not using bundler when running guard init
                            
                                Ruby API response as lower camel case

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With