Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do an ORDER BY ignoring the diacritical marks (use a collation) in Cypher?

I have a list of names where some names contain diacritics characters, like Á, Ê

For example:

Átila
André
Êlisa
Mercês
Sá

But when I run a simple query like this:

MATCH (p:Person)
ORDER BY p.Name

It returns the names out of alphabetical order, because of the diacritics:

André
Mercês
Sá
Átila
Êlisa

I would like that it return in alphabetical order, independently of the presence of diacritics (pt-BR / portuguese / Brazil).

I can do this in Microsoft SQL Server:

SELECT Name
FROM Person
ORDER BY Name COLLATE SQL_Latin1_General_CP1_CS_AS

How to do that in Cypher?

like image 849
Tony Avatar asked Oct 16 '25 13:10

Tony


1 Answers

DISCLAIMER: I'm the co-founder and CTO of Memgraph.

I would say this goes under the proper support for Unicode. openCypher grammar has the support to parse various characters, but it's an implementation detail of how characters are stored and later interpreted. I'm not aware of a clause like COLLATE in openCypher.

When it comes to Memgraph, it just stores and interprets raw bytes (for now), which results in the wrong sorting order. Options when using Memgraph are:

  • implementing a user-defined function in C/C++/Python/Rust which will allow you to sort the items correctly
  • maybe a quick fix is to order on the client/application side (I'm aware of all the problems that might produce, large transferred data volume, and slow queries), but maybe it's a quick solution for you.

There is a related GitHub issue. Please contribute with more details or follow the discussion. At some point, we'll add native capability :) Also, we are looking for contributors!

like image 133
buda Avatar answered Oct 18 '25 12:10

buda



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!