Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

For a same String, will SQLite's length will ever return a different value than Java's length method?

Give a same String data

  1. SQLite perform length calculation on its TEXT column.
  2. The TEXT column is read into (Using Android Room database) Java String, then Java performs String.length()

Is there any chance that these yields 2 different value?

I have do a rough test using English and non-English characters. Both yields the same value.

But, I am not sure whether there is any edge cases I have missed out?

like image 421
Cheok Yan Cheng Avatar asked Oct 27 '20 16:10

Cheok Yan Cheng


People also ask

What will length function in string return?

length() method returns the number of characters present in the string.

What results returned by length method?

Java has an inbuilt method called length() to find the number of characters of any String. int length(); where length() is a method to find the number of characters and returns the result as an integer.

What is the difference between length () and length in Java?

The key difference between Java's length variable and Java's length() method is that the Java length variable describes the size of an array, while Java's length() method tells you how many characters a text String contains.

Which method is used to return the length of a string in Java?

To calculate the length of a string in Java, you can use an inbuilt length() method of the Java string class. In Java, strings are objects created using the string class and the length() method is a public member method of this class.


2 Answers

Since you are looking for edge cases...

From SQLite's Built-In Scalar SQL Functions:

length(X)
For a string value X,
the length(X) function returns the number of characters (not bytes) in X
prior to the first NUL character. (emphasis mine)
Since SQLite strings do not normally contain NUL characters,
the length(X) function will usually return the total number of characters in the string X....

So, SQLite, for:

SELECT LENGTH('a' || CHAR(0) || 'b')

will return 1,

but Java, for:

String s = "a" + Character.toString('\0') + "b";
System.out.println("" + s.length());

will return 3.

like image 130
forpas Avatar answered Oct 01 '22 02:10

forpas


There could be some cases where the length differ, Java uses UTF-16 for internal string representation, so some kind of characters will need a surrogate pair to be stored in memory. Java's String.length() does not take into account this.

A simple example using the 💩 emoji character

    class HelloWorld {
    public static void main(String[] args) {
        System.out.println("💩".length());
    }}

This will print 2.

On the other hand the documentation of sqlite states:

For a string value X, the length(X) function returns the number of characters (not bytes) in X prior to the first NUL character.

It specifies that it counts the characters

sqlite> select length('💩'); 

this will return 1.

This is not exclusive to "emojis" it will be the same also for some languages that have characters with "high" codepoints like some Asian characters

tested with sqlite 3.28.0 and openjdk version "1.8.0_252". I think it should hold true for your stack.

like image 43
david-ao Avatar answered Oct 01 '22 04:10

david-ao