Give a same String data
length
calculation on its TEXT
column.TEXT
column is read into (Using Android Room database) Java String, then Java performs String.length()
Is there any chance that these yields 2 different value?
I have do a rough test using English and non-English characters. Both yields the same value.
But, I am not sure whether there is any edge cases I have missed out?
length() method returns the number of characters present in the string.
Java has an inbuilt method called length() to find the number of characters of any String. int length(); where length() is a method to find the number of characters and returns the result as an integer.
The key difference between Java's length variable and Java's length() method is that the Java length variable describes the size of an array, while Java's length() method tells you how many characters a text String contains.
To calculate the length of a string in Java, you can use an inbuilt length() method of the Java string class. In Java, strings are objects created using the string class and the length() method is a public member method of this class.
Since you are looking for edge cases...
From SQLite's Built-In Scalar SQL Functions:
length(X)
For a string value X,
the length(X) function returns the number of characters (not bytes) in X
prior to the first NUL character. (emphasis mine)
Since SQLite strings do not normally contain NUL characters,
the length(X) function will usually return the total number of characters in the string X....
So, SQLite, for:
SELECT LENGTH('a' || CHAR(0) || 'b')
will return 1
,
but Java, for:
String s = "a" + Character.toString('\0') + "b";
System.out.println("" + s.length());
will return 3
.
There could be some cases where the length differ, Java uses UTF-16 for internal string representation, so some kind of characters will need a surrogate pair to be stored in memory. Java's String.length() does not take into account this.
A simple example using the 💩 emoji character
class HelloWorld {
public static void main(String[] args) {
System.out.println("💩".length());
}}
This will print 2.
On the other hand the documentation of sqlite states:
For a string value X, the length(X) function returns the number of characters (not bytes) in X prior to the first NUL character.
It specifies that it counts the characters
sqlite> select length('💩');
this will return 1.
This is not exclusive to "emojis" it will be the same also for some languages that have characters with "high" codepoints like some Asian characters
tested with sqlite 3.28.0 and openjdk version "1.8.0_252". I think it should hold true for your stack.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With