Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does manually implementing a hash tag give a performance boost to my queries?

In my model I set up an Entity (let's say Person) to have an attribute as string (called "name") and I put an index on it. If I do a lot of queries on my model the queries come out to be a performance drain. My query is a simple

  [ NSPredicate predicateWithFormat: @"%K == %@", @"name", lPersonName ];

so I would assume that the index would do its work.

Then, if I calculate some simple hash tag and store that along with my entity, in an indexed integer attribute (called "hash"), and do a more narrow query the performance drain is gone. Like this:

[ NSPredicate predicateWithFormat: @"%K == %d AND (%K == %@)",
                           @"hash", [ self calculateHashForName: lPersonName ],
                           @"name", lPersonName ];

Why is the index on the integer so much faster than an index on a string? Am I overlooking something? Is this a Core Data issue?

I can keep the solution with the hash tag, of course, but if I am overlooking something I would love to know about it sooner rather than later.

like image 228
Kristof Van Landschoot Avatar asked Mar 21 '12 15:03

Kristof Van Landschoot


1 Answers

On a low level computers process integers natively, processors have an internal data type for integer but no internal data type for strings (in ARM and x86 land anyways).

4000000000 == -123456789 

Can be processed by a computer in 1 instruction, while...

"Abcdefg" == "Abcdefzzzz"

Has to loop through the characters, taking several instructions.

This is fairly generalized, but it gets to the crux of the issue. In short, computers process integers quicker, and even though strings can be expressed as integers (binary bytes) they are a of a variable length which makes them more complex to process.

like image 63
Louis Ricci Avatar answered Nov 12 '22 01:11

Louis Ricci