Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Couchbase internals

One question regarding data fetching approach,

First Approach:

Let say I have two document

userdoc1
{
“status”:“pending”
“usertype”:“VIP”
“userid”:“123”
}

for above document let say my documentid is status::usertype [just to clarify,this document id will be unique in our case ]

userdoc2
{
“userid”:“123”,
“fname”:“abc”,
“lname”:“xyz”,
“age”:20;
“address”:“asdf”
}

for userdoc2, let say userid is my documentid

If i do a get operation i would proceed like this (here idea is to fetch data based on document id)

select userid from userdoc1 with key “pending::VIP”;

and then

select * from userdoc2 with key “123”;

Second Approach:

I have only one document

userdoc
{
“status”:“pending”
“usertype”:“VIP”
“userid”:“123”
“fname”:“abc”,
“lname”:“xyz”,
“age”:20;
“address”:“asdf”
}

Here, documentid is “status::usertype” and we have secondary index on userid

Here if get the data like this(here idea is to fetch data based on secondary index):

select * from userdoc where userd=“123”;

Could you please explain which approach will give high read performance assuming high data load with 100 of nodes in a cluster and XDCR and other factors ?

like image 863
Bifrost Avatar asked Dec 29 '25 04:12

Bifrost


1 Answers

Option 1 is going to have two roundtrips from the client to the server to run two cheap queries. Option 2 is going to have one roundtrip from the client to the server to run one slightly more expensive query.

I can't be completely confident without measuring, but I would bet my money on option 2. Roundtrip costs can be a bitch.

Be sure to use a proper index on userid for option 2 and use a prepared query with the userid as a parameter. That should be the fastest option.

like image 65
Johan Larson Avatar answered Jan 02 '26 18:01

Johan Larson



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!