I'm testing RavenDB for my future projects. Database performance is a must requirement for me, that's why I want to be able to tune RavenDB to be at least in SQL Server's performance range, but my tests shows that raven db is approximately 10x-20x slower in select queries than SQL Server, even when RavenDB is indexed and SQL Server doesn't have any indexes.
I populated database with 150k documents. Each document has a collection of child elements. Db size is approx. 1GB and so is index size too. Raven/Esent/CacheSizeMax is set to 2048 and Raven/Esent/MaxVerPages is set to 128. Here's how the documents looks like:
{
"Date": "2028-09-29T01:27:13.7981628",
"Items": [
{
{
"ProductId": "products/673",
"Quantity": 26,
"Price": {
"Amount": 2443.0,
"Currency": "USD"
}
},
{
"ProductId": "products/649",
"Quantity": 10,
"Price": {
"Amount": 1642.0,
"Currency": "USD"
}
}
],
"CustomerId": "customers/10"
}
public class Order
{
public DateTime Date { get; set; }
public IList<OrderItem> Items { get; set; }
public string CustomerId { get; set; }
}
public class OrderItem
{
public string ProductId { get; set; }
public int Quantity { get; set; }
public Price Price { get; set; }
}
public class Price
{
public decimal Amount { get; set; }
public string Currency { get; set; }
}
Here's the defined index:
from doc in docs.Orders
from docItemsItem in ((IEnumerable<dynamic>)doc.Items).DefaultIfEmpty()
select new { Items_Price_Amount = docItemsItem.Price.Amount, Items_Quantity = docItemsItem.Quantity, Date = doc.Date }
I defined the index using Management studio, not from code BTW (don't know if it has any negative/positive effect on perfromance).
This query takes from 500ms to 1500ms to complete (Note that this is the time that is needed to execute the query, directly shown from ravendb's console. So it doesn't contain http request time and deserialization overhead. Just query execution time).
session.Query<Order>("OrdersIndex").Where(o =>
o.Items.Any(oi => oi.Price.Amount > 0 && oi.Quantity < 100)).Take(128).ToList();
I'm running the query on quad core i5 cpu running at 4.2 GHz and the db is located on a SSD.
Now when I populated same amount of data on sql server express, with same schema and same amount of associated objects. without index, sql server executes the same query which includes joins in 35ms. With index it takes 0ms :|.
All tests were performed when db servers were warmed up.
Though, I'm still very satisfied with RavenDB's performance, I'm curious if I am missing something or RavenDB is slower than a relational database? Sorry for my poor english.
Thanks
UPDATE
Ayande, I tried what you suggested, but when I try to define the index you sent me, I get the following error:
public Index_OrdersIndex()
{
this.ViewText = @"from doc in docs.Orders
select new { Items_Price_Amount = doc.Items(s=>s.Price.Amount), Items_Quantity = doc.Items(s=>s.Quantity), Date = doc.Date }
";
this.ForEntityNames.Add("Orders");
this.AddMapDefinition(docs => from doc in docs
where doc["@metadata"]["Raven-Entity-Name"] == "Orders"
select new { Items_Price_Amount = doc.Items(s => s.Price.Amount), Items_Quantity = doc.Items.(s => s.Quantity), Date = doc.Date, __document_id = doc.__document_id });
this.AddField("Items_Price_Amount");
this.AddField("Items_Quantity");
this.AddField("Date");
this.AddField("__document_id");
this.AddQueryParameterForMap("Date");
this.AddQueryParameterForMap("__document_id");
this.AddQueryParameterForReduce("Date");
this.AddQueryParameterForReduce("__document_id");
}
}
error CS1977: Cannot use a lambda expression as an argument to a dynamically dispatched operation without first casting it to a delegate or expression tree type
Davita, The following index generate ~8 million index entries:
from doc in docs.Orders
from docItemsItem in ((IEnumerable<dynamic>)doc.Items).DefaultIfEmpty()
select new { Items_Price_Amount = docItemsItem.Price.Amount, Items_Quantity = docItemsItem.Quantity, Date = doc.Date }
This one generates far less:
from doc in docs.Orders
select new { Items_Price_Amount = doc.Items(s=>s.Price.Amount), Items_Quantity = doc.Items.(s=>s.Quantity), Date = doc.Date }
And can be queried with the same results, but on our tests showed up to be about twice as fast.
The major problem is that you are making several range queries, which are expensive with a large number of potential values, and then you have a large number of actual matches for the query.
Doing an exact match is significantly faster, by the way.
We are still working on ways to try to speed things up.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With