Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stuck on designing the schema for my firebase database

I come from a SQL background so I've been having a problem designing my NoSQL firebase schema. I'm used to being able to query for anything using the "WHERE" clause, and it seems more difficult to do so in firebase (although the performance EASILY makes up for it!).

I'm storing "track" objects for songs. These objects have key/value pairs such as artist_name, track title, genre, rating, created_date, etc. as below:

tracks
|_____-JPl1zwOzjqoM8xDTFll
          |____ artist: "Bob"
          |____ title: "so long"
          |____ genre: "pop"
          |____ rating: 52
          |____ created: 1403129692781
|
|_____ -JPv7KnVi8ASQJjRDpvh
          |____ artist: "Mary"
          |____ title: "im alright now"
          |____ genre: "rock"
          |____ rating: 70
          |____ created: 1403129692787

The default behaviour on my site will be to list all these tracks, with the newest added track appearing at the top of the list. I can set my $priority to be created and just turn it negative (created * -1) to achieve this effect I believe.

But in the future, I'd like to be able to filter/query the list by other means, for example:

  • Retrieve all tracks that have a genre of rock, pop, or hip-hop.

  • Retrieve all tracks that have a rating of 80 or higher, and have been added in the last 7 days.

How is it possible to achieve this in firebase? My understanding is that there are really only 2 ways to order data:

  1. Through the "ID" value, which has the physical location of "firebaseURL.firebaseio.com/tracks/id", which in my case, was automatically selected for me when I add a track. This is okay (I think) as I have pages for individual track pages that list details, and the URL on my site is something like "www.mysite.com/tracks/-JPl1zwOzjqoM8xDTFll".

  2. By using a $priority, which in my case, I've used on the "created" value so as to order my list in proper date order.

Given the way I have things set up (and please do let me know if there's a better way), is there a way I can easily query for specific genres, or specific ratings?

I read the blog "Denormalizing your Data is Normal" (https://www.firebase.com/blog/2013-04-12-denormalizing-is-normal.html), and I think I understand it. From what Anant describes, one way to achieve what I want would maybe be to create a new object in firebase for a genre and list all the tracks there, like so:

tracks
|______ All
        |_____ -JPlB34tJfAJT0rFT0qI
        |_____ -JPlB32222222222T0qI
        |_____ -JPlB34wefwefFT0qI

|______ Rock
        |_____ -JPlB32222222222T0qI
        |_____ -JPlB34tJfAJT0rFT0qI

|______ Pop
        |_____ -JPlB34wefwefFT0qI

The premise in the blog, was that hard drive space was cheap, but a user's time is not. Thus, it's okay for there to be duplicate data as it allows for faster reads.

That makes sense, and I wouldn't mind this method. But this would work only if a user wanted to select all tracks from only ONE genre. What if they wanted to get all the tracks from BOTH rock AND pop? Would I have to store another object called Rock&Pop and store a track in there each time someone submits a song of either genre?

genre
|_______pop-rock
         |_________ -JPlB34tJfAJT0rFT0qI (a rock song)
         |_________ -JPlB34wefwefFT0qI (a pop song)
         |_________ -JPlB32222222222T0qI (a rock song)

Also, would it make more sense to store the ENTIRE track object or just a reference using the trackid? So for example, under /genre/pop:

Should I store just the reference?
genre
|______ pop
        |______ -JPlB34wefwefFT0qI

Or, Should I store the entire track?
genre
|______ pop
        |______ -JPlB34wefwefFT0qI
                    |___ artist: "bob"
                    |___ title: "hello"
                    |___ genre: pop
                    |___ etc..

Is there a performance difference between the two methods? I'm thinking that maybe the latter one would be faster, as I wouldn't need to query for each individual track for the other details but I just want to be sure.

I've redone my firebase schema several times already. I've made some improvements, but as my application is getting bigger, changing it gets more costly and consumes more time. It'd be nice if I could get these questions cleared up for the final time before I spend a lot of time redoing the rest of my code to match it again..

Thanks for any help with this, it's very much appreciated. And please let me know if you need additional information.

like image 285
Isaiah Lee Avatar asked Jun 24 '14 16:06

Isaiah Lee


1 Answers

Firebase is rolling out a lot of additions to the query API over the next year. Contextual searching (where foo like bar) is probably never going to be a big hit in real-time data--it's slow and cumbersome.

There is a two-part blog article on sql queries and equivalent patterns in Firebase. I'd recommend you give it a read-through. Part 2, in particular, talks about Flashlight.

Why ElasticSearch and a service? Like real-time data storage and synchronization, search is a complex topic with a lot of boilerplate and discoverable complexity. It's easy to write a where clause in SQL and that will get you a ways, but it quickly falls short of user expectations.

ES can be integrated with Firebase in a snap (the Flashlight service took less than 5 minutes to integrate with an app, last time I attempted it), and provides robust and thorough search capabilities.

So until Firebase rolls out some game-changing features around querying, I'd suggest checking out this approach at the start, rather than trying to bolt on search capabilities by another means.

like image 87
Kato Avatar answered Jan 02 '23 06:01

Kato