Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I create a lot of sample data for firestore?

Let's say I need to create a lot of different documents/collections in firestore. I need to add it quickly, like copy and paste json. I can't do that with standard firebase console, because adding 100 documents will take me forever. Is there any solutions for to bulk create mock data with a given structure in firestore db?

like image 894
Eriendel Avatar asked Apr 29 '18 09:04

Eriendel


People also ask

Is Firebase suitable for large amount of data?

Architecture: Firebase is a NoSQL database that stores and syncs data in real-time (a real-time document store); MySQL is an open-source relational database management system based on the domain-specific language SQL. Data Handling: Firebase handles large data sets effectively; MySQL is a good choice for complex data.

How large can a firestore array be?

The array [1, 2, 3] has elements equal to the first three elements of [1, 2, 3, 1] but is shorter in length. Up to 1,048,487 bytes (1 MiB - 89 bytes). Only the first 1,500 bytes are considered by queries.

How much data can firestore handle?

a 256 KB limit on the size of the ruleset text source published from the Firebase console or from the CLI using firebase deploy . a 250 KB limit on the size of the compiled ruleset that results when Firebase processes the source and makes it active on the back-end.

How many collections can firestore create?

So to summarize, there are no limits on how many collections you have, just how deep you can go within a collection.


1 Answers

If you switch to the Cloud Console (rather than Firebase Console) for your project, you can use Cloud Shell as a starting point.

From the Cloud Shell environment you'll find tools like node and python installed and available. Using whatever one you prefer, you can write a script using the Server Client libraries.

For example in Python:

from google.cloud import firestore
import random

MAX_DOCUMENTS = 100
SAMPLE_COLLECTION_ID = u'users'
SAMPLE_COLORS = [u'Blue', u'Red', u'Green', u'Yellow', u'White', u'Black']

# Project ID is determined by the GCLOUD_PROJECT environment variable
db = firestore.Client()

collection_ref = db.collection(SAMPLE_COLLECTION_ID)

for x in range(0, MAX_DOCUMENTS - 1):
collection_ref.add({
    u'primary': random.choice(SAMPLE_COLORS),
    u'secondary': random.choice(SAMPLE_COLORS),
    u'trim': random.choice(SAMPLE_COLORS),
    u'accent': random.choice(SAMPLE_COLORS)
})

While this is the easiest way to get up and running with a static dataset, it lives a little to be desired. Namely with Firestore, live dynamic data is needed to exercises it's functionally, such as real-time queries. For this task, using Cloud Scheduler & Cloud Functions is a relatively easy way to regularly updating sample data.

In addition to the sample generation code, you'll specific the update frequency in the Cloud Scheduler. For instance in the image below, */10 * * * * defines a frequency of every 10 minutes using the standard unix-cron format:

Image of frequency settings in Cloud Scheduler

For non-static data, often a timestamp is useful. Firestore provides a way to have a timestamp from the database server added at write-time as one of the fields:

u'timestamp': firestore.SERVER_TIMESTAMP

It is worth noting that timestamps like this will hotspot in production systems if not sharded correctly. Typically 500 writes/second to the same collection is the maximum you will want so that the index doesn't hotspot. Sharding can be as simple something like as each user having their own collection (500 writes/second per user). However for this example, writing 100 documents every minute via a scheduled Cloud Function is definitely not an issue.

like image 119
Dan McGrath Avatar answered Nov 09 '22 10:11

Dan McGrath