Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practices for exporting mongo collections to SQL Server

we are using MongoDB (on Linux) as our main database. However, we need to periodically (e.g. nightly) export some of the collections from Mongo to a MS SQL server to run analytics.

I am thinking about the following approach:

  1. Backup the Mongo database (probably from a replica) using mongodump
  2. Restore the database into a Windows machine where Mongo is istalled
  3. Write a custom made app to import the collections from Mongo into SQL (possibly handling any required normalization).
  4. Run analytics on the Windows SQL Server installation.

Are there any other "tried and true" alternatives?

Thanks, Stefano

EDIT: for point 4, the analytics is to be run on SQL Server, not Mongo.

like image 608
Stefano Ricciardi Avatar asked Feb 22 '12 09:02

Stefano Ricciardi


People also ask

What is the best way to migrate from SQL Server to MongoDB?

Click on the SQL Migration button in the toolbar, or right-click into a server, database or collection in the Connection Tree and select the SQL Migration option. Then select SQL → MongoDB Migration. This will open a new tab where you can configure and execute the import.

What is the difference between MongoDB and SQL Server?

SQL databases are used to store structured data while NoSQL databases like MongoDB are used to save unstructured data. MongoDB is used to save unstructured data in JSON format. MongoDB does not support advanced analytics and joins like SQL databases support.

What is PolyBase SQL?

PolyBase allows T-SQL queries to join the data from external sources to relational tables in an instance of SQL Server. A key use case for data virtualization with the PolyBase feature is to allow the data to stay in its original location and format.


1 Answers

Overall looks fine, but i can suggest two things:

  1. Skip backup/restore steps and read data directly from linux mongodb, because it will be harder and harder to backup/restore database as it will grow.
  2. Instead of custom made app use Quartz.net for nightly export, it is easy to use and can solve any other schedule tasks.

Also i can suggest look into such new approaches as cqrs and event sourcing, that's basically allow to avoid export tasks. You can just handle messages and store data into two data sources (linux mongodb, windows sql server) in real time with small delay, or even analyze data from messages and store in mongodb.

like image 163
Andrew Orsich Avatar answered Nov 10 '22 11:11

Andrew Orsich