Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a cluster in MongoDB?

Tags:

mongodb

I am someone new to mongoDB and has absolutely no knowledge regarding databases so would like to know what is a cluster in MongoDB and what is the point of connecting to one in MongoDB? Is it a must to connect to one or can we just connect to the localhost?

like image 348
Ong Kong Tat Avatar asked Apr 17 '17 06:04

Ong Kong Tat


People also ask

Is cluster and database same?

"Cluster" is a set of hosts. In a sharded cluster the hosts have different roles. "Database" is a group of collections, you could compare a "database" to "schema" in Oracle.

What is MongoDB Sharded cluster?

A MongoDB sharded cluster consists of the following components: shard: Each shard contains a subset of the sharded data. As of MongoDB 3.6, shards must be deployed as a replica set. mongos: The mongos acts as a query router, providing an interface between client applications and the sharded cluster.

How MongoDB tracks the clusters data?

The mongos tracks what data is on which shard by caching the metadata from the config servers. The mongos uses the metadata to route operations from applications and clients to the mongod instances. A mongos has no persistent state and consumes minimal system resources.

What are the different types of MongoDB clusters?

There are two different distributed configurations of MongoDB. The first is a “replica set”, where several servers carry the exact same data, to protect against failure. The second is a “sharded cluster”, where several servers each carry only a fragment of the whole data, to achieve higher performance and carry larger data sets.

What is a MongoDB sharded cluster?

Each node is a member of a shard (which is a replicaset, see below for the explanation) and the data are separated on all shards. This is the representation of a mongodb sharded cluster from the official doc. If you are starting with mongodb, I do not recommend you to shard your data.

How do I create a cluster in MongoDB Atlas?

A cluster is a container for your databases, and you can create several databases inside it. Head over to the MongoDB Atlas platform. Tap Sign In at the top-right. Then follow the onscreen instructions to create an account. You might want to use the Google authentication option to make this easier. Accept the privacy policy when asked.

What is Mongos in MongoDB?

Mongos is a query router; it acts as an intermediary between the client and the server, or the shard that the data resides in. Apart from routing, it can also handle load balancing for all query and write operations to the shards. Finally, a metadata service called config servers (configsvr) is also deployed.


2 Answers

A mongodb cluster is the word usually used for sharded cluster in mongodb. The main purposes of a sharded mongodb are:

  • Scale reads and writes along several nodes
  • Each node does not handle the whole data so you can separate data along all the nodes of the shard. Each node is a member of a shard (which is a replicaset, see below for the explanation) and the data are separated on all shards.

This is the representation of a mongodb sharded cluster from the official doc.

mongodb sharded cluster

If you are starting with mongodb, I do not recommend you to shard your data. Shards are way more complicated to maintain and handle than replicasets are. You should have a look at a basic replicaset. It is fault tolerant and sufficient for simple needs.

The ideas of a replicaset are :

  • Every data are repartited on each node
  • Only one node accept writes

A replicaset representation from the official doc

enter image description here

For simple apps there is no problem to have your mongodb cluster on the same host than your application. You can even have them on a single member replicaset but you won't be fault tolerant anymore.

like image 119
Julien TASSIN Avatar answered Oct 14 '22 00:10

Julien TASSIN


Atlas-managed MongoDB deployments, or “clusters,” can be either a replica set or a sharded cluster.

replica set: a cluster of MongoDB servers that implements replication and automated failover. (Failover is a method of protecting computer systems from failure, in which standby equipment automatically takes over when the main system fails.)

replication: a feature allowing multiple database servers to share the same data, thereby ensuring redundancy and facilitating load balancing. (Redundancy is a system design in which a component is duplicated so if it fails there will be a backup.)

sharded cluster: the set of nodes comprising a sharded MongoDB deployment. A sharded cluster consists of config servers, shards, and one or more mongos routing processes.

sharding: a method for distributing data across multiple machines.

A sharded cluster consists of ten servers: one server for the mongos [interface between the client applications and the sharded cluster] and three servers each for the first replica set, the second replica set, and the config server replica set

shard: a single mongod instance or replica set that stores some portion of a sharded cluster’s total data set. In production, all shards should be replica sets.

Sources: MongoDB docs: Glossary, Create a Cluster, Convert a Replica Set to a Sharded Cluster

like image 21
nCardot Avatar answered Oct 13 '22 23:10

nCardot