MongoDB is a popular NoSQL database for “cloud native” applications that don’t require the strict consistency guarantees of ACID-compliant DBMS such as MySQL or Postgres.
ACID-compliant DBMS rely on a transactional model to acknowledge that each CRUD operation (Create, Read, Update, Delete) has completed prior to moving onto the next transaction. While a transactional database might be necessary for some applications, like ones that handle financial transactions, many web applications do not require a perfect view of their state at all times.
It is sufficient for everyday applications such as social networks or chat apps to be “eventually consistent” so long as the network latency between the database nodes is kept to a minimum. MongoDB is an example of a database which has these characteristics. Some of the largest, world-scale applications use MongoDB to scale almost limitlessly to serve millions or billions of users a day.
The scalability of MongoDB is better than with a traditional RDBMS because of its data is stored in documents, as opposed to a schema consisting of tables, rows, and relations (PKs and FKs). Mongo databases can be more easily “sharded” between nodes when they grow very large, meaning that not every node needs to store the complete database.
A production MongoDB should be deployed in a replica set of no less than three (3) replicas. One (1) Mongo node is the primary (accepting writes from the application) and the writes are replicated to the other two (2) Mongo nodes which serve as secondaries. If the primary fails, one of the secondaries is elected to take its place as the primary, while continuing to replicate data to the other secondary for redundancy.
As long as no more than one (1) out of three (3) nodes fails at any given time, consensus is maintained. When the failed primary is ready to rejoin the replica set, the data is copied to it from the other two (2) nodes.
Many Docker Compose files for various applications rely on one Mongo container, which may be adequate for testing, but does not provide any redundancy or fault tolerance. Fortunately, it is easy to establish a Mongo replica set by spinning up a temporary Docker container that runs a shell script to join the other Mongo containers into a cluster.
This is the setup we use to deploy RocketChat using Docker, one of the applications that we support for our customers. Here is an excerpt from the docker-compose.yml file. It creates three (3) Mongo containers named mongo1, mongo2, mongo3 respectively and adds them to a replica set, rs0.
version: "3" services: mongo1: hostname: mongo1 container_name: mongo1 image: mongo:3.6-jessie expose: - 27017 restart: always volumes: - mongo1:/data/db entrypoint: [ "/usr/bin/mongod", "--bind_ip_all", "--replSet", "rs0" ] mongo2: hostname: mongo2 container_name: mongo2 image: mongo:3.6-jessie expose: - 27017 restart: always volumes: - mongo2:/data/db entrypoint: [ "/usr/bin/mongod", "--bind_ip_all", "--replSet", "rs0" ] mongo3: hostname: mongo3 container_name: mongo3 image: mongo:3.6-jessie expose: - 27017 restart: always volumes: - mongo3:/data/db entrypoint: [ "/usr/bin/mongod", "--bind_ip_all", "--replSet", "rs0" ] mongosetup: image: mongo:3.6-jessie links: - mongo1:mongo1 - mongo2:mongo2 - mongo3:mongo3 depends_on: - mongo1 - mongo2 - mongo3 volumes: - .:/scripts restart: "no" entrypoint: [ "bash", "/scripts/mongo_setup.sh" ] volumes: mongo1: mongo2: mongo3:
This script, mongo_setup.sh should be located in the same directory as the docker-compose.yml file so that it can be bind mounted into the /scripts/ directory of the mongosetup container.
The mongosetup container waits for the other Mongo containers to be up-and-running because depends_on: is specified for the service. It continues to retry until all of the other containers have been added to the replica set. Because mongo1 is given a
"priority": 2 while mongo2 and mongo3 have
"priority": 0, it becomes the primary in the replica set.
Adapting the Compose file for Multi-Node Deployment with Docker Swarm
Thanks to Docker’s virtual networking, the Mongo containers can automatically communicate with each other via the overlay network using their hostnames – without any further configuration. This example assumes that all three of the Mongo nodes are running on a single host, but the Compose file above can be easily adapted using deployment constraints and labels to deploy them to separate hosts in Docker Swarm.
depends_on: keys would also need to be removed as those keys are ignored for a Docker Swarm stack deployment. If all of the containers are added to the same custom Docker network, they should be able to communicate with each other using their service name as hostname. The mongosetup container would also need to be bound to the host where the mongo_setup.sh script resides, probably one of your Manager nodes.
In the example above, the data is persisted on the host using Docker named volumes mongo1, mongo2, mongo3 and mounted into the /data/db directory of each Mongo container – the default data dir for Mongo. The data is actually stored in /var/lib/docker/volumes/ on the host so that it survives the deletion (and re-creation) of the container.
If you want expert assistance with deploying an application on Docker or Docker Swarm that uses MongoDB as a backend, our team is ready to help. We take the complexity out of using Docker to deploy applications by setting up any desired stack on the bare-metal or cloud environment of your choice.
The beauty of Docker is that it makes applications portable between environments, making it easy to move from development, QA, to production. MongoDB takes to being containerized particularly well compared to traditional databases. While we still usually run databases such as MySQL or Postgres outside out Docker, we would not hesitate at all to containerize MongoDB and manage it using Docker’s tools.