2

I'm running Mongodb on AWS EC2 instance. Data/log/and journal are stored in a separate volume, formatted as xfs. Currently we are stopping the mongodb instance to take a snapshot, but reading this: https://docs.mongodb.com/ecosystem/tutorial/backup-and-restore-mongodb-on-amazon-ec2/ apparently there is no need to stop the service during the snapshot since journal is enabled. Am I correct? Can I create consistent snapshot even if the service is running?

2
  • You may have missed an important note on that documentation page after the end of the "Backup with --journal" description: Snapshotting with the journal is only possible if the journal resides on the same volume as the data files, so that one snapshot operation captures the journal state and data file state atomically.. If your data & journal are stored on separate volumes you cannot take a consistent snapshot of an active deployment, and would have to fsyncLock or stop the mongod process.
    – Stennie
    Commented Sep 5, 2018 at 2:48
  • @Stennie actually question states: Data/log/ and journal are stored in a separate volume Commented Sep 5, 2018 at 3:29

1 Answer 1

4

In general, do not trust any backup procedure until you have confirmed the integrity of a restore from long term media.


You already have the capability to take a storage system layer backup online. In this case, with EBS volumes or Linux LVM. The problem is getting the database in a consistent state.

An online backup is possible with or without journal. In either case, mongo's way to suspend database writes is fsync and lock, as described in that tutorial.

Without a journal, it is difficult to tell what data is durable on disk and what is buffered and not yet committed. fsync and lock establishes a point in time, and stops any more in progress writes until the backup is done.

The lock is also needed with multiple disks, where (on this storage system) the snapshots are not consistent with each other. Suspending writes for the duration of the backup means that disk /dev/sdf will not be at a slightly different point of time compared to /dev/sdg.

Mongo claims that if you only have a single disk, and have a journal, you don't need to fsync and lock. Presumably, the EBS snapshot is a good enough crash-consistent point in time, and journal forward recovery can fix up any incomplete writes.

2
  • Agree completely with this, particularly that you need to regularly do a restore test. I personally would do (and actually do) a little more. 1) I would do a monthly process to stop, snapshot, and start the database 2) I would have a process that periodically (nightly for me) exports the data, compresses, then stores the backup somewhere outside of AWS. I trust AWS plenty, but human error is infinite. I also store other assets, using in incremental backup system.
    – Tim
    Commented Sep 4, 2018 at 9:10
  • 1
    You don't need to take offline backups if you have sufficiently verified online backups. But offline backups are simpler. Also: restore test from long term media is open to interpretation. It could be restored to another storage account/region/cloud provider to demonstrate you can export to a different failure domain. Commented Sep 4, 2018 at 12:42

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .