Mongo on AWS

Here are some best practices when running Mongo on AWS:

  • Always use XFS or Ext4 filesystems when running Mongo.  Both support consistent snapshots for backups.
  • Use SSD volumes – General Purpose SSD (GP2) or Provisioned IOPS SSD (IO1).  Magnetic volumes provide much less IOPs than SSDs. Screen Shot 2016-01-22 at 3.30.51 PM
  • For best performance, use I2 – High I/O (I2) instances or D2 – Dense-storage (D2) instances.  The more RAM the merrier, which would keep your working set in memory and not hit disks.
  • Use EBS-Optimized instances.  These instances use an “optimized configuration stack and provides additional, dedicated capacity for Amazon EBS I/O” (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.html).
  • Mount drives used by Mongo without atime and diratime options.  You can mount the disk manually or write to the /etc/fstab file to keep mount options persistent after restarts:
    • sudo mount -o remount,noatime,noexec  /var/lib/mongo
    • /etc/fstab: /dev/xvde1 /var/lib/mongo/ ext4 noatime,noexec 0 2
  • Raise the ulimits for Mongo. The defaults are usually low and not very performant for a production Mongo server. You can view ulimits with ulimit -a. Recommend settings are (https://docs.mongodb.org/v2.4/reference/ulimit/)
    • -f (file size): unlimited
    • -t (cpu time): unlimited
    • -v (virtual memory): unlimited
    • -n (open files): 64000
    • -m (memory size): unlimited
    • -u (processes/threads): 64000
  • Set block read ahead to 32.  Example (assuming /dev/xvdfe is your block device for mount – run lsblk to confirm):
    • sudo blockdev --setra 32 /dev/xvdfe
  • Pre-warm snaphosts.  This only applies if your volume was created from a snapshot.  When you create an AWS EBS volume which is restored from a snapshot it’s not initialized (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html).   This causes “an increase in the latency of an I/O operation the first time each block is accessed”.  You can access all blocks on a fresh EBS snapshot volumen by running:
    • sudo fio --filename=/dev/xvde --rw=randread --bs=128k --iodepth=32 --ioengine=libaio --direct=1 --name=volume-initialize
  • Pre-warm documents into memory.  If you are promoting a SECONDARY into a PRIMARY, the working set data should be pre-warmed.  If not, you will see I/O degradation with faults increasing in mongostat until all data is read from disk.  Ideally you would want to have all documents in memory without touching disk (disk I/O access is much slower then RAM memory access).  You can pre-warm Mongo documents by running live production queries to load them into memory.  Also, you can leave a fresh instance as a SECONDARY for a day or two before promoting to PRIMARY.
  • Run each replica-set node in a different Availability Zone (AZ).

Screen Shot 2016-01-22 at 3.45.32 PMAbove: Image from “Mongo on AWS Whitepaper

 

Sources:

Setup a Simple Test Mongo Replicaset

You use the hostname linux command to determine the hostname.

Example: hostname = test.hostname.com

mkdir /tmp/mongo1

mkdir /tmp/mongo2

mkdir /tmp/mongo3

mongod –port 27021 –dbpath /tmp/mongo1 –fork –logpath /tmp/mongod1.log –replSet rs0
mongod –port 27022 –dbpath /tmp/mongo2 –fork –logpath /tmp/mongod2.log –replSet rs0
mongod –port 27023 –dbpath /tmp/mongo3 –fork –logpath /tmp/mongod3.log –replSet rs0

rsconf = {
_id: “rs0”,
members: [
{
_id: 0,
host: “test.hostname.com:27021”
}
]
}

rs.initiate( rsconf );
rs.conf();
rs.add(“test.hostname.com:27022”);
rs.add(“test.hostname.com:27023”);
rs.status();

 

Other mongo commands:

db.printSlaveReplicationInfo();

 

In order to read from the secondaries, you need to run slaveOk.  Otherwise:

Thu Feb 26 16:05:19 uncaught exception: count failed: { “errmsg” : “not master”, “ok” : 0 }
SECONDARY> rs.slaveOk()