
At Spotinst, we must drink our own champagne before we release our stuff to customers. Being a cost-centric company, we’re trying hard to deploy every possible service on Spot Instances using Elastigroup.
I had the opportunity to chat this week with my colleague Alex Friedman who leads our DevOps, I asked Alex how do we run our massive Elastic (ElasticSearch) clusters on Spot Instances. We are using Elastic’s built-in features for high-availability, storing data on EBS disks to maintain the Elastic data and state as well as using Elastigroup for a smooth EC2 Spot migration and management, which helps us to dramatically cut our cost, scale faster, and even become more highly-available. (I don’t know how many of you have lost Elastic nodes and managed to recover within a few minutes, with no performance or data loss, Alex did!)
I’ve asked Alex to put together a guide how we’re using it internally, and I’m happy to share that with you.
Let’s jump into the details, here are the High-Level steps, we’ll discuss;
- Deploy Elastigroup for ElasticSearch Master nodes. (Start
cerebro
for monitoring on one of the master servers) - Deploy Elastigroup for ElasticSearch Data nodes. (Use IPs of the Master nodes in the config)
Check out our guide for running Elasticsearch on Kubernetes
Elastigroup for ElasticSearch Master Nodes
Create a new Stateful Elastigroup for the Master Nodes
Important Configurations
AZs/Subnets: Be aware of Cross-AZ cost when choosing multiple AZs, especially if ES replication is going to be enabled.
Instance type: Memory oriented types,R3
R4
are preferable due to ES’s high memory consumption
AMI: In this example I use Amazon Linux 2 LTS Candidate 2
Elastigroup Stateful Settings: Find available private IPs in your vpc/subnet(s) and use them in the Assign Specific Private IPs
option (As we want to keep static addresses for the Master nodes)
Master user data:
- Installs ES, Java from internet repos
- The variable
device="/dev/xvdc"
should match the EBSdeviceName
configuration as explained later below.
#!/usr/bin/env bash device="/dev/xvdc" data_path="/var/lib/elasticsearch" sleep_sec=10 elastic_pckg="elasticsearch-6.2.4-1.noarch" logical_volume="lv_elastic01" volume_group="vg_elastic01" device_lv="/dev/${volume_group}/${logical_volume}" function install_java { echo "Install jdk" rpm_file_src="http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.rpm" rpm_file_dest="/tmp/jdk-8u131-linux-x64.rpm" wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" -O "$rpm_file_dest" "$rpm_file_src" rpm -Uvh "$rpm_file_dest" } function install_es { echo "Install elasticsearch" cat > /etc/yum.repos.d/elasticsearch.repo << 'EOF' [elasticsearch-6.x] name=Elasticsearch repository for 6.x packages baseurl=https://artifacts.elastic.co/packages/6.x/yum gpgcheck=1 gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch enabled=1 autorefresh=1 type=rpm-md EOF yum install $elastic_pckg -y echo "Update elasticsearch config" cat > /etc/elasticsearch/elasticsearch.yml << 'EOF' # ======================== Elasticsearch Configuration ========================= # ---------------------------------- Cluster ----------------------------------- # Use a descriptive name for your cluster: cluster.name: es-test # ------------------------------------ Node ------------------------------------ node.name: ${HOSTNAME} node.master: true node.data: false node.ingest: false search.remote.connect: false # ----------------------------------- Paths ------------------------------------ # Path to directory where to store the data (separate multiple locations by comma): path.data: /var/lib/elasticsearch # Path to log files: path.logs: /var/log/elasticsearch # ---------------------------------- Network ----------------------------------- # Set the bind address to a specific IP (IPv4 or IPv6): network.host: 0.0.0.0 # --------------------------------- Discovery ---------------------------------- # # Pass an initial list of hosts to perform discovery when new node is started: # The default list of hosts is ["127.0.0.1", "[::1]"] discovery.zen.ping.unicast.hosts: ["10.14.11.70", "10.14.11.71", "10.14.11.72"] # Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1): discovery.zen.minimum_master_nodes: 2 EOF ## Update jvm Xmx/Xms according to inst. type #grep Xm /etc/elasticsearch/jvm.options } function install_cerebro { echo "Install Docker" kernel_installed=$(uname -r) if [[ "$kernel_installed" =~ .*amzn.* ]]; then yum install docker-17.06.2ce-1.102.amzn2.x86_64 -y else echo "Amazon Linux is required, not installing docker/cerebro" fi } function handle_mounts { echo "Wait until we have the EBS attached (new or reattached)" ls -l "$device" > /dev/null while [ $? -ne 0 ]; do echo "Device $device is still NOT available, sleeping..." sleep $sleep_sec ls -l "$device" > /dev/null done echo "Device $device is available" echo "Check if the instance is new or recycled" lsblk "$device" --output FSTYPE | grep LVM > /dev/null if [ $? -ne 0 ]; then echo "Device $device is new, creating LVM & formatting" pvcreate "$device" pvdisplay vgcreate vg_elastic01 "$device" vgdisplay lvcreate -l 100%FREE -n "$logical_volume" "$volume_group" lvdisplay mkfs -t ext4 $device_lv else echo "Device $device was reattached" fi echo "Add to entry to fstab" UUID=$(blkid $device_lv -o value | head -1) echo "UUID=$UUID $data_path ext4 _netdev 0 0" >> /etc/fstab echo "Make sure mount is available" mount -a > /dev/null while [ $? -ne 0 ]; do echo "Error mounting all filesystems from /etc/fstab, sleeping..." sleep 2; mount -a > /dev/null done chown -R elasticsearch:elasticsearch "$data_path" echo "Mounted all filesystems from /etc/fstab, proceeding" } function start_apps { echo "Start elasticsearch" systemctl start elasticsearch.service } function main { ## Installations can be offloaded to AMI install_java install_es install_cerebro handle_mounts start_apps } main
Add EBS volume in the final stage of the Elastigroup creation with the desired size
"blockDeviceMappings": [ { "deviceName": "/dev/xvdc", "ebs": { "deleteOnTermination": false, "volumeSize": 50, "volumeType": "gp2" } } ],
Hit “Create” and wait for the masters to launch.
Elastigroup for ElasticSearch Data Nodes
Use the same steps as above except a slightly different user-data
script due do master/data nodes difference.
No need to maintain specific private IPs.
Data node user-data:
#!/usr/bin/env bash device="/dev/xvdc" data_path="/var/lib/elasticsearch" sleep_sec=10 elastic_pckg="elasticsearch-6.2.4-1.noarch" logical_volume="lv_elastic01" volume_group="vg_elastic01" device_lv="/dev/${volume_group}/${logical_volume}" function install_java { echo "Install jdk" rpm_file_src="http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.rpm" rpm_file_dest="/tmp/jdk-8u131-linux-x64.rpm" wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" -O "$rpm_file_dest" "$rpm_file_src" rpm -Uvh "$rpm_file_dest" } function install_es { echo "Install elasticsearch" cat > /etc/yum.repos.d/elasticsearch.repo << 'EOF' [elasticsearch-6.x] name=Elasticsearch repository for 6.x packages baseurl=https://artifacts.elastic.co/packages/6.x/yum gpgcheck=1 gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch enabled=1 autorefresh=1 type=rpm-md EOF yum install $elastic_pckg -y echo "Update elasticsearch config" cat > /etc/elasticsearch/elasticsearch.yml << 'EOF' # ======================== Elasticsearch Configuration ========================= # ---------------------------------- Cluster ----------------------------------- # Use a descriptive name for your cluster: cluster.name: es-test # ------------------------------------ Node ------------------------------------ node.name: ${HOSTNAME} node.master: false # ----------------------------------- Paths ------------------------------------ # Path to directory where to store the data (separate multiple locations by comma): path.data: /var/lib/elasticsearch # Path to log files: path.logs: /var/log/elasticsearch # ---------------------------------- Network ----------------------------------- # Set the bind address to a specific IP (IPv4 or IPv6): network.host: 0.0.0.0 # --------------------------------- Discovery ---------------------------------- # # Pass an initial list of hosts to perform discovery when new node is started: # The default list of hosts is ["127.0.0.1", "[::1]"] discovery.zen.ping.unicast.hosts: ["10.14.11.70", "10.14.11.71", "10.14.11.72"] # Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1): discovery.zen.minimum_master_nodes: 2 EOF ## Update jvm Xmx/Xms according to inst. type #grep Xm /etc/elasticsearch/jvm.options } function handle_mounts { echo "Wait until we have the EBS attached (new or reattached)" ls -l "$device" > /dev/null while [ $? -ne 0 ]; do echo "Device $device is still NOT available, sleeping..." sleep $sleep_sec ls -l "$device" > /dev/null done echo "Device $device is available" echo "Check if the instance is new or recycled" lsblk "$device" --output FSTYPE | grep LVM > /dev/null if [ $? -ne 0 ]; then echo "Device $device is new, creating LVM & formatting" pvcreate "$device" pvdisplay vgcreate vg_elastic01 "$device" vgdisplay lvcreate -l 100%FREE -n "$logical_volume" "$volume_group" lvdisplay mkfs -t ext4 $device_lv else echo "Device $device was reattached" fi echo "Add to entry to fstab" UUID=$(blkid $device_lv -o value | head -1) echo "UUID=$UUID $data_path ext4 _netdev 0 0" >> /etc/fstab echo "Make sure mount is available" mount -a > /dev/null while [ $? -ne 0 ]; do echo "Error mounting all filesystems from /etc/fstab, sleeping..." sleep 2; mount -a > /dev/null done chown -R elasticsearch:elasticsearch "$data_path" echo "Mounted all filesystems from /etc/fstab, proceeding" } function start_apps { echo "Start elasticsearch" systemctl start elasticsearch.service } function main { ## Installations can be offloaded to AMI install_java install_es handle_mounts start_apps } main
Post Installation
Cerebro UI
Run Cerebro on one of the master nodes
$ systemctl start docker.service $ docker run -d -p 9000:9000 --name cerebro yannart/cerebro:latest
Grant access to Cerebro on port 9000
Enable Delayed Shard Allocation
An important piece of the story hers is to enable delayed shard allocation, for better support of shard replacement when Spot Instances are being replaced.
curl -X PUT "localhost:9200/_all/_settings" -H 'Content-Type: application/json' -d' { "settings": { "index.unassigned.node_left.delayed_timeout": "7m" }