Clustering Redis

Here at adjust we use Redis as a fast in memory key-value data store. Today I want to show you how we are clustering Redis.

Currently Redis operates as a single process and single threaded, which means you can not utilize modern multicore CPUs with a single Redis instance. This is not a problem if you use Redis as a cache for your little cat blog, but becomes interesting if you start to put gigabytes of data into Redis and query it concurrently with dozens of clients. A built-in solution called “Redis cluster” is planned, but is still in alpha stage. That is why we are using a combination of Redis and nutcracker/twemproxy to shard our dataset over 80 Redis instances.

Initial Redis Setup

As the base for our Redis cluster we use Dell servers from Hetzner as these machines offer up to 384GB of RAM. We use Gentoo on our servers which uses OpenRC as its init system, so some of the following instructions are Gentoo and OpenRC specific. Nevertheless it should be easy to adapt the instructions to other init systems and package managers and run the setup on virtual machines for testing purposes. For this article we will assume that you have two servers with 20 instances on each machine. Also we will assume that the servers have the IP addresses 192.168.0.10 and 192.168.0.11

Installing Redis

I prefer to run the latest stable version of Redis. As the 2.8 branch of Redis is still masked in Portage, we need to unmask it first.

1
2
~ $ echo ">=dev-db/redis-2.8.8 ~amd64" >> /etc/portage/package.accept_keywords/dev-db-redis
~ $ emerge -av dev-db/redis

If you want for some reason to use an older version of Redis, you should make sure that you use the Sentinel from at least the 2.8 branch. Redis Sentinel prior to version 2.8 is known not to work correctly.

Configuring Redis

All of the following steps should be automated with the configuration management software of your choice to avoid subtle errors. But for this article i will stick to little bash loops.

Note that we make the Redis user the owner of the new created directories and files. This is important, as Redis will rewrite its configs if told so by Sentinel.

First we will set up the data directories for Redis.

1
2
3
4
5
6
for i in {1..20}
do
    mkdir /var/lib/redis$i/
    chown redis.redis /var/lib/redis$i
    chmod 750 /var/lib/redis$i
done

Also we want to set up the log directories in a similar way.

1
2
3
4
5
6
for i in {1..20}
do
    mkdir /var/log/redis$i/
    chown redis.redis /var/log/redis$i
    chmod 750 /var/log/redis$i
done

Now we need to set up the OpenRC conf.d files.

1
2
3
4
5
6
7
8
9
for i in {1..20}
do
    echo 'REDIS_USER="redis"' > /etc/conf.d/redis$i
    echo 'REDIS_GROUP="redis"' >> /etc/conf.d/redis$i
    echo 'REDIS_CONF="/etc/redis$i.conf"' >> /etc/conf.d/redis$i
    echo 'REDIS_DIR="/var/lib/redis$i"' >> /etc/conf.d/redis$i
    echo 'REDIS_PID="/var/run/redis$i/redis.pid"' >> /etc/conf.d/redis$i
    echo 'REDIS_OPTS="${REDIS_CONF}"' >> /etc/conf.d/redis$i
done

Now we will set up the Redis config. For this example we will only change the necessary parts.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
for i in {1..20}
do
    #give each instance another port
    PORT=`echo $i+6379|bc`
    #create a new config file for each instance
    cp /etc/redis.conf /etc/redis$i.conf
    # this is just a search and replace instruction
    perl -p -i -e 's|pidfile "/var/run/redis/redis.pid"|pidfile "/var/run/redis$i/redis.pid"|g' /etc/redis$i.conf
    perl -p -i -e 's|port 6379|port $PORT|g' /etc/redis$i.conf
    perl -p -i -e 's|logfile "/var/log/redis/redis.log"|logfile "/var/log/redis$i/redis.log"|g' /etc/redis$i.conf
    perl -p -i -e 's|dir "/var/lib/redis"|dir "/var/lib/redis$i"|g' /etc/redis$i.conf
    chown redis.redis /etc/redis$i.conf
    chmod 750 /etc/redis$i.conf
    #echo 'slaveof 192.168.0.10 $PORT' >> /etc/redis$i.conf
done

Even though we are using a fancy NoSql in memory database, we want to be safe in case of a server outage. That is why we want to set up a master/slave configuration. Luckily this is super easy with Redis. Just uncomment the last line in the example above when running it on the second server.

Finally we set up OpenRC init scripts and add Redis to the default runlevel and start our Redis instances.

1
2
3
4
5
6
for i in {1..20}
do
    ln -s /etc/init.d/redis /etc/init.d/redis$i
    rc-update add redis$i default
done
rc -n

Setting up nutcracker/twemproxy

Of course we need to install nutcracker first.

1
2
~ $ echo "=net-proxy/nutcracker-0.2.4 ~amd64" >> /etc/portage/package.accept_keywords/nutcracker
~ $ emerge -av nutcracker

Now we set up our initial nutcracker config.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
redis_cluster:
  auto_eject_hosts: false
  distribution: ketama
  hash: fnv1a_64
  listen: 0.0.0.0:6379
  preconnect: true
  redis: true
  servers:
    - 192.168.0.10:6380:1 redis_1
    - 192.168.0.10:6381:1 redis_2
    - 192.168.0.10:6382:1 redis_3
    - 192.168.0.10:6383:1 redis_4
    - 192.168.0.10:6384:1 redis_5
    - 192.168.0.10:6385:1 redis_6
    - 192.168.0.10:6386:1 redis_7
    - 192.168.0.10:6387:1 redis_8
    - 192.168.0.10:6388:1 redis_9
    - 192.168.0.10:6389:1 redis_10
    - 192.168.0.10:6390:1 redis_11
    - 192.168.0.10:6391:1 redis_12
    - 192.168.0.10:6392:1 redis_13
    - 192.168.0.10:6393:1 redis_14
    - 192.168.0.10:6394:1 redis_15
    - 192.168.0.10:6395:1 redis_16
    - 192.168.0.10:6396:1 redis_17
    - 192.168.0.10:6397:1 redis_18
    - 192.168.0.10:6398:1 redis_19
    - 192.168.0.10:6399:1 redis_20

Notice the explicit name each node gets (e.g. redis_2). This name is used for the key distribution by nutcracker and is important. If you omit this, nutcracker will use the IP addresses for key distribution, which means it will put the same keys on different servers if you replace, remove or add nodes.

Place this config file in the nutcracker config directory in /etc/nutcracker and make sure to make this directory writeable by redis user. This is needed for the automatic we will set up in the next step. Afterwards you can add nutcracker to the default runlevel and start it.

1
2
3
~ $ chown -R redis.redis /etc/nutcracker
~ $ rc-update add nutcracker default
~ $ rc -n

If you did everything right, you should be able to use your brandnew Redis cluster.

1
2
3
4
~ $ redis-cli set 1 1
OK
~ $ redis-cli get 1
"1"

You can now add a bunch of keys and see that they get distributed over all your nodes.

Setting up Sentinel

Right now you have several Redis instances working nicely together. But what happens if one node or your whole server crashes? You would have to go through the following list to get your Redis cluster working properly again:

  • notice that something went wrong
  • call SLAVEOF NO ONE on the corresponding slave to your failed master
  • rewrite the nutcracker config to point to your new master
  • restart nutcracker
  • rewrite the config of your new master (slaveof changed)
  • rewrite the config of the failed master
  • resync the failed slave

Wee, thats a lot of work. Fortunately Redis is shipped with Sentinel, a tool that does all this work for you. Sentinel is a special Redis server that talks to your Redis instances via pub/sub. You just have to tell Sentinel where your masters are, it will autodiscover other Sentinels and the slaves of your masters with these pub/sub mechanics. That is why you do not have to update your Sentinel each time you add new slaves or Sentinels. As said above, make sure you use for Sentinel at least the 2.8 branch of Redis.

1
2
~ $ echo ">=dev-db/redis-2.8.8 ~amd64" >> /etc/portage/package.accept_keywords/dev-db-redis
~ $ emerge -av dev-db/redis

Now we can set up the initial config for Sentinel.

1
2
3
4
5
6
7
8
9
port 26379
logfile "/var/log/sentinel/sentinel.log"
pidfile "/var/run/sentinel.pid"
# make an entry for all nodes
# configs for redis{2..20} are missing
sentinel monitor redis_1 192.168.0.10 6380 6
sentinel down-after-milliseconds redis_1 10000
sentinel parallel-syncs redis_1 1
sentinel failover-timeout redis_1 180000

As Sentinel can only restart Redis by default we need to use something that enables Sentinel to restart nutcracker. That is what client-reconfig-script is for. With this option we can add custom failover scripts, that rewrite nutcrackers config and restarts it. A simple and lightweight solution can be found here. After installing it, we only need to add this line for every node to our Sentinel config file.

1
sentinel client-reconfig-script redis_1 /usr/bin/failover.pl

As Gentoo does not ship an init script for Sentinel we need to put our own to /etc/init.d/sentinel

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#!/sbin/runscript
# Copyright 1999-2012 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: $

depend() {
    need localmount
}

start() {
    ebegin "Starting Sentinel"
    start-stop-daemon -2 /var/log/sentinel/crash.log --wait 2000 --background --start --exec \
    /usr/sbin/redis-sentinel \
    --make-pidfile --pidfile /var/run/sentinel.pid \
    -u redis -- /etc/sentinel.conf
    eend $?
}

stop() {
    ebegin "Stopping Sentinel"
    start-stop-daemon --stop --exec \
    /usr/sbin/redis-sentinel \
    --pidfile /var/run/sentinel.pid
}

If you are interested what all that means, you can find a short explanation here.

Finally we can add Sentinel to the default runlevel and start it.

1
2
~ $ rc-update add sentinel default
~ $ rc -n

Now you should have your own Redis cluster with 20 master nodes and automatic failover up and running.

Comments