Keepalived

g7s · 2 years ago

Keepalived

kimli · 2 years ago

It’s been a few years since I used keepalived so my knowledge might be outdated.

You are correct that the VMs should be in different servers. To test around you can set up on the same, but this shouldn’t be done in production environments, if you lose the host, you lose the service.

Keepalived will make sure your service is available in an IP. To say, you have two (it can be configured for more than two) servers with (A) 192.168.0.2 and (B) 192.168.0.3 which provide the service you want to provide. With Keepalived you’ll configure a common IP for both of them, let’s say 192.168.0.4

While working, server A will be available at 192.168.0.2 and 192.168.0.4 while server B will be available at 192.168.0.3. If server A fails keepalived will “move” 192.168.0.4 to server B, so 192.168.0.2 will not be available and server B will be available at 192.168.0.3 and 192.168.0.4.

No matter which server is up / primary, your service will always be available at 192.168.0.4

For the mirroring part, you need to solve it in another step outside from keepalived. For example, MariaDB provides multimaster replication “out of the box” with galera (the recommendation is at least 3 nodes)

For files, depending on your filesystem you should have to rsync, use some shared units, distribute filesystem (Ceph), …

g7s · 2 years ago

Thank you for the explanation. I might look into heartbeat, as suggested by @arbiter. I understand now, that keepalived is only working on an IP layer, and not helping me with mirroring my actual VM’s. For that I will look into other technologies.

arbiter · 2 years ago

If you plan on using this in a production environment, I’d bring in a consultant.

However, I’ve heard of people in the home-lab sphere use things like heartbeat and drdb. The more nodes the merrier as if you lose connection between the two you’ll have a bad time.

g7s · 2 years ago

I’m working for our department as the only IT-admin, everything runs fine and nightly downtimes for upgrades etc. are fine. However, I want to make it more available. Thanks for the suggestions, I will look into them :)

arbiter · edit-2 2 years ago

Other data replication technologies worth looking in to: GlusterFs, Ceph.

Dependent on your db’s they should offer replication out of the box.

You can also implement a load balancer, such as HAProxy or Nginx, to distribute incoming network traffic across multiple VMs

taladar@sh.itjust.works · 2 years ago

Whatever technology you end up using you should be aware that you will see an order of magnitude or two increase in complexity by running things in a HA way which is very likely to cause some additional downtime instead of reducing it for a while (and possibly even in the long-term).

Network block devices on clusters like Ceph or distributed filesystems have many more failure modes in addition to the ones of the underlying storage hardware due to their distributed nature. Clustered services are similar. You might also see new performance bottlenecks emerge (e.g. your network might be significantly slower in both latency and throughput than modern local SSD or NVMe storage) and new temporarily unavailable services when the failover happens too often.

My advice would be to start running something like that only on a dev/test system that sees some use for a few months at least to learn what to do when things go wrong before you even consider using them in production.

g7s · 2 years ago

Thank you for the insight. I will think about it more and set up a test lab. We have 2.5Gbit switches, so I hope the network won’t be a bottleneck