Airflow High Available cluster

Part 3

Implementing the Airflow HA solution

Well, let’s continue building the HA cluster.

The current chapter of the tutorial is installing and configuring application for DAGs synchronization and Airflow itself.

1.8 csync2

Install and configure csync2:

Pre-shared keys can be generated using the following:

Then specify the file with your keys in the string

This is used for authentication nodes.

Which directory is synchronized should be specified in

So, add a job to crontab:

Then you can create a file or directory and they will appear on all nodes(will be synchronized). Of course, only one key is used here across all nodes.

To autostart the csync2 daemon with necessary params, it’s necessary to modify the systemd unit:

Start csync2:

1.9 Airflow

And finally, we got the moment when all preparations have been done and we can install Airflow itself:

Initialize a database:

Create airflow components systemd units:

Don’t forget to enable them.

Create config file by the command in the default directory:

Find and modify the following settings:

in [code] section:

in section [celery]:

As for broker_url, I specified all three nodes, but it’s redundant because we suppose celery will connect to localhost and if the whole node is down, there is no reason to reconnect to anyone else node. But for reliability, I added three nodes. You can consider specifying only localhost.

Create a user:

Restart the airflow:

Look if some errors occur.

To get access to the web interface I recommend using an an ssh tunnel.

Right now we have the Airflow cluster is up and running. Double-check all settings again to make sure you didn’t make an error.

Troubleshooting

If you have issues with patroni, etcd, or PostgreSQL

Try the following:

sometimes you can remove all PostgreSQL data:

But be careful! This can be done only if you have test environment, production servers are not for experiments, you should understand what you are doing.

And restart the services:

Issues with csync can be resolved by removing directories with a database:

Then take a look at statuses of processes by issuing the command:

Last but not least I’d like to say, or even strongly recommend, you should consider the implementation of monitoring of the system. What system to use is up to you. I use Zabbix and special templates for PostgreSQL, templates for tracking if processes are running, and so on. Any comments are welcome!

FAUN.dev is where engineers from GitHub, Netflix, and Shopify go to stay ahead — fast.

Airflow High Available cluster - 3

Troubleshooting

Let's keep in touch!

Give a Pawfive to this post!

Start blogging about your favorite technologies, reach more readers and earn rewards!

FAUN.dev is where engineers from GitHub, Netflix, and Shopify go to stay ahead — fast.

Denis Matveev

Developer Influence

110

11k

3

You may also like ..

Airflow High Available cluster

GitHub Container Registry: Publish A Postgres Docker Image

Airflow High Available cluster - 2