Airflow Dynamic DAGs — Python Globals

In this post, I introduce the concept of dynamic DAG creation and explain the significance of Python global variables for Airflow.

What do I mean by “dynamic DAG”?

Dynamic DAG creation is important for scalable data pipeline applications.

  1. Pass that object back to the global namespace of the DAGfile.

Static DAG Example

Let’s imagine I have a pipeline that get’s the current price of bitcoin (BTC) and emails it to me:

Dynamic DAG example

Now let’s imagine we wanted to get the price of some other cryptocurrencies as well; say, Ethereum (ETH), Litecoin (LTC) and Stellar (XLM).

Why it doesn’t work

In order to understand why the above code does not act like we need it to, we have to consider Ariflow’s core concept of DAG scope.

>>> globals()["my_name"] = "Alex"
>>> print(my_name)
Alex

How to make it work

Knowing about this core concept of Airflow, the solution is trivial. All we need to do is maintain references to each DAG in the loop.

Conclusion

We’ve seen how using Python’s builtin globals function can be useful when dynamically creating Airflow DAGs.

Python Data Engineer, MSc. Physics

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store