先决条件
内存需要至少有4GB
环境准备
部署安装docker
sudo dnf install -y docker
sudo systemctl enable docker --now
sudo usermod -aG docker $USER
newgrp docker
exec bash
部署安装docker-compose
# docker-compose (latest version) $->
sudo curl -L https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m) -o /usr/local/bin/docker-compose
# Fix permissions after download $-> sudo chmod +x /usr/local/bin/docker-compose # Verify success $-> docker-compose version
部署实施
-
下载部署文件
curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.9.3/docker-compose.yaml'
-
初始化环境
mkdir -p ./dags ./logs ./plugins ./config
echo -e "AIRFLOW_UID=$(id -u)" > .env
-
初始化数据库
docker compose up airflow-init
-
启动服务
docker-compose up -d
-
查看容器运行是否正常
docker ps
-
查看容器日志
docker logs ec2-user-airflow-webserver-1
docker logs ec2-user-airflow-scheduler-1
docker logs ec2-user-airflow-worker-1
docker logs ec2-user-airflow-triggerer-1
登录管理airflow-web控制台
The webserver is available at: http://localhost:8080
. The default account has the login airflow
and the password airflow
.
登录之后界面显示效果如下
ENDPOINT_URL="http://localhost:8080/" curl -X GET \ --user "airflow:airflow" \ "${ENDPOINT_URL}/api/v1/pools"
airflow调度emr serverless application job
Airflow connetion配置
参考https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/connections/aws.html
-
在airflow 控制台中选择Admin下的Connections
-
配置连接信息,配置好之后点击保存
将airflow dag上传到dag目录下
$ ls
__pycache__ airflow_invoke_emrserverlessapp.py
参考资料
https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html
https://github.com/apache/airflow/discussions/16801