Commit 53d3f17c authored by Cristian Aguirre's avatar Cristian Aguirre

Update 20-06-23. Update template files to deploy in Kubernetes.

parent b9c35909
......@@ -46,6 +46,7 @@ AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG="<tag>"
NOTA: *Detalle de cada variable de entorno usada por los POD y Airflow:*
```text
AIRFLOW__KUBERNETES_EXECUTOR__NAMESPACE: bcom-airflow # El namespace en k8s donde los workers serán creados
AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: '30' # Intervalo de tiempo en ir a buscar nuevos dags en la carpeta de dags
AIRFLOW__LOGGING__LOGGING_LEVEL: INFO # Nivel de log de Airflow (webserver, scheduler y workers)
AIRFLOW__WEBSERVER__DEFAULT_UI_TIMEZONE: America/Lima # Timezone de la Web de Airflow
......@@ -80,29 +81,20 @@ MINIO_USER: bWluaW9hZG1pbg== # Usuario de Minio para conectarse al bucket del Mi
MINIO_PASSWORD: bWluaW9hZG1pbg== # Contraseña de Minio para conectarse al bucket del Minio Server
```
4.- Colocar el archivo **pod_template.yaml** en la ruta del host /opt/airflow/templates/.
*Nota:* Si en caso se este usando Minikube, entonces el archivo debe estar dentro del contenedor
en la misma ruta, para eso se ejecuta el siguiente comando:
```shell
docker cp pod_template.yaml <ID_CONTAINER>:/opt/airflow/templates/
```
Y aprovechando que estamos en el directorio, creamos la carpeta con la siguiente ruta: */opt/airflow/dags/dags/*
5.- Ejecutar el script **script_apply.sh** para crear todos los configsMaps, volúmenes, secrets, deployments
4.- Ejecutar el script **script_apply.sh** para crear todos los configsMaps, volúmenes, secrets, deployments
y servicios para correr Airflow.
⚠️ _Nota_: Verifique si usará la sincronización de la carpeta **dags** con un bucket en Minio o en S3.
De acuerdo a esto tendrá que ejecutar el archivo _sync-dags-deployment.yaml_ o _sync-dags-deployment-s3.yaml_ y
colocarlo en los archivos **script-apply.sh** y **script-delete.sh**.
```shell
sh script_apply.sh
```
Con este comando, tiene que esperar unos minutos para que todos los POD's esten levantados y corriendo
normalmente
Con este comando, tiene que esperar unos minutos para que todos los recursos esten levantados y corriendo
normalmente.
6.- (OPCIONAL) Si en caso a deployado usando Minikube, será necesario exponer el puerto del Airflow Webserver
5.- (OPCIONAL) Si en caso a deployado usando Minikube, será necesario exponer el puerto del Airflow Webserver
para que desde su local puede ingresar a la web. Dado que el puerto expuesto es dentro del contenedor del
minikube, debe salir ahora a su local con el siguiente comando:
......@@ -112,7 +104,7 @@ kubectl port-forward <ID-POD-AIRFLOW-WEBSERVER> 8081:8080
Ahora desde su navegador puede ingresar la ruta http://localhost:8081 para ver la Web de Airflow
7.- Validamos que nuestros POD's están corriendo en estado "Running" con el siguiente comando:
6.- Validamos que nuestros POD's están corriendo en estado "Running" con el siguiente comando:
```shell
kubectl get pods
......@@ -141,7 +133,7 @@ MINIO_DAGS_DIR: '/prueba-ca/dags'
```
3.- Solo quedaría esperar n segundos, de acuerdo a las variables _SYNCHRONYZE_DAG_DIR_ (el POD sync
sincronizará bucket con la carpeta local cada n segundos) y _AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL_
sincronizará el bucket con la carpeta dags cada n segundos) y _AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL_
(el scheduler Airflow irá a buscar actualizaciones en su carpeta dags cada n segundos). Luego del tiempo
establecido se podrá ver los cambios en la web de Airflow.
......@@ -152,7 +144,7 @@ _Requisitos para ejecutar el DAG:_
- El ambiente este desplegado correctamente y los POD's en estado "Running"
- Crear la conexión en la web de Airflow con el nombre que está configurado en app_conf.yml con
parámetro **s3_conn_id**.
parámetro **bcom_tp_connection**.
- Validar todas las rutas (bucket y prefijos) y configuraciones de los 7 insumos en app_conf.yml.
1.- En la web de Airflow, ubicamos nuestro DAG con nombre: **"BCOM_DAG_TRANSFORMACION_TACOMVENTAS_PROMOCIONESRESIDENCIAL"**
......
apiVersion: v1
kind: Namespace
metadata:
name: bcom-airflow
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: bcom-airflow
namespace: bcom-airflow
---
apiVersion: v1
kind: ConfigMap
metadata:
name: airflow-envvars-configmap
namespace: bcom-airflow
data:
# The conf below is necessary because of a typo in the config on docker-airflow image:
# https://github.com/puckel/docker-airflow/blob/bed777970caa3e555ef618d84be07404438c27e3/config/airflow.cfg#L934
AIRFLOW__KUBERNETES_EXECUTOR__NAMESPACE: bcom-airflow
AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: '30'
AIRFLOW__LOGGING__LOGGING_LEVEL: INFO
AIRFLOW__WEBSERVER__DEFAULT_UI_TIMEZONE: America/Lima
......@@ -28,4 +44,49 @@ data:
MINIO_SERVER: 'http://192.168.49.2:9000'
MINIO_DAGS_DIR: '/prueba-ca/dags'
---
apiVersion: v1
kind: ConfigMap
metadata:
name: pod-template-config
namespace: bcom-airflow
data:
pod_template.yaml: |
apiVersion: v1
kind: Pod
metadata:
name: dummy-name
namespace: bcom-airflow
spec:
containers:
- args: [ ]
command: [ ]
envFrom:
- configMapRef:
name: airflow-envvars-configmap
env:
- name: AIRFLOW__CORE__EXECUTOR
value: LocalExecutor
image: dumy-image
imagePullPolicy: IfNotPresent
name: base
volumeMounts:
- name: dags-host-volume
mountPath: /opt/airflow/dags
- name: logs-persistent-storage
mountPath: /opt/airflow/logs
hostNetwork: false
restartPolicy: OnFailure
securityContext:
runAsUser: 50000
nodeSelector: { }
affinity: { }
tolerations: [ ]
volumes:
- name: dags-host-volume
persistentVolumeClaim:
claimName: airflow-dags-pvc
- name: logs-persistent-storage
persistentVolumeClaim:
claimName: airflow-logs-pvc
......@@ -2,6 +2,7 @@ kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: pods-permissions
namespace: bcom-airflow
rules:
- apiGroups: [""]
resources: ["pods"]
......@@ -13,10 +14,11 @@ kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: pods-permissions
namespace: bcom-airflow
subjects:
- kind: ServiceAccount
name: default
namespace: default
namespace: bcom-airflow
roleRef:
kind: ClusterRole
name: pods-permissions
......
......@@ -2,6 +2,7 @@ apiVersion: apps/v1
kind: Deployment
metadata:
name: airflow-scheduler
namespace: bcom-airflow
labels:
app: airflow-k8s
......@@ -34,13 +35,11 @@ spec:
mountPath: /opt/airflow/templates
volumes:
- name: dags-host-volume
hostPath:
path: /opt/airflow/dags/dags/
type: Directory
persistentVolumeClaim:
claimName: airflow-dags-pvc
- name: pods-templates
hostPath:
path: /opt/airflow/templates/
type: Directory
configMap:
name: pod-template-config
- name: logs-persistent-storage
persistentVolumeClaim:
claimName: airflow-logs-pvc
......@@ -2,13 +2,13 @@ apiVersion: v1
kind: Secret
metadata:
name: credentials
namespace: bcom-airflow
type:
Opaque
data:
AWS_ACCESS_KEY: bWluaW9hZG1pbg==
AWS_SECRET_KEY: bWluaW9hZG1pbg==
AWS_ACCESS_KEY: QUtJQVFBQU1YTzNaNEJITktFSUU=
AWS_SECRET_KEY: K01VbW4zRW9pZ1k5M3c1UnhOdG1DY3hWK0Vya1pnRVhxeFVralhVMw==
MINIO_USER: bWluaW9hZG1pbg==
MINIO_PASSWORD: bWluaW9hZG1pbg==
apiVersion: v1
kind: PersistentVolume
metadata:
name: airflow-dags-pv
namespace: bcom-airflow
spec:
capacity:
storage: 300Mi
accessModes:
- ReadWriteMany
storageClassName: airflow-dags
nfs:
server: 192.168.1.11
path: "/mnt/nfs_share"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: airflow-postgres-pv
namespace: bcom-airflow
spec:
capacity:
storage: 8000Mi
accessModes:
- ReadWriteMany
storageClassName: airflow-postgres
nfs:
server: 192.168.1.11
path: "/mnt/nfs_postgres"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: airflow-logs-pv
namespace: bcom-airflow
spec:
capacity:
storage: 4000Mi
accessModes:
- ReadWriteMany
storageClassName: airflow-logs
nfs:
server: 192.168.1.11
path: "/mnt/nfs_logs"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: airflow-dags-pvc
namespace: bcom-airflow
spec:
accessModes:
- ReadWriteMany
storageClassName: airflow-dags
resources:
requests:
storage: 200Mi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: airflow-postgres-pvc
namespace: bcom-airflow
spec:
accessModes:
- ReadWriteMany
storageClassName: airflow-postgres
resources:
requests:
storage: 7500Mi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: airflow-logs-pvc
namespace: bcom-airflow
labels:
app: airflow-k8s
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 3500Mi
storageClassName: airflow-logs
\ No newline at end of file
......@@ -2,6 +2,7 @@ apiVersion: apps/v1
kind: Deployment
metadata:
name: airflow-webserver
namespace: bcom-airflow
labels:
app: airflow-k8s
......@@ -25,8 +26,6 @@ spec:
envFrom:
- configMapRef:
name: airflow-envvars-configmap
ports:
- containerPort: 8080
volumeMounts:
- name: dags-host-volume
mountPath: /opt/airflow/dags/
......@@ -34,9 +33,8 @@ spec:
mountPath: /opt/airflow/logs
volumes:
- name: dags-host-volume
hostPath:
path: /opt/airflow/dags/dags/
type: Directory
persistentVolumeClaim:
claimName: airflow-dags-pvc
- name: logs-persistent-storage
persistentVolumeClaim:
claimName: airflow-logs-pvc
......@@ -2,17 +2,18 @@ apiVersion: v1
kind: Service
metadata:
name: airflow-webserver
namespace: bcom-airflow
labels:
app: airflow-k8s
spec:
type: NodePort
selector:
app: airflow-webserver
ports:
- name: web
- appProtocol: http
name: airflow-webserver
port: 8080
protocol: TCP
port: 8081
targetPort: 8080
nodePort: 30080
\ No newline at end of file
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: airflow-logs-pvc
labels:
app: airflow-k8s
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Gi
storageClassName: standard
apiVersion: v1
kind: Pod
metadata:
name: dummy-name
spec:
containers:
- args: [ ]
command: [ ]
envFrom:
- configMapRef:
name: airflow-envvars-configmap
env:
- name: AIRFLOW__CORE__EXECUTOR
value: LocalExecutor
image: dumy-image
imagePullPolicy: IfNotPresent
name: base
volumeMounts:
- name: dags-host-volume
mountPath: /opt/airflow/dags
- name: logs-persistent-storage
mountPath: /opt/airflow/logs
hostNetwork: false
restartPolicy: IfNotPresent
securityContext:
runAsUser: 50000
nodeSelector: { }
affinity: { }
tolerations: [ ]
volumes:
- name: dags-host-volume
hostPath:
path: /opt/airflow/dags/dags/
type: Directory
- name: logs-persistent-storage
persistentVolumeClaim:
claimName: airflow-logs-pvc
......@@ -2,6 +2,7 @@ apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
namespace: bcom-airflow
spec:
selector:
matchLabels:
......@@ -31,3 +32,10 @@ spec:
value: airflow
- name: POSTGRES_DB
value: airflow
volumeMounts:
- name: pgdatavol
mountPath: /var/lib/postgresql/data
volumes:
- name: pgdatavol
persistentVolumeClaim:
claimName: airflow-postgres-pvc
\ No newline at end of file
......@@ -2,10 +2,13 @@ apiVersion: v1
kind: Service
metadata:
name: postgres
namespace: bcom-airflow
spec:
type: NodePort
selector:
app: postgres
ports:
- port: 5432
targetPort: 5432
nodePort: 30084
kubectl apply -f logs-persistenvolumeclaim.yaml
kubectl apply -f airflow-envvars-configmap.yaml
kubectl apply -f airflow-volumes.yaml
kubectl apply -f airflow-rbac.yaml
kubectl apply -f postgres-deployment.yaml
kubectl apply -f postgres-service.yaml
kubectl apply -f airflow-envvars-configmap.yaml
kubectl apply -f airflow-secrets.yaml
kubectl apply -f airflow-webserver-deployment.yaml
kubectl apply -f airflow-webserver-service.yaml
kubectl apply -f airflow-scheduler-deployment.yaml
kubectl apply -f sync-dags-deployment.yaml
kubectl apply -f sync-dags-deployment-s3.yaml
......@@ -2,9 +2,9 @@ kubectl delete -f airflow-rbac.yaml
kubectl delete -f postgres-service.yaml
kubectl delete -f postgres-deployment.yaml
kubectl delete -f airflow-secrets.yaml
kubectl delete -f airflow-envvars-configmap.yaml
kubectl delete -f airflow-webserver-service.yaml
kubectl delete -f airflow-webserver-deployment.yaml
kubectl delete -f airflow-scheduler-deployment.yaml
kubectl delete -f logs-persistenvolumeclaim.yaml
kubectl delete -f sync-dags-deployment.yaml
\ No newline at end of file
kubectl delete -f sync-dags-deployment-s3.yaml
kubectl delete -f airflow-volumes.yaml
kubectl delete -f airflow-envvars-configmap.yaml
\ No newline at end of file
apiVersion: apps/v1
kind: Deployment
metadata:
name: airflow-sync-dags
namespace: bcom-airflow
spec:
selector:
matchLabels:
app: airflow-sync-dags
template:
metadata:
labels:
app: airflow-sync-dags
spec:
containers:
- args:
- while true; aws s3 sync --exact-timestamps --delete ${S3_DAGS_DIR:-s3://prueba1234568/dags} /dags;
do sleep ${SYNCHRONYZE_DAG_DIR:-30}; done;
command:
- /bin/bash
- -c
- --
name: sync-dags-s3
image: amazon/aws-cli:2.1.34
envFrom:
- configMapRef:
name: airflow-envvars-configmap
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
key: AWS_ACCESS_KEY
name: credentials
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
key: AWS_SECRET_KEY
name: credentials
volumeMounts:
- name: dags-host-volume
mountPath: /dags
volumes:
- name: dags-host-volume
persistentVolumeClaim:
claimName: airflow-dags-pvc
......@@ -2,7 +2,7 @@ apiVersion: apps/v1
kind: Deployment
metadata:
name: airflow-sync-dags
namespace: bcom-airflow
spec:
selector:
matchLabels:
......@@ -44,6 +44,5 @@ spec:
mountPath: /dags
volumes:
- name: dags-host-volume
hostPath:
path: /opt/airflow/dags/dags/
type: Directory
persistentVolumeClaim:
claimName: airflow-dags-pvc
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment