Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
B
bcom-tp-etl-transformation-pipelines
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
general
bcom-tp-etl-transformation-pipelines
Commits
53d3f17c
Commit
53d3f17c
authored
Jun 21, 2023
by
Cristian Aguirre
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Update 20-06-23. Update template files to deploy in Kubernetes.
parent
b9c35909
Changes
16
Show whitespace changes
Inline
Side-by-side
Showing
16 changed files
with
254 additions
and
99 deletions
+254
-99
README.md
README.md
+11
-19
airflow-envvars-configmap.yaml
deploy-k8/airflow-envvars-configmap.yaml
+61
-0
airflow-rbac.yaml
deploy-k8/airflow-rbac.yaml
+3
-1
airflow-scheduler-deployment.yaml
deploy-k8/airflow-scheduler-deployment.yaml
+5
-6
airflow-secrets.yaml
deploy-k8/airflow-secrets.yaml
+3
-3
airflow-volumes.yaml
deploy-k8/airflow-volumes.yaml
+95
-0
airflow-webserver-deployment.yaml
deploy-k8/airflow-webserver-deployment.yaml
+3
-5
airflow-webserver-service.yaml
deploy-k8/airflow-webserver-service.yaml
+5
-4
logs-persistenvolumeclaim.yaml
deploy-k8/logs-persistenvolumeclaim.yaml
+0
-13
pod_template.yaml
deploy-k8/pod_template.yaml
+0
-37
postgres-deployment.yaml
deploy-k8/postgres-deployment.yaml
+9
-1
postgres-service.yaml
deploy-k8/postgres-service.yaml
+3
-0
script-apply.sh
deploy-k8/script-apply.sh
+3
-3
script-delete.sh
deploy-k8/script-delete.sh
+3
-3
sync-dags-deployment-s3.yaml
deploy-k8/sync-dags-deployment-s3.yaml
+47
-0
sync-dags-deployment.yaml
deploy-k8/sync-dags-deployment.yaml
+3
-4
No files found.
README.md
View file @
53d3f17c
...
...
@@ -46,6 +46,7 @@ AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG="<tag>"
NOTA:
*Detalle de cada variable de entorno usada por los POD y Airflow:*
```
text
AIRFLOW__KUBERNETES_EXECUTOR__NAMESPACE: bcom-airflow # El namespace en k8s donde los workers serán creados
AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: '30' # Intervalo de tiempo en ir a buscar nuevos dags en la carpeta de dags
AIRFLOW__LOGGING__LOGGING_LEVEL: INFO # Nivel de log de Airflow (webserver, scheduler y workers)
AIRFLOW__WEBSERVER__DEFAULT_UI_TIMEZONE: America/Lima # Timezone de la Web de Airflow
...
...
@@ -80,29 +81,20 @@ MINIO_USER: bWluaW9hZG1pbg== # Usuario de Minio para conectarse al bucket del Mi
MINIO_PASSWORD: bWluaW9hZG1pbg== # Contraseña de Minio para conectarse al bucket del Minio Server
```
4.
- Colocar el archivo
**pod_template.yaml**
en la ruta del host /opt/airflow/templates/.
*Nota:*
Si en caso se este usando Minikube, entonces el archivo debe estar dentro del contenedor
en la misma ruta, para eso se ejecuta el siguiente comando:
```
shell
docker
cp
pod_template.yaml <ID_CONTAINER>:/opt/airflow/templates/
```
Y aprovechando que estamos en el directorio, creamos la carpeta con la siguiente ruta:
*/opt/airflow/dags/dags/*
5.
- Ejecutar el script
**script_apply.sh**
para crear todos los configsMaps, volúmenes, secrets, deployments
4.
- Ejecutar el script
**script_apply.sh**
para crear todos los configsMaps, volúmenes, secrets, deployments
y servicios para correr Airflow.
⚠️ _Nota_: Verifique si usará la sincronización de la carpeta
**dags**
con un bucket en Minio o en S3.
De acuerdo a esto tendrá que ejecutar el archivo _sync-dags-deployment.yaml_ o _sync-dags-deployment-s3.yaml_ y
colocarlo en los archivos
**script-apply.sh**
y
**script-delete.sh**
.
```
shell
sh script_apply.sh
```
Con este comando, tiene que esperar unos minutos para que todos los
POD'
s esten levantados y corriendo
normalmente
Con este comando, tiene que esperar unos minutos para que todos los
recurso
s esten levantados y corriendo
normalmente
.
6
.
- (OPCIONAL) Si en caso a deployado usando Minikube, será necesario exponer el puerto del Airflow Webserver
5
.
- (OPCIONAL) Si en caso a deployado usando Minikube, será necesario exponer el puerto del Airflow Webserver
para que desde su local puede ingresar a la web. Dado que el puerto expuesto es dentro del contenedor del
minikube, debe salir ahora a su local con el siguiente comando:
...
...
@@ -112,7 +104,7 @@ kubectl port-forward <ID-POD-AIRFLOW-WEBSERVER> 8081:8080
Ahora desde su navegador puede ingresar la ruta http://localhost:8081 para ver la Web de Airflow
7
.
- Validamos que nuestros POD's están corriendo en estado "Running" con el siguiente comando:
6
.
- Validamos que nuestros POD's están corriendo en estado "Running" con el siguiente comando:
```
shell
kubectl get pods
...
...
@@ -141,7 +133,7 @@ MINIO_DAGS_DIR: '/prueba-ca/dags'
```
3.
- Solo quedaría esperar n segundos, de acuerdo a las variables _SYNCHRONYZE_DAG_DIR_ (el POD sync
sincronizará
bucket con la carpeta local
cada n segundos) y _AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL_
sincronizará
el bucket con la carpeta dags
cada n segundos) y _AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL_
(el scheduler Airflow irá a buscar actualizaciones en su carpeta dags cada n segundos). Luego del tiempo
establecido se podrá ver los cambios en la web de Airflow.
...
...
@@ -152,7 +144,7 @@ _Requisitos para ejecutar el DAG:_
-
El ambiente este desplegado correctamente y los POD's en estado "Running"
-
Crear la conexión en la web de Airflow con el nombre que está configurado en app_conf.yml con
parámetro
**
s3_conn_id
**
.
parámetro
**
bcom_tp_connection
**
.
-
Validar todas las rutas (bucket y prefijos) y configuraciones de los 7 insumos en app_conf.yml.
1.
- En la web de Airflow, ubicamos nuestro DAG con nombre:
**"BCOM_DAG_TRANSFORMACION_TACOMVENTAS_PROMOCIONESRESIDENCIAL"**
...
...
deploy-k8/airflow-envvars-configmap.yaml
View file @
53d3f17c
apiVersion
:
v1
kind
:
Namespace
metadata
:
name
:
bcom-airflow
---
apiVersion
:
v1
kind
:
ServiceAccount
metadata
:
name
:
bcom-airflow
namespace
:
bcom-airflow
---
apiVersion
:
v1
kind
:
ConfigMap
metadata
:
name
:
airflow-envvars-configmap
namespace
:
bcom-airflow
data
:
# The conf below is necessary because of a typo in the config on docker-airflow image:
# https://github.com/puckel/docker-airflow/blob/bed777970caa3e555ef618d84be07404438c27e3/config/airflow.cfg#L934
AIRFLOW__KUBERNETES_EXECUTOR__NAMESPACE
:
bcom-airflow
AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL
:
'
30'
AIRFLOW__LOGGING__LOGGING_LEVEL
:
INFO
AIRFLOW__WEBSERVER__DEFAULT_UI_TIMEZONE
:
America/Lima
...
...
@@ -28,4 +44,49 @@ data:
MINIO_SERVER
:
'
http://192.168.49.2:9000'
MINIO_DAGS_DIR
:
'
/prueba-ca/dags'
---
apiVersion
:
v1
kind
:
ConfigMap
metadata
:
name
:
pod-template-config
namespace
:
bcom-airflow
data
:
pod_template.yaml
:
|
apiVersion: v1
kind: Pod
metadata:
name: dummy-name
namespace: bcom-airflow
spec:
containers:
- args: [ ]
command: [ ]
envFrom:
- configMapRef:
name: airflow-envvars-configmap
env:
- name: AIRFLOW__CORE__EXECUTOR
value: LocalExecutor
image: dumy-image
imagePullPolicy: IfNotPresent
name: base
volumeMounts:
- name: dags-host-volume
mountPath: /opt/airflow/dags
- name: logs-persistent-storage
mountPath: /opt/airflow/logs
hostNetwork: false
restartPolicy: OnFailure
securityContext:
runAsUser: 50000
nodeSelector: { }
affinity: { }
tolerations: [ ]
volumes:
- name: dags-host-volume
persistentVolumeClaim:
claimName: airflow-dags-pvc
- name: logs-persistent-storage
persistentVolumeClaim:
claimName: airflow-logs-pvc
deploy-k8/airflow-rbac.yaml
View file @
53d3f17c
...
...
@@ -2,6 +2,7 @@ kind: ClusterRole
apiVersion
:
rbac.authorization.k8s.io/v1
metadata
:
name
:
pods-permissions
namespace
:
bcom-airflow
rules
:
-
apiGroups
:
[
"
"
]
resources
:
[
"
pods"
]
...
...
@@ -13,10 +14,11 @@ kind: ClusterRoleBinding
apiVersion
:
rbac.authorization.k8s.io/v1
metadata
:
name
:
pods-permissions
namespace
:
bcom-airflow
subjects
:
-
kind
:
ServiceAccount
name
:
default
namespace
:
default
namespace
:
bcom-airflow
roleRef
:
kind
:
ClusterRole
name
:
pods-permissions
...
...
deploy-k8/airflow-scheduler-deployment.yaml
View file @
53d3f17c
...
...
@@ -2,6 +2,7 @@ apiVersion: apps/v1
kind
:
Deployment
metadata
:
name
:
airflow-scheduler
namespace
:
bcom-airflow
labels
:
app
:
airflow-k8s
...
...
@@ -34,13 +35,11 @@ spec:
mountPath
:
/opt/airflow/templates
volumes
:
-
name
:
dags-host-volume
hostPath
:
path
:
/opt/airflow/dags/dags/
type
:
Directory
persistentVolumeClaim
:
claimName
:
airflow-dags-pvc
-
name
:
pods-templates
hostPath
:
path
:
/opt/airflow/templates/
type
:
Directory
configMap
:
name
:
pod-template-config
-
name
:
logs-persistent-storage
persistentVolumeClaim
:
claimName
:
airflow-logs-pvc
deploy-k8/airflow-secrets.yaml
View file @
53d3f17c
...
...
@@ -2,13 +2,13 @@ apiVersion: v1
kind
:
Secret
metadata
:
name
:
credentials
namespace
:
bcom-airflow
type
:
Opaque
data
:
AWS_ACCESS_KEY
:
bWluaW9hZG1pbg=
=
AWS_SECRET_KEY
:
bWluaW9hZG1pbg
==
AWS_ACCESS_KEY
:
QUtJQVFBQU1YTzNaNEJITktFSUU
=
AWS_SECRET_KEY
:
K01VbW4zRW9pZ1k5M3c1UnhOdG1DY3hWK0Vya1pnRVhxeFVralhVMw
==
MINIO_USER
:
bWluaW9hZG1pbg==
MINIO_PASSWORD
:
bWluaW9hZG1pbg==
deploy-k8/airflow-volumes.yaml
0 → 100644
View file @
53d3f17c
apiVersion
:
v1
kind
:
PersistentVolume
metadata
:
name
:
airflow-dags-pv
namespace
:
bcom-airflow
spec
:
capacity
:
storage
:
300Mi
accessModes
:
-
ReadWriteMany
storageClassName
:
airflow-dags
nfs
:
server
:
192.168.1.11
path
:
"
/mnt/nfs_share"
---
apiVersion
:
v1
kind
:
PersistentVolume
metadata
:
name
:
airflow-postgres-pv
namespace
:
bcom-airflow
spec
:
capacity
:
storage
:
8000Mi
accessModes
:
-
ReadWriteMany
storageClassName
:
airflow-postgres
nfs
:
server
:
192.168.1.11
path
:
"
/mnt/nfs_postgres"
---
apiVersion
:
v1
kind
:
PersistentVolume
metadata
:
name
:
airflow-logs-pv
namespace
:
bcom-airflow
spec
:
capacity
:
storage
:
4000Mi
accessModes
:
-
ReadWriteMany
storageClassName
:
airflow-logs
nfs
:
server
:
192.168.1.11
path
:
"
/mnt/nfs_logs"
---
apiVersion
:
v1
kind
:
PersistentVolumeClaim
metadata
:
name
:
airflow-dags-pvc
namespace
:
bcom-airflow
spec
:
accessModes
:
-
ReadWriteMany
storageClassName
:
airflow-dags
resources
:
requests
:
storage
:
200Mi
---
apiVersion
:
v1
kind
:
PersistentVolumeClaim
metadata
:
name
:
airflow-postgres-pvc
namespace
:
bcom-airflow
spec
:
accessModes
:
-
ReadWriteMany
storageClassName
:
airflow-postgres
resources
:
requests
:
storage
:
7500Mi
---
apiVersion
:
v1
kind
:
PersistentVolumeClaim
metadata
:
name
:
airflow-logs-pvc
namespace
:
bcom-airflow
labels
:
app
:
airflow-k8s
spec
:
accessModes
:
-
ReadWriteMany
resources
:
requests
:
storage
:
3500Mi
storageClassName
:
airflow-logs
\ No newline at end of file
deploy-k8/airflow-webserver-deployment.yaml
View file @
53d3f17c
...
...
@@ -2,6 +2,7 @@ apiVersion: apps/v1
kind
:
Deployment
metadata
:
name
:
airflow-webserver
namespace
:
bcom-airflow
labels
:
app
:
airflow-k8s
...
...
@@ -25,8 +26,6 @@ spec:
envFrom
:
-
configMapRef
:
name
:
airflow-envvars-configmap
ports
:
-
containerPort
:
8080
volumeMounts
:
-
name
:
dags-host-volume
mountPath
:
/opt/airflow/dags/
...
...
@@ -34,9 +33,8 @@ spec:
mountPath
:
/opt/airflow/logs
volumes
:
-
name
:
dags-host-volume
hostPath
:
path
:
/opt/airflow/dags/dags/
type
:
Directory
persistentVolumeClaim
:
claimName
:
airflow-dags-pvc
-
name
:
logs-persistent-storage
persistentVolumeClaim
:
claimName
:
airflow-logs-pvc
deploy-k8/airflow-webserver-service.yaml
View file @
53d3f17c
...
...
@@ -2,17 +2,18 @@ apiVersion: v1
kind
:
Service
metadata
:
name
:
airflow-webserver
namespace
:
bcom-airflow
labels
:
app
:
airflow-k8s
spec
:
type
:
NodePort
selector
:
app
:
airflow-webserver
ports
:
-
name
:
web
-
appProtocol
:
http
name
:
airflow-webserver
port
:
8080
protocol
:
TCP
port
:
8081
targetPort
:
8080
nodePort
:
30080
\ No newline at end of file
deploy-k8/logs-persistenvolumeclaim.yaml
deleted
100644 → 0
View file @
b9c35909
apiVersion
:
v1
kind
:
PersistentVolumeClaim
metadata
:
name
:
airflow-logs-pvc
labels
:
app
:
airflow-k8s
spec
:
accessModes
:
-
ReadWriteMany
resources
:
requests
:
storage
:
2Gi
storageClassName
:
standard
deploy-k8/pod_template.yaml
deleted
100644 → 0
View file @
b9c35909
apiVersion
:
v1
kind
:
Pod
metadata
:
name
:
dummy-name
spec
:
containers
:
-
args
:
[
]
command
:
[
]
envFrom
:
-
configMapRef
:
name
:
airflow-envvars-configmap
env
:
-
name
:
AIRFLOW__CORE__EXECUTOR
value
:
LocalExecutor
image
:
dumy-image
imagePullPolicy
:
IfNotPresent
name
:
base
volumeMounts
:
-
name
:
dags-host-volume
mountPath
:
/opt/airflow/dags
-
name
:
logs-persistent-storage
mountPath
:
/opt/airflow/logs
hostNetwork
:
false
restartPolicy
:
IfNotPresent
securityContext
:
runAsUser
:
50000
nodeSelector
:
{
}
affinity
:
{
}
tolerations
:
[
]
volumes
:
-
name
:
dags-host-volume
hostPath
:
path
:
/opt/airflow/dags/dags/
type
:
Directory
-
name
:
logs-persistent-storage
persistentVolumeClaim
:
claimName
:
airflow-logs-pvc
deploy-k8/postgres-deployment.yaml
View file @
53d3f17c
...
...
@@ -2,6 +2,7 @@ apiVersion: apps/v1
kind
:
Deployment
metadata
:
name
:
postgres
namespace
:
bcom-airflow
spec
:
selector
:
matchLabels
:
...
...
@@ -31,3 +32,10 @@ spec:
value
:
airflow
-
name
:
POSTGRES_DB
value
:
airflow
volumeMounts
:
-
name
:
pgdatavol
mountPath
:
/var/lib/postgresql/data
volumes
:
-
name
:
pgdatavol
persistentVolumeClaim
:
claimName
:
airflow-postgres-pvc
\ No newline at end of file
deploy-k8/postgres-service.yaml
View file @
53d3f17c
...
...
@@ -2,10 +2,13 @@ apiVersion: v1
kind
:
Service
metadata
:
name
:
postgres
namespace
:
bcom-airflow
spec
:
type
:
NodePort
selector
:
app
:
postgres
ports
:
-
port
:
5432
targetPort
:
5432
nodePort
:
30084
deploy-k8/script-apply.sh
View file @
53d3f17c
kubectl apply
-f
logs-persistenvolumeclaim.yaml
kubectl apply
-f
airflow-envvars-configmap.yaml
kubectl apply
-f
airflow-volumes.yaml
kubectl apply
-f
airflow-rbac.yaml
kubectl apply
-f
postgres-deployment.yaml
kubectl apply
-f
postgres-service.yaml
kubectl apply
-f
airflow-envvars-configmap.yaml
kubectl apply
-f
airflow-secrets.yaml
kubectl apply
-f
airflow-webserver-deployment.yaml
kubectl apply
-f
airflow-webserver-service.yaml
kubectl apply
-f
airflow-scheduler-deployment.yaml
kubectl apply
-f
sync-dags-deployment.yaml
kubectl apply
-f
sync-dags-deployment
-s3
.yaml
deploy-k8/script-delete.sh
View file @
53d3f17c
...
...
@@ -2,9 +2,9 @@ kubectl delete -f airflow-rbac.yaml
kubectl delete
-f
postgres-service.yaml
kubectl delete
-f
postgres-deployment.yaml
kubectl delete
-f
airflow-secrets.yaml
kubectl delete
-f
airflow-envvars-configmap.yaml
kubectl delete
-f
airflow-webserver-service.yaml
kubectl delete
-f
airflow-webserver-deployment.yaml
kubectl delete
-f
airflow-scheduler-deployment.yaml
kubectl delete
-f
logs-persistenvolumeclaim.yaml
kubectl delete
-f
sync-dags-deployment.yaml
\ No newline at end of file
kubectl delete
-f
sync-dags-deployment-s3.yaml
kubectl delete
-f
airflow-volumes.yaml
kubectl delete
-f
airflow-envvars-configmap.yaml
\ No newline at end of file
deploy-k8/sync-dags-deployment-s3.yaml
0 → 100644
View file @
53d3f17c
apiVersion
:
apps/v1
kind
:
Deployment
metadata
:
name
:
airflow-sync-dags
namespace
:
bcom-airflow
spec
:
selector
:
matchLabels
:
app
:
airflow-sync-dags
template
:
metadata
:
labels
:
app
:
airflow-sync-dags
spec
:
containers
:
-
args
:
-
while
true
; aws s3 sync --exact-timestamps --delete ${S3_DAGS_DIR:-s3://prueba1234568/dags} /dags;
do sleep ${SYNCHRONYZE_DAG_DIR:-30}; done;
command
:
-
/bin/bash
-
-c
-
--
name
:
sync-dags-s3
image
:
amazon/aws-cli:2.1.34
envFrom
:
-
configMapRef
:
name
:
airflow-envvars-configmap
env
:
-
name
:
AWS_ACCESS_KEY_ID
valueFrom
:
secretKeyRef
:
key
:
AWS_ACCESS_KEY
name
:
credentials
-
name
:
AWS_SECRET_ACCESS_KEY
valueFrom
:
secretKeyRef
:
key
:
AWS_SECRET_KEY
name
:
credentials
volumeMounts
:
-
name
:
dags-host-volume
mountPath
:
/dags
volumes
:
-
name
:
dags-host-volume
persistentVolumeClaim
:
claimName
:
airflow-dags-pvc
deploy-k8/sync-dags-deployment.yaml
View file @
53d3f17c
...
...
@@ -2,7 +2,7 @@ apiVersion: apps/v1
kind
:
Deployment
metadata
:
name
:
airflow-sync-dags
namespace
:
bcom-airflow
spec
:
selector
:
matchLabels
:
...
...
@@ -44,6 +44,5 @@ spec:
mountPath
:
/dags
volumes
:
-
name
:
dags-host-volume
hostPath
:
path
:
/opt/airflow/dags/dags/
type
:
Directory
persistentVolumeClaim
:
claimName
:
airflow-dags-pvc
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment