Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
B
bcom-tp-etl-transformation-pipelines
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
general
bcom-tp-etl-transformation-pipelines
Commits
0e639513
Commit
0e639513
authored
Aug 02, 2023
by
Cristian Aguirre
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Update 01-08-23. Fix some bugs (377, 378)
parent
03b3285a
Changes
15
Hide whitespace changes
Inline
Side-by-side
Showing
15 changed files
with
113 additions
and
71 deletions
+113
-71
README.md
README.md
+29
-1
Extractor.py
dags/components/Extractor.py
+1
-0
Generation.py
dags/components/Generation.py
+1
-0
S3Route.py
dags/components/S3Route.py
+3
-1
Utils.py
dags/components/Utils.py
+1
-1
dag_conf.yml
dags/dag_conf.yml
+7
-7
dag_inform_process.py
dags/dag_inform_process.py
+18
-17
airflow-envvars-configmap.yaml
deploy-k8/airflow-envvars-configmap.yaml
+21
-16
airflow-scheduler-deployment.yaml
deploy-k8/airflow-scheduler-deployment.yaml
+4
-0
airflow-volumes.yaml
deploy-k8/airflow-volumes.yaml
+12
-12
airflow-webserver-deployment.yaml
deploy-k8/airflow-webserver-deployment.yaml
+4
-0
postgres-deployment.yaml
deploy-k8/postgres-deployment.yaml
+2
-2
script-apply.sh
deploy-k8/script-apply.sh
+1
-1
script-delete.sh
deploy-k8/script-delete.sh
+1
-1
sync-dags-deployment-gcs.yaml
deploy-k8/sync-dags-deployment-gcs.yaml
+8
-12
No files found.
README.md
View file @
0e639513
...
@@ -456,8 +456,36 @@ Listo. Con esto se ha configurado todos los servicios requeridos para levantar n
...
@@ -456,8 +456,36 @@ Listo. Con esto se ha configurado todos los servicios requeridos para levantar n
### Desplegando Airflow con GKE
### Desplegando Airflow con GKE
⚠️ Ojo: Antes de desplegar nuestros componentes, hay que asegurarnos que cumple con los
recursos de nuestro clúster (en cuestión de vCPU y memoria RAM), para esto hay que validar
los templates que son de tipo despliegue (deployment), estos son los que en su nombre termina
con la palabra _"deployment"_: "airflow-scheduler-deployment,yaml", "airflow-webserver-deployment.yaml",
"postgres-deployment.yaml" y "sync-dags-deployment-gcs" (para este caso que estamos en GCP).
1.
-
Además del archivo "airflow-envvars-configmap.yaml" que contiene los recursos para los workers.
Todos los templates mencionados tienen sus parámetros "containers.resources.limits" o
"containers.resources.requests" las cuales indican el máximo y mínimo, respectivamente.
1.
- Luego de tener ya configurado nuestro Bucket y NFS (Filestore), podemos desplegar nuestro sistema
Airflow con Kubernetes. Para esto nos situamos dentro de la carpeta
**"deploy-k8"**
y ejecutamos
el siguiente comando:
```
shell
sh script-apply.sh
```
Esto arrancará todos los componentes y dentro de unos segundos o minutos estará todo arriba,
listo para su uso.
2.
- Para validar que están todos los componentes arriba, se ejecuta el siguiente comando:
```
shell
kubectl get pods
-n
bcom-airflow
```
Deberá mostrarse algo similar a esto:

dags/components/Extractor.py
View file @
0e639513
...
@@ -4,6 +4,7 @@ import json
...
@@ -4,6 +4,7 @@ import json
import
numpy
as
np
import
numpy
as
np
import
pandas
as
pd
import
pandas
as
pd
from
enums.ProcessStatusEnum
import
ProcessStatusEnum
from
enums.ProcessStatusEnum
import
ProcessStatusEnum
from
enums.DatabaseTypeEnum
import
DatabaseTypeEnum
from
components.Utils
import
select_multiple
,
generateModel
from
components.Utils
import
select_multiple
,
generateModel
from
components.DatabaseOperation.DatabaseExtraction
import
get_iterator
,
get_steps
from
components.DatabaseOperation.DatabaseExtraction
import
get_iterator
,
get_steps
from
components.DatabaseOperation.DatabaseLoad
import
save_from_dataframe
from
components.DatabaseOperation.DatabaseLoad
import
save_from_dataframe
...
...
dags/components/Generation.py
View file @
0e639513
...
@@ -8,6 +8,7 @@ from airflow.decorators import task
...
@@ -8,6 +8,7 @@ from airflow.decorators import task
from
airflow.exceptions
import
AirflowSkipException
from
airflow.exceptions
import
AirflowSkipException
from
enums.ProcessStatusEnum
import
ProcessStatusEnum
from
enums.ProcessStatusEnum
import
ProcessStatusEnum
from
enums.DatabaseTypeEnum
import
DatabaseTypeEnum
from
components.S3Route
import
save_df_to_s3
,
load_control_to_s3
from
components.S3Route
import
save_df_to_s3
,
load_control_to_s3
from
components.Utils
import
select_multiple
,
create_temp_file
,
delete_temp_dir
from
components.Utils
import
select_multiple
,
create_temp_file
,
delete_temp_dir
from
components.Control
import
get_tasks_from_control
,
update_new_process
from
components.Control
import
get_tasks_from_control
,
update_new_process
...
...
dags/components/S3Route.py
View file @
0e639513
...
@@ -4,6 +4,7 @@ from typing import Any, Dict, List, Tuple
...
@@ -4,6 +4,7 @@ from typing import Any, Dict, List, Tuple
import
pytz
import
pytz
from
io
import
BytesIO
,
StringIO
from
io
import
BytesIO
,
StringIO
import
pandas
as
pd
import
pandas
as
pd
import
re
from
components.Utils
import
get_type_file
from
components.Utils
import
get_type_file
from
enums.FileTypeEnum
import
FileTypeEnum
from
enums.FileTypeEnum
import
FileTypeEnum
...
@@ -181,6 +182,7 @@ def get_file_from_prefix(conn: str, bucket: str, key: str, provider: str, timezo
...
@@ -181,6 +182,7 @@ def get_file_from_prefix(conn: str, bucket: str, key: str, provider: str, timezo
frequency
:
str
=
"montly"
)
->
Any
:
frequency
:
str
=
"montly"
)
->
Any
:
result
,
key_result
=
BytesIO
(),
''
result
,
key_result
=
BytesIO
(),
''
try
:
try
:
format_re_pattern
=
'[0-9]{4}-[0-9]{2}
\
.'
format_date
=
"
%
Y-
%
m"
if
frequency
==
"montly"
else
"
%
Y-
%
W"
format_date
=
"
%
Y-
%
m"
if
frequency
==
"montly"
else
"
%
Y-
%
W"
period
=
str
(
datetime_by_tzone
(
timezone
,
format_date
))[:
7
]
period
=
str
(
datetime_by_tzone
(
timezone
,
format_date
))[:
7
]
logger
.
info
(
f
"Periodo actual: {period}."
)
logger
.
info
(
f
"Periodo actual: {period}."
)
...
@@ -197,7 +199,7 @@ def get_file_from_prefix(conn: str, bucket: str, key: str, provider: str, timezo
...
@@ -197,7 +199,7 @@ def get_file_from_prefix(conn: str, bucket: str, key: str, provider: str, timezo
files
=
gcp_hook
.
list
(
bucket
,
prefix
=
key
)
files
=
gcp_hook
.
list
(
bucket
,
prefix
=
key
)
files_with_period
=
[]
files_with_period
=
[]
for
file
in
files
:
for
file
in
files
:
if
file
.
endswith
(
"/"
):
if
file
.
endswith
(
"/"
)
or
not
re
.
search
(
format_re_pattern
,
file
)
:
continue
continue
file_period
=
file
[
file
.
rfind
(
"_"
)
+
1
:
file
.
rfind
(
"."
)]
file_period
=
file
[
file
.
rfind
(
"_"
)
+
1
:
file
.
rfind
(
"."
)]
files_with_period
.
append
((
file
,
file_period
))
files_with_period
.
append
((
file
,
file_period
))
...
...
dags/components/Utils.py
View file @
0e639513
...
@@ -107,7 +107,7 @@ def update_sql_commands(dataset: List[Tuple[str, str]], label_tablename: str) ->
...
@@ -107,7 +107,7 @@ def update_sql_commands(dataset: List[Tuple[str, str]], label_tablename: str) ->
final_data
[
-
1
]
=
final_data
[
-
1
]
+
"; end;"
final_data
[
-
1
]
=
final_data
[
-
1
]
+
"; end;"
final_item
=
item
final_item
=
item
if
item
.
lower
()
.
strip
()
.
find
(
label_tablename
.
lower
()
.
strip
()
+
":"
)
!=
-
1
:
if
item
.
lower
()
.
strip
()
.
find
(
label_tablename
.
lower
()
.
strip
()
+
":"
)
!=
-
1
:
init_index
=
item
.
lower
()
.
strip
()
.
index
(
label_tablename
.
lower
()
.
strip
()
+
":"
)
init_index
=
item
.
replace
(
" "
,
""
)
.
lower
()
.
strip
()
.
index
(
label_tablename
.
lower
()
.
strip
()
+
":"
)
table_name
=
item
.
replace
(
" "
,
""
)
.
strip
()[
init_index
+
len
(
label_tablename
+
":"
):]
.
strip
()
table_name
=
item
.
replace
(
" "
,
""
)
.
strip
()[
init_index
+
len
(
label_tablename
+
":"
):]
.
strip
()
add_next
=
True
add_next
=
True
elif
item
!=
""
:
elif
item
!=
""
:
...
...
dags/dag_conf.yml
View file @
0e639513
...
@@ -10,7 +10,7 @@ app:
...
@@ -10,7 +10,7 @@ app:
port
:
3306
port
:
3306
username
:
admin
username
:
admin
password
:
adminadmin
password
:
adminadmin
database
:
prueba_
bcom
database
:
prueba_
ca_1
service
:
ORCLPDB1
service
:
ORCLPDB1
schema
:
sources
schema
:
sources
transformation
:
transformation
:
...
@@ -19,7 +19,7 @@ app:
...
@@ -19,7 +19,7 @@ app:
port
:
3306
port
:
3306
username
:
admin
username
:
admin
password
:
adminadmin
password
:
adminadmin
database
:
prueba_
bcom
2
database
:
prueba_
ca_
2
service
:
service
:
schema
:
intern_db
schema
:
intern_db
chunksize
:
8000
chunksize
:
8000
...
@@ -28,16 +28,16 @@ app:
...
@@ -28,16 +28,16 @@ app:
procedure_mask
:
procedure
# S
procedure_mask
:
procedure
# S
transformation_mask
:
transform
# S
transformation_mask
:
transform
# S
prefix_order_delimiter
:
.
prefix_order_delimiter
:
.
cloud_provider
:
google
cloud_provider
:
aws
scripts
:
scripts
:
s3_params
:
s3_params
:
bucket
:
prueba-airflow3
bucket
:
prueba-airflow
1
3
prefix
:
bcom_scripts
prefix
:
bcom_scripts
connection_id
:
conn_script
connection_id
:
conn_script
control
:
control
:
s3_params
:
s3_params
:
connection_id
:
conn_script
connection_id
:
conn_script
bucket
:
prueba
-airflow3
bucket
:
prueba
1234568
prefix
:
bcom_control
prefix
:
bcom_control
filename
:
control_<period>.json
filename
:
control_<period>.json
timezone
:
'
GMT-5'
timezone
:
'
GMT-5'
...
@@ -48,12 +48,12 @@ app:
...
@@ -48,12 +48,12 @@ app:
delimiter
:
'
|'
delimiter
:
'
|'
tmp_path
:
/tmp
tmp_path
:
/tmp
s3_params
:
s3_params
:
bucket
:
prueba-airflow3
bucket
:
prueba-airflow
1
3
prefix
:
bcom_results
prefix
:
bcom_results
connection_id
:
conn_script
connection_id
:
conn_script
report
:
report
:
s3_params
:
s3_params
:
bucket
:
prueba
-airflow3
bucket
:
prueba
1234568
prefix
:
bcom_report
prefix
:
bcom_report
connection_id
:
conn_script
connection_id
:
conn_script
filename
:
report_<datetime>.xlsx
filename
:
report_<datetime>.xlsx
...
...
dags/dag_inform_process.py
View file @
0e639513
...
@@ -76,7 +76,7 @@ def create_report(tmp_path: str, **kwargs) -> None:
...
@@ -76,7 +76,7 @@ def create_report(tmp_path: str, **kwargs) -> None:
title_format
.
set_font_size
(
20
)
title_format
.
set_font_size
(
20
)
title_format
.
set_font_color
(
"#333333"
)
title_format
.
set_font_color
(
"#333333"
)
header
=
f
"
Reporte
ejecutado el día {execution_date}"
header
=
f
"
Proceso
ejecutado el día {execution_date}"
if
status
==
ProcessStatusEnum
.
SUCCESS
.
value
:
if
status
==
ProcessStatusEnum
.
SUCCESS
.
value
:
status
=
"EXITOSO"
status
=
"EXITOSO"
elif
status
==
ProcessStatusEnum
.
FAIL
.
value
:
elif
status
==
ProcessStatusEnum
.
FAIL
.
value
:
...
@@ -95,20 +95,20 @@ def create_report(tmp_path: str, **kwargs) -> None:
...
@@ -95,20 +95,20 @@ def create_report(tmp_path: str, **kwargs) -> None:
row_format
=
workbook
.
add_format
()
row_format
=
workbook
.
add_format
()
row_format
.
set_font_size
(
8
)
row_format
.
set_font_size
(
8
)
row_format
.
set_font_color
(
"#000000"
)
row_format
.
set_font_color
(
"#000000"
)
if
status
!=
ProcessStatusEnum
.
RESET
.
value
:
base_index
=
5
base_index
=
5
for
index
,
key
in
enumerate
(
data
.
keys
()):
for
index
,
key
in
enumerate
(
data
.
keys
()):
index
=
base_index
+
index
index
=
base_index
+
index
worksheet
.
merge_range
(
'A'
+
str
(
index
)
+
':B'
+
str
(
index
),
key
,
row_format
)
worksheet
.
merge_range
(
'A'
+
str
(
index
)
+
':B'
+
str
(
index
),
key
,
row_format
)
if
data
[
key
][
"TYPE"
]
==
"EXTRACTION"
:
if
data
[
key
][
"TYPE"
]
==
"EXTRACTION"
:
worksheet
.
merge_range
(
'C'
+
str
(
index
)
+
':G'
+
str
(
index
),
f
"TABLA DE EXTRACCIÓN: {data[key]['DESCRIPTION']}"
,
row_format
)
worksheet
.
merge_range
(
'C'
+
str
(
index
)
+
':G'
+
str
(
index
),
f
"TABLA DE EXTRACCIÓN: {data[key]['DESCRIPTION']}"
,
row_format
)
elif
data
[
key
][
"TYPE"
]
==
"TRANSFORMATION"
:
elif
data
[
key
][
"TYPE"
]
==
"TRANSFORMATION"
:
script
=
data
[
key
][
"DESCRIPTION"
]
.
split
(
"|"
)[
1
]
script
=
data
[
key
][
"DESCRIPTION"
]
.
split
(
"|"
)[
1
]
worksheet
.
merge_range
(
'C'
+
str
(
index
)
+
':G'
+
str
(
index
),
f
"SCRIPT DE TRANSFORMACIÓN: {script}"
,
row_format
)
worksheet
.
merge_range
(
'C'
+
str
(
index
)
+
':G'
+
str
(
index
),
f
"SCRIPT DE TRANSFORMACIÓN: {script}"
,
row_format
)
elif
data
[
key
][
"TYPE"
]
==
"GENERATION"
:
elif
data
[
key
][
"TYPE"
]
==
"GENERATION"
:
worksheet
.
merge_range
(
'C'
+
str
(
index
)
+
':G'
+
str
(
index
),
f
"ARCHIVO GENERADO DESDE LA TABLA: {data[key]['DESCRIPTION']}"
,
row_format
)
worksheet
.
merge_range
(
'C'
+
str
(
index
)
+
':G'
+
str
(
index
),
f
"ARCHIVO GENERADO DESDE LA TABLA: {data[key]['DESCRIPTION']}"
,
row_format
)
worksheet
.
merge_range
(
'H'
+
str
(
index
)
+
':I'
+
str
(
index
),
f
"ESTADO: {data[key]['STATUS']}"
,
row_format
)
worksheet
.
merge_range
(
'H'
+
str
(
index
)
+
':I'
+
str
(
index
),
f
"ESTADO: {data[key]['STATUS']}"
,
row_format
)
worksheet
.
merge_range
(
'J'
+
str
(
index
)
+
':N'
+
str
(
index
),
data
[
key
][
'MESSAGE'
],
row_format
)
worksheet
.
merge_range
(
'J'
+
str
(
index
)
+
':N'
+
str
(
index
),
data
[
key
][
'MESSAGE'
],
row_format
)
task
.
xcom_push
(
key
=
"REPORT_PATH"
,
value
=
excel_tmp_path
)
task
.
xcom_push
(
key
=
"REPORT_PATH"
,
value
=
excel_tmp_path
)
except
Exception
as
e
:
except
Exception
as
e
:
logger
.
error
(
f
"Error creando reporte. {e}"
)
logger
.
error
(
f
"Error creando reporte. {e}"
)
...
@@ -125,7 +125,8 @@ def get_data_report(**kwargs) -> None:
...
@@ -125,7 +125,8 @@ def get_data_report(**kwargs) -> None:
else
:
else
:
last_process
=
control
[
-
1
]
last_process
=
control
[
-
1
]
if
"reset_by_user"
in
last_process
.
keys
():
if
"reset_by_user"
in
last_process
.
keys
():
report_data
[
"PROCESS_EXECUTION"
]
=
ProcessStatusEnum
.
RESET
.
value
report_data
[
"PROCESS_STATUS"
]
=
ProcessStatusEnum
.
RESET
.
value
report_data
[
"PROCESS_EXECUTION"
]
=
last_process
[
"date"
]
else
:
else
:
total_tasks
=
[
last_process
[
"tasks"
]]
total_tasks
=
[
last_process
[
"tasks"
]]
current_status
=
last_process
[
"status"
]
current_status
=
last_process
[
"status"
]
...
@@ -161,7 +162,7 @@ def get_data_report(**kwargs) -> None:
...
@@ -161,7 +162,7 @@ def get_data_report(**kwargs) -> None:
report_data
.
update
({
item
:
{
"STATUS"
:
final_key_tasks
[
item
],
"TYPE"
:
type_task
,
report_data
.
update
({
item
:
{
"STATUS"
:
final_key_tasks
[
item
],
"TYPE"
:
type_task
,
"DESCRIPTION"
:
final_key_desc
[
item
],
'MESSAGE'
:
final_key_message
[
item
]}})
"DESCRIPTION"
:
final_key_desc
[
item
],
'MESSAGE'
:
final_key_message
[
item
]}})
report_data
.
update
({
"PROCESS_STATUS"
:
current_status
,
"PROCESS_EXECUTION"
:
last_process
[
"date"
]})
report_data
.
update
({
"PROCESS_STATUS"
:
current_status
,
"PROCESS_EXECUTION"
:
last_process
[
"date"
]})
task
.
xcom_push
(
key
=
"REPORT-DATA"
,
value
=
report_data
)
task
.
xcom_push
(
key
=
"REPORT-DATA"
,
value
=
report_data
)
logger
.
info
(
f
"Diccionario de datos para el reporte: {report_data}"
)
logger
.
info
(
f
"Diccionario de datos para el reporte: {report_data}"
)
except
Exception
as
e
:
except
Exception
as
e
:
logger
.
error
(
f
"Error general creando reporte. {e}"
)
logger
.
error
(
f
"Error general creando reporte. {e}"
)
...
...
deploy-k8/airflow-envvars-configmap.yaml
View file @
0e639513
apiVersion
:
v1
#
apiVersion: v1
kind
:
Namespace
#
kind: Namespace
metadata
:
#
metadata:
name
:
bcom-airflow
#
name: bcom-airflow
#
---
#
---
apiVersion
:
v1
#
apiVersion: v1
kind
:
ServiceAccount
#
kind: ServiceAccount
metadata
:
#
metadata:
name
:
bcom-airflow
#
name: bcom-airflow
namespace
:
bcom-airflow
#
namespace: bcom-airflow
#
---
#
---
apiVersion
:
v1
apiVersion
:
v1
kind
:
ConfigMap
kind
:
ConfigMap
...
@@ -36,6 +36,10 @@ data:
...
@@ -36,6 +36,10 @@ data:
value: LocalExecutor
value: LocalExecutor
image: dumy-image
image: dumy-image
imagePullPolicy: IfNotPresent
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: "1000m"
memory: "2Gi"
name: base
name: base
volumeMounts:
volumeMounts:
- name: dags-host-volume
- name: dags-host-volume
...
@@ -77,19 +81,20 @@ data:
...
@@ -77,19 +81,20 @@ data:
AIRFLOW__CORE__DEFAULT_TIMEZONE
:
America/Lima
AIRFLOW__CORE__DEFAULT_TIMEZONE
:
America/Lima
AIRFLOW__KUBERNETES__KUBE_CLIENT_REQUEST_ARGS
:
'
{"_request_timeout":
[60,60]}'
AIRFLOW__KUBERNETES__KUBE_CLIENT_REQUEST_ARGS
:
'
{"_request_timeout":
[60,60]}'
AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY
:
cristianfernando/airflow_custom
AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY
:
cristianfernando/airflow_custom
AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG
:
"
0.0.
5
"
AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG
:
"
0.0.
6
"
AIRFLOW__KUBERNETES__LOGS_VOLUME_CLAIM
:
airflow-logs-pvc
AIRFLOW__KUBERNETES__LOGS_VOLUME_CLAIM
:
airflow-logs-pvc
AIRFLOW__KUBERNETES__ENV_FROM_CONFIGMAP_REF
:
airflow-envvars-configmap
AIRFLOW__KUBERNETES__ENV_FROM_CONFIGMAP_REF
:
airflow-envvars-configmap
AIRFLOW__KUBERNETES_EXECUTOR__POD_TEMPLATE_FILE
:
/opt/airflow/templates/pod_template.yaml
AIRFLOW__KUBERNETES_EXECUTOR__POD_TEMPLATE_FILE
:
/opt/airflow/templates/pod_template.yaml
AIRFLOW__CORE__EXECUTOR
:
Kubernetes
Executor
AIRFLOW__CORE__EXECUTOR
:
Local
Executor
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN
:
postgresql+psycopg2://airflow:airflow@postgres/airflow
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN
:
postgresql+psycopg2://airflow:airflow@postgres/airflow
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION
:
'
true'
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION
:
'
true'
AIRFLOW__CORE__LOAD_EXAMPLES
:
'
tru
e'
AIRFLOW__CORE__LOAD_EXAMPLES
:
'
fals
e'
_AIRFLOW_DB_UPGRADE
:
'
true'
_AIRFLOW_DB_UPGRADE
:
'
true'
_AIRFLOW_WWW_USER_CREATE
:
'
true'
_AIRFLOW_WWW_USER_CREATE
:
'
true'
_AIRFLOW_WWW_USER_USERNAME
:
admin
_AIRFLOW_WWW_USER_USERNAME
:
admin
_AIRFLOW_WWW_USER_PASSWORD
:
admin
_AIRFLOW_WWW_USER_PASSWORD
:
admin
S3_DAGS_DIR
:
'
s3://prueba1234568/dags'
S3_DAGS_DIR
:
'
s3://prueba1234568/dags'
GCS_DAGS_DIR
:
'
gs://prueba-rsync2/carpeta'
SYNCHRONYZE_DAG_DIR
:
'
30'
SYNCHRONYZE_DAG_DIR
:
'
30'
MINIO_SERVER
:
'
http://192.168.49.2:9000'
MINIO_SERVER
:
'
http://192.168.49.2:9000'
MINIO_DAGS_DIR
:
'
/prueba-ca/dags'
MINIO_DAGS_DIR
:
'
/prueba-ca/dags'
\ No newline at end of file
deploy-k8/airflow-scheduler-deployment.yaml
View file @
0e639513
...
@@ -22,6 +22,10 @@ spec:
...
@@ -22,6 +22,10 @@ spec:
containers
:
containers
:
-
name
:
airflow-scheduler
-
name
:
airflow-scheduler
image
:
cristianfernando/airflow_custom:0.0.4
image
:
cristianfernando/airflow_custom:0.0.4
resources
:
requests
:
cpu
:
"
1000m"
memory
:
"
4Gi"
args
:
[
"
scheduler"
]
args
:
[
"
scheduler"
]
envFrom
:
envFrom
:
-
configMapRef
:
-
configMapRef
:
...
...
deploy-k8/airflow-volumes.yaml
View file @
0e639513
...
@@ -5,13 +5,13 @@ metadata:
...
@@ -5,13 +5,13 @@ metadata:
namespace
:
bcom-airflow
namespace
:
bcom-airflow
spec
:
spec
:
capacity
:
capacity
:
storage
:
300M
i
storage
:
5G
i
accessModes
:
accessModes
:
-
ReadWriteMany
-
ReadWriteMany
storageClassName
:
airflow-dags
storageClassName
:
airflow-dags
nfs
:
nfs
:
server
:
1
92.168.1.9
server
:
1
0.216.137.186
path
:
"
/
mnt
/nfs_share"
path
:
"
/
volume1
/nfs_share"
---
---
...
@@ -22,13 +22,13 @@ metadata:
...
@@ -22,13 +22,13 @@ metadata:
namespace
:
bcom-airflow
namespace
:
bcom-airflow
spec
:
spec
:
capacity
:
capacity
:
storage
:
8000M
i
storage
:
16G
i
accessModes
:
accessModes
:
-
ReadWriteMany
-
ReadWriteMany
storageClassName
:
airflow-postgres
storageClassName
:
airflow-postgres
nfs
:
nfs
:
server
:
1
92.168.1.9
server
:
1
0.216.137.186
path
:
"
/
mnt
/nfs_postgres"
path
:
"
/
volume1
/nfs_postgres"
---
---
...
@@ -39,13 +39,13 @@ metadata:
...
@@ -39,13 +39,13 @@ metadata:
namespace
:
bcom-airflow
namespace
:
bcom-airflow
spec
:
spec
:
capacity
:
capacity
:
storage
:
4000M
i
storage
:
10G
i
accessModes
:
accessModes
:
-
ReadWriteMany
-
ReadWriteMany
storageClassName
:
airflow-logs
storageClassName
:
airflow-logs
nfs
:
nfs
:
server
:
1
92.168.1.9
server
:
1
0.216.137.186
path
:
"
/
mnt
/nfs_logs"
path
:
"
/
volume1
/nfs_logs"
---
---
...
@@ -60,7 +60,7 @@ spec:
...
@@ -60,7 +60,7 @@ spec:
storageClassName
:
airflow-dags
storageClassName
:
airflow-dags
resources
:
resources
:
requests
:
requests
:
storage
:
200M
i
storage
:
5G
i
---
---
...
@@ -75,7 +75,7 @@ spec:
...
@@ -75,7 +75,7 @@ spec:
storageClassName
:
airflow-postgres
storageClassName
:
airflow-postgres
resources
:
resources
:
requests
:
requests
:
storage
:
7500M
i
storage
:
16G
i
---
---
...
@@ -91,5 +91,5 @@ spec:
...
@@ -91,5 +91,5 @@ spec:
-
ReadWriteMany
-
ReadWriteMany
resources
:
resources
:
requests
:
requests
:
storage
:
3500M
i
storage
:
10G
i
storageClassName
:
airflow-logs
storageClassName
:
airflow-logs
\ No newline at end of file
deploy-k8/airflow-webserver-deployment.yaml
View file @
0e639513
...
@@ -22,6 +22,10 @@ spec:
...
@@ -22,6 +22,10 @@ spec:
containers
:
containers
:
-
name
:
airflow-webserver
-
name
:
airflow-webserver
image
:
apache/airflow:2.5.3
image
:
apache/airflow:2.5.3
resources
:
requests
:
cpu
:
"
500m"
memory
:
"
500Mi"
args
:
[
"
webserver"
]
args
:
[
"
webserver"
]
envFrom
:
envFrom
:
-
configMapRef
:
-
configMapRef
:
...
...
deploy-k8/postgres-deployment.yaml
View file @
0e639513
...
@@ -21,8 +21,8 @@ spec:
...
@@ -21,8 +21,8 @@ spec:
image
:
postgres:12
image
:
postgres:12
resources
:
resources
:
limits
:
limits
:
memory
:
128Mi
memory
:
"
2Gi"
cpu
:
500m
cpu
:
"
500m"
ports
:
ports
:
-
containerPort
:
5432
-
containerPort
:
5432
env
:
env
:
...
...
deploy-k8/script-apply.sh
View file @
0e639513
...
@@ -7,4 +7,4 @@ kubectl apply -f airflow-secrets.yaml
...
@@ -7,4 +7,4 @@ kubectl apply -f airflow-secrets.yaml
kubectl apply
-f
airflow-webserver-deployment.yaml
kubectl apply
-f
airflow-webserver-deployment.yaml
kubectl apply
-f
airflow-webserver-service.yaml
kubectl apply
-f
airflow-webserver-service.yaml
kubectl apply
-f
airflow-scheduler-deployment.yaml
kubectl apply
-f
airflow-scheduler-deployment.yaml
kubectl apply
-f
sync-dags-deployment.yaml
kubectl apply
-f
sync-dags-deployment
-gcs
.yaml
deploy-k8/script-delete.sh
View file @
0e639513
...
@@ -5,6 +5,6 @@ kubectl delete -f airflow-secrets.yaml
...
@@ -5,6 +5,6 @@ kubectl delete -f airflow-secrets.yaml
kubectl delete
-f
airflow-webserver-service.yaml
kubectl delete
-f
airflow-webserver-service.yaml
kubectl delete
-f
airflow-webserver-deployment.yaml
kubectl delete
-f
airflow-webserver-deployment.yaml
kubectl delete
-f
airflow-scheduler-deployment.yaml
kubectl delete
-f
airflow-scheduler-deployment.yaml
kubectl delete
-f
sync-dags-deployment.yaml
kubectl delete
-f
sync-dags-deployment
-gcs
.yaml
kubectl delete
-f
airflow-volumes.yaml
kubectl delete
-f
airflow-volumes.yaml
kubectl delete
-f
airflow-envvars-configmap.yaml
kubectl delete
-f
airflow-envvars-configmap.yaml
\ No newline at end of file
deploy-k8/sync-dags-deployment-gcs.yaml
View file @
0e639513
...
@@ -14,9 +14,12 @@ spec:
...
@@ -14,9 +14,12 @@ spec:
app
:
airflow-sync-dags
app
:
airflow-sync-dags
spec
:
spec
:
serviceAccountName
:
bcom-airflow
nodeSelector
:
iam.gke.io/gke-metadata-server-enabled
:
"
true"
containers
:
containers
:
-
args
:
-
args
:
-
while
true
; g
cloud rsync -d -r ${GCS_DAGS_DIR:-gs://prueba-rsync
/carpeta} /dags;
-
while
true
; g
sutil rsync -d -r ${GCS_DAGS_DIR:-gs://prueba-rsync2
/carpeta} /dags;
do sleep ${SYNCHRONYZE_DAG_DIR:-30}; done;
do sleep ${SYNCHRONYZE_DAG_DIR:-30}; done;
command
:
command
:
-
/bin/bash
-
/bin/bash
...
@@ -24,20 +27,13 @@ spec:
...
@@ -24,20 +27,13 @@ spec:
-
--
-
--
name
:
sync-dags-gcloud
name
:
sync-dags-gcloud
image
:
gcr.io/google.com/cloudsdktool/google-cloud-cli:alpine
image
:
gcr.io/google.com/cloudsdktool/google-cloud-cli:alpine
resources
:
limits
:
cpu
:
"
250m"
memory
:
"
1Gi"
envFrom
:
envFrom
:
-
configMapRef
:
-
configMapRef
:
name
:
airflow-envvars-configmap
name
:
airflow-envvars-configmap
env
:
-
name
:
AWS_ACCESS_KEY_ID
valueFrom
:
secretKeyRef
:
key
:
AWS_ACCESS_KEY
name
:
credentials
-
name
:
AWS_SECRET_ACCESS_KEY
valueFrom
:
secretKeyRef
:
key
:
AWS_SECRET_KEY
name
:
credentials
volumeMounts
:
volumeMounts
:
-
name
:
dags-host-volume
-
name
:
dags-host-volume
mountPath
:
/dags
mountPath
:
/dags
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment