Tags: python sql etl visualisation
Mobile payment company, pipeline refinement and optimisation for cost reduction and improved efficiency
In the product data science team there have been a lot of team restructuring and movement with little pipeline ownership. The result of this are resource heavy and brittle data pipelines which fail regularly making analysis difficult.
Pipelines were optimized and refined as well as managing dependence on upstream tables:
One was migrated to Prefect, implementing best practices such as incremental loading.
A new pipeline was created to serve a refined dashboard.
Others remained in airflow but queries were optimised.
Some dependencies were late but outside the teams jurisdiction so decisions about what was required and which fields were allowed to be out of date had to be made.