Job Overview
- Salary Offers Market related
For our prestigious client, we are looking for a senior Data Engineer with strong Python programming experience.
Responsibilities:
– Develop and operate data pipelines processing large, complex datasets as input for analytics and machine learning
– Help to define the analytical scope and data for projects, including investigating data sources, designing new features and data integration flows.
– Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
– Create data tools for analytics and data scientist team members that assist them in building and optimizing their results.
– Utilizing a diverse array of technologies and data science toolsets as needed, primarily Python, Spark and Pandas, but also Jupyter, Denodo, Azure ML, Azure DevOps, Docker, Databricks, GIT, SQL, …
– Communicate ideas, approaches and results with peers and stakeholders
Requirements:
– Mastery of Python, Spark and Pandas to create ETL pipelines for data scientists to use; knowledge of one or more data pipelines frameworks is a plus
– At least 3 years of intensive hands-on experience as a full-stack Python data engineer: Python, Spark, Pandas, NumPy, SciPy, visualization (matplotlib), machine learning (scikit-learn), data pipeline orchestration (e.g. kedro)
– Good knowledge and experience with versioning systems (GIT)
– Good knowledge and experience with databases
– Advanced degree in a relevant discipline such as: Statistics, Applied Mathematics, Operations Research/Optimization, Computer Science, Computational/Theoretical
Physics, Data Science/visualization, Machine Learning, Electrical/Computer Engineering or Health Sciences (e.g. Bioengineering /Bioinformatics)
– Experience in extracting, cleaning, preparing and modeling data. Experience with command-line scripting, data structures, and algorithms.
– Ability to work across structured, semi-structured, and unstructured data
– Strong presentation and communication skills towards peer data scientists and non-technical stakeholders.
– Ability to work individually and in teams (agile).
– Experience with sales & marketing analytics is a plus.
Hours per week: Full time
Duration of the assignment: 6 months+
Start date: January 2023 (or earlier)
100% work from home
If you are interested, please send your CV now for immediate consideration.