Qualifications & experience
- Working experience of at least 2 years with exposure to Big Data, Real-time data pipelines, ETL processes and supporting tools/framework such as Airflow, Hadoop, superset, spark and apache ecosystem
- Experienced in Cloud Service environment, especially docker
- Experienced in working with RDBMS & Non-RDBMS datastores
- In conjunction with the requirement above, excellent skills in writing efficient querying language especially SQL
- Skilled in Python programming
- Good Teamwork & Communication skill
Tasks & responsibilities
- Construct and maintain efficient data pipelines + warehouses for data collection
- Develop necessary scripts/mini-apps to run data analysis
- Conduct tests on large scale data platforms
- Provide recommendations to improve data quality
- Ensure and support the data architecture utilized by data scientists and analysts
- Development of data processes for data modeling, mining, and data production