Vacancy expired!
Sr. Data Engineer
- Viewed as a data expert; drives innovation and plays a key role in the department. Participates in highly visible initiatives that have broad impact.
- Identify, design, and implement internal process improvements: automate manual processes, optimize data delivery, re-design infrastructure for greater scalability.
- Design, develop, code, test, and document architectures and applications.
- Work closely with team members and cross-functional teams to ensure design/architecture/deliverables support business requirements and align with best-practices.
- Troubleshoot and resolve a wide range of data issues.
- Makes innovative recommendations to improve data reliability, efficiency and quality.
- Required to perform duties outside of normal work hours based on business needs.
Role
Senior data engineer, develop ETL pipelines involving transformation of nested data stored in JSON and Parquet files, using GLUE + Pyspark
Develops and maintains scalable data pipelines and builds out new API integrations for data transfer.
Develop Terraform scripts to deploy infra required for ETL pipelines on AWS
Performs data analysis required to troubleshoot data-related issues and assist in the resolution of data issues.
Please complete the attached DataEngSkillSurvey for each worker and submit with the resume
Position is 100% remote within Central Time Zone.
Must-have
BS or MS degree in Computer Science or a related technical field
5+ years of extensive ETL development experience using Pyspark/Glue on AWS
5+ years of experience in CSV, JSON, Parquet file formats, especially with nested data types
5+ years of experience in S3, Athena, RDS, Glue catalogue, Cloudformation
Strong understanding of ETL/Data-pipelines/BigData architecture
Strong Database/SQL experience in any RDBMS
Nice-to-have
Experience in schema design, data ingestion experience on Snowflake (or equivalent MPP)
Experience in orchestrating data processing jobs using Step Function/Glue workflow/Apache Airflow (MWAA)
Experience in data analysis using Excel formulas, vlookup, pivot, slicers