Data Engineer with Spark exp (Please share the 10 yrs exp profiles)
Client location : Livonia,MI
Duration:6+ Months Responsibilities Work with data engineering team to define and develop data ingestion, validation, transformation and data engineering code.
Develop open source platform components using Spark, Scala, Java, Oozie, Hive and other components
Document code artifacts and participate in developing user documentation and run books
Troubleshoot deployment to various environments and provide test support.
Participate in design sessions, demos and prototype sessions, testing and training workshops with business users and other IT associates Qualifications: Data Engineers are required to design and build data products and data pipelines. They will ensure the robust flow of data from acquisition through curation, and governance. Data Engineers will enable data as a service and drive all critical data driven initiatives.
Experts of the data and its application by users
Understands data landscape and environments: sources, elements, update freq, completeness, stewards/contacts, platforms
Manages ETL: use programming and tools for data ingestion, configure pipelines, apply transformations and decoding, integrate and fuse data, move and securely deliver
Translates business requirements to build repeatable, sustainable, efficient, coded processes that can be productionized by Software Engineers and readily modified by other Data Engineers
Creates POC processes in Dev/QA and works with Software Engineers to productionize and align best coding and process practices that maximize efficiency, speed, stability, system resources and capabilities
Leverages frameworks in place with big data tools: Hadoop, Spark, Python, Kafka, etc.
Experience with relational SQL and NoSQL databases
Awareness of and compliance with: data privacy, security, legal and contractual guidelines
Incorporates data quality and privacy checks/alerts to minimize bad data being consumed by end users, models and dashboards, and to protect customer data; revises checks for new data issues
Maintains feedback loop with Data Stewards on data issues, standards, fit for use (Data Stewardship is a subset of data engineering which would include responsibilities like data curation)
Validates data products and pipelines are functioning as expected following system or application upgrades, source changes, etc.
Responsible for data architecture including sources, table structures, physical models
Works closely with Architects to align systems, tools and applications being utilized with business use case and performance requirements
Communicates with end users to set expectations and ensure alignment around data accuracy, completeness, timeliness and consistency
Provides data product support and maintenance
Establishes, tracks and monitors KPIs related to specific data products and deliverables
Preferred Skills and Education: Bachelors's degree in Computer Science or related field
Certification in Spark, Azure or other cloud platform