Lead Data Engineer- Tarrytown NY 10591 (Hybrid role 3 days onsite 2 days WFH)
Role : Lead Data Engineer Location : Tarrytown NY 10591 (Hybrid role 3 days onsite 2 days WFH) 6+ Months
Must have AWS , Apache Airflow , Pyspark , Redshift
JD
Candidate should have 12+ years of experience in Data Engineering
Designing, creating, testing and maintaining the complete data management & processing systems.
Working closely with the stakeholders & solution architect.
Ensuring architecture meets the business requirements.
Building highly scalable, robust & fault-tolerant systems.
Taking care of the complete ETL process.
Knowledge of Hadoop ecosystem and different frameworks inside it HDFS, YARN, MapReduce, Apache Pig, Hive, Flume, Sqoop, ZooKeeper, Oozie, Impala and Kafka
Must have knowledge and working experience in Real-time processing Framework (Apache Spark), PySpark and in AWS Redshift
Must have experience on SQL-based technologies (e.g. MySQL/ Oracle DB) and NoSQL technologies (e.g. Cassandra and MongoDB)
Should have Python/Scala/Java Programming skills
Discovering data acquisitions opportunities
Finding ways & methods to find value out of existing data.
Improving data quality, reliability & efficiency of the individual components & the complete system.
Creating a complete solution by integrating a variety of programming languages & tools together.
Creating data models to reduce system complexities and hence increase efficiency & reduce cost.
Introducing new data management tools & technologies into the existing system to make it more efficient.
Setting & achieving individual as well as the team goal.
Problem solving mindset working in agile environment
Greetings, Please go through the below requirement, and please send an updated resume to my attention at your earliest convenience, so that we may support your candidacy for the role. I sincerely appreciate you referring this requirement to anyone that you know who might be interested in case of your unavailability.