Duration: 9-12 months will likely extend (conversion opportunity)
Top Skills List:
Hadoop
Spark/Scala (Scala is a must, 2-4 years and coding assessment)
Java
Rest APIs / APIs
AWS (Preference)
Overview:
Part of data engineering team consumes data from multiple different sources, certify the data and make that available to the business stakeholders. Moving from Hadoop to AWS(everything is deployed on AWS). Looking for a candidate who can independently work on the data team, with extensive experience with Hadoop and Spark and Java. API would be handy as well. Someone who has exposure on cloud, AWS more specifically.
Job Description:
Designs, modifies, develops, writes and implements software programming applications.
Will be responsible for writing high-performance, reliable and maintainable code.
Will write MapReduce or Scala jobs
Will write scalable and maintainable Python for serverless deployments on AWS services
Hadoop development, implementation and support
Loading from disparate data sets
Pre-processing using Hive
Translate complex functional and technical requirements into detailed design
Perform analysis of vast data stores and uncover insights
Maintain security and data privacy
Well versed in HDP or AWS architecture.
High-speed querying
Propose best practices or standards
Assists architecture team on solution design and implementation.
Providing assistance when technical problems arise.
Monitoring systems to make sure they meet business requirements.
Work Experience Requirements:
5 to 7 years of Experience in Java or Scala.
3 plus years hands-on experience in Hadoop programming
Hands on experience in Java, Scala and Spark
Hands on experience with Python
Hands on experience with Kafka, NiFi, AWS, Maven, Stash and Bamboo
Hands on experience to write MapReduce jobs.
Hands on experience with REST APIs
Strong exposure in Object Oriented concepts and implementation
Good knowledge on spark architecture.
Good understanding of Hadoop, YARN, AWS EMR
Familiarity with cloud database like AWS Redshift, Aurora MySQL.
Knowledge of workflow or schedulers like Oozie or Apache AirFlow.
Analytical and problem solving skills, applied to Big Data domain
Proven understanding with Hadoop, HBase and Hive
Good aptitude in multi-threading and concurrency concepts