Vacancy expired!
4+ yrs of strong experience in Hadoop Framework and its ecosystem -
HDFS Architecture, MapReduce Programming, Hive, Pig, Sqoop, Hbase,
Zookeeper, Oozie, Spark, Scala, Flume etc.
Good knowledge on Hadoop Architecture and ecosystems such as HDFS,
Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce
programming paradigm.
Hands on experience in installing, configuring Cloudera's Apache Hadoop
ecosystem components like Flume-ng, Hbase, Zoo Keeper, Oozie, Hive,
Spark, Sqoop, , Hue, Pig, Hue
Hand on with analyzing data using HiveQL, Pig Latin, HBase and custom
Map Reduce programs
4+ years experience in developing data ingestion from multiple data
platforms & multiple modes: batch, realtime(Spark(SQL/Streaming). Need
good working knowledge of data processing frameworks – Apache Spark
Any experience on distribute event processing systems like Kafka is a plus.
4+ years experience in data modeling, data transformation with detail
designs & integration processes that move data from raw to
curated/publish zones of a data warehouses
Below Cloud/Admin Skills would be a plus:
Experience with Cloud technologies such as Google Cloud Platform(preferred) orAWS or
Azure with focuses on big data management stacks: Google Cloud Platform-
BigQuery/PubSub, DataFlow or AWS- Redshift/Kinesis/Glue or Google Cloud Platform-
BigQuery/PubSub, DataFlow or Azure - Synapse, EventsHub, Stream
Analytics
Provide Hadoop Administration support which includes infrastructure
setup, software installation (MapR, Hortonworks, Cloudera, etc.),
configuration, upgrading/patching, monitoring, trouble shooting,
maintenance, and working with development team to
install components (Hive, Pig, etc.) and manage MapReduce jobs
Work closely with the vendor, Cloudera, to make sure the environment is
running properly
This support would include filesystem management and monitoring, cluster
monitoring and management and automating / scripting backups and
restores
Perform security and compliance assessment