Vacancy expired!
Lead Big Data
Full Time/W2 on Client Roles with Benefits
End Client: Citi Bank
Initially Remote then Irving,Texas
Education: Minimum Bachelor's degree in Computer Science, Engineering, Business Information
Systems, or related field. Masters in Computing related to Pyspark and distributed computing is a
major plus
Key Responsibilities:
Develop Big Data applications using -Pyspark on Hadoop, Hive and/or Kafka, HBase, MongoDB
Build Machine Learning models
Deployment on Cloud platforms
Experience & Skillset
MUST-HAVE
Total IT / development experience of minimum 4+ years of Big Data.
Experience in -Pyspark developing Big Data applications on Hadoop, Hive and/or Kafka, HBase,
MongoDB
Deep knowledge of Pyspark and libraries to develop and debug complex data engineering
challenges
Experience in developing sustainable data driven solutions with current new generation data
technologies to drive our business and technology strategies
Exposure in deploying on Cloud platforms
Development experience on designing and developing Data Pipelines for Data Ingestion or
Transformation using -Pyspark
Development experience in the following Big Data frameworks: File Format (Parquet, AVRO, ORC),
Resource Management, Distributed Processing and RDBMS
Developing applications in Agile with Monitoring, Build Tools, Version Control, Unit Test, TDD, CI/CD,
Change Management to support DevOps
Development experience with SQL and Shell Scripting experience
GOOD-TO-HAVE
Banking domain knowledge
Hands-on experience in SAS toolset / statistical modelling migrating to Machine
Learning models
Digital Marketing Machine Learning models and use cases
ETL / Data Warehousing and Data Modelling experience prior to Big Data
experience
Deep knowledge on AWS stack for big data and machine learning