Job Role: Big Data ArchitectWork Location: Bay Area, CA Contract duration: 06+ months Minimum Years of Experience: 10+years Detailed Job Description: Building Big Data Platforms that can ingest hundreds of terabytes of data, for Business Analytics, Operational Analytics, Text Analytics, Data Services. Build Back end analytical applications. Work on performance optimizations. Debug complex production scenarios. Required Skills and Experience At least 8 years of experience building and managing complex products/solutions.
Good problem Solving/analytical skills & an absolute team player.
5 Plus years Experience developing Restful web services in any Java framework.
Minimum 5 years experience on Hadoop Ecosystem (Spark/Scala/Python preferred) & Backend software modules using Scala / Spark & java
Minimum 8 years of experience working in Linux / Unix environment.
Expert level programming in Java, Scala & Python.
Experience in developing ETL modules for the AI/ML use cases, developing algorithms & testing
Minimum 5 Years of experience on performance optimizations on Spark, Hadoop, Any NoSQL
Minimum 5 Years of experience on Testing and Debugging Data pipelines based on Hadoop and Spark
Experience with debugging production issues & performance scenarios
Roles & Responsibilities: Design and development of java, Scala and spark based back end software modules, performance improvement and testing of these modules. Scripting using python and shell scripts for ETL workflow. Design and development of back end big data frameworks that is built on top of Spark with features like Spark as a service, workflow and pipeline management, handling batch and streaming jobs; Build comprehensive Big Data platform for data science and engineering that can run batch process and machine learning algorithms reliably Design and development of data ingestion services that can ingest 10s of TB of data; Coding for Big Data applications on clickstream, location and demographic data for behavior analysis using Spark / Scala & Java Optimized resource requirements including number of executors, cores per executors, memory for Spark streaming and batch jobs Expert level knowledge and experience in Scala, Java, Distributed Computing, Apache Spark, PySpark, Python, HBase, Kafka, REST based API, Machine Learning. Development of AI/ML modules and algorithms for Verizon ML use