USPTO PROGRAM Design and implement Big Data analytic solutions on a
Hadoop based platform. Design/develop/implement Big Data/Hadoop platforms related to data ingestion, storage, transformation and analytics. Refine data processing pipelines focused on unstructured and semi-structured data.
Design/develop/implement Big Data/Hadoop platforms related to data ingestion, storage, transformation and analytics.
Load data from disparate data sets, load Hive/HBase and RDBMS tables.
Import and export data using Sqoop from HDFS to RDBMS.
Preprocess data using Hive and Pig.
Develop shell/Scala/Python scripts to transform the data in HDFS.
Perform analysis of vast data stores and uncover insights.
Create scalable and high-performance web services for data tracking.
Create custom analytic and data mining algorithms for data extraction.
Experience in developing shell/python scripts to transform the data in HDFS.
Assist in resolution of infrastructure issues
Execute and troubleshoot Spark and Hive jobs including performance tuning.
Skills:
4+ years’ experience with Hive/HBase/MRV1/MRV2
Experience with Hadoop, HDFS, Hive, Apache Spark, Storm, and Kafka
Experience with RDBMS, SQL, MongoDB, and hierarchical data management
Education Requirement: A minimum of a Bachelor’s degree in computer science, computer information systems, information technology, or a closely related field, or a combination of education and experience equating to the U.S. equivalent of a Bachelor’s degree in one of the aforementioned subjects.