Big Data Engineer

Big Data Engineer

13 Apr 2024
Georgia, Atlanta, 30301 Atlanta USA

Big Data Engineer

Vacancy expired!

Our client is looking for a Big Data Engineer to join their team supporting the Centers for Disease Control.
Job Requirements:
• Bachelors in engineering, Information Systems, Computer Science or Information Technology or equivalent experience
• Design and implement scalable data pipelines and data storage on Azure using, Data Factory Service Bus, ADLS, Synapse and a Spark-based architecture
• Automate the deployment and operation of data pipelines using Azure Data Factory and Databricks Spark
• Determine data storage structural requirements by analyzing and reviewing objectives, key business metrics and reporting requirements with customer
• Demonstrated expertise working with ETL on SQL, JSON, CSV/TSV, Parquet data sources in Cloud Object Storage like S3/Azure ADLSv2 using Big Data Cloud technologies
• Provide high level expertise in applicable public health disciplines to collect, abstract, query/code, analyze, and interpret scientific data contained within information systems and SQL Server databases, S3, Azure Blob (ADLS) and other data structures related to public health.
• Help build data pipelines leveraging Data Factory, Databricks Notebooks with experience working with Delta Lakes (Data bricks) and demonstrated knowledge of data flows
• Knowledge of Hive Megastore, Hadoop Ecosystem
• Working knowledge of relational DB MS SQL Server ,Azure Synapse SQL, Postgres
• Knowledge of scalable and low latency implementations for data products using Spark, Kafka Elasticsearch, or similar
• Knowledge of orchestration and monitoring of pipelines for their failure success and recovery
• Implement Comprehensive Testing and Continuous Integration frameworks for schema, data, and functional processes/pipelines
• Provide recommendations on opportunities for leveraging new data sources, data reporting capabilities, and integration of systems
• Provides meaningful knowledge transfer of design decisions, component composition and technical solutions to the program staff
• Consulting with CDC Scientists and Epidemiologist on the algorithms needed to support research
• Advanced SQL knowledge and experience working with relational and non-relational databases, as well as designing Tables, Schemas, or Collections
• Manage data in cloud storage, and data technologies such as, Spark, Databricks and Snowflake environments, using scripts and automation
• Strong dedication/commitment to automation, simplicity, and smooth-running systems
Responsibilities for the Role:
• Collaborates with CDC and other public health entities to translate workflows into business requirements in support of data and system integration projects
• Work with the data scientists to productionize their analytical workloads and models
• Work with data scientists and data visualization developers to productionizes their analytical workloads, visualizations and dashboards
• Directly responsible for production pipelines, their orchestration and change management
• Evaluating current data and help define the development/ETL strategies for moving data from heterogeneous source systems to a data warehouse/data lake and surfacing reporting and analytics using MS Power BI or QuickSights and other Visualization and Analytical Tools in the Cloud
• Communicate and/or address build, deployment and operational issues as they come up
• Work on workflow optimization and execution improvements
• Automate the monitoring of data processes and outputs
• Solve day-to-day customer and production challenges
• Interact with, and support, a variety of different teams (engineering, quality, management, etc.)
• Collaborate with Engineering and Platform teams to improve automation of workflows, code testing and deployment
• Collaborate with Engineering and Platform teams on the latest technologies for data management
• Monitor all data update processes and outputs to ensure predictive quality
• Communicate with customers to discuss any issues with received data and help them identify and fix data issues
• Iterate on best practices to increase the quality and velocity of deployments
• Design and implement secure automation solutions for production environments
• Strong experience working with Python
• Strong communication and organizational skills
• Experience working in a cross-functional team in a dynamic environment:
• Ability to work independently and deliver to deadlines
• Ability to solve problems with minimal direction
• Strong deductive reasoning ability
• Great attention to detail and accuracy
• Ability to work in a dynamic team environment using AGILE methodology

Top Skills Details:

1. 5+ years of experience working within Big Data, specifically Hadoop and Azure
2. Experience with automation, specifically the deployment and operation of data pipelines using Azure Data Factory and Databricks Spark
3. Demonstrated expertise working with ETL on SQL, JSON, CSV/TSV, Parquet data sources in Cloud Object Storage like S3/Azure ADLSv2 using Big Data Cloud technologies

Additional Skills & Qualifications:

This person must be have experience with Agile and working within a matrixed environment.

Experience Level:

Intermediate Level

About TEKsystems:

We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia. As an industry leader in Full-Stack Technology Services, Talent Services, and real-world application, we work with progressive leaders to drive change. That's the power of true partnership. TEKsystems is an Allegis Group company.

The company is an equal opportunity employer and will consider all applications without regards to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.

Related jobs

Job Details

Jocancy Online Job Portal by jobSearchi.