PySpark Developer

PySpark Developer

21 Jun 2024
Georgia, Planotx 00000 Planotx USA

PySpark Developer

Vacancy expired!

Position: PySpark Developer
Location: Plano, TX
Please submit the candidate's LinkedIn or GitHub profile for this job. (Must have)
Job Description: The Data Engineer will be responsible for building big data pipelines using open-source tools and enterprise frameworks in response to new business requests. This individual will work closely with data scientists and SMEs to define and map data requirements which can be translated to executable data processing pipes.
Design and implementation of specific data models to ultimately help drive better business decisions through insights from a combination of external and internal data assets. This role is also accountable for developing the necessary enablers and data platform in the Big Data Computing Environment and maintaining its integrity across the data life cycle phases.
Daily Responsibilities:

  • Gather requirements for data integration and business intelligence applications.
  • Determine and document data mapping rules for movement of medium to high complexity data between applications.
  • Analyze existing or build new PySpark/Scala/Snow SQL code wherever necessary to evolve existing prototypes into modern scalable data processing pipelines using Snowflake and Databricks
  • Work directly with the client user community as a data analysts to define and document data
  • Create reusable software components (e.g. specialized spark UDFs) and analytics applications Support architecture evaluation of the enterprise data platform through implementation and launch of data preparation and data science capabilities
  • Perform data quality validation. Employ data mining techniques to achieve data synchronization, redundancy elimination, source identification, data reconciliation, and problem root cause analysis.
  • Build high-performance algorithms, prototypes, predictive models and proof of concepts
  • Support data selection, extraction, and cleansing for enterprise applications, including data warehouse and data marts.
  • Investigate and resolve data issues across platforms and applications, including discrepancies of definition, format, and function.
Required Qualifications and Skills: (Highlighted skills must have)
  • 8+ years of Data Ware housing and Big Data Technology experience.
  • 5+ of strong PySpark scripting experience.
  • 3+ years of experience with Databricks, preferably Azure Databricks.
  • Strong knowledge of Data Quality Management
  • Strong understanding and use of databases: relational (especially SQL), and as well as NoSQL datastores
  • Intermediate knowledge of Snowflake required
  • Prior experience with data exploration, prototyping and visualization tools: e.g., Zeppelin, Jupyter, Power BI, Tableau
  • Prior experience with deploying complex data science solutions is a strong plus
Desired Qualifications:
  • Experience working in telecommunications industry
Education: Bachelors or Master's in computer science or equivalent

Job Details

  • ID
    JC15692699
  • State
  • City
  • Job type
    Contract
  • Salary
    $DOE
  • Hiring Company
    Intuites
  • Date
    2021-06-21
  • Deadline
    2021-08-20
  • Category

Jocancy Online Job Portal by jobSearchi.