Design, develop, and execute test cases to validate data ingestion from source systems
(SQL, APIs) to Databricks UAP platform.
Perform schema validation, data completeness, transformation, and row-level data
checks between source and target.
Utilize SQL extensively for data profiling and validation.
Leverage PySpark and Pandas for large-scale data comparison and automation.
Maintain and enhance the Data Test Automation Framework using Python/PySpark for
efficient and scalable data testing.
Participate in daily stand-ups, sprint planning, and retrospectives following Agile
practices.
Manage and track test progress, issues, and risks in JIRA.
Ensure adherence to QA documentation practices:
Test Plan
Test Scenarios & Test Cases
Test Summary Reports
Defect Reports
Own and drive the Defect Life Cycle, working closely with developers and product teams
to ensure timely resolution.
Collaborate with business analysts and developers to understand data requirements and
ensure high test coverage.
Required Skills:
4+ years of experience in QA/testing with a strong focus on data validation.
Strong proficiency in SQL (Joins, Subqueries, Aggregations, Data Profiling)
Experience in testing data pipelines ingesting from SQL and APIs to Databricks.
Hands-on with Python and Pandas for data manipulation.
Working knowledge of PySpark and familiarity with Spark DataFrames, transformations,
and data handling.
Knowledge of Agile testing processes and QA best practices.
Familiar with QA documentation and reporting standards.
Experience with JIRA for test and defect management.
Good understanding of Defect Life Cycle and its role in maintaining software quality.