📊

Data Pipeline QA Factory

Data pipeline validation with schema checks, quality testing, and anomaly detection

Data 7 stages 3 specialists v1.0.0

About

A seven-stage quality assurance pipeline for data pipelines. Schema validation and sample profiling feed into parallel anomaly detection and business rule checks. Findings are consolidated into a QA report, reviewed by a reflection gate, and finally submitted for human approval before pipeline promotion.

Input / Output

Input

Data pipeline configuration or dataset to validate

data_pipeline

Output

QA report with validation results, anomaly findings, and recommendations

min quality: 0.85

Pipeline Stages

⚡

schema validation

Execute

Validate data schemas, types, and structural integrity

👤 analyst 🔧 file_read, shell

⚡

sample profiling

Execute

Profile data samples for distributions, nulls, and cardinality

👤 analyst 🔧 file_read, shell ← schema validation

⇅ runs in parallel

⚡

anomaly detection

Execute

Detect statistical anomalies, outliers, and unexpected patterns

👤 analyst 🔧 file_read, shell ← sample profiling

⚡

quality rules

Execute

Apply business-specific quality rules and constraints

👤 engineer 🔧 file_read, shell ← schema validation

⇅ runs in parallel

⚡

qa report

Execute

Generate comprehensive QA report with findings and severity ratings

👤 writer 🔧 file_write ← anomaly detection, quality rules

🔍

quality gate

Reflect

Review QA report completeness and accuracy

← qa report quality ≥ 0.9 max depth: 2

✋

approval

Approval

Human approval before pipeline promotion

← quality gate timeout: 120m

Data Pipeline QA Factory

About

Input / Output

Input

Output

Pipeline Stages

schema validation

sample profiling

anomaly detection

quality rules

qa report

quality gate

approval

Tags

Related Recipes