Feathr – A scalable, unified data and AI engineering platform for enterprise
-
Updated
Apr 4, 2024 - Scala
Feathr – A scalable, unified data and AI engineering platform for enterprise
Automated data quality suggestions and analysis with Deequ on AWS Glue
Test data management tool for any data source, batch or real-time. Generate, validate and clean up data all in one tool.
The Lightning Catalog is an open-source data catalog designed for preparing data at any scale in ad-hoc analytics, data virtualization, data warehousing, lake houses, and ML projects.
Data quality control tool built on spark and deequ
Example API implementation for Data Caterer
A library for Spark that helps to stadardize any input data (DataFrame) to adhere to the provided schema.
Data generation and validation tool for any data source
A Quality Spark DQ and transformation Library
PoC Spark wrapper for validating data
An extensible and configurable ETL tool built on top of Apache Spark
Add a description, image, and links to the data-quality topic page so that developers can more easily learn about it.
To associate your repository with the data-quality topic, visit your repo's landing page and select "manage topics."