Hey everyone! I'm looking to get a sense of the current options for frameworks—especially open-source ones, but I'm open to all suggestions—that can help with data quality checks. I used "great expectations" a while back, but I'm curious if it's still a top choice or if there are better alternatives out there now. I'm particularly interested in any frameworks that might use LLMs for these quality checks. Any recommendations?
5 Answers
For real projects, I've had good experiences with dbt and sqlmesh. Great Expectations is decent, but it can get pretty messy as projects scale. Just a heads up, relying on LLMs for quality checks might not be practical in high-stakes scenarios since they're not deterministic.
Just to throw it out there, I don’t think relying solely on LLMs is the way to go.
You might want to try Pandera for dataframe validation. It’s been helpful for my data validation needs.
Have you considered Frictionless? It's worth checking out for managing data quality.
You mentioned you're looking for data quality tools—what specific types of data are you working with? If it's tabular, there are some other options worth exploring!

Related Questions
XML Signature Verifier
Voltage Divider Calculator
SSL Certificate Decoder
SQL Formatter
Online Font Playground to Test Google or Custom Fonts
File Hash Generator Online – Get Instant MD5 and SHA-256 Hashes