4 Comments

I was so ready to dunk on yet another data quality definition, but I think this is the first one I have seen that talks about fitness for purpose — and doesn’t drone on about the number of nulls.

Fun pet peeves with data quality:

- masked missing values, where instead of a null you have some plausible default value, is so much worse than nulls

- imputation of missing values aimed at one use case can make the data unusable for other purposes

I think it is weird that the discussions on data quality rarely mention the original purpose of the data or understanding the process that generates it.

Expand full comment

Well I'm not selling a data quality tool so it's hopefully easier to be less dunk-able :D

My data quality horror is nulls that are literally the string "null" but the db query tool I'm using doesn't make that obvious.

Expand full comment

My horror is empty strings in oracle are the same as NULL.........

Expand full comment

One word horror story: Oracle.

Expand full comment