Validating Data Consistency
Swipe to show menu
After cleaning duplicates, fixing formats, and highlighting errors, the final step is to make sure that your data is logically consistent.
Data consistency means that values across different columns make sense when compared to each other. Even if each column looks correct on its own, the relationship between columns can still be wrong.
Let's look at a simple example:
Product | Quantity | Price | Total |
|---|---|---|---|
Phone | 2 | 400 | 800 |
Phone | 1 | 800 | 800 |
Laptop | 1 | 1000 | 1000 |
At first glance, everything looks fine. But if you think about it, there is a problem.
If 2 phones cost 800, then 1 phone should not also cost 800. This indicates that one of the values is incorrect.
How to Validate Consistency
The idea is to compare related columns and check if they follow expected rules.
For example, you might expect:
Total = Quantity × Price;- Values to stay within a realistic range;
- Consistent relationships across rows.
You can use simple formulas to verify this logic.
How It Works
You create a formula that checks whether the relationship between columns is correct.
For example:
=IF(B2*C2=D2, "OK", "Error")
This formula checks whether the total matches the calculation. If it does, it returns OK; if not, it flags the row as an Error.
Create a new column called Validation.
Write a formula that calculates the price per item by dividing Total by Quantity. If the price of one item exceeds 2000, mark it as "Error", otherwise mark it as "OK".
Apply the formula to all rows.
Use a formula that divides Total by Quantity and checks if the result is greater than 2000, for example: =IF(E2/D2>2000;"Error";"OK").
Thanks for your feedback!
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat