Data Management Policies are one area for limiting crap data and related consequences. This is an aspect of Data Management that’s away from the doing inside the cells, formulae, queries, PHP coding and data fields. Why is this important?

Why do businesses need data scrubbed? They know the data can’t be trusted so, they can’t make good decisions. They regularly opt to make no decision because the data is corrupt.

How does data get so messy and needs to be scrubbed? One way is: no policies or rules around data collection and maintenance. (Even Barney Fife had 2 basic rules for prisoners in Mayberry.)

To keep clean data you need Data Management policies, rules and processes. Not the grand scale, regulatory, Sarbanes-Oxley kind of policies but much more microscopic: at the desktop and tablet level, the 1-user spreadsheet, and the 3-person departments.

What are some things we can do to stay out of trouble and keep our data clean?

Just a Few Rules & Policies That We Live By
Our Data Should Also Be Governed By Rules & Policies

Keep easy access to the original, raw data You never know when you’ll need to refer to it. There is the obvious: the data you work with has been lost or ruined. The not-so-obvious:

  • New information looks odd and you want to compare it against the history to see if there’s something wrong and needs to be investigated
  • You didn’t need a piece of information for ordinary use (e.g., shipping method, times that orders were placed) but now you’d like to go back and get it

Don’t let data entry back up How many business cards have piled up until One Day you’ll get them entered or scanned? What about the phone numbers, service requests, and other details you’ve scribbled on random pieces of paper and intend to load into your system one day?

Establish a policy for getting this data entered within 24 hours or within 5 business days. Any policy is better than waiting on one day.

Maintain only that data which you need Keeping fields for Pager Numbers and other extraneous data is a potential source of corruption. Every time you touch data there’s risk of accidental deletion or replication. Copy-Pastes get pasted into the wrong areas. Calculations are accidentally altered.

The more you keep and try to maintain, the greater opportunity for error. So, get very clear:

  • Do you need to know if a survey respondent is married?
  • Do you need to know if they’re
    • married
    • widowed
    • over 35 and never married
    • divorced
    • separated
    • have more than 2 baby’s mommas or
    • in a domestic relationship?

Marketers really need to pay attention to this because there’s a tendency to demand more and more. Just be wary of the potential for crap data. Set up policies that state what you need, and distinguish it from the data that would be nice to have. Be sure you know why and if it’s worth the maintenance and risk of corruption?

Determine tolerances for numbers that don’t match A policy will determine what is considered “close enough” and how long you have to chase the cause of the mis-match.

A policy will determine which of these to reconcile 
Source A Source B Difference % Diff Acceptable
$7.21 $19.42 12.21 169.3% ?
$900.12 $946.08 45.96 5.11% ?
$1,093,381.39 $1,093,362.99 -18.40 0.002% ?
4,298.00 $4,293.61 -4.39 0.102% ?

Consider these as a starting point for your Data Management Policies, having explicit rules around your data collection and data maintenance so that you can effectively do what you need to do.

Now, keep that data clean!