Today, 25 September, is my birthday and this reminds me of something:

Data Collection Efforts That Invite Crap Data

Crap data leads to questionable analysis, bad business decisions, and even contempt of your own customers because they’re the ones who fed you this hogwash.

It’s often said that our lists are the most valuable element in our businesses. Contact lists, client lists, prospect lists, webinar participant lists, etc. These lists are  often the only thing that’s exchanged when someone sells a business. How about the data quality, though?

One data collection effort that is a magnet for poor data quality is online forms with Required Fields, specifically when the Required Field is personal and is not germaine to current purposes. Requesting and requiring information in this context is technically called OFDLIZ:

Invitation for Crap Data

open front door,
let in zombies

Sell a business and exchange crap data for money? :facepalm:

BIRTHDAY as a required field is something marketers need to think seriously about. Birthdays are undoubtedly valuable information TO MARKETERS. But marketers need to give us a damned good reason to provide the truth, and they have to gain the trust of a public that’s wary of identity theft.

I am regularly guilty of inputting an incorrect birthdate when the marketer

  1. doesn’t need my birthday
    • I just want access to read an article about Chicago’s plan to make city parks wi-fi.
  2. only needs to know that I’m over a certain age
    • To use social media sites
    • To access a site and read reviews on bourbons and ryes

Crap Data - thinking

Here’s a question someone may ask: Well, Mr. DataMan, how do you know that the birthdays are crap?


I was helping someone scrub 1000+ rows of data submitted by people who’d registered themselves online. A surprising number of people were born on 1/1/11, 6/6/66 and 1/2/34. Interesting. And plenty of “555” phone numbers.

This leads to other questions:

  • How many people abandoned the form/registration because it became a hassle?
  • Were ANY of the 6/6/66  birth dates real? That looks more plausible than 1/1/11
  • How many of the normal-looking dates are crap data?
  • Why wasn’t there sufficient validation to prevent 0/0/00 as an acceptable date?
  • Is the birthdate easy to revise if a person wants to provide the real date later?


  • Marketers should gain our trust. State how this information will and will not be used.
  • Only require fields that are necessary.
    • Required fields can make a form a hassle to fill out. Pressing ENTER and having the form come back with red boxes is un-fun.
    • If you want to send customers a birthday email, then just ask for the date and month. Leave the year optional.
    • You can ask for anything under the sun (eye color, medical history, dietary restrictions, political beliefs) but leave the option open to users to provide it if it’s only for marketing purposes and is tangential to the purpose of the interaction.
  • Explicitly invite users to restrain themselves from loading crap data. Invite their empty fields. Love their empty fields.

Of course, all of this means you’ll get less data from your audience but you’ll be proactive in increasing the data quality. Otherwise, sit down and wait for the zombies.

Crap Data - Zombies

Additional Reading: Using New Data for Marketing That Is “JUST RIGHT” Eric Wittlake raises moral and ethical questions about advancements in technology and how data can be used in marketing to us. This triggered my concern about data quality, something that Eric is also concerned about.

doorway photo credit: maistora via photopin cc
lightbulb photo credit: Raymond Larose via photopin cc
zombie silhouettes photo credit: Ben Blogged via FreeVector