So, a client delivers a lot of data to me. They need it scrubbed or, reorganized or, an application built around their to automate tasks … or some combination of the three.

Guess what? Most projects involve something that I don’t know how to do (yet).

  1. And that’s how it goes in these technical fields.
  2. It’s also part of the thrill.

The uncertainty feels like being a Marshall in the old west showing up in a place reminiscent of Cormac McCarthy’s Blood Meridian.

We (Data Marshalls) confidently ride into the scene not knowing. However, what our clients pay us for is our knowledge of the tools, experience, and our inventiveness in creating solutions. So, let’s discuss one of the tools for taking on this mission.

THE MISSION
Stop the wanton pillagin’, maraudin’, gamblin’ and general mayhem in Dataville

NAME or DATA GENERATORS have nothing to do with Excel or databases. These are time savers. You or I can make up 50 names and 50 cities and states. No! Go use some technology to generate 50 bogus names for you instantly.  Two are listed below.

Why would you need 50 (or even thousands) of bogus names, addresses, etc? For several reasons, and not only for Data Management professionals but also for clients who want to hire a Data Management professional. Here are 4 uses of this tool:

  • Build a small version of the real project. I recently had a spreadsheet that was 11,000 rows and 800 columns, and the client needed the data organized in a specific way. Ok. Developing a strategy to squash the mayhem didn’t require taking on the whole town. Using a Data Generator to fill a 50×30 spreadsheet with bogus data provided enough access to test solutions.
  • Getting the project to look familiar. Data can often look like incomprehensible strings of text and digits
    • Product numbers
    • Tracking codes
    • Chemical compounds

Let’s say, a data-set is full of things like C1 J8320000002-R08. It’s early in the project, you’re just trying to get familiar with what’s what and determine the effectiveness of your strategies.

Use a Data Generator to change the data to common things like Sue, Phil, Ohio, etc. Maybe you’ll see that your strategy that resulted in C1 J8320000002-R08 will be revealed as having merged Ohio and Sue: OhiSue-o

That may or may not be what you wanted but it’s at least easier to comprehend.

  • Protection of sensitive data. Often we have to seek help for some piece of the task. Sometimes there’s no choice but to show the document to someone for them to get every nuance of the maraudin’ we’re dealing with. Of course, we aren’t going to hand over the straight project. Use a Data Generator to load the document with bogus data. The guru’s guru just needs to know how the data-set hangs together.

For Clients, a Name/Data Generator can be useful in helping us work together Often, a person or business is in need of help but they don’t want to expose their data to just anyone or risk it being lost somewhere. Completely understandable. Two solutions:

  • If they need a template or application developed, they can load the bogus data in the format that they want it in and I can install the guts in order for everything to function.
  • If a clean-up or reorganization strategy is needed, again, a client can provide bogus data for me to generate a solution for them to carry out on the real data.

IN CLOSING There’s a lot that goes into Data Management, a lot of tools to make it happen, and every day there’s one less Dataville mired in moral turpitude. Name/Data Generators are as helpful as a strong healthy horse and a comfortable saddle.

DataVille

I tip my hat to ya’ and request, “whatever you do, keep your data clean.”

A couple of free Data/Name Generators

  • GENERATE DATA Generates addresses and other data that are downloaded
  • RANDOM NAME GENERATOR Generates names in your browser. You would then copy/paste if you need it in Excel, Word, etc.

See a screencast on how I use data from a Data Generator

More from The Data Marshall: Preview of Excel 15