Tonight, I’m preparing to teach a workshop and need examples to show the students. Here’s your behind-the-scenes peek into using Excel to generate data to build useful examples.
WHY GENERATE BOGUS DATA (AKA DUMMY DATA) SETS?
Knowing how to generate random data is useful for many reasons.
- You’re learning something new and want to make examples that are relevant to what you need.
- You’re building a solution for data that are confusing. Solution: generate a simple dataset, get that working and then use the real data at the end.
- Like me, you’re preparing to teach and need examples that are relevant to your students.
Examples
#1. I recently had a project that had 600 wide columns of course codes, book titles, the same course titles but on different dates. All of the scrolling, and course titles were getting in the way of my ability to think through a solution for the client.
I created simple, nonsensical data using first names, pastry types, quantities and weights. DONE!
Those 15 columns and 20 rows of familiar data acted as placeholders for the actual data; it faded into the background and allowed the solution-development to stay front and center in my mind.
#2. Over this past year I’ve been blogging and am constantly making up random data sets. So, come have a peek and see what’s what.
LET’S GENERATE SOME DUMMY DATA
In previous posts, I’ve covered the use of bogus data, and sources for acquiring data sets. They’re fantastic for first & last names, fake mailing addresses, etc. Sometimes, we don’t need to be that fancy. We can generate our own dummy data in Excel.
In the screencast below, I show how to generate data using
RANDBETWEEN
Find-Replace
The result: 102 records of bogus call center data with Dates, Call Center Reps, Call Types, Call Lengths, and whether the call was recorded or not.
And we close the video with a quick Pivot Table example!
Hopefully this blogpost gives you a new tool to help keep your data clean and help you work efficiently. That’s the mission in 2013. Let’s make it happen!
Please leave any comments or questions below. It’d also be interesting to know who else works with dummy data, why, and how you access it.
UPDATE: 29OCT14
I wrote this blogpost back in January 2013 and want to direct you to a video by Mike “ExcelIsFun” Girvin. A few days ago he published a fantastic video that shows a more efficient way to create a random dataset. Check it out: Excel Magic Trick 1145: Create Random Data Set with Formulas: 4 Unusual Formulas
Peeking photo credit: ashley rose, via photopin cc
I use often http://www.yandataellan.com/ tool to generate my test databases; it is simple, rich of functionalities, and user-friendly. You could generate up to 10.000 rows of mock data in several file formats (CSV, Excel, SQL, JSON, HTML, and XML). Take a look!
I use often http://www.yandataellan.com/ tool to generate my test databases; it is simple, rich of functionalities, and user-friendly. You could generate up to 10.000 rows of mock data in several file formats (CSV, Excel, SQL, JSON, HTML, and XML). Take a look!