Declutter Your Data: get ready to tidy (2/3)
In my first post How to Tidy Your Data Environment, I established a framework for decluttering your data. Let’s get ready to tidy with the first two steps.
Imagine your Ideal Data
The key to the KonMari Method’s success is the first step. Kondo’s phrasing is deliberate. She doesn’t ask you to imagine your dream home. Or even a clean home. Or less stuff.
She asks you to imagine your ideal lifestyle. Do you want to cook more? Do you want to spend more time with family? Do you want to play an escape room board game with a glass of bourbon in front of a fire after a long day of work? If so, we’d make good friends :)
Once you imagine how you want to live, you can imagine the home you need to support that lifestyle.
What comes to mind when I ask, “What do you want your data to look like?” You might think:
Validated
Organized
Centralized
Accessible
Curated
Let’s take a Kondo approach to this though. Ask yourself instead:
How do you want to use your data to create value?
Start your answers with, “I want to…” You might says things like:
I want to...
find the right data quickly.
trust the data is accurate.
understand what the data represents.
know where the data comes from.
know where the data is used.
Once you know how you want to work with your data to create value, you can proceed to tidying your data.
Evaluate by Asset Type
Here’s Kondo’s advice to start the tidying process:
Tidy by category – not by location. Going from room to room makes it hard to truly comprehend what you own – and will inevitably lead to a rebound.
What are the rooms in your data home? Put another way, how is your data currently siloed, organized, or classified? By team? By type? By source?
I’d like to broaden the definition of data here as well. Let’s consider data assets, including:
Data Pipeline | Produces data sets, whether raw, transformed, or from a machine learning model.
Data Set | A structured or unstructured collection of data stored and used for analysis or reporting.
Data Tool | Provides access to data sets for users to analyze, monitor, or report on data. Most commonly represented as reports and dashboards.
I use the word asset deliberately to imbue a sense of value. Data should create value for your company, not just sit dusty on a shelf in your data home. Like Kondo, I recommend you declutter data assets in a specific order:
Data Tools
Data Sets
Data Pipelines
Data Tools like reports and dashboards represent the furthest downstream dependency for data. If you no longer need the tools using a particular data set, you likely no longer need that data set. If you no longer need that data set, you likely no longer need the data pipeline or data model producing that data set.
Your decisions on which data assets to keep will be much easier if you start at the end and work your way backwards.
Focus on What to Keep
Part of Kondo’s claim to fame is her deceptively simple criteria for deciding what to keep: “Does this item spark joy?” More visceral than logical. More emotional than practical. It flies against the natural inclination to evaluate physical items based on monetary or utilitarian value instead.
I’ve used the KonMari process and found it deceptively powerful however. By not keeping items just because you spent money on them, they might be useful in the future, or you don’t have an alternative, you keep items that add real value to your life and help you achieve your ideal lifestyle.
I propose the equivalent question for data is:
Does this data create value?
As mentioned in Data that Sparks Joy, I can’t define what value means to your business. But once you figure that out, you can easily decide whether you should collect, process, analyze, or report on a certain set of data. You’ll then focus your time, energy, and effort on data that creates true value for your organization.
Now that you’re mentally prepared to tidy your data, you can start making decisions and taking action on data assets in my next post: Tidy Up to Maximize Value.