It does not matter whether you are actually using it or not.
Hence, as Adam Wray, CEO of Basho, told Forbes, “everything is vacuumed up.
However, this leads to problems.
[…] [Data lakes] are evil because they’re unruly, they’re incredibly costly and the extraction of value is infinitesimal compared to the value promised”.
The central question now is how to generate value and make that data usable.
In other words, what is essential for the strategy?What is Essentialism?Recently, I came across Greg McKeown’s concept of essentialism.
In a Tim Ferriss podcast episode, McKeown talks about his course in the d.
school of Stanford combining this concept with Design Thinking.
According to McKeown, the best sitting definition for essentialism which he describes in his book, is that it is “less but better”.
Hence, it starts with the question, what is really essential to what I want to achieve.
In general, it is a disciplined, systematic approach for determining where your highest participation lies.
Afterwardsl, Execute on those things that matter.
McKeown’s design course on essentialism at Stanford includes a lot of techniques from Design Thinking, such as defining the problem space and develop solutions that seem essential to then test them.
Starting with use-casesWhen it comes to the data strategy of a firm, it makes sense to start identifying reasonable use cases for your overall corporate strategy first.
Often it is way leaner to develop good use cases first and then decide how the architecture looks like.
This doesn’t mean you cannot develop a data lake strategy, but your most valuable factor is time and it is advisable to start doing early and create meaningful learnings.
A valuable mindset helping to do this is e.
Hence, some principles of essentialis also can be applied to Data Thinking and also to your data strategy at large.
Through a method like Data Thinking workshops, the valuable use cases are generated, prioritise (decide which are essential) and afterwards develop further towards prototypes.
A few of the principles of Essentialism — The disciplined pursuit of less can be directly applied for this kind of work.
Spend time exploringThere is a common misconception that AI and ML often are just plug and play solutions.
As a foundation to employ such technologies, often a fairly big amount of data is necessary.
Mostly, the companies that want to start using ML and AI however, don’t have the needed data.
Hence, a strategy on how to collect and store it is need first.
During Data Sprints, we often spend a significant amount of time to explore use cases that make sense for the company and decide how we actually get the data we need if it is not already there.
Realise you have a choiceLike with everything in life, it is important to also realize here that you have a choice.
Only because big players decide to centralise all their data bases into a data lake and buy the respective tech architecture, doesn’t mean that this is the right setup for you.
In the end, you always have a choice.
Focus on the vital fewDuring Data Thinking Workshops we collect a lot of different use cases.
Often clients have already a list of use cases.
How do you determine those that are vital?.An efficient way to identify the ones you should double down on, is to evaluate them with the team, based on impact and effort.
You can start by placing all the use cases on an impact effort matrix, like the one below.
In the beginning you would want to start with the use cases in the top right corner as they bring high value and cost less effort.
They are also often considered as “low hanging fruits”.
Repeat the processNot only can you only be good at something if you repeat it over and over again, also Design Thinking and Data Science are iterative disciplines.
You need to constantly learn and improve it.
Hence, it is in the nature of these two disciplines.
Dat Tran from Idealo wrote a fantastic article about what a Minimum Viable Data Productis.
He touched on this point in his article.
He writes “A possible approach to solve this classification problem would be to take a neural network with one hidden layer.
We would next train and evaluate the model.
Then depending on the results, we might want to keep improving our model.
We then would add another hidden layer and then do the same modelling exercise again.
Then depending on the results again we might add more and more hidden layers.
” This shows the iterative approach.
Start lean and add complexity later when you learnt enough to really see what would bring you the necessary information depth.
In the end, what ever you do, your data strategy is an inseparable part of your products and services and need constant learning, testing, and improvement as well as everything else.
For a final conclusion, I want changed some of the personal questions you can ask yourself when applying essentialism and combined them with the questions we ask during Data Thinking Workshops and Data Sprints.
Ask yourself the three questions: What is it that our brand/product/company stands for?.What are we particularly good in?.What meets a significant need for our customers?If you could only do one thing with the data you have at hand, what would you do?Regarding my data strategy, is this the very most important thing, I should be doing with my time and resources right now?What is important right now?.Can we solve it with the data we have?Have some thoughts?.Please share in the comments.
References:Kowalski, Kyle (2017).
10 Life hacks form ‘Essentialism’ (Book Summary) URL: https://www.
How to design a successful data lake URL: https://knowledgent.
com/whitepaper/design-successful-data-lake/Tran, Dat (2018).
What is Minimum Viable Data Product?.On Medium URL: https://medium.
com/idealo-tech-blog/what-is-minimum-viable-data-product-49269e338d85Wood, Dan (2016).
Why Data Lakes are evil.
Forbes URL: https://www.