So you have the data lake, the messy version of the lake or data swamp and then the pristine, well managed version of the data lake called the data reservoir.
Imagine how a reservoir of fresh water is used for multiple purposes… fishing, drinking, watering crops, providing electricity. That’s how your data should be structured. Even if you are working with multiple data sources made up of a lot of unstructured data from social media, you need to be organized with your data.
I’m willing to bet that if you are reading this then you are by nature pretty organized. Analysts tend to be. If you are working in an data swamp and the company culture is not data-driven, the best advice I can give you, no joke, is to find another job.
What to look for in a data-driven company? Are the data warehouses easy to use? Is their documentation on the data architecture? Is there a knowledge base? Are there experts and are they open to helping you?
If you say yes to questions like that, then your data management tasks are generally about optimization, data blending, adding new sources and being a kick ass analyst.
If you say no to questions like that, then your data management tasks are generally about cleaning data, lots of data validation and having your analysis be filled with caveats that you might be missing something.
So a few tips I have for those in good data companies; get your documentation fresh, do a lot of bread crumb dropping, save your queries and models.
Keep the data architects,database admins and/or IT staff in your circle. Share with them how powerful your analysis is because of their help. And most importantly, show you masterly of the data lake. Tell your story. And teach others how to fish in it.
For those of you not so blessed with good data cultures. You have to start on both ends. Map out the data flow. Try and assess where the data goes bad. Is it the input or capture of the data, is it a loading process, is it filers? Once you get a start on the front end, then go to the back end.
Who needs the data? How much of what data is being provided now is actually usable? Eliminate any unnecessary data. Basically start cleaning up the swamp at the same time you map it. And again tell this story. Don’t make excuses, but you do need to educate. Let people know there is a problem with the data and outline what you will do to correct for it.
In either case, before you go out and request or purchase new tools or start adding new data… make sure you have the architecture figured out. That’s the best tip I can give you about managing data.
The Fundamental of Business Analytics – Business Analytics is the application of talent, technology and technique on business data for the purpose of extracting insights and discovering opportunities. DMAIPH specializes in empowering organizations, schools, and businesses with a mastery of the fundamentals of business analytics. Contact DMAIPH now at firstname.lastname@example.org or connect with me directly to find out how you can strengthen your business analytics fundamentals.