Turning Data Owners Into Data Resources

One of the biggest challenges I hear about when I do public trainings is how to get people who are stingy with their data to share it.

My answer is always the same… buy them a doughnut.

Seriously, when I reflect back to what made me a great analyst when I was with Wells Fargo, one of the biggest reason was I made sure all the data guys liked me.

Just about every company has someone who likes to keep their data close. Sometimes it is a result of security risks. But most of the time it is because they just don’t like to share. It is also possible they just don’t like someone on your team. Whatever the reason, you have to get them to lower the gate and let you in to play with their data.

From my perspective, I generally see a few types of data gate keepers who have very different reasons to keeping you out of their data playground.

  1. They are afraid to share the data, because they know the data is not 100% trustworthy.
  2. They are afraid to share the data, because they worry you will use the data to do things they can’t.
  3. They are afraid to share the data, because they had a bad experience with you or someone like you.
  4. They are afraid to share the data, because you play for a different team.
  5. They are afraid to share the data, because you won’t need them anymore.
  6. They can’t share the data because it’s a security risk.

1075177_10151826941667425_1417094118_n

In every case, even the last, engagement is the key. Share with them why you need the data, demonstrate how much more awesome your analysis and reporting will be if you can include their data.

One of the advantages I have enjoyed in my career is that I really get along with people. I make an effort to be likeable and trustworthy. To be a great analyst, you will need to be likeable and trustworthy too.

And I kid you not, buying them a doughnut and dropping it off at their cube works more often than you might imagine.

The key to using analytics in a business is like a secret sauce. It is a unique combination of analytics talent, technology and technique that are brought together to enrich and empower an organization. A successful analytics culture is not easy to create, but DMAIPH can show you how. Contact DMAIPH now at analytics@dmaiph.com or connect with me directly so we can build a strategic plan to turn your company into analytics driven success story.

Advertisements

Q9: Can you please describe the concepts of storing data in a data ware house?

Twenty years ago data was mostly stored in databases. These databases housed all the data a business would need to do analytics. Transaction data, sales data, customer data, demographic data was all neatly collected, stored and analyzed in databases.

A surprising number of companies still store most of their data in databases. It works well for business that just need to look at historical data to conduct basic descriptive analytics.

About ten years ago the amount of data captured in a business and the growing diversity in date sources and data storage brought about the mainstream use of data warehouses in the business world.

Data warehouse are often a collection of databases interconnected so that data can be brought together into one place for reporting and analysis.

Whether you are working with a data base or a data warehouse, you should have a basic understanding of how data is stored. It should be in table format, with header columns and data rows.

A good way to quickly assess the analytics culture of a business is to look at how data is shared among management. Does it look table like? Or is it obvious that most of the time spent by the author was put into decorating? If you can’t easy sort something, then you are not dealing with a good data culture.

The best way to have a good data culture is to have well documented data structures. Any dB admin worth a grain of salt has the data hierarchy mapped out and has a knowledge base to help users know what data is in each field.

Like with finding data, being good at storing data starts with knowing the environment. Any good analyst should have a basic understanding of how to use SQL to pull a query for a data table. Even if you cant do hard core coding, know how data is generally stored in a structure is key.

1075177_10151826941667425_1417094118_n

Another important concept about data warehouses if you have to know how to join or blend data from different sources. When you have multiple data tables in a warehouse you often need to join the data on a common field. Data blending goes on step further as you are often trying to take data that doesn’t have a natural point on common that is easy to join on. Advanced data warehouses and data management tools can blend things easily, but its still important to understand the core concepts of how to join and blend data.

As I mentioned in earlier posts, there is now a new concept taking root that one up data warehouses. Data lakes are being used to address the fact that we have more unstructured data then we have structured data. Data bases and data warehouses were designed only to handle structured data the easily fits into a data able.

Now we have to collect data from images, videos, blogs, comments and other places that are not easily converted to a value. Data blending across both traditional structured data warehouses and new types of data is not easily done in most data warehouses so tools are being developed to bridge this gap.

The lake is no longer a place just to fish, but also to do all the other things a lake can be used for.

So, when it comes to understanding data warehouses, learn who built and/or maintains it and buy them a cup of coffee. Get your hands on the data dictionary, knowledge base, FAQ, metadata.. whatever you can to map out the data environment. If you do that then you can find use the big data stored in a data warehouse to find the right data at the right time.

Q6: Can you provide some tips on how to manage data?

So you have the data lake, the messy version of the lake or data swamp and then the pristine, well managed version of the data lake called the data reservoir.

08-data-reservoir-walter-with-hard-hat

Imagine how a reservoir of fresh water is used for multiple purposes… fishing, drinking, watering crops, providing electricity. That’s how your data should be structured. Even if you are working with multiple data sources made up of a lot of unstructured data from social media, you need to be organized with your data.

I’m willing to bet that if you are reading this then you are by nature pretty organized. Analysts tend to be. If you are working in an data swamp and the company culture is not data-driven, the best advice I can give you, no joke, is to find another job.

What to look for in a data-driven company? Are the data warehouses easy to use? Is their documentation on the data architecture? Is there a knowledge base? Are there experts and are they open to helping you?

If you say yes to questions like that, then your data management tasks are generally about optimization, data blending, adding new sources and being a kick ass analyst.

If you say no to questions like that, then your data management tasks are generally about cleaning data, lots of data validation and having your analysis be filled with caveats that you might be missing something.

So a few tips I have for those in good data companies; get your documentation fresh, do a lot of bread crumb dropping, save your queries and models.

Keep the data architects,database admins and/or IT staff in your circle. Share with them how powerful your analysis is because of their help. And most importantly, show you masterly of the data lake.  Tell your story. And teach others how to fish in it.

For those of you not so blessed with good data cultures. You have to start on both ends. Map out the data flow. Try and assess where the data goes bad. Is it the input or capture of the data, is it a loading process, is it filers? Once you get a start on the front end, then go to the back end.

Who needs the data? How much of what data is being provided now is actually usable? Eliminate any unnecessary data. Basically start cleaning up the swamp at the same time you map it. And again tell this story. Don’t make excuses, but you do need to educate. Let people know there is a problem with the data and outline what you will do to correct for it.

In either case, before you go out and request or purchase new tools or start adding new data… make sure you have the architecture figured out. That’s the best tip I can give you about managing data.

jobspicture2

The Fundamental of Business Analytics – Business Analytics is the application of talent, technology and technique on business data for the purpose of extracting insights and discovering opportunities. DMAIPH specializes in empowering organizations, schools, and businesses with a mastery of the fundamentals of business analytics. Contact DMAIPH now at analytics@dmaiph.com or connect with me directly to find out how you can strengthen your business analytics fundamentals.

Q5:What are some basic strategies an analyst can use to find the right data at the right time? – Part 2

How do you know if the data you are using is the right data to be using?

I can’t count the number of times I asked myself that question. In general, just about every new analysis or project or research or whatever it is you are using data for, you have to ask that question at some point.

Even data you have used a hundred times and comes from a highly trusted source needs to be scrutinized.

Now if you work with data everyday in a familiar format, from the same source and with no changes to the data gathering and storage process you don’t have to spend much time validating it. Usually you will see problems when something just doesn’t look right when you are doing the analysis.

On the other hand, things get a whole lot trickier when you are using data from a source you don’t use often, or something has changed in the way the data is populated or if it’s the first time you are using the data.

When this happens, I have a few suggestions on how to validate the data.

First off, pull the data, do your analysis and draw some conclusions. If it passed the eye test and it feels ok to you, then your job is just to validate it.

One simple way to do this is pull the data again the exact same way to make sure you get the exact same data. Or change one parameter like the dates used in the query. See if that significantly alters the way the data looks and feels.

Another option is to have someone else do the same thing independently. See if they get the same results you do.  You can also find someone who knows the data to look over you work to see if it makes sense to them.

Whatever you do, the best way to prevent publishing or using bad data is to involve someone else. Not always possible, I know, but it’s the best way to go.

Another suggestion is to get the data, do some analysis and then step away for a while.  Come back to it with fresh eyes. Don’t let our minds play tricks on us by making us see what we want to see and not what is really there.

MSP24321b65h584fd76h9i00006996g957h01de5g6

I have seen several articles showing research that most time doing data analysis is actually spent cleaning data. In a lot of businesses the data lake as become a data swamp, clogged with bad or unusable data. As the % of unstructured data increases daily, its easy to see how data swamps have become the norm. Even he most robust data collection and mining can run afoul if the data is not trustworthy.
So getting back to the last post… know how the data is populated. Who, when, why, how, how often, with what filters… things like that. I can’t stress this enough. No matter how good you are at analysis, or what tool you are using to do the analysis, if you don’t have an understanding of what happens to the data before it gets to you then you are probably not drinking from a clean lake.

Q5: What are some basic strategies an analyst can use to find the right data at the right time? – Part 1

Several years ago I came across a book called the Accidental Analyst. After reading the book I was inspired to come up with a way to teach analytics to college students and fresh graduates.

The core of both the book and my program hinges on the ability of an analyst to find the right data at the right time.  The authors suggested that identifying your data is where it all starts. Identifying exactly what you need to address whatever it is that you need to report.

When I am training newbies, I generally brake finding data into two parts… the process of getting the data and the process of making sure the data is valid.

Back at Wells Fargo, the single greatest attribute that I had that made me successful was my ability to size up how long it would take me to deliver something. Knowing what data I would need, where I would find it and how long it would take me to analyze it to come up with something useful made me somewhat of a wizard in the minds of the team.

Finding the right data at the right time requires one to first off know their data. You have to know how the data is captured, where it is stored and how it makes it way to you. Knowing the data architecture in your business is the key.

So you have to get to know the people who know where you data comes from and how it gets there. Learn from them. Partner with them. Buy them doughnuts.

A few months ago I came across an analogy being used to describe data in a business. That of a data lake. A data lake is the living, breathing, evolving pool of all the data in a business. If you have a good data architecture, and you can navigate it fairly easily, then you have a data lake.  Ideally, your business has data structured in such a way you can live off it. Data to a business is like water to living things… it sustains life

07_data-lake-walter-fishing

So once you have the lake mapped out, then you have to learn how to fish it. Knowing where the fish are biting is another key. Once you know what data you need, you have to know how to get to it quickly.

Business Intelligence tools help us here. As does coding languages to extract data from a database. These are your fishing tools. You have to practice using them to be good at getting the right data at the right time.

Another way to optimize your data search is to save your work. Of as I call it leave yourself breadcrumbs. Save the query. Cut and paste the code into a document and save it. Write down the steps. Whatever you need to do to replicate what you just did so you can do it again in the future without starting over from scratch.

So to recap, how to you find the right data at the right time? You know its structure, you understand how its stored and you leave yourself clues to do things faster next time.

Now the other part of the equation is knowing if the data you are using is the right data. Finding data quickly doesn’t do you any good if you bring back the wrong data. We’ll talk about data validation and data quality in a future post.

The Fundamental of Business Analytics – Business Analytics is the application of talent, technology and technique on business data for the purpose of extracting insights and discovering opportunities. DMAIPH specializes in empowering organizations, schools, and businesses with a mastery of the fundamentals of business analytics. Contact DMAIPH now at analytics@dmaiph.com or connect with me directly to find out how you can strengthen your business analytics fundamentals.

Q3: What are some of the current trends in analytics?

Every few months I devote a day to discover what are the current trends in analytics. I do this both to refresh the slides in my presentation and to refresh my mind to see what I may have missed.

The amount of literature out there on analytics continues to blossom at an amazing rate, making it a true challenge to stay well versed on what’s hot and what’s not. I read a new analytics themed book about once a month and I have well over 200 blogs, web sites and social media groups cataloged. So I like to think I’m pretty well versed on what is current.

Every time I go to list the top 5 analytics trends, I find that some things change and some stay the same. Ever since I have been doing this, data visualization is near the top. Business dashboards continue to be a big need. Business intelligence tools evolve and new ones’ pop up, but Tableau continues to be a market leader. 90% of us still use Excel for 90% of our analytics work.

275

Still a lot has changed. When I started this just 5 years ago no one was really talking about Big Data or Data Science. People just stared discussing using predictive analytics and now its all about prescriptive, even though most of us are still just doing descriptive analytics. For the newbie, descriptive = historical, predictive = forecast models, and prescriptive = really complicated models with a lot of variables to not just predict the future but to show a lot of alternatives as well.

Now if you talk to experts they make think nothing I have mentioned so far is new. But to the novice analyst or to the manager who really doesn’t care what’s it called, she just want’s results… its all new to them.

So I try each time to really find something really new not just to me but truly new to analytics. Six months ago that was the idea of using a data lake instead of a data warehouse. For those still unsure what a data warehouse is, it’s a collection of databases stored and/or connected centrally. Data lakes are used to describe the reality that more and more data is now unstructured data.

The discussion on what is unstructured data and how best to mine it and integrate it with structured data has really been at the forefront for a while now. Going from 80% structured to 90% unstructured in in just a few short years as mankind generates unprecedented amounts of data not easily captured in a database every day.

As of today, if I had to pick 5 topics to talk about it would be (1) Hiring Data Science and Analytics Talent, (2) Big Data Analytics, (3) Data Warehousing and Data Lakes, (4) Data Blending and (5) Mining Public Unstructured Data

Check back with me in a few weeks and this list will change.

The Fundamental of Business Analytics – Business Analytics is the application of talent, technology and technique on business data for the purpose of extracting insights and discovering opportunities. DMAIPH specializes in empowering organizations, schools, and businesses with a mastery of the fundamentals of business analytics. Contact DMAIPH now at analytics@dmaiph.com or connect with me directly to find out how you can strengthen your business analytics fundamentals.

Analytics Tip > Keep Your Data Clean

http://bicorner.com/2015/03/22/5-nuggets-from-the-big-data-driven-business/

Came across this interesting post on LinkedIn…

Database quality now has an unprecedented impact on the success of Big Data initiatives. To ensure that these databases are as productive as possible, Marketers must maintain good data hygiene.

Five steps for cleaner data:

1) Make sure your data entry team is keying in data accurately in the first place.  Make the data entry team a priority.

2) Incentivize your sales team, call-center squad and other customer facing employees to regularly request updated contact information and other data from the customers they encounter.

3) Use available software, such as Trillium, to streamline the process of cleansing, correcting and updating email and postal addresses.

4) Allow customers access to their records so they can help keep them accurate.  Consider offering discounts as an incentive for customers to participate.

5) Regularly contact customers, either via phone or email, to update records.  This approach is critical with the most important accounts.

Having clean data is very, very important.

img_7731

I have my admin team refresh my connection data on LinkedIn on a regular basis so our mailings lists stay up to date.

We also have audits of our client pipeline to make sure all relevant applicant data is captured for analysis.

Make sure you put some thought into how to keep your data clean!

Analytics is the application of using data and analysis to discover patterns in data. DMAIPH specializes in empowering and enabling leaders, managers, professionals and students with a mastery of analytics fundamentals.

DMAIPH is also a founding member of the Analytics Council of the Philippines and specializes in arming the Data-Driven Leader with the tools and techniques they need to build and empower an analytics centric organization. Analytics leadership requires a mastery of not just analytics skill, but also of nurturing an analytics culture. We have guided thousands of Filipino professionals to become better analytics leaders. Contact DMAIPH now at analytics@dmaiph.com or connect with me directly to discuss a uniquely tailored strategy to ensure you are the top of your game when it comes to Analytics Leadership.