Q11: Can you next describe how to best use predictive analytics?

A look at how predictive analytics is used to help drive decision-making starts with a basic need to improve things. Someone once told me that despite all the advanced technology in our phones, cars, homes, workplaces… the world is a remarkably inefficient, wasteful place.

Two blogs ago, I defined predictive analytics as a process that takes data and extrapolates patterns to predict likely outcomes. Past, Present, Past Present, Future… the goal being too provided educated guesses on what is most likely to happen next. The primary use of predictive analytics is to predict outcomes using models that will mitigate risk and eliminate choices based on unlikely outcomes.

For anyone who is familiar with Lean or Six Sigma, there is a lot in common with predictive analytics and process improvement methodologies.  We take historical performance data and combine it with rules, algorithms, and occasionally external data to determine the probable future outcome of an event or the likelihood of a situation occurring. Once we see where we think things might go wrong, we make changes to prevent or at least mitigate the future.

Predictive analytics is used most extensively in places where you want to know the future like sales, marketing, and finance. To do this you need to build models. Models are not always simple and often take someone with both business experience and professional training in certain coding or programming languages.

In the hands of a good analyst, predictive analytics helps a business continually reinvent itself based not just on what happen, but what is likely to happen.

This allows a wide range of organizational activities to be improved by predicting the behaviors and outcomes of people, the futures of individual customers, debtors, patients, criminal suspects, employees, and voters. It’s that generality that makes this technology so awesome.

10592750_728053387243786_8229253185490727991_n

Business that have good predictive analytics are much more likely to be successful over the long term. When you look at businesses that fail, its generally because they didn’t have an eye of the future.

If you are wondering how to take your descriptive analytics to the next level and start getting more into predictive analytics, let me know. I can help you figure out to starting using something besides the magic 8 ball to predict what lies ahead.

Q10: Please talk about how, when and why we use should descriptive analytics?

Going back to our previous definition of descriptive analytics, it is used to answer questions about what has happened in a business. It is primary use is to look at the current business situation with an eye towards looking for cause and effect. It helps one to understand how to manage in the present based on what happened in the past.

The vast majority that have attended my trainings on analytics, are looking for help with descriptive analytics challenges. Using unstructured big data for predictive analytics modeling is not really something they are concerned with.

I have found that people who are really engaged with analytics are very driven to self-educate. They are driven by curiosity to make use of cutting edge stuff to tackle bigger and bigger challenges. For data scientists and really good analysts, descriptive analytics is easy and kinda boring.

But that is a small percentage of people who use analytics every day.  To most of my attendees, its more about how to cut down on the time it takes for them to prepare the reports they have to make and how to make them more useful to their bosses. That’s where most of my descriptive analytics training has an impact.

How to make a better report? How to build and maintain a simple business dashboard? How to have more impactful power point slides. How to streamline the reporting process? This is one way to look at descriptive analytics… its not just taking historical data and using it for reports, but also how to make the reports better.

ANALYTICS CAN

So how can we use descriptive analytics? Well, we probably already are. Inventory control, payroll, performance management, quality assurance, sales reports, marketing results… all use forms of descriptive analytics. They take what happen, they look at it and then they make decisions.

For the most part this can and is done in Excel. If you want to supercharge what you do in Excel, then you can use a business intelligence tool to build dashboards and publish dynamic reports. This is where most people doing reports need help. How to better visualize the data so it has more power and how to use BI tools to do things faster than can be done in Excel.

In many, many companies a lot of time and energy has been devoted to building reporting tools in house. And this is generally the problem. The reports are static and hard to change. If you are in a company like this, then descriptive analytics can be a bear.

To make the most of it, I suggest using free tools like Tableau Public, which is free, to demonstrate new ways to analyze and report data, to get the boss interested in updating the way you company reports.

Another big challenge facing analysts doing mostly descriptive analytics in the form of reporting, is blending data. Taking data from different data sources and combining them. This can often be very manual and general done in excel if you company hasn’t invested in a way to centrally store enterprise wide data and make it easily accessible. There are some applications out there that can help you with this, Alteryx and Qlikview being ones I have used and they both have a free demo.

If you are already doing predictive analytics, then you probably have your descriptive analytics figured out.

So, if you need help super charging your reporting, are looking to get started using business intelligence and data blending tools, and/or need to build a business case to invest more into analytics, let me know. I’m happy to help you come up with a much better way to build reports that have real impact and don’t take up all your time.

 

Prelude to Q10: Understanding the 3 different types of analytics.

The analytics efforts in a business are generally divided into 3 types; descriptive, predictive and prescriptive analytics.

A simple definition of descriptive analytics is that it is used to answer questions about what has happened in a business. It is primary use is to look at the current business situation with an eye towards looking for cause and effect. It helps one to understand how to manage in the present based on what happened in the past.

Per the Commission on Higher Education (CHED), descriptive analytics make use of current transactions to enable managers to visualize how the company is performing. When teaching the concept, it is generally focused on analysis and reporting to guide decision-making.

Most businesses use mostly descriptive analytics in their analysis, reporting and decision-making.

Three_Phases_of_Analytics

Have to apologize to whoever made this image,  I dont know the source, but you have my thanks for making it. 

As you can see in the image, predictive analytics takes data and extrapolates patterns to predict likely outcomes. Past, Present, Past Present, Future… the goal being too provided educated guesses on what is most likely to happen next. The primary use of predictive analytics is to predict outcomes using models that will mitigate risk and eliminate choices based on unlikely outcomes.

Per CHED, Predictive analytics allows voluminous data to be used for prediction, classification and association making it very useful tool for projections, forecasts, and correlations. Most lessons around predictive analytics involve data modeling and require a much higher degree of skill then descriptive analytics.

In general, predictive analytics is used by large companies in data-rich industries. Up until recently there were very few tools available to smaller businesses to add this type of analytics to their decision-making.

Prescriptive analytics goes one step further and finds the best course of action for a given situation. Its primary goal is to enhance decision-making by giving multiple outcomes based on multiple variables.   The analogy of how doctors prescribe medicine to patients based on a wide range of variables in a patient’s health, using an equally wide range of treatment options.

Per CHED, Prescriptive Analytics help organizations develop insights to make decisions from the current data that maximizes the organization goals.  Prescriptive analytics not only anticipates what will happen and when it will happen, but also why it will happen. Largely, instruction take the model building found in predictive analytics and supercharges it with more data, more choices and more outcomes.

Prescriptive analytics is fairly new and just now gaining widespread use in the corporate world. There are not many tools available that are cheap or easy to use. Generally, you find data scientists assigned to prescriptive analytics projects. It also take us closer to some decision-making in a business being completely automated. With enough data on hand, using machine learning to analyze the data, we are starting to see artificial intelligence at play with prescriptive analytics. It is a pretty exciting time.

Its important to keep in mind that to really be good at predictive and prescriptive analytics you need both the high tech tools and the training/experience to use them effectively.

 

Q9: Can you please describe the concepts of storing data in a data ware house?

Twenty years ago data was mostly stored in databases. These databases housed all the data a business would need to do analytics. Transaction data, sales data, customer data, demographic data was all neatly collected, stored and analyzed in databases.

A surprising number of companies still store most of their data in databases. It works well for business that just need to look at historical data to conduct basic descriptive analytics.

About ten years ago the amount of data captured in a business and the growing diversity in date sources and data storage brought about the mainstream use of data warehouses in the business world.

Data warehouse are often a collection of databases interconnected so that data can be brought together into one place for reporting and analysis.

Whether you are working with a data base or a data warehouse, you should have a basic understanding of how data is stored. It should be in table format, with header columns and data rows.

A good way to quickly assess the analytics culture of a business is to look at how data is shared among management. Does it look table like? Or is it obvious that most of the time spent by the author was put into decorating? If you can’t easy sort something, then you are not dealing with a good data culture.

The best way to have a good data culture is to have well documented data structures. Any dB admin worth a grain of salt has the data hierarchy mapped out and has a knowledge base to help users know what data is in each field.

Like with finding data, being good at storing data starts with knowing the environment. Any good analyst should have a basic understanding of how to use SQL to pull a query for a data table. Even if you cant do hard core coding, know how data is generally stored in a structure is key.

1075177_10151826941667425_1417094118_n

Another important concept about data warehouses if you have to know how to join or blend data from different sources. When you have multiple data tables in a warehouse you often need to join the data on a common field. Data blending goes on step further as you are often trying to take data that doesn’t have a natural point on common that is easy to join on. Advanced data warehouses and data management tools can blend things easily, but its still important to understand the core concepts of how to join and blend data.

As I mentioned in earlier posts, there is now a new concept taking root that one up data warehouses. Data lakes are being used to address the fact that we have more unstructured data then we have structured data. Data bases and data warehouses were designed only to handle structured data the easily fits into a data able.

Now we have to collect data from images, videos, blogs, comments and other places that are not easily converted to a value. Data blending across both traditional structured data warehouses and new types of data is not easily done in most data warehouses so tools are being developed to bridge this gap.

The lake is no longer a place just to fish, but also to do all the other things a lake can be used for.

So, when it comes to understanding data warehouses, learn who built and/or maintains it and buy them a cup of coffee. Get your hands on the data dictionary, knowledge base, FAQ, metadata.. whatever you can to map out the data environment. If you do that then you can find use the big data stored in a data warehouse to find the right data at the right time.

Q8: Here’s something a lot of us are wondering, what exactly is big data?

Think about some of the things you do in your daily life. You get up, you eat, go to work/school, shop, do something for entertainment, bank, go online and do things on social media. Everything you do generates data. That data is captured in countless ways. And then its stored in countless places. And analyzed by countless numbers of people. And then used in countless ways by businesses to market, design, advertise, build, sell, and so on.

Every time you check your phone to see if there are any updates on Facebook you generate a lot of data for your phone manufacturer, your service provider and Facebook itself. Everything you like or comment on can be turned into a data point. The time, place and length of your connection all provide useful data. Get the point? Its endless.

That’s big data.

In general, big data is thought of as all the data businesses capture and store in a database that they can use for business decision-making.

When you think of data collections that have millions and millions of rows of data like big bank transaction data, or traffic data for major cities, or all the statistics captured everyday across professional sports. Way too much for man to analyze without help from technology. That’s all big data.

Every business defines its big data a little differently. There is no one way to look at how best to manage big data because big data is such a living, evolving, never ending flow of information. It’s like lakes of water that are too big to swim across and too deep to dive to the bottom of without help. And no two lakes are alike.

Data analysts and data s2.5.2cientists are the ones who know the lake and guide you across or build you a submarine to explore the bottom.

As I have mentioned in previous posts, knowing the data environment is key to your success. And big data just adds weight to that statement. If you don’t know where all the data is coming from, can’t be sure if its clean, then you will get lost in the deluge of big data.

The Fundamental of Business Analytics – Business Analytics is the application of talent, technology and technique on business data for the purpose of extracting insights and discovering opportunities.

DMAIPH specializes in empowering organizations, schools, and businesses with a mastery of the fundamentals of business analytics. Contact DMAIPH now at analytics@dmaiph.com or connect with me directly to find out how you can strengthen your business analytics fundamentals.

 

 

Q7: What exactly is data science and why the rapid rise of data scientists?

A year ago I might have found it challenging to really answer this question. The first time I had heard of the term data science and a data scientist wasn’t that long ago. And I have been doing some pretty advanced analytics for close to 20 years now.  I know the term has been around in academic and research circles awhile longer, but 2014 is the first time I ever saw a job posting for data scientist in big business.

So what is data science? Besides simply being the study of data, it generally refers to using complex models, machine learning, predictive and prescriptive analytics and powerful technology to analyze business data in much greater volume, velocity and variety then possible a few years ago.

And of course the ones charged with doing the data science are data scientists. They understand math, statistics, and theories that can be applied to business data using new technologies and methodologies.

The biggest challenge to being a true data scientist is that you have to be adapt at both technology and working with people. Being a business data expert, knowing how to code and doing higher math are only half the job. You have to also share your data, communicate it in ways that drive action, share and engage with non-data centric people. It’s hard to find people who are good at both.

ByugG_cIEAAL6wM

Image from Forbes Magazine. 

In addition, whole some data scientists are educated to be data scientists, very, very few actually have any kind of degree in data science. That kind of degree really didn’t exist until very recently. Instead most data scientists have advanced degrees is related subjects and have migrated into the business world do to market demand.

That demand has been growing at a staggering rate the past few years as every day we generate more and more data across the planet. President Obama first employed a data scientist for his campaign in 2012. The White House now has a chief data scientist position.

If you were to compare results from job board searches form 2012, you’d see maybe 100 data scientist job postings. Now its easily in the 1000’s.  So that’s why the job market for data scientist is one of the hottest around.  Lack of training programs, having both tech and people skills, and the booming demand due to unending new data to being analyzed.

Some people ask me if I’m a data scientist I am careful with my answer. True data science is not something I am academically prepared for nor I have never published anything in a scholarly journal. But my real world experience working with data has made me an expert on many aspects of data science.

I guess I feel more like an analyst, but a freakin awesome analyst who can do a lot of things using data that are super important to a business.

img_8168

Analytics Education – Facilitating a mastery of the fundamentals of analytics is what DMAIPH does best. As a key parnter of the Data Science Philippines Meetup Group, DMAIPH champions the use of using data. All across the world, companies are scrambling to hire analytics talent to optimize the big data they have in their businesses.

We can empower students and their instructors with the knowledge they need to prepare for careers in analytics. Contact DMAIPH now at analytics@dmaiph.com or connect with me directly so we can set a guest lecturer date, On-the-Job Training experience or other analytics education solution specifically tailored to your needs.

Q6: Can you provide some tips on how to manage data?

So you have the data lake, the messy version of the lake or data swamp and then the pristine, well managed version of the data lake called the data reservoir.

08-data-reservoir-walter-with-hard-hat

Imagine how a reservoir of fresh water is used for multiple purposes… fishing, drinking, watering crops, providing electricity. That’s how your data should be structured. Even if you are working with multiple data sources made up of a lot of unstructured data from social media, you need to be organized with your data.

I’m willing to bet that if you are reading this then you are by nature pretty organized. Analysts tend to be. If you are working in an data swamp and the company culture is not data-driven, the best advice I can give you, no joke, is to find another job.

What to look for in a data-driven company? Are the data warehouses easy to use? Is their documentation on the data architecture? Is there a knowledge base? Are there experts and are they open to helping you?

If you say yes to questions like that, then your data management tasks are generally about optimization, data blending, adding new sources and being a kick ass analyst.

If you say no to questions like that, then your data management tasks are generally about cleaning data, lots of data validation and having your analysis be filled with caveats that you might be missing something.

So a few tips I have for those in good data companies; get your documentation fresh, do a lot of bread crumb dropping, save your queries and models.

Keep the data architects,database admins and/or IT staff in your circle. Share with them how powerful your analysis is because of their help. And most importantly, show you masterly of the data lake.  Tell your story. And teach others how to fish in it.

For those of you not so blessed with good data cultures. You have to start on both ends. Map out the data flow. Try and assess where the data goes bad. Is it the input or capture of the data, is it a loading process, is it filers? Once you get a start on the front end, then go to the back end.

Who needs the data? How much of what data is being provided now is actually usable? Eliminate any unnecessary data. Basically start cleaning up the swamp at the same time you map it. And again tell this story. Don’t make excuses, but you do need to educate. Let people know there is a problem with the data and outline what you will do to correct for it.

In either case, before you go out and request or purchase new tools or start adding new data… make sure you have the architecture figured out. That’s the best tip I can give you about managing data.

jobspicture2

The Fundamental of Business Analytics – Business Analytics is the application of talent, technology and technique on business data for the purpose of extracting insights and discovering opportunities. DMAIPH specializes in empowering organizations, schools, and businesses with a mastery of the fundamentals of business analytics. Contact DMAIPH now at analytics@dmaiph.com or connect with me directly to find out how you can strengthen your business analytics fundamentals.

Q5:What are some basic strategies an analyst can use to find the right data at the right time? – Part 2

How do you know if the data you are using is the right data to be using?

I can’t count the number of times I asked myself that question. In general, just about every new analysis or project or research or whatever it is you are using data for, you have to ask that question at some point.

Even data you have used a hundred times and comes from a highly trusted source needs to be scrutinized.

Now if you work with data everyday in a familiar format, from the same source and with no changes to the data gathering and storage process you don’t have to spend much time validating it. Usually you will see problems when something just doesn’t look right when you are doing the analysis.

On the other hand, things get a whole lot trickier when you are using data from a source you don’t use often, or something has changed in the way the data is populated or if it’s the first time you are using the data.

When this happens, I have a few suggestions on how to validate the data.

First off, pull the data, do your analysis and draw some conclusions. If it passed the eye test and it feels ok to you, then your job is just to validate it.

One simple way to do this is pull the data again the exact same way to make sure you get the exact same data. Or change one parameter like the dates used in the query. See if that significantly alters the way the data looks and feels.

Another option is to have someone else do the same thing independently. See if they get the same results you do.  You can also find someone who knows the data to look over you work to see if it makes sense to them.

Whatever you do, the best way to prevent publishing or using bad data is to involve someone else. Not always possible, I know, but it’s the best way to go.

Another suggestion is to get the data, do some analysis and then step away for a while.  Come back to it with fresh eyes. Don’t let our minds play tricks on us by making us see what we want to see and not what is really there.

MSP24321b65h584fd76h9i00006996g957h01de5g6

I have seen several articles showing research that most time doing data analysis is actually spent cleaning data. In a lot of businesses the data lake as become a data swamp, clogged with bad or unusable data. As the % of unstructured data increases daily, its easy to see how data swamps have become the norm. Even he most robust data collection and mining can run afoul if the data is not trustworthy.
So getting back to the last post… know how the data is populated. Who, when, why, how, how often, with what filters… things like that. I can’t stress this enough. No matter how good you are at analysis, or what tool you are using to do the analysis, if you don’t have an understanding of what happens to the data before it gets to you then you are probably not drinking from a clean lake.

Q5: What are some basic strategies an analyst can use to find the right data at the right time? – Part 1

Several years ago I came across a book called the Accidental Analyst. After reading the book I was inspired to come up with a way to teach analytics to college students and fresh graduates.

The core of both the book and my program hinges on the ability of an analyst to find the right data at the right time.  The authors suggested that identifying your data is where it all starts. Identifying exactly what you need to address whatever it is that you need to report.

When I am training newbies, I generally brake finding data into two parts… the process of getting the data and the process of making sure the data is valid.

Back at Wells Fargo, the single greatest attribute that I had that made me successful was my ability to size up how long it would take me to deliver something. Knowing what data I would need, where I would find it and how long it would take me to analyze it to come up with something useful made me somewhat of a wizard in the minds of the team.

Finding the right data at the right time requires one to first off know their data. You have to know how the data is captured, where it is stored and how it makes it way to you. Knowing the data architecture in your business is the key.

So you have to get to know the people who know where you data comes from and how it gets there. Learn from them. Partner with them. Buy them doughnuts.

A few months ago I came across an analogy being used to describe data in a business. That of a data lake. A data lake is the living, breathing, evolving pool of all the data in a business. If you have a good data architecture, and you can navigate it fairly easily, then you have a data lake.  Ideally, your business has data structured in such a way you can live off it. Data to a business is like water to living things… it sustains life

07_data-lake-walter-fishing

So once you have the lake mapped out, then you have to learn how to fish it. Knowing where the fish are biting is another key. Once you know what data you need, you have to know how to get to it quickly.

Business Intelligence tools help us here. As does coding languages to extract data from a database. These are your fishing tools. You have to practice using them to be good at getting the right data at the right time.

Another way to optimize your data search is to save your work. Of as I call it leave yourself breadcrumbs. Save the query. Cut and paste the code into a document and save it. Write down the steps. Whatever you need to do to replicate what you just did so you can do it again in the future without starting over from scratch.

So to recap, how to you find the right data at the right time? You know its structure, you understand how its stored and you leave yourself clues to do things faster next time.

Now the other part of the equation is knowing if the data you are using is the right data. Finding data quickly doesn’t do you any good if you bring back the wrong data. We’ll talk about data validation and data quality in a future post.

The Fundamental of Business Analytics – Business Analytics is the application of talent, technology and technique on business data for the purpose of extracting insights and discovering opportunities. DMAIPH specializes in empowering organizations, schools, and businesses with a mastery of the fundamentals of business analytics. Contact DMAIPH now at analytics@dmaiph.com or connect with me directly to find out how you can strengthen your business analytics fundamentals.

Q4: Can you please describe the current state of analytics in the Philippines? – Part 1

Let me tackle this question in two parts. The history major in me demands we look at how we got to where we are now before we talk too much about where we are going.

To start, both the appreciation for and the use of analytics has grown tremendously over the past few years. When I first started thinking about setting up a business in the Philippines back in 2011, hardly anyone knew much about analytics. Big banks, large call centers, multinational corporations and only the top schools were even talking the concept.

It was a challenge to fill my initial training classes due to lack of general awareness. Even at industry events and conferences it was rare to hear much about the idea of using data to drive business decisions.

Doing a search on the top job board in the Philippines back in 2012 for the jobs with analyst in the title netted about 1,000 job postings on any given day.  The average salary was some here around 30,000 PHP a month. It was a challenge to find good talent and those who could do analytics were all gainfully employed.

It wasn’t until 2013 that I stated seeing other analytics training options and those were just ones being done by IBM to meet the CHED (Commission on Higher Education) requiring the implementation of a six class elective tract in business analytics. The was accompanied by the launching of Analytica, and IBM backed effort to push the Philippines towards being more a viable option for analytics outsourcing.

At this time a job search for analyst would bring back about 1,500 jobs. Salaries were starting to rise for analysts as well with the market average getting closer to 50,000 PHP.  Still not a lot of public training or analytics centric organizations around then.

About the same time I started getting invited to schools on a regular basis to lecture about analytics to IT, CompSci and Management students. For the most part they had no idea of the career opportunities out there for those with analytics talent. I consulted with several schools on how to implement the CHED memo and how to prepare their students for analytics careers.

In 2014, an analyst job search was yielding closer 2,000 open jobs. The average salary climbed north of 50,000 Pesos for an experience analyst. I did a lot more trainings, being able to routinely fill a class of people hungry to learn more about analytics and how it could help them in their jobs.

The most in demand analytics skills up to this point where many centered on management reporting, production analysis and workforce management. Most analysts used some kind on proprietary database to store data and did just about all their analysis in Excel.

By 2015, analytics was finally in the mainstream.  Job posting now routinely called for specific skills sets in programming languages and business intelligence tools. Multiple organizations made up of analytics professionals started coming together. The number of jobs open hit 2,500 on any given day and salaries for really good analysts hit 70,000 PHP a year.  By this time, many outsourcing companies focused on setting up team of analysts to offer analytics as an outsourcing option.  Big data jobs and even data scientist positions started showing up in large numbers.

 

So here, we are now in early 2016. The sky is the limit when it comes to Filipinos with analytics talent being able to enjoy good career growth and make substantial salaries. The schools are now starting to churn out talent with analytics careers in mind. Things look great on the supply side of analytics talent and the market growth opportunity for businesses offering analytics is huge.

An additional complexity in the analytics world is the vast number of tools out there to gather, store, analyze and present data. Although IBM is by far the biggest player in training people, they are not the universal solution when it comes to the methodologies and technologies people use every day.

The biggest challenge today is that the demand for analytic talent dwarves the actual current and near term talent supply. The global need for not just analysts, but also data scientists has quickened to a point where catching up for the Philippines seems almost impossible.

hrcloudblog-967x380

HR & Recruitment Analytics is the application of talent, technology and technique on business data for the purpose of extracting insights and discovering opportunities. DMAIPH specializes in empowering organizations, schools, and businesses with a mastery of the fundamentals of business analytics.

The recruitment and retention of top talent is the biggest challenge facing just about every organization. You really have to Think Through The Box to come up with winning solutions to effectively attract, retain and manage talent in the Philippines today. DMAIPH is a leading expert in empowering HR & Recruitment teams with analytics techniques to optimize their talent acquisition and management processes.

Contact DMAIPH now at analytics@dmaiph.com or connect with me directly to learn how to get more analytics in your HR & Recruitment process so you can rise to the top in the ever quickening demand for top talent.