Editor’s Note: Predictive analytics is a rapidly and continuously evolving field. Feel free to check out an updated version of this article with all new data.
This one is for you, Bill Kutik.
At Visier the security and privacy of our customer data is our highest priority. When determining the validity of our predictive analytics capabilities, customer data analyzed was anonymized.
Many HR software vendors talk the talk of predicting “at risk employees,” but how many can prove they walk the walk… and that their predictions actually work?
Recently HR industry expert and father of the HR Tech conference, Bill Kutik, wrote a column for HR Executive Online about employee flight risk, and talked about the hype around predictive workforce analytics. As he quoted Constellation Research analyst Holger Mueller saying, “it all comes down to whether the models “really work” when applied across a variety of customers with very different data landscapes.”
As usual, he (and Holger) are right.
So, how can you ensure a vendor’s claim to predict employee retention risks is valid? What should you look for?
First, why does predicting “risk of exit” even matter?
In PwC’s 17th Annual Global CEO Survey, CEOs identified “talent strategies” as the number one area they are focused on changing to capitalize on global trends. For many HR professionals, this doesn’t come as a surprise. Since the peak of the recession in 2009, the number of unemployed persons per job opening in the US has steadily declined, and is now back at pre-recession levels.
As a result, retention is a key objective for most HR organizations — understandably. In an attempt to quantify the impact of attrition, many have tried to connect turnover to business impact. In one of the most comprehensive studies that analyzed data from 48 separate studies, turnover was shown to have real impact on financial results, customer service, labor productivity, and safety outcomes.
Many more have tried to quantify the impact of turnover by estimating the direct and indirect costs. While many opinions have been shared, the research results on the costs associated with attrition remain varied, largely because the roles and the factors considered also vary. A full accounting needs to extend beyond hiring and training to include separation, productivity, and lost knowledge. Our own scan of the available literature led us to the following result:
At a company with 5000 exempt employees (e.g. such as administrative, executive, professional employees, computer professionals, and sales employees) with a voluntary turnover rate of 10% (over 1 percent lower than the average rate across industries in 2014), even conservative estimates can convert unwanted turnover into more than $30 million in replacement costs in a single year.
The bottom line can be hit further by spending on well intended, but misapplied, retention tactics, such as raises, bonuses, and/or promotions — put in place by HR and/or managers in an attempt to prevent resignations. When these tactics are applied without hard data to back them up, their results can be limited. Worse, money can be spent needlessly to retain people who are not actually at risk of leaving.
Rather than applying retention programs across the organization, HR can sharpen its focus and apply dollars where they will have the most impact.
If you can leverage predictive analytics to correctly identify employees who are at risk of leaving — in particular, top performers and people in key roles — you can avoid these costs, while also enabling productivity and performance gains.
The key word is correctly.
Why is proving that predictive analytics work so hard?
First, with any predictive model, you need to have a means to validate that your predictions are valid. At Visier when we put our own Data Scientists to work on validating the success of our “at risk employee” predictive capabilities, they immediately identified that a minimum of 2-3 years of data is required for the analysis to be valid (the more the better). It’s like the statement most parents have made to their kids at some point, “How do you know you don’t like it, if you haven’t tried it?” Or, in our case, how do you know the predictions are working, if you haven’t made a prediction that can be validated against real outcomes?
Secondly, the patterns behind why people make decisions cannot be boiled down to simple factors – marketers have been trying to figure that out for years. It is “data with feelings”, and to find the patterns inherent in such data requires looking across as many varied sources of information as possible. Like mining for gold, the wider your search the more likely you are going to find the hidden nugget – of insight in the case of predictive analytics.
Thirdly, the accuracy of the predictions depends on the data used to create the model. For instance, if a model is created based on the factors inherent at one company, it doesn’t necessarily apply at a second company. Compounding this challenge, the same may be true about a model for one year compared to the next year within the same company. Approaches need to take this dynamic nature into mind.
The problem is that most “at risk” predictive analytics capabilities available today are in their infancy — they have simply not been used for long enough by enough companies for enough employees on enough sources of data.
How did we validate Visier’s “at risk” predictive workforce analytics technology?
We recently validated that Visier’s “at risk” predictive analytics technology is up to 8x more accurate at predicting who will resign over the next 3 months than guesswork or intuition — and up to 10x more accurate if you focus on the top 100 “at risk” employees.
To understand how we did this, you first need to understand how Visier works.
Visier applies what could be referred to as Continuous Machine Learning. For each customer, Visier’s patented in memory multidimensional analytics technology looks back over 18 months, at all employees and potentially hundreds of employee variables or attributes, as well as the groupings within each attribute, and determines how much each correlates to the employee resignations that have occurred during the 18-month period. Visier then automatically assigns each employee an “at risk” score, essentially ranking each employee from highest to lowest risk.
All of this is calculated dynamically and instantly, so when an HR analyst, business partner, or leader goes into Visier and asks which employees are “at risk” in a specific employee sub-group (for instance, specifying a role, location, tenure, and performance level), Visier automatically provides the relevant results, based on the latest, most current data applicable to the user.
Visier does not artificially limit or restrict what data is analyzed. Rather, Visier always considers all the attributes for the group of employees being looked at, and because Visier was built to analyze all employee data we are not limited like applications who manage a portion of the employee lifecycle. Visier looks at all the employee attributes, collected in all HR transactional systems, from payroll to HR management to talent acquisition to recognition and so on.
A typical approach by many “at risk” predictors is to perform what is called a regression analysis on a select group of attributes. For instance, a manager thinks salary and tenure have something to do with resignations, so the analyst looks for a correlation between those attributes and resignations to see if the manager is right. They may indeed prove to be right, but in artificially limiting the analysis, several other attributes that relate to resignations may be missed. More sophisticated methods may train algorithms with various machine learning techniques; however, people leave companies for a variety of often surprising reasons and those reasons change over time. For the best results, all potential reasons (aka employee attributes) should be considered, and the results should be calculated continuously to take into account changing conditions.
In validating Visier’s approach, we went back as many years as possible for each customer (a minimum of 3 years of history to a maximum of 5) for over 140,000 employees across several companies, and asked, during the time period:
- What was the company’s actual resignation rate?
- What was the company’s actual resignation rate for the employees Visier identified as most “at risk”?
- How did the two differ?
To be more specific, we actually looked (anonymously!) at the exact employees who Visier predicted would resign, and determined how many of them actually had resigned. To determine how the two differ – and how well we predicted outcomes – we compared the “at risk” groups we identified to actuals. However, given Visier can give a risk score for every employee, this leaves open who you determine is “at risk”.
To address defining an “at risk” population, we looked at multiple techniques to validate our approach, and varied how aggressive or conservative we were in defining who was at risk. To illustrate this, one simple approach we took was to look at the top of the list, which we defined as being the predicted 100 highest risk employees (“top 100”), and ask how right we were in identifying that group as likely to leave, compared to everyone else. This means that if you had a turnover rate of 10% (so you would expect 10 out of any sample of 100 employees to leave), yet 50 of the “top 100” left, then Visier would be 5x better. Here our predictions were up to 10x better, meaning all of the “top 100” Visier identified did indeed resign.
As a tougher test of our approach, we also compared everyone who actually left the organization directly against those who we had predicted would likely leave. So, if 500 people actually resigned, how did that 500 compare to who we predicted to leave? Did we identify the same 500 people? Predicting at 100% is impossible as the reasons that people leave an organization may include personal reasons not captured in any source of data for analysis, but on average Visier performed 3.5x better, and up to 8x better.
Note, the size of the company’s annual resignation rate did not make a difference to our results — customers with high resignation rates had the same results as customers with low resignation rates. And, the longer we looked forward in time from the prediction, the better Visier’s predictive analytics performed.
Hold on? Shouldn’t predicting “at risk” employees actually make less of them leave?
Yes! That’s another tough part of measuring the effectiveness of predictive analytics — when you predict correctly and people take actions on the findings your results actually appear to be worse. Such is life!
HR’s critical role in predictive analytics
As I explained in my blog post about the Myths of Predictive Analytics, despite the hype, predictive analytics will not replace human intervention: they won’t tell you the one clear course of action to take, particularly when dealing with data that has feelings.
Predictive analytics is about more than who will leave, it is about why they are leaving — consider the Key Drivers chart below that shows the employee attributes (for the group of employees being considered) that most relate to resignations (on the right) and retention (on the left). In many ways, this sort of prediction is more valuable than naming individuals, as it allows HR to develop thoughtful, refined, long-term programs to reduce resignation rates by targeting root causes. One of our customers recently told us that — because this chart provided them with such clear insight into the factors driving resignation rates — they were able to self-identify the at risk employees.
That said, keep in mind, your own company’s journey to data-driven HR doesn’t (and probably shouldn’t) need to start with predictive analytics. Consider a “crawl-walk-run” approach in your graduation from metrics to true Workforce Intelligence. When you are ready to implement predictive analytics, consider these five ways to implement them effectively, provided in Enhancing HR’s Strategic Capabilities through Predictive Analytics, Bersin by Deloitte, Deloitte Consulting LLP, June 2015