Staff turnover costs companies big money. Needless to say, the time and resources spent on recruitment, onboarding, and training serve no purpose when the employee leaves, adding ‘loss of knowledge’ to the existing pile of losses suffered by the company. The most frustrating part is that the constant risk of employees leaving impedes the organisation from working proactively with the underlying issues behind the high attrition rate.
Even today, several organisations rely on gut feelings or obsolete data to put forward preventive measures against employee turnover. Not surprisingly, the management team gets caught off-guard every time one of their top talents leave the organisation.
Since the inception of Winningtemp, our product has helped around 1200 companies in visualising the accurate state of employee well-being in real-time. It enables managers and HR leaders to act on day-to-day data and quickly see the impact of various activities on the overall results. This has been an immense step forward in defining the future of work.
However, it still didn’t provide the users with churn indicators and the ability to identify the issues behind staff turnover.
Introducing Winningtemp Smart Prediction. Our data scientists have been working with artificial intelligence and deep learning to make Winningtemp more intuitive and robust. It works with millions of data points to find patterns in real-time and send warning signals to notify the managers of risks and opportunities.
The turnkey function adapts itself to your organisation’s ecosystem, analyses the results and transforms the time-series data into digestible information to
To get insights about employee turnover from data, we need to somehow transform a stream of answers into estimates of when each user will quit. We also need to model uncertainty in these estimates so that we can accurately analyse the risk over different periods. This requires a model that can find and represent the intricate patterns inherent in users' answers.
The setup is illustrated in the diagram below, where we have historical data with answers to different questions on the left and to the right a probability distribution over time to the event that the user quits.
This is a supervised learning problem where the data consists of variable-length sequences of events, and it is not immediately obvious how to represent the explanatory variables. A simple approach would be to calculate various aggregates over rolling time windows. We decided to instead feed the raw event stream directly into a Recurrent Neural Network (RNN), which has the capability to learn relevant features on its own.
The target variables, i.e. the network output, are the parameters for the probability distribution that describes when and how certain the model is that the user will quit. This differs from regular binary churn classification in one important way - we need not specify a fixed churn-definition before training the model. The end result is an interpretable and flexible model that can be used to predict employee turnover in any time period.
When evaluating the accuracy of a model on historical data, we can see how well the probability distributions match the actual outcomes of users who have quit, primarily by evaluating how likely the model is to generate the same data. For currently employed users, however, all we know is that they did not quit before today's date. In Survival Analysis, this is called the censoring point, and the target for active users is to push the probability distribution beyond the point of censoring. By utilising all available data, every user, including the currently active ones contributes to the model training process.
Our approach is based on Deep Learning using recurrent neural networks (RNNs) with a Long short-term memory (LSTM) architecture. The network's feedback connections allow the model to identify and retain patterns in sequences of answers. It is implemented using Pytorch - a Python framework for differentiable programming.
At the lowest level, the model output consists of two parameters to a Weibull probability distribution that controls its location and shape. This approach is mostly inspired by the thesis and accompanying blog post by Egil Martinsson. It allows us to further calculate:
The last item is derived from the ability to track the model's prediction over time, effectively allowing us to attribute employee turnover to every specific answer. This allows us to construct recommendations on a per-group basis on which question categories that should be prioritised to reduce employee turnover.
For an upcoming release, we are working on predicting the answers to single questions - generating a predictive index for each question category. This will help new customers focus on areas where their time is well spent and to significantly reduce their time to receive the first insight.
We are also working on Natural Language Processing (NLP) models that will structure and help navigate a large amount of textual feedback given in the system. By modelling natural language, we are able to extract the essence of a text and connect it to other essential data.
Get nerdy with us! We are the dynamic team of data scientists and developers who spend their downtime innovating (while playing Fifa). Keep an eye on 'Science Behind Winningtemp' to get a closer look into how the tool works.
If you are interested in finding out more about what Winningtemp can offer your organisation get in contact with our sales team.