The cost of bad data: Hidden costs
What is bad supplier data costing you?
This is the second in our series of articles on ‘What is bad supplier data costing you?’ In our previous article, we looked at some real world examples of how poor supplier data can lead to bad or inconsistent decision making. In this article, we’ll be focusing on the hidden costs of poor supplier data. We’ll focus on where the main costs are coming from, why it is particularly significant in the area of supplier data and consider the extent to which the cost to the business can be estimated.
In the world of supplier data, there are two ways in which bad data could be described in terms of how it adds cost, namely directly and indirectly. Indirect costs are those that may not be being incurred at all, or that may never be incurred. For example, the cost of not identifying risk or the cost of supplier compliance issues. Direct costs on the other hand, are those that are already being incurred – they just don’t appear as a line item associated with ‘bad data’ specifically – hence ‘hidden,’ but certainly present. In this article, we’ll turn our attention to these direct hidden costs being racked up by enterprises on an hourly basis.
The cost of bad supplier data is far higher than you would imagine
First, to put the cost of bad data into perspective, an article in the Harvard Business Review revealed that the yearly cost of poor quality data in the US alone was $3.1 trillion in 2016, according to an IBM estimate. It adds, “While most people who deal in data every day know that bad data is costly, this figure stuns.”
Other studies have shown that knowledge workers can spend up to 50% of their time trying to locate data, identifying errors in data or seeking confirmatory sources of data to support information that they do not trust. Meanwhile, in a survey of data scientists, “60% said they spent most of their time cleaning and labeling data,” as opposed to putting it to use. More recently, a HICX study revealed that the situation remains challenging as far as supplier data is concerned. 89% in this survey said, for example, that they do not have total oversight into their suppliers.
Supplier data is especially prone to creating hidden cost
There are a number of reasons why supplier data is a particularly notorious culprit for creating hidden costs:
- Suppliers represent the highest spend for most enterprises. The high spend generates a huge number of transactions that all create and require ongoing maintenance of data in order to facilitate.
- There is a corresponding need for accurate supplier data for BI/Analytics
- Relationships with suppliers are complex and require significant amounts of additional data input if relationships are to be managed successfully
- Suppliers are used across the entire enterprise and therefore data comes from a wide variety of areas, is used in many different contexts, and can potentially be altered by many different people
- Supplier data is frequently subservient to the ERP or driven by P2P or S2P systems, which can introduce duplications or inaccuracies
Taken together, it becomes obvious that supplier data, or rather poor supplier data, is going to drive a significant proportion, if not the highest share, of hidden data cost. In spite of this, it is frequently not prioritized as highly as other data in the enterprise. In our survey, 66% of procurement leaders agreed that data projects have stalled due to their relatively low priority when compared to other business initiatives. A key reason for the low level of urgency: 61% pointed to a perceived lack of return on investment – and part of that problem is, of course, the ‘hidden’ aspect of these costs.
Where do supplier data hidden costs come from?
The hidden direct costs to the enterprise fall into two categories:
- Internal hidden costs: This includes time or resource involved in building, maintaining or using bespoke stand-alone databases; using standalone Excel sheets; having to support manual work processes and interventions to manage bad data; and/or the use of data scientists within the business, as examples
- External hidden costs: This could include costs such as amounts spent on outsourced data cleansing exercises; or the amount that suppliers pass on to customers factoring in higher administration costs of doing business with the customer
In terms of internal hidden costs, Excel is one environment in which hidden factories can quickly become established. While ‘free’ in the sense that the software is bundled onto most people’s desktops, the result of using Excel is the equivalent to ‘death by a thousand cuts.’ In some circumstances, it serves its purpose in that it can be used in order to make some rapid and fairly complex calculations, but this frequently escalates into Excel sheets becoming an embedded part of a business process – one which lacks documentation, data verification, version control and governance. This is where costs mount for every supplier data spreadsheet that exists. A simple, quick, one-off spreadsheet that takes a few hours to complete can mean that benefits outweigh the cost. However, as the size and complexity of the data sets increase, and the number of people involved rises, the costs begin to exponentially outweigh the benefits at an alarming rate. This wastage is not measured and its disproportionate inefficiencies are tolerated.
As Professor Richard Wilding, OBE, Professor and Chair in Supply Chain Strategy at Cranfield University explains: “You have got to have an appropriate supplier data management system, which enables you to capture all the various data that you need and also communicate that effectively across the organization – so that you’ve got that one version of the truth, which everybody can use. So you haven’t got people in warehousing having their own special spreadsheet of the ‘correct’ data which they need, and somebody else having their special spreadsheet of what they think is the correct data that they need. You just need one version of the truth across the whole business, and that’s a really important aspect when we’re thinking about the software that we’re using.”
It is important, both in terms of its suitability for collaborative use, but also with regards to the hidden costs associated with it as well. These hidden costs (or the benefits of removing them) should be taken into account as part of the business case. One method may be to audit the extent of the problem via a survey. This could determine the amount of time data users spend in Excel on spreadsheets related to supplier data, with a business objective in mind to reduce this time. For instance, a survey that Excel with Business recently undertook indicates that knowledge workers use Excel on average for more than an hour every day, with finance and R&D professionals spending 2.5 and 3 hours per day respectively. Clearly it’s neither realistic nor desirable to eradicate Excel from all activity related to supplier data, but if such an audit were to ask how many of the spreadsheets were regularly used by, for example, three or more members of staff, one would get a good idea of the level of inefficiency given the inherent challenges with Excel when it is used as part of a wider process.
Can hidden costs be quantified?
As with anything ‘hidden,’ – and as per the example above – it is challenging, if not impossible, to measure with 100% accuracy. There are many inter-related variables, some of which may not be recorded, such as the extent of hidden factories, or the amount of time knowledge workers spend on manual fixes to data, or indeed the amount of time that employees might spend as a result of having to deal with the consequences of bad data.
However, that should not be a barrier to trying, as this is the key to addressing the issue of 61% of procurement leaders facing a perceived lack of ROI, and the exercise can be used to inform at least a base line for a business case.
Assumptions can be made and hypotheses tested that at a minimum provide an initial benchmark of the extent of a hidden cost, which helps to add perspective to the impact within a given scenario or use case. One methodology for this is to use the 1-10-100 quality principle, developed by G. Loabovitz and Y. Chang, which states that the relative cost of fixing a problem due to poor data increases exponentially over time, related to the outcome that the bad data causes.
It is important to bear in mind that this is a principle to provide an illustration of escalating costs, as opposed to calculating a precise number.
Example: Bank data
For example, we ran a use case against bank data to measure the veracity of an aggregated sample to capture the average number of bank account details that would result in a failed transaction. The incorrect details amounted to 0.95% of the total volume of data in this field. Not bad, you may think. However, if this data were in a live environment, it could mean that up to 1% of invoices fail and therefore require a manual intervention. If each of those interventions takes an individual, as a conservative estimate, half an hour to reconcile (find the error, investigate, communicate with stakeholders, fix the error), then it’s thirty minutes accrued for every 1% of the total number of invoices processed by an enterprise – which quickly amounts to a huge amount of wasted minutes and unnecessary ‘hidden’ cost based on this one use case alone. That’s, of course, excluding the administration costs incurred by the supplier on their side to follow-up on non-payment. It is very easy to see the 1-10-100 quality model coming into play here, even before considering any indirect costs of risks or fraud, should they occur.
While these intangible costs cannot be calculated precisely, it provides a mechanism for an enterprise to understand or estimate in a numerical way how much bad data is likely costing within specific scenarios – or, on the flip-side, the size of the scope for significant savings – and therefore it can provide the basis of a business case to pitch for priority.
As Peter Smith, Past President of the Chartered Institute of Purchasing and Supply, clarifies, the emphasis must be on articulating the issue in a way the business understands, as he explains, “I’m a great believer in telling stories and having good data. If you can tell stories but also relate it to data, you could say, ‘Do you realize we handle x million invoices a year and we pay out this much money to suppliers… If only 1% of that was going to the wrong places, that equates to x million.’ I think it’s a combination of data, to describe stories that resonate and that are remembered, linked to the organizational goals and priorities.”
If you found this blog useful, then you may want to check out our other detailed resources as well, covering different aspects of master data management, data cleansing, data governance and more.
Posted in