Fixing your supply chain with Big Data

ISE Magazine Volume: 49, Number: 04
By Alexander Klein

Distilling millions of granules of information into operational decisions will separate tomorrow’s winners and losers

What is big data and why is it so important in supply chain management today? How do we get past the mere buzzword appeal in capturing an audience’s attention with trendy aphorisms? The term “big data” is being applied to situations where analysts, managers and continuous improvement teams need to weed through massive amounts to support decision-making. But on the quality side of the fence, it is important to make good quality data available to decision-makers in a format that is ready and available for them to use. Wikipedia defines big data as “… a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them.”

Data quality/quantity

The quality end of the spectrum offers a completely different view. Trust and diligence must be placed both in the originator of the data and the processing and storage of the information. Even in this advanced age of data manipulation, a lot of work must happen to ensure stable and accurate data. It’s not uncommon to experience situations where you need to remove more than 70 percent of a massive data set due to missing or corrupt data elements within the population. Is that data really useable? In some situations, you might have no other choice and will forge ahead with heavy disclaimers. In other cases, you would merely have to wait out another cycle of activity using the data set analysis to direct improvements in the collection, storage and manipulation process.


After addressing the quality and quantity aspect of big data as it relates to the raw data, our next challenge is presenting this information so it can help make quick decisions with high levels of confidence.

Enter the operations research field, which appears to be laser- focused on querying and using data. Advanced tools and knowledge of database management can harness the power of analytics at the appropriate level. At the same time, enter the newly marketed presentation methods like dashboarding and live views of activities taking place in the supply chain, and we find ourselves meeting the definition of big data head on. Over the past several years, reporting capabilities in Excel have developed almost to the point of a maximum threshold that was screaming for more. Along came dashboard tools like Xcelsius and Tableau that understood the shortcomings of traditional workbook data reporting and offered not only an alternate but an unbeatable capability. No more static reports that would need to be generated daily, weekly, monthly, quarterly or annually – instead, now we have live views of the actual data in any format the user or decision-maker wanted.

The most common use of big data in third-party logistics freight management is found in generating the data and reporting performance.

Typically, the two entities agree upon performance measurements and levels at the outset of their relationship. You could say that measurement and performance levels feed on each other, and, as the old saying goes, you cannot fix what you cannot measure. So the idea is to collect all of the data regarding everything: every movement, every time stamp, every dimension, every reason code, etc. The data needs to be collected in a systemic and systematic way to permit easy retrieval and manipulation. In short, a lot of thought needs to go into the upfront work.

Once that is completed, then you turn your focus to how you present the output. You need to understand how the customer wants to view the data to make decisions or improvements. Lessons learned from the old A3 format tell us to put as much information as possible on one page. This permits decision-makers to view trends that normally wouldn’t be seen and pick out how certain performance indicators correlate to others.

More specifically, the decision-makers, supported by keen data managers and capable vendors, are gaining access to cycle time information, facility utilization, conveyance performance and most every type of documentation efforts that allow them to manage the bottom line. Any manager who can do this expertly and consistently will lead the way.

Operationalize that data

A test of the designed data extraction process was conducted using a 90-day shipment cycle population. This determined that in two of the inherent systems, all the data could be retrieved automatically. In addition, data required to complete the report that was not available in those systems could be collected in an ad hoc manner and connected by existing key fields data.

Next, a template for presenting the KPIs was created and gaps between the connected data and the reporting data were identified. The final step was creating a standard operating procedure document to complete the report generation work on an ongoing basis in as automated fashion as possible. Here are some of the issues that the team encountered during this initial process:

  • Freeform comment fields containing pertinent information on container load planning decisions required extensive data cleansing work. This information was critical in reporting the cycle time performance.
  • Specific routings created double counting and missed events based on how the shipments were entered into the system manually and systemically.
  • Some of the time stamps were available for certain routings, while others were not – just by the nature of the processes involved.
  • Several key time stamps were being overwritten by purchase order changes, and the automated data extraction was only picking up the latest change, losing sight of the original data, which was critical to capturing true time events.
  • Some data required manual collection. This created a burden on resources and had the potential to make the data collection process lack standardization from month to month.

For the first five years, this effort was actively managed. Then the process improvement team was able to transition the process to a dashboard, which permitted a more automated data extraction and cleansing process and nearly ondemand performance measurement availability, all based entirely on the original process.

Reading the granules for better decisions

Consider the hundreds of thousands and perhaps even millions of lines of related data that are generated. Some of this information is generated weekly, some monthly and some annually. Being able to turn the most granular data into meaningful and useful reports is powerful. Third-party logistics providers – and other enterprises – must focus on supporting their customers’ needs for big data analysis with dashboards that feature any required or requested performance indicator, landed cost calculations, scorecards, inventory management, vendor performance, carbon dioxide emissions and even peer analysis, to name a few measurable activities.

All this information exists, and we should understand how to use it, making it invaluable not only to us in shaping our future, but to shippers who can continue to focus on shaping their own future with high-quality big data management.