right_division Green SCM Distribution
Bookmark us
SCDigest Logo

SCDigest Expert Insight: Supply Chain by Design

About the Author

Dr. Michael Watson, one of the industry’s foremost experts on supply chain network design and advanced analytics, is a columnist and subject matter expert (SME) for Supply Chain Digest.

Dr. Watson, of Northwestern University, was the lead author of the just released book Supply Chain Network Design, co-authored with Sara Lewis, Peter Cacioppi, and Jay Jayaraman, all of IBM. (See Supply Chain Network Design – the Book.)

Prior to his current role at Northwestern, Watson was a key manager in IBM's network optimization group. In addition to his roles at IBM and now at Northwestern, Watson is director of The Optimization and Analytics Group.

By Dr. Michael Watson

April 15, 2014

More on Big Data in the Supply Chain

Instead of Getting Hung Up on the Definition of 'Big Data,' Learn to Embrace It

Dr. Watson Says:

...Instead of thinking about the technical side of data, think about how you can better use the data you have to solve new problems...
What Do You Say?

Click Here to Send Us Your Comments
Click Here to See Reader Feedback

Dan Gilmore wrote a great First Thoughts column this week is on Big Data in the Supply Chain.   He painted a fair picture of the state of “Big Data” in the supply chain and some potential applications.  I think he nailed the heart of the matter with this paragraph:

“In my opinion, there is a lot of humbug and hype out there right now relative to Big Data, and a lack of clarity about where the applications/opportunities are in the supply chain. That said, I think there really is something here, but largely still emerging.”

Part of the hype and confusion around Big Data comes from the fact that people are using the term in many different ways. 

If you talk to someone in IT or computer science, they are referring to a definition of Big Data that stresses the technology aspect.  That is, the data set is too big for existing or standard technologies.  Folks at place like Gartner typically refer to this as the 3 V’s—as in Volume (you have more data than can be stored on your existing servers), Velocity (you have data coming in from sensors so fast you can’t process it), and Variety (you are collecting data from text, video, or social media and the data doesn’t fit in typical relational databases). 

It is this definition that will quickly get people talking about Hadoop and other interesting new technology. 

However, I find that most people in supply chain do not need to and shouldn’t worry about this definition.  And, the data we deal with (for the most part) in the supply chain does not fit this definition.

Previous Columns by Dr. Watson

The Three Use Cases for Data Scientists

Learn Python, PuLP, Jupyter Notebooks, and Network Design

EOQ Model and the Hidden Costs of Fixed Costs

CSCMP Edge - Nike Quote: "It is All an Art Project Until you Get it on Someone's Feet"

Supply Chain by Design: Why Business Leaders should think of AI as an Umbrella Term


Instead, there is another, more appropriate way the term Big Data is being used.  In the press and in discussions in the hall, Big Data is being used to describe an interesting problem being solved or a creative way to use data. 

This definition leads to a better way to think about Big Data.  Instead of thinking about the technical side of data, think about how you can better use the data you have to solve new problems, think about how you can combine data sets to answer tough questions, and think about an old problem you can now solve by collecting some new data. 

It is here where I think the folks in the supply chain will see the most value.   For example, people in the supply chain are collecting and using data to answer new questions:

Using detailed ship-to data to better understand customer segments (to gain insight into different supply chain processes needed) and to understand what products ship together (to decide what to store where and where to put items in the warehouse).


Using detailed truck sensor data to understand fuel efficiency and matching engine types to requirements


Using historical demand data along with outside data (like weather, demographics, competition, or the vehicle registration database if you are in the auto aftermarket) to create better forecasts.


Using data on pricing and promotions to forecast change in demand—change for the promoted items, but also other items that may rise or fall with the promotion.


Pulling supply chain data from multiple sources to create better cost-to-serve models.

Final Thoughts:

And, this list could go on and on.  Note that none of the above items will probably fit into the 3 V’s of Big Data and each item on the list requires techniques to analyze the data (more on this in future columns). 

In the end, like Dan mentioned in his article, there is something here—no matter what we call it.

Recent Feedback

Great post - I completely agree!  

While most of the data useful in supply chain does not fit the "Big Data" definition, we do have plenty of data available to us (e.g. Point of Sale data, weather data, promotional activity) that is largely unused.  Yes, some data sources are big enough to be challenging to handle, other sources you can get into a spreadsheet, but size is not the important factor here - the big question is whether you can drive value from it.  

Don't get caught up on the definition, or let the data volume deter you, just keep driving for value.

Andrew Gibson
Crabtree Analytics
Apr, 14 2014