Walmart Is Making Big Data Part Of Its DNA

Walmart Is Making Big Data Part Of Its DNA
👋 Hi, I am Mark. I am a strategic futurist and innovation keynote speaker. I advise governments and enterprises on emerging technologies such as AI or the metaverse. My subscribers receive a free weekly newsletter on cutting-edge technology.

Walmart started using big data even before the term big data became known in the industry and in 2012 they moved from an experiential 10-node Hadoop cluster to a 250-node Hadoop cluster. At the same time they developed new tools to migrate their existing data on Oracle, Netezza and Greenplum hardware to their own systems. The objective was to consolidate 10 different websites into one website and store all incoming data in the new Hadoop cluster. Since then they have made big steps in integrating big data into the DNA of Walmart.

Social Big Data Solutions

Many of the big data tools have been developed at the Walmart Labs, which was created after Walmart took over Kosmix in 2011. Some of the products that were developed at Walmart Labs are ‘Social Genome’, ‘ShoppyCat’ and Get on the Shelf.

The Social Genome product allows Walmart to reach customers, or friends of customers, who have mentioned something online to inform them about that exact product and include a discount. In order to do this they combine public data from the web, social data and proprietary data such as customer purchasing data and contact information. This has resulted in a vast, constantly changing, up-to-date knowledge base with hundreds of millions of entities and relationships. It helps Walmart to better understand the context of what their customers are saying online. An example mentioned by Walmart Labs shows a woman tweeting regularly about movies. When she tweets “I love Salt”, Walmart is able to understand that she is talking about the movie Salt and not the condiment.

Walmart came across several technical difficulties when developing the Social Genome, among others the quantity and velocity the data pours into their Hadoop clusters. As the regular Map-Reduce/Hadoop framework was not able to cope with the amount and speed the data was coming in, they have developed their own tool called Muppet. This, now open-source, tool processes the data in real-time over all clusters and can perform several analysis at the same time.

The Shoppycat product that was developed by Walmart is able to recommend suitable products to Facebook users based on the hobbies and interests of their friends. It uses the Social Genome technology among others to help customers with presents for their friends. An interesting aspect of this Facebook App is that Walmart will direct the Facebook users to a different store in case the product is sold out at a nearby Walmart store.

Get on the Shelf was a crowd-sourcing solution that gave anyone the chance to pitch his or her product in front of a large online audience. The best products would be sold at Walmart with the potential to suddenly reach millions of customers. Over a million votes were cast and in the end three products are now carried in the Walmart stock.

Are you looking for Big Data Jobs or Candidates? Please go to our WORK section

Mobile Big Data Solutions

With over 200 million customers visiting a Walmart store every week, it is obvious that Walmart aims big at the mobile development. They have developed several iOS and Android Apps that use the latest technology to give their customers the best experience. They even created a few of their own tools and subsequently open-sourced them. These are Thorax and Lumbar. Thorax is framework to build large-scale web applications and Lumbar is a js-build tool that can generate modular platform specific applications.

But there is more of course. As can be expected from the largest retailer in the world, Walmart has an extensive big data ecosystem. This system processes multi-TB’s of new data and PB’s of historical data on a daily basis, covering millions of products and 100s of millions users from internal and external sources. They analyse over 100 million keywords to optimize the bidding of each keyword on a daily basis. The ecosystem is schematically shown in below visual:

All the big data efforts of Walmart are a good example of the massive possibilities of what can be done if big data is truly incorporated in the DNA of the company. Already, Walmart is able to optimize the local assortment of Walmart stores based on what the customers in the neighbourhood are saying on social media. When Walmart combines all big data efforts with their mobile efforts truly exciting solutions can be created.

They are also developing in-store mobile navigation using personal smartphones and with that they can steer their customers through aisles of products they have been talking about on social media and therefore are more willing to buy. Of course, resulting in more revenue for the already largest retailer in the world.

Image courtesy of Walmart Labs.
Image Credit: goir/Shutterstock
Dr Mark van Rijmenam

Dr Mark van Rijmenam

Dr. Mark van Rijmenam is a strategic futurist known as The Digital Speaker. He stands at the forefront of the digital age and lives and breathes cutting-edge technologies to inspire Fortune 500 companies and governments worldwide. As an optimistic dystopian, he has a deep understanding of AI, blockchain, the metaverse, and other emerging technologies, and he blends academic rigour with technological innovation.

His pioneering efforts include the world’s first TEDx Talk in VR in 2020. In 2023, he further pushed boundaries when he delivered a TEDx talk in Athens with his digital twin , delving into the complex interplay of AI and our perception of reality. In 2024, he launched a digital twin of himself offering interactive, on-demand conversations via text, audio or video in 29 languages, thereby bridging the gap between the digital and physical worlds – another world’s first.

As a distinguished 5-time author and corporate educator, Dr Van Rijmenam is celebrated for his candid, independent, and balanced insights. He is also the founder of Futurwise , which focuses on elevating global digital awareness for a responsible and thriving digital future.


Digital Twin