Friday, January 11, 2013

The @WalmartLabs Social Media Analytics project

Firstly, let me wish you all a happy new year. 2012 was definitely very exciting for us all. While scientists at CERN were sifting through a whopping 200 petabytes of data analyzing 800 trillion collision events to detect the Higgs boson, we at @WalmartLabs started out on a little hunt of our own. I’m talking about an insights-mining project we started working on last year.

Social Media Analytics is all about mining retail-related insights from social channels, a perilous and personally exciting task to us. When our team spent the 22nd of November feverishly following the social retail pulse on Black Friday, we knew the world wasn’t preparing for an apocalypse.


As we watched the incredible surge in Walmart related social buzz on that day, we were only gently reminded another time of the promise that lay hidden deep within the treasure of the social data goldmine – the promise of social media analytics that is only emphasized all the time today, almost to the point of a cliché. The potential itself is nevertheless, still largely untapped. We are only barely beginning to scratch the surface of all the great tales that the data has to tell us.

Social buzz typically precedes retail buzz. People are constantly expressing about upcoming stuff on social apps - the hot new video game whose trailer just released, the cool gadget about to be launched, or a new upgrade to that toy that your child always loved. There are good things said, and bad things too. And thus, social media is really a direct real-time feedback channel to us from our many customers. I am still only stating the obvious.


Our goal is to tap into this social buzz and help Walmart with decision making on aspects like inventory and assortment. As an example, the figure below shows a reasonable spike in social activity about Sony's new Android phone Xperia Z, few days ahead of its actual launch. Such insights can help our merchants make smarter decisions ahead of time.




Social data mining comes with incredible challenges, which only makes it all the more exciting for our super smart engineers to come to work every day. Data volume is formidably huge. We are talking about petabytes here. Real-time social data processing requires sophisticated data stores and blazingly fast algorithms. The noise levels are exorbitant, the language used in social forums is heavily informal, unstructured and often ungrammatical, and filtering out that helpful insight out of the huge amount of noise is super hard. Just consider algorithmically parsing - “OMG!!! dis is sooo coool! i luv ma new fone. i cant believ ma luck 4 chosin this! #wellwhatdoyathink”. Popular text analytics and natural language processing techniques based on standard language models simply fail. We need altogether different techniques to filter out and focus on social data that is relevant to us, which in itself is a daunting task. The next step is to map this to meaningful retail products. All of these are difficult tasks. As a quick sneak-peek, a new technique we are trying out today is to look for any of several hand-verified n-grams around brands in a large time window. Several more schemes are to follow. It is only after conquering all of these multifold challenges that meaningful recommendations can be made.

Today, our social media analytics project operates on top of a searchable index of 60 billion social documents and helps merchants at Walmart monitor sentiments and popular interests real-time, or inquire into trends in the past. One can also see geographical variations of social sentiments and buzz levels. There are also tools that marry search trends on walmart.com, sales trends in our brick-and-mortar stores and social buzz all in one place, to help make correlations. Together, these tools provide powerful social insights today.

As we step into another fantastic year, we are excited to be taking up more audacious goals. On one hand, we aspire to improve the quality of our insights and work with our merchants to expedite them effectively. On the other, we aim to map our interest trends to demand levels for actual products and come up with insights for assortment and inventory management. And all of this, well ahead of time, while we can make a difference. 

It is going to be an exciting year indeed.


Monday, November 26, 2012

Targeting @WalmartLabs

Humans are profoundly efficient at ingesting a wide array of disparate signals and internally aggregating these patterns together to form “profiles” of objects, events and actions that can be recalled when confronted with analogous situations in the future. 

Consider the following:  

Most of us from a young age are quite comfortable playing catch with a baseball.  The brain has been exposed to catching baseballs under different lighting conditions, various velocities and spin in which a partner might be tossing the ball to you, and consequently has established mental profiles of varying resolutions of what is needed, mechanically, to successfully catch this baseball.  Now, if someone were to toss you a dog bone in lieu of the traditional baseball, you’d likely be able to catch this a-spherical object with ease (assuming it wasn’t the femur of an extinct mastodon!). 

Fast forward to the modern world in which the amount of data that is being generated in a day or two is equivalent to all information generated up until the beginning of the 21st century. Humans are often times overwhelmed with reasoning under this bombardment of signals and often times rely on machines to comb through the vast amounts of data to form aggregates, which humans can leverage as useful information. 

This explosion in data is particularly present in modern eCommerce, for example Walmart.com.  Customers are quite good at recognizing products and brands that they have had a previous affinity to.  However, as the number of items in retailer’s catalogs seem to be increasing an order of magnitude every few years, it becomes a challenging problem, especially in the digital world, to help customers sift through the vast space of items they and their family might enjoy.

Fortunately, big data tools largely developed to handle processing of the immense amounts of data being generated by the consumer web allow us within @WalmartLabs, to improve the online customer shopping experience.  These big data tools, in conjunction with the appropriate machine learning and information retrieval methodologies can profoundly improve the eCommerce shopping experience by helping customers navigate through the noise of millions if not tens of millions of items to present a set of relevant and delightfully discovered products that an individual might enjoy.

The targeting team within @WalmartLabs ingests just about every clickable action on Walmart.com: what individuals buy online and in stores, trends on Twitter, local weather deviations, and other local external events such as the San Francisco Giants winning the World Series.  We capture these events and intelligently tease out meaningful patterns so our millions of Walmart.com customers have a shopping experience that is individually personalized.

Given the enormous amounts of data collected it often poses the question of what does one do to understand all of it.  Our big data tools help us personalize the shopping experience and our psychological analysis helps us to dissect even deeper meaning behind patterns in the data.  We apply behavioral economics to find clarity behind both the rational and irrational behavior shoppers exhibit. By integrating a psychological approach we provide a holistic experience that takes in data and reflects an accurate picture of customer activity and what consumers might be seeking.

Our targeting team comprised of PhD’s in computer science, statistics, signal processing and behavioral psychologists, has developed methodologies which aggregate all this disparate data together (in the “right” way) to formalize “good” recommendations or implicit “neighbors”.  Historical buying patterns of our customers provide wonderful insights into items these customers are also likely to enjoy by utilizing collaborative filtering methodologies, e.g., correlation structure. However, transaction patterns alone can be enriched by also incorporating items which were browsed prior to any purchase as well as location areas shoppers might be in across the United States.  Due to block structure in these large covariance matrices (and auto-correlation), you end up with a scenario in which the “relevance” of the recommended items to users ends up being greater than the sum of the parts.  In other words, the model produces more relevant recommendations when trained on all these data sources (when appropriately aggregated) than any one in isolation. 

In the same spirit in which a customer may seek out the counsel of her closest (and likely most similar friends) when assessing the right pair of head phones, our high-dimensional statistical models can implicitly replicate this process of inheriting items by “close” individuals.  The targeting team will often times take each user, represented by every item they’ve ever interacted with on the Walmart.com site, as well as demographic and geo data, and project users and items to the “appropriate” low-dimensional subspace. 

This projection enables us to believe that users who behave similarly in all their patterns are likely to be nearby similar users.  Additionally, items which are similar end up being nearby each other in this lower dimensional space.   We are then able to establish who are a given users “closest” friends in this subspace (geometrically speaking), and then allow this particular individual to “inherit” the items interacted by these very similar neighbors, often times resulting in spot on recommendations.

Regarding the head phones example, such a procedure often times results in recommending the niche higher-end head phones rather than lower quality ones based upon this projection and neighborhood derivation.

The @WalmartLabs targeting team is able to replicate an age old process of seeking out item recommendations from their trusted peers and friends and extend this to the digital age in which a user receives a highly relevant and delightful head phone recommendation via email while being completely unaware of their implicit, geometrically similar “friends”, who helped make this recommendation a reality.

Wednesday, November 14, 2012

Food is personal, food is social


Goodies Co. is here to help you discover new foods you’ll love.

If you’re anything like us, you often find yourself thinking, “I’m so tired of eating the same old thing.” But branching out can always be difficult. Why? Because trying new foods is expensive, time-consuming and risky. What if it doesn’t taste good? It’s money down the drain, not to mention a dull and very disappointing meal.

That’s no way to live.

And that’s where Goodies Co. comes in. 

For just a $7 per month subscription including taxes and shipping, the Goodies Co. box features 5-8 delicious new foods hand-picked and delivered directly to your doorstep. We always try to get you as much value as possible for your $7, and are proud to say that our box costs less than half of the sum of its parts if sold individually.

You get to receive a surprise every month, sample new foods, and expand your palate without breaking the bank.  We always try to include a wide variety of foods, from healthy, organic, artisan, and international, and definitely steer clear of being too niche (all organic, all gluten-free, etc.). We want everyone to be able to enjoy Goodies Co! And whatever you love most from a Goodies Co. box is always available for purchase in full size from our online storefront.

We also believe in rewarding our loyal customers, and offer points for reviews our customers leave about products on Goodies Co. Points can soon be redeemed for discounts on future boxes and items in the store, ensuring that Goodies Co. subscribers keep getting new products they’ll love.

Products that are considered for inclusion in a Goodies Co. Taster’s Box are always tasted by the Goodies Co. team, and tasted by volunteers from our Tasting Lab, where each potential product is graded for ultimate deliciousness.

Goodies Co. is a new experiential ecommerce site developed by @WalmartLabs in Silicon Valley.

Given Walmart’s tremendous reach in the grocery sector, we saw an opportunity to build upon the market in new ways. Goodies Co. was born out of Walmart’s firm ties to food, while also innovating and refining upon the core pillars of discovery commerce. Goodies Co. is bridge-building between physical fulfillment of a subscription box, and a rich associated experience on the Goodies Co. site and social channels. The opportunity ahead of Goodies Co. is tremendous; by reaching both those that love the emerging discovery commerce sector and subscription boxes, while also getting interesting foods to customers in new ways.

Try us out, and let Goodies Co. show you just how good life can be. Sign up with your email address now at Goodies Co, and feel free to follow us on Twitter and Facebook for more delicious surprises. 

Wednesday, September 12, 2012

Mupd8 – The @WalmartLabs Real-time Platform



In recent years, the world has seen an explosive growth in the volume of real-time data streams. Once the preserve of stock markets and day traders, real-time data is now ubiquitously available to consumers through popular services such as Twitter and Facebook. With the availability and growth of real-time data comes the inevitable problem of real-time data overload, and the need for systems that can separate the signal from the noise.

As we began working with firehoses from various social media sites, we recognized the need for a general-purpose real-time stream processing platform that could address the issues of scale and performance -- and enable our stream processing applications to focus on the quality of their generated content. “Mupd8” came into existence to fulfill that need. At the highest level, Mupd8 does for Fast Data, what Hadoop and the MapReduce computation model do for Big Data. Mupd8 was formerly known as Muppet within @WalmartLabs and in our peer-reviewed publications.


A MapUpdate application is a workflow of map and update operators.

With Mupd8, you can easily write applications to process these (or your own) firehoses using the MapUpdate framework, a simple way to express streaming computation. By writing your application as a collection of customized operators map and update, you can focus your programming on your application logic and let Mupd8 handle the load-and data-distribution across multiple CPU cores and machines for you. For example, an application can be written to subscribe to the Twitter firehose of every tweet written; such an application can analyze the tweets to determine Twitter's most influential users, or identify suddenly prominent events as they occur. Alternatively, an application can be written to subscribe to a log of all user activity on a Web site; such an application can detect service problems users face as they occur, or compute suggestions for users' next steps based on up-to-the-moment activity.

Mupd8 enables @WalmartLabs to leverage Walmart’s business data and product taxonomy to extract valuable content in real-time from the largest streams of social media content. As an example, we extract and collect videos, images, location information and status updates from these massive flowing streams. We then categorize them and track such information as key influencers, relevant videos, images and web pages for each product.

To be effective at separating the signal from the noise in real-time, Mupd8 not only has to provide a simple computation model for stream processing applications, but also has to be highly available since any downtime causes loss of real-time information. This, in turn, requires persistence of the rapidly changing content generated by the applications. Low latency -- the time between receiving an update and updating the affected nodes -- is a key requirement, since the half-life of social media content is short. Lastly, Mupd8 has to be scalable across three dimensions -- increasing stream rates, the size of the taxonomy and the complexity of supported applications.

To address these requirements for high availability, low latency and scalability, the current Mupd8 architecture leverages open-source solutions, including Cassandra, which has proven to be a high-performance and scalable key-value store (though Mupd8 adds a caching layer to minimize the impact of its higher network read latency).

The @WalmartLabs Mupd8 platform has already supported more than a dozen sophisticated stream applications processing over 300 million status updates per day, gathering real-time information on our product taxonomy. Learn more about our experience (see our VLDB 2012 paper) and check out the newest version, available under the Apache License 2.0, for yourself at http://github.com/walmartlabs/mupd8. Get started on your application with "Starting a New Application in MUPD8," available in the Mupd8 source tree and http://walmartlabs.github.com/mupd8/quickstart.html.

Wednesday, May 30, 2012

Mobilizing our store customers

As we continue to improve the mobile commerce experience, we’re very excited to roll out major updates to our iPhone and iPad apps for Walmart in the United States.  For the iPhone, it’s all about in-store aisle location and our new In-Store Mode.  For the iPad, we’ve introduced a new interactive Local Ad feature.  These new mobile enhancements put more power at the fingertips of our customers, allowing them to shop our brand anytime, anywhere.

iPhone: In-store Aisle Location
We’ve expanded the in-store aisle location feature to help customers easily find products on their shopping lists in our stores.  As customers add items to their mobile shopping list, they can also see the aisle where they’ll find their product.  Previously a limited beta,  we now offer the in-store aisle location feature for all U.S. stores and thousands of products across consumables, grocery, electronics and toys.

In addition, customers can set their local store within the app, plan their visit and create a shopping list at home by scanning, typing, or speaking items to buy.  One of the unique features of our app is that we pull item prices from your local Walmart store and calculate total price in real-time as items are added to shopping lists.  We’re able to do that for any of our 3,800 Walmart stores across the U.S. so no matter where you happen to be, only our app helps you see the estimated total of your Walmart shopping trip before ever leaving home. 

iPhone: In-Store Mode
This tool was specifically created to help customers while they’re shopping in our retail stores. Now, when you launch the Walmart for iPhone app from one of our stores in the U.S., you will be prompted to enter In-Store Mode.  Features include scanning – price checking from product barcodes or learning more about products and special offers displayed in-store as QR Codes – and easy access to the Local Ad product list and general store information.  

iPad: Local Ad
We’ve introduced an interactive Local Ad experience that presents our customers with a preview of store specials and featured products. The format of our iPad Local Ad will be familiar to customers who are used to flipping through our printed Local Ads. They will be able to browse our Local Ads from their iPad page-by-page, with the ability to tap on a product to view more information, and purchase the item online or by visiting the store.

You can download our iPhone and iPad apps from the App Store.

Walmart for iPhone:

Walmart for iPad:

Happy shopping!

Thursday, May 3, 2012

Get on the shelf Contest Winners Announced!




Today we announced the final winners of our popular Get on the shelf contest: HumanKind Water, PlateTopper and SnapIt Eyeglass Repair Kit which will be sold at http://walmart.com/getontheshelf.

More than 4,000 inventors, entrepreneurs and small businesses from across the country entered the contest to have the opportunity to be carried on Walmart.com and in Walmart U.S. stores.  The most popular products ranged from categories entered were home improvement,  to personalized products items and health and wellness.  Over one million votes were cast by the public to compete for the opportunity to be carried at Walmart.com and in Walmart U.S. stores. The public cast over one million votes through Facebook or text.

Throughout the contest, the winning inventors went the distance to market their participation through word of mouth. Humankind Water transformed its homepage into a “war room” completely dedicated to getting votes. PlateTopper deployed humorous videos and social marketing tactics to raise visibility, including a YouTube video, which has been viewed more than two million times. Nancy of SnapIt even went to the NBC Today Show’s plaza in New York where her assistant dressed as a giant screw and was seen on national television with a sign asking for votes.

We are now at the end of an amazing adventure.  Get on the Shelf surpassed our expectations and surprised us all here.  We congratulate the winners for their creativity and passion and thank every single one of the 4,000 contestants for taking a leap to join in the first contest ever of this kind from.  

Get on the shelf, launched in January of this year where contestants sent in videos of their latest inventions to be voted on by the public.  During contest voting, nearly 95 percent of the participants received a vote via Facebook or text. 

We thank all contestants for their participation and enthusiasm for the "Get on the shelf" contest. We congratulate them for their creativity and passion. We are amazed by the visibility so many have received for their products, whether online or through traditional news media, and we hope the boost in awareness We hope the GOTS experience accelerates their your business going forward.

The Walmart Get on the shelf team

For more information, visit http://getontheshelf.com.