Monday, November 26, 2012

Targeting @WalmartLabs

Humans are profoundly efficient at ingesting a wide array of disparate signals and internally aggregating these patterns together to form “profiles” of objects, events and actions that can be recalled when confronted with analogous situations in the future. 

Consider the following:  

Most of us from a young age are quite comfortable playing catch with a baseball.  The brain has been exposed to catching baseballs under different lighting conditions, various velocities and spin in which a partner might be tossing the ball to you, and consequently has established mental profiles of varying resolutions of what is needed, mechanically, to successfully catch this baseball.  Now, if someone were to toss you a dog bone in lieu of the traditional baseball, you’d likely be able to catch this a-spherical object with ease (assuming it wasn’t the femur of an extinct mastodon!). 

Fast forward to the modern world in which the amount of data that is being generated in a day or two is equivalent to all information generated up until the beginning of the 21st century. Humans are often times overwhelmed with reasoning under this bombardment of signals and often times rely on machines to comb through the vast amounts of data to form aggregates, which humans can leverage as useful information. 

This explosion in data is particularly present in modern eCommerce, for example Walmart.com.  Customers are quite good at recognizing products and brands that they have had a previous affinity to.  However, as the number of items in retailer’s catalogs seem to be increasing an order of magnitude every few years, it becomes a challenging problem, especially in the digital world, to help customers sift through the vast space of items they and their family might enjoy.

Fortunately, big data tools largely developed to handle processing of the immense amounts of data being generated by the consumer web allow us within @WalmartLabs, to improve the online customer shopping experience.  These big data tools, in conjunction with the appropriate machine learning and information retrieval methodologies can profoundly improve the eCommerce shopping experience by helping customers navigate through the noise of millions if not tens of millions of items to present a set of relevant and delightfully discovered products that an individual might enjoy.

The targeting team within @WalmartLabs ingests just about every clickable action on Walmart.com: what individuals buy online and in stores, trends on Twitter, local weather deviations, and other local external events such as the San Francisco Giants winning the World Series.  We capture these events and intelligently tease out meaningful patterns so our millions of Walmart.com customers have a shopping experience that is individually personalized.

Given the enormous amounts of data collected it often poses the question of what does one do to understand all of it.  Our big data tools help us personalize the shopping experience and our psychological analysis helps us to dissect even deeper meaning behind patterns in the data.  We apply behavioral economics to find clarity behind both the rational and irrational behavior shoppers exhibit. By integrating a psychological approach we provide a holistic experience that takes in data and reflects an accurate picture of customer activity and what consumers might be seeking.

Our targeting team comprised of PhD’s in computer science, statistics, signal processing and behavioral psychologists, has developed methodologies which aggregate all this disparate data together (in the “right” way) to formalize “good” recommendations or implicit “neighbors”.  Historical buying patterns of our customers provide wonderful insights into items these customers are also likely to enjoy by utilizing collaborative filtering methodologies, e.g., correlation structure. However, transaction patterns alone can be enriched by also incorporating items which were browsed prior to any purchase as well as location areas shoppers might be in across the United States.  Due to block structure in these large covariance matrices (and auto-correlation), you end up with a scenario in which the “relevance” of the recommended items to users ends up being greater than the sum of the parts.  In other words, the model produces more relevant recommendations when trained on all these data sources (when appropriately aggregated) than any one in isolation. 

In the same spirit in which a customer may seek out the counsel of her closest (and likely most similar friends) when assessing the right pair of head phones, our high-dimensional statistical models can implicitly replicate this process of inheriting items by “close” individuals.  The targeting team will often times take each user, represented by every item they’ve ever interacted with on the Walmart.com site, as well as demographic and geo data, and project users and items to the “appropriate” low-dimensional subspace. 

This projection enables us to believe that users who behave similarly in all their patterns are likely to be nearby similar users.  Additionally, items which are similar end up being nearby each other in this lower dimensional space.   We are then able to establish who are a given users “closest” friends in this subspace (geometrically speaking), and then allow this particular individual to “inherit” the items interacted by these very similar neighbors, often times resulting in spot on recommendations.

Regarding the head phones example, such a procedure often times results in recommending the niche higher-end head phones rather than lower quality ones based upon this projection and neighborhood derivation.

The @WalmartLabs targeting team is able to replicate an age old process of seeking out item recommendations from their trusted peers and friends and extend this to the digital age in which a user receives a highly relevant and delightful head phone recommendation via email while being completely unaware of their implicit, geometrically similar “friends”, who helped make this recommendation a reality.