Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

AI for Retail: A Practical Guide to Modernize Your Retail Business with AI and Automation
AI for Retail: A Practical Guide to Modernize Your Retail Business with AI and Automation
AI for Retail: A Practical Guide to Modernize Your Retail Business with AI and Automation
Ebook521 pages5 hours

AI for Retail: A Practical Guide to Modernize Your Retail Business with AI and Automation

Rating: 0 out of 5 stars

()

Read preview

About this ebook

The coming of the AI revolution in brick-and-mortar retail

In AI for Retail: A Practical Guide to Modernize Your Retail Business with AI and Automation, Francois Chaubard, AI researcher and retail technology CEO, delivers a practical guide for integrating AI into your brick-and-mortar retail business. In the book, you’ll learn how to make your business more efficient by automating inventory management, supply chain, front-end, merchandising, pricing, loss prevention, e-commerce processes, and more.

The author takes you step by step from no AI Strategy at all to implementing a robust AI playbook that will permeate through your entire organization. In this book, you will learn:

  • How AI works, including key terminology and fundamental AI applications in retail
  • How AI can be applied to the major functions of retail with detailed P&L analysis of each application
  • How to implement an AI strategy across your entire business to double or even triple Free Cash Flow

AI for Retail is the comprehensive, hands-on blueprint for AI adoption that retail managers, executives, founders, entrepreneurs, board members, and other business leaders have been waiting for.

LanguageEnglish
PublisherWiley
Release dateApr 6, 2023
ISBN9781394184705
AI for Retail: A Practical Guide to Modernize Your Retail Business with AI and Automation

Related to AI for Retail

Related ebooks

Industries For You

View More

Related articles

Reviews for AI for Retail

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    AI for Retail - Francois Chaubard

    SECTION 1

    Introduction to the AI Revolution

    CHAPTER 1

    How AI Has Revolutionized Many Other Industries over the Last 20 Years

    History never repeats itself, but it does often rhyme.

    —Mark Twain

    To understand how AI will transform retail, it's important to understand how AI has already transformed, or is transforming now, countless other industries. I've included some of my favorite examples for us to have a clearer understanding of what is going to happen to retail.

    Just like in finance, the three prerequisites to look out for in all of these examples are:

    There needed to be a step function in availability of accurate, real‐time data;

    There needed to be a step function in task‐specific algorithms;

    Someone had to have the courage to prove that this new model will work.

    Once all three of these have occurred, the floodgates open, and AI infiltrates the entire industry with a ferocious pace. Let's dig into some examples.

    Advertising Revolutionized by AI in 2000

    Marketers have always been trying to get us to buy their product, go to their stores, or change our behavior in some way. Classic marketing strategy goes this way. First, identify your target markets/audience/personas, and then create campaigns per segment with ad placement to target those personas to maximize conversion rates per dollar of ad spend. For example, if you were selling luxury handbags, you would likely be targeting a core demographic of perhaps: only women, between 30 and 70, and of a higher economic status. We would then try to target that group as much as possible with every dollar spent in advertising.

    Before 2000, ad placement employed a similar strategy to a sawed‐off shotgun, broadly spraying the same ad to the masses, advertising pantyhose to men, Big Macs to vegans, and brake pads to people who don't own cars. This was true in print, TV, billboards, and even online, where marketers knew 90% of their ad spend would be wasted on the wrong people.

    Then came the concept of personalization. Before AI, personalization was a marketer's pipedream. For personalization to work, advertisers needed to know many things about you. But before 2000, there was no data set in the world of what you, the advertisee, liked, didn't like, where you lived, what car you drove, what your political beliefs were, how old you were, what age, sex, religion you were, etc.

    Until Google. In 1998, Google launched the best search engine of all time, an algorithmic advancement called PageRank. Because of this improvement in user experience and AI prowess, customers all around the world freely handed Google their personal data, data that Yahoo, MSN, AskJeeves, or any of the other search engine competitors could not gather, infer, or interpret to a level accurate enough to be able to target. Two years later, Google had access to accurate, real‐time information on hundreds of millions of people. With Google Maps, Chrome, and the DoubleClick acquisition, they had more data on you than the US government. They knew where you were right now, where you wanted to go, and what you wanted to buy. Additionally, they knew it sometimes quicker than you did! With AdWords/AdSense, they mined that data and provided marketers the ability to turn their sawed‐off shotgun into a sniper rifle, which revolutionized advertising forever. For the first time in history, marketers could target Vietnamese American males between 20 and 22 in Des Moines searching for ice cream shops right now and hit them with an ad in a millisecond, which dropped the cost‐per‐click (CPC), or the cost to get someone to click on my ice cream ad, by 100 to 1,000 times.

    Each time you search on Google or load a page powered by AdSense, Google is figuring out which ad to show using AI models that are trying to maximize the probability that you will click on that ad and convert. The formulation of this problem is known as the multi‐armed bandit problem, which goes like this: Imagine you are in Vegas, and you have access to millions of slot machines, each one pays out some reward (Ri) at some probability (Pi). But you do not know these numbers for each slot machine before you start pulling, so you have to explore and then exploit. You start with one slot machine, see how often it pays out, explore another sometime later, pull that a few times, and see how often that one pays out. This is repeated until you have explored enough slot machines and maybe you found a few that pay out really well, so instead of exploring more and more, you slow it down a bit, and start exploiting the slot machines you know pay out well.

    Similarly, with advertising, you start with a new ad, see how often people similar to you click on it, then see how well other ads work on you. This is repeated until they figure out what you are likely to click on and what you are not, customizing every single page you see on the web to maximize your click‐through rate (CTR), or percent of the time a user clicks on your ad. One of the most popular techniques to solve this is a class of AI algorithms called collaborative filtering.

    Since then, Facebook, Instagram, Snapchat, Yelp, Amazon, TikTok, YouTube, and many other tech platforms have employed similar recommendation systems to grow their advertising revenue.

    This innovation dropped the cost of advertising so low that only tech companies can really play in the advertising business anymore, forcing many classical ad‐driven industries such as print newspapers, billboards, radio, and television companies into turmoil and some into bankruptcy giving rise to the modern duopoly that is Facebook and Google.

    For more information on this, watch the documentary The Social Dilemma.

    Baseball Revolutionized by AI in 2002

    Moneyball is one of my favorite movies. It's a true story of how AI transformed Major League Baseball (MLB) in 2002 irrevocably. It's based on Oakland A's General Manager Billy Beane (see Figure 1.1), who lost all of his best players in 2001. The team was likely to rank last place in the league.

    Photograph of Bbilly Beane; Oakland Coliseum, 1989.

    FIGURE 1.1 Billy Beane; Oakland Coliseum, 1989.

    Source: Silent Sensei / Wikimedia Commons / CC BY 2.0.

    After losing a specific trade to another team, he noticed the opposing GM continuously consulting with a young economics major from Yale. Beane poached him to aggregate player statistics from the major and minor leagues and deploy AI algorithms on this data to select and manage a winning team on a shoestring budget. This was the first time AI was used to scout players and manage a baseball team. Beane's AI scouted team went on to break a number of baseball records, best summarized by Boston Red Sox GM in the last scene in the movie:

    For $41m you built a playoff team. You lost Damon, Giambi, Isringhausen, Pena, and you won more games without them than you did with them. You won the exact same number of games as the Yankees, but the Yankees spent $1.4 million per win and you paid $260,000. I know you're taking it in the teeth out there but the first guy through the wall, he always gets bloody…always. This is threatening not just their way of doing business, buddy, but really it's threatening their livelihood, it's threatening their jobs, and every time that happens whether it's a government or way of doing business or whatever it is, the people who are holding the reins, who have their hands on the switch, they go batshit crazy. I mean anybody who's not tearing their team down right now, and rebuilding it using your model, they're dinosaurs. They will be sitting on their ass on the sofa in October watching the Boston Red Sox win the World Series.

    The Red Sox owner was right. With AI, the Boston Red Sox won the World Series in 2004, for the first time since 1918. Over the next four years, every single MLB team hired swarms of statisticians and data scientists to mimic what Billy Beane and the Red Sox did.

    The three major step functions that occurred to enable this AI revolution were, first, the Society for American Baseball Research (SABR) was established in 1971, which began recording and publishing all player statistics and metrics (accurate, and real‐time data). Second, in 1978, a guy named George William James (Bill James), an American baseball writer, historian, and statistician, started putting out an annual booklet called The Bill James Baseball Abstract, which described a new algorithm for running a baseball team called sabermetrics in reference to SABR. And third, in 2002, a guy named Billy Beane was crazy and desperate enough to actually try it, risking his entire career on the idea. Because of these three events, baseball will never be the same.

    Computer Vision Revolutionized by AI in 2012

    This is the most important advancement in AI that I will cover in this book. In 2006, my Computer Vision Lab at Stanford pulled together a huge data set of images downloaded from the Internet and labeled them. This data set is now famously called ImageNet. It is a huge data set of 22 million images of cars, dogs, cats, stop signs, etc. Since 2010, Stanford has used this data set for a global computer vision competition (called ILSVRC, ImageNet Large Scale Visual Recognition Competition) to try to provide the AI community a clear measure of how strong our AI models are in relation to each other. Almost all computer vision labs in the world submit their best AI model each year to try to win first prize. Winning this competition is a huge deal for computer vision researchers.

    The grading of the competition goes like this. There are 1,000 possible classes or types of objects you want your AI to accurately identify, such as cats, dogs, stop signs, etc. Your submitted AI model will be given a few thousand images for which the correct answer is hidden, and the AI has to predict which of the 1,000 classes are present in each image. Stanford knows the true answers and then computes each submissions scores against the real answers and reports the results.

    In 2010, the @1 Accuracy of the winning solution was around 40%. This means if the AI gets to make only one guess out of 1,000 possible classes, it will only get the right class 40% of the time, and 60% of the time would guess the wrong answer. This is really poor performance and would not be usable by most applications. Take self‐driving cars for example. I certainly would not get in a self‐driving car that would miss 60% of stop signs, would you?

    In 2012, however, three authors who are now famous in the AI community (Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton out of the University of Toronto) published one of the most important papers in the world describing their model now known as AlexNet. They won the ILSVRC 2012 competition by a huge margin. They brought the state of the art @1 Accuracy from below 26% to 63.3%, an increase of 37%! This was almost too good to be true. Most labs thought they cheated. They did not. Their AI was just that superior to what any other lab was doing. With this tremendous success, these three gave birth to the AI renaissance we are experiencing today. Using their technique, inventors have been able to produce products such as Alexa, Siri, self‐driving cars, and more. I will explain this model in great detail in the theory section of this book, but to provide some context now, they were the first team to (1) ingest huge amounts of data into their AI models leveraging GPUs to make the computation very fast, (2) feed it into a very deep model (nowadays this model is puny but for its time was very deep), and (3) take Geoffrey Hinton's 1980 paper called Back Propagation as a way to optimize this very deep model with stochastic gradient descent (SGD). This is now the framework for all modern AI. By doing so, the AI model was able to learn filters on images that mimic what neuroscientists have discovered in the occipital lobe in our brains where each box in Figure 1.2 represents a detector that the AI is looking for (as proven by neuroscientists David Hubel and Torsten Wiesel in 1964).

    While this was a huge step forward, to level‐set, 63% @1 Accuracy is not that good. That means that 37% of the time the AI gets it wrong. If this were a self‐driving car looking for stop signs, 37% of the time it would blow right through them. So not usable. But with this technique, they showed the AI community how to get there. Now we have continued to increase the size and sophistication of the models, and we have pushed the art of the possible on performance (see Figure 1.3).

    Today, the state‐of‐the‐art accuracy on ILSVRC2012 is 91% @1 Accuracy. This is almost another 30% points of improvement. But still, 9% of the time the AI makes an error. To be fair, human‐level performance on this task is about 85%, so this is much better than humans, but still not perfect. To understand how hard the last 9% is, the AI needs to tell the difference between a Siberian Husky and an Eskimo Sheep Dog. Can you tell which is which (see Figure 1.4)?

    Photograph of image filter for AI viewing images.

    FIGURE 1.2 Image filter for AI viewing images.

    Source: Isha et al., 2019 / IEEE, CC BY 4.0.

    While there were many novel contributions of the AlexNet paper described earlier, the biggest one was leapfrogging from CPUs and GPUs. This was very non‐trivial and difficult for the team to do back then. To program these models on GPUs, they had to learn a new language called CUDA (Compute Unified Device Architecture), which was only available since 2007 and even then was not fully featured. Because Alex Krizhevsky knew how to program GPUs, he was able to make this jump, expand the available compute power of his AI models, and train his network faster than any other method possible before. Note, Alex's model was so big (at the time) that he needed two GPUs to fit all the learnable parameters of his model. It took five to six days to train on two GTX 580 3GB GPUs. Today, we would consider this a very small model compared to the thousands of GPUs used to train modern AI models. We will discuss this more in the theory section of this book.

    Graph depicts aI image classification accuracy, 2013–2022.

    FIGURE 1.3 AI image classification accuracy, 2013–2022.

    Source: https://blog.roboflow.com/introduction-to-imagenet/ last accessed 20 Dec 2022 / Jacob Solawetz.

    Photographs of (Left) Siberian husky, (Right) Eskimo sheep dog.

    FIGURE 1.4 (Left) Siberian husky, (Right) Eskimo sheep dog.

    Source: Randi Hausken / Wikimedia Commons / CC BY-SA 2.0; sir_j / Adobe Stock.

    Speech‐to‐Text Services Revolutionized by AI in 2013

    Every time you see captions on TV or enter a courtroom with a person typing ferociously, a person is listening to someone speak and then transcribing what is being said and trying not to make any mistakes. This industry is $9.4B in global annual revenue and is growing 23% a year.¹ Numerous companies are adopting AI to do this faster and more accurately than humans today, such as Otter.ai, datasaur, deepgram, uniphore, and verbit.

    So let's look for the three prerequisites again here. First, in the late 2000s, news agencies started posting all their content online, their video, audio, and subtitles. AI companies took troves of this information (called parallel audio/text data) from CNN, Fox News, and many others, and used it to train very sophisticated ASR (automatic speech recognition) pipelines. In 2013, Amazon launched Alexa, one of the first accurate speech recognition devices in the world, powered by modern AI. Since then, the field has been decimated by AI, giving rise to Siri, voice‐to‐note‐taking apps, robotic call center chatbots, and more.

    While we could discuss many papers that were published to enable this transformation, I would like to highlight the 2013 paper Speech Recognition with Deep Recurrent Neural Networks, which is the first successful application of this new class of AI algorithms applied to speech. In this paper, the same lab and advisor (Geoffrey Hinton) who published AlexNet used the same general technique of AlexNet but instead of using a convolutional neural network over an image, they used a bidirectional recurrent neural network (RNN) over a spectrogram. A spectrogram is a frequency vs. time graph version of a sound file. For the technical readers, this is the Fourier transform applied to the sound file, basically turning a sound file into an image.

    By converting the sound file into this representation, the AI can map between this input to the output much easier and is therefore able to learn much quicker.

    The next step function in ASR was DeepSpeech 2 by Andrew Ng's team at Baidu SVAIL. This paper was one of the first to use deep learning approaches and huge amounts of data to recognize speech in a completely end‐to‐end manner, doing away with many hand‐engineered components that were used previously. This achieved a 13.25% WER (word error rate). By adding more and more data and increasing the model size, today we are closer to 2.5% WER, which is far beyond the abilities of humans on this task—roughly 5.8% WER (see Figure

    Enjoying the preview?
    Page 1 of 1