Summary
In this episode of the AI Engineering Podcast machine learning engineer Shashank Kapadia explores the transformative role of generative AI in retail. Shashank shares his journey from an engineering background to becoming a key player in ML, highlighting the excitement of understanding human behavior at scale through AI. He discusses the challenges and opportunities presented by generative AI in retail, where it complements traditional ML by enhancing explainability and personalization, predicting consumer needs, and driving autonomous shopping agents and emotional commerce. Shashank elaborates on the architectural and operational shifts required to integrate generative AI into existing systems, emphasizing orchestration, safety nets, and continuous learning loops, while also addressing the balance between building and buying AI solutions, considering factors like data privacy and customization.
Announcements
Parting Question
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0
In this episode of the AI Engineering Podcast machine learning engineer Shashank Kapadia explores the transformative role of generative AI in retail. Shashank shares his journey from an engineering background to becoming a key player in ML, highlighting the excitement of understanding human behavior at scale through AI. He discusses the challenges and opportunities presented by generative AI in retail, where it complements traditional ML by enhancing explainability and personalization, predicting consumer needs, and driving autonomous shopping agents and emotional commerce. Shashank elaborates on the architectural and operational shifts required to integrate generative AI into existing systems, emphasizing orchestration, safety nets, and continuous learning loops, while also addressing the balance between building and buying AI solutions, considering factors like data privacy and customization.
Announcements
- Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems
- Your host is Tobias Macey and today I'm interviewing Shashank Kapadia about applications of generative AI in retail
- Introduction
- How did you get involved in machine learning?
- Can you summarize the main applications of generative AI that you are seeing the most benefit from in retail/ecommerce?
- What are the major architectural patterns that you are deploying for generative AI workloads?
- Working at an organization like WalMart, you already had a substantial investment in ML/MLOps. What are the elements of that organizational capability that remain the same, and what are the catalyzed changes as a result of generative models?
- When working at the scale of Walmart, what are the different types of bottlenecks that you encounter which can be ignored at smaller orders of magnitude?
- Generative AI introduces new risks around brand reputation, accuracy, trustworthiness, etc. What are the architectural components that you find most effective in managing and monitoring the interactions that you provide to your customers?
- Can you describe the architecture of the technical systems that you have built to enable the organization to take advantage of generative models?
- What are the human elements that you rely on to ensure the safety of your AI products?
- What are the most interesting, innovative, or unexpected ways that you have seen generative AI break at scale?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI?
- When is generative AI the wrong choice?
- What are your paying special attention to over the next 6 - 36 months in AI?
Parting Question
- From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?
- Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers.
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0
[00:00:05]
Tobias Macey:
Hello, and welcome to the AI Engineering podcast, your guide to the fast moving world of building scalable and maintainable AI systems. Your host is Tobias Macy, and today I'm interviewing Shashank Kapadia about applications of generative AI in retail. So, Shashank, can you start by introducing yourself?
[00:00:29] Shashank Kapadia:
Yeah. Hey. Hi, Tobias. I'm Shashank. I'm currently a machine learning engineer. I've been in the domain for close to ten years now. Throughout my journey, I've had hands on experience dealing with the problems, right from the small size company to Fortune 500 to more recently at a Fortune one company. I really enjoy what I do, and I'm excited to be here and, share my experience with the audience.
[00:00:53] Tobias Macey:
And do you remember how you first got started working in ML?
[00:00:56] Shashank Kapadia:
Yeah. That's funny you start, you asked that. So I essentially come from an engineering background, later did my graduate studies in operations research. So very much kind of a math had been, more or less in from an academic standpoint, but was dealing very much in within the deterministic and a structured way of solving problems. I think what really caught me was when I started to understand how the recommendation systems work, right, from moving from people who bought x are also buying y, which is to an extent a basic statistics. But then when system starts to understand the intent behind it and what customers are not telling you, but you still are able to identify those needs and understand the subtle patterns. It really just got me excited about it. And I think what sealed the deal for me to be in ML, it's just not about the technology, but understanding, like, how human behavior can be understood at scale and can then be used to help millions of people find what they need faster, save them time, reduce friction reduce friction, or in any capacity that can make their lives better. So to me, it feels like we are kind of in this in this Internet era of 1995.
We know a lot of things are going to change. Perhaps at this point, we don't know just how, or maybe are in the position to steer it in the right direction.
[00:02:18] Tobias Macey:
Yeah. Absolutely. It's it's definitely an interesting time in the world, and we're absolutely in the very earliest stages of whatever change is going to happen. And it's hard to know where we're going because even just in the past six months, the pace of change continues to accelerate and build up. And there's a lot of hype being thrown around, and it's hard to differentiate between what is marketing, what is fear and uncertainty, and what are actual legitimate claims.
[00:02:48] Shashank Kapadia:
Yeah. No. Absolutely. It's totally a situation where it feels like either we are building a plane while in the flight or building the ship while floating. So there's certainly noise around, and, I think there are also some good signals coming out of it. So it's just, I think, a job of identifying the signals from the noise.
[00:03:06] Tobias Macey:
And so to that point, given the particular area that you're focused on as far as retail and ecommerce, given that you're working for Walmart. Wondering if you can just talk to some of the ways that you're seeing the conversation evolve around when and how to apply these new and upcoming generative AI capabilities, and what are the situations where the risk and uncertainty or cost are still too high given the volume and the level of visibility, and you still just rely on what I'm going to term traditional ML use cases, whether that's linear regressions or deep learning or what have you?
[00:03:50] Shashank Kapadia:
Yeah. No. I think that's an excellent broad question. And before I jump into it, just to, slide the disclaimer here. I think sort of comments and opinions that I'm probably going to share throughout this conversations of us. It's sort of coming in from my own personal capacity and not the representation of Walmart or my previous employers, and they just don't reflect the views of them or are endorsed by, Walmart in any capacity. So just, yeah, I wanted to have it out there. But I think when you think about applications of generative AI within retail ecom or just generally speaking, like, when do we know which type of solution or an algorithm to use? And it's certainly a space where I think there is a lot of experimentation that's happening, because, again, we aren't fully aware about different capabilities that Gen AI or generative AI can essentially support us with. So there's certainly a room for us to explore and understand what are different layers that we can extract out of it. But that being said, there are certainly problems that aren't quite suitable for generative AI at least in its capacity as of today. Few of those things comes to mind, when you're within the retailer ecom, there is an optimizations that you are running. Right? So, hey, if you're running a store chain and you wanna understand how my assortment needs to be optimized and how my shelf space needs to be utilized, it's a very much of an optimization problem that still is to be run not from a generative AI standpoint, but more so much as a mathematical modeling. But that being said, I think even in those type of applications where we know that the problem itself requires a different perspective to it, I think how generative AI is helping is complementing the outputs coming out of the solution.
Just to give you an example, suppose you are running a store chain and have an optimization problem that optimizes for you how my shelf space needs to be, allocated and for how long and for what type of products needs to be there. Now once you have run that optimization and have got the results from it, oftentimes, those results are consumed by store managers or stakeholders who aren't necessarily coming from the math or the technical background. And what GenAI has helped is to layer that output into adding a layer of explainability. Right? So even though GenAI is not what it's optimizing, the solution for it, it certainly can consume the outputs from the solutions and explain to the end users why certain decisions make sense. So it's, again, just one of the example in which, again, we can't really directly plug in Gen AI as of now, but it can certainly serve as a compliment to it. But there are also areas where I think integrating Gen AI right at the foundational layer has been super helpful. I think one of those scenario that comes to mind is intent understanding.
I think for the longest time when we look at or think about ecom, users would go and search for I need to kind of sort of products that you are looking to buy for, like, some kind of a milk or an ice cream or or anything that you're looking to buy for. But I think the shift that we are observing, with the Gen AI is more so much as now the customers are not searching for products anymore, but they essentially describe the problem or a situation. Think about it as, like, hey. I need something for my kids' science project, which is due tomorrow. Now once this as a situation comes in from the customer, how AI can help is really figuring out, like, hey. The customer would need a poster board, markers, maybe a display stand. So it's essentially moving from a keyword based matching to an actual comprehension.
That's one area that I can see a lot of potential going forward. The second is also around dynamic personalization. So, hey, not just recommended for you. I mean, truly dynamic experiences, right, from the product description to what image, what customer sees, and every piece of copy that adapts to who's reading it. So if a professional chef or a college student is in the market to purchase a knife, they may have a completely different experience of a same product that has been shown to them when they are browsing for it. And I suppose the third area that I think seems to have a lot of potential is also a predictive commerce. Again, this is more of like a sleeper hit nobody talks about, but AI that knows what you need before you do. So not like a like a creepy tracking, perspective, but understanding your patterns. Think about it like, hey. It's February. You bought Valentine's decorations last year. Here is a subtle reminder you're in the market for it. Your printer ink lasts about three months, and you purchased a printer around that period. So you may be in the market to buy that. So how can we have those predictive customer or predictive commerce experiences for customer?
Is in another layer, I think, Jenny, I can help in in sort of pushing those envelope or pushing the boundaries, further down the road.
[00:09:10] Tobias Macey:
Given the fact that you do have already a substantial investment in various types of machine learning systems, whether that be recommender systems or predictive analytics for things like supply chain issues or, to your point, some of the proactive notifications of, okay, you bought this printer three months ago. Let me go ahead and see if you want to purchase some replacement ink for it. How has the injection of generative AI workloads changed the architectural requirements, and what are some of the ways that you've been able to build on top of existing systems that you've already invested in?
[00:09:53] Shashank Kapadia:
Yeah. No. I think that's a that's a really good point that you bring up and sort of break it down into, like, two forms, I suppose. One is the architectural patterns that we would probably would now need to start adopting as we integrate generative AI in our solutions. And then the second part is more so much from a machine learning operations standpoint. But even before that, I think when we think about fusing Gen AI into any of our existing solutions, I think the key question here is is the value. And more than oftentimes, it's hard to quantify that. But generally speaking, I think what allows any solution from its POC stage to go into production is having a good understanding of value that it's driving. Right? So, yes, there's a cost part to it, but then there's also an additional revenue part of it, which known beforehand or being tried to known, can help understand the value proposition of the effort. Now, again, from a cost perspective, not just the infrastructural cost of hosting and running the solutions in production, but also personal cost, resourcing cost, opportunity cost. So quite a few different type of costs that you may want to take into account when it comes to fusing JNEI. And that, again, is not just in the case of JNEI for any improvement or enhancements that we are planning on doing with either a conventional software piece or a conventional machine learning piece. The the value proposition needs to essentially make sense before anything goes into production. Now that being said, I think from an architectural patent standpoint, especially when we think about it from a generative AI standpoint, the fundamental is that it's a probabilistic in nature. So unlike patterns that we would follow in the sense of a conventional software where things are mostly deterministic, there are few things that we would wanna have in place from an architectural pattern standpoint that can suit well for Gen AI, but also perhaps put the companies starting to use them in a comfortable position. So one of them is the orchestra of models or perhaps we can think of it as like an agent. So instead of having one giant AI solution trying to do everything, you have specialized models that are working with each other. So let's say you have one model that handles the understanding behind what the users are asking for, another one who goes and find the products from your inventory, another one who is layering the personalization on the overall experience.
And they all pass information between each other, like, again, the musicians in orchestra. Right? So one thing from a pattern or an architectural standpoint is we are designing this systems or an experiences, we wanna start to think about it more from an orchestra of models rather than one model doing it up. The second is safety net, which is being the solutions are probabilistic. Every AI output get goes or essentially is going out, it passes through multiple checkpoints. Like, think of it as, like, someone passing through an airport security. There are multiple layers and each is tasked to catch different things.
Hey. The information that's going out, is it factually correct? Is it appropriate to the user? And that appropriate can come in white to your forms being an age appropriate or the content appropriate. Will it resonate with the customer? So point being is you want to have that safety net, especially with the probabilistic models to make sure that the information that's going out is in line with what you expect it to be. So alignment is is is the key here. And the third piece is the learning loop, and this is really a secret sauce. Again, it has been the secret sauce for conventional machine learning models as well as much data and the feedback you collect, better you can have your models. And it's very important for degenerative AI solutions as well, if not more, because every interaction teaches systems something. Customers bought the recommended product, that's a win, reinforce that pattern. They immediately returned it. Learn from that too because here's the key. The the learning happens continuously and not so much in, like, a big batch of dense fashion. So the system's literally getting intelligent for the for the right word with every hour or every frequency at which we ingest a feedback into it. So, again, these are not net new patterns as such. These are, again, universal patterns that we would, anyways, would want to have, but they play a very key role, especially with Gen AI being the probabilistic in nature. Now the second piece, which is around the machine learning operations, I think it can also be sort of divided into two parts, which is, hey. There are things that remains the same, no matter what type of solution we are deploying. And then there are things that do change a bit with the generative AI into play. So what stays the same? Again, all the boring stuff, that actually matters. So you have data pipelines, model monitoring, AB testing the infrastructure, All of that, which is super critical, still remains to be the scenario when we are dealing with the general DVI solution as well. If anything, it becomes more important, because of the probabilistic nature. We can't just deploy and forget. Also, part of it is, again, garbage in, garbage out, which isn't changing when it comes to the Gen AI as well. It just automatically doesn't fix the bad data. Now what has changed though? I think in my view, the the speed of everything, We perhaps when I remember a few years back, we used to spend months doing a POC on a new recommendation model or for any classification model that we're doing.
And now someone can prototype a new experience in in the days, if not in an afternoon. And so the barrier of experimentation has collapsed. And being able to have an infrastructure that support that, I think, is the key here because one can run a quick POC to understand if there is a value in it to then later spend resources to put something into into production. The other big shift is also democratization. I think earlier when POCs or even experimentations were limited to people with a certain technical background, I think with generative AI and especially with the with the chat based API thing, people not coming from a conventional math or data science or machine learning background can also try to POC things and see if there is a value to even further bring it as a item for the technical teams to explore. Right? So a marketing person can write a prompt, create something that would have required teams of engineers to use it or to do it. So there is there is a kiosk for sure, and that, again, requires a new type of governance and another topic, which is quite an extensive one when we think of Gen AI.
But the skill sets are evolving too as well. Right? I mean, you have prompt engineering, which is becoming a very important part of how the traditional ML engineering, used to be. So kind of to sum it up from an MLOps standpoint, yes, there are certainly things which have pretty much unchanged, if not have are are are more critical. And then there are few things that are changing. Speed will be one of them.
[00:17:26] Tobias Macey:
To your point of some of these generative capabilities being incorporated into processes like experimentation or some of the prioritization or some of the implementation work that feeds into these AI systems. One of the ways that I've been thinking about the bifurcation as far as the application of AI in a broader system is whether you're building for AI where you're actually constructing all of the data pipelines, building all of the inference systems, the context management, or building with AI where you're actually using a generative model in the process of whether it's writing code, generating diagrams, to your point of some of the marketing team being able to generate copy and experiment with that. And I'm wondering how you're seeing those different roles of AI in the broader system manifest in terms of how you're thinking about the work to be done as somebody who is very closely linked into the actual machine learning
[00:18:25] Shashank Kapadia:
cycles? Yeah. I think, again, I think when you and and you've just basically put it very rightly here. Again, I think a few years back when we used to think about AI or machine learning, we used to think of it as a vertical that is in line with other verticals that how business operates with. So you have a vertical of AI or machine learning, your vertical of sales, vertical of marketing, engineering, and so forth. And I think what has shift or what has shifted is this paradigm where AI is no longer a vertical, but essentially a horizontal layer that touches upon every aspect of, of a business. And I think Jeff Bezos in one of his, interviews, sometime back kind of very rightly put it, which is how each of the verticals in the business utilizes AI is different. For example, from a marketing standpoint, you could be using AI to kind of have a query interface to give you an understanding about how your marketing ad or ads are doing. So you could think about having an AI solution that is plugged into the data systems in the back end where there is a interface where you could ask, like, hey. How my particular ad did over the period of last seven days or a month. As opposed to from an engineering standpoint where we are using AI when it comes to working or writing a piece of code is to, again, think about something that we would want to have and having the right coding assistant who could help us write the code that is more or less close to how we would want to have it in production.
Another vertical may have a different use case. So I think when we think about AI and all these different use cases within our organization, they're all coming from a perspective which can overall or essentially help them improve their improve their workflow and different teams would have different workflows that they're trying to optimize for. I think when it comes to being able to support that, that's again a whole different challenge at scale, not just from a cost perspective, but also the infrastructural requirements perspective. I'm happy to go into, like, either of that if that's, of an interest.
[00:20:41] Tobias Macey:
I think in particular, one of the ways that or one of the areas that's interesting to explore is some of the methods that you found helpful as far as bringing some of these generative models into the inner loop of ideating about the different types of experimentation that you might want to do on some of your existing model environments where maybe you have a recommender system and you want to do some hyperparameter tuning or you want to use the model to help you determine what are some interesting features that you want to do some engineering of and extracting to see how that impacts some of the existing models that you have and managing the experiment design and just some of the ways that the ML has influenced the ways that you think about that work?
[00:21:31] Shashank Kapadia:
Yeah. Absolutely. And I think that can be kind of taught into two ways as well. Right? So when we think about, let's say, for an example, the recommendation systems right now, one way to think about having AI to help us with is how can we integrate AI within the solution itself. And the second piece of it is how can I use AI that can help me experiment and run POCs for my solutions? And they both have a certain level of value that they would essentially now if I were to take a case where I'm using AI to help me speed up my overall experimentation, meaning, hey. I have this model, and there have been an influx of new data sources. And I wanna understand if I use a certain new features into my model, would my overall accuracy or relevancy improve over a certain metrics that I'm tracking? I think that piece is where AI is more or less an assistant to you, where you could essentially with the natural language, given the code base that you already have, the data that you are already using, quickly experiment and run a POC by saying that, hey. I'm gonna pull this model, add an additional features to it. Can there be a rough script that can be generated? And then you can add in your own pieces of code to be able to experiment that. So what it has helped essentially is shortening that experimentation window when it comes to running a quick POCs to understand if there is a potential to go further down. I think the other side of things where what we are also trying to understand when you have a recommendation systems that are in production, you also are collecting feedbacks constantly.
And some of those feedbacks as you collect and some of those challenges that comes by can better be solved by having the right AI solution integrated into them in itself. I think the explainability is a really good use case that is sort of also a starting point on how one can integrate Gen AI into recommendation systems without having to change the underlying models or having to experiment the change of algorithm in itself. It's more so much about building the trust, having a good level of transparency. So the other side of it is integrating AI directly into it in itself, and that usually comes from what problem or a challenge that we are trying to solve for and is Gen. AI a right solution to solve for that problem.
[00:24:04] Tobias Macey:
Another interesting angle of your experience is that given the scale of an organization like Walmart and the number of end users that you need to be able to facilitate interactions with, you're going to hit issues of scale much sooner than a start up or a small organization or somebody who's just doing a one off experiment with some of these generative AI models. And so that brings in a lot more due diligence. I'm sure a lot more of, a lot more upfront analysis before you actually even embark on any experimentation. And I'm curious what you have seen as you have started to implement some of these generative AI use cases, if there are any particular bottlenecks that you have run into whether in terms of infrastructure capacity or, some of the challenges around data scale or structuring of data that have led to issues and just some of the problems that you have maybe run into sooner than other people who are going along the same journey?
[00:25:13] Shashank Kapadia:
Yeah. I think scale is certainly something that when again, for companies like Walmart, Google, Meta, again, you can think about I think the scale here essentially comes from two two sides of things. I think one is how many how many users are engaging with the services and the products that you have to offer. And as a result of it, what sort of volumes that you are driving when it comes to the usage overall. And I think when when we think about the challenges that are coming from scale, I I I sort of like to break things down into, like, four different types of problems. Of course, you have much more technical challenges when it comes to the data, when it comes to the inferencing. But if you really sum them up, it essentially boils down to one, you have you have a penny problem. When you're solving millions, if not billions of requests, even timing of inefficiencies can become massive costs.
Think about it as if if your model is taking ten milliseconds longer to compute a request, and you have millions of requests that are hitting you in a day or a month or even a year to that matter, you have that much extra millions of dollars going into into compute. Or for that matter, if you have a generative AI integrated into the solution, the real currency here is is essentially tokens. And so if my requests are getting processed with, let's say, 60 tokens versus 50 tokens, that savings of 10 additional tokens or a large volume can result into a significant cost savings or a costly, scenario. So it's more becoming or it more become like a penny problem that you're trying to solve for. The second area is around the edge case multiplication. So at a at a small scale, your edge cases are rare. At a large scale, we can get to see everything every day. Someone asking AI to write sort of an haiku about an expired yogurt or people trying to jailbreak the system. So you have to build for a world where you can have to imagine about every possible weird thing that will happen, and it will happen because it will. The third area is around the feedback, especially when we think about it from a machine learning solutions, which is, again, thrives on feedback. But what surprised me actually having to deal with such a large scale is that when, again, millions of peoples are using your solution, the volume of feedback is also massive. And sometimes too much feedback is almost as bad as too little because system can then start to chase the tails or react on some temporary trends.
So we have to start thinking about how can we build dampeners, like like shock shock absorbers for for learning systems so that we can extract the signal out of noise. And the third, when we think about scale, we also think about scale from a geographic standpoint. Again, when you have solutions that are deployed globally, you run into a speed of light problem. Again, we can't beat physics. So when you are global, if there is a certain latency that is associated with distance having your solution deployed in North America and if a user is trying to make the request from, let's say, Asia Pacific, there is a deep there's a latency because of the distance that is between two geographic locations. And so you then have to start to think about how can we further optimize it. Do we need a regional based inference? Do we think about edge computing?
So all of these complexities may not be as much of a challenge when it comes to a small scale, but when you start to operate at a global scale, many of this can become a big challenge and sometimes a costly one. So what I've been able to learn from my experience is that at scale, you sometimes are not optimizing to make things better. It is to more about preventing a a collapse. You're you're constantly fighting with the entropy.
[00:29:26] Tobias Macey:
To the point of entropy, there's also a measure of platform risk that is involved when adopting some of the APIs from the various foundation model providers, which can be mitigated if you host your own models because you can freeze the model. But there are any number of other variations in the operating environment that can cause perturbations to the input, to the context that's provided. And, also, there is a huge surface area in terms of security around these generative AI systems, whether that's prompt injection, jailbreaking, mitigating potential vulnerabilities if you're enabling any sort of tool use. And then also ways that if you are using a managed model provider or if you're using different versions of models, there are unpredictable changes in terms of the output that it will give given the same prompts. You need to have prompt versioning and and ways to manage that experimentation.
And I'm curious how you're thinking about the build versus buy problem along some of the different major architectural elements that support just the raw inference capability.
[00:30:34] Shashank Kapadia:
Yeah. I think when it comes to, again, build versus, by conundrum, I think all the challenges as you described still remains the challenge. The additional layer to that also is the data privacy aspect of things. The third aspect is around the customizations. And I think more important is for what use case are we trying to use the generative AI as a solution. Right? So there are certainly use cases where it's relatively low risk for a business, and the output that comes out of an open source solutions can very well meet the requirements that we are trying to solve and the data privacy concerns are not too high in in those scenarios.
And the scale may not be an issue. In those scenarios, it's very much okay to go and use what's available out there. It saves a lot of resources that goes into building something that's custom or maintaining, something that's custom. But there are certainly also use cases which are very core capabilities of a for a business and has to deal with quite a bit of sensitive data and That's where I think building something from scratch or having a custom solution would perhaps rather make more sense. That being said, I think there are very few instances or scenarios where we are truly building anything from scratch. I think when we are thinking about or talking about customizations, what we really are talking about here is two things. One, fine tuning an already existing LLMs or any pretrained large language models, And then two, hosting it so that the data coming in and out is fully controlled.
So part of, I think, when we think about customization, I think more than 90% of scenarios, we're talking about hosting it internally as well as making sure that there is fine tuning there that goes on through it. And that fine tuning has multiple objectives, mainly from making sure that the model is aligned with the objectives we are trying to achieve. The custom solution, the data privacy is part of it. The other part of it is also having the right guardrails that the company can control to make sure that the right amount of safety checks are in place before and response or an output is served back to the customer. So kind of taking a step back when we think about build versus buy, I think there are a lot of factors beyond infrastructure and maintenance that tends to play a role. And usually, when it comes to low risk, low volume usage, you may end up using an API that eliminates the need to maintain the underlying infrastructure and the cost associated with it. However, when it's something that's core to the business, it's often evolved from a data privacy standpoint, from a customization standpoint to fine tune something in house and also maintain that in house.
[00:33:29] Tobias Macey:
As far as the evolution of your own work, how have these generative capabilities changed the various priorities or the time to delivery for the types of projects that you were tasked with prior to the, what I'll term the Gen AI epoch of when ChatGPT launched?
[00:33:51] Shashank Kapadia:
Yeah. I think it certainly has impacted in in variety of ways. I think when you think about it from an overall productivity, I think personally I can speak for myself, I have found to be able to be a lot more productive. I'm I'm sure many others in my, sort of surroundings and in the similar roles would would find it to be the similar situation, partly because of the fact that, hey, there is a lot of things that we used to do from scratch can now be done with an assistant. I think also the key here is oftentimes well, it's sort of a counterintuitive, meaning you would want to know what you are trying to do. And so if you know what you're trying to do, having a right assistant in place can really help you boost the productivity.
If you don't know what you're trying to do and are trying to do that through an assistant, sometimes doing it from scratch can be a bit more faster than through having in a constant loop back and forth with an assistant, assistant, or any GNEI solution. So generally speaking, I think if you know what you're doing and having a right assistant by your side, it certainly can help you increase in the overall productivity. Now, I think when it comes to the other part of it, so with an increased productivity, are there more deliverables that you are working on? And I think a short answer to that is yes.
Again, I think end of the day, you have a certain time and hours that you're trying to work, towards your professional career. And if your productivity is going up, more initially, you are being able to deliver more things. Now that being said, it's truly not to be thought about as net new deliverables that you're able to push out because we have to keep in mind that with every new delivery polls, that also comes in a point of maintaining those delivery polls and ensuring that they are how they are performing in production. So that also comes in with an increased amount of work. So it's not just, hey. We have more productivities or more things are getting delivered.
Because of more things that have been getting delivered, there's also more maintenance that is being added. And so sometimes it comes out to be like a net neutral, if I were to put it that way.
[00:36:07] Tobias Macey:
And then another interesting angle given your perspective and the scale that you're operating is to our earlier conversation of bottlenecks that you run into is I'm wondering as you have been bringing some of these generative AI use cases into end user facing environments, what are some of the most interesting ways that you have seen these systems break?
[00:36:35] Shashank Kapadia:
Yeah. No. I think in in a variety of ways, I think we we talked about prompt injection being one of them. I think the evolution of generative AI solutions, itself has kind of evolved. If you look back a year or two years to where it's now, I think part of it essentially comes from lack of something than the generative AI solution in itself. Now what I mean by that is if you have a generative AI as a solution and without having a right surroundings to it and it's being deployed in production, it can come with a lot of different issues that the customers while using it, will face.
So the very first thing you wanna think about it as is any solution that is being deployed, is there a constitutional foundation associated with that model? And what I mean by constitutional foundations are, what are some core principles that are embedded during its training in itself. A lack of it can run, into all sorts of output that you can get. The other part of it is the real time guardrails. Right? So a lack of guardrails can sometimes lead to funny or to some extent, sometimes potential to harm the reputation kind of responses from the from the model itself. The third aspect, which I think AI or Gen AI has evolved quite a bit is the confidence systems. I think AI now knows when it doesn't know, instead of confidently hallucinating, which used to be the case a few years back. So, hey, if you have the right confidence systems embedded into your model where I'm not sure about this, let me connect with you or let me connect someone who can answer that and that can be the human. I think that pretty much alleviates many of those scenarios and instances where we say, Jenny, I failed, quote, unquote. So, yeah, I think part of when you think about failures, I think there's certainly a long way ahead of us when you think about all different innovative ways to break it. But where it comes from, I feel, is is lack of certain of these components, having them to be in place to make sure those type of errors are as alleviated as possible.
And
[00:38:48] Tobias Macey:
in your own work of being involved in the design and Yeah.
[00:39:02] Shashank Kapadia:
Let Yeah. Let me think about that, actually. That's a really good question. I think well, I think I've I've seen AI fail at scale in in some in some weird ways. I think one instance, it's something that I kind of call it like a recommendation recursion. So, again, having dealt with the recommendation systems for some time now, I think what we have seen AI do, especially when we infuse that with GenAI, is something what we call recommendation recursion, which is if a customers, or a particular customer looked at the shoes, you wanna recommend them with socks. If you're recommending them with socks and shoes, you wanna recommend them with shoe polish, then you wanna recommend them with polish applicators, and then you wanna recommend microfiber cloths to clean the shoes. So there is no end to what all things you can recommend, and by the end of it, it just starts to suggest products that even barely exist. So recommendation recursion is one that comes to mind that was sort of unexpected in terms of experiences that I've seen Gen AI do in the personalization space. The other is more so much, I think I've also seen and used AI with something what we call, like, the accent mirror. Right? So say you're using a voice shopping AI assistant, and as you speak to that assistant, the assistant starts to mimic the customer accents. Right? So they are not programmed to do it. It's just learning from the patterns. And so if you are a certain customer and I've got certain accent, the AI starts to replicate that. If you are a Britisher and from any other accent, tries to mimic that. So, again, it sounds cool in the way that it's trying to learn from the patterns until you realize that it seems somewhat like a like a mockery. Then I've also seen holiday hallucinations.
So I think when you have seasons like during Christmas seasons, and you're asking me, I like, how do I fix my sink? And the response starts to come more so much in a festive season way, like, hey, jingling your way out of pummeling success or how do I make my pipes problem go away or, like, making your pipes go merry and bright. So all sorts of layering that happens because of hallucination and the time of day or that you are interacting, with the chatbot. So I think what I have kind of seen the fair shares of those experiences with different AI solutions, and I think the lesson for me coming out of that is that AI does not feel like a traditional software, as in you have this error, there's some logical piece of code that broke somewhere. It fails creatively, almost like an artistic way that it's so hard to understand which part of AI would have caused what thing. But every failure helps kind of helps us to understand what are some challenges or the tail events that could happen and how systems really are thinking for us to then go and, go and debug.
[00:41:58] Tobias Macey:
And as you are building new capabilities or working on improving existing systems, what are some of the heuristics that you use for determining when generative AI is just absolutely the wrong choice?
[00:42:12] Shashank Kapadia:
Yeah. I think two two things comes to mind. So one is well, the problem that you're trying to solve for, is a deterministic problem, and there is no reason for it to be layered in with the probabilistic model. Right? So if you are trying to, again, run an optimization problem, it's a very deterministic problem, and there is no need for it to be a probabilistic in in the nature. The other side of things are also the the value side. So, yes, you could be having a problem that is a probabilistic in nature, but it could very well be solved by having a good classification model or a regression model to be in place rather than a full blown generative AI solution to it. So I I call this bucket to be as a match between the problem and a solution. So you wanna use the solution that fits right to the problem that you're trying to solve. Now the other part of it also is the value side where your problem, which, let's say, for example, the explainability for your recommendation systems output, it does fit to the checklist of where generative AI can be very helpful. But where value comes into play is that while it's good to have and a nice to have, is this something that our customers want? Is this something that the stakeholders are asking for? And what essentially are the costs associated with building it, but also maintaining them? And what sort of revenue are we expecting to have coming out of it? So part of it is to understand it from a value perspective as well. Is it really something that's going to drive value for the business?
Third thing, do I I'd like to put that into not so much as data mining whether to use Gen AI or not. I think that is also a notion of innovation and pushing boundaries. And more than often time as a company, if there is something that you can afford or have the flexibility to support, Gen AI can always be a small percentage of your time where you can think about, hey. Can I run a certain kind of POCs to see if there is a value in it? So sometimes it may not always be driven from a problem like, hey. Here is a problem, and we are trying to solve it. Sometimes it also comes with just experimenting with different tools and technologies and essentially coming up with the ways that could then actually drive up value even without having a problem in first place. So there is that third kind of a bucket that I like to think about as well.
[00:44:40] Tobias Macey:
And so as you continue to build some of these machine learning systems, as you continue to stay apprised of the evolution in the space, particularly for generative applications and the supporting tooling, what are some of the areas that you're paying particular attention to particularly in the horizon of six months to two or three years?
[00:45:02] Shashank Kapadia:
Yeah. I think a few things that comes to mind. So I think from six to twelve months, I sort of I'm thinking about it from a from a convergence standpoint. So, again, search and shopping are merging into something entirely new. Again, instead of typing blue shoes size ten, you're you're probably going to have conversations. I need something for my brother's beach wedding that will make me look like, like, I'm trying too hard. So having more of a fusion between a conversational piece with your shopping experience. So conversions would be one I'm I'm really looking forward to. I think further out, I would imagine to be something about, like, shopping agents for you. So you have an AI assistant who understands your shopping patterns, who understands who you who you are, and will shop for you while you sleep. Right? So it knows everything from your preference to budgets to your values. It can, in some cases, if allowed, can negotiate prices, can wait for the sales events to happen, and even can circle back with the with the seller if something is off or wrong. So I'm really looking forward to I think in a year or two with having the right, kind of autonomous shopping agents. And I think way further beyond, I think I'm also looking forward to perhaps bringing back the the metaverse of, overall retail experience in itself. Right? So I'm talking about, like, how AI can show you exactly how your furniture or the group of furnitures can look into your actual rooms. We have bunch of flavors of that already in place today, but not so much as you can walk into your, let's say, an apartment. It is a one on one replica of how it is, and you can play around with it when you are doing a furniture shopping to see how exactly it's gonna come out to pay. So that, again, is a more futuristic view of how I see, like, two, three years from now. I think what are some game changers, I think, not so much putting timelines to it that very few are talking about is again, the emotional commerce where again, I understand not just what you want, but why you want it. And having a layer of emotions involved with the commerce is also something we can achieve with having the right gen AI solution to be in place. The second piece is the negotiation economy. I think every price can become negotiable, and rather than having a distinction between as a human, are you a good negotiator or not, perhaps having a right AI assistant by your side can pretty much haggle with the retailers on what the negotiated price could pay. Again, with that, there are also things that probably will come by.
I think regulation explosion is one that I'm, anticipating to come by in the near future with again, there's one AI scandal and worldwide restrictions could come into play or what right regulations and compliance for all the good reasons, would have to be in place as well. There's also going to be like an open source disruption, with a lot of these tools that are out there in the market. It doesn't really require you to be in a particular ecosystem or a group of tools to achieve certain things. And so there's perhaps going to be a plethora of open source tools that would perhaps come out. I think quantum computing is one, and there was a really good presentation from Google a few months back that, again, could make the overall AI computation as we do on GPU today look like a pocket calculator. So quite a few different areas I'm really looking forward to as we as we shape this industry, as we shape the use cases, applications of it, not just directly connected with AI, but also the industries that support AI or the manifestation of AI to begin with.
[00:48:55] Tobias Macey:
Are there any other aspects of your day to day work or the ways that you're engaging with this new and evolving ecosystem of AI capabilities that we didn't discuss yet that you'd like to cover before we close out the show?
[00:49:10] Shashank Kapadia:
I think we we covered quite a bit here. And personally, myself, I'm I'm trying to find what are some right ways, we can do it. Again, I'm already using the right coding assistance to help me get my work done, not just professionally, but projects that I'm trying to do on my personal end. I think the other areas that I've been toying around with lately are agents who can help organize my own calendars. There's also a I like to call it to be like a like a multifaceted experiences. So, again, we we generally speak about Gen AI from a text perspective, but how do you fuse in your videos with the photos, with the text?
So some of those experimentations I like to do and and see what are some potential that comes out of it.
[00:49:56] Tobias Macey:
Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gaps in the tooling technology or human training that's available for AI systems today.
[00:50:13] Shashank Kapadia:
That's a really good question. Let me think about it. I think when you think about it from a tooling perspective, it's not so much as gap as it is that there are so many tools that are out there today. So you have different frameworks when you're trying to build or deal with agents. You have different frameworks when you're trying to integrate coding assistance with items that you're using. You have different assistance or different interfaces when you're trying to interact as a chatbot. So it's it's a plethora of tools that's a problem rather than a gap. So perhaps a gap in my view would be a unification of an experience that can allow us to do multiple things without having to jump from one kind of tool to the other.
[00:50:55] Tobias Macey:
Alright. Well, thank you very much for taking the time today to join me and share the work that you're doing and your perspectives on the applications of generative AI and the trade offs between that and some of the more understandable and predictable, I use predictable in air quotes, machine learning systems that we've been building up until now. And, appreciate you sharing those capabilities. And I hope you enjoy the rest of your day.
[00:51:22] Shashank Kapadia:
Yeah. Likewise, so yes. Thank you so much for having me. I'm, glad to have spoken a lot of good discussions today.
[00:51:33] Tobias Macey:
Thank you for listening. And don't forget to check out our other shows, the Data Engineering Podcast, which covers the latest in modern data management, and podcast dot in it, which covers the Python language, its community, and the innovative ways it is being used. You can visit the site at themachinelearningpodcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts at themachinelearningpodcast dot com with your story. To help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Hello, and welcome to the AI Engineering podcast, your guide to the fast moving world of building scalable and maintainable AI systems. Your host is Tobias Macy, and today I'm interviewing Shashank Kapadia about applications of generative AI in retail. So, Shashank, can you start by introducing yourself?
[00:00:29] Shashank Kapadia:
Yeah. Hey. Hi, Tobias. I'm Shashank. I'm currently a machine learning engineer. I've been in the domain for close to ten years now. Throughout my journey, I've had hands on experience dealing with the problems, right from the small size company to Fortune 500 to more recently at a Fortune one company. I really enjoy what I do, and I'm excited to be here and, share my experience with the audience.
[00:00:53] Tobias Macey:
And do you remember how you first got started working in ML?
[00:00:56] Shashank Kapadia:
Yeah. That's funny you start, you asked that. So I essentially come from an engineering background, later did my graduate studies in operations research. So very much kind of a math had been, more or less in from an academic standpoint, but was dealing very much in within the deterministic and a structured way of solving problems. I think what really caught me was when I started to understand how the recommendation systems work, right, from moving from people who bought x are also buying y, which is to an extent a basic statistics. But then when system starts to understand the intent behind it and what customers are not telling you, but you still are able to identify those needs and understand the subtle patterns. It really just got me excited about it. And I think what sealed the deal for me to be in ML, it's just not about the technology, but understanding, like, how human behavior can be understood at scale and can then be used to help millions of people find what they need faster, save them time, reduce friction reduce friction, or in any capacity that can make their lives better. So to me, it feels like we are kind of in this in this Internet era of 1995.
We know a lot of things are going to change. Perhaps at this point, we don't know just how, or maybe are in the position to steer it in the right direction.
[00:02:18] Tobias Macey:
Yeah. Absolutely. It's it's definitely an interesting time in the world, and we're absolutely in the very earliest stages of whatever change is going to happen. And it's hard to know where we're going because even just in the past six months, the pace of change continues to accelerate and build up. And there's a lot of hype being thrown around, and it's hard to differentiate between what is marketing, what is fear and uncertainty, and what are actual legitimate claims.
[00:02:48] Shashank Kapadia:
Yeah. No. Absolutely. It's totally a situation where it feels like either we are building a plane while in the flight or building the ship while floating. So there's certainly noise around, and, I think there are also some good signals coming out of it. So it's just, I think, a job of identifying the signals from the noise.
[00:03:06] Tobias Macey:
And so to that point, given the particular area that you're focused on as far as retail and ecommerce, given that you're working for Walmart. Wondering if you can just talk to some of the ways that you're seeing the conversation evolve around when and how to apply these new and upcoming generative AI capabilities, and what are the situations where the risk and uncertainty or cost are still too high given the volume and the level of visibility, and you still just rely on what I'm going to term traditional ML use cases, whether that's linear regressions or deep learning or what have you?
[00:03:50] Shashank Kapadia:
Yeah. No. I think that's an excellent broad question. And before I jump into it, just to, slide the disclaimer here. I think sort of comments and opinions that I'm probably going to share throughout this conversations of us. It's sort of coming in from my own personal capacity and not the representation of Walmart or my previous employers, and they just don't reflect the views of them or are endorsed by, Walmart in any capacity. So just, yeah, I wanted to have it out there. But I think when you think about applications of generative AI within retail ecom or just generally speaking, like, when do we know which type of solution or an algorithm to use? And it's certainly a space where I think there is a lot of experimentation that's happening, because, again, we aren't fully aware about different capabilities that Gen AI or generative AI can essentially support us with. So there's certainly a room for us to explore and understand what are different layers that we can extract out of it. But that being said, there are certainly problems that aren't quite suitable for generative AI at least in its capacity as of today. Few of those things comes to mind, when you're within the retailer ecom, there is an optimizations that you are running. Right? So, hey, if you're running a store chain and you wanna understand how my assortment needs to be optimized and how my shelf space needs to be utilized, it's a very much of an optimization problem that still is to be run not from a generative AI standpoint, but more so much as a mathematical modeling. But that being said, I think even in those type of applications where we know that the problem itself requires a different perspective to it, I think how generative AI is helping is complementing the outputs coming out of the solution.
Just to give you an example, suppose you are running a store chain and have an optimization problem that optimizes for you how my shelf space needs to be, allocated and for how long and for what type of products needs to be there. Now once you have run that optimization and have got the results from it, oftentimes, those results are consumed by store managers or stakeholders who aren't necessarily coming from the math or the technical background. And what GenAI has helped is to layer that output into adding a layer of explainability. Right? So even though GenAI is not what it's optimizing, the solution for it, it certainly can consume the outputs from the solutions and explain to the end users why certain decisions make sense. So it's, again, just one of the example in which, again, we can't really directly plug in Gen AI as of now, but it can certainly serve as a compliment to it. But there are also areas where I think integrating Gen AI right at the foundational layer has been super helpful. I think one of those scenario that comes to mind is intent understanding.
I think for the longest time when we look at or think about ecom, users would go and search for I need to kind of sort of products that you are looking to buy for, like, some kind of a milk or an ice cream or or anything that you're looking to buy for. But I think the shift that we are observing, with the Gen AI is more so much as now the customers are not searching for products anymore, but they essentially describe the problem or a situation. Think about it as, like, hey. I need something for my kids' science project, which is due tomorrow. Now once this as a situation comes in from the customer, how AI can help is really figuring out, like, hey. The customer would need a poster board, markers, maybe a display stand. So it's essentially moving from a keyword based matching to an actual comprehension.
That's one area that I can see a lot of potential going forward. The second is also around dynamic personalization. So, hey, not just recommended for you. I mean, truly dynamic experiences, right, from the product description to what image, what customer sees, and every piece of copy that adapts to who's reading it. So if a professional chef or a college student is in the market to purchase a knife, they may have a completely different experience of a same product that has been shown to them when they are browsing for it. And I suppose the third area that I think seems to have a lot of potential is also a predictive commerce. Again, this is more of like a sleeper hit nobody talks about, but AI that knows what you need before you do. So not like a like a creepy tracking, perspective, but understanding your patterns. Think about it like, hey. It's February. You bought Valentine's decorations last year. Here is a subtle reminder you're in the market for it. Your printer ink lasts about three months, and you purchased a printer around that period. So you may be in the market to buy that. So how can we have those predictive customer or predictive commerce experiences for customer?
Is in another layer, I think, Jenny, I can help in in sort of pushing those envelope or pushing the boundaries, further down the road.
[00:09:10] Tobias Macey:
Given the fact that you do have already a substantial investment in various types of machine learning systems, whether that be recommender systems or predictive analytics for things like supply chain issues or, to your point, some of the proactive notifications of, okay, you bought this printer three months ago. Let me go ahead and see if you want to purchase some replacement ink for it. How has the injection of generative AI workloads changed the architectural requirements, and what are some of the ways that you've been able to build on top of existing systems that you've already invested in?
[00:09:53] Shashank Kapadia:
Yeah. No. I think that's a that's a really good point that you bring up and sort of break it down into, like, two forms, I suppose. One is the architectural patterns that we would probably would now need to start adopting as we integrate generative AI in our solutions. And then the second part is more so much from a machine learning operations standpoint. But even before that, I think when we think about fusing Gen AI into any of our existing solutions, I think the key question here is is the value. And more than oftentimes, it's hard to quantify that. But generally speaking, I think what allows any solution from its POC stage to go into production is having a good understanding of value that it's driving. Right? So, yes, there's a cost part to it, but then there's also an additional revenue part of it, which known beforehand or being tried to known, can help understand the value proposition of the effort. Now, again, from a cost perspective, not just the infrastructural cost of hosting and running the solutions in production, but also personal cost, resourcing cost, opportunity cost. So quite a few different type of costs that you may want to take into account when it comes to fusing JNEI. And that, again, is not just in the case of JNEI for any improvement or enhancements that we are planning on doing with either a conventional software piece or a conventional machine learning piece. The the value proposition needs to essentially make sense before anything goes into production. Now that being said, I think from an architectural patent standpoint, especially when we think about it from a generative AI standpoint, the fundamental is that it's a probabilistic in nature. So unlike patterns that we would follow in the sense of a conventional software where things are mostly deterministic, there are few things that we would wanna have in place from an architectural pattern standpoint that can suit well for Gen AI, but also perhaps put the companies starting to use them in a comfortable position. So one of them is the orchestra of models or perhaps we can think of it as like an agent. So instead of having one giant AI solution trying to do everything, you have specialized models that are working with each other. So let's say you have one model that handles the understanding behind what the users are asking for, another one who goes and find the products from your inventory, another one who is layering the personalization on the overall experience.
And they all pass information between each other, like, again, the musicians in orchestra. Right? So one thing from a pattern or an architectural standpoint is we are designing this systems or an experiences, we wanna start to think about it more from an orchestra of models rather than one model doing it up. The second is safety net, which is being the solutions are probabilistic. Every AI output get goes or essentially is going out, it passes through multiple checkpoints. Like, think of it as, like, someone passing through an airport security. There are multiple layers and each is tasked to catch different things.
Hey. The information that's going out, is it factually correct? Is it appropriate to the user? And that appropriate can come in white to your forms being an age appropriate or the content appropriate. Will it resonate with the customer? So point being is you want to have that safety net, especially with the probabilistic models to make sure that the information that's going out is in line with what you expect it to be. So alignment is is is the key here. And the third piece is the learning loop, and this is really a secret sauce. Again, it has been the secret sauce for conventional machine learning models as well as much data and the feedback you collect, better you can have your models. And it's very important for degenerative AI solutions as well, if not more, because every interaction teaches systems something. Customers bought the recommended product, that's a win, reinforce that pattern. They immediately returned it. Learn from that too because here's the key. The the learning happens continuously and not so much in, like, a big batch of dense fashion. So the system's literally getting intelligent for the for the right word with every hour or every frequency at which we ingest a feedback into it. So, again, these are not net new patterns as such. These are, again, universal patterns that we would, anyways, would want to have, but they play a very key role, especially with Gen AI being the probabilistic in nature. Now the second piece, which is around the machine learning operations, I think it can also be sort of divided into two parts, which is, hey. There are things that remains the same, no matter what type of solution we are deploying. And then there are things that do change a bit with the generative AI into play. So what stays the same? Again, all the boring stuff, that actually matters. So you have data pipelines, model monitoring, AB testing the infrastructure, All of that, which is super critical, still remains to be the scenario when we are dealing with the general DVI solution as well. If anything, it becomes more important, because of the probabilistic nature. We can't just deploy and forget. Also, part of it is, again, garbage in, garbage out, which isn't changing when it comes to the Gen AI as well. It just automatically doesn't fix the bad data. Now what has changed though? I think in my view, the the speed of everything, We perhaps when I remember a few years back, we used to spend months doing a POC on a new recommendation model or for any classification model that we're doing.
And now someone can prototype a new experience in in the days, if not in an afternoon. And so the barrier of experimentation has collapsed. And being able to have an infrastructure that support that, I think, is the key here because one can run a quick POC to understand if there is a value in it to then later spend resources to put something into into production. The other big shift is also democratization. I think earlier when POCs or even experimentations were limited to people with a certain technical background, I think with generative AI and especially with the with the chat based API thing, people not coming from a conventional math or data science or machine learning background can also try to POC things and see if there is a value to even further bring it as a item for the technical teams to explore. Right? So a marketing person can write a prompt, create something that would have required teams of engineers to use it or to do it. So there is there is a kiosk for sure, and that, again, requires a new type of governance and another topic, which is quite an extensive one when we think of Gen AI.
But the skill sets are evolving too as well. Right? I mean, you have prompt engineering, which is becoming a very important part of how the traditional ML engineering, used to be. So kind of to sum it up from an MLOps standpoint, yes, there are certainly things which have pretty much unchanged, if not have are are are more critical. And then there are few things that are changing. Speed will be one of them.
[00:17:26] Tobias Macey:
To your point of some of these generative capabilities being incorporated into processes like experimentation or some of the prioritization or some of the implementation work that feeds into these AI systems. One of the ways that I've been thinking about the bifurcation as far as the application of AI in a broader system is whether you're building for AI where you're actually constructing all of the data pipelines, building all of the inference systems, the context management, or building with AI where you're actually using a generative model in the process of whether it's writing code, generating diagrams, to your point of some of the marketing team being able to generate copy and experiment with that. And I'm wondering how you're seeing those different roles of AI in the broader system manifest in terms of how you're thinking about the work to be done as somebody who is very closely linked into the actual machine learning
[00:18:25] Shashank Kapadia:
cycles? Yeah. I think, again, I think when you and and you've just basically put it very rightly here. Again, I think a few years back when we used to think about AI or machine learning, we used to think of it as a vertical that is in line with other verticals that how business operates with. So you have a vertical of AI or machine learning, your vertical of sales, vertical of marketing, engineering, and so forth. And I think what has shift or what has shifted is this paradigm where AI is no longer a vertical, but essentially a horizontal layer that touches upon every aspect of, of a business. And I think Jeff Bezos in one of his, interviews, sometime back kind of very rightly put it, which is how each of the verticals in the business utilizes AI is different. For example, from a marketing standpoint, you could be using AI to kind of have a query interface to give you an understanding about how your marketing ad or ads are doing. So you could think about having an AI solution that is plugged into the data systems in the back end where there is a interface where you could ask, like, hey. How my particular ad did over the period of last seven days or a month. As opposed to from an engineering standpoint where we are using AI when it comes to working or writing a piece of code is to, again, think about something that we would want to have and having the right coding assistant who could help us write the code that is more or less close to how we would want to have it in production.
Another vertical may have a different use case. So I think when we think about AI and all these different use cases within our organization, they're all coming from a perspective which can overall or essentially help them improve their improve their workflow and different teams would have different workflows that they're trying to optimize for. I think when it comes to being able to support that, that's again a whole different challenge at scale, not just from a cost perspective, but also the infrastructural requirements perspective. I'm happy to go into, like, either of that if that's, of an interest.
[00:20:41] Tobias Macey:
I think in particular, one of the ways that or one of the areas that's interesting to explore is some of the methods that you found helpful as far as bringing some of these generative models into the inner loop of ideating about the different types of experimentation that you might want to do on some of your existing model environments where maybe you have a recommender system and you want to do some hyperparameter tuning or you want to use the model to help you determine what are some interesting features that you want to do some engineering of and extracting to see how that impacts some of the existing models that you have and managing the experiment design and just some of the ways that the ML has influenced the ways that you think about that work?
[00:21:31] Shashank Kapadia:
Yeah. Absolutely. And I think that can be kind of taught into two ways as well. Right? So when we think about, let's say, for an example, the recommendation systems right now, one way to think about having AI to help us with is how can we integrate AI within the solution itself. And the second piece of it is how can I use AI that can help me experiment and run POCs for my solutions? And they both have a certain level of value that they would essentially now if I were to take a case where I'm using AI to help me speed up my overall experimentation, meaning, hey. I have this model, and there have been an influx of new data sources. And I wanna understand if I use a certain new features into my model, would my overall accuracy or relevancy improve over a certain metrics that I'm tracking? I think that piece is where AI is more or less an assistant to you, where you could essentially with the natural language, given the code base that you already have, the data that you are already using, quickly experiment and run a POC by saying that, hey. I'm gonna pull this model, add an additional features to it. Can there be a rough script that can be generated? And then you can add in your own pieces of code to be able to experiment that. So what it has helped essentially is shortening that experimentation window when it comes to running a quick POCs to understand if there is a potential to go further down. I think the other side of things where what we are also trying to understand when you have a recommendation systems that are in production, you also are collecting feedbacks constantly.
And some of those feedbacks as you collect and some of those challenges that comes by can better be solved by having the right AI solution integrated into them in itself. I think the explainability is a really good use case that is sort of also a starting point on how one can integrate Gen AI into recommendation systems without having to change the underlying models or having to experiment the change of algorithm in itself. It's more so much about building the trust, having a good level of transparency. So the other side of it is integrating AI directly into it in itself, and that usually comes from what problem or a challenge that we are trying to solve for and is Gen. AI a right solution to solve for that problem.
[00:24:04] Tobias Macey:
Another interesting angle of your experience is that given the scale of an organization like Walmart and the number of end users that you need to be able to facilitate interactions with, you're going to hit issues of scale much sooner than a start up or a small organization or somebody who's just doing a one off experiment with some of these generative AI models. And so that brings in a lot more due diligence. I'm sure a lot more of, a lot more upfront analysis before you actually even embark on any experimentation. And I'm curious what you have seen as you have started to implement some of these generative AI use cases, if there are any particular bottlenecks that you have run into whether in terms of infrastructure capacity or, some of the challenges around data scale or structuring of data that have led to issues and just some of the problems that you have maybe run into sooner than other people who are going along the same journey?
[00:25:13] Shashank Kapadia:
Yeah. I think scale is certainly something that when again, for companies like Walmart, Google, Meta, again, you can think about I think the scale here essentially comes from two two sides of things. I think one is how many how many users are engaging with the services and the products that you have to offer. And as a result of it, what sort of volumes that you are driving when it comes to the usage overall. And I think when when we think about the challenges that are coming from scale, I I I sort of like to break things down into, like, four different types of problems. Of course, you have much more technical challenges when it comes to the data, when it comes to the inferencing. But if you really sum them up, it essentially boils down to one, you have you have a penny problem. When you're solving millions, if not billions of requests, even timing of inefficiencies can become massive costs.
Think about it as if if your model is taking ten milliseconds longer to compute a request, and you have millions of requests that are hitting you in a day or a month or even a year to that matter, you have that much extra millions of dollars going into into compute. Or for that matter, if you have a generative AI integrated into the solution, the real currency here is is essentially tokens. And so if my requests are getting processed with, let's say, 60 tokens versus 50 tokens, that savings of 10 additional tokens or a large volume can result into a significant cost savings or a costly, scenario. So it's more becoming or it more become like a penny problem that you're trying to solve for. The second area is around the edge case multiplication. So at a at a small scale, your edge cases are rare. At a large scale, we can get to see everything every day. Someone asking AI to write sort of an haiku about an expired yogurt or people trying to jailbreak the system. So you have to build for a world where you can have to imagine about every possible weird thing that will happen, and it will happen because it will. The third area is around the feedback, especially when we think about it from a machine learning solutions, which is, again, thrives on feedback. But what surprised me actually having to deal with such a large scale is that when, again, millions of peoples are using your solution, the volume of feedback is also massive. And sometimes too much feedback is almost as bad as too little because system can then start to chase the tails or react on some temporary trends.
So we have to start thinking about how can we build dampeners, like like shock shock absorbers for for learning systems so that we can extract the signal out of noise. And the third, when we think about scale, we also think about scale from a geographic standpoint. Again, when you have solutions that are deployed globally, you run into a speed of light problem. Again, we can't beat physics. So when you are global, if there is a certain latency that is associated with distance having your solution deployed in North America and if a user is trying to make the request from, let's say, Asia Pacific, there is a deep there's a latency because of the distance that is between two geographic locations. And so you then have to start to think about how can we further optimize it. Do we need a regional based inference? Do we think about edge computing?
So all of these complexities may not be as much of a challenge when it comes to a small scale, but when you start to operate at a global scale, many of this can become a big challenge and sometimes a costly one. So what I've been able to learn from my experience is that at scale, you sometimes are not optimizing to make things better. It is to more about preventing a a collapse. You're you're constantly fighting with the entropy.
[00:29:26] Tobias Macey:
To the point of entropy, there's also a measure of platform risk that is involved when adopting some of the APIs from the various foundation model providers, which can be mitigated if you host your own models because you can freeze the model. But there are any number of other variations in the operating environment that can cause perturbations to the input, to the context that's provided. And, also, there is a huge surface area in terms of security around these generative AI systems, whether that's prompt injection, jailbreaking, mitigating potential vulnerabilities if you're enabling any sort of tool use. And then also ways that if you are using a managed model provider or if you're using different versions of models, there are unpredictable changes in terms of the output that it will give given the same prompts. You need to have prompt versioning and and ways to manage that experimentation.
And I'm curious how you're thinking about the build versus buy problem along some of the different major architectural elements that support just the raw inference capability.
[00:30:34] Shashank Kapadia:
Yeah. I think when it comes to, again, build versus, by conundrum, I think all the challenges as you described still remains the challenge. The additional layer to that also is the data privacy aspect of things. The third aspect is around the customizations. And I think more important is for what use case are we trying to use the generative AI as a solution. Right? So there are certainly use cases where it's relatively low risk for a business, and the output that comes out of an open source solutions can very well meet the requirements that we are trying to solve and the data privacy concerns are not too high in in those scenarios.
And the scale may not be an issue. In those scenarios, it's very much okay to go and use what's available out there. It saves a lot of resources that goes into building something that's custom or maintaining, something that's custom. But there are certainly also use cases which are very core capabilities of a for a business and has to deal with quite a bit of sensitive data and That's where I think building something from scratch or having a custom solution would perhaps rather make more sense. That being said, I think there are very few instances or scenarios where we are truly building anything from scratch. I think when we are thinking about or talking about customizations, what we really are talking about here is two things. One, fine tuning an already existing LLMs or any pretrained large language models, And then two, hosting it so that the data coming in and out is fully controlled.
So part of, I think, when we think about customization, I think more than 90% of scenarios, we're talking about hosting it internally as well as making sure that there is fine tuning there that goes on through it. And that fine tuning has multiple objectives, mainly from making sure that the model is aligned with the objectives we are trying to achieve. The custom solution, the data privacy is part of it. The other part of it is also having the right guardrails that the company can control to make sure that the right amount of safety checks are in place before and response or an output is served back to the customer. So kind of taking a step back when we think about build versus buy, I think there are a lot of factors beyond infrastructure and maintenance that tends to play a role. And usually, when it comes to low risk, low volume usage, you may end up using an API that eliminates the need to maintain the underlying infrastructure and the cost associated with it. However, when it's something that's core to the business, it's often evolved from a data privacy standpoint, from a customization standpoint to fine tune something in house and also maintain that in house.
[00:33:29] Tobias Macey:
As far as the evolution of your own work, how have these generative capabilities changed the various priorities or the time to delivery for the types of projects that you were tasked with prior to the, what I'll term the Gen AI epoch of when ChatGPT launched?
[00:33:51] Shashank Kapadia:
Yeah. I think it certainly has impacted in in variety of ways. I think when you think about it from an overall productivity, I think personally I can speak for myself, I have found to be able to be a lot more productive. I'm I'm sure many others in my, sort of surroundings and in the similar roles would would find it to be the similar situation, partly because of the fact that, hey, there is a lot of things that we used to do from scratch can now be done with an assistant. I think also the key here is oftentimes well, it's sort of a counterintuitive, meaning you would want to know what you are trying to do. And so if you know what you're trying to do, having a right assistant in place can really help you boost the productivity.
If you don't know what you're trying to do and are trying to do that through an assistant, sometimes doing it from scratch can be a bit more faster than through having in a constant loop back and forth with an assistant, assistant, or any GNEI solution. So generally speaking, I think if you know what you're doing and having a right assistant by your side, it certainly can help you increase in the overall productivity. Now, I think when it comes to the other part of it, so with an increased productivity, are there more deliverables that you are working on? And I think a short answer to that is yes.
Again, I think end of the day, you have a certain time and hours that you're trying to work, towards your professional career. And if your productivity is going up, more initially, you are being able to deliver more things. Now that being said, it's truly not to be thought about as net new deliverables that you're able to push out because we have to keep in mind that with every new delivery polls, that also comes in a point of maintaining those delivery polls and ensuring that they are how they are performing in production. So that also comes in with an increased amount of work. So it's not just, hey. We have more productivities or more things are getting delivered.
Because of more things that have been getting delivered, there's also more maintenance that is being added. And so sometimes it comes out to be like a net neutral, if I were to put it that way.
[00:36:07] Tobias Macey:
And then another interesting angle given your perspective and the scale that you're operating is to our earlier conversation of bottlenecks that you run into is I'm wondering as you have been bringing some of these generative AI use cases into end user facing environments, what are some of the most interesting ways that you have seen these systems break?
[00:36:35] Shashank Kapadia:
Yeah. No. I think in in a variety of ways, I think we we talked about prompt injection being one of them. I think the evolution of generative AI solutions, itself has kind of evolved. If you look back a year or two years to where it's now, I think part of it essentially comes from lack of something than the generative AI solution in itself. Now what I mean by that is if you have a generative AI as a solution and without having a right surroundings to it and it's being deployed in production, it can come with a lot of different issues that the customers while using it, will face.
So the very first thing you wanna think about it as is any solution that is being deployed, is there a constitutional foundation associated with that model? And what I mean by constitutional foundations are, what are some core principles that are embedded during its training in itself. A lack of it can run, into all sorts of output that you can get. The other part of it is the real time guardrails. Right? So a lack of guardrails can sometimes lead to funny or to some extent, sometimes potential to harm the reputation kind of responses from the from the model itself. The third aspect, which I think AI or Gen AI has evolved quite a bit is the confidence systems. I think AI now knows when it doesn't know, instead of confidently hallucinating, which used to be the case a few years back. So, hey, if you have the right confidence systems embedded into your model where I'm not sure about this, let me connect with you or let me connect someone who can answer that and that can be the human. I think that pretty much alleviates many of those scenarios and instances where we say, Jenny, I failed, quote, unquote. So, yeah, I think part of when you think about failures, I think there's certainly a long way ahead of us when you think about all different innovative ways to break it. But where it comes from, I feel, is is lack of certain of these components, having them to be in place to make sure those type of errors are as alleviated as possible.
And
[00:38:48] Tobias Macey:
in your own work of being involved in the design and Yeah.
[00:39:02] Shashank Kapadia:
Let Yeah. Let me think about that, actually. That's a really good question. I think well, I think I've I've seen AI fail at scale in in some in some weird ways. I think one instance, it's something that I kind of call it like a recommendation recursion. So, again, having dealt with the recommendation systems for some time now, I think what we have seen AI do, especially when we infuse that with GenAI, is something what we call recommendation recursion, which is if a customers, or a particular customer looked at the shoes, you wanna recommend them with socks. If you're recommending them with socks and shoes, you wanna recommend them with shoe polish, then you wanna recommend them with polish applicators, and then you wanna recommend microfiber cloths to clean the shoes. So there is no end to what all things you can recommend, and by the end of it, it just starts to suggest products that even barely exist. So recommendation recursion is one that comes to mind that was sort of unexpected in terms of experiences that I've seen Gen AI do in the personalization space. The other is more so much, I think I've also seen and used AI with something what we call, like, the accent mirror. Right? So say you're using a voice shopping AI assistant, and as you speak to that assistant, the assistant starts to mimic the customer accents. Right? So they are not programmed to do it. It's just learning from the patterns. And so if you are a certain customer and I've got certain accent, the AI starts to replicate that. If you are a Britisher and from any other accent, tries to mimic that. So, again, it sounds cool in the way that it's trying to learn from the patterns until you realize that it seems somewhat like a like a mockery. Then I've also seen holiday hallucinations.
So I think when you have seasons like during Christmas seasons, and you're asking me, I like, how do I fix my sink? And the response starts to come more so much in a festive season way, like, hey, jingling your way out of pummeling success or how do I make my pipes problem go away or, like, making your pipes go merry and bright. So all sorts of layering that happens because of hallucination and the time of day or that you are interacting, with the chatbot. So I think what I have kind of seen the fair shares of those experiences with different AI solutions, and I think the lesson for me coming out of that is that AI does not feel like a traditional software, as in you have this error, there's some logical piece of code that broke somewhere. It fails creatively, almost like an artistic way that it's so hard to understand which part of AI would have caused what thing. But every failure helps kind of helps us to understand what are some challenges or the tail events that could happen and how systems really are thinking for us to then go and, go and debug.
[00:41:58] Tobias Macey:
And as you are building new capabilities or working on improving existing systems, what are some of the heuristics that you use for determining when generative AI is just absolutely the wrong choice?
[00:42:12] Shashank Kapadia:
Yeah. I think two two things comes to mind. So one is well, the problem that you're trying to solve for, is a deterministic problem, and there is no reason for it to be layered in with the probabilistic model. Right? So if you are trying to, again, run an optimization problem, it's a very deterministic problem, and there is no need for it to be a probabilistic in in the nature. The other side of things are also the the value side. So, yes, you could be having a problem that is a probabilistic in nature, but it could very well be solved by having a good classification model or a regression model to be in place rather than a full blown generative AI solution to it. So I I call this bucket to be as a match between the problem and a solution. So you wanna use the solution that fits right to the problem that you're trying to solve. Now the other part of it also is the value side where your problem, which, let's say, for example, the explainability for your recommendation systems output, it does fit to the checklist of where generative AI can be very helpful. But where value comes into play is that while it's good to have and a nice to have, is this something that our customers want? Is this something that the stakeholders are asking for? And what essentially are the costs associated with building it, but also maintaining them? And what sort of revenue are we expecting to have coming out of it? So part of it is to understand it from a value perspective as well. Is it really something that's going to drive value for the business?
Third thing, do I I'd like to put that into not so much as data mining whether to use Gen AI or not. I think that is also a notion of innovation and pushing boundaries. And more than often time as a company, if there is something that you can afford or have the flexibility to support, Gen AI can always be a small percentage of your time where you can think about, hey. Can I run a certain kind of POCs to see if there is a value in it? So sometimes it may not always be driven from a problem like, hey. Here is a problem, and we are trying to solve it. Sometimes it also comes with just experimenting with different tools and technologies and essentially coming up with the ways that could then actually drive up value even without having a problem in first place. So there is that third kind of a bucket that I like to think about as well.
[00:44:40] Tobias Macey:
And so as you continue to build some of these machine learning systems, as you continue to stay apprised of the evolution in the space, particularly for generative applications and the supporting tooling, what are some of the areas that you're paying particular attention to particularly in the horizon of six months to two or three years?
[00:45:02] Shashank Kapadia:
Yeah. I think a few things that comes to mind. So I think from six to twelve months, I sort of I'm thinking about it from a from a convergence standpoint. So, again, search and shopping are merging into something entirely new. Again, instead of typing blue shoes size ten, you're you're probably going to have conversations. I need something for my brother's beach wedding that will make me look like, like, I'm trying too hard. So having more of a fusion between a conversational piece with your shopping experience. So conversions would be one I'm I'm really looking forward to. I think further out, I would imagine to be something about, like, shopping agents for you. So you have an AI assistant who understands your shopping patterns, who understands who you who you are, and will shop for you while you sleep. Right? So it knows everything from your preference to budgets to your values. It can, in some cases, if allowed, can negotiate prices, can wait for the sales events to happen, and even can circle back with the with the seller if something is off or wrong. So I'm really looking forward to I think in a year or two with having the right, kind of autonomous shopping agents. And I think way further beyond, I think I'm also looking forward to perhaps bringing back the the metaverse of, overall retail experience in itself. Right? So I'm talking about, like, how AI can show you exactly how your furniture or the group of furnitures can look into your actual rooms. We have bunch of flavors of that already in place today, but not so much as you can walk into your, let's say, an apartment. It is a one on one replica of how it is, and you can play around with it when you are doing a furniture shopping to see how exactly it's gonna come out to pay. So that, again, is a more futuristic view of how I see, like, two, three years from now. I think what are some game changers, I think, not so much putting timelines to it that very few are talking about is again, the emotional commerce where again, I understand not just what you want, but why you want it. And having a layer of emotions involved with the commerce is also something we can achieve with having the right gen AI solution to be in place. The second piece is the negotiation economy. I think every price can become negotiable, and rather than having a distinction between as a human, are you a good negotiator or not, perhaps having a right AI assistant by your side can pretty much haggle with the retailers on what the negotiated price could pay. Again, with that, there are also things that probably will come by.
I think regulation explosion is one that I'm, anticipating to come by in the near future with again, there's one AI scandal and worldwide restrictions could come into play or what right regulations and compliance for all the good reasons, would have to be in place as well. There's also going to be like an open source disruption, with a lot of these tools that are out there in the market. It doesn't really require you to be in a particular ecosystem or a group of tools to achieve certain things. And so there's perhaps going to be a plethora of open source tools that would perhaps come out. I think quantum computing is one, and there was a really good presentation from Google a few months back that, again, could make the overall AI computation as we do on GPU today look like a pocket calculator. So quite a few different areas I'm really looking forward to as we as we shape this industry, as we shape the use cases, applications of it, not just directly connected with AI, but also the industries that support AI or the manifestation of AI to begin with.
[00:48:55] Tobias Macey:
Are there any other aspects of your day to day work or the ways that you're engaging with this new and evolving ecosystem of AI capabilities that we didn't discuss yet that you'd like to cover before we close out the show?
[00:49:10] Shashank Kapadia:
I think we we covered quite a bit here. And personally, myself, I'm I'm trying to find what are some right ways, we can do it. Again, I'm already using the right coding assistance to help me get my work done, not just professionally, but projects that I'm trying to do on my personal end. I think the other areas that I've been toying around with lately are agents who can help organize my own calendars. There's also a I like to call it to be like a like a multifaceted experiences. So, again, we we generally speak about Gen AI from a text perspective, but how do you fuse in your videos with the photos, with the text?
So some of those experimentations I like to do and and see what are some potential that comes out of it.
[00:49:56] Tobias Macey:
Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gaps in the tooling technology or human training that's available for AI systems today.
[00:50:13] Shashank Kapadia:
That's a really good question. Let me think about it. I think when you think about it from a tooling perspective, it's not so much as gap as it is that there are so many tools that are out there today. So you have different frameworks when you're trying to build or deal with agents. You have different frameworks when you're trying to integrate coding assistance with items that you're using. You have different assistance or different interfaces when you're trying to interact as a chatbot. So it's it's a plethora of tools that's a problem rather than a gap. So perhaps a gap in my view would be a unification of an experience that can allow us to do multiple things without having to jump from one kind of tool to the other.
[00:50:55] Tobias Macey:
Alright. Well, thank you very much for taking the time today to join me and share the work that you're doing and your perspectives on the applications of generative AI and the trade offs between that and some of the more understandable and predictable, I use predictable in air quotes, machine learning systems that we've been building up until now. And, appreciate you sharing those capabilities. And I hope you enjoy the rest of your day.
[00:51:22] Shashank Kapadia:
Yeah. Likewise, so yes. Thank you so much for having me. I'm, glad to have spoken a lot of good discussions today.
[00:51:33] Tobias Macey:
Thank you for listening. And don't forget to check out our other shows, the Data Engineering Podcast, which covers the latest in modern data management, and podcast dot in it, which covers the Python language, its community, and the innovative ways it is being used. You can visit the site at themachinelearningpodcast.com to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts at themachinelearningpodcast dot com with your story. To help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Introduction to AI Engineering Podcast
Meet Shashank Kapadia: Journey in Machine Learning
The Evolution of AI in Retail and E-commerce
Architectural Shifts with Generative AI
AI as a Horizontal Layer in Business
Challenges of Scaling AI at Walmart
Build vs. Buy: Generative AI Solutions
When Generative AI is Not the Right Choice
Future Trends in AI and Retail