Episode 029
Released
Duration 1 hr 25 min

Demystifying Artificial Intelligence - AI 101

Harsh Joshi, founder of DAO Studio, explains AI for a finance audience — what AI is, what LLMs and RAG do, and why deploying AI in production is the hard part.

Harsh Joshi

Founder and CEO, DAO Studio

Creative. Goofy. Consistent.

Also on
RSS
Chapters
  1. 00:00 Cold open
  2. 00:32 From electronics engineer to AI builder
  3. 05:28 AI 101 — what artificial intelligence actually is
  4. 09:31 The AI taxonomy: ML, neural networks, GenAI
  5. 16:29 LLMs, parameters, and diffusion models
  6. 23:30 Pre-training, fine-tuning, and RAG
  7. 32:19 Prompt engineering, vector databases, and MLOps
  8. 41:10 Why business AI is hard — control, explainability, privacy
  9. 53:02 Open source, big tech, and the AI economics
  10. 01:05:15 Deploying AI at scale — and what comes next
Summary essay Read the summary of this episode The key ideas from the conversation, in a few minutes — no audio required.

Show Notes

This episode is a deliberate departure from the SoF mainline. Recorded in June 2024 — before “GenAI” had fully crossed from technology beat into boardroom agenda, and before Anthropic’s Claude had reached the mainstream of business tooling — it’s an AI 101 primer pitched at a finance audience. The guest is Harsh Joshi, an electronics engineer turned AI builder who has spent a decade in the space (drones for agricultural research, IoT energy savings at SmartJoules, a failed crime-response startup, and now DAO Studio, where he’s building YOJN — an AI-deployment platform that aims to take models from demo to production).

The conversation works as a glossary you can listen to. Harsh walks through, in order: what AI actually is (function approximators that learn decision boundaries from data, not rules-based expert systems); the taxonomy (machine learning ⊃ neural networks ⊃ GenAI; LLMs vs diffusion models; what a “parameter” means); the architectures (pre-training vs fine-tuning vs RAG, when each is the right tool, and why most enterprise use cases don’t need fine-tuning at all); the tooling stack (prompt engineering — which Harsh argues is a field that will disappear — vector databases, embeddings, MLOps and AIOps); and why business AI is genuinely hard (the three fundamental problems: controllability, explainability, and decomposability, plus privacy as a deployment blocker).

The back half is the part that ages best. Harsh’s framing of AI as “a problem of data, capital, talent, and distribution — a monopoly of big tech” is exactly the lens that has played out in the 18 months since. His read on why Meta open-sourced Llama (economic strategy, not altruism), why prompt engineering is a flash-in-the-pan job category, and why most enterprise AI projects stall at the demo-to-production gap, are all calls that have aged well.

The closing pitch on YOJN is the practical answer Harsh’s company is building: a deployment layer that lets finance teams treat AI like any other budgeted, governed, observable function of the business — visibility into cost, return, and model behavior, rather than the black-box experiments most enterprises run today. For a CFO trying to figure out how to fund AI without writing a blank check, that framing matters.

This is a time-capsule episode: a snapshot of how a serious practitioner explained the field at the moment finance leaders started asking real questions about it. Some of the specifics (RAG architectures, prompt engineering as a discipline) are already evolving fast. The framework for thinking about AI as a business problem — not just a research one — is what holds up.

Takeaways

  • AI is function approximation, not rules. A neural network learns the decision boundary inside a dataset; it doesn’t encode rules a human writes. That single shift — from if-then-else to learned approximation — is what made the last decade of progress possible.
  • The taxonomy is a tree, not a list. Machine learning ⊃ neural networks ⊃ LLMs and diffusion models. GenAI is a marketing term for the latest, most general-purpose neural networks; underneath it, the math is the same family that’s been around since the 1980s.
  • Parameters are the dial. A 1B-parameter model and a 100B-parameter model are doing the same thing — the larger model can capture more nuance because it has more knobs to turn. More parameters = more capability, more compute, more cost. There’s no magic at scale, only more expensive math.
  • Pre-training vs fine-tuning vs RAG. Pre-training is the foundation model. Fine-tuning is teaching it your domain. RAG (Retrieval-Augmented Generation) is letting it look things up at runtime instead. Most enterprises don’t need fine-tuning — they need RAG and a well-built context layer. Harsh’s view: 80% of use cases are solved by RAG done right.
  • Prompt engineering is a field that disappears. Harsh’s call: as models get better at understanding intent, the value of clever prompting goes to zero. Don’t build a career around it; build a career around AI systems engineering.
  • The three hard problems are controllability, explainability, and decomposability. AI systems can’t be debugged like microservices. You can’t break them down into components, you can’t prove why they produced a given output, and you can’t guarantee they won’t produce a different one tomorrow. Until those problems are partly solved, governance is the bottleneck.
  • MLOps is the production gap. “The difference between you calling your engineer to show a demo to your board members and your AI system being consumed daily by your customers is MLOps.” Most enterprise AI projects stall at exactly this gap.
  • AI is a problem of data, capital, talent, and distribution. All four are concentrated in big tech. Open source is the lever that breaks the monopoly — but big-tech open source (Meta’s Llama, etc.) is economic strategy, not altruism. Watch what’s released and what’s withheld.
  • Privacy is a deployment blocker, and it should be. Enterprises legitimately can’t send sensitive data to third-party model APIs. This is why on-prem and private-cloud model deployment is the real growth area — not consumer chatbots.
  • For finance teams, AI needs to look like any other budget line. Cost transparency, ROI measurability, governance, and the ability to sanction an experiment without writing a blank check. That’s the YOJN pitch, and it’s also the framework CFOs should demand from any AI vendor.
  • The future is “democratizing intelligence.” Harsh’s closing prediction: as the underlying intelligence in foundation models gets commoditized, the value migrates up the stack to AI-native applications — products designed from the ground up around AI capabilities, not products with AI bolted on.

Notable Quotes

The AI systems that are all the rage today, which scale well and generalize well, are neural networks, which are essentially function approximators.

Prompt engineering is a field that's gonna go away as fast as it has come.

The difference between you calling your engineer to show a demo to your board members and your AI system being consumed daily by your customers is MLOps.

The first step of governance is being able to explain something; without explainability, you cannot govern it.

AI is a problem of data, capital and talent. If you have three, then all you need is distribution to iterate it faster. Now, these four items are a monopoly of big tech.

AI applications are all about the context; if the context is gone, it's just garbage in, garbage out.

Lightning Round

Sweet or Savory
Sweet
Books or Podcasts
Books
Thinker or Doer
Doer
Introvert or Extrovert
Introvert
How does someone can impress you?
Staying consistent
If not an AI builder, what would you be?
Hermit
if you could teleport yourself right now, where would you go and why?
I'd probably go to Mars and see life in a very different way.
#1 item on your bucket list
one year long sabbatical
If you could un -invent something, what would it be?
Maxwell's equation
who is your role model?
Swami Vivekananda
What can make you 10x more productive?
An AGI that is my personal butler

Transcript

Cold open

Rohit Agarwal: Hello, hello. Welcome to the Strategy of Finance podcast. In a special episode today, we aren’t necessarily celebrating a finance professional. Rather, we are going to unpack a topic for the community that’s on everyone’s mind. Yes, I’m talking about artificial intelligence or AI. Joining me today is a builder in this space, Harsh Joshi, founder and CEO of DAO Studio. Let’s dive in. Harsh, welcome to the show.

Harsh Joshi: Thanks for having me.

From electronics engineer to AI builder

Rohit Agarwal: Why don’t we kick it off with understanding who is Harsh Joshi?

Read the full transcript →

Harsh Joshi: So, Harsh Joshi has been in the AI space for over a decade now. I did my bachelor’s in electronics communication engineering, did a small diploma on remote sensing and satellite navigation. I spent about two years working with Indian Council of Agriculture Research and a bunch of NGOs building drone systems to solve saline line problem. And then I kind of started building this technology at a startup called SmartJoules with Arjun and built a team where we were primarily using AI, IoT, Cloud, all these technologies to save energy in HVAC systems primarily. After that, I spent a couple of years doing consulting, did product management for the startup called Zenoti. It’s a SaaS startup in wellness space. After that, I took a plunge on entrepreneurship and built my first startup called Sachait. It tanked, it tanked really hard. But the idea over there was also pretty much around AI. We were trying to build a model, actually built 7,500 parameter model to gather all the crime-related data, to analyze sensor data, and… in a nutshell to predict the false positive rate on these 112 call systems, right? Because in India, first responders to civilian ratio is 1:150,000, right? So you really don’t have a lot of first responders for every civilians. And if you have all of these false positives coming up on these emergency lines, that’s a problem. And that’s what we try to tackle. Couldn’t really commercialize the technology. So we tried something else. That’s how we came about building DAO Studio. And right now we are building this product called YOJN at DAO Studio, which tackles all the key challenges in the AI space and helps startup get from demo to production.

Rohit Agarwal: All right, quite interesting. When did you first caught a bug? Well, I guess I’m talking to engineer, so can’t really talk about bugs. When did you first get inflicted by the whole enigma around artificial intelligence? What was the first introduction?

Harsh Joshi: I think I was back in class 8th. So my cousin brother, he was studying at this college in India BITS Pilani. He was an electronics major and he used to come back during summer vacations. I used to bring all of these books and all of these video lectures on this LAN system they used to have. I used to read them. I was building websites and all those things and the idea of making everything into some if-then-else routine writing program wasn’t really a fun thing. I was like, can’t these systems learn for themselves? And that was the first, I think around class eight, class nine, I got into this whole idea of let’s build something in AI. And since then, I’ve been just carrying that thing forward.

Rohit Agarwal: You mentioned in your introduction where you have worked at SmartJoules and other places, there seemed to be a good combination of hardware, software, you know, sensors are in place and then they are learning and doing something, software is in place and you know, that’s learning and doing something. Is it a field which actually brings back software and hardware in some ways in reality and like unlocks IoT in a truest sense or? you think it was just serendipitous into kind of how you got exposed to that hardware plus software kind of field and just kind of made it work.

Harsh Joshi: So I mean the whole idea of thinking hardware separate, software separate It doesn’t really make sense when you’re talking about artificial intelligence. Because in the end, what is AI? You They’re trying to make a system learn certain behaviors, and you want it to behave in a certain way, and then give an output in that way. So it’s always response, and then there’s a certain stimulus, and then you respond to that stimulus. So that stimulus can come in from anywhere. It could be a digital stimulus. It could be a sensor data. It could be a motion actuator, like your pressure sensor sends something. It could also be just your click-through rate or something like that. It could be anything. It’s a stimulus in the end. And then you respond, again, depending on what problem you’re trying to solve. Either you respond by moving a motor, or you respond by showing certain pixels in certain ways, or you respond and showing a graph, or whatever. So the idea of hardware being very different from software doesn’t really work, because even your software works on a hardware. It’s just not a mobile hardware. In the end that’s what your software also is. There’s nothing magical when you talk about cloud. It’s someone else’s computer. Huge data centers, racks, it’s still a physical hardware in the end.

AI 101 — what artificial intelligence actually is

Rohit Agarwal: Makes a ton of sense. Yeah, it’s an input output device at the end of the day. Okay. Well, why don’t we dive into kind of some AI one-on-one stuff, right? So let’s start from the highest level question. What is artificial intelligence? What is AI?

Harsh Joshi: I mean, there’s no one right answer for AI because at this point, AI is more of a marketing term, right? But if you really have to think about it, let’s say you have a process, right? And you have a hundred rules around it. If this is happening, then do this. If this is happening, do that. Arguably, if you can write that if-then-else routines, that’s also an AI, right? The challenge with that is it doesn’t really scale well. And, Prior to 1990s, that’s how most AI system actually worked. There are different fancy ways to do that. But in the end, these are if this else systems. And then a lot of smart people started thinking, can these systems self-learn? And that’s when you saw this emergence of complete class of algorithms, which people started calling machine learning algorithms, when you had these models come up like decision trees. Now the thing with decision trees was these systems are highly interpretable, right? But again, they don’t scale well. They don’t generalize well. So the AI systems that are all that rage today that do really scale well and generalize to a problem really well, these are neural networks, per se. And at its root, what these systems are doing, they are just function approximators. So think of it this way. Like a. in finance, you’re used to seeing a lot of data in Excel and you’re seeing revenue growth, this growth, that growth, and you’re trying to prod a trend line through that. And based on that trend, you’re trying to make a prediction, okay, X years down the line, how will this company or asset behave? That’s what these machine learning systems are doing. Like neural network is a function approximator. You throw arbitrary data to this model. It tries to form a decision boundary around it. And then based on this boundary, it is trying to make a decision. So, When they found this boundary, that is the process of learning. They have learned the spread of this data.

Rohit Agarwal: But it’s not that, let’s say I have, let’s pick finance, right? I have some accounting rules. It’s very defined, right? There are probably a hundred different textbooks that can be best of breed in accounting. And they’ll say more or less all the same thing, right? If we fed that into a system, it’s not gonna auto learn, right? It’s not gonna understand the boundaries. It’s not gonna understand. how to behave, what to behave, what is it? Is it just basically providing it with the rules to work with? Is it that simple?

Harsh Joshi: So, like I said there are two approaches to it, right? The older ways are way more interpretable, right? As in you, but that actually requires a lot of effort and that hence it does not scale well and generalize well. Where you read this book and say these are my rules, now I will encode these rules in certain way, right? You are manually doing that, that is a more interpretable technique, but it does not scale really well because there are infinite rules, now how many rules are you going to encode, right? And then there is this new class of system which is not really new, like they existed for a while, but now only they’ve started becoming mainstream front and center everywhere. These neural networks where they essentially, you don’t encode each and every rule, you give all of these data and then you give certain instructions on, or I would say examples on if I ask this question, you should answer it like that. And then the system tried to understand what is this general rule going on inside of these books. that I can probably learn. It is kind of like how you build an intuition around a subject when you studied really well, practice it multiple times and you kind of get the intuition, I do not need to summarize all these jargons and learn it. Essentially, this is the trend that I want to analyze. So that is what these systems are trying to do.

The AI taxonomy: ML, neural networks, GenAI

Rohit Agarwal: That sounds quite magical, that all the learnings that a person may have by, let’s say, reading a certain number of text, a certain amount of text, and then being able to work on it for a number of years, a system can just learn it and is able to answer those questions in, you know, not that long a timeframe. You mentioned a few words there, machine learning. If this, then that, maybe you can call it more like rule -based systems or AI, neural network. There’s a term of course like GenAI. Are they all the same? How are they different? Can you make some alignment around these maybe four or other similar words that are around this ecosystem?

Harsh Joshi: Sure. So I think it’s better understood through certain examples, right? So machine learning, neural network, these are just like different forms of the same thing. We’re talking about making systems intelligent, right? And over the last two decades, you really saw a commercialization to the point of mere perfection of these technologies, right? So for example, when your system… a piece of code in some software is trying to say that, hey, this person has marked these kinds of mails as spam, I’ll mark this mail also as a spam until he tells me otherwise. That’s also machine learning happening, right? And that happens in your email systems every day. Or let’s say, hey, this person is buying these things, I should show him this product with this much discount, there’s a probability he’ll buy that also, right? Amazon. That’s also machine learning happening. So machine learning is just a class of algorithms, right? And there are different kinds of problems you solve through it. Like there’s a classification problem, there’s a regression problem. Latest and the most crazed about things is generational problems where these systems can also generate text, not just classify or label them, right? So. AI is like a broad field and there are different ways to do that, but the scalable systems are not rule based expert system. These are more machine learning algorithms. Inside of these machine learning algorithms, there are different kinds of algorithms. One such category of algorithms is called neural networks. It’s essentially a piece of code that mimics how your brain works. And it is one of the most general function approximators on this planet. Right? problem is it is really opaque. You do not understand why it has formed those rules. You only understand how you taught it to get to that point. And these are two very different things. So, all the craze that you see today is because these what really happened over the past five, six years, there was a race to okay, let us start making the system as big as possible. If you are making a neural network, can I make it this big? Can I make it like 100 parameters or maybe 100 billion parameter size? So you’re constantly increasing the dimensionality of this data it can comprehend and the parameter it has to analyze. You’re making it bigger and bigger. And that’s what gave birth to this idea of large language model. So it’s essentially a language model, but it’s like really large system. Prior to that, for imaging system, you had deep convolutional neural networks. So these are different classes of algorithms. And. This was one of the systems that are all the craze these days, right? GBDs and LLMs. These were primarily, I would say, pioneered by Google, right? Around 2016, 2017, based on these BERT kind of architectures. But what really happened was OpenAI actually drove it to home, right? They said, okay, I’m going to pre -train it with lots of data. I’m going to do lots of QA around it. and I am going to just optimize each and everything that I can around it and that is what gave this one very special model that just exploded after 2022, which was GPT, Generative Model, which is a pre -trained and a transformer architecture. So, it is just the same thing but in different form but everything is a bit more nuanced if you go into the technicality.

Rohit Agarwal: Got it. So we’ll unpack GPT a little more, but before that, some people are calling this whole sort of GenAI revolution as a 70 years overnight success. Is it because the first sort of neural network paper came out that long back and it just took 70 years for it to get to a place where it can be commercialized and people can start using it.

Harsh Joshi: So yeah, I mean, if you look at the very oldest systems, most of these systems were actually, these papers are published back in even 1950s. So it’s definitely a 70 year long overnight success. But if you really look at the history of adoption, around 1990s, you had first convolution neural network that were running. So this one OG of this field. Professor Yan Li Kun, who heads Meta AI lab. So he at Bell Laboratory actually demonstrated digit and character classification using CNNs back in 1990s. The whole idea was these machines require a lot of compute and a lot of data. So you could argue, we say most of these technologies were way ahead of its time because the infrastructure to power these technologies was not there. Now, the past two decades, you saw the emergence of this whole cloud phenomenon. Anyone and everyone was building anything using cloud only. So you had these things like big data and all of these fancy terms came about. What they’re essentially talking about is you’re taking data, you’re storing it in different places, you’re trying to distribute because all of this data cannot be stored in one place. So there are different techniques around that. Now, how fast can you process it and how efficiently can you process it and how systematically can you manage this data kind of gives a direct throughput to the raw firepower you can put inside of these models. And that’s what really started happening. And over the last five, six years, what we have seen is these systems now on one side, the technology has evolved, the other side, the research has also evolved to a point, yes, maybe we can start. putting them in any and every place we can imagine. But I wouldn’t say it’s still a success because there’s a long road ahead. There are a lot of challenges. Cloud and software engineering primarily had multiple decades to evolve. AI has not yet had that decade. It just went from research to production overnight, particularly when you talk about generating.

LLMs, parameters, and diffusion models

Rohit Agarwal: Got it, makes sense. Let’s talk about LLMs a little bit. As you said, the large language models. What does that really mean? Can you give us some flavor of what defines large, what defines language? And what does this kind of model really mean and is able to do? Because you told me before the show that in some ways it’s just two files. But it’s crazy to think about a model as just two files and it does so many things.

Harsh Joshi: Yeah. So see, there are going to be two things, right? One thing is you’re providing raw energy. That is raw data that you want to process. And the second thing is you are trying to approximate what this data is trying to tell you, right? Forming decision boundaries inside of this data, forming a curve through your Excel data sheet, right? These are two things. there is an algorithm and then there is this data and once that decision boundary is formed, it is stored in this format which you call weights of this neural networks. So, in the end it is weight and your algorithm that can generate these weights or interpret these weights, that is it. So, what is happening inside of these and see the large is again a marketing term more than a scientific term because there is really no definition lately, 1 billion parameter is that large or like 100 billion parameter is that large or like trillion parameter is then when you really say it is large. So again, it is very marketing and hazy boundary around large. But the idea is yes, we are talking in the order of magnitude of billions of parameters. That is the general concern.

Rohit Agarwal: What does that mean? One billion parameter versus 100 billion parameter, what’s the difference?

Harsh Joshi: So depending on the technique on how you’ve trained it, the definition may vary. But if you simply look at it, you want to analyze revenue versus sales kind of graph. So essentially dealing with only these two parameters around which you’re trying to form a trend line. And I was saying, no, I also want to study based on the segment. of the company. I want to study SaaS companies revenue versus years. And I also want to study how does that behave if it’s not a SaaS company. So essentially what you’re trying to do is you’re trying to add more and more dimensions to the data that you’re trying to interpret. But depending on the technique, most of the time these parameters are not directly your headers. Sometimes these are principal components and so on. There’s a lot of science to it. But on an outline level, yes, you’re trying to study how many combinations and permutations can actually represent this data so that I can get a better, bigger picture of this data. That’s in a nutshell what’s happening. So when we talk about language models, and so GPT is where the most famous, like everyone knows GPT, GPT, GPT, GPT. What powers inside it is like this Da Vinci model. that used to be there. So what these models are actually doing, these are sequence to sequence encoders in VM. So for example, you gave a prompt and ChatGPT gave you like a really fancy essay on that prompt. The sequence of tokens that got translated to another sequence of tokens. Tokens are just like words. Only thing is in literature, when you say it is a boy, so these are different words. But There are certain permutations that can be arranged around it. And again, that’s a whole field of how do you tokenize things better and what efficiency can you drive with those things. But the closest counterpart you would say is a word. So these systems, they are really large in size. And why they’re large in size? Because they’re trained on really large amounts of data. And these are sequence to sequence inquirers. So they have really good understanding of how these words come after each other, the sequence of words. So. that is a ballpark of what these large language models are doing. But again, this is all very high level, there is very much subtleties and details to each and every of those terms.

Rohit Agarwal: Makes sense. So there are these GPTs, or more language -oriented models, which one could say are made more famous by OpenAI’s ChatGPT product. And then there are other products in the market which are doing image generation or video. generation by just taking a text prompt from the user. And the models behind them are called diffusion models, is that right?

Harsh Joshi: Yeah.

Rohit Agarwal: So what’s a diffusion model? How is that different than a GPT or an LLM?

Harsh Joshi: So I mean, take off with this way, right? Let’s talk about finance for analogy, right? So you want to get to the valuation of companies. So there are different ways to do it. There’s discounted cash flow method. There’s net asset value. So to do one thing, there are multiple ways, right? And each way actually is not just, OK, I have different ways to do one thing. But each way gives you insight into a certain dimension of that problem. You understand it better from that aspect. So. Similarly, these are different approaches to building these neural networks and solving these problems. Some of them work really well for image based problems. So, diffusion models is one of those approaches where essentially there are two processes. One process essentially just focuses on generating image and trying to identify if it’s the right image or not. The second process is trying to add noise and trying to figure out what is the distinction between these two things. So one is trying to confuse the other, the other is trying to beat it and it is an eternal loop and the output is still that like you want to run this loop to the point where a human understanding of this image is good enough. So for example, about 5, 6 years back we had these tools Google released it called Deep Dream models. They were also doing the same thing that these diffusion models are doing today. Only thing was they were not that good, right, because the definition of quality is really subjective, right, for an image. Now, these systems have been perfected and fine tuned to the extent that they what they are generating is just like blowing people away, right. So, it is a different approach. It comes from this primarily one class of architecture called Generative Adversarial Networks or GANs, which were again pointed way back like 2016. before and onwards of 2016. The only idea is now they have gotten to a point with the amount of investment, capital, fine tuning and advancement and resources and technology that it is acceptable, the generation output quality is better.

Pre-training, fine-tuning, and RAG

Rohit Agarwal: Got it. Interesting. You mentioned the pre -training with the GPT models, right? That’s what the P stands for. Then you mentioned something around fine tuning, right? And then there is another word which is somewhat relative, which is RAG, right? Can you unpack these three, how they’re similar, different, interrelated?

Harsh Joshi: So, RAG is a very different kind of a thing. Let us first talk about pre -training and fine tuning. So, again, the way to understand this is when you start from scratch, you are kind of training a model, pre -training. But once you have trained it enough, now you just want to give it a special spin. Think of it this way, you have done your bachelor’s, now you know engineering. Now you just want to get really good at software engineering. So you do masters in software engineering. That kind of thing. So when you want to give a special flair or kind of train it for one specific purpose without discarding all that previous knowledge mass that you have done. So that’s the whole idea of fine tuning. So it’s that 80 -20 rule in play. Arguably based on the technique, sometimes in fine tuning also you go really deep. Sometimes you really don’t go beyond one layer. in more of marketing concepts, but technically it depends on to what depth of this model you are going to train it. And arguably in pre -training, you’re starting from scratch, so you’re training everything. So that’s the difference. And fine tuning and training, there are a lot of algorithms, there are a lot of techniques, well -defined techniques to solve it. But to a certain extent, it’s an art. Because you throw certain kind of data at the problem. you twist certain parameters, you instruct it to behave in certain way, you use different kinds of algorithm, you observe, this happened this way, it should have gone this way. Then you go back, change those parameters. So it’s more of an art, you’re sculpting the model, behave it this particular way only. While there are techniques, but it’s more of a art. When you talk about RAGs, that’s a completely different area, but… So think of RAG.

Rohit Agarwal: we’ll come to RAG, So on the pre -training versus fine tuning. You talked about it being an art and it being an 80 -20 thing. So arguably what you’re saying is basically LLMs are generic in nature. They are built by these few companies to be used at as large a scale as possible. And then whoever is ultimately using them for I would imagine business purposes can then use fine tuning to really make that LLM a lot more customized for their own purposes. Is that the right way to think about it? Okay. If I have, well, I guess then the question again comes around the large language. Where I was going was, if I have enough data, can’t I just have my own LLM? Or I guess there is a term called SLM, small language models that are also being thrown around. Are those?

Harsh Joshi: Yeah, that’s true. That’s right.

Rohit Agarwal: kind of basically that, that I have enough data and I could just train my own model from scratch.

Harsh Joshi: So when you talk about SLMs, LLMs, like I said, these are mostly marketing words, right? They don’t have like a lot of technical resemblance. It’s a scale, right? How many parameters you want the system to train on? How many layers do you want this model to have? How dense this network should be or not, right? So it’s about those things. So the point here is if you have your own data, can you create your own model? The answer is yes, you can create. Question is should you create it? That depends on a lot of factors. Can you create it? Yes, you create it. The point is there are a lot of subtle connections these networks form. That today we don’t really understand why exactly. The communities split, debating, there’s futurism, there’s doomerism, there’s different schools of thought and all that. But if you cut through the noise, essentially the entire community has consensus that we don’t really understand. how these connections are being formed in a deterministic way as it can be predicted before these connections have happened. So because of that, there are a lot of new interesting properties that arise. So for example, if you have a model that’s trained on a corpus of a lot of language data, so arguably it has gotten really good at grammar. Now let’s say you have your own customer support data and you want to fine tune it for customer support data, you don’t need to teach a model grammar. So that is the point of should you pre -train it or should you fine tune it, it really depends on the use case. Do you want to do all of that work is the question. Can you do it? Yes. The answer is always yes to these things. If you have enough money and if you have enough compute and talent, you can do anything you want. There’s always a question, should you do it?

Rohit Agarwal: Got it. Makes sense. Why don’t we now move to RAG, where I interrupted you while you were starting to explain.

Harsh Joshi: So, RAG is a very interesting area, right? And the way to understand it is the analogies with software development world. So, prior to this whole AI being front and center in all of these spaces, software engineering was the way to go about solving any of the problem you wanted to solve, right? And in software engineering, there’s this fundamental concepts of operating systems, right? So these operating systems have two, three fundamental aspects. One is, OK, they’ll control your hardware. They’ll know when to call in this processor to process something. In RAGs or agentic systems, your LLMs or these models are the processor unit. That’s it. Now, all of these operating systems have threads. or processes, they are managing multiple processes at a given time or maybe they are doing one by one. But these are different processes that are happening inside of a system, independent items that can talk to each other. So in these new AI architectures, that is what your agentic systems enable. That is what these agents do. There are lots of agentic frameworks. That is essentially what these are doing. Now, inside of these traditional systems you had memory, where all these computation was happening, the data was being cached into all of those areas. That is what context is of these models. In AI system that is the context. This is the context window, how many tokens can I process at one given time, which means not everything can fit into the context. So, you have to bring certain things into the context, time to time. That used to happen in software engineering with this concept of IO. That’s what RAG enables in these modern AI architectures. So arguably, if you have a huge customer data set, and then you have a really good AI system deployed, and your customer information is, let’s say, terabytes, let’s say, gigabytes. No model today will give you that context. So the best way is, and you can’t constantly keep on training your model on all of that data. So as I say is, okay, I’m gonna make sure my AI behaves in a certain way and whenever it needs it, I can go look it up from that table and get it for me. That’s the whole idea of RAG, retrieval augmented generation. I’m gonna retrieve some information from some places and then I’m going to augment my generational capacity of this AI through what I have.

Rohit Agarwal: Got it. Can you share an example or two of where this RAG technique is being used today quite prominently?

Harsh Joshi: So when you use ChatGPT, when you ask it certain questions, you see it gives you sources, right? Hey, I found this article over here. So that’s RAG happening in real time, right? BING users and co -pilots. And even in your customer support systems, the first adoption we’re seeing of Generative AI is in customer support, right? In most of these areas. What’s actually happening is it’s going looking up your past invoices so that it can tell you, okay, what kind of refund should I enable to you or not, right? So when you have to look up something you can’t because these data are constantly being refreshed. You can’t keep training your system with all that data with the rate with this data is being refreshed. And neither can all of this data fit in the context window of your model, right? That means you need a separate pool of data. That’s the… tradition in software architecture that was the job of databases and IO happening input output from that data. That essential piece is being replaced by RAG in this AI architectures.

Prompt engineering, vector databases, and MLOps

Rohit Agarwal: Super. Quite interesting. Let’s talk about prompt engineering now. I mean, there are even courses of prompt engineering. From a user’s perspective, it seems like, hey, you’re asking a question. The system should be intelligent enough to give you the answer that you intend to ask, even if you didn’t ask it the right way. But in AI, it seems to make a hell lot of difference in the way you ask the question. You may get. completely surprised by the quality of answer, either which way. So help us understand what exactly is prompt engineering. Is it the way the user is putting the request into the application, or is it something that is happening behind the scenes?

Harsh Joshi: The short answer is I feel prompt engineering is a field that’s gonna go away as fast as it has come. That’s the short answer for it, right? And I tell you why. And there are different schools of thought to it, but for that we need to first understand how the AI system actually operate, right? And when you see an image, right, you see, is it my nose or is it your nose? Right, that’s what you see. Is it my eye or your eye? That’s not how AI system sees the world. For them, it’s a matrix that describes intensity at a particular point in red, green, blue channel. So there are three matrix. So difference between your nose, my nose, and horse poop for a model is 0 .01835, 231. That’s how a model sees it. A model does not differentiate between my nose, your nose, or a horse poop. It just differentiates between these. boundaries of these data points that are scattered across these spaces. And if the difference is good enough, it will just give you that output. That is the whole idea of vectorization and kind of breaking things into vector. That is how a model internally represent data. It is studied in form of these vectors. Now, Let us say your vector representation and all this data representation inside of this model and anywhere in this latent space of where this model is thinking about roaming about internally with all the training you have done is really trained on certain like there are certain vectors that are clumped together there are certain areas that get activated on this word called markdown. Because apparently in that corpus the word markdown table was used a lot. Now, when you are using this model and you’re saying, hey, chat GPT, give me a table of these three numbers, you might not get that same answer that you’re expecting. But when you use the word, hey, chat GPT, I want you to generate a markdown table for this particular data set. Something’s going to happen in that region inside the model is going to activate, and you’re going to get that desired output. Right? So what’s essentially happening is you’re just trying to guess which word is going to activate which kind of behavior without actually knowing what kind of corpus that model was trained on. So I mean, it’s one of those beginners luck kind of a thing, right? Your first guess will be good and you might be in that gambler’s fallacy kind of state. Yes, prompt engineers doing some real damage. But in the end, till the time we really understand how these models. behave and operate and control them, prompt engineering is not a sustainable way to solve these problems. And in most of these cases, prompt engineering is a hit or miss. So it’s good to get started. It’s good to play around with it. It’s good to kind of generate the buzz and get people involved to contribute in the AI. But me personally, I don’t think prompt engineering actually moves any significant needle other than a hit or miss kind of thing. this it’s not a scientific way to solve a problem you really don’t know what’s happening inside this model, what’s being activated.

Rohit Agarwal: Got it. You mentioned something around vector, vector database, vector embeddings. What are those? What role do they play in this whole matrix?

Harsh Joshi: Yeah, so like I said, right in the end, these systems are number processors, like as much as the illusion of them being sentient or intelligent or smart or whatever, cool kids are calling it these days, you want to put it. But in the end, they are just processing numbers and how these numbers are grouped in this weird space, right? So in your Excel and in all of these finance data, you compare two numbers. So there’s like, x versus y and you are comparing one data with another data. These models are doing every data that you give, either you give it a photo or an image or a document or a number or a sheet or an essay, everything essentially gets converted into the sequence of numbers that gets plotted in this high dimensional space somewhere. So, it is this art of translating any data into the sequence of number, that is the whole job of embeddings. It is this vector space in which they get represented. So in your legacy databases, you have this idea of records, right? And you’re storing things in either records and there are columns and how you’re storing. So that’s just a representation of things. For abstract data, the image, how would you go about storing an image? Where would you place my nose and your ear inside of the same image in a database, right? So really smart people came up with these ideas of kind of creating these databases. which vectorize this information, store it in this vector format somewhere, which is easy for a model to understand. Probably you might not be able to make sense out of it, but then you really do not have to make sense out of it, your model needs to.

Rohit Agarwal: Interesting. Is it safe to say that what SQL was for structured data, these vector databases and these vector embeddings are for unstructured or abstract data?

Harsh Joshi: Yeah, arguably that’s a pretty sound analogy.

Rohit Agarwal: Okay.

Harsh Joshi: I would say that all these analogies depend on the audience. So if you’re talking to a business person or a product person, these analogies make sense. But if you make that direct comparison in front of a scientific community, just be ready for torches to blow up.

Rohit Agarwal: All right, makes sense. Let’s talk about another term called AI ops or ML ops, right? That seem to… Think about composition of teams at times. At times, it represents a certain number of tasks that people are doing that falls under it. But of course, given this is such an emergent discipline in itself, there is no sacrosanct way of saying, what is MLOps? What is AIOps? Are they two different? Are they completely separate? So help us understand what, in your view, is MLOps or AIOps.

Harsh Joshi: So I think the short answer is MLOps is the difference between you calling your engineer to show a demo to your board members to your AI system being consumed daily by your customers and them actually really liking it. That’s the short answer. Because what’s really happening is these systems, so one part is the research problem. Okay. you solved it really good. And the other part is making a software out of it. Okay, you did that as well. But then comes the bigger problem, right? How do you distribute it? How do you scale it? There are a million people using it simultaneously, right? Then there comes security issues, challenges around complexity, and the cost blows up. A lot of things happen, right? It’s not just magical, right? It’s not like your devops teams just kind of goes. and drinks Red Bull and bippity boppity boom. Now the system is like, they’re actually doing some really interesting engineering to solve these challenges. So that entire class of problems that you solve to take these systems to production that comes in this category of DevOps. Now DevOps was and still today is where the key pillars of software engineering, particularly in the SaaS domain. Now with AI, same problem needs to be solved, but in very different ways and a lot of these problems are unsolved. So for example, you don’t have this notion of decomposability, controllability, explainability, there’s a lot of segments that are still unsolved. So the whole area of AIOps is an evolving space because these things were taken for granted in software engineering world and then over two decades, these technologies matured solving those problems that were already solved. But distribution wise. Now in AI, you have to first solve those problems and then you have to again do the same work that DevOps did for the past two decades to get that to a maturity. So, our whole evolving space is AIOps or MLOps or LLMOps, again, jargons, defend who you are talking to.

Why business AI is hard — control, explainability, privacy

Rohit Agarwal: So let’s understand why an AI application is so different than a non-AI SaaS application. What is it that makes it uniquely challenging?

Harsh Joshi: So there are lots of challenges, right? There is not a single challenge that you can point to. And all of these challenges come together to create a big chaos, right? So let us tackle them one by one. So in your legacy systems, right? You had this idea of decomposability. What it simply meant was, hey, I want to solve a problem, let us create a software out of it. How will I create a software out of it? I will create certain modules of these software. I create certain modules of the software, I will go and create certain classes, certain functions inside of these modules. These are fundamental building blocks. Each block can be customized, programmed and interacted with other elements. You have this idea of decomposing a big problem to small, small, small problems.

Rohit Agarwal: So microservices is that kind of what in the SaaS world one can say decomposability.

Harsh Joshi: No, see the microservices like a very generic buzzword around a particular way of architecting SaaS systems on cloud. But the idea of decomposability runs really deep to the principles of even compiler design and programming languages also. It is a fundamental idea. Can you break things down into simple atomic blocks? That is not how AI works. Particularly these. models that generalize really well but are not interpretable, I am talking about neural networks, they are not decomposable. You cannot go to a model and say I will just talk to this particular neurons inside of this model. It doesn’t happen. You either talk to a model or you don’t talk to a model. That’s it. It’s zero one end. So that decomposability is not there. Now another thing is determinism is not there. So for example, in software engineering world, you had this idea. I am writing this function, this is supposed to return 2 plus 2 is equal to 4, this is an addition function. So, I can test it, it has to go 2 plus 2 is equal to 4, is it giving or not. There is certain amount of determinism because you know each line will behave in a certain way. Again in AI, it is generating sequence, it is just trying to generate a sequence of tokens, at least the way you have AI today, which means you really do not know what is going to come out, there is only a certain probability of what might come out. So again, there is no determinism. How are you going to test the output? Because the output itself is not deterministic. So it’s a chicken and egg problem. It’s a loop over there. Now there’s another challenge. Because these things were decomposable and deterministic, so you had these two really awesome things came out of it. They were controllable and explainable. You knew what your software was doing, and you were controlling to do a very particular thing. Now inside of these systems, you’ve really do not know what is happening. So, you know how this architecture is designed. The community understands that really well. You know how you have trained it and what sort of data you have trained it on. You just do not know deterministically what part of this model will get activated when what kind of question is asked and what kind of output it may spit out. So, that level of explainability and controllability lags. This is a problem because, We just got used to taking these ideas for granted and then we built on top of it some really amazing technologies. A lot of smart people built a lot of cool technologies for the past couple of decades. But those principles don’t directly apply to AI system. So you have to kind of rethink from first principles, a lot of problems. Hey, how do I go about solving this? Do I want my customer support agent to go tell my customer, you go f$$k yourself, I’m not going to give you this receipt, right? It’s not going to happen. How do I do a QA of this problem? So a lot of these things are still unsold. And the entire community is working. And each of these problems is an area on its own to solve for, and a long road.

Rohit Agarwal: Got it. Are these the primary reasons also why businesses are still just scratching the surface with the adoption of AI? There’s a lot of buzz. Of course, OpenAI has been fastest to cross a billion ARR and so on and so forth, right? For the right reasons, but again, that seems to me to be more focused on the personal. usage or people trying it as a new magical tool that is in the market, not businesses that are adopting adopting at the largest of the scales, where I could argue every single business could have some use of it, whether from a productivity side, cost reduction side, or good revenue enhancement side. And so, maybe beyond these controllability, explainability, decomposability, are there other things that are stopping enterprises from adopting this new technology?

Harsh Joshi: So, the short answer is, yeah, there are many, but if you really go deep into it, these are the fundamental problems. Everything else is a compound combination of it. So for example, if you cannot control and if you cannot explain, that means you cannot do a good quality assurance. That means you don’t know what you’re throwing out in front of your customer, right? Or let’s say if you cannot decompose and if you cannot control, That means you cannot build that rich journeys around it. So there are many problems. And if you cannot explain something, that means you cannot govern it. You don’t know what’s happening. The first step of governance is you better be able to explain something. So what will your compliances do? Let’s say you build something. And the model behaved in a particular way. Who’s liable for it? Company is liable for it. The company is liable for an adverse effect. what does the company do to fix it? So, these are just permutations, combinations and different manifestations of the problem added through the fundamental problems that we described. As long as these are not solved or at least solved significantly enough to be able to roll things out first. Because again, this will always be a chicken and egg. Without rolling things out, you will not know what to do better. But to roll things out, you need a certain level of assurity. And it’s that loop where the whole industry is stuck in, which is why you saw these consumer segments where this was not a mission -critical application. This could not lead to any kind of financial loss. For example, hey, I’m just chatting with the chat assistant. What’s the worst that can happen? There you saw adoption of Gen .AI coming in. Right? Now. let’s see insurance segment really drive it home with that level. And the answer is it’s not happening. At the root of it is this problem because, and it’s not like these segments don’t use AI. For example, most of these hedge funds and all of these systems, they do use AI. And if you really go investigate the kind of AI these systems are using, they are using really old school systems, decision trees. Why because they are interpretable, they are not even, so your problem might have like millions of dimension, but the decision tree that gets deployed in production will at max have like 70 dimensions to capture or 700, right? Why? Because it gets the job done and you know exactly how it will behave in a certain situation. So it is going to be a long road, it is not going to happen overnight.

Rohit Agarwal: Got it. Makes sense. Privacy is another thing that I think many businesses are struggling with to really put their arms around and say, okay, we can deploy XYZ AI solution. And amongst everything else, privacy is taken care of, our data is secure. What do you think about that particular evolution of that side of AI systems?

Harsh Joshi: Yeah, so arguably privacy was always and is always going to be a problem, right? And will always be a problem. And for all the right reasons, I mean, you don’t want your insurance company to find out, let’s say you ate two more samosas than usual yesterday, because if they have that data, they can arguably make a case around how I’ll jack up your prices. Right? So. It is all for the good reasons, you need to protect certain kind of privacy around you and as data is being more common place in the society, the technology is being more common place in the society, your need to actually rely on the data is increasing simultaneously the awareness of the consumer around what that data really represents for me is also increasing. So these are two opposing forces that kind of push like. how these mountains are formed, right? One landmass hits another and mountains are formed. So that’s how privacy is becoming one of the biggest problems to solve for. And it should be. And again, like you don’t have determinism, you don’t have explainability, all of those things. So you really don’t know, once the data goes inside the model, how and where it’s going to come out in which form, you don’t know that, right? which means if you do not know that be careful what you are putting out in production.

Rohit Agarwal: Make sense. Sometimes I wonder with so many chat or text generation applications out there, people are generating, content left, right, and center. There is perhaps going to be a time where the content that is getting generated is coming out of AI and that content is being used to kind of train the AI systems of the future. Is that not going to create a garbage in garbage garbage out kind of a loop at the end of the day?

Harsh Joshi: So the interesting thing is we should be scared of it, right? Because at one point, it kind of becomes like a self -fulfilling prophecy kind of things. You really don’t want that kind of thing to happen because… So there’s this really interesting book called Moral Tribes. Okay. It’s written by a Harvard professor. At its core it argues this one fundamental principle that. humanity does not have an objective definition of morality today. It is a subjective thing. How do you expect these smart systems to have objective definition of something? You yourself do not have an objective definition of something, four people have different definitions. How would this model have? Which means the more data it gets trained on, it is just amplifying what is inside of it. So, if there is a bias, it is going to get amplified. there is a particular garbage, it is going to get amplified, which means with these systems generating output, it is just amplifying that thing. And now again, you put that back into the training process. So you are creating a outcome that you have no control over. And there are a lot of techniques coming out. So it is not, see, arguably this seems like a chaotic thing. But then there are a lot of smart people around the planet, there are a lot of smart mathematical techniques to filter through it. But there comes a question, to what extent can you solve it? And what better guardrails, policies, practices can be put? And some of these things will not be done by one government or one company. For example, like Bluetooth is a standard format. If every company had a definite definition of how to transmit that signal, no two devices would be able to communicate on Bluetooth. Some things will have to be standardized in the end. And it’s all for the betterment of the whole ecosystem.

Open source, big tech, and the AI economics

Rohit Agarwal: This is starting to make some sense now. So now there is OpenAI, right? There is Anthropic Claude Google has some models, but then on the other end, these are more closed source models, right? And then on the other end, you have Mistral, Llama, and a bunch of others that are more open source models. Why is open source exploding on this whole AI thing? I mean, it was prevalent in the SaaS world. I would say a bunch of interesting projects, like really seminal companies came out of that whole movement, but it was never so front and center in the conversation, right? I still remember when IBM bought Red Hat, it was a big deal, like a huge deal. Like why a company like IBM is buying an open source company. Same thing around when Microsoft looked at GitHub. Similar kind of a logic. What’s so special this time where there seem to be, maybe not from a commercial perspective yet, but at least from the pace of innovation perspective, the open source is kind of really keeping pace with the closed source. community or models or whatever you want to call it on this whole AI trend.

Harsh Joshi: So part of this answer will be a speculation, of course, But I don’t think Llama or Mistral any of these open source systems being run by big corporates is being done for a social benefit scheme. There’s always an economical aspect to it. So first, let’s understand what’s closed source and what’s open source. Open source not unnecessarily means it is free. Not everything that is open source is free. So there are different kinds of licenses. Hey, you can use this but not beyond this limit. You can use this but if you make this change you have to show this also. There are different classes of licenses. One of the biggest confusion is open source is always free. No, open source means your source is open you can go inspect it. And then there are different levels to how much is it open. Now the things with the open AI and all this, this is completely closed source. Like you call my API, you use it, you pay me, that’s all you know about it and that’s all you can know about it. If I feel like it, I’ll publish a paper like that’s the whole idea over there. Now, the argument that one could make is everything is an economical opportunity. So, there is no arguing the fact that Google owned the search for the past couple of decades. They literally owned it. Which means you cannot compete with Google if you are building a business around search. It is just not going to happen. And with more capital and more talent and more distribution, they kept on making their search business so well that it was just a monopoly at same with social media and meta. So each of these big tech had certain areas. And today if you see make no mistake, big tech owns AI at this point. It is just plain front and center. But the idea is there is still not one big king of the whole space. And the idea is if you can own AI, you would have a monopoly unlike any other. It’s an asymmetrical force, which means you’re going to do anything and everything to stop others also from having that monopoly, right? Not just build your own. So what can stop a monopoly in this space? Because this field is so raw, because this field is so raw, a lot of concepts need to be translated, new concepts need to be created. So a lot of experimentation will be required. It’s not going to happen one day. And no one company can do it. So. open source has really tried to getting the community together and sharing the benefits with the community. That has always happened. So Hugging Face kind of created this GitHub for AI and they’re like, you know what, you guys kind of start sharing your data, start sharing your models. I’m going to create a centralized leaderboard. Everyone can participate. Everyone can fork each other’s model. That actually brought down the barrier to entry in this whole AI space for new incumbents. AI is a problem of data, capital and talent. If you have three, then all you need is distribution to iterate it faster. Now, these four items are a monopoly of big tech. So it was a point of bringing that barrier down. This is where companies like, again, this part is pure speculation and everyone can have their own opinion. But I do feel like Meta has not rolled out Llama in the goodwill of their heart. Right now, open source is just good business strategy in AI. If you do things in open source, community is going to test it for you. The odds of you creating something that beats something else is going to increase, is going to give you more distribution reach and adoption. So the idea is I have a bit skeptical on this whole point that when a corporate does something open source, they’re doing it out of goodness of their heart. If they are good for them, good karma. But otherwise, open source is the new favorite word of the day for every company. And everyone has their selfish motives around it. But that’s not the soul of open source. So the entire internet is built on the foundation. The entire spirit of internet is open source. HTTP, www. and then you have this idea of Git. All of these things, they are open source Linux, open source, right? Game order, Unix, all of those things. The entire world, all of these data centers and servers use Linux to run it. So it’s all those things, you are rooting for the underdog or are you trying to take advantage of the underdog? Is there a selfish motive to it? So this answer is purely speculative and it opens a lot of conspiracy theories and all those things.

Rohit Agarwal: Got it. I’m sure we will see a lot of evolution on the open source side. We’ve already seen a bunch of different technologies that were previously open source who are taking closed source. So I’m sure some of that phenomenon is also going to happen in this ecosystem. All right. So let’s talk about copilots a little bit. Companies are generating text image videos, or you can say every SaaS solution really has a copilot. Whether it’s good or bad, I think that’s questionable. But they are trying to make sure that they don’t fall behind in this whole AI race and has. at least one AI companion that they can put out to their customers. I’m pretty sure there are a ton of challenges to really operationalize these copilots and even more so to perhaps improve them to a level where you can really start charging something meaningful from these copilots. Can you talk about what are you seeing in terms of… issues that these companies who are building AI solutions to put in front of their users are facing.

Harsh Joshi: So I think building a copilot is good way, right? The whole idea is you want to make things simple. You want to cut through all the steps and AI helps you do that. And you will create rich journeys. Good. The challenge comes when you’re building any sort of a copilot system. You pretty much have only two ways to go about it, right? Either you kind of invoke an LLM directly, make an API call to that LLM. and kind of make that model, not LLM, any kind of a system that you want to use, and just make it really strong enough. You go deep into the AI, fine tune it, optimize it, work on it, and just put it for your use case, and make an API call to that. Or you create like an agent -ic OS kind of a structure, agents, RAG, LLM being invoked at different, different steps. looking up certain database so that you can get some resemblance of control. And then your customer essentially talks to this entire system that you have created instead of them talking directly to you. These are the only two ways. Now, in either of these two ways, what you can do is you can either use a closed source system, like OpenAI, or you can create your own model or let us say you can take any open source model and build on top of it. So, two approaches, either go closed source or open source, where the model needs to be called and two approaches. so if you are going in the closed source approach, that is the best way to start. If you’re starting it, because it is not a lot of complexity to simply make an API call, get the output done with it, to get it started. But once you really want to make it like a production -grade system, get some quality output from it, and behave it in a certain way, the only option you have is you want to create some sort of an agentic framework, some RAG mechanisms around it, and then probably host your own model somewhere. Because you need to have control of your weight so that you can understand, optimize and improve those systems. This is where all of those challenges that I described about, right? Challenge with controllability, challenge with interpretability, challenge with determinism, the challenge with decomposability. After a certain point, you will see them at your face. You can get from 0 to 80 like this. This is the whole idea with this generative AI systems, right? Get a demo of something out very fast. Now, once you want to go from that 80 to 100, that long tail, problem space that your model needs to be really very, hey, in this scenario, it could behave this way, this scenario don’t behave this way. The challenge is you don’t have tools to solve that problem. If you have tools to solve that problem, there’s still research problems. And if they are solving it, they don’t solve it completely, right? And that is where most of these companies struggle. So we have been onboarding a few pilots at DAO Studio to use our product YOJN And we have been talking to a lot of customers and almost every conversation. The conversation always ends with these points only. Hey, how do I make a call to this area? And the answer is you can’t. OK, how do I go about doing this better? How do I make sure my model never gives these things? And then the idea is there are 100 tools. Now you need to take this one, take this one, stitch them together. But this tool is also evolving on it. So this is a work. So it’s a circus at that point. Because now you are. pushing the boundaries of engineering at that point, right? Because you have taken something that was a research item for the good part of the century and only became something like a production grade in the past one, two years and just all of a sudden put it into production because everyone is doing it. But now you do not have that kind of system that controls this engineering output. So now you have to push the boundaries of engineering to actually create those systems and make them work in a similar way. Or what would you do? You would say, OK, I’ll have all the best AI talent on the planet. That doesn’t really work well. It’s an HR problem after that. So how do you solve all these problems? You need to take the existing workforce with the existing software analogies. You need to solve these problems one by one. That’s where the tools and technologies and systems are really lacking. You’re pushing the boundaries of what is known. And that’s where these companies get stuck. You can get to that demo really nice, but your board is happy. Your investors say, wow, very nice. when it comes to taking to customers, it’s almost a quarter away for past two years.

Deploying AI at scale — and what comes next

Rohit Agarwal: So you have a product, YOJN .ai, which enables enterprises to deploy, improve, and manage their AI at scale. Tell us more about it and if that is really tackling any of the problems with these companies who are building their own AI solutions.

Harsh Joshi: so the DAO Studio when we created YOJN, I thought was very straightforward. No company can possibly have all AI talent on this planet. You’re going to need your business team. You’re going to need your subject matter expert. You’re going to need your product teams to solve that problem. There’s just no way around it. Of course, your software teams would help. Of course, your AI team would help. But there’s not enough AI talent on this planet to work just for one company. So first thing was we need to make things simple, simple enough for subject matter experts to understand what’s happening inside the model. not for AI experts. The second thing was give a shared vocabulary to AI experts and subject matter experts to improve this system because it’s going to be a collaborative effort. So around those lines, we created a system that works really well with these agentic systems where you can create these agents, you can do RAG and you can deploy a model. But where we actually differed was we said, Even after all of these things, these agents are really dumb because they just come alive at one particular moment and then they’re calling an LLM and then LLM is doing all the job and there’s no idea of re- drive functionality. You don’t know where the model has taken the step and it cannot go backtrack, backtrack it step. So our core work was just making that one functionality really better. Can my agents become self-learning? my agents become as smart as possible? Can they become more state aware? All of this fancy work is being done to solve only one problem. Can I get surety of my output? So there are a lot of things we do around it. We also work on the fine tuning part of it. Once we can understand how the model has behaved, can we show it in a very simple way to a user? And a subject matter expert can give a feedback on that one particular step only. where you thought the model went wrong. So we can increase the context going in and reduce the garbage going inside the model. The first principle, that’s where our focus is. Just to do that once more, then we have to do a ton of things. But that’s all it did. And then came the whole fine -tuning part that, OK, if it can do this, can I create different snapshots of it? Can I create different variations of it? And can I compare them in simple one to two clicks? I don’t need to understand all these fancy, complicated maths. Just. teach to me like a five -year -old. So that was the focus I’ve created that whole fine -tuning part of the system. And then came the whole idea if you want to increase the adoption of these systems for these companies to make AI better, then we need to talk in the same language, the software team’s talk. So we started working around the deployment piece. OK, we’ll do Kubernetes. We’ll do this AWS deployment. We’ll give you self hosted. All of those things. So essentially what we’re trying to do is we’re trying to solve these three fundamental problems, controllability, explainability, and decomposability. And in doing so, what we’re trying to do is we’re trying to take the fundamental principles of software engineering and we’re trying to apply it into AI so that teams can work together the way they used to build SaaS, kind of like how in Mixpanel, your product team can simply see what are my DAU, MAU, and all those metrics. And then we can take better business decision; through Segment you don’t constantly need to engage a software engineering team, and you can get a data representation. towards one way. So your operation and growth experts can analyze that data. And then your engineering team, with two, three clicks, they can work on integrating these systems and then focus really on the reliability part and software engineering part and scalability part of these things. And that’s what we wanted these AI systems to do. Because once we can do that, then we can do some sort of determinism, some sort of controllability, some sort of explainability, and some sort of decomposability. And that kind of moves a needle. beyond that 80%. So a lot of things have to come together, and that’s what we focused on building in YOJN.

Rohit Agarwal: Got it. So it’s essentially an AI engineer that you can deploy at scale.

Harsh Joshi: From a marketing point of view, we can say that from an engineering point of view, I don’t want to tick off the engineering community, so I’m not going to say that.

Rohit Agarwal: Fair. So look, the audience are again, mostly finance folks here. What you said all makes sense, but is there any direct benefit of YOJN to the CFOs or finance folks here? Is there some cost element? Like why, if I’m a CFO for a company where, you know, let’s say you are coming into my engineering team, my AI team to deploy YOJN, Why should I be interested in deploying YOJN or having my teams work with YOJN rather than all the other crazy things that they may be doing?

Harsh Joshi: So, see there are two ways you could use YOJN, you could either use YOJN to make your work better or you could use YOJN to make sure whatever your team is doing, you can get transparency in terms of the pricing and the growth of that system. So, what I mean by that is as a CFO, it is really important for you guys to be able to model the cost and the return from that cost. the ROIs really matter, right? You can’t be burning billions of dollars and not generating revenue after a certain point. So you really need to know, hey, these are the steps that were taken for my model. And it gave me growth in this one particular area with this amount of compute, right? So you need that visibility. On the second hand, you want certain kind of controllability on that visibility as well, right? So for example, With most of these tools, there is a a of voodoo token economics going on. After these many tokens, you get charged this much. On these many API calls, you get charged this much. And well, see, that is all an attempt to kind of make the pricing less transparent, more opaque. And we do not believe in that. Our idea is very straightforward. The product should kind of justify. costs in itself. So once we kind of partner with you, we simply tell you, hey, this is the fixed cost. That’s all you need to pay. Now you make a million call, trillion call. That’s up to you. We’re not going to keep charging you for each and every nonsense thing. It’s your job to iteratively grow. So you’re going to do multiple experiments. You’re going to scale it to multiple people. I’m not going to keep on charging you for all of those unnecessary things. I’ve built my business in such a way that I can still survive without that. Now the second aspect to it is we offer this thing called self -hosting. which means you do not have to worry about building all of those engineering processes and cost heads around hosting your own model. Which means once you do host it in your infrastructure, what you are essentially paying for is the cost of that EC2 machine, let us say you hosted it on like a GPU instance of an EC2 on AWS. So now you know it is going to cost you 1 .75 dollar per hour. Now it is up to my kind of engineering team how they want to scale it up, they are using using they will scale it to five systems. So you know, okay, five times it will cost me that much, all of those things, but it is still still So as a CFO, it would help you get a better modeling of the cost and then you can kind of decide for yourself, hey, with this cost, this much is how my model has grown in this area. We do offer direct A/B testing. so now then it makes it easy for you to make that build vs. buy, vs. drop, vs. push kind of calls.

Rohit Agarwal: This is super interesting. So you can actually have a budget around your AI application development, AI experimentation, rather than sanctioning a lump sum amount of X million dollars on which you may have no idea at all in terms of what kind of returns could be expected. It might. give amazing returns or it might suck all of that up and still be pretty crappy in terms of the responses that your AI application is actually giving or any other usage that your AI application is used for.

Harsh Joshi: Yeah, you see the whole idea is these you’re not going to create a good AI system in one day. Let’s first accept the fact. It’s going to be an iterative approach. Your entire business would need to be involved because it’s not just a research problem. It’s a business problem in the end. That means you’re going to iterate it. So you have to solve for how do you survive and how do you grow responsibly.

Rohit Agarwal: Super interesting.

Harsh Joshi: budgeting will be a core part of your AI strategy.

Rohit Agarwal: Do you see? So this is a new challenge in itself. I’m not just talking about cost. I’m talking about coming up or creating AI applications on its own. Do you see organizational changes happening to really make this whole thing efficient? Because previously, there was a product management team involved with a development team with some inputs coming from the front line, people who would touch customers on a regular basis. And that… would more or less be enough to create a software product, iterate on it, so on and so forth. And there was always a lack, right? Or there was always a complaint that, hey, there’s a lack of customer interaction for these PMs and for these engineers to really understand what’s going on, right? And there is kind of some context lost in communication. In some ways, with AI applications, it is much more the case. AI applications are all about the context. If the context is gone, again, it’s just like complete garbage in garbage out. And so as you said, the subject matter experts or the business folks need to be much more integrally involved. Plus, you have this new AI engineer, maybe AI product manager, as two other constituents, right? And plus maybe the finance team who really cares about how much of a money you might end up blowing up in this experimentative product that you are creating. Do you foresee an organizational change? I’m sure that it’s an emerging factor as well. Not many people would have figured out what’s the best way to make it work. But is there an organizational change in the works where? there is better way to manage this craziness across the multiple disciplines that it needs to touch to make it happen.

Harsh Joshi: Yeah, so. I think we need to kind of demystify the whole idea of AI. Right now, everyone’s talking about AI as if it’s this one magical beast from fantasy world. But you have to realize it’s a branch of mathematics and software engineering in the end. So your organization need not become completely different organization. You just really need to be very clear on. What do you want to achieve from this AI? What are you willing to put in to achieve this AI? And what resources do you have available to get to that point? As long as you have that clarity, your software engineers and your product managers and your business folks and your finance folks can work together to create in a very similar way. I’ll give you a simple example. So in typical SaaS world, what do you do? You A -B test. your UI. Why? Because you want to improve your customer experience and push the adoption in a particular direction that favors the business code. That’s what you do. Okay. What is user experience in an AI product? It’s the perception of intelligence for the domain the customer is asking intelligence for, right? From that point. So essentially here also your product managers will do A -B testing. The difference is they will A -B test the quality of response getting out of it. Does that mean you need AI product manager? I would argue no, let’s not blow things out of water. You still have to do A -B testing in the end. You had DevOps team that used to scale your systems to millions of users. Here also you are going to scale it to millions of users. The end point still will be scaled. So do you need AI DevOps team? Arguably no, you need DevOps team but with a bit more clarity on what do you want this particular piece of software to scale to? Because AI arguably in the end is still that piece of software, completely new kinds of architecture, amazing work, awesome work by the community, but still in terms of the whole business aspect is the same thing. Which essentially brings you back to the business team. In a SaaS world, if your business team constantly say, hey, today I want to go this direction, tomorrow I want to go this direction, what will your product teams be able to do if they don’t have that level of autonomy? It doesn’t work that way. Similarly, your business team still needs to be involved in terms of what do they want from their customers and their businesses so that their product teams can build those kind of AI. So I would say the alignment is still there. What’s happening right now, the biggest challenge is, Everyone kind of sees AI in this weird fantasy beast. So, okay, get some AI product managers, AI researcher, AI scientist in my company and give them this problem, they’ll solve this. How do you want to do this? I probably cannot tell you. Okay, then how do you expect me to do this? That’s the barrier that needs to be broken down. And we need to kind of humanize it back again to the standard software engineering practices. So, over there also you had finance budgets. You’ll have finance budgets in AI. You had deployment targets you have here. Today with AI, most people don’t use OKRs the way they should be using. In SaaS, you have OKRs. Hey, I want to grow my adoption to this much for this segment. What’s your OKR for your AI team? No one has a clue to solve this problem. These are the fundamental problems we need to be focusing on instead of saying, OK, get me more AI scientists in my team and create this new model. It doesn’t work.

Rohit Agarwal: Awesome. Any predictions about AI? Maybe going like we have half of 2024 is gone. So for the rest of 2024 or for 2025.

Harsh Joshi: The AI prediction space is so weird because anything you say will be held up against you in the court of law. It’s kind of the Miranda rights thing happening with it. And it’s very nonlinear. You cannot predict anything. But on a meta level, I see one trend. Today, we are in 8086 era of AI, no doubt about it. So if you look at how the entire computer architecture, the software world, Back in the day, Intel came up with this architecture 8086 series of microprocessors, right? Very small controllers, KBs of ramps, and that’s the dimension we were talking about. So at that time, there were different kinds of software challenges, like even with the games like Mario that you used to see and all these console games, right? At that time, the programmers wrote some really memory efficient code. to even hide some pieces of information and game levels inside of those pixels that you see over there. It was really efficient design. Now, as these systems became bigger, more efficient over the time, today, I would say like the Apple logo takes more memory than an entire OS used to take back then. So, We have to be cognizant of the way systems evolve. And a lot of those problems that we were solving at that time don’t even matter today. Similarly, a lot of the problems that we as a community are focused on solving, we need to solve this, this, this, this, probably will not matter when you look back five, 10 years down the line, because the capacity of these systems are so huge. I mean, think about 1 ,000x bigger system with, let’s say, 10 ,000 times bigger context windows. Right? Because see, if you have petabytes of data, then probably RAG would still matter for you. But if your entire data could fit into that one context window, and you have a model which is like 100 times bigger size and 100 times lesser compute, then probably most of the problems that you’re trying to solve with this architecture are already solved for with that system. So it will evolve in very weird ways. And energy and compute footprints are going to be key aspects of it. because we are in the early stages. And as these systems evolve, there will be a continuous focus on reliability because now, you know, any technology has that high curve and then valley of realization. So I see AI will kind of start entering in that space where people will realize, okay, we need to build better QA systems around these things so we can actually really take advantage of it. We need to build better business processes that can take advantage of it. And we want AI to observe more of our business. so that AI can kind of learn from it, but we want more explainability from this AI so that we can performance review what it has learned and then it can kind of put into production. I see that going to evolve and somewhere down the line, I think we would come to a point where the fundamental intelligence inside of these models are being localized and commoditized. That is when we will see real AI native architectures being built.