Demo to production — Harsh Joshi on the chasm enterprise AI keeps falling into
What AI actually is, why controllability/explainability/decomposability are the real deployment gates, and how a CFO budgets a black box.
Harsh Joshi built a 7,500-parameter model in 2018 to triage false-positive emergency calls in India, where the first-responder-to-civilian ratio is one to 150,000. The startup tanked. The next one — DAO Studio, where he's building YOJN, a deployment layer for enterprise AI — is the answer to the question he's spent a decade asking: why do good AI demos almost never become production systems? This is an AI primer recorded in June 2024, pitched at a finance audience, by a builder who treats AI as a branch of mathematics and software engineering rather than a fantasy beast.
The spine of the conversation is the gap between a board-pleasing demo and an AI system the business can actually run on. Harsh’s frame is structural: AI systems lack the three properties traditional software took for granted — controllability, explainability, decomposability — and until they’re at least partially solved, deployment is the bottleneck rather than capability. MLOps is the engineering response to that gap. The economics layered on top — data, capital, talent, distribution as the four monopolised inputs — explains why the deployment problem is so concentrated in big tech.
AI is function approximation, not rules
The cleanest reframe of the conversation comes early.
The AI systems that are all the rage today, which scale well and generalize well, are neural networks, which are essentially function approximators.
Pre-1990s AI was rules-based — interpretable but unscalable, because there are infinite rules and a human has to encode them all. Modern AI inverts that: you give the model arbitrary data, it forms a decision boundary inside that data, and the act of finding the boundary is what we call learning. Parameters are the dials — a 1B-parameter model and a 100B-parameter model are doing the same thing, the larger one with more knobs to capture more nuance. Large language models are the same family of math that ran on Yann LeCun’s 1990 CNNs at Bell Labs; what changed is compute, data scale, and the willingness to pre-train on the entire reachable internet. The 70-year overnight success label is literal — the foundation papers are from the 1950s; the infrastructure caught up four decades late.
The three hard problems
The diagnostic that explains why enterprise AI keeps stalling is mechanical.
The first step of governance is being able to explain something; without explainability, you cannot govern it.
Traditional software gives you decomposability (microservices, functions, atomic blocks), determinism (2+2=4, every time), and the controllability and explainability that fall out of those two. Neural networks give you none. You can’t talk to one neuron; you can’t guarantee output on a given input; you therefore can’t explain or govern. The QA loop becomes chicken-and-egg — you need stable output to test against, but the output is non-deterministic by construction. This is why hedge funds run decision trees in production despite access to far more capable systems — 70 dimensions, fully interpretable, predictable. The mission-critical end of the enterprise has not adopted neural-network AI, because the governance machinery doesn’t yet exist. Until the three problems are at least partially solved, the long tail of enterprise adoption is structurally blocked.
Demo to production — the MLOps chasm
The most quotable line in the episode is also the most operationally useful one.
The difference between you calling your engineer to show a demo to your board members and your AI system being consumed daily by your customers is MLOps.
Two decades of DevOps maturity sit underneath every modern SaaS company; AI deployment has had a few years. Distribution, scale, security, observability, cost control, and behavioural guardrails all have to be re-solved for systems with no determinism and no decomposability — and the tooling is mid-build while teams are trying to use it. The 0-to-80 part is fast (the demo); the 80-to-100 part is where the chasm sits. YOJN, Harsh’s product, is the bet that the demo-to-production gap is the real market — giving subject-matter experts and engineers a shared vocabulary to inspect agent behaviour, fine-tune in-product, and deploy on infrastructure they already trust (Kubernetes, AWS). The framing is the right one for CFOs: AI as a deployment problem, not a research one.
What this looks like as a budget line
The CFO-facing translation of the conversation is its quiet centrepiece.
AI is a problem of data, capital and talent. If you have three, then all you need is distribution to iterate it faster. Now, these four items are a monopoly of big tech.
For a finance leader, AI cannot live as a blank-cheque experiment line. Instrument it the same way you instrument any other software function — visibility into cost per step, ROI per use case, controlled experimentation budget rather than a sanctioned lump sum. The token-economics opacity most vendors lean on (per-token, per-API-call, surprise charges at quarter-end) is the failure mode; the response is fixed-cost contracts on self-hosted infrastructure where the CFO knows the EC2 GPU instance price ($1.75/hour) and the engineering team owns the scaling decisions. The org-design corollary: you don’t need an AI product manager and an AI DevOps team — you need the existing teams to know what they’re A/B-testing and what their OKR is. The biggest enterprise AI failure mode Harsh names isn’t technical; it’s leadership treating AI as a magical beast a hired-in AI scientist will solve, instead of as a software problem with a clear owner and budget envelope.
What to listen for
The full episode is the AI 101 most finance teams will benefit from listening to once: the ML/neural-net/LLM/GenAI taxonomy as a tree not a list, pre-training vs fine-tuning vs RAG (Harsh’s 80% rule — most enterprises need RAG, not fine-tuning), vector databases as the SQL of unstructured data, and Meta’s Llama open-sourcing read as economic strategy not altruism. His three-word descriptor is Creative. Goofy. Consistent. Listen at /podcast/ep-029-harsh-joshi; for the other AI-in-finance essays in the catalogue, see Joy Mbanugo and David Junius, or /topics/ai-in-finance.
Related questions
- What does Harsh mean when he says AI is 'function approximation, not rules'?
- Pre-1990s AI was rules-based — if-then-else systems written by humans. They were interpretable but didn't scale: there are infinite rules, and you can't encode them all. Modern AI is the opposite. A neural network is a function approximator — you throw arbitrary data at it, and it forms a decision boundary inside that data. The act of finding the boundary is what we call learning. The system doesn't know your rules; it has approximated the shape of your data well enough to make predictions on the next data point. That single shift — from encoded rules to learned approximation — is what made the last decade of AI progress possible. It's also what makes the systems opaque: you can describe how you trained them, but you can't deterministically describe what they will do.
- Why does Harsh call controllability, explainability, and decomposability the three hard problems of enterprise AI?
- Traditional software has all three by default. You can decompose a problem into modules and functions; you can test that each function returns a deterministic output; you can therefore explain and control its behaviour. Neural networks have none of the three. You can't talk to one neuron in the model; you can't guarantee what output a given input will produce; and you therefore can't explain why a particular output appeared or stop a similar one from appearing tomorrow. Compliance, QA, governance, and customer-facing safety all rest on those three properties. Until the AI field solves them — even partially — most enterprise use cases stall at the gate, which is why hedge funds and insurers still deploy old decision trees in production: smaller surface area, fully interpretable, predictable behaviour. The community will get there, but the road is long.
- What is MLOps, and why does Harsh say it's the difference between a demo and a deployment?
- MLOps is the engineering discipline of taking an AI system from a working demo to a system being consumed daily by customers — security, scale, observability, cost control, and behavioural guardrails. Traditional DevOps had two decades to mature on top of decomposable, deterministic software; MLOps is doing the same work for systems that have neither property, and is itself only a few years old. Harsh's line is that the difference between calling your engineer to demo to the board and your AI being used in production is MLOps. Most enterprise AI projects stall at exactly this gap. The board is impressed, the customer never sees it, and the project quietly ages out of the budget cycle. The fix is treating AI deployment as an engineering discipline, not a research output.
- What is the AI economics framing — 'data, capital, talent, distribution'?
- Harsh's compressed lens on the AI industry's structure. Building competitive AI requires four things at scale: training data, capital for compute, top-tier ML talent, and the distribution to iterate the product against real users fast. All four are concentrated in big tech, which is why the AI race has played out as a big-tech race despite the open-source noise. Open source is the lever that breaks the monopoly — but Harsh is sceptical that big-tech open-source releases (Meta's Llama, etc.) are charity. They're economic strategy: open-sourcing kills competitors' moats while preserving your own data, capital, and distribution advantages. The framing is the cleanest way to read big-tech AI moves in the 18 months since this conversation.
Updates
- Editorial pass under the v2 podcast-summary guideline.