May 30, 2026

How to Build a SalesGPT (And Why You Probably Shouldn't)

Every cycle has a new version of the same sentence.

Last decade it was: "Why do we need Snowflake? We can build a data warehouse."

This decade it’s: "Why do we need Von? We can vibe code a SalesGPT in a weekend."

And I get where the ask is coming from. You’re seeing demos online where someone wires up MCP connectors to Claude or ChatGPT, points it at Salesforce and Gong, and — boom — "AI CRM." Then you see a budget line item for Von, and you think:

"Isn’t this just an MCP that connects to systems? Why are we paying for this?"

So I’m going to do something most vendors won’t. I’m not going to tell you you can’t build it in-house. I’m going to give you the blueprint for how to build it — step by step — so you can understand what you’re actually buying when you buy Von. Because the real difference between “vibe coding a demo” and “shipping a SalesGPT that runs your revenue org” is not one clever prompt. It’s an entire stack.

We spent $3.5M on R&D in 2025 and will spend $8M+ in 2026 building that stack. It took 12 months and a team of San Francisco-based top 1% AI/ML engineers who live and breathe the revenue data problem. This is what it takes to answer the questions your team will actually ask.

The 7 Categories of Questions Your SalesGPT Must Answer

Before you write a single line of code for your in-house SalesGPT, you need to understand the landscape of questions a SalesGPT must answer. There are seven types, and they get progressively harder:

Category 1 — Analytics. Questions over structured data (CRM, data warehouse). “What’s my win rate last quarter?”

Category 2 — Enablement. Questions over unstructured data (call recordings, internal docs). “How do we handle data privacy objections?”

Category 3 — Individual Deals. Combining structured and unstructured data for a single deal. “What’s happening with the Acme opportunity?”

Category 4 — Multiple Deals. The same structured + unstructured combination, but across hundreds or thousands of deals at once. “Which deals closing this quarter don’t have a next step?”

Category 5 — Analysis. Pattern recognition and trend analysis across your entire book of business. “Why are we losing deals for Product X in EMEA?”

Category 6 — Action. Taking real action across systems based on AI-generated insights. “Update these 30 records in Salesforce and add them to Outreach sequence Y.”

Category 7 — Data Scientist. Building predictive models on your own data. “Help me build a churn prediction model and apply it to my current book.”

A real SalesGPT answers across this entire spectrum. If your system can only do Category 1 and 2, you haven’t built a SalesGPT — you’ve built a helpful chatbot. You’ll get a few impressive demos but zero meaningful operational leverage.

Now let’s build it.

Category 1: Analytics over structured data (What Twitter Calls "The Weekend Build")

"What’s my win rate last quarter?"

This is where everyone starts, and it feels deceptively easy. You build a direct integration with your CRM or data warehouse or you use MCP to connect your structured data to a powerful AI model like Claude / GPT. Right out of the gate, you’ll get decent results. The demo will feel magic. You’ll think you’re 80% of the way there.

But the minute you hand this to 5 people, it falls apart. It doesn’t know that “pipeline” in your org means something different than the default Salesforce definition. It doesn’t know that your team renamed “Stage 4” to “Negotiation” last quarter. It doesn’t understand that your fiscal year starts in February.

So now you need to build a semantics layer — a structured repository of your business definitions, field mappings, and organizational context that the AI can reference before it writes a query. This is not optional. Without it, your AI confidently gives wrong answers and executives lose trust permanently. OpenAI recently published a blog post about building their own internal data agent — a system designed to answer exactly this category of question. According to the blog, it required six layers of context to get reliable answers. Six layers — just to answer data questions over their own tables.

There’s another problem most people don’t anticipate: the query engine itself. If you’re pointing your AI directly at Salesforce, you’re writing SOQL — and SOQL can’t do JOINs, window functions, CTEs, conditional aggregates, or CASE expressions. The moment someone asks a question that requires cross-object analytics, SOQL simply cannot express the query. You hit a language-level ceiling, not a model-level one.

But we have six more categories to go.

Category 2: Enablement over unstructured data

"How do we handle objections related to data privacy?"

Next, you connect your unstructured data — call recordings, sales enablement content, engagement tools, internal docs — to the AI via MCP. And almost immediately, you hit a wall.

MCP over unstructured data does a keyword search. But what your users need is semantic search. When someone asks about “data privacy objections,” they also want results where the prospect said “we’re concerned about GDPR” or “our security team won’t approve this.” Those are the same concept but entirely different words.

So you need to rip out the MCP approach and build something else: a pipeline that pulls data from all your systems, vectorizes it in a vector database (something like TurboPuffer), and wraps the whole thing in a RAG (Retrieval Augmented Generation) architecture. A competent engineer can build the first version in a few weeks.

But the first version won’t be good enough because pure vector search sounds elegant, but misses exact terms—product names, competitor names, acronyms.

But at least you can answer Category 2 questions — congratulations! At this point, CEOs see the demo and say: “This is good enough.” It isn’t because your org doesn’t actually live in Categories 1 and 2. The real questions start when structured and unstructured collide.

I should note: Von doesn’t use MCPs. There are well-documented limitations with building production-grade applications on MCP, and we hit those walls early enough that we built our own integration and retrieval layer from scratch.

Category 3: Individual deals (structured + unstructured)

"What’s happening with the Acme opportunity?"

This is where things get interesting, because you’re now asking the AI to bounce between structured and unstructured data within a single question.

Sometimes the query is: “Tell me every time Competitor X came up, and what happened in those deals.” The system needs to search your unstructured data (Category 2 retrieval) to find every mention of the competitor, then pivot to your CRM (Category 1 retrieval) to pull the deal outcomes.

Other times it’s simpler on the surface: “What’s going on with the XYZ deal?” But answering that well means pulling the CRM record, the last three call transcripts, the email thread with the champion, and the Slack conversation where your AE asked their manager for help with pricing.

Two problems emerge here.

First, getting the agent to do this consistently is hard. It needs to decide on every query: do I start with structured data and enrich with unstructured? Or the reverse? The routing logic sounds simple in theory. In practice, it breaks constantly.

Second — and this is more insidious — the bindings often don’t exist. There’s no clean link between the opportunity in Salesforce and the call transcript in Gong, or between the account and the Slack thread. The data lives in silos, and nobody’s ever connected them at the record level. So the agent simply can’t find what it needs.

We solved both problems — it’s core to what makes Von work. A basic in-house build will get these questions wrong roughly half of the time. Let’s keep going.

Category 4: Cross-deal questions (the context wall)

"Give me all deals closing this quarter that don’t have a next step or are past Stage 3 without a decision maker identified."

Read that question carefully. To answer it, the system needs to:

Query your CRM to find all deals closing this quarter past Stage 3 (Category 1 search). For each of those deals — potentially hundreds — do a Category 3 search to determine if a next step exists and whether a decision maker has been identified, drawing from call transcripts, emails, and CRM fields.

This is where most homegrown systems die.

The reason is the context window. Most AI models have a context window of about one million tokens per query, and in practice, you don’t want to load more than 250K because you need room for the conversation history. A single call transcript runs 9,000+ tokens. Do the math: if you have 200 deals and each has three call transcripts, you’re looking at 5.4 million tokens just for the raw data — more than five times the model’s limit.

So what happens? The agent looks at the first 10 deals, gives you an answer on those, and quietly ignores the other 190. It doesn’t tell you it only checked 5% of what you asked about. It just gives you a confident, incomplete answer.

This is the context engineering problem that the AI community has been talking about. And it was one of the hardest things we had to solve.

Our approach: we pre-process aggressively. When you first connect your systems to Von, we deploy hundreds of thousands of specialized agents that build a comprehensive view of every deal, every account, every relationship. Then we re-run those agents every night to keep everything current. This means when you ask a Category 4 question, we’re not trying to read thousands of raw transcripts in real time — we already have the distilled intelligence ready to go.

Category 5: Analysis (the question everyone actually asks first)

"Why are we losing deals for Product X in EMEA?"

Here’s the most surprising thing we’ve learned: when people get access to a SalesGPT, they don’t start with simple questions. They don’t ask about win rates or individual deals. They jump straight to Category 5 — the hardest analytical question on their mind.

“Why are we winning? Why are we losing?”

To answer this, the system needs to do a Category 4 search across all your won and lost deals over the past year, then analyze that data to find patterns and trends. It’s not retrieval anymore — it’s genuine analysis.

What most AI systems do here is sample. They’ll pull 10 won deals and 10 lost deals, load everything into a single context window, and generate an answer. The first time you see it, it looks impressive. The second time, you realize it’s shallow — because it looked at less than 1% of your deals and treated that as the whole picture.

We solved this by giving the agent something most AI systems don’t have: its own compute environment. When Von tackles a Category 5 question, it spins up an isolated cloud sandbox — an ephemeral virtual machine with its own file system, a stateful code execution kernel, and the ability to write and run its own Python scripts.

That’s the key thing to understand about Category 5: these queries are genuinely compute-intensive. Von can run for hours on a single analysis, iterating across your entire book of business, writing and executing its own code, accumulating evidence in files — and it never loses progress. There is no other system in the world that does this today.

Category 6: Action

"Find every customer I lost to pricing last year who was deep in the conversation. Write each of them a personalized win-back email and add them to an Outreach sequence."

Every sales leader will tell you the same thing: insights are great, but I need to do something with them.

Category 6 means the AI doesn’t just find and analyze — it acts. And the critical nuance here is what it acts on. For a question like the one above, you don’t want the system looking at the “Closed Lost Reason” field in Salesforce — because half your reps pick from a dropdown without thinking. You want it identifying pricing as the real blocker based on what actually happened in calls and emails. That’s a Category 5 analysis that feeds directly into action.

Then it needs to write a personalized win-back email for each of those accounts — not a mail merge template, but outreach that references the specific conversation, the specific objection, and why now is different. Then it needs to push those contacts into the right Outreach sequence, update the CRM records, and log what it did.

There’s no secret to this — it’s straightforward engineering. But “straightforward” doesn’t mean “easy” or “fast.” It means building reliable, tested integrations with every system in your stack, with proper error handling, rollback capabilities, permissions checks, and audit trails. It means human-in-the-loop approvals where the agent pauses execution, shows the user exactly what it plans to change — field by field, with before and after values — and waits for sign-off before touching a single record. It means role-based access controls, multi-tenant data isolation enforced at the database level, and structured logging of every action the agent takes. It’s the kind of work that takes months of careful engineering, not a weekend of vibe coding.

Category 7: Data Scientist

"Help me build a churn prediction model based on my historical data and apply it to my current book."

This is the last frontier — the question most revenue leaders wish they could ask but can’t, because it currently requires hiring a data scientist, waiting three months, and hoping the model is still relevant by the time it’s done.

Category 7 means the SalesGPT writes its own code at runtime to build machine learning models on your data. Churn prediction. Opportunity risk scoring. Account health scoring. Propensity models for upsell. The kinds of analyses that today live in a Jupyter notebook on one data scientist’s laptop, if they exist at all.

Think about what this requires: the system needs to understand your data schema, pull the right historical features, clean and normalize the data, select an appropriate model, train it, validate it, and then apply it to your live book — all within a single conversation. And it needs to explain what it built and why, so you actually trust the output.

Von can do this because of the same sandboxed compute environment from Category 5 — but now the agent is doing something far more sophisticated with it. It writes and executes code that leverages an AutoML engine under the hood. I won’t walk through how we built this the way I did for the other categories — we’re likely going to patent it. But this is the capability that makes a SalesGPT feel like it has a data science team built in, available to every person in your revenue org, not just the one team that managed to get on the analytics backlog.

You’ve built the GPT. Now build everything on top of it.

If you’ve made it this far — through all seven categories — congratulations. You now have a SalesGPT. But you don’t have a product. Here’s what’s still missing:

A BI layer. Your users shouldn’t have to ask the same questions every week. They need dashboards — pipeline reviews, forecast snapshots, and health-of-business metrics. The problem? Your entire architecture runs on structured and unstructured data, and there isn’t a BI tool in the world that does reporting on unstructured data. So you’ll build that too.

An agentic layer. Simple things like sending an alert when a deal goes dark. Complex things like: if a customer gives a positive NPS score, email them asking for a testimonial; when they agree, automatically add it to your website’s wall of love. Multi-step, conditional workflows that run autonomously.

More integrations. Every new integration gives your AI more tools and more context — and dramatically increases complexity. The agent that worked beautifully with five tools starts hallucinating with fifteen.

Enterprise reality. User provisioning. Role-based permissions. Security controls. SOC 2 compliance. SSO. Audit logs. The unsexy infrastructure that nobody talks about but every enterprise buyer requires.

"But Models Are Getting Better. Won’t This All Go Away?"

The models are getting more intelligent. But all the plumbing we’ve built — the semantics layer, the vector pipeline, the record bindings, the Core, the pre-processing agents, the file system, the runtime code execution — still provides essential context and capability that foundation models can’t replicate on their own.

Our scaffolding doesn’t become obsolete when models improve. It becomes more powerful.

So Why Pay for Von?

If you’re still evaluating building a SalesGPT in-house, you basically have three options:

Option A — Build the Demo

You can absolutely connect MCP to Claude and impress your team. You’ll get Category 1. Maybe even Category 2 if you start putting data in vector databases. That’s a great internal experiment. It will not replace a production SalesGPT.

Option B — Build the System

You invest 6–12 months of engineering time. You’ll rewrite the architecture at least once, probably twice. You’ll discover the hard problems (bindings, context, precompute, governance) the slow way—by getting burned in production. And when you finish? You now own a core platform you must maintain forever.

Option C — Buy Von

You get Categories 1–7 now. Your competitors don’t get a 12-month head start while you build. And your team stays focused on what differentiates your business—product, pricing, positioning, deals—not infrastructure.

I follow a simple rule:

Build what creates differentiation. Buy what creates speed.

Von is speed. And leverage. And competitive advantage.

Because there’s real revenue on the line, and “almost accurate” is the same as “wrong” when the CRO is making decisions.

‍

Meet the author

Sahil Aggarwal

CEO & Co-founder