How do you evaluate an AI development vendor?

Evaluate on three dimensions: capability (production deployments, relevant domain experience, technical depth), process (how they scope, how they handle changes, how they test), and terms (pricing model, data handling, IP ownership, post-launch support). Ask for references from clients with similar projects in production for at least 6 months. Demo quality correlates weakly with delivery quality -- production track record is the real signal.

What are red flags when hiring an AI development company?

Key red flags: all demos, no production references; time-and-materials pricing with no fixed scope; vague answers about who specifically builds your project; no clear data security and confidentiality terms; claims of proprietary AI models for tasks where open models already exist; overpromising timelines without asking about your data and integration complexity; and inability to explain how their AI handles failure cases and edge conditions.

How do I check if an AI vendor is legitimate?

Ask for three client references at similar companies with similar projects that have been in production for 6+ months. Talk to those clients, not just the ones the vendor selects. Ask specifically about timeline accuracy, budget adherence, and what happened when things went wrong. Check their GitHub for actual code contributions. Search Clutch, G2, and LinkedIn for independent reviews. Look at who specifically is listed on the team and verify their AI background.

Why choose 1Raft for AI development?

1Raft has shipped 100+ AI products across healthcare, fintech, commerce, hospitality, and manufacturing. Fixed-scope, fixed-price engagements with no time-and-materials billing. Your data stays in your infrastructure. The founders work on every project. References available on request.

Buyer's Playbook

20 Questions to Ask an AI Development Vendor Before You Sign

By Ashit VoraNovember 15, 202510 min

What Matters

-Ask for production deployments, not demos. Any vendor can build an impressive demo. The question is how many AI systems they've shipped to real users who depend on them daily.
-Pricing structure tells you everything. Fixed-scope, fixed-price vendors have skin in the game. Time-and-materials vendors can blame scope for overruns indefinitely.
-Data handling is non-negotiable. Understand exactly where your data goes, who can access it, and what happens to it after the engagement ends.
-The team question is critical. Who specifically builds your project? Junior developers working under a senior who reviewed the proposal once is not the same team that gave you the demo.
-Post-launch support is where the relationship either holds or breaks. Get support terms in writing before you sign.

Three months into an AI project with the wrong vendor is an expensive education. The vendor who gave the best demo isn't always the vendor who ships working software. The proposal that looked most thorough isn't always written by the team who builds your product.

Before you sign with an AI development vendor -- any vendor, including us -- ask these 20 questions. The answers reveal more than any sales presentation.

TL;DR

The questions that matter most: How many production deployments (not demos) have you shipped? Who specifically builds my project? What's the pricing model and what happens at scope changes? Where does my data go? What does post-launch support look like? Vendors who can't answer these clearly are showing you something important about what the engagement will look like.

Category 1: Production Track Record

1. How many AI systems have you shipped to production?

Not demos. Not proof-of-concepts. Not internal tools. Systems that real users depend on daily, where a failure has business consequences.

Any agency can build an impressive demo in 48 hours. The hard part is getting from demo to production -- handling edge cases, building the monitoring infrastructure, surviving the first month of real usage, and maintaining the system when the underlying models update. Gartner found in 2024 that 30% of generative AI projects get abandoned after proof of concept - most because the vendor couldn't close that gap.

Ask for a number, then ask to see references.

References matter most when the project type matches yours. Ask specifically for:

Similar industry or use case
Projects that have been running for at least 6 months (enough time for early problems to surface)
Contacts you can call, not just written testimonials

When you speak to references, ask: "What went wrong and how did they handle it?" The vendor's response to problems tells you more than their response to success.

3. What AI projects have you built in my industry?

Domain expertise matters in AI development. Medical records processing has different compliance requirements than e-commerce recommendation engines. Financial fraud detection has different model requirements than customer service chatbots.

A vendor who has built AI in your industry has already solved the domain-specific problems you'll hit. A vendor building in your industry for the first time will solve those problems on your dime.

4. What is the most complex AI failure you've had in production and how did you fix it?

This question reveals two things: their honesty (every real AI project has failures) and their operational depth (how they monitor, diagnose, and fix production problems).

A vendor who says "we've never had a failure" is either inexperienced or not being straight with you. AI systems fail in interesting and non-obvious ways -- model drift, edge-case inputs, upstream data changes, API changes from model providers. Good vendors have war stories.

Category 2: Team and Process

5. Who specifically will work on my project?

There's a common pattern in agencies: senior people give the sales presentation, junior people do the work. Ask for the actual team: names, roles, and AI-specific experience. Then verify them.

Check LinkedIn profiles for the engineers listed. How long have they been doing AI/ML work? What do their public projects and writing look like? Have they built the type of system you need?

6. Will the people who gave this presentation be working on my project?

Follow-up to the above. If the founders or senior engineers gave the demo but will "oversee" the project while others execute, understand what that means in practice. Weekly reviews? Full code review? Or quarterly check-ins?

There's nothing wrong with senior oversight plus junior execution -- it's how most agencies work. But you need to know which model you're buying.

7. How do you scope AI projects and what happens when scope changes?

Scope management is where AI projects go wrong more than anywhere else. AI development has genuine uncertainty -- model performance might require architectural changes, data quality issues might require additional processing steps, integration complexity might be higher than estimated.

How does the vendor handle this?

Fixed-scope, fixed-price: They commit to a defined set of deliverables at a defined cost. If scope genuinely expands, it's a conversation with a new estimate. They have skin in the game to scope accurately upfront.
Time-and-materials: You pay for hours. Scope can expand indefinitely. Every estimation miss is your problem.

Fixed-scope isn't always possible for research-heavy AI work. But for known deliverables (build a chatbot with these capabilities, integrate these data sources, deploy to this environment), fixed-scope is better for you.

8. How do you test AI systems before delivery?

Testing AI is fundamentally different from testing traditional software. There's no "expected output" you can hard-code for an LLM response. Model behavior is probabilistic.

Ask what their AI testing process looks like:

How do they build and maintain evaluation datasets?
How do they measure model performance against the metrics that matter for your use case?
How do they test edge cases and adversarial inputs?
What does their staging environment look like?

A vendor without a clear answer to these questions has not thought seriously about quality.

9. What does the handoff process look like?

At the end of the engagement, what do you receive? Source code (obviously), but also:

Infrastructure-as-code (so you can rebuild the deployment)
Documentation that your team can actually use
Runbooks for common operational tasks (restarting services, responding to model drift)
A knowledge transfer session with your internal team

The difference between "we handed over the code" and "your team can independently operate this system" is significant. Ask which one they deliver.

Category 3: Data and Security

10. Where does my data go during the project?

In AI development, your data is sent to model providers for training, fine-tuning, or inference. Each hop is a security boundary.

Get specific answers to:

Which model providers will be used? (OpenAI, Anthropic, Google, Meta open-source, other?)
Does your data go through the vendor's own infrastructure, or directly to model providers?
What are the data retention policies at each model provider?
Is there a data processing agreement?

For regulated industries (healthcare, finance, legal), this isn't a nice-to-have -- it's a compliance requirement. If the vendor can't answer clearly, they haven't done regulated industry work.

11. Who can access my data at the vendor side?

Which of their employees or contractors can see your data? Under what circumstances? What access controls are in place?

This matters especially for sensitive business data (customer records, financial data, proprietary formulas, competitive strategy). You're entitled to know exactly who has access and why.

12. Who owns the code and models after delivery?

This sounds obvious but often isn't. Get clarity on:

Source code: You own it. Full stop. No licensing back to the vendor.
Fine-tuned models: If the vendor fine-tunes a base model on your data, who owns that model? If they walk away, can you keep running the model?
Proprietary components: Some vendors include proprietary frameworks or libraries. Understand if you're licensing those ongoing or receiving a full source code transfer.
Training data: If your data was used to fine-tune a model, can the vendor use that model or those learnings on other client projects?

Get the IP ownership terms in writing, not just verbal assurance.

Category 4: Pricing and Commercial Terms

13. What is the full pricing model and what is and isn't included?

The proposal price is rarely the final cost. Ask specifically what's excluded:

Infrastructure costs during development (who pays for GPU compute, model API costs, staging environments?)
Third-party tool and API costs
Post-launch support and maintenance
Model API costs when the system goes to production
Future model updates when the underlying LLM versions change

"$80K for the build" sounds different when you add $15K in model API costs during development and $3K/month in ongoing inference costs.

14. What triggers a change order?

Every project has changes. How the vendor handles them defines your commercial relationship.

Good answer: "We scope tightly upfront to minimize changes. Genuine scope expansion (features you ask for that weren't in the original spec) gets a new estimate and your approval before any work starts."

Warning sign: Vague answer, or "we'll handle it as they come up." This usually means informal scope creep billed at high hourly rates.

15. What does post-launch support cost?

AI systems require ongoing maintenance: monitoring for model drift, updating integrations when APIs change, handling edge cases that surface in production, and performance optimization.

Get specific numbers:

What's included in the project price?
What does ongoing support cost per month?
What's the SLA for production issues?
Is there a retainer model or break-fix pricing?

The vendor who ghosts you three weeks after launch is more common than you'd hope. Support terms in writing, before you sign.

Category 5: Red Flags

"The failure question is the one that separates vendors who've shipped real systems from those who've only shipped demos. Every production AI system breaks eventually - what matters is whether the team has a plan for when it does." - Ashit Vora, Captain at 1Raft

16. Can you describe how your AI handles failure cases?

Every AI system fails sometimes. LLMs hallucinate. Classifiers make wrong predictions. Models encounter input types they haven't seen.

Ask: "When your AI is wrong, what does the user experience? How does the system degrade gracefully?"

Good AI systems fail safely: returning to a human workflow, showing confidence scores so users know when to double-check, or presenting an "I don't know" rather than confident wrong answers.

Vendors who haven't thought about this haven't shipped AI to production users.

Ask them to justify their technical recommendation. A vendor suggesting a custom fine-tuned model when a well-prompted base model would perform comparably is either over-engineering your project or running up the bill.

If they can't explain the trade-offs between their recommended approach and simpler alternatives, they may not understand the trade-offs themselves -- or they're optimizing for scope, not your outcome.

18. What happens if the AI doesn't meet the performance targets?

Before starting, you should agree on measurable success criteria (accuracy rate, task completion rate, response time, cost per query). Then ask: what happens if the system doesn't hit those targets by launch?

Does the vendor commit to iterating until it does? Refund a portion? Define "good enough" as whatever ships?

The answer to this question reveals the vendor's confidence in their approach and their commitment to your outcome.

19. Have you built this exact type of system before?

If your project is a RAG-based document search system, ask if they've built RAG systems before. If it's a real-time fraud detection model, ask how many real-time ML pipelines they've deployed.

Vendors who pivot from their actual experience ("we've built similar things") or talk around the question ("our team is very adaptable") are telling you something.

Adjacent experience is fine. No experience is a problem. Know which one you're getting.

20. What questions should I be asking that I'm not?

This one is the most revealing of all.

The best vendors know things about your problem that you don't. A vendor who deeply understands your use case should be able to identify risks or questions you haven't thought to ask.

If they answer with "no, you've covered everything" or a deflection, you're talking to someone in sales mode, not problem-solving mode. The vendors who push back with "actually, the bigger question is..." are the ones who've been in the trenches with this work.

What to Do With the Answers

Score each vendor on a simple 1-3 scale across the five categories:

Production track record
Team and process
Data and security
Commercial terms
Red flags

Vendors who score poorly in any category don't just have a weakness in that area -- they're showing you something about how they operate. A vendor who can't answer data security questions hasn't built for regulated industries. A vendor with a vague scope change process has had ugly client disputes and learned nothing from them.

The best AI development partner is honest about what they don't know, specific about what they do know, and more focused on your outcome than closing the deal.

Related reading: How to Choose an AI Development Partner -- the full framework for evaluating AI vendors. AI Development Company vs. Freelancer -- when to use which.

Frequently asked questions

The five most important: (1) How many AI systems have you shipped to production (not just demos)? (2) Who specifically will work on my project and what is their AI experience? (3) What is your pricing model and what happens if scope changes? (4) How do you handle my data and who can access it? (5) What does post-launch support look like and what is included? These questions reveal whether you're working with a proven vendor or an agency riding the AI hype wave.

Top AI Consulting Firms in 2026: Big Four vs. Boutique

Most AI consulting engagements end with a strategy deck that never ships. Here is when you actually need a Big Four firm - and when a boutique studio gets you to production 10x faster.

Nov 11, 20257 min

Buyer's Playbook

Build vs Buy AI: A Decision Framework for Product Teams

75% of AI use cases run on vendor products. The 25% companies build custom deliver the deepest moats. Here's the framework for deciding which bet to make.

Nov 29, 202514 min

Buyer's Playbook

Why 80% of AI Projects Fail (and How to Beat the Odds)

85% of AI projects fail - not from bad algorithms, but from five predictable implementation mistakes that every organization makes. Here is how to be in the 15% that succeeds.

Jan 9, 202611 min

20 Questions to Ask an AI Development Vendor Before You Sign

What Matters

Category 1: Production Track Record

1. How many AI systems have you shipped to production?

2. Can you share three client references with similar AI projects in production for 6+ months?

3. What AI projects have you built in my industry?

4. What is the most complex AI failure you've had in production and how did you fix it?

Category 2: Team and Process

5. Who specifically will work on my project?

6. Will the people who gave this presentation be working on my project?

7. How do you scope AI projects and what happens when scope changes?

8. How do you test AI systems before delivery?

9. What does the handoff process look like?

Category 3: Data and Security

10. Where does my data go during the project?

11. Who can access my data at the vendor side?

12. Who owns the code and models after delivery?

Category 4: Pricing and Commercial Terms

13. What is the full pricing model and what is and isn't included?

14. What triggers a change order?

15. What does post-launch support cost?

Category 5: Red Flags

16. Can you describe how your AI handles failure cases?

17. Why do you recommend [specific model/approach] for my use case?

18. What happens if the AI doesn't meet the performance targets?

19. Have you built this exact type of system before?

20. What questions should I be asking that I'm not?

What to Do With the Answers

Frequently asked questions

What are the most important questions to ask an AI development company?

How do you evaluate an AI development vendor?

What are red flags when hiring an AI development company?

How do I check if an AI vendor is legitimate?

Why choose 1Raft for AI development?

How to Choose an AI Development Partner

How to Evaluate AI Vendors

AI Development Company vs. Freelancer

In-House vs. Outsource AI Development

Related posts

Top AI Consulting Firms in 2026: Big Four vs. Boutique

Build vs Buy AI: A Decision Framework for Product Teams

Why 80% of AI Projects Fail (and How to Beat the Odds)