From Prototype to Production: How Perk Built a Voice AI Agent That Makes 10,000 Calls a Week

Listen to this episode on: Spotify | Apple Podcasts

What happens when you combine a real customer problem, a no-code prototype, and a team willing to listen to every single call?

In this episode of Just Now Possible, Teresa Torres talks with Steven Payne (Product Manager), Gabriel Stock (Senior Engineering Manager), and Philipe Steiff (Senior Software Engineer) from Perk—a company that helps businesses eliminate "shadow work" like travel booking and expense management. They share how they built a voice AI agent that calls hotels to verify virtual credit card payments, preventing travelers from arriving to find their rooms unpaid.

What started as a hackathon experiment in Make.com became a production system handling over 10,000 calls per week across multiple languages. Along the way, the team learned hard lessons about prompt engineering for voice (numbers, pronunciation, and a very "Karen-like" first version), how to break a single monolithic prompt into structured conversation stages, and why listening to actual calls beats any amount of theorizing.

You'll hear how they:

Built a working prototype without writing a single line of backend code
Structured the call into discrete stages (IVR, booking confirmation, payment) to improve reliability
Created two eval systems: one for call success classification, another for conversational behavior
Scaled from five calls a day to tens of thousands per week while maintaining quality

This is a detailed look at building AI for real-time human interaction—where the stakes are high and the feedback is immediate.

Show Notes

Guests

Steven Payne, Product Manager, Perk
Gabriel Stock, Senior Engineering Manager, Perk
Philipe Steiff, Senior Software Engineer, Perk

What we cover in this episode

How Perk's team identified an AI use case by connecting prior experimentation with a real operational problem
Why they chose Make.com for prototyping—and shipped to production without touching backend code
The evolution from a single prompt to structured conversation stages (IVR handling, booking confirmation, payment request)
How breaking up the agent's task dramatically improved reliability
Building two eval systems: classification for success rates and LLM-as-judge for conversational behavior
Why the team still listens to calls manually even with automated metrics
The challenge of prompt engineering for voice: numbers, booking references, and text-to-speech markup
Lessons learned from expanding to German (prompts in native language improve results)
How this project uncovered other operational problems they didn't know existed

Resources & Links

Perk
Make.com – No-code automation platform used for the prototype
Twilio – Voice/telephony provider
Eleven Labs – Text-to-speech provider (used in early experiments)

Chapters

00:00 Introduction to the Team
01:54 Understanding PERK's Mission
02:59 Challenges in Travel Booking
07:27 AI Solutions for Customer Care
09:52 Prototyping with AI and Voice
17:00 Implementing AI in Production
25:51 Learning Through Trial and Error
26:40 Prompting Challenges and Solutions
27:58 Iterating on Prompts and Evaluations
30:08 Scaling and Production Challenges
32:43 Advanced Evaluation Techniques
35:32 Real-World Applications and Success
49:07 Future Directions and Expansion
53:53 Conclusion and Team Reflections

Full Transcript

Podcast transcripts are only available to paid subscribers.

From Prototype to Production: How Perk Built a Voice AI Agent That Makes 10,000 Calls a Week

Show Notes

Guests

What we cover in this episode

Resources & Links

Chapters

Full Transcript

Read next

Building GitHub for Product Management: How Momental Uses AI to Find Merge Conflicts in Strategy

Building AI Sales Reps: How ShowMe Orchestrates Voice, Video, and Multi-Agent Workflows to Close Deals

Building Earmark: How a Two-Person Team Turned Meetings into Finished Work

Make better product decisions.

Show Notes

Guests

What we cover in this episode

Resources & Links

Chapters

Full Transcript

Read next

Make better product decisions.