Job Description: Voice AI Engineer
Company: VoxyHealth (VoxyAI)
Location: Chennai, India (On-site)
Level: Mid-to-Senior Individual Contributor
About VoxyHealth
VoxyHealth builds AI-powered voice and workflow automation for healthcare. Our platform connects EHR systems, automates front-desk workflows, and enables ambient clinical intelligence — helping healthcare providers spend less time on paperwork and more time on patients. We work with clients ranging from specialty clinics to national health systems.
The Role
We’re hiring a voice AI engineer to build and scale the voice intelligence stack at the heart of our product—the real-time pipelines that listen, understand, and respond on behalf of healthcare providers. You’ll work on the production voice agent: STT, LLM orchestration, TTS, telephony integration, latency tuning, and the hard edge cases that come with handling clinical conversations.
This is a hands-on engineering role for someone who’s already shipped voice AI in production and wants to go deeper—not learn it for the first time.
What You’ll Own
– Voice agent pipeline — end-to-end design and tuning of STT → LLM → TTS flows, including barge-in, turn-taking, and latency budgets
– Telephony integration — SIP/WebRTC, call routing, audio streaming, jitter/packet-loss handling
– Model integration — work with hosted and self-hosted models (Whisper, Deepgram, ElevenLabs, OpenAI Realtime, Claude, etc.); evaluate, swap, and tune
– Real-time performance — drive down end-to-end latency; profile and fix the slow path
– Conversation quality — prompt engineering, function/tool calling, dialog state, fallback handling
– Evaluation and observability — build the test harness and metrics that tell us when the agent regresses
– Healthcare-grade reliability — handle edge cases (silence, noise, accents, code-switching, interruptions) without the agent falling apart
What We’re Looking For
Must-Have
– 5–7 years of software engineering experience overall, with 1–2 years specifically in voice AI (production, not side projects)
– Hands-on with at least one full voice pipeline: STT, TTS, and an LLM in the loop — you’ve debugged latency, barge-in, and turn-taking issues in production
– Telephony experience — SIP, WebRTC, or comparable real-time audio stacks; understands codecs, jitter buffers, and streaming audio
– Python backend proficiency; comfortable with Node.js as well
– Solid grasp of LLM orchestration — function/tool calling, streaming, prompt design, context management
– Comfortable in production: Docker, cloud (AWS or Azure), logs, traces, and metrics
– Understands the difference between a demo that works and a system that holds up under real call volume
Strong Plus
– Experience with Whisper, Deepgram, AssemblyAI, ElevenLabs, Cartesia, OpenAI Realtime API, LiveKit, Pipecat, or similar voice frameworks
– Healthcare domain knowledge — HIPAA, clinical workflows, EHR integrations
– Worked on multi-lingual or accent-robust voice systems
– ML/DL background — fine-tuning, evaluation harnesses, or custom model deployment
– Has shipped a voice agent that handled >1k concurrent calls or non-trivial production traffic
– Familiarity with Claude Code or similar AI-assisted development tools—we mandate their use across the engineering team
Mindset
– Obsessed with latency and conversation quality—”It works on my laptop” is not a finish line
– Ships and measures—every change is backed by an evaluation, not vibes
– Comfortable with messy real-world audio and the long tail of edge cases that come with it
– Bias toward production—you’d rather ship something behind a flag than perfect it offline
– Direct communicator; surfaces tradeoffs early
What You’ll Work On (Examples)
– Cutting end-to-end agent response latency from 1.8s to under 800ms
– Building a barge-in handler that doesn’t cut the user off mid-sentence
– Designing the eval harness that catches regressions before they hit a clinic
– Integrating a new TTS provider behind a feature flag and A/B testing voice quality
– Debugging why the agent loses context on long calls with accented speakers
Compensation
– Competitive base salary (commensurate with experience)
– Equity
– Health benefits
– On-site, Chennai
How We Hire
1. Intro call with Founder
2. Technical conversation — past voice AI work + system design (60 min)
3. Hands-on round — voice pipeline debugging or design exercise
4. Team / culture interview
5. Offer
Work with Area Sales Manager – ABHFL and plan sales operations for achieving targets (including cross-sell), considering competitive forces and...
Apply For This JobCompany Description The Indian Institute of Organics (IIO) promotes organic wellness by blending modern science with traditional organic methods. Specializing...
Apply For This JobDear candidate, We have opening for Mumbai Location for SQL developer. We are looking for Immediate to April serving. With...
Apply For This JobHello Visionary! We empower our people to stay resilient and relevant in a constantly evolving world. We’re looking for people...
Apply For This JobPersyaratan Pekerjaan S1 segala jurusan Minimal 5 tahun pengalaman di bidang Pemasaran & Penjualan. Memiliki pemahaman yang kuat tentang proses...
Apply For This JobStrategic/ Policy Ensure achievement of Sales budgets for the assigned geographical territory Develop a clear understanding of his/her role as...
Apply For This Job