We’re building agentic AI systems that autonomously reason, plan, and act across complex workflows. You’ll be a core contributor designing and shipping production-grade AI agents, integrating with LLM providers, orchestrating multi-step workflows, and building the infrastructure that makes agents reliable, observable, and scalable. This is a deeply technical role — you’ll work at the intersection of LLM engineering, backend systems, and agent architecture.
Job Title – Senior AI & Agentic System Engineer
Location – Gurgaon, India (You may note that this role is fully remote)
Level – Senior (5+)
Team – AI Platform / Engineering
WHAT YOU’LL DO
• Design and build AI agents — autonomous systems that use tools, browse the web, call APIs, manage memory, and complete multi-step tasks with minimal human intervention
• Architect agentic workflows — orchestrate multi-agent pipelines (planner → executor → verifier patterns), handle tool-use loops, HITL checkpoints, retries, and failure recovery
• Integrate LLM providers — work with Anthropic Claude, OpenAI, Gemini; implement prompt caching, structured outputs, and context management at scale; select the right model for the right task
• Build knowledge pipelines — design RAG systems, hybrid retrieval, vector stores, embedding pipelines, and memory layers that give agents contextual awareness
• Own backend reliability — REST/GraphQL APIs, async job queues, observability, latency optimization, cost tracking for token usage
• Drive system design decisions — lead architecture discussions, define component boundaries, evaluate build vs. buy tradeoffs, and document decisions (ADRs, diagrams)
• Maintain CI/CD pipelines — keep deployments fast, safe, and automated across dev/staging/prod environments
• Evaluate agent behavior — build LLM tracing, regression frameworks, and evals pipelines to detect behavioral drift, intent failures, and quality degradation across model updates
• Stay ahead of the curve — evaluate emerging standards (MCP, A2A, reasoning models, new frameworks) and make adoption recommendations
WHAT WE’RE LOOKING FOR
• Deep experience with at least one major LLM provider SDK (Anthropic, OpenAI, Gemini)
• Prompt engineering depth: few-shot design, chain-of-thought, structured outputs, tool/function calling, multi-turn context management, prompt caching, token budgeting
• Built real production agents — not just chatbots; systems that take actions, use tools, and operate autonomously
• Understands agent patterns: ReAct, plan-and-execute, multi-agent orchestration, reflection loops, supervisor/worker hierarchies
• Familiar with MCP (Model Context Protocol), A2A, and agent frameworks (LangGraph, CrewAI, AutoGen) — and when NOT to use them
• Knows how to handle non-determinism: retry strategies, output validation, graceful degradation
Emerging AI Landscape
• Reasoning models — knows when to use o1/o3, Claude Extended Thinking vs. standard models; treats model selection as an architectural decision, not an afterthought
• Model routing & cascading — routes queries to right-sized models (Haiku/Flash for simple, Opus/GPT-4o for complex) based on cost/capability tradeoffs; doesn’t default to the largest model for everything
• Compound AI systems — combine multiple specialized models and tools in pipelines rather than relying on a single general-purpose model
• Human-in-the-loop (HITL) — designs checkpointing for long-running agents, approval workflows, interrupt/resume patterns; knows when and where agents need human gates
• Persistent agent state — durable execution patterns (LangGraph checkpointers, Temporal-style workflows); agents that survive restarts and partial failures
• Prompt injection & AI security — understands attack surfaces unique to tool-using agents; builds with guardrails in mind (NeMo Guardrails, Llama Guard, output schemas)
• Semantic caching & batch inference — reduces LLM costs through response-level semantic caching and async batch APIs; thinks about cost at 10x before it becomes a problem
LLMOps & Evaluation
• LLM-specific observability: Langfuse, LangSmith, Helicone, or Arize Phoenix — prompt/response tracing, token cost per flow, per-step latency, session replay
• Evals methodology: RAGAS, DeepEval, Promptfoo, or Braintrust — systematic evaluation of LLM outputs, not vibes-based testing
• Treats prompts as code: versioning, A/B testing, rollback; understands how prompt changes ripple into downstream agent behavior
• Knows how to run AI/non-deterministic systems in CI — mock vs. live modes, cost controls, behavioral regression (not just output format matching) • Cost tracking and optimization: token usage per workflow, model spend dashboards, alerting on cost regressions
Knowledge & Retrieval
• RAG pipeline depth: document ingestion, chunking strategies (fixed, semantic, agentic), embedding, retrieval, reranking
• Hybrid search — dense (vector) + sparse (BM25/keyword) retrieval; understands where pure vector search underperforms and how to combine approaches
• Awareness of GraphRAG and contextual retrieval (prepending context summaries to chunks) for structured knowledge domains
• Worked with vector databases: Pinecone, Weaviate, pgvector, or similar
• Understands knowledge graph concepts and when structured retrieval outperforms embedding-based approaches
System Design & Architecture
• Can design distributed systems end-to-end: from data flow diagrams to API contracts to deployment topology
• Experience designing for agentic-specific challenges: long-running tasks, partial failures, agent state persistence, tool result caching, idempotent retries
• Knows when to use async vs. sync, event-driven vs. request-response, monolith vs. microservices — and can justify the choice
• Comfortable with architecture patterns: event sourcing, CQRS, saga pattern for multi-step workflows, circuit breakers
• Documents decisions clearly (ADRs, architecture diagrams, onboarding docs) — not just builds, but communicates design
• Has opinions on API versioning, backward compatibility, and service contracts
• Thinks about scalability and cost together — not just ‘will it scale’ but ‘what does it cost at 10x’
Backend Engineering
•Strong in Python and/or Node.js/TypeScript for backend services
• API design (REST, GraphQL, webhooks), async patterns, background job processing
• Database proficiency: PostgreSQL, DynamoDB, or equivalent
• Strong on observability: structured logging, distributed tracing, metrics
CI/CD & Developer Tooling
• Proficient with GitHub Actions — writing workflows from scratch, not just copy-pasting templates
• Understands CI/CD pipeline stages: lint → test → build → deploy with proper gate logic
• Can set up environment-specific deployments (dev/staging/prod) with appropriate safeguards
• Familiar with Docker and container-based build pipelines
• Knows how to handle secrets management in pipelines (AWS Secrets Manager, GitHub Secrets)
• Experience with deployment strategies: blue/green, canary, rolling updates
• Understands how to test AI/non-deterministic systems in CI — mock vs. live modes, cost controls, timeout handling
AWS (Core Services)
• Compute: Lambda (serverless functions, cold start optimization, layer management), ECS/Fargate (containerized long-running agents and services) • API & Routing: API Gateway for REST endpoints, ALB for container-based routing
• Storage: S3 for artifacts/reports/model outputs, DynamoDB or RDS/PostgreSQL for structured data
• Async & Queuing: SQS for decoupled task queues, SNS for fan-out, EventBridge for scheduled or event-driven triggers
• IAM: Writing least-privilege policies, understanding role assumption, cross-service permissions — not just using admin credentials
• Observability: CloudWatch Logs and Metrics for baseline monitoring; experience connecting to Datadog or similar APM preferred
• Infrastructure as Code: Proficient with Terraform — writing modules, managing state (remote backends, workspaces), handling drift, and deploying infra changes safely without manual console work
Mindset
• Comfortable with high ambiguity — AI systems behavior isn’t always predictable
• Ships iteratively; knows when a prototype is good enough vs. when to harden
• Opinionated about quality but pragmatic about tradeoffs
NICE TO HAVE
• Experience with browser automation in agentic contexts (Playwright, Puppeteer, Anthropic computer use API)
• Multi-modal agent experience — vision inputs, document parsing, image understanding in agentic loops
• Familiarity with agent evaluation frameworks and behavioral regression testing
• Contributions to open-source AI tooling
• Background in ML/data science alongside software engineering
TECH STACK WE USE
• LLMs: Anthropic Claude, OpenAI
• Agent infra: Custom orchestration, MCP servers
• Backend: Node.js / Python, AWS Lambda + ECS/Fargate
• Browser automation: Playwright
• CI/CD: GitHub Actions
• Observability: Datadog, CloudWatch, Langfuse
• Data: PostgreSQL, S3, SQS, DynamoDB
• IaC: Terraform
WHY THIS ROLE
• You’ll work on AI problems that are genuinely unsolved — not CRUD apps with an LLM wrapper
• Greenfield architecture decisions — your opinions shape the system
• Direct impact on product; small team, high ownership
• Exposure to the full agentic stack from infra to eval to UX
WHAT WE DON’T EXPECT
• A research background or ML theory depth — this is an engineering role
• Experience with every framework listed — curiosity and speed of learning matter more than coverage
• You need to know everything about the AI market — but you should have strong opinions about where it’s heading
Full job description Job Title: Business Coordinator Job Summary:We are looking for a proactive and organized Business Coordinator to support...
Apply For This JobJob description ABOUT GQG PARTNERS GQG Partners is an investment boutique which is a wholly owned subsidiary of a majority...
Apply For This JobManufacturing and Quality Lead Location – Trivandrum, India Reporting To – Global Manufacturing Manager, Toronto, Canada Who we are You...
Apply For This JobThis is a remote position. Summary We are looking for a skilled Accountant with a minimum of 2 years of...
Apply For This JobLinkedIn is the world’s largest professional network, built to create economic opportunity for every member of the global workforce. Our...
Apply For This JobWhy Choose Bottomline? Are you ready to transform the way businesses pay and get paid? Bottomline is a global leader...
Apply For This Job“`
Search qualified candidates by skills, location, experience, education, and more.
“`
