Infrastructure & AI Systems

SAHIL KHUNDIYA

Engineering distributed systems, intelligent retrieval pipelines, and production-grade AI architectures.Optimizing high-throughput backends and scalable AI infrastructure.

Scroll to explore

Observability

System Impact Metrics

Live Monitoring Active
Region: Asia-South1
Throughput Optimization
0%
P99 Latency: ValidatedProduction Mode
Query Latency Reduction
0%
P99 Latency: ValidatedProduction Mode
Algorithmic Mastery
0+
P99 Latency: ValidatedProduction Mode
Production Systems
0+
P99 Latency: ValidatedProduction Mode
Kafka Event Velocity
0k/s
P99 Latency: ValidatedProduction Mode
RAG Retrieval Accuracy
0%
P99 Latency: ValidatedProduction Mode

Trajectory

Engineering Evolution

The progression from building interfaces to architecting high-scale intelligent infrastructure.

Monolithic Origins

Foundational Backend Engineering & REST APIs.

JavaSpringPostgreSQL
Architecture Complexity
Validated Phase 01

Distributed Transition

Scalable Microservices & System Integration.

MicroservicesDockerHibernate
Architecture Complexity
Validated Phase 02

Event-Driven Velocity

High-Throughput Asynchronous Pipelines.

KafkaJMSEvent Streams
Architecture Complexity
Validated Phase 03

Intelligent Retrieval

Production RAG & Vector Infrastructure.

Vector DBsSemantic Search
Architecture Complexity
Validated Phase 04

Knowledge Orchestration

Graph-Augmented Intelligent Systems.

Knowledge GraphsNeo4jLLM Routing
Architecture Complexity
Validated Phase 05

Elite AI Infrastructure

Architecting Scalable Hybrid AI Platforms.

Distributed AISystem Reliability
Architecture Complexity
Validated Phase 06

Deep Dive

Hybrid AI Infrastructure

Engineering a high-fidelity retrieval system for production AI agents.

Case Study: Retrieval 2.0System: Distributed RAG

Optimizing Context Fusion

To move beyond simple RAG, we built a **Tri-Search Engine** that executes parallel queries across a **Knowledge Graph**, **Vector Index**, and **Full Text Search**. The key engineering challenge was "Context Fusion" — merging these disparate data sources into a ranked set of documents that minimize LLM hallucination while maximizing recall.

Routing Engine

Custom intent classification layer using lightweight BERT models to route queries between exact search and semantic retrieval.

Context Fusion

RRF (Reciprocal Rank Fusion) algorithm implementation to normalize scores from vector and keyword search pipelines.

Status: Deployed to Production
Engineering Logic
Why PostgreSQL FTS?

Lower infrastructure overhead. By leveraging PG's native inverted indices alongside our vector store, we reduced network hop latency by 15ms.

Dual Retrieval Pipeline

Vector search captures "concepts", while FTS captures "specific identifiers". Combining them solved the 20% recall gap we faced with pure vector search.

Outcomes
40%
Precision Increase
-20%
Hallucination Rate
~85ms
P95 Retrieval Latency
Stack Overview
DB
PostgreSQL
Vector
ChromaDB
Stream
Kafka
Lang
Python/Java

Evidence

Production Artifacts

Real-world system designs and performance optimizations.

Distributed Event Topology

Distributed Event Topology

Production-grade Kafka cluster architecture designed for 100k+ events/sec.

Query Plan Optimization

Query Plan Optimization

PostgreSQL P99 latency reduction through advanced indexing and query plan analysis.

Infrastructure

Live System Simulation

The lifecycle of an intelligent query through a distributed retrieval architecture.

User Query
LLM Router
Vector Index
Knowledge Graph
Context Fusion
Production LLM

Philosophy

Beyond the CRUD API.

My approach to engineering is centered on **reliability at scale**. I believe that any system that doesn't account for network failures, database lock contention, and traffic spikes isn't production-ready.

I prioritize **Clean Architecture** and **SOLID principles** not as dogmas, but as practical tools to reduce the cost of change in complex distributed systems.

"Simple is hard, but scalable is impossible without simplicity."

Fault Isolation

Implementing bulkhead patterns and circuit breakers to ensure service failures stay contained.

Event-Driven Systems

Leveraging asynchronous message brokers for decoupled architectures and reliable state propagation.

Observability-First

Designing systems with distributed tracing and structured logging for P99 latency monitoring.

Distributed State

Managing data consistency across microservices using Sagas and Outbox patterns.

System Reliability Whiteboard

Architecting for Resilience

1
Backoff
Retry Strategy
2
Timeout
Fault Isolation
3
Bulkhead
Resource Segregation
4
Circuit
Stateful Resilience

Arsenal

Technology Stack

Java

Language

Spring Boot

Framework

Kafka

Streaming

PostgreSQL

Database

Microservices

Architecture

RAG Systems

AI

Knowledge Graphs

Data Structure

Vector DB

AI Infrastructure

Docker

DevOps

Hibernate

ORM

Trajectory

Engineering Experience

Amantya Technologies

Trainee Engineer – Software Development

2023 - Present

Developing scalable backend microservices and intelligent systems.

  • Engineered scalable backend microservices using Spring Boot and Hibernate/JPA.
  • Implemented high-throughput async processing with Kafka and JMS.
  • Optimized PostgreSQL queries for production-grade performance.
  • Contributed to HLD/LLD for distributed system modules.

Samsung SDS

Developer Intern

2022

Contributed to enterprise backend modules and system integration.

  • Developed core backend modules using Java/Spring framework.
  • Optimized complex SQL queries for data-intensive operations.
  • Assisted in integrating various system modules into the main pipeline.

Coding Ninjas

Teaching Assistant

2021

Mentored students in Data Structures and Algorithms.

  • Solved over 1100+ DSA problems across various platforms.
  • Assisted students in debugging complex algorithmic problems.
  • Conducted doubt-clearing sessions for 500+ students.

Roadmap

Currently Exploring

The future of my engineering focus, moving towards deep infrastructure and autonomous agentic systems.

In Progress

Distributed AI Systems

Scaling LLM inference and training across distributed clusters.

Researching

Graph Retrieval (GRAG)

Combining graph traversal with vector search for deep semantic context.

Core Focus

Reliability Engineering

Advanced observability and chaos engineering in backend systems.

Planning

Event-Driven AI

Real-time agentic workflows triggered by streaming data events.

Consistency

Engineering Pulse

2000+ Commits in 2023
Less
More
Backend Engineering Focus

Recruiter Command Console

Engineering OS v2.4
Secure
Active
SahilOS v2.4.1 (Stable). Type "help" to list available protocols.
λuser@sahil:~$
Last Login: 5/7/2026Infrastructure status: Optimal