Microservices & Distributed Systems — Real-World Interview Guide

1. Why Microservices (and when NOT to)

Microservices are an organizational architecture. If they don’t match team structure and delivery needs, they backfire.

✅ When it makes sense

Multiple teams + independent deployments.
Different scaling needs per domain.
Clear bounded contexts (DDD).
High change frequency + autonomy required.

🚨 When it’s a mistake

Small team / early product / unstable requirements.
Strong cross-service transactions required.
No observability / DevOps maturity.
Chatty synchronous calls everywhere.

💡 Senior interview sentence

“Microservices optimize for independent change and deployment, not for simplicity. You pay in distributed-systems complexity.”

2. Service Boundaries & Data Ownership

        // Rule of thumb

        // One service = one bounded context = one database

        // Never share tables across services

⚠️ Trap: “Shared DB is faster”

It destroys autonomy and blocks independent deployment. A shared database is basically a distributed monolith.

💡 Boundary test

If two areas can evolve independently and have different language/rules, they should be separate bounded contexts.

3. APIs & Contracts (Versioning, Compatibility)

✅ Good contract habits

Additive changes (new fields) over breaking changes.
Backwards compatible event schemas.
Explicit error model + stable status codes.
Consumer-driven contract tests (when critical).

🚨 Traps

“Just deploy both sides together” (not microservices anymore).
Breaking event schemas (silent production failures).
Leaking internal IDs without meaning.

4. Sync vs Async Communication

Sync (HTTP / gRPC)

Simple but creates coupling and cascading failures.

Async (Events)

Loose coupling, resilient, scalable. Preferred for integration.

Mixed

Common in practice. Must be designed carefully.

🚨 Trap: Chatty synchronous microservices

Many sync hops per request = high latency + fragile chain. Prefer async events or BFF/API aggregation.

5. Events, Commands, and Messaging

Command

“Do X” to a specific service. Needs a handler. Often requires a response (sync or async acknowledgment).

Event

“X happened”. Multiple consumers can react. No direct dependency on who listens.

        // Integration Event (contract between services)

        public record OrderPaidIntegrationEvent(

          Guid OrderId,

          DateTime PaidAtUtc,

          decimal Amount

        );

🚨 Trap: Publishing “Domain Entities” as events

Events are contracts. Publish minimal, stable facts — not your internal entity shape.

6. Reliability: Retries, Timeouts, Circuit Breakers

✅ Core rules

Always set timeouts (otherwise retries amplify outages).
Use retry with backoff + jitter.
Add circuit breaker for failing dependencies.
Use bulkheads to isolate resources.

🚨 Trap: “Retry everything”

Retrying non-idempotent operations creates duplicates. Also: retries without timeouts cause cascading failures.

        // Pseudo: resilience stack order

        // Timeout → Retry (backoff) → Circuit Breaker → Bulkhead

7. Idempotency (Safe Retries)

In distributed systems, duplicates happen. The system must tolerate them.

💡 Practical patterns

Use an Idempotency-Key (client or gateway).
Store processed message IDs (Inbox pattern) or dedupe by key.
Design handlers as “apply if not already applied”.

🚨 Trap: Exactly-once delivery

Most systems are at-least-once. Aim for effectively-once behavior via idempotency + deduplication.

8. Outbox Pattern (Reliable Event Publishing)

        // ✅ Outbox flow

        // 1) Begin DB transaction

        // 2) Save aggregate changes

        // 3) Save IntegrationEvent in Outbox table (same transaction)

        // 4) Commit

        // 5) Background worker publishes and marks as sent

🚨 Trap: “Publish inside the DB transaction”

Publish-before-commit risks phantom events. Publish-after-commit risks lost events. Outbox fixes both with eventual consistency.

9. Sagas (Distributed Workflows)

Orchestration

A coordinator service drives the workflow step-by-step. Easier to reason about; central point of control.

Choreography

Services react to events. More decentralized, but harder to debug at scale.

🚨 Trap: “Sagas guarantee consistency”

Sagas manage eventual consistency via compensations. You must design compensating actions and handle partial failures.

10. Eventual Consistency (Business-friendly explanation)

How to explain in interviews

“Instead of a single global transaction, each service commits locally and publishes facts. Other services converge to the new state.”

11. Observability (Logs, Metrics, Traces)

Logs

Structured logs + correlation IDs.

Metrics

Latency, error rate, saturation, queue depth.

Traces

End-to-end request view across services.

        // Must-have in production

        // CorrelationId / TraceId propagated via HTTP headers + message metadata

🚨 Trap: “We can debug later”

Without tracing + structured logs, microservices become impossible to operate reliably.

12. Security Between Services (Zero Trust)

Principles

Services authenticate each other (mTLS / tokens).
Authorize based on claims/policies, not client input.
Least privilege per service identity.

🚨 Trap

Trusting headers/roles from the client directly. A gateway can enforce, but services must still validate.

13. Deployments & Versioning (The real difficulty)

✅ Practices that save you

Blue/Green or canary releases.
Backward compatible contracts.
Feature flags for risky behavior changes.
Database migrations per service with safe rollout.

🚨 Trap: Coordinated deploys

If every change requires deploying 5 services together, you’ve created a distributed monolith.

14. Most Common Interview Traps

“We use microservices to be scalable”

Scalability is one reason, but autonomy and independent change are usually the real drivers.

“Exactly once delivery”

Assume at-least-once, build idempotency and deduplication.

“Synchronous chains are fine”

They produce cascading failures. Use timeouts, retries, circuit breakers, or async events.

🎯 Final Interview Advice

If you mention “bounded contexts + outbox + idempotency + tracing + resilience policies + contract compatibility” — you’ll sound like someone who has shipped microservices in production.