codeintelligently
Back to posts
Codebase Understanding

Dependency Mapping: Finding Hidden Connections in Your Codebase

Vaibhav Verma
8 min read
dependenciesarchitecturecode analysiscodebase understandingrefactoring

Dependency Mapping: Finding Hidden Connections in Your Codebase

Last year I was asked to split a monolith into microservices. "It'll take a quarter," the VP of Engineering said. "The boundaries are pretty clear." They weren't. What looked like four independent modules turned out to be a web of 847 cross-module dependencies. The "simple split" took fourteen months.

The problem wasn't that the codebase was badly written. The problem was that nobody had mapped the actual dependencies. Everyone had a mental model of how things connected, and every mental model was wrong in different ways.

Dependency mapping is the practice of discovering, documenting, and analyzing every connection between components in your codebase. Not the dependencies you think exist. The dependencies that actually exist.

The Five Types of Dependencies Most Teams Miss

When developers think "dependencies," they think imports. But imports are just one of at least five types of coupling between components.

Type 1: Static Dependencies (Import/Require)

The obvious ones. Module A imports Module B. These are visible in the code and analyzable with tools.

typescript
import { UserService } from "../users/UserService"; // Explicit dependency

Type 2: Runtime Dependencies

Dependencies that only manifest at runtime. Dependency injection, dynamic imports, reflection, and string-based lookups.

typescript
// This dependency is invisible to static analysis
const service = container.resolve(config.serviceName);

// So is this one
const handler = await import(`./handlers/${eventType}`);

Type 3: Data Dependencies

Two modules that share no code but depend on the same database table, the same cache key, or the same message queue topic. Change the schema, and both modules break.

typescript
// Module A writes:
await prisma.user.update({ where: { id }, data: { status: "active" } });

// Module B reads (in a completely separate file, no shared imports):
const activeUsers = await prisma.user.findMany({ where: { status: "active" } });

// If Module A changes "active" to "enabled", Module B breaks silently.

Type 4: Temporal Dependencies

Components that must execute in a specific order. The nightly batch job must complete before the reporting job runs. The cache must be warmed before the API starts serving traffic. These dependencies live in deployment scripts, cron schedules, and tribal knowledge.

Type 5: Semantic Dependencies

The most insidious kind. Two modules that share an implicit understanding about data format, business rules, or behavior. No code connection. No shared schema. Just an agreement that "status code 7 means the user has been soft-deleted" that exists only in two developers' heads.

Building Your Dependency Map

Step 1: Static Analysis

Start with what tools can find automatically.

For JavaScript/TypeScript:

bash
# Install dependency-cruiser (more configurable than madge)
npm install -g dependency-cruiser

# Generate a dependency report
depcruise --output-type dot src/ | dot -T svg > deps.svg

# Or get a JSON report for further analysis
depcruise --output-type json src/ > deps.json

# Check for violations against your architecture rules
depcruise --validate .dependency-cruiser.js src/

Dependency-cruiser is my preferred tool because it lets you define rules about which dependencies are allowed:

javascript
// .dependency-cruiser.js
module.exports = {
  forbidden: [
    {
      name: "no-circular",
      severity: "error",
      from: {},
      to: { circular: true },
    },
    {
      name: "no-ui-to-data",
      comment: "UI layer should not import from data layer directly",
      severity: "error",
      from: { path: "^src/ui/" },
      to: { path: "^src/data/" },
    },
    {
      name: "no-cross-module-internals",
      comment: "Modules should only import from each other's index",
      severity: "warn",
      from: { path: "^src/orders/" },
      to: {
        path: "^src/payments/(?!index)",
        pathNot: "^src/payments/index\\.ts$",
      },
    },
  ],
};

For Go:

bash
# Built-in tooling
go mod graph
# Or for internal package dependencies
go list -json ./... | jq '.ImportPath, .Imports'

For Python:

bash
pip install pydeps
pydeps mypackage --no-show --max-bacon=3

Step 2: Runtime Dependency Discovery

Static analysis misses runtime dependencies. To find those, you need to observe the system running.

Approach A: Distributed tracing. If you have Jaeger, Zipkin, or Datadog APM, your traces already show runtime dependencies. Export a service map from your tracing tool. Compare it to your static analysis. The differences are your hidden runtime dependencies.

Approach B: Log analysis. Add structured logging to module boundaries. Log every outbound call with the source and destination module. Run the system for a week. Aggregate the logs.

typescript
// Add boundary logging
logger.info("module_call", {
  from: "orders",
  to: "payments",
  method: "processPayment",
  correlationId: ctx.correlationId,
});

Approach C: Network analysis. For microservices, capture network traffic between services using a service mesh (Istio, Linkerd) or a network analysis tool. The network doesn't lie. If service A makes HTTP calls to service B, that's a dependency, regardless of what the documentation says.

Step 3: Data Dependency Discovery

This is where most teams have blind spots.

bash
# Find all modules that reference a specific database table
rg "prisma\.user\." --type ts -l

# Find all modules that read from a specific cache key
rg "cache\.get\(\"user:" --type ts -l

# Find all modules that publish or subscribe to specific events
rg "order\.created" --type ts -l

Build a matrix:

Data Resource Writers Readers
users table auth, admin orders, billing, reports, notifications
orders table orders billing, reports, fulfillment
user:session cache auth api-gateway, orders
order.created event orders payments, notifications, analytics, fulfillment

Every row with multiple writers is a coordination problem. Every row with many readers is a blast radius problem (changing the schema affects many modules).

Step 4: Temporal Dependency Discovery

bash
# Check cron schedules
crontab -l
# Or in Kubernetes
kubectl get cronjobs -o wide

# Check startup dependencies in docker-compose
grep "depends_on" docker-compose.yml -A 5

Build a timeline:

00:00 - Cache warmup job starts
00:15 - Cache warmup must complete before API health check passes
03:00 - Nightly data sync from external provider
04:00 - Report generation (depends on sync being complete)
06:00 - Email digest (depends on reports being generated)

Analyzing Your Dependency Map

Once you have the full picture, analyze it for problems.

Problem 1: Circular Dependencies

Two modules that depend on each other. This makes them impossible to deploy, test, or reason about independently.

bash
# Find circular dependencies
depcruise --output-type err-long --validate .dependency-cruiser.js src/
# or
madge --circular src/

Fix: Extract the shared concept into a third module. If orders depends on payments and payments depends on orders, there's probably a shared concept (like MoneyCalculation or PricingRules) that both should depend on.

Problem 2: Hub Modules

One module that everything depends on. If it breaks, everything breaks.

bash
# Find the most depended-on modules (highest afferent coupling)
depcruise --output-type json src/ | \
  jq '[.modules[] | {source: .source, dependents: (.dependents | length)}] |
  sort_by(-.dependents) | .[0:10]'

Fix: If the hub module is a utilities grab-bag, split it into focused modules. If it's a core domain module, ensure it has excellent test coverage and stability guarantees.

Problem 3: Hidden Coupling Through Data

Two modules that look independent but share a database table. Changing one breaks the other without any compiler warning.

Fix: Define explicit data contracts. If the orders module needs user data, it should depend on a UserRepository interface, not directly on the users database table. This makes the dependency visible and testable.

I Was Wrong About Microservices Solving This

For years I believed that splitting a monolith into microservices would eliminate dependency problems. It doesn't. It transforms compile-time dependencies into runtime dependencies, which are harder to detect, harder to debug, and harder to fix.

In a monolith, a circular dependency between orders and payments causes a compiler error. In microservices, a circular dependency between the orders service and the payments service causes a cascading failure at 3 AM on a Saturday.

The discipline of mapping and managing dependencies applies regardless of your architecture. Microservices don't remove the need for it. They increase it.

The Dependency Health Checklist

Run this assessment monthly:

  • Static dependency graph is generated and reviewed
  • No circular dependencies exist (or they're documented and scheduled for fix)
  • No module has more than 10 direct dependents (hub check)
  • Data dependencies (shared tables, caches, events) are documented
  • Temporal dependencies (job ordering, startup sequencing) are documented
  • Dependency-cruiser rules (or equivalent) run in CI
  • Architectural boundary violations are caught before merge
  • New dependencies require explicit justification in the PR description

The Dependency Decision Matrix

When you find a dependency problem, use this matrix to decide what to do:

Dependency Type Severity Action
Circular (static) High Break the cycle by extracting shared concept
Circular (runtime) Critical Break immediately, this causes cascading failures
Hub module (>10 dependents) Medium Split into focused modules over next 2 quarters
Data coupling (shared table, many readers) Medium Add explicit data contracts / repository interfaces
Data coupling (shared table, multiple writers) High Consolidate writes to one owner module
Temporal (undocumented) Medium Document in runbook, add monitoring for ordering violations
Semantic (implicit agreement) High Make explicit with shared types, schemas, or contracts

Getting Started

If you've never mapped your dependencies, start small. Pick one module. Run madge or dependency-cruiser on it. Look at the output. Are there dependencies you didn't expect? Circular dependencies you didn't know about? Connections to modules that seem unrelated?

Then pick the most surprising dependency and investigate it. Why does it exist? Is it necessary? Can it be removed or inverted?

That single investigation will teach you more about your codebase than a week of reading source code. Dependencies are the skeleton of your architecture. Once you see them clearly, everything else makes more sense.

$ ls ./related

Explore by topic