Scaling and Clustering Principles

MonkDB supports flexible clustering patterns to match different workload shapes, operational preferences, and resiliency requirements.

Clusters can scale horizontally for distributed workloads or use primary–replica patterns for simpler read-scaling deployments.

Supported Clustering Patterns

MonkDB supports two primary clustering models.

1. Distributed Shard-Based Cluster

Tables are partitioned into shards distributed across nodes.
Each shard has a primary and optional replica copies.
Query execution fans out across shard holders.
Reads and writes are distributed across the cluster.

Best suited for:

Large datasets
Mixed operational + analytical workloads
High ingest throughput
Distributed compute scaling

Shard distribution and replica placement

2. Primary with Read Replica Topology

A primary node handles write operations while replicas serve read traffic.

Primary node processes writes
Replicas maintain synchronized copies
Replicas can serve read-heavy workloads
Provides simple scaling for read-intensive applications

Best suited for:

Read-heavy applications
Simpler operational topology
Controlled write workloads

Primary with read replica topology

Cluster Sizing Baseline

For production deployments:

Minimum 3 nodes for quorum-safe distributed clusters
Replication aligned with failure domains (node, rack, zone)
Storage sized for data growth + replicas
Memory sized for query peaks and intermediate results

For primary–replica deployments:

Size primary node for peak write load
Size replicas for read concurrency

Capacity Planning Dimensions

Plan capacity across four dimensions.

CPU

Handles:

SQL parsing
Query planning
Execution
Shard-level processing
Distributed merge operations

Memory

Memory usage includes:

JVM heap
Query intermediate results
Circuit breakers
Caches
Distributed merge buffers

Storage

Storage footprint includes:

Shard data
Replica copies
Translog files
Snapshot repositories

Network

Network traffic includes:

Shard replication
Distributed query fan-out
Shard relocation
Recovery operations

Baseline load testing should measure:

Steady-state performance
Peak ingest behavior
Worst-case query patterns

Horizontal Scaling Model

Scaling behavior depends on deployment topology.

Distributed Cluster Scaling

Adding nodes increases:

Total CPU
Total memory
Storage capacity
Distributed query workers

Shard allocation automatically redistributes data across nodes.

Distributed queries utilize additional shard workers as the cluster grows.

Primary–Replica Scaling

Adding replicas increases:

Read throughput
Read concurrency
Failover capacity

Write throughput remains bounded by the primary node.

Read/write separation reduces contention on the write path.

Practical Scale-Out Flow

Recommended scaling procedure:

Add one node or failure domain at a time
Monitor shard relocation
Observe cluster health
Validate query latency and throughput
Expand incrementally

Monitor using:

sys.nodes
sys.shards
sys.allocations

Scale Trigger Decision Table

Signal	Typical Pattern	Primary Action
Sustained high CPU	p95 node CPU near saturation	Add nodes and rebalance shards
Primary node saturation	write-heavy workload	Scale primary node or distribute writes
Breaker exceptions	memory pressure during queries	Increase memory capacity or tune queries
Long shard recovery times	recovery exceeds SLO	Reduce shard size or improve storage
Read latency spikes	p95/p99 read latency increases	Add replicas or additional nodes
Write latency increases	replication or I/O pressure	Review replication policy and storage

Shard Strategy Guidance

Shard count affects both performance and operational complexity.

Too Few Shards

Limited parallelism
Underutilized cluster resources

Too Many Shards

Scheduling overhead
Cluster metadata pressure
Slower recovery operations

Choose shard counts based on:

Expected dataset growth
Node count trajectory
Concurrent workload patterns

Shard Size Heuristics

Operational recommendations:

Keep shard sizes manageable for relocation
Prefer predictable partitioning for time-series workloads
Periodically review shard strategy as clusters grow

Replica Strategy Guidance

Replicas improve:

Read throughput
Query parallelism
Fault tolerance

Tradeoffs include:

Increased storage usage
Additional write amplification
More recovery traffic

Replica count should reflect:

SLA requirements
Read concurrency
Failure tolerance goals

Zone-Aware Resiliency

Deploy clusters across independent failure domains when possible.

Examples:

Availability zones
Racks
Physical hosts

Zone aware resiliency

Primaries and replicas should be distributed across zones to reduce correlated failures.

Kubernetes Scaling Principles

Recommended patterns:

Deploy data nodes using StatefulSets
Assign persistent volumes per pod
Configure pod anti-affinity
Avoid placing shard replicas on the same node

Scale gradually and monitor relocation and query latency during scaling operations.

Docker and VM Deployment Principles

Operational guidance:

Maintain stable node.name
Preserve persistent storage across restarts
Avoid frequent recycling of data nodes
Use explicit cluster discovery configuration

Replica nodes must resynchronize cleanly after restarts.

Node-Level vs Cluster-Level Settings

Node-Level Settings

Examples:

network.*
path.*
node.name
transport.*
http.*
JVM heap configuration

These settings apply only to the local node.

Cluster-Level Settings

Examples:

indices.breaker.*
governance.*
audit.*
lineage.*
fdw.allow_local

These settings affect the entire cluster.

Rolling Operations

Recommended approach:

Perform node updates one at a time
Allow shards to stabilize before continuing
Monitor cluster health throughout operations

Key monitoring tables:

sys.nodes
sys.shards
sys.allocations

Scale Troubleshooting Checklist

SELECT name, load['1'], mem['used_percent'], heap['used']
FROM sys.nodes
ORDER BY name;

SELECT table_name, id, routing_state, state
FROM sys.shards
ORDER BY table_name, id;

SELECT table_name, shard_id, node_id, explanation
FROM sys.allocations
WHERE explanation IS NOT NULL
ORDER BY table_name, shard_id;

If latency increases during scale events:

Pause further node operations
Wait for shard relocation to complete
Verify breaker pressure has stabilized

Scaling and Clustering Principles

Supported Clustering Patterns

1. Distributed Shard-Based Cluster

2. Primary with Read Replica Topology

Cluster Sizing Baseline

Capacity Planning Dimensions

CPU

Memory

Storage

Network

Horizontal Scaling Model

Distributed Cluster Scaling

Primary–Replica Scaling

Practical Scale-Out Flow

Scale Trigger Decision Table

Shard Strategy Guidance

Too Few Shards

Too Many Shards

Shard Size Heuristics

Replica Strategy Guidance

Zone-Aware Resiliency

Kubernetes Scaling Principles

Docker and VM Deployment Principles

Node-Level vs Cluster-Level Settings

Node-Level Settings

Cluster-Level Settings

Rolling Operations

Scale Troubleshooting Checklist

Related Documentation