Skip to content
Monghoul
Free reference · MongoDB aggregation

MongoDB aggregation cheat sheet

Every pipeline stage, expression operator, and copy-paste pattern in one page. Covers MongoDB 4.4 through 8.0.

Updated April 2026

Basic pipeline

db.orders.aggregate([
  { $match: { status: "shipped", total: { $gte: 100 } } },
  { $group: {
    _id: "$region",
    revenue: { $sum: "$total" },
    count: { $sum: 1 }
  } },
  { $sort: { revenue: -1 } }
])

Filter with $match, aggregate with $group, order with $sort. Most pipelines are variations of this pattern.

SQL to aggregation

If you know SQL, you already know the concepts. Each SQL clause maps to an aggregation stage.

SQL clause to MongoDB aggregation stage mapping
SQL Aggregation stage
SELECT a, b $project
SELECT *, a+b AS c $addFields / $set
WHERE $match
GROUP BY $group
HAVING $match (after $group)
ORDER BY $sort
LIMIT / OFFSET $limit / $skip
JOIN $lookup
UNION ALL $unionWith
COUNT(*) $count
DISTINCT $group
WINDOW / OVER $setWindowFields
INSERT INTO ... SELECT $merge / $out
CASE WHEN $cond / $switch
COALESCE $ifNull
CAST $convert / $toInt / ...
EXISTS (subquery) $lookup + $match
LATERAL JOIN $lookup (pipeline)
CREATE MATERIALIZED VIEW $merge / $out

How the pipeline works

Documents flow left to right through stages. Each stage transforms the stream and passes results to the next.

The document count can shrink ($match, $group), stay the same ($addFields, $sort, $project), grow ($unwind, $unionWith), or branch ($facet). Place stages that reduce document count early to keep later stages fast.

Monghoul's aggregation builder shows this flow visually with per-stage output previews and "Run to here" controls. See how it works →

Pipeline stages

Filtering

$match docs ↓

Filters documents. Same query syntax as find().

{ $match: { status: "active", age: { $gte: 18 } } }
Example data
Input
[
  { name: "Alice", status: "active", age: 25 },
  { name: "Bob", status: "inactive", age: 30 },
  { name: "Carol", status: "active", age: 15 }
]
Output
[
  { name: "Alice", status: "active", age: 25 }
]
$limit docs ↓

Passes only the first N documents.

{ $limit: 10 }
$sample docs ↓

Randomly selects N documents from the input.

{ $sample: { size: 5 } }
$geoNear docs ↓

Returns documents sorted by proximity to a GeoJSON point. Must be the first stage.

{ $geoNear: {
  near: { type: "Point", coordinates: [-73.99, 40.73] },
  distanceField: "dist",
  maxDistance: 5000,
  spherical: true
} }

Projection

$project docs =

Includes, excludes, or computes fields. 1 to include, 0 to exclude.

{ $project: { name: 1, total: { $multiply: ["$price", "$qty"] } } }
Example data
Input
[
  { _id: 1, name: "Widget", price: 25, qty: 4, sku: "W1" }
]
Output
[
  { _id: 1, name: "Widget", total: 100 }
]
$addFields / $set 4.2+ docs =

Adds new fields or overwrites existing ones. $set is an alias for $addFields.

{ $addFields: { totalPrice: { $multiply: ["$price", "$qty"] } } }
Example data
Input
[
  { _id: 1, price: 25, qty: 4, sku: "W1" }
]
Output
[
  { _id: 1, price: 25, qty: 4, sku: "W1", totalPrice: 100 }
]
$unset 4.2+ docs =

Removes fields. Shorthand for $project with exclusions.

{ $unset: ["tempField", "internal.debug"] }
$replaceRoot / $replaceWith docs =

Replaces the entire document with a specified sub-document or expression.

{ $replaceRoot: { newRoot: "$address" } }
// or equivalently:
{ $replaceWith: "$address" }
Example data
Input
[
  { _id: 1, name: "Alice", address: { city: "NYC", zip: "10001" } }
]
Output
[
  { city: "NYC", zip: "10001" }
]

Array handling

$unwind docs ↑

Deconstructs an array field into one document per element.

{ $unwind: "$tags" }
// with options:
{ $unwind: { path: "$tags", preserveNullAndEmptyArrays: true } }
Example data
Input
[
  { _id: 1, item: "Shirt", tags: ["cotton", "sale"] }
]
Output
[
  { _id: 1, item: "Shirt", tags: "cotton" },
  { _id: 1, item: "Shirt", tags: "sale" }
]

Grouping

$group docs ↓

Groups documents by a key and applies accumulators.

{ $group: {
  _id: "$department",
  total: { $sum: "$salary" },
  count: { $sum: 1 },
  avgSalary: { $avg: "$salary" }
} }
Example data
Input
[
  { name: "Alice", department: "eng", salary: 120 },
  { name: "Bob", department: "eng", salary: 100 },
  { name: "Carol", department: "sales", salary: 90 }
]
Output
[
  { _id: "eng", total: 220, count: 2, avgSalary: 110 },
  { _id: "sales", total: 90, count: 1, avgSalary: 90 }
]
$count docs → 1

Returns a document with the count of input documents. As a stage, not the accumulator of the same name.

{ $count: "totalDocuments" }
Example data
Input
// 142 input documents
Output
[ { totalDocuments: 142 } ]
$bucket docs ↓

Groups documents into buckets by value ranges.

{ $bucket: {
  groupBy: "$price",
  boundaries: [0, 50, 100, 200, Infinity],
  default: "other",
  output: { count: { $sum: 1 } }
} }
Example data
Input
[
  { item: "A", price: 10 },
  { item: "B", price: 75 },
  { item: "C", price: 150 }
]
Output
[
  { _id: 0, count: 1 },
  { _id: 50, count: 1 },
  { _id: 100, count: 1 }
]
$bucketAuto docs ↓

Automatically divides documents into a specified number of evenly distributed buckets.

{ $bucketAuto: { groupBy: "$score", buckets: 5 } }
Example data
Input
// 10 documents with scores 10-99
Output
[
  { _id: { min: 10, max: 38 }, count: 2 },
  { _id: { min: 38, max: 64 }, count: 2 },
  { _id: { min: 64, max: 85 }, count: 2 },
  { _id: { min: 85, max: 95 }, count: 2 },
  { _id: { min: 95, max: 99 }, count: 2 }
]
$sortByCount docs ↓

Groups by a field and sorts by count descending. Shorthand for $group + $sort.

{ $sortByCount: "$category" }
Example data
Input
[
  { item: "A", category: "food" },
  { item: "B", category: "food" },
  { item: "C", category: "tech" }
]
Output
[
  { _id: "food", count: 2 },
  { _id: "tech", count: 1 }
]

Sorting

$sort docs =

Orders documents. 1 for ascending, -1 for descending.

{ $sort: { createdAt: -1, name: 1 } }
Example data
Input
[
  { name: "Bob", createdAt: "2024-01-01" },
  { name: "Alice", createdAt: "2024-03-15" },
  { name: "Carol", createdAt: "2024-02-10" }
]
Output
[
  { name: "Alice", createdAt: "2024-03-15" },
  { name: "Carol", createdAt: "2024-02-10" },
  { name: "Bob", createdAt: "2024-01-01" }
]

Joining

$lookup docs =

Left outer join to another collection in the same database.

// basic form:
{ $lookup: {
  from: "orders",
  localField: "_id",
  foreignField: "userId",
  as: "userOrders"
} }

// pipeline form (correlated sub-query):
{ $lookup: {
  from: "orders",
  let: { uid: "$_id" },
  pipeline: [
    { $match: {
      $expr: { $eq: ["$userId", "$$uid"] }
    } },
    { $sort: { date: -1 } },
    { $limit: 5 }
  ],
  as: "recentOrders"
} }
Example data
Input
// users collection:
[{ _id: 1, name: "Alice" }]
// orders collection:
[{ userId: 1, total: 50 }, { userId: 1, total: 80 }]
Output
[
  { _id: 1, name: "Alice",
    userOrders: [
      { userId: 1, total: 50 },
      { userId: 1, total: 80 }
    ] }
]
$graphLookup docs =

Recursive lookup for tree and graph structures (org charts, categories, social graphs).

{ $graphLookup: {
  from: "employees",
  startWith: "$managerId",
  connectFromField: "managerId",
  connectToField: "_id",
  as: "reportingChain",
  maxDepth: 5,
  depthField: "level"
} }
Example data
Input
// employees collection (pipeline starts with { $match: { name: "Alice" } }):
[
  { _id: 1, name: "Alice", managerId: 2 },
  { _id: 2, name: "Bob", managerId: 3 },
  { _id: 3, name: "Carol", managerId: null }
]
Output
[
  { _id: 1, name: "Alice", managerId: 2,
    reportingChain: [
      { _id: 2, name: "Bob", managerId: 3, level: 0 },
      { _id: 3, name: "Carol", managerId: null, level: 1 }
    ] }
]
$unionWith 4.4+ docs ↑

Appends documents from another collection to the pipeline. Like SQL UNION ALL.

{ $unionWith: { coll: "archivedOrders" } }
// with a pipeline:
{ $unionWith: {
  coll: "archivedOrders",
  pipeline: [{ $match: { status: "shipped" } }]
} }
Example data
Input
// currentOrders (pipeline source):
[{ _id: 1, item: "Laptop", status: "active" }]
// archivedOrders (joined via $unionWith):
[{ _id: 99, item: "Mouse", status: "shipped" }]
Output
[
  { _id: 1, item: "Laptop", status: "active" },
  { _id: 99, item: "Mouse", status: "shipped" }
]

Window functions

$setWindowFields 5.0+ docs =

Adds fields computed over a window of documents (running totals, ranks, moving averages).

{ $setWindowFields: {
  partitionBy: "$department",
  sortBy: { salary: -1 },
  output: {
    rank: { $rank: {} },
    runningTotal: {
      $sum: "$salary",
      window: { documents: ["unbounded", "current"] }
    }
  }
} }
Example data
Input
[
  { department: "eng", name: "Alice", salary: 120 },
  { department: "eng", name: "Bob", salary: 100 },
  { department: "sales", name: "Carol", salary: 90 }
]
Output
[
  { department: "eng", name: "Alice", salary: 120,
    rank: 1, runningTotal: 120 },
  { department: "eng", name: "Bob", salary: 100,
    rank: 2, runningTotal: 220 },
  { department: "sales", name: "Carol", salary: 90,
    rank: 1, runningTotal: 90 }
]

Multi-pipeline

$facet docs → 1

Runs multiple sub-pipelines in parallel on the same input. Returns one document with each pipeline result as a field.

{ $facet: {
  priceRanges: [
    { $bucket: { groupBy: "$price", boundaries: [0, 50, 100, 500] } }
  ],
  topRated: [
    { $sort: { rating: -1 } },
    { $limit: 5 }
  ],
  totalCount: [
    { $count: "count" }
  ]
} }
Example data
Input
// all input documents are sent
// to each sub-pipeline
Output
[
  {
    priceRanges: [{ _id: 0, count: 3 }, ...],
    topRated: [{ name: "A", rating: 5 }, ...],
    totalCount: [{ count: 42 }]
  }
]

Output

$out docs =

Writes pipeline results to a collection, replacing its contents. Must be the last stage.

{ $out: "monthlySummary" }
// cross-database (4.4+):
{ $out: { db: "reporting", coll: "monthlySummary" } }
Example data
Input
// pipeline result:
[
  { _id: "US-East", revenue: 18400 },
  { _id: "EU-West", revenue: 9300 }
]
// monthlySummary before (any prior contents):
[ { _id: "OLD", revenue: 999 } ]
Output
// monthlySummary after $out (fully replaced):
[
  { _id: "US-East", revenue: 18400 },
  { _id: "EU-West", revenue: 9300 }
]
$merge 4.2+ docs =

Merges pipeline results into a collection. More flexible than $out: can insert, merge, replace, or fail on match.

{ $merge: {
  into: "monthlySummary",
  on: "_id",
  whenMatched: "merge",
  whenNotMatched: "insert"
} }
Example data
Input
// pipeline result:
[
  { _id: "US-East", revenue: 18400 },
  { _id: "EU-West", revenue: 9300 }
]
// monthlySummary before:
[
  { _id: "US-East", revenue: 15000 },
  { _id: "APAC", revenue: 5000 }
]
Output
// monthlySummary after $merge:
[
  { _id: "US-East", revenue: 18400 },  // matched, merged
  { _id: "APAC", revenue: 5000 },      // untouched
  { _id: "EU-West", revenue: 9300 }    // inserted
]

Other stages

$redact docs ↓

Restricts document content based on field-level conditions. Returns "$$DESCEND", "$$PRUNE", or "$$KEEP".

{ $redact: {
  $cond: {
    if: { $eq: ["$level", "public"] },
    then: "$$DESCEND",
    else: "$$PRUNE"
  }
} }
Example data
Input
[
  { _id: 1, level: "public", title: "News",
    section: { level: "secret", body: "hidden" } },
  { _id: 2, level: "secret", title: "Internal" }
]
Output
[
  { _id: 1, level: "public", title: "News" }
  // nested "secret" section pruned
  // doc 2 pruned entirely
]
$densify 5.1+ docs ↑

Fills gaps in a sequence (dates, numbers) by inserting missing documents.

{ $densify: {
  field: "date",
  range: { step: 1, unit: "day", bounds: "full" }
} }
Example data
Input
[
  { date: ISODate("2024-03-01"), visits: 120 },
  { date: ISODate("2024-03-03"), visits: 95 },
  { date: ISODate("2024-03-05"), visits: 140 }
]
Output
[
  { date: ISODate("2024-03-01"), visits: 120 },
  { date: ISODate("2024-03-02") },                // filled
  { date: ISODate("2024-03-03"), visits: 95 },
  { date: ISODate("2024-03-04") },                // filled
  { date: ISODate("2024-03-05"), visits: 140 }
]
$fill 5.3+ docs =

Fills null or missing field values using a specified method.

{ $fill: {
  sortBy: { date: 1 },
  output: {
    temperature: { method: "linear" },
    status: { value: "unknown" }
  }
} }
Example data
Input
[
  { date: "2024-03-01", temperature: 10, status: "ok" },
  { date: "2024-03-02", temperature: null },
  { date: "2024-03-03", temperature: 16 }
]
Output
[
  { date: "2024-03-01", temperature: 10, status: "ok" },
  { date: "2024-03-02", temperature: 13, status: "unknown" },
  { date: "2024-03-03", temperature: 16, status: "unknown" }
]
$documents 6.0+ docs =

Creates documents from expressions. Useful for generating test data or constants in a pipeline.

{ $documents: [
  { name: "Alice", role: "admin" },
  { name: "Bob", role: "user" }
] }
$changeStream 4.0+ docs =

Returns a cursor over real-time change events on a collection. Must be the first stage.

{ $changeStream: {
  fullDocument: "updateLookup",
  startAtOperationTime: Timestamp(1690000000, 1)
} }

Admin and diagnostics

$collStats 3.4+ docs → 1

Returns storage and query-execution statistics for the collection. Must be the first stage.

{ $collStats: {
  storageStats: {},
  count: {},
  queryExecStats: {}
} }
$indexStats 3.2+ docs → N

Returns usage statistics for each index on the collection. Must be the first stage.

{ $indexStats: {} }
Example data
Input
// no input documents; reads index metadata
Output
[
  { name: "status_1", accesses: { ops: 142, since: ISODate("2024-01-01") } },
  { name: "_id_", accesses: { ops: 9800, since: ISODate("2024-01-01") } }
]
$planCacheStats 4.2+ docs → N

Returns plan cache entries for the collection. Useful for diagnosing slow queries.

{ $planCacheStats: {} }
$currentOp 3.6+ docs → N

Returns documents describing active and/or dormant operations on the server. Run on the admin database.

{ $currentOp: { allUsers: true, idleSessions: false } }
$listSessions 3.6+ docs → N

Lists active sessions on the server. Run on the config database.

{ $listSessions: { allUsers: true } }
$shardedDataDistribution 7.0+ docs → N

Returns data distribution metrics across shards for each sharded collection.

{ $shardedDataDistribution: {} }

Expressions and operators

Expression operators transform values inside stages like $project and $addFields. Query operators (used inside $match) are included at the end for quick reference.

Accumulators (used in $group, $setWindowFields)

Items marked window only work inside $setWindowFields.

$sum

Sum of numeric values.

{ $sum: "$qty" }
$avg

Average of numeric values.

{ $avg: "$price" }
$min

Lowest value.

{ $min: "$score" }
$max

Highest value.

{ $max: "$score" }
$first

First value in the group.

{ $first: "$name" }
$last

Last value in the group.

{ $last: "$name" }
$push

Array of all values (including duplicates).

{ $push: "$tag" }
$addToSet

Array of unique values.

{ $addToSet: "$tag" }
$count 5.0+

Count of documents in the group. Accumulator form; for a top-level total, use the $count stage instead.

{ $count: {} }
$top 5.2+

Top document by sort order.

{ $top: { sortBy: { score: -1 }, output: "$name" } }
$topN 5.2+

Top N documents by sort order.

{ $topN: { n: 3, sortBy: { score: -1 }, output: "$name" } }
$bottom 5.2+

Bottom document by sort order.

{ $bottom: { sortBy: { score: -1 }, output: "$name" } }
$bottomN 5.2+

Bottom N documents by sort order.

{ $bottomN: { n: 3, sortBy: { score: -1 }, output: "$name" } }
$firstN 5.2+

Array of first N values.

{ $firstN: { n: 3, input: "$item" } }
$lastN 5.2+

Array of last N values.

{ $lastN: { n: 3, input: "$item" } }
$median 7.0+

Median value (50th percentile).

{ $median: { input: "$score", method: "approximate" } }
$percentile 7.0+

Value at a given percentile.

{ $percentile: { input: "$score", p: [0.9], method: "approximate" } }
$stdDevSamp

Sample standard deviation.

{ $stdDevSamp: "$score" }
$stdDevPop

Population standard deviation.

{ $stdDevPop: "$score" }
$expMovingAvg 5.0+ window

Exponential moving average. Smooths a time series by weighting recent points more heavily.

{ $expMovingAvg: { input: "$price", N: 3 } }
Example data
Input
[
  { day: 1, price: 100 },
  { day: 2, price: 110 },
  { day: 3, price: 105 }
]
Output
[
  { day: 1, price: 100, ema: 100 },
  { day: 2, price: 110, ema: 105 },
  { day: 3, price: 105, ema: 105 }
]
$minN 5.2+

Array of N smallest values.

{ $minN: { n: 3, input: "$score" } }
$maxN 5.2+

Array of N largest values.

{ $maxN: { n: 3, input: "$score" } }
$rank 5.0+ window

Rank of the document within the partition. Ties get the same rank and the next rank skips.

{ $rank: {} }
Example data
Input
[
  { name: "Alice", score: 95 },
  { name: "Bob", score: 95 },
  { name: "Carol", score: 80 }
]
Output
[
  { name: "Alice", score: 95, rank: 1 },
  { name: "Bob", score: 95, rank: 1 },
  { name: "Carol", score: 80, rank: 3 }   // gap after tie
]
$denseRank 5.0+ window

Rank within the partition without gaps after ties.

{ $denseRank: {} }
Example data
Input
[
  { name: "Alice", score: 95 },
  { name: "Bob", score: 95 },
  { name: "Carol", score: 80 }
]
Output
[
  { name: "Alice", score: 95, rank: 1 },
  { name: "Bob", score: 95, rank: 1 },
  { name: "Carol", score: 80, rank: 2 }   // no gap
]
$documentNumber 5.0+ window

Position of the document in the partition (1-based, no ties).

{ $documentNumber: {} }
$shift 5.0+ window

Value from a document at a relative offset. Use negative for previous, positive for next.

{ $shift: { output: "$price", by: -1, default: 0 } }
Example data
Input
[
  { date: "2024-03-01", price: 100 },
  { date: "2024-03-02", price: 110 },
  { date: "2024-03-03", price: 108 }
]
Output
[
  { date: "2024-03-01", price: 100, prev: 0 },
  { date: "2024-03-02", price: 110, prev: 100 },
  { date: "2024-03-03", price: 108, prev: 110 }
]
$derivative 5.0+ window

Rate of change between adjacent documents. Pair with a range window for time-based rates.

{ $derivative: { input: "$distance", unit: "hour" } }
Example data
Input
[
  { t: ISODate("2024-03-01T00:00Z"), distance: 0 },
  { t: ISODate("2024-03-01T01:00Z"), distance: 60 },
  { t: ISODate("2024-03-01T02:00Z"), distance: 130 }
]
Output
[
  { ..., speed: null },   // first doc has no prior
  { ..., speed: 60 },     // 60 miles / hour
  { ..., speed: 70 }
]
$linearFill 5.3+ window

Fill nulls with linear interpolation between known points.

{ $linearFill: "$temperature" }
Example data
Input
[
  { x: 1, y: 10 },
  { x: 2, y: null },
  { x: 3, y: 30 }
]
Output
[
  { x: 1, y: 10 },
  { x: 2, y: 20 },   // interpolated
  { x: 3, y: 30 }
]
$accumulator

Custom accumulator with init, accumulate, merge JS functions.

{ $accumulator: { init: "function() { ... }", ... } }
$function

Run a custom JavaScript function as an expression.

{ $function: { body: "function(name) { ... }", args: ["$name"], lang: "js" } }
$integral 5.0+ window

Numeric integral (area under the curve) over a window.

{ $integral: { input: "$speed", unit: "hour" } }
Example data
Input
[
  { t: 0, speed: 10 },
  { t: 1, speed: 20 },
  { t: 2, speed: 30 }
]
Output
[
  { t: 0, ..., total: 0 },
  { t: 1, ..., total: 15 },   // trapezoid area
  { t: 2, ..., total: 40 }
]
$covarianceSamp 5.0+ window

Sample covariance of two expressions.

{ $covarianceSamp: ["$x", "$y"] }
$covariancePop 5.0+ window

Population covariance of two expressions.

{ $covariancePop: ["$x", "$y"] }
$locf 5.2+ window

Last observation carried forward. Fills nulls with the previous non-null value.

{ $locf: "$temperature" }
Example data
Input
[
  { date: "2024-03-01", temperature: 12 },
  { date: "2024-03-02", temperature: null },
  { date: "2024-03-03", temperature: 18 }
]
Output
[
  { date: "2024-03-01", temperature: 12 },
  { date: "2024-03-02", temperature: 12 },   // carried forward
  { date: "2024-03-03", temperature: 18 }
]

Query operators (in $match)

$in (query)

Matches any value in a list. Not the same as the $in expression operator.

{ field: { $in: ["a", "b", "c"] } }
$nin

Matches none of the values in a list.

{ status: { $nin: ["deleted", "archived"] } }
$exists

Matches documents where field exists (or not).

{ email: { $exists: true } }
$regex (query)

Matches strings by regular expression.

{ name: { $regex: /^test/i } }
$elemMatch (query)

Matches array element satisfying all conditions.

{ scores: { $elemMatch: { $gte: 80, $lt: 90 } } }
$all

Matches arrays containing all specified elements.

{ tags: { $all: ["mongodb", "aggregation"] } }
$not (query)

Inverts the effect of a query operator.

{ price: { $not: { $gt: 100 } } }
$nor

Joins query clauses with logical NOR.

{ $nor: [{ price: 0 }, { deleted: true }] }
$text

Full-text search on text-indexed fields.

{ $text: { $search: "coffee shop" } }
$expr (query)

Use aggregation expressions in $match. Note: usually prevents index use.

{ $expr: { $gt: ["$qty", "$minQty"] } }

Conditional

$cond

If-then-else expression.

{ $cond: { if: { $gte: ["$score", 90] }, then: "A", else: "B" } }
Example data
Input
{ score: 95 }
{ score: 72 }
Output
{ grade: "A" }
{ grade: "B" }
$switch

Multi-branch switch expression.

{ $switch: { branches: [ { case: { $eq: ["$status", 1] }, then: "active" } ], default: "unknown" } }
Example data
Input
{ status: 1 }
{ status: 5 }
Output
{ label: "active" }
{ label: "unknown" }
$ifNull

Returns the first non-null value. Like SQL COALESCE.

{ $ifNull: ["$nickname", "$name", "anonymous"] }
Example data
Input
[
  { nickname: "Al", name: "Alice" },
  { name: "Bob" },
  {}
]
Output
[
  { shown: "Al" },
  { shown: "Bob" },
  { shown: "anonymous" }
]

Comparison

$eq

Equal.

{ $eq: ["$status", "active"] }
$ne

Not equal.

{ $ne: ["$status", "deleted"] }
$gt

Greater than.

{ $gt: ["$age", 18] }
$gte

Greater than or equal.

{ $gte: ["$score", 90] }
$lt

Less than.

{ $lt: ["$qty", 10] }
$lte

Less than or equal.

{ $lte: ["$price", 100] }
$cmp

Three-way comparison (-1, 0, 1).

{ $cmp: ["$a", "$b"] }

Logical

$and

True if all expressions evaluate to true.

{ $and: [{ $gte: ["$age", 18] }, { $eq: ["$active", true] }] }
$or

True if any expression evaluates to true.

{ $or: [{ $eq: ["$role", "admin"] }, { $eq: ["$role", "owner"] }] }
$not

Inverts a boolean expression.

{ $not: [{ $gt: ["$score", 50] }] }

Array

$filter

Select matching array elements.

{ $filter: { input: "$items", as: "i", cond: { $gte: ["$$i.price", 50] } } }
Example data
Input
{ items: [{ name: "A", price: 30 }, { name: "B", price: 80 }, { name: "C", price: 55 }] }
Output
{ result: [{ name: "B", price: 80 }, { name: "C", price: 55 }] }
$map

Transform each element.

{ $map: { input: "$items", as: "i", in: "$$i.name" } }
Example data
Input
{ items: [{ name: "A", price: 30 }, { name: "B", price: 80 }] }
Output
{ names: ["A", "B"] }
$reduce

Fold array to a single value.

{ $reduce: { input: "$items", initialValue: 0, in: { $add: ["$$value", "$$this.qty"] } } }
Example data
Input
{ items: [{ qty: 5 }, { qty: 10 }, { qty: 3 }] }
Output
{ totalQty: 18 }
$size

Length of an array.

{ $size: "$items" }
$in

True if value is in array.

{ $in: ["admin", "$roles"] }
$arrayElemAt

Element at index. Negative indices count from the end.

{ $arrayElemAt: ["$arr", 0] }
Example data
Input
{ arr: ["a", "b", "c", "d"] }
Output
{ first: "a", last: "d" }   // index 0 vs -1
$first (array) 4.4+

First element of an array (expression, not accumulator).

{ $first: "$items" }
$last (array) 4.4+

Last element of an array (expression, not accumulator).

{ $last: "$items" }
$slice

Subarray by position and count.

{ $slice: ["$arr", 0, 3] }
Example data
Input
{ arr: [10, 20, 30, 40, 50] }
Output
{ result: [10, 20, 30] }   // $slice: ["$arr", 0, 3]
$concatArrays

Concatenate arrays.

{ $concatArrays: ["$a", "$b"] }
Example data
Input
{ a: [1, 2], b: [3, 4] }
Output
{ result: [1, 2, 3, 4] }
$isArray

True if value is an array.

{ $isArray: "$field" }
$sortArray 5.2+

Sort an array by a field or expression.

{ $sortArray: { input: "$items", sortBy: { price: -1 } } }
Example data
Input
{ items: [
  { name: "C", price: 30 },
  { name: "A", price: 80 },
  { name: "B", price: 50 }
] }
Output
{ sorted: [
  { name: "A", price: 80 },
  { name: "B", price: 50 },
  { name: "C", price: 30 }
] }
$indexOfArray

Index of first match.

{ $indexOfArray: ["$tags", "urgent"] }
$reverseArray

Reverse an array.

{ $reverseArray: "$arr" }
Example data
Input
{ arr: ["a", "b", "c"] }
Output
{ result: ["c", "b", "a"] }
$range

Generate an array of integers [start, end) with optional step.

{ $range: [0, 10, 2] }
Example data
Input
// no input fields needed
Output
{ result: [0, 2, 4, 6, 8] }   // $range: [0, 10, 2]
$zip 3.4+

Merge arrays element-wise into arrays of pairs.

{ $zip: { inputs: ["$keys", "$values"] } }
Example data
Input
{ keys: ["a", "b", "c"], values: [1, 2, 3] }
Output
{ result: [["a", 1], ["b", 2], ["c", 3]] }
$setUnion

Union of arrays (unique).

{ $setUnion: ["$a", "$b"] }
Example data
Input
{ a: [1, 2, 3], b: [2, 3, 4] }
Output
{ result: [1, 2, 3, 4] }
$setIntersection

Intersection of arrays.

{ $setIntersection: ["$a", "$b"] }
Example data
Input
{ a: [1, 2, 3], b: [2, 3, 4] }
Output
{ result: [2, 3] }
$setDifference

Elements in first but not second.

{ $setDifference: ["$a", "$b"] }
Example data
Input
{ a: [1, 2, 3], b: [2, 3, 4] }
Output
{ result: [1] }
$setEquals

True if two arrays have the same distinct elements.

{ $setEquals: ["$a", "$b"] }
$setIsSubset

True if all elements of the first array appear in the second.

{ $setIsSubset: ["$a", "$b"] }
$allElementsTrue

True if no element is false, 0, null, or undefined.

{ $allElementsTrue: ["$flags"] }
$anyElementTrue

True if any element is not false, 0, null, or undefined.

{ $anyElementTrue: ["$flags"] }

String

$concat

Concatenate strings.

{ $concat: ["$first", " ", "$last"] }
$substr

Substring by byte index and length. Use $substrCP for unicode-safe operations.

{ $substr: ["$name", 0, 3] }
Example data
Input
{ name: "Alice Smith" }
Output
{ result: "Ali" }   // start 0, length 3
$toUpper

Uppercase.

{ $toUpper: "$name" }
$toLower

Lowercase.

{ $toLower: "$email" }
$split

Split string into array.

{ $split: ["$fullName", " "] }
Example data
Input
{ fullName: "Alice Marie Johnson" }
Output
{ parts: ["Alice", "Marie", "Johnson"] }
$trim

Remove leading/trailing whitespace, or specified characters.

{ $trim: { input: "$name" } }
Example data
Input
{ name: "  Alice  " }
Output
{ result: "Alice" }
$regexMatch

True if string matches regex. Add options: "i" for case-insensitive.

{ $regexMatch: { input: "$email", regex: /@example\.com$/ } }
$regexFind

First regex match with captures. Add options: "i" for case-insensitive.

{ $regexFind: { input: "$text", regex: /\d+/ } }
Example data
Input
{ text: "Order #1234 from 2024" }
Output
{ match: { match: "1234", idx: 7, captures: [] } }
$regexFindAll

All regex matches with captures.

{ $regexFindAll: { input: "$text", regex: /\d+/ } }
Example data
Input
{ text: "Order #1234 and #5678" }
Output
{ matches: [
  { match: "1234", idx: 7, captures: [] },
  { match: "5678", idx: 17, captures: [] }
] }
$replaceOne 4.4+

Replace first occurrence of a search string.

{ $replaceOne: { input: "$str", find: "foo", replacement: "bar" } }
Example data
Input
{ str: "foo bar foo baz" }
Output
{ result: "bar bar foo baz" }
$replaceAll 4.4+

Replace all occurrences of a search string.

{ $replaceAll: { input: "$str", find: "foo", replacement: "bar" } }
Example data
Input
{ str: "foo bar foo baz" }
Output
{ result: "bar bar bar baz" }
$strLenCP

String length in code points (characters).

{ $strLenCP: "$name" }
$indexOfCP

Index of substring in code points. Returns -1 if not found.

{ $indexOfCP: ["$name", "world"] }
$substrCP

Substring by code-point index and length. Unicode-safe alternative to $substr.

{ $substrCP: ["$name", 0, 5] }
$ltrim

Strip leading whitespace or specified characters.

{ $ltrim: { input: "$str", chars: " \t" } }
$rtrim

Strip trailing whitespace or specified characters.

{ $rtrim: { input: "$str", chars: " \t" } }
$strcasecmp

Case-insensitive comparison. Returns -1, 0, or 1.

{ $strcasecmp: ["$a", "$b"] }
$substrBytes

Substring by byte index and length.

{ $substrBytes: ["$name", 0, 5] }
$strLenBytes

String length in bytes.

{ $strLenBytes: "$name" }
$indexOfBytes

Byte index of substring. Returns -1 if not found.

{ $indexOfBytes: ["$name", "world"] }

Arithmetic

$add

Add numbers or add milliseconds to a date.

{ $add: ["$price", "$tax"] }
$subtract

Subtract numbers or dates.

{ $subtract: ["$total", "$discount"] }
$multiply

Multiply numbers.

{ $multiply: ["$price", "$qty"] }
$divide

Divide numbers.

{ $divide: ["$total", "$count"] }
$mod

Remainder of division.

{ $mod: ["$num", 2] }
$round

Round to decimal place.

{ $round: ["$avg", 2] }
$ceil

Round up to nearest integer.

{ $ceil: "$price" }
$floor

Round down to nearest integer.

{ $floor: "$price" }
$abs

Absolute value.

{ $abs: "$change" }
$rand

Random float between 0 and 1.

{ $rand: {} }
$pow

Raise to a power.

{ $pow: ["$base", 2] }
$sqrt

Square root.

{ $sqrt: "$variance" }
$log

Log in a specified base.

{ $log: ["$val", 10] }
$log10

Log base 10.

{ $log10: "$val" }
$ln

Natural logarithm.

{ $ln: "$val" }
$exp

Euler's number (e) raised to a power.

{ $exp: "$val" }
$trunc

Truncate to integer or specified decimal place.

{ $trunc: ["$price", 2] }

Date

$dateToString

Format date as string.

{ $dateToString: { format: "%Y-%m-%d", date: "$created" } }
Example data
Input
{ created: ISODate("2024-03-15T14:30:00Z") }
Output
{ dateStr: "2024-03-15" }
$dateFromString

Parse string to date.

{ $dateFromString: { dateString: "2024-01-15" } }
$dateAdd 5.0+

Add interval to date.

{ $dateAdd: { startDate: "$created", unit: "day", amount: 30 } }
Example data
Input
{ created: ISODate("2024-03-15") }
Output
{ later: ISODate("2024-04-14") }   // +30 days
$dateSubtract 5.0+

Subtract interval from date.

{ $dateSubtract: { startDate: "$created", unit: "month", amount: 3 } }
$dateDiff 5.0+

Difference between two dates.

{ $dateDiff: { startDate: "$start", endDate: "$end", unit: "day" } }
Example data
Input
{ start: ISODate("2024-03-01"), end: ISODate("2024-03-15") }
Output
{ days: 14 }
$year

Extract year.

{ $year: "$created" }
$month

Extract month (1-12).

{ $month: "$created" }
$dayOfMonth

Extract day of month (1-31).

{ $dayOfMonth: "$created" }
$hour

Extract hour (0-23).

{ $hour: "$created" }
$minute

Extract minute (0-59).

{ $minute: "$created" }
$second

Extract second (0-59).

{ $second: "$created" }
$week

Week of year (0-53).

{ $week: "$created" }
$dayOfWeek

Day of week (1=Sun, 7=Sat).

{ $dayOfWeek: "$created" }
$dateTrunc 5.0+

Truncate date to a unit (day, week, month, etc.).

{ $dateTrunc: { date: "$created", unit: "month" } }
Example data
Input
{ created: ISODate("2024-03-15T14:30:00Z") }
Output
{ start: ISODate("2024-03-01T00:00:00Z") }   // unit: "month"
$dateFromParts

Build a date from individual components.

{ $dateFromParts: { year: 2024, month: 6, day: 15 } }
Example data
Input
// no input fields needed
Output
{ date: ISODate("2024-06-15T00:00:00Z") }
$dateToParts

Break a date into year, month, day, hour, minute, second, millisecond.

{ $dateToParts: { date: "$created" } }
Example data
Input
{ created: ISODate("2024-03-15T14:30:00Z") }
Output
{ parts: {
  year: 2024, month: 3, day: 15,
  hour: 14, minute: 30, second: 0, millisecond: 0
} }
$millisecond

Extract millisecond (0-999).

{ $millisecond: "$created" }
$dayOfYear

Day of year (1-366).

{ $dayOfYear: "$created" }
$isoWeek

ISO week number (1-53).

{ $isoWeek: "$created" }
$isoWeekYear

ISO week-numbering year.

{ $isoWeekYear: "$created" }
$isoDayOfWeek

ISO day of week (1=Mon, 7=Sun).

{ $isoDayOfWeek: "$created" }
$tsSecond 5.1+

Seconds from a BSON Timestamp value.

{ $tsSecond: "$ts" }
$tsIncrement 5.1+

Ordinal increment from a BSON Timestamp value.

{ $tsIncrement: "$ts" }

Type and object

$type

BSON type of a field.

{ $type: "$field" }
$convert

Convert value to a specified type. onError lets the pipeline survive bad data.

{ $convert: { input: "$val", to: "int", onError: 0 } }
Example data
Input
[
  { val: "42" },
  { val: "abc" },
  { val: null }
]
Output
[
  { result: 42 },
  { result: 0 },    // onError fired
  { result: 0 }     // onNull would fire if set
]
$toInt

Convert to integer.

{ $toInt: "$str" }
$toDouble

Convert to double.

{ $toDouble: "$str" }
$toString

Convert to string.

{ $toString: "$num" }
$toBool

Convert to boolean.

{ $toBool: "$val" }
$toDate

Convert to date.

{ $toDate: "$timestamp" }
$toObjectId

Convert to ObjectId.

{ $toObjectId: "$idString" }
$toLong

Convert to 64-bit integer.

{ $toLong: "$str" }
$toDecimal

Convert to 128-bit decimal.

{ $toDecimal: "$str" }
$mergeObjects

Merge multiple objects into one.

{ $mergeObjects: ["$defaults", "$overrides"] }
Example data
Input
{ defaults: { color: "blue", size: "M" }, overrides: { size: "L", qty: 2 } }
Output
{ result: { color: "blue", size: "L", qty: 2 } }
$objectToArray

Convert object to array of {k, v} pairs. Pair with $arrayToObject to transform unknown keys.

{ $objectToArray: "$metadata" }
Example data
Input
{ metadata: { name: "Alice", role: "admin" } }
Output
{ pairs: [
  { k: "name", v: "Alice" },
  { k: "role", v: "admin" }
] }
$arrayToObject

Convert array of {k, v} pairs to object. Useful for building dynamic field names.

{ $arrayToObject: "$pairs" }
Example data
Input
{ pairs: [
  { k: "name", v: "Alice" },
  { k: "role", v: "admin" }
] }
Output
{ result: { name: "Alice", role: "admin" } }
$getField 5.0+

Get field value by name. Supports keys containing dots or starting with $.

{ $getField: { field: "price.usd", input: "$data" } }
Example data
Input
{ data: { "price.usd": 9.99, "price.eur": 8.50 } }
Output
{ price: 9.99 }   // dotted key, no nested lookup
$setField 5.0+

Set field value by name. Lets you write keys containing dots or starting with $.

{ $setField: { field: "price.usd", input: "$data", value: 9.99 } }
Example data
Input
{ data: { name: "A" } }
Output
{ data: { name: "A", "price.usd": 9.99 } }
$let

Define variables for use in a sub-expression. Keeps complex projections readable.

{ $let: { vars: { total: { $multiply: ["$price", "$qty"] } }, in: { $gt: ["$$total", 100] } } }
Example data
Input
{ price: 25, qty: 6 }
Output
{ isExpensive: true }   // 25 * 6 = 150 > 100
$literal

Return a value without parsing. Use when a value looks like an expression (starts with $).

{ $literal: "$notAFieldRef" }
$isNumber 4.4+

True if value is an integer, double, long, or decimal.

{ $isNumber: "$age" }
$unsetField 5.0+

Remove a field by name (supports dots and $ prefixes).

{ $unsetField: { field: "temp", input: "$$ROOT" } }
$meta

Access text search score or index key.

{ $meta: "textScore" }

Trigonometry

$sin 4.2+

Sine (input in radians).

{ $sin: "$angle" }
$cos 4.2+

Cosine (input in radians).

{ $cos: "$angle" }
$tan 4.2+

Tangent (input in radians).

{ $tan: "$angle" }
$asin 4.2+

Arcsine. Returns radians.

{ $asin: "$val" }
$acos 4.2+

Arccosine. Returns radians.

{ $acos: "$val" }
$atan 4.2+

Arctangent. Returns radians.

{ $atan: "$val" }
$atan2 4.2+

Arctangent of y/x. Returns radians.

{ $atan2: ["$y", "$x"] }
$sinh 4.2+

Hyperbolic sine.

{ $sinh: "$val" }
$cosh 4.2+

Hyperbolic cosine.

{ $cosh: "$val" }
$tanh 4.2+

Hyperbolic tangent.

{ $tanh: "$val" }
$degreesToRadians 4.2+

Convert degrees to radians.

{ $degreesToRadians: "$angleDeg" }
$radiansToDegrees 4.2+

Convert radians to degrees.

{ $radiansToDegrees: "$angleRad" }

Bitwise

$bitAnd 6.3+

Bitwise AND of long integers.

{ $bitAnd: [7, 3] }
$bitOr 6.3+

Bitwise OR.

{ $bitOr: [4, 2] }
$bitXor 6.3+

Bitwise XOR.

{ $bitXor: [5, 3] }
$bitNot 6.3+

Bitwise NOT.

{ $bitNot: 5 }

Data size

$bsonSize 4.4+

BSON byte size of a document or field.

{ $bsonSize: "$$ROOT" }
$binarySize 4.4+

Byte size of a string or binary value.

{ $binarySize: "$data" }

Common patterns

01 Pagination with total count

Get one page of results and the total count in a single query using $facet.

db.products.aggregate([
  { $match: { category: "electronics" } },
  { $facet: {
    data: [
      { $sort: { createdAt: -1 } },
      { $skip: 20 },
      { $limit: 10 }
    ],
    total: [
      { $count: "count" }
    ]
  } }
])
Output
[
  {
    data: [
      { _id: 21, name: "USB-C Hub",
        price: 39, createdAt: "2024-03-14" },
      { _id: 22, name: "Webcam",
        price: 79, createdAt: "2024-03-13" }
      // ... up to 10 documents
    ],
    total: [{ count: 184 }]
  }
]
02 Left join with $lookup

Join two collections. The pipeline form lets you filter and shape the joined documents before they land in the array.

db.users.aggregate([
  { $lookup: {
    from: "orders",
    let: { userId: "$_id" },
    pipeline: [
      { $match: {
        $expr: { $eq: ["$userId", "$$userId"] }
      } },
      { $sort: { date: -1 } },
      { $limit: 5 },
      { $project: { _id: 0, date: 1, total: 1 } }
    ],
    as: "recentOrders"
  } }
])
Output
[
  {
    _id: 1, name: "Alice",
    recentOrders: [
      { date: "2024-03-15", total: 120 },
      { date: "2024-03-01", total: 85 }
    ]
  },
  {
    _id: 2, name: "Bob",
    recentOrders: []
  }
]
03 Running total with $setWindowFields

Calculate a cumulative sum partitioned by a group field. Requires MongoDB 5.0+.

db.transactions.aggregate([
  { $setWindowFields: {
    partitionBy: "$accountId",
    sortBy: { date: 1 },
    output: {
      runningBalance: {
        $sum: "$amount",
        window: { documents: ["unbounded", "current"] }
      }
    }
  } }
])
Output
[
  { accountId: "A1", date: "2024-03-01",
    amount: 500, runningBalance: 500 },
  { accountId: "A1", date: "2024-03-05",
    amount: -120, runningBalance: 380 },
  { accountId: "A1", date: "2024-03-12",
    amount: 200, runningBalance: 580 }
]
04 Top N per group

Get the top 3 products by revenue in each category.

db.sales.aggregate([
  { $group: {
    _id: "$category",
    topProducts: {
      $topN: {
        n: 3,
        sortBy: { revenue: -1 },
        output: { name: "$product", revenue: "$revenue" }
      }
    }
  } }
])
Output
[
  {
    _id: "electronics",
    topProducts: [
      { name: "Laptop", revenue: 45000 },
      { name: "Phone", revenue: 38000 },
      { name: "Tablet", revenue: 22000 }
    ]
  },
  {
    _id: "clothing",
    topProducts: [
      { name: "Jacket", revenue: 12000 },
      { name: "Shoes", revenue: 9500 }
    ]
  }
]
05 Pivot data (rows to columns)

Turn distinct field values into object keys. Useful for report-style output.

db.survey.aggregate([
  { $group: {
    _id: "$question",
    answers: { $push: { k: "$respondent", v: "$answer" } }
  } },
  { $addFields: { answers: { $arrayToObject: "$answers" } } },
  { $replaceRoot: {
    newRoot: { $mergeObjects: [{ question: "$_id" }, "$answers"] }
  } }
])
Output
// input: 6 rows, 2 questions × 3 respondents
// [
//   { question: "color", respondent: "alice", answer: "blue" },
//   { question: "color", respondent: "bob",   answer: "red"  },
//   ...
// ]
[
  { question: "color",
    alice: "blue", bob: "red", carol: "green" },
  { question: "size",
    alice: "M", bob: "L", carol: "S" }
]
06 De-duplication

Keep only the first document for each unique key. Combine $sort before $group to control which document wins.

db.events.aggregate([
  { $sort: { timestamp: -1 } },
  { $group: {
    _id: "$userId",
    latestEvent: { $first: "$$ROOT" }
  } },
  { $replaceRoot: { newRoot: "$latestEvent" } }
])
Output
// given 5 events from 2 users:
[
  { _id: "evt_3", userId: "u1",
    type: "click",
    timestamp: "2024-03-15T10:00:00Z" },
  { _id: "evt_5", userId: "u2",
    type: "purchase",
    timestamp: "2024-03-14T18:30:00Z" }
]
07 Date-based grouping (by day, week, month)

Group documents into time buckets using $dateToString or $dateTrunc.

// group by day:
db.orders.aggregate([
  { $group: {
    _id: {
      $dateToString: {
        format: "%Y-%m-%d", date: "$createdAt"
      }
    },
    total: { $sum: "$amount" },
    count: { $sum: 1 }
  } },
  { $sort: { _id: 1 } }
])

// group by month (5.0+):
db.orders.aggregate([
  { $group: {
    _id: {
      $dateTrunc: { date: "$createdAt", unit: "month" }
    },
    total: { $sum: "$amount" }
  } }
])
Output
// by-day result:
[
  { _id: "2024-03-14", total: 3800, count: 15 },
  { _id: "2024-03-15", total: 4200, count: 18 },
  { _id: "2024-03-16", total: 2900, count: 11 }
]
08 Time-zone-aware date grouping

Group by day or month in a specific time zone. Without the timezone option, $dateTrunc and $dateToString use UTC, which shifts results for users in other zones.

db.orders.aggregate([
  { $group: {
    _id: {
      $dateTrunc: {
        date: "$createdAt",
        unit: "day",
        timezone: "America/New_York"
      }
    },
    revenue: { $sum: "$amount" },
    orders: { $sum: 1 }
  } },
  { $sort: { _id: 1 } }
])
Output
[
  { _id: ISODate("2024-03-14T05:00:00Z"),
    revenue: 4200, orders: 18 },
  { _id: ISODate("2024-03-15T04:00:00Z"),
    revenue: 3800, orders: 15 }
]
// Note: _id is still UTC, but boundaries
// align to midnight in America/New_York
09 Gap-free time series with $densify

Fill missing dates so charts have no gaps. $densify inserts empty documents for missing intervals, then $fill sets default values. Requires MongoDB 5.1+.

db.dailyStats.aggregate([
  { $densify: {
    field: "date",
    range: {
      step: 1,
      unit: "day",
      bounds: [
        ISODate("2024-03-01"),
        ISODate("2024-03-08")
      ]
    }
  } },
  { $fill: {
    output: {
      visits: { value: 0 },
      revenue: { value: 0 }
    }
  } },
  { $sort: { date: 1 } }
])
Output
// before: 5 documents (March 2, 4, 5 missing)
// after: 7 documents, one per day
[
  { date: "2024-03-01", visits: 120, revenue: 800 },
  { date: "2024-03-02", visits: 0, revenue: 0 },
  { date: "2024-03-03", visits: 95, revenue: 650 },
  // ...every day present
]
10 Rolling window over dates

Compute a 7-day moving average using $setWindowFields with a date-based range window. Requires MongoDB 5.0+.

db.dailyMetrics.aggregate([
  { $setWindowFields: {
    sortBy: { date: 1 },
    output: {
      avg7d: {
        $avg: "$revenue",
        window: {
          range: [-6, "current"],
          unit: "day"
        }
      }
    }
  } }
])
Output
[
  { date: "2024-03-01", revenue: 800,
    avg7d: 800 },
  { date: "2024-03-02", revenue: 650,
    avg7d: 725 },
  { date: "2024-03-03", revenue: 920,
    avg7d: 790 },
  // avg7d smooths daily fluctuations
]
11 Multi-collection union

Combine documents from multiple collections into one result set, applying per-collection filters.

db.currentOrders.aggregate([
  { $match: { status: "active" } },
  { $unionWith: {
    coll: "archivedOrders",
    pipeline: [{ $match: { status: "active" } }]
  } },
  { $sort: { createdAt: -1 } }
])
Output
// documents from both collections, merged:
[
  { _id: 5, item: "Laptop", status: "active",
    createdAt: "2024-03-15" },
  { _id: 88, item: "Monitor", status: "active",
    createdAt: "2024-02-20" },
  { _id: 2, item: "Keyboard", status: "active",
    createdAt: "2024-01-10" }
]
12 Conditional aggregation

Compute different values based on a condition within a group.

db.orders.aggregate([
  { $group: {
    _id: "$region",
    totalRevenue: { $sum: "$amount" },
    onlineRevenue: {
      $sum: {
        $cond: [
          { $eq: ["$channel", "online"] },
          "$amount", 0
        ]
      }
    },
    storeRevenue: {
      $sum: {
        $cond: [
          { $eq: ["$channel", "store"] },
          "$amount", 0
        ]
      }
    }
  } }
])
Output
[
  { _id: "US-East",
    totalRevenue: 18400,
    onlineRevenue: 12200,
    storeRevenue: 6200 },
  { _id: "EU-West",
    totalRevenue: 9300,
    onlineRevenue: 7100,
    storeRevenue: 2200 }
]
13 Recursive graph traversal

Walk a parent-child hierarchy (org chart, category tree, social graph) up to a specified depth.

db.employees.aggregate([
  { $match: { name: "Alice" } },
  { $graphLookup: {
    from: "employees",
    startWith: "$managerId",
    connectFromField: "managerId",
    connectToField: "_id",
    as: "chain",
    maxDepth: 10,
    depthField: "level"
  } }
])
Output
[
  { _id: 4, name: "Alice", managerId: 2,
    chain: [
      { _id: 2, name: "Bob",
        managerId: 1, level: 0 },
      { _id: 1, name: "Carol",
        managerId: null, level: 1 }
    ] }
]
14 Materialized view with $merge

Incrementally refresh a summary collection. $merge inserts new documents and updates existing ones, so you only reprocess the delta.

db.orders.aggregate([
  { $match: {
    updatedAt: { $gte: ISODate("2024-03-01") }
  } },
  { $group: {
    _id: {
      region: "$region",
      month: {
        $dateTrunc: { date: "$createdAt", unit: "month" }
      }
    },
    revenue: { $sum: "$amount" },
    orders: { $sum: 1 }
  } },
  { $merge: {
    into: "monthly_summary",
    on: "_id",
    whenMatched: "replace",
    whenNotMatched: "insert"
  } }
])
Output
// monthly_summary collection after merge:
[
  { _id: { region: "US-East",
      month: ISODate("2024-03-01") },
    revenue: 18400, orders: 142 },
  { _id: { region: "EU-West",
      month: ISODate("2024-03-01") },
    revenue: 9300, orders: 87 }
]
15 Flatten nested arrays

Merge arrays of arrays into a single flat array using $reduce with $concatArrays.

db.departments.aggregate([
  { $project: {
    name: 1,
    allSkills: {
      $reduce: {
        input: "$teams.skills",
        initialValue: [],
        in: { $concatArrays: ["$$value", "$$this"] }
      }
    }
  } }
])
Output
[
  {
    _id: 1,
    name: "Engineering",
    allSkills: ["go", "rust", "react", "typescript", "python", "ml"]
  }
]
16 Dynamic field names

Process documents with unknown keys by converting objects to arrays, transforming, and converting back.

db.configs.aggregate([
  { $project: { entries: { $objectToArray: "$settings" } } },
  { $unwind: "$entries" },
  { $match: { "entries.v": { $ne: null } } },
  { $group: {
    _id: "$_id",
    settings: { $push: { k: "$entries.k", v: "$entries.v" } }
  } },
  { $addFields: { settings: { $arrayToObject: "$settings" } } }
])
Output
// input: { settings: { theme: "dark", lang: "en", beta: null } }
[
  { _id: 1,
    settings: { theme: "dark", lang: "en" } }
]

Common gotchas

Subtle pitfalls that cost real engineering time. Worth a skim before you ship a long pipeline.

BSON document size limit (16 MB)

A single output document cannot exceed 16 MB. $group with $push, $facet that returns a large result, and $out / $merge target documents all hit this limit. If $push might grow unboundedly, use $topN or chunk by a partition key.

// risky: tags array can exceed 16 MB across the dataset
{ $group: { _id: null, allTags: { $push: "$tag" } } }

// safer: cap with $topN
{ $group: {
  _id: null,
  topTags: { $topN: { n: 1000, sortBy: { count: -1 }, output: "$tag" } }
} }

$lookup foreign field must be indexed

Without an index on the foreign field, every input document triggers a full collection scan of the target. On 100k × 100k joins this is millions of scans. Always confirm with .explain() that $lookup uses an index.

$expr defeats indexes

When comparing against a constant, use standard query syntax in $match instead of $expr. $expr forces a collection scan even if the field is indexed. The exception is inside $lookup sub-pipelines, where $expr is required to reference outer $$variables.

$unwind multiplies document count

An array of N elements becomes N documents. Pipelines that $unwind then $group can briefly hold 10× the working set. When you only need to transform an array, use $filter, $map, or $reduce in place.

UTC by default for date operators

$dateTrunc, $dateToString, and $dateAdd use UTC unless you pass timezone. "Group by day" without timezone shifts boundaries for users in other zones. Either pass timezone explicitly or store dates already normalized to the relevant zone.

$out replaces, $merge upserts

$out destroys the target collection on every run, including its indexes. $merge is incremental and preserves the target. Use $out only when you genuinely want a fresh snapshot.

Pipeline stage 100 MB RAM limit

Each stage holds up to 100 MB of in-memory data. A $sort on a large unindexed field or a $group with many unique keys can blow this limit. Set allowDiskUse: true on the aggregate call, but treat the spill as a sign to filter or project earlier.

$count appears in two places with different shapes

Stage form: { $count: "fieldName" } returns one document with that field. Accumulator form: { $count: {} } inside $group. They are not interchangeable. The stage version is faster for total counts because MongoDB can optimize it internally.

Performance tips

  • Place $match as early as possible. It can use indexes and reduces the document count for every stage that follows.
  • $sort immediately after $match on the same fields can piggyback on a compound index, avoiding an in-memory sort.
    // index: { status: 1, createdAt: -1 }
    db.orders.aggregate([
      { $match: { status: "shipped" } },
      // uses the compound index, no in-memory sort
      { $sort: { createdAt: -1 } }
    ])
  • Use $project or $unset early to drop fields you do not need. Smaller documents mean less memory per stage.
  • Each pipeline stage has a 100 MB RAM limit by default. Pass { allowDiskUse: true } as an option to spill to disk for large datasets.
    db.orders.aggregate([
      { $group: {
        _id: "$customerId",
        total: { $sum: "$amount" }
      } },
      { $sort: { total: -1 } }
    ], { allowDiskUse: true })
  • In $lookup, make sure the foreign field (foreignField or the field matched inside the sub-pipeline) is indexed. An unindexed $lookup is effectively a nested collection scan.
  • Prefer $filter or $reduce over $unwind + $group when you only need to transform an array in place. $unwind creates N documents per array element.
  • Use $merge instead of $out for incremental materialized views. $merge can insert, update, or replace on match, while $out replaces the entire target collection.
  • Run .explain("executionStats") on your pipeline. Check whether the first $match uses an index (IXSCAN, not COLLSCAN) and how many documents each stage examines.
    db.orders
      .explain("executionStats")
      .aggregate([
        { $match: {
          status: "shipped",
          createdAt: { $gte: ISODate("2024-01-01") }
        } },
        { $group: {
          _id: "$region",
          total: { $sum: "$amount" }
        } }
      ])
  • Avoid $sort on unindexed fields with large result sets. If the sort exceeds the 100 MB limit and allowDiskUse is off, the query fails.
  • $count is faster than $group + $sum when you only need a total, because MongoDB can optimize it internally.
  • Avoid deep pagination with $skip. Skip is O(n): page 1000 of 50/page makes the server walk 50,000 documents before discarding them. For deep pages, use range-based pagination on an indexed field.
    // slow at high offsets:
    db.orders.aggregate([
      { $sort: { _id: 1 } },
      { $skip: 50000 },
      { $limit: 50 }
    ])
    
    // fast: use the last _id from the previous page
    db.orders.aggregate([
      { $match: { _id: { $gt: lastSeenId } } },
      { $sort: { _id: 1 } },
      { $limit: 50 }
    ])
  • $expr inside $match prevents index usage in most cases. For field-to-field comparisons, this means a collection scan. Inside $lookup sub-pipelines this is unavoidable, but for top-level $match, use standard query operators when comparing against constants.
    // slow: $expr prevents index use
    { $match: {
      $expr: { $gt: ["$qty", "$threshold"] }
    } }
    
    // comparing against a constant? use standard syntax:
    // uses { qty: 1 } index
    { $match: { qty: { $gt: 100 } } }

Frequently asked questions

What MongoDB version introduced the aggregation pipeline?

MongoDB 2.2 (2012). Each major release since then has added stages and operators. $setWindowFields arrived in 5.0, $densify and $fill in 5.1/5.3, $documents in 6.0, and $vectorSearch in Atlas 7.0+.

What is the default memory limit per stage?

Each stage can use up to 100 MB of RAM. If a stage exceeds that limit, the query fails unless you pass { allowDiskUse: true } as an aggregation option. Disk spills are slower, so try to reduce document size with $project or filter early with $match.

Can I use aggregation inside a transaction?

Yes, starting in MongoDB 4.4. The pipeline must target a single collection and cannot use $out or $merge. Transactions have their own size and time limits, so keep pipelines short.

How do I debug a long pipeline?

Build incrementally: start with the first stage, check its output, add the next stage, and repeat. In the mongo shell, comment out later stages. In Monghoul, use the aggregation builder's per-stage preview and "Run to here" to see intermediate results without touching later stages.

What is the difference between $project and $addFields?

$project controls exactly which fields appear in the output. If you include field A, all other fields are excluded (except _id). $addFields keeps every existing field and adds or overwrites the ones you specify. Use $project when you want a strict shape, $addFields when you want to enrich documents.

When should I use $merge instead of $out?

$out replaces the entire target collection on every run, which makes it all-or-nothing. $merge can insert new documents and update existing ones based on a match key, which is better for incremental updates (e.g., daily roll-ups that add to a running summary).

$lookup is slow. What can I do?

First, index the foreign field. An unindexed foreign field turns every lookup into a collection scan. Second, add a $match inside the lookup pipeline to reduce the joined set. Third, if you only need a count or a single field, project inside the sub-pipeline to avoid pulling full documents.

Does stage order matter for performance?

Absolutely. MongoDB does some automatic optimization (e.g., moving $match before $project when possible), but it cannot reorder everything. As a rule, filter first ($match), project next ($project/$unset), then sort and group. Fewer documents in, less work for expensive stages.

When should I NOT use aggregation?

For simple lookups, use find(). The aggregation pipeline has higher per-query overhead than find() and is harder to reason about. Reach for it when you need joins, grouping, window functions, computed fields, or multi-stage transformations. A query like "give me one user by id" should always be find(), not a pipeline.

Build pipelines visually in Monghoul

Drag-and-drop stages, per-stage preview, and autocomplete for every operator on this page. 14-day Pro trial, no card required.

Monghoul aggregation builder with visual pipeline stages and per-stage output preview
Download Monghoul