MongoDB aggregation cheat sheet
Every pipeline stage, expression operator, and copy-paste pattern in one page. Covers MongoDB 4.4 through 8.0.
Updated April 2026
Basic pipeline
db.orders.aggregate([
{ $match: { status: "shipped", total: { $gte: 100 } } },
{ $group: {
_id: "$region",
revenue: { $sum: "$total" },
count: { $sum: 1 }
} },
{ $sort: { revenue: -1 } }
])
Filter with $match, aggregate with $group, order with $sort.
Most pipelines are variations of this pattern.
SQL to aggregation
If you know SQL, you already know the concepts. Each SQL clause maps to an aggregation stage.
| SQL | Aggregation stage |
|---|---|
| SELECT a, b | $project |
| SELECT *, a+b AS c | $addFields / $set |
| WHERE | $match |
| GROUP BY | $group |
| HAVING | $match (after $group) |
| ORDER BY | $sort |
| LIMIT / OFFSET | $limit / $skip |
| JOIN | $lookup |
| UNION ALL | $unionWith |
| COUNT(*) | $count |
| DISTINCT | $group |
| WINDOW / OVER | $setWindowFields |
| INSERT INTO ... SELECT | $merge / $out |
| CASE WHEN | $cond / $switch |
| COALESCE | $ifNull |
| CAST | $convert / $toInt / ... |
| EXISTS (subquery) | $lookup + $match |
| LATERAL JOIN | $lookup (pipeline) |
| CREATE MATERIALIZED VIEW | $merge / $out |
How the pipeline works
Documents flow left to right through stages. Each stage transforms the stream and passes results to the next.
The document count can shrink ($match, $group), stay the same ($addFields, $sort, $project), grow ($unwind, $unionWith), or branch ($facet). Place stages that reduce document count early to keep later stages fast.
Monghoul's aggregation builder shows this flow visually with per-stage output previews and "Run to here" controls. See how it works →
Pipeline stages
Filtering
Filters documents. Same query syntax as find().
{ $match: { status: "active", age: { $gte: 18 } } } Example data
[
{ name: "Alice", status: "active", age: 25 },
{ name: "Bob", status: "inactive", age: 30 },
{ name: "Carol", status: "active", age: 15 }
] [
{ name: "Alice", status: "active", age: 25 }
] Projection
Includes, excludes, or computes fields. 1 to include, 0 to exclude.
{ $project: { name: 1, total: { $multiply: ["$price", "$qty"] } } } Example data
[
{ _id: 1, name: "Widget", price: 25, qty: 4, sku: "W1" }
] [
{ _id: 1, name: "Widget", total: 100 }
] Adds new fields or overwrites existing ones. $set is an alias for $addFields.
{ $addFields: { totalPrice: { $multiply: ["$price", "$qty"] } } } Example data
[
{ _id: 1, price: 25, qty: 4, sku: "W1" }
] [
{ _id: 1, price: 25, qty: 4, sku: "W1", totalPrice: 100 }
] Removes fields. Shorthand for $project with exclusions.
{ $unset: ["tempField", "internal.debug"] } Replaces the entire document with a specified sub-document or expression.
{ $replaceRoot: { newRoot: "$address" } }
// or equivalently:
{ $replaceWith: "$address" } Example data
[
{ _id: 1, name: "Alice", address: { city: "NYC", zip: "10001" } }
] [
{ city: "NYC", zip: "10001" }
] Array handling
Deconstructs an array field into one document per element.
{ $unwind: "$tags" }
// with options:
{ $unwind: { path: "$tags", preserveNullAndEmptyArrays: true } } Example data
[
{ _id: 1, item: "Shirt", tags: ["cotton", "sale"] }
] [
{ _id: 1, item: "Shirt", tags: "cotton" },
{ _id: 1, item: "Shirt", tags: "sale" }
] Grouping
Groups documents by a key and applies accumulators.
{ $group: {
_id: "$department",
total: { $sum: "$salary" },
count: { $sum: 1 },
avgSalary: { $avg: "$salary" }
} } Example data
[
{ name: "Alice", department: "eng", salary: 120 },
{ name: "Bob", department: "eng", salary: 100 },
{ name: "Carol", department: "sales", salary: 90 }
] [
{ _id: "eng", total: 220, count: 2, avgSalary: 110 },
{ _id: "sales", total: 90, count: 1, avgSalary: 90 }
] Groups documents into buckets by value ranges.
{ $bucket: {
groupBy: "$price",
boundaries: [0, 50, 100, 200, Infinity],
default: "other",
output: { count: { $sum: 1 } }
} } Example data
[
{ item: "A", price: 10 },
{ item: "B", price: 75 },
{ item: "C", price: 150 }
] [
{ _id: 0, count: 1 },
{ _id: 50, count: 1 },
{ _id: 100, count: 1 }
] Automatically divides documents into a specified number of evenly distributed buckets.
{ $bucketAuto: { groupBy: "$score", buckets: 5 } } Groups by a field and sorts by count descending. Shorthand for $group + $sort.
{ $sortByCount: "$category" } Example data
[
{ item: "A", category: "food" },
{ item: "B", category: "food" },
{ item: "C", category: "tech" }
] [
{ _id: "food", count: 2 },
{ _id: "tech", count: 1 }
] Sorting
Orders documents. 1 for ascending, -1 for descending.
{ $sort: { createdAt: -1, name: 1 } } Example data
[
{ name: "Bob", createdAt: "2024-01-01" },
{ name: "Alice", createdAt: "2024-03-15" },
{ name: "Carol", createdAt: "2024-02-10" }
] [
{ name: "Alice", createdAt: "2024-03-15" },
{ name: "Carol", createdAt: "2024-02-10" },
{ name: "Bob", createdAt: "2024-01-01" }
] Joining
Left outer join to another collection in the same database.
// basic form:
{ $lookup: {
from: "orders",
localField: "_id",
foreignField: "userId",
as: "userOrders"
} }
// pipeline form (correlated sub-query):
{ $lookup: {
from: "orders",
let: { uid: "$_id" },
pipeline: [
{ $match: {
$expr: { $eq: ["$userId", "$$uid"] }
} },
{ $sort: { date: -1 } },
{ $limit: 5 }
],
as: "recentOrders"
} } Example data
// users collection:
[{ _id: 1, name: "Alice" }]
// orders collection:
[{ userId: 1, total: 50 }, { userId: 1, total: 80 }] [
{ _id: 1, name: "Alice",
userOrders: [
{ userId: 1, total: 50 },
{ userId: 1, total: 80 }
] }
] Recursive lookup for tree and graph structures (org charts, categories, social graphs).
{ $graphLookup: {
from: "employees",
startWith: "$managerId",
connectFromField: "managerId",
connectToField: "_id",
as: "reportingChain",
maxDepth: 5,
depthField: "level"
} } Example data
// employees collection (pipeline starts with { $match: { name: "Alice" } }):
[
{ _id: 1, name: "Alice", managerId: 2 },
{ _id: 2, name: "Bob", managerId: 3 },
{ _id: 3, name: "Carol", managerId: null }
] [
{ _id: 1, name: "Alice", managerId: 2,
reportingChain: [
{ _id: 2, name: "Bob", managerId: 3, level: 0 },
{ _id: 3, name: "Carol", managerId: null, level: 1 }
] }
] Appends documents from another collection to the pipeline. Like SQL UNION ALL.
{ $unionWith: { coll: "archivedOrders" } }
// with a pipeline:
{ $unionWith: {
coll: "archivedOrders",
pipeline: [{ $match: { status: "shipped" } }]
} } Example data
// currentOrders (pipeline source):
[{ _id: 1, item: "Laptop", status: "active" }]
// archivedOrders (joined via $unionWith):
[{ _id: 99, item: "Mouse", status: "shipped" }] [
{ _id: 1, item: "Laptop", status: "active" },
{ _id: 99, item: "Mouse", status: "shipped" }
] Window functions
Adds fields computed over a window of documents (running totals, ranks, moving averages).
{ $setWindowFields: {
partitionBy: "$department",
sortBy: { salary: -1 },
output: {
rank: { $rank: {} },
runningTotal: {
$sum: "$salary",
window: { documents: ["unbounded", "current"] }
}
}
} } Example data
[
{ department: "eng", name: "Alice", salary: 120 },
{ department: "eng", name: "Bob", salary: 100 },
{ department: "sales", name: "Carol", salary: 90 }
] [
{ department: "eng", name: "Alice", salary: 120,
rank: 1, runningTotal: 120 },
{ department: "eng", name: "Bob", salary: 100,
rank: 2, runningTotal: 220 },
{ department: "sales", name: "Carol", salary: 90,
rank: 1, runningTotal: 90 }
] Multi-pipeline
Runs multiple sub-pipelines in parallel on the same input. Returns one document with each pipeline result as a field.
{ $facet: {
priceRanges: [
{ $bucket: { groupBy: "$price", boundaries: [0, 50, 100, 500] } }
],
topRated: [
{ $sort: { rating: -1 } },
{ $limit: 5 }
],
totalCount: [
{ $count: "count" }
]
} } Example data
// all input documents are sent
// to each sub-pipeline [
{
priceRanges: [{ _id: 0, count: 3 }, ...],
topRated: [{ name: "A", rating: 5 }, ...],
totalCount: [{ count: 42 }]
}
] Output
Writes pipeline results to a collection, replacing its contents. Must be the last stage.
{ $out: "monthlySummary" }
// cross-database (4.4+):
{ $out: { db: "reporting", coll: "monthlySummary" } } Search (Atlas)
Full-text search using Atlas Search indexes. Must be the first stage.
{ $search: {
index: "default",
text: { query: "coffee", path: "description" }
} } Returns metadata about Atlas Search results (facet counts, total hits).
{ $searchMeta: {
index: "default",
facet: {
operator: { text: { query: "coffee", path: "description" } },
facets: { categories: { type: "string", path: "category" } }
}
} } Other stages
Restricts document content based on field-level conditions. Returns "$$DESCEND", "$$PRUNE", or "$$KEEP".
{ $redact: {
$cond: {
if: { $eq: ["$level", "public"] },
then: "$$DESCEND",
else: "$$PRUNE"
}
} } Fills gaps in a sequence (dates, numbers) by inserting missing documents.
{ $densify: {
field: "date",
range: { step: 1, unit: "day", bounds: "full" }
} } Fills null or missing field values using a specified method.
{ $fill: {
sortBy: { date: 1 },
output: {
temperature: { method: "linear" },
status: { value: "unknown" }
}
} } Creates documents from expressions. Useful for generating test data or constants in a pipeline.
{ $documents: [
{ name: "Alice", role: "admin" },
{ name: "Bob", role: "user" }
] } Admin and diagnostics
Returns storage and query-execution statistics for the collection. Must be the first stage.
{ $collStats: {
storageStats: {},
count: {},
queryExecStats: {}
} } Returns usage statistics for each index on the collection. Must be the first stage.
{ $indexStats: {} } Example data
// no input documents; reads index metadata [
{ name: "status_1", accesses: { ops: 142, since: ISODate("2024-01-01") } },
{ name: "_id_", accesses: { ops: 9800, since: ISODate("2024-01-01") } }
] Returns plan cache entries for the collection. Useful for diagnosing slow queries.
{ $planCacheStats: {} } Returns documents describing active and/or dormant operations on the server. Run on the admin database.
{ $currentOp: { allUsers: true, idleSessions: false } } Expressions and operators
Expression operators transform values inside stages like $project and $addFields. Query operators (used inside $match) are included at the end for quick reference.
Accumulators (used in $group, $setWindowFields)
{ $sum: "$qty" } { $avg: "$price" } { $min: "$score" } { $max: "$score" } { $first: "$name" } { $last: "$name" } { $push: "$tag" } { $addToSet: "$tag" } { $count: {} } { $top: { sortBy: { score: -1 }, output: "$name" } } { $topN: { n: 3, sortBy: { score: -1 }, output: "$name" } } { $bottom: { sortBy: { score: -1 }, output: "$name" } } { $bottomN: { n: 3, sortBy: { score: -1 }, output: "$name" } } { $firstN: { n: 3, input: "$item" } } { $lastN: { n: 3, input: "$item" } } { $median: { input: "$score", method: "approximate" } } { $percentile: { input: "$score", p: [0.9], method: "approximate" } } { $stdDevSamp: "$score" } { $stdDevPop: "$score" } { $expMovingAvg: { input: "$price", N: 5 } } { $minN: { n: 3, input: "$score" } } { $maxN: { n: 3, input: "$score" } } { $rank: {} } { $denseRank: {} } { $documentNumber: {} } { $shift: { output: "$price", by: -1, default: 0 } } { $derivative: { input: "$distance", unit: "hour" } } { $linearFill: "$temperature" } { $accumulator: { init: "function() { ... }", ... } } { $function: { body: "function(name) { ... }", args: ["$name"], lang: "js" } } { $integral: { input: "$speed", unit: "hour" } } { $covarianceSamp: ["$x", "$y"] } { $covariancePop: ["$x", "$y"] } { $locf: "$temperature" } Query operators (in $match)
{ field: { $in: ["a", "b", "c"] } } { status: { $nin: ["deleted", "archived"] } } { email: { $exists: true } } { name: { $regex: /^test/i } } { scores: { $elemMatch: { $gte: 80, $lt: 90 } } } { tags: { $all: ["mongodb", "aggregation"] } } { price: { $not: { $gt: 100 } } } { $nor: [{ price: 0 }, { deleted: true }] } { $text: { $search: "coffee shop" } } { $expr: { $gt: ["$qty", "$minQty"] } } Conditional
{ $cond: { if: { $gte: ["$score", 90] }, then: "A", else: "B" } } Example data
{ score: 95 }
{ score: 72 } { grade: "A" }
{ grade: "B" } { $switch: {
branches: [
{ case: { $eq: ["$status", 1] },
then: "active" }
],
default: "unknown"
} } Example data
{ status: 1 }
{ status: 5 } { label: "active" }
{ label: "unknown" } { $ifNull: ["$nickname", "$name", "anonymous"] } Comparison
{ $eq: ["$status", "active"] } { $ne: ["$status", "deleted"] } { $gt: ["$age", 18] } { $gte: ["$score", 90] } { $lt: ["$qty", 10] } { $lte: ["$price", 100] } { $cmp: ["$a", "$b"] } Array
{ $filter: { input: "$items", as: "i", cond: { $gte: ["$$i.price", 50] } } } Example data
{ items: [{ name: "A", price: 30 }, { name: "B", price: 80 }, { name: "C", price: 55 }] } { result: [{ name: "B", price: 80 }, { name: "C", price: 55 }] } { $map: { input: "$items", as: "i", in: "$$i.name" } } Example data
{ items: [{ name: "A", price: 30 }, { name: "B", price: 80 }] } { names: ["A", "B"] } { $reduce: { input: "$items", initialValue: 0, in: { $add: ["$$value", "$$this.qty"] } } } Example data
{ items: [{ qty: 5 }, { qty: 10 }, { qty: 3 }] } { totalQty: 18 } { $size: "$items" } { $in: ["admin", "$roles"] } { $arrayElemAt: ["$arr", 0] } { $first: "$items" } { $last: "$items" } { $slice: ["$arr", 0, 3] } { $concatArrays: ["$a", "$b"] } { $isArray: "$field" } { $sortArray: { input: "$items", sortBy: { price: -1 } } } { $indexOfArray: ["$tags", "urgent"] } { $reverseArray: "$arr" } { $range: [0, 10, 2] } { $zip: { inputs: ["$keys", "$values"] } } { $setUnion: ["$a", "$b"] } { $setIntersection: ["$a", "$b"] } { $setDifference: ["$a", "$b"] } { $setEquals: ["$a", "$b"] } { $setIsSubset: ["$a", "$b"] } { $allElementsTrue: ["$flags"] } { $anyElementTrue: ["$flags"] } String
{ $concat: ["$first", " ", "$last"] } { $substr: ["$name", 0, 3] } { $toUpper: "$name" } { $toLower: "$email" } { $split: ["$fullName", " "] } { $trim: { input: "$name" } } { $regexMatch: { input: "$email", regex: /@example\.com$/ } } { $regexFind: { input: "$text", regex: /\d+/ } } { $regexFindAll: { input: "$text", regex: /\d+/ } } { $replaceOne: { input: "$str", find: "foo", replacement: "bar" } } { $replaceAll: { input: "$str", find: "foo", replacement: "bar" } } { $strLenCP: "$name" } { $indexOfCP: ["$name", "world"] } { $substrCP: ["$name", 0, 5] } { $ltrim: { input: "$str", chars: " \t" } } { $rtrim: { input: "$str", chars: " \t" } } { $strcasecmp: ["$a", "$b"] } { $substrBytes: ["$name", 0, 5] } { $strLenBytes: "$name" } { $indexOfBytes: ["$name", "world"] } Arithmetic
{ $add: ["$price", "$tax"] } { $subtract: ["$total", "$discount"] } { $multiply: ["$price", "$qty"] } { $divide: ["$total", "$count"] } { $mod: ["$num", 2] } { $round: ["$avg", 2] } { $ceil: "$price" } { $floor: "$price" } { $abs: "$change" } { $rand: {} } { $pow: ["$base", 2] } { $sqrt: "$variance" } { $log: ["$val", 10] } { $log10: "$val" } { $ln: "$val" } { $exp: "$val" } { $trunc: ["$price", 2] } Date
{ $dateToString: { format: "%Y-%m-%d", date: "$created" } } Example data
{ created: ISODate("2024-03-15T14:30:00Z") } { dateStr: "2024-03-15" } { $dateFromString: { dateString: "2024-01-15" } } { $dateAdd: { startDate: "$created", unit: "day", amount: 30 } } { $dateSubtract: { startDate: "$created", unit: "month", amount: 3 } } { $dateDiff: { startDate: "$start", endDate: "$end", unit: "day" } } { $year: "$created" } { $month: "$created" } { $dayOfMonth: "$created" } { $hour: "$created" } { $minute: "$created" } { $second: "$created" } { $week: "$created" } { $dayOfWeek: "$created" } { $dateTrunc: { date: "$created", unit: "month" } } { $dateFromParts: { year: 2024, month: 6, day: 15 } } { $dateToParts: { date: "$created" } } { $millisecond: "$created" } { $dayOfYear: "$created" } { $isoWeek: "$created" } { $isoWeekYear: "$created" } { $isoDayOfWeek: "$created" } { $tsSecond: "$ts" } { $tsIncrement: "$ts" } Type and object
{ $type: "$field" } { $convert: { input: "$val", to: "int", onError: 0 } } { $toInt: "$str" } { $toDouble: "$str" } { $toString: "$num" } { $toBool: "$val" } { $toDate: "$timestamp" } { $toObjectId: "$idString" } { $toLong: "$str" } { $toDecimal: "$str" } { $mergeObjects: ["$defaults", "$overrides"] } Example data
{ defaults: { color: "blue", size: "M" }, overrides: { size: "L", qty: 2 } } { result: { color: "blue", size: "L", qty: 2 } } { $objectToArray: "$metadata" } { $arrayToObject: "$pairs" } { $getField: { field: "price.usd", input: "$data" } } { $setField: { field: "price.usd", input: "$data", value: 9.99 } } { $let: { vars: { total: { $multiply: ["$price", "$qty"] } }, in: { $gt: ["$$total", 100] } } } Example data
{ price: 25, qty: 6 } { isExpensive: true } // 25 * 6 = 150 > 100 { $literal: "$notAFieldRef" } { $isNumber: "$age" } { $unsetField: { field: "temp", input: "$$ROOT" } } { $meta: "textScore" } Trigonometry
{ $sin: "$angle" } { $cos: "$angle" } { $tan: "$angle" } { $asin: "$val" } { $acos: "$val" } { $atan: "$val" } { $atan2: ["$y", "$x"] } { $sinh: "$val" } { $cosh: "$val" } { $tanh: "$val" } { $degreesToRadians: "$angleDeg" } { $radiansToDegrees: "$angleRad" } Data size
{ $bsonSize: "$$ROOT" } { $binarySize: "$data" } 01 Pagination with total count
Get one page of results and the total count in a single query using $facet.
db.products.aggregate([
{ $match: { category: "electronics" } },
{ $facet: {
data: [
{ $sort: { createdAt: -1 } },
{ $skip: 20 },
{ $limit: 10 }
],
total: [
{ $count: "count" }
]
} }
]) [
{
data: [
{ _id: 21, name: "USB-C Hub",
price: 39, createdAt: "2024-03-14" },
{ _id: 22, name: "Webcam",
price: 79, createdAt: "2024-03-13" }
// ... up to 10 documents
],
total: [{ count: 184 }]
}
] 02 Left join with $lookup
Join two collections. The pipeline form lets you filter and shape the joined documents before they land in the array.
db.users.aggregate([
{ $lookup: {
from: "orders",
let: { userId: "$_id" },
pipeline: [
{ $match: {
$expr: { $eq: ["$userId", "$$userId"] }
} },
{ $sort: { date: -1 } },
{ $limit: 5 },
{ $project: { _id: 0, date: 1, total: 1 } }
],
as: "recentOrders"
} }
]) [
{
_id: 1, name: "Alice",
recentOrders: [
{ date: "2024-03-15", total: 120 },
{ date: "2024-03-01", total: 85 }
]
},
{
_id: 2, name: "Bob",
recentOrders: []
}
] 03 Running total with $setWindowFields
Calculate a cumulative sum partitioned by a group field. Requires MongoDB 5.0+.
db.transactions.aggregate([
{ $setWindowFields: {
partitionBy: "$accountId",
sortBy: { date: 1 },
output: {
runningBalance: {
$sum: "$amount",
window: { documents: ["unbounded", "current"] }
}
}
} }
]) [
{ accountId: "A1", date: "2024-03-01",
amount: 500, runningBalance: 500 },
{ accountId: "A1", date: "2024-03-05",
amount: -120, runningBalance: 380 },
{ accountId: "A1", date: "2024-03-12",
amount: 200, runningBalance: 580 }
] 04 Top N per group
Get the top 3 products by revenue in each category.
db.sales.aggregate([
{ $group: {
_id: "$category",
topProducts: {
$topN: {
n: 3,
sortBy: { revenue: -1 },
output: { name: "$product", revenue: "$revenue" }
}
}
} }
]) [
{
_id: "electronics",
topProducts: [
{ name: "Laptop", revenue: 45000 },
{ name: "Phone", revenue: 38000 },
{ name: "Tablet", revenue: 22000 }
]
},
{
_id: "clothing",
topProducts: [
{ name: "Jacket", revenue: 12000 },
{ name: "Shoes", revenue: 9500 }
]
}
] 05 Pivot data (rows to columns)
Turn distinct field values into object keys. Useful for report-style output.
db.survey.aggregate([
{ $group: {
_id: "$question",
answers: { $push: { k: "$respondent", v: "$answer" } }
} },
{ $addFields: { answers: { $arrayToObject: "$answers" } } },
{ $replaceRoot: {
newRoot: { $mergeObjects: [{ question: "$_id" }, "$answers"] }
} }
]) 06 De-duplication
Keep only the first document for each unique key. Combine $sort before $group to control which document wins.
db.events.aggregate([
{ $sort: { timestamp: -1 } },
{ $group: {
_id: "$userId",
latestEvent: { $first: "$$ROOT" }
} },
{ $replaceRoot: { newRoot: "$latestEvent" } }
]) // given 5 events from 2 users:
[
{ _id: "evt_3", userId: "u1",
type: "click",
timestamp: "2024-03-15T10:00:00Z" },
{ _id: "evt_5", userId: "u2",
type: "purchase",
timestamp: "2024-03-14T18:30:00Z" }
] 07 Date-based grouping (by day, week, month)
Group documents into time buckets using $dateToString or $dateTrunc.
// group by day:
db.orders.aggregate([
{ $group: {
_id: {
$dateToString: {
format: "%Y-%m-%d", date: "$createdAt"
}
},
total: { $sum: "$amount" },
count: { $sum: 1 }
} },
{ $sort: { _id: 1 } }
])
// group by month (5.0+):
db.orders.aggregate([
{ $group: {
_id: {
$dateTrunc: { date: "$createdAt", unit: "month" }
},
total: { $sum: "$amount" }
} }
]) // by-day result:
[
{ _id: "2024-03-14", total: 3800, count: 15 },
{ _id: "2024-03-15", total: 4200, count: 18 },
{ _id: "2024-03-16", total: 2900, count: 11 }
] 08 Time-zone-aware date grouping
Group by day or month in a specific time zone. Without the timezone option, $dateTrunc and $dateToString use UTC, which shifts results for users in other zones.
db.orders.aggregate([
{ $group: {
_id: {
$dateTrunc: {
date: "$createdAt",
unit: "day",
timezone: "America/New_York"
}
},
revenue: { $sum: "$amount" },
orders: { $sum: 1 }
} },
{ $sort: { _id: 1 } }
]) [
{ _id: ISODate("2024-03-14T05:00:00Z"),
revenue: 4200, orders: 18 },
{ _id: ISODate("2024-03-15T04:00:00Z"),
revenue: 3800, orders: 15 }
]
// Note: _id is still UTC, but boundaries
// align to midnight in America/New_York 09 Gap-free time series with $densify
Fill missing dates so charts have no gaps. $densify inserts empty documents for missing intervals, then $fill sets default values. Requires MongoDB 5.1+.
db.dailyStats.aggregate([
{ $densify: {
field: "date",
range: {
step: 1,
unit: "day",
bounds: [
ISODate("2024-03-01"),
ISODate("2024-03-08")
]
}
} },
{ $fill: {
output: {
visits: { value: 0 },
revenue: { value: 0 }
}
} },
{ $sort: { date: 1 } }
]) // before: 5 documents (March 2, 4, 5 missing)
// after: 7 documents, one per day
[
{ date: "2024-03-01", visits: 120, revenue: 800 },
{ date: "2024-03-02", visits: 0, revenue: 0 },
{ date: "2024-03-03", visits: 95, revenue: 650 },
// ...every day present
] 10 Rolling window over dates
Compute a 7-day moving average using $setWindowFields with a date-based range window. Requires MongoDB 5.0+.
db.dailyMetrics.aggregate([
{ $setWindowFields: {
sortBy: { date: 1 },
output: {
avg7d: {
$avg: "$revenue",
window: {
range: [-6, "current"],
unit: "day"
}
}
}
} }
]) [
{ date: "2024-03-01", revenue: 800,
avg7d: 800 },
{ date: "2024-03-02", revenue: 650,
avg7d: 725 },
{ date: "2024-03-03", revenue: 920,
avg7d: 790 },
// avg7d smooths daily fluctuations
] 11 Multi-collection union
Combine documents from multiple collections into one result set, applying per-collection filters.
db.currentOrders.aggregate([
{ $match: { status: "active" } },
{ $unionWith: {
coll: "archivedOrders",
pipeline: [{ $match: { status: "active" } }]
} },
{ $sort: { createdAt: -1 } }
]) // documents from both collections, merged:
[
{ _id: 5, item: "Laptop", status: "active",
createdAt: "2024-03-15" },
{ _id: 88, item: "Monitor", status: "active",
createdAt: "2024-02-20" },
{ _id: 2, item: "Keyboard", status: "active",
createdAt: "2024-01-10" }
] 12 Conditional aggregation
Compute different values based on a condition within a group.
db.orders.aggregate([
{ $group: {
_id: "$region",
totalRevenue: { $sum: "$amount" },
onlineRevenue: {
$sum: {
$cond: [
{ $eq: ["$channel", "online"] },
"$amount", 0
]
}
},
storeRevenue: {
$sum: {
$cond: [
{ $eq: ["$channel", "store"] },
"$amount", 0
]
}
}
} }
]) [
{ _id: "US-East",
totalRevenue: 18400,
onlineRevenue: 12200,
storeRevenue: 6200 },
{ _id: "EU-West",
totalRevenue: 9300,
onlineRevenue: 7100,
storeRevenue: 2200 }
] 13 Recursive graph traversal
Walk a parent-child hierarchy (org chart, category tree, social graph) up to a specified depth.
db.employees.aggregate([
{ $match: { name: "Alice" } },
{ $graphLookup: {
from: "employees",
startWith: "$managerId",
connectFromField: "managerId",
connectToField: "_id",
as: "chain",
maxDepth: 10,
depthField: "level"
} }
]) [
{ _id: 4, name: "Alice", managerId: 2,
chain: [
{ _id: 2, name: "Bob",
managerId: 1, level: 0 },
{ _id: 1, name: "Carol",
managerId: null, level: 1 }
] }
] 14 Materialized view with $merge
Incrementally refresh a summary collection. $merge inserts new documents and updates existing ones, so you only reprocess the delta.
db.orders.aggregate([
{ $match: {
updatedAt: { $gte: ISODate("2024-03-01") }
} },
{ $group: {
_id: {
region: "$region",
month: {
$dateTrunc: { date: "$createdAt", unit: "month" }
}
},
revenue: { $sum: "$amount" },
orders: { $sum: 1 }
} },
{ $merge: {
into: "monthly_summary",
on: "_id",
whenMatched: "replace",
whenNotMatched: "insert"
} }
]) // monthly_summary collection after merge:
[
{ _id: { region: "US-East",
month: ISODate("2024-03-01") },
revenue: 18400, orders: 142 },
{ _id: { region: "EU-West",
month: ISODate("2024-03-01") },
revenue: 9300, orders: 87 }
] 15 Flatten nested arrays
Merge arrays of arrays into a single flat array using $reduce with $concatArrays.
db.departments.aggregate([
{ $project: {
name: 1,
allSkills: {
$reduce: {
input: "$teams.skills",
initialValue: [],
in: { $concatArrays: ["$$value", "$$this"] }
}
}
} }
]) [
{
_id: 1,
name: "Engineering",
allSkills: ["go", "rust", "react", "typescript", "python", "ml"]
}
] 16 Dynamic field names
Process documents with unknown keys by converting objects to arrays, transforming, and converting back.
db.configs.aggregate([
{ $project: { entries: { $objectToArray: "$settings" } } },
{ $unwind: "$entries" },
{ $match: { "entries.v": { $ne: null } } },
{ $group: {
_id: "$_id",
settings: { $push: { k: "$entries.k", v: "$entries.v" } }
} },
{ $addFields: { settings: { $arrayToObject: "$settings" } } }
]) // input: { settings: { theme: "dark", lang: "en", beta: null } }
[
{ _id: 1,
settings: { theme: "dark", lang: "en" } }
] Performance tips
- Place $match as early as possible. It can use indexes and reduces the document count for every stage that follows.
- $sort immediately after $match on the same fields can piggyback on a compound index, avoiding an in-memory sort.
// index: { status: 1, createdAt: -1 } db.orders.aggregate([ { $match: { status: "shipped" } }, // uses the compound index, no in-memory sort { $sort: { createdAt: -1 } } ]) - Use $project or $unset early to drop fields you do not need. Smaller documents mean less memory per stage.
- Each pipeline stage has a 100 MB RAM limit by default. Pass { allowDiskUse: true } as an option to spill to disk for large datasets.
db.orders.aggregate([ { $group: { _id: "$customerId", total: { $sum: "$amount" } } }, { $sort: { total: -1 } } ], { allowDiskUse: true }) - In $lookup, make sure the foreign field (foreignField or the field matched inside the sub-pipeline) is indexed. An unindexed $lookup is effectively a nested collection scan.
- Prefer $filter or $reduce over $unwind + $group when you only need to transform an array in place. $unwind creates N documents per array element.
- Use $merge instead of $out for incremental materialized views. $merge can insert, update, or replace on match, while $out replaces the entire target collection.
- Run .explain("executionStats") on your pipeline. Check whether the first $match uses an index (IXSCAN, not COLLSCAN) and how many documents each stage examines.
db.orders .explain("executionStats") .aggregate([ { $match: { status: "shipped", createdAt: { $gte: ISODate("2024-01-01") } } }, { $group: { _id: "$region", total: { $sum: "$amount" } } } ]) - Avoid $sort on unindexed fields with large result sets. If the sort exceeds the 100 MB limit and allowDiskUse is off, the query fails.
- $count is faster than $group + $sum when you only need a total, because MongoDB can optimize it internally.
- $expr inside $match prevents index usage in most cases. For field-to-field comparisons, this means a collection scan. Inside $lookup sub-pipelines this is unavoidable, but for top-level $match, use standard query operators when comparing against constants.
// slow: $expr prevents index use { $match: { $expr: { $gt: ["$qty", "$threshold"] } } } // comparing against a constant? use standard syntax: // uses { qty: 1 } index { $match: { qty: { $gt: 100 } } }
Frequently asked questions
What MongoDB version introduced the aggregation pipeline?
MongoDB 2.2 (2012). Each major release since then has added stages and operators. $setWindowFields arrived in 5.0, $densify and $fill in 5.1/5.3, $documents in 6.0, and $vectorSearch in Atlas 7.0+.
What is the default memory limit per stage?
Each stage can use up to 100 MB of RAM. If a stage exceeds that limit, the query fails unless you pass { allowDiskUse: true } as an aggregation option. Disk spills are slower, so try to reduce document size with $project or filter early with $match.
Can I use aggregation inside a transaction?
Yes, starting in MongoDB 4.4. The pipeline must target a single collection and cannot use $out or $merge. Transactions have their own size and time limits, so keep pipelines short.
How do I debug a long pipeline?
Build incrementally: start with the first stage, check its output, add the next stage, and repeat. In the mongo shell, comment out later stages. In Monghoul, use the aggregation builder's per-stage preview and "Run to here" to see intermediate results without touching later stages.
What is the difference between $project and $addFields?
$project controls exactly which fields appear in the output. If you include field A, all other fields are excluded (except _id). $addFields keeps every existing field and adds or overwrites the ones you specify. Use $project when you want a strict shape, $addFields when you want to enrich documents.
When should I use $merge instead of $out?
$out replaces the entire target collection on every run, which makes it all-or-nothing. $merge can insert new documents and update existing ones based on a match key, which is better for incremental updates (e.g., daily roll-ups that add to a running summary).
$lookup is slow. What can I do?
First, index the foreign field. An unindexed foreign field turns every lookup into a collection scan. Second, add a $match inside the lookup pipeline to reduce the joined set. Third, if you only need a count or a single field, project inside the sub-pipeline to avoid pulling full documents.
Does stage order matter for performance?
Absolutely. MongoDB does some automatic optimization (e.g., moving $match before $project when possible), but it cannot reorder everything. As a rule, filter first ($match), project next ($project/$unset), then sort and group. Fewer documents in, less work for expensive stages.
Build pipelines visually in Monghoul
Drag-and-drop stages, per-stage preview, and autocomplete for every operator on this page. 14-day Pro trial, no card required.