Prefix Sum and Difference Thinking
prefix sum, range queries, subarray sums, difference arrays
Prefix sums teach you how to pay an upfront preprocessing cost so later range questions become cheap. Difference arrays teach the dual idea: record the impact of range updates cheaply now, and materialize the final values later.
1. Definition
What is this topic about?
This topic is about turning repeated range work into incremental state.
Instead of recomputing the sum of a subarray every time, you precompute enough structure so a range result can be derived quickly.
The same mindset extends to updates.
Instead of modifying every element inside every range update, you record only where an update starts and where it stops affecting the array.
The two main ideas
The chapter combines two linked patterns:
- prefix sums for fast range queries,
- difference arrays for fast range updates.
Why this matters in interviews
Many range problems are written in a way that invites a brute-force loop inside another loop.
That often works for tiny examples but collapses for large constraints.
Prefix thinking changes the model from âcompute each range directlyâ to âreuse earlier accumulated work.â
What it is NOT
This topic is not yet segment trees, Fenwick trees, or lazy propagation.
It is the earlier and more interview-common layer where you use:
- running sums,
- prefix tables,
- hash maps on prefix values,
- and difference markers.
Recognition signals
This topic should trigger when you see prompts about:
- range sum queries,
- many repeated subarray sums,
- counting subarrays with a property,
- or applying many range increments.
2. Core Concepts
Running sum
A running sum is the total you get by accumulating values from left to right.
That idea is the foundation of prefix sums.
Prefix sum
A prefix sum at index i stores the total from the start up to some point.
A common convention is:
prefix[0] = 0,prefix[i + 1] = prefix[i] + nums[i].
This convention makes range formulas cleaner.
Range sum from prefix sums
Once prefix sums exist, the sum of a range [left, right] becomes:
$$ \text{rangeSum}(left, right) = prefix[right + 1] - prefix[left] $$
That removes the inner loop.
Preprocessing versus query cost
Prefix sums trade a one-time preprocessing pass for much cheaper repeated queries.
That trade-off is often exactly what the problem wants.
Prefix state as a model, not only as an array
Sometimes you do not store the prefix values just to answer direct range sums.
Sometimes you track how often a prefix value has appeared.
That is the key idea behind problems like:
- subarray sum equals
k, - count of zero-sum subarrays,
- balanced prefix patterns.
Difference array
A difference array stores changes between adjacent positions.
Instead of applying a range increment to every element in [left, right], you can:
- add
valueatleft, - subtract
valueatright + 1if it exists, - then reconstruct the final array with a running sum.
Why difference arrays work
The increment starts affecting the array at the left boundary and stops affecting it right after the right boundary.
Recording those boundaries is enough.
Inclusive boundaries matter
Range formulas often fail because of off-by-one mistakes.
You must be explicit about whether a range is:
- inclusive,
- exclusive,
- or half-open.
Negative numbers matter
Sliding window techniques often rely on non-negative values.
Prefix-sum-with-map techniques do not require that assumption in the same way.
That distinction is very important.
Recognition sentence
A strong instinct is:
- âIf I keep re-summing overlapping ranges, I probably need prefix sums.â
- âIf I keep re-applying overlapping updates, I probably need a difference array.â
3. Practical Usage
Fast range sum queries
If you must answer many sum queries over an array that does not change, prefix sums are usually the first tool to try.
Counting subarrays with a target sum
If the prompt asks for how many subarrays sum to k, think about prefix sums plus a map of previously seen prefix values.
The key identity is:
$$ currentPrefix - earlierPrefix = k $$
So if the current prefix is s, then you want to know how many times s - k has appeared before.
Repeated range increments
If the prompt applies many operations like âadd value to every index from left to right,â a direct per-range loop may be too slow.
That is where difference thinking becomes valuable.
Building intuition
Prefix sums are about asking:
- what total has been accumulated so far,
- and how can two totals cancel into the range I want?
Difference arrays are about asking:
- where does an effect begin,
- where does it end,
- and can I materialize the full result later?
Interview explanation pattern
A clean explanation often sounds like this:
- âThe naive solution recomputes overlapping work.â
- âThese queries or updates share structure.â
- âI can move that repeated work into a preprocessing pass.â
- âThen each query or each update becomes much cheaper.â
When not to use these patterns blindly
Do not force prefix sums when:
- the operation is not additive,
- updates and queries are both dynamic in harder ways,
- or the problem asks for a structure better handled by another technique.
But for interview-range basics, they are incredibly common.
Related LeetCode problems
- LeetCode #303 â Range Sum Query â Immutable â direct prefix sum application.
- LeetCode #560 â Subarray Sum Equals K â prefix sum +
HashMapfor count. - LeetCode #1094 â Car Pooling â difference array for interval capacity.
- LeetCode #1109 â Corporate Flight Bookings â difference array for range increment.
- LeetCode #304 â Range Sum Query 2D â Immutable â 2D prefix sum extension.
4. Code Examples
Baseline â brute-force range sum
The naĂŻve approach recalculates the sum for every query from scratch.
public class BruteForceRangeSum {
private final int[] data;
public BruteForceRangeSum(int[] nums) {
this.data = nums.clone();
}
public int sumRange(int left, int right) {
int total = 0;
for (int index = left; index <= right; index++) {
total += data[index];
}
return total;
}
}
Complexity: no preprocessing; each query O(n); extra space O(1).
This is fine for a handful of queries, but with q queries over an array of size n, the total cost is O(n·q). Prefix sums eliminate the repeated work.
Optimized â build prefix sums for range queries
public class RangeSumQuery {
private final int[] prefix;
public RangeSumQuery(int[] nums) {
this.prefix = new int[nums.length + 1];
for (int index = 0; index < nums.length; index++) {
prefix[index + 1] = prefix[index] + nums[index];
}
}
public int sumRange(int left, int right) {
return prefix[right + 1] - prefix[left];
}
}
Why it works:
- preprocessing is linear,
- each query is constant time,
- and the formula is simple when
prefix[0] = 0.
Complexity: preprocessing O(n); each query O(1); extra space O(n).
Intermediate Example â subarray sum equals `k`
import java.util.HashMap;
import java.util.Map;
public class SubarraySumEqualsK {
public static int countSubarrays(int[] nums, int target) {
Map<Integer, Integer> seen = new HashMap<>();
seen.put(0, 1);
int prefix = 0;
int result = 0;
for (int value : nums) {
prefix += value;
result += seen.getOrDefault(prefix - target, 0);
seen.put(prefix, seen.getOrDefault(prefix, 0) + 1);
}
return result;
}
}
This is one of the most important prefix-sum interview patterns.
You are not only storing prefix values.
You are storing how often each prefix value has appeared.
Complexity: time O(n); extra space O(n) (for the map).
Intermediate Example â difference array for range increment
public class RangeIncrement {
public static int[] applyUpdates(int length, int[][] updates) {
int[] diff = new int[length];
for (int[] update : updates) {
int left = update[0];
int right = update[1];
int value = update[2];
diff[left] += value;
if (right + 1 < length) {
diff[right + 1] -= value;
}
}
int[] result = new int[length];
int running = 0;
for (int index = 0; index < length; index++) {
running += diff[index];
result[index] = running;
}
return result;
}
}
The update cost becomes constant per operation, and the final materialization is one pass.
Complexity: O(k) for k updates + O(n) for materialization; extra space O(n).
Advanced Example â check capacity after many trips
public class CarPoolingCheck {
public static boolean canCarryAllTrips(int[][] trips, int capacity) {
int[] diff = new int[1001];
for (int[] trip : trips) {
int passengers = trip[0];
int from = trip[1];
int to = trip[2];
diff[from] += passengers;
diff[to] -= passengers;
}
int currentLoad = 0;
for (int change : diff) {
currentLoad += change;
if (currentLoad > capacity) {
return false;
}
}
return true;
}
}
This is a great example of difference thinking where you care more about change points than about every index inside a range.
Complexity: O(trips) for recording + O(maxStop) for checking; extra space O(maxStop).
Common Pitfall â off-by-one prefix formula
public class BrokenRangeSum {
public static int sumRange(int[] prefix, int left, int right) {
return prefix[right] - prefix[left];
}
}
If prefix[i] means the sum of elements before index i, then this formula is wrong for inclusive [left, right] ranges.
The correct formula under that convention is:
return prefix[right + 1] - prefix[left];
5. Trade-offs
| Aspect | Details |
|---|---|
| ⥠Performance | Prefix preprocessing is linear and turns repeated range queries into constant-time operations. Difference arrays make each range update constant-time before a final materialization pass. |
| đŸ Memory | You spend extra memory for the prefix or difference structure to reduce repeated work. |
| đ§ Maintainability | The formulas are elegant once the indexing convention is clear, but off-by-one mistakes can hurt readability if the convention is vague. |
| đ Flexibility | Great for additive range logic, but not a universal answer for every dynamic query/update problem. |
Performance trade-off: You pay once to save work many times.
Memory trade-off: Extra arrays are usually worth it when repeated range work is large.
Maintainability trade-off: Clear conventions like prefix[0] = 0 make the code much safer.
Flexibility trade-off: Prefix and difference patterns are powerful, but they do not replace all advanced range data structures.
6. Common Mistakes
Using brute-force range loops repeatedly
This usually signals missed shared work.Getting the prefix indexing convention wrong
Many bugs come from inconsistent meaning ofprefix[i].Forgetting the initial
0 -> 1count in prefix-map problems
That base case is essential for subarrays starting at index0.Applying sliding window when negative numbers exist
Prefix-sum map solutions often handle those cases better.Forgetting to stop a difference update at
right + 1
Without that, the effect continues too far.Writing beyond the array boundary
right + 1must be checked before use.Mixing query logic and update logic conceptually
Prefix sums answer ranges fast; difference arrays apply ranges fast.
7. Deep Dive
The deeper mental shift
The real power here is not the formula itself.
It is the shift from direct work to reusable accumulated state.
Once you notice overlapping subarrays or overlapping updates, you should start asking whether previous work can be reused rather than repeated.
Prefix sums are about subtraction, not only addition
Beginners often think prefix sums are âjust cumulative addition.â
But the real trick is subtraction of two accumulated states to isolate a middle range.
That is why the method is so reusable.
Prefix maps extend the idea further
When you pair prefix sums with a map, you stop thinking only about explicit ranges and start thinking about relationships between earlier and current accumulated state.
That is why problems like subarray sum equals k become much easier.
Difference arrays are the mirror image
Prefix sums answer many queries after one accumulation pass.
Difference arrays absorb many updates first, then recover final values with one accumulation pass.
That symmetry is worth remembering.
Why these problems feel âsmartâ in interviews
They feel smart because they replace repeated local work with a compact global model.
Interviewers like that because it shows pattern recognition, not just implementation ability.
Java-specific notes
In Java, arrays make these patterns especially clean.
The main risk is not syntax but indexing discipline.
If you clearly define what prefix[i] means and what your range boundaries mean, the implementation becomes straightforward.
8. Interview Questions
Q: Why use prefix[0] = 0?
A: It makes range formulas cleaner and naturally supports ranges starting at index 0.
Q: What is the main idea behind subarray sum equals k?
A: Track the current prefix sum and count how many earlier prefix sums equal currentPrefix - k.
Q: Why is the base case seen.put(0, 1) important?
A: It accounts for subarrays that begin at index 0 and already sum to the target.
Q: What is the core idea of a difference array?
A: Record where a range update starts and where it stops, then reconstruct final values with a running sum.
Q: What is a frequent source of bugs here?
A: Off-by-one indexing mistakes and unclear boundary conventions.
9. Glossary
| Term | Definition |
|---|---|
| prefix sum | Accumulated total from the start up to a point. |
| running sum | Total built incrementally while scanning left to right. |
| range query | A question about a contiguous interval, such as sum from left to right. |
| difference array | A structure that records boundary changes for range updates. |
| preprocessing | Upfront work that makes later operations cheaper. |
| materialization | Reconstructing final values from compressed state. |
| off-by-one error | A bug caused by shifting a boundary by one index. |
10. Cheatsheet
- Repeated range sums often suggest prefix sums.
- Repeated range increments often suggest difference arrays.
- A clean prefix convention is
prefix[0] = 0. - Inclusive range sum is often
prefix[right + 1] - prefix[left]. - For subarray sum equals
k, look forcurrentPrefix - kin a map. - Seed prefix maps with
0 -> 1when needed. - Difference updates start at
leftand stop atright + 1. - Always guard the
right + 1boundary. - Negative numbers often break simple sliding-window reasoning.
- Shared work is the signal that preprocessing may help.
đź Games
10 questions