Intermediate Reading time: ~13 min

Prefix Sum and Difference Thinking

prefix sum, range queries, subarray sums, difference arrays

Prefix sums teach you how to pay an upfront preprocessing cost so later range questions become cheap. Difference arrays teach the dual idea: record the impact of range updates cheaply now, and materialize the final values later.

1. Definition

What is this topic about?

This topic is about turning repeated range work into incremental state.

Instead of recomputing the sum of a subarray every time, you precompute enough structure so a range result can be derived quickly.

The same mindset extends to updates.

Instead of modifying every element inside every range update, you record only where an update starts and where it stops affecting the array.

The two main ideas

The chapter combines two linked patterns:

  • prefix sums for fast range queries,
  • difference arrays for fast range updates.

Why this matters in interviews

Many range problems are written in a way that invites a brute-force loop inside another loop.

That often works for tiny examples but collapses for large constraints.

Prefix thinking changes the model from “compute each range directly” to “reuse earlier accumulated work.”

What it is NOT

This topic is not yet segment trees, Fenwick trees, or lazy propagation.

It is the earlier and more interview-common layer where you use:

  • running sums,
  • prefix tables,
  • hash maps on prefix values,
  • and difference markers.

Recognition signals

This topic should trigger when you see prompts about:

  • range sum queries,
  • many repeated subarray sums,
  • counting subarrays with a property,
  • or applying many range increments.

2. Core Concepts

Running sum

A running sum is the total you get by accumulating values from left to right.

That idea is the foundation of prefix sums.

Prefix sum

A prefix sum at index i stores the total from the start up to some point.

A common convention is:

  • prefix[0] = 0,
  • prefix[i + 1] = prefix[i] + nums[i].

This convention makes range formulas cleaner.

Range sum from prefix sums

Once prefix sums exist, the sum of a range [left, right] becomes:

$$ \text{rangeSum}(left, right) = prefix[right + 1] - prefix[left] $$

That removes the inner loop.

Preprocessing versus query cost

Prefix sums trade a one-time preprocessing pass for much cheaper repeated queries.

That trade-off is often exactly what the problem wants.

Prefix state as a model, not only as an array

Sometimes you do not store the prefix values just to answer direct range sums.

Sometimes you track how often a prefix value has appeared.

That is the key idea behind problems like:

  • subarray sum equals k,
  • count of zero-sum subarrays,
  • balanced prefix patterns.

Difference array

A difference array stores changes between adjacent positions.

Instead of applying a range increment to every element in [left, right], you can:

  • add value at left,
  • subtract value at right + 1 if it exists,
  • then reconstruct the final array with a running sum.

Why difference arrays work

The increment starts affecting the array at the left boundary and stops affecting it right after the right boundary.

Recording those boundaries is enough.

Inclusive boundaries matter

Range formulas often fail because of off-by-one mistakes.

You must be explicit about whether a range is:

  • inclusive,
  • exclusive,
  • or half-open.

Negative numbers matter

Sliding window techniques often rely on non-negative values.

Prefix-sum-with-map techniques do not require that assumption in the same way.

That distinction is very important.

Recognition sentence

A strong instinct is:

  • “If I keep re-summing overlapping ranges, I probably need prefix sums.”
  • “If I keep re-applying overlapping updates, I probably need a difference array.”

3. Practical Usage

Fast range sum queries

If you must answer many sum queries over an array that does not change, prefix sums are usually the first tool to try.

Counting subarrays with a target sum

If the prompt asks for how many subarrays sum to k, think about prefix sums plus a map of previously seen prefix values.

The key identity is:

$$ currentPrefix - earlierPrefix = k $$

So if the current prefix is s, then you want to know how many times s - k has appeared before.

Repeated range increments

If the prompt applies many operations like “add value to every index from left to right,” a direct per-range loop may be too slow.

That is where difference thinking becomes valuable.

Building intuition

Prefix sums are about asking:

  • what total has been accumulated so far,
  • and how can two totals cancel into the range I want?

Difference arrays are about asking:

  • where does an effect begin,
  • where does it end,
  • and can I materialize the full result later?

Interview explanation pattern

A clean explanation often sounds like this:

  1. “The naive solution recomputes overlapping work.”
  2. “These queries or updates share structure.”
  3. “I can move that repeated work into a preprocessing pass.”
  4. “Then each query or each update becomes much cheaper.”

When not to use these patterns blindly

Do not force prefix sums when:

  • the operation is not additive,
  • updates and queries are both dynamic in harder ways,
  • or the problem asks for a structure better handled by another technique.

But for interview-range basics, they are incredibly common.

  • LeetCode #303 — Range Sum Query – Immutable — direct prefix sum application.
  • LeetCode #560 — Subarray Sum Equals K — prefix sum + HashMap for count.
  • LeetCode #1094 — Car Pooling — difference array for interval capacity.
  • LeetCode #1109 — Corporate Flight Bookings — difference array for range increment.
  • LeetCode #304 — Range Sum Query 2D – Immutable — 2D prefix sum extension.

4. Code Examples

Baseline — brute-force range sum

The naĂŻve approach recalculates the sum for every query from scratch.

public class BruteForceRangeSum {
    private final int[] data;

    public BruteForceRangeSum(int[] nums) {
        this.data = nums.clone();
    }

    public int sumRange(int left, int right) {
        int total = 0;
        for (int index = left; index <= right; index++) {
            total += data[index];
        }
        return total;
    }
}

Complexity: no preprocessing; each query O(n); extra space O(1).

This is fine for a handful of queries, but with q queries over an array of size n, the total cost is O(n·q). Prefix sums eliminate the repeated work.

Optimized — build prefix sums for range queries

public class RangeSumQuery {
    private final int[] prefix;

    public RangeSumQuery(int[] nums) {
        this.prefix = new int[nums.length + 1];
        for (int index = 0; index < nums.length; index++) {
            prefix[index + 1] = prefix[index] + nums[index];
        }
    }

    public int sumRange(int left, int right) {
        return prefix[right + 1] - prefix[left];
    }
}

Why it works:

  • preprocessing is linear,
  • each query is constant time,
  • and the formula is simple when prefix[0] = 0.

Complexity: preprocessing O(n); each query O(1); extra space O(n).

Intermediate Example — subarray sum equals `k`

import java.util.HashMap;
import java.util.Map;

public class SubarraySumEqualsK {
    public static int countSubarrays(int[] nums, int target) {
        Map<Integer, Integer> seen = new HashMap<>();
        seen.put(0, 1);

        int prefix = 0;
        int result = 0;

        for (int value : nums) {
            prefix += value;
            result += seen.getOrDefault(prefix - target, 0);
            seen.put(prefix, seen.getOrDefault(prefix, 0) + 1);
        }

        return result;
    }
}

This is one of the most important prefix-sum interview patterns.

You are not only storing prefix values.

You are storing how often each prefix value has appeared.

Complexity: time O(n); extra space O(n) (for the map).

Intermediate Example — difference array for range increment

public class RangeIncrement {
    public static int[] applyUpdates(int length, int[][] updates) {
        int[] diff = new int[length];

        for (int[] update : updates) {
            int left = update[0];
            int right = update[1];
            int value = update[2];

            diff[left] += value;
            if (right + 1 < length) {
                diff[right + 1] -= value;
            }
        }

        int[] result = new int[length];
        int running = 0;

        for (int index = 0; index < length; index++) {
            running += diff[index];
            result[index] = running;
        }

        return result;
    }
}

The update cost becomes constant per operation, and the final materialization is one pass.

Complexity: O(k) for k updates + O(n) for materialization; extra space O(n).

Advanced Example — check capacity after many trips

public class CarPoolingCheck {
    public static boolean canCarryAllTrips(int[][] trips, int capacity) {
        int[] diff = new int[1001];

        for (int[] trip : trips) {
            int passengers = trip[0];
            int from = trip[1];
            int to = trip[2];

            diff[from] += passengers;
            diff[to] -= passengers;
        }

        int currentLoad = 0;
        for (int change : diff) {
            currentLoad += change;
            if (currentLoad > capacity) {
                return false;
            }
        }

        return true;
    }
}

This is a great example of difference thinking where you care more about change points than about every index inside a range.

Complexity: O(trips) for recording + O(maxStop) for checking; extra space O(maxStop).

Common Pitfall — off-by-one prefix formula

public class BrokenRangeSum {
    public static int sumRange(int[] prefix, int left, int right) {
        return prefix[right] - prefix[left];
    }
}

If prefix[i] means the sum of elements before index i, then this formula is wrong for inclusive [left, right] ranges.

The correct formula under that convention is:

return prefix[right + 1] - prefix[left];

5. Trade-offs

Aspect Details
⚡ Performance Prefix preprocessing is linear and turns repeated range queries into constant-time operations. Difference arrays make each range update constant-time before a final materialization pass.
đŸ’Ÿ Memory You spend extra memory for the prefix or difference structure to reduce repeated work.
🔧 Maintainability The formulas are elegant once the indexing convention is clear, but off-by-one mistakes can hurt readability if the convention is vague.
🔄 Flexibility Great for additive range logic, but not a universal answer for every dynamic query/update problem.

Performance trade-off: You pay once to save work many times.

Memory trade-off: Extra arrays are usually worth it when repeated range work is large.

Maintainability trade-off: Clear conventions like prefix[0] = 0 make the code much safer.

Flexibility trade-off: Prefix and difference patterns are powerful, but they do not replace all advanced range data structures.

6. Common Mistakes

  1. Using brute-force range loops repeatedly
    This usually signals missed shared work.

  2. Getting the prefix indexing convention wrong
    Many bugs come from inconsistent meaning of prefix[i].

  3. Forgetting the initial 0 -> 1 count in prefix-map problems
    That base case is essential for subarrays starting at index 0.

  4. Applying sliding window when negative numbers exist
    Prefix-sum map solutions often handle those cases better.

  5. Forgetting to stop a difference update at right + 1
    Without that, the effect continues too far.

  6. Writing beyond the array boundary
    right + 1 must be checked before use.

  7. Mixing query logic and update logic conceptually
    Prefix sums answer ranges fast; difference arrays apply ranges fast.

7. Deep Dive

The deeper mental shift

The real power here is not the formula itself.

It is the shift from direct work to reusable accumulated state.

Once you notice overlapping subarrays or overlapping updates, you should start asking whether previous work can be reused rather than repeated.

Prefix sums are about subtraction, not only addition

Beginners often think prefix sums are “just cumulative addition.”

But the real trick is subtraction of two accumulated states to isolate a middle range.

That is why the method is so reusable.

Prefix maps extend the idea further

When you pair prefix sums with a map, you stop thinking only about explicit ranges and start thinking about relationships between earlier and current accumulated state.

That is why problems like subarray sum equals k become much easier.

Difference arrays are the mirror image

Prefix sums answer many queries after one accumulation pass.

Difference arrays absorb many updates first, then recover final values with one accumulation pass.

That symmetry is worth remembering.

Why these problems feel “smart” in interviews

They feel smart because they replace repeated local work with a compact global model.

Interviewers like that because it shows pattern recognition, not just implementation ability.

Java-specific notes

In Java, arrays make these patterns especially clean.

The main risk is not syntax but indexing discipline.

If you clearly define what prefix[i] means and what your range boundaries mean, the implementation becomes straightforward.

8. Interview Questions

Q: Why use prefix[0] = 0?
A: It makes range formulas cleaner and naturally supports ranges starting at index 0.

Q: What is the main idea behind subarray sum equals k?
A: Track the current prefix sum and count how many earlier prefix sums equal currentPrefix - k.

Q: Why is the base case seen.put(0, 1) important?
A: It accounts for subarrays that begin at index 0 and already sum to the target.

Q: What is the core idea of a difference array?
A: Record where a range update starts and where it stops, then reconstruct final values with a running sum.

Q: What is a frequent source of bugs here?
A: Off-by-one indexing mistakes and unclear boundary conventions.

9. Glossary

Term Definition
prefix sum Accumulated total from the start up to a point.
running sum Total built incrementally while scanning left to right.
range query A question about a contiguous interval, such as sum from left to right.
difference array A structure that records boundary changes for range updates.
preprocessing Upfront work that makes later operations cheaper.
materialization Reconstructing final values from compressed state.
off-by-one error A bug caused by shifting a boundary by one index.

10. Cheatsheet

  • Repeated range sums often suggest prefix sums.
  • Repeated range increments often suggest difference arrays.
  • A clean prefix convention is prefix[0] = 0.
  • Inclusive range sum is often prefix[right + 1] - prefix[left].
  • For subarray sum equals k, look for currentPrefix - k in a map.
  • Seed prefix maps with 0 -> 1 when needed.
  • Difference updates start at left and stop at right + 1.
  • Always guard the right + 1 boundary.
  • Negative numbers often break simple sliding-window reasoning.
  • Shared work is the signal that preprocessing may help.

🎼 Games

10 questions