Prefix Sum Pattern

The Prefix Sum pattern involves preprocessing an array to create a new array where each element at index i represents the sum of all elements from the start of the array up to index i. This technique allows for efficient sum queries on subarrays in constant time.

When to Use This Pattern

When you need to perform multiple sum queries on subarrays
When you need to calculate cumulative sums efficiently
When dealing with range sum queries
When optimizing problems that require repeated subarray sum calculations

Key Concepts

The fundamental idea is to precompute a prefix sum array P where:

\[ P[i] = \sum_{j=0}^{i} A[j] = A[0] + A[1] + \ldots + A[i] \]

Once we have this prefix array, we can calculate the sum of any subarray from index i to j in constant time using:

\[ \text{Sum}(i, j) = P[j] - P[i-1] \]

For the special case when i = 0, the sum is simply P[j].

Visual Representation

Prefix Sum Array Visualization

Query Example: Sum from index 1 to 3 = P[3] - P[0] = 10 - 1 = 9

This represents: A[1] + A[2] + A[3] = 2 + 3 + 4 = 9

Step-by-Step Simulation

Let's trace through building a prefix sum array with A = [1, 2, 3, 4, 5]:

Initial Array A: [1, 2, 3, 4, 5]
Step 0: P[0] = A[0] = 1
P = [1, _, _, _, _]
Step 1: P[1] = P[0] + A[1] = 1 + 2 = 3
P = [1, 3, _, _, _]
Step 2: P[2] = P[1] + A[2] = 3 + 3 = 6
P = [1, 3, 6, _, _]
Step 3: P[3] = P[2] + A[3] = 6 + 4 = 10
P = [1, 3, 6, 10, _]
Step 4: P[4] = P[3] + A[4] = 10 + 5 = 15
P = [1, 3, 6, 10, 15]
Final Prefix Array: [1, 3, 6, 10, 15]

Query Simulation

Now let's query the sum from index 1 to 3:

Query: Sum from index 1 to 3 (inclusive)
Direct calculation: A[1] + A[2] + A[3] = 2 + 3 + 4 = 9
Using prefix sum: P[3] - P[0] = 10 - 1 = 9
Why this works:
P[3] = A[0] + A[1] + A[2] + A[3] = 1 + 2 + 3 + 4 = 10
P[0] = A[0] = 1
P[3] - P[0] = (A[0] + A[1] + A[2] + A[3]) - A[0] = A[1] + A[2] + A[3] = 9

Algorithm Section

Building the Prefix Sum Array

The algorithm to build a prefix sum array is straightforward:

function buildPrefixSum(arr):
    n = length(arr)
    prefix = new array of size n
    prefix[0] = arr[0]
    
    for i from 1 to n-1:
        prefix[i] = prefix[i-1] + arr[i]
    
    return prefix

Querying Subarray Sum

To get the sum of elements from index i to j (inclusive):

function rangeSum(prefix, i, j):
    if i == 0:
        return prefix[j]
    else:
        return prefix[j] - prefix[i-1]

Time and Space Complexity

Time Complexity: \(O(n)\) to build the prefix array, \(O(1)\) per query
Space Complexity: \(O(n)\) for storing the prefix array

Implementation in C++

#include <iostream>
#include <vector>
using namespace std;

// Build prefix sum array
vector<int> buildPrefixSum(const vector<int>& arr) {
    int n = arr.size();
    vector<int> prefix(n);
    prefix[0] = arr[0];
    
    // Build prefix sum: each element is sum of all previous + current
    for (int i = 1; i < n; i++) {
        prefix[i] = prefix[i-1] + arr[i];
    }
    
    return prefix;
}

// Query range sum from index i to j (inclusive)
int rangeSum(const vector<int>& prefix, int i, int j) {
    // If starting from index 0, prefix[j] is the answer
    if (i == 0) {
        return prefix[j];
    }
    // Otherwise, subtract prefix[i-1] to exclude elements before i
    return prefix[j] - prefix[i-1];
}

// Example usage
int main() {
    vector<int> arr = {1, 2, 3, 4, 5};
    vector<int> prefix = buildPrefixSum(arr);
    
    cout << "Prefix array: ";
    for (int val : prefix) {
        cout << val << " ";
    }
    cout << endl;
    
    cout << "Sum from index 1 to 3: " << rangeSum(prefix, 1, 3) << endl;
    // Output: 9
    
    return 0;
}

Detailed Explanation of C++ Implementation

buildPrefixSum: Creates a prefix array where each element stores the cumulative sum up to that index. We start with prefix[0] = arr[0], then for each subsequent index i, we add arr[i] to the previous prefix sum.
rangeSum: To get the sum from index i to j, we use prefix[j] - prefix[i-1]. This works because prefix[j] contains the sum of all elements from 0 to j, and prefix[i-1] contains the sum from 0 to i-1. Subtracting them gives us the sum from i to j.
Edge Case: When i = 0, we simply return prefix[j] since there are no elements before index 0 to subtract.

Practice Problems

Problem 1: Range Sum Query - Immutable (LeetCode #303)

Design a data structure that supports range sum queries on an immutable array.

Example:

Input: nums = [-2, 0, 3, -5, 2, -1]
sumRange(0, 2) -> 1
sumRange(2, 5) -> -1
sumRange(0, 5) -> -3

Solution in C++:

#include <vector>
using namespace std;

class NumArray {
private:
    vector<int> prefix;
    
public:
    NumArray(vector<int>& nums) {
        prefix.push_back(0);  // prefix[0] = 0 for easier calculation
        for (int num : nums) {
            // Each prefix[i+1] = sum of elements from 0 to i
            prefix.push_back(prefix.back() + num);
        }
    }
    
    int sumRange(int left, int right) {
        // prefix[right+1] contains sum from 0 to right
        // prefix[left] contains sum from 0 to left-1
        // Subtracting gives sum from left to right
        return prefix[right + 1] - prefix[left];
    }
};

Step-by-Step Simulation:

Input: nums = [-2, 0, 3, -5, 2, -1]
Building prefix array:
prefix[0] = 0 (initial)
prefix[1] = 0 + (-2) = -2
prefix[2] = -2 + 0 = -2
prefix[3] = -2 + 3 = 1
prefix[4] = 1 + (-5) = -4
prefix[5] = -4 + 2 = -2
prefix[6] = -2 + (-1) = -3
Query sumRange(0, 2):
prefix[3] - prefix[0] = 1 - 0 = 1 ✓
Query sumRange(2, 5):
prefix[6] - prefix[2] = -3 - (-2) = -1 ✓

Explanation: We build a prefix sum array where prefix[i] represents the sum of elements from index 0 to i-1. This allows us to calculate any range sum in O(1) time. The key insight is that sumRange(left, right) = prefix[right+1] - prefix[left], where prefix[0] = 0 to handle the edge case. By storing prefix[0] = 0, we can use the same formula for all queries without special cases.

Time Complexity: O(n) for initialization, O(1) per query

Space Complexity: O(n)

Problem 2: Contiguous Array (LeetCode #525)

Given a binary array, find the maximum length of a contiguous subarray with equal number of 0 and 1.

Example:

Input: [0,1,0]
Output: 2
Explanation: [0, 1] or [1, 0] is the longest contiguous subarray with equal number of 0 and 1.

Solution in C++:

#include <vector>
#include <unordered_map>
#include <algorithm>
using namespace std;

int findMaxLength(vector<int>& nums) {
    int count = 0;
    int max_length = 0;
    unordered_map<int, int> count_map;
    count_map[0] = -1;  // Initialize: count 0 at index -1
    
    for (int i = 0; i < nums.size(); i++) {
        // Treat 1 as +1, 0 as -1
        count += (nums[i] == 1) ? 1 : -1;
        
        // If we've seen this count before, we found a balanced subarray
        if (count_map.find(count) != count_map.end()) {
            // Length = current index - first occurrence index
            max_length = max(max_length, i - count_map[count]);
        } else {
            // Store first occurrence of this count
            count_map[count] = i;
        }
    }
    
    return max_length;
}

Step-by-Step Simulation:

Input: nums = [0, 1, 0]
count_map initialized: {0: -1}
i=0, nums[0]=0: count = -1
count_map[-1] not found, store: {0: -1, -1: 0}
i=1, nums[1]=1: count = -1 + 1 = 0
count_map[0] = -1 found! length = 1 - (-1) = 2
max_length = max(0, 2) = 2
i=2, nums[2]=0: count = 0 - 1 = -1
count_map[-1] = 0 found! length = 2 - 0 = 2
max_length = max(2, 2) = 2
Result: 2

Explanation: We use a prefix sum approach where we treat 1 as +1 and 0 as -1. When the count (prefix sum) repeats, it means we've found a subarray with equal 0s and 1s. We track the first occurrence of each count in a hash map. The length of the subarray is the difference between the current index and the first occurrence index. The key insight is that if count at index i equals count at index j, then the subarray from j+1 to i has equal 0s and 1s (net count = 0).

Time Complexity: O(n)

Space Complexity: O(n)

Problem 3: Subarray Sum Equals K (LeetCode #560)

Given an array of integers and an integer k, find the total number of continuous subarrays whose sum equals k.

Example:

Input: nums = [1,1,1], k = 2
Output: 2

Solution in C++:

#include <vector>
#include <unordered_map>
using namespace std;

int subarraySum(vector<int>& nums, int k) {
    int count = 0;
    int prefix_sum = 0;
    unordered_map<int, int> sum_map;
    sum_map[0] = 1;  // prefix sum 0 appears once initially
    
    for (int num : nums) {
        prefix_sum += num;
        
        // If prefix_sum - k exists, we found subarray(s) with sum k
        // prefix_sum - (prefix_sum - k) = k
        if (sum_map.find(prefix_sum - k) != sum_map.end()) {
            count += sum_map[prefix_sum - k];
        }
        
        // Update frequency of current prefix sum
        sum_map[prefix_sum]++;
    }
    
    return count;
}

Step-by-Step Simulation:

Input: nums = [1, 1, 1], k = 2
sum_map initialized: {0: 1}
i=0, num=1: prefix_sum = 1
Check: prefix_sum - k = 1 - 2 = -1 (not in map)
sum_map = {0: 1, 1: 1}
i=1, num=1: prefix_sum = 2
Check: prefix_sum - k = 2 - 2 = 0 (found! count += 1)
count = 1, sum_map = {0: 1, 1: 1, 2: 1}
i=2, num=1: prefix_sum = 3
Check: prefix_sum - k = 3 - 2 = 1 (found! count += 1)
count = 2, sum_map = {0: 1, 1: 1, 2: 1, 3: 1}
Result: 2
Subarrays found: [1,1] (indices 0-1) and [1,1] (indices 1-2)

Explanation: We use prefix sums and a hash map. For each prefix sum, we check if prefix_sum - k exists in our map. If it does, it means there's a subarray ending at the current position with sum k. The key insight is: if prefix_sum[j] - prefix_sum[i] = k, then the subarray from i+1 to j has sum k. We count frequencies because multiple subarrays can have the same prefix sum difference.

Time Complexity: O(n)

Space Complexity: O(n)

Problem 4: Maximum Size Subarray Sum Equals k

Given an array nums and a target value k, find the maximum length of a subarray that sums to k. If there isn't one, return 0 instead.

Example:

Input: nums = [1, -1, 5, -2, 3], k = 3
Output: 4
Explanation: The subarray [1, -1, 5, -2] sums to 3 and is the longest.

Solution in C++:

#include <vector>
#include <unordered_map>
#include <algorithm>
using namespace std;

int maxSubArrayLen(vector<int>& nums, int k) {
    int prefix_sum = 0;
    int max_length = 0;
    unordered_map<int, int> sum_map;
    sum_map[0] = -1;  // prefix sum 0 at index -1
    
    for (int i = 0; i < nums.size(); i++) {
        prefix_sum += nums[i];
        
        // If prefix_sum - k exists, calculate length
        if (sum_map.find(prefix_sum - k) != sum_map.end()) {
            // Length = current index - earliest index with prefix_sum - k
            max_length = max(max_length, i - sum_map[prefix_sum - k]);
        }
        
        // Store the earliest index for each prefix sum
        if (sum_map.find(prefix_sum) == sum_map.end()) {
            sum_map[prefix_sum] = i;
        }
    }
    
    return max_length;
}

Step-by-Step Simulation:

Input: nums = [1, -1, 5, -2, 3], k = 3
sum_map initialized: {0: -1}
i=0, num=1: prefix_sum = 1
Check: prefix_sum - k = 1 - 3 = -2 (not found)
Store: sum_map = {0: -1, 1: 0}
i=1, num=-1: prefix_sum = 0
Check: prefix_sum - k = 0 - 3 = -3 (not found)
0 already in map, don't update
i=2, num=5: prefix_sum = 5
Check: prefix_sum - k = 5 - 3 = 2 (not found)
Store: sum_map = {0: -1, 1: 0, 5: 2}
i=3, num=-2: prefix_sum = 3
Check: prefix_sum - k = 3 - 3 = 0 (found! length = 3 - (-1) = 4)
max_length = 4
i=4, num=3: prefix_sum = 6
Check: prefix_sum - k = 6 - 3 = 3 (not found)
Store: sum_map = {0: -1, 1: 0, 5: 2, 3: 3, 6: 4}
Result: 4 (subarray [1, -1, 5, -2])

Explanation: Similar to the previous problem, but we track the earliest index where each prefix sum occurs. This ensures we get the maximum length subarray. When we find prefix_sum - k in the map, the subarray from map[prefix_sum - k] + 1 to current index has sum k. We only store the first occurrence of each prefix sum to maximize the subarray length.

Time Complexity: O(n)

Space Complexity: O(n)

Problem 5: Product of Array Except Self (LeetCode #238)

Given an integer array nums, return an array answer such that answer[i] is equal to the product of all the elements of nums except nums[i]. You must write an algorithm that runs in O(n) time and without using the division operator.

Example:

Input: nums = [1,2,3,4]
Output: [24,12,8,6]

Solution in C++:

#include <vector>
using namespace std;

vector<int> productExceptSelf(vector<int>& nums) {
    int n = nums.size();
    vector<int> result(n, 1);
    
    // First pass: Calculate left prefix products
    // result[i] = product of all elements to the left of i
    for (int i = 1; i < n; i++) {
        result[i] = result[i-1] * nums[i-1];
    }
    
    // Second pass: Multiply by right prefix products
    // Accumulate right product on the fly to save space
    int right_product = 1;
    for (int i = n - 1; i >= 0; i--) {
        result[i] *= right_product;
        right_product *= nums[i];
    }
    
    return result;
}

Step-by-Step Simulation:

Input: nums = [1, 2, 3, 4]
First Pass (Left Products):
result[0] = 1 (no elements to the left)
result[1] = result[0] × nums[0] = 1 × 1 = 1
result[2] = result[1] × nums[1] = 1 × 2 = 2
result[3] = result[2] × nums[2] = 2 × 3 = 6
After first pass: result = [1, 1, 2, 6]
Second Pass (Right Products):
right_product = 1
i=3: result[3] = 6 × 1 = 6, right_product = 1 × 4 = 4
i=2: result[2] = 2 × 4 = 8, right_product = 4 × 3 = 12
i=1: result[1] = 1 × 12 = 12, right_product = 12 × 2 = 24
i=0: result[0] = 1 × 24 = 24, right_product = 24 × 1 = 24
Result: [24, 12, 8, 6]
Verification:
result[0] = 2×3×4 = 24 ✓
result[1] = 1×3×4 = 12 ✓
result[2] = 1×2×4 = 8 ✓
result[3] = 1×2×3 = 6 ✓

Explanation: We use a variation of prefix products. First pass: result[i] contains the product of all elements to the left. Second pass: we multiply by the product of all elements to the right. This avoids division and handles zeros correctly. The right product is accumulated on the fly to save space. This approach works because: product_except_self[i] = (product of left elements) × (product of right elements).

Time Complexity: O(n)

Space Complexity: O(1) excluding output array