Top K Elements Pattern

The Top K Elements pattern finds the top k largest or smallest elements in an array or stream of data. This pattern typically uses heaps (priority queues) or sorting to efficiently solve these problems.

When to Use This Pattern

  • Finding k largest/smallest elements
  • Finding k most frequent elements
  • Finding k closest points
  • Problems involving "top k" or "k largest/smallest"
  • Streaming data where you need to maintain top k elements

Key Concepts

Heap (Priority Queue)

  • Min Heap: Root is minimum. Use for finding k largest (keep size k).
  • Max Heap: Root is maximum. Use for finding k smallest (keep size k).

For k largest: maintain a min heap of size k. For k smallest: maintain a max heap of size k.

Algorithm Section

Find K Largest Elements

function findKLargest(nums, k):
    min_heap = new MinHeap()
    
    for num in nums:
        if min_heap.size() < k:
            min_heap.push(num)
        else if num > min_heap.peek():
            min_heap.pop()
            min_heap.push(num)
    
    return min_heap.toArray()

Implementation in C++

#include <vector>
#include <queue>
#include <algorithm>
using namespace std;

vector<int> findKLargest(vector<int>& nums, int k) {
    // Min heap: smallest element at top
    priority_queue<int, vector<int>, greater<int>> min_heap;
    
    for (int num : nums) {
        if (min_heap.size() < k) {
            min_heap.push(num);
        } else if (num > min_heap.top()) {
            // Replace smallest with current number
            min_heap.pop();
            min_heap.push(num);
        }
    }
    
    // Convert heap to vector
    vector<int> result;
    while (!min_heap.empty()) {
        result.push_back(min_heap.top());
        min_heap.pop();
    }
    
    return result;
}

Practice Problems

Problem 1: Kth Largest Element in an Array (LeetCode #215)

Given an integer array nums and an integer k, return the kth largest element in the array.

Example:

Input: nums = [3,2,1,5,6,4], k = 2
Output: 5

Solution in C++:

#include <vector>
#include <queue>
using namespace std;

int findKthLargest(vector<int>& nums, int k) {
    // Min heap: keeps k largest elements, smallest at top
    priority_queue<int, vector<int>, greater<int>> min_heap;
    
    for (int num : nums) {
        if (min_heap.size() < k) {
            min_heap.push(num);
        } else if (num > min_heap.top()) {
            // Current number is larger than smallest in heap
            min_heap.pop();
            min_heap.push(num);
        }
    }
    
    // Root of min heap is kth largest
    return min_heap.top();
}

Step-by-Step Simulation:

Input: nums = [3, 2, 1, 5, 6, 4], k = 2

Processing:

num=3: heap=[3] (size < 2)

num=2: heap=[2,3] (size = 2, root=2)

num=1: heap=[2,3] (1 < 2, skip)

num=5: heap=[3,5] (5 > 2, replace 2 with 5)

num=6: heap=[5,6] (6 > 3, replace 3 with 6)

num=4: heap=[4,6] (4 > 5? No, but 4 > 5? Wait...)

Actually: 4 < 5, so skip

Result: 5 (2nd largest)

Explanation: We maintain a min heap of size k. The root is the kth largest element. When we see a number larger than the root, we replace it. After processing all numbers, the root is our answer. The min heap ensures we always have the k largest elements, with the smallest of those k at the root.

Time Complexity: O(n log k)

Space Complexity: O(k)

Problem 2: Top K Frequent Elements (LeetCode #347)

Given an integer array nums and an integer k, return the k most frequent elements.

Example:

Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]

Solution in C++:

#include <vector>
#include <unordered_map>
#include <queue>
#include <algorithm>
using namespace std;

vector<int> topKFrequent(vector<int>& nums, int k) {
    // Count frequencies
    unordered_map<int, int> count;
    for (int num : nums) {
        count[num]++;
    }
    
    // Min heap: pair of (frequency, number)
    // Compare by frequency (first element)
    auto compare = [](pair<int, int>& a, pair<int, int>& b) {
        return a.first > b.first;  // Min heap: smaller frequency at top
    };
    priority_queue<pair<int, int>, vector<pair<int, int>>, 
                   decltype(compare)> min_heap(compare);
    
    // Keep k most frequent elements
    for (auto& [num, freq] : count) {
        if (min_heap.size() < k) {
            min_heap.push({freq, num});
        } else if (freq > min_heap.top().first) {
            min_heap.pop();
            min_heap.push({freq, num});
        }
    }
    
    // Extract results
    vector<int> result;
    while (!min_heap.empty()) {
        result.push_back(min_heap.top().second);
        min_heap.pop();
    }
    
    return result;
}

Step-by-Step Simulation:

Input: nums = [1,1,1,2,2,3], k = 2

Frequencies: {1:3, 2:2, 3:1}

Processing:

num=1, freq=3: heap=[(3,1)]

num=2, freq=2: heap=[(2,2), (3,1)]

num=3, freq=1: heap=[(2,2), (3,1)] (1 < 2, skip)

Result: [2, 1] (most frequent)

Explanation: First, count frequencies using a hash map. Then use a min heap of size k to keep track of k most frequent elements. We compare by frequency. When we find an element with higher frequency than the smallest in the heap, we replace it. This ensures we always have the k most frequent elements. The min heap keeps the least frequent of the top k at the root.

Time Complexity: O(n + k log k)

Space Complexity: O(n)

Problem 3: Find K Pairs with Smallest Sums (LeetCode #373)

Given two sorted arrays nums1 and nums2, and an integer k, return the k pairs (u, v) with the smallest sums.

Example:

Input: nums1 = [1,7,11], nums2 = [2,4,6], k = 3
Output: [[1,2],[1,4],[1,6]]

Solution in C++:

#include <vector>
#include <queue>
#include <algorithm>
using namespace std;

vector<vector<int>> kSmallestPairs(vector<int>& nums1, 
                                        vector<int>& nums2, int k) {
    if (nums1.empty() || nums2.empty()) {
        return {};
    }
    
    // Min heap: (sum, index1, index2)
    priority_queue<vector<int>, vector<vector<int>>, 
                   greater<vector<int>>> heap;
    
    // Initialize with first element of nums1 paired with all nums2
    for (int j = 0; j < min((int)nums2.size(), k); j++) {
        heap.push({nums1[0] + nums2[j], 0, j});
    }
    
    vector<vector<int>> result;
    while (!heap.empty() && result.size() < k) {
        vector<int> top = heap.top();
        heap.pop();
        int i = top[1], j = top[2];
        result.push_back({nums1[i], nums2[j]});
        
        // Push next pair from nums1 with same nums2 element
        if (i + 1 < nums1.size()) {
            heap.push({nums1[i+1] + nums2[j], i+1, j});
        }
    }
    
    return result;
}

Explanation: We use a min heap to track pairs by their sum. We start with the first element of nums1 paired with all elements of nums2. Then we pop the smallest, add it to results, and push the next pair from nums1 with the same nums2 element. This ensures we process pairs in order of increasing sum. The heap maintains the smallest sums at the top, allowing us to efficiently find k smallest pairs.

Time Complexity: O(k log k)

Space Complexity: O(k)

Problem 4: K Closest Points to Origin (LeetCode #973)

Given an array of points, return the k closest points to the origin (0, 0).

Example:

Input: points = [[1,3],[-2,2]], k = 1
Output: [[-2,2]]

Solution in C++:

#include <vector>
#include <queue>
#include <cmath>
using namespace std;

vector<vector<int>> kClosest(vector<vector<int>>& points, int k) {
    // Max heap: keep k closest points
    // Use negative distance since priority_queue is max heap by default
    priority_queue<pair<int, vector<int>>> max_heap;
    
    for (auto& point : points) {
        // Calculate squared distance (avoid sqrt for efficiency)
        int dist = point[0] * point[0] + point[1] * point[1];
        
        if (max_heap.size() < k) {
            max_heap.push({dist, point});
        } else if (dist < max_heap.top().first) {
            // Current point is closer than farthest in heap
            max_heap.pop();
            max_heap.push({dist, point});
        }
    }
    
    // Extract results
    vector<vector<int>> result;
    while (!max_heap.empty()) {
        result.push_back(max_heap.top().second);
        max_heap.pop();
    }
    
    return result;
}

Explanation: We maintain a max heap of size k. The root has the largest distance among k closest points. When we find a point closer than the root, we replace it. We use squared distance to avoid square root calculation (since we only compare distances, squared distance works the same). The max heap ensures we always have the k closest points by keeping the farthest of those k at the root.

Time Complexity: O(n log k)

Space Complexity: O(k)

Problem 5: Merge K Sorted Lists (LeetCode #23)

You are given an array of k linked-lists lists, each linked-list is sorted in ascending order. Merge all the linked-lists into one sorted linked-list.

Example:

Input: lists = [[1,4,5],[1,3,4],[2,6]]
Output: [1,1,2,3,4,4,5,6]

Solution in C++:

#include <vector>
#include <queue>
using namespace std;

struct ListNode {
    int val;
    ListNode *next;
    ListNode() : val(0), next(nullptr) {}
    ListNode(int x) : val(x), next(nullptr) {}
    ListNode(int x, ListNode *next) : val(x), next(next) {}
};

ListNode* mergeKLists(vector<ListNode*>& lists) {
    // Min heap: (value, list_index, node)
    auto compare = [](pair<int, ListNode*>& a, pair<int, ListNode*>& b) {
        return a.first > b.first;  // Min heap
    };
    priority_queue<pair<int, ListNode*>, 
                   vector<pair<int, ListNode*>>, 
                   decltype(compare)> min_heap(compare);
    
    // Add first node of each list
    for (ListNode* node : lists) {
        if (node) {
            min_heap.push({node->val, node});
        }
    }
    
    ListNode* dummy = new ListNode(0);
    ListNode* current = dummy;
    
    while (!min_heap.empty()) {
        auto [val, node] = min_heap.top();
        min_heap.pop();
        
        current->next = node;
        current = current->next;
        
        // Push next node from same list
        if (node->next) {
            min_heap.push({node->next->val, node->next});
        }
    }
    
    return dummy->next;
}

Explanation: We use a min heap to always get the smallest element. We start by adding the first node of each list. Then we repeatedly pop the smallest, add it to result, and push the next node from that list. This efficiently merges k sorted lists by always processing the smallest available element. The heap ensures we process elements in sorted order.

Time Complexity: O(n log k) where n is total nodes

Space Complexity: O(k)