Quicksort and Quickhull Algorithms

This lecture section combines algorithmic theory and practice: Linux runtime measurement, time and memory complexity, recurrence solving with formal proofs, Quicksort partitioning variants, Haskell vs C/C++ implementation style, and Quickhull for the convex hull problem.

Topics of Discussion

Linux: measuring running time
Time and memory computational complexity
Bounding complexity with recurrences
Partition procedure and Quicksort
Partition variants: Hoare, 3-way, Median-of-3
Mathematical proofs: correctness and average case
Quicksort in Haskell vs C/C++
Quickhull for convex hull
Generics in C and C++
Problems by level (hidden solutions)

1) Linux Stuff: Measuring the Running Time

Use Linux tools to measure real execution behavior:

time — wall/user/system time
/usr/bin/time -v — memory footprint, page faults
perf stat — CPU-level counters (cycles, instructions, cache misses)

# Compile (C++)
g++ -O2 -std=c++17 quicksort.cpp -o quicksort

# Basic timing
time ./quicksort 1000000

# Detailed timing + memory report
/usr/bin/time -v ./quicksort 1000000

# Hardware counters (requires perf tools)
perf stat ./quicksort 1000000
                

Practical note: Run each test multiple times and report average ± standard deviation. Isolate input generation from algorithm timing using clock() or std::chrono inside the program.

#include <chrono>
#include <iostream>
using namespace std;
using namespace chrono;

int main() {
    vector<int> a = generateRandom(1000000);

    auto start = high_resolution_clock::now();
    quicksort(a, 0, (int)a.size() - 1);
    auto end   = high_resolution_clock::now();

    double ms = duration<double, milli>(end - start).count();
    cout << "Elapsed: " << ms << " ms\n";
}
                

2) Time and Memory Computational Complexity

Complexity describes the growth of resource usage as input size n increases, independently of hardware.

Algorithm	Best	Average	Worst	Aux Memory
Quicksort	O(n log n)	Θ(n log n)	Θ(n²)	O(log n) avg stack, O(n) worst
Quickhull	O(n log n)	~O(n log n)	O(n²)	O(n) recursion + point sets
Merge Sort	Θ(n log n)	Θ(n log n)	Θ(n log n)	O(n) auxiliary array
Insertion Sort	O(n)	Θ(n²)	Θ(n²)	O(1)

Important: Big-O hides constants. For real systems, cache locality, branch prediction, and memory allocation overhead can make an O(n log n) algorithm faster than another O(n log n) algorithm by a factor of 2–5×. Quicksort's in-place nature gives it excellent cache behavior.

3) How to Bound Computational Complexity (Recurrences)

Divide-and-conquer algorithms are naturally described by recurrence equations. The three main tools are: substitution, recursion tree, and the Master Theorem.

Master Theorem

For recurrences of the form \(T(n) = aT\!\left(\tfrac{n}{b}\right) + f(n)\), with \(a \ge 1\), \(b > 1\):

Let \(d = \log_b a\). Then:

Case 1: If \(f(n) = O(n^{d-\varepsilon})\) for some \(\varepsilon > 0\), then \(T(n) = \Theta(n^d)\)
Case 2: If \(f(n) = \Theta(n^d)\), then \(T(n) = \Theta(n^d \log n)\)
Case 3: If \(f(n) = \Omega(n^{d+\varepsilon})\) and regularity holds, then \(T(n) = \Theta(f(n))\)

Quicksort Recurrences

General recurrence (pivot splits into parts of size \(k\) and \(n-k-1\)):

\[T(n) = T(k) + T(n-k-1) + \Theta(n)\]

Balanced case \((k \approx n/2)\) — ideal pivot:

\[T(n) = 2T\!\left(\tfrac{n}{2}\right) + \Theta(n)\]

Master theorem: \(a=2,\, b=2,\, f(n)=\Theta(n)\), so \(d=\log_2 2 = 1\). Case 2 applies: \(T(n) = \Theta(n \log n)\).

Worst case \((k=0)\) — always smallest or largest element as pivot:

\[T(n) = T(n-1) + \Theta(n)\]

Unrolling: \(T(n) = \Theta(n) + \Theta(n-1) + \cdots + \Theta(1) = \Theta(n^2)\).

Unrolling the Worst-Case Recurrence (Step by Step)

Start with \(T(n) = T(n-1) + cn\) for some constant \(c > 0\):

\[T(n) = T(n-1) + cn\] \[= T(n-2) + c(n-1) + cn\] \[= T(n-3) + c(n-2) + c(n-1) + cn\] \[\vdots\] \[= T(1) + c \cdot 2 + c \cdot 3 + \cdots + cn\] \[= T(1) + c\sum_{k=2}^{n} k = T(1) + c\left(\frac{n(n+1)}{2} - 1\right) = \Theta(n^2)\]

4) Partition Procedure and Quicksort Algorithm

4a) Lomuto Partition (simple, educational)

Uses the last element as pivot. Maintains a boundary i such that a[lo..i] ≤ pivot.

#include <bits/stdc++.h>
using namespace std;

// Lomuto partition: pivot = a[hi]
// After: a[lo..p-1] <= a[p] <= a[p+1..hi]
// Returns: final pivot index p
int partitionLomuto(vector<int>& a, int lo, int hi) {
    int pivot = a[hi];
    int i = lo - 1;           // i: last index of "small" region
    for (int j = lo; j < hi; ++j) {
        if (a[j] <= pivot) {
            ++i;
            swap(a[i], a[j]); // grow "small" region
        }
    }
    swap(a[i + 1], a[hi]);    // place pivot in correct position
    return i + 1;
}

void quicksort(vector<int>& a, int lo, int hi) {
    if (lo >= hi) return;
    int p = partitionLomuto(a, lo, hi);
    quicksort(a, lo, p - 1);
    quicksort(a, p + 1, hi);
}

int main() {
    vector<int> a = {9, 4, 7, 3, 1, 8, 2, 5, 6};
    quicksort(a, 0, (int)a.size() - 1);
    for (int x : a) cout << x << " ";  // 1 2 3 4 5 6 7 8 9
}
                

Step-by-Step Trace of Lomuto Partition

Array: [9, 4, 7, 3, 1, 8, 2, 5, 6] lo=0, hi=8, pivot=6 0 1 2 3 4 5 6 7 8 i = -1 j=0: a[0]=9 > 6 → no swap [9, 4, 7, 3, 1, 8, 2, 5, 6] i=-1 j=1: a[1]=4 ≤ 6 → i=0, swap(0,1) [4, 9, 7, 3, 1, 8, 2, 5, 6] i=0 j=2: a[2]=7 > 6 → no swap [4, 9, 7, 3, 1, 8, 2, 5, 6] i=0 j=3: a[3]=3 ≤ 6 → i=1, swap(1,3) [4, 3, 7, 9, 1, 8, 2, 5, 6] i=1 j=4: a[4]=1 ≤ 6 → i=2, swap(2,4) [4, 3, 1, 9, 7, 8, 2, 5, 6] i=2 j=5: a[5]=8 > 6 → no swap [4, 3, 1, 9, 7, 8, 2, 5, 6] i=2 j=6: a[6]=2 ≤ 6 → i=3, swap(3,6) [4, 3, 1, 2, 7, 8, 9, 5, 6] i=3 j=7: a[7]=5 ≤ 6 → i=4, swap(4,7) [4, 3, 1, 2, 5, 8, 9, 7, 6] i=4 Final: swap(i+1=5, hi=8): [4, 3, 1, 2, 5, 6, 9, 7, 8] Pivot 6 is now at index 5. ✓ All a[0..4] ≤ 6, all a[6..8] > 6.

5) Partition Variants

5a) Hoare Partition (faster in practice)

Hoare's original partition uses two pointers moving toward each other. It does roughly 3× fewer swaps than Lomuto on average. The pivot is placed somewhere in [lo, hi] — not necessarily at the returned index.

// Hoare partition: pivot = a[lo]
// Returns index j such that a[lo..j] <= pivot and a[j+1..hi] >= pivot
// NOTE: a[j] is NOT necessarily equal to pivot after return.
int partitionHoare(vector<int>& a, int lo, int hi) {
    int pivot = a[lo + (hi - lo) / 2]; // mid element as pivot
    int i = lo - 1;
    int j = hi + 1;
    while (true) {
        do { ++i; } while (a[i] < pivot);
        do { --j; } while (a[j] > pivot);
        if (i >= j) return j;
        swap(a[i], a[j]);
    }
}

void quicksortHoare(vector<int>& a, int lo, int hi) {
    if (lo >= hi) return;
    int j = partitionHoare(a, lo, hi);
    quicksortHoare(a, lo, j);      // note: includes j (not j-1)
    quicksortHoare(a, j + 1, hi);
}

int main() {
    vector<int> a = {5, 3, 8, 1, 9, 2, 7, 4, 6};
    quicksortHoare(a, 0, (int)a.size() - 1);
    for (int x : a) cout << x << " ";  // 1 2 3 4 5 6 7 8 9
}
                

Hoare correctness subtlety: The returned index j does NOT split the array at the pivot itself — it splits into [lo..j] and [j+1..hi]. You must recurse on [lo, j] (not [lo, j-1]), otherwise elements may be skipped.

5b) 3-Way Partition (Dutch National Flag) — handles duplicates efficiently

When many duplicate keys exist, standard Quicksort degrades because equal elements are not collapsed. 3-way partition splits into three regions: a[lo..lt-1] < pivot, a[lt..gt] == pivot, a[gt+1..hi] > pivot. Equal elements are never recursed on.

// 3-way partition (Dijkstra's Dutch National Flag)
// After: a[lo..lt-1] < pivot, a[lt..gt] == pivot, a[gt+1..hi] > pivot
void partition3way(vector<int>& a, int lo, int hi, int& lt, int& gt) {
    int pivot = a[lo];
    lt = lo;
    gt = hi;
    int i = lo;
    while (i <= gt) {
        if      (a[i] < pivot) swap(a[lt++], a[i++]);
        else if (a[i] > pivot) swap(a[i],    a[gt--]);
        else                   ++i;
    }
}

void quicksort3way(vector<int>& a, int lo, int hi) {
    if (lo >= hi) return;
    int lt, gt;
    partition3way(a, lo, hi, lt, gt);
    quicksort3way(a, lo, lt - 1);  // recurse on < region
    quicksort3way(a, gt + 1, hi);  // recurse on > region
    // a[lt..gt] all equal pivot — no recursion needed!
}

int main() {
    // Array with many duplicates — 3-way is optimal here
    vector<int> a = {3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 3, 5};
    quicksort3way(a, 0, (int)a.size() - 1);
    for (int x : a) cout << x << " ";
    // 1 1 2 3 3 3 4 5 5 5 5 6 9
}
                

Step-by-Step Trace of 3-Way Partition

Array: [3, 1, 4, 1, 3, 3, 2] lo=0, hi=6, pivot=3 0 1 2 3 4 5 6 lt=0, gt=6, i=0 i=0: a[0]=3 == pivot → i=1 [3, 1, 4, 1, 3, 3, 2] lt=0,gt=6 i=1: a[1]=1 < pivot → swap(lt=0,i=1), lt=1, i=2 [1, 3, 4, 1, 3, 3, 2] lt=1,gt=6 i=2: a[2]=4 > pivot → swap(i=2,gt=6), gt=5 [1, 3, 2, 1, 3, 3, 4] lt=1,gt=5 i=2: a[2]=2 < pivot → swap(lt=1,i=2), lt=2, i=3 [1, 2, 3, 1, 3, 3, 4] lt=2,gt=5 i=3: a[3]=1 < pivot → swap(lt=2,i=3), lt=3, i=4 [1, 2, 1, 3, 3, 3, 4] lt=3,gt=5 i=4: a[4]=3 == pivot → i=5 [1, 2, 1, 3, 3, 3, 4] lt=3,gt=5 i=5: a[5]=3 == pivot → i=6 [1, 2, 1, 3, 3, 3, 4] lt=3,gt=5 i=6 > gt=5 → stop Result: a[0..2]={1,2,1} < 3, a[3..5]={3,3,3} == 3, a[6]={4} > 3 ✓

5c) Median-of-Three Pivot Selection

Choose the pivot as the median of a[lo], a[mid], a[hi]. This eliminates worst-case behavior on already-sorted arrays and gives better splits on average.

// Returns index of median of a[lo], a[mid], a[hi]
int medianOfThree(vector<int>& a, int lo, int hi) {
    int mid = lo + (hi - lo) / 2;
    // Sort lo, mid, hi in place (network of 3 comparisons)
    if (a[lo] > a[mid]) swap(a[lo], a[mid]);
    if (a[lo] > a[hi])  swap(a[lo], a[hi]);
    if (a[mid] > a[hi]) swap(a[mid], a[hi]);
    // Now a[lo] <= a[mid] <= a[hi]
    // Place median at hi-1 to use as pivot (Lomuto-style)
    swap(a[mid], a[hi]);
    return hi;
}

// Optimized Quicksort: median-of-3 + insertion sort for small n
void insertionSort(vector<int>& a, int lo, int hi) {
    for (int i = lo + 1; i <= hi; ++i) {
        int key = a[i];
        int j = i - 1;
        while (j >= lo && a[j] > key) { a[j + 1] = a[j]; --j; }
        a[j + 1] = key;
    }
}

void quicksortOpt(vector<int>& a, int lo, int hi) {
    while (lo < hi) {
        if (hi - lo < 16) {        // small partitions: insertion sort
            insertionSort(a, lo, hi);
            return;
        }
        medianOfThree(a, lo, hi);   // pivot now at a[hi]
        int p = partitionLomuto(a, lo, hi);
        // Tail-call optimization: recurse on smaller side first
        if (p - lo < hi - p) {
            quicksortOpt(a, lo, p - 1);
            lo = p + 1;             // iterate on larger half
        } else {
            quicksortOpt(a, p + 1, hi);
            hi = p - 1;
        }
    }
}

int main() {
    vector<int> a = {5, 2, 8, 1, 9, 3, 7, 4, 6};
    quicksortOpt(a, 0, (int)a.size() - 1);
    for (int x : a) cout << x << " ";  // 1 2 3 4 5 6 7 8 9
}
                

5d) Randomized Quicksort

#include <random>
mt19937 rng(42);

int partitionRandom(vector<int>& a, int lo, int hi) {
    int pivotIdx = uniform_int_distribution<int>(lo, hi)(rng);
    swap(a[pivotIdx], a[hi]);
    return partitionLomuto(a, lo, hi);
}

void quicksortRandom(vector<int>& a, int lo, int hi) {
    if (lo >= hi) return;
    int p = partitionRandom(a, lo, hi);
    quicksortRandom(a, lo, p - 1);
    quicksortRandom(a, p + 1, hi);
}
                

5e) Iterative Quicksort (no call-stack overflow)

void quicksortIterative(vector<int>& a) {
    if (a.empty()) return;
    stack<pair<int,int>> st;
    st.push({0, (int)a.size() - 1});
    while (!st.empty()) {
        auto [lo, hi] = st.top(); st.pop();
        if (lo >= hi) continue;
        int p = partitionLomuto(a, lo, hi);
        // Push larger partition last (so smaller is processed first)
        // This keeps stack depth O(log n) worst-case
        if (p - lo < hi - p) {
            st.push({p + 1, hi});
            st.push({lo, p - 1});
        } else {
            st.push({lo, p - 1});
            st.push({p + 1, hi});
        }
    }
}
                

6) Mathematical Proofs

6a) Proof of Lomuto Partition Correctness (Loop Invariant)

Loop invariant: At the start of each iteration with index j:

\(\forall k \in [\text{lo},\, i]:\; a[k] \le \text{pivot}\)
\(\forall k \in [i+1,\, j-1]:\; a[k] > \text{pivot}\)
\(a[\text{hi}] = \text{pivot}\)

Initialization: Before the loop, \(i = \text{lo} - 1\) and \(j = \text{lo}\). Both regions \([\text{lo}, i]\) and \([i+1, j-1]\) are empty — trivially satisfied.

Maintenance: At iteration \(j\):

If \(a[j] > \text{pivot}\): do nothing. Region \([i+1, j]\) is now all \(> \text{pivot}\). ✓
If \(a[j] \le \text{pivot}\): increment \(i\), swap \(a[i] \leftrightarrow a[j]\). Now \(a[i] \le \text{pivot}\), and \(a[j]\) (old \(a[i]\)) is moved to the \(>\) region. ✓

Termination: At \(j = \text{hi}\), the invariant gives \(a[\text{lo}..i] \le \text{pivot}\) and \(a[i+1..\text{hi}-1] > \text{pivot}\). After the final swap(a[i+1], a[hi]), the pivot lands at \(i+1\) with everything left \(\le\) it and everything right \(>\) it. \(\blacksquare\)

6b) Average-Case Analysis of Quicksort: \(\Theta(n \log n)\)

We analyze randomized Quicksort where the pivot is chosen uniformly at random. Let the elements be \(z_1 < z_2 < \cdots < z_n\) (their sorted order).

Claim: \(E[\text{# comparisons}] = \Theta(n \log n)\).

Key observation: Elements \(z_i\) and \(z_j\) (with \(i < j\) in sorted order) are compared at most once — only when one of them is chosen as pivot while both are still in the same subarray.

Define indicator random variable:

\[X_{ij} = \begin{cases}1 & z_i \text{ and } z_j \text{ are compared}\\ 0 & \text{otherwise}\end{cases}\]

Probability that \(z_i\) and \(z_j\) are compared:
Among the elements \(\{z_i, z_{i+1}, \ldots, z_j\}\), comparison happens iff \(z_i\) or \(z_j\) is the first of these elements chosen as pivot. Since each of the \(j - i + 1\) elements is equally likely to be chosen first:

\[\Pr[X_{ij} = 1] = \frac{2}{j - i + 1}\]

Expected total comparisons: By linearity of expectation:

\[E[C] = \sum_{i=1}^{n-1}\sum_{j=i+1}^{n} \frac{2}{j-i+1} = \sum_{i=1}^{n-1}\sum_{k=2}^{n-i+1} \frac{2}{k}\]

Let \(d = j - i\). For each gap \(d\) from 1 to \(n-1\), there are \(n - d\) pairs with that gap:

\[E[C] = \sum_{d=1}^{n-1}(n-d)\cdot\frac{2}{d+1} \le 2n\sum_{d=1}^{n-1}\frac{1}{d+1} \le 2n\sum_{k=2}^{n}\frac{1}{k} \le 2n \ln n\]

For the lower bound: \(E[C] \ge 2\sum_{d=1}^{n/2}\frac{n-d}{d+1} \ge \frac{n}{2}\sum_{k=2}^{n/2}\frac{2}{k} \ge n\ln\frac{n}{2} = n(\ln n - \ln 2)\).

Therefore: \(E[C] = \Theta(n \log n)\). \(\blacksquare\)

6c) Worst-Case Stack Depth and Tail-Call Optimization

Problem: On a sorted array with a naive "always pick last element" strategy, every partition call peels off one element. Stack depth = \(O(n)\), causing stack overflow for \(n = 10^5\).

Fix — always recurse on the smaller half first:

After partitioning at index \(p\), compare sizes of \([lo, p-1]\) and \([p+1, hi]\).
Recurse on the smaller subarray; handle the larger with a loop (tail-call elimination).
This guarantees stack depth \(\le \log_2 n\) because each recursive call is on a subarray at most half the size of the current one.

Proof of \(O(\log n)\) stack bound: Let \(S(n)\) = max stack depth on size-\(n\) input. The recursive call is on at most \(\lfloor n/2 \rfloor\) elements, so \(S(n) \le S(n/2) + 1\), giving \(S(n) \le \log_2 n\). \(\blacksquare\)

7) Implementations: Haskell vs C/C++

Haskell (declarative, immutable lists)

quicksort :: Ord a => [a] -> [a]
quicksort [] = []
quicksort (x:xs) =
    quicksort [y | y <- xs, y <= x]
    ++ [x] ++
    quicksort [y | y <- xs, y > x]
                

Aspect	Haskell	C/C++
Style	Functional, concise	Imperative, explicit control
Memory behavior	Creates new lists — O(n) per level	In-place — O(1) extra per swap
Performance tuning	Relies on compiler + GC	Fine-grained cache/allocation control
Worst-case avoidance	Harder without mutation	Easy: randomize pivot, shuffle input
Readability	Very expressive	Verbose but transparent

8) Quickhull Algorithm for Convex Hull

The convex hull of a point set \(P\) is the smallest convex polygon containing all points. Quickhull finds it recursively — analogous to Quicksort — by always eliminating points that cannot be on the hull.

8a) Geometric Foundations

Cross Product and Signed Area

Given three points \(A\), \(B\), \(C\) in 2D, the cross product of vectors \(\overrightarrow{AB}\) and \(\overrightarrow{AC}\) is:

\[\text{cross}(A, B, C) = (B_x - A_x)(C_y - A_y) - (B_y - A_y)(C_x - A_x)\]

This equals the signed area of the parallelogram spanned by \(\overrightarrow{AB}\) and \(\overrightarrow{AC}\):

\(\text{cross} > 0\): \(C\) is to the left of directed line \(A \to B\) (counterclockwise turn)
\(\text{cross} < 0\): \(C\) is to the right (clockwise turn)
\(\text{cross} = 0\): \(C\) is collinear with \(A\) and \(B\)

Distance from Point to Line

The perpendicular distance from point \(C\) to line \(AB\) is:

\[d(C, AB) = \frac{|\text{cross}(A, B, C)|}{|AB|} = \frac{|(B_x-A_x)(C_y-A_y) - (B_y-A_y)(C_x-A_x)|}{\sqrt{(B_x-A_x)^2+(B_y-A_y)^2}}\]

Since we only need the farthest point and \(|AB|\) is constant within one recursive call, we can compare \(|\text{cross}(A,B,C)|\) directly — no square root needed!

struct Point { double x, y; };

// Signed area of parallelogram formed by AB and AC
// > 0: C is left of A->B,  < 0: C is right,  = 0: collinear
double cross(const Point& a, const Point& b, const Point& c) {
    return (b.x - a.x) * (c.y - a.y)
         - (b.y - a.y) * (c.x - a.x);
}

// Squared distance between two points
double dist2(const Point& a, const Point& b) {
    double dx = b.x - a.x, dy = b.y - a.y;
    return dx*dx + dy*dy;
}
                

8b) Quickhull Algorithm — Full C++ Implementation

The algorithm works in two symmetric halves: above and below the baseline \(AB\).

#include <bits/stdc++.h>
using namespace std;

struct Point { double x, y; };

double cross(const Point& a, const Point& b, const Point& c) {
    return (b.x-a.x)*(c.y-a.y) - (b.y-a.y)*(c.x-a.x);
}

// Recursively find hull points on the left side of directed line A->B.
// `pts` contains only points with cross(A,B,P) > 0 (left of AB).
void hullRec(vector<Point>& pts, Point a, Point b, vector<Point>& hull) {
    if (pts.empty()) return;

    // Step 1: Find point P farthest from line AB
    // (maximize |cross(A,B,P)| = proportional to distance since |AB| is constant)
    int farthestIdx = 0;
    double maxCross = 0.0;
    for (int i = 0; i < (int)pts.size(); ++i) {
        double c = cross(a, b, pts[i]);
        if (c > maxCross) {
            maxCross = c;
            farthestIdx = i;
        }
    }
    Point p = pts[farthestIdx];

    // Step 2: All points INSIDE triangle A-P-B cannot be on hull — discard them.
    // Collect points to the left of A->P and points to the left of P->B.
    vector<Point> leftAP, leftPB;
    for (int i = 0; i < (int)pts.size(); ++i) {
        if (i == farthestIdx) continue;
        if (cross(a, p, pts[i]) > 0) leftAP.push_back(pts[i]);
        if (cross(p, b, pts[i]) > 0) leftPB.push_back(pts[i]);
        // If on neither side, the point is inside triangle A-P-B: discard
    }

    // Step 3: Recurse on each subproblem, then add P to hull in order
    hullRec(leftAP, a, p, hull);
    hull.push_back(p);
    hullRec(leftPB, p, b, hull);
}

// Main Quickhull: returns convex hull in counterclockwise order
vector<Point> quickhull(vector<Point>& pts) {
    int n = (int)pts.size();
    if (n < 3) return pts;

    // Step 1: Find leftmost (A) and rightmost (B) points
    Point a = *min_element(pts.begin(), pts.end(),
                           [](const Point& p, const Point& q){ return p.x < q.x; });
    Point b = *max_element(pts.begin(), pts.end(),
                           [](const Point& p, const Point& q){ return p.x < q.x; });

    // Step 2: Partition into above-AB and below-AB
    vector<Point> upper, lower;
    for (auto& p : pts) {
        double c = cross(a, b, p);
        if (c > 0) upper.push_back(p);  // left of A->B = above
        if (c < 0) lower.push_back(p);  // right of A->B = below
        // collinear points on AB are interior — skip
    }

    // Step 3: Build hull counterclockwise: A, upper hull, B, lower hull
    vector<Point> hull;
    hull.push_back(a);
    hullRec(upper, a, b, hull);   // A -> ... -> B along upper
    hull.push_back(b);
    hullRec(lower, b, a, hull);   // B -> ... -> A along lower
    return hull;
}

int main() {
    vector<Point> pts = {
        {0,0},{1,3},{2,2},{4,4},{0,2},{3,1},{3,3},{1,1},{4,0},{2,4}
    };
    vector<Point> hull = quickhull(pts);
    cout << "Convex hull (" << hull.size() << " points):\n";
    for (auto& p : hull)
        cout << "  (" << p.x << ", " << p.y << ")\n";
    // Expected hull vertices (CCW): (0,0),(4,0),(4,4),(2,4),(0,2)
}
                

8c) Step-by-Step Trace

Points: {(0,0),(1,3),(2,2),(4,4),(0,2),(3,1),(3,3),(1,1),(4,0),(2,4)} Step 1: Leftmost A=(0,0), Rightmost B=(4,0) Step 2: Partition by line A->B (the x-axis here): Upper (left of A->B, y > 0): (1,3),(2,2),(4,4),(0,2),(3,1),(3,3),(1,1),(2,4) Lower (right of A->B): none (all points have y >= 0 in this example) Step 3: hullRec(upper, A=(0,0), B=(4,0)): cross(A,B,P) = (4)(P.y) - (0)(P.x-0) = 4*P.y Farthest from line AB: maximize P.y → (2,4) and (4,4) tie → pick first with max Say P = (4,4) → cross = 4*4 = 16 Triangle A=(0,0), P=(4,4), B=(4,0): Points left of A->P: cross((0,0),(4,4),Q) = 4*Qy - 4*Qx = 4(Qy-Qx) > 0 → Qy > Qx (1,3)✓ (0,2)✓ (2,4)✓ Points left of P->B: cross((4,4),(4,0),Q) = 0*(Qy-4) - (-4)*(Qx-4) = -4(Qx-4) > 0 → Qx < 4 (3,1)✓ (3,3)✓ Points inside triangle (discard): (2,2),(1,1) Recurse: hullRec([{1,3},{0,2},{2,4}], A, P=(4,4)): Farthest from line (0,0)->(4,4): cross = 4*Qy-4*Qx = 4(Qy-Qx) (0,2): 4(2-0)=8, (1,3): 4(3-1)=8, (2,4): 4(4-2)=8 (tie → (0,2)) ... continues recursively ... Hull (CCW): (0,0), ... upper hull points ..., (4,0)

8d) Complexity Analysis of Quickhull

Average case: \(O(n \log n)\)

At each level of recursion, every point is examined once per level. The expected number of levels is \(O(\log n)\) when the farthest point splits the problem "evenly" — analogous to a balanced Quicksort split. Formally, for uniformly random points inside a convex region, the expected number of recursive calls is \(O(n \log n)\) total.

Worst case: \(O(n^2)\)

Consider \(n\) points on a semicircle. The farthest point from chord \(AB\) splits the arc into two arcs of sizes \(1\) and \(n-2\) — exactly like Quicksort's worst case with a sorted array. The recurrence is:

\[T(n) = T(1) + T(n-2) + O(n) = O(n^2)\]

In practice, Quickhull avoids this with random point distributions, but adversarial input (circular arrangements) triggers \(O(n^2)\).

Key insight — triangle elimination: Any point strictly inside triangle \(A\text{-}P\text{-}B\) cannot be on the convex hull (it is dominated by the three vertices). Quickhull discards these in \(O(n)\) per level, which is why it achieves sub-quadratic expected performance.

Why is Quickhull fast in practice? For typical inputs (random, clustered), the farthest-point triangles eliminate a constant fraction of points at each level, giving expected \(O(n \log n)\). The algorithm also avoids geometric predicates beyond the cross product, making it cache-friendly and fast in absolute terms.

8e) Correctness Argument

Lemma: If point \(Q\) lies strictly inside triangle \(A\text{-}P\text{-}B\) (where \(P\) is the farthest point from \(AB\)), then \(Q\) is not a vertex of the convex hull.

Proof: Since \(Q\) is strictly inside the triangle, it is a convex combination of \(A\), \(P\), \(B\): \(Q = \alpha A + \beta P + \gamma B\) with \(\alpha,\beta,\gamma > 0\) and \(\alpha+\beta+\gamma=1\). Therefore \(Q\) is in the interior of the convex hull (which contains \(A\), \(P\), \(B\)), so it cannot be an extreme point. \(\blacksquare\)

Corollary: Quickhull never discards a hull point. Combined with the fact that it adds boundary candidates at each level, the algorithm is correct.

8f) Comparison with Other Convex Hull Algorithms

Algorithm	Time (avg)	Time (worst)	Notes
Quickhull	O(n log n)	O(n²)	Fast in practice, simple to implement
Graham Scan	O(n log n)	O(n log n)	Sort + angular sweep, always optimal
Jarvis March	O(nh)	O(n²)	h = hull size; good when h is small
Chan's Algorithm	O(n log h)	O(n log h)	Optimal output-sensitive algorithm

9) Generics in C and C++

Generics let you write reusable algorithms and data structures that work with many types while keeping type safety and performance.

9a) What "generics" means in C vs C++

Language	Main Generic Mechanism	Type Checking	Runtime Cost
C	Macros, `void*`, `_Generic` (C11)	Mostly weak/manual	Usually zero for macros, potential casts
C++	Templates (functions/classes), concepts	Strong compile-time checking	Zero-cost abstractions in optimized builds

9b) Generics in C

C has no templates, so generic style is built from three common techniques:

Preprocessor macros (text substitution)
void* + element size + callbacks (e.g., qsort)
_Generic for compile-time type selection (C11)

Example: macro-based generic max

#define MAX(a, b) ((a) > (b) ? (a) : (b))

int main() {
    int x = MAX(3, 7);        // int
    double y = MAX(2.5, 1.9); // double
}
                

Macro caveat: Macros are not type-safe and may evaluate arguments multiple times. For example, MAX(i++, j++) can cause subtle bugs.

Example: `qsort` with comparator

#include <stdlib.h>

int cmpInt(const void* a, const void* b) {
    int x = *(const int*)a, y = *(const int*)b;
    return (x > y) - (x < y);
}

int main() {
    int arr[] = {5, 1, 4, 2, 3};
    qsort(arr, 5, sizeof(int), cmpInt);
}
                

9c) Generics in C++

C++ templates generate type-specific code at compile time and preserve static typing.

Example: generic function template

template <typename T>
T myMax(const T& a, const T& b) {
    return (a < b) ? b : a;
}

int main() {
    int a = myMax(2, 9);
    double b = myMax(3.14, 2.71);
}
                

Example: generic class template

template <typename T>
class Stack {
    vector<T> data;
public:
    void push(const T& x) { data.push_back(x); }
    void pop() { if (!data.empty()) data.pop_back(); }
    const T& top() const { return data.back(); }
    bool empty() const { return data.empty(); }
};
                

9d) Core principles

Abstraction: Write logic once, reuse for many types.
Type constraints: In C++ templates/concepts enforce required operations; in C this is manual.
Separation of algorithm and data: Same algorithm can run on different element types.
Performance: C++ templates are often inlined and optimized as if handwritten.

9e) Similarities and differences

Aspect	Similarity	Difference
Goal	Both aim at reusable generic code	C does it via patterns; C++ has native language support
Safety	Both can be used safely with discipline	C++ checks types at compile time; C often relies on casts
Error discovery	Both may produce hard-to-read errors	C errors often runtime bugs; C++ template errors are compile-time
Performance	Both can be fast	C++ templates usually give safer zero-cost abstractions
Ecosystem	Both support generic libraries	C++ STL is deeply template-driven; C relies on library conventions

Rule of thumb: In C, prefer small, predictable generic patterns (qsort-style interfaces, careful macros). In C++, prefer templates and concepts for safer and clearer generic programming.

10) Problems by Level

Easy

Problem E1: Partition Index Easy

Task: Given array [9, 3, 7, 1, 8] with pivot at last index (8), perform Lomuto partition and return the final pivot index. Trace each swap.

Array: [9, 3, 7, 1, 8] pivot=8, i=-1 j=0: a[0]=9 > 8 → no swap [9, 3, 7, 1, 8] i=-1 j=1: a[1]=3 ≤ 8 → i=0, swap(0,1) [3, 9, 7, 1, 8] i=0 j=2: a[2]=7 ≤ 8 → i=1, swap(1,2) [3, 7, 9, 1, 8] i=1 j=3: a[3]=1 ≤ 8 → i=2, swap(2,3) [3, 7, 1, 9, 8] i=2 Final swap(i+1=3, hi=4): [3, 7, 1, 8, 9] Pivot 8 at index 3. ✓

int partitionLomuto(vector<int>& a, int lo, int hi) {
    int pivot = a[hi], i = lo - 1;
    for (int j = lo; j < hi; ++j)
        if (a[j] <= pivot) swap(a[++i], a[j]);
    swap(a[i + 1], a[hi]);
    return i + 1;
}

int main() {
    vector<int> a = {9, 3, 7, 1, 8};
    int idx = partitionLomuto(a, 0, 4);
    cout << "Pivot index: " << idx << "\n"; // 3
    for (int x : a) cout << x << " "; // 3 7 1 8 9
}
                        

Problem E2: Count Quicksort Comparisons Easy

Task: Instrument Quicksort to count comparisons. Compare: (a) random array, (b) sorted array, (c) reverse-sorted array of size 1000. Observe \(O(n \log n)\) vs \(O(n^2)\) behavior.

long long cmpCount = 0;

int partitionCount(vector<int>& a, int lo, int hi) {
    int pivot = a[hi], i = lo - 1;
    for (int j = lo; j < hi; ++j) {
        ++cmpCount;
        if (a[j] <= pivot) swap(a[++i], a[j]);
    }
    swap(a[i + 1], a[hi]);
    return i + 1;
}

void quicksortCount(vector<int>& a, int lo, int hi) {
    if (lo >= hi) return;
    int p = partitionCount(a, lo, hi);
    quicksortCount(a, lo, p - 1);
    quicksortCount(a, p + 1, hi);
}

int main() {
    int n = 1000;
    // Random
    vector<int> a(n); iota(a.begin(), a.end(), 1);
    shuffle(a.begin(), a.end(), mt19937(42));
    cmpCount = 0;
    quicksortCount(a, 0, n-1);
    cout << "Random: " << cmpCount << " comparisons\n";
    // Expected: ~n*log2(n) ≈ 10000

    // Sorted (worst case for last-element pivot)
    iota(a.begin(), a.end(), 1);
    cmpCount = 0;
    quicksortCount(a, 0, n-1);
    cout << "Sorted: " << cmpCount << " comparisons\n";
    // Expected: ~n*(n-1)/2 ≈ 499500

    // Reverse sorted
    iota(a.rbegin(), a.rend(), 1);
    cmpCount = 0;
    quicksortCount(a, 0, n-1);
    cout << "Reverse: " << cmpCount << " comparisons\n";
    // Expected: ~n*(n-1)/2 ≈ 499500
}
                        

Problem E3: Cross Product Sign Easy

Task: Given three points, determine if they make a left turn, right turn, or are collinear using the cross product.

Input: A=(0,0), B=(4,0), C=(2,3)
Output: Left turn (C is above line AB)

struct Point { double x, y; };

double cross(const Point& a, const Point& b, const Point& c) {
    return (b.x-a.x)*(c.y-a.y) - (b.y-a.y)*(c.x-a.x);
}

string turn(const Point& a, const Point& b, const Point& c) {
    double cr = cross(a, b, c);
    if (cr > 0) return "Left turn (CCW)";
    if (cr < 0) return "Right turn (CW)";
    return "Collinear";
}

int main() {
    // Manual check: cross((0,0),(4,0),(2,3))
    // = (4-0)*(3-0) - (0-0)*(2-0) = 12 - 0 = 12 > 0 → Left turn
    cout << turn({0,0},{4,0},{2,3}) << "\n"; // Left turn (CCW)
    cout << turn({0,0},{4,0},{2,-1}) << "\n"; // Right turn (CW)
    cout << turn({0,0},{4,0},{2,0}) << "\n";  // Collinear
}
                        

Medium

Problem M1: Compare Partition Strategies Medium

Task: Implement both Lomuto and Hoare partitions. Count swaps for each on the same 10,000-element random array. Verify that Hoare does approximately 3× fewer swaps.

long long swapCount = 0;

int partitionLomutoC(vector<int>& a, int lo, int hi) {
    int pivot = a[hi], i = lo - 1;
    for (int j = lo; j < hi; ++j)
        if (a[j] <= pivot) { ++swapCount; swap(a[++i], a[j]); }
    ++swapCount;
    swap(a[i+1], a[hi]);
    return i + 1;
}

int partitionHoareC(vector<int>& a, int lo, int hi) {
    int pivot = a[lo + (hi - lo) / 2];
    int i = lo - 1, j = hi + 1;
    while (true) {
        do ++i; while (a[i] < pivot);
        do --j; while (a[j] > pivot);
        if (i >= j) return j;
        ++swapCount;
        swap(a[i], a[j]);
    }
}

int main() {
    int n = 10000;
    vector<int> a(n), b(n);
    iota(a.begin(), a.end(), 1);
    shuffle(a.begin(), a.end(), mt19937(42));
    b = a;

    // Lomuto
    swapCount = 0;
    function<void(int,int)> qs1 = [&](int lo, int hi) {
        if (lo >= hi) return;
        int p = partitionLomutoC(a, lo, hi);
        qs1(lo, p-1); qs1(p+1, hi);
    };
    qs1(0, n-1);
    long long lomutoSwaps = swapCount;

    // Hoare
    swapCount = 0;
    function<void(int,int)> qs2 = [&](int lo, int hi) {
        if (lo >= hi) return;
        int j = partitionHoareC(b, lo, hi);
        qs2(lo, j); qs2(j+1, hi);
    };
    qs2(0, n-1);
    long long hoareSwaps = swapCount;

    cout << "Lomuto swaps: " << lomutoSwaps << "\n";
    cout << "Hoare swaps:  " << hoareSwaps << "\n";
    cout << "Ratio: " << (double)lomutoSwaps / hoareSwaps << "x\n";
    // Typical output: Lomuto ~1.5x to 3x more swaps than Hoare
}
                        

Problem M2: Solve the Recurrences Medium

Task: Give tight bounds for: (a) \(T(n)=2T(n/2)+n\), (b) \(T(n)=T(n-1)+n\), (c) \(T(n)=T(n/2)+1\), (d) \(T(n)=4T(n/2)+n^2\). Show all work.

(a) \(T(n) = 2T(n/2) + n\) — Master Theorem, \(a=2,b=2,d=\log_2 2 = 1\), \(f(n)=n=\Theta(n^1)\) → Case 2: \(T(n) = \Theta(n \log n)\)

(b) \(T(n) = T(n-1) + n\) — Unroll: \(T(n) = n + (n-1) + \cdots + 1 = \frac{n(n+1)}{2} = \Theta(n^2)\)

(c) \(T(n) = T(n/2) + 1\) — Master Theorem, \(a=1,b=2,d=\log_2 1=0\), \(f(n)=1=\Theta(n^0)\) → Case 2: \(T(n) = \Theta(\log n)\)

(d) \(T(n) = 4T(n/2) + n^2\) — Master Theorem, \(a=4,b=2,d=\log_2 4=2\), \(f(n)=n^2=\Theta(n^2)\) → Case 2: \(T(n) = \Theta(n^2 \log n)\)

Problem M3: 3-Way Quicksort on Duplicates Medium

Task: Sort [5,3,5,1,5,2,5,4,5] using 3-way partition. Count how many comparisons are saved vs standard Quicksort when many elements equal the pivot.

Array: [5,3,5,1,5,2,5,4,5] pivot=5 (first element) 3-way partition result: [1,2,3,4] | [5,5,5,5,5] | [] lt=0 gt=4 (no "greater" elements) Standard Quicksort would recurse on [1,2,3,4,5,5,5,5,5], comparing 5 against itself 4 more times before placing each. 3-way: only recurse on [1,2,3,4] (size 4) — 5×5=25 comparisons saved.

void partition3way(vector<int>& a, int lo, int hi, int& lt, int& gt) {
    int pivot = a[lo];
    lt = lo; gt = hi; int i = lo;
    while (i <= gt) {
        if      (a[i] < pivot) swap(a[lt++], a[i++]);
        else if (a[i] > pivot) swap(a[i],    a[gt--]);
        else                   ++i;
    }
}

void qs3(vector<int>& a, int lo, int hi) {
    if (lo >= hi) return;
    int lt, gt;
    partition3way(a, lo, hi, lt, gt);
    qs3(a, lo, lt - 1);
    qs3(a, gt + 1, hi);
}

int main() {
    vector<int> a = {5,3,5,1,5,2,5,4,5};
    qs3(a, 0, (int)a.size()-1);
    for (int x : a) cout << x << " "; // 1 2 3 4 5 5 5 5 5
}
                        

Hard

Problem H1: Iterative Quicksort with O(log n) Stack Hard

Task: Implement iterative Quicksort that guarantees \(O(\log n)\) stack space by always pushing the larger partition last.

// Iterative Quicksort with guaranteed O(log n) stack depth.
// Key: always push the LARGER subarray first, process SMALLER subarray next.
// This ensures the stack never holds more than log2(n) entries at once.
void quicksortIterative(vector<int>& a) {
    if (a.size() < 2) return;
    vector<pair<int,int>> st;
    st.push_back({0, (int)a.size() - 1});

    while (!st.empty()) {
        auto [lo, hi] = st.back(); st.pop_back();
        if (lo >= hi) continue;

        int p = partitionLomuto(a, lo, hi);
        int leftSize  = p - 1 - lo;
        int rightSize = hi - (p + 1);

        // Push larger half first (processed later / deeper in stack)
        // Push smaller half second (processed next / stack top)
        if (leftSize > rightSize) {
            if (lo < p - 1)  st.push_back({lo, p - 1});
            if (p + 1 < hi)  st.push_back({p + 1, hi});
        } else {
            if (p + 1 < hi)  st.push_back({p + 1, hi});
            if (lo < p - 1)  st.push_back({lo, p - 1});
        }
    }
}

int main() {
    vector<int> a(10000);
    iota(a.begin(), a.end(), 1);
    shuffle(a.begin(), a.end(), mt19937(42));
    quicksortIterative(a);
    cout << (is_sorted(a.begin(), a.end()) ? "Sorted!" : "Error") << "\n";
}
                        

Why O(log n) stack: Each recursive call that goes onto the stack is on a subarray at most half the size of the current subarray (since we push larger last and process smaller next). So stack depth ≤ log₂(n).

Problem H2: Full Quickhull with Deduplication Hard

Task: Implement complete Quickhull returning hull vertices in counterclockwise order, handling collinear points and duplicate input points correctly.

#include <bits/stdc++.h>
using namespace std;

struct Point {
    double x, y;
    bool operator<(const Point& o) const {
        return x < o.x || (x == o.x && y < o.y);
    }
    bool operator==(const Point& o) const {
        return x == o.x && y == o.y;
    }
};

double cross(const Point& a, const Point& b, const Point& c) {
    return (b.x-a.x)*(c.y-a.y) - (b.y-a.y)*(c.x-a.x);
}

void hullRec(vector<Point>& pts, Point a, Point b, vector<Point>& hull) {
    if (pts.empty()) return;
    int fi = 0;
    double maxVal = 0;
    for (int i = 0; i < (int)pts.size(); ++i) {
        double c = cross(a, b, pts[i]);
        if (c > maxVal) { maxVal = c; fi = i; }
    }
    if (maxVal <= 0) return; // no point to the left
    Point p = pts[fi];
    vector<Point> leftAP, leftPB;
    for (int i = 0; i < (int)pts.size(); ++i) {
        if (i == fi) continue;
        if (cross(a, p, pts[i]) > 0) leftAP.push_back(pts[i]);
        if (cross(p, b, pts[i]) > 0) leftPB.push_back(pts[i]);
    }
    hullRec(leftAP, a, p, hull);
    hull.push_back(p);
    hullRec(leftPB, p, b, hull);
}

vector<Point> quickhull(vector<Point> pts) {
    // Deduplicate
    sort(pts.begin(), pts.end());
    pts.erase(unique(pts.begin(), pts.end()), pts.end());
    int n = (int)pts.size();
    if (n < 2) return pts;
    if (n == 2) return pts;

    Point a = pts.front(); // min x (sorted)
    Point b = pts.back();  // max x (sorted)

    vector<Point> upper, lower;
    for (auto& p : pts) {
        double c = cross(a, b, p);
        if (c > 0) upper.push_back(p);
        if (c < 0) lower.push_back(p);
    }

    vector<Point> hull;
    hull.push_back(a);
    hullRec(upper, a, b, hull); // CCW upper
    hull.push_back(b);
    hullRec(lower, b, a, hull); // CCW lower (B->A on the bottom)
    return hull;
}

int main() {
    vector<Point> pts = {
        {0,0},{2,0},{4,0},{4,2},{4,4},{2,4},{0,4},{0,2},{2,2},
        {1,1},{3,3},{1,3},{3,1},{0,0},{4,4} // duplicates
    };
    auto hull = quickhull(pts);
    cout << "Hull vertices (" << hull.size() << "):\n";
    for (auto& p : hull)
        cout << "  (" << p.x << "," << p.y << ")\n";
    // Expected: (0,0),(4,0),(4,4),(0,4) — corners of the square
}
                        

Problem H3: Prove Expected O(n log n) for Quicksort Hard

Task: Complete the indicator random variable proof from Section 6b. Compute the exact expected comparisons for \(n=5\) by hand and verify with code.

For \(n=5\), elements \(z_1 < z_2 < z_3 < z_4 < z_5\):

\[E[C] = \sum_{1 \le i < j \le 5} \frac{2}{j - i + 1}\]

All pairs \((i,j)\) with their contributions:

\((i,j)\)	\(j-i+1\)	\(2/(j-i+1)\)
(1,2),(2,3),(3,4),(4,5)	2	4 × 1 = 4
(1,3),(2,4),(3,5)	3	3 × 2/3 = 2
(1,4),(2,5)	4	2 × 1/2 = 1
(1,5)	5	1 × 2/5 = 2/5

\(E[C] = 4 + 2 + 1 + 0.4 = 7.4\)

Compare with \(n \ln n - n \approx 5 \ln 5 - 5 \approx 3.05\) (lower bound) and \(2n \ln n \approx 16.1\) (upper bound). \(7.4\) is in range. ✓

// Verify by simulation: average comparisons over many random permutations
#include <bits/stdc++.h>
using namespace std;
long long cmp;

int partition(vector<int>& a, int lo, int hi) {
    int pivot = a[hi], i = lo - 1;
    for (int j = lo; j < hi; ++j) { ++cmp; if (a[j] <= pivot) swap(a[++i], a[j]); }
    swap(a[i+1], a[hi]); return i + 1;
}
void qs(vector<int>& a, int lo, int hi) {
    if (lo >= hi) return;
    int p = partition(a, lo, hi);
    qs(a, lo, p-1); qs(a, p+1, hi);
}

int main() {
    int n = 5;
    mt19937 rng(0);
    long long totalCmp = 0;
    int trials = 100000;
    for (int t = 0; t < trials; ++t) {
        vector<int> a(n); iota(a.begin(), a.end(), 1);
        shuffle(a.begin(), a.end(), rng);
        // Use random pivot to match theory
        cmp = 0; qs(a, 0, n-1);
        totalCmp += cmp;
    }
    cout << fixed << setprecision(2);
    cout << "Simulated E[C] for n=5: " << (double)totalCmp/trials << "\n";
    cout << "Theoretical E[C]:        7.40\n";
    // Output: approximately 7.40 (matches theory)
}