Amortized-Analysis Arrays 02. Algorithm Analysis
1. Overview
- Dynamic arrays (like
std::vector) automatically resize when they run out of space - When a
vectorresizes, it typically doubles it capacity to accommodate more elements - Amortized analysis helps us understand the cost of such resizing operations spread over multiple insertions
- While some individual operations are expensive, the average insertion cost is low
What is amortized time?: the average time per operation over a sequence of operations, taking into account occasional expensive operations that are offset by many cheap ones.
2. How Resizing Works in std::vector
- When a
vectorreaches its capacity limit:- A new, larger array (usually double the previous capacity) is allocated
- The existing elements are copied to the new array
- The old array is deallocated, and new elements can be added without resizing until the capacity is reached again
Example
If a vector has a capacity of 4 and is full, inserting one more element will:
- Allocate a new array with a capacity of
8 - Copy the 4 existing elements to the new array
- Insert the new element
3. Amortized Analysis of Insertion
When analyzing the cost of repeatedly inserting elements into a vector, letβs assume we start with an empty vector and add n elements
- Most Insertion (O(1) Cost) : *For each element that fits within the current capacity, insertion is
O(1) - Costly Insertion (O(n) Cost) :
- when resizing occurs, the insertion can become
O(n)because all existing elements are copied to new array - however, the number of resizing operations grows logarithmically relative to the number of elements
- when resizing occurs, the insertion can become
- Amortized Cost Calculation:
- over
ninsertions, each element only requiresO(1)cost on average due to the doubling strategy - this is because each element is only copied a limited number of time (once for each capacity doubling)
- over
Thus, the amortized time complexity per insertion is O(1)
Example: Logarithmic resizing
-
Starting Conditions:
- Assume we start with an empty
vectorwith an initial capacity of1. - Each time the vector reaches its capacity, it doubles.
- Assume we start with an empty
-
Insertions and Resizing:
- Insert 1st element: Capacity is
1, no resize needed. - Insert 2nd element: Capacity reached. Resize to
2. - Insert 3rd element: Capacity reached. Resize to
4. - Insert 5th element: Capacity reached. Resize to
8. - Insert 9th element: Capacity reached. Resize to
16.
Notice the pattern: resizing happens at
1, 2, 4, 8, 16..., doubling each time. - Insert 1st element: Capacity is
-
Logarithmic Growth:
- If we continue this pattern up to
nelements, the number of times we double the capacity grows aslog(n). - For example, to reach
16elements, we resized4times (logarithmic in base2), and to reach32elements, we resized5times.
- If we continue this pattern up to
Key Takeaway
- Even though we perform many insertions, we only resize about
log(n)times fornelements. - Each resize operation affects all current elements but becomes less frequent as the capacity grows, averaging out the cost of resizing over all insertions.