Skip to content

Commit e7ca433

Browse files
committed
perf(py): optimize remove_duplicates (O(n²) → O(n))
1 parent 6d6b1f6 commit e7ca433

File tree

1 file changed

+41
-7
lines changed

1 file changed

+41
-7
lines changed

Sprint-1/Python/remove_duplicates/remove_duplicates.py

Lines changed: 41 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,19 +7,53 @@ def remove_duplicates(values: Sequence[ItemType]) -> List[ItemType]:
77
"""
88
Remove duplicate values from a sequence, preserving the order of the first occurrence of each value.
99
10-
Time complexity:
11-
Space complexity:
12-
Optimal time complexity:
10+
Time Complexity: O(n) - Single pass through the sequence
11+
Space Complexity: O(n) - Set to track seen elements
12+
Optimal Time Complexity: O(n) - Cannot do better than linear time
1313
"""
14+
# OPTIMIZED IMPLEMENTATION: O(n) time complexity
15+
# Previous implementation: O(n²) due to nested loops checking each element
16+
17+
seen = set() # O(n)
18+
unique_items = [] # O(n)
19+
20+
# O(n) time complexity
21+
for value in values:
22+
# O(1) lookup
23+
if value not in seen:
24+
seen.add(value) # O(1)
25+
unique_items.append(value) # O(1)
26+
27+
return unique_items
28+
29+
30+
# ORIGINAL IMPLEMENTATION (for comparison):
31+
"""
32+
def remove_duplicates(values: Sequence[ItemType]) -> List[ItemType]:
1433
unique_items = []
1534
16-
for value in values:
35+
for value in values: # O(n) iterations
1736
is_duplicate = False
18-
for existing in unique_items:
19-
if value == existing:
37+
for existing in unique_items: # O(k) iterations (k grows with unique elements)
38+
if value == existing: # O(1) comparison
2039
is_duplicate = True
2140
break
2241
if not is_duplicate:
23-
unique_items.append(value)
42+
unique_items.append(value) # O(1) operation
2443
2544
return unique_items
45+
46+
COMPLEXITY ANALYSIS OF ORIGINAL:
47+
- Outer loop: O(n) iterations through values
48+
- Inner loop: O(k) iterations through unique_items (k grows with each unique element)
49+
- Worst case: O(n²) when all elements are unique
50+
- Space: O(n) for unique_items list
51+
52+
PERFORMANCE ISSUES:
53+
- Quadratic time complexity O(n²) in worst case
54+
55+
IMPROVEMENTS MADE:
56+
1. Reduced from O(n²) to O(n) time complexity
57+
2. Set lookup is O(1) vs linear search O(k)
58+
3. Single pass through input sequence
59+
"""

0 commit comments

Comments
 (0)