Skip to content

Commit c27f946

Browse files
committed
chore(repo): update pre-commit workflow and sync 1.4.4 docs/changelog
1 parent 62227d3 commit c27f946

14 files changed

Lines changed: 87 additions & 38 deletions

File tree

.github/ISSUE_TEMPLATE/bug_report.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,4 +62,4 @@ body:
6262
id: notes
6363
attributes:
6464
label: Additional context
65-
description: CFG structure, HTML screenshots, logs, etc.
65+
description: CFG structure, HTML screenshots, logs, etc.

.github/ISSUE_TEMPLATE/cfg_semantics.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,4 +43,4 @@ body:
4343
attributes:
4444
label: Desired CFG behavior
4545
validations:
46-
required: true
46+
required: true

.github/ISSUE_TEMPLATE/false_positive.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,4 +43,4 @@ body:
4343
attributes:
4444
label: CFG-related?
4545
options:
46-
- label: Control flow structure differs meaningfully
46+
- label: Control flow structure differs meaningfully

.github/ISSUE_TEMPLATE/feature_request.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,4 +43,4 @@ body:
4343
- type: textarea
4444
id: alternatives
4545
attributes:
46-
label: Alternatives considered
46+
label: Alternatives considered

.github/actions/codeclone/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,4 @@ Runs CodeClone to detect architectural code duplication in Python projects.
88
- uses: orenlab/codeclone/.github/actions/codeclone@v1
99
with:
1010
path: .
11-
fail-on-new: true
11+
fail-on-new: true

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,4 +32,6 @@ htmlcov/
3232
.DS_Store
3333

3434
# Logs
35-
*.log
35+
*.log
36+
37+
.claude

.pre-commit-config.yaml

Lines changed: 31 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,55 @@
1+
default_install_hook_types: [ pre-commit, pre-push ]
2+
13
repos:
4+
- repo: https://github.com/pre-commit/pre-commit-hooks
5+
rev: v6.0.0
6+
hooks:
7+
- id: check-merge-conflict
8+
- id: end-of-file-fixer
9+
- id: trailing-whitespace
10+
- id: check-added-large-files
11+
- id: check-toml
12+
- id: check-yaml
13+
214
- repo: local
315
hooks:
4-
- id: ruff-check
5-
name: Ruff (lint)
6-
entry: ruff check .
16+
- id: ruff-format
17+
name: Ruff (format)
18+
entry: ruff format .
719
language: system
820
pass_filenames: false
921
types: [ python ]
22+
stages: [ pre-commit ]
1023

11-
- id: ruff-format
12-
name: Ruff (format)
13-
entry: ruff format .
24+
- id: ruff-check
25+
name: Ruff (lint)
26+
entry: ruff check .
1427
language: system
1528
pass_filenames: false
1629
types: [ python ]
30+
stages: [ pre-commit ]
1731

1832
- id: mypy
1933
name: Mypy
2034
entry: mypy .
2135
language: system
2236
pass_filenames: false
2337
types: [ python ]
38+
stages: [ pre-commit ]
2439

2540
- id: codeclone
2641
name: CodeClone
2742
entry: codeclone
2843
language: system
2944
pass_filenames: false
3045
args: [ ".", "--ci" ]
31-
types: [ python ]
46+
types: [ python ]
47+
stages: [ pre-commit ]
48+
49+
- id: pytest
50+
name: Pytest
51+
entry: pytest -q
52+
language: system
53+
pass_filenames: false
54+
types: [ python ]
55+
stages: [ pre-push ]

CHANGELOG.md

Lines changed: 39 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,28 @@
11
# Changelog
22

3+
## [1.4.4] - 2026-03-14
4+
5+
### Performance
6+
7+
- Optimized HTML snippet rendering hot path:
8+
- file snippets now reuse cached full-file lines and slice ranges without
9+
repeated full-file scans
10+
- Pygments modules are loaded once per importer identity instead of
11+
re-importing for each snippet
12+
- Optimized block explainability range stats:
13+
- replaced repeated full `ast.walk()` scans per range with a per-file
14+
statement index + `bisect` window lookup
15+
16+
### Tests
17+
18+
- Preserved existing golden/contract behavior for `1.4.x` and kept report output
19+
semantics unchanged while improving runtime overhead.
20+
21+
### Contract Notes
22+
23+
- No baseline/cache/report schema changes.
24+
- No clone detection or fingerprint semantic changes.
25+
326
## [1.4.3] - 2026-03-03
427

528
### Cache Contract
@@ -328,57 +351,57 @@ codeclone . --update-baseline
328351

329352
### Overview
330353

331-
This release focuses on security hardening, robustness, and long-term maintainability.
354+
This release focuses on security hardening, robustness, and long-term maintainability.
332355
No breaking API changes were introduced.
333356

334357
The goal of this release is to provide users with a safe, deterministic, and CI-friendly
335358
tool suitable for security-sensitive and large-scale environments.
336359

337360
### Security & Robustness
338361

339-
- **Path Traversal Protection**
362+
- **Path Traversal Protection**
340363
Implemented strict path validation to prevent scanning outside the project root or
341364
accessing sensitive system directories, including macOS `/private` paths.
342365

343-
- **Cache Integrity Protection**
366+
- **Cache Integrity Protection**
344367
Added HMAC-SHA256 signing for cache files to prevent cache poisoning and detect tampering.
345368

346-
- **Parser Safety Limits**
369+
- **Parser Safety Limits**
347370
Introduced AST parsing time limits to mitigate risks from pathological or adversarial inputs.
348371

349-
- **Resource Exhaustion Protection**
372+
- **Resource Exhaustion Protection**
350373
Enforced a maximum file size limit (10MB) and a maximum file count per scan to prevent
351374
excessive memory or CPU usage.
352375

353-
- **Structured Error Handling**
376+
- **Structured Error Handling**
354377
Introduced a dedicated exception hierarchy (`ParseError`, `CacheError`, etc.) and replaced
355378
broad exception handling with graceful, user-friendly failure reporting.
356379

357380
### Performance Improvements
358381

359-
- **Optimized AST Normalization**
382+
- **Optimized AST Normalization**
360383
Replaced expensive `deepcopy` operations with in-place AST normalization, significantly
361384
reducing CPU and memory overhead.
362385

363-
- **Improved Memory Efficiency**
386+
- **Improved Memory Efficiency**
364387
Added an LRU cache for file reading and optimized string concatenation during fingerprint
365388
generation.
366389

367-
- **HTML Report Memory Bounds**
390+
- **HTML Report Memory Bounds**
368391
HTML reports now read only the required line ranges instead of entire files, reducing peak
369392
memory usage on large codebases.
370393

371394
### Architecture & Maintainability
372395

373-
- **Strict Type Safety**
396+
- **Strict Type Safety**
374397
Migrated all optional typing to Python 3.10+ `| None` syntax and achieved 100% `mypy` strict
375398
compliance.
376399

377-
- **Modular CFG Design**
400+
- **Modular CFG Design**
378401
Split CFG data structures and builder logic into separate modules (`cfg_model.py` and
379402
`cfg.py`) for improved clarity and extensibility.
380403

381-
- **Template Extraction**
404+
- **Template Extraction**
382405
Extracted HTML templates into a dedicated `templates.py` module.
383406

384407
- Added a `py.typed` marker for downstream type checkers.
@@ -420,13 +443,13 @@ support for Python 3.10–3.14 across the test matrix.
420443

421444
### Fixed
422445

423-
- **CFG Exception Handling**
446+
- **CFG Exception Handling**
424447
Fixed incorrect control-flow linking for `try`/`except` blocks.
425448

426-
- **Pattern Matching Support**
449+
- **Pattern Matching Support**
427450
Added missing structural handling for `match`/`case` statements in the CFG.
428451

429-
- **Block Detection Scaling**
452+
- **Block Detection Scaling**
430453
Made `MIN_LINE_DISTANCE` dynamic based on block size to improve clone detection accuracy
431454
across differently sized functions.
432455

@@ -436,7 +459,7 @@ support for Python 3.10–3.14 across the test matrix.
436459

437460
### BREAKING CHANGES
438461

439-
- **CLI Arguments**
462+
- **CLI Arguments**
440463
Renamed output flags for brevity and consistency:
441464
- `--json-out``--json`
442465
- `--text-out``--text`

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,4 @@ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
1818
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
1919
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
2020
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21-
SOFTWARE.
21+
SOFTWARE.

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
![Baseline](https://img.shields.io/badge/baseline-versioned-green?style=flat-square)
99
[![License](https://img.shields.io/pypi/l/codeclone.svg?style=flat-square)](LICENSE)
1010

11-
**CodeClone** is a Python code clone detector based on **normalized AST and Control Flow Graphs (CFG)**.
11+
**CodeClone** is a Python code clone detector based on **normalized AST and Control Flow Graphs (CFG)**.
1212
It discovers architectural duplication and prevents new copy-paste from entering your codebase via CI.
1313

1414
---
@@ -34,13 +34,13 @@ Unlike token-based tools, CodeClone compares **structure and control flow**, mak
3434

3535
**Three Detection Levels:**
3636

37-
1. **Function clones (CFG fingerprint)**
37+
1. **Function clones (CFG fingerprint)**
3838
Strong structural signal for cross-layer duplication
3939

40-
2. **Block clones (statement windows)**
40+
2. **Block clones (statement windows)**
4141
Detects repeated local logic patterns
4242

43-
3. **Segment clones (report-only)**
43+
3. **Segment clones (report-only)**
4444
Internal function repetition for explainability; not used for baseline gating
4545

4646
**CI-Ready Features:**

0 commit comments

Comments
 (0)