Use hash func to boost file creation and lookup#79
Use hash func to boost file creation and lookup#79RoyWFHuang wants to merge 3 commits intosysprog21:masterfrom
Conversation
|
How can you determine which hash function is the most suitable? |
visitorckw
left a comment
There was a problem hiding this comment.
I saw that your PR description includes some performance benchmarks, but the commit message lacks any performance numbers to support your improvements. Please improve the commit message.
b928f62 to
308bb4c
Compare
308bb4c to
1645d00
Compare
I’m not sure if "fnv" is the most suitable, but index in SimpleFS is relatively small, using a more complex algorithm might not provide significant benefits. I think fnv is a reasonable balance between simplicity and performance. |
ca74c03 to
51e0478
Compare
visitorckw
left a comment
There was a problem hiding this comment.
You ignored many of my comments without making any changes or providing any replies. You still retained many irrelevant changes, making the review difficult. Additionally, a single-line commit message saying only "optimize the file search process" is way too vague. Please improve the git commit message.
0519d61 to
864a9a1
Compare
Added all hash results into the commit. |
56c0522 to
0176a4b
Compare
0176a4b to
c2316df
Compare
visitorckw
left a comment
There was a problem hiding this comment.
Quoted from patch 2:
Align the print function with the Simplefs print format for consistency.
Also adjust variable declarations to fix compiler warnings when building
under the C90 standard.
I'm unsure which Linux kernel versions simplefs currently intends to support, but AFAIK, the Linux kernel currently uses gnu c11 as its standard.
Furthermore, the word "Also" is often a sign that the change should be in a separate patch. In my view, you are performing two distinct actions here:
a) Changing printk -> pr_err.
b) Fixing a compiler warning.
I also remain confused as to whether the printk to pr_err change is truly warranted, and what relevance it has to the PR's title, which is "Use hash func to boost file creation and lookup".
c2316df to
c51cbb1
Compare
This comment was marked as resolved.
This comment was marked as resolved.
acfa00f to
19c1b8e
Compare
|
Worst case is still O(n) linear search when hash collisions cluster. No evidence hash quality was tested. What's the distribution? hash_code = simplefs_hash(dentry) % (SIMPLEFS_MAX_EXTENTS * SIMPLEFS_MAX_BLOCKS_PER_EXTENT);
ei = hash_code / SIMPLEFS_MAX_BLOCKS_PER_EXTENT;
bi = hash_code % SIMPLEFS_MAX_BLOCKS_PER_EXTENT;The above is good for deterministic FNV-1a hash, proper modulo distribution. It uses 64-bit hash but returns 32-bit (truncates), no analysis of collision rates. |
inode.c
Outdated
| strncpy(dblock->files[fi].filename, dest_dentry->d_name.name, | ||
| SIMPLEFS_FILENAME_LEN); |
There was a problem hiding this comment.
File becomes unfindable after rename because:
- Old entry stays at hash(old_name) location
- Future lookups calculate hash(new_name) and search wrong bucket
- File effectively vanishes from namespace
066a066 to
3d8fd72
Compare
f3a1971 to
7323d62
Compare
7323d62 to
d0090a0
Compare
| _ei = hash_code / SIMPLEFS_MAX_BLOCKS_PER_EXTENT; | ||
| _bi = hash_code % SIMPLEFS_MAX_BLOCKS_PER_EXTENT; | ||
| int nr_ei_files = eblock->nr_files; | ||
| for (idx_ei = 0; nr_ei_files; _ei++, idx_ei++) { |
There was a problem hiding this comment.
CRITICAL — unbounded ring loop on corrupt metadata
for (idx_ei = 0; nr_ei_files; _ei++, idx_ei++) {
CHECK_AND_SET_RING_INDEX(_ei, SIMPLEFS_MAX_EXTENTS);
if (!eblock->extents[_ei].ee_start)
continue; /* nr_ei_files unchanged: spin forever */
nr_ei_files -= eblock->extents[_ei].nr_files;
...
}On a directory whose eblock->nr_files != 0 but all extents[].ee_start == 0 (corruption), this never terminates — kernel hang in process context. Never trust on-disk metadata as a sole loop bound.
Fix: add idx_ei < SIMPLEFS_MAX_EXTENTS to the loop condition. Validate extents[_ei].nr_files <= nr_ei_files before subtracting, return -EFSCORRUPTED on inconsistency.
There was a problem hiding this comment.
In our design, eblock->nr_files records the total number of files within a simplefs_file_ei_block. This value is expected to be identical to the sum of nr_files across all extent blocks (eblock->extents[_ei].nr_files). If these values do not match, it would indicate a bug in our file creation or management logic.
However, I acknowledge the importance of defensive programming when dealing with on-disk metadata to prevent potential kernel hangs. I will create a dedicated issue to improve these validation checks and ensure the file system's robustness against metadata corruption.
| nr_ei_files -= eblock->extents[_ei].nr_files; | ||
| /* Iterate blocks in extent */ | ||
| int nr_bi_files = eblock->extents[_ei].nr_files; | ||
| for (idx_bi = 0; nr_bi_files; _bi++, idx_bi++) { |
There was a problem hiding this comment.
CRITICAL — same unbounded-loop pattern at the inner block level
int nr_bi_files = eblock->extents[_ei].nr_files;
for (idx_bi = 0; nr_bi_files; _bi++, idx_bi++) {
...
nr_bi_files -= dblock->nr_files; /* signed int, can go negative */
}A corrupt dblock->nr_files == 0 keeps nr_bi_files constant; an oversized value drives it negative, which is still truthy. Kernel hang.
Fix: idx_bi < eblock->extents[_ei].ee_len; reject dblock->nr_files > SIMPLEFS_FILES_PER_BLOCK and return -EFSCORRUPTED.
| ei = hash_code / SIMPLEFS_MAX_BLOCKS_PER_EXTENT; | ||
| bi = hash_code % SIMPLEFS_MAX_BLOCKS_PER_EXTENT; | ||
|
|
||
| for (; dir_nr_files; ei++) { |
There was a problem hiding this comment.
CRITICAL — simplefs_remove_from_dir repeats the unbounded-ring pattern
for (; dir_nr_files; ei++) {
CHECK_AND_SET_RING_INDEX(ei, SIMPLEFS_MAX_EXTENTS);
if (eblock->extents[ei].ee_start) { ... }
bi = 0;
}When extents[ei].ee_start == 0, the body is skipped and dir_nr_files is never decremented. Same hang risk as __file_lookup.
Fix: bound the outer scan with a separate counter < SIMPLEFS_MAX_EXTENTS and validate counters before subtracting.
| brelse(bh2); | ||
| return 0; | ||
| } | ||
| _fi += dblock->files[_fi].nr_blk; |
There was a problem hiding this comment.
HIGH — unchecked nr_blk in run-length walk
_fi += dblock->files[_fi].nr_blk;On-disk nr_blk == 0 → infinite loop. Oversized nr_blk → next iteration indexes past the block. Same code pattern exists in simplefs_try_remove_entry line 493.
Fix: validate nr_blk >= 1 && _fi + nr_blk <= SIMPLEFS_FILES_PER_BLOCK before stepping; return -EFSCORRUPTED on violation.
There was a problem hiding this comment.
Regarding the concerns about nr_blk == 0, potential infinite loops, and metadata validation (including line 493 in simplefs_try_remove_entry):
As I mentioned previously:
"I believe this situation should not occur under normal conditions, as nr_blk (or nr_files) should always be at least 1. If we encounter a 0, it likely indicates a bug in the block merging or splitting logic."
While I understand the concern regarding out-of-bounds indexing or infinite loops in the case of disk corruption, I plan to implement these defensive programming measures and error handling as part of a separate, dedicated metadata optimization task.
This current PR is strictly focused on the core hash-based indexing improvements and the resulting performance gains. To maintain a clean and focused commit history, I believe it is better to handle these robustness enhancements in a subsequent PR, ensuring the code review process remains efficient for both distinct functional changes.
| } | ||
| blk_nr_files--; | ||
| } | ||
| fi += dblock->files[fi].nr_blk; |
There was a problem hiding this comment.
HIGH — same unchecked nr_blk walk as __file_lookup
fi += dblock->files[fi].nr_blk;Mirror of the bug at line 178. Validate nr_blk >= 1 && fi + nr_blk <= SIMPLEFS_FILES_PER_BLOCK.
| if (chk < 0) | ||
| return ERR_PTR(chk); /* I/O error */ | ||
|
|
||
| bh = sb_bread(sb, ci_dir->ei_block); |
There was a problem hiding this comment.
HIGH — redundant double-read defeats the PR's perf goal
chk = __file_lookup(dir, dentry, &ei, &bi, &fi);
...
bh = sb_bread(sb, ci_dir->ei_block); /* re-read ei_block */
bh2 = sb_bread(sb, eblock->extents[ei].ee_start + bi); /* re-read dir block */
inode = simplefs_iget(sb, dblock->files[fi].inode); /* needed only the ino_t */__file_lookup already had the inode number. Two extra sb_getblk round-trips per successful lookup, on the exact path the PR cover letter is selling as faster. The buffer cache makes the I/O cheap but not free — and this is the hottest path in the filesystem.
Fix: have __file_lookup return the inode number via an out-pointer; drop the second sb_bread pair entirely. simplefs_lookup then collapses to roughly:
err = __file_lookup(dir, dentry, &ino);
switch (err) {
case 0: inode = simplefs_iget(sb, ino); break;
case -ENOENT: inode = NULL; break;
default: return ERR_PTR(err);
}
return d_splice_alias(inode, dentry);There was a problem hiding this comment.
The same as #79 (comment)
refactoring __file_lookup involves a larger scope and will be handled in a dedicated PR.
d0090a0 to
d6b910d
Compare
Introduce a hash-based mechanism to speed up file creation and lookup operations. The hash function enables faster access to extent and logical block extent index, improving overall filesystem performance. hash_code = file_hash(file_name); extent index = hash_code / SIMPLEFS_MAX_BLOCKS_PER_EXTENT block index = hash_code % SIMPLEFS_MAX_BLOCKS_PER_EXTENT; Use perf to measure: 1. File Creation (random) Legacy: 259.842753513 seconds time elapsed 23.000247000 seconds user 150.380145000 seconds sys full_name_hash: 222.274028604 seconds time elapsed 20.794966000 seconds user 151.941876000 seconds sys 2. File Listing (random) Legacy: min time: 0.00171 s max time: 0.03799 s avg time: 0.00423332 s tot time: 129.539510 s full_name_hash: min time: 0.00171 s max time: 0.03588 s avg time: 0.00305601 s tot time: 93.514040 s 3. files Removal (Random) Legacy: 106.921706288 seconds time elapsed 16.987883000 seconds user 91.268661000 seconds sys full_name_hash: 86.132655220 seconds time elapsed 19.180209000 seconds user 68.476075000 seconds sys
d6b910d to
54807e2
Compare









Previously, SimpleFS used a sequential insertion method to create files, which worked efficiently when the filesystem contained only a small number of files.
However, in real-world use cases, filesystems often manage a large number of files, making sequential search and insertion inefficient.
Inspired by Ext4’s hash-based directory indexing, this change adopts a hash function to accelerate file indexing and improve scalability.
Change:
Implemented hash-based file index lookup
Improved scalability for large directory structures
hash_code = file_hash(file_name);
extent index = hash_code / SIMPLEFS_MAX_BLOCKS_PER_EXTENT
block index = hash_code % SIMPLEFS_MAX_BLOCKS_PER_EXTENT;
Performance test
legacy:
full_name_hash
legacy:
full_name_hash
Use perf stat ls -la to measure the query time for each file and sum up all elapsed times to calculate the total lookup cost.
Legacy :
min time: 0.00171 s
max time: 0.03799 s
avg time: 0.00423332 s
tot time: 129.539510 s
full_name_hash:
min time: 0.00171 s
max time: 0.03588 s
avg time: 0.00305601 s
tot time: 93.514040 s
Summary by cubic
Switched SimpleFS to hash-based directory indexing using a deterministic FNV-1a
simplefs_hashto map filenames to extent/block slots for faster create, lookup, and delete. On 30.6k files: create ~15% faster, delete ~19% faster, lookup ~28% faster.New Features
__file_lookupused by lookup/rename.iterateskips fully-scanned extents/blocks usingnr_files; unlink/rename free empty directory extents; same-dir rename updates in place, cross-dir rename inserts-then-removes with rollback safety.Refactors
hash.c(FNV-1a) andCHECK_AND_SET_RING_INDEX; renamed helpers/variables (e.g.,simplefs_get_new_ext, src/dest in rename); enforced filename max 254 chars (reserve NUL) and updated README; Makefile includeshash.o.Written for commit 54807e2. Summary will update on new commits.