gh-144319: `madvise(MADV_HUGEPAGE)` #144353

maurycy · 2026-01-31T02:09:28Z

The hint enables Transparent Huge Pages on systems with madvise, which seems to be the default on Ubuntu and Fedora, at least according to this article.

More on THP:

https://docs.kernel.org/admin-guide/mm/transhuge.html

Importantly, it seems to cary no SIGBUS risk. mimalloc seems to already do this with MIMALLOC_LARGE_OS_PAGES=1.

Reusing the benchmark from #144319:

bench_obmalloc.py

import sys, gc

def bench_small_object_churn():
    objs = []
    for _ in range(200_000): objs.append(bytearray(64))
    for _ in range(200_000): objs.append(bytearray(64)); objs.pop(0)

def bench_bulk_small_alloc():
    objs = [bytearray(48) for _ in range(1_000_000)]
    for o in objs: o[0] = 1

def bench_dict_churn():
    for _ in range(500_000): d = {"a": 1, "b": 2, "c": 3, "d": 4}; del d

def bench_mixed_sizes():
    sizes = [8, 16, 24, 32, 48, 64, 96, 128, 192, 256, 384, 512]
    objs = [bytearray(sizes[i % 12]) for i in range(500_000)]

def bench_fragmentation():
    objs = [bytearray(128) for _ in range(500_000)]
    for i in range(0, len(objs), 2): objs[i] = None
    for i in range(0, len(objs), 2): objs[i] = bytearray(128)

def bench_list_of_tuples():
    objs = [(i, i+1, i+2) for i in range(1_000_000)]

def bench_class_instances():
    class Pt:
        __slots__ = ('x', 'y', 'z')
        def __init__(s, x, y, z): s.x = x; s.y = y; s.z = z
    objs = [Pt(i, i+1, i+2) for i in range(500_000)]

def bench_arena_pressure():
    layers = [[bytearray(256) for _ in range(200_000)] for _ in range(10)]

def bench_random_walk():
    import random; random.seed(42)
    objs = [bytearray(64) for _ in range(1_000_000)]
    idx = list(range(len(objs))); random.shuffle(idx)
    for i in idx: objs[i][0] = i & 0xff

BENCHMARKS = dict(small_object_churn=bench_small_object_churn,
    bulk_small_alloc=bench_bulk_small_alloc, dict_churn=bench_dict_churn,
    mixed_sizes=bench_mixed_sizes, fragmentation=bench_fragmentation,
    list_of_tuples=bench_list_of_tuples, class_instances=bench_class_instances,
    arena_pressure=bench_arena_pressure, random_walk=bench_random_walk)

if __name__ == "__main__":
    gc.collect(); gc.disable(); BENCHMARKS[sys.argv[1]](); gc.enable()

on

[126] 2026-01-31T02:32:04.127734128+0100 maurycy@eiger /home/maurycy  % sudo cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never

Where the baseline is the main branch

Wall-clock time

Benchmark	Baseline	With MADV_HUGEPAGE	Change
fragmentation	0.107s	0.101s	-5.4%
bulk_small_alloc	0.126s	0.121s	-4.1%
class_instances	0.078s	0.076s	-2.9%
list_of_tuples	0.102s	0.101s	-1.2%
mixed_sizes	0.085s	0.084s	-1.1%
random_walk	0.517s	0.515s	-0.4%
arena_pressure	0.325s	0.326s	+0.3%

dTLB load misses

Benchmark	Baseline	With MADV_HUGEPAGE	Change
fragmentation	123,390	99,413	-19.4%
arena_pressure	280,228	237,222	-15.3%
bulk_small_alloc	93,894	85,661	-8.8%
list_of_tuples	88,019	81,778	-7.1%

It's smaller than MAP_HUGETLB because MADV_HUGEPAGE is just a hint, so maybe khugepaged did not kick in yet.

I noted no regression with THP=always.

The only thing that I'm wondering whether and how it should be guarded. Enabling by default seems risky, but it's not exactly --with-pymalloc-hugepages. That's why I'm opening this as a draft.

pyperformance --rigorous suite (I'd say it's jitter: asyncio_tcp is I/O bound, scimark is numpy, the benchmarks are short-lived etc.)

uv run --with pyperf python -m pyperf compare_to /tmp/baseline_affinity.json /tmp/modified_affinity.json --table --table-format md

Benchmark	baseline_affinity	modified_affinity
many_optionals	693 us	688 us: 1.01x faster
subparsers	7.71 ms	7.65 ms: 1.01x faster
async_generators	290 ms	288 ms: 1.00x faster
async_tree_cpu_io_mixed_tg	411 ms	414 ms: 1.01x slower
async_tree_eager_cpu_io_mixed	344 ms	344 ms: 1.00x faster
async_tree_eager_cpu_io_mixed_tg	380 ms	385 ms: 1.01x slower
async_tree_eager_memoization	172 ms	170 ms: 1.01x faster
async_tree_eager_tg	162 ms	165 ms: 1.02x slower
async_tree_io	447 ms	454 ms: 1.02x slower
async_tree_memoization	243 ms	234 ms: 1.04x faster
async_tree_memoization_tg	252 ms	239 ms: 1.06x faster
async_tree_none_tg	200 ms	195 ms: 1.03x faster
asyncio_tcp	301 ms	269 ms: 1.12x faster
asyncio_tcp_ssl	1.28 sec	1.27 sec: 1.00x faster
asyncio_websockets	359 ms	357 ms: 1.01x faster
chameleon	12.1 ms	12.1 ms: 1.01x faster
chaos	44.4 ms	44.1 ms: 1.01x faster
comprehensions	12.6 us	12.7 us: 1.01x slower
bench_thread_pool	795 us	800 us: 1.01x slower
crypto_pyaes	56.8 ms	57.3 ms: 1.01x slower
dask	700 ms	698 ms: 1.00x faster
deepcopy	186 us	186 us: 1.00x slower
deepcopy_reduce	2.06 us	2.08 us: 1.01x slower
deepcopy_memo	18.6 us	19.1 us: 1.03x slower
deltablue	2.50 ms	2.47 ms: 1.01x faster
django_template	29.7 ms	29.5 ms: 1.01x faster
docutils	2.21 sec	2.19 sec: 1.01x faster
dulwich_log	44.2 ms	45.0 ms: 1.02x slower
fannkuch	285 ms	280 ms: 1.02x faster
gc_traversal	4.08 ms	4.25 ms: 1.04x slower
generators	22.9 ms	22.6 ms: 1.01x faster
genshi_text	17.1 ms	17.3 ms: 1.01x slower
genshi_xml	39.5 ms	39.2 ms: 1.01x faster
go	90.0 ms	89.8 ms: 1.00x faster
hexiom	4.39 ms	4.47 ms: 1.02x slower
html5lib	48.9 ms	48.3 ms: 1.01x faster
json_dumps	7.57 ms	7.50 ms: 1.01x faster
json_loads	18.4 us	18.5 us: 1.01x slower
logging_simple	4.54 us	4.43 us: 1.02x faster
mako	8.47 ms	8.49 ms: 1.00x slower
mdp	941 ms	965 ms: 1.03x slower
meteor_contest	95.9 ms	94.8 ms: 1.01x faster
nbody	67.5 ms	67.9 ms: 1.01x slower
nqueens	73.6 ms	72.4 ms: 1.02x faster
pathlib	10.0 ms	10.1 ms: 1.01x slower
pickle	13.8 us	13.8 us: 1.01x faster
pickle_dict	24.9 us	24.6 us: 1.01x faster
pickle_list	4.08 us	4.11 us: 1.01x slower
pickle_pure_python	250 us	247 us: 1.01x faster
pidigits	185 ms	184 ms: 1.00x faster
pprint_safe_repr	573 ms	568 ms: 1.01x faster
pprint_pformat	1.18 sec	1.16 sec: 1.02x faster
pyflate	327 ms	324 ms: 1.01x faster
python_startup	11.0 ms	11.0 ms: 1.00x faster
python_startup_no_site	6.48 ms	6.48 ms: 1.00x faster
raytrace	211 ms	214 ms: 1.02x slower
regex_compile	98.2 ms	98.3 ms: 1.00x slower
regex_dna	164 ms	156 ms: 1.05x faster
regex_effbot	2.32 ms	2.18 ms: 1.06x faster
regex_v8	18.2 ms	17.5 ms: 1.04x faster
richards	33.6 ms	34.3 ms: 1.02x slower
richards_super	38.5 ms	38.3 ms: 1.00x faster
scimark_fft	204 ms	203 ms: 1.01x faster
scimark_lu	68.9 ms	66.6 ms: 1.04x faster
scimark_monte_carlo	43.2 ms	44.0 ms: 1.02x slower
scimark_sor	75.9 ms	74.7 ms: 1.02x faster
scimark_sparse_mat_mult	3.24 ms	3.05 ms: 1.06x faster
spectral_norm	64.5 ms	64.7 ms: 1.00x slower
sphinx	808 ms	798 ms: 1.01x faster
sqlglot_v2_normalize	82.4 ms	83.6 ms: 1.01x slower
sqlglot_v2_optimize	41.7 ms	41.8 ms: 1.00x slower
sqlglot_v2_parse	973 us	990 us: 1.02x slower
sqlglot_v2_transpile	1.26 ms	1.25 ms: 1.00x faster
sympy_integrate	16.4 ms	16.4 ms: 1.00x slower
sympy_sum	112 ms	112 ms: 1.00x faster
sympy_str	214 ms	215 ms: 1.01x slower
telco	118 ms	120 ms: 1.02x slower
tomli_loads	1.49 sec	1.50 sec: 1.01x slower
tornado_http	79.7 ms	79.4 ms: 1.00x faster
typing_runtime_protocols	124 us	121 us: 1.02x faster
unpack_sequence	32.7 ns	31.9 ns: 1.03x faster
unpickle	11.0 us	10.7 us: 1.02x faster
unpickle_list	3.95 us	3.99 us: 1.01x slower
unpickle_pure_python	163 us	163 us: 1.00x slower
xdsl_constant_fold	36.0 ms	36.2 ms: 1.01x slower
xml_etree_parse	109 ms	108 ms: 1.01x faster
xml_etree_iterparse	68.3 ms	67.3 ms: 1.01x faster
xml_etree_generate	68.3 ms	67.3 ms: 1.01x faster
xml_etree_process	47.4 ms	47.7 ms: 1.01x slower
Geometric mean	(ref)	1.00x faster

Issue: Add huge pages support for pymalloc #144319

poof

de7eac6

bedevere-app bot mentioned this pull request Jan 31, 2026

Add huge pages support for pymalloc #144319

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gh-144319: `madvise(MADV_HUGEPAGE)` #144353

gh-144319: `madvise(MADV_HUGEPAGE)` #144353

Uh oh!

maurycy commented Jan 31, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

gh-144319: madvise(MADV_HUGEPAGE) #144353

Are you sure you want to change the base?

gh-144319: madvise(MADV_HUGEPAGE) #144353

Uh oh!

Conversation

maurycy commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gh-144319: `madvise(MADV_HUGEPAGE)` #144353

gh-144319: `madvise(MADV_HUGEPAGE)` #144353

maurycy commented Jan 31, 2026 •

edited

Loading