Skip to content

Commit 6dbddac

Browse files
author
Ralf Waldukat
committed
CRITICAL FIX: Add missing parameters to llama_params_fit
Found by Gemini-3-Flash deep review. The llama_params_fit signature was missing 3 parameters (margin, n_ctx_min, log_level) causing guaranteed stack corruption if the function was ever called. Before (WRONG - 6 parameters): - path_model, mparams, cparams, tensor_split, tensor_buft_overrides, n_buft_overrides After (CORRECT - 8 parameters per llama.h:480-488): - path_model, mparams, cparams, tensor_split, tensor_buft_overrides, margin, n_ctx_min, log_level Impact: Stack corruption/segfault prevented. This function is rarely used (memory fitting), so the bug was latent but critical.
1 parent d53fc2e commit 6dbddac

File tree

1 file changed

+14
-3
lines changed

1 file changed

+14
-3
lines changed

llama_cpp/llama_cpp.py

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1409,7 +1409,9 @@ def llama_max_tensor_buft_overrides() -> int:
14091409
# struct llama_context_params * cparams,
14101410
# float * tensor_split,
14111411
# struct llama_model_tensor_buft_override * tensor_buft_overrides,
1412-
# size_t n_buft_overrides);
1412+
# size_t margin,
1413+
# uint32_t n_ctx_min,
1414+
# enum ggml_log_level log_level);
14131415
@ctypes_function(
14141416
"llama_params_fit",
14151417
[
@@ -1418,7 +1420,9 @@ def llama_max_tensor_buft_overrides() -> int:
14181420
ctypes.POINTER(llama_context_params),
14191421
ctypes.POINTER(ctypes.c_float),
14201422
ctypes.c_void_p, # tensor_buft_overrides - not fully bound
1421-
ctypes.c_size_t,
1423+
ctypes.c_size_t, # margin
1424+
ctypes.c_uint32, # n_ctx_min
1425+
ctypes.c_int, # ggml_log_level (enum)
14221426
],
14231427
ctypes.c_int,
14241428
)
@@ -1428,11 +1432,18 @@ def llama_params_fit(
14281432
cparams: CtypesPointerOrRef[llama_context_params],
14291433
tensor_split: CtypesArray[ctypes.c_float],
14301434
tensor_buft_overrides: Optional[ctypes.c_void_p],
1431-
n_buft_overrides: Union[ctypes.c_size_t, int],
1435+
margin: Union[ctypes.c_size_t, int],
1436+
n_ctx_min: Union[ctypes.c_uint32, int],
1437+
log_level: int,
14321438
/,
14331439
) -> int:
14341440
"""Check if model parameters will fit in memory
14351441
1442+
Args:
1443+
margin: Memory margin to leave per device in bytes
1444+
n_ctx_min: Minimum context size when trying to reduce memory
1445+
log_level: Minimum log level (ggml_log_level enum)
1446+
14361447
Returns:
14371448
LLAMA_PARAMS_FIT_STATUS_SUCCESS (0) - found allocations that are projected to fit
14381449
LLAMA_PARAMS_FIT_STATUS_FAILURE (1) - could not find allocations that are projected to fit

0 commit comments

Comments
 (0)