Skip to content

Does dump-on-demand really work? #140

@squadgazzz

Description

@squadgazzz

I adapted the tikv-jemallocator in our services in order to investigate memory leaks in the following way:

  • Compile everything with enabled debug symbols: CARGO_PROFILE_RELEASE_DEBUG=1 cargo build --release
  • The application is running with disabled active profiling using _RJEM_MALLOC_CONF=prof:true,prof_active:false,lg_prof_sample:19,prof_prefix:/tmp/jeprof. This is done to not record allocations during warm-up and focus only on the memory leak.
  • Then I enable the active profiling with:
    async fn set_enabled(&self, enabled: bool, socket: &mut UnixStream) -> bool {
        let mut state = self.active.lock().await;
        match unsafe { tikv_jemalloc_ctl::raw::update(b"prof.active\0", enabled) } {
            Ok(was_enabled) => {
                *state = enabled;
                was_enabled != enabled
            }
            Err(err) => {
                log(
                    socket,
                    format!("failed to set memory profiler state to {enabled}: {err:?}"),
                )
                .await;
                false
            }
        }
    }
  • Wait for a few hours and then request for a dump with:
    async fn dump_prof(&self, socket: &mut UnixStream) {
        let state = self.active.lock().await;
        // Hold the lock, so no other thread can disable profiling
        // while the dump is being created.
        if !*state {
            log(
                socket,
                "memory profiler is not active, cannot dump".to_string(),
            )
            .await;
            return;
        }

        let timestamp = chrono::Utc::now().format("%Y%m%d_%H%M%S").to_string();
        let process_name = self.process_name.as_str();
        let filename = format!("jemalloc_dump_{process_name}_{timestamp}.heap");
        let full_path = match Self::get_dump_dir_path(socket).await {
            Ok(path) => path.join(filename),
            Err(err) => {
                log(
                    socket,
                    format!("failed to get dump dir path, dump was not saved: {err:?}"),
                )
                .await;
                return;
            }
        };

        {
            let Some(bytes) = CString::new(full_path.as_os_str().as_bytes()).ok() else {
                log(
                    socket,
                    format!("failed to create CString from path {full_path:?}"),
                )
                .await;
                return;
            };

            log(
                socket,
                format!("saving jemalloc profiling dump to {full_path:?}"),
            )
            .await;
            let mut bytes = bytes.into_bytes_with_nul();
            let ptr = bytes.as_mut_ptr().cast::<c_char>();
            if let Err(err) = unsafe { tikv_jemalloc_ctl::raw::write(b"prof.dump\0", ptr) } {
                log(
                    socket,
                    format!("failed to dump jemalloc profiling data: {err:?}"),
                )
                .await;
                return;
            }
        }

        log(
            socket,
            format!("saved jemalloc profiling dump to {full_path:?}"),
        )
        .await;
    }
  • It doesn't matter how long I run the profiler active and how the memory has grown during this time, the dump remains very short and not readable:
heap dump file
heap_v2/2097152
  t*: 0: 0 [0: 0]
  t0: 0: 0 [0: 0]

MAPPED_LIBRARIES:
55d863cc3000-55d863df9000 r--p 00000000 103:01 136352606                 /usr/local/bin/solvers
55d863df9000-55d86484a000 r-xp 00136000 103:01 136352606                 /usr/local/bin/solvers
55d86484a000-55d864c09000 r--p 00b87000 103:01 136352606                 /usr/local/bin/solvers
55d864c09000-55d864cf1000 r--p 00f45000 103:01 136352606                 /usr/local/bin/solvers
55d864cf1000-55d864cf9000 rw-p 0102d000 103:01 136352606                 /usr/local/bin/solvers
55d864cf9000-55d864f1a000 rw-p 00000000 00:00 0 
55d86a66a000-55d86a8cb000 rw-p 00000000 00:00 0                          [heap]
7fe870000000-7fe870021000 rw-p 00000000 00:00 0 
7fe870021000-7fe874000000 ---p 00000000 00:00 0 
7fe874000000-7fe874021000 rw-p 00000000 00:00 0 
7fe874021000-7fe878000000 ---p 00000000 00:00 0 
7fe878000000-7fe878021000 rw-p 00000000 00:00 0 
7fe878021000-7fe87c000000 ---p 00000000 00:00 0 
7fe87c000000-7fe87c021000 rw-p 00000000 00:00 0 
7fe87c021000-7fe880000000 ---p 00000000 00:00 0 
7fe880000000-7fe8814fe000 rw-p 00000000 00:00 0 
7fe8814fe000-7fe884000000 ---p 00000000 00:00 0 
7fe884000000-7fe884021000 rw-p 00000000 00:00 0 
7fe884021000-7fe888000000 ---p 00000000 00:00 0 
7fe888000000-7fe889470000 rw-p 00000000 00:00 0 
7fe889470000-7fe88c000000 ---p 00000000 00:00 0 
7fe88c000000-7fe88d497000 rw-p 00000000 00:00 0 
7fe88d497000-7fe890000000 ---p 00000000 00:00 0 
7fe890000000-7fe891229000 rw-p 00000000 00:00 0 
7fe891229000-7fe894000000 ---p 00000000 00:00 0 
7fe894000000-7fe89551c000 rw-p 00000000 00:00 0 
7fe89551c000-7fe898000000 ---p 00000000 00:00 0 
7fe898000000-7fe899349000 rw-p 00000000 00:00 0 
7fe899349000-7fe89c000000 ---p 00000000 00:00 0 
7fe89c000000-7fe89d616000 rw-p 00000000 00:00 0 
7fe89d616000-7fe8a0000000 ---p 00000000 00:00 0 
7fe8a0000000-7fe8a13e6000 rw-p 00000000 00:00 0 
7fe8a13e6000-7fe8a4000000 ---p 00000000 00:00 0 
7fe8a5c00000-7fe8a6000000 rw-p 00000000 00:00 0 
7fe8a61f2000-7fe8a61f3000 ---p 00000000 00:00 0 
7fe8a61f3000-7fe8a63f3000 rw-p 00000000 00:00 0 
7fe8a63f3000-7fe8a63f4000 ---p 00000000 00:00 0 
7fe8a63f4000-7fe8a65f4000 rw-p 00000000 00:00 0 
7fe8a65f4000-7fe8a65f5000 ---p 00000000 00:00 0 
7fe8a65f5000-7fe8a67f5000 rw-p 00000000 00:00 0 
7fe8a67f5000-7fe8a67f6000 ---p 00000000 00:00 0 
7fe8a67f6000-7fe8a69f6000 rw-p 00000000 00:00 0 
7fe8a69f6000-7fe8a69f7000 ---p 00000000 00:00 0 
7fe8a69f7000-7fe8a6bf7000 rw-p 00000000 00:00 0 
7fe8a6bf7000-7fe8a6bf8000 ---p 00000000 00:00 0 
7fe8a6bf8000-7fe8a6df8000 rw-p 00000000 00:00 0 
7fe8a6df8000-7fe8a6df9000 ---p 00000000 00:00 0 
7fe8a6df9000-7fe8a6ff9000 rw-p 00000000 00:00 0 
7fe8a6ff9000-7fe8a6ffa000 ---p 00000000 00:00 0 
7fe8a6ffa000-7fe8a71fa000 rw-p 00000000 00:00 0 
7fe8a71fa000-7fe8a71fb000 ---p 00000000 00:00 0 
7fe8a71fb000-7fe8a73fb000 rw-p 00000000 00:00 0 
7fe8a73fb000-7fe8a73fc000 ---p 00000000 00:00 0 
7fe8a73fc000-7fe8a75fc000 rw-p 00000000 00:00 0 
7fe8a75fc000-7fe8a75fd000 ---p 00000000 00:00 0 
7fe8a75fd000-7fe8a77fd000 rw-p 00000000 00:00 0 
7fe8a77fd000-7fe8a77fe000 ---p 00000000 00:00 0 
7fe8a77fe000-7fe8a79fe000 rw-p 00000000 00:00 0 
7fe8a79fe000-7fe8a79ff000 ---p 00000000 00:00 0 
7fe8a79ff000-7fe8a7bff000 rw-p 00000000 00:00 0 
7fe8a7bff000-7fe8a7c00000 ---p 00000000 00:00 0 
7fe8a7c00000-7fe8a7e00000 rw-p 00000000 00:00 0 
7fe8a7e00000-7fe8a8600000 rw-p 00000000 00:00 0 
7fe8a8784000-7fe8a8785000 ---p 00000000 00:00 0 
7fe8a8785000-7fe8a8787000 rw-p 00000000 00:00 0 
7fe8a8787000-7fe8a8788000 ---p 00000000 00:00 0 
7fe8a8788000-7fe8a878a000 rw-p 00000000 00:00 0 
7fe8a878a000-7fe8a878b000 ---p 00000000 00:00 0 
7fe8a878b000-7fe8a878d000 rw-p 00000000 00:00 0 
7fe8a878d000-7fe8a878e000 ---p 00000000 00:00 0 
7fe8a878e000-7fe8a8790000 rw-p 00000000 00:00 0 
7fe8a8790000-7fe8a8791000 ---p 00000000 00:00 0 
7fe8a8791000-7fe8a8793000 rw-p 00000000 00:00 0 
7fe8a8793000-7fe8a8794000 ---p 00000000 00:00 0 
7fe8a8794000-7fe8a8796000 rw-p 00000000 00:00 0 
7fe8a8796000-7fe8a8797000 ---p 00000000 00:00 0 
7fe8a8797000-7fe8a8799000 rw-p 00000000 00:00 0 
7fe8a8799000-7fe8a879a000 ---p 00000000 00:00 0 
7fe8a879a000-7fe8a879c000 rw-p 00000000 00:00 0 
7fe8a879c000-7fe8a879d000 ---p 00000000 00:00 0 
7fe8a879d000-7fe8a87a4000 rw-p 00000000 00:00 0 
7fe8a87a4000-7fe8a87ca000 r--p 00000000 103:01 177454126                 /usr/lib/x86_64-linux-gnu/libc.so.6
7fe8a87ca000-7fe8a891f000 r-xp 00026000 103:01 177454126                 /usr/lib/x86_64-linux-gnu/libc.so.6
7fe8a891f000-7fe8a8972000 r--p 0017b000 103:01 177454126                 /usr/lib/x86_64-linux-gnu/libc.so.6
7fe8a8972000-7fe8a8976000 r--p 001ce000 103:01 177454126                 /usr/lib/x86_64-linux-gnu/libc.so.6
7fe8a8976000-7fe8a8978000 rw-p 001d2000 103:01 177454126                 /usr/lib/x86_64-linux-gnu/libc.so.6
7fe8a8978000-7fe8a8985000 rw-p 00000000 00:00 0 
7fe8a8985000-7fe8a8995000 r--p 00000000 103:01 177541866                 /usr/lib/x86_64-linux-gnu/libm.so.6
7fe8a8995000-7fe8a8a09000 r-xp 00010000 103:01 177541866                 /usr/lib/x86_64-linux-gnu/libm.so.6
7fe8a8a09000-7fe8a8a63000 r--p 00084000 103:01 177541866                 /usr/lib/x86_64-linux-gnu/libm.so.6
7fe8a8a63000-7fe8a8a64000 r--p 000dd000 103:01 177541866                 /usr/lib/x86_64-linux-gnu/libm.so.6
7fe8a8a64000-7fe8a8a65000 rw-p 000de000 103:01 177541866                 /usr/lib/x86_64-linux-gnu/libm.so.6
7fe8a8a65000-7fe8a8a68000 r--p 00000000 103:01 177540588                 /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
7fe8a8a68000-7fe8a8a7f000 r-xp 00003000 103:01 177540588                 /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
7fe8a8a7f000-7fe8a8a83000 r--p 0001a000 103:01 177540588                 /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
7fe8a8a83000-7fe8a8a84000 r--p 0001d000 103:01 177540588                 /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
7fe8a8a84000-7fe8a8a85000 rw-p 0001e000 103:01 177540588                 /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
7fe8a8a85000-7fe8a8b4a000 r--p 00000000 103:01 113269826                 /usr/lib/x86_64-linux-gnu/libcrypto.so.3
7fe8a8b4a000-7fe8a8dc6000 r-xp 000c5000 103:01 113269826                 /usr/lib/x86_64-linux-gnu/libcrypto.so.3
7fe8a8dc6000-7fe8a8ea4000 r--p 00341000 103:01 113269826                 /usr/lib/x86_64-linux-gnu/libcrypto.so.3
7fe8a8ea4000-7fe8a8f06000 r--p 0041e000 103:01 113269826                 /usr/lib/x86_64-linux-gnu/libcrypto.so.3
7fe8a8f06000-7fe8a8f09000 rw-p 00480000 103:01 113269826                 /usr/lib/x86_64-linux-gnu/libcrypto.so.3
7fe8a8f09000-7fe8a8f0c000 rw-p 00000000 00:00 0 
7fe8a8f0c000-7fe8a8f2b000 r--p 00000000 103:01 113269827                 /usr/lib/x86_64-linux-gnu/libssl.so.3
7fe8a8f2b000-7fe8a8f88000 r-xp 0001f000 103:01 113269827                 /usr/lib/x86_64-linux-gnu/libssl.so.3
7fe8a8f88000-7fe8a8fa7000 r--p 0007c000 103:01 113269827                 /usr/lib/x86_64-linux-gnu/libssl.so.3
7fe8a8fa7000-7fe8a8fb1000 r--p 0009a000 103:01 113269827                 /usr/lib/x86_64-linux-gnu/libssl.so.3
7fe8a8fb1000-7fe8a8fb5000 rw-p 000a4000 103:01 113269827                 /usr/lib/x86_64-linux-gnu/libssl.so.3
7fe8a8fb7000-7fe8a8fb8000 ---p 00000000 00:00 0 
7fe8a8fb8000-7fe8a8fbc000 rw-p 00000000 00:00 0 
7fe8a8fbc000-7fe8a8fbd000 r--p 00000000 103:01 177233239                 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7fe8a8fbd000-7fe8a8fe3000 r-xp 00001000 103:01 177233239                 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7fe8a8fe3000-7fe8a8fed000 r--p 00027000 103:01 177233239                 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7fe8a8fed000-7fe8a8fef000 r--p 00031000 103:01 177233239                 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7fe8a8fef000-7fe8a8ff1000 rw-p 00033000 103:01 177233239                 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7fff3e4f0000-7fff3e518000 rw-p 00000000 00:00 0                          [stack]
7fff3e5f1000-7fff3e5f5000 r--p 00000000 00:00 0                          [vvar]
7fff3e5f5000-7fff3e5f7000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
  • And then, when I try to create a report, jeprof can't find any allocation in the file.
root@d70392f1dd37:/jm-dump# jeprof --show_bytes --pdf ./solvers ./jemalloc_dump_baseline_20250828_064852.heap > report.pdf
Using local file ./solvers.
Argument "MSWin32" isn't numeric in numeric eq (==) at /usr/local/bin/jeprof line 5314.
Argument "linux" isn't numeric in numeric eq (==) at /usr/local/bin/jeprof line 5314.
Using local file ./jemalloc_dump_baseline_20250828_064852.heap.
No nodes to print

I am running this on Debian(docker.io/debian:bookworm-slim) with the following list of installed apps:

ca-certificates tini gettext-base build-essential cmake git zlib1g-dev libelf-dev libdw-dev libboost-dev libboost-iostreams-dev libboost-program-options-dev libboost-system-dev libboost-filesystem-dev libunwind-dev libzstd-dev git

Full code: https://github.com/cowprotocol/services/blob/c8bdc0e99f7e719b6b8199c85f387225c79b8af2/crates/shared/src/alloc.rs#L85-L145

I've read many examples already. Is there anything missing in my setup or code to get a proper memory dump? Should I have installed jemalloc on the runtime machine?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions