Skip to content

Excessive Memory Usage and Continuous Growth When Processing hog_big in FastOMA with 2570 Species #82

@alvinliu89757

Description

@alvinliu89757

Hi

I am using FastOMA to infer orthologous groups from protein sequences of 2570 species. The overall process seemed to be progressing smoothly initially. However, the computation has been stuck on the hog_big processing step for over a week. During this time, I've observed a severe and continuous surge in memory (RAM) usage.

Initially, I allocated 4 TB of memory, which was completely filled. I then increased the available memory to 12 TB to accommodate the growth, but the memory footprint continues to increase and is consuming this additional capacity as well. The memory statue: size 12T used 4.8T avail 7.3T use% 40%.

Now statue:
executor> local (178) [f4/408a97] check_input (1) | 1 of 1, cached: 1 ✔ [8a/4d0f3d] oma…peruviana_NRRL66754.fasta) | 2570 of 2570, cached: 2570 ✔ [f7/5dac82] infer_roothogs (1) | 1 of 1, cached: 1 ✔ [25/b25ad3] batch_roothogs (1) | 1 of 1, cached: 1 ✔ [77/1e0f3d] hog_big (1054) | 2497 of 7378, cached: 2445, retries: 117 [a7/42e896] hog_rest (1231) | 2106 of 2106, cached: 2106 ✔ [- ] collect_subhogs - [- ] ext…airwise_ortholog_relations - [- ] fastoma_report - [2c/6fe723] NOTE: Process hog_big (1004) terminated with an error exit status (137) -- Execution is retried (1)

My run parameter:

`#!/bin/bash
#SBATCH --job-name=NP_batchScan_P1
#SBATCH -N 1
#SBATCH --partition=smp
#SBATCH --cpus-per-task=36
#SBATCH --output=%x_%j.out
#SBATCH --error=%x_%j.err

source /public/home/user/miniconda3/bin/activate
conda activate FastOMA-v04

CONFIG_PATH="/public/home/user/FOMA_ALF/custom_memory.config"

cd /public/home/user/FastOMA_0.4.0/FastOMA

nextflow run FastOMA.nf
-resume
-c $CONFIG_PATH
--input_folder /public/home/user/FOMA_ALF/allSpe
--output_folder /public/home/user/FOMA_ALF/results
--omamer_db /public/home/user/FastOMA/database/LUCA.h5`

my config set
`ss {
// Force the specific process to run (if it fits)
withName: hog_big {
memory = '128.GB'
}
}

// Tell the executor (local machine) it is allowed to use up to 480 GB total
executor {
name = 'local'
memory = '480.GB'
}
`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions