Skip to content

Chr issue in BAM #15

@sybrohee

Description

@sybrohee

Hi all,

When running vargrouper on a VCF file on hg38 with the corresponding BAM and FASTA, I ran into the following issue.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/celltics-_version_-py3.8.egg/celltics/tools/vargroup.py", line 667, in <module>
    cli()
  File "/usr/local/lib/python3.8/dist-packages/click-8.1.3-py3.8.egg/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/click-8.1.3-py3.8.egg/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/dist-packages/click-8.1.3-py3.8.egg/click/core.py", line 1635, in invoke
    rv = super().invoke(ctx)
  File "/usr/local/lib/python3.8/dist-packages/click-8.1.3-py3.8.egg/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/dist-packages/click-8.1.3-py3.8.egg/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/celltics-_version_-py3.8.egg/celltics/tools/vargroup.py", line 661, in cli
    main(input_file=input_file, output_file=output_file, bam_file=bam_file, merge_distance=merge_distance,
  File "/usr/local/lib/python3.8/dist-packages/celltics-_version_-py3.8.egg/celltics/tools/vargroup.py", line 628, in main
    records, var_dict = bam_and_merge_multiprocess(bam_file, vars_to_group, fq_threshold, min_reads,
  File "/usr/local/lib/python3.8/dist-packages/celltics-_version_-py3.8.egg/celltics/tools/vargroup.py", line 608, in bam_and_merge_multiprocess
    recs, var_dict_part = r.get().get_fat() if not debug and nthreads > 1 else r.get_fat()
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 768, in get
    raise self._value
ValueError: invalid contig `chrchr11`

This seems to be due to the fact that vargrouper adds "chr" to the chromosome name when interrogating the BAM as it is supposed to work with a VCF file without chr and BAM that may potentially have one (function check_for_chr)

def check_for_chr(sam):
    """ Check sam file to see if 'chr' needs to be prepended to chromosome """
    if 'chr' in sam.references[0]:
        return True
    return False

It could be of interest to modify the function check_for_chr, so that it also takes the vcf file as an argument and checks that chr is used or not as prefix in the chromosome coordinates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions