Python Bindings for lib_tsalign
Python bindings for the template switch aligner. Aligns two DNA sequences while detecting template switches — short-range translocations where a query region is copied from (or aligns to) a different location, possibly on the reverse complement strand.
pip install tsalignimport tsalign
result = tsalign.align("ACGTACGT", "ACGACGT")
print(result.cigar()) # compact alignment string
print(result.stats()) # cost, duration, node counts, …Create an Aligner once and reuse it for many sequences:
aligner = tsalign.Aligner(
min_length_strategy="preprocess_lookahead", # default: "lookahead"
chaining_strategy="lower_bound", # default: "none"
total_length_strategy="maximise", # default: "maximise"
no_ts=False, # set True for plain gap-affine
)
result = aligner.align("ACGTACGT", "ACGACGT")Costs are specified in .tsa format. Use sample_tsa_config/config.tsa
as a starting point and consult the main repository README for a description
of each parameter.
aligner = tsalign.Aligner(costs_file="sample_tsa_config/config.tsa")
result = aligner.align("ACGTACGT", "ACGACGT")You can also pass the cost string directly:
with open("my_costs.tsa") as f:
cost_str = f.read()
aligner = tsalign.Aligner(costs=cost_str)Use AlignmentRange to align only a window of the input sequences:
from tsalign import Aligner, AlignmentRange
aligner = Aligner()
result = aligner.align(
"NNNACGTACGTNNN",
"ACGACGT",
range=AlignmentRange(reference_start=3, reference_end=11),
)
print(result.cigar())Individual start/limit keyword arguments are also accepted when range is
not provided:
result = aligner.align(
"NNNACGTACGTNNN",
"ACGACGT",
reference_start=3,
reference_limit=11,
)alignment.alignments() returns a typed list of (count, op) pairs:
from tsalign import align, TemplateSwitchEntranceOp, TemplateSwitchExitOp
result = align(reference, query)
for count, op in result.alignments():
if isinstance(op, TemplateSwitchEntranceOp):
print(f"Template switch: {op.direction}, descendant={op.descendant}, offset={op.first_offset}")
elif isinstance(op, TemplateSwitchExitOp):
print(f"Exit, anti-descendant gap: {op.anti_descendant_gap}")
else:
# SimpleAlignmentOp — a basic edit in the primary or secondary track
print(f"{count}x {op.kind}")result = tsalign.align(reference, query)
result.viz_template_switches() # prints ASCII art to stdoutresult = aligner.align(
reference,
query,
cost_limit=100, # return None if cost would exceed this
memory_limit=500_000, # return None if memory exceeds this number of bytes
)
if result is None:
print("No alignment found within limits")Any object whose str() representation is a valid DNA string (ACGTN) is
accepted — including Bio.Seq:
from Bio.Seq import Seq
result = tsalign.align(Seq("ACGTACGT"), Seq("ACGACGT"))