Skip to content

Can be hard to constrain root node to a fixed time on a large single tree #466

@hyanwong

Description

@hyanwong

I'm finding when dating a large single tree where the root node is fixed, tsdate makes the root time change. The effect is relatively small below, (root dates before & after: 1658.0 1679.47) but gets larger with bigger sample sizes (e.g. in the larger ts of 4M samples from which this has been simplified down, the fixed root node is assigned twice the expected age)

sts = tszip.load("simpified50000.tsz")
print(f"{sts.num_trees} trees, root is sample @ time {sts.nodes_time[sts.first().root]}: ", sts.first().root)
edge_times = sts.nodes_time[sts.edges_parent] - sts.nodes_time[sts.edges_child]
av_mu = sts.num_mutations / ((sts.edges_right - sts.edges_left) * edge_times).sum()

dts = tsdate.date(
    sts,
    mutation_rate=av_mu,
    rescaling_intervals=0,
    max_iterations=1,  # single tree, so only one round needed
    time_units=sts.time_units,
    allow_unary=True,
    progress=True,
    set_metadata=False,
)
print("Root dates before & after:", sts.nodes_time[sts.first().root], dts.nodes_time[dts.first().root])

simpified50000.tsz.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions