Choose this option to calculate branch lengths between user-defined nodes. This analysis is equivalent to our former BranchLength tool.
Choose this option to find the least squares optimized midpoint of the tree when all samples are taken at a single time point.
Choose this option to find the "traditional" midpoint of the tree when all samples are taken at a single time point.
Choose this option to find the root that minimizes the sum of variances. This optimization finds the root that gives the most homogeneous (“clocklike”) rate in a tree with samples from two or more time points. Note that the rate is allowed to change between time points, so dynamics in the evolutionary rate can be investigated. Input of time point groupings is required for this method.
Below is an example of grouped taxon names. Groups are specified as lists of taxon names. Each taxon must be on a separate line, and groups are separated by an empty line. The first item in a group will be taken as the time point of the group and should be numeric and end in ':'. If the first time point of a group is in the 'Discard:' list, the taxa belong to the group will not be considered in the calculation. Any taxa that are not present in any group will be considered to be in the 'Discard:' group.
The following can be pasted in with the Sample Input for testing the Grouped Taxon Names option:
1990: B.US.90.5_ B.US.90.2_ B.US.90.3_ 1981: B.US.81.7_ B.US.81.5_ B.US.81.2_ B.US.81.6_ Discard: B.US.81.1_
You can remove sequences, or groups of sequences, from the analysis without removing them from the Newick tree. Taxa to be discarded can be included in the file of grouped taxon names (as shown above), or submitted as a separate file in the "Discard taxa" input box:
Discard: B.US.90.1_ B.US.81.7_or just taxon names:
If your sequences are named with 2-digit years (for example, B.US.08.sequence_name), select this option to specify that this year value is the actual time distance between samples.
For example, if left unchecked, the timepoint values 99, 00, 01 would be placed in numerical order as 00, 01, 99.
Select this option to calculate an evolutionary rate. You must define the numerical time distance between your grouped timepoints, unless this information is encoded in your timepoint names.
For example, if your sampling timepoints were named A, B, C, you need to provide a data file that defines the actual time distance numerically. If your timepoints were named 5.0, 6.6, 8.2, and if these numbers correspond to the actual number of years (or months or days), then you do NOT need to provide any additional input here. See sample file.
Enter the length of the alignment that was used to generate the tree. This number is required to generate the approximate confidence interval of Δd.
If checked, this option will remove discarded taxa from your tree (and from the resulting treefile that you can pass to other tools). If unchecked, discarded taxa will not be used in the analysis, but will still appear in your tree.
Based on the user input, the tool roots the input tree in all possible ways. For each rooting point, the tool estimates an average distance from the root to the Timepoint 1 taxa (x1) and an average distance from the root to the Timepoint 2 taxa (x2). The difference between the average distances from the Timepoint 2 taxa and the Timepoint 1 taxa (x2 - x1) gives a Δd value for each rooting point. There can be two or more timepoints (defined by the user's groupings). The tool then calculates the sum of variances of the taxa in the different timepoints for each rooting point. The Δd from the rooting point that gives the lowest minimum sum of variances will give the best estimation of an evolutionary rate for the chosen time points in the tree.
Figure. Illustration of average distance (x) and difference (Δd) values for 2 time points.
For the calculations of evolutionary rate, the tool calculates an average time for each group. The differences between the average times give Δt values. The evolutionary rate for each Δt is calculated by dividing Δd by Δt, and is presented as substitutions per site per unit time (in whatever units were used in the dates input file). The evolutionary rate for every rooting point of the input tree is calculated; the best estimated evolutionary rate will be the one with the lowest minimum sum of variances.