HIV Databases HIV Databases home HIV Databases home
HIV sequence database



ElimDupes

Duplicate Sequence Removal

Purpose: Given an alignment or set of unaligned nucleotide or protein sequences, this tool compares the sequences and eliminates any duplicates or very similar sequences, thus producing a set of unique sequences.

Note: This program needs an alignment. If your sequences are NOT aligned, please uncheck the box at the bottom of the Input block. Adding an alignment step slows the program down dramatically.

For more details, see ElimDupes Explanation.

You have javascript turned off
Please note that some tool features, form validation in particular, may not work properly.
Input
Paste your sequences here
[Sample Input]
or upload your file
Uncheck if your sequences are not aligned (this will take much longer)

Options
Remove extraneous characters from sequences Yes No
Make all letters uppercase Yes No
Consider subsequences as duplicates Yes No
Restore original sequences in output Yes No
Eliminate sequences more similar than %
To analyze input by groups enter number of leading digits
Create a file of unique sequences with
_count added to the sequence name
Yes No       Use .rank_count format: Yes No      
last modified: Tue Sep 30 10:07 2014


Questions or comments? Contact us at seq-info@lanl.gov.

 
Operated by Los Alamos National Security, LLC, for the U.S. Department of Energy's National Nuclear Security Administration
Copyright © 2005-2012 LANS LLC All rights reserved | Disclaimer/Privacy

Dept of Health & Human Services Los Alamos National Institutes of Health