HIV molecular immunology database
HLA binding motif scanner allows you to find HLA anchor residue motifs within protein sequences for specified HLA serotypes, genotypes or supertypes. The potential epitopes are included in the output. Two major motif libraries were used:
The motifs presented are linked to their sources, and you can choose which one to use for scanning the sequences. We also constantly search the literature for the new motifs, not listed in these two major sources. What we find is presented as an additional source. You also can use your own custom motif, which can be composed based on the information we present and on your own data.
The supermotifs and supertypes classification is taken from
Supermotifs indicate the residues defining supertype specificities. The supermotifs incorporate residues that are recognized by multiple alleles within the supertype.
This tool searches for anchor motifs only. If you want additional information on auxiliary amino acids, please look at the original motif libraries. However, you can still use our tool with the auxiliary amino acids if you compose your own custom motif using the information from these sources.
x-[VTILF]-x-x-x-x-x-x-[YF(ML)]. This means that second and C-terminal positions are anchor positions. The dominant amino acids at the second position are
Fand at the C-terminal anchor position the dominant amino acids are
Lare the preferred but not dominant. Note that as a default, unless you specify your own motif, we will search on all anchor position amino acids, both dominant and preferred but not dominant, so the information on which amino acids are less dominant is presented for your information only. However, if you want to search on the dominant amino acids only, you can compose your own motif using the information we present. Also, should you have any questions of how it was decided which amino acid is dominant and which is not, please address them to the authors who published these motifs.
, and enter arbitrary residues with an
x. You may optionally use a dash (
-) to separate the residues. For example,
The sequences consist of the amino acid codes:
ACDEFGHIKLMNPQRSTVWYBZX and the gap code
-. All other characters are removed and ignored.
Gaps are ignored unless the input sequences form an alignment. Two sequence formats are accepted, FASTA and Table, and examples of these formats are shown below. For more information about sequence formats, see Common Sequence Formats.
>sequence_a MENRWQVMIVWQVDRMRIRTWKSLVKHHMYVSGKARGWFYRHHYESPHPR ISSEVHIPLGDARLVITTYWGLHTGERDWHLGQGVSIEWRKKRYSTQVDP ELADQLIHLYYFDCFSDSAIRKALLGHIVSPRCEYQAGHNKVGSLQYLAL AALITPKKIKPPLPSVTKLTEDRWNKPQKTKGHRGSHTMNGH >sequence_b MEQAPEDQGPQREPHNEWTLELLEELKNEAVRHFPRIWLHGLGQHIYETY GDTWAGVEAIIRILQQLLFIHFRIGCRHSRIGVTRQRRARNGASRS >sequence_c MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKALGISYGRK KRRQRRRAHQNSQTHQASLSKQPTSQPRGDPTGPKExKKKVERETETDPF D
sequence_a MENRWQVMIVWQVDRMRIRTWKSLVKHHMYVSGKARGWFYRHHYESPHPR sequence_b MEQAPEDQGPQREPHNEWTLELLEELKNEAVRHFPRIWLHGLGQHTY sequence_c MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKALGISYGRKD
The result of the program is presented in several ways. First, the motifs corresponding to the input HLA type are presented. Then, you choose which motifs to scan against, choose motif length, load your sequences or choose predefined sequences, and scan these sequences for the respective motifs.
The final output is organized by search pattern---all motifs with identical search patterns are grouped together. The matching binding motifs are presented on the input sequences in two colors: C-terminal anchor amino acids are shown in magenta and anchor amino acids in the other positions are shown in cyan. If a given amino acid is matched by more than one motif, then it is highlighted as a C-terminal anchor amino acid if any of the motifs are matched at the C-terminal anchor. All anchor amino acids are shown in uppercase and non-anchors are lowercase. Following the sequences is a list of potential epitopes showing their positions in the input sequences.
You can also view and download the resulting sequences in the FASTA format where the anchor amino acids are presented in uppercase and all the remaining ones in lowercase. The potential epitopes can be also downloaded in CSV (comma-separated value) format which can be read into a spreadsheet. This output is convenient for further analysis.Last modified: Thu Jun 9 09:04:55 MDT 2005