Recent Developments

Recoding of IPD Ethnicities

The IPD-IMGT/HLA and IPD-KIR Sequence Databases have long maintained information on the source material each allele is derived from. This meta-data includes a description of the ethnicity, race or ancestry of the individual. The descriptive terms used were originally developed over 20 years, and whilst all efforts were made to make sure the terms listed were acceptable, we realise that society has changed and the descriptions used have become out-dated. For this reason, all terminology used in the IPD project has been reviewed and updated to utilise the Human Ancestry Ontology published in Genome Biology (https://rdcu.be/c6ZVu) and available at https://www.ebi.ac.uk/ols/ontologies/hancestro. From Release 3.52.0 of the IPD-IMGT/HLA Database onwards all descriptions will align with the Hancestro ontology. Previous designations will only be available from the Release archives held within our Git Hub Repositories. The IPD-KIR Database will be updated shortly after IPD-IMGT/HLA and prior to the release of Release 2.13 of the database.

CE Marking, MHRA and MDR compliance

The matching algorithms available on the IPD websites are classified as decision support software, they provide reference information to enable a healthcare professional (HCP) to make a clinical decision, where the HCP ultimately rely on their own knowledge and expertise to decide on the treatment option for their patient. The algorithms will filter the data for the healthcare professional to review, the raw data will always need to be reviewed and any reported results are calculated based on algorithms implemented from peer-reviewed published papers. The results could be reproduced manually but the tools allow for a faster analysis time, and analysis of multiple patient and donor combinations at the same time. For this reason, this software is not considered as a medical device under regulation The United Kingdom Medical Devices Regulation 2002 (UK MDR 2002).

Other FAQs

What is an HLA Allele?

Within the HLA field we refer to sequences as alleles. These alleles are a combination of the SNPs seen in the sequence. Each allele is a unique sequence that differs by at least 1 change from another HLA allele. This change could be a single or multiple substitutions, insertions or deletions. Each sequence is given a unique allele ID i.e. HLA00001 and a unique HLA name i.e. HLA-A*01:01:01:01

Why are there more HLA Alleles in the hla_nuc.fasta than the hla_gen.fasta?

The difference between alleles included in the hla_nuc.fasta and hla_gen.fasta files is that we do not have complete sequence for all alleles. In order for an allele to be included in the hla_gen.fasta file we require a single contigous sequence from the start codon to the stop codon including all intervening exons and introns.

What is the difference between the nuc, cds and gen files?

Within the database the terms 'cDNA', 'CDS' and 'nuc' refer to the same sequences, these will be exon only and not contain any intronic sequence, unless there is a splice site mutation that means some of the intronic sequence is now considered part of the exon. The 'gen' files will contain the exon, intron and where available UTR sequence of an allele.

Where is the DQB1 exon 5 sequence?

Exon 5 of HLA-DQB1 is variably expressed. Usually it is spliced out and therefore not present in the mRNA or CDS due to polymorphisms in the acceptor splice site at the end of intron 4. With the exception of HLA-DQB1*05:03:01 and HLA-DQB1*06:01:01 (and a few others), which have a mutation in this splice site, exon 5 is not expressed in HLA-DQB1. This means it will appear in the genomic files for all alleles, but in the protein and CDS files we use the intron 4 data to determine whether or not a given allele will have exon 5 sequence.

I just submitted a sequence, when will I get an official name?

You can expect to receive an official allele name within 4-6 weeks of submission providing there have been no issues or missing data when processing the submission. If the submission requires further information or other issues are encountered during processing then we will be in touch by email to resolve the situation. We welcome queries regarding submissions, however please be aware that we cannot speed up the assignment of an official allele name, as these are worked on during specific time frames every month.

Why is there incomplete sequence for so many alleles in the database?

The IPD-IMGT/HLA Database has a minimum requirement of exons 2+3 for class I or exon 2 for class II to have been sequenced and submitted in order to receive an official allele name. This is because these exons encode the extermely polymorphic antigen recognition site (ARS). When the database was first established in 1998 molecular methods for whole gene sequencing were difficult and expensive making alleles with complete sequences rare. Although this has changed greatly in recent years there are still many challenges for sequencing HLA genes, particularly full lenght class II. If you have data for an incomplete allele sequence we encourage you to submit this data to use to improve the overall coverage of the database.

I need a donor, can you help?

The IPD-IMGT/HLA Database often receives emails from patients or members of their family, asking if we can provide or search for a donor. Unfortunately, we are not a register of unrelated bone marrow (stem cell) donors, and so cannot provide a suitably matched donor for transplants. Details of how a search for an unrelated stem cell donor is done is given on the web site of Bone Marrow Donors Worldwide (BMDW). See http://www.bmdw.org/index.php?id=for_patients

What does this typing mean?

The IPD-IMGT/HLA Database often receives emails from patients or members of their family and prospective donors, asking if we can explain typing results or an HLA profile they have received. The format of the results will vary depending on the techniques and the centre used. In general results it will list the genes tested and then for each gene the alleles (the different variants) at that gene either in a numerical string or in a coded format. Please be aware that we cannot give medical advice on any typings sent to us, our reply, where appropriate, will be restricted to explaining what part of the code represents the gene and what represents the allele variants.

What does the version number mean?

The database version number, IPD-IMGT/HLA 3.56.0 2024-04 ([object Object]), can be interpreted as;

  • Database Name
  • Major release number (nomenclature version, quarterly release, sequence version)
  • Date
  • Latest commit for ANHIG/IMGTHLA/Latest branch

The major release number contains three key fields, the first is the nomenclature version, which is currently 3. The second is the quarterly release number, which is incremented by 1 every January, April, July and October with each subsequent release. The final third number represents the sequence version. A '0' is used for the primary quarterly release, and only incremented if any subsequent interim path or update contains a change to a valid base (not a * or a .) in either the nucleotide (both cDNA and gDNA) or protein sequence. Changes to the positioning of indels, or unsequenced bases are not included if the raw sequence remains unchanged.

Is there an HGVS style description for an HLA allele?

Yes, the allele report tool, can be used to access a HGVS style description for each allele. This can be found by viewing the allele report and the then following the ‘View HGVS Report” link. Please see our page about the use of IPD-IMGT/HLA in Genomics Analysis for further information.

Do you have the dbSNP or RS ID for a particular SNP or combination for a particular allele?

In short, no, the IPD-IMGT/HLA database acts as repository for the allele sequences, and not the SNPs, with the HLA field, when viewing the variants these are considered in line with all SNPs in a sequence to form the allele. For this reason, less emphasis is placed upon the individual SNPs and the need to catalog these. It is the combination of SNPs that we focus on and not the individual entries. Please see our page about the use of IPD-IMGT/HLA in Genomics Analysis for further information.

What does the 'Q' mean?

A list of all suffixes used in allele names can be found here.

What is the frequency of a certain allele in the following populations?

The IPD-IMGT/HLA Database does not store allele frequency data. We recommend the following sources of data:

  • The HLA FactsBook - Steven Marsh, Peter Parham, Linda Barber available from Academic Press
  • www.allelefrequencies.net - describes the frequencies of alleles in different populations, at various polymorphic regions within Histocompatibility and Immunogenetics. Data may be searched under country,region,ethnic origin.

How should I cite the database?

For citations please use:

Reports problems with a particular tool

If you are having trouble with a specific tool, the alignment tool for instance, please first see if the solution is listed in the help pages provided. If this does not solve the problem contact IPD-IMGT/HLA Support. The e-mail should list the name of the tool and all the parameters used. For instance if an alignment failed to work then you should include the values used for the Locus, Specific Sequences, Type, Reference, Display, Formatting and "Cut & Paste". This better enables us to replicate and diagnose the problem.

I can't find the sequence of HLA-A?

We often get enquiries asking for a single sequence for a locus. For example, a request for the sequence of HLA-A where there is not one single sequence which represents HLA-A, this locus currently has over three thousand alleles.

I want the sequences in this particular format?

Please check the FTP directory before asking for sequences in a particular format. The FTP directory contains copies of the database as FASTA, PIR and MSF formatted files. It also contains EMBL-like flat files and copies of the text versions of the alignments.

If you are unable to access our FTP directory we also make this data available on our Github account here: https://github.com/ANHIG/IMGTHLA.

I am having trouble with the sequencing software from a particular company?

We cannot advise or support external software. If you are experiencing software trouble when sequencing HLA alleles please contact the support line for that particular make of software.