With the advancements in high-throughput next generation sequencing (NGS), sequencing has become more affordable and faster. The applicability of NGS at gene panel, exome and genome levels makes it a very versatile and robust option in clinical testing. However, this method comes with some challenges. The large amount of data produced by NGS is not always straightforward to analyze, and proper analysis and interpretation of clinically significant variants is the key for an accurate clinical diagnosis. In this blog letter, we are going to explain the standards and guidelines adopted by genomize-Seq for sequence variant interpretation.

genomize-Seq employs a six-tier variant classification system using the terms ‘pathogenic’, ‘likely pathogenic’, ‘uncertain significance’, ‘likely benign’, ‘benign’ and ‘pathogenic with family segregation data’: while the first five terms are already used by a majority of clinics and databases, the last term is coined by genomize and explained in further in this letter. The classification of a variant should be done based on how it affects the normal function of the gene or the protein. The explanations for variant classifications are listed in Table 1.

 

 

Table 1 – The explanations for variant classification

While assigning a pathogenicity category to a sequence variant, genomize-Seq uses the standards recommended by the American College of Medical Genetics (ACMG) and Genomics and the Association for Molecular Pathology(1). These standards enable the integration of all available information on a variant while weighing the information, and produce a consensus pathogenicity value. The frequency of the variant in population databases such as 1000Genomes, ESP6500 and ExAC; functional and experimental data on the variant from literature; computational prediction results such as SIFT, PolyPhen and MutationTaster; the existence of the variant in clinically relevant databases such as ClinVar and segregation data are some of the variant evidence types that are used for variant classification. However, the weights of all these evidence types are not the same. For example, functional data has more distinctive information about the pathogenicity of a variant than computational data. Also, while it is possible to say a variant observed with a high frequency in healthy populations can be directly classified as benign, it cannot be directly concluded that a low frequency is the indicator of a pathogenic variant. Therefore, a classification that weighs evidence types is also done. We assign a evidence code for each evidence and calculate the final pathogenicity as the sum of all these evidence codes. The evidence codes are listed in Table 2.

Table 2 – Criteria and evidence codes for classifying variants

While the evidence codes starting with PVS, PS, PM and PP are indicators of a pathogenic variant, evidence codes starting with BA, BS and BP show that the variant is benign. The weighing is done previously while determining which evidence codes are given to which kind of data by ACMG: the evidence codes PVS and BA represent very strong evidence, the codes PS and BS means strong evidence, PM is moderate evidence and finally the codes PP and BP signifies supporting evidence. The weight of the codes descend in the order of very strong, strong, moderate and supporting. The rules for combining evidence codes are also taken from the ACMG paper1 and shown in Table 3. As genomize, we added an extra pathogenicity category, PF (The last category in Table 3), which is applicable in the cases when a missense variant has low frequency and is reported pathogenic by computational evidence (SIFT, PolyPhen, MutationTaster), or when the variant is an inframe insertion/deletion and has low frequency; however, there is not any other kind of evidence (functional data, literature support) present. These cases are supportive of a pathogenic variant; yet, are not enough to classify the variant as pathogenic. When this is the case, we categorize the variant as PF (Pathogenic with Family Segregation Data) and recommend the user to check family segregation data. This category prevents missense and inframe variants without any prior functional or clinical data to be directly categorized as variants of unknown significance (VUS), and marks them as variants in need of deeper analysis. As a result of family segregation data analysis, if the variant is not present in the healthy family members and present in affected family members (segregation: ACMG evidence code PP1) or if the variant is de-novo in the affected individual (ACMG evidence codes PS2 or PM6, the variant is not present in the parents and present in the affected child), PF variants can be considered as pathogenic. While ACMG does not suggest a concrete metric for segregation, it encourages users to further question pathogenicity of the variant depending on the extent of segregation observed in the family.

Table 2 – Criteria and evidence codes for classifying variants
Two Example Cases
Let’s find out the pathogenicity class of MEFV variant R202Q with rsid rs224222 shown in Figure 1. The variant is observed with very high allele frequencies in almost in populations (1000Genomes, ESP6500 and ExAC); this is an indicator for a benign variant and corresponds to the evidence code BA1. All computational methods (SIFT, PolyPhen and MutationTaster) predict the variant as harmless, and this brings the evidence code BP4. The ClinVar entries report benign and likely benign, meaning the evidence code BP6. At this point, the variant rs224222 has three evidence codes: BA1, BP4 and BP6. When we apply the rules defined in Table 3, we can conclude that this variant is benign. 

Figure 1 – An example case for pathogenicity classification

Figure 2 – Another example case for pathogenicity classification

Let’s look at another MEFV variant V726A with rsid rs28940579 shown in Figure 2. This variant has very low allele frequencies in all populations, therefore gets a PM2. Computational methods gives contradicting predictions: 2 benign scores from SIFT and PolyPhen, one deleterious score from MutationTaster. When this is the case, the result supported by the majority of the methods is assigned, and the variant gets a BP4 from computational data. All clinical data from ClinVar reports the variant as pathogenic and brings the evidence code PP5. At this point there is not enough evidence to classify the variant, and there is need for literature search for functional or clinical information. Thus, we make a literature search. The papers by French FMF Consortium(2) and International FMF Consortium(3) report that the prevalence of the variant in affected individuals is significantly increased compared with prevalence in healthy controls. This is an indicator of a pathogenic variant and PS4 is the evidence code corresponding to this information. Another functional study4 is supportive of a damaging effect, meaning the evidence code PS3. At this point we have PM2, BP4, PP5, PS3 and PS4 as evidence codes and this variant is classified as pathogenic by genomize-Seq. 
References
• (1)Sue Richards, Nazneen Aziz, Sherri Bale, David Bick, Soma Das, Julie Gastier-Foster, Wayne W. Grody, Madhuri Hegde, Elaine Lyon, Elaine Spector, Karl Voelkerding, and Heidi L. Rehm. “Standards and Guidelines for the Interpretation of Sequence Variants: A Joint Consensus Recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.” Genetics in Medicine Genet Med (2015): 405-23.
• (2)French Fmf Consortium. “A Candidate Gene for Familial Mediterranean Fever.” Nature Genetics Nat Genet: 25-31.
• (3)The International Fmf Consortium. “Ancient Missense Mutations in a New Member of the RoRet Gene Family Are Likely to Cause Familial Mediterranean Fever.” Cell: 797-807.
• (4)Yalçinkaya F, Akar N, Misirlioglu M. “Familial Mediterranean fever–amyloidosis and the Val726Ala mutation.” N Engl J Med. 1998 Apr 2;338(14):993-4