Sorting Out Non-Synonymous Single Nucleotide Polymorphism Leads to Novel Biomarker Discovery for Disease Prognosis

Extensive effort was given over the past years on revealing how genetic changes give upsurge to the molecular effects that cause diseases and phenotypes [1,2]. These efforts enforced the growth to number of databases, web resources, and tools for prioritizing possible single nucleotide polymorphisms (SNPs). These resources and online tools are designated on the basis of their genomic context as well as annotations. Till now, most of the focus is on human genome annotations, although some resources provide insight into SNP data from model organisms such as mouse, fruit fly, or chimpanzee [3]. Typically, SNP data is used as a marker in the context of a linkage or population-based association study. However, there are a number of challenges such as the position, expression of functional products (RNA, Protein), and experimental validation of target SNP etc. to identify these so-called functional variants. Therefore, systematic in silico approach is needed to reduce the burden of scrutinizing a large number of SNPs available for a target disease. The approach might concentrate in three important areas: identification of candidate genes that may have causal variants, selection of candidate causal SNPs and focus on nsSNP’s (Figure 1). We suggest targeting nsSNPS as these polymorphisms are most likely to affect the functionality of the target gene product. In this regard, bioinformatics tools can play a pivotal role to identify specific disease related nsSNP through structural and functional assessment.


Rationale
Extensive effort was given over the past years on revealing how genetic changes give upsurge to the molecular effects that cause diseases and phenotypes [1,2]. These efforts enforced the growth to number of databases, web resources, and tools for prioritizing possible single nucleotide polymorphisms (SNPs). These resources and online tools are designated on the basis of their genomic context as well as annotations. Till now, most of the focus is on human genome annotations, although some resources provide insight into SNP data from model organisms such as mouse, fruit fly, or chimpanzee [3]. Typically, SNP data is used as a marker in the context of a linkage or population-based association study. However, there are a number of challenges such as the position, expression of functional products (RNA, Protein), and experimental validation of target SNP etc. to identify these so-called functional variants. Therefore, systematic in silico approach is needed to reduce the burden of scrutinizing a large number of SNPs available for a target disease. The approach might concentrate in three important areas: identification of candidate genes that may have causal variants, selection of candidate causal SNPs and focus on nsSNP's ( Figure 1). We suggest targeting nsSNP S as these polymorphisms are most likely to affect the functionality of the target gene product. In this regard, bioinformatics tools can play a pivotal role to identify specific disease related nsSNP through structural and functional assessment.

Bioinformatics Tools and Resources for nsSNP's Discovery and Analysis
Generally, the discovery and prioritization of SNPs are carried out by sequencing. SNPs discovery based on the various sites isolating from the sequence, assesses frequency of the error in total numbers of the selected sequences, isolates parologous and then determines genotype. In silico approaches play an important role in SNP discovery and scrutiny. These methods mark genes that encompass SNPs, let researchers to retrieve data about SNPs based on gene of interest, genetic or physical map location, or expression pattern.
The polymorphism data is available from several databases such as NCBI dbSNP database (http://www.ncbi.nlm.nih.gov/ SNP/), the Ensemble genome browser (www.ensembl.org/), and the UniProt database (www.uniprot.org). The NCBI dbSNP database is the most extensive SNP database among the others, but it contains both validated and non validated polymorphisms.
There are now many databases that provide access to SNP or disease mutation data. Many genotype-phenotype databases are available as well including the Human Gene Mutation Database

Abstract
Hereditary genetic variation which is considered to be primarily caused by single nucleotide polymorphism (SNP), is a significant drawback for developing universal therapy against diseases. Among others, non-synonymous SNP (nsSNP) could be fatal due to its effect on structure and function of the ultimate gene product. Therefore, study of functional nsSNP's would provide an insight into the exact cause underlying the onset of genetic variation and possible methodologies for the cure or early management of the disease. Various in silico tools could be employed to screen and map the deleterious nsSNP's to the protein structure for predicting the structure-function effects. Further, these nsSNPs upon experimental verification would be ideal candidate for the disease risk assessment. Positive linkage study would enforce novel biomarker discovery for specific disease prognosis.
Besides, there are some tools and softwares which could provide meaningful insights of target polymorphisms (Table 1). These tools could be utilized for prioritization of SNPs based on their functionality and stability. SNPs with the potential to cause structural modifications due to the amino acid substitution as well as their functional abnormality could also be predicted with these tools (Figure 1, Table 1). In this scenario, nsSNP would be the best choice to study as it could show the most deleterious effect of polymorphism [5].  [14].

SNP3D
Annotations of structure, systems biology, evolution and alternative splicing [15].   After the identification of possible non synonymous polymorphism researcher can employ these nsSNP's into experimental validation for the novel biomarker discovery. Various methods may be considered for the experimental validation of the nsSNP in specific disease (Figure 2) [4]. If the verification and statistical output of selected nsSNP gives the frequent occurrence in the specific region of the target gene, it could be evaluated as a biomarker for disease risk assessment. Furthermore, linkage or population based association study should also be performed to declare a nsSNP as biomarker.

nsSNPs: Future Biomarkers?
The relationship between SNPs and various diseases, has long been established with a wide range of human diseases resulting from different nsSNP's. Particular population or individual's disease susceptibility, severity of illness etc depend on those nsSNP's. Also, nsSNP's contributed to individual's response to drug, drug resistance etc. However, the establishment of association of nsSNP's with diseases is going with slow pace due to the lack of their proper identification. Therefore, the overwhelming task of characterization of 8.2 million SNPs in Human genome [28] for disease association, bioinformatics tools together with wet laboratory research could be the best option for the obvious future biomarker 'nsSNP' development.