Analysing the Functional and Structural Impact of Single Nucleotide Polymorphisms in Matrix Metalloproteinase 3 Gene: An In-silico Approach
Correspondence Address :
Dr. Shajee S Nair,
Government Medical College Manjeri Malappuram, Manjeri-676121, Kerala, India.
E-mail: drno2007@gmail.com
Introduction: Matrix Metalloproteinase 3 (MMP3) is a vital member of the MMP family, known for its wide range of substrate specificity and proteolytic activity against Extracellular Matrix (ECM) components. The role of functional polymorphisms in the MMP genes has been previously investigated in relation to cancer susceptibility, particularly breast cancer. Several Single-Nucleotide Polymorphisms (SNPs) in the MMP3 gene have been linked to a number of clinical illnesses, such as Coronary Artery Disease (CAD); however, the results were not entirely conclusive.
Aim: To identify pathogenic missense SNPs in the human MMP3 gene and analyse their effects on structure and function.
Materials and Methods: This was a record-based cross-sectional study performed using data retrieved from online resources. The analysis was conducted using a series of different bioinformatic tools, for which ethical clearance was obtained from the institution. The online tools used included Sorting Intolerant from Tolerant (SIFT), PolyPhen-2, PhD-SNP, PANTHER, PROVEAN, and SNPs and GO to predict harmful non synonymous SNPs (nsSNPs). Further analysis was performed using I-Mutant 2.0, MutPred2, Consurf, and HOPE software. These tools were able to filter out damaging SNPs and predict the impact of deleterious SNPs on the structure and function of the MMP3 protein.
Results: This study predicted two potentially pathogenic SNPs (D175Y and Y116C) out of 443 missense SNPs from dbSNP, which is a database of SNPs available on the National Centre for Biotechnology Information website. Further analysis revealed that these SNPs were located in highly conserved regions and were predicted to decrease protein stability.
Conclusion: In this study, two potentially pathogenic SNPs (D175Y and Y116C) were identified. Characterisation of these SNPs can help us gain a better understanding of the molecular basis of clinical conditions. The results of this study can be further validated by designing population-based studies and wet lab experiments. This will help in augmenting research and personalised medicine.
Computational molecular biology, Gene polymorphism, Human genome, Precision medicine
The human genome is relatively large, comprising approximately six billion nucleotides. About 99.9% of the DNA sequences in the human genome are the same across individuals, with individual variations accounting for the remaining 0.1% (1). Single Nucleotide Polymorphisms (SNPs), which refer to the substitution of a single nucleotide, are the most prevalent type of genetic variation. SNPs occur throughout the genome at a rate of one in every 1,000 base pairs. SNPs are broadly categorised into two types: synonymous and non synonymous (nsSNP). Non synonymous SNPs are responsible for protein variations and amino acid substitutions in humans, while synonymous SNPs do not change the amino acid sequences (2). With the development of practical and readily accessible DNA sequencing tools, SNPs have gained substantial clinical significance over the past two decades. Regardless of their effects on the biological function of gene products, SNPs are increasingly recognised as valuable markers in mapping complex human diseases, population genetics, and evolutionary studies (3). Efficient identification of important SNPs and understanding their impact on protein function will be beneficial for genetic studies aimed at elucidating the molecular basis of various diseases. Furthermore, targeted in-vitro and in-vivo experiments can be designed to predict the effects of specific SNPs (4). In light of this, multiple studies are being conducted to predict the functional consequences of nsSNPs, particularly based on sequence information and structural attributes (5),(6),(7),(8),(9),(10).
Matrix Metalloproteinases (MMPs) are a family of more than 25 zinc-containing enzymes that break down various proteoglycans, fibronectin, interstitial collagen, and basement membrane collagen during normal physiological remodelling and repair processes in development and inflammation (11). MMP-3, also known as Stromelysin-1, is a crucial member of the MMP family that exhibits a wide range of substrate specificity. Additionally, MMP-3 plays a critical role in the remodelling of connective tissue by activating other MMPs, such as collagenase, matrilysin, and gelatinase B (12),(13). The role of MMP gene functional polymorphisms in cancer susceptibility, especially breast cancer, has been previously investigated. The 5A/6A polymorphism has been linked to patient survival, oestrogen receptor status, breast cancer risk, and tumour size based on multiple genome-wide association analyses (14). However, the association between SNPs and breast cancer risk remains inconclusive and controversial, as it is largely influenced by the ethnicity of the population and sample size [14-16]. Apart from breast cancer, a meta-analysis indicates a significant correlation between gastrointestinal, colorectal, and oesophageal malignancies and the MMP-3-1171 5A/6A variation (17).
In addition to malignancies, MMP3 gene polymorphisms have been found to be associated with atherosclerosis. This is due to the critical role of MMP3 in the degradation of the Extracellular Matrix (ECM) of blood vessels. Several studies evaluating the influence of promoter polymorphism and MMP-3 (-1612 5A/6A) on CAD risk have produced contradictory results, including gender differences as well (18),(19),(20). Two MMP3 polymorphisms, rs520540 A/G and rs679620 C/T, were associated with cerebral stroke risk; however, contrary results were obtained in a different population, probably due to racial differences (21).
Among these reported SNPs, distinguishing functional SNPs from neutral SNPs is a major challenge. Experimental wet lab studies aimed at identifying the effects of multiple SNPs are time-consuming, challenging, and costly. Applying computational approaches to filter out possibly detrimental SNPs is emerging as a novel strategy for understanding the molecular mechanisms of diseases (22).
This was an in-silico study utilising bioinformatics computational tools with good predictive capabilities to identify the possible deleterious SNPs in the MMP-3 gene. Several studies have been conducted to find the association between specific SNPs in the MMP3 gene and diseases (14),(15),(16),(17),(18),(19),(20),(21). However, according to the literature search, there have been no comparable in-silico studies that examined every SNP in MMP-3 from the NCBI database. This study is expected to unravel the potential impact of polymorphisms on MMP3 structure and function. In addition, it can help in designing further specific experiments to confirm the importance of these SNPs in various clinical conditions. Combined with experimental data, it can aid in the development of personalised medicine.
This online records-based cross-sectional study was conducted at Government Medical College, Manjeri, Kerala, India over a period from May to July 2024 (three months), which included data collection as well as analysis using online computational tools. IEC approval for this study was obtained from the institution (Ref. No: IEC/GMCM/133, date: 18/05/2024).
Methodology
The operational flow chart of this study is presented in (Table/Fig 1).
The detailed methodology is as follows:
1. Retrieval of SNPs in the MMP3 gene: The ncbi SNPs database (dbSNP) (https://www.ncbi.nlm.nih.gov/snp/) was used to retrieve all known SNPs for MMP-3 in humans (23). Non synonymous (missense) mutations were selected from the dataset for further analysis. The UniProt database was utilised to obtain the protein’s sequence and other pertinent information.
2. Prediction of the most deleterious missense SNPs: The retrieved missense SNPs were analysed using bioinformatics-based web tools for identification of high-risk nsSNPs. Details of the tools used are provided in (Table/Fig 2). Polymorphism phenotyping v2 (PolyPhen-2) and SIFT were the initial web-based tools employed. A SIFT score value between 0 and 1 indicates a potentially harmful effect, with scores below 0.05 being particularly concerning (24). A PolyPhen score ranging from 0.85 to 1 is considered probably damaging (25),(26). SNPs identified as potentially harmful by the aforementioned tools were further subjected to additional bioinformatics tools for confirmation and further refinement (27). The tools used included Predictions from Predictor of Human Deleterious Single Nucleotide Polymorphisms (PhD-SNP) (28), PANTHER-position-specific evolutionary preservation (29), PROVEAN (30), and SNPs and GO (31). SNPs projected by every tool to be harmful were selected for further examination.
3. Structural and functional analysis of SNP s with predicted damaging consequences:
a. Identification of impact on the structural and functional properties of MMP3: MutPred2 is a web application that integrates genetic and molecular data. It provides a ranked list of specific molecular changes with potential impacts on phenotype, along with a general pathogenicity prediction (32).
b. Prediction of change in stability of the MMP3 protein using I-Mutant 2.0 Suite: The influence of nsSNPs on protein stability was assessed using the I-Mutant 2.0, which utilises the free energy change value (Delta Delta G: DDG). The DDG represents the difference between the free energy of the folded wild-type and mutant structures. DDG values fall into three categories: neutral (-0.5 ≤ DDG ≤ 0.5 kcal/mol), mostly stable (DDG > 0.5 kcal/mol), or highly unstable (DDG < -0.5 kcal/mol) (33).
c. Prediction of evolutionary conservation of MMP3 using the ConSurf server: The ConSurf web server was used to examine the evolutionary rate of conservation of amino acids in MMP3. ConSurf determines the conservation scores of the protein’s amino acid positions by analysing the evolutionary relationships among similar sequences. The highest conserved location is graded as 9, while conservation values range from 1 to 9 (34).
d. Protein structure prediction and mutant analysis using HOPE software: The web-based application Have (y) Our Protein Explained (HOPE) is used to analyse how SNPs affect the structural conformation and functionality of proteins. HOPE uses data from multiple web services and databases to generate reports regarding the impact of SNPs, which include text, graphs, and animations (35).
1. Data on SNPs retrieved: In the MMP3 gene, the dbSNP yielded a total of 4,120 SNPs. Among these SNPs, 443 were missense, 196 were synonymous, and 2,720 were located in the intronic region. For further in-silico research, only the missense or nsSNPs were selected. The remaining 761 SNPs belonged to other functional classes, for which authors were unable to retrieve information from dbSNP. The functional classes in dbSNP delineate the position of a polymorphism in relation to identifiable features of a specific gene transcript. Most functional classes are determined by the location of the variation in relation to the exon boundaries of the transcript (23). Since present study requires only missense SNPs, only those were retrieved and analysed. The remaining 3,677 SNPs were excluded from the manuscript due to our inability to obtain accurate information about their functional class.
2. Identification of predicted deleterious SNPs: The selected missense SNPs were subjected to SIFT and PolyPhen-2 to identify possible deleterious SNPs. Out of the 443 SNPs, 14 were predicted to be damaging by both SIFT and PolyPhen-2 (Table/Fig 3) (24),(25),(26). All 14 shortlisted SNPs identified by these two web-based techniques were further analysed using other bioinformatics tools: PhD-SNP, PANTHER, PROVEAN, and SNPs and GO (Table/Fig 4) (28),(29),(30),(31). Among all the analysed SNPs, two nsSNPs, rs201427128 (D175Y) and rs373783506 (Y116C), were predicted to be damaging by all six bioinformatics tools, including SIFT and PolyPhen.
3. Prediction by MutPred2 on structural and functional effects (32): The selected two detrimental SNPs were analysed using MutPred2, which suggested a significant impact on the 3D structure of the MMP3 protein (Table/Fig 5).
4. Prediction of changes in protein stability by I-Mutant 2.0 (33): I-Mutant predicted the extent to which the damaging SNPs alter the stability of the MMP3 protein. Both SNPs were predicted to decrease the stability of the protein (Table/Fig 6).
5. Consurf server results (34): An analysis of conservation using the ConSurf server revealed that both SNPs, D175Y and Y116C, were located in highly conserved areas and were found to be exposed residues (Table/Fig 7).
6. Predicted changes in the 3D structure of the MMP3 protein due to nsSNPs (35): HOPE software was used to generate results for both nsSNPs, with structural information obtained from the Protein Data Bank (PDB) database. The PDB ID used was 1SLM. In the case of D175Y, the mutant residue is larger, impacting hydrogen bond formation and its interaction with the metal ion in the original wild-type residue. Additionally, there is a disturbance in the ionic interaction made by the wild-type residue. For Y116C, the smaller residue results in an empty space in the protein core, which disrupts proper folding by causing a loss of hydrogen bonds in the core (Table/Fig 8).
The MMP-3 gene, which codes for the MMP-3 protein, is a component of the MMP gene cluster located on chromosome 11q22.3. The NCBI gene ID for MMP3 is 4314, and its UniProt ID is P08254 (36),(37). The expression of the MMP3 gene may be regulated by DNA polymorphisms, which can exert allele-specific effects, resulting in fundamental variations in each person’s susceptibility to diseases and other traits (11).
The goal of the current study was to use several computational techniques to discover SNPs that could be harmful and alter the structure or function of the MMP3 gene. A significant number of researchers have utilised in-silico tools to predict the structural and functional impact of SNPs on proteins, such as the human CTLA4 gene (38), IL-33 gene (39), and the androgen receptor gene (27). This work represents the first comprehensive and systematic in-silico study of functional SNPs in the MMP3 gene.
Combining multiple computational methods can enhance the validity of predictions regarding damaging SNPs and improve efficiency. By employing six prediction methods (SIFT, PolyPhen-2, PROVEAN, PANTHER, PhD-SNP, and SNPs and GO), authors were able to obtain an integrated view of harmful SNPs in the MMP3 gene. Since each method relies on a different set of parameters, two ns SNPs (D175Y and Y116C) were identified as high-risk in present study, which were predicted to be harmful by every algorithm. Additional research revealed that these two ns SNPs might be dangerous because they are located in a highly conserved region and have the potential to affect protein stability (6),(7),(8),(9),(22),(27),(38),(39).
As an important member of the MMP3 family, several experimental studies have been conducted on genetic variation in MMP3, especially in malignancies (11),(13),(14),(15),(16),(17),(18),(21). Wang L et al., conducted an observational study and meta-analysis of MMP3 gene polymorphism in the Chinese population. The researchers concluded that while MMP-3 gene polymorphism raises the risk of ovarian cancer in the southern Chinese population, their meta-analysis showed that it had no effect on risk in other populations (40). Durmanova V et al., demonstrated that the MMP3 SNP rs3025058 influences the age of onset of Alzheimer’s disease (41).
The predicted ns SNPs from present study have not been linked to any clinical disease in published studies. Therefore, to support this finding, it is necessary to validate these ns SNPs through larger population-based studies and specifically designed wet lab experiments.
Limitation(s)
The present study utilised the limited dataset from the NCBI dbSNP database. More ns SNPs could have been analysed if other databases had been included.
Using a variety of bioinformatics techniques, this study has identified two significant nsSNPs, D175Y and Y116C, from the extensive SNP dataset of MMP3. Based on these findings, it can be concluded that these SNPs should be regarded as significant candidates for planning functional experiments involving the MMP3 gene. This presents an opportunity to investigate the molecular basis of any associated clinical problems and may aid in the discovery of potential drugs or pharmacological targets. Although computational biology is powerful, it has its limitations; therefore, population-based research is necessary to provide definitive validation.
DOI: 10.7860/JCDR/2024/73934.20249
Date of Submission: Jul 01, 2024
Date of Peer Review: Aug 01, 2024
Date of Acceptance: Sep 25, 2024
Date of Publishing: Nov 01, 2024
AUTHOR DECLARATION:
• Financial or Other Competing Interests: None
• Was Ethics Committee Approval obtained for this study? Yes
• Was informed consent obtained from the subjects involved in the study? No
• For any images presented appropriate consent has been obtained from the subjects. No
PLAGIARISM CHECKING METHODS:
• Plagiarism X-checker: Jul 02, 2024
• Manual Googling: Aug 06, 2024
• iThenticate Software: Sep 24, 2024 (14%)
ETYMOLOGY: Author Origin
EMENDATIONS: 8
- Emerging Sources Citation Index (Web of Science, thomsonreuters)
- Index Copernicus ICV 2017: 134.54
- Academic Search Complete Database
- Directory of Open Access Journals (DOAJ)
- Embase
- EBSCOhost
- Google Scholar
- HINARI Access to Research in Health Programme
- Indian Science Abstracts (ISA)
- Journal seek Database
- Popline (reproductive health literature)
- www.omnimedicalsearch.com