Metagenomics Comparison of Buruli and Non Buruli Ulcer Skin Wound
Correspondence Address :
Malhaur, Near Railway Station, Gomti Nagar Extension, Lucknow-226028, Uttar Pradesh, India.
Introduction: Buruli Ulcer (BU) is caused due to mycobacteria, namely Mycobacterium ulcerans. Buruli ulcer is caused by the mycolactones that are secreted by the Mycobacterium ulcerans which results into the tissue necrosis. Metagenomics is a branch of genomics that deals with the study of uncultured microbial genomes present in natural samples like human body parts, environmental samples, food and dairy, disease conditions. Metagenomics branch has enabled us to explore and elucidate the importance of microbial genomes in healthy and infected samples.
Aim: To evaluate metagenomic and microbial analysis of buruli and non buruli ulcer skin wound samples along with structural and functional analysis of MUL_3720 protein.
Materials and Methods: The present analytical study was conducted from May 2021 to January 2022 at Amity Institute of Biotechnology, Amity University Uttar Pradesh, Lucknow, India. European Nucleotide Archive (ENA) database was used to retrieve metagenome data of BU and non BU skin lesions with the project id PRJEB14948. Galaxy server was used for the metagenomics analysis from quality control to identification and classification of microbial community in the samples. Different tools from the Galaxy like FASTQC, Trim Galore, KRAKEN2, convert kraken, Krona pie chart tools were used for metagenomic analysis and for taxonomic classification of microbes. Finally, Krona pie chart was generated that gives an elaborate understanding of the different microbes and their percentage in the BU. Complete annotation like protein structure prediction, domain analysis of MUL_3720 protein of Mycobacterium ulcerans was also done as potential drug target against BU. Statistical analysis was done using Krona pie chart generation and prediction.
Results: Metagenomic analysis shows that there is difference in microbiome of BU and non BU samples. Differential microbes identified were Mycobacteriacae 1-2%, Sporomusa species 18-22% and Desulfovibrio halophillus 24-25%. Bacteria which were present in both the samples are Actiniidae, Desulfovibrio halophillus, Sphingomonas, and Mycobacteriacae. Structural and functional annotation of MUL_3720 protein of Mycobacterium ulcerans shows that MUL_3720 protein can be potential drug target for drug discovery.
Conclusion: This study highlights the metagenome of Buruli Ulcer skin wound and can be used to identify potential drug targets for Buruli Ulcer. Metagenomic analysis of BU and non BU skin wound shows that there is difference in microbial community hence this information can be used in proper diagnostic and medication to combat Buruli Ulcer disease.
European nucleotide archive database, Galaxy server, Krona pie chart, Microbiome
Buruli Ulcer (BU) is a deadly skin disease, which can even lead to distortion of limbs (1). It is caused by mycobacteria named Mycobacterium ulcerans it secretes mycolactones which is a toxin and produced by many pathogenic bacteria and by Mycobacterium ulcerans also that results in necrosis of the surrounding tissues and skin lesions (2). Buruli ulcer is one of the most commonly occurring skin infection that is caused by mycobacterial diseases and this infection has higher frequency in the African countries (3). The main body organs or parts that are affected due to this disease are skin and bones (4). In case of improper treatment, major and serious consequences of the disease are permanent or long-term deformity and disability.
The exact mode of transmission of BU in the human population is still not known and it is of important research interests (5). Some hypothesis state that causative bacterium, Mycobacterium ulcerans is transmitted into human population through the aquatic bodies (6). This is essentially based on the disease spreading in regions that are having number of water bodies that are stagnant. Mycobacterium ulcerans generally live independently or is an association with the various other organisms like mosses, faeces of the animal, insects, crayfishes living in the aquatic environment (7).
One of the earliest symptoms of the diseases noted is the appearance of a bump which is swollen and painful (8). Other variable form noted as hard, elevated skin known as plaque, that is spread widely beneath the skin cover (9). After weeks of infection, skin is raised abruptly and then subsequently sloughs out rendering an ulcer which is of painful nature. In some cases, healing of the ulcer on its own has been seen, but sometimes it may decrease in size and remain like an open wound for years (10). In other cases, having larger ulcer, the infection may penetrate deep till the tissue that are lying beneath, causing serious implications like infection of the bone that renders muscles (11).
Medical treatment includes oral rifampin and intramuscular streptomycin or oral administration of rifampin in addition to clarithromycin for about two months which is directed by the World Health Organisation (12). The WHO has classified BU into following three categories:
• Category I- Ulcer which is single in number and spreads across an area of less than 5 cm.
• Category II- Ulcer that are with broader swollen areas and wider in size that is upto 15 cm (13).
• Category III- Ulcer sizing more than 15 cm or who have spread to areas like eyes, genitals, joints, which are highly sensitive area, are included in this group (14).
The drug combination and dosage advised by WHO are rifampicin and clarithromycin to be administered once and twice, respectively on daily basis (15). Other drugs that can be also effective include, levofloxacin, moxifloxacin, amikacin (16). Surgery is also a major part of the treatment process as it can fasten the recovery process by removing the necrotic tissue, in this way it also stops the spreading of the infection to the surrounding areas (17).
Prevention is better than cure and this deadly disease can be prevented by trying to avoid coming in close contact with the water bodies (18). However, this situation is unavoidable by the people living in these areas (19). Wearing clothes with full length or long sleeves and using mosquito repellents can be effective preventive measures. Till now, no vaccine has been designated but Bacille Calmette-Guérin (BCG) vaccine has been found to be partly effective in dealing with this disease (20). The specific mode of transmission of BU is still into research but it mostly takes place in regions around waterways that are stale or stagnant and slow moving. It has been seen that cases rise in the rainy seasons (21). The bites from mosquitoes have been proposed as a mode of transmission relying the on epidemiologic examination that were conducted during Australian outbreak (22).
Metagenomics is compiled from two words “Meta” and “Genomics”. It basically refers to the study and examination of genetic material of the samples that are obtained from the environment. Metagenomics has tremendous capability to reveal the hidden or the undiscovered microorganisms (23). Metagenomics is designated as the study of genetic materials of different natural samples that are collected from the environment like soil, water, plants, food, organ (24). Metagenomics can reveal the varieties and hidden potentials of microscopic organisms, so that the understanding of the living world is revolutionised (25). Metagenomics has enabled researchers to study and identify novel microbes that are present in different conditions. Metagenomic holds great promise in revealing microbes, their genes, and pathways that have function in different environmental, biological or any condition of interest (26). Since, decades bioinformatics has been used to reveal the hidden function of genes, proteins, upto genome and proteome level (27). MUL_3720, a M. ulcerans protein, is a promising target for antigen capture-based detection methods. Currently the power of computational biology and sophisticated software’s has enabled researchers to analyse high throughput data generated from Next Generation Sequencing (NGS) (28). This capability of bioinformatics has been explored in metagenomics also to identify the microbes present in metagenome sequence data. With the availability of bioinformatics tools and software’s metagenomics study holds great potential in identification of microbial community in specific microbiomes (29). Hence, the present study aimed to evaluate metagenomics analysis of BU samples and identified microbiome in BU. Structural and functional analysis of MUL_3720 was also done that can be used as drug target.
The present analytical study was conducted from May 2021 to January 2022 at Amity Institute of Biotechnology, Amity University Uttar Pradesh, Lucknow, India.
Retrieval of the metagenomic data: Bioinformatics database that is European Nucleotide Archive (ENA) database (https://www.ebi.ac.uk/ena/browser/) was used to download metagenome data of BU. The raw data of BU was retrieved from ENA database with the project id PRJEB14948 (https://www.ebi.ac.uk/ena/browser/view/PRJEB14948?show=reads). Complete experimental information and data generation protocol can be accessed from project id PRJEB14948 and link provided above. This data contains six samples from a BU skin wound and seven samples from a non BU skin wound. In this project Illumina MiSeq paired end sequencing was performed to get metagenome of both types of samples. Total four files were selected for current research and two conditions were made namely:
1: Non Buruli Ulcer skin wound
2: Buruli Ulcer skin wound
Details of sample files used has been shown in (Table/Fig 1) along with their accession ID, sample type and conditions.
Metagenomics tools: Metagenomics tools used for this study were used from the Galaxy Server (https://metagenomics.usegalaxy.eu/). Different tools were used for data preprocessing, quality control analysis, taxonomic classification, and visualisation of metagenome (30). (Table/Fig 2) lists all the tools along with their functions that were used for metagenomics analysis of data sets (31),(32),(33),(34),(35).
Metagenomics Analysis Protocol
i) Quality control: It is procedure to check and manage the data quality. It is an important step for obtaining high quality output data in the results. It is executed for precision, fulfilment, relevance, legitimacy, practicality, and consistency of the information (36). To perform quality control analysis of the uploaded FASTQ file. Firstly, authors searched for the FASTQC tool from the galaxy tool panel. FASTQC was performed for all sample files. (Table/Fig 3) shows the steps and tools used for this study.
ii) Data preprocessing: It is important step to filter all adapter sequence or contaminants from the FASTQ files. This step was performed using Trim Galore tool from the galaxy tool panel. Trim Galore tool was searched from the galaxy tool panel. FASTQC output file was used as input file in Trim Galore tool. Result of Trim Galore tool appeared in the history column of the galaxy (37).
iii) Taxonomic classification: Kraken 2 is the latest version of Kraken tool. It is a tool for taxonomic classification, and it works by using exact k-mer matches with the microbial database for obtaining accurately and speedy classification. KRAKEN 2 tool was searched from the galaxy tool panel. Trim Galore output files were used as input in Kraken 2 tool. The result of KRAKEN 2 was appeared in the history column of the galaxy (38).
iv) Taxonomic representation: KRAKEN 2 result file was converted into galaxy readable representation format so that data can be visualised. This step was performed using Convert kraken tool in this tool output of Kraken 2 tool was used as input and formatted file was obtained (39).
v) Metagenomic visualisation: To obtain the clubbed metagenomic results in a pie chart form, Krona pie chart tool was used. In this tool results of Convert Kraken were used as input and pie chart was obtained for all the samples (40).
Functional annotation of MUL_3720 protein of Mycobacterium ulcerans as drug target: Further to investigate the target protein of Mycobacterium ulcerans MUL_3720 protein was selected. Bioinformatics tools like Swiss model server was used to predict the structure of target protein. Structure prediction is important to understand the function and prediction of active sites, binding regions of protein. To study the function of MUL_3720, domain prediction was done. Domain prediction is very crucial part of protein function identification, as domain is a part of protein that is functionally and structurally conserved (41).
Further to highlight the importance of MUL_3720 protein docking was performed to study the efficiency of interaction of target protein with drug. Docking is a computational method that is used to predict the interaction of protein with any ligand, drug or any chemical compound of interest (42). In current study, docking of MUL_3720 with rifampicin drug was done to access the suitability of MUL_3720 as target protein. MUL_3720 protein sequence information was retrieved from UniProtKB database https://www.uniprot.org/uniprot/A0PU23#sequences. UniProtKB-A0PU23 (A0PU23_MYCUA). All functional annotation to docking analysis has been described in result section.
Statistical analysis was done using Krona pie chart generation and prediction. The data has been tabulated in the form of percentages.
Metagenomics analysis of BU and Non BU: Metagenomics analysis of all the sample files was done using methodology and tools as mentioned in (Table/Fig 3). All files passed the Quality Control (QC) and trimming was done in all four files. After QC and data preprocessing analysis was done. Results were compiled in the form of Krona pie chart which shows the various microorganisms that were present in all the samples. Krona pie chart for the set 1 i.e., sample from the non BU is depicted in the (Table/Fig 4), which shows the bacterial diversity and classification along with their percentage composition.
Similarly, the final results for the set two data i.e., sample from the Bururli Ulcer was shown in (Table/Fig 5), where bacterial pie chart shows relatively higher concentration of Sphingomonas species about 42% which is evident from the krona pie chart.
Bacterial diversity in both the conditions non BU skin wound (ERR1551328 and ERR1551330) and BU skin wound (ERR1551321 and ERR1551322) were studied and results were compared. (Table/Fig 6) shows the comparison between bacteria in both the conditions. The bacteria which were present in both non BU and BU skin wound are Actiniidae, Desulfovibrio halophillus, Sphingomonas and Mycobacteriacae. Bacteria which were present only in Buruli ulcer skin wound are Cyanothece species, Thermosipho melanesiensis, Streptosporangium longisporum, Massilia timonae as seen in (Table/Fig 4), (Table/Fig 5).
Metagenomics analysis of BU shows that there is difference in microbial community in BU and Non BU skin disease. Metagenomics analysis shows percentage of microbes in the BU that is Mycobacteriacae about 1-2%, Sporomusa species about 18-22%, Desulfovibrio halophilus about 24-25% etc. Bacteria which are present in both the samples were Actiniidae, Desulfovibrio halophilus, Sphingomonas, and Mycobacteriacae. This study can be used to identify potential drug targets for BU.
Annotation of MUL_3720 protein of Mycobacterium ulcerans as drug target: Protein annotation is a important approach to study the function of protein. With the help of bioinformatics tools and software’s protein structure and function prediction can be done. MUL_3720 protein of Mycobacterium ulcerans was annotated and its function was studied using computational methods. MUL_3720 protein sequence information was retrieved from UniProtKB database https://www.uniprot.org/uniprot/A0PU23#sequences. Domain is functional part of protein that is conserved in structure and function across protein family and superfamily. Protein functional sites that are called as motifs are also important to predict the protein function. Multiple motifs make domain hence domain identification is a important parameter to identify protein function. Any novel protein function can be predicted with the help of domain identification (43). UniProtKB-A0PU23 (A0PU23_MYCUA) protein sequence was used for domain identification, structure prediction and structural analysis. MUL_3720 protein was extensively studied as potential drug target against Mycobacterium ulcerans causing BU.
Domain identification of MUL_3720 protein was done using Scan Prosite tool it is online tool available at (https://prosite.expasy.org/scanprosite/) and protein functional regions that is domains were identified. (Table/Fig 7) shows the domain information of MUL_3720 protein (44),(45). Two domains BULB_LECTIN (Bulb-type lectin domain profile) and LYSM with profile id PS50927and PS51782 respectively were predicted, and function was studied. BULB_LECTIN domain was identified from protein sequence position 2-109 and LYSM domain was predicted from protein sequence position from 159-206.
Protein structure prediction: MUL_3720 protein structure prediction was done by homology modelling method (46). Homology modelling is computational technique that can be used to predict the structure of protein of unknown protein structure. Since all proteins cannot be crystallised hence structure of all proteins are unavailable. Homology modelling method predicts the structure of protein based on sequence homology. If sequence homology is more than 30% this method can be used to predict protein structure. Homology modelling works on the assumption that homologous sequence has similar structure (47). Homology modelling of MUL_3720 was done using Swiss model server https://swissmodel.expasy.org/.
First step in homology modelling is to identify homologous sequence called as template. Template sequences have sequence similarity with the MUL_3720 protein and structure is known. Template with PDB id 4oit.2. A was identified for homology modelling of MUL_3720 protein with the sequence homology of 77.14%. Template used for structure prediction have LysM domain protein structure of Mycobacterium smegmatis. (Table/Fig 8) shows the three-dimensional structure of MUL_3720 protein of Mycobacterium ulcerans.
Structural verification and validation are important aspect of protein structure prediction. Different methods are available that verifies the computationally predicted protein structure. Out of these methods Ramachandran plot analysis method is best to verify any protein structure (48). It plots protein amino acids in four regions as shown in (Table/Fig 9) and locates amino acids according to their phi and psi angles. Ramachandran plot amino acids as favourable region (dark green), left handed alpha helix regions (light green) and disallowed regions (white) (49). Ramachandran plot analysis was done for predicted protein structure of MUL_3720 for structural verification and quality assessment as shown in (Table/Fig 9). According to Ramachandran plot 96.08% residues lies in favoured and 0.0% in Outlier’s region. Structural analysis shows that predicted structure is stable and can be further used for binding site prediction and analysis.
Docking of MUL_3720 with rifampicin: Docking is yet another and most widely used computational method to predict the interaction between protein and ligands, drugs. Docking can be used to study the binding efficiency, stable or unstable interaction, binding energy hence stability of interaction between any two molecules of interest. Docking can be used to predict protein-drug interaction, protein-protein interaction, DNA-protein interaction etc (50). Docking method was used in current research to study the efficiency of MUL_3720 protein as potential drug target. Hence, MUL_3720 protein was docked with known drug that is Rifampicin to study the binding efficiency of target protein. Binding site prediction and interaction with drug analysis of predicted protein structure was done using docking (51). Rifampicin drug that is well known and established drug for Buruli Ulcer was used for docking. Rifampicin drug file and structural information was retrieved from PubChem database with CID 135398735 having chemical formulae as C43H58N4O12 as shown in (Table/Fig 10).
The MUL_3720 Protein interaction with Rifampicin drug was done by docking method using CB-Dock server which is a Cavity-detection guided Blind Docking server (52). (Table/Fig 11) shows the interaction of Rifampicin drug with target protein. Docking result shows that MUL_3720 protein have binding efficiency with Rifampicin drug with the vina score of -7.1 and cavity size of 215.
Structural and functional annotation of MUL_3720 protein of Mycobacterium ulcerans shows that MUL_3720 protein can be potential drug target for drug discovery. Domain analysis shows that MUL_3720 protein have function in BU, structural and docking analysis shows that this protein has binding efficiency towards drug as docking was done with rifampicin drug.
Metagenomics analysis has become the important tool to understand the microbial community in different biomes like soil, water, plants, animals, organ, or organ system. Metagenomics approach has been used to study the microbial diversity along with their percentage in different disease condition like gut cancer, skin cancer, liver infection, mouth, or oral cancer. Metagenomics tools has been used to identify infections in cardiovascular disease (53), oral health condition, infectious diseases (54), gut viromes, colorectal cancer (55), autoimmune diseases (56). In current research, metagenome analysis was done to predict the microbial community in BU and non BU. Researches has been done BU skin wound to study the protein or genes involved in BU skin (57). But with the advent of metagenomics, it is possible to explore the presence of microbes in such condition that can be further used for medicinal or clinical purposes. In current research metagenomics analysis of non BU skin wound and BU skin wound was done by using metagenome data from metagenome databases. Complete and exhaustive analysis of BU metagenome shows that there is difference in microbiomes of both the condition of skin wound. This study highlights the important microorganisms shows their presence only in Buruli ulcer skin wound. The microbiome of BU skin wound showed presence of Cyanothece, Thermosipho melaneiensis, Streptosporangium longisporum, Massilia timonae. Metagenomics analysis shows percentage of microbes in the BU that is Mycobacteriacae about 1-2%, Sporomusa species about 18-22%, and Desulfovibrio halophilus about 24-25%. Bacteria which are present in both the samples were Actiniidae, Desulfovibrio halophilus, Sphingomonas, and Mycobacteriacae. This study can be used to identify potential drug targets for BU. Further complete annotation of MUL_3720 protein of Mycobacterium ulcerans was done to study the efficiency of MUL_3720 protein as potential drug target. Structural and functional analysis shows that MUL_3720 protein can be potential drug target and can be used for drug discovery.
The present study was completely based on computational approach with the help of well established and verified protocol. This research gives insights into the metagenome of BU and non BU skin wound and highlights the presence of bacteria in both the condition. But clinical and microbial investigation is required to further investigate the microbial community in BU condition.
Metagenomic analysis of BU and non BU skin wound shows that there is difference in microbial community in both the conditions. More information is still required to combat this disease as there is lack in greater and deeper knowledge about the deadly diseases like BU. Proper medication is the need of the hour to combat this disease. Hence, more research and investigation are required about the metagenome, protein function of BU. Development of control and prevention strategies also requires equal attention, which provides great opportunities to leverage new and latest diagnosis. Mycolactones secreted by the Mycobacteria ulcerans is a toxin which can be used in various research fields if investigated thoroughly. Hence, BU holds a lot of scope of research not only in the field of medication but also in understanding the cause and mechanism of action these bacteria and also they can be employed various sectors in the ever changing and the revolutionising world.
Authors would like to acknowledge Amity Institute of Biotechnology, Amity University, Uttar Pradesh, Lucknow campus for providing the facilities to conducting the study.
Date of Submission: Apr 26, 2022
Date of Peer Review: Jul 01, 2022
Date of Acceptance: Aug 29, 2022
Date of Publishing: Nov 01, 2022
• Financial or Other Competing Interests: None
• Was Ethics Committee Approval obtained for this study? No
• Was informed consent obtained from the subjects involved in the study? No
• For any images presented appropriate consent has been obtained from the subjects. No
PLAGIARISM CHECKING METHODS:
• Plagiarism X-checker: May 02, 2022
• Manual Googling: Aug 22, 2022
• iThenticate Software: Aug 27, 2022 (5%)
ETYMOLOGY: Author Origin
- Emerging Sources Citation Index (Web of Science, thomsonreuters)
- Index Copernicus ICV 2017: 134.54
- Academic Search Complete Database
- Directory of Open Access Journals (DOAJ)
- Google Scholar
- HINARI Access to Research in Health Programme
- Indian Science Abstracts (ISA)
- Journal seek Database
- Popline (reproductive health literature)