Bioinformatics Tools in Clinical Microbiology and Infectious Disease Prevention Algorithms

Bioinformatics resource is the exploitation of genome sequence data for diagnostic, therapeutic


Introduction
Bioinformatics tools and techniques analyzing Next-Generation Sequencing (NGS) data are increasingly used for the diagnosis and monitoring of infectious diseases [1][2][3]. It is of interest to review the application of bioinformatics tools, commonly used databases and NGS data in clinical microbiology, focusing on molecular identification, genotypic, microbiome research, antimicrobial resistance analysis and detection of unknown disease-associated pathogens in clinical specimens. Furthermore, bioinformatics tools are extensively used in the identification, characterization, and typing of all kinds of pathogens [4,5].
This followed the widespread use of genomic approaches in the diagnosis and management of viral, bacterial, and fungal infections [6,7]. Applications of bioinformatics have been used in pathogen identification, detection of virulence factors, resistance analysis, and strain typing. Next Generation sequencing (NGS) determines the DNA sequence of a complete bacterial genome in a single sequence run, and from these data, information on resistance and virulence, as well as information for typing is obtained, useful for outbreak investigation. The obtained genome data can be further used for the development of an outbreakspecific screening test. In this review, a general introduction to NGS is presented, including the library preparation and the major characteristics of the most common NGS platforms. Although NGS holds enormous promise for clinical infectious disease testing, many challenges remain, including automation, standardizing technical protocols and bioinformatics pipelines, improving reference databases, establishing proficiency testing and quality control measures, and reducing cost and turnaround time, all of which would be necessary for widespread adoption of NGS in clinical microbiology laboratories [8][9][10].
Furthermore, applications of NGS in the clinical setting are described, such as outbreak management, molecular case finding, characterization and surveillance of pathogens, rapid identification of bacteria using taxonomy, metagenomics approaches on clinical samples, and the determination of the transmission of zoonotic micro-organisms from animals to humans [11][12][13]. Finally, we share our vision on the use of NGS in personalized microbiology in the near future, pointing out specific requirements. Recently, repeated transmission of animal viruses to humans has prompted investigation of the viral, host, and environmental factors responsible for transmission via aerosols or respiratory droplets [14]. How do we determine out of thousands of virus isolates collected in animal surveillance studies each year which viruses have the potential to become airborne and hence pose a pandemic threat [15].
In this study, using knowledge from pandemic, zoonotic and epidemic viruses, we postulate that the minimal requirements for efficient transmission of an animal virus between humans are efficient virus attachment to upper respiratory tissues, replication to high titers in these tissues, and release and aerosolization of single virus particles. Investigating airborne transmission of viruses is key to understand and predict virus pandemics [16,17]. In addition to it, antibiotics are essential for the treatment of bacterial infections and are among our most important drugs. Resistance has emerged to all classes of antibiotics in clinical use. Antibiotic resistance has, proven inevitable and very often it emerges rapidly after the introduction of a drug into the clinic. There is, therefore, a great interest in understanding the origins, scope and evolution of antibiotic resistance [18][19][20].
The review discusses the concept of the antibiotic resistome, which is the collection of all genes that directly or indirectly contribute to antibiotic resistance. The concept of the antibiotic resistome provides a framework for the study and understanding of how resistance emerges and evolves [21][22][23][24]. Thus, we seeks to assemble current knowledge of the resistome concept as a means of understanding the totality of resistance and not just resistance in pathogenic bacteria. Furthermore, the study of the resistome reveals strategies that can be applied in new antibiotic discoveries.

Virus Design: Route to airborne mechanism
For the purpose of virus transmission, the major challenges for virus transmission research is needed to elucidate mechanisms for transmission. The focus should be on both gain of function and loss of function approaches. Rather, the loss of function experiments are removing any crucial part and it will stop running. In analogy, mutating a transmissible virus so it no longer transmits is a pointless exercise, giving us none to little mechanistic information; there are a thousand ways to accomplish that. Gain of function experiments that only one or a few parts need adjusting but the key is determining which part(s) out of the possible thousand they are. To investigate which viral parts need before it becomes transmissible, at least two options are available, and both should be followed. We believe that the evolutionary events leading to the pandemic viruses and genetic changes in the avian-origin viral genes made these reasserting viruses airborne. Thus, we can hypothesize based on accumulated data, which viral characteristics would facilitate airborne transmission to be important for virus to become airborne. Therefore, we postulate that the minimal requirements for airborne viruses are several of these kinds of features that attachment to and replication in appropriate cells of the URT, high virus yields in the URT and virus shedding as single particles.

Transmission: Viral determinant characteristics
Human-to-human transmission of viruses can occur through direct or indirect contact and/or via aerosols and respiratory droplets. Opinions differ on the importance of each route, as data have been published support of various routes. However, efficient aerosolization of viral particles is crucial for the transmission efficiency and pandemic potential of viruses. There is no exact particle cut-off size at which transmission changes from exclusively large respiratory droplets to aerosols, but it is generally accepted that transmission occurs through aerosols for infectious particles with a diameter smaller than µm. Larger droplets do not appear to remain suspended in air and typically travel less than 1 m before settling on environmental surfaces or on the mucosa of close contacts. Smaller particles -below 5µm -, in contrast, settle less rapidly and can therefore travel much further.
Humans exhale respiratory droplets of widely varying quantity and size. Coughing and sneezing has been documented to generate mostly aerosol particles with sizes under 1 µm. There are several pieces of evidence in support of the role of aerosols in virus transmission. A classical study has shown that much lower doses of virus are required to cause disease in human volunteers when infected via aerosols than the doses required for intranasal inoculation. Second, aerosolized influenza virus can remain infectious for prolonged periods of time, in particular at low humidity. Moreover, when virus aerosolization is blocked with UV treatment of upper room air, virus transmission can be abolished.

Consolidation algolithm for antibiotic resistome model
The antibiotic resistome is dynamic and ever expanding, yet its foundations were laid long before the introduction of antibiotics into clinical practice. Here, we revisit our theoretical framework for the resistome concept and consider the many factors that influence the evolution of novel resistance genes, the spread of mobile resistance elements, and the ramifications of these processes for clinical practice. Observing the trends and prevalence of genes within the antibiotic resistome is key to maintaining the efficacy of antibiotics in the clinic. Antimicrobial resistance is a major global health challenge. Metagenomics allows analyzing the presence and dynamics of "resistomes" (the ensemble of genes encoding antimicrobial resistance in a given microbiome) in disparate microbial ecosystems.
However, the low sensitivity and specificity of available metagenomic methods preclude the detection of minority populations (often present below their detection threshold) and/ or the identification of allelic variants that differ in the resulting phenotype. Here, we describe a novel strategy that combines targeted metagenomics using last generation in-solution capture platforms, with novel bioinformatics tools to establish a standardized framework that allows both quantitative and qualitative analyses of resistomes.

Bioinformatics tools: NGS and Software
Identification and characterization of micro-organisms that cause infections are crucial for successful treatment [25], recovery and safety of patients. Sequence analyses can be used to answer different diagnostic questions, such as the genetic relationship of either bacteria or viruses, the detection of mutations in viral or bacterial genomes leading to resistance against antivirals or antibiotics [26], identification of fungi through sequence analyses of the 18S ribosomal deoxyribonucleic acid (rDNA) of the Internal Transcribed Spacer (ITS) region and identification of bacteria through sequence analyses of the 16S rDNA such as the MiSeq (Illumina) and the Ion PGM™ (ThermoFisher) [27]. This was also confirmed with the clear divergence among P.

Advances in Biotechnology & Microbiology
mirabilis SCDR1 Proteus mirabilis species 300 on the proteomic level. Genome sequencing and analysis of the first spontaneous Nano silver resistant bacterium Proteus mirabilis strain SCDR1 using the bioinformatics tools for identifying and combating antimicrobial resistance was illustrated in (Figure 1). The biggest challenge concerning the introduction of NGS in the clinical microbiology laboratory is the data analyses. Nonetheless, even with little knowledge of bioinformatics, it is possible to perform NGS data analyses for diagnostic purposes, using the numerous user-friendly software packages available. However, for more indepth analysis, scientific knowledge is required on the genomic features and the biological background of the micro-organism under investigation (Table 1).

Virus transmission
A virus-spread mechanism is related to inter-flat or interzonal airflow through open windows caused by buoyancy effects. Both on-site measurements and numerical simulations quantify the amount of the exhaust air that exits the upper part of the window of a floor and re-enters the lower part of the open window of the immediately upper floor. Ventilation air could contain up to 7% (in terms of mass fraction) of the exhaust air from the lower floor. The transmission route was reconstructed by epidemiological and genomic data (Figures 2-4). Each node represents a patient, and an arrow indicates a possible transmission event from one patient to another. The blue arrow with solid line represents a direct transmission event supported by both epidemiological data and genetic data, the blue arrow with dash line represents an indirect transmission (e.g. via environment) supported by epidemiological data, and the red arrow indicates the equally parsimonious transmission link which cannot be resolved by neither epidemiological data nor genetic data.

Advances in Biotechnology & Microbiology
The inter-institutional transfer of the patient is shown by dash lines, on which the distance between institutions is indicated. The red star represents an outbreak at a secondary hospital, but the isolates were unavailable for further research. Shown figures in Figure 3 and 4 are presented its mutants tested in competitive transmission studies. On the other hand, the bacterial bioinfor-matics database and analysis resource (PATRIC) gene annotation analysis showed that 261 the number of the observed Coding Sequence (CDS) is 4423, rRNA is 10 and tRNA is 71. The unique gene 262 count for the different observed metabolic pathways is 2585 ( Figure 5).

A resistome algorithm model
The small cross-hatched boxes represent the antibiotics and resistance genes of relevance to clinical practice. Respectively, these are a small subset of the world of small bioactive molecules (the parvome), and the world of potential resistance determinants (the resistome). The resistome comprises the genes that potentially encode resistance to antibiotics. The mobilome comprises the mobile proportion of bacterial genomes. The mobilome and resistome overlap, since many resistance genes are located on mobile elements. Both the resistome and mobilome are a subset of the total coding capacity of prokaryotic cells, the pangenome, which is expressed as the panproteome. Note that only a small proportion of the parvome is utilized by humans for antibiotic purposes, and that the scale of commercial antibiotic production probably overwhelms the natural production of these molecules by the entire global microbiota ( Figure 6). A recent study in Figure 7 compared the detection of viruses in known respiratory virus-positive samples and not previously analysed nasopharyngeal swabs by an RNA sequencing-based metagenomics approach with a more conventional molecular method. The data-analyses was performed using Taxonomer, a rapid and interactive, web-based metagenomics data-analyses tool. Overall, the metagenomics approach had a high agreement with the molecular method, detected viruses not targeted by the molecular method, and yielded epidemiologically and clinically relevant sequence information (Figure 7).

Prediction mathematical model to infection
In this study, we propose that patient-to-patient variability sets a fundamental limit on outcome prediction accuracy for a general class of mathematical models for the immune response to infection. We investigate several systems of Ordinary Differential Equations (ODEs) that model the host immune response to a pathogen load. Advantages of systems of ODEs for investigating the immune response to infection include the ability to collect data on large numbers of 'virtual patients', each with a given set of model parameters, and obtain many time points during the course of the infection. We implement patient-to-patient variability v in the ODE models by randomly selecting the model parameters from distributions with coefficients of variation v that are centered on physiological values.
We use logistic regression with one-versus-all classification to predict the discrete steady-state outcomes of the system. We find that the prediction algorithm achieves near 100% accuracy for v = 0, and the accuracy decreases with increasing v for all ODE models studied. The fact that multiple steady-state outcomes can be obtained for a given initial condition, i.e. the basins of attraction overlap in the space of initial conditions, limits the prediction accuracy for v > 0. Increasing the elapsed time of the variables used to train and test the classifier, increases the prediction accuracy, while adding explicit external noise to the ODE models decreases the prediction accuracy. Our results quantify the competition between early prognosis and high prediction accuracy that is frequently encountered by clinicians. Clinicians need to predict patient outcomes with high accuracy as early as possible after disease inception (Figure 8). Therefore, the early elucidation of resistance mechanisms using WGS also has implications for the design of clinical trials. If resistance mechanisms are discovered that only result in marginally increased Minimal Inhibitory Concentrations (MICs) compared with the wild type MIC distributions, more frequent dosing or higher doses could be employed in clinical trials to overcome this level of resistance. Moreover, the discovery of cross-resistance between agents using WGS can influence the choice of antibiotics that are included in novel regimens. WGS has become an essential tool for drug development by enabling the rapid identification of resistance mechanisms.

Conclusion
Bioinformatics tools such as Next-generation sequencing (NGS) technologies are increasingly being used for diagnosis and monitoring of infectious diseases. Herein, we reviewed the application of NGS in clinical microbiology, focusing on genotypic resistance testing, direct detection of unknown disease-associated pathogens in clinical specimens, investigation of microbial population diversity in the human host, and strain typing applications in clinical virology,