Introduction
Complexity is one of the first words which come to mind when trying to decipher human health and the mechanisms behind the operation of cells, tissues and biological systems in general are vast and intertwined. Chronic and recurrent inflammatory disorders, such as inflammatory bowel diseases (IBD) and their 2 main representatives Crohn’s disease (CD) and ulcerative colitis (UC), require complex and multiple approaches so that one can study every aspect of their etiopathology. Fig. 1 showcases the complex disease background of IBD, which in turn requires sophisticated approaches for its elucidation.
Figure 1 Graphical representation of -omes studied by inflammatory bowel disease (IBD)-omics: The genome (red) remains unchanged regardless of disease phase, whereas the epigenome (yellow) could be enriched as the disease progresses. During the disease the transcriptome (green) and proteome (blue) vary, usually increasing, while the gut microflora (dark-yellow) loses biodiversity. Finally, the metabolome (magenta) presents high variability, affected by both the host and the microflora
Systems biology [1], systems bioinformatics [2], computational biology [3], network medicine [4], and several other terms have been coined in recent years to denote the interdisciplinary field whose goal is to approach health as the integration of individual biological data and functions, using principles from computer science, mathematics, physics, chemistry, and computational tools. Omics data (genomics, transcriptomics, metabolomics, metagenomics, etc.; Table 1), along with clinical data and observations, are analyzed and associated with specific phenotypes forming networks of interacting entities. The suffix “-omics” in the name of a scientific field (e.g., genomics) refers to the comprehensive or global study of factors that may be implicated in disease pathogenesis [5].
Table 1 Definitions of omics
In recent years, the principles and practices of systems biology have been applied to a multitude of human health aspects, allowing scientists to gain better insights and help drive therapeutic approaches. In unifactorial and multifactorial conditions alike, computational tools and algorithmic implementations have produced many critical observations in the study of obesity [6,7], diabetes [8,9], cardiovascular disease [10,11], autoimmune- [12,13] and inflammation-related conditions [14,15], neurodevelopment and neurodegeneration [16,17], lung disorders [18,19], cancer [20,21], and a variety of other diseases. Computational drug discovery [22,23] and repositioning [24,25] have also benefited greatly from systemic approaches, allowing chemical compounds and molecules to be studied for targets in the mechanisms behind specific diseases.
In this review, we cover an innovative and exciting field: IBD-omics.
IBD-omics
IBD etiopathology is not completely known, yet various factors from different origins and systems are implicated in it [26]. Among these, it is known that the genetic background and the epigenetic modifications play an important role in disease predisposition, prognosis, and response to treatment [27]. It is also known that environmental factors may exacerbate or ameliorate patients’ clinical symptoms and that the microbial composition and the metabolic profile of IBD patients differ from healthy individuals and may be the result or the cause of the disease’s manifestation [28,29]. Additionally, there is accumulated knowledge on how the immune system and several expressed proinflammatory and profibrotic molecules are involved in disease pathogenesis [30]. However, these factors do not act alone: on the contrary, they synergize and shape the complex pathophysiology of IBD.
IBD-omics examine specific features involved in IBD, highlighting the imperative to employ methodologies that study and model all the existing interactions leading to knowledge acquisition (Fig. 2). In the subsequent sections, we identify and discuss the various molecular systems that shape the IBD field.
Figure 2 The plethora of methodologies and their experimental results, which can be combined to elucidate inflammatory bowel disease pathophysiology and molecular background IBD-omics, inflammatory bowel disease-omics; NMR, nuclear magnetic resonance; HPLC, high-performance liquid chromatography; SNP, single nucleotide polymorphism; CLIP-seq, cross-linking immunoprecipitation-sequencing; RNAseq, RNA sequencing; qPCR, quantitative polymerase chain reaction; ChIP-seq, chromatin immunoprecipitation sequencing
Genomics in IBD
DNA, the fundamental basis of life, was first discovered in 1869 by a Swiss physician and biologist, Johannes Friedrich Miescher [31]. Since then, huge progress has been made, including the discovery of the DNA composition by Albrecht Kossel and Phoebus Levene and its structure by Watson and Crick; however, most important was the initiation and completion of the Human Genome Project (HGP), which succeeded in sequencing the whole human genome [32]. The term “genome” refers to the total gene composition of each organism and in humans, only about 2% of the genome (~20,000 genes) is translated into proteins. Nonetheless, the rest of the genome does not contain “junk” sequences, as first believed, but ones important for cell survival, functionality and evolution, such as non-coding RNAs, regulatory DNA regions, LINEs, SINEs, and introns [33].
Another goal of the HGP was to identify single nucleotide polymorphisms (SNPs), a single nucleotide substitution in DNA, which have been found to be the most common genetic variants and were later used in genome-association studies [34]. During the early 2000s, the first genome-wide association studies (GWAS) started to emerge and it was Klein et al that published the first GWAS on age-related macular degeneration, highlighting several gene mutations associated with this disease [35]. Since then, the number of articles reporting GWAS on various diseases has grown exponentially [36].
The basic principle of GWAS is that complex genetic diseases are associated with multiple and often common genetic polymorphisms; they are thus referred to as polygenic. The first part of GWAS methodology includes whole DNA extraction from patient and control samples, such as blood, followed by genetic sequencing for SNPs of specific genes, as described in the Methodology section of Guo et al [37]. For this purpose, chips containing genome-wide SNPs have been developed to test for common genetic variations among the population. Next, results are analyzed using various approaches. If an SNP frequently appears in the patient group, with a very high statistical significance, such as P<5×10-8, then the genetic region that includes this SNP is considered to be a risk locus and therefore associated with the disease [38]. In other words, GWAS may accurately associate any specific SNP with a disease trait, outcome, or even response to treatment, excluding at the same time any insignificant difference between patients and controls [39,40].
Even before the GWAS era, hypothesis-driven studies reported the first gene to be associated with susceptibility to CD, the nucleotide-binding oligomerization domain-containing protein 2 (NOD2) [41,42]. NOD2 plays a significant role in inflammation as it is implicated in the activation of nuclear factor (NF)-κB, responsible for the activation of numerous genes of the innate and adaptive immunity [43]. In 2006, Duerr et al published the first GWAS on IBD and reported a strong association between CD patients and an SNP located in the IL23R gene [44]. Since then, many more GWA studies took place, highlighting susceptibility loci in IBD patients, such as ATG16L1, IRGM, TNFSF15, PTPN2, IL12B, JAK2, STAT3 and many more that showed strong association with CD patients [38].
Another technological milestone in genomics was the development of the Immunochip. Based on the observation that many chronic and autoimmune diseases share common genetic traits, researchers developed a chip, the Immunochip, which contained about 200,000 SNPs and 800 small insertion–deletions found to be associated with different autoimmune disorders [39,45]. Using this technology, Jostins et al carried out a large scale meta-analysis and identified 163 genetic loci associated with CD, UC, or both [46]. Since then, the use of Immunochip has highlighted new genetic variants that are associated with IBD, the clinical course of the disease and even adverse events following treatment with anti-tumor necrosis factor (TNF) agents [47-49].
As mentioned above, GWA studies are limited in associating only common genetic variants with a disease trait, outcome or response to a treatment. This limitation was overcome by next-generation sequencing (NGS) technologies, which focus on studying rare genetic variants [40]. NGS not only offered the opportunity to study rare genetic variants, but also sped up the sequencing process and significantly lowered its cost. One of the technologies that emerged during the NGS era was targeted genome sequencing, where only parts of the genome are selectively sequenced and studied; this is accomplished by the use of DNA or RNA probes that specifically target and bind to the regions of interest in the DNA sequence. Whole exome sequencing (WES) is one example that arose from the use of these technologies [50].
WES aims at sequencing only the part of the genome that is translated into proteins [40]. Previous WES studies have identified new genetic variants associated with susceptibility to IBD, such as missense mutations in the genes PRDM1, NDP52, IL17REL, and CSF2RB [51-53]. A recent WES study on an Ashkenazi Jewish IBD population highlighted genetic variants in the genes NOD2, ZNF366 and MDGA1 associated with IBD, but failed to confirm previous associations with genes such as THEMIS, MCOLN2 and NLRP2 [54]. Similarly, Onoufriadis et al found a strong association between a rare variant in the NLRP7 gene and UC patients who originated from the UK, but not with CD patients, probably due to the small size of the CD patient group [55]. Very early onset IBD (VEOIBD), a severe variety of IBD that may manifest in children, has also been the focus of WES studies and researchers have identified rare genetic variants in several genes associated with VEOIBD, such as NOX1, NOD2, IL10RA and ADAM17 [56-60].
Genomic studies have shown that CD and UC may share the same genetic risk factors for susceptibility, but have also reported different genetic loci specifically associated with either disease or either disease’s subphenotype. This discovery may offer clinicians a useful tool for distinguishing CD’s subphenotypes and CD from UC, especially in cases where clinical and endoscopic criteria fail to do so [61]. Nonetheless, the genetic background of the population plays a significant role, as some genetic risk factors seem to be ethnic-specific, with different frequencies being reported in studies that include patients of different ethnicities [62-65].
Epigenomics in IBD
Beyond genetic factors implicated in CD and UC, which account for 13.6% and 7.5%, of disease diversity respectively, environmental factors also contribute to IBD pathogenesis and may affect and alter the genetic background. Environmental elements that interact with DNA components and regulate its function are called epigenetic factors. Epigenetics is the field of science that studies the interactions between environment and genome and the manner that these interactions may regulate gene expression. DNA methylation and histone modifications are the main epigenetic modifications. Although they were first believed to be non-inheritable, recent studies suggest the opposite and thus may play a significant role in IBD pathogenesis [66].
DNA methylation occurs when the DNA methyl-transferases (DNMTs) transfer a methyl group to cytosines, resulting in the formation of 5-methylcytosine. If this reaction takes place in regions of gene promoters, then it leads to obstruction of transcriptional factor binding and ultimately, to suppression of gene expression. DNA methylation is also implicated in histone modification, and together they regulate the expression patterns of cells [67,68]. The most common method for studying DNA methylation was first described by Frommer et al. Briefly, addition of sodium bisulfite to genomic DNA results in the conversion of non-methylated cytosines to uracils, while leaving methylated cytosines intact. Next, the methylation status of the gene of interest is calculated by performing methylation-specific polymerase chain reaction (PCR) [69].
In IBD, 3 types of DNMTs have been found to be implicated in its pathogenesis: DNMT1, DNMT3a and DNMT3b. DNMT1 and DNMT3b have been found to be elevated in inflamed mucosa sites of UC patients, while genetic variants of the DNMT3a gene have been associated with CD [70]. In a methylation-profiling study that included female-only CD patients, Li Yim et al found more than 4000 positions to be differentially methylated and to be associated with over 2700 genes in CD patients, out of which two were the most significant and located in PTPRN2 and BCL11A genes. The same investigators also identified 8 differentially methylated regions (DMRs) close to 8 different genes, and some of these genes were associated with immune-related pathways [71]. Regarding UC patients, Kang et al also found differentially methylated patterns in UC patients with 3 genes, FAM217B, KIAA1614 and RIBC2, being hypermethylated; thus, this might be a tool for distinguishing UC patients from healthy individuals [72]. In a recent study, mucosal biopsies taken from UC patients showed different methylation patterns from those of healthy individuals; UC patients showed hyper-methylation or hypo-methylation in genes involved in homeostasis and defense, or in immune response pathways, respectively [73]. An epigenome-wide association study revealed that IBD patients bear differentially methylated positions (DMPs) compared with healthy individuals and most of these DMPs found in IBD cases were shared between CD and UC patients. Among the top IBD-associated DMPs were positions located in the RPS6KA2, IL23A and TNFSF10 genes and among the IBD-related DMRs were regions near the genes VMP1, ITGB2, WDR8 and TXK [74]. In a similar study that included pediatric patients, Howell et al found that intestinal epithelial cells from IBD patients had different methylated patterns from controls and the methylation pattern was not only disease-specific but also gut-segment specific [75].
Histone modifications are another epigenetic mechanism that regulates gene expression; among these, histone acetylation and methylation are the best studied [76]. Histone acetylation takes place when a histone acetyl transferase adds an acetyl group to the amino-acid lysine of the histones, resulting in transcriptional enabling, while its removal by histone deacetylases (HDACs) leads to transcription blocking. On the other hand, histone methylation can either enable or block transcription, depending on the region where the methyl group is attached [27,76]. A well-established method for studying histone modifications is the use of chromatin immunoprecipitation (ChIP) protocol, where the chromatin structure is extracted, fragmentated and immunoprecipitated and, finally, the DNA is studied using various protocols, such as microarrays (ChIP-on-chip) or next-generation sequencing (ChiP-seq), that will enable the detection and quantification of modifications that occurred at the point of interest [77-79].
There are only few data regarding histone modifications in IBD. Bai et al found that lysine acetyltransferase 2B expression was significantly reduced in inflamed tissues of IBD patients, which resulted in low levels of histone H4 lysine 5 acetylation and, subsequently, in downregulation of interleukin (IL)-10 expression [80]. Previous in vitro studies have shown that Th17 immunological responses may be subject to histone modifications. Primary Th17 cells isolated from healthy individuals expressed high levels of IL-17A, and not IL-17F, upon stimulation with prostaglandin E2 and/or IL-23 plus IL-1β. Further investigation revealed that the expression of 2 cytokines, IL-17A and IL-17F, was regulated by histone modifications; IL-17F expression was silenced due to histone methylation of H3, whereas IL-17A was overexpressed as a result of different patterns of methylation and acetylation of H3 [81]. Along similar lines, Ghadimi et al showed that certain commensal probiotics may inhibit the NF-κB transcriptional factor by reducing the histone acetylation levels [82]. Finally, data from murine models of colitis suggest that the inhibition HDACs may lead to the induction of apoptosis and Foxp3 expression and to the suppression of proinflammatory cytokine expression [76].
Transcriptomics in IBD
If we consider the genome to be the answer to “who is responsible for our genetic and health background?”, the transcriptome answers the question “what does the genome do at any given time/cell?”. As DNA gets transcribed to RNA, by RNA polymerase in the nucleus [83], it allows for measurable differences and identification of how gene interactions transpire (infer function via gene expression [84]) and what genetic information gets passed along in the “genotype to phenotype” pipeline. We now know that a considerable number of genes are to be transcribed to RNA and translated into proteins, while others regulate and assist as a variety of RNAs. Messenger RNA (mRNA) is our main source of characterizing coding genes while ribosomal (rRNA), micro (miRNA), long non-coding (lncRNA), transfer (tRNA), small nuclear (snRNA) and others are performing functions that make protein synthesis possible [85-87].
One of the advantages of modern technological advances has been the ability to measure gene expression in samples of various tissues after RNA isolation [88,89]. Quantitative real-time PCR (qPCR) [90], cap analysis gene expression (CAGE) [91], serial analysis of gene expression (SAGE) [92], microarrays [93], and total RNA sequencing (RNA-seq) [94] are just a few of the methodologies allowing us to quantify RNA transcripts. Differential gene expression analysis (DGEA) [95-97] enables studying the expression of specific genes under various conditions, tissues and timepoints and the juxtaposition between them to detect statistically significant differences that may signify association. For example, if a gene is found to be over- or under-expressed in several disease-associated samples versus controls, it allows for the assumption that this gene is implicated in the pathophysiology of the disease and the signaling pathways it is involved in are affected. Databases like KEGG [98] and REACTOME [99], supported by bioinformatics platforms like Enrichr [100] and Ingenuity Pathway Analysis (IPA) [101], enable researchers to identify these pathways and perform further analyses based on them.
In the past decades these techniques, along with various other omics, have assisted in identifying the pathophysiological mechanisms of IBD. In 2005 Costello et al [102], using cDNA arrays of colonic mucosa samples, identified several genes associated with UC and CD, highlighting the complex nature of IBD but also distinguishing the phenotypes. Following a similar motif, Schmidt et al [103] observed that IL-23p19 and IL-27p28 are elevated in CD but not in UC. Carey et al [104] highlighted IL-6:STAT3-dependent biological networks upregulated in IBD patients, regulating leukocyte recruitment, HLA expression, angiogenesis and tissue remodeling. Bamias et al [105] studied the difference in mucosal expression of housekeeping genes during IBD and concluded that it is altered, proposing other genes (namely RPLPO and RPS9) as reference and extended validation. Sugihara et al [106] described elevated C3 and IL-17 mRNA expressions in the inflamed mucosa of IBD patients. In Fransen et al [107], a correlation between IBD and IL6, IL23A and RORC was identified. Chiriac et al [108], using RNA-seq, found that colon tissues from IBD patients and mice with DSS colitis exhibited increased expression of IL28 versus controls, leading them to test and validate that IL28 administration in the animal model promotes mucosal healing. Hong et al [109], via RNA-seq, pinpointed differences between the inflamed and non-inflamed intestinal mucosa of CD patients and healthy controls. Of high interest in recent years is the study of non-coding RNAs and their role, as depicted in reviews and original articles [110-116].
Finally, Telesco et al [117], Arijs et al [118], Nunes et al [119], Lucafò et al [120], and Váradi et al [121], among others, have studied the response to specific therapeutic interventions to identify molecular targets and separate responders from non-responders.
Proteomics in IBD
The term “proteome” refers to the total proteins, including all isoforms or post-translational modifications that can be expressed by the genome. Thus, proteomic analysis provides an opportunity for the large scale detection, identification and characterization of the whole protein expression of a given cell or tissue, making it the ideal tool for biomarker discovery [122]. Currently, there are several different approaches to proteomic analysis; in this review we try to briefly cover most of them.
Liquid chromatography and mass-spectrometry (MS) are the 2 widely used techniques for protein separation and identification. Liquid chromatography aims at separating the components found in a mixture, depending on their size, shape, charge or affinity for a certain ligand; there are thus many different types of chromatography [123]. MS is used for the identification of proteins, peptides or their post-translational modifications, and over the years several different techniques have been developed, from electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI) to the new generation of mass analyzers and complex multistage instruments, such as hybrid quadrupole time-of-flight (Q-Q-TOF) [124]. Nonetheless, the most common MS techniques include surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) and MALDI time-time-of-flight (MALDI-TOF). Identification of proteins based on these techniques requires, first, protein digestion into peptides, then measurement of the mass-to-charge ratio in an electric field and, finally, investigation of the peptide mass signatures in relation to a database of known proteins [122]. Beyond liquid chromatography and MS, more complex and quantitative techniques have been developed, such as 2D-PAGE, 2D-DIGE, ICAT, SILAC, proteolytic 18O labeling, iTRAQ and protein/antibody arrays (chips); a technical description of these is beyond the scope of this review but they were nicely analyzed by Alex et al [122].
By using any of the aforementioned techniques, proteomics are able to compare expression profiles between IBD patients and controls, which may ultimately lead to biomarker discovery. Over 10 years ago, Meuwis et al performed the first proteomic analysis in serum samples of IBD patients and controls and found 4 possible biomarkers (PF4, MRP8, FIBA and Hpalpha2) that could discriminate IBD patients from healthy individuals, thus providing a possible diagnostic tool [125]. During the same year, a similar proteomic study on intestinal epithelial cells from IBD patients revealed expression differences between inflamed and non-inflamed colonic regions in molecular pathways that regulate signal transduction, stress response and energy metabolism; among these, the most outstanding differences were in the expression of programmed cell death protein 8 and annexin 2A [126]. Along similar lines, Nanni et al observed statistically significant expression changes in the intestinal epithelial cells of CD patients and healthy individuals; some of these proteins, such as heat shock protein 70 and tryptase alpha-1 precursor, were upregulated in CD patients, while others, such as the nuclear protein Annexin A1 were downregulated [127]. Another interesting study on mononuclear cells from IBD patients and healthy donors was able not only to distinguish the expression profiles of patients and controls, but also to discriminate between CD and UC patients [128]. In a recent proteomic analysis, Manfredi et al showed that proteins related to the acute phase response and the complement activity were upregulated in serum samples from IBD patients, whereas proteins implicated in protease function, blood coagulation, oxygen transport and lipoprotein metabolism were downregulated [129]. Expression differences in genes related to immune responses between IBD patients and controls have also been identified, and according to the disease location or behavior, these expression patterns may vary among CD subtypes [130]. Proteomic analysis of colonic mucosal-luminal interface aspirates from pediatric patients with IBD-associated colitis led to the identification of possible markers for the diagnosis of UC [131]. Apart from diagnostic biomarkers that can enable easier discrimination between IBD adult or pediatric patients and healthy individuals or CD and UC patients, proteomic studies have also revealed possible biomarkers associated with response to treatment. Specifically, Meuwis et al conducted a small pilot study in CD patients treated with infliximab and found that PF4 was associated with response to treatment [132].
Metabolomics in IBD
Metabolomics is the scientific field that studies the metabolic processes and changes in metabolite production that occur in an organism. Metabolomic studies are usually performed in noninvasive samples, such as blood, urine, stool and swabs, and the resulting metabolic profile is the outcome of at least 3 different sources; dietary compounds, xenobiotics from the environment, and metabolic products of the microflora. Therefore, metabolomics studies in IBD usually examine the metabolic relationship between the host and the gut microflora and the factors that may influence this relationship [133].
The 2 most common technologies used in the field of metabolomics are MS, analyzed in the previous section (Proteomics), and nuclear magnetic resonance (NMR)-spectroscopy [133]. NMR is a widely used technique in chemistry that provides information about the molecular structure of the examined compounds and their absolute or relative concentration in the sample. The main differences between MS and NMR-spectroscopy concern the sensitivity, and the sample size and preparation required. MS is the more sensitive method and is able to detect and identify metabolites with a thousand times higher sensitivity than NMR-spectroscopy. Regarding the sample size, MS methods usually require volumes of 10-30 μL, whereas NMR samples need to be about 300 μL. Lastly, NMR methods do not require any specific sample preparation other than sample dilutions, whereas, as described above, MS requires a series of steps before sample analysis [134].
The first metabolomics study was performed in fecal samples from twin CD patients and twin healthy individuals and one of the findings was that the metabolic profile differed among patients with ileal disease, patients with colonic disease and healthy individuals. The differences in metabolite content were identified to be on pathways concerning the metabolism or synthesis of amino acids, fatty acids, bile acids and arachidonic acid; thus, this study suggested possible metabolic biomarkers for disease diagnosis and phenotype characterization [135]. Along similar lines, Bjerrum et al performed a metabolomic study in patients with active or inactive UC and healthy individuals and found that the metabolomic profile of colonic biopsies and colonic epithelial cells differed between active UC and controls. The most interesting result, however, was the fact that, although inactive UC patients were free of clinical and histological findings, the metabolic profile of 20% of them matched with that of active UC patients, suggesting again that metabolomic analysis might be a useful tool for disease prognosis [136]. In a recent study, Diab et al investigated the metabolite levels of omega-3 and omega-6 polyunsaturated fatty acids in colonic biopsies taken from healthy controls and treatment-naïve or deep remission UC patients. They found that levels of omega-6 metabolites such as prostaglandin E2, thromboxane, trans-leukotriene, and 12-hydroxy-eicosatetraenoic acid, known to be actively involved in inflammation, were significantly higher in treatment-naïve UC patients compared with the other 2 groups, while levels of omega-3 metabolites were lower. Furthermore, the elevated levels of omega-6 metabolites correlated with increased proinflammatory cytokine expression, suggesting that polyunsaturated fatty acid metabolism plays a significant role during the onset of the disease [137]. In another study, Keshteli et al highlighted metabolic differences in urine and serum samples between UC patients in remission or relapse. They showed that significantly higher levels of trans-aconitate, 3-hydroxybutyrate, acetoacetate and acetone, and lower levels of acetamide and cystine were found in UC patients experiencing a clinical relapse, suggesting possible biomarkers for disease prognosis [138]. Metabolomic signatures could also contribute to disease categorization, as in a recent study serum metabolic profiles differed among healthy individuals and UC or CD patients [139]. Apart from the host, changes in the metabolic profile of gut microflora have also been found between healthy individuals and IBD patients. One of the first studies that compared the metabolomic signatures in fecal samples from healthy individuals and patients with UC or irritable bowel syndrome showed that taurine and cadaverine levels were higher in UC patients [140]. Furthermore, our research group, using in silico approaches, has recently reported that the microbial metabolites of CD B2 or B3 behavioral sub-phenotypes differ from the B1 sub-phenotype, suggesting that metabolomic analysis might contribute to disease sub-phenotype classification [141].
Overall, metabolomics is a promising tool for disease diagnosis, prognosis and classification and its now low-cost and easy-to-perform evaluations could add significant information in everyday clinical practice.
Microbiome in IBD: the meta-paradigm
So far, this review has presented various -omics approaches that help characterize and analyze the background of human health. But is this the full picture or is it all a matter of application? The truth is that the targets of these techniques are usually only one tenth of the cells in the human body. What has happened in recent years is a paradigm shift towards exploring the rest; the identity and function of our symbionts. The microflora or microbiota, as they are referred to in the literature, are viruses, fungi, archaea, and, most importantly, bacteria that live primarily in the gastrointestinal tract but also on the skin, in the mouth, nose and lungs. Their total genetic composition is called the microbiome and has become the new focus for genomics, transcriptomics, metabolomics and various other targeted and blanket approaches using state-of-the-art and established technologies [142-144].
Microbiota are usually satisfied staying in an ever-evolving cycle of mutualistic bliss (albeit a fragile one); the homeostasis. During homeostasis the microbiome’s function [145] is associated with human health by providing (in an interaction with the gut mucosa and the immune system [146-148]) defense against pathogens, modulation of inflammation, production of energy and vitamins, and assistance with the host’s metabolism and nutrient intake. When homeostasis becomes unbalanced and microbial populations and functions are altered, the phenomenon is called dysbiosis. It has been directly associated [149-151] with the onset, progression and therapeutic response of multiple health conditions, including lung-associated disorders [152,153], obesity [154,155], diabetes [156,157], cardiovascular disease and atherosclerosis [158-161], chronic kidney disease [162-164], cancer [165-167], neurological and neuropsychiatric disorders [168-171], and, in the spirit of this review, IBD [172-174].
As the microbiome comes under extensive scrutiny, new insights into its association with IBD have come to light to present a parallel alteration in behavior and phenotype [175]. In 2008 Huttenhower et al [176] identified Faecalibacterium prausnitzii’s anti-inflammatory role in CD, and later studies [177-179] confirmed its reduction during dysbiosis. Robinson et al [180], using an animal model of colitis, identified several phylogenetic and metabolic associated changes in the microbiome similar to what happens in human IBD. Chu et al [181] described how outer membrane vesicles secreted by Bacteroides fragilis play a role in the immunomodulation of IBD in partnership with the NOD2 and ATG16L1 genes. Halfvarson et al [182], in a longitudinal study of the microbiome, proposed and demonstrated diversion from a bacterial healthy plane during IBD, while also studying the role of f-calprotectin without finding any statistically significant association. In a recent work, our group has also identified dysbiosis associated with the complex behavioral sub-phenotypes of CD (stricturing and penetrating) and the differential diversity and function of the microbiome versus the inflammatory sub-phenotype [183]. Ananthakrishnan et al [184] have studied the response to anti-integrin biologic therapy in association with the microbiome and have concluded that microbial function and diversity in early stages of therapy might be able to predict its efficacy. Finally, many studies have focused on fecal microbiota transplantation (FMT) [185-189] as a potential therapeutic action versus IBD, but also as a possible irritant [190,191].
From systems biology to treating IBD
As it stands, we now possess an abundance of tools and technical methodologies to analyze and try to comprehend the genetic and molecular background of IBD. This is, though, just the first step in the war waged against it. We are facing a group of diseases that differ greatly in their symptomology and progression, but also in the approaches that should be taken to treat them. Up to now, both CD and UC have been considered manageable but untreatable, with a heavy burden on the quality of life. This, combined with the long and tedious process of finding a potential therapeutic target, identifying/validating a chemical compound/molecule and ultimately providing it to the general public, means that changing IBD’s status to treatable may still be far away.
Systems biology has provided the tools to ameliorate the situation in all steps along the way. We have already discussed how -omics approaches highlight potential pharmacological targets, but what about drug identification and validation? Or their deployment and efficacy in the general population? Drug repositioning [192-194] has been introduced to cost-effectively, accurately and efficiently identify drugs that can help regulate the pathology of a disease. It allows for repurposing pharmaceuticals already on the market to be used on new targets. These drugs have already undergone extensive preclinical and clinical trials, have known side-effects and are generally considered safe for the patient under other conditions. Computational drug repositioning, because of its nature in utilizing -omics data, is a step towards precision medicine. By identifying drugs for specific patients/patient groups, there is no need to classify and account for responders and non-responders to treatment, since the drugs discovered are specifically identified using the patient’s genetic and molecular background [195-199].
Over the years, by using all the aforementioned approaches of -omics, possible therapeutic targets have been identified, and this has led to the development of biological therapies and, ultimately, to personalized medicine. The milestone during all those years of research was the use of anti-TNF agents in treating IBD patients. Infliximab was the first chimeric monoclonal antibody against TNF-α, a proinflammatory cytokine with a central role in systemic inflammation. Since then, 3 other anti-TNF agents have been approved for IBD; adalimumab, certolizumab pegol, and golimumab [200]. Beyond anti-TNF agents, many other biological therapies against proinflammatory ILs, such as ustekinumab, intracellular signaling targets, such as Janus kinase inhibitors (tofacitinib, filgotinib, and upadacitinib), and cell adhesion molecules, such as natalizumab, vedolizumab, etrolizumab, AJM300 and PF-00547659 either have been approved for IBD treatment or are currently under investigation in clinical trials [200-204].
Biological data integration and networks: the way forward
As with all multidisciplinary approaches, medical systems biology must advance on multiple fronts to be effective. Its goals and hypotheses must be reassessed and its methodologies must be updated constantly. For example, multidisciplinary approaches such as the recently emerged “proteogenomics” (a combination of proteomics, genomics, and transcriptomics) can allow for a holistic point of view regarding disease pathophysiology [205]. Medicine has moved away from a generalized treatment-oriented goal to preventive and personalized approaches. The target is no longer just to treat a condition, but how to do it effectively and efficiently while gaining a better understanding of its methods of action. Meanwhile, biological data acquisition and digitization techniques (omics) advance every day, producing higher volumes of data with higher specificity and precision. Computational power and toolkits must keep up with these new challenges. New databases, algorithms and user-friendly applications [206-211] are being developed to support the analysis of said data by experts in bioinformatics and physicians alike.
The application of graph and network theory principles in the construction and analysis of biological networks is arguably the strongest weapon in our arsenal [212]. It provides the means to study multiple biological interactions in a mathematical way and extract information that might not be easily comprehensible otherwise [213]. Genes, proteins, microbiota, diseases, drugs and a variety of other entities can be linked together and analyzed in a meaningful manner to extract significance and association. Perhaps the easiest way to understand biological networks is to think of a molecular signaling pathway and how genes synergize together to perform a function as a whole. But that is only the tip of the iceberg in our case; having the accumulated knowledge of past years we know which genes are co-expressed in various experiments, to which proteins these are being translated to, how those interact amongst themselves, how they can provide variability in the course of a disease or its treatment, and so forth.
We can approach these networks via different methodologies and tools depending on what our target is. Probably the most commonly used paradigms include GENEMANIA [214] for gene interactions, STRING [215] for protein-protein associations, KEGG [216] or REACTOME [99] for signaling pathways, and even some powerful solutions like Cytoscape [217] or the IGRAPH [218] package for R oriented towards more tech-savvy users, but the whole list is very extensive.
In IBD, network approaches are not very popular, with less than 400 papers utilizing some kind of network implementation having been indexed in PubMed over the last 3 years. Some recent examples of different kinds of networks include the work of Peters et al [219], with the creation of a complex genomic network via a variety of resources and its modeling and analysis, the work of Benchimol et al [220], who employed distributed network analysis on phenotypic and locational data to identify the epidemiology of IBD in Canada, and how Coward et al [221] used a network based meta-analysis to compare the effectiveness of commonly used IBD treatments.
Concluding remarks
The sections of this review have covered the most popular -omics, how they are applied and their implementation in IBD studies. Table 2 showcases the results we can expect from -omics analyses along with how these can be interpreted and utilized in everyday clinical practice. This is important because, regardless of the methodology, the ultimate goal will always be the usability and effectiveness of the results. We neither want nor need more meaningless data, but rather clear information about how to approach, evaluate and treat a disorder. It is apparent that systems biology is years away from being the only solution we will ever need to employ again, but is nevertheless a very efficient way to elucidate the knowledge we currently have and continue to amass.
Table 2 Inflammatory bowel disease (IBD)-omics, the expected outcome, and its possible applications in clinical practice