Fig. 1

Fig. 1

Pipeline overview. Experimental steps comprise sample collection, cell isolation, RNA extraction, and sequencing. The computational pipeline analysis implemented in Nextflow included repertoire quantification, and repertoire properties screening. AIRR: Adaptive Immune Receptor Repertoire.

Fig. 2

Fig. 2

Different characteristics of TCR (beta chain) repertoire analysis showed specific within-sample profiles independent of immune status. A) Overlap analysis of CDR3β sequences to uncover clonal convergence or shared clonotypes between individuals. Jaccard similarity coefficient matrix indicates a high (red) or low (blue) match between two sample sets. B) Diversity profile analysis to measure clonal expansion. The heatmap contains high (light yellow) or low (dark purple) pairwise Pearson correlation coefficients of the Evenness profiles calculated for control (blue), cirrhotic without MHE (red) or cirrhotic with MHE (black) patients. C) K-mer analysis to examine repertoire similarity. The heatmap represents the highest (light yellow) to the lowest (dark purple) pairwise Pearson correlation coefficients of the 3-mers (amino acid k-mers of length 3) frequency distribution matrix. D) Network analysis to study clonal architecture. Colors from light yellow to dark purple represent the number of nodes with degree 0–5 of each repertoire. Most of the nodes are single (i.e. degree=0). E) Network single node (degree=0) distribution between patient groups to test if the number of degree=0 nodes in cirrhotic patients without MHE (red) were significantly higher than control (blue) or cirrhotic with MHE (black) patients. The test was performed after removing PC134 (outlier in the withMHE group).

Expand allCollapse all

Abstract

T-cell receptor (TCR) analysis is relevant for the study of immune system diseases. The expression of TCRs is usually measured with targeted sequencing approaches where TCR genes are selectively amplified. However, many non-targeted RNA-seq experiments also contain reads of TCR genes, which could be leveraged for TCR expression analysis while reducing sample requirements and costs. Moreover, a step-by-step pipeline for the processing of transcriptome RNA-seq reads to deliver immune repertoire data is missing, and these types of analyses are usually not included in RNA-seq studies of immunological conditions. This represents a missed opportunity for complementing them with the analysis of the immune repertoire.

We present a Nextflow pipeline for T-cell receptor repertoire reconstruction and analysis from RNA sequencing data. We used a case study where TCR repertoire profiles were recovered from bulk RNA-seq of isolated CD4 T cells from control patients, cirrhotic patients without and with Minimal Hepatic Encephalopathy (MHE). MHE is a neuropsychiatric syndrome, mediated by peripheral inflammation, that may affect cirrhotic patients. After the recovery of 498-1,114 distinct TCR beta chains per patient, repertoire analysis of patients resulted in few public clones, high diversity and elevated within-repertoire sequence similarity, independently of immune status. Additionally, TCRs associated with celiac disease and inflammatory bowel disease were significantly overrepresented in MHE patient repertoires. The provided computational pipeline functions as a resource to facilitate TCR profiling from RNA-seq data boosting immunophenotype analyses of immunological diseases.

Graphical Abstract

Figure thumbnail ga1

1. Introduction

T-cell receptors (TCR) are able to recognize an immense variety of processed antigens. The approximate diversity of unique TCRs in a human individual is ∼108−1010 [1x[1]Y. Elhanati, A. Murugan, C.G. Callan, T. Mora, A.M. Walczak, Quantifying selection in immune receptor repertoires, Proc. Natl. Acad. Sci. 111 (2014) 9875–9880. https://doi.org/10.1073/pnas.1409572111.

Google ScholarSee all References
, 2x[2]A. Murugan, T. Mora, A.M. Walczak, C.G. Callan, Statistical inference of the generation probability of T-cell receptors from sequence repertoires, Proc. Natl. Acad. Sci. 109 (2012) 16161–16166. https://doi.org/10.1073/pnas.1212755109.

Google ScholarSee all References
, 3x[3]Robins, H.S., Campregher, P.V., Srivastava, S.K., Wacher, A., Turtle, C.J., Kahsai, O., Riddell, S.R., Warren, E.H., and Carlson, C.S. Comprehensive assessment of T-cell receptor β-chain diversity in αβ T cells. Blood. 2009; 114: 4099–4107https://doi.org/10.1182/blood-2009-04-217604

Crossref | PubMed | Scopus (766)
| Google ScholarSee all References
]. A T-cell clonotype is a set of cells that share the same TCR, and the set of unique T-cell clonotypes in an individual is called a TCR repertoire.

The TCR is a two-chain protein and most human T cells consist of α/β chains (TRA and TRB) with a small proportion being γ/δ (TRC and TRD). TCR genes are formed by a process called V(D)J recombination, which consists of the rearrangement of the variable (V), diversity (D), joining (J), and constant (C) gene segments. Three complementarity-determining regions (CDRs) are important for recognizing antigens, with CDR3β (CDR3 region of the TRB chain) as the preferential target of many TCR repertoire studies due to its high diversity and primary importance for antigen binding [4x[4]Davis, M.M. and Bjorkman, P.J. T-cell antigen receptor genes and T-cell recognition. Nature. 1988; 334: 395–402https://doi.org/10.1038/334395a0

Crossref | PubMed | Scopus (2433)
| Google ScholarSee all References
].

High-throughput sequencing (HTS) is a powerful tool for the analysis of these highly diverse immune repertoires, contributing to lymphocyte biology research, antibody engineering, and vaccination [5x[5]Brown, A.J., Snapkov, I., Akbar, R., Pavlović, M., Miho, E., Sandve, G.K., and Greiff, V. Augmenting adaptive immunity: progress and challenges in the quantitative engineering and analysis of adaptive immune receptor repertoires. Mol. Syst. Des. Eng. 2019; 4: 701–736https://doi.org/10.1039/C9ME00071B

Crossref
| Google ScholarSee all References
,6x[6]Rosati, E., Dowds, C.M., Liaskou, E., Henriksen, E.K.K., Karlsen, T.H., and Franke, A. Overview of methodologies for T-cell receptor repertoire analysis. BMC Biotechnol. 2017; 17: 61https://doi.org/10.1186/s12896-017-0379-9

Crossref | PubMed | Scopus (138)
| Google ScholarSee all References
]. For capturing TCRs of α/β T cells, most of the HTS immune repertoire studies (also called Adaptive Immune Receptor Repertoire sequencing, AIRR-seq [7x[7]Rubelt, F., Busse, C.E., Bukhari, S.A.C., Bürckert, J.-P., Mariotti-Ferrandiz, E., Cowell, L.G., Watson, C.T., Marthandan, N., Faison, W.J., Hershberg, U., Laserson, U., Corrie, B.D., Davis, M.M., Peters, B., Lefranc, M.-P., Scott, J.K., Breden, F., Luning Prak, E.T., and Kleinstein, S.H. Adaptive Immune Receptor Repertoire Community recommendations for sharing immune-repertoire sequencing data. Nat. Immunol. 2017; 18: 1274–1278https://doi.org/10.1038/ni.3873

Crossref | PubMed | Scopus (94)
| Google ScholarSee all References
]) apply specific library preparation methods targeting receptor transcripts [8x[8]Barennes, P., Quiniou, V., Shugay, M., Egorov, E.S., Davydov, A.N., Chudakov, D.M., Uddin, I., Ismail, M., Oakes, T., Chain, B., Eugster, A., Kashofer, K., Rainer, P.P., Darko, S., Ransier, A., Douek, D.C., Klatzmann, D., and Mariotti-Ferrandiz, E. Benchmarking of T cell receptor repertoire profiling methods reveals large systematic biases. Nat. Biotechnol. 2021; 39: 236–245https://doi.org/10.1038/s41587-020-0656-3

Crossref | PubMed | Scopus (40)
| Google ScholarSee all References
,9x[9]Greiff, V., Miho, E., Menzel, U., and Reddy, S.T. Bioinformatic and Statistical Analysis of Adaptive Immune Repertoires. Trends Immunol. 2015; 36: 738–749https://doi.org/10.1016/j.it.2015.09.006

Abstract | Full Text | Full Text PDF | PubMed | Scopus (111)
| Google ScholarSee all References
]. Available HTS protocols for immune repertoire profiling include multiplex PCR, with guided primers for the amplification of TCR transcripts at the CDR3 region; target enrichment to capture the sequences of interest using complementary RNA baits; and the more standard 5’RACE (rapid amplification of 5′complementary DNA ends), which is able to retrieve the complete 5′end of the mRNA [6x[6]Rosati, E., Dowds, C.M., Liaskou, E., Henriksen, E.K.K., Karlsen, T.H., and Franke, A. Overview of methodologies for T-cell receptor repertoire analysis. BMC Biotechnol. 2017; 17: 61https://doi.org/10.1186/s12896-017-0379-9

Crossref | PubMed | Scopus (138)
| Google ScholarSee all References
]. However, immune repertoires can also be efficiently extracted from transcriptome RNA-seq data [10x[10]Bolotin, D.A., Poslavsky, S., Davydov, A.N., Frenkel, F.E., Fanchi, L., Zolotareva, O.I., Hemmers, S., Putintseva, E.V., Obraztsova, A.S., Shugay, M., Ataullakhanov, R.I., Rudensky, A.Y., Schumacher, T.N., and Chudakov, D.M. Antigen receptor repertoire profiling from RNA-seq data. Nat. Biotechnol. 2017; 35: 908–911https://doi.org/10.1038/nbt.3979

Crossref | PubMed | Scopus (134)
| Google ScholarSee all References
, 11x[11]Farmanbar, A., Kneller, R., and Firouzi, S. RNA sequencing identifies clonal structure of T-cell repertoires in patients with adult T-cell leukemia/lymphoma. Npj Genomic Med. 2019; 4: 1–9https://doi.org/10.1038/s41525-019-0084-9

Crossref | PubMed | Scopus (16)
| Google ScholarSee all References
, 12x[12]Song, L., Cohen, D., Ouyang, Z., Cao, Y., Hu, X., and Liu, X.S. TRUST4: immune repertoire reconstruction from bulk and single-cell RNA-seq data. Nat. Methods. 2021; https://doi.org/10.1038/s41592-021-01142-2

Crossref | Scopus (30)
| Google ScholarSee all References
], as TCR transcripts are part of the bulk-sequenced transcriptome. Using RNA-seq for immune receptor analysis reduces costs and sample amount since both gene expression and immune receptor transcripts are measured in the same experiment, at the expense of less sensitivity as only the most abundant and expanded clones can be recovered. Different computational methods are able to reconstruct TCR from RNA-seq data such as MiXCR [10x[10]Bolotin, D.A., Poslavsky, S., Davydov, A.N., Frenkel, F.E., Fanchi, L., Zolotareva, O.I., Hemmers, S., Putintseva, E.V., Obraztsova, A.S., Shugay, M., Ataullakhanov, R.I., Rudensky, A.Y., Schumacher, T.N., and Chudakov, D.M. Antigen receptor repertoire profiling from RNA-seq data. Nat. Biotechnol. 2017; 35: 908–911https://doi.org/10.1038/nbt.3979

Crossref | PubMed | Scopus (134)
| Google ScholarSee all References
], CATT [13x[13]Chen, S.Y., Liu, C.J., Zhang, Q., and Guo, A.Y. An ultra-sensitive T-cell receptor detection method for TCR-Seq and RNA-Seq data. Bioinforma. Oxf. Engl. 2020; 36: 4255–4262https://doi.org/10.1093/bioinformatics/btaa432

Crossref | PubMed | Scopus (5)
| Google ScholarSee all References
], or TRUST4 [12x[12]Song, L., Cohen, D., Ouyang, Z., Cao, Y., Hu, X., and Liu, X.S. TRUST4: immune repertoire reconstruction from bulk and single-cell RNA-seq data. Nat. Methods. 2021; https://doi.org/10.1038/s41592-021-01142-2

Crossref | Scopus (30)
| Google ScholarSee all References
]. To a certain extent, these repertoire reconstruction algorithms have been evaluated and compared against each other in terms of the sensitivity and specificity of TCR extraction. For instance, for the MiXCR software it was suggested that it is able to extract high-frequency clonotypes better than the first version of TRUST at any tested read sequencing length and most of the MiXCR-reported clonotypes were confirmed by control TCR-seq data [10x[10]Bolotin, D.A., Poslavsky, S., Davydov, A.N., Frenkel, F.E., Fanchi, L., Zolotareva, O.I., Hemmers, S., Putintseva, E.V., Obraztsova, A.S., Shugay, M., Ataullakhanov, R.I., Rudensky, A.Y., Schumacher, T.N., and Chudakov, D.M. Antigen receptor repertoire profiling from RNA-seq data. Nat. Biotechnol. 2017; 35: 908–911https://doi.org/10.1038/nbt.3979

Crossref | PubMed | Scopus (134)
| Google ScholarSee all References
]. Additional methods for immune receptor reconstruction have been emerging, such as BASIC [14x[14]Canzar, S., Neu, K.E., Tang, Q., Wilson, P.C., and Khan, A.A. BASIC: BCR assembly from single cells. Bioinforma. Oxf. Engl. 2017; 33: 425–427https://doi.org/10.1093/bioinformatics/btw631

Crossref | PubMed | Scopus (49)
| Google ScholarSee all References
], BRACER [15x[15]Lindeman, I., Emerton, G., Mamanova, L., Snir, O., Polanski, K., Qiao, S.-W., Sollid, L.M., Teichmann, S.A., and Stubbington, M.J.T. BraCeR: B-cell-receptor reconstruction and clonality inference from single-cell RNA-seq. Nat. Methods. 2018; 15: 563–565https://doi.org/10.1038/s41592-018-0082-3

Crossref | PubMed | Scopus (41)
| Google ScholarSee all References
], and BALDR [16x[16]Upadhyay, A.A., Kauffman, R.C., Wolabaugh, A.N., Cho, A., Patel, N.B., Reiss, S.M., Havenar-Daughton, C., Dawoud, R.A., Tharp, G.K., Sanz, I., Pulendran, B., Crotty, S., Lee, F.E.-H., Wrammert, J., and Bosinger, S.E. BALDR: a computational pipeline for paired heavy and light chain immunoglobulin reconstruction in single-cell RNA-seq data. Genome Med. 2018; 10: 20https://doi.org/10.1186/s13073-018-0528-3

Crossref | PubMed | Scopus (32)
| Google ScholarSee all References
], but they are able to reconstruct only B-cell receptors (BCR) from single-cell RNA-seq data, which was not applicable in our study. Despite the available collection of TCR repertoire reconstruction tools, a detailed step-by-step pipeline for the processing of immune repertoire data from standard bulk RNA-seq data is not readily available, representing a missed opportunity for complementing gene expression studies of immunological conditions with the analysis of the immune repertoire.

Here, we present an end-to-end pipeline for the analysis of TCR repertoire profiles from bulk RNA-seq, implemented in Nextflow [17x[17]Di Tommaso, P., Chatzou, M., Floden, E.W., Barja, P.P., Palumbo, E., and Notredame, C. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 2017; 35: 316–319https://doi.org/10.1038/nbt.3820

Crossref | PubMed | Scopus (571)
| Google ScholarSee all References
] for easy distribution and robust utilization. Nextflow is a workflow management system that provides native support to run pipelines in multiple compute environments and with containerization systems. AIRR-seq and immune-related Nextflow pipelines have become very popular [18x[18]Ewels, P.A., Peltzer, A., Fillinger, S., Patel, H., Alneberg, J., Wilm, A., Garcia, M.U., Di Tommaso, P., and Nahnsen, S. The nf-core framework for community-curated bioinformatics pipelines. Nat. Biotechnol. 2020; 38: 276–278https://doi.org/10.1038/s41587-020-0439-x

Crossref | PubMed | Scopus (265)
| Google ScholarSee all References
, 19x[19]Garrett, M.E., Galloway, J.G., Wolf, C., Logue, J.K., Franko, N., Chu, H.Y., Matsen, F.A., and Overbaugh, J. Comprehensive characterization of the antibody responses to SARS-CoV-2 Spike protein after infection and/or vaccination. BioRxiv. 2021; https://doi.org/10.1101/2021.10.05.463210

Crossref | Scopus (0)
| Google ScholarSee all References
, 20x[20]J.G. Galloway, E. Matsen, phip-flow, Matsen Group, 2022. https://github.com/matsengrp/phip-flow (Accessed 9 February 2022).

Google ScholarSee all References
], mainly due to the simplicity to run an analysis while transparently managing common issues of shell scripting (e.g., required dependencies, computational resources, code failure tracking, cumbersome transfer between collaborators). We apply our Nextflow pipeline to a dataset of CD4 T cells isolated from control patients, cirrhotic patients without, and with Minimal Hepatic Encephalopathy (MHE). MHE is a neuropsychiatric syndrome affecting about 40% of cirrhotic patients who show attention deficits, mild cognitive impairment, psychomotor slowing, and impaired visuomotor coordination [21x[21]Weissenborn, K., Giewekemeyer, K., Heidenreich, S., Bokemeyer, M., Berding, G., and Ahl, B. Attention, Memory, and Cognitive Function in Hepatic Encephalopathy. Metab. Brain Dis. 2005; 20: 359–367https://doi.org/10.1007/s11011-005-7919-z

Crossref | PubMed | Scopus (135)
| Google ScholarSee all References
]. The main hypothesis of MHE etiology is that peripheral inflammation together with hyperammonemia leads to neuroinflammation, which alters neurotransmission and produces cognitive/motor impairment [22x[22]Cabrera-Pastor, A., Llansola, M., Montoliu, C., Malaguarnera, M., Balzano, T., Taoro-Gonzalez, L., García-García, R., Mangas-Losada, A., Izquierdo-Altarejos, P., Arenas, Y.M., Leone, P., and Felipo, V. Peripheral inflammation induces neuroinflammation that alters neurotransmission and cognitive and motor function in hepatic encephalopathy: Underlying mechanisms and therapeutic implications. Acta Physiol. 2019; 226: e13270https://doi.org/10.1111/apha.13270

Crossref | PubMed | Scopus (36)
| Google ScholarSee all References
]. Specific alterations in the immunophenotype of cirrhotic patients with MHE pointed to CD4 T cells as key factors in the immune shift that triggers the appearance of MHE [23x[23]Mangas-Losada, A., García-García, R., Urios, A., Escudero-García, D., Tosca, J., Giner-Durán, R., Serra, M.A., Montoliu, C., and Felipo, V. Minimal hepatic encephalopathy is associated with expansion and activation of CD4+CD28-, Th22 and Tfh and B lymphocytes. Sci. Rep. 2017; 7: 6683https://doi.org/10.1038/s41598-017-05938-1

Crossref | PubMed | Scopus (20)
| Google ScholarSee all References
]. The study of CD4 T-cell repertoires might therefore help understand the immune status of MHE patients.

2. Material and Methods

2.1. Patient recruitment

Three groups of patients (healthy control, cirrhotic without, and with MHE) were recruited from the outpatient clinics of Hospital Clinico and Hospital Arnau de Vilanova (Valencia, Spain). The diagnosis of cirrhosis was based on clinical, biochemical, and ultrasonographic data. Cognitive function was evaluated by the Psychometric Hepatic Encephalopathy Score (PHES), a set of five psychometric tests used as the gold standard for MHE diagnosis. Patients were classified as MHE when the score was ≤ −4 points [24x[24]Weissenborn, K., Ennen, J.C., Schomerus, H., Rückert, N., and Hecker, H. Neuropsychological characterization of hepatic encephalopathy. J. Hepatol. 2001; 34: 768–773https://doi.org/10.1016/S0168-8278(01)00026-5

Abstract | Full Text | Full Text PDF | PubMed | Scopus (587)
| Google ScholarSee all References
]. All participants were enrolled after signing a written informed consent form. Study protocols were approved by the Scientific and Research Ethics Committees of Hospital Clinico Universitario and Arnau de Vilanova Hospital of Valencia, Spain, and were in accordance with the ethical guidelines of the Helsinki Declaration.

Blood samples were collected in BD Vacutainer® (Becton, Dickinson and Company, Franklin Lakes, NJ, USA) tubes with EDTA. Peripheral blood mononuclear cells were centrifuged over a density gradient medium (Lymphoprep™, Palex Medical, SA), according to the manufacturer's instructions and CD4 T cells were purified from 5 × 106 PBMCs by immunomagnetic negative selection using the EasySep™ Human CD4 T Cell Isolation Kit (STEMCELL Technologies Inc.).

2.2. RNA sequencing experimental design

Whole RNA from CD4 T cells was isolated using the miRNeasy Micro Kit (QIAGEN) following the instructions of the manufacturer and sequenced on an Illumina HiSeq2500 machine using HiSeq Sequencing v4 Chemistry. Ultra-low input RNA library preparation with polyA selection and strand-specificity was used for RNA-seq. Paired-end of 125 bp and 50 million reads of sequencing depth was selected for short-read sequencing.

2.3. Read trimming and filtering

Reads were trimmed using Trimmomatic v0.38 [25x[25]Bolger, A.M., Lohse, M., and Usadel, B. a flexible trimmer for Illumina sequence data. Bioinforma. Oxf. Engl. 2014; 30: 2114–2120https://doi.org/10.1093/bioinformatics/btu170

Crossref | PubMed | Scopus (26638)
| Google ScholarSee all References
] when the average Phred quality score was below 20 in a sliding window of 20 bp and removed if the resulting read length was less than 80 bp. These parameters were selected as optimal after a comparative analysis of different sliding window values (from 4-20 bp).

2.4. Repertoire reconstruction

MiXCR v3.0.13 [26x[26]Bolotin, D.A., Poslavsky, S., Mitrophanov, I., Shugay, M., Mamedov, I.Z., Putintseva, E.V., and Chudakov, D.M. MiXCR: software for comprehensive adaptive immunity profiling. Nat. Methods. 2015; 12: 380–381https://doi.org/10.1038/nmeth.3364

Crossref | PubMed | Scopus (713)
| Google ScholarSee all References
] was used to align and assemble TCR repertoire from RNA-seq data using the “analyze shotgun” command and default parameters. Clones (i.e., CDR3 amino acid sequences of the TRB chain) were included in the analysis if they had a minimal abundance read of 2 read counts and their CDR3βs were of 4 amino acids minimum length as described previously [27x[27]Amoriello, R., Greiff, V., Aldinucci, A., Bonechi, E., Carnasciali, A., Peruzzi, B., Repice, A.M., Mariottini, A., Saccardi, R., Mazzanti, B., Massacesi, L., and Ballerini, C. The TCR Repertoire Reconstitution in Multiple Sclerosis: Comparing One-Shot and Continuous Immunosuppressive Therapies. Front. Immunol. 2020; 11: 559https://doi.org/10.3389/fimmu.2020.00559

Crossref | PubMed | Scopus (15)
| Google ScholarSee all References
].

2.5. Hill-based evenness profiles

Common diversity indices are Hill numbers defined as:

(1)Math Eq
where Math Eq is the number of unique clones in a repertoire, Math Eq is the frequency distribution (proportional abundance of clones) and ⍺ is a scale parameter in (0,1) and (1, +∞).

An α-Diversity profile (Math Eq) was previously defined [28x[28]Greiff, V., Bhat, P., Cook, S.C., Menzel, U., Kang, W., and Reddy, S.T. A bioinformatic framework for immune repertoire diversity profiling enables detection of immunological status. Genome Med. 2015; 7: 49https://doi.org/10.1186/s13073-015-0169-8

Crossref | PubMed | Scopus (97)
| Google ScholarSee all References
] as:

(2)Math Eq
where Math Eq is the evenness and Math Eq is the species richness or the number of unique clones in a repertoire dataset.

Here, we calculated Evenness profiles (Math Eq), defined as Math Eq according to the above Equation 2. We used different values of ⍺, ranging from 0 to 10 with a step size of 0.2, to obtain the Diversity profiles (Math Eq) [28x[28]Greiff, V., Bhat, P., Cook, S.C., Menzel, U., Kang, W., and Reddy, S.T. A bioinformatic framework for immune repertoire diversity profiling enables detection of immunological status. Genome Med. 2015; 7: 49https://doi.org/10.1186/s13073-015-0169-8

Crossref | PubMed | Scopus (97)
| Google ScholarSee all References
]. Diversity is not defined for the case ⍺=1 but L'Hospital's rule defines that as ⍺ tends to 1, diversity tends to the exponential of the Shannon entropy:

(3)Math Eq

All pairwise Pearson correlation coefficients of the Evenness profiles were calculated between samples. Hierarchical clustering was performed based on Euclidean distance for correlations and heatmaps were generated for visualization.

2.6. Shannon Evenness

Shannon Evenness (S-E) is defined as Shannon entropy divided by the Species Richness (SR). S-E is 1 if all clones in a repertoire have the same frequency (an “even” repertoire), or it nears 0 if very few clones dominate in the repertoire (“polarized” repertoire).

2.7. Jaccard similarity

Pairwise clonal convergence between two repertoires A and B was quantified using the Jaccard similarity coefficient, defined as the size of the intersection of A and B divided by the size of the union of A and B:

(4)Math Eq

Values range between 0 and 1, where 1 means complete overlap of repertoire A and B, and 0 indicates no overlapping receptor sequences between repertoires A and B.

2.8. K-mer-based TCR analysis

Overlapping k-mers of length 3 (k = 3) were extracted from the amino acid CDR3β sequences [27x[27]Amoriello, R., Greiff, V., Aldinucci, A., Bonechi, E., Carnasciali, A., Peruzzi, B., Repice, A.M., Mariottini, A., Saccardi, R., Mazzanti, B., Massacesi, L., and Ballerini, C. The TCR Repertoire Reconstitution in Multiple Sclerosis: Comparing One-Shot and Continuous Immunosuppressive Therapies. Front. Immunol. 2020; 11: 559https://doi.org/10.3389/fimmu.2020.00559

Crossref | PubMed | Scopus (15)
| Google ScholarSee all References
] in each TCR repertoire and condensed into a k-mer frequency distribution matrix using the immunarch R package [29x[29]ImmunoMind Team, immunarch: An R Package for Painless Analysis of Large-Scale Immune Repertoire Data, (2019).

Google ScholarSee all References
]. Hierarchical clustering and heatmap visualization were performed as described above in Section 2.5.

2.9. TCR sequence similarity architecture

TRB repertoire similarity networks were generated as previously described [27x[27]Amoriello, R., Greiff, V., Aldinucci, A., Bonechi, E., Carnasciali, A., Peruzzi, B., Repice, A.M., Mariottini, A., Saccardi, R., Mazzanti, B., Massacesi, L., and Ballerini, C. The TCR Repertoire Reconstitution in Multiple Sclerosis: Comparing One-Shot and Continuous Immunosuppressive Therapies. Front. Immunol. 2020; 11: 559https://doi.org/10.3389/fimmu.2020.00559

Crossref | PubMed | Scopus (15)
| Google ScholarSee all References
,30x[30]Amoriello, R., Chernigovskaya, M., Greiff, V., Carnasciali, A., Massacesi, L., Barilaro, A., Repice, A.M., Biagioli, T., Aldinucci, A., Muraro, P.A., Laplaud, D.A., Lossius, A., and Ballerini, C. TCR repertoire diversity in Multiple Sclerosis: High-dimensional bioinformatics analysis of sequences from brain, cerebrospinal fluid and peripheral blood. EBioMedicine. 2021; 68: 103429https://doi.org/10.1016/j.ebiom.2021.103429

Abstract | Full Text | Full Text PDF | PubMed | Scopus (6)
| Google ScholarSee all References
,31x[31]Miho, E., Roškar, R., Greiff, V., and Reddy, S.T. Large-scale network analysis reveals the sequence space architecture of antibody repertoires. Nat. Commun. 2019; 10: 1321https://doi.org/10.1038/s41467-019-09278-8

Crossref | PubMed | Scopus (51)
| Google ScholarSee all References
], where nodes represent amino acid CDR3β sequences and edges were drawn between sequences differing by 1 amino acid (Levenshtein distance = 1). The degree (number of links per node) distributions of each repertoire were calculated using the degree function from the R package igraph [32x[32]Csardi, G. and Nepusz, T. The igraph software package for complex network research. InterJ Complex Syst. 2006; 1695: 9

Google ScholarSee all References
].

2.10. Graphics generation

Statistical analysis and graphics were performed using the programming environment R v4.0.5 [33x[33]Core Team, R. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria; 2020

Google ScholarSee all References
]. The matrix of public clones generated in the repertoire overlap analysis was generated using the immunarch R package [29x[29]ImmunoMind Team, immunarch: An R Package for Painless Analysis of Large-Scale Immune Repertoire Data, (2019).

Google ScholarSee all References
] with the repOverlap() and vis() functions. All heatmaps were created using the aheatmap() function of the NMF R package [34x[34]Gaujoux, R. and Seoighe, C. A flexible R package for nonnegative matrix factorization. BMC Bioinformatics. 2010; 11: 367https://doi.org/10.1186/1471-2105-11-367

Crossref | PubMed | Scopus (665)
| Google ScholarSee all References
]. Mean quality sequencing plots for paired-end reads were obtained from FastQC [35x[35]S. Andrews, FASTQC. A quality control tool for high throughput sequence data, 2010.

Google ScholarSee all References
] report and barplots summarizing MiXCR output were drawn using the ggplot2 R package [36x[36]Wickham, H. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag, New York; 2016

Crossref
| Google ScholarSee all References
]. The ggpubr R package [37x[37]A. Kassambara, ggpubr: “ggplot2” Based Publication Ready Plots. R package version 0.4.0., (2020). https://CRAN.R-project.org/package=ggpubr.

Google ScholarSee all References
] functions ggboxplot() and ggscatter() were used for clone statistics and correlations, respectively.

2.11. Antigen/Disease-specific TCR databases

McPAS-TCR is a curated database of TCR sequences linked to the associated antigen target or pathology based on published literature [38x[38]Tickotsky, N., Sagiv, T., Prilusky, J., Shifrut, E., and Friedman, N. McPAS-TCR: a manually curated catalogue of pathology-associated T cell receptor sequences. Bioinforma. Oxf. Engl. 2017; 33: 2924–2929https://doi.org/10.1093/bioinformatics/btx286

Crossref | PubMed | Scopus (131)
| Google ScholarSee all References
]. The database was downloaded and filtered by pathological category, maintaining human TRB sequences associated with autoimmunity and pathogens. VDJdb is a curated repository of antigen-specific TCR sequences utilizing experimental information from recently published TCR specificity assays [39x[39]Bagaev, D.V., Vroomans, R.M.A., Samir, J., Stervbo, U., Rius, C., Dolton, G., Greenshields-Watson, A., Attaf, M., Egorov, E.S., Zvyagin, I.V., Babel, N., Cole, D.K., Godkin, A.J., Sewell, A.K., Kesmir, C., Chudakov, D.M., Luciani, F., and Shugay, M. VDJdb in 2019: database extension, new analysis infrastructure and a T-cell receptor motif compendium. Nucleic Acids Res. 2020; 48: D1057–D1062https://doi.org/10.1093/nar/gkz874

Crossref | PubMed | Scopus (116)
| Google ScholarSee all References
]. At the moment of download, McPAS-TCR and VDJdb databases had been last updated on 6 March 2021 and 2 February 2021, respectively.

2.12. Fisher's exact test analysis

The overrepresentation of clones associated with diseases or antigens (McPAS-TCR and VDJdb) in our CDR3β sequences was evaluated using a one-tailed Fisher's exact test applied to each group of patients (control, with MHE, without MHE), using the disease categories included in the McPAS-TCR and VDJdb databases. TCRs present in the samples but absent in the McPAS-TCR and VDJdb databases were excluded from the analysis as described previously [30x[30]Amoriello, R., Chernigovskaya, M., Greiff, V., Carnasciali, A., Massacesi, L., Barilaro, A., Repice, A.M., Biagioli, T., Aldinucci, A., Muraro, P.A., Laplaud, D.A., Lossius, A., and Ballerini, C. TCR repertoire diversity in Multiple Sclerosis: High-dimensional bioinformatics analysis of sequences from brain, cerebrospinal fluid and peripheral blood. EBioMedicine. 2021; 68: 103429https://doi.org/10.1016/j.ebiom.2021.103429

Abstract | Full Text | Full Text PDF | PubMed | Scopus (6)
| Google ScholarSee all References
]. Fisher's exact test was used to test the overrepresentation of specific disease-associated receptors in the database within the measured receptors of the sample. The obtained p-values were adjusted for multiple testing using Benjamini-Hochberg FDR correction considering both the number of diseases and the number of sample groups tested.

2.13. Nextflow pipeline

Nextflow v21.10.6 was used to implement the pipeline. In addition, the DSL2 syntax extension was enabled at the beginning of the workflow script to allow the definition of module libraries and simplify the writing of the data analysis pipeline.

2.14. Data and code availability

The transcriptomic dataset used in this study is available in the GEO database repository, GSE184200, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE184200. The code and complete documentation of the Nextflow pipeline are publicly available from the Github repository:https://github.com/ConesaLab/TCR_nextflow.

3. Results and Discussion

3.1. Step-by-step analysis overview

The step-by-step pipeline for the processing of immune repertoire data from whole transcriptome RNA-seq reads is summarized in Fig. 1. The pipeline consists of four main steps representing both the experimental procedure and the computational analysis steps: (1) Experimental design, (2) Transcriptome profiling, (3) AIRR (Adaptive Immune Receptor Repertoire) pre-processing, and (4) AIRR analysis. The experimental procedure includes sample collection, immune cell isolation, RNA extraction, sequencing design (e.g., library strand specificity, paired- or single-end reads), and RNA sequencing (RNA-seq). The computational analysis starts with ‘fastq’ files in which the sequencing quality (step 1) needs to be verified. The MiXCR software [26x[26]Bolotin, D.A., Poslavsky, S., Mitrophanov, I., Shugay, M., Mamedov, I.Z., Putintseva, E.V., and Chudakov, D.M. MiXCR: software for comprehensive adaptive immunity profiling. Nat. Methods. 2015; 12: 380–381https://doi.org/10.1038/nmeth.3364

Crossref | PubMed | Scopus (713)
| Google ScholarSee all References
] was used to assemble TCR repertoires from sequencing reads after quality control. We chose this well-established repertoire reconstruction tool because it is able to extract high-frequency clonotypes from RNA-seq data with a comparable yield to other similar tools (TRUST3, TRUST4) in case of sufficient sequencing read length [12x[12]Song, L., Cohen, D., Ouyang, Z., Cao, Y., Hu, X., and Liu, X.S. TRUST4: immune repertoire reconstruction from bulk and single-cell RNA-seq data. Nat. Methods. 2021; https://doi.org/10.1038/s41592-021-01142-2

Crossref | Scopus (30)
| Google ScholarSee all References
]. MiXCR assembly algorithm avoids the introduction of false-positive clones, which might appear either by alignment of reads to non-target molecules or by overlapping between two sequences from different clones in the reconstruction of partially covered CDR3 regions. The Nextflow pipeline starts with the MiXCR repertoire extraction step (Fig. 1) and was performed using the MiXCR function “analyze shotgun” command, which consists of the following workflow: sequence alignment against reference V, D, J, and C genes (step 2), followed by clustering of identical sequences into clonotypes (default, clustering by CDR3β) (step 3) and correction of PCR/sequencing errors (step 4) to output a tab-delimited file containing the quantification as a clonotype matrix (step 5).

Fig. 1 Opens large image

Fig. 1

Pipeline overview. Experimental steps comprise sample collection, cell isolation, RNA extraction, and sequencing. The computational pipeline analysis implemented in Nextflow included repertoire quantification, and repertoire properties screening. AIRR: Adaptive Immune Receptor Repertoire.

Additional AIRR analysis steps are needed to study different immune receptor features: overlap for clonal convergence (step 6a), diversity indexes for clonal expansion (step 6b), network analysis for clonal sequence architecture (step 6c), k-mer distribution for repertoire sequence similarity (step 6d) and public databases screening for antigen specificity (step 6e). Steps 1–6 were implemented in the Nextflow pipeline as parallel processes that receive MiXCR files as input and provide ready-to-publish plots and tables as well as a final report summarising all results for better user interpretability.

3.2. T-cell receptor sequences can be recovered from RNA-seq data (steps 1–5)

CD4 T cells were isolated and sequenced by bulk paired-end RNA-seq from a total of 20 patients (8 control, 6 cirrhotic without MHE, and 6 cirrhotic with MHE). Sequencing read pre-processing included trimming and filtering (see Methods), which resulted in good quality scores (mean q > 30) for all samples (Supplementary Fig. 1A).

Read alignment against the reference VDJ genes (IMGT database [40x[40]Giudicelli, V., Chaume, D., and Lefranc, M.P. IMGT/GENE-DB: a comprehensive database for human and mouse immunoglobulin and T cell receptor genes. Nucleic Acids Res. 2005; 33: D256–D261https://doi.org/10.1093/nar/gki010

Crossref | PubMed | Scopus (338)
| Google ScholarSee all References
]) showed a range of successfully aligned reads between 0.05–0.1%. The majority of reads matched TCR regions (24.1–67.6% of successfully aligned reads), although some reads aligned with immunoglobulin (IG) chains (27.1–75.0% of successfully aligned reads) indicating slight contamination during T-cell isolation (Supplementary Fig. 1B). A high proportion (52.5–87.7%) of the recovered clones, i.e., CDR3 amino acid sequences, matched TRA and TRB chains. Bulk RNA-seq data cannot determine the pairing of specific α/β receptor chains within the population of T cells, something that can only be achieved by sequencing single T cells. Therefore, we decided to focus on the TRB chain for all subsequent analyses. TRB is more appropriate than TRA for identifying T-cell clones because around 7–30% of T cells may have two different alpha chains expressed on the same clone [41x[41]Rybakin, V., Westernberg, L., Fu, G., Kim, H.O., Ampudia, J., Sauer, K., and Gascoigne, N.R.J. Allelic exclusion of TCR α-chains upon severe restriction of Vα repertoire. PloS One. 2014; 9: e114320https://doi.org/10.1371/journal.pone.0114320

Crossref | PubMed | Scopus (4)
| Google ScholarSee all References
] while only 1% of T cells may have two different beta chains on the same clone [42x[42]Steinel, N.C., Brady, B.L., Carpenter, A.C., Yang-Iott, K.S., and Bassing, C.H. Posttranscriptional silencing of VbetaDJbetaCbeta genes contributes to TCRbeta allelic exclusion in mammalian lymphocytes. J. Immunol. Baltim. Md. 2010; 185: 1055–1062https://doi.org/10.4049/jimmunol.0903099

Crossref | PubMed | Scopus (16)
| Google ScholarSee all References
].

We have compared three groups of patients using the Kruskal-Wallis test for various repertoire statistics – number of isolated cells, RNA quantity, number of reads obtained, the number of recovered clones, i.e., CDR3β amino acid sequence, and their Shannon evenness (Supplementary Fig. 2A). The clone recovery yield ranged from 498 to 1,114 distinct CDR3 amino acid sequences per individual in TRB. None of the mentioned measurements showed any significant difference between groups of patients (Kruskal-Wallis test, p-value > 0.05) except for the number of clones, which was significantly increased in cirrhotic patients without MHE versus control (post hoc Wilcoxon test, p-value = 0.024).

To determine whether sequencing depth sufficiently covered the clonal repertoire of the samples, we calculated pairwise correlations as previously described [27x[27]Amoriello, R., Greiff, V., Aldinucci, A., Bonechi, E., Carnasciali, A., Peruzzi, B., Repice, A.M., Mariottini, A., Saccardi, R., Mazzanti, B., Massacesi, L., and Ballerini, C. The TCR Repertoire Reconstitution in Multiple Sclerosis: Comparing One-Shot and Continuous Immunosuppressive Therapies. Front. Immunol. 2020; 11: 559https://doi.org/10.3389/fimmu.2020.00559

Crossref | PubMed | Scopus (15)
| Google ScholarSee all References
] between the cell number, the number of clones, and the Shannon evenness across all samples (Supplementary Fig. 2B). When sequencing depth is saturating with respect to clone detection, the number of clones solely depends on the sample type and not on the number of reads. We found a positive correlation (Pearson coefficient = 0.77, p-value = 6.5 × 10−5), that may indicate insufficient sequencing depth. Nevertheless, the number of distinct CDR3 sequences assembled was of similar magnitude as reported in other studies of TCR reconstruction from bulk RNA-seq data: 367–936 TRB extracted clonotypes from the central nervous system and 1,684–2,977 extracted TRB clonotypes from the spleen using paired-end data from isolated CD4 T cells [10x[10]Bolotin, D.A., Poslavsky, S., Davydov, A.N., Frenkel, F.E., Fanchi, L., Zolotareva, O.I., Hemmers, S., Putintseva, E.V., Obraztsova, A.S., Shugay, M., Ataullakhanov, R.I., Rudensky, A.Y., Schumacher, T.N., and Chudakov, D.M. Antigen receptor repertoire profiling from RNA-seq data. Nat. Biotechnol. 2017; 35: 908–911https://doi.org/10.1038/nbt.3979

Crossref | PubMed | Scopus (134)
| Google ScholarSee all References
].

3.3. TCR sequences profiling in MHE

3.3.1. Low clonal convergence among patient samples (step 6a)

Repertoire overlap analysis is the most common approach to uncover clonotypes shared between given individuals, which are also denominated as “public” clones [43x[43]Elhanati, Y., Sethna, Z., Callan, C.G., Mora, T., and Walczak, A.M. Predicting the spectrum of TCR repertoire sharing with a data-driven model of recombination. Immunol. Rev. 2018; 284: 167–179https://doi.org/10.1111/imr.12665

Crossref | PubMed | Scopus (56)
| Google ScholarSee all References
, 44x[44]Greiff, V., Menzel, U., Miho, E., Weber, C., Riedel, R., Cook, S., Valai, A., Lopes, T., Radbruch, A., Winkler, T.H., and Reddy, S.T. Systems Analysis Reveals High Genetic and Antigen-Driven Predetermination of Antibody Repertoires throughout B Cell Development. Cell Rep. 2017; 19: 1467–1478https://doi.org/10.1016/j.celrep.2017.04.054

Abstract | Full Text | Full Text PDF | PubMed | Scopus (77)
| Google ScholarSee all References
, 45x[45]Greiff, V., Weber, C.R., Palme, J., Bodenhofer, U., Miho, E., Menzel, U., and Reddy, S.T. Learning the High-Dimensional Immunogenomic Features That Predict Public and Private Antibody Repertoires. J. Immunol. 2017; 199: 2985–2997https://doi.org/10.4049/jimmunol.1700594

Crossref | PubMed | Scopus (60)
| Google ScholarSee all References
, 46x[46]Putintseva, E., Britanova, O., Staroverov, D., Merzlyak, E., Turchaninova, M., Shugay, M., Bolotin, D., Pogorelyy, M., Mamedov, I., Bobrynina, V., Maschan, M., Lebedev, Y., and Chudakov, D. Mother and Child T Cell Receptor Repertoires: Deep Profiling Study. Front. Immunol. 2013; 4: 463https://doi.org/10.3389/fimmu.2013.00463

Crossref | PubMed | Scopus (35)
| Google ScholarSee all References
]. Using the Jaccard similarity (see Methods), we found a low clonal convergence (0.00027±0.00041 Jaccard average measure) between healthy and cirrhotic patients with or without MHE (Fig. 2A).

Fig. 2 Opens large image

Fig. 2

Different characteristics of TCR (beta chain) repertoire analysis showed specific within-sample profiles independent of immune status. A) Overlap analysis of CDR3β sequences to uncover clonal convergence or shared clonotypes between individuals. Jaccard similarity coefficient matrix indicates a high (red) or low (blue) match between two sample sets. B) Diversity profile analysis to measure clonal expansion. The heatmap contains high (light yellow) or low (dark purple) pairwise Pearson correlation coefficients of the Evenness profiles calculated for control (blue), cirrhotic without MHE (red) or cirrhotic with MHE (black) patients. C) K-mer analysis to examine repertoire similarity. The heatmap represents the highest (light yellow) to the lowest (dark purple) pairwise Pearson correlation coefficients of the 3-mers (amino acid k-mers of length 3) frequency distribution matrix. D) Network analysis to study clonal architecture. Colors from light yellow to dark purple represent the number of nodes with degree 0–5 of each repertoire. Most of the nodes are single (i.e. degree=0). E) Network single node (degree=0) distribution between patient groups to test if the number of degree=0 nodes in cirrhotic patients without MHE (red) were significantly higher than control (blue) or cirrhotic with MHE (black) patients. The test was performed after removing PC134 (outlier in the withMHE group).

3.3.2. High clonal expansion in all samples independently of immune status (step 6b)

The expansion of individual T-cell clones that bind their matching antigen can be analyzed using Hill-based evenness profiles, a diversity measurement derived from ecology (see Methods). Unlike single diversity indices, which can produce different clonal expansion results, diversity profiles capture the entire immune repertoire and reflect immunological statuses more sensitively [28x[28]Greiff, V., Bhat, P., Cook, S.C., Menzel, U., Kang, W., and Reddy, S.T. A bioinformatic framework for immune repertoire diversity profiling enables detection of immunological status. Genome Med. 2015; 7: 49https://doi.org/10.1186/s13073-015-0169-8

Crossref | PubMed | Scopus (97)
| Google ScholarSee all References
]. CD4 T cells in this work showed a positive correlation in diversity profiles of T-cell clones (Pearson coefficient from 0.59 to 1), regardless of the cognitive impairment or cirrhosis condition of the studied patients (Fig. 2B).

3.3.3. Increased within-repertoire similarity based on repertoire architecture is unconstrained by immune status (steps 6c-d)

The adaptive immune response is determined by immune receptor sequences: the higher their dissimilarity, the wider the range of antigens they are able to recognize. The all-to-all sequence similarity within a repertoire represents the repertoire architecture, which was measured in our patients using both k-mers and network analysis.

First, the number of overlapped 3-mers (k-mers length = 3 amino acids) were calculated per patient, as previously described [27x[27]Amoriello, R., Greiff, V., Aldinucci, A., Bonechi, E., Carnasciali, A., Peruzzi, B., Repice, A.M., Mariottini, A., Saccardi, R., Mazzanti, B., Massacesi, L., and Ballerini, C. The TCR Repertoire Reconstitution in Multiple Sclerosis: Comparing One-Shot and Continuous Immunosuppressive Therapies. Front. Immunol. 2020; 11: 559https://doi.org/10.3389/fimmu.2020.00559

Crossref | PubMed | Scopus (15)
| Google ScholarSee all References
]. A large positive Pearson correlation, ranging from 0.96 to 1 was obtained between k-mers vectors (Fig. 2C). This result might indicate that patients share similar sequence architecture patterns independently of their immune status.

To complete the repertoire architecture analysis, a sequence similarity network was constructed using CDR3β amino acid sequence as nodes and adding an edge when sequences differed in 1 amino acid (Levenshtein distance = 1). Then, the number of similar clones (network degree) was calculated and represented as a heatmap (Fig. 2D). 96.8% of the clones had degree = 0 (single nodes) in all the samples and the maximum degree obtained was 5 in one cirrhotic patient without MHE (PC149). This substantial proportion of clones with degree zero was also found in CD4 cells from patients with Multiple Sclerosis while CD8 cells presented a more homogeneous degree distribution [27x[27]Amoriello, R., Greiff, V., Aldinucci, A., Bonechi, E., Carnasciali, A., Peruzzi, B., Repice, A.M., Mariottini, A., Saccardi, R., Mazzanti, B., Massacesi, L., and Ballerini, C. The TCR Repertoire Reconstitution in Multiple Sclerosis: Comparing One-Shot and Continuous Immunosuppressive Therapies. Front. Immunol. 2020; 11: 559https://doi.org/10.3389/fimmu.2020.00559

Crossref | PubMed | Scopus (15)
| Google ScholarSee all References
]. Moreover, the majority of cirrhotic patients without MHE showed a significantly higher (Wilcoxon test, p-value = 0.013) number of single nodes (Fig. 2E), which may be related to underlying differences in the number of TRB clones between the three different groups (Supplementary Fig. 2A). Noteworthy, sample PC134 (cirrhotic patient without MHE) presented 1064 single nodes, the highest number compared with the rest of patients (472–935 single nodes), which could be explained by the highest number of recovered clones (1,114 clones) in this sample, and we considered it as an outlier in this analysis (Fig. 2E).

3.4. TCRs with known disease association are overrepresented in MHE CDR3 repertoires (step 6e)

To evaluate if the clones in our patients were significantly associated with previously described diseases or antigens, we assessed the overlap (see Methods) between our CDR3β sequences and two different TCR databases: VDJdb and McPAS-TCR.

VDJdb contained a total of 41,169 human TRB sequences with Cytomegalovirus being the species epitope with the highest number of sequences, compressing nearly a half of them. McPAS-TCR contained a total of 30,052 TRB sequences within the autoimmune and pathology categories, with over half of them belonging to Mycobacterium tuberculosis. The sequence intersection between the two databases was low, ranging from 2 to 1,032 in the common pathologies/pathogens: InfluenzaA, Cytomegalovirus, Epstein-Barr Virus, Human Immunodeficiency Virus, Yellow fever virus, Human T cell Lymphotropic Virus, Hepatitis C virus, Mycobacterium tuberculosis, Herpes Simplex Virus 2 and Covid-19, sorted by decreasing order of shared sequences (Supplementary Table 1).

CDR3β sequences were grouped by type of patient (control, cirrhotic without MHE and cirrhotic with MHE) to test overlap with McPAS-TCR database (Table 1 and Supplementary Table 2). We found that Inflammatory bowel disease (IBD) has a significant overrepresentation in cirrhotic without MHE (p-value = 3.082 × 10−5) and with MHE patients (p-value = 5.470 × 10−4). Traditionally, IBD has been associated with a Th1-mediated inflammation [47x[47]Brand, S. Crohn's disease: Th1, Th17 or both? The change of a paradigm: new immunological and genetic insights implicate Th17 cells in the pathogenesis of Crohn's disease. Gut. 2009; 58: 1152–1167https://doi.org/10.1136/gut.2008.163667

Crossref | PubMed | Scopus (492)
| Google ScholarSee all References
,48x[48]Molnár, T., Tiszlavicz, L., Gyulai, C., Nagy, F., and Lonovics, J. Clinical significance of granuloma in Crohn's disease. World J. Gastroenterol. WJG. 2005; 11: 3118–3121https://doi.org/10.3748/wjg.v11.i20.3118

Crossref | PubMed | Scopus (47)
| Google ScholarSee all References
], but more recent discoveries have shown the involvement of Th17 cells contributing to inflammation by the secretion of proinflammatory cytokines such as IL-17 and IL-21 [49x[49]Nemeth, Z.H., Bogdanovski, D.A., Barratt-Stopper, P., Paglinco, S.R., Antonioli, L., and Rolandelli, R.H. Crohn's Disease and Ulcerative Colitis Show Unique Cytokine Profiles. Cureus. 2017; 9: e1177https://doi.org/10.7759/cureus.1177

Crossref | PubMed
| Google ScholarSee all References
, 50x[50]Imam, T., Park, S., Kaplan, M.H., and Olson, M.R. Effector T Helper Cell Subsets in Inflammatory Bowel Diseases. Front. Immunol. 2018; 9: 1212https://doi.org/10.3389/fimmu.2018.01212

Crossref | PubMed | Scopus (116)
| Google ScholarSee all References
, 51x[51]Bushara, O., Escobar, D.J., Weinberg, S.E., Sun, L., Liao, J., and Yang, G.-Y. The Possible Pathogenic Role of IgG4-Producing Plasmablasts in Stricturing Crohn's Disease. Pathobiol. J. Immunopathol. Mol. Cell. Biol. 2022; : 1–11https://doi.org/10.1159/000521259

Crossref | Scopus (0)
| Google ScholarSee all References
]. Previous studies have also shown alterations in Th1, Th17, IL-17 and IL-21 in patients with MHE [23x[23]Mangas-Losada, A., García-García, R., Urios, A., Escudero-García, D., Tosca, J., Giner-Durán, R., Serra, M.A., Montoliu, C., and Felipo, V. Minimal hepatic encephalopathy is associated with expansion and activation of CD4+CD28-, Th22 and Tfh and B lymphocytes. Sci. Rep. 2017; 7: 6683https://doi.org/10.1038/s41598-017-05938-1

Crossref | PubMed | Scopus (20)
| Google ScholarSee all References
,52x[52]Rubio, T., Felipo, V., Tarazona, S., Pastorelli, R., Escudero-García, D., Tosca, J., Urios, A., Conesa, A., and Montoliu, C. Multi-omic analysis unveils biological pathways in peripheral immune system associated to minimal hepatic encephalopathy appearance in cirrhotic patients. Sci. Rep. 2021; 11: 1907https://doi.org/10.1038/s41598-020-80941-7

Crossref | PubMed | Scopus (5)
| Google ScholarSee all References
], which may constitute a plausible link between these two disorders. Celiac disease was also significant in control (p-value = 1.127 × 10−2) and cirrhotic patients without MHE group (p-value = 4.836 × 10−4). There was no significant overlap with a p-value level below 0.05 between our dataset and VDJdb associated CDR3βs (Supplementary Table 3). While the overlap studies of bulk sequencing data with antigen-specific data are interesting, results remain challenging to interpret, as it is unclear why specific antigens are enriched. The polyreactivity of the TCR repertoire might be a reason. Specifically, each TCR maps to several antigen specificities, or in other words, there is no one-to-one TCR-antigen map, only a many-to-many. We have observed a similar association of bulk sequencing data with seemingly unrelated antigens in two recent publications of ours [27x[27]Amoriello, R., Greiff, V., Aldinucci, A., Bonechi, E., Carnasciali, A., Peruzzi, B., Repice, A.M., Mariottini, A., Saccardi, R., Mazzanti, B., Massacesi, L., and Ballerini, C. The TCR Repertoire Reconstitution in Multiple Sclerosis: Comparing One-Shot and Continuous Immunosuppressive Therapies. Front. Immunol. 2020; 11: 559https://doi.org/10.3389/fimmu.2020.00559

Crossref | PubMed | Scopus (15)
| Google ScholarSee all References
,30x[30]Amoriello, R., Chernigovskaya, M., Greiff, V., Carnasciali, A., Massacesi, L., Barilaro, A., Repice, A.M., Biagioli, T., Aldinucci, A., Muraro, P.A., Laplaud, D.A., Lossius, A., and Ballerini, C. TCR repertoire diversity in Multiple Sclerosis: High-dimensional bioinformatics analysis of sequences from brain, cerebrospinal fluid and peripheral blood. EBioMedicine. 2021; 68: 103429https://doi.org/10.1016/j.ebiom.2021.103429

Abstract | Full Text | Full Text PDF | PubMed | Scopus (6)
| Google ScholarSee all References
].

Table 1Top p-values of the overrepresentation CDR3 sequence analysis for the diseases collected in the McPAS-TCR database [38x[38]Tickotsky, N., Sagiv, T., Prilusky, J., Shifrut, E., and Friedman, N. McPAS-TCR: a manually curated catalogue of pathology-associated T cell receptor sequences. Bioinforma. Oxf. Engl. 2017; 33: 2924–2929https://doi.org/10.1093/bioinformatics/btx286

Crossref | PubMed | Scopus (131)
| Google ScholarSee all References
].
McPAS-TCR
Disease associationcontrolwithoutMHEwithMHE
Celiac disease0.011*0.214<0.001**
Cytomegalovirus (CMV)110.723
Diabetes Type 1110.510
Epstein Barr virus (EBV)10.5101
HTLV-110.8770.823
Inflammatory bowel disease (IBD)1<0.001**0.001**
Influenza0.5100.5100.066
Narcolepsy0.72311
Psoriatic Arthritis0.06411
Ulcerative Colitis10.0661
Yellow fever virus10.7231
View Table in HTML

P-values were adjusted for multiple testing using Benjamini-Hochberg FDR correction considering both numbers of diseases and the number of sample groups. HTLV1: Human T cell Leukaemia virus 1; *p-value<0.05; **p-value<0.01

4. Conclusion

We have presented here a step-by-step computational pipeline for the processing of immune repertoire data from whole transcriptome RNA-seq reads that leverages the presence of immunological receptor sequences (TCR) extracted from RNA-seq transcriptomics datasets. As far as we know, this is the first pipeline that includes both TCR extraction from RNA-seq data as well as a complete immune repertoire data analysis. Different repertoire features can be calculated to interpret the immune repertoire variation. Repertoire overlap analysis is the most common approach to uncover shared clonotypes between individuals also known as “public” clones [45x[45]Greiff, V., Weber, C.R., Palme, J., Bodenhofer, U., Miho, E., Menzel, U., and Reddy, S.T. Learning the High-Dimensional Immunogenomic Features That Predict Public and Private Antibody Repertoires. J. Immunol. 2017; 199: 2985–2997https://doi.org/10.4049/jimmunol.1700594

Crossref | PubMed | Scopus (60)
| Google ScholarSee all References
]. Diversity measurement helps understand the expansion of individual T-cell clones [28x[28]Greiff, V., Bhat, P., Cook, S.C., Menzel, U., Kang, W., and Reddy, S.T. A bioinformatic framework for immune repertoire diversity profiling enables detection of immunological status. Genome Med. 2015; 7: 49https://doi.org/10.1186/s13073-015-0169-8

Crossref | PubMed | Scopus (97)
| Google ScholarSee all References
]. Repertoire architecture is represented by receptor sequence likeness, which determines adaptive immune response and can be quantified both by k-mers and network analysis [27x[27]Amoriello, R., Greiff, V., Aldinucci, A., Bonechi, E., Carnasciali, A., Peruzzi, B., Repice, A.M., Mariottini, A., Saccardi, R., Mazzanti, B., Massacesi, L., and Ballerini, C. The TCR Repertoire Reconstitution in Multiple Sclerosis: Comparing One-Shot and Continuous Immunosuppressive Therapies. Front. Immunol. 2020; 11: 559https://doi.org/10.3389/fimmu.2020.00559

Crossref | PubMed | Scopus (15)
| Google ScholarSee all References
]. Finally, evaluating the overrepresentation of immune receptors with a known pathological association in patient immune repertoires guides the assessment of cross-reactivity with other immunological conditions [30x[30]Amoriello, R., Chernigovskaya, M., Greiff, V., Carnasciali, A., Massacesi, L., Barilaro, A., Repice, A.M., Biagioli, T., Aldinucci, A., Muraro, P.A., Laplaud, D.A., Lossius, A., and Ballerini, C. TCR repertoire diversity in Multiple Sclerosis: High-dimensional bioinformatics analysis of sequences from brain, cerebrospinal fluid and peripheral blood. EBioMedicine. 2021; 68: 103429https://doi.org/10.1016/j.ebiom.2021.103429

Abstract | Full Text | Full Text PDF | PubMed | Scopus (6)
| Google ScholarSee all References
].

We present a case study where TCR repertoire profiles were recovered from bulk RNA-seq of isolated CD4 T cells from control patients, cirrhotic patients without and with MHE. A total of 498–1,114 distinct clones (i.e. CDR3β amino acid sequences) per individual were reconstructed using MiXCR. The authors of this tool have shown similar yields on 100 bp paired-end RNA-seq data using ileocecal lymph node metastasis samples (around 3,000 recovered TRB), small intestine resection samples (around 150 TRB), or isolated CD4 T-cells from spleen (1,684–2,977 TRB) and central nervous system (367–936 TRB) [10x[10]Bolotin, D.A., Poslavsky, S., Davydov, A.N., Frenkel, F.E., Fanchi, L., Zolotareva, O.I., Hemmers, S., Putintseva, E.V., Obraztsova, A.S., Shugay, M., Ataullakhanov, R.I., Rudensky, A.Y., Schumacher, T.N., and Chudakov, D.M. Antigen receptor repertoire profiling from RNA-seq data. Nat. Biotechnol. 2017; 35: 908–911https://doi.org/10.1038/nbt.3979

Crossref | PubMed | Scopus (134)
| Google ScholarSee all References
]. Our resulting range of clones across patients from isolated CD4 T-cells from human blood samples is halfway between those from spleen and central nervous system, showing a good extraction of clonotypes from the RNA-seq dataset. From clonotype data analysis, we found that the immune repertoires of our three groups of patients are highly similar. The high similarity could have either a biological or technological origin. Three main different reasons can be suggested. (1) The fact that repertoire changes to immune perturbations are more subtle than previously thought. This is in line with recent results on larger cohorts. Specifically, our findings suggest that for autoimmune diseases, the immune signal is very weak if not isolated by cell type for example [53x[53]Weber, C.R., Rubio, T., Wang, L., Zhang, W., Robert, P.A., Akbar, R., Snapkov, I., Wu, J., Kuijjer, M.L., Tarazona, S., Conesa, A., Sandve, G.K., Liu, X., Reddy, S.T., and Greiff, V. Reference-based comparison of adaptive immune receptor repertoires. bioRxiv. 2022; https://doi.org/10.1101/2022.01.23.476436

Crossref | PubMed | Scopus (0)
| Google ScholarSee all References
]. (2) Another reason could be the low sample number. However, we have previously shown that even large sample numbers may not lead to separation between patient classes unless specific machine learning algorithms and feature encoding are used [54x[54]Emerson, R.O., DeWitt, W.S., Vignali, M., Gravley, J., Hu, J.K., Osborne, E.J., Desmarais, C., Klinger, M., Carlson, C.S., Hansen, J.A., Rieder, M., and Robins, H.S. Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nat. Genet. 2017; 49: 659–665https://doi.org/10.1038/ng.3822

Crossref | PubMed | Scopus (210)
| Google ScholarSee all References
,55x[55]Pavlović, M., Scheffer, L., Motwani, K., Kanduri, C., Kompova, R., Vazov, N., Waagan, K., Bernal, F.L.M., Costa, A.A., Corrie, B., Akbar, R., Al Hajj, G.S., Balaban, G., Brusko, T.M., Chernigovskaya, M., Christley, S., Cowell, L.G., Frank, R., Grytten, I., Gundersen, S., Haff, I.H., Hovig, E., Hsieh, P.-H., Klambauer, G., Kuijjer, M.L., Lund-Andersen, C., Martini, A., Minotto, T., Pensar, J., Rand, K., Riccardi, E., Robert, P.A., Rocha, A., Slabodkin, A., Snapkov, I., Sollid, L.M., Titov, D., Weber, C.R., Widrich, M., Yaari, G., Greiff, V., and Sandve, G.K. The immuneML ecosystem for machine learning analysis of adaptive immune receptor repertoires. Nat. Mach. Intell. 2021; 3: 936–944https://doi.org/10.1038/s42256-021-00413-z

Crossref | Scopus (7)
| Google ScholarSee all References
]. (3) It may also be that the proposed features (repertoire overlap, repertoire diversity, network analysis, k-mer, and comparison with existing databases) do not capture the full biological heterogeneity of TCR repertoires. However, we have recently shown that these features cover a large part of immune repertoire diversity [53x[53]Weber, C.R., Rubio, T., Wang, L., Zhang, W., Robert, P.A., Akbar, R., Snapkov, I., Wu, J., Kuijjer, M.L., Tarazona, S., Conesa, A., Sandve, G.K., Liu, X., Reddy, S.T., and Greiff, V. Reference-based comparison of adaptive immune receptor repertoires. bioRxiv. 2022; https://doi.org/10.1101/2022.01.23.476436

Crossref | PubMed | Scopus (0)
| Google ScholarSee all References
]. That said, crucial features that have been taken into account are HLA-associated or antigen-specific sequences [54x[54]Emerson, R.O., DeWitt, W.S., Vignali, M., Gravley, J., Hu, J.K., Osborne, E.J., Desmarais, C., Klinger, M., Carlson, C.S., Hansen, J.A., Rieder, M., and Robins, H.S. Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nat. Genet. 2017; 49: 659–665https://doi.org/10.1038/ng.3822

Crossref | PubMed | Scopus (210)
| Google ScholarSee all References
,56x[56]Francis, J.M., Leistritz-Edwards, D., Dunn, A., Tarr, C., Lehman, J., Dempsey, C., Hamel, A., Rayon, V., Liu, G., Wang, Y., Wille, M., Durkin, M., Hadley, K., Sheena, A., Roscoe, B., Ng, M., Rockwell, G., Manto, M., Gienger, E., Nickerson, J., M.C.-19 C. and P. Team2‡, Moarefi, A., Noble, M., Malia, T., Bardwell, P.D., Gordon, W., Swain, J., Skoberne, M., Sauer, K., Harris, T., Goldrath, A.W., Shalek, A.K., Coyle, A.J., Benoist, C., Pregibon, D.C., Jilg, N., Li, J., Rosenthal, A., Wong, C., Daley, G., Golan, D., Heller, H., Sharpe, A., Abayneh, B.A., Allen, P., Antille, D., Armstrong, K., Boyce, S., Braley, J., Branch, K., Broderick, K., Carney, J., Chan, A., Davidson, S., Dougan, M., Drew, D., Elliman, A., Flaherty, K., Flannery, J., Forde, P., Gettings, E., Griffin, A., Grimmel, S., Grinke, K., Hall, K., Healy, M., Henault, D., Holland, G., Kayitesi, C., LaValle, V., Lu, Y., Luthern, S., Schneider, J.M., Martino, B., McNamara, R., Nambu, C., Nelson, S., Noone, M., Ommerborn, C., Pacheco, L.C., Phan, N., Porto, F.A., Ryan, E., Selleck, K., Slaughenhaupt, S., Sheppard, K.S., Suschana, E., Wilson, V., Carrington, M., Martin, M., Yuki, Y., Alter, G., Balazs, A., Bals, J., Barbash, M., Bartsch, Y., Boucau, J., Carrington, M., Chevalier, J., Chowdhury, F., DeMers, E., Einkauf, K., Fallon, J., Fedirko, L., Finn, K., Garcia-Broncano, P., Ghebremichael, M.S., Hartana, C., Jiang, C., Judge, K., Kaplonek, P., Karpell, M., Lai, P., Lam, E.C., Lefteri, K., Lian, X., Lichterfeld, M., Lingwood, D., Liu, H., Liu, J., Ly, N., Hill, Z.M., Michell, A., Millstrom, I., Miranda, N., O'Callaghan, C., Osborn, M., Pillai, S., Rassadkina, Y., Reissis, A., Ruzicka, F., Seiger, K., Sessa, L., Sharr, C., Shin, S., Singh, N., Sun, W., Sun, X., Ticheli, H., Trocha-Piechocka, A., Walker, B., Worrall, D., Yu, X.G., and Zhu, A. Allelic variation in class I HLA determines CD8+ T cell repertoire shape and cross-reactive memory responses to SARS-CoV-2. Sci. Immunol. 2021; https://www.science.org/doi/abs/10.1126/sciimmunol.abk3070.

Google ScholarSee all References
,57x[57]DeWitt, W.S. III, Smith, A., Schoch, G., Hansen, J.A., Matsen IV, F.A., and Bradley, P. Human T cell receptor occurrence patterns encode immune history, genetic background, and receptor specificity. ELife. 2018; 7: e38358https://doi.org/10.7554/eLife.38358

Crossref | PubMed | Scopus (63)
| Google ScholarSee all References
]. Our pipeline can be applied to any bulk RNA-seq dataset obtained from a sample containing T cells, thanks to the Nextflow implementation. We believe that is a useful resource to study the immune repertoire similarity landscape across different biological scenarios (e.g., health, disease, autoimmunity, infection, vaccination).

Taking advantage of the generation of a vast amount of RNA-seq datasets for different immune cell populations in the last few decades, our Nextflow pipeline can be applied for the study of TCR repertoires to understand patient immune status in multiple diseases. The current version of this pipeline is useful for the study of T-cell subtypes (CD8 and CD4 subpopulations) but it can be easily adapted to the study of B cells. Additionally, it only supports two input species for the moment (Homo sapiens and Mus musculus), but they can be expanded as soon as the information on antigen/disease-specific TCR databases increases. However, only the most abundant and expanded clones can be recovered using bulk RNA-seq data for immune repertoire quantification and pairing of specific α/β receptor chains cannot be determined rendering simulation [58x[58]Weber, C.R., Akbar, R., Yermanos, A., Pavlović, M., Snapkov, I., Sandve, G.K., Reddy, S.T., and Greiff, V. immuneSIM: tunable multi-feature simulation of B- and T-cell receptor repertoires for immunoinformatics benchmarking. Bioinformatics. 2020; 36: 3594–3596https://doi.org/10.1093/bioinformatics/btaa158

Crossref | PubMed | Scopus (16)
| Google ScholarSee all References
,59x[59]C. Kanduri, M. Pavlović, L. Scheffer, K. Motwani, M. Chernigovskaya, V. Greiff, G.K. Sandve, Profiling the baseline performance and limits of machine learning models for adaptive immune receptor repertoire classification, (2021) 2021.05.23.445346. https://doi.org/10.1101/2021.05.23.445346.

Google ScholarSee all References
] and benchmarking [8x[8]Barennes, P., Quiniou, V., Shugay, M., Egorov, E.S., Davydov, A.N., Chudakov, D.M., Uddin, I., Ismail, M., Oakes, T., Chain, B., Eugster, A., Kashofer, K., Rainer, P.P., Darko, S., Ransier, A., Douek, D.C., Klatzmann, D., and Mariotti-Ferrandiz, E. Benchmarking of T cell receptor repertoire profiling methods reveals large systematic biases. Nat. Biotechnol. 2021; 39: 236–245https://doi.org/10.1038/s41587-020-0656-3

Crossref | PubMed | Scopus (40)
| Google ScholarSee all References
,60x[60]Dahal-Koirala, S., Balaban, G., Neumann, R.S., Scheffer, L., Lundin, K.E.A., Greiff, V., Sollid, L.M., Qiao, S.-W., and Sandve, G.K. TCRpower: quantifying the detection power of T-cell receptor sequencing with a novel computational pipeline calibrated by spike-in sequences. Brief. Bioinform. 2022; : bbab566https://doi.org/10.1093/bib/bbab566

Crossref | PubMed | Scopus (2)
| Google ScholarSee all References
] studies necessary, which will need to determine to what extent the present workflow can be used to identify immune-state-related immune signals. Since bulk RNA-seq datasets combine dual biological information measured in one single experiment, the Nextflow pipeline will allow for parallel analysis of immune repertoires and gene expression. In previous work, we performed the gene expression analysis of the RNA-seq dataset used here, in which the integration of both RNA-seq and miRNA-seq datasets from CD4 T-cells of our MHE patient cohort were analyzed (data not shown). Formal integration of TCR and gene expression analysis results will require the development of adequate mathematical methods that are able to deal with the different structure of both datasets [53x[53]Weber, C.R., Rubio, T., Wang, L., Zhang, W., Robert, P.A., Akbar, R., Snapkov, I., Wu, J., Kuijjer, M.L., Tarazona, S., Conesa, A., Sandve, G.K., Liu, X., Reddy, S.T., and Greiff, V. Reference-based comparison of adaptive immune receptor repertoires. bioRxiv. 2022; https://doi.org/10.1101/2022.01.23.476436

Crossref | PubMed | Scopus (0)
| Google ScholarSee all References
,61x[61]Schattgen, S.A., Guion, K., Crawford, J.C., Souquette, A., Barrio, A.M., Stubbington, M.J.T., Thomas, P.G., and Bradley, P. Integrating T cell receptor sequences and transcriptional profiles by clonotype neighbor graph analysis (CoNGA). Nat. Biotechnol. 2022; 40: 54–63https://doi.org/10.1038/s41587-021-00989-2

Crossref | PubMed | Scopus (11)
| Google ScholarSee all References
].

Declaration of Competing Interest

VG declares advisory board positions in aiNET GmbH, Enpicom B.V, Specifica Inc, Adaptyv Biosystems, and EVQLV. VG is a consultant for Roche/Genentech.

Author contributions

AC, VF, ST, and CMontoliu designed the study. AU and PI collected samples and performed T-cell isolation. RNA extraction was made by CMarti. VG designed and supervised data analysis. MC designed Fisher's exact test analysis. VF, AC, and CMontoliu obtained funding. TR analyzed the data and wrote the manuscript. SM and TR assembled the Nextflow workflow. AC, VG, and ST supervised manuscript write-up. All authors reviewed, revised, and approved the final manuscript.

Funding

We acknowledge generous support by The Leona M. and Harry B. Helmsley Charitable Trust (#2019PG-T1D011, to VG), UiO World-Leading Research Community (to VG), UiO:LifeScience Convergence Environment Immunolingo (to VG), EU Horizon 2020 iReceptorplus (#825821) (to VG), a Research Council of Norway FRIPRO project (#300740, to VG), a Research Council of Norway IKTPLUSS project (#311341, to VG), a Norwegian Cancer Society Grant (#215817, to VG). This work was also supported in part by Fundación Ramón Areces (to CM), the Ministerio de Ciencia e Innovación Spain (SAF2017-82917-R and PID2020-113388RB-I00 to VF; FIS PI18/00150 to CM), Consellería Educación Generalitat Valenciana (PROMETEOII/2018/051 to VF), Ministerio de Economía y Competitividad (BIO2015-71658-R to AC), Centro de Investigación Príncipe Felipe (Ayudas para proyectos de investigación intergrupos to TR) and co-funded with European Regional Development Funds (ERDF to VF, CM, AC).

Appendix. Supplementary materials

References

  1. [1]Y. Elhanati, A. Murugan, C.G. Callan, T. Mora, A.M. Walczak, Quantifying selection in immune receptor repertoires, Proc. Natl. Acad. Sci. 111 (2014) 9875–9880. https://doi.org/10.1073/pnas.1409572111.
  2. [2]A. Murugan, T. Mora, A.M. Walczak, C.G. Callan, Statistical inference of the generation probability of T-cell receptors from sequence repertoires, Proc. Natl. Acad. Sci. 109 (2012) 16161–16166. https://doi.org/10.1073/pnas.1212755109.
  3. [3]Robins, H.S., Campregher, P.V., Srivastava, S.K., Wacher, A., Turtle, C.J., Kahsai, O., Riddell, S.R., Warren, E.H., and Carlson, C.S. Comprehensive assessment of T-cell receptor β-chain diversity in αβ T cells. Blood. 2009; 114: 4099–4107https://doi.org/10.1182/blood-2009-04-217604
  4. [4]Davis, M.M. and Bjorkman, P.J. T-cell antigen receptor genes and T-cell recognition. Nature. 1988; 334: 395–402https://doi.org/10.1038/334395a0
  5. [5]Brown, A.J., Snapkov, I., Akbar, R., Pavlović, M., Miho, E., Sandve, G.K., and Greiff, V. Augmenting adaptive immunity: progress and challenges in the quantitative engineering and analysis of adaptive immune receptor repertoires. Mol. Syst. Des. Eng. 2019; 4: 701–736https://doi.org/10.1039/C9ME00071B
  6. [6]Rosati, E., Dowds, C.M., Liaskou, E., Henriksen, E.K.K., Karlsen, T.H., and Franke, A. Overview of methodologies for T-cell receptor repertoire analysis. BMC Biotechnol. 2017; 17: 61https://doi.org/10.1186/s12896-017-0379-9
  7. [7]Rubelt, F., Busse, C.E., Bukhari, S.A.C., Bürckert, J.-P., Mariotti-Ferrandiz, E., Cowell, L.G., Watson, C.T., Marthandan, N., Faison, W.J., Hershberg, U., Laserson, U., Corrie, B.D., Davis, M.M., Peters, B., Lefranc, M.-P., Scott, J.K., Breden, F., Luning Prak, E.T., and Kleinstein, S.H. Adaptive Immune Receptor Repertoire Community recommendations for sharing immune-repertoire sequencing data. Nat. Immunol. 2017; 18: 1274–1278https://doi.org/10.1038/ni.3873
  8. [8]Barennes, P., Quiniou, V., Shugay, M., Egorov, E.S., Davydov, A.N., Chudakov, D.M., Uddin, I., Ismail, M., Oakes, T., Chain, B., Eugster, A., Kashofer, K., Rainer, P.P., Darko, S., Ransier, A., Douek, D.C., Klatzmann, D., and Mariotti-Ferrandiz, E. Benchmarking of T cell receptor repertoire profiling methods reveals large systematic biases. Nat. Biotechnol. 2021; 39: 236–245https://doi.org/10.1038/s41587-020-0656-3
  9. [9]Greiff, V., Miho, E., Menzel, U., and Reddy, S.T. Bioinformatic and Statistical Analysis of Adaptive Immune Repertoires. Trends Immunol. 2015; 36: 738–749https://doi.org/10.1016/j.it.2015.09.006
  10. [10]Bolotin, D.A., Poslavsky, S., Davydov, A.N., Frenkel, F.E., Fanchi, L., Zolotareva, O.I., Hemmers, S., Putintseva, E.V., Obraztsova, A.S., Shugay, M., Ataullakhanov, R.I., Rudensky, A.Y., Schumacher, T.N., and Chudakov, D.M. Antigen receptor repertoire profiling from RNA-seq data. Nat. Biotechnol. 2017; 35: 908–911https://doi.org/10.1038/nbt.3979
  11. [11]Farmanbar, A., Kneller, R., and Firouzi, S. RNA sequencing identifies clonal structure of T-cell repertoires in patients with adult T-cell leukemia/lymphoma. Npj Genomic Med. 2019; 4: 1–9https://doi.org/10.1038/s41525-019-0084-9
  12. [12]Song, L., Cohen, D., Ouyang, Z., Cao, Y., Hu, X., and Liu, X.S. TRUST4: immune repertoire reconstruction from bulk and single-cell RNA-seq data. Nat. Methods. 2021; https://doi.org/10.1038/s41592-021-01142-2
  13. [13]Chen, S.Y., Liu, C.J., Zhang, Q., and Guo, A.Y. An ultra-sensitive T-cell receptor detection method for TCR-Seq and RNA-Seq data. Bioinforma. Oxf. Engl. 2020; 36: 4255–4262https://doi.org/10.1093/bioinformatics/btaa432
  14. [14]Canzar, S., Neu, K.E., Tang, Q., Wilson, P.C., and Khan, A.A. BASIC: BCR assembly from single cells. Bioinforma. Oxf. Engl. 2017; 33: 425–427https://doi.org/10.1093/bioinformatics/btw631
  15. [15]Lindeman, I., Emerton, G., Mamanova, L., Snir, O., Polanski, K., Qiao, S.-W., Sollid, L.M., Teichmann, S.A., and Stubbington, M.J.T. BraCeR: B-cell-receptor reconstruction and clonality inference from single-cell RNA-seq. Nat. Methods. 2018; 15: 563–565https://doi.org/10.1038/s41592-018-0082-3
  16. [16]Upadhyay, A.A., Kauffman, R.C., Wolabaugh, A.N., Cho, A., Patel, N.B., Reiss, S.M., Havenar-Daughton, C., Dawoud, R.A., Tharp, G.K., Sanz, I., Pulendran, B., Crotty, S., Lee, F.E.-H., Wrammert, J., and Bosinger, S.E. BALDR: a computational pipeline for paired heavy and light chain immunoglobulin reconstruction in single-cell RNA-seq data. Genome Med. 2018; 10: 20https://doi.org/10.1186/s13073-018-0528-3
  17. [17]Di Tommaso, P., Chatzou, M., Floden, E.W., Barja, P.P., Palumbo, E., and Notredame, C. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 2017; 35: 316–319https://doi.org/10.1038/nbt.3820
  18. [18]Ewels, P.A., Peltzer, A., Fillinger, S., Patel, H., Alneberg, J., Wilm, A., Garcia, M.U., Di Tommaso, P., and Nahnsen, S. The nf-core framework for community-curated bioinformatics pipelines. Nat. Biotechnol. 2020; 38: 276–278https://doi.org/10.1038/s41587-020-0439-x
  19. [19]Garrett, M.E., Galloway, J.G., Wolf, C., Logue, J.K., Franko, N., Chu, H.Y., Matsen, F.A., and Overbaugh, J. Comprehensive characterization of the antibody responses to SARS-CoV-2 Spike protein after infection and/or vaccination. BioRxiv. 2021; https://doi.org/10.1101/2021.10.05.463210
  20. [20]J.G. Galloway, E. Matsen, phip-flow, Matsen Group, 2022. https://github.com/matsengrp/phip-flow (Accessed 9 February 2022).
  21. [21]Weissenborn, K., Giewekemeyer, K., Heidenreich, S., Bokemeyer, M., Berding, G., and Ahl, B. Attention, Memory, and Cognitive Function in Hepatic Encephalopathy. Metab. Brain Dis. 2005; 20: 359–367https://doi.org/10.1007/s11011-005-7919-z
  22. [22]Cabrera-Pastor, A., Llansola, M., Montoliu, C., Malaguarnera, M., Balzano, T., Taoro-Gonzalez, L., García-García, R., Mangas-Losada, A., Izquierdo-Altarejos, P., Arenas, Y.M., Leone, P., and Felipo, V. Peripheral inflammation induces neuroinflammation that alters neurotransmission and cognitive and motor function in hepatic encephalopathy: Underlying mechanisms and therapeutic implications. Acta Physiol. 2019; 226: e13270https://doi.org/10.1111/apha.13270
  23. [23]Mangas-Losada, A., García-García, R., Urios, A., Escudero-García, D., Tosca, J., Giner-Durán, R., Serra, M.A., Montoliu, C., and Felipo, V. Minimal hepatic encephalopathy is associated with expansion and activation of CD4+CD28-, Th22 and Tfh and B lymphocytes. Sci. Rep. 2017; 7: 6683https://doi.org/10.1038/s41598-017-05938-1
  24. [24]Weissenborn, K., Ennen, J.C., Schomerus, H., Rückert, N., and Hecker, H. Neuropsychological characterization of hepatic encephalopathy. J. Hepatol. 2001; 34: 768–773https://doi.org/10.1016/S0168-8278(01)00026-5
  25. [25]Bolger, A.M., Lohse, M., and Usadel, B. a flexible trimmer for Illumina sequence data. Bioinforma. Oxf. Engl. 2014; 30: 2114–2120https://doi.org/10.1093/bioinformatics/btu170
  26. [26]Bolotin, D.A., Poslavsky, S., Mitrophanov, I., Shugay, M., Mamedov, I.Z., Putintseva, E.V., and Chudakov, D.M. MiXCR: software for comprehensive adaptive immunity profiling. Nat. Methods. 2015; 12: 380–381https://doi.org/10.1038/nmeth.3364
  27. [27]Amoriello, R., Greiff, V., Aldinucci, A., Bonechi, E., Carnasciali, A., Peruzzi, B., Repice, A.M., Mariottini, A., Saccardi, R., Mazzanti, B., Massacesi, L., and Ballerini, C. The TCR Repertoire Reconstitution in Multiple Sclerosis: Comparing One-Shot and Continuous Immunosuppressive Therapies. Front. Immunol. 2020; 11: 559https://doi.org/10.3389/fimmu.2020.00559
  28. [28]Greiff, V., Bhat, P., Cook, S.C., Menzel, U., Kang, W., and Reddy, S.T. A bioinformatic framework for immune repertoire diversity profiling enables detection of immunological status. Genome Med. 2015; 7: 49https://doi.org/10.1186/s13073-015-0169-8
  29. [29]ImmunoMind Team, immunarch: An R Package for Painless Analysis of Large-Scale Immune Repertoire Data, (2019).
  30. [30]Amoriello, R., Chernigovskaya, M., Greiff, V., Carnasciali, A., Massacesi, L., Barilaro, A., Repice, A.M., Biagioli, T., Aldinucci, A., Muraro, P.A., Laplaud, D.A., Lossius, A., and Ballerini, C. TCR repertoire diversity in Multiple Sclerosis: High-dimensional bioinformatics analysis of sequences from brain, cerebrospinal fluid and peripheral blood. EBioMedicine. 2021; 68: 103429https://doi.org/10.1016/j.ebiom.2021.103429
  31. [31]Miho, E., Roškar, R., Greiff, V., and Reddy, S.T. Large-scale network analysis reveals the sequence space architecture of antibody repertoires. Nat. Commun. 2019; 10: 1321https://doi.org/10.1038/s41467-019-09278-8
  32. [32]Csardi, G. and Nepusz, T. The igraph software package for complex network research. InterJ Complex Syst. 2006; 1695: 9
  33. [33]Core Team, R. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria; 2020

    https://www.R-project.org/.

  34. [34]Gaujoux, R. and Seoighe, C. A flexible R package for nonnegative matrix factorization. BMC Bioinformatics. 2010; 11: 367https://doi.org/10.1186/1471-2105-11-367
  35. [35]S. Andrews, FASTQC. A quality control tool for high throughput sequence data, 2010.
  36. [36]Wickham, H. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag, New York; 2016

    https://ggplot2.tidyverse.org.

  37. [37]A. Kassambara, ggpubr: “ggplot2” Based Publication Ready Plots. R package version 0.4.0., (2020). https://CRAN.R-project.org/package=ggpubr.
  38. [38]Tickotsky, N., Sagiv, T., Prilusky, J., Shifrut, E., and Friedman, N. McPAS-TCR: a manually curated catalogue of pathology-associated T cell receptor sequences. Bioinforma. Oxf. Engl. 2017; 33: 2924–2929https://doi.org/10.1093/bioinformatics/btx286
  39. [39]Bagaev, D.V., Vroomans, R.M.A., Samir, J., Stervbo, U., Rius, C., Dolton, G., Greenshields-Watson, A., Attaf, M., Egorov, E.S., Zvyagin, I.V., Babel, N., Cole, D.K., Godkin, A.J., Sewell, A.K., Kesmir, C., Chudakov, D.M., Luciani, F., and Shugay, M. VDJdb in 2019: database extension, new analysis infrastructure and a T-cell receptor motif compendium. Nucleic Acids Res. 2020; 48: D1057–D1062https://doi.org/10.1093/nar/gkz874
  40. [40]Giudicelli, V., Chaume, D., and Lefranc, M.P. IMGT/GENE-DB: a comprehensive database for human and mouse immunoglobulin and T cell receptor genes. Nucleic Acids Res. 2005; 33: D256–D261https://doi.org/10.1093/nar/gki010
  41. [41]Rybakin, V., Westernberg, L., Fu, G., Kim, H.O., Ampudia, J., Sauer, K., and Gascoigne, N.R.J. Allelic exclusion of TCR α-chains upon severe restriction of Vα repertoire. PloS One. 2014; 9: e114320https://doi.org/10.1371/journal.pone.0114320
  42. [42]Steinel, N.C., Brady, B.L., Carpenter, A.C., Yang-Iott, K.S., and Bassing, C.H. Posttranscriptional silencing of VbetaDJbetaCbeta genes contributes to TCRbeta allelic exclusion in mammalian lymphocytes. J. Immunol. Baltim. Md. 2010; 185: 1055–1062https://doi.org/10.4049/jimmunol.0903099
  43. [43]Elhanati, Y., Sethna, Z., Callan, C.G., Mora, T., and Walczak, A.M. Predicting the spectrum of TCR repertoire sharing with a data-driven model of recombination. Immunol. Rev. 2018; 284: 167–179https://doi.org/10.1111/imr.12665
  44. [44]Greiff, V., Menzel, U., Miho, E., Weber, C., Riedel, R., Cook, S., Valai, A., Lopes, T., Radbruch, A., Winkler, T.H., and Reddy, S.T. Systems Analysis Reveals High Genetic and Antigen-Driven Predetermination of Antibody Repertoires throughout B Cell Development. Cell Rep. 2017; 19: 1467–1478https://doi.org/10.1016/j.celrep.2017.04.054
  45. [45]Greiff, V., Weber, C.R., Palme, J., Bodenhofer, U., Miho, E., Menzel, U., and Reddy, S.T. Learning the High-Dimensional Immunogenomic Features That Predict Public and Private Antibody Repertoires. J. Immunol. 2017; 199: 2985–2997https://doi.org/10.4049/jimmunol.1700594
  46. [46]Putintseva, E., Britanova, O., Staroverov, D., Merzlyak, E., Turchaninova, M., Shugay, M., Bolotin, D., Pogorelyy, M., Mamedov, I., Bobrynina, V., Maschan, M., Lebedev, Y., and Chudakov, D. Mother and Child T Cell Receptor Repertoires: Deep Profiling Study. Front. Immunol. 2013; 4: 463https://doi.org/10.3389/fimmu.2013.00463
  47. [47]Brand, S. Crohn's disease: Th1, Th17 or both? The change of a paradigm: new immunological and genetic insights implicate Th17 cells in the pathogenesis of Crohn's disease. Gut. 2009; 58: 1152–1167https://doi.org/10.1136/gut.2008.163667
  48. [48]Molnár, T., Tiszlavicz, L., Gyulai, C., Nagy, F., and Lonovics, J. Clinical significance of granuloma in Crohn's disease. World J. Gastroenterol. WJG. 2005; 11: 3118–3121https://doi.org/10.3748/wjg.v11.i20.3118
  49. [49]Nemeth, Z.H., Bogdanovski, D.A., Barratt-Stopper, P., Paglinco, S.R., Antonioli, L., and Rolandelli, R.H. Crohn's Disease and Ulcerative Colitis Show Unique Cytokine Profiles. Cureus. 2017; 9: e1177https://doi.org/10.7759/cureus.1177
  50. [50]Imam, T., Park, S., Kaplan, M.H., and Olson, M.R. Effector T Helper Cell Subsets in Inflammatory Bowel Diseases. Front. Immunol. 2018; 9: 1212https://doi.org/10.3389/fimmu.2018.01212
  51. [51]Bushara, O., Escobar, D.J., Weinberg, S.E., Sun, L., Liao, J., and Yang, G.-Y. The Possible Pathogenic Role of IgG4-Producing Plasmablasts in Stricturing Crohn's Disease. Pathobiol. J. Immunopathol. Mol. Cell. Biol. 2022; : 1–11https://doi.org/10.1159/000521259
  52. [52]Rubio, T., Felipo, V., Tarazona, S., Pastorelli, R., Escudero-García, D., Tosca, J., Urios, A., Conesa, A., and Montoliu, C. Multi-omic analysis unveils biological pathways in peripheral immune system associated to minimal hepatic encephalopathy appearance in cirrhotic patients. Sci. Rep. 2021; 11: 1907https://doi.org/10.1038/s41598-020-80941-7
  53. [53]Weber, C.R., Rubio, T., Wang, L., Zhang, W., Robert, P.A., Akbar, R., Snapkov, I., W