Gore, Trupti Amol (2025) Machine learning approaches to TCR binding prediction: benchmarking existing methods and developing a novel model for predicting TCR-MHC specificity. PhD thesis, Birkbeck, University of London.
![]() |
Text
Gore T, final thesis for library.pdf Download (4MB) |
Abstract
CD8+ T cells are key components of the adaptive immune system, playing crucial roles in targeting intracellular pathogens and in tumour surveillance. T cell immune function is mediated by their surface T cell receptors (TCRs), which bind to complexes formed between antigenic peptides and MHC molecules. It is estimated that a typical individual has around 108 unique CD8+ T cells. The sequences of many millions of unique TCRs are known (e.g. from single cell sequencing experiments), but in all but a tiny fraction of cases their antigenic targets are unknown. This gap between the TCR sequence and TCR function affords a key motivation for the research presented in this thesis, which concerns two complementary computational prediction tasks. The first task involved the evaluation of state-of-the-art deep learning tools that predict TCR binding to peptide-MHC complexes. This is known to be a challenging task, notably because TCR binding is modulated by six flexible loops. Tools were retrained in order to address the generalised TCR binding prediction task: can a tool predict whether a given TCR will bind to an antigenic peptide not present in the dataset used to train the tool? The results demonstrated that only two tools proved moderately successful at generalised prediction. A subsequent correlation analysis provides useful insights into the factors associated with prediction success or failure. MHC molecules, encoded by HLA alleles, are highly polymorphic and the HLA types of the individuals from which TCR data is acquired is rarely known. A TCR is said to be HLA-restricted, i.e. it will commonly bind to complexes involving a single type of MHC. The second task addressed in this thesis involved designing a transformer-based deep learning tool for predicting TCR-HLA associations. The tool achieved AUCs of 0.68 and 0.76 for HLA alleles present in both training and test sets.
Metadata
Item Type: | Thesis |
---|---|
Copyright Holders: | The copyright of this thesis rests with the author, who asserts his/her right to be known as such according to the Copyright Designs and Patents Act 1988. No dealing with the thesis contrary to the copyright or moral rights of the author is permitted. |
Depositing User: | Acquisitions And Metadata |
Date Deposited: | 13 Mar 2025 10:44 |
Last Modified: | 18 Sep 2025 03:38 |
URI: | https://eprints.bbk.ac.uk/id/eprint/55155 |
DOI: | https://doi.org/10.18743/PUB.00055155 |
Statistics
Additional statistics are available via IRStats2.