Hasan, Abul Kalam Md. Rajib (2022) Extracting health information from social media. PhD thesis, Birkbeck, University of London.
|
Text
Thesis_Final_Copy_Abul_Hasan.pdf - Full Version Download (6MB) | Preview |
Abstract
Social media platforms with large user bases such as Twitter, Reddit, and online health forums contain a rich amount of health-related information. Despite the advances achieved in natural language processing (NLP), extracting actionable health information from social media still remains challenging. This thesis proposes a set of methodologies that can be used to extract medical concepts and health information from social media that is related to drugs, symptoms, and side-effects. We first develop a rule-based relationship extraction system that utilises a set of dictionaries and linguistic rules in order to extract structured information from patients’ posts on online health forums. We then automate the concept extraction pro-cess via; i) a supervised algorithm that has been trained with a small labelled dataset, and ii) an iterative semi-supervised algorithm capable of learning new sentences and concepts. We test our machine-learning pipeline on a COVID-19 case study that involves patient authored social media posts. We develop a novel triage and diagnostic approach to extract symptoms, severity, and prevalence of the disease rather than to provide any actionable decisions at the individual level. Finally, we extend our approach by investigating the potential benefit of incorporating dictionary information into a neural network architecture for natural language processing.
Metadata
Item Type: | Thesis |
---|---|
Copyright Holders: | The copyright of this thesis rests with the author, who asserts his/her right to be known as such according to the Copyright Designs and Patents Act 1988. No dealing with the thesis contrary to the copyright or moral rights of the author is permitted. |
Depositing User: | Acquisitions And Metadata |
Date Deposited: | 22 Nov 2022 13:45 |
Last Modified: | 01 Nov 2023 15:54 |
URI: | https://eprints.bbk.ac.uk/id/eprint/49947 |
DOI: | https://doi.org/10.18743/PUB.00049947 |
Statistics
Additional statistics are available via IRStats2.