Text Mining

Clegg, A.B. and Shepherd, Adrian J. (2008) Text Mining. Methods in Molecular Biology 453 (4), pp. 471-491. ISSN 1064-3745.

Full text not available from this repository.

Official URL: http://dx.doi.org/10.1007/978-1-60327-429-6_25

Abstract

One of the fastest-growing fields in bioinformatics is text mining: the application of natural language processing techniques to problems of knowledge management and discovery, using large collections of biological or biomedical text such as MEDLINE. The techniques used in text mining range from the very simple (e.g., the inference of relationships between genes from frequent proximity in documents) to the complex and computationally intensive (e.g., the analysis of sentence structures with parsers in order to extract facts about protein —protein interactions from statements in the text). This chapter presents a general introduction to some of the key principles and challenges of natural language processing, and introduces some of the tools available to end-users and developers. A case study describes the construction and testing of a simple tool designed to tackle a task that is crucial to almost any application of text mining in bioinformatics —identifying gene/protein names in text and mapping them onto records in an external database.

Metadata

Item Type:	Article
School:	Birkbeck Faculties and Schools > Faculty of Science > School of Natural Sciences
Research Centres and Institutes:	Bioinformatics, Bloomsbury Centre for (Closed), Structural Molecular Biology, Institute of (ISMB)
Depositing User:	Administrator
Date Deposited:	04 Aug 2010 14:09
Last Modified:	02 Aug 2023 16:49
URI:	https://eprints.bbk.ac.uk/id/eprint/1088

Statistics

DownloadsShow export options

Activity Overview

6 month trend

0Downloads

6 month trend

234Hits

Additional statistics are available via IRStats2.

Archive Staff Only (login required)

Edit/View Item