BIROn - Birkbeck Institutional Research Online

    Probabilistic verb selection for data-to-text generation

    Zhang, Dell and Yuan, J. and Wang, X. and Foster, Adam (2018) Probabilistic verb selection for data-to-text generation. Transactions of the Association for Computational Linguistics 6 , pp. 511-527. ISSN 2307-387X.

    [img] Text
    verb_sel_paper.pdf - Author's Accepted Manuscript
    Restricted to Repository staff only

    Download (393kB)
    [img]
    Preview
    Text
    23242.pdf - Published Version of Record
    Available under License Creative Commons Attribution.

    Download (398kB) | Preview

    Abstract

    In data-to-text Natural Language Generation (NLG) systems, computers need to find the right words to describe phenomena seen in the data. This paper focuses on the problem of choosing appropriate verbs to express the direction and magnitude of a percentage change (e.g., in stock prices). Rather than simply using the same verbs again and again, we present a principled data-driven approach to this problem based on Shannon's noisy-channel model so as to bring variation and naturalness into the generated text. Our experiments on three large-scale real-world news corpora demonstrate that the proposed probabilistic model can be learned to accurately imitate human authors' pattern of usage around verbs, outperforming the state-of-the-art method significantly.

    Metadata

    Item Type: Article
    School: Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences
    Research Centres and Institutes: Birkbeck Knowledge Lab, Data Analytics, Birkbeck Institute for
    Depositing User: Dell Zhang
    Date Deposited: 23 Jul 2018 09:45
    Last Modified: 09 Aug 2023 12:44
    URI: https://eprints.bbk.ac.uk/id/eprint/23242

    Statistics

    Activity Overview
    6 month trend
    246Downloads
    6 month trend
    221Hits

    Additional statistics are available via IRStats2.

    Archive Staff Only (login required)

    Edit/View Item Edit/View Item