BIROn - Birkbeck Institutional Research Online

    Gaussian process audio segmentation

    Marshall, Benjamin Charles (2023) Gaussian process audio segmentation. PhD thesis, Birkbeck, University of London.

    [img]
    Preview
    Text
    Marshall B thesis final.pdf - Full Version

    Download (1MB) | Preview
    [img] Audio
    bangbang.wav

    Download (640kB)
    [img] Audio
    indian.wav

    Download (208kB)
    [img] Audio
    segovia.wav

    Download (185kB)

    Abstract

    This thesis presents a probability model for generating polyphonic audio and derives an algorithm for estimating the most probable interpretation of new audio under this model. The algorithm works through a combination of dynamic programming, recursive grouping rules and Bayesian inference. The probability model combines older ideas from signal processing (Fourier transforms, stationary processes) with newer ideas coming out of Bayesian inference in the form of hyperpriors over the parameters governing the audio generating process. These techniques allow the model to adapt to novel audio sources (vocals, music instruments, background noises, special effects) and thus is not easily fooled by `out-of-sample' data. Furthermore, the generative nature of the model allows new audio to be re-sampled after an interpretation has been inferred, this highlights what aspects of the data have been modelled but also what's been missed. The report describes the development of this model and algorithm through a series of increasingly complex audio examples. This reflects the way in which the author has approached the problem of audio modelling: additional model complexity has been brought in only when some quite general features of audio has been missed by simpler methods. The focus of the thesis is on the ideas the author has discovered to work well. The many other models and algorithmic ideas that accompany these few successful ones have not been described in very much detail (if at all).

    Metadata

    Item Type: Thesis
    Copyright Holders: The copyright of this thesis rests with the author, who asserts his/her right to be known as such according to the Copyright Designs and Patents Act 1988. No dealing with the thesis contrary to the copyright or moral rights of the author is permitted.
    Depositing User: Acquisitions And Metadata
    Date Deposited: 26 Jan 2024 17:37
    Last Modified: 27 Jan 2024 08:44
    URI: https://eprints.bbk.ac.uk/id/eprint/52916
    DOI: https://doi.org/10.18743/PUB.00052916

    Statistics

    Activity Overview
    6 month trend
    24Downloads
    6 month trend
    23Hits

    Additional statistics are available via IRStats2.

    Archive Staff Only (login required)

    Edit/View Item Edit/View Item