Marshall, Benjamin Charles (2023) Gaussian process audio segmentation. PhD thesis, Birkbeck, University of London.
|
Text
Marshall B thesis final.pdf - Full Version Download (1MB) | Preview |
|
Audio
bangbang.wav Download (640kB) |
||
Audio
indian.wav Download (208kB) |
||
Audio
segovia.wav Download (185kB) |
Abstract
This thesis presents a probability model for generating polyphonic audio and derives an algorithm for estimating the most probable interpretation of new audio under this model. The algorithm works through a combination of dynamic programming, recursive grouping rules and Bayesian inference. The probability model combines older ideas from signal processing (Fourier transforms, stationary processes) with newer ideas coming out of Bayesian inference in the form of hyperpriors over the parameters governing the audio generating process. These techniques allow the model to adapt to novel audio sources (vocals, music instruments, background noises, special effects) and thus is not easily fooled by `out-of-sample' data. Furthermore, the generative nature of the model allows new audio to be re-sampled after an interpretation has been inferred, this highlights what aspects of the data have been modelled but also what's been missed. The report describes the development of this model and algorithm through a series of increasingly complex audio examples. This reflects the way in which the author has approached the problem of audio modelling: additional model complexity has been brought in only when some quite general features of audio has been missed by simpler methods. The focus of the thesis is on the ideas the author has discovered to work well. The many other models and algorithmic ideas that accompany these few successful ones have not been described in very much detail (if at all).
Metadata
Item Type: | Thesis |
---|---|
Copyright Holders: | The copyright of this thesis rests with the author, who asserts his/her right to be known as such according to the Copyright Designs and Patents Act 1988. No dealing with the thesis contrary to the copyright or moral rights of the author is permitted. |
Depositing User: | Acquisitions And Metadata |
Date Deposited: | 26 Jan 2024 17:37 |
Last Modified: | 27 Jan 2024 08:44 |
URI: | https://eprints.bbk.ac.uk/id/eprint/52916 |
DOI: | https://doi.org/10.18743/PUB.00052916 |
Statistics
Additional statistics are available via IRStats2.