indii.org
http://www.indii.org/
Lawrence Murray: software, research, photography.en-usFri, 12 Feb 2021 19:53:51 -0800Fri, 12 Feb 2021 19:53:51 -0800Probabilistic programming: a powerful new approach to statistical phylogenetics
http://www.indii.org/research/probabilistic-programming-statistical-phylogenetics/index.html
Tue, 23 Jun 2020 00:00:00 -0700Lawrence Murrayhttp://www.indii.org/research/probabilistic-programming-statistical-phylogenetics/index.htmlStatistical phylogenetic analysis currently relies on complex, dedicated software packages, making it difficult for evolutionary biologists to explore new models and inference strategies. Recent years have seen more generic solutions based on probabilistic graphical models, but this formalism can only partly express phylogenetic problems. Here we show that universal probabilistic programming languages (PPLs) solve the model expression problem, while still supporting automated generation of efficient inference algorithms. To illustrate the power of the approach, we use it to generate sequential Monte Carlo (SMC) algorithms for recent biological diversification models that have been difficult to tackle using traditional approaches. This is the first time that SMC algorithms have been available for these models, and the first time it has been possible to compare them using model testing. Leveraging these advances, we re-examine previous claims about the performance of the models. Our work opens up several related problem domains to PPL approaches, and shows that few hurdles remain before PPLs can be effectively applied to the full range of phylogenetic models.
Canadian Rockies
http://www.indii.org/archives/canadian-rockies/index.html
Sat, 15 Feb 2020 00:00:00 -0800Lawrence Murrayhttp://www.indii.org/archives/canadian-rockies/index.htmlMountain vistas and the private life of snow.Lazy object copy as a platform for population-based probabilistic programming
http://www.indii.org/research/lazy-object-copy/index.html
Sat, 01 Feb 2020 00:00:00 -0800Lawrence Murrayhttp://www.indii.org/research/lazy-object-copy/index.htmlThis work considers dynamic memory management for population-based probabilistic programs, such as those using particle methods for inference. Such programs exhibit a pattern of allocating, copying, potentially mutating, and deallocating collections of similar objects through successive generations. These objects may assemble data structures such as stacks, queues, lists, ragged arrays, and trees, which may be of random, and possibly unbounded, size. For the simple case of N particles, T generations, D objects, and resampling at each generation, dense representation requires $O(DNT)$ memory, while sparse representation requires only $O(DT+DN \log DN)$ memory, based on existing theoretical results. This work describes an object copy-on-write platform to automate this saving for the programmer. The core idea is formalized using labeled directed multigraphs, where vertices represent objects, edges the pointers between them, and labels the necessary bookkeeping. A specific labeling scheme is proposed for high performance under the motivating pattern. The platform is implemented for the Birch probabilistic programming language, using smart pointers, hash tables, and reference-counting garbage collection. It is tested empirically on a number of realistic probabilistic programs, and shown to significantly reduce memory use and execution time in a manner consistent with theoretical expectations. This enables copy-on-write for the imperative programmer, lazy deep copies for the object-oriented programmer, and in-place write optimizations for the functional programmer.
Parameter elimination in particle Gibbs sampling
http://www.indii.org/research/parameter-elimination-in-particle-gibbs-sampling/index.html
Sat, 09 Nov 2019 00:00:00 -0800Lawrence Murrayhttp://www.indii.org/research/parameter-elimination-in-particle-gibbs-sampling/index.htmlBayesian inference in state-space models is challenging due to high-dimensional state trajectories. A viable approach is particle Markov chain Monte Carlo, combining MCMC and sequential Monte Carlo to form “exact approximations” to otherwise intractable MCMC methods. The performance of the approximation is limited to that of the exact method. We focus on particle Gibbs and particle Gibbs with ancestor sampling, improving their performance beyond that of the underlying Gibbs sampler (which they approximate) by marginalizing out one or more parameters. This is possible when the parameter prior is conjugate to the complete data likelihood. Marginalization yields a non-Markovian model for inference, but we show that, in contrast to the general case, this method still scales linearly in time. While marginalization can be cumbersome to implement, recent advances in probabilistic programming have enabled its automation. We demonstrate how the marginalized methods are viable as efficient inference backends in probabilistic programming, and demonstrate with examples in ecology and epidemiology.
Probabilistic programming for birth-death models of evolution using an alive particle filter with delayed sampling
http://www.indii.org/research/probabilistic-programming-for-birth-death-models-of-evolution/index.html
Sun, 14 Jul 2019 00:00:00 -0700Lawrence Murrayhttp://www.indii.org/research/probabilistic-programming-for-birth-death-models-of-evolution/index.htmlWe consider probabilistic programming for birth-death models of evolution and introduce a new widely-applicable inference method that combines an extension of the alive particle filter (APF) with automatic Rao-Blackwellization via delayed sampling. Birth-death models of evolution are an important family of phylogenetic models of the diversification processes that lead to evolutionary trees. Probabilistic programming languages (PPLs) give phylogeneticists a new and exciting tool: their models can be implemented as probabilistic programs with just a basic knowledge of programming. The general inference methods in PPLs reduce the need for external experts, allow quick prototyping and testing, and accelerate the development and deployment of new models. We show how these birth-death models can be implemented as simple programs in existing PPLs, and demonstrate the usefulness of the proposed inference method for such models. For the popular BiSSE model the method yields an increase of the effective sample size and the conditional acceptance rate by a factor of 30 in comparison with a standard bootstrap particle filter. Although concentrating on phylogenetics, the extended APF is a general inference method that shows its strength in situations where particles are often assigned zero weight. In the case when the weights are always positive, the extra cost of using the APF rather than the bootstrap particle filter is negligible, making our method a suitable drop-in replacement for the bootstrap particle filter in probabilistic programming inference.
Yosemite Falls
http://www.indii.org/archives/yosemite-falls/index.html
Sat, 15 Jun 2019 00:00:00 -0700Lawrence Murrayhttp://www.indii.org/archives/yosemite-falls/index.htmlGecko
http://www.indii.org/archives/gecko/index.html
Sun, 27 Jan 2019 00:00:00 -0800Lawrence Murrayhttp://www.indii.org/archives/gecko/index.htmlThe Wet Comes To Bali
http://www.indii.org/archives/wet-in-bali/index.html
Mon, 26 Nov 2018 00:00:00 -0800Lawrence Murrayhttp://www.indii.org/archives/wet-in-bali/index.htmlLight and dark.The Kimberley
http://www.indii.org/archives/the-kimberley/index.html
Tue, 20 Nov 2018 00:00:00 -0800Lawrence Murrayhttp://www.indii.org/archives/the-kimberley/index.htmlGorges and waterholes in the remote north-west of Australia.Automated learning with a probabilistic programming language: Birch
http://www.indii.org/research/automated-learning-with-a-probabilistic-programming-language-birch/index.html
Sat, 13 Oct 2018 00:00:00 -0700Lawrence Murrayhttp://www.indii.org/research/automated-learning-with-a-probabilistic-programming-language-birch/index.htmlThis work offers a broad perspective on probabilistic modeling and inference in light of recent advances in probabilistic programming, in which models are formally expressed in Turing-complete programming languages. We consider a typical workflow and how probabilistic programming languages can help to automate this workflow, especially in the matching of models with inference methods. We focus on two properties of a model that are critical in this matching: its structure—the conditional dependencies between random variables—and its form—the precise mathematical definition of those dependencies. While the structure and form of a probabilistic model are often fixed a priori, it is a curiosity of probabilistic programming that they need not be, and may instead vary according to random choices made during program execution. We introduce a formal description of models expressed as programs, and discuss some of the ways in which probabilistic programming languages can reveal the structure and form of these, in order to tailor inference methods. We demonstrate the ideas with a new probabilistic programming language called Birch, with a multiple object tracking example.