PyAutoFit: A Classy Probabilistic Programming Language For Data Science

Classy Probabilistic Programming For Big Data

James Nightingale

Big Data Data Science Open-Source Science

See in schedule: Fri, Jul 30, 13:45-14:15 CEST (30 min) Download/View Slides

A major trend in data science is the rapid adoption of Bayesian statistics for data analysis and modeling. With modern data-sets growing by orders of magnitude in size, the focus is now on developing methods capable of applying contemporary inference techniques to extremely large datasets. To this aim, I present PyAutoFit (https://github.com/rhayes777/PyAutoFit), an open-source probabilistic programming language for automated Bayesian inference that was recently published in the Journal of Open Source Software (https://joss.theoj.org/papers/10.21105/joss.02550).

I will begin by giving an overview of PyAutoFit’s core features, in particular how it:

- Makes it simple to compose and fit probabilistic models using a range of Bayesian inference libraries, such as emcee (https://github.com/dfm/emcee) and dynesty (https://github.com/joshspeagle/dynesty).

- Handles the 'heavy lifting' that comes with model-fitting, including model composition & customization, outputting results, model-specific visualization and posterior analysis.

- Is built for big-data analysis, whereby results are output as a sqlite database which can be queried after model-fitting is complete.

PyAutoFit was developed by Astronomers seeking to fit large libraries of galaxy images to better understand the nature of dark matter. Using this science-case, I will describe PyAutoFit’s advanced features, such as multi-level models, automated model-fitting pipelines and support for massively parallel computing infrastructures.

The goal of this talk is to introduce the audience to PyAutoFit so they can adopt it for their use-case. The only prerequisite is a basic understanding of object oriented programming in Python.

Type: Talk (30 mins); Python level: Beginner; Domain level: Beginner


James Nightingale

Durham University

I’m James Nightingale, an observational cosmologist and postdoctoral researcher at Durham University.

My cosmology research focuses on strong gravitational lensing, in particular devising new ways to use this phenomenon to study dark matter and the distant Universe. I’m the developer of PyAutoLens (https://github.com/Jammy2211/PyAutoLens), open source software for analysing strong lenses.

I am an advocate for cross-disciplinary research and am collaborating with healthcare researchers on applying the statistical techniques we use to study cosmology to improve cancer treatments. This uses the statistics software PyAutoFit (https://github.com/rhayes777/PyAutoFit) that I am the lead developer of.