Track: All | |
---|---|
Keynote |
PyMC: Past, Present, and Future To kick off PyMCON 2020, I will provide some of the back-story of the PyMC project: where we’ve been, where we are, and where we might go |
|
Chris is a Senior Quantitative Analyst in Baseball Operations for the New York Yankees. He is interested in computational statistics, machine learning, Bayesian methods, and applied decision analysis. He hails from Vancouver, Canada and received his Ph.D. from the University of Georgia. |
Keynote |
Inferring the spread of SARS-CoV-2 --- and measures to mitigate it With the second waves of COVID-19 unfolding in most European countries, it is worth to look back at the first wave and especially the past summer, where case numbers were case numbers stayed low. I will present approaches to infer the effectiveness of interventions, and models to explore their potential. This talk will be interesting for anyone wanting to understand the current and potential future dynamics of COVID-19. |
|
Viola heads a research team at the Max Planck Institute for Dynamics and Self-Organization. She investigates the self-organization of spreading dynamics in the brain to understand the emergence of living computation. With the outbreak of COVID-19, she adapted these mathematical approaches to infer and predict the spread of SARS-CoV-2, and to investigate mitigation strategies. Viola is board member of the Campus Institute for Data Science and Fellow of the Schiemann Kolleg. |
Keynote |
These are a few of my favorite inference diagnostics I discuss some old and some more recent inference diagnostics methods for Markov chain Monte Carlo, importance sampling, and variational inference. When the convergence fails, I simply remember my favorite inference diagnostics, and then I don’t feel so bad. |
|
Aki is an Associate professor in computational probabilistic modeling at Aalto University, Finland. His numerous research interests are Bayesian probability theory and methodology, especially probabilistic programming, inference methods, model assessment and selection, non-parametric models such as Gaussian processes, dynamic models, and hierarchical models. Aki is also a co-author of the popular and awarded book « Bayesian Data Analysis », Third Edition, and the brand new « Regression and other stories ». He is also a core-developer of the seminal probabilistic programming framework Stan. An enthusiast of open-source software, Aki has been involved in many free software projects such as GPstuff for Gaussian processes and ELFI for likelihood inference. |
Time zone: Americas | Track: Beginner |
---|---|
Talk |
Studying glycan 3D structures with PyMC3 and ArviZ Interest for circular variables is present across a very diverse array of applied fields, from social and political sciences to geology and biology. They are very useful for the statistical modelling of, time, wind directions, bond angles between atoms or even swimming pattern in fish. In this talk, I will present an introduction to circular variables, mostly related to my work in ArviZ during the last GSoC and I will offer a glimpse on how to use PyMC3 and ArviZ to explore biomolecules 3D shapes. |
|
I am a PhD candidate in Biology. In my research, I apply Bayesian statistics to biomolecular structure determination and validation, i.e. finding the 3-dimensional shape of biomolecules and evaluating if that shape is a good model. I enjoy contributing to open source software. I have participated in the Google Summer of Code program with PyMC3 and ArviZ and I have recently incorporated to ArviZ core developers team. |
Let's Build a Model |
Learning Bayesian Statistics with Pokemon GO In the mobile game Pokemon GO, players can rarely encounter "shiny" Pokemon. The exact appearance rates are unknown. But by using Bayesian inference and PyMC3, we can model different species’ shiny rates. In this beginner-level tutorial, we will introduce fundamental principles at the heart of Bayesian modeling; then we will apply them to develop PyMC3 models that can answer questions about Pokemon GO. |
|
Tushar is a senior data scientist at Nielsen Global Media in Chicago. At Nielsen, he works on developing Bayesian models for next-generation audience measurement. He loves cats (living with two, Luna and Ruby), chai, and college football. This is his first conference talk! |
Let's Build a Model |
Microbial cell counting in a noisy environment In this LBAM, we’ll introduce the microbiological task of cell counting and understand all the potential sources of error involved. We’ll model each source of error probabilistically, introduce priors, and then discuss inference on the posterior. Finally, we’ll explore how we can extend our model to use in a calibration curve for other instruments. Only basic probability theory is required for this LBAM. |
|
Cameron Davidson-Pilon has worked in many areas of applied statistics, from the evolutionary dynamics of genes to modeling of financial prices. His contributions to the community include lifelines, an implementation of survival analysis in Python, lifetimes, and Bayesian Methods for Hackers, an open source book & printed book on Bayesian analysis. Formally Director of Data Science at Shopify, Cameron is now applying data science to food microbiology. |
Let's Build a Model |
The Bayesian Zig Zag: Developing and Testing PyMC Models Tools like PyMC make it easy to implement probablistic models, but it is still challenging to develop and validate those models. In this talk, I present an incremental strategy for developing and testing models by alternating between forward and inverse probabilities and between grid algorithms and MCMC. I’ll use Poisson processes as an example, but this strategy applies to other probabilistic models. |
|
Allen Downey is a professor of Computer Science at Olin College and Visiting Lecturer at Ashesi University in Ghana. He is the author of a series of open-source textbooks related to software and data science, including Think Python, Think Bayes, and Think Complexity, which are also published by O’Reilly Media. His blog, Probably Overthinking It, features articles on Bayesian probability and statistics. He holds a Ph.D. in computer science from U.C. Berkeley, and M.S. and B.S. degrees from MIT. |
Talk |
The why and how of one domain-specific PyMC3 extension In this talk I will describe some of the unique challenges encountered in probabilistic modeling for astrophysics and some approaches taken to overcome these obstacles. In particular, I will discuss the motivation for and development of the domain-specific |
|
Dan is an Associate Research Scientist at the Flatiron Institute’s Center for Computational Astrophysics studying the application of probabilistic data analysis techniques to solve fundamental problems in astrophysics. |
Talk |
A Bayesian Approach to Media Mix Modeling This talk describes how we built a Bayesian Media Mix Model of new customer acquisition using PyMC3. We will explain the statistical structure of the model in detail, with special attention to nonlinear functional transformations, discuss some of the technical challenges we tackled when building it in a Bayesian framework, and touch on how we use it in production to guide our marketing strategy. |
|
Michael Johns is a data scientist at HelloFresh US. His work focuses on building statistical models for business applications, such as optimizing marketing strategy, customer acquisition forecasting and customer retention. |
|
Zhenyu Wang is a Senior Business Intelligence Analyst at HelloFresh International. He works on developing and implementing methods to measure the effectiveness of advertising campaigns using analytic and statistical methods. |
Talk |
What is probability? A philosophical question with practical implications for Bayesians This talk will familiarize you with the philosophical questions of probability and the implications when it comes to justifying and explaining Bayesian models. |
|
Max Sklar is a machine learning engineer and a member of the innovation labs team at Foursquare. He hosts a weekly podcast called The Local Maximum which covers a broad range of current issues, including a focus on Bayesian Inference. |
Time zone: Americas | Track: Advanced |
---|---|
Talk |
A Novel Bayesian Model to Fit Spectrophotometric Data of Hubble and Spitzer Space Telescopes Understanding how the most massive galaxies rapidly formed and quenched when Universe was only ~3 billion years old is one of the major challenges of extragalactic astronomy. In this talk, I will discuss how to improve our understanding of massive galaxy formation by combining the spectro-photometric observations of the Hubble and Spitzer Space Telescopes for strong gravitationally lensed galaxies. In particular, a multi-level regression model is built that can fit all multi-wavelength data for a range of instruments within a hierarchical Bayesian framework to constrain the properties of the stellar populations. The details of how this model is implemented using PyMC3, as well as the estimates of the posteriors of all parameters of interest and nuisance parameters will be highlighted. |
|
Mo is a grad student of (astro)physics by day, a matheux and a Bayesian enthusiast all along. Broadly interested in cosmology and probability too. |
Talk |
Sequential Monte Carlo: Introduction and diagnostics In this talk we will provide a brief introduction to Sequential Monte Carlo (SMC) methods and provide a guide to diagnose posterior samples computed using SMC. |
|
Osvaldo is a researcher at the National Scientific and Technical Research Council in Argentina and is notably the author of the book Bayesian Analysis with Python, whose second edition was published in December 2018. He also teaches bioinformatics, data science and Bayesian data analysis, and is a core developer of PyMC3 and ArviZ, and recently started contributing to Bambi. Originally a biologist and physicist, Osvaldo trained himself to python and Bayesian methods – and what he’s doing with it is pretty amazing! |
|
In the year 2014 I completed my Bs. in Molecular Biology at the National University of San Luis, Argentina and in 2020 I finished my PhD in the Instute of Applied Mathematics (IMASL) while working within the Structural Bioinformatics Group (BIOS). My PhD thesis was centered around the use of a statiscal mechanics model to simulate biologically relevant systems of peptide-lipid interactions. Currently I’m doing my postdoc alongside Dr. Osvaldo Martin on probabilistic modeling and Sequential Monte Carlo. |
Talk |
Bayesian Machine Learning: A PyMC-Centric Introduction At the heart of any machine learning (ML) problem is the identification of models that explain the data well, where learning about the model parameters, treated as random variables, is integral. Bayes’ theorem, and in general Bayesian learning, offers a principled framework to update one’s beliefs about an unknown quantity; Bayesian methods therefore play an important role in many aspects of ML. This introductory talk aims to highlight some of the most prominent areas in Bayesian ML from the perspective of statisticians and analysts, drawing parallels between these areas and common problems that Bayesian statisticians work on. |
|
Quan is a Bayesian statistics enthusiast (and a programmer at heart). He is the author of several programming books on Python and scientific programming. Quan is currently pursuing a Ph.D. in computer science at Washington University in St. Louis, researching Bayesian methods in machine learning. |
Time zone: Africa/Asia/Europe | Track: Beginner |
---|---|
Talk |
calibr8: Going beyond linear ranges with non-linear calibration curves and multilevel modeling You just coded up a beautiful model and dummy prediction looks great. Now comes the data, but wait: the units don’t match! And to make matters worse, the correlation between model variable and measurement readout is non-linear and heteroscedastic! Sounds familiar? non-linear calibration to the rescue! With |
|
A biotechnologist by training, Laura transitioned to Data Science in the past years and is now a Bayesian enthusiast. In her Master’s thesis, she actually collected the data Michael was using for his fancy Bayesian models. During her wet lab experience, Laura gained valuable knowledge on microorganisms and biological processes that she is now applying to implement mechanistic process models. Her experimental work also gave her the motivation to focus on lab automation for bioprocess development in her PhD at Forschungszentrum Jülich. |
|
Michael Osthege is a biotech Bayesian by choice. He likes to work with robots, bacteria and models as much as he loves to work in enthusiastic teams. As a PhD student in laboratory automation for bioprocess development at Forschungszentrum Jülich, he writes software to make robots generate his data. Since he unit-tests his code, he always blames the robots if the data doesn’t agree with his Bayesian models. |
Talk |
Demystifying Variational Inference What will you do if MCMC is taking too long to sample? Also what if the dataset is huge? Is there any other cost-effective method for finding the posterior that can save us and potentially produce similar results? Well, you have come to the right place. In this talk, I will explain the intuition and maths behind Variational Inference, the algorithms capturing the amount of correlation, out of the box implementations that we can use, and ultimately diagnosing the model to fit our use case. |
|
Sayam Kumar is a Computer Science undergraduate student at IIIT Sri City, India. He loves to travel and study maths in his free time. He also finds Bayesian statistics super awesome. He was a Google Summer of Code student with NumFOCUS community and contributed towards adding Variational Inference methods to PyMC4. |
Tutorial |
Partial Missing Multivariate observation and what to do with them Missing value is pretty common in any real world data set. While PyMC3 provides convenient automatic imputation, how do we verify it works, especially dealing with multivariate observation with partially missing value? Come to this tutorial to find out! |
|
Junpeng Lao is a PyMC developer and currently a data scientist at Google. He also contribute to Tensorflow Probability and varies other Open source libraries. |
Talk |
Posterior Predictive Sampling in PyMC3 PyMC3 is great for inferring parameter values in a model given some observations, but sometimes we also want to generate random samples from the model as predictions given what we already inferred from the observed data. This kind of sampling is called posterior predictive sampling, and it can be very hard. The typical problems that show up are related to shape mismatches in hierarchical models, latent categorical values that aren’t correctly re-sampled or changing the shape of the data between the training and test phases. In this presentation I’ll talk about how posterior predictive sampling is implemented in PyMC3, show some typical situations where it fails, and how to make it work. |
|
I got into Bayesian stats during my PhD in cognitive neuroscience. During my postdoc I got more involved with machine learning, and discovered PyMC3. I became a core contributor of PyMC, learnt a lot in the process and made up my mind to pursue a career outside of academia. I am now a machine learning engineer at Innova SpA in Italy. |
Lets Build a Model |
Building an ordered logistic regression model for toxicity prediction We will build a simple but useful ordered logistic regression model to predict severity of drug-induced liver injury (DILI) from in vitro data and physicochemical properties of compounds. |
|
Elizaveta is currently a postdoc in Bayesian Machine Learning at a pharmaceutical company. Her interests span Gaussian Processes, Bayesian Neural Networks, compartmental models and differential equations with applications in epidemiology and toxicology. She is tool agnostic and builds probabilistic models in either Stan, PyMC3 or Turing. |
Talk |
My Journey in Learning and Relearning Bayesian Statistics My journey in learning (and relearning) Bayesian methods as a computer scientist |
|
A data scientist and a lecturer. Learning/teaching data science, machine learning, and artificial intelligence. |
Talk |
Estimating the Causal Network of Developmental Neurotoxicants using PyMC3 There is a vital need for alternative methods to animal testing to assess compounds for their potency of inducing developmental neurotoxicity such as learning disabilities in children. However, data are often limited and complex in structure. Therefore, Bayesian approaches are perfect to unravel their meaning and create predictive models. In this talk, I will showcase a multilevel probabilistic model and outline how to deal with unbalanced, correlated and missing values. This presentation will be of interest for those willing to learn multilevel modelling in PyMC3, how to deal with missing values for both predictors and outcomes of data matrices, and their application to a real problem in toxicology. |
|
Nicoleta Spînu is a PhD candidate in Computational Toxicology with a background in pharmaceutical sciences and regulatory affairs looking to have her own impact on the protection of human health while promoting animal welfare (Replacement, Reduction and Refinement of animal testing; “the 3Rs”). Research interests include the science of network and causal inference, computational modelling of chemical toxicity, and regulatory toxicology and policy making. |
Talk |
Automatic transformation of Bayesian probabilistic models into interactive visualizations Automatic transformation of Bayesian probabilistic models into interactive visualisations: models expressed in a probabilistic programming language are translated automatically into interactive multiverse diagrams, a graphical representation of the model’s structure at varying levels of granularity, with seamless integration of uncertainty visualisation. A concrete implementation in Python that translates probabilistic programs to interactive multiverse diagrams will be presented and illustrated by examples for a variety of Bayesian probabilistic models. |
|
Evdoxia is a PhD student at the School of Computing Science of University of Glasgow since 2019. Her research focuses on the creation of novel representations of probabilistic models that incorporate animation and interaction for a more intuitive communication of the uncertainty in the variables of probabilistic models. She became a Python and Bayesian enthusiast ever since she started her PhD and she got a foot in the door of a whole new-to-her, but very charming world. Evdoxia completed her undergraduate and master studies in the Aristotle University of Thessaloniki, Greece as Electrical and Computer Engineer. She worked as a Research Assistant at the Centre for Research & Technology Hellas in Thessaloniki contributing to various national- and EU-funded research projects in areas such as computer vision, 3D reconstruction and simulation, machine learning. She has also worked as a Research Database Engineer for the HCV Research UK project at the Centre for Virus Research of the University of Glasgow. |
Let's Build a Model |
The Bayesian Workflow: Building a COVID-19 model In this tutorial we will build a COVID-19 model from scratch. |
|
Thomas is the founder of PyMC Labs, a Bayesian consulting firm. |
Tutorial |
A Tour of Model Checking techniques Have you ever written a model in PyMC3 and aren’t sure if it’s any good? In this talk I will show you the many ways you can evaluate how will your model fits your data using PyMC3. Not all these techniques may be applicable for your particular problem but you will definitely walk away with a few new tricks for being confident in the models you fit. |
|
Rob Zinkov is a PhD student at University of Oxford. My research covers how to more efficiently specify and train deep generative models as well as how to more effectively discover a good statistical model for your data. Previously I was a research scientist at Indiana University where I was the lead developer of the Hakaru probabilistic programming language. |
Time zone: Africa/Asia/Europe | Track: Advanced | ||
---|---|---|---|
Let's Build a Model |
Using Hierarchical Multinomial regression to predict elections in Paris at the district-level Predicting elections in Paris with hierarchical multinomial regression |
||
|
|
||
Let's Build a Model |
Hierarchical time series with Prophet and PyMC3 When doing time-series modelling, you often end up in a situation where you want to make long-term predictions for multiple, related, time-series. In this talk, we’ll build an hierarchical version of Facebook’s Prophet package to do exactly that |
||
|
I’m a data scientist, active in Amsterdam, The Netherlands. My current work involves training junior data scientists at Xccelerated.io. This means I divide my time between building new training materials and exercises, giving live trainings and acting as a sparring partner for the Xccelerators at our partner firms, as well as doing some consulting work on the side. I spent a fair amount of time contributing to our open scientific computing ecosystem through various means. I maintain open source packages (scikit-lego, seers) as well as co-chair the PyData Amsterdam conference and meetup and vice-chair the PyData Global conference. In my spare time I like to go mountain biking, bouldering, do some woodworking or go scuba diving. |
||
Talk |
An alcohol? What are the chances! Knowledge-based and probabilistic models in chemistry using PyMC3 We have used PyMC3 to formulate an explainable probabilistic model of chemical reactivity. This probabilistic model combines the intuitive concepts of high school chemistry with the computer’s ability to store and reason about large datasets. We use our model in the lab, where it guides a robot chemist towards "interesting" experiments that might lead to the discovery of new reactions. |
||
|
Dario Caramelli is a research associate in the Cronin group at the University of Glasgow. His research involves building and programming of autonomous robots for reaction discovery as well as the development of algorithms for chemical space modelling and data processing. Dario obtained a Master degree in Organic chemistry in Rome (2015) and a PhD in the Cronin group (2019). |
||
|
Hessam Mehr is a research associate in the Cronin group at the University of Glasgow’s School of Chemistry. He works with an interdisciplinary group of scientists and engineers to build robots and teach them how to do chemistry. Since he joined the group in 2018, Hessam’s main focus has been the integration of probabilistic reasoning with chemical robotics and discovery. |
||
Talk |
The MLDA multilevel sampler in PyMC3 This presentation will give you the chance to know more about PyMC3’s new multilevel MCMC sampler, MLDA, and help you use it in practice. MLDA exploits multilevel model hierarchies to improve sampling efficiency compared to standard methods, especially when working with high-dimensional problems where gradients are not available. We will present a step-by-step guide on how to use MLDA within PyMC3, go through its various features and also present some advanced use cases, e.g. employing multilevel PDE-based models written in FEniCS and using adaptive error correction to correct model bias between different levels. |
||
|
Prof. Tim Dodwell has a personal chair in Computational Mechanics at the University of Exeter, is the Romberg Visiting at Heidelberg in Scientific Computing and holds a 5 year Turing AI Fellowship at the Alan Turing Institute where he is also an academic lead. |
||
|
Mikkel Lykkegaard is a PhD student with the Data Centric Engineering Group and Centre for Water Systems (CWS) at University of Exeter. His research is mainly concerned with Uncertainty Quantification (UQ) for computationally intensive forward models. |
||
|
Dr. Grigorios Mingas is a Senior Research Data Scientist at The Alan Turing Institute. He received his PhD from Imperial College London, where he co-designed MCMC algorithms and hardware to accelerate Bayesian inference. He has experience in a wide range of projects as a data scientist. |
||
Talk |
Using hierarchical models in instrumental variable analysis for advertising effectiveness Due to unobserved confounders, users are often exposed to too many repetitive ads. We will show how we use instrumental variable analysis to prove this is ineffective for advertisers. The focus of the talk will be choosing the model assumptions and how to implement them in pymc3. Finally, we show how hierarchical modelling can be used to combine these models. |
||
|
Back in 2012, Ruben introduced data science at Greenhouse, a digital advertising agency in the Netherlands. He is currently principal data scientist and cluster lead. He’s given several talks at PyData conferences and is one of the founders of PyData Eindhoven. |
||
Let's Build a Model |
Priors of Great Potential - How you can add Fairness Constraints to Models using Priors. Priors of Great Potential - How you can add Fairness Constraints to Models using Priors. |
||
|
Vincent likes to spend his days debunking hype in ML. He started a few open source packages (whatlies, scikit-lego, clumper and evol) and is also known as co-founding chair of PyData Amsterdam. He currently works at Rasa as a Research Advocate where he tries to make NLP algorithms more accessible. |
||
Tutorial |
Including partial differential equations in your PyMC3 model This tutorial will demonstrate use of PyMC3 for PDE-based inverse problems. We will infer parameters of a simple continuum mechanics model but the demonstrated tools can be readily applied to other complex PDE-based models. |
||
|
Ivan Yashchuk has 3 years’ experience in computational mechanics and scientific computing with occasional contributions to OSS projects. He received his M.Sc. in Computational Mechanics from Aalto University, Finland and is currently doing PhD research in Probabilistic Machine Learning group at Aalto. |