Astrostatistics sessions at JSM 2020
INDEX  Aug 3: AIG meeting  Aug 3:11 Solar & Geo  Aug 4:253 Large Public Data  Aug 4:265 Astro & Space Physics  Aug 5:401 Student Paper Award  Aug 6:431 Astronomical(ly) Big Data  Aug 6:539 Signal detection  Other events 
We will use an Astrostatistics Interest Group Slack channel to ease communication between the audience and the speakers at the various sessions. In order to be added to this channel, please contact one of the office bearers or write to aigamstat @ gmail.
AIG Annual Meeting
Mon Aug 3, 2020, 1:00pm2:30pm EDT
The business meeting of the AIG will be held on the afternoon of Aug 3 via Zoom.
Agenda .pdf
 Overview and Census
 Charter
 Amendments on language, Program Chair term, and Webmaster
 Elections for next year
 Section vs Interest Group
 Activities
 Sessions during this JSM
 Student Paper Award
 AIG Virtual Table
 AIG Mixer on Wed Aug 5 56pm via Zoom; see Slack Channel for connection information
 Planning for next year
 Other events
 External Coordination
 Web presence
 Website, contact email, and email exploder
 Social media: Slack, twitter, etc
 Logo
 Top 
Session 11
Mon Aug 3, 2020, 10:00am  11:50am EDT
219357 Statistical Inference for Solar and Geophysical Data  Invited Papers
Section on Physical and Engineering Sciences, Section on Nonparametric Statistics, Astrostatistics Special Interest Group
Organizer(s): Gwendolyn M Eadie, University of Toronto
Chair(s): David van Dyk, Imperial College London

10:05 AM
Multitaper Analysis of HighQ Spectral Peaks and Nonstationarity in the Geomagnetic Field over the 4004000 microHz Band
Alan Chave, Woods Hole Oceanographic Institution
Three 60 d sections of geomagnetic data from Honolulu Observatory during 20012 were analyzed using multitaper spectral analysis, showing the ubiquitous presence of narrowband, very statistically significant, high Q features in multitaper power spectra and pervasive nonstationarity as measured by the frequency offset coherence over 4004000 microHz. The peak frequencies correlate well with the opticallymeasured frequencies of solar pmodes, and the raw Qs are defined by the resolution bandwidths of the estimates, with values ranging from 100s to 1000s. Further, spectral peaks are consistently coherent across frequency due to nonstationarity, and frequently exhibit cyclostationarity at offset frequencies of +0.5 and +1 cpd. 
10:30 AM
Hitting a Moving Target: Modelling NonStationary Relationships in Geomagnetism
David Riegert, Queen’s University; David J Thomson, Queen’s University
This talk focuses on modelling the relationship between Earth’s magnetic field as a predictor and induced currents in the ground as a response; a field of study known as magnetotellurics. Current modelling approaches assume a stationary relationship, however energy transfer between frequencies, indicating nonstationarity in the process, has been investigated in the univariate setting using seismic, ocean pressure (Chave et al., 2019), and geomagnetic measurements (Chave et al., 2018; Riegert & Thomson, 2018). Nonstationarity in a timeseries process provides strong evidence for a nonstationary relationship between that series and any other. Current models are discussed and an extension is introduced which aims to account for violations in the assumption of stationarity. 
10:55 AM
Solar flare prediction with machine learning
Yang Chen, University of Michigan
We present our machine learning efforts, which show great promise towards early predictions of solar flare events. First, we present a data preprocessing pipeline that is built to extract useful data from multiple sources – Geostationary Operational Environmental Satellites (GOES) and Solar Dynamics Observatory (SDO)/Helioseismic and Magnetic Imager (HMI) and SDO/Atmospheric Imaging Assembly (AIA) – to prepare inputs for machine learning algorithms. Second, we adopt deep learning algorithms to extract/select features from raw HMI and AIA data. Third, we train machine learning models that capture both the spatial and temporal information from HMI magnetogram data for strong/weak flare classification and for predictions of flare intensities. Fourth, we show that using the MLderived features gives almost as good performance as using active region parameters provided in HMI data files, i.e. features manually constructed based on physical principles. Last, case studies show a significant increase in the prediction score around 20 hours before strong solar flare events, which implies that early precursors appear at least 20 hours prior to the peak of a flare event. 
11:20 AM
Effect of Systematic Uncertainties on Density and Temperature Estimates in Coronae of Capella
Xixi Yu, Imperial College London; David van Dyk, Imperial College London; David Stenning, Imperial College London; Vinay Kashyap, Center for Astrophysics  Harvard & Smithsonian; Giulio Del Zanna, Centre for Mathematical Sciences, University of Cambridge
Information about the physical properties of astrophysical objects cannot be measured directly but is inferred by interpreting spectroscopic observations in the context of atomic physics calculations. A critical component of this analysis is understanding how uncertainties in the underlying atomic physics propagates to the uncertainties in the inferred plasma parameters. Instead of using the standard approach, a common strategy deployed by the astrophysicists, that treats the uncertainty as fixed and known and obtains the bestfit values of the parameters, we propose a multistage analysis to prevent underestimation of the error bars on the model parameters and increase the accuracy of the analysis results. A case study on Fe XVII and O VII/VIII is discussed where we implement both a pragmatic Bayesian method where atomic physics information is unaffected by observed data, and a fully Bayesian method where the data can be used to probe physics, and in particular detail a method of summarizing atomic uncertainties using principal components analysis. 
11:45 AM
Floor Discussion
 Top 
Session 253
Tue Aug 4, 2020, 1:00pm  2:50pm EDT
219418 Innovations in AstroStatistics on Exploring Large Public Data – Invited Papers
Astrostatistics Special Interest Group, Section on Physical and Engineering Sciences, Section on Statistical Learning and Data Science
Organizer(s): Hyungsuk Tak, Pennsylvania State University
Chair(s): Hyungsuk Tak, Pennsylvania State University

1:05 PM
Handling Model Uncertainty via Smoothed Inference Sara Algeri, University of Minnesota
Classical inferential methods often rely on the assumption that one among the models specified under the null or alternative hypothesis provides a suitable representation of the data under study. Unfortunately, when conducting searches for new physics, the specification of a correct model for the data is not always an easy task. Consequently, the validity and the sensitivity of the experiment under study may be substantially compromised. Algeri (2020) introduced a novel statistical approach to perform modeling, estimation, and inference under background mismodeling for large samples in the continuous setting. This work aims to extend the framework proposed in Algeri (2020) to arbitrary large samples from continuous or discrete distributions. 
1:30 PM
Improving Exoplanet Detection Power: Multivariate Gaussian Process Models for Stellar Activity
David Edward Jones, Texas A&M University; David Stenning, Imperial College London; Eric B Ford, Penn State University; Robert L Wolpert, Duke University; Thomas J Loredo, Cornell University; Xavier Dumusque, Observatoire Astronomique de l’Universite de Geneve
The radial velocity technique is one of the two main approaches for detecting planets outside our solar system, often referred to as exoplanets. When a planet orbits a star its gravitational force causes the star to move and this induces a Doppler shift (i.e. the star light appears redder or bluer than expected), and it is this effect that the radial velocity method attempts to detect. Unfortunately, these Doppler signals are typically contaminated by various stellar activity phenomena, such as dark spots on the star surface. We propose a Gaussian process modeling framework to capture this stellar activity and thereby improve detection power for lowmass planets (e.g., Earthlike planets). Our approach builds on previous work in two ways: (i) we use dimension reduction techniques to construct datadriven stellar activity proxies, as opposed to using traditional activity proxies; (ii) we extend the multivariate Gaussian process model of Rajpaul et al. (2015) to a class of models and use a largescale model selection procedure to find the best model for the particular proxies at hand. Our method results in substantially improved power for planet detection. 
1:55 PM
Disentangling Stellar Activity and Planetary Signals using Bayesian Highdimensional Analysis
Bo Ning, Yale University; Jessi CisewskiKehe, Yale University; Allen Davis, Yale University; Sarah DodsonRobinson, University of Delaware; Debra Fischer, Yale University; Parker Holzer, Yale University; Alexander Wise, Penn State University
As the development of thirdgeneration high precision spectrometers (e.g., the EXtreme PREcision Spectrometer, EXPRES), the stellar activity has become the dominant background noise that can lead to false discoveries or poor mass estimates of small planets. Recent efforts are putting on finding those stellar activitysensitive lines from a given set of spectra. Since there are ~10^5 features in a typical spectrum, finding those lines can be challenging and timeconsuming if using those proposed linebyline search approaches. In this talk, a Bayesian variable selection method is introduced to automatically search for activitysensitive lines through pixels from a set of spectra. We applied this method to study the spectra of alpha Centauri B from HARPS. The results are promising. We identified not only many wellknown lines that are sensitive to activity, but also several new lines. With stellar activity being the largest source of variability for nextgeneration RV spectrographs, this work is a step toward accessing the myriad information available in highprecision spectra. 
2:20 PM
Floor Discussion
 Top 
Session 265
Tue Aug 4, 2020, 1:00pm  2:50pm EDT
219617 Innovations in Statistics for Astronomy & Space Physics – Topic Contributed Papers
SSC (Statistical Society of Canada), Section on Physical and Engineering Sciences, Astrostatistics Special Interest Group
Organizer(s): Gwendolyn M Eadie, University of Toronto
Chair(s): David J Thomson, Queen’s University

1:05 PM
The Photometric LSST Astronomical Time Series Classification Challenge (PLAsTiCC)
Renee Hlozek, University of Toronto
The Legacy Survey of Space and Time (LSST) on the Rubin Observatory will generate a data deluge: millions of astrophysical transients and variable sources will need to be classified from their time series light curves alone. Photometric classification has long been a problem of interest in the astronomical community, but the Photometric LSST Astronomical Timeseries Classification Challenge (PLAsTiCC) brings a wide range of models together, simulated under LSSTlike observing conditions for the first time. PLAsTiCC was delivered to the community through a Kaggle challenge, designed to stimulate interest in timeseries photometric classification and deliver methodologies that will advance the LSST science case. We will give an overview of the road to PLAsTiCC, present and analyze the results of the PLAsTiCC challenge and metrics used to evaluate the challenge, and discuss lessons learned in presenting sciencespecific challenges to the broader computational and statistical communities. 
1:25 PM
Gibbs Point Process Model for Objects in the Star Formation Complexes of M33 Dayi Li, Western University; Pauline Barmby, Western University; Ian McLeod, Western University
We demonstrate the power of Gibbs point process models in the spatial statistics literature when applied to stellar population studies. We conduct a rigorous analysis of the empirical spatial distributions of objects in the star formation complexes of M33, including giant molecular clouds (GMCs), and young stellar cluster candidates (YSCCs). We choose a hierarchical model structure from GMCs to YSCCs based on the natural formation hierarchy between them. This approach circumvents the limitations of the empirical twopoint correlation function analysis by naturally accounting for the inhomogeneity present in the distribution of YSCCs. We also investigate the effects of GMCs’ properties on their spatial distributions. We confirm that the distribution of GMCs and YSCCs are highly correlated. We found that the spatial distributions of YSCCs reaches a peak of clustering pattern at ~250 pc scale compared to a Poisson process and this clustering mainly occurs at regions where the galactocentric distance >~4.5 kpc. Furthermore, the galactocentric distance of GMCs and the mass of GMCs have a strong positive effect on the correlation strength between GMCs and YSCCs. 
1:45 PM
Likelihoodfree Inference of Chemical Homogeneity in Open Clusters
Aarya Patil; Jo Bovy, University of Toronto
Stellar clusters are excellent astrophysical laboratories to study because we expect stars in a cluster to have similar chemistry due to the standard assumption that they are born out of the same molecular cloud at the same time. Recent efforts to constrain initial abundance spread of different elements in open clusters of stars have employed Approximate Bayesian Computation to approximate the posterior probability distribution of the scatter in each element using stellar spectral data. Densityestimation likelihoodfree inference methods turn inference into a density estimation task, and give ordersofmagnitude improvements over traditional ABC approaches. We illustrate accurate and efficient inference of elemental abundances on a set of synthetic spectra using an ensemble of Neural Density Estimators. We use compression to tackle the curse of dimensionality and to remove instrumental noise in the APOGEE spectral data. We believe that fast highfidelity posterior inference will bring the power of differential abundances to stars in large spectroscopic surveys and help unravel the history of star formation and chemical enrichment in the Milky Way through chemical tagging. 
2:05 PM
Bayesian Inference and Computation for Old Star Clusters
Gwendolyn M Eadie, University of Toronto; Jeremy Webb, University of Toronto; Jeffrey Rosenthal, University of Toronto
Globular Clusters (GCs) are astronomical objects made up of tens of thousands to hundreds of thousands of stars. GCs are some of the oldest objects in the universe and are incredibly spatially dense, making them interesting laboratories for studying stellar populations. In particular, estimates of a GC’s mass as a function of radius can be used to test theories about GC evolution. However, the high spatial density of GCs is both a blessing and a curse — there is a large population of stars to observe in the outer regions of a GC, but it is impossible to discern individual stars in the inner regions because of extreme crowding. Thus, astronomers usually estimate a GC’s mass as a function of radius by first estimating the total light in radial bins, and then assuming a masstolight ratio. I will present a Bayesian approach that negates both the need for binning data and the assumption about the masstolight ratio, and that instead takes advantage of position and velocity information from a sample of individual stars. I will also discuss the statistical and computational challenges we face while including measurement uncertainties, projection effects, and incomplete data. 
2:25 PM
Statistical Characterization of Matrix Effects in LaserInduced Breakdown Spectroscopy
David Stenning, Simon Fraser University
Scientists often model complex physics using computer simulations. Such simulations complicate statistical inference because the resulting likelihood function cannot be directly evaluated and a single simulation run may take minutes to hours on supercomputers. One example from astrophysics is in the area of stellar evolution, whereby computer simulators are used to predict the brightness of a star in several wide wavelength bands given a set of parameters that describe physical properties of the star (e.g., age, chemical composition, distance from Earth, etc.). Another example comes from simulating plasmas generated by LaserInduced Breakdown Spectroscopy, a technique used by the ChemCam instrument on the Mars Science Laboratory rover Curiosity, to aid in determining the composition of rocks and soils on Mars. This talk will address the novel statistical challenges that arise when combining such simulations with observational or experimental data for inference, using examples from recent astrostatistical analyses. 
2:45 PM
Floor Discussion
 Top 
Session 401
Wed Aug 5, 2020, 1:00pm  2:50pm EDT
219559 Astrostatistics Interest Group: Student Paper Award – Topic Contributed Papers
Information about Student Paper competition
Astrostatistics Special Interest Group
Organizer(s): Gwendolyn M Eadie, University of Toronto
Chair(s): Chad Schafer, Carnegie Mellon University

1:05 PM
Photometric Biases in Modern Astronomical Surveys
Joshua Speagle, Harvard University
Many modern astronomical surveys use maximumlikelihood (ML) methods to fit models when extracting photometry from images. We show these ML estimators systematically overestimate the flux as a function of the signaltonoise ratio and the number of model parameters involved in the fit. This bias is substantially worse for resolved sources: while a 1% bias is expected for a 10sigma point source, a 10sigma resolved galaxy with a simplified Gaussian profile suffers a 2.5% bias. This bias also behaves differently depending how multiple bands are used in the fit: simultaneously fitting all bands leads the flux bias to become roughly evenly distributed between them, while fixing the position in “nondetection” bands (i.e. forced photometry) gives flux estimates in those bands that are biased low, compounding a bias in derived colors. We show that these effects are present in idealized simulations, outputs from the Hyper SuprimeCam fake object pipeline (SynPipe), and observations from Sloan Digital Sky Survey Stripe 82, implying they are present in numerous astronomical datasets widely used today. 
1:25 PM
Trend Filtering: A Modern Statistical Tool for TimeDomain Astronomy and Astronomical Spectroscopy
Collin Politsch, Carnegie Mellon University; Jessi CisewskiKehe, Yale University; Larry Wasserman, Carnegie Mellon University; Rupert Croft, Carnegie Mellon University
The problem of denoising a onedimensional signal possessing varying degrees of smoothness is ubiquitous in timedomain astronomy and astronomical spectroscopy. In this work, we introduce trend filtering into the astronomical literature. Trend filtering is a modern nonparametric statistical tool that yields significant improvements in the broad problem space of denoising spatially heterogeneous signals. When the underlying signal is spatially heterogeneous, trend filtering is superior to any statistical estimator that is a linear combination of the observed dataincluding kernels, LOESS, smoothing splines, and Gaussian process regression. Furthermore, the trend filtering estimate can be computed with practical and scalable efficiency via a specialized convex optimization algorithm. In order to illustrate the broad utility of trend filtering, we discuss its relevance to a diverse set of spectroscopic and timedomain studies. The observations we discuss are (1) the Lymanalpha forest of quasar spectra; (2) more general spectroscopy of quasars, galaxies, and stars; (3) stellar light curves with transiting exoplanet(s); (4) eclipsing binary light curves; and (5) supernova light curves. 
1:45 PM
Galaxy Cluster Mass Estimation Using Deep Learning
Matthew Ho, McWilliams Center for Cosmology; Arya Farahi, Michigan Institute for Data Science; Michelle Ntampaka, HarvardSmithsonian Center for Astrophysics; Markus Michael Rau, McWilliams Center for Cosmology; Hy Trac, McWilliams Center for Cosmology; Barnabas Poczos, School of Computer Science
Utilizing galaxy cluster abundance in precision cosmology requires large, welldefined cluster samples and robust mass measurement methods. In addition, modern cluster measurement techniques are expected to place a strong emphasis on efficiency and automation, as the wealth of detailed cluster data is expected to greatly increase with current and upcoming surveys such as DES, LSST, WFIRST, Euclid, and eROSITA. In this talk, I will discuss how we can leverage the use of deep learning models to infer dynamical cluster masses from spectroscopic samples with high precision and computational efficiency. I will demonstrate the ability of Convolutional Neural Networks (CNNs) to mitigate systematics in the virial scaling relation and produce dynamical mass estimates of galaxy clusters, using projected galaxies, with remarkably low bias and scatter. I will then discuss the performance of these methods relative to other leading analytic and machine learning dynamical mass estimators. Lastly, I will discuss our ongoing work in quantifying uncertainties in CNN mass predictions and our applications on spectroscopic datasets from the SDSS and GAMA surveys. 
2:05 PM
Inferring Galactic Parameters from Chemical Abundances: A MultiStar Approach
Oliver Philcox, Princeton University; Jan Rybizki, MaxPlanck Institute for Astronomy
To understand galactic physics and create realistic simulations of the Milky Way, we require strong constraints on galactic evolution parameters, constraining effects such as the birthrate of massive stars and the frequency of supernovae. In this talk, I will outline a method to precisely determine these using the chemical element abundances and ages from a large set of stars. Inference is performed via a simple chemical evolution model in a hierarchical Bayesian framework, marginalizing over a large number of parameters describing the stars’ individual environments and model errors to account for inaccuracies in our model. Hamiltonian Monte Carlo methods are used to sample the posterior function, which is sped up by use of Neural Networks. I will show the parameter constraints obtained from simulations (which are competitive with those from other methods), and discuss future applications of the method. 
2:25 PM
Multiband Probabilistic Cataloging: A Joint Fitting Approach to Point Source Detection and Deblending
Richard Feder, California Institute of Technology
Probabilistic cataloging (PCAT) outperforms traditional cataloging methods on singleband optical data in crowded fields (Portillo et al. 2017). We extend our work to multiple bands, achieving greater sensitivity (~0.4 mag) and greater speed (500x) compared to previous singleband results. We demonstrate the effectiveness of multiband PCAT on mock data, both in terms of recovering accurate posteriors in the catalog space, and in directly deblending sources. When applied to Sloan Digital Sky Survey (SDSS) observations of M2, taking Hubble Space Telescope data as truth, our joint fit on r and i band data goes ~0.4 mag deeper than singleband probabilistic cataloging and has a false discovery rate less than 20% for F606W < = 20. Compared to DAOPHOT, the twoband SDSS catalog fit goes nearly 1.5 magnitudes deeper using the same data, and maintains a lower false discovery rate down to F606W ~ 20.5. Given recent improvements in computational speed, multiband PCAT shows promise in application to largescale surveys and is a plausible framework for joint analysis of multiinstrument observational data. 
2:45 PM
Floor Discussion
 Top 
Session 431
Thu Aug 6, 2020, 10:00am  11:50am EDT
219396 Astronomical(ly) Big Data for Statisticians  Invited Papers
Section on Physical and Engineering Sciences, Astrostatistics Special Interest Group, Section on Statistical Consulting
Organizer(s): Vinay Kashyap, Center for Astrophysics  Harvard & Smithsonian
Chair(s): Gwendolyn M Eadie, University of Toronto

10:05 AM
The Astrophysics Data Access Infrastructure
Peter Kelsey George Williams, Center for Astrophysics  Harvard & Smithsonian
The field of astronomy has traditionally had a very open and robust infrastructure for data access, perhaps due to the fact that astronomical data generally have no economic value. From 20thcentury collections of photographic plates to modern databases synthesizing the measurements reported in thousands of journal articles, astronomers have long recognized that sharing and standardization of data enable new science not anticipated by the original investigators. The value of this tradition is becoming even more pronounced as industrydriven data science methods spread and Web technologies enable ever more powerful forms of remote data access and exploration. However, the data rates of modern astronomical instruments – terabytes per night – are pushing the existing infrastructure and astronomers’ technical skills to the limit. I will provide an overview of the astronomical data access landscape and offer some predictions of how it may evolve in the future. 
10:25 AM
Xray data and its many challenges
Kristin Madsen, Caltech
Astrophysical data taken in Xrays and Gammarays are rich in content and the analysis challenges therefore of a wide assortment. The observatory fleet that obtains the data consist of several different instruments, each of which focus on different aspects, such as timing accuracy, high resolution spectroscopy, high spatial resolution, or high/low energy coverage, and it is not uncommon to combine data sets from several instruments. As such the importance of instrument calibration becomes crucial for data analysis, and this component often constitutes the largest source of uncertainties. Naturally, it has become a topic of lively discussion precisely how to correctly include these errors into complex data fitting routines, and in this talk I will review the challenges and discuss the implications of getting it wrong. 
10:45 AM
Gaia data: challenges for the exploitation of a large and complex dataset
Xavier Luri, Universitat de Barcelona; Frederic Arenou, GEPI, Observatoire de Paris, Universite PSL, CNRS
In recent years it has become very common to hear statements on how Big Data, the availability of very large datasets, is revolutionising science. It is applicable to a wide variety of area, but it is often forgotten that the breakthroughs achieved with these data do not only come from its volume, but specially from the capability to do a meaningful data analysis with them. This capability requires the large processing capability of computers but also, and more critically, a proper understanding of the statistical properties of these samples and the ability to design statistical analysis tools to extract knowledge from the data. A clear example of this is the datasets produced by the Gaia mission of the European Space Agency. It is generating very large astrometric catalogues (two billion objects) with unprecedented accuracy, and in this talk I will discuss the challenges faced by the astronomical community to fully exploit its scientific potential. These challenges range from the basic need to understand the properties of the data (data censorships, variable transformation, random errors, systematics) to the design and implementation of analysis tools appropriate to handle them. 
11:05 AM
Solar (Data) Explosion: Challenges in Using Large Astrophysical Imaging Data Sets.
Katharine Reeves, HarvardSmithsonian Center for Astrophysics
The launch of the Solar Dynamic Observatory in 2010 pushed the field of Solar Physics solidly into the big data era by gathering several terabytes of imaging and magnetic field data of the Sun every day. The size of the data archive means that online visualization tools, metadata catalogs, and event databases have become increasingly important. In this talk, I will review some of these tools, as well as their challenges and limitations. Some of these challenges include: cleaning databases of false positives, calibration issues, and verifying completeness. 
11:25 AM
Discussant: XiaoLi Meng, Harvard University 
11:45 AM
Floor Discussion
 Top 
Session 539
Thu Aug 6, 2020, 1:00pm  2:50pm EDT
219552 Challenging signal detection problems in astronomy – Topic Contributed Papers
Section on Physical and Engineering Sciences, Astrostatistics Special Interest Group
Organizer(s): Eric Feigelson, Pennsylvania State University
Chair(s): Vinay Kashyap, Center for Astrophysics  Harvard & Smithsonian

1:05 PM
Challenges for detecting gravitational wave signals
Jess McIver, Univ of British Columbia
Groundbased gravitationalwave detector data is nonstationary and contains a high rate of transient noise artifacts. This transient noise can mimic or obscure true astrophysical gravitationalwave events, reducing the effective reach of searches for these signals. This talk will summarize the methods employed by the LIGO, Virgo, and KAGRA collaborations to characterize and mitigate the impact of transient noise, including regression, statistical correlation, and machine learning. 
1:25 PM
Experimental design and discovery of unknown unknowns with the Rubin Observatory Legacy Survey of Space and Time
Federica Bianco, University of Delaware
Astrophysics has been at the forefront of data science and statistics for decades, yet The Rubin Observatory Legacy Survey of Space and Time (LSST) will usher a new era in dataintensive astrophysics. The nextgeneration groundbased astronomical survey, LSST will generate 20TB of informationrich imaging data every night for 10 years. A core deliverable of the survey is the exploration of the transient sky: astronomical sources that change brightness, color, and position, enabling a deep understanding of stellar physics and cosmology. Cutting edge methodologies that scale with the data volume to address event detection in stochastic time series, outliers and anomaly detection, and lightcurve characterization, typically in irregularly time spaced time series at the limit of the signaltonoise are under development. However, to maximize the scientific throughput of the survey, statistics and data science methodologies have to enter the picture at the experimental design level. I will review applications of machine learning in experimental design to assure the Rubin LSST enables realtime detection of rare and rapidly evolving transients and the discovery of unknown unknowns. 
1:45 PM
Statistical Opportunities and Challenges of Multiepoch Photometric Surveys
Tamas Budavari, The Johns Hopkins University
Refraction by the atmosphere causes measured source directions to change depending on the airmass through which the observations are taken. This wavelengthdependent subtle shift called differential chromatic refraction provides new opportunities for modern groundbased astronomy surveys to obtain additional spectral information. Based on simulations of Large Synoptic Survey Telescope exposures, we expect this prism effect to be measurable from repeated observations of the same part of the sky over a range of different airmasses and parallactic angles. We will discuss initial successes and the challenges to infer highresolution spectral and spatial information from the newgeneration timedomain experiments. 
2:05 PM
A Multivariate Damped Random Walk Process for IrregularlySpaced MultiFilter Light Curves with Heteroscedastic Measurement Erros
Hyungsuk Tak, Pennsylvania State University; Zhirui Hu, Harvard University
In preparation for the era of the LSSTdriven timedomain astronomy, we propose a statespace representation of a multivariate damped random walk process as a tool to analyze irregularlyspaced multifilter light curves of an astronomical object with heteroscedastic measurement errors. It is not necessary that the multiband observations be measured at the same time and multiple light curves be of the same length. Thus, the proposed process is suitable for the multiband light curves of the LSST in particular. We adopt a computationally efficient Kalmanfiltering approach to evaluate the likelihood function of the proposed model, leading to O(k^3n) complexity, where k is the number of bands and n is the total number of observations across the bands. This is a significant computational advantage over a commonly used O(k^3n^3) approach based on a univariate Gaussian process that stacks up all multifilter light curves in one vector. Using this efficient likelihood evaluation, we provide both maximum likelihood estimates and Bayesian posterior samples of the model parameters. We apply the proposed process to several astronomical data sets for numerical illustrations. 
2:25 PM
Discussant: Eric Feigelson, Pennsylvania State University 
2:45 PM
Floor Discussion
 Top 
Other events of interest
Througout JSM
AIG Virtual Community TableWed Aug 5, 56pm
Astrostatistics Virtual Mixer (See Slack channel #jsm2020 for Zoom connection information)Thu Aug 6, 2:20pm
Larry Wasserman on Statistical Methods for some problems in Physics, in session Emerging Statistical Learning Methods in Modern Data Science