Unresolved binaries and where they appear in the color-magnitude diagram

img img

David Hogg and I have been working on “open fiber” proposals for the SDSS-V surveys, which will decide targets for fibers (fiber optic cables that connect from the focal plane to the SDSS spectrographs) that are not allocated by the main survey. As a part of that, I have been making many, many color-magnitude diagrams with data from the Gaia Mission (Data Release 2; DR2). For example, look at any one of the panels in the figures above: The gray background is the same in all panels and shows the density of stars selected from Gaia DR2 in the Gaia BP-RP color and absolute G-band magnitude (these have been cleaned to remove high-RUWE sources, but if that doesn’t mean anything to you…ignore this comment!).

Two features that often come up in conversations between us and others are: (1) the very apparent “binary sequence” above (brighter than) the main sequence, and (2) the spray of stars that are below (fainter) and to the left of (bluer than) the main sequence. From talking to many other people, I have heard a number of different explanations for these stars: white dwarf-main sequence binaries, cataclysmic variables, low-metallicity main sequence stars, or stripped-envelope stars.

Thinking about these classes of stars sent me off on a tangent thinking about unresolved binary stars, and wondering about all of the places that they can appear in the color-magnitude diagram (CMD). So, I downloaded some isochrones from MIST for two different age stellar populations, and made some visualizations to get an intuition for this.

In each panel, I pair up the “star” indicated by the red marker with all other possible stellar types along the isochrone (blue lines), assuming that the two “stars” are blended so that their fluxes add. The two sets of panels are for two different age stellar populations (indicated on the top of the figures). There are many interesting features and things to note in these figures, but of relevance to our original question, it does indeed look like pairing a white dwarf with a lower main sequence star ends up with an unresolved source that would sit below (and bluer than) the main sequence!

One interesting question to think about: Are there places in the CMD that should be empty? Do we find stars there?

Estimating Covariance of a Cross Correlation Function


Eric Ford summarized some of his recent exploration into various approximations to the covariance of the CCF. The CCF of Echelle spectra of stars (typically with a mask that is non-zero near the location of stellar spectral lines, but sometimes a template spectrum) is frequently used as the basis for measuring radial velocities in the context of characterizing exoplanets and binary stars. The methodology goes back to the days of measuring radial velocities with photographic plates, metal plates and photocells. Of course, the advent of modern computers provides much more flexibility in what astronomers can calculate (e.g., assigning lines weights rather than simply 0 or 1 when the mask is a metal plate). Nevertheless, it appears that most exoplanet hunting teams still compute something amazing similar to the CCFs of the 20th century. (e.g., 1, 2, 3; but see 4). While these sometimes mention computing the variance of the CCF, they rarely describe computing the covariance of the CCF. Also concerning, most papers describe computing only a subset of the terms that contribute to the variance of the CCF, those that are based on the variance of the flux, but omit terms that are due the covariance of the flux at different wavelengths. If one were to use the full covariance matrix, then one can map the CCF to a likelihood function and get rigorous uncertainty estimates. But if one leaves out some terms, then the resulting velocity uncertainties will probably be overly optimistic.

Eric began thinking about this last summer, when working with Andrew Collier-Cameron and Sahar Shahaf on developing and evaluating the Self-Correlation Analysis of Line Profiles for Extracting Low-amplitude Shifts (SCALPELS) algorithm for removing apparent RV shifts due to changes in the shape of the CCF (5, Appendix B). For that paper, they had a large sample of spectra, so they chose to make use of the sample covariance of the CCFs. Actually, we replaced the sample covariance of the CCF with the sum of two terms: (1) a rank-reduced version of the sample covariance of the CCF that captures the large-scale structure of the CCF covariance, and (2) an analytical model for the near-diagonal terms that can be scaled to match the signal-to-noise of each spectrum. That seemed to work well for the HARPS-N solar dataset (6), but might not work as well for datasets with significantly fewer and/or lower signal-to-noise spectra, such as several groups are currently working on as part of the EXPRES Stellar Signals project.

Eric described a few possible approaches. The most direct solution is to simply compute the full covariance matrix via brute force, performing the four dimensional integrals necessary to compute each term of the covariance (7). Eric tried this, but wasn’t happy with the computational expense, and didn’t observe an improvement compared to some other approximations when measuring velocities (but hasn’t evaluated the relative performance for characterizing line shape variations).

Perhaps the simplest approximation would be to only compute the contribution of terms that depend on the variance of the flux, but then scale up the variance by some factor. This leaves the question of what factor should be used. One can compute a minimum value based on the theoretical velocity precision which can be computed in flux space (i.e., before taking the CCF), and scale up the resulting uncertainties the ratio of the theoretical velocity precision to the velocity precision derived with the approximation of the CCF.

Another possibility would be to simply use the sample covariance matrix, but there’s the concern that won’t perform well on small datasets. A third possibility would be to use a hierarchical model for the covariance conditioned on the data. Mathematically, there’s a lot of appeal, but it’s unclear how computationally intensive it would be and it take a fair bit of work to setup. So this would probably be the last approach to try implementing.

A fourth possibility was suggested by Jinglin Zhao, computing the sample covariance of a large number CCFs generating in a bootstrap-like fashion from the actual spectra. This would still be computationally expensive, but it might be faster than the direct computation approach. Additionally, it has the advantage that it would be relatively straightforward to account for a complicated line width function (LSF), including details like how it varies across the detection.

Eric showed some CCFs, CCF minus mean CCF residuals, sample covariance of CCFs, and residuals to the rank-reduced CCF covariance model, and the mean along offset diagonals of the residuals to the rank-reduced CCF covariance model, so as to give people a sense for what the data is like.

Someone (Hogg?) asked why Eric needed to compute the covariance of the CCF in the first place. Eric replied that he was trying to use the CCF for a log likelihood that was sensitive to the line-shape changes due to stellar variability.

Eric presented the above options, hoping that someone would point out some papers where scientists had thought about this more deeply and validated one or more approximations. Unfortunately, we didn’t identify any such paper. It’s unclear what subbranch of astronomy would care about computing the covariance of the CCF accurately. Maybe there’s some other field using CCFs that would care (e.g., MRIs?).
For now, it seems that the CCF covariance approximation of 5 is state-of-the-art, at least for large datasets. If Eric manages to try the bootstrap approach, he’ll report back.

References Footnotes

  1. Bouchy et al. (2001)
  2. Boisse et al. (2010)
  3. Lafarga et al. (2020)
  4. For instruments using a gas absorption cell as a wavelength calibration, the velocity measurements are typically done using something closer to a forward modeling approach; Butler et al. (1996). The advent of optical laser frequency combs (LFCs) and etalons has enabled even more precise wavelength calibration that can leave the stellar spectrum unperturbed.
  5. Collier-Cameron et al. (2020)
  6. Dumusque et al. 2020; actually we used 910 daily average spectra from this data set, rather than all 34,550 solar spectra. The daily averages have very high SNR, ~6,000-10,000 per CCF “pixel”.
  7. Technically, one would need to do 6 dimensional integrals if one were to consider the finite extent of each detector pixel.

About the author:

Eric Ford has been joining in on the CCA’s Astro Data group meetings while on sabbatical, and is eager to learn from others at CCA applying advanced statistical or data science methodologies that may be relevant to detecting and characterizing exoplanets and exoplanet populations.

Coding in Julia vs. Python (when using existing Python Packages)


At this week’s meeting, I asked the group for some advice about coding in Julia vs. in Python. I am starting work on a project with David Spergel and Ruth Angus, simulating the exoplanet astrometry yield from the Nancy Grace Roman Space Telescope. We are going to look at synergies between Roman exoplanet astrometry and radial velocity (RV) observations from the Terra Hunting Experiment. In order to simulate the joint RV and astrometry signals, we are planning on using exoplanet or The Joker as well as some other exclusively Python packages. So, I asked for advice on whether it would still be “worth it” in terms of computational speed to write the code in Julia if the code I am writing will be primarily a wrapper for other Python packages. In addition, I was curious whether calling Python packages in Julia (using PyCall) would introduce a slowdown in the code.

Eric Ford said that there is no speed penalty in using PyCall in order to run Python packages in Julia. He also mentioned that Python actually works quite well as a scripting language, so if I will primarily be calling Python packages for the meat of the code, then it would work well to do so in Python. The conversation then turned into a deeper discussion about the possible benefits/downsides of re-implementing the existing Python packages in Julia. Dan Foreman-Mackey suggested that the codes themselves are quite replicable and it wouldn’t be too difficult to re-implement them in Julia. David Hogg brought up the benefit in writing my own code from scratch in terms of developing a deeper understanding of how the code works. He also suggested for me to “think about your goals and do what makes sense within those goals.” Megan Bedell and Adrian Price-Whelan then suggested an alternative approach could be to implement a few toy or simpler cases to get a feel for what’s going on, but ultimately using existing packages in production work. Dan Foreman-Mackey also mentioned that the Julia code wouldn’t be quicker than the existing Python packages, as these packages use C or C++ for the computationally expensive portions of the code.

The main takeaway I got from the conversation is that it would be possible to do the project in Julia, by re-writing the Python packages that I would need, but that this is only really the right approach if it is something that I am specifically interested in doing as a project in itself. Otherwise, if the primary goal is to quickly produce a functioning simulation of Roman astrometry + Terra Hunting RVs, then it is likely better to use the existing Python code bases and to write the wrapping/scripting code in Python as well.

Figure credit

Evolution of the exoplanet radius gap


At this week’s meeting I updated the group on an ongoing project examining the influence of age on the exoplanet radius gap, a relative scarcity of close-in exoplanets sized between 1.5-2 Earth radii. The feature is widely believed to be a signature of atmospheric loss, with planets below the gap having rocky compositions and planets above the gap requiring a few percent-by-mass atmospheres to explain their low bulk densities. In the atmospheric loss paradigm, some fraction of the rocky planets are the remnant cores of planets that lost the entirety of their atmospheres.

At the core of this project is the observation of a more sparsely populated radius gap at younger system ages. My question to the group was in regards to estimating the statistical significance of the feature in 1D (i.e. just the radius distribution) and 2D (i.e. the orbital period-radius plane). While the gap appears to be remarkably empty in 2D, the feature is somewhat washed out (or, less statistically significant) in 1D partially due to the fact that the precise location of the gap decreases with increasing orbital period. In the 1D case, David W. Hogg proposed that the data could be projected orthogonally to an assumed functional form of the gap. A grid search of the gap function parameters could be performed to test different projections and find the “best” or emptiest gap, which would also allow us to estimate the feature’s significance through simulations.

Christina Hedges brought up Hough transforms for the 2D case. Since there also appears to be a stellar mass dependence to the location of the radius gap (and conceivably a metallicity dependence), we would be interested in exploring the location of the gap in higher dimensions. Generalizing the Hough transform to higher dimensions was discussed, and Christina threw together a lovely notebook for the 2D case.

Finding Unresolved Stellar Companions in Kepler Planet Hosts with Gaia


At this week’s meeting, I updated the group about my project with D F-M, Andy Casey (Monash University), and Tim Morton (USC- the west coast one). We are looking into how to determine how the binarity of Kepler planet hosts influence planet formation. That sounds really exciting but “determine” is doing a lot of work in that sentence. First, we have to figure out which stars are (unresolved) binaries. There are a couple of different ways in which we’ll go about doing this, but the one highlighted this week is the fruit of Andy’s work.

Alongside the astrometric and RV measurements, Gaia also lists errors associated with fits. For unresolved binaries, trying to fit the astrometric or RV curve of a single star will yield an uncharacteristically large amount of error. Through Andy’s work (which I can’t claim to understand beyond the surface level), we can turn this error into a likelihood that the star is not a single star.

Incredibly, this seems to work really well! In the color-magnitude diagram above of the KIC stars with astrometric measurements, we can pick out the stars that have a low probability of being single above the main sequence; right where we would expect binaries to be.

Interestingly, the same plot for KIC stars with RV measurements looks very different. There are a few selection effects here we’ll have to figure out: not all stars have both astrometric and RV measurements and (as pointed out by Trevor) the population of binaries detected by astrometry is different from that of RV binaries. Known spectroscopic and eclipsing binaries are picked up through RV jitter but not astrometric jitter!

Figuring out how exactly accurate this is will be key. Because neither method is batting 1.000, exactly how to quantify “I’m pretty sure this star is not a single” represents an interesting puzzle for our near-future selves.

GPU-accelerated difference imaging


At this week’s meeting I shared an update about a project called PyTorchDIA led by James Hitchcock (St. Andrews) where we are looking into how numerical tools developed for machine learning can be applied to astronomy. Unlike many applications of PyTorch, TensorFlow, etc. in astronomy, I’m not talking about using neural networks. Instead, I’m talking about using PyTorch to compute fast convolutions and matrix-vector products to accelerate more “traditional” astronomy problems. This is somewhat similar in spirit to the wobble project led by Megan Bedell (that strangely has never appeared on this blog, Megan!).

For the PyTorchDIA project, James and I realized that we might be able to easily port existing difference imaging methods to run on the GPU using PyTorch since the bottleneck operation is solving a large linear problem where the design matrix is a convolution. This is exactly the type of operation encountered in deep learning applications and exactly what frameworks like PyTorch are designed for. James took this one step further, and instead of directly solving this linear system, he uses an iterative solver that only requires matrix-vector products (rather than a full Cholesky factorization). This allows us to relax some of the assumptions (like Gaussianity of the observation model) of the standard methods so that we can use a robust loss function instead of sigma clipping.

Using the GPUs available on Google Colab, James finds that his method is at least an order of magnitude faster than optimized implementations of other commonly used methods for difference imaging, across a wide range of realistic test cases. We will hopefully be submitting this paper in the next week or so and we are already planning on incorporating this method into the pipelines for several existing and upcoming difference imaging surveys. But, the real reason why I shared this with the group was that I believe that there are some (potentially) useful lessons learned:

  1. Astronomical imaging analysis shares a lot of structure with machine learning problems; let’s take advantage of that!
  2. Some of the linear problems that we work on in the group might be easier and faster to solve if we use iterative solvers rather than direct solvers. The scaling is different, but also the implementations have been optimized for the forward calculation and these methods can open the possibility of generalizing the loss function.

Aside: I also want to mention the awesome website weirdgalaxi.es (yes that is the domain name, but don’t click on that if you’re using Safari) that Kate Storey-Fisher made (and shared at this week’s meeting) as a continuation of the project that she previously blogged about.



After arriving at CCA a few weeks ago, I’m starting on a project with Adrian Price-Whelan and David W. Hogg. We are trying to figure out how to answer the question: What does it mean that APOGEE doesn’t have a bunch of binary systems containing black holes? Barring the obvious reason that APOGEE’s targets are low-ish mass stars and thus unlikely to be companions to BHs and their high-mass progenitors, we think it will be interesting to compare a bunch of synthetic binary populations generated with COSMIC to Adrian’s close binary catalog.

Step 0 of this process is to work out how to generate mock APOGEE catalogs with COSMIC data. This means I need to do real astronomy to convert the effective temperatures and bolometric luminosities in my simulations to (gasp) magnitudes of both the absolute and apparent kind. Luckily for me we found Tim Morton’s isochrones package that provides interpolated bolometric corrections from the MIST bolometric correction tables. The figure above shows a CMD that I made from a little test sample of a few synthetic binaries that follow the age and metallicity distribution of the m12i galaxy in the Latte Suite of the FIRE cosmological simulations. I was pretty excited to see the suuper obvious double main sequence and that COSMIC produced something that looks somewhat similar to what you’d expect (see e.g. Fig 5 here).

Steps 1 onward will include picking which kinds of binary interactions will most severely impact a BH + star population that APOGEE can see (hello BH natal kicks!) to set up our binary evolution model set and including dust extinction. I’m sure eventually I will also open real astro data files that at some point intheir distant post-processing past came from telescopes!

Estimating the Bolometric Light Curves of Supernovae


I’ve been working (slowly but surely) on an open source code extrabol which estimates the bolometric light curves of supernovae from an arbitrary set of filtered photometric observations. This code is based on a package by Matt Nicholl, SuperBoL.

Estimating the bolometric luminosity and black body temperature of a supernova is an extremely common analysis problem in time-domain astronomy as it allows astronomers to easily probe supernova explosion dynamics. Unfortunately, supernova light curves are often poorly sampled across multiple photometric filters, making it challenging to build a pseudo-bolometric light curve directly from the data.

In order to solve this problem, extrabol uses a 2D Gaussian process to extrapolate the supernova spectral energy distribution (SED) as a function of both wavelength and time. Extrabol is linked to the Spanish Virtual Observatory, a database of nearly every commonly used astronomical filter. Given the name of a telescope/filter, extrabol will link the observation to an effective wavelength. Using an arbitrary set of filters from the user, extrabol will estimate the underlying SED and then use this SED to estimate a black body temperature and radius. Eventually, the goal is to have extrabol linked with the Open Supernova Catalog, so that a user type in the name of a supernova and quickly build a bolometric light curve using public data.

For now, I am still working out some bugs! The Figure above shows an example light curve for a core-collapse supernova. Each observation is shown as a point (with error bars being smaller than the points). The GP interpolation is shown as lines, with uncertainty shown as shaded regions. Currently, I am assuming a mean function of a very dim magnitude, which works well before and after the supernova itself; however, if there is a large gap of data, the GP prefers artificially low luminosities. I got some very useful feedback from the Astro Data Group this week to investigate a more physically meaningful mean function, with residuals being modeled by the GP. My next step is figuring out if there is such a function which can be used for all flavors of supernovae!

The Great Emu Fight


At AstroHackWeek 2020 last week, a fight broke out - not between the supportive and friendly participants, but between emulators. I pitched a hack project on comparing emulation methods, mostly so I could show this emu vs. kangaroo fight video during my pitch, and a team of fighters assembled. All credit for this project shared with Catarina Alves, Johannes Heyl, Yssa Camacho-Neves, and Johnny Esteves!

One of my research projects is on emulating galaxy clustering statistics for speedier, more accurate inference of cosmological and assembly bias parameters. I had been wanting to try out different emulation methods on this data, as well as write up a pedagogical tutorial on emulation. In brief, emulators are regressors that imitate more complex models or simulations. They are trained on a limited set of simulation inputs and output and learn the layout of this (often high-dimensional) parameter space; then for any new set of inputs, they can quickly spit out output values, without performing a physical calculation or a full simulation.

Our hack team first wrote up a detailed explanation of emulators, complete with an example about a computational emu biologist; check it out in our complete tutorial here. Then we wanted to actually build some emulators and test them on astronomical data. My full-scale research dataset proved unwieldy for a week-long project, so I generated a set of two-point correlation functions using nbodykit. The winning emulator should be able to take the input values of (Omega_m, sigma_8, Omega_b) and predict the correlation function in each of 10 separation bins.

Each team member took on a different emulation framework. So far, our lineup includes an Artifical Neural Network, a Random Forest, a Decision Tree, and a Support Vector Machine (with Gaussian Processes on the way). The fighters tuned their implementations and hyperparameters to the best of their abilities. The fight was heated, with the emulators racing to train and test out their clustering predictions. You can see the predictions of the SVM compared to the true correlation function in the figure above.

To see the outcome, you’ll have to follow the full battle at The Great Emu Fight notebook. But the takeaways were:

  • The Support Vector Machine emerged as the clear winner in accuracy for every metric we tested, emulating the correlation function to within a tenth of a percent for most r-bins. The Artificial Neural network came in a close second. (The second figure above shows the R^2 error; you can see the large difference in accuracy between the SVM and ANN compared to the DTree and RF.)
  • However, the SVM was by far the slowest to train. The Decision Tree handily won the training round.
  • The Decision Tree and the ANN were the winners when it came to prediction time (this is super important for inference!).
  • The Random Forest was an all-around loser, being very slow at prediction and sadly inaccurate; we love and appreciate it anyway.
  • Most importantly, all of these results are super dependent on emulator implementation, hyperparameter tuning, data normalization, and the dataset itself! So the reigning champion likely won’t reign for long.
  • We built this project to be incredibly modular, so you can throw your own emulator or dataset into the ring. Follow one of our standalone notebook tutorials and tune the hyperparameters to improve on our results, or add in your own complete implementation, and join in the fight.

Check out the project in this repo, and follow the complete fight at The Great Emu Fight notebook. Huge thank you to the emu-fight team and the AstroHackWeek organizers!

How do people run big group meetings?

Last week we had our first meeting of the new semester and we decided to try something “new”. In fact we went back to how we ran things back in the olden days: we take the total meeting time and divide it equally between people who are online, and everyone gives an update. In many ways, this is really how we want group meeting to go. The whole point is that, since group members work on such a broad range of topics, we want to have an opportunity to hear what everyone else has been up to. The goal is to identify shared interests and encourage collaborations across research areas.

This time (unlike the olden days), we had 18 group members (or collaborators working actively on a project with a group member) online (and that isn’t even the whole group these days - I’m not complaining!). This meant that even with 90 minutes, the updates had to be very short and the whole thing felt like quite a whirlwind (and I needed to take a nap after logging off). As a result, I’m not going to write up a summary of all the awesome work that folks are doing. Instead, I want to ask if anyone has advice or thoughts about running big group meetings like this.

Overall, it seems like the group appreciated this format, but agrees that it might be a bit overwhelming to do it every week. So the current plan is to proceed with the format from the summer (a shared slide deck where group members can choose to add a slide and then share a 10 minute update with some discussion) for most meetings, and then about once a month we would have a whirlwind update from everyone. I don’t love this option, because this meeting is often the only time that I get updates from many group members. But, we haven’t been able to identify a format that helps to make sure that everyone gets and opportunity to share. We’d love to hear if there are other groups out there doing interesting things with their meetings. We’re always keen to try experiments, especially since we’ll be staying fully-remote for the foreseeable future!

If you have thoughts, please Tweet at me or send me an email (should be easy to find!)