Summary, May 22, 2020


This week we had a smaller group meeting than usual (long weekend, summer weather, apocalypse, etc.), but we still had some interesting conversations!

Rodrigo Luger discussed some work he’s been doing to efficiently fit the light curves of differentially rotating stars (see the header figure). This was inspired by his collaboration with Fran Bartolić to map the time variable surface of Io using occultation light curves as discussed last week. In the limit of long lived spots, Rodrigo has demonstrated that many of the degeneracies commonly found when mapping stellar surfaces using phase curves can be broken. But, he’s now looking for good test cases (including this sample) and he will look at the issues introduced by variable spot lifetimes.

Next, John Forbes updated the group on a project that he previously blogged about where he (in collaboration with Nia Imara, Harvard) is quantifying the density structure in turbulent box simulations. The key ingredient of their method is that the eigenvalues (or ratios thereof) of the local Hessian matrix of the density (or log density) seem to separate the structures in these simulations along axes corresponding to the qualitative descriptors that are generally used. Now they’re working to get a better understanding of how to interpret these quantities by visualizing the Hessian matrix at various points in this parameter space. These are hard to visualize since the data live in three dimensions so they have been making 3D matplotlib animations of the stencil used to estimate the Hessian, but they remain hard to interpret. There was some discussion about alternative visualization techniques and whether or not it even makes sense to try to qualitatively interpret these features.

David W. Hogg shared some work he has been doing with Soledad Villar (NYU) to understand the relative performance of discriminative and generative models for prediction tasks in several toy scenarios. They find that generative models marginally outperform discriminative models in many of the test cases they try. In particular, they find—in agreement of previous results in the literature—that discriminative models have catastrophically poor performance when the number of training samples is approximately equal to the dimension of the data, even when the model performs well for both fewer and more training points.

Finally, Sam Grunblatt led a discussion about the definition of the Galactic stellar halo. He asked about the relative importance of kinematics, chemistry, and current coordinates. There was broad agreement that chemical and dynamical arguments are both important, but the discussion predictably devolved into the assertion that the Milky Way is not multi-modal in any dimensions and that classifications should be continuous rather than discrete.

Summary, May 15, 2020


Given the current state of the world, we have been holding the Astro Data Group meeting virtually for the past couple of months and this blog has taken a hit over that time. This week, I (Dan) have decided to try to start posting a short summary of what we talk about each week so we’ll see how long I keep it up :)!

Fran Bartolić started us off with a discussion of how he is using non-negative matrix factorization to map the time variable surface of Io using occultation light curves. In particular, he’s modeling the time variable surface as a linear combination of starry maps where the coefficients vary as a function of time. The results of this inference are prior dependent so he presented some of the prior information that he is considering: non-negativity constraints, sparsity priors, and Gaussian processes to capture smooth time variability and volcano duty cycles. There was some discussion about including spatial priors that would favor contiguous features in each map (for example, individual volcanos).

Trevor David has been investigating long run times and divergences in his PyMC3 inferences. To handle difficult posterior inferences, he has been increasing the target acceptance statistic in the tuning phase of his PyMC3 runs, but still ends up with some small percentage of diverging samples and long run times. Some of these issues can be handled by re-parameterization, but Trevor has been looking at various diagnostics to try and understand the behavior of his sampler. One such diagnostic is the comparison of the empirical marginal energy distribution and the conditional energy distribution (see figure 23 of Betancourt 2017 for an example), and Trevor finds a small skew in these distributions (with heavier tails to larger energies).

On a related topic, I presented some work I have been doing to improve the tuning performance for PyMC3 when there is a large dynamic range in posterior precision across dimensions. The idea is fairly simple—set the target acceptance statistic to something low (like 50%) at the beginning of the tuning phase and increase this target to the final value throughout tuning—but it seems to work well for at least some test cases. This is currently implemented in exoplanet, but more testing will be needed before this is ready for prime time.

Dreia Carrillo shared the figure at the top of this post and asked for feedback about its clarity and, in particular, the binning choices made in the bottom histogram. This figure shows stellar metallicity gradients as a function of birth radius in a Milky Way-like galaxy simulation. The main panel shows how these gradients change as a function of age, but doesn’t capture underlying density of stars so she included the histograms to show this distribution, and specifically to highlight the signatures of inside-out disk formation. The group discussed options for making the birth radius histogram plot clearer. These suggestions included using a kernel density estimate instead of a histogram and using fewer, wider age bins.

To finish, Emily Cunningham led a discussion of how members of the group manage their time when working on multiple parallel projects and how to identify when we are over-committed to too many projects. The group agreed that this is something we all struggle with, especially since we are currently all working in isolation. Group members have devised some tactics for dealing with this, including setting goals on week-long time scales instead of daily or by running work by collaborators more frequently. We also discussed the fact that side projects and context switching can be useful (especially these days) to help getting ourselves out of a research rut or to identify our next projects, even if they don’t always lead to visible results.

Ambiguous periods


A common problem with algorithmic determination of rotation periods from light curves is that the “best” period may actually be the undertone/subharmonic or harmonic of the true period (in other words, twice or half the true period). Two real examples from the Kepler mission are shown here.

In the first row of each figure, the Kepler light curve of a star is phase-folded on a period either from the literature or determined using a Lomb-Scargle (L-S) periodogram. In the second and third rows, the light curve has been phase-folded on half and twice the period in the first row, respectively. The bottom two panels show the light curve from a single Kepler quarter and the periodogram itself.

With KOI 49, the various estimates for the rotation period are in agreement, and it seems clear that those estimates are correct. Why? We see two peaks (and troughs) in the phased light curve of uneven height. The simplest explanation for this behavior is a fairly stable arrangement of two dominant spot groupings of different size and/or contrast on the surface. Reality is definitely more complicated, but the light curves of many stars can be approximated with a two-spot model.

For KOI 377, the literature period estimates (first four columns) are in agreement, while the L-S method (last column) prefers approximately half of that period. It’s worth noting here that the L-S method assumes a sinusoidal model for the data and has indeed selected the period which best approximates the data as a sinusoid. What’s going on?

Imagine this pathological case: a star with two spots equal in size, contrast, and latitude but separated by 180 degrees in longitude, spinning on an axis perpendicular to our line-of-sight. As this star rotates, an observer would record a light curve showing two peaks and two troughs of exactly equal height/duration, and would not be blamed for reporting the rotation period as half the true period.

What is the correct period for KOI 377? From the plot it’s not immediately clear. However, other clues exist in the light curve. Compared with KOI 49, the light curve is less “stable” and shows variations about twice as small in amplitude. From ensemble studies of Kepler light curves it’s known that rotation period is anti-correlated with variability amplitude, and differential rotation (which traces “stability”) correlates with stellar mass, which itself correlates with rotation period. So, one path towards determining more reliable rotation periods may rely on all available information, such as the variability amplitude and stellar parameters. In fact, group members Lucy Lu and Ruth Angus are working on just this problem with Astraea.

Eclipsing binaries with archival radial velocities


At the TESS Ninja 3 workshop, Jessica Birky (UW) cross matched Adrian’s catalog of binary star systems found using APOGEE with the TESS 2-minute cadence light curves and found a handful of light curves with eclipses. With APOGEE cadence, it can be hard to tie down the period of the orbit, but a TESS light curve can approximately perfectly measure the period, epoch, and inclination of the orbit. This breaks degeneracies in the orbital solution and can lead to precise contraints on the stellar parameters. At the workshop, I worked with Birky and others to start fitting these orbits and since then I added support for eclipsing binary light curves directly into exoplanet (the code). The figure this week shows an initial fit for one of these targets. This one is interesting because the deeper eclipse seems to actually be the secondary; the star being eclipsed during the shallower event is actually brighter, but the orbital configuration (specifically the inclination and eccentricity) is such that this produces a shallower signal. This falls out of light curve only fit (the flux ratio is required to be >1) and is consistent with the radial velocity measurements (APOGEE only measures one set of velocities—presumably for the brighter star). The top panels show the TESS light curve and APOGEE radial velocities and the bottom panels show the best fit orbital configuration (the observer is at z = infinity) with the configuration during the deepest event shown with the radii drawn to scale. I haven’t been able to get the MCMC to run efficiently for this yet (probably because of parameterization issues), but hopefully we’ll have a fully working system soon!

Writing a collaboration policy

Inspired by a conversation with Andy Casey who we visited at Monash University in Melbourne last week, I decided to write a Collaboration Policy, and share my feelings on the topic at our weekly group meeting.

At Monash, Andy showed me the Research Expectations page of his website, which is a document outlining the expectations he has for his students, and what their expectations might be for him. I love this idea and would like to create my own research expectations guidelines for any students working with me, current or future. As I started to write it however, I realized I wanted to write a zeroth-order expectations document first. Something a little more general, targeted at everyone I work with: a collaboration policy.

My collaboration policy document states what expectations I have for co-authors on my papers, and what they should expect from me as a co-author on their papers. It describes how I select and order co-authors, how I like the commenting and feedback procedure to go, etc.

The document also outlines an explicit policy for proposal, as well as paper, collaboration, including my definition of the responsibilities of funded and unfunded collaborators. This is something that I have found particularly difficult to navigate in the past, because proposal authorship and responsibility distribution can be delicate and political. I found it really helpful to make my expectations explicit and I’m sure it will help avoid a few awkward conversations in future.

Finally I outline my philosophy on authorship. For example, I feel strongly that “whomever does the work leads the paper”, a policy that may seem obvious but still, sadly, has to be made explicit on occasion.

This is just a first version, and there’s a lot more I’d like to add. The Astro Data group were enthusiastic about working on a collaboration policy together, and I think we’ll converge on an official version soon.

I really enjoyed writing this document. I’ll find it useful to send to my collaborators, but I also enjoyed solidifying some of my feelings and opinions on this topic. I recommend that everyone goes through this exercise!

STELLA - Detecting TESS Flares with CNNs


This will be both my first and last post to this blog after a short (but wonderful!) 6 months at the Flatiron Institute. So I wanted to share a machine learning project that I’ve been working over the past few months with Adina Feinstein, a PhD student at University of Chicago who many of you TESS users may already know from her Eleanor package. We’ve designed a Convolutional Neural Network (CNN) named Stella that automatically and rapidly detects flares in TESS light curves at high (99%!) precision and with minimal data preprocessing (no detrending of stellar activity signals needed!).

Neural networks are able to learn features directly from the data rather than having them hand-engineered by mere humans. CNNs are a specific type of neural network that are especially good at identifying spatial features in the data – a good thing if you’re looking for flares, which have a characteristic sharp rise and slow decline in light curves. We trained Stella on the flares identified in Gunther et al. 2020, and because the dataset was relatively small, this could be done on a laptop in about 30 minutes (no GPUs needed!). After training, Stella is able to obtain high precision (99%) and accuracy (98%) when identifying whether part of a TESS light curve contains a flare.

As shown in the animation above, we can then feed TESS light curves into this CNN “classifier” to transform it into a “detector” that outputs a sort of “flare probability time series” (where we use the term “probability” very, very loosely). Stella is also fast – it takes roughly 30 minutes to search 3500 TESS light curves for flares.

Keep an eye out for the paper presenting Stella and its application to some cool science cases in the coming months. As for now, feel free to check out the code on Github and send us any feedback (if you’re lucky you may get one of the sweet stickers we made with the Stella logo designed by Adina).

Occultations in reflected light


I’ve been working on the problem of computing occultation light curves of planets and moons in reflected light. This is an extension of my starry code to the case where the specific intensity on the surface of the body is zero beyond the day/night terminator. To compute the light curve of an occultation event like that shown in the figure, we need to integrate the specific intensity of the body over the portion of the dayside hemisphere that is facing the observer and not obscured by the occultor. It’s often easier to compute the occulted flux and subtract that from the total flux; the former is given by the integral over the region that is shaded in blue. In the original starry paper, I showed how this integral can be computed analytically in the case of thermal (emitted) light if we use Green’s theorem to transform it into a line integral along the boundary of the region. In the case of reflected light, the integral is still analytic, but harder because of the presence of the terminator. The tricky part is determining all of the integration paths (P1, Q, P2, and T) and endpoints (alpha, beta, gamma, delta). The terminator is always a segment of an ellipse, so we need to figure out how to programmatically identify all the points of intersection and traverse all of the segments that make up the boundary (of which there can be anywhere from two to five). This is proving to be tricky, since there are lots of different ways that two circles and an ellipse can intersect. But the end is in sight, and I hope to make this functionality available in starry in the next month or so.

This will hopefully have applications to modeling secondary eclipse (planet) light curves in the optical, occultations of Solar System bodies (such as mutual occultations of the Galilean moons), planet-planet occultations with JWST, and some day perhaps even exoplanet-exomoon occultations.

Turbulence Statistics


The sites of star formation are self-gravitating, supersonic, magnetized turbulent clouds. These clouds are riddled with small-scale structure - clumps, filaments, and shocks - which undoubtedly play a role in how stars form. To understand these structures a bit better, my collaborator Nia Imara (Harvard) and I have been thinking about statistics we might use to characterize them!

Inspired by the Zel’dovich approximation, used to describe structure formation in cosmology as a succession of gravitational collapse into sheets, filaments, and halos or clusters, we’ve been exploring the following space. At each point in a turbulent box simulation, we compute the Hessian matrix, composed of all the second spatial derivatives of the density. This matrix describes the curvature of the density. We then compute the eigenvalues of this matrix, which each correspond to the size of the curvature in one of the three spatial dimensions, though in coordinates that are not necessarily aligned with x, y, and z.

In this space of eigenvalues, “clumps” should be found where all three eigenvalues are large and negative, while “filaments” only need two eigenvalues to be large and negative (the third should be small), and “sheets” should have one large, and two small eigenvalues. This space of eigenvalues is shown in the animation, where each frame corresponds to a different smoothing scale for the density. In this particular simulation, there is remarkably little mass that clearly resides in any of these categories. Instead, it occupies a range of values right in the middle! There is also a good deal of mass where the second eigenvalue has a sizable positive value, corresponding to thin structures where the density oscillates up and down along the structure. It remains to be seen how this distribution will be affected by changing the inputs of the simulation, but it could be that the nomenclature and interpretation around these small-scale structures could use some revision.

Data viz, meeting design, and other science


Hello again, world! We (the data group) have been slightly deliquent with this blog lately, but our collective New Year’s Resolution is to post more. (Not really, I just made that up. But here’s hoping!) This particular post is going to be fairly meta, because coming back to the blog after a break seems like a natural opportunity to reflect a bit.

In the last couple of months we made some changes to the format of our weekly group meetings. Figuring out how to run a good meeting is a hard job, especially when the goal of your meeting is measured less in productivity and more in culture. We want the group meeting to feel like a community event where everyone has an equal opportunity to get the feedback and support they want, and there’s no atmosphere of judgement or power hierarchy. In my experience, all the other good stuff (lively science discussions, new ideas, increased understanding of other group members’ research) arises naturally under those conditions. This is a hard thing to control, and it’s gotten trickier as our group has gotten bigger! Recently we had to move to a bigger room, and the usual “60 or 90 minutes divided by number of people in the room” time allotment was leaving many people without enough time to get the feedback they wanted from the group.

Our new group meeting format is similar to the old one in that everyone gets a chance to share something (a figure, an idea, a piece of text, etc), but we made a few tweaks. We are now using Google Slides, which is an improvement because (a.) it cuts down dramatically on the time spent fumbling with HDMI and AirPlay when everything can be shown from one laptop, and (b.) each meeting leaves a record, which is nice to reflect on sometimes and allows group members who aren’t at the meeting to see what happened. We are also shifting away from the requirement that every group member should attend and present at every meeting; this hasn’t really changed attendence in practice, but I think it helps the general tone when we all know that everyone in the room is there because they want to be. Finally, regardless of how many people attend, each person gets 10 minutes to speak. If we run out of time, those who didn’t get a chance to show their slide get bumped to the beginning of next meeting’s slide deck. Overall this format seems to be working well, but I’m sure we’ll continue to experiment.

All that being said, here’s my slide from last week’s meeting! I wanted to spark a discussion on data visualization and particularly the representation of uncertainties, with the example of an exoplanet mass-radius diagram that I drafted up for a paper. Mass-radius diagrams are tricky because the objects that naturally take up the most ink and catch the viewer’s eye are the most uncertain measurements, which actually contain the least information. So how to make sure that the viewer gets a quick, intuitive sense of what we really know about exoplanet properties? My favorite example of this in the literature from Zach Berta-Thompson’s 2015 paper is shown on the left. He scales the opacity of the points with their relative uncertainties, which is a clever way around this problem. I actually liked this plot so much that I made an interactive version of it in Bokeh for fun a while back! On the right is my mock-up of more of a “contour plot” approach, which solves my pet peeve of having disembodied error bars creeping in at the edges of the plot, and I think conveys the uncertainties better. I like that the overlapping transparent shapes give more weighting to the regions of parameter space that are covered by more planets, and I think this is a good way to go when what I’m trying to convey is more about the population than about individual planets. Next step is to scale ellipse transparencies with their areas!

Stellar streams around the Milky Way in the Legacy Surveys imaging data


The stellar halo of the Milky Way is full of debris streams — “stellar streams” — left over from dwarf galaxies and globular clusters that have been destroyed by our Galaxy’s gravitational field. These structures are diffuse, distant, and much lower density than foreground stars in, e.g., the Milky Way disk, which is why they don’t appear if you just plot the sky positions of all stars from a photometric survey like the SDSS or Pan-STARRS. One way to search for and enhance the significance of stellar streams is to filter photometric data using templates constructed from isochrones of stellar populations that have characteristics similar to the typical objects that we think disrupt to form the streams (typically low metallicity and old). By moving these color-magnitude filters in distance or distance modulus, we can then construct density maps of the sky positions of these filtered stars to look for linear features that persist or appear consistent between different distance slices. This has been done in the past using, e.g., the northern SDSS data, the full SDSS footprint, and Pan-STARRS.

This week, Nora Shipp (Chicago; who found most of the known streams in the southern sky) has been visiting the CCA to search for new streams, and confirm the existence of stream candidates, using data from the DESI Imaging Legacy Surveys. This survey is deeper and higher resolution than the SDSS, so we are hoping we will find some interesting new things. So far, we have produced filtered sky-maps in distance slices, and are beginning to correlate structures in our new maps with past surveys and discoveries.

One way of visualizing the data is to produce a “field of streams” map, by assigning density in different distance slices to different color channels. In the sky map figure shown above (in equatorial coordinates), the intensity of each pixel is related to the number of stars that pass our color-magnitude filter, and the color of each pixel is an indication of the mean distance of stars in that pixel. In this case, red pixels are ~20–60 kpc from the sun, green is ~10–20 kpc, and blue is ~5–10 kpc. Already with this map, we can see many of the known prominent streams in the northern and southern hemispheres (the widest and most prominent being the Sagittarius stream). Now that we have pretty images, we have to buckle down and try to characterize individual features!