Friday, March 31, 2017

Brute Force and The AAVSO Data on Boyajian's Star

We have more than 500 days span of data from the AAVSO data on Boyajian's star now. I thought it might be worth a closer look to see if any of the secular dimming seen by either Schaefer in the archival plates or Montent and Simon in the Kepler full frame images might still be going on.

I am not a world class statistician, but sometimes a naive approach is interesting if we employ standard method knowing that our the systematics in the data are not well characterized yet.

A little background information

So, a brief explanation of what the AAVSO does. Many of their members have equipped their telescopes with special electrically cooled digital cameras and optical filters that together can measure the brightness of a star in a particular color, or band, of light with respect to standard comparison stars.  The colors we concern ourselves with right now are known as Blue, Visual, Red, and Infrared, or B,V, R and I for short.

Over many decades, the AAVSO has done a great deal of careful work finding and observing comparison stars, which are in turn compared to each other. Each observer procures his or her own equipment, pays for access to training materials, and is responsible for making sure their gear is in good working order. They are supplied with AAVSO software that turns the digital counts on the cameras into a brightness, or as it is known, a magnitude.

There are really only two things you need to know about magnitude to avoid being confused with what is to come. Some of it is historical accident, but it still makes sense in a way - unlike English spelling, which is all historical accident and little of it makes sense anymore:

  1. A higher magnitude means the source is dimmer.  The brightest things in the sky have a negative magnitude, and the dimmest thing you can see with your naked eye on a dark, moonless night is around magnitude 6. This is why the Y axis of the points you will see seems to be upside down, with the higher numbers lower on the Y axis.
  2. A small difference in magnitude is a big difference in brightness, because the scale is logarithmic. This actually makes sense, since the brightness of astronomical objects varies over a huge range. A decrease in brightness by a factor of 100 is 5 magnitudes.

The AAVSO Data so Far

I want to start with spoilers. No one should get too excited about this yet. We need more data taken over a longer time span to confirm that the Schaefer dimming is still going on. There are several possibilities left standing, including that there is no dimming going on, although my unconfirmed hunch is that there is some dimming taking place. Permit me to explain.

The AAVSO observers all deserve kudos for staying up late to snag a few measurements of Boyajian's star when they could. I've little doubt that each one did the best possible job with the equipment and observing conditions that they had.

The AAVSO started observing Boyajian's Star around October 2015, and have observed it almost every night they could except when the sun was too close in the sky to get good data  - mostly in January and February.  February 2016 was a washout, but a better effort was made in February 2017.  None of the big dips observed by the Kepler space telescope have yet been seen in the AAVSO data, but that's not the only interesting thing, as we have documented.

I thought I would try using some fairly powerful tools in a naive way. I took all the AAVSO data up until 28 March of 2017. That's roughly 500 days, and a considerable number of observers, mostly in North America and Europe (the first "A" in AAVSO stands for American, but it's totally international now).  It turns out that there are more than 30,000 valid observations of just this one star to date, and almost every day more are added. Most of them are I, R, B, and V bands as discussed above.

Let's focus on the "B" or blue color band for now, since that is what Schaefer was looking at. Astronomers actually call this Johnson Blue, but let's not get lost in those nuances. It represents shorter wavelengths of visible light than V, R or I.  
The passbands for the various standard filters
OK, so let's start with only those "B" observations (2279 in all) that were taken at a reasonable angle above the horizon. in the the plot that follows, the blue points are the observation we are using, the black ones are the ones that were taken out, either because the "Airmass" was too high, or not reported:
The best fit to the selected B band data

You will see a line drawn through the data that is definitely trending down. In fact it is trending down at 1.75 magnitudes per century, which is much higher than the Montet/Simon dimming (about 0.6 magnitudes/century) or the Schaefer dimming (0.165 magnitudes per century).  Should we believe this?

Maybe not. One way to test the result is to fool around with the data. What if we didn't filter out so many points, and allowed any air mass, or even observations with no reported airmass? Easily done, and now there are more than twice as many "B" observations, 4648 in all:

The AAVSO B data with any air mass allowed
You can easily see from this that the slope reverses, and it's now getting brighter at almost  0.9 magnitudes per century. Both of these things can't be happening, so the result is too sensitive to the data allowed. There is more than one way to interpret this result, but to me the most likely is that we just haven't been observing long enough. Let's try one more experiment more, and eliminate the observations of one observer. Not to single anyone out, but I think the result is interesting:

Any air mass, allowed, but one observer excluded.

What happens here is that the fit flips again from brightening to dimming at 1 magnitude per century! This is just one observer of many, and one who did all their observing over a span of 40 days, with what appears to be unusually large scatter over each session. However, it's a lot of points, and the linear regression algorithm in question thinks the more the merrier.

One could easily interpret the above as that the star is dimming anomalously fast in B, if we just ignore a small subset of observations. Not so fast. I want to know why the solution, using the lm() function in the R statistical package, is so delicate to the inclusion or exclusion of a single observer. It tells me that the problem is something bigger than one observer's telescope. Maybe, given the uncertainty inherent in the AAVSO data, the time span just isn't long enough. My hunch is that the star is dimming in B, but not as much as I'm seeing right now. I don't think a definitive conclusion is justified yet.

Stay tuned. We're going to put this all to the test as the time span lengthens. Give it a few hundred more days.

You can get my .R script here, the essential subroutines here, and the data through 28 March here. Try it yourself.