Exoplanet transit light curve#

In this notebook we’ll assemble the light curve of a star, calibrated against other stars of the same field, with the purpose of uncovering an exoplanet transit.

As usual, you can download this page as a jupyter notebook file.

Note

In case you have been working on an AIfA lab room computer until this point, you might be interested in moving to a personal laptop (for example) for these last steps. All that we’ll need from here on is the content of the PHOTOMETRY directory (including the reference catalog), as well as one of the pre-reduced images that has a WCS-transformation in its header (from ASTROMETRY of LIGHT_PRERED, depending on how the processing was run). This is a relatively small amount of data (compared to the raw or pre-reduced images), so you could consider transferring just these files to your own computer. The CPU requirements for these last steps will also be negligible. Make sure to install the right python environment, following Requirements and installation. There is no need for astrometry.net at this stage.

import dataredconfig
import pathlib

import numpy as np
import pandas as pd
import astropy
import astropy.table
import astropy.visualization
import astropy.coordinates
from astropy import units as u

import datetime

%matplotlib widget
import matplotlib
from matplotlib import pyplot as plt

import ccdproc
# We'll ignore some astropy warnings that get raised as our FITS headers (from NINA) are not 100% standards compliant.
import warnings
warnings.simplefilter('ignore', category=astropy.wcs.FITSFixedWarning)

Preparation of the data#

photometry_dir = dataredconfig.work_dir / "PHOTOMETRY"

object_to_process = "HD92670"

catalog_filepaths = sorted(list(photometry_dir.glob('*.fits')))
catalogs = []

# We'll read all catalogs in, can keep only those matching the desired object in the above list.
for catalog_filepath in catalog_filepaths:
    
    catalog = astropy.table.Table.read(catalog_filepath)

    # We select the photometric catalogs of our object:  
    if "OBJECT" in catalog.meta:
        if catalog.meta["OBJECT"] == object_to_process:
            print(f"{catalog_filepath} : {catalog.meta}")
            catalogs.append(catalog)
# We combine these catalogs into a single table, in "depth": columns will be 2D, where the second dimension is time.
catalog = astropy.table.dstack(catalogs, join_type="exact", metadata_conflicts="silent")

# We also produce a list of datetime objects, from the FITS headers:
date_strings = [c.meta["DATE-OBS"] for c in catalogs]
dates = [datetime.datetime.fromisoformat(date) for date in date_strings]

# And while we are at it, same for the airmass:
airmasses = [c.meta["AIRMASS"] for c in catalogs]

# We read the reference catalog, as this one contains the position of each star
ref_catalog = astropy.table.Table.read(photometry_dir / f"ref_catalog_{object_to_process}.fits")
assert len(ref_catalog) == len(catalog) # Just a check that these are indeed of same length

# We copy the positions from the reference catalog over to our combined catalog:
catalog["sky_centroid_win"] = ref_catalog["sky_centroid_win"]

n_epochs = len(dates)
print(f"Number of epochs: {n_epochs}")

# And just for information, we print the column names and column shapes of our "combined" catalogue.
# Indeed, most of the "columns" are in fact 2D arrays:
print("Column names and shapes:")  
for colname in catalog.colnames:
    print(f"{colname}: {catalog[colname].shape}")

At this stage, the data is in structure that is convenient for plotting and calibration: the epochs are in a list of datetime objects (dates), and all the corresponding flux measurements are in the catalog.

For noisy observations, it might be interesting to “bin” the light curves, that is to sum the flux of each source in groups of n exposures. This increases the signal to noise ratio of the datapoints, at the cost of decreasing the sampling along the time axis. Before starting with the further analysis and calibration, we provide below a piece of code that creates a catalog_binned and a corresponding dates_binned, that you might want to use instead of catalog and dates.

# Optional cell to create a binned version of the data catalog

binsize = 10 # How many measurements to group in each bin

n_epochs_binnable = n_epochs - (n_epochs % binsize) # Number of epochs we can use (largest possible integer multiple of binsize)
print(f"With bin size {binsize} we can use {n_epochs_binnable} of the {n_epochs} epochs.")

# First, we create a list of binned dates: we take the "mean" of the epochs in each bin.
dates_reshape = np.reshape(dates[:n_epochs_binnable], (-1, binsize))
dates_binned = [pd.to_datetime(pd.Series(bin)).mean() for bin in dates_reshape]

# Then we create a "binned" catalog, starting from a full copy.
catalog_binned = catalog.copy()

# Define how to process columns:
colnames_to_sum = ["sum_4", "sum_6", "sum_8", "sum_10", "back_sum_4", "back_sum_6", "back_sum_8", "back_sum_10", "flux_fit"]
colnames_to_mean = ["fwhm_fit", "q_fit"]
colnames_to_median = []

for colname in colnames_to_sum + colnames_to_mean + colnames_to_median:

    original_column_shape = catalog[colname].shape
    if len(original_column_shape) != 2:
        raise(RuntimeError(f"Column {colname} has shape {original_column_shape} and can't be binned."))
    
    nb_sources = original_column_shape[0]
    column_reshape = np.reshape(catalog[colname][:,:n_epochs_binnable], (nb_sources, -1, binsize)) # This is now 3D : (source index, nb of binned epochs, binsize)

    if colname in colnames_to_sum:
        column_binned = np.sum(column_reshape, axis=2) #sum within bins
    elif colname in colnames_to_mean:
        column_binned = np.mean(column_reshape, axis=2) # mean within bins
    elif colname in colnames_to_median:
        column_binned = np.median(column_reshape, axis=2) # median within bins

    catalog_binned[colname] = column_binned

# For information, the column names and column shapes of the "binned" catalogue are:
for colname in catalog_binned.colnames:
    print(f"{colname}: {catalog_binned[colname].shape}")
print(f"Number of epochs in binned catalog: {len(dates_binned)}")

Visualizing the field of view#

We now display an image of the field, overplotting the “source indices” corresponding to rows of our catalog.

This will allow us to identify the target and reference stars.

# We load one of the images, it does not have to be a specific one.
light_prered_dir = dataredconfig.work_dir / "LIGHT_PRERED"
science_files = ccdproc.ImageFileCollection(light_prered_dir, keywords=dataredconfig.ifc_header_keywords)
science_files = science_files.filter(object=object_to_process)
image_path = science_files.files[0]
image = ccdproc.CCDData.read(image_path, unit="adu")
image.data -= np.median(image.data) # Quick sky subtraction

# We test that the selected image does have WCS information, to prevent unexpected errors being raised further below. 
if(image.wcs is None):
    raise RuntimeError("This image has no WCS, make sure you specify the correct directory (i.e., with WCS) above!")
# And now create the figure
plt.figure(figsize=(10, 6))
ax = plt.subplot(projection=image.wcs)
ax.imshow(image.data, origin='lower', cmap='Greys_r', interpolation='nearest',
    norm=astropy.visualization.simple_norm(image.data, stretch="sqrt", vmin=-20, vmax=500))
ax.scatter(
    catalog["sky_centroid_win"].ra.degree,
    catalog["sky_centroid_win"].dec.degree,
    transform=ax.get_transform('world'),
    s=50, # The size of these markers is not related to any measurement apertures!
    edgecolor='red', facecolor='none'
    )
for line in catalog:
    ax.text(
        x=line["sky_centroid_win"].ra.degree,
        y=line["sky_centroid_win"].dec.degree,
        s=str(line.index),
        transform=ax.get_transform('world'),
        color="cyan"
        )
ax.grid(color='white', ls='solid')
ax.coords[0].set_axislabel('RA')
ax.coords[1].set_axislabel('Dec')
#ax.coords[0].set_ticks(spacing=5.*u.arcmin)
#ax.coords[1].set_ticks(spacing=5.*u.arcmin)
plt.tight_layout()
plt.show()

The raw light curves#

Let’s now visualize the raw light curves, to get a first impression. The following code is there as an example on how to use and manipulate the catalog and how to plot basic light curves, to get you started.

# We compute an "instrumental magnitude" from all our flux measurements.
# "Instrumental" means that this is not yet calibrated to correspond to an apparent magnitude.
# Note that this is the line where we select which aperture to consider.
# Of course, you can try to figure out which apertures works best for your data.
catalog["instr_mag"] = -2.5 * np.log10(catalog["sum_8"].value) # this is a "2D" column: (source index, date)

# We can also compute summary statistics for each source. Useful ones could be for example the median instrumental magnitude,
# obtained by taking the median along the "date"-axis of the instrumental magnitudes:
catalog["median_instr_mag"] = np.nanmedian(catalog["instr_mag"].value, axis=1) # this is just a 1D column: (source index)
# Using "nanmedian" instead of "median" has the advantage that NaNs get ignored.
# And similarly the standard deviation of each light curve:
catalog["std_instr_mag"] = np.nanstd(catalog["instr_mag"].value, axis=1) # this is just a 1D column: (source index)
# Note that these are 1D columns: they just have one index, namely the source index.

# Another interesting quantity might be the angular separation (in degrees) between each star and your target:
target_index = 34 # Adapt this!
target_center_pos = catalog["sky_centroid_win"][target_index]
catalog["separation"] = astropy.coordinates.SkyCoord.separation(catalog["sky_centroid_win"], target_center_pos)
# Depending on your data, it could be important to avoid stars with large angular separation to calibrate your target.

# When making a plot (and also to select reference stars), you could hand-pick indices, or build such a list algorithmically:
indices_show = [0, 5, 8, 17, 38]

plt.figure()
ax = plt.subplot()

# We simply loop over the indices to show:
for index in indices_show:
    ax.plot(dates, catalog["instr_mag"][index], lw=1, label=index)

ax.invert_yaxis() # Needed, as we show a magnitude on y.

# Some advanced settings to help getting a nice format of the date axis labels:
ax.xaxis.set_major_formatter(matplotlib.dates.ConciseDateFormatter(ax.xaxis.get_major_locator()))
ax.set_xlabel("UTC")
ax.set_ylabel("Instrumental magnitude")
plt.gcf().autofmt_xdate()
plt.legend()
plt.tight_layout()
plt.show()

It’s good to play around with these raw light curves a bit.

Question

Do you observe any trends or features in these light curves that are common to all (or at least several) stars? Comment on what could cause these flux variations.

Calibration#

To reveal the transit itself, some empirical calibration of the flux (or magnitude) measured in each exposure is needed. Assuming you work in magnitudes, the basic idea is to subtract some reference light curve from the target light curve. Recall that this corresponds to a division when working with fluxes.

To get you started, a somewhat minimal strategy for the calibration could be the following:

  1. Select a reference star.

  2. Compute the difference between its light curve (in instrumental magnitudes) and the median of that light curve. This isolates the observed “variability” of the reference star.

  3. Subtract the above difference from the target light curve (in instrumental magnitudes). The resulting light curve is still in instrumental magnitudes, but is now empirically corrected for the variability of the reference star.

  4. Somewhat optionally, apply a zero point to transform the instrumental magnitudes to (approximate) apparent magnitudes, using for example a literature value of the apparent magnitude of the host star, and the baseline of the transit light curve.

This simple approach is probably sufficient to see the transit (depending on its depth and the reference star you selected), but it leaves plenty of room for improvement, especially given our large field of view.

Question

Try to further develop the above strategy, in particular towards better robustness and precision, write it down in a clear formalism (i.e., equations) with well-chosen symbols, and of course apply it to the data, to discuss any improvements.

Question

In theory, what would be the properties of good reference stars to use for the calibration, and why?

It’s good to keep in mind that some technical or instrumental effects might possibly have strong effects on the calibration, and influence the optimal selection of reference stars. So in practice, it’s mandatory to actually test the calibration, instead of relying on theoretical considerations about what “should work best”.

Discussion of the light curves#

With a good calibration strategy at hand, you might now be able to clearly reveal the transit.

Question

In addition to the light curve of the target star, also plot the light curves of some reference stars (applying the same calibration). Use this to verify that your corrections are successful, and that you did not pick up variable stars or stars that have observational issues as reference sources. Do you see residual correlations between some stars?

Note

It could be interesting to make a scatter plot of the median apparent magnitude versus the standard deviation of each calibrated light curve (i.e., star). This could help you identify which stars might not be that useful for the calibration (e.g., pay attention to potential saturation issues for bright stars), and also it nicely summarizes the achieved photometric precision as function of magnitude.

Question

If you see stars with strong residuals after the calibration, what could be the reasons? You can try to optimize the calibration by adapting the code to ignore particular stars.

Discussion of the exoplanet transit#

Question

Can you identify the start and/or end of the transit? Give an estimate for the uncertainty of your estimate.

Question

Can you measure a depth (in magnitude) of the transit? If yes, convert your value to the commonly used “parts per thousand” (ppt).

Question

Are your findings regarding the transit consistent with measurements from the literature (depth, duration, start/end time of the transit)?

Question

Calculate the radius of the exoplanet relative to its host star from your measurement of the light curve. Think as well of how you could estimate the uncertainties.