Colour-magnitude diagram#
In this notebook, we will:
read the photometric catalogs of the science frames and group them in a convenient structure,
calibrate the photometry by comparing our instrumental magnitudes to apparent magnitudes from a reference catalogue,
draw a colour-magnitude diagram (CMD),
and compare this CMD to isochrones from a stellar evolution code.
You can download this page as a jupyter notebook
file.
Note
In case you have been working on an AIfA lab room computer until this point, you might be interested in moving to a personal laptop (for example) for these last steps.
All that we’ll need from here on is the content of the PHOTOMETRY
directory (including the reference catalog). This is a relatively small amount of data (compared to the raw or pre-reduced images), so you could consider transferring just these files to your own computer. The CPU requirements for these last steps will also be negligible. Make sure to install the right python environment, following Requirements and installation. There is no need for astrometry.net
at this stage.
import dataredconfig
import numpy as np
import astropy
import astropy.table
import astropy.visualization
import astropy.coordinates
import astroquery.vizier
import urllib.request
import pathlib
import pickle
from astropy import units as u
%matplotlib widget
import matplotlib
import matplotlib.widgets
from matplotlib import pyplot as plt
Gathering and structuring the data#
We start by reading the relevant catalogs as a list of tables.
# Where the photometry catalogs are:
photometry_dir = dataredconfig.work_dir / "PHOTOMETRY"
object_to_process = "M 37"
catalog_filepaths = sorted(list(photometry_dir.glob('*.fits')))
catalogs = []
for catalog_filepath in catalog_filepaths:
catalog = astropy.table.Table.read(catalog_filepath)
# We select the photometric catalogs of our object:
if "OBJECT" in catalog.meta:
if catalog.meta["OBJECT"] == object_to_process:
print(f"{catalog_filepath} : {catalog.meta}")
catalogs.append(catalog)
print(f"Collected {len(catalogs)} catalogs.")
Now we “merge” all these separate tables into a particular structure that will make the analysis easier:
Measurements done with the same filter are stacked in “depth”, i.e., grouped in 2D-columns.
Finally all filters are stacked “horizontally”, i.e., as separate columns in one single table.
filter_names = ("g", "r", "i")
filter_catalogs = []
for filter_name in filter_names:
#print(f"Starting filter {filter_name}")
this_filter_catalogs = [catalog for catalog in catalogs if catalog.meta["FILTER"] == filter_name]
this_filter_catalog = astropy.table.dstack(this_filter_catalogs, join_type="exact", metadata_conflicts="silent")
this_filter_catalog.name = filter_name
filter_catalogs.append(this_filter_catalog)
# And we stack these catalogs horizontally into one single table:
cat = astropy.table.hstack(filter_catalogs,
join_type="exact", metadata_conflicts="silent",
table_names=filter_names, uniq_col_name='{col_name}_{table_name}'
)
# The columns are now the following:
print("Columns of cat:")
for colname in cat.colnames:
print(f"{colname}: shape {cat[colname].shape}")
Note how the filter name is now appended to the column names. These columns have two dimension: the first index identifies the source (i.e., star), and the second index runs over the different exposures.
We now add two more columns: the sky position of each star, and while we are at it the separation between the star and the cluster center.
# To get the position, we need to read the reference catalog:
ref_catalog = astropy.table.Table.read(photometry_dir / f"ref_catalog_{object_to_process}.fits")
assert len(ref_catalog) == len(cat) # Just a check that these are indeed of same length
cat["sky_pos"] = ref_catalog["sky_centroid_win"]
target_center_pos = astropy.coordinates.SkyCoord(cat.meta["RA"]*u.deg, cat.meta["DEC"]*u.deg)
cat["separation"] = target_center_pos.separation(cat["sky_pos"])
A quick CMD in instrumental magnitudes#
As an example about how the catalog can now be used, we compute a median instrumental magnitude in each filter (where the median is taken over the different exposures) and plot a simple uncalibrated CMD. This quick magnitude computation is clearly not optimal, in particular if the atmospheric conditions changed during the observation! But it should be sufficient to get a first idea of the CMD. We’ll come back to magnitude calibration later.
# We take the median accross "axis=1", that is the second index
# (the first index (axis=0) identifies the different stars).
# Note that the aperture radius of 4 arcsec is just used here as an example, other radii might perform better!
cat["quick_instr_mag_g"] = np.median(-2.5 * np.log10(cat["sum_4_g"].value), axis=1)
cat["quick_instr_mag_r"] = np.median(-2.5 * np.log10(cat["sum_4_r"].value), axis=1)
cat["quick_instr_mag_i"] = np.median(-2.5 * np.log10(cat["sum_4_i"].value), axis=1)
# Plotting a first CMD
plt.figure()
plt.scatter(
cat["quick_instr_mag_g"] - cat["quick_instr_mag_i"],
cat["quick_instr_mag_r"],
s = 5,
c = cat["separation"].value # color represents separation between star and cluster center
)
plt.gca().invert_yaxis()
plt.colorbar(label=f"Separation to target center in {cat['separation'].unit}")
plt.xlabel("Instrumental mag g - i")
plt.ylabel("Instrumental mag r")
plt.title(object_to_process)
plt.show()
Magnitude calibration#
To compare our data with stellar models, we need to:
convert our instrumental magnitudes into apparent magnitudes (that is, “standardise” our observations)
correct the absolute magnitudes of the stellar models by some appropriate distance and extinction, so that they can be compared to apparent magnitudes.
In this section we will start by calibrating our magnitude measurements. We do this by matching some of the measured stars with a reference catalogue.
As a preparatory task, we compute instrumental magnitudes for each flux measurement, so that we can process each exposure individually.
# Computing instrumental magnitudes for all apertures and exposures:
filter_names = ["g", "r", "i"]
photometry_prefixes = ["sum_4", "sum_6", "sum_8", "sum_10", "flux_fit"]
for filter_name in filter_names:
for photometry_prefix in photometry_prefixes:
cat[f"instr_mag_{photometry_prefix}_{filter_name}"] = -2.5 * np.log10(cat[f"{photometry_prefix}_{filter_name}"].value)
# Recall that these are 2D-columns with indices (source, exposure)
Matching to a reference catalogue#
We will make use the catalogue ATLAS-REFCAT2. This is an all-sky reference catalog compiling together several large surveys. In particular, among many other data, it contains real photometry in the SDSS filters g, r, and i for stars brighter than the 14th magnitude. Details about the catalogue can be found at https://archive.stsci.edu/hlsp/atlas-refcat2 .
We access this catalogue via a service called VizieR, a library of published astronomical catalogues.
vizier = astroquery.vizier.Vizier()
vizier.ROW_LIMIT = 500 # just as a safety measure, 500 is more than enough
catalogue_list = vizier.find_catalogs('ATLAS-REFCAT2')
print("Identified catalogues:")
for k, v in catalogue_list.items():
print(k, ":", v.description)
ref_cat_name = list(catalogue_list.keys())[0]
# Loading the reference catalogue within 20 arcmin around the target, and down to mag 13:
ref_cat = vizier.query_region(target_center_pos, radius=20*u.arcmin, catalog=ref_cat_name, column_filters={'rmag': '<13'})[0]
print(ref_cat.colnames)
print(ref_cat)
# If you obtain 500 stars (i.e., the ROW_LIMIT), decrease the depth of the request (see column_filters in the above).
# We really don't need more than a few stars, in fact!
# Matching our own catalogue with the reference catalogue
# The coordinates to match:
ref_cat_coords = astropy.coordinates.SkyCoord(ra=ref_cat["RA_ICRS"], dec=ref_cat["DE_ICRS"])
cat_coords = cat["sky_pos"]
# The matching itself is just one call:
idx, seps, _ = astropy.coordinates.match_coordinates_sky(cat_coords, ref_cat_coords)
# This returns:
# idx: indices into ref_cat_coords to get the matched points for each cat_coords. Shape matches cat_coords.
# seps: on-sky separation between the closest match for each cat_coords and the cat_coords. Shape matches cat_coords.
# And assembling a combined matched catalogue:
sep_constraint = seps < 2.0 * u.arcsec # criteria for a successful identification
cat_matches = cat[sep_constraint]
ref_cat_matches = ref_cat[idx[sep_constraint]]
# joining these two into one single matched catalogue:
matched_cat = astropy.table.hstack([cat_matches, ref_cat_matches], join_type="exact", metadata_conflicts="silent")
print("Column names of matched catalogue:")
print(matched_cat.colnames)
print(f"Length of matched catalogue: {len(matched_cat)}")
We can have a look at the histogram of separations between the matches, i.e., between the stars from the reference catalogue that are nearest to the sources from our own catalogue. The typical astrometric error should be much smaller than a pixel, if our astrometric calibration is good.
# Histogram of separations:
plt.figure()
plt.hist(seps.to_value(u.arcsec), bins=50, range=(0, 2))
plt.xlabel("Separation in arcseconds")
plt.title("Histogram of on-sky separation between matches")
plt.show()
Computing the magnitude zero points#
Using the matched catalogue, we can visualize the difference between our own instrumental magnitudes and the apparent magnitudes from the reference catalogue.
exp_to_show = 0 # Showing data from the first exposure in each filter
photom_prefix_to_show = "sum_4" # Showing data for the 4-arcsec aperture
plt.figure()
plt.plot(matched_cat["rmag"], matched_cat[f"instr_mag_{photom_prefix_to_show}_g"][:,exp_to_show] - matched_cat["gmag"], ls="None", marker=".", label="g", color="green")
plt.plot(matched_cat["rmag"], matched_cat[f"instr_mag_{photom_prefix_to_show}_r"][:,exp_to_show] - matched_cat["rmag"], ls="None", marker=".", label="r", color="orange")
plt.plot(matched_cat["rmag"], matched_cat[f"instr_mag_{photom_prefix_to_show}_i"][:,exp_to_show] - matched_cat["imag"], ls="None", marker=".", label="i", color="red")
plt.xlabel("Apparent magnitude from reference catalogue")
plt.ylabel(f"instr_mag_{photom_prefix_to_show} - apparent mag")
plt.legend()
plt.grid()
plt.show()
And now we compute and apply a zero point correction for each filter, exposure, and aperture size. The zero points are computed using the matched catalogue, but then we apply them to our full catalogue, by adding new columns with calibrated apparent magnitudes.
for filter_name in filter_names:
for photometry_prefix in photometry_prefixes:
# We compute the zero point using the catalogue matched_cat.
# The zero point is computed as the median (across all sources) of the mag differences plotted above.
# For this we'll need to manually "broadcast" the reference magnitude columns, as they are 1D (i.e., they don't contain several "exposures").
instr_mags = matched_cat[f"instr_mag_{photometry_prefix}_{filter_name}"] # This is 2D
ref_mags = np.tile(matched_cat[f"{filter_name}mag"], (instr_mags.shape[1], 1)).transpose() # This is now also 2D, same shape as instr_mags
#ref_mags = np.repeat(np.atleast_2d(matched_cat[f"{filter_name}mag"]), instr_mags.shape[1], axis=0).transpose() # Equivalent code, not much nicer
zeropoints = np.ma.median(instr_mags - ref_mags, axis=0) # Median taken accross the first index, i.e. the sources
# This contains one zero point per exposure
# And now we directly apply those zero points in the catalogue "cat", creating a new column:
cat[f"mag_{photometry_prefix}_{filter_name}"] = cat[f"instr_mag_{photometry_prefix}_{filter_name}"] - zeropoints
# To get one single magnitude per star, we compute the median (across exposures) of the calibrated magnitudes:
cat[f"median_mag_{photometry_prefix}_{filter_name}"] = np.ma.median(cat[f"mag_{photometry_prefix}_{filter_name}"], axis=1)
# Note that this last step can certainly be improved and extended, one could compute several statistics at this stage.
# The above is just meant to provide a simple and robust first solution to plot a CMD.
Let’s repeat the CMD plot with these new calibrated apparent magnitudes:
# The calibrated observational CMD
plt.figure()
plt.scatter(
cat["median_mag_sum_4_g"] - cat["median_mag_sum_4_i"],
cat["median_mag_sum_4_r"],
s = 5,
c = cat["separation"].value # color represents separation between star and cluster center
)
plt.gca().invert_yaxis()
plt.colorbar(label=f"Separation to target center in {cat['separation'].unit}")
plt.xlabel("g - i")
plt.ylabel("r")
plt.title(object_to_process)
plt.show()
Isochrones#
We will use isochrones (i.e. “synthetic” cluster CMDs) based on the stellar evolution code MESA, as provided by the MIST project (Dotter 2016 and Choi et al. 2016). These isochrones have been generated with synthetic photometry computed in the same Sloan/SDSS photometric system that we use for our observations. To make the use of these isochrones as easy as possible, we have repackaged them into a custom single file, that we will download and use below.
In case you want to know more about the origin of these isochrones, see the website of MIST (https://waps.cfa.harvard.edu/MIST/), the website of MESA (https://mesastar.org), and the related publications.
The details about our repackaging into a single file are given in this notebook.
Loading the isochrones#
# Download the file containing all the isochrones (about 48 MB):
isochrones_url = "https://uni-bonn.sciebo.de/s/czdMPjrLyi571uu/download"
local_isochrones_filepath = pathlib.Path("isochrones_MIST.pickle")
if not local_isochrones_filepath.exists():
urllib.request.urlretrieve(isochrones_url, local_isochrones_filepath)
# Read this file with the following code (see below for explanations):
with open(local_isochrones_filepath, 'rb') as f:
(isochrones, vrots, metallicities, logages) = pickle.load(f)
The structure of this data needs a few explanations.
In practice, a single isochrone is stored as a 2D table: columns are absolute magnitudes in g, r, i, and rows correspond to different initial stellar masses. One can compute differences between these columns to obtain colors.
Obviously we need several isochrones. So we have such a table for several different ages, for several different metallicities, and for two stellar rotation velocities (more precisely: with or without rotation). Note that these isochrone tables don’t necessarily have the same size (i.e., the same resolution of initial masses).
The contents read by the code above are the following:
isochrones
is a nested list (i.e., a list of lists of lists) of astropy Tables containing the isochrones.The first index of
isochrones
corresponds to the stellar rotation velocity,the second index corresponds to metallicity, and
the third index corresponds to the age.
Then, the contained astropy tables have 4 columns, named
g
,r
,i
, andinitial_mass
, where the photometry is given in absolute magnitudes, and the initial mass is given in solar masses.
vrots
is a 1D numpy array containing stellar rotation velocities (first index ofisochrones
). In fact, there are just two values: 0.0 means no rotation, and 0.4 means that the initial rotation velocity of higher mass stars is set to 0.4 times a critical surface velocity.metallicities
is a 1D numpy array with the different available metallicities ([Fe/H]) of the isochrones, corresponding to the second index ofisochrones
. Recall that the metallicity of stars, i.e., the abundance of elements other than hydrogen and helium, has a significant influence on stellar evolution, and that we can assume that stars of a cluster share the same metallicity. Metallicity is often described by the logarithm of the iron-to-hydrogen ratio relative to that of the Sun, and denoted [Fe/H]. For example, [Fe/H] = 0 means that the stars have the same metallicity as the Sun, while [Fe/H] = -1.0 means that their metallictiy is one tenth that of the Sun.logages
is a 1D numpy array containing the log10 of the available ages (in years) of the isochrones, corresponding to the third index ofisochrones
.
The following plot demonstrates how to use the objects read from the isochrone file.
# Demonstrating the use of the isochrones with a simple non-interactive plot
vrot_index = 0
metallictiy_index = 12
age_indices_to_plot = [60, 70, 80]
plt.figure()
for age_index in age_indices_to_plot:
age = 10.0**logages[age_index]
metallicity = metallicities[metallictiy_index]
vrot = vrots[vrot_index]
label_string = f"{age/1e9:.2f} Gyr"
# And this is how to get one of the isochrones:
iso = isochrones[vrot_index][metallictiy_index][age_index]
plt.plot(iso["r"] - iso["i"], iso["r"], label=label_string)
plt.gca().invert_yaxis()
plt.xlabel("Absolute mag r - absolute mag i")
plt.ylabel("Absolute magnitude r")
plt.legend()
plt.show()
Extinction and distance#
To compare these isochrones to our calibrated data, we need two more ingredients: extinction, and distance.
Extinction is the radiation loss due to scattering and absorption by the interstellar medium (gas and dust). For a given line of sight, the amount of extinction depends on wavelength: blue light is scattered and absorbed more than red light, causing reddening of the light of stars. This dependency on wavelength is called a reddening law. Its shape depends on the properties of the interstellar medium, and is not the same for every cluster or part of the sky. The relation that we’ll use below is an empirical “average” solution, certainly only approximate, but it illustrates the idea.
Note that extinction is multiplicative in flux, and in practice it can therefore be expressed as a magnitude offset \(A\) for each band (assuming some spectrum for the starlight). These magnitude offsets in the different bands are proportional to each other. For a given reddening law, they can be expressed as multiples of the extinction in some reference band. In the example below, this reference is the visible band V, and the reference extinction is denoted \(A_{\mathrm{V}}\). We will use it as a free parameter.
def extinction_law(A_V):
"""
Returns extinction values for the different SDSS/Sloan filters, based on some reference total extinction A_V
The particular values we use here are from Wang and Chen (2019) ApJ 877 116:
https://iopscience.iop.org/article/10.3847/1538-4357/ab1c61
"""
#return {"g": ref_ext*3.30, "r": ref_ext*2.31, "i":ref_ext*1.71 } # Table 2 of Yuan et al. (2013) https://doi.org/10.1093/mnras/stt039
return {"g": A_V*1.205, "r": A_V*0.848, "i":A_V*0.630 } # Table 3 of Wang and Chen (2019)
def mag_absolute_to_apparent(m, distance, extinction):
"""
Function that corrects absolute magnitudes by distance modulus and extinction, to get apparent magnitudes
m: absolute magnitude(s)
distance: in pc
extinction: in mag
"""
return m + 5.0*np.log10(distance) - 5 + extinction
Comparing the observed CMD with isochrones#
With all this in hand, we will now build an interactive matplotlib plot, displaying the observed CMD, and overplotting isochrones of selected age and metallicity. For this we define one more helper function:
def get_apparent_isochrone(vrot_index, metallicity_index, age_index, distance=10, A_V=0):
"""
Helper function to get a specific isochrone at a specific distance and total extinction
"""
# Computing the extinctions:
extinctions = extinction_law(A_V)
# Obtaining the corrected magnitudes:
iso_app_g = mag_absolute_to_apparent(isochrones[vrot_index][metallicity_index][age_index]["g"], distance, extinctions["g"])
iso_app_r = mag_absolute_to_apparent(isochrones[vrot_index][metallicity_index][age_index]["r"], distance, extinctions["r"])
iso_app_i = mag_absolute_to_apparent(isochrones[vrot_index][metallicity_index][age_index]["i"], distance, extinctions["i"])
# A fancy string describing the isochrone, with some LaTeX formatting:
label_string = f"Age {(10.0**logages[age_index])/(1e9):.2f} Gyr, " \
+ f"[Fe/H] = {metallicities[metallicity_index]:.2f}, " \
+ r"$v/v_{\mathrm{crit}}$ = " + f"{vrots[vrot_index]}, " \
+ f"$d$ = {distance:.0f} pc, " \
+ r"$A_{\mathrm{V}}$ = " + f"{A_V:.2f}"
return (iso_app_g, iso_app_r, iso_app_i, label_string)
And now the interactive plot, using matplotlib sliders:
# Initial position of sliders:
init_distance = 10.0
init_age_index = 50
init_metallicity_index = 12
init_A_V = 0.0
init_vrot_index = 0
fig, ax = plt.subplots(figsize=(10, 6))
fig.subplots_adjust(left=0.30, bottom=0.1) # making room for the sliders
# Plotting the observed data
cbar = ax.scatter(
cat["median_mag_sum_4_r"] - cat["median_mag_sum_4_i"],
cat["median_mag_sum_4_r"],
s = 5,
c = cat["separation"].value # color represents separation between star and cluster center
)
# Plotting the isochrone
(g, r, i, label_string) = get_apparent_isochrone(init_vrot_index, init_metallicity_index, init_age_index, init_distance, init_A_V)
line, = ax.plot(r-i, r, color="red", linewidth=1.0)
text = ax.text(0.5, 0.95, label_string, color="red", horizontalalignment='center', verticalalignment='center', transform=ax.transAxes)
# The function to be called anytime a slider or botton is changed
def update(val):
vrot_index = vrot_button.index_selected
metallicity_index = metallicity_slider.val
age_index = age_slider.val
distance = dist_slider.val
A_V = ext_slider.val
(g, r, i, label_string) = get_apparent_isochrone(vrot_index, metallicity_index, age_index, distance, A_V)
line.set_xdata(r-i)
line.set_ydata(r)
text.set_text(label_string)
fig.canvas.draw_idle()
# Slider to control the distance
axdist = fig.add_axes([0.05, 0.25, 0.0225, 0.6])
dist_slider = matplotlib.widgets.Slider(
ax=axdist,
label=r"$d$ [pc]",
valmin=10,
valmax=10000,
valinit=init_distance,
orientation="vertical"
)
# Slider to control the age
axage = fig.add_axes([0.09, 0.25, 0.0225, 0.6])
age_slider = matplotlib.widgets.Slider(
ax=axage,
label=r"$i_{\mathrm{Age}}$",
valmin=0,
valmax=len(logages)-1,
valstep=1,
valinit=init_age_index,
orientation="vertical"
)
# Slider to control the metallicity
axmetal = fig.add_axes([0.13, 0.25, 0.0225, 0.6])
metallicity_slider = matplotlib.widgets.Slider(
ax=axmetal,
label=r"$i_{\mathrm{[Fe/H]}}$",
valmin=0,
valmax=len(metallicities)-1,
valstep=1,
valinit=init_metallicity_index,
orientation="vertical"
)
# Slider to control the extinction
axext = fig.add_axes([0.17, 0.25, 0.0225, 0.6])
ext_slider = matplotlib.widgets.Slider(
ax=axext,
label=r"$A_{\mathrm{V}}$",
valmin=0.0,
valmax=3.0,
valinit=init_A_V,
orientation="vertical"
)
# Radio buttons to control the rotation
axvrot = fig.add_axes([0.05, 0.1, 0.14, 0.08])
vrot_button = matplotlib.widgets.RadioButtons(
ax=axvrot,
labels=("No rotation", r"$v/v_{\mathrm{crit}} = 0.4$"),
active=init_vrot_index
)
# Register the update function with each slider and button:
age_slider.on_changed(update)
dist_slider.on_changed(update)
metallicity_slider.on_changed(update)
ext_slider.on_changed(update)
vrot_button.on_clicked(update)
# Finalize plot
ax.invert_yaxis()
plt.colorbar(cbar, label=f"Separation to target center in {cat['separation'].unit}")
ax.set_xlabel("$r$ - $i$")
ax.set_ylabel("$r$")
ax.set_title(object_to_process)
plt.show()
Discussion of the CMD#
Question
What photometric aperture radius gives the most accurate CMD? Apart from judging the appearance of the CMD itself, feel invited to make and discuss dedicated plots (e.g., similar to the one done just before the zero point computations above). Especially if the observing conditions were variable, it might be interesting to analyse if and how individual exposures differ, instead of just considering their median.
Question
Optionally, and somewhat related to the previous question, you could compute uncertainty estimates for your magnitude measurements, and visualize those on the CMD. There are several different approaches to compute these. If you chose to do so, make sure to describe precisely how you computed these error bars, and discuss what sources of uncertainty they represent.
Question
What features do you see in the observed CMD? Describe your data.
Question
By comparing the calibrated CMD to the synthetic isochrones, give an estimate of the age and distance of your cluster, and potentially also of the metallicity and extinction. Potentially, you can discuss how some features of your CMD help in constraining these parameters.
Question
Discuss the difficulty to get estimates (and uncertainty estimates) for these parameters. You could create CMDs with several isochrones to illustrate your points. If you would have to quantify uncertainties, how could you approach this?
Question
Provide a comparison with values from the literature (at least for distance and age).
Question
Try to list the ingredients of this whole analysis that could possibly have issues. In other words, what do you think are the key assumptions of your analysis?