IOAA 2024 Data Analysis Solutions
IOAA 2024 Data Analysis Solutions
IOAA 2024 Data Analysis Solutions
Table 1 Table 2
ID1 RA Dec mg err mg ID2 RA Dec mg err mg
(deg) (deg) (mag) (mag) (deg) (deg) (mag) (mag)
1 0.047255 0.000406 21.7649 0.0120 1 0.006167 0.066874 21.9020 0.0576
2 0.064741 0.021568 21.1111 0.0067 2 0.018660 0.007450 21.8039 0.0529
3 0.064911 0.026395 21.3931 0.0084 3 0.047853 0.061487 21.3007 0.0418
4 0.098343 0.054871 21.3934 0.0088 4 0.050870 0.015659 21.1678 0.0388
5 0.022256 0.039129 21.9933 0.0157 5 0.051270 0.020812 21.2524 0.0401
6 0.006188 0.066928 21.5490 0.0088 6 0.057414 0.075999 21.8884 0.0578
7 0.083945 0.074259 21.9395 0.0126 7 0.064745 0.021583 21.3634 0.0422
8 0.076715 0.079496 21.4808 0.0089 8 0.064910 0.026419 21.6428 0.0488
9 0.057422 0.076006 21.8897 0.0127 9 0.071102 0.091058 21.9259 0.0751
10 0.024412 0.087688 21.8341 0.0126 10 0.074946 0.002792 21.3258 0.0410
11 0.044723 0.091782 21.8868 0.0172 11 0.076709 0.079474 21.5303 0.0476
12 0.071089 0.091053 21.4390 0.0098 12 0.092635 0.077395 21.6995 0.0513
13 0.098343 0.054854 21.6542 0.0499
14 0.099332 0.093711 21.8802 0.0577
(a) (5 points) From these tables, which survey (SDSS or DES) is Table 1 and which is Table
2? Assume that both surveys are equivalent regarding detector response, exposure times,
and site characteristics.
Solution: DES is Table 1 and SDSS Table 2, since faint stars will have higher uncer-
tainties. SDSS uses a smaller telescope (2.5 m) than DES (4 m) hence, it is typically
shallower and has larger errors. Just looking at the error distribution should be enough
to tell who has larger errors.
(b) (35 points) Using the data in the table, plot the magnitude (mg ) on the x-axis (linear scale)
and the error in magnitude (err mg ) on the y-axis (logarithmic scale) using the semi-log
paper marked as Graph 1. Estimate the angular coefficient A (slope) and linear coefficient
B (y-axis intercept) for each dataset. There is no need to calculate the associated errors.
1
Page 2 of 9 Data Analysis Round
SDSS:
log10 perr mg q “ A ¨ mg ` B
A “ 0.2780 (acceptable from 0.2502 to 0.3058 which is within 10% range of the fit)
B “ ´7.3108(acceptable from -8.0419 to -6.5797 which is within 10% range of the fit)
DES:
log10 perr mg q “ A ¨ mg ` B
A “ 0.4063(acceptable from 0.3657 to 0.4470 which is within 10% range of the fit)
B “ ´10.7597(acceptable from -10.2217 to -11.2977 which is within 10% range of the fit)
(c) (5 points) The Signal to Noise ratio (S{N ) is approximately the inverse of the error in the
magnitude, S{N « 1{perr mg ). Using the linear fit calculated in the previous part, what is
the S{N reached for each survey at a magnitude of mg “ 21.5 mag?
Solution: Using that S{N „ 1{err mg and above fits we can arrive at the following:
SDSS: log10 perr mg q “ 0.2780 ¨ 21.5 ´ 7.3108 Ñ err mg “ 0.0464
DES: log10 perr mg q “ 0.4063 ¨ 21.5 ´ 10.7597 Ñ err mg “ 0.0095
Accepted answers follow below.
2
Page 3 of 9 Data Analysis Round
err S/N
SDSS err at 21.5 mag 0.05 22 (acceptable from 19.8 to 24.2 -> within 10% range of the result)
DES err at 21.5 mag 0.01 106 (acceptable from 95.4 to 116.6 -> within 10% range of the result)
(d) (15 points) An object in Table 1 that is within 1 arcsecond of an object in Table 2 can be
considered to be the same object. By looking at the RA and Dec of the objects in both
tables, identify the objects in common and write down a new table with the matching IDs,
ID1 and ID2 .
Solution: One tip for this question is that the student should realize that the SDSS
RA coordinate is sorted, so it can be used as a reference for the scanning of DES
coordinates to perform the match. These are the stars that can be matched between
catalogs:
ID1 RA (deg) Dec (deg) mg (mag) err mg (mag) ID2 RA (deg) Dec (deg) mg (mag) err mg (mag) Sep (arcsec)
3 0.064911 0.026395 21.3931 0.0084 8 0.064910 0.026419 21.6428 0.0488 0.08475
9 0.057422 0.076006 21.8897 0.0127 6 0.057414 0.075999 21.8884 0.0578 0.03724
4 0.098343 0.054871 21.3934 0.0088 13 0.098343 0.054854 21.6542 0.0499 0.06186
6 0.006188 0.066928 21.5490 0.0088 1 0.006167 0.066874 21.9020 0.0576 0.2076
12 0.071089 0.091053 21.4390 0.0098 9 0.071102 0.091058 21.9259 0.0751 0.05009
2 0.064741 0.021568 21.1111 0.0067 7 0.064745 0.021583 21.3634 0.0422 0.05655
8 0.076715 0.079496 21.4808 0.0089 11 0.076709 0.079747 21.5303 0.0476 0.08276
(e) (15 points) Using the matched table from part (d), plot the g-band magnitude of each sur-
vey against the other, Table 1 on the x-axis, and Table 2 on the y-axis using the millimetre
(linear) paper marked as Graph 2. Draw on error bars for each point in both horizontal
and vertical directions, using values double err mg (known as a 2σ uncertainty). From
your graph, identify the stars that would be suitable for photometric calibration between
the two surveys and write down their corresponding IDs from Table 1.
Solution: The figure of g magnitude of DES vs SDSS should look like this with the
best linear fit:
3
Page 4 of 9 Data Analysis Round
The student should identify the stars that are closest to the one-to-one line (x = y)
between surveys. This could be with a zero offset, or a non-zero offset. The two stars
that would be suitable for photometric calibration with zero offset are the ones with
ID’s 8 and 9 of Table 1 (DES). The four stars that would be suitable for photometric
calibration with non-zero offset are the ones with ID’s 2, 3, 4 and 6 of Table 1 (DES).
4
Page 5 of 9 Data Analysis Round
Name l (degrees) b (degrees) Distance modulus (mag)
NGC 6522 1.025 -3.926 14.3
NGC 6401 3.450 3.980 14.4
NGC 6342 4.898 9.725 14.5
NGC 6553 5.253 -3.029 13.6
NGC 6440 7.729 3.801 14.6
Ter 12 8.358 -2.101 13.6
VW-CL160 10.151 0.302 14.2
2MASS-GC01 10.471 0.100 12.6
NGC 6517 19.225 6.762 14.8
NGC 6402 21.324 14.804 14.8
NGC 6712 25.354 -4.318 14.3
NGC 6426 28.087 16.234 16.6
NGC 5466 42.150 73.592 16.0
NGC 7089 53.371 -35.770 15.3
NGC 288 151.285 -89.380 14.8
NGC 2298 245.629 -16.006 15.0
NGC 4590 299.626 36.051 15.1
NGC 4372 300.993 -9.884 13.8
NGC 362 301.533 -46.247 14.7
BH 140 303.171 -4.307 13.4
NGC 5927 326.604 4.860 14.6
Patchick 126 340.381 -3.826 14.5
NGC 5897 342.946 30.294 15.5
NGC 6380 350.182 -3.422 14.9
Djor 1 356.675 -2.484 15.0
(a) (20 points) Calculate the distance (in parsecs) of each globular cluster from the Sun as well
as their cartesian coordinates (x,y,z). The x-axis points to the Galactic Centre and the
y-axis points in the direction of galactic rotation. The system is right-handed.
Solution: The first step is to convert the extinction corrected distance moduli (DM )
of all globular clusters to distance (d), in parsecs:
So, it follows to calculate the cartesian coordinates (x, y, z) of the globular clusters with
respect to the Sun, using the galactic coordinates (longitude l and latitude b). The x-
axis points to the Galactic Centre, the y-axis points to direction of galactic rotation and
the z-axis is perpendicular to the galactic disk, and points in the direction antiparallel
to the angular momentum. The conversion of coordinates should be done as follows:
The table below shows the calculated values of d, x, y, and z for all globular clusters.
5
Page 6 of 9 Data Analysis Round
(b) (20 points) From the given data, estimate the distance from the Sun to the centre of the
distribution of globular clusters and its uncertainty.
Solution: In order to estimate the distance from the Sun to the centre of the distri-
bution and its uncertainty, we first need the mean values of each coordinate x̄, ȳ and
z̄, as well as the standard deviations in each axis, σx , σy and σz . The calculations of
the mean and standard deviation for the x, y, and z coordinates are as follows:
N N N
1 ÿ 1 ÿ 1 ÿ
x̄ “ xi , ȳ “ yi , z̄ “ zi
N i“1 N i“1 N i“1
6
Page 7 of 9 Data Analysis Round
σx σy σz
δ x̄ “ ? , δ ȳ “ ? , δz̄ “ ?
N N N
And, to estimate its uncertainty δD, we should perform an error propagation calcula-
tion:
ˆ ˙2 ˆ ˙2 ˆ ˙2
2 BD 2 BD 2 BD
pδDq “ ¨ pδ x̄q ` ¨ pδ ȳq ` ¨ pδz̄q2 ñ
Bx̄ B ȳ Bz̄
„´ ¯2 ´ ȳ ¯2 ´ z̄ ¯2
2 x̄
pδDq “ ¨ δ x̄ ` ¨ δ ȳ ` ¨ δz̄ ñ
D D D
1 “ ‰1{2
δD “ ¨ p6164.1 ¨ 811.54q2 ` p´321.3 ¨ 839.28q2 ` p434.3 ¨ 936.52q2 ñ
6187.73
δD “ 812.28 pc
(c) (30 points) To test the validity of Shapley’s hypothesis that globular clusters are symmet-
rically distributed around the Galactic Centre, make histograms with five bins (i.e. sort the
data and divide them into five equally-sized intervals) for each of the distributions in the x,
y, and z directions. Mark the value of the quartiles (Q1 , Q2 , Q3 ) of the three distributions
with solid lines on the histograms.
Hint: The three quartiles divide the sorted sample into four sections, each containing 25%
of the data, with the second and third sections representing the interquartile range.
Solution:
The table below shows the calculated values of the quartiles for each distribution.
7
Page 8 of 9 Data Analysis Round
8
Page 9 of 9 Data Analysis Round
(d) (5 points) Using the quartiles, calculate the symmetry factor value for the three distribu-
tions as given by:
Classify the three distributions in the x, y, and z directions based on their calculated
symmetry factor values, according to the table shown below. Hence, on the answer sheet,
write True (T) if the analysed sample follows Shapley’s hypothesis or False (F) otherwise.
Solution:
F (False). The analysed sample does not follow Shapley’s hypothesis, because all
distributions are asymmetrical. See table below for the values of calculated symmetry
factors for each distribution.