VII/293 Classifier catalogue of class II YSO posteriors (Wilson+, 2023)
A naive Bayes classifier for identifying Class II YSOs.
Wilson A.J., Lakeland B.S., Wilson T.J., Naylor T.
<Mon. Not. R. Astron. Soc., 521, 354-388>
=2023MNRAS.521..354W 2023MNRAS.521..354W
=2023yCat.7293....0W 2023yCat.7293....0W
ADC_Keywords: YSOs ; Galactic plane ; Positional data ;
Parallaxes, trigonometric; Photometry ; Optical ; Stars, variable
Keywords: methods: statistical - catalogues - stars: formation -
stars: pre-main-sequence - stars: variables: T Tauri, Herbig Ae/Be
Abstract:
A naive Bayes classifier for identifying Class II YSOs has been
constructed and applied to a region of the Northern Galactic Plane
containing 8 million sources with good quality Gaia EDR3 parallaxes.
The classifier uses the five features: Gaia G-band variability, WISE
mid-infrared excess, UKIDSS and 2MASS near-infrared excess, IGAPS
Hα excess and overluminosity with respect to the main sequence.
A list of candidate Class II YSOs is obtained by choosing a posterior
threshold appropriate to the task at hand, balancing the competing
demands of completeness and purity. At a threshold posterior greater
than 0.5 our classifier identifies 6504 candidate Class II YSOs. At
this threshold we find a false positive rate around 0.02 per cent and
a true positive rate of approximately 87 per cent for identifying
Class II YSOs. The ROC curve rises rapidly to almost one with an area
under the curve around 0.998 or better, indicating the classifier is
efficient at identifying candidate Class II YSOs. Our map of these
candidates shows what are potentially three previously undiscovered
clusters or associations. When comparing our results to published
catalogues from other young star classifiers, we find between one
quarter and three quarters of high probability candidates are unique
to each classifier, telling us no single classifier is finding all
young stars.
Description:
We created a naive Bayes classifier for identifying candidate Class II
YSOs. We derived five features from known observational properties of
Class II YSOs: Gaia EDR3 G-band variability, WISE mid-infrared excess
from (W1-W2), near-infrared excess from (H-K) using UKIDSS and 2MASS
photometry, Hα excess, and Isochronal Age both using IGAPS
photometry.
The full classifier catalogue contains 8080045 sources. This includes
29166 sources without any valid feature likelihoods that simply
return the prior for the posterior.
File Summary:
--------------------------------------------------------------------------------
FileName Lrecl Records Explanations
--------------------------------------------------------------------------------
ReadMe 80 . This file
catalog.dat 419 8080045 Results from the naive Bayes classifier
catalog.fit 2880 729458 Fits version of the results
--------------------------------------------------------------------------------
See also:
I/350 : Gaia EDR3 (Gaia Collaboration, 2020)
II/328 : AllWISE Data Release (Cutri+ 2013)
II/246 : 2MASS All-Sky Catalog of Point Sources (Cutri+ 2003)
V/165 : IGAPS merged IPHAS and UVEX northern Galactic plane (Monguio+, 2020)
http://wsa.roe.ac.uk : UKIDSSDR11plus reliableGpsPointSource (Lucas+ 2008)
https://stilism.obspm.fr : Stilism homepage (Lallement+ 2014,Capitanio+ 2017)
Byte-by-byte Description of file: catalog.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 19 I19 --- GaiaEDR3 Gaia EDR3 unique source designation
(GaiaEDR3SourceId)
21- 35 F15.11 deg RAdeg [0/360] Right ascension (ICRS) at
Ep=2016.0 from Gaia EDR3 (RAdeg)
37- 51 F15.11 deg DEdeg [-13.32/66.88] Declination (ICRS) at
Ep=2016.0 from Gaia EDR3 (DEdeg)
53- 66 F14.10 deg GLON [20/220] Galactic longitude from
Gaia EDR3 (GLON)
68- 81 F14.10 deg GLAT [-4/4] Galactic latitude from Gaia EDR3
(GLAT)
83- 90 F8.4 mas plx [0.5/169.03] Gaia EDR3 parallax
zero-point adjusted (plx)
92- 96 F5.3 mag E(B-V) [0/4.73] E(B-V) from STILISM (EBmV)
98 I1 --- SuspRedd [0/1] Suspect reddening flag
(SuspectReddening)
100-105 F6.3 mag GmagCorr [6.81/20.45] Gaia EDR3 G magnitude with
official correction (GmagCorr)
107-112 F6.3 mag rmagCorr [-6.94/15.68]? Parallax corrected
dereddened IGAPS r magnitude using
Ar=2.38E(B-V) (rmagParCorrDered)
114-119 F6.3 mag (r-i)0 [-2.53/4.81]? IGAPS (r-i) dereddened
(rmiDered)
121 I1 --- NFeat [0/5] Number of features in the Bayes
calculation (FeaturesCalc)
123-127 F5.3 --- PriorCII [0.001] CII prior (PriorCII)
129-133 F5.3 --- PriorO [0.999] Other prior (PriorOther)
135-151 F17.15 --- PostrCII [0/1] Naive Bayes CII posterior
(PosteriorCII)
153-169 F17.15 --- PostO [0/1] Naive Bayes Other posterior
(PosteriorOther)
171-180 F10.8 --- GFSDFlux [0/1.7] Gaia EDR3 G observed fractional
standard deviation of flux
(GFracStdDevFlux)
182-192 F11.8 --- GvarCIIL [0/37.41] Variability CII likelihood
(GvarCIILike)
194-206 F13.8 --- GvarOL [0/1749.37] Variability Other likelihood
(GvarOtherLike)
208-217 F10.6 --- GvarLR [0.01/999] Variability CII/Other
likelihood ratio (GvarLikeRatio)
219 I1 --- GvarCalc [0/1] Variability likelihood calculated
(GvarCalc)
221 I1 --- GvarCIICap [0/1] Variability CII likelihood capped
(GvarCIICapped)
223 I1 --- GvarOCap [0/1] Variability Other likelihood capped
(GvarOtherCapped)
225-230 F6.3 mag W1-W2 [-1.93/1.95]? AllWISE (W1-W2) (W1mW2)
232-241 F10.8 --- W1-W2CIIL [0.09/1] (W1-W2) CII likelihood
(W1mW2CIILike)
243-252 F10.8 --- W1-W2OL [0/1] (W1-W2) Other likelihood
(W1mW2OtherLike)
254-263 F10.6 --- W1-W2LR [0.09/999] (W1-W2) CII/Other likelihood
ratio (W1mW2LikeRatio)
265 I1 --- W1-W2Calc [0/1] (W1-W2) likelihood calculated
(W1mW2Calc)
267 I1 --- W1-W2CIICap [0/1] (W1-W2) CII likelihood capped
(W1mW2CIICapped)
269 I1 --- W1-W2OCap [0/1] (W1-W2) Other likelihood capped
(W1mW2OtherCapped)
271-277 A7 --- H-KExSource (H-K) feature photometry source,
2MASS, UKIDSS or "No data" (HmKExSource)
279-284 F6.3 mag H-KEx [-1.94/2.61]? (H-K) offset (HmKEx)
286-295 F10.8 --- H-KExCIIL [0.04/2.25] (H-K) CII likelihood
(HmKExCIILike)
297-306 F10.8 --- H-KExOL [0/1] (H-K) Other likelihood
(HmKExOtherLike)
308-317 F10.6 --- H-KExLR [0.04/999.0] (H-K) CII/Other likelihood
ratio (HmKExLikeRatio)
319 I1 --- H-KExCalc [0/1] (H-K) likelihood calculated
(HmKExCalc)
321 I1 --- H-KExCIICap [0/1] (H-K) CII likelihood capped
(HmKExCIICapped)
323 I1 --- H-KExOCap [0/1] (H-K) Other likelihood capped
(HmKExOtherCapped)
325-330 F6.3 mag r-HaEx [-1.47/1.7]? IGAPS (r-Ha) offset (rmHaEx)
332-341 F10.8 --- r-HaExCIIL [0.07/2.01] H-alpha CII likelihood
(rmHaExCIILike)
343-352 F10.8 --- r-HaExOL [0/5.23] H-alpha Other likelihood
(rmHaExOtherLike)
354-363 F10.6 --- r-HaExLR [0.33/999.0] H-alpha CII/Other
likelihood ratio (rmHaExLikeRatio)
365 I1 --- r-HaExCalc [0/1] H-alpha likelihood calculated
(rmHaExCalc)
367 I1 --- r-HaExCIICap [0/1] H-alpha CII likelihood capped
(rmHaExCIICapped)
369 I1 --- r-HaExOCap [0/1] H-alpha Other likelihood capped
(rmHaExOtherCapped)
371-374 F4.1 [yr] log10Age [-6.9/15.0]? Isochronal age log10(age)
(log10Age)
376 I1 --- OldMaxIsoAge [0/1] Iso age older than isochrones flag
(OlderMaxIsoAge)
378 I1 --- YoungMinIsoAge [0/1] Iso age younger than isochrones
flag (YoungerMinIsoAge)
380 I1 --- GoodClIIYSOFit [0/1] Iso age good fit to YSO isochrones
flag (GoodClassIIYSOFit)
382-391 F10.8 --- IsoAgeCIIL [0.04/1] Iso age CII likelihood
(IsoAgeCIILike)
393-402 F10.8 --- IsoAgeOL [0/1] Iso age Other likelihood
(IsoAgeOtherLike)
404-413 F10.6 --- IsoAgeLR [0.18/31.84] Iso age CII/Other
likelihood ratio (IsoAgeLikeRatio)
415 I1 --- IsoAgeCalc [0/1] Iso age calculated (IsoAgeCalc)
417 I1 --- IsoAgeCIICap [0/1] Iso age CII likelihood capped
(IsoAgeCIICapped)
419 I1 --- IsoAgeOCap [0/1] Iso age Other likelihood capped
(IsoAgeOtherCapped)
--------------------------------------------------------------------------------
Acknowledgements:
Andrew J. Wilson, aw648(at)exeter.ac.uk, andyjwilson_uk(at)hotmail.com
(End) Patricia Vannier [CDS] 20-Jan-2023