The hydrostatic-to-lensing mass bias from resolved X-ray and optical-IR data

An accurate reconstruction of galaxy cluster masses is key to use this population of objects as a cosmological probe. In this work we present a study on the hydrostatic-to-lensing mass scaling relation for a sample of 53 clusters whose masses were reconstructed homogeneously in a redshift range between z = 0 . 05 and 1 . 07. The M 500 mass for each cluster was indeed inferred from the mass proﬁles extracted from the X-ray and lensing data, without using a priori observable-mass scaling relations. We assessed the systematic dispersion of the masses estimated with our reference analyses with respect to other published mass estimates. Accounting for this systematic scatter does not change our main results, but enables the propagation of the uncertainties related to the mass reconstruction method or used dataset. Our analysis gives a hydrostatic-to-lensing mass bias of (1 − b ) = 0 . 739 + 0 . 075 − 0 . 070 and no evidence of evolution with redshift. These results are robust against possible subsample di ﬀ erences.


Introduction
The distribution of galaxy clusters in mass and redshift is sensitive to the expansion history and matter content of the Universe, as well as to the initial conditions in the primordial Universe (Huterer et al. 2015).Thus, cluster masses are valuable tools for the use of these objects in cosmology (Vikhlinin et al. 2009;Allen et al. 2011;Costanzi et al. 2019).Recent results have shown that the cosmological analyses based on cluster number counts seem to favour more lower matter density (Ω m ) and matter power spectrum normalisation (σ 8 ) values than the studies based on the cosmic microwave background (CMB) power spectrum (Planck Collaboration XX. 2014;Salvati et al. 2018;Costanzi et al. 2019).Given that cluster masses are not directly observable quantities and have to be estimated under several hypotheses from observations, the uncertainties and systematic errors of those estimates could be the source of tension with the CMB (Pratt et al. 2019;Salvati et al. 2020).
Some cluster number count studies (Garrel et al. 2022;Planck Collaboration XXIV. 2015) have relied on cluster masses obtained from scaling relations (SRs) between the Sunyaev-Zel'dovich (SZ) effect (Sunyaev & Zeldovich 1972) or the X-ray emission of the cluster and the hydrostatic mass reconstructed ⋆ miren.munoz@lpsc.in2p3.frfrom X-ray data (Arnaud et al. 2010).It has been widely investigated and proved that masses reconstructed under the hydrostatic equilibrium (HSE) hypothesis are biased low (Lau et al. 2013;Planck Collaboration XX. 2014;Biffi et al. 2016;Gianfagna et al. 2021).For the cluster number count analyses based on HSE masses, the so-called HSE mass bias could be one of the possibilities to solve the mentioned Ω m − σ 8 cosmological tension (Planck Collaboration XX. 2014; Salvati et al. 2018).We define the HSE mass bias as the relative difference of HSE mass estimates to the true cluster masses, (1 − b) = M HSE /M True .A large value of the bias, that is, smaller (1 − b), implies larger values for Ω m and σ 8 in cluster number count analyses (Planck Collaboration XXIV.2015).
In the literature different approaches have been developed to estimate this bias.On the one hand, combined CMB power spectrum and cluster number count analyses fit the bias value that is required to get consistent results between the two probes.According to Planck Collaboration XX. ( 2014), (1−b) = 0.59±0.05would be needed to reconcile the results from the Planck CMB analysis in Planck Collaboration XV. (2014) with the cluster count cosmology from Planck Collaboration XX. (2014).The posterior analysis of Planck data in Planck Collaboration XXIV.( 2015) obtained (1 − b) values in the range between 0.54 and 0.705 considering different priors for the bias (based both on X-ray and lensing data).The updated analysis in Planck Collaboration VI. ( 2020) provides (1 − b) = 0.62 ± 0.03, compatible with the (1 − b) = 0.62 ± 0.07 from Salvati et al. (2018).Accounting for the power spectrum of the thermal SZ (tSZ) effect together with the cluster number counts, Salvati et al. (2018) conclude that the bias needed to reconcile the CMB should be (1 − b) = 0.63 ± 0.04.Considering also the trispectrum in the covariance matrix of the tSZ power spectrum analysis, Bolliet et al. (2018) estimate (1 − b) = 0.58 ± 0.06 (68% C.L.) to be compatible with CMB data.
On the other hand, studies based on simulations have compared the HSE masses of clusters to their true masses.A large variety of simulations have been used in different works (Planck Collaboration XX. 2014).Some of them computed the HSE masses by combining, under the HSE hypothesis, the true thermodynamical quantities (density, temperature, and/or pressure) from the intracluster medium in the simulations (Lau et al. 2013;Biffi et al. 2016;Gianfagna et al. 2021Gianfagna et al. , 2023)).Others used simulations to mimic mock observations and reconstruct the HSE masses (Rasia et al. 2012).However, they all tend to obtain a bias value of (1 − b) > 0.7 (see Figure 1 in Gianfagna et al. 2021, for a summary).
In an attempt to have a reliable estimate of the bias of observational HSE masses, several works have compared the HSE masses to lensing mass estimates, that is, to masses reconstructed from the lensing effect of the cluster on background sources.Under the assumption that lensing masses are unbiased estimates of the true mass of clusters, such HSE-to-lensing mass biases are good estimators of the HSE bias.Most of the studies in the literature are based on lensing masses obtained from the weak lensing signal on background galaxies.Figure 10 in Salvati et al. (2018) shows a compilation of HSE-to-lensing mass biases from different works.Despite the heterogeneity of the data and methods used in the various studies, the presented results prefer values of M HSE /M lens above 0.7.Lensing mass reconstructions from a combination of weak and strong lensing data have also been used to measure the HSE-to-lensing mass bias on small samples (Penna-Lima et al. 2017;Ferragamo et al. 2022;Muñoz-Echeverría et al. 2023), obtaining M HSE /M lens values that span from ∼ 0.6 to ∼ 1.The lensing of the CMB anisotropies due to the presence of clusters can also be used to estimate their mass (Melin & Bartlett 2015).A comparison of HSE and CMB lensing masses based on Planck data gave 1/(1 − b) = 0.99 ± 0.19, approximately (1 − b) = 1.01 +0.24  −0.16 (Planck Collaboration XXIV. 2015).The posterior analysis in Zubeldia & Challinor (2019) jointly fitted the cosmological parameters and the HSE mass bias in the scaling relation between the SZ signal from Planck and cluster masses, using CMB lensing.They determined that the bias is of (1 − b) = 0.71 ± 0.10.According to the South Pole Telescope (SPT) data analysis in Baxter et al. (2015), the masses inferred from CMB lensing are consistent with those estimated from the SZ.
Other than lensing, some works in the literature have used the dynamical mass estimate of clusters, based on the velocity dispersion of member galaxies, to compute the bias corresponding to HSE masses (see Ferragamo et al. 2021, and references therein).According to the analysis with Sloan Digital Sky Survey (SDSS) archival data in Ferragamo et al. (2021), for the 207 galaxy clusters studied, the HSE-to-dynamical bias of Planck masses is (1 − b) = 0.83 ± 0.07(stat.)± 0.02(sys.).Also from optical observations, authors in Aguado-Barahona et al. (2022) measured the HSE-to-dynamical mass bias for a different sample of 297 Planck clusters and obtained (1 − b) = 0.80 ± 0.04(stat.)± 0.05(sys.).
In Wicker et al. (2023) authors investigated the evolution of the (total) HSE bias with mass and redshift by studying the gas mass fraction in galaxy clusters with XMM-Newton mass reconstructions from Lovisari et al. (2020).The main result in Wicker et al. (2023) is that the value of the HSE bias and its dependence on mass and redshift varies significantly with the analysed cluster sample, in agreement with the conclusions in Salvati et al. (2019).However, according to Wicker et al. (2023) a value of (1 − b) ∼ 0.8 is preferred.Similarly, from the comparison of the gas fraction measured on 12 nearby clusters to the universal gas fraction value, authors in Eckert et al. (2019) concluded that the mass bias for SZ-derived estimates is (1−b) = 0.85±0.05,therefore, inconsistent with the bias needed to reconcile the CMB power spectrum.A different approach was taken in Hurier & Lacasa (2017), where the authors used the Planck galaxy cluster number counts, tSZ power spectrum, and bispectrum to constrain (1 − b) = 0.71 ± 0.07.This was obtained by fitting the normalisation of the SZ-mass SR, interpreting that the bias must appear in the calibration of the scaling relation.They assumed a generalised Navarro-Frenk-White (gNFW, Nagai et al. 2007;Zhao 1996) pressure profile for the gas in clusters, using the best-fitting parameter values from Arnaud et al. (2010), with the normalisation parameter computed to agree with the scaling relation in Planck Collaboration XX. ( 2014).The choice of this particular pressure profile could affect the resulting bias value.
There are, therefore, different issues to be considered.Firstly, as stated in Planck Collaboration XXIV.( 2015), the main limitation of cosmological analyses with cluster number counts from SZ data is the large uncertainty on the HSE mass bias.But in spite of this large incertitude, the compilation of many studies shows that the bias values estimated with and without considering the need to reconcile CMB results have different tendencies.Such inconsistency is in line with a more general tension between results from early-and late-Universe probes (see Abdalla et al. 2022, for a review).Hence it is essential to have a deeper understanding of the HSE mass bias and its possible evolution with mass and/or redshift.
In this work, we aim to estimate the HSE-to-lensing mass bias combining single-cluster HSE and lensing M 500 mass estimates that have been obtained by evaluating mass profiles at their corresponding radius (R 500 ).Given the large number of methods and models that can be employed to reconstruct HSE and lensing masses and the potentially different biases that they could be subjected to, we focus on a sample of clusters for which X-ray based HSE and lensing masses have been homogeneously reconstructed.The HSE masses of the homogeneous sample have been reconstructed mainly with XMM-Newton data and following the method described in Sect.2.1.2.The lensing masses belong to the CoMaLit compilation of masses from the literature (Sect.2.1.1,Sereno 2015).
As indicated by Sereno & Ettori (2015), and shown also for the well-observed CL J1226.9+3332galaxy cluster in Figure 1 in Muñoz-Echeverría et al. (2023), cluster mass estimates can vary up to ∼ 40% from one work to another.Being aware of these important differences that exist between the masses reconstructed by different studies, we gather together results from several works that have also produced estimates based on mass profiles.We use those estimates to measure the systematic dispersion with respect to our XMM-Newton and CoMaLit masses.In this work, we analyse a sample of clusters that spans a large redshift range, select homogeneous mass estimates, and propagate the systematic dispersion, which is one step beyond previous studies (Lovisari et al. 2020;Sereno & Ettori 2015;Sereno et al. 2019).
This paper is structured as follows.In Sect. 2 we present the data, describing the homogeneous and comparison cluster samples.The method used to match clusters from different catalogues and the measurement of the systematic dispersion of the reference masses with respect to other estimates is described in Sect.3. The reference sample is built in the same section.In Sect. 4 we present the HSE-to-lensing mass bias and its evolution with redshift.The scaling relation between HSE and lensing masses is obtained in Sect.5, with all the related systematic tests in the same section.Finally, results are compared to similar works in the literature in Sect.6 and conclusions are presented in Sect.7. Throughout the paper 'log' corresponds to the logarithm to base 10 and 'ln' is the natural logarithm.When needed, we assume a flat ΛCDM cosmology with H 0 = 70 km/s/Mpc and Ω m = 0.3.

Homogeneous sample
This study is built aiming for a clusters sample with resolved HSE and lensing masses that are comparable amongst all the objects (homogeneous reconstruction procedure) and cover the largest possible redshift range.We present in this section the mass reconstruction and regularisation procedures of the XMM-Newton and CoMaLit clusters, which constitute our homogeneous sample.

CoMaLit sample
The CoMaLit sample contains the clusters with lensing masses that we used to build the homogeneous sample.They correspond to the clusters from the Literature Catalogs of weak Lensing Clusters (LC2 ) compilation presented in Sereno (2015).The LC 2 contains 806 clusters (in the 3.9 version of the LC 2 -single catalogue 1 ) with weak lensing masses obtained from different works in the literature, including the widely used Canadian Cluster Comparison Project (CCCP, Hoekstra et al. 2012;Hoekstra et al. 2015) and Weighing the Giants (WtG, Applegate et al. 2014) clusters.
Although the masses were not derived homogeneously amongst the original works, an effort was made in Sereno (2015) to select the most comparable mass estimates.Only masses reconstructed assuming spherical symmetry were considered, clusters without optical, X-ray or SZ counterpart were excluded and when the same authors or collaborations had published several estimates for the same cluster along a refinement process, only the latest result was considered.In addition, all the masses were standardised to the same cosmology (a flat ΛCDM cosmology with Ω m = 0.3 and H 0 = 70 km/s/Mpc) and were given at the overdensities of 2500, 500, 200 as well as at the virial radius.We define R ∆ as the radius at which the mean mass density of the cluster is ∆ times the critical density of the Universe at its redshift, ρ crit = 3H(z) 2 /(8πG), with H(z) the Hubble function.We consider only the masses at an overdensity of ∆ = 500.For some cases, the masses given in the original papers had to be extrapolated following the density profile adopted in the original paper or with a Navarro-Frenk-White (NFW, Navarro et al. 1996) model.
1 http://pico.oabo.inaf.it/~sereno/CoMaLit/2.1.2.XMM-Newton sample with the reference X-ray pipeline Regarding the HSE masses, we built a sample of clusters with masses reconstructed from XMM-Newton data and following the same procedure, hereafter XMM-Newton or reference Xray pipeline.Thus, an homogeneous method was applied consistently to the full sample.This pipeline has already been used in previous works (Pratt et al. 2010;Bartalucci et al. 2017;Ruppin et al. 2018;Kéruzoré et al. 2020;Muñoz-Echeverría et al. 2023), proving the reliability of the method.
As described in Bartalucci et al. (2017), the X-ray raw data were processed using the standard procedures with the XMM-Newton Science Analysis System (SAS) pipeline.The electron density and temperature profiles were reconstructed following the correction and deprojection methods detailed in Pratt et al. (2010) and Bartalucci et al. (2018).To obtain the HSE mass profiles, the electron density and temperature were combined in the Monte Carlo procedure described in Démoclès et al. (2010).The binned HSE mass profiles were interpolated to define the M 500 masses used in this work.Based on the same XMM-Newton data, two differently estimated M 500 are available per cluster: masses derived from a X-ray calibrated scaling relation (Arnaud et al. 2010) and masses estimated from a forward NFW profile fit to the density and temperature profiles.We do not use these two masses in our main analysis, but they are employed to investigate the consistency of all three estimates in Appendix D. Amongst the clusters with XMM-Newton data, we distinguish three different subsamples along the redshift.

Intermediate-z clusters: LPSZ
The LPSZ stands for the NIKA2 SZ Large Programme (Mayet et al. 2020;Perotto et al. 2022).It is a high angular resolution follow-up of 45 clusters of galaxies detected with the Atacama Cosmology Telescope (ACT, Hilton et al. 2018;Hilton et al. 2021) or the Planck satellite (Planck Collaboration XVIII.2015).The LPSZ follow-up combines high-resolution SZ data from the NIKA2 instrument (Adam et al. 2018;Perotto et al. 2020) with X-ray XMM-Newton observations and covers a redshift range between 0.5 and 0.9.Studies on individual clusters from the LPSZ sample have already been published (Ruppin et al. 2018;Kéruzoré et al. 2020;Muñoz-Echeverría et al. 2023), illustrating the joint SZ and X-ray analysis.Even though this sample was designed to be followed-up in SZ using NIKA2, we emphasise that in this work we do not make use of any SZ data for the mass estimation procedure.Instead, we consider the HSE masses obtained from XMM-Newton data only.
A&A proofs: manuscript no.aanda High-z clusters: Bartalucci+2018 Bartalucci et al. (2017) and Bartalucci et al. (2018) were able to go beyond z = 0.9 and measure the HSE mass of five individual clusters from resolved mass profiles.Given the difficulties related to the high redshift of the clusters, XMM-Newton data were combined with Chandra observations.Although supplementary Chandra data was added, we consider these masses as homogeneous with respect to the ESZ+LoCuSS and LPSZ samples since the same reconstruction pipeline was employed.However, special care is taken in our analyses when studying the impact of these clusters.Authors in Bartalucci et al. (2018) also indicate that the mass estimate for the SPT-CLJ2106-5844 cluster is not reliable, therefore, we exclude it from our analyses.

Comparison sample
The mass estimate of a cluster often varies from analysis to analysis, because of differences related to raw data or to the mass reconstruction method.In order to try to account for possible systematic biases in the CoMaLit and the reference X-ray pipeline masses, we gathered as many as possible HSE and lensing mass estimates from the literature for the clusters in our homogeneous sample.Again, we made sure that the masses in the chosen studies were measured on resolved profiles, excluding masses derived from scaling relations.We only considered HSE masses obtained from X-ray data.Comparing to HSE masses that use SZ data or scaling relations is also of great interest, but it would be an independent analysis in itself and beyond the scope of this paper (see, for example, Hoekstra et al. 2015;Sereno et al. 2017;Sereno & Ettori 2017;Schellenberger & Reiprich 2017).For lensing, in addition to the weak lensing masses, we also compared to masses reconstructed from the combination of strong and weak lensing signal.
We present in the following a brief description of this comparison sample, highlighting the distinctive characteristics of each analysis.We refer the reader to the original works for more details.

Ettori+2010
In Ettori et al. (2010) (and the Corrigendum, Ettori et al. 2011), the authors reconstructed the HSE mass of 44 clusters with redshifts 0.092 < z < 0.307 using XMM-Newton observations.They employed two different methods (M1 and M2) and gave the results in units of R 500 .We converted the R 500 values into M 500 masses.The main caveat of these results is that profiles were extrapolated to reach R 500 assuming an NFW profile.As coordinates of the assumed centres of the clusters are not given in Ettori et al. (2010), we took them from Yuan et al. (2022) 3 and when missing, from the 4XMM-DR9 source list4 .

Landry+2013
In Landry et al. (2013) the HSE masses of 35 clusters with redshifts between 0.152 < z < 0.3017 were obtained using Chandra data.Two different mass estimates are given in the paper: either using the Vikhlinin model or the polytropic equation of state.According to the authors, the profiles of seven clusters required 'slight' extrapolation to reach R 500 .Again, the coordinates of the assumed centres of the clusters are not given in Landry et al. (2013), so most of coordinates were taken from Ebeling et al. (1998).When missing, position coordinates of clusters were found by querying in the Simbad-CDS portal5 with the cluster name given in Table 1 in Landry et al. (2013).

LoCuSS
The aforementioned LoCuSS sample contains in all 50 clusters, with 0.152 < z < 0.3 (Smith et al. 2015).For our mass comparisons, we used the LoCuSS HSE masses published in Martino et al. (2014) and the lensing masses from Okabe & Smith (2016).The HSE masses were reconstructed with Chandra data for 43 clusters and with XMM-Newton observations for 39.For some clusters both estimates are available.Central coordinates of clusters were also taken from Martino et al. (2014).The analysis in Zhang et al. (2010) studied 12 out the 50 clusters with XMM-Newton and Subaru data.The lensing masses published in Zhang et al. (2010) are equivalent to those in Okabe & Smith (2016), but the HSE mass profiles were evaluated at the R 500 corresponding to the lensing analyses.We, therefore, gave preference to the results in Okabe & Smith (2016) and Martino et al. (2014) and restricted the LoCuSS masses to the estimates in the latter two studies.

Mahdavi+2008
Uniformly estimated masses of 18 clusters were published in Mahdavi et al. (2008).Lensing masses were obtained as in Hoekstra (2007), but with the photometric redshift distributions from Ilbert et al. (2006).The lensing mass reconstruction was done with a method based on aperture mass estimation, that is, obtaining first projected masses, and subsequently deprojecting by assuming an NFW density model and the concentration-mass scaling relation from Bullock et al. (2001).For the HSE masses, Chandra observations were used.As indicated in Table 2 in Mahdavi et al. (2008), for 14 out of the 18 clusters the HSE masses at R 500 were obtained from extrapolation and all of them were measured at the lensing R 500 .

Mahdavi+2013
In Mahdavi et al. (2013) authors studied a sample of 50 clusters with redshift 0.152 < z < 0.55.The clusters correspond to the CCCP sample.The HSE masses were reconstructed from a combined analysis of XMM-Newton and Chandra data.For the same sample, lensing estimates were obtained in Hoekstra et al. (2012), using CFH12k and Megacam data from the Canada-France-Hawaii Telescope.HSE masses were measured at the R 500 obtained from lensing masses.

Israel+2014
The analysis in Israel et al. (2014) contains eight clusters with redshift 0.35 < z < 0.80.The lensing masses were obtained from an NFW fit to the tangential shear profiles of clusters, assuming a mass-concentration relation.To reconstruct the HSE mass, the authors used the electron density profiles of individual clusters, which were estimated from Chandra surface brightness maps.
The temperature profile of individual clusters being more challenging to obtain, the authors combined the Chandra data of all clusters in the sample to reconstruct a single global temperature profile for the whole sample.The HSE masses in Israel et al. (2014) were also evaluated at the R 500 measured from lensing mass profiles.

LPSZ+CLASH
Within the LPSZ programme, Muñoz-Echeverría et al. (2022), Ferragamo et al. (2022), and Muñoz-Echeverría et al. (2023) estimated the lensing mass for three clusters in the sample in common with the Cluster Lensing And Supernova survey with Hubble (CLASH, Postman et al. 2012;Zitrin et al. 2015Zitrin et al. , 2013a,b;,b;Zitrin et al. 2009).Masses were reconstructed by fitting a projected NFW mass density profile to the publicly available CLASH convergence maps (Zitrin et al. 2015).Given that two differently modelled convergence maps were provided, for some clusters two lensing mass estimates are available, named LTM and PIEMD+eNFW following the name of the method used to reconstruct the convergence maps.We also considered the lensing masses published in Umetsu et al. (2014) and Merten et al. (2015) for the same clusters.

Bartalucci+2018
In Bartalucci et al. (2018) authors studied the HSE-to-lensing mass bias of five SPT clusters.The weak lensing masses were obtained in Schrabback et al. (2017) using Hubble Space Telescope (HST) observations.The profiles were centred in the X-ray peak or the SZ peak (indicated in Table 1 in Schrabback et al. 2017), giving two different lensing mass estimates per cluster.

Selection and characterisation of the sample
In this section, we present the comparison of the XMM-Newton and CoMaLit mass estimates to the results from other works presented in Sect.2.2.We briefly describe the procedure used to match and select clusters from different catalogues, and then quantify the scatter based on the comparisons of several mass measurements for each cluster across our sample.Finally, we build the reference sample with the XMM-Newton and CoMaLit masses that we use for the rest of the analysis.

Matching clusters
We matched clusters from different catalogues on the basis of their coordinates.We considered that two entries in two distinct catalogues correspond to the same cluster for angular separations smaller than 400 ′′ .We further verified every match by checking the redshifts given in the different catalogues.We identified suspicious mismatching between A1606 (z = 0.0963) and A2029 (z = 0.078) and excluded it.
At the same time, we discarded clusters that appear as one object in some catalogue and as a combination of multiple substructures in another.For example, the cluster A1758 in Landry et al. (2013) has four entries in the LC 2 -single catalogue: A1758S, A1758NW, A1758N, A1758NE.Similarly, we excluded A222, A223N, and A223S.In addition, we identified and discarded A750 (present in CoMaLit, LoCuSS, Mah-davi+2013, and Mahdavi+2008 catalogues), whose mass estimate can not be reliable since it is superimposed along the line of sight with MS0906+11 (Geller et al. 2014).
We summarise in Table 1 the overlap between the homogeneous clusters in XMM-Newton and CoMaLit samples and those from other works presented in Sect.2.2.For 36 of the XMM-Newton and 82 of the CoMaLit clusters we identified other HSE and lensing mass estimates6 .

Estimation of systematic dispersion
We present in the left panel in Fig. 1 the relation between X-ray HSE masses obtained with the reference X-ray pipeline (homogeneous masses) with respect to other X-ray HSE masses from the literature (comparison sample).In the right panel, we show the relation between lensing masses from different works with respect to the estimates summarised in CoMaLit.Each colour represents one of the samples described in Sect.2.2 and different estimates of the same work are differentiated with markers.The black dashed line shows the one-to-one relation.
Overall, the agreement between the samples is reasonable, with a significant dispersion around the 1:1 relation.We identify some clusters for which the mass estimates differ significantly.These are Abell521, Abell2390, Abell2163 in X-rays and RXJ1347.5-1145,CL1641, CL1701 in the lensing masses comparison.The cluster shown with the green marker on top of the left panel in Fig. 1 is Abell2390 and, despite its departure from the 1:1 relation, we do not have strong arguments for excluding it.For lensing masses (in the right panel in Fig. 1) there seem also to be a hint of some bias that we do not propagate hereafter.We verified that the bias does not correlate with a comparison sample in particular, but rather with high mass clusters.Further investigation would be needed to understand this trend.
To quantify the systematic dispersion with respect to the 1:1 relation, we followed Eq. 3 and 4 in Pratt et al. (2009).For the N lens = 120 matched entries between the CoMaLit catalogue and the other lensing samples (Table 1), we defined the raw variance as where w i is the weight of each cluster and M CoMaLit lens and M other lens are the lensing mass in the CoMaLit catalogue and in a different analysis, respectively.The weight given to each cluster is lens , the sum of the uncertainties related to each cluster.The σ 2 raw HSE was measured in an equivalent way using the HSE masses and uncertainties of each cluster, M XMM HSE and M other HSE and δ 2 M XMM HSE and δ 2 M other HSE .The statistical error associated to the masses was obtained for both lensing and X-ray masses with the above-mentioned weight, w i , and σ 2 i : (3) Notes.We also report the total amount of matches, that is, the data points in Fig. 1 and the number of different objects.This allowed us to define the systematic scatter, that is, the excess of scatter in the raw variance not explained by the statistical uncertainties, as We report in Fig. 1 (and in Table A.1) the statistical, systematic, and raw scatter for the HSE and lensing masses.The raw dispersion of lensing masses (σ 2 raw lens = 5.280 × (10 14 M ⊙ ) 2 ) is larger than HSE ones (σ 2 raw HSE = 3.231 × (10 14 M ⊙ ) 2 ) and the uncertainties of individual lensing masses being larger, the statistical dispersion is also larger (σ 2 stat lens = 4.340 × (10 14 M ⊙ ) 2 and σ 2 stat HSE = 1.507 × (10 14 M ⊙ ) 2 ).Nevertheless, the error bars of HSE masses are not large enough to cover the excess of scatter around the 1:1 relation, making the systematic scatter for HSE masses (σ 2 sys HSE = 1.724 × (10 14 M ⊙ ) 2 ) larger than for lensing (σ 2 sys lens = 0.940 × (10 14 M ⊙ ) 2 ).
As mentioned in the description of each sample in Sect.2.2, HSE masses in some works were evaluated at the R 500 obtained from lensing.We checked the impact of excluding such estimates from the analysis.In the left panel in Fig. A.1, we present the relation between the XMM-Newton reference pipeline masses and X-ray masses from the comparison sample without accounting for M HSE (< R lens 500 ) estimates (that is, without Mahdavi et al. 2008Mahdavi et al. , 2013)).The statistical, raw, and systematic variances change by 0.4, 5, and 10%, respectively.Hence, taking a conservative approach, in the following sections we consider the largest systematic scatter values obtained.

Reference sample
Following the procedure described in Sect.3.1, we matched the clusters in the CoMaLit catalogue (Sect.2.1.1)with the clusters with HSE masses from the XMM-Newton reference pipeline (Sect.2.1.2) and obtained an homogeneous sample composed of 65 clusters.Amongst the 65 clusters, 54 correspond to the ESZ+LoCuSS samples, eight clusters are from the LPSZ, and three from Bartalucci+2018.For these clusters, we performed additional checks and discarded: three clusters with senseless error bars (see Appendix B.1 for details), and ten clusters (one of them already excluded) for which X-ray and lensing mass reconstruction analyses had assumed very distant centres (see Appendix B.2).
As a result, our reference sample contains 53 clusters with homogeneous HSE and lensing masses that can be used for comparisons (see Table B.2).We present in Fig. 2 a summary of the characteristics of the sample.The histograms in the left show the number of clusters with respect to redshift, HSE mass, and lensing mass.The right panel in Fig. 2 presents the clusters in the mass-redshift plane.While very few works in the literature go above z = 0.5, 20% of the clusters in our sample have redshifts higher than 0.5.However, the distribution in redshift of the sample is dominated by low-z clusters.
After excluding the last 12 clusters (in Appendix B.1 and B.2) from the XMM-Newton and CoMaLit samples, we recalculated the scatter with respect to other HSE and lensing masses.Compared to Fig. 1, the raw, statistical, and systematic dispersions remain of the same order, but the impact of individual clusters is again noticeable in the resulting values (less than 10% of change, see Fig. A.1 and Table A.1).Therefore, we took the most conservative approach and considered that the systematic scatters to be accounted for in the XMM-Newton and Co-MaLit masses are the largest values we have found: σ 2 sys lens = 1.202 × (10 14 M ⊙ ) 2 and σ 2 sys HSE = 2.017 × (10 14 M ⊙ ) 2 .We note that the clusters used for these calculations are not necessarily the 53 in our reference sample, but the ones in common between XMM-Newton and other X-ray samples and between CoMaLit and other lensing works (summarised in Table 1).We compare in Fig B .3 the systematic standard deviation values to the individual statistical uncertainties of the masses from the XMM-Newton reference pipeline and the CoMaLit catalogue.
In the following sections, we investigate how the HSE-tolensing mass bias and scaling relation change when accounting for these systematic scatters.In order to propagate the scatters to the final results, we consider that the uncertainties in the mass of each cluster are the quadratic sum of the measurement statistical uncertainties and the systematic scatters derived in this section.Thus, we have for the lensing masses, and for the hydrostatic ones.This is a very conservative approach that assumes that the mass estimates from the X-ray reference pipeline and the Co-MaLit catalogue may have an additional error (due to, for example, the used dataset or the mass reconstruction method) that can be quantified from the distance to other estimates.Such supplementary error is usually not considered in the literature.For this reason, we also perform the study without accounting for the systematic uncertainties.An alternative approach was considered in Sereno & Ettori (2015) by separating the analysis in subsamples.A cross-validation of our results by subsamples is presented in Sect.5.4.

Direct HSE-to-lensing mass bias measurement
The bias of HSE masses with respect to lensing estimates is defined from the ratio of the masses, For simplicity, in the rest of this paper we name the HSE-tolensing mass bias without subscripts b = b HSE/lens .As a first approach, and for comparison with other works in the literature, we directly compare the HSE-to-lensing mass ratio among the clusters of the reference sample.Following the parametrisation in Salvati et al. (2019) and Wicker et al. (2023), we describe the redshift evolution of the HSE-to-lensing mass bias as where We show in Fig. 3 the HSE-to-lensing mass ratio as a function of redshift for the 53 clusters in the reference sample.Here error bars include systematic scatter following Eq. 5 and 6.The grey shaded area in the top panel indicates the 16th to 84th percentile region of the bias evolution model obtained from the posterior distributions of the fitted parameters.For comparison, the horizontal lines show the mean (dash-dotted line), median (dotted line), and error weighted mean (solid line) HSE-to-lensing mass ratio obtained with the 53 cluster masses.Posterior distributions of the fitted parameters are shown with grey contours in Fig. 4. The best-fit values and uncertainties are given in the first row in Table 2.We give (1 − B)/(1 + z * ) β z , which is the value of the bias at z = 0. We also report the results without accounting for the systematic scatter of the lensing and HSE masses.As expected, when accounting for σ 2 sys the uncertainties of the posterior distributions are enlarged.
Due to the significant differences in the mass uncertainties and the non-uniform distribution of the clusters in redshift, certain subsamples might be driving the fit of the model.To check for these effects and investigate any dependence with redshift, we repeat the fit by considering clusters in different redshift ranges.
Considering only the clusters with z < 0.9 (that is, those in ESZ+LoCuSS and LPSZ samples) and only those with z < 0.5  Notes.Columns 1 to 3 present the considered sample, the number of clusters, and the median redshift.Columns 4 to 7 give the best-fit values with 16th and 84th percentiles of the posterior distributions for parametres describing bias evolution, without (columns 4 and 5) and with (columns 6 and 7) the systematic scatters.In bold the values corresponding to the reference sample accounting for the systematic scatters.
(only ESZ+LoCuSS), the results are very close to the ones obtained with the reference sample.This means that the grey result in Fig. 3 and 4 is most probably dominated by ESZ+LoCuSS clusters.Best-fit values and uncertainties for these two cases are given in Table 2.The corresponding bias evolution models are shown in blue (z < 0.9) and cyan (z < 0.5) in the central panel in Fig. 3.We find more significant differences when considering only low redshift clusters (z < 0.2, in green), or, when discarding them (z > 0.2, in red).For low redshift clusters, the HSE masses at z = 0 are less biased with respect to lensing masses ((1 − B)/(1 + z * ) β z closer to 1) than for the reference sample, but the dependence on redshift is stronger.Exactly the opposite happens when fitting only z > 0.2 masses: the HSE-to-lensing mass bias is larger at z = 0 (smaller (1 − B)/(1 + z * ) β z ), but the redshift evolution is weaker (the absolute value of β z smaller).These conclusions agree with the results in Wicker et al. (2023), where the same cut in redshift is adopted.In Smith et al. (2015), the authors also reported a different tendency for Planck cluster masses depending on the redshift, with a larger HSE-to-lensing bias value (smaller 1 − b) for Planck masses at z > 0.3, than the bias at z < 0.3.However, these masses were inferred from the SZ-mass scaling relation and not measured from profiles.Nonetheless, in our analysis β z is compatible with no redshift evolution both for z < 0.2 and z > 0.2 subsamples (see posterior probability density contours in Fig. 4).
As shown in Fig. 2, the clusters at high redshift are rare in our sample, with a large gap between z = 0.62 and z = 0.89.Only CL J1226.9+3332,SPT-CLJ0615-5746, SPT-CLJ0546-5345, and SPT-CLJ2341-5119 are above z = 0.62.For CL J1226.9+3332 the uncertainties on the bias are more than one order of magnitude smaller than the uncertainties of the three SPT clusters.We suspect that this single cluster may be forcing the bias towards lower values at high redshift.To test the impact that CL J1226.9+3332 has on the fits, we repeat the analyses excluding it.The results without CL J1226.9+3332 are shown, following the same colour scheme as before, in the bottom panel in Fig. 3 and with dashed lines in Fig. 4. We observe that β z varies significantly when excluding CL J1226.9+3332 and it tends to be more compatible with no redshift evolution.At the same time, the bias at z = 0 is slightly shifted towards lower values.All the results are summarised in Table 2.
The described direct HSE-to-lensing mass bias estimation method neglects the intrinsic scatters of the HSE and lensing mass estimates.As explained in Sereno & Ettori (2015), this could influence the resulting bias that relates HSE and lensing masses.For this reason, in the next section we take a different approach to estimate the HSE-to-lensing mass bias.

HSE-to-lensing mass scaling relation
Estimating the scaling relation between HSE and lensing masses is an alternative way for measuring the HSE-to-lensing mass bias (Eq.7), together with the intrinsic scatter associated to HSE and lensing masses.We follow the methodology presented in Sereno & Ettori (2015) and consider that both the HSE and the lensing masses are scattered and biased estimates of the true mass of clusters, such that ln M lens ± δ lens = α lens + β lens ln M True ± σ lens , (9) Here δ lens and δ HSE are the measurement uncertainties associated with the logarithm of the lensing and HSE mass estimates for each cluster.The natural logarithm of the bias and the deviation from linearity are α and β, respectively.The intrinsic scatter of the lensing and HSE masses with respect to the true mass are given by σ lens and σ HSE .All the masses in the arguments of logarithms are in 10 14 M ⊙ units.Authors in Sereno & Ettori (2015) verified that the scatter and bias results do not vary if α lens = 0 or α HSE = 0 is considered, so following their work, we fix α lens = 0. We use the LInear Regression in Astronomy (LIRA7 , Sereno 2016) R package and the pylira8 Python wrapper to perform the fit of the SR.LIRA performs the Gibbs sampling of a posterior distribution constructed from a MCMC fit based on a Bayesian hierarchical modelling.It can account for heteroscedastic measurement errors, intrinsic scatter, and time evolution of the SR.

Reference scaling relation
The SR of reference in this paper is built using the aforementioned 53 clusters in the reference sample, assuming that both the lensing and the HSE masses scale linearly with the true mass, β lens = 1 and β HSE = 1, and that there is no evolution of the SR with redshift.The MCMC sampling is performed using 200 chains and 6 × 10 6 steps, with a burn-in of the first half of the steps.Convergence is checked following the R test of Gelman & Rubin (1992).We take uniform priors for the free parameters: α HSE ∼ U(−4, 4), σ HSE ∼ U(0, 10), σ lens ∼ U(0, 10).
We present in the left panel in Fig. 5 the HSE-to-lensing mass scaling relation obtained with the 53 clusters of the reference sample.Data points correspond to each one of the clusters in the sample, with the ellipses in the figure indicating the error bars in both axes when considering the systematic scatter (see Eq. 5 and 6).We assume no correlation between both mass estimates.The grey and pink lines show respectively the scaling relation accounting and not accounting for the systematic scatter in the error bars of each cluster (Eq. 5 and 6).Shaded areas indicate the 1σ region.The black dashed line shows the one-to-one relation between HSE and lensing masses.In the right panel in Fig. 5, we show the posterior distributions of the fitted scaling relation parameters.The intrinsic scatter related to HSE masses is remarkably shifted towards zero when accounting for the systematic scatter in the error bars of cluster masses.This is expected, since increasing the error bars of clusters reduces the need to have a dispersion around the SR.The median values with the 16th and 84th percentiles of the posterior distributions of α HSE , σ HSE , and σ lens are given in the first row of Table 3. From α HSE we compute the HSE-to-lensing mass bias at R 500 (Eq.7), which gives (1 − b) = 0.739 +0.075 −0.070 considering the systematic scatters.

Impact of particular subsamples in redshift
As for the bias model in Sect.4, we also want to check how the SR parameters may vary depending on the chosen redshift range.Therefore, we repeat the analysis for the different redshift subsamples considered in Sect. 4. We present in Fig. 6 and C.1 and in Table 3 the different results, with and without σ 2 sys .Again, we observe that the bias changes for z < 0.2 and z > 0.2 clusters, in line with a (1 − b) value that decreases with redshift.The scaling relations with z < 0.9 and z < 0.5 samples remain almost unchanged with respect to the SR of reference.Not accounting for CL J1226.9+3332reduces the lensing scatter for the z > 0.5 subsample.Overall, we find that the SRs are compatible for the different subsamples.
The posterior distribution of the SR parameters obtained for the z < 0.5 clusters without σ 2 sys (see Fig. C.2) can be directly compared to Figure 5 in Sereno & Ettori (2015).In that work, the 50 CCCP clusters from Mahdavi et al. (2013) were used to measure the HSE-to-lensing mass scaling relation (even though the HSE masses were evaluated at the R 500 obtained from lensing).The intrinsic scatters seem to be differently correlated in Sereno & Ettori (2015) and in this paper.However, in both cases we observe no strong correlation between α HSE and the intrinsic HSE or lensing scatters.In our case, for the z < 0.5 clusters without (with) σ 2 sys we measure (1 − b) = 0.720 +0.080 −0.073 ).These results (Table 3) are in line with the values reported in Table 6 in Sereno & Ettori (2015) and Table 2 in Lovisari et al. (2020).

Investigations of possible model extensions
Beyond the reference scaling relation, for which we have assumed no redshift evolution and a linear scaling between the masses, in this section we test if relaxing some of these assumptions improves the description of the data by the scaling relation model.

Deviation from linearity
The HSE and/or lensing masses could also scale non-linearly with the true mass, meaning that the HSE-to-lensing bias would depend on the mass of the clusters.In Hoekstra et al. (2015) and von der Linden et al. ( 2014) authors investigated such dependence on the mass comparing Planck results to CCCP and WtG lensing masses, respectively.Both works found modest evidence for a mass-dependence: M Planck ∝ M 0.64±0.172014) for different cluster samples.Physically, this mass dependence could correspond, for example, to an impact of the baryonic physics that would depend on the strength of the clusters potential wells.In this case, low mass clusters having shallower potential wells, we can imagine that baryonic effects are stronger in them (McCarthy et al. 2011).On the contrary, simulations in Rasia et al. (2012) also indicate that massive objects are the most disturbed ones and have, probably, more complex temperature structures.Notes.We present the results for different data subsamples, with and without accounting for the systematic uncertainties in the error bars of the masses.We show in bold the parameters for the scaling relation of reference presented in Sect. 5.
Table 4. Summary of the median values and uncertainties at the 16th and 84th percentiles of the parameters in the HSE-to-lensing SR when considering a deviation from linearity, an offset between HSE and lensing masses or an evolution with redshift.Notes.We present the results for the reference sample, accounting or not for the systematic scatter in the error bars of the masses.For the BCES fit we report the best-fit values and 1σ uncertainties. (* ) We also calculate the scatter with respect to the best BCES scaling relations following Eq. 4.
We also test this hypothesis by fitting the SR in Eq. 9 and 10 leaving β HSE as a free parameter.We take a uniform prior for β HSE ∼ U(0, 2) and consider the same priors for α HSE , σ HSE , and σ lens .The resulting scaling relations are presented in Fig. 7 and the median values are given in Table 4.As shown in the corner plot in Fig. 7, α HSE and β HSE are completely degenerated.Nevertheless, our results are in agreement with Hoekstra et al. (2015) and von der Linden et al. (2014).However, the HSE masses in those works were Planck masses from the SZ-mass scaling relation.
For comparison to the results obtained with LIRA, we also perform the fit of the SR using the orthogonal Bivariate Correlated Errors and intrinsic Scatter method (BCES, Akritas & Bershady 1996).BCES favours a larger deviation from linearity, that is, smaller β HSE .We also report the results in Table 4.Given the large uncertainties on α HSE and β HSE , the scaling relations obtained with LIRA and BCES are compatible.
In Fig. 8 we present the HSE-to-lensing mass ratio as a function of the lensing mass for the fitted α HSE and β HSE , with the green shaded area showing the 16th to 84th percentiles.The horizontal grey hatched area represents the HSE-to-lensing mass ratio measured in the previous section assuming that HSE and lensing masses scale linearly with the true mass.Given that we obtain β HSE < 1, on average the difference between HSE and lensing masses is larger for more massive objects.This is in agreement with the mild decreasing tendency for the HSE-tolensing mass ratio obtained in Hoekstra et al. (2015), von der Linden et al. ( 2014), and Eckert et al. (2019), but different from the trend observed in Salvati et al. (2019).Nevertheless, our results are consistent with no mass dependence of the ratio.The difficulty of disentangling α HSE and β HSE does not motivate further investigations of the SR model with additional free parameters.Leaving free α lens would add a free parameter to the model strongly correlated to α HSE and β HSE .

Considering an offset
In addition to the HSE-to-lensing mass bias defined in Eq. 7, there could be also an offset between the HSE and lensing mass estimates.Thus, the scaling relation could be defined as, where A HSE and B HSE are the offset and the multiplicative factor, respectively.Here σ HSE and σ lens are again the scatter of HSE and lensing masses with respect to the SR, but in this case in units of 10 14 M ⊙ .We perform again the fit of the SR using both the LIRA and BCES methods.We present in Fig. 9 and Table 4 the results.As for the non-linear SR fit, A HSE and B HSE are completely degenerated.The results obtained with LIRA indicate an offset in mass completely compatible with zero.It is reassuring to verify that the data motivates a scaling relation model for which the HSE mass goes to zero in the limit M True → 0. We show in Fig. 8 the bias evolution in blue, indicating again that there is no significant trend of the HSE-to-lensing mass ratio with cluster mass.

Evolution with redshift
LIRA enables fitting a scaling relation that evolves with redshift.Looking for such evolution can be particularly interesting with our reference sample, given the large redshift range that it covers (0.05 < z < 1.07).Assuming again that HSE and lensing masses scale linearly with the true mass (β lens = β HSE = 1), we write ln M lens ± δ lens = ln M True ± σ lens , (13) and, We note that T is the time evolution factor, T = log 1+z 1+z re f , with z re f = 0.01 the normalisation redshift set by default in LIRA.We take flat priors for the parameter describing the evolution with redshift: γ HSE ∼ U(−10, 10).Similarly, we consider the evolution with redshift for the SR defined in Eq. 11 and 12.Given the strong impact of the CL J1226.9+3332galaxy cluster on the fits at high redshift (see Sect. 4), we repeat the analysis excluding it.All the results are summarised in Table 4.
In Fig. 10 we present the redshift evolution of the HSE-tolensing mass ratio for the analyses performed with the reference sample and accounting for systematic uncertainties in the HSE and lensing masses.We show in grey the results obtained in Sect.4, neglecting the intrinsic scatter of HSE and lensing masses with respect to the true masses.In blue we present the bias evolution model resulting from the scaling relation fit in this section.Darker regions show the evolution with redshift obtained when excluding CL J1226.9+3332 from the analyses.
There seem to be a tendency for a decreasing HSE-to-lensing mass ratio with redshift (γ HSE = −1.530+1.071  −1.085 ), but it is not statistically significant when removing CL J1226.9+3332 from the sample (γ HSE = −0.896+1.154  −1.155 ).From the comparison of the grey and blue results we observe directly the impact that accounting for the intrinsic scatters of the SRs has on the bias.Considering the intrinsic scatter reduces the difference between HSE and lensing masses and, therefore, the bias.
We present in Fig. C.3 and C.4 a comparison of the scaling relations and posterior distributions of parameters when accounting for redshift evolution (dashed lines) and not accounting for it (solid lines).The contribution of the redshift evolution factor introduces a change of the order of a few percent (or less) in the intrinsic scatters.Given the correlation of the other parameters with γ HSE , the change is of ∼ 30% for α HSE and of the order of 10% for A HSE and B HSE .However, the results are compatible with the ones obtained without considering redshift evolution, so there is no strong evidence of redshift evolution in the data.

Comparison of SR models
In this section, we compare the tested SR models to assess which is the one preferred by the data.We define the goodness of fit of the scaling relations χ2 following Eq. 3 in Lovisari et al. (2020): where the sum is done over the N clusters = 53 clusters in the reference sample.In Eq. 15 ln M HSE ln M lens i , z i , ϑ is the function described by Eq. 10 or 14 depending on the SR model, with the parameters ϑ = [α HSE , β HSE , γ HSE ] defined accordingly.The factors ln M HSE i , ln M lens i , δ HSE,i , and δ lens,i are the HSE and lensing mass of each cluster i and their associated uncertainties, and z i is the redshift of each cluster.We compare the results obtained considering always the systematic uncertainties in the HSE and lensing mass uncertainties.We take the posterior distributions of the parameters for α HSE , β HSE , γ HSE , σ HSE , and σ lens .For the scaling relations considering an offset in mass (Eq.11 and 12), we replace the logarithmic masses and uncertainties by the linear values in the χ2 definition in Eq. 15.Similarly, we take A HSE and B HSE instead of α HSE and β HSE .The χ2 distribution for each SR model fit is shown in Fig. C.5.
According to the Akaike information criterion (AIC, Akaike 1974) and the Bayesian information criterion (BIC, Schwarz 1978), the scaling relation of reference and the one considering a deviation from linearity are almost equally probable (see Appendix C.2 for more details).Furthermore, there is statistically no gain in adding a parameter that describes an evolution with redshift.In other words, redshift evolution does not seem to be favoured by the data.
Anyhow, the intrinsic scatters being free parameters in our LIRA fits, we expect all the models to adjust the data points at the expense of increasing the scatters.From the comparison of all the σ HSE and σ lens (see Tables 3 and 4), there is not a statistically significant increase, nor decrease in the intrinsic scatters when changing the number of free parameters in the SR model.
In conclusion, our best scaling law between X-ray HSE and lensing masses is given by the scaling relation of reference: ln which corresponds to a HSE-to-lensing mass bias of assuming Gaussian intrinsic scatters for lensing and HSE masses.

Caveats
The two main caveats of the analysis presented in this work are the representativity of the used sample and the inhomogeneity in the estimates of the lensing masses.The former is hardly quantifiable, given that the selection criteria of the reference sample (Sect.2.1) are mainly a combination of the selection criteria used for the ESZ, LoCuSS, LPSZ, and Bartalucci et al. (2018) clusters.An equivalent study using a clearly defined selection criterium, as for the LPSZ (Mayet et al. 2020), would be of great interest.
Regarding the inhomogeneity of the lensing masses, we have exploited the compilation of mass estimates from different works standardised in the CoMaLit catalogue.We have treated all the CoMaLit masses equally, no matter the work from which the lensing mass has been extracted.But the different quality of the data and/or the methods used in each of the original works make the uncertainties of lensing masses not homogeneous within the CoMaLit sample.By propagating σ sys lens we account, to first order, for the overall error of CoMaLit masses with respect to other estimates.A possible improvement would be to measure an independent systematic scatter σ sys lens for each of the works used in the CoMaLit sample, but at the expenses of much poorer statistics.Instead, we quantify a posteriori the goodness of our best scaling relation (estimated with all the 53 clusters in the reference sample, Eq. 16 and 17) for the cluster masses extracted from each of the different works within the CoMaLit catalogue.In Fig. 11 we show, for the clusters obtained from each of the lensing works, the corresponding χ defined from Eq. 15 as For those works with several clusters in our reference sample, we give the mean value and the 16th to 84th percentiles over all the used clusters.We observe that only 'merten+15' (Merten et al. 2015), 'monteiro-oliveira+20' (Monteiro-Oliveira et al. 2020), and 'pedersen&07' (Pedersen & Dahle 2007) cluster masses are at more than 1σ.The cluster from 'monteiro-oliveira+20' at more than 2σ from the scaling relation is Abell1644 (on the top left of all our SRs), which is known for being a cluster in a merger scenario.Thus, we conclude that the scaling relation of reference fits well the large majority of the clusters in the reference sample.There is no hint of a too bad or too good fit to any of the subsamples in the CoMaLit catalogue.

Comparison to previous results
Similar studies to the one presented in this paper were previously done in the literature.However, the methods used to estimate the masses and to compute the HSE-to-lensing bias differ significantly from work to work.Thus, comparisons are delicate.In Fig. 12 we present our best bias estimate together with the HSEto-lensing M 500 ratios obtained in the works detailed below.We use Roman numerals to refer to each result from the literature.
The different results are also summarised in Table 5.
The HSE-to-lensing mass bias was measured in Smith et al. (2015) with the 50 clusters from the LoCuSS sample (0.15 < z < 0.3).By using resolved HSE mass estimates, they computed the weighted mean HSE-to-lensing bias: 1 − b = 0.95 ± 0.05 (I in Fig. 12).Uncertainties were calculated from the standard deviation of 1000 bootstrap samples geometric means.Following the equations (Eq. 1 and 2 in Smith et al. 2015) used to calculate the weighted mean in Smith et al. (2015) we obtain for our reference sample a mean bias of: 1 − b = 0.763 and 0.818 not including and including, respectively, the systematic error in the uncertainty of each mass estimate.Considering, as in Smith et al. (2015), only the clusters in the redshift range 0.15 < z < 0.3, we obtain 1 − b = 0.769 and 0.720 with and without the systematic scatter.The difference between the bias estimated in Smith et al. (2015) and the results obtained in our work could originate from the larger HSE mass estimates in Smith et al. (2015).Bright blue markers in the left panel in Fig. 1 show that HSE masses used in Smith et al. (2015) tend to be larger than the homogeneous ones.
In Mahdavi et al. (2008) authors compared the HSE and lensing masses evaluated at the same radius, in particular at the R 500 measured from the lensing mass profile of each cluster.With a sample of 18 clusters, Mahdavi et al. (2008) concluded that at R lens 500 the ratio of masses is M HSE /M lens = 0.78 ± 0.09 (II).Extending the analysis, the HSE-to-lensing mass bias obtained in Mahdavi et al. (2013) is consistent with no bias for cool-core clusters, while (1 − b) ∼ 0.8 for non-cool core clusters (III).In the same line, authors in Israel et al. (2014) concluded, from the study of 8 clusters with redshifts 0.35 < z < 0.80, that HSE and lensing masses differ by 0 to 20% (IV).
By using very high redshift clusters (0.933 < z < 1.066), Bartalucci et al. (2018) obtained that HSE masses from X-rays are a factor of 1.39 +0.51  −0.37 (V) larger than weak lensing estimates, Notes.We report our reference result and different values from the literature.The last column indicates the singularity of each analysis.
in contradiction with the rest of the results.The clusters in Bartalucci et al. (2018) are the highest redshift clusters in our reference sample (Sect.2.1).Using the same HSE masses as in Bartalucci et al. (2018), but with the CoMaLit lensing estimates we obtain an error-weighted mean ratio of M HSE 500 /M lens 500 = 1.56(1.58)not including (including) the systematic error in the uncertainty of each mass estimate.Instead, the error-weighted mean ratio for our full reference sample is M HSE 500 /M lens 500 = 0.47(0.51).For the clusters in the X-COP sample, Eckert et al. (2022) found that HSE masses estimated using XMM-Newton data are 10 to 15% lower than the lensing estimates in Herbonnet et al. (2020) (VI).With a different approach and assuming that the gas fraction in clusters is constant, Eckert et al. (2019) obtained that HSE masses are biased (with respect to the true total mass) by 7% at R 500 .These results differ significantly from the bias values obtained in this paper.HSE masses in Eckert et al. (2019) were reconstructed making use of excellently well resolved mass profiles, therefore, unless there are unidentified systematic effects, HSE masses in Eckert et al. (2019) should be very reliable.The small bias values obtained for the low redshift (0.047 < z < 0.09) clusters in Eckert et al. (2019) could then indicate that there is indeed a redshift dependence in the HSE bias and that low redshift cluster HSE masses are less biased.
Regarding also the evolution of the bias with redshift, which we have largely discussed in Sect. 4 and 5, the weak tendency for a larger bias at higher redshift seems to be in line with the results from Wicker et al. (2023) and Smith et al. (2015).
Particularly interesting are the comparisons to Sereno et al. (2019), Lovisari et al. (2020), andSereno &Ettori (2015) works, where the methods are equivalent to the ones employed in this paper, making use of the LIRA code and accounting for the intrinsic dispersion of HSE and lensing masses to the SR.The analysis in Lovisari et al. (2020) compares the HSE masses obtained with XMM-Newton data (from Lovisari et al. 2020) to lensing estimates in the CoMaLit LC 2 catalogue, for 62 clusters from the Planck-ESZ sample with z < 0.5.With this sample, authors obtain 1 − b = 0.74 ± 0.06 (VII) and no redshift evolution.This is in excellent agreement with our result.In Lovisari et al. (2020) the results found with CoMaLit lensing masses are also compared to those obtained with other lensing masses from other works in the literature: the HSE-to-lensing mass ratio spans from ∼ 0.6 to ∼ 1 depending on the used dataset.
Conclusions are along the same line in Sereno & Ettori (2015), where different samples with HSE and lensing mass estimates are used to measure the scaling relation and, consequently, the HSE-to-lensing bias.The effect that intrinsic scatters have on the determination of scaling relations is also studied in Sereno & Ettori (2015).They conclude that not taking into account explic-itly the scatter of masses makes scaling relations flatter, as we see when using BCES instead of LIRA (also in agreement with Lovisari et al. 2020).While the intrinsic scatter for lensing masses obtained in Sereno & Ettori (2015) is of the order of the expected values from simulations (∼ 10 − 15%), the intrinsic dispersion for HSE masses is larger than expected (∼ 20 − 30%).An underestimation of the statistical uncertainties in HSE masses could be the reason, according to Sereno & Ettori (2015), for this large scatter.Accounting for the systematic scatter in the uncertainty of each cluster mass, as described in this paper, could help to have more realistic uncertainties for the HSE mass estimates.The HSE-to-lensing mass ratio in Sereno & Ettori (2015) depends again on the used sample and data and spans from ∼ 0.5 to ∼ 1.We have reproduced the same result by separating the sample in redshift ranges.
Also Sereno et al. (2019) used the Bayesian hierarchical modelling from Sereno (2016) to fit a scaling relation between HSE masses from XMM-Newton data and weak lensing masses of clusters in the Hyper Suprime-Cam Survey (Pacaud et al. 2016).The median redshift of the 100 clusters in the sample is z = 0.30, spanning from z = 0.054 to z = 1.050.Thus, the analysis in Sereno et al. (2019) is probably the closest study to our work.Nevertheless, to get temperature profiles that reach R 500 with X-ray data (to compute then the HSE mass), in Sereno et al. (2019) a model was iteratively fitted to the integrated temperature measured per cluster within 300 kpc, well below R 500 .Assuming β HSE = 1, β lens = 1, and α lens = 0 they obtained: α HSE = −0.04±0.08,σ HSE = 0.31±0.05,and σ lens = 0.37±0.06.According to Sereno et al. (2019), the difference between HSE and lensing masses is of b = 0.09 ± 0.17 (VIII).The α HSE from Sereno et al. (2019) is at 3σ from our result with the full reference sample (α HSE = −0.338+0.105  −0.097 without accounting for the systematic uncertainties).Their values for σ HSE and σ lens agree with the intrinsic scatter values that we obtain when we do not account for the systematic uncertainties (σ HSE = 0.304 +0.069 −0.072 and σ lens = 0.305 +0.080 −0.083 ).In addition, the behaviour of the HSE-to-lensing mass bias could vary with the overdensity at which masses are measured.By estimating weak lensing masses and HSE masses from Xrays at R 200 , Jee et al. (2011) concluded that for a sample of 14 very massive and distant clusters (0.83 < z < 1.46), the HSE and lensing masses are compatible.However, the HSE masses were obtained from the extrapolation of a singular isothermal sphere profile to reach R 200 , which likely limits the validity of their HSE mass estimates.Similarly, in Amodeo et al. (2016) authors compared M 200 masses reconstructed from Chandra data (although the radial reach of Chandra is way below R 200 ) to their lensing estimates, and concluded that both mass estimates are in agreement.No evolution with redshift was detected in Amodeo et al. (2016).We prefer to avoid extrapolating the mass profiles to reach R 200 .

Summary and conclusions
In this work we have investigated the HSE-to-lensing mass bias with masses inferred at R 500 from resolved profiles.We carefully selected the clusters and obtained a reference sample with 53 clusters with redshifts spanning from z = 0.05 to 1.07.This is the largest redshift range analysed homogeneously with this type of data, having access to X-ray HSE masses obtained from resolved profiles.HSE masses were estimated with the XMM-Newton mass reconstruction reference pipeline and lensing masses were extracted from the LC 2 CoMaLit catalogue.
In order to account for possible systematic effects in the reference analysis, we compared the XMM-Newton and CoMaLit masses to other estimates from the literature.The obtained systematic scatters were propagated to our analyses, but all the main conclusions remain unchanged when considering or not these additional systematic dispersions on the HSE and lensing mass uncertainties.
We performed different tests in the measurement of the HSEto-lensing mass scaling relation and bias, varying the redshift range and the scaling relation model.Our main conclusions are the following: 1. Assuming that HSE and lensing masses scale linearly with the true mass and considering σ 2 sys HSE and σ 2 sys lens , we measure for the 53 clusters in the reference sample a HSE-to-lensing mass ratio of M HSE 500 /M lens 500 = (1 − b) = 0.739 +0.075 −0.070 (stat.)± 0.226 (intrin.scatter).2. We find that the best scaling relation between HSE and lensing masses is our scaling relation of reference, where we assume that there is no evolution with mass and redshift and that HSE and lensing masses scale linearly.We obtain: α HSE = −0.303+0.101 −0.095 , σ HSE = 0.166 +0.086 −0.101 , and σ lens = 0.257 +0.080  −0.092 .3. When we let the SR evolve with redshift, we observe a trend towards a larger discrepancy between HSE and lensing masses at high redshift, but it is not statistically significant.In conclusion, there is no evidence of evolution with redshift.The dependence of the HSE-to-lensing mass bias on the mass of the clusters is not confirmed either.
4. Given the size of the sample, single clusters can be driving the fits and special care needs to be taken for clusters with very small uncertainties.We have investigated the case of CL J1226.9+3332galaxy cluster, whose impact is crucial when determining the bias at high redshift.
5. Ignoring the intrinsic scatter of HSE and lensing masses with respect to the true mass of clusters introduces a bias in the measurement of the HSE-to-lensing mass bias.
Additional considerations are needed to compare the HSEto-lensing mass bias obtained in this work to the bias needed to reconcile cluster number counts and CMB power spectrum results for several reasons: 1) the HSE masses used in cluster number count analyses are not direct HSE mass measurements, but masses obtained from a SZ (or X-ray) measurement through a SZ-mass (or X-ray-mass) scaling relation, 2) lensing masses can also be biased with respect to the true mass of clusters (Becker & Kravtsov 2011), and 3) this sample is not representative of the cluster population in any given survey.Instead, this study provide a step forwards in our understanding of the deviation from hydrostatic equilibrium of galaxy clusters and of the impact of systematic and intrinsic errors.2. For a good visualisation, we only show in grey, green, and red the results for the whole sample, the z < 0.2, and the z > 0.2 ranges, respectively.Dashed distributions have been obtained excluding CL J1226.9+3332galaxy cluster.5 to identify Roman numerals with the works.The horizontal grey hatched area represents the HSE-to-lensing mass ratio measured in this work assuming that HSE and lensing masses scale linearly with the true mass, accounting for the systematic scatter, and considering no evolution with redshift.Notes.We report the different values depending on the sample selection criteria, showing in bold the systematic scatters considered for the rest of the analysis.Notes.Column 1: cluster names from the CoMaLit catalogue (entries Comalit_Name and Comalit_Num).Column 2: redshift.Columns 3 to 6: right ascension α and declination δ of the cluster centres according to CoMaLit or X-rays.Columns 7 to 10: cluster masses and uncertainties from the CoMaLit catalogue and from the XMM-Newton analysis.We report in Table C.1 the results for the different scaling relation models and the ∆AIC and ∆BIC differences with respect to the simplest scaling law amongst the nested models.
Many of the low redshift (z < 0.5) clusters detected by Planck were also observed by XMM-Newton.It is the case of the 62 Planck Early Sunyaev-Zel'dovich (ESZ) clusters (Planck Collaboration VIII.2011), whose HSE masses were reconstructed with X-ray data in Planck Collaboration XI. (2011).Similarly, based on the Local Cluster Substructure Survey (LoCuSS 2 ) sample, Planck Collaboration III.(2013) reconstructed the HSE mass of 19 clusters.

Fig. 1 .
Fig. 1.Relation between HSE (left) and lensing (right) masses from the homogeneous samples in this work (XMM-Newton and CoMaLit) with respect to other estimates from the literature (comparison sample).Each colour indicates a different analysis and several results from the same work are differentiated by using different markers.The black dashed lines show the one-to-one relation.We give the statistical, systematic, and raw variances as defined in the text.All the variances are in units of (10 14 M ⊙ ) 2 .
(1 − B) is the bias normalised at the pivot redshift, z * , and β z describes the evolution with redshift.As inSalvati et al. (2019), we take z * the median redshift value of the clusters in the analysed sample.InWicker et al. (2023) the pivot redshift is the mean of the sample.With the homogeneous HSE and lensing masses of the 53 clusters in the reference sample, we perform a Markov chain Monte Carlo (MCMC) analysis to fit the model (Eq.8) to data, using the emcee Python package(Foreman-Mackey et al. 2019;Goodman & Weare 2010).We consider uniform priors for the parameters, (1 − B) ∼ U(0, 2) and β z ∼ U(−8, 8), and assume a Gaussian likelihood, uncorrelated between points.

Fig. 2 .
Fig. 2. Main characteristics of the 53 clusters in the reference sample.Histograms in the left panels show the redshift, HSE mass, and lensing mass distributions.We show in purple, magenta, and grey the distributions for ESZ+LoCuSS, LPSZ, and Bartalucci+2018 clusters, respectively.The black dashed lines represent the distributions of the whole sample.In the right panel we show the HSE and lensing masses as a function of redshift for all the clusters.
∼ 0.19 in von der Linden et al. (

Fig. 3 .
Fig. 3. HSE-to-lensing mass ratio with respect to the redshift.Markers with error bars show the ratio of each cluster in the reference sample with error bars accounting for the systematic uncertainty.Horizontal solid, dotted, and dash-dotted black lines give respectively the error weighted mean, median, and mean mass ratio for the data points.Shaded areas represent the 16th to 84th percentiles of the bias evolution model obtained by fitting different redshift ranges.Top: the bias evolution model obtained with the 53 clusters in the reference sample.Centre: different colours indicate the models fitted to clusters in different redshift ranges.Bottom: grey, blue, red, and orange shaded areasshow respectively the bias evolution model fitted to clusters along all the redshift range, at z < 0.9, at z > 0.2, and at z > 0.5, excluding in all the cases the CL J1226.9+3332galaxy cluster.

Fig. 4 .
Fig. 4. One-dimensional and two-dimensional posterior distributions of the parameters in the redshift dependent mass bias model, accounting for the σ 2 sys in the error bars.Different colours describe the results for the various samples presented in Table2.For a good visualisation, we only show in grey, green, and red the results for the whole sample, the z < 0.2, and the z > 0.2 ranges, respectively.Dashed distributions have been obtained excluding CL J1226.9+3332galaxy cluster.

Fig. 5 .
Fig. 5. Reference scaling relation (β HSE = 1) between HSE and lensing masses in the reference sample.Data points with ellipses represent each cluster masses and uncertainties in both axes accounting for the systematic scatter.The pink line corresponds to the SR for the median value of parameters obtained without σ sys and the solid grey line with σ sys .The shaded regions show the 16th and 84th percentiles and the black dashed line gives the one-to-one relation.The corner plots in the right panel are the posterior 1D and 2D distributions of the parameters in the SR, including (grey) or not (pink) systematic scatters.

Fig. 6 .
Fig. 6.Scaling relation between HSE and lensing masses for the reference sample in grey and for different subsamples in colours, all accounting for σ sys .Here β HSE is fixed to 1.As in Fig.4, we only show the cases for z > 0.2 and z < 0.2.Data points with ellipses represent each cluster masses and uncertainties in both axes accounting for the systematic scatters.The black dashed line shows the equality.The corner plots in the right panel are the posterior 1D and 2D distributions of the parameters in the SR.

Fig. 7 .
Fig. 7. Scaling relation between HSE and lensing masses in the reference sample considering a deviation from linearity.Data points with ellipses represent each cluster masses and uncertainties in both axes accounting for the systematic scatters.The pink line corresponds to the SR for the median value of parameters obtained without σ sys and the solid grey line with σ sys .The black dashed line shows the equality and shaded regions the 16th and 84th percentiles.The corner plots in the right panel are the posterior 1D and 2D distributions of the parameters in the SR, including (grey) or not (pink) the systematic scatters.

Fig. 8 .
Fig.8.HSE-to-lensing mass ratio with respect to lensing mass.The grey hatched area indicates the 16th to 84th percentiles of the bias without mass dependence, accounting for systematic scatters in the uncertainties of HSE and lensing masses.The green area shows the bias evolution when assuming a deviation from linearity of the HSE and lensing masses.Blue area indicates the bias evolution when considering an offset between HSE and lensing masses.Horizontal solid, dotted, and dash-dotted black lines give respectively the weighted mean, median, and mean mass ratio for the 53 clusters, same as in Fig.3.

Fig. 9 .
Fig. 9. Scaling relation between HSE and lensing masses in the reference sample considering an offset between both mass estimates.Data points with ellipses represent each cluster masses and the uncertainties in both axes accounting for the systematic scatter.The pink line corresponds to the SR obtained without σ sys and the solid grey line with σ sys .The black dashed line shows the equality.The corner plots in the right panel are the posterior 1D and 2D distributions of the parameters in the SR, including (grey) or not (pink) the systematic scatters.

Fig. 10 .
Fig.10.HSE-to-lensing mass ratio with respect to redshift.The grey shaded area shows the evolution from Fig.3for all the clusters in the sample and in darker excluding CL J1226.9+3332.The blue area gives the evolution with redshift obtained from the fit of the scaling relation with the reference sample and the grey hatched area without considering the redshift evolution.The blue dark area is the evolution obtained for the reference sample excluding CL J1226.9+3332.As in Fig.3, markers with error bars show the ratio per cluster in the reference sample with error bars accounting for the systematic uncertainty.Horizontal solid, dotted, and dash-dotted black lines give respectively the weighted mean, median, and mean mass ratio for the data points.

Fig. 12 .
Fig. 12. HSE-to-lensing mass ratio with respect to redshift.Shaded areas indicate different results from different works in the literature.See text and Table5to identify Roman numerals with the works.The horizontal grey hatched area represents the HSE-to-lensing mass ratio measured in this work assuming that HSE and lensing masses scale linearly with the true mass, accounting for the systematic scatter, and considering no evolution with redshift.

Fig
Fig. A.1.Comparison of HSE and lensing mass estimates from the homogeneous samples in this work (XMM-Newton and CoMaLit) with respect to other estimates from the literature (comparison sample).Top: Relation between X-ray masses from literature and from the XMM-Newton reference pipeline without accounting for M HSE (< R lens 500 ).In the left (right) the clusters with very large uncertainties and with very different XMM-Newton and CoMaLit centres are considered (not considered).Bottom: Same figure as Fig. 1, but not accounting for clusters with very large uncertainties and with very different XMM-Newton and CoMaLit centres.The dashed lines show the 1:1 relation.We give the statistical, systematic, and raw variances in units of 10 28 M 2 ⊙ corresponding to the data points in each figure.

Fig. B. 1 .
Fig. B.1.Correlation between the HSE-to-lensing mass ratio (left) and redshift (right) of the 65 clusters in the XMM-Newton-CoMaLit homogeneous sample with respect to the separation between the centres assumed in the X-ray and lensing analyses.Error bars in the left panel do not account for the systematic scatters.

Fig. C. 1 .
Fig. C.1.Same as Fig. 6, but without considering the systematic scatter in the fit.

Fig. C. 2 .
Fig. C.2. Scaling relation between HSE and lensing masses for the reference sample in grey and a subsample containing only z < 0.5 clusters in cyan, without considering the systematic scatter in the fit.The black dashed line shows the equality.The corner plots in the right panel are the posterior 1D and 2D distributions of the parameters in the SR.Here β HSE is fixed to 1.

Fig. C. 3 .
Fig. C.3.Same as Fig. 5 but with dashed lines showing the results if an evolution with redshift is considered in the scaling relation and solid lines without evolution.

Fig. C. 4 .
Fig. C.4.Same as Fig. 9 but with dashed lines showing the results if an evolution with redshift is considered in the scaling relation and solid lines without evolution.

Table 1 .
Summary of the amount of clusters in the each of the comparison samples and their overlap with the homogeneous XMM-Newton and CoMaLit clusters.

Table 2 .
Best-fit values and uncertainties for the normalisation and redshift evolution parameters of the mass bias model in Eq. 8 obtained for different subsamples of the reference sample.

Table 3 .
Summary of the median values and uncertainties at 16th and 84th percentiles of the parameters for the HSE-to-lensing SR assuming linearity (β HSE = 1).

Table 5 .
HSE-to-lensing mass bias values from resolved mass profiles.

Table A .
1. Raw, statistical, and systematic variances of the HSE and lensing mass estimates from the homogeneous samples in this work (XMM-Newton and CoMaLit) with respect to other estimates from the literature (comparison samples).
Table B.2. Characteristics of the reference sample.