Swarming in stellar streams: unveiling the structure of the Jhelum stream with ant colony-inspired computation

The Jhelum stream, a remnant of past tidal interactions with the Milky Way, sets itself apart from other halo streams with its unusual and complex morphology. Although the substructure inherent within Jhelum’s morphology and its proper motion space have been previously indicated, these findings remained disjoint. Using the novel LAAT algorithm, a machine learning methodology reliant on the idea of food retrieval performed by ant colonies, we identify Gaia DR3 members of Jhelum, confirming two overdensities in the stream’s proper motion space. We establish the connection between these overdensities and Jhelum’s narrow and broad spatial components, and demonstrate for the first time their separation in radial velocity. Armed with this new information, we estimate dynamical and


Introduction
The history of the formation of the Milky Way's stellar halo is, in large part, caused by mergers with globular clusters (GCs) and nearby dwarf galaxies (DGs).The tidal disruption induced by the host galaxy on the orbiting systems causes a loss of mass in the form of elongated distributions of stars, also known as stellar streams (Helmi et al. 1999;Combes et al. 1999;Eyre & Binney 2009).These stellar remnants of past mergers are excellent probes of the acceleration field of the Galaxy over the spatial range of their structure (Ibata et al. 2002;Johnston et al. 2002;Carlberg 2012).The studied acceleration field can then provide constraints on the gravitational force and the distribution of dark matter within the halo.This in turn allows for the assessment of the standard Lambda cold dark matter (ΛCDM) cosmology by comparing the observationally informed constraints with predictions made by the assumed cosmology.Additionally, the elongated thin nature of these streams makes them sensitive to perturbations induced by small-scale gravitational encounters.Therefore, stellar streams can also be useful tools for studying dark matter sub-halos by studying the gaps in stellar streams that could be the result of an impact with these subhalos (Bovy 2016;Banik et al. 2018;Bonaca et al. 2019;Montanari & García-Bellido 2022).Consequently, the search for stellar streams has grown over the years and has been successful in finding several such streams in orbit around the Galaxy (Odenkirchen et al. 2001;Newberg et al. 2002;Grillmair & Dionatos 2006;Belokurov et al. 2007;Ibata et al. 2018Ibata et al. , 2021)).The proliferation of these studies has also been greatly aided by the availability of large photometric surveys such as the Sloan Digital Sky Survey (York et al. 2000, SDSS), the Dark Energy Survey (Abbott et al. 2018, DES), and most notably the data releases from the Gaia mission (Gaia Collaboration et al. 2016, 2018, 2023).
Of particular interest when studying stellar streams is attempting to determine what the properties of the progenitor were.Such information provides great insight into the masses, chemical compositions, and stellar populations of these progenitors which had been embedded in the Galaxy and played a role in its formation.Works such as Bonaca et al. (2021) and Malhan et al. (2022) have attempted to group streams, GCs, and satel-lite galaxies in action space and trace them back to past mergers with the Milky Way.In addition to these studies based on large populations, every stream has its own past with regards to sub-halo interactions and thus merger histories.In this work, we focus on the stellar stream Jhelum, which was first discovered by Shipp et al. (2018) using data from DES and subsequently confirmed by Malhan et al. (2018) using Gaia's second data release (DR2).The proper motion of Jhelum was further analyzed in Shipp et al. (2019).The most recent studies of Jhelum have classified it to be more likely the remnant of a DG (Ji et al. 2020;Bonaca et al. 2021;Li et al. 2022).Such a classification is not surprising given the wide morphology for which Jhelum is known.In Bonaca et al. (2019), evidence for two spatial components of Jhelum was found.Aided by photometric measurements from DES and proper motion measurements from Gaia DR2, Bonaca et al. (2019) showed that Jhelum is composed of two parallel components: a narrow dense component and a broad diffuse component beneath it.Dynamic environments induced by interactions with sub-halos, the Large Magellanic Cloud (LMC), or with an asymmetric potential of the Milky Way, for example, can also disperse the stars that were originally part of a thin stream and form wider structures as a result (Bonaca et al. 2014;Ngan et al. 2016;Pearson et al. 2017).Woudenberg et al. (2023) have therefore suggested that Jhelum could be a stream perturbed by the orbit of the Sagittarius DG.On the other hand, other explanations are also popular.For example, a GC within a satellite galaxy that has fallen into the Galaxy's potential can create a dynamically cold stream accompanied by a wider component with a lower surface brightness (Carlberg 2020;Qian et al. 2022).
In addition to recognizing them spatially, Bonaca et al. (2019) found no distinguishable difference between the two components in proper motion, while Shipp et al. (2019) independently measured two distinct proper motion components with consistent spatial distributions.In this work, we build on the latter two studies and confirm the substructure within the proper motion space of Jhelum with measurements from Gaia DR3.Specifically, using a novel machine learning tool, the Locally Aligned Ant Technique (Taghribi et al. 2023, LAAT), we distinguished two components in proper motion space that we attribute to the narrow and broad spatial components of the stream.We also found, for the first time, a separation between the two components in the third velocity component, namely using radial velocity measurements from the Southern Stellar Stream Spectroscopic Survey (Li et al. 2019; Li et al. 2022, S 5 ).With this new knowledge, we provide our estimates of several properties of Jhelum such as the best-fit orbit, the velocity and metallicity dispersions, as well as the width of either component.Additionally, we attempted to constrain the more probable progenitors of the narrow and broad components of Jhelum.This paper is organized as follows.Section 2 presents our preliminary data selection to locate the Jhelum stream.Section 3 describes the procedure followed using our novel methodology to isolate the stars belonging to the stream from the surrounding field stars.In Section 4, we explain how we fine-tuned our selection of stars and related the two overdensities in the proper motion space that were first noticed in Shipp et al. (2019) to the spatial narrow and broad components of Jhelum.In Section 5 we provide our estimates of the properties of the two components.Section 6 contains our discussion of the star selection criteria and the most likely merger scenario.In Section 7, we summarize our findings and suggest possible future developments.7 PARSEC isochrone at 13 kpc (dark blue), we define the primary CMD selection as all stars lying in between the orange lines.The selection limit line on the right is positioned there to capture as many main sequence stars as possible while containing most stars in common with the S 5 survey.

Data
For the selection of the area on the sky containing Jhelum, we rely on the coordinate system defined in Bonaca et al. (2019) and implemented in gala (Price-Whelan et al. 2017) to transform any set of coordinates into the Jhelum frame (ϕ 1 , ϕ 2 ), where ϕ 1 is the coordinate aligned with the stream track and ϕ 2 is the coordinate perpendicular to it.We query the Gaia DR3 catalog (Gaia Collaboration et al. 2023) within the rectangular region defined by −5 • < ϕ 1 < 30 • and −5 • < ϕ 2 < 5 • , and select all stars with parallaxes larger than 1 mas.Using gala, we then correct the proper motions of the selected stars for the solar reflex motion relying on the procedure of Price-Whelan & Bonaca (2018) assuming a constant distance to the stream of 13 kpc.To narrow down our selection of stars belonging to the Jhelum stream, we follow Bonaca et al. (2019) and retain the stars belonging to the region in proper motion space defined by −8 < µ ϕ 1 /mas yr −1 < −4 and −2 < µ ϕ 2 /mas yr −1 < −2, where µ ϕ 1 and µ ϕ 2 are the proper motions projected along ϕ 1 and ϕ 2 respectively.Finally, we apply a selection in color-magnitude space to further constrain the stars more likely to belong to the stream.First, all magnitudes of the stars are corrected for extinction using the Schlegel et al. (1998) dust maps and assuming a Cardelli et al. (1989) extinction law with R v = 3.1.The color-magnitude diagram (CMD) of the stars selected so far and corrected for extinction and reddening is shown in gray in Figure 1.To define a selection of stars within the CMD, we first cross-match all remaining stars with the entire Southern Stellar Stream Spectroscopic Survey (S 5 ) catalog (Li et al. 2019;Li et al. 2022) to locate the stars which have radial velocities measured by the survey.The latter is a spectroscopic survey that makes use of the 3.9 m Anglo-Australian Telescope's positioner and AAOmega spectrograph combined with the photometry of the Dark Energy Survey DR1 (Abbott et al. 2018, DES) and precise proper motions from Gaia DR2 (Gaia Collaboration et al. 2018) to map, in Notes.From left to right, we give: the parameters used for running the Locally Aligned Ant Technique (LAAT), the values used for each parameter, and its definition.Note that we give the advisable range for the neighborhood radius parameter r for this specific work.In the second part of the table we give: the index of the three runs, the value of r used for each run, the number of stars in the input dataset, N i , the number of stars remaining after the filtration based on the pheromone, N f , and the time needed for each LAAT run using the unparallelized MATLAB implementation of LAAT on a machine with a 1.8 GHz ×8 processor and 15.3 GiB of RAM memory.
detail, the properties of the stellar streams in the southern hemisphere of the Galaxy.The survey specifically targets stars in the Galactic halo with a focus on stellar streams including Jhelum.Thus, mapping the position of these stars in the CMD allows us to better constrain our color and magnitude selection.The stars in common between our selection and the S 5 catalog are shown in red in Figure 1.We also define a 12 Gyr, [Fe/H]= −1.7 PAR-SEC isochrone at a distance of 13 kpc which fits the CMD reasonably well following Woudenberg et al. (2023).This isochrone is therefore shown in Figure 1 in dark blue as a primary reference for the location of the red giant branch (RGB), sub-giant, turn off and main sequence (MS) stars belonging to the stream.Using the CMD of the stars cross-matched with the S 5 catalog and the location of this isochrone, we define a region in the CMD bordered by the orange lines, which contains all stars matching with the S 5 survey and following the trend of the mentioned isochrone.We keep all stars within the defined region and use them as the starting selection for our subsequent analysis.The purpose of choosing such a wide region, as opposed to works such as Bonaca et al. (2019), Sheffield et al. (2021), Woudenberg et al. (2023), or Viswanathan et al. (2023) is to be as inclusive as possible of the stars belonging to Jhelum, and from there begin fine-tuning this selection to detect stars that belong to the stream with high probability.The narrowing-down procedure is explained in Section 3 and further discussed in Section 6.

Method for stream extraction
We now begin the procedure of fine-tuning the color-magnitude selection to create a high-purity sample of stars belonging to the Jhelum stream.With that purpose, we employ the Locally Aligned Ant Technique (LAAT) first introduced in Taghribi et al.
(2023), and then grouped into a toolbox of manifold (structure) extraction and modeling algorithms in Canducci et al. (2022) and Awad et al. (2023).The main purpose of LAAT is to highlight the contrast between low and high-density regions in a given point cloud, as well as the detection of regions that are closely aligned with a defined structure (low-dimensional smooth manifolds) within the spatial distribution of data points (such as stars, gas particles, and simulated dark matter particles).
The algorithm operates based on the idea of Ant Colony Optimization (Dorigo & Stützle 2004, ACO) whereby a number of agents or "ants" are distributed in the data (e.g., in the position space of a given point cloud dataset) and a random walk is initiated in that space.A defined quantity termed the "pheromone" is artificially deposited on the data points visited by the agents1 and is associated with an "evaporation rate."The latter reduces the quantity of pheromone on the data points over time.Ants, in their random walk, prefer paths with more accumulated pheromone thus implementing a form of a "positive feedback loop."Points visited more frequently will accumulate more pheromone that would take longer to evaporate, thus attracting even more ants.Such a positive reinforcement mechanism distinguishes the ant colony from a random walk defined by a Markov chain without the pheromone mechanism.There, the visitation frequency of the points would be given by the stationary distribution of the Markov chain.For a more formal treatment see Mohammadi et al. (2022).One can shape the random walk to concentrate on different forms of spatial structures.In our case, we would like to emphasize the points aligned with low-dimensional manifolds in the data cloud.Hence, during the walk, and given a data point on which an agent is currently located, principal component analysis (PCA) is performed within the neighborhood with a chosen radius r and centered at the location of the point, to distinguish the main directions along which the data points are distributed in that neighborhood.The agent is then incentivized to jump toward points within the neighborhood that 1) align with the dominant direction of distribution of the data points, and 2) have accumulated larger amounts of pheromone as the walk is allowed to continue.Since data points belonging to a structure in the data show more directional alignment and have a larger local density than a random distribution of points, the agents have a higher probability of visiting and depositing pheromone on structures embedded within the data.As the walk is allowed to progress over multiple iterations, the pheromone will accumulate on data points belonging to the embedded structures, and evaporate in sparser, less directionally aligned regions.Finally, a threshold can be enforced on the pheromone quantity to extract the detected structures from the data.For a detailed mathematical description of the algorithm, see Taghribi et al. (2023).
In the setting we have so far, the structure we would like to extract from the data is the Jhelum stream and the stars most likely to belong to it, using information of the stars' positions, proper motions and photometry.Since dynamically cold streams tend to occupy their respective proper motion spaces as local over-densities, the proper motion information of our selection of  The upper row of the panels (a-c) shows the distribution of pheromone at the end of the run, while the lower row shows the remaining stars after enforcing a threshold on the pheromone quantity.Darker colors mark stars that accumulated higher quantities of pheromone, indicating regions of local alignment and density.In this way, a CMD cut is first performed, followed by finding a distribution of pheromone using LAAT and retaining stars that have accumulated a pheromone quantity that exceeds the given threshold.The CMD of the remaining stars is replotted and refined and the procedure is repeated until the stream is isolated from the majority of field stars.
stars can thus amplify the contrast between the stars belonging and not belonging to the stream.We therefore run LAAT on the four-dimensional space (ϕ 1 , ϕ 2 , µ ϕ 1 , µ ϕ 2 ) composed of the spatial and proper motion components of our selection of stars.We list our parameter choices used as input for LAAT and their respective definition in Table 1, and discuss them in Section 6.
The procedure followed to separate the stream members from contaminating nonmember stars is the following: after defining the CMD selection in Figure 1, we apply LAAT to highlight the stars of interest using the pheromone deposited, and apply a cut-off threshold on the pheromone to remove as many field stars as possible.The remaining stars after applying this procedure are then replotted on a CMD and the polygon selection is refined so that it follows the distribution of stars more closely.The procedure is then applied iteratively in this manner, until any further selection based on the pheromone quantity would have a high chance of removing stars belonging to the stream.By getting as many reliable member stars as possible, we are able to constrain the radial velocities of each component as done in the following sections.In this way, we pinpoint the location of stream member stars in the CMD, and extract the stream from within the data.
In total, we applied this procedure three times to extract Jhelum from its backgound.The CMD refinement is shown in the first column of Figure 2 where each panel refers to the three LAAT runs respectively.The result of applying LAAT on the selection of stars described in Section 2 is displayed in the upper row of Figure 2a where regions accumulating a larger amount of pheromone are shown in darker colors.We observe that the track of Jhelum has been prominently highlighted by the pheromone unlike regions farther away from the stream.We can also see that as we move in increasing degrees of ϕ 1 , the pheromone quantity seems to increase on the stars of the field.This can be explained by the fact that the region on the right is closer to the Galaxy's disk and is therefore more densely populated with stars, thus, it will accumulate more pheromone.This is also seen on the right edge of the proper motion space (top-right plot of Figure 2a).One can see from this result that applying LAAT once and enforcing a high pheromone cut-off instead of following an iterative procedure, will keep many contaminant stars that reside in regions of high density.When inspecting the pheromone distribution of star members in the proper motion space, we also see the emergence of two overdensities as first noted by Shipp et al. (2018).Our study is focused on the new information about Jhelum derived from these two overdensities to understand their physical meaning, and will be presented in detail in the following sections.For now however, we continue with refining our selection of stars.Once we have our pheromone distribution, we apply a threshold whereby we retain the stars that have accumulated at least 30% of the maximum amount of pheromone deposited in the run.This value is chosen conservatively so as not to remove any stars that could be part of Jhelum while still removing as many contaminants as possible.The result of this step is shown in the bottom row of Figure 2a where the remaining stars are plotted in both position and proper motion space.The CMD of the remaining stars is then replotted (left panel of Figure 2b), and a finer selection in that space is applied such that the area within the orange outlines of Figure 1 is shrunk.The selection is chosen so as to follow the distribution of the remaining stars in color and magnitude more closely.We then apply LAAT a second time on the refined selection, and show the results in the upper row of Figure 2b.The application of the threshold where we keep the stars that accumulated at least 40% of the maximum amount of pheromone in the run is shown in the bottom row of the same figure.We can see that the contamination from the region closest to the Galaxy's disk (right sides of the position and proper motion spaces) has been reduced while safeguarding the main structure of the stream.We also note the persistence of the two overdensities in proper motion space.We repeat this procedure a last time, where we replot the CMD of the stars remaining after the second threshold cut, fine-tune the selection region by following the distribution of the remaining stars in color and magnitude, and apply LAAT on the refined selection.The result of the third LAAT run is shown in Figure 2c in a similar manner as for the other runs.The bottom row of the figure shows the stars that have survived a 50% cut on the pheromone quantity.We observe that the contamination from field stars not likely to belong to Jhelum has been greatly reduced in this iterative application, and the track that the stream follows is more defined.From the distribution of stars in position space, we see that the pheromone accumulates more on the narrow component of Jhelum than on the broad component that extends approximately between 0 Bonaca et al. (2019) for a detailed explanations of these two components).This is expected as the narrow component is more linear and has greater directional alignment than the broad component which appears more diffuse (Bonaca et al. 2019).In Table 1, we provide the properties of the three runs including the neighborhood radius r that produced the clearest density contrast, the input number of stars N i , the number of stars N f remaining after applying a threshold cut, as well as the time needed to perform each run.

Fine-tuning and results
Some contamination from field stars could still be present which consists of a few remaining halo stars and stars residing closer to the Galactic disk.The latter persist in the sample due to their high local density which leads to them accumulating a large amount of pheromone.A source of contamination could be the fact that our selection is purely based on photometry, position and proper motion, as we do not know the distance or the lineof-sight velocity of the majority of these stars.The highlighting of some contaminant stars by LAAT, especially those closer to the Galactic disk, is not surprising since they reside in a dense region and will therefore accumulate large amounts of pheromone.One can eliminate those stars by increasing the threshold on the pheromone quantity.At this stage however, using a large threshold introduces a high risk of eliminating stars that are part of the stream.
To avoid this risk, and to make the sample as pure as possible, we use a Gaussian mixture model (GMM) to model the two overdensities in proper motion space as two2 Gaussian distributions3 .We use the proper motion of stars in the bottom right panel of Figure 2c  Gaussians.We enforce a lower bound of log likelihood > −1.75 and keep all the stars that survive this final selection criteria, here shown as the stars within the red contour in Figure 3.This lower bound is chosen so as to outline the stars closer to the two overdensities in proper motion space.The proposed modeling methodology is aimed at finding the relation between the found proper motion overdensities and their corresponding positional information.If a model of the positional information is also required, one could use existing methodologies of 1-DREAM, presented in Canducci et al. (2022).In the joint proper motion and positional space however, the two overdensities, may form lowdimensional manifolds of dimension larger than one.In this case, a full model for each manifold is achievable via the methodology proposed in Canducci et al. (2022).
Using the constructed GMM, we can calculate the posterior distributions over the two Gaussian components for each selected star.We can therefore separate the stars belonging to either overdensity accordingly.In the right panel of Figure 4, we show the proper motion space of our selection of stars colored by the probability density to belong to either overdensity, in this case, to the one on the bottom right.The left panel of the same figure shows where the clustered stars lie in position space.From Figure 4, we can see that the overdensities correspond to the narrow and broad components respectively.
We also comment on the density variations seen in our selection for the Jhelum stream components.Woudenberg et al. (2023) and Viswanathan et al. (2023) have reported on the presence of a tertiary component to Jhelum located on the top left side of the narrow component and is parallel to it.This component is also seen in our Figure 2 especially in the middle plots of panels (a) and (b).As this component is very faint, we see that it gets slowly filtered out with the consecutive runs of LAAT be-cause it is much fainter than the narrow and broad components, and so the pheromone quantity it accumulated is smaller than the threshold cuts we applied.We also recover the kink at ϕ 1 = 15 • similar to what is seen in Viswanathan et al. (2023).Finally, we see that the narrow and broad components are not separately parallel structures as thought to be in Bonaca et al. (2019), but are in fact overlapping in their spatial distributions.
For the subsequent analysis of either component, we separate the stars within each overdensity to be able to clearly define the properties of each stream component separately.We therefore keep the stars that have at least a 50% posterior probability to belong to either component.In total, we thus have 167 stars that we classify as belonging to the narrow component, and 279 stars that belong to the broad component of Jhelum.For the remainder of this work, we represent the stars corresponding to the narrow component in red and those corresponding to the broad component in blue.We also note that the narrow component seems unevenly sampled, or seems much more sparse in the regions that overlap with the broad component.This however is an artifact produced by LAAT and not an intrinsic property of the narrow component.We discuss this result more thoroughly in Section 6.
We perform a final check to see if the separation between the two components also exists in the third velocity component, as in the radial velocity.For that, we use the high precision radial velocity measurements from the S 5 survey.We thus plot the radial velocity as a function of ϕ 1 of the stars for which we find radial velocity measurements (30 and 65 stars for the narrow and broad components respectively).Particularly, we focus on the stars which show a trend in terms of radial velocity outlined by the dashed gray lines in Figure 5, and reproduced in the zoom-in plot within the figure.We observe for the first time, an offset between the narrow and broad components in radial velocity.The trend of the narrow component is composed of 22 stars while that of the broad component is composed of 54 stars.We provide the list of this star selection and its properties in Appendix C. We also see a wide radial velocity dispersion in the broad component as opposed to a narrower one for the narrow component.This has not been observed before and confirms the separation of the narrow and broad components in all three velocity components.The rest of this work focuses on extracting information from this finding and discussing the possible progenitors and formation scenarios that could have formed the stream Jhelum.

Narrow and broad component properties
In this section, we explore the properties of the narrow and broad component of Jhelum, separated using the procedure explained in Sections 3 and 4. In particular, we attempt to fit an orbit that models the dynamics of the components in Section 5.1, and examine the velocity dispersion, the width, and metallicity dispersions of both components in Sections 5.2 to 5.4.The size of the dispersions provides pieces of evidence toward the type of progenitor that has formed the Jhelum stream and/or its subsequent evolution.

Best-fit orbit
We now determine the orbits which follow the track of both the narrow and broad components of Jhelum when integrated in an axi-symmetric Milky Way potential.Through fitting the orbits we also estimate the dynamical properties of the components including their widths and velocity dispersions.We follow the procedure thoroughly detailed in Woudenberg et al. (2023) which is recounted here.To set up the Milky Way gravitational  2020) and create a composite model consisting of a bulge, disk, and dark matter halo.Similar to Woudenberg et al. (2023), the bulge potential is modeled as a Hernquist sphere (Hernquist 1990) with a mass of 4 × 10 9 M ⊙ and scale length of c b = 1 kpc.We model the disk as a Miyamoto-Nagai potential (Miyamoto & Nagai 1975) with a mass of 5.5 × 10 10 M ⊙ , scale length a d = 3 kpc, and scale height b d = 0.28 kpc.Finally, the dark matter halo is modeled as a generalized Navarro-Frenk-White (NFW) potential (Navarro et al. 1996) with a mass of 0.7 × 10 12 M ⊙ , scale radius r s = 15.62 kpc, and a minor-to-major axis ratio q z = 0.95.
Any orbit integration is performed using the package AGAMA (Vasiliev 2019) with the above defined potential.The integration of the stars' orbits is performed in Galactocentric coordinates.We are fitting a single orbit to each component although the components have a given width or dispersion, as the stars belonging to the components do not all follow the exact same orbit.The single orbit fit however, is a good approximation of the best fit (see Appendix A in Woudenberg et al. ( 2023)) and will therefore be used here.With this information, we use the Markov chain Monte Carlo (MCMC) method to find the model parameters that best fit the data.The orbit model parameters consist of the declination δ, distance to the stream D, the proper motion components µ α and µ δ , as well as the radial velocity v rad .The right ascension α on the other hand is kept fixed throughout the run to avoid degenerate solutions to the best fit orbit.We measure the fitness of an orbit in following the track of the set of stars by defining the log-likelihood function ln(L): (1) The index j ∈ χ where χ = {δ, µ α , µ δ , v rad }, and so x i j refers to the j-th quantity of the i-th star in the data.The superscript m refers to the modeled orbit evaluated on the location of the data points while the superscript d refers to the data measurements.Moreover, σ denotes the errors on the four quantities we attempt to fit.The errors consist of both the measurement errors σ meas and the intrinsic dispersion of these quantities along the stream σ int ∈ {σ w , σ µ α , σ µ δ , σ v } where σ w is the component width, σ µ α and σ µ δ are the transverse velocity dispersions and σ v is the radial velocity dispersion.The error in Eq. ( 1) is therefore calculated as the sum in quadrature of these two uncertainties: meas + σ 2 int .σ i j therefore refers to the error on the jth quantity for the i-th data point.The intrinsic dispersions are kept as free parameters that we fit along with the orbit modeling parameters.
An initial guess for the model parameters is taken as the measured values of a randomly chosen star belonging to the corresponding component, and the right ascension α is fixed to the value corresponding to the chosen star.For the parameters we are fitting, we set a flat prior defined by the following: In this way, the MCMC algorithm is run using the emcee package with 80 walkers and 1000 steps to ensure convergence.
The corner plot of the posterior distributions of all modeled parameters is shown in Figure 6 (red and blue corresponding to the narrow and broad components respectively).The 50-th percentile (median) values for each fitted quantity is also portrayed in the figure with the errors given by the 16-th and 84-th percentiles (values indicated with a tilde correspond to the narrow component).The median values are taken to be the parameters that produce the best fitting orbits for the stream's components.Regarding the narrow component, we see a strong degeneracy between the distance parameter d and µ δ and weaker degeneracies between combinations of the other quantities.This degeneracy is also weaker for the distributions of the broad component which also shows wider posterior distributions.One can explain this given the fact that the broad component is shorter and more diffuse than the narrow component.Therefore, a wider range of parameters produces orbits that fit the distribution of stars well, that is, the broad component data is less constraining.In terms of the distance to the stream, we retrieve a median value of 12.40 +0.10 −0.10 kpc for the narrow component and 10.95 +0.21 −0.20 kpc for the broad component.These distance measurements agree with recent estimates from works such as Li et al. (2022), Woudenberg et al. (2023) and Viswanathan et al. (2023), but are nonoverlapping given their respective error margins.The difference between the distances to the two components is ≈ 1.4 kpc.We discuss this result further in Section 6.In Figure 7, we provide the best fit orbit of the narrow and broad components plotted using solid and dashed black lines respectively.The stars belonging to either component are shown in their respective colors.We see that the orbits fit the distribution of most of the stars of either component well.

Velocity dispersion
Through this orbit fitting, we also obtain estimates of the velocity dispersions and widths of Jhelum's narrow and broad components.Given v rad as the radial velocity measurements from S 5 and v m rad as the radial velocity obtained from the bestfit orbit, the radial velocity dispersion can then be visualized as the width of the distribution of v rad − v m rad as shown in the top panel of Figure 8. From the MCMC fit, the radial velocity dispersions are found to be equal to 4.84 +1.23  −0.79 km/s and 19.49+2.19  −1.84 km/s respectively.As for the dispersions in µ α and µ δ , we obtain 0.03 +0.03 −0.02 mas/yr (1.75 +1.78 −1.17 km/s) and 0.04 +0.04 −0.03 mas/yr (2.33 +2.37 −1.75 km/s) respectively for the narrow component, and 0.13 +0.02 −0.02 mas/yr (6.69 +1.18 −1.13 ) and 0.21 +0.03 −0.03 mas/yr (10.81 +1.78  −1.71 km/s) respectively for the broad component.The conversion of units has been performed assuming the best-fit dis-tance obtained for each component.The radial velocity dispersion will be referred to as the velocity dispersion hereafter and the implication of these measurements will be discussed in Section 6.

Stream width
The width of a stream can also provide information on its progenitor.Wider streams tend to be a result of a DG falling into the Milky Way potential, while GC accretion tends to produce narrower streams.Since we have a distinct sample of high confidence members for the narrow and broad component of Jhelum, we estimate the width of either of these components separately rather than calculating one width for the entire stream.The component width σ w is calculated as described in the previous subsection when fitting for the orbit of either component of the stream.With this procedure, we obtain σ w ∼ 0.13 • for the narrow component of Jhelum and σ w ∼ 0.44 • for the broad component.Assuming the posterior best-fit distances shown in Figure 6 (12.40 +0.10 −0.10 and 10.95 +0.21 −0.20 ), we find the linear widths of the components.The evaluated widths are then σ w = 28.13+8.9 −6.64 pc and σ w = 84.09+11.26  −7.17 pc.We provide a comparison between these estimates and those calculated by other works such as Bonaca et al. (2019) and Shipp et al. (2019) in Section 6.

Metallicity dispersion
For evaluating the mean metallicity, [Fe/H], and metallicity dispersion, σ [Fe/H] , of either component of the stream, we utilize the metallicities provided by the S 5 Survey calculated using the Calcium Triplet (CaT) regions.The CaT metallicities have been derived for red giant branch stars (RGB) of the stream using the equivalent widths (EW) of the CaT lines and using the EW to metallicity calibration from Carrera et al. (2013).From these measurements, the S 5 Collaboration has provided us with highquality members stars as found in Li et al. (2022) which we then use.These measurements were reported as the more trusted estimates in Li et al. (2022) and have been used as the basis of the discussion around the chemical properties of the dozen streams studied within the same work.With our selection of stars that belong to Jhelum, 15 stars have their metallicities measured by the survey.This sample is much smaller than the radial velocity sample and the measurements are mostly for stars of the broad component.Of these measurements ten are for stars belonging to the broad component and five are for stars in the narrow one.We display the distribution of metallicities for each component in the lower panel of Figure 8.
We then run an MCMC algorithm to model [Fe/H] and σ [Fe/H] of either component by fitting a Gaussian function to the distribution of metallicities of each component.Similar to Section 5.1, the total width σ of the metallicity distribution of a given component is given by the sum in quadrature of the measurement errors and the distribution's intrinsic width which we are attempting to fit.Therefore, we have σ = σ 2 meas + σ 2 int .The initial guess for the MCMC algorithm is taken as mean and standard deviation of the selection of metallicity measurements we have for either component.The fit is then performed by optimizing the following log-likelihood function:   We run the MCMC algorithm with 80 runners and 1000 steps and then extract the best-fit parameters as the 50-th percentile values of the resulting distribution.The posterior distributions for the Gaussian fits is shown in Figure 9 where we also display the best fit modeled mean metallicities and metallicity dispersions.We denote σ int for the broad component as σ [Fe/H] and that for the narrow component as σ[Fe/H] .For the narrow component, we obtain a mean [Fe/H] = −1.87+0.12 −0.11 and a metallicity dispersion of σ[Fe/H] = 0.15 +0.18  −0.10 , while for the broad component we obtain [Fe/H] = −1.77+0.13 −0.13 and σ [Fe/H] = 0.34 +0.13 −0.09 .Note that for the narrow component, the posterior distribution intersects with σ int = 0, and given the low amount of stars with metallicity measurements for this component, the metallicity dispersion represents an upper bound of its actual dispersion.The comparison of the calculated values with other estimates in literature is performed in Section 6.

Discussion
In this section, we discuss the procedure and results explained throughout this work.We review the robustness of the selection criteria of the stars belonging to Jhelum and the dependence of our results on the used methodology.We also discuss the different formation scenarios of the stream Jhelum based on the results achieved in Section 5.

Evaluation of selection procedure
The selection of stars depending on their position in the CMD is a standard step followed to isolate those that are members of a given stellar population.In Section 2 we defined the initial polygon selection to be as wide as possible to ensure that all stars belonging to Jhelum fall within this region, even though this also includes many of the surrounding field stars.Given that LAAT, in its current version, uses one value of pheromone threshold to filter out stars within an entire run (as opposed to using a threshold which is dependent on the location within the distribution of stars) it is possible to miss some stars that are true members of the stream.Therefore, since several filtering procedures follow this initial selection (see Figure 2), it becomes necessary to be as inclusive as possible in the first step of the for either component of the stream.Metallicity measurements are obtained from the S 5 survey.The intrinsic width of these distributions is fitted to a Gaussian distribution, to find the metallicity dispersion of each component.This calculation is explained in Section 5.4.We also indicate the mean error on the metallicity measurements, σ meas = 0.2.
CMD selection.This fact becomes important when considering that much less stars occupy the red giant branch (RGB) than the main sequence part of the CMD.Thus, missing some stars that belong to this branch limits the subsequent metallicity dispersion analysis that depends on the high quality measurements of these stars.This risk is also mitigated by the iterative approach we follow whereby the CMD selection is repetitively narrowed down, guided by the distribution of the remaining stars after each run of LAAT.The parameters chosen for running LAAT are listed in Table 1 and here we explain the intuition behind choosing the values for these parameters.The large number of agents and number [Fe/H] = 0.15 +0.18 0.10 [Fe/H] = 1.87 +0.12 0.11 [Fe/H] = 0.34 +0.13 0.09 of steps have been chosen to make sure that each star has been visited multiple times during the run.This allows for the convergence of the algorithm toward a result that does not change between different initializations of the random walk.The neighborhood radius parameter r defines the region in which PCA is performed to determine the main orthogonal directions along which the stars are distributed.If r is smaller than necessary, then the stars within the neighborhoods may be insufficient to infer any alignment information.This parameter is thus chosen such that it is large enough to create a region where the linearity of Jhelum is detected.Similarly, if r is larger than necessary, many field stars would be included in the neighborhood which could drown the alignment signal.A large neighborhood radius also acts as a zoomed out perspective of the region it encompasses, and so leads to missing out on smaller structures in the data that are distributed on a smaller scale.For example, a large radius will lead to highlighting the two clusters in proper motion space as one large overdensity.Note however that LAAT does not create false-positive detections of structure, but rather highlights local density contrasts (especially if aligned along a preferred subspace) that would not have been seen on larger scales (for a detailed proof, see Taghribi et al. (2023); Mohammadi et al. (2022)).In other words, the separation of the clusters in proper motion space is not an artificial creation of the algorithm.This claim is substantiated by the fact that the two overdensities show a correspondence to the narrow and broad component (see Figure 4) and by the separation present in radial velocity as well (see Figure 5).As for the remaining parameters, they have not been altered from the default settings of the algorithm.The choice of r in this work has been picked to be on the order of the width of the stream so as to capture the most directional information, and the specific values are chosen so as to produce the highest density contrast visible.This has been achieved by experimenting with different values for the radius and observing that r = 0.5 gives the best results.After performing the pheromone cut on the first run, some sparse neighborhoods will be formed in places were the density is low.To infer the main directions the stars are distributed within a neighborhood of size r, LAAT needs a minimum of four stars within that neighborhood if applied to four-dimensional data.For the second and third runs, since some sparse distributions form due to the applied pheromone cut, we increase the value of the radius to insure that this condition is met.The specific values again are chosen through manual experimentation and visually checking what produces the largest density contrast between the stream and field stars.
As for the choice of the pheromone threshold for each run, the value is first chosen conservatively to avoid eliminating member stars by mistake.Any value smaller than 30% would unnecessarily keep some of the field stars that have accumulated a very small pheromone amount.If we use this same threshold value for the rest of the runs, we would retain more contaminant stars with each run which would necessitate performing more iterations of the CMD fine-tuning and running LAAT.Therefore, to keep the number of iterations within 3 runs and to remove as many nonmember stars as possible, we increase the pheromone threshold gradually from 30% to 50%.Through experimentation, we find that smaller thresholds will keep more field stars, and larger ones would remove parts of the two components of the stream.That is why after the third run, we rely on the GMM log-likelihood cut in Section 4 to fine-tune this selection instead of using harsher pheromone thresholds.We also provide further discussion along the lines of contamination and completeness of our sample in Appendix B.
We also discuss the sparsity of the narrow component in the regions overlapping with the broad component of the stream.The broad component of Jhelum is sparser than its counterpart, and the overlapping region between the two components has a relatively smaller alignment between its member stars than regions belonging solely to the narrow component.These two reasons lead LAAT to see a smaller contrast between the overlapping region of the two components and the field stars.Therefore, the broad component along with the overlapping region will acquire a low pheromone concentration compared to locations occupied by the narrow component alone.When applying the threshold on the pheromone quantity, some of the member stars that did not receive a large enough pheromone concentration will get filtered out.The part of the narrow component in that region will then be less populated as a result of the enforced cuts and will appear disconnected in some places as seen in Figure 4.Note that the thresholding criteria of LAAT are being updated in future versions of the algorithm so that overlapping streams could be extracted in a more efficient manner.The idea is to implement local thresholds depending on the pheromone quantity of local neighborhoods in the data rather than using one global threshold on the pheromone for the entire input dataset.In this way, regions that show smaller alignment or smaller density but are still equally interesting will not be filtered out as harshly.Given that this is a future implementation, with current means, we prefer to create a high purity sample at the cost of missing some member stars over creating a sample that could contain some contamination from stars in the field.

Likely merger scenarios
The properties estimated in Section 5 of the component widths and velocity/metallicity dispersions are pieces of information that point at the nature of the past progenitor of Jhelum.For the narrow component of the stream, we measure a velocity dispersion of 4.84 km/s and an upper limit to the metallicity dispersion of 0.15 dex.We also obtain a component width of ≈ 28 pc for the narrow component.The velocity dispersion of this component is comparable to other streams studied in the literature which are classified as having a GC origin.Some of these streams are 300S (Fu et al. 2018), Willka Yaku, Jet, andPhoenix (Shipp et al. 2018), as well as GD-1 (Gialluca et al. 2021).GCs are also characterized by a negligible metallicity dispersion not greater than 0.05 dex especially compared to larger systems such as DGs, which usually have metallicity dispersions an order of magnitude higher.The calculated metallicity dispersion of the narrow component is an upper limit to the true dispersion of this component's progenitor.This upper limit is again comparable to the dispersions of the streams in Li et al. (2022) that are more likely results of GC accretion.Therefore, the thin and dynamically cold nature of this component, suggest that a GC might have been its progenitor.The properties of the narrow component also rule out some other origin scenarios that were hypothesized in Bonaca et al. (2019) particularly Jhelum being the result of multiple orbital wraps.This scenario is now questionable since it is unlikely that the narrow component would remain of such a small width after long periods of orbit in the Galaxy's potential.
As for the broad component, we measure a velocity dispersion of 19.49 km/s as well as a metallicity dispersion of 0.34 +0.13  −0.09 dex and component width of ≈ 84 pc located at a distance of 1.45 kpc closer than the narrow component.Such large dispersions can be explained by dynamical perturbations (Woudenberg et al. 2023), or could be indicative of a DG origin.The latter complements the fact that when considering the whole stream, Jhelum has been classified so far as being more likely a remnant of a DG (Ji et al. 2020;Bonaca et al. 2021;Li et al. 2022).This shows that when studying the Jhelum stream, it is important to treat the two components distinctly as highlighted by this work, otherwise one risks artificially inflating the dispersions for the whole object.Using the updated mass-metallicity relation from Romero-Gómez et al. (2023), the mean metallicity of the broad component allows us to infer the stellar mass of the progenitor to be between 10 6 and 10 7 M ⊙ .Our estimate of the stream width using the procedure defined in Section 5.3 is smaller than the value reported in works such as Shipp et al. (2018) and cited in Li et al. (2022).One possible cause of the dissimilarity is the difference of the processes used to measure the stream width.Shipp et al. (2018) attempt to fit the transverse stream profile with a Gaussian stream model and a linear foreground component.The separation between member and contaminating stars is performed by iteratively narrowing down their star selection around the best-fit isochrone of the CMD.On the other hand, we fit each component width as a free parameter in our MCMC scheme.We also consider the proper motion of the stars and their color and magnitude information to determine their membership to the stream, whereas Shipp et al. (2018) use photometry only.Furthermore, using LAAT, membership to the stream is determined by assigning a global threshold to all stars in a run.Since central parts of streams tend to be more populated than their outer parts, LAAT will concentrate more pheromone on those inner regions, and so when applying a cut on the pheromone value, it is possible that some stars on the edges of the streams are filtered out while preserving the inner denser parts.This creates the possibility of missing some member stars especially in less directionally alligned or in the diffuse outer regions of streams.As a result, our calculations could underestimate the stream width.Furthermore, Bonaca et al. (2019) estimate the widths of the components to be 91 +4 −13 pc and 213 +8 −23 pc for the narrow and broad components respectively, at an assumed distance of 13 kpc.These estimates again are larger than our measurements of the component widths (28.13 +8.9 −6.64 pc and 84.09 +11.26 −7.17 pc respectively).Similar to the explanation above, we attribute this difference to the fact that LAAT uses a global threshold on the pheromone quantity.We prefer to use the threshold values mentioned in this work to limit contamination as much as possible and allow better constraints on the dynamical and metallicity properties of the components, keeping in mind that it comes at the expense of underestimating the widths of the stream components.
As a counterargument to the possible GC progenitor of the narrow component, works such as Walker et al. (2007) and Minor et al. (2010) also quote a range of 4−10 km/s for the velocity dispersion of several dwarf spheroidal galaxies around the Milky Way.Moreover, the errors we have on the metallicity dispersion measurements are large especially given the relatively low amount of stars that were available to perform the measurement.Therefore, even though our measurements favor a GC accretion scenario, it is difficult to completely rule out a DG origin.On the other hand, a DG with a velocity dispersion of ∼ 5 km/s should have a stellar mass of about 10 4−5 M ⊙ when using the relation between velocity dispersion and stellar mass from Eftekhari et al. (2022).This means that the number of stars in this component should be at least a factor 100 smaller than the number of stars in the broad component.This shows that although a DG progenitor for the narrow component cannot be ruled out completely, it remains unlikely.The thin nature of the narrow component could also point toward a scenario where a nuclear star cluster (NSC) at the center of a DG has been accreted onto the Milky Way (see Neumayer et al. (2020) for a review on NSCs).However, given that the narrow component is located at the edge of the broad component and not the center, it would be difficult to argue for this scenario without more data and/or modeling.
The work of Woudenberg et al. (2023) has also investigated the effects induced by encounters between Jhelum and the Sagittarius DG.Given that Sagittarius and Jhelum share the same orbital plane, it is natural to assume that Sagittarius has induced dynamical and structural perturbations on the smaller stream.Through N-body simulations of a loose GC set on Jhelum's orbit, and integrating the orbits of the large perturber and that of the GC in Milky Way-like potentials, Woudenberg et al. (2023) have shown that encounters between the two systems produces multiple components in Jhelum's stream.Their simulations also show that the interactions with Sagittarius result in inflating the measured velocity dispersion of Jhelum by a factor of 4 at most, compared to the unperturbed stream.These interactions are also elements that explain Jhelum's complex morphology.Woudenberg et al. ( 2023) also point out a tertiary component of Jhelum located on the top left of the narrow component and is parallel to it.The component can be seen in Figure 2 but is slowly filtered out upon applying the multiple runs of LAAT, and so we leave its exploration for future work.
Given this information, our measurements show a likely multiple progenitor scenario of Jhelum in which a GC belonging to a DG was accreted onto the Milky Way during the DG's infall, and produced the remnants that form the Jhelum stream.Arguments of this kind are also present in Errani et al. (2022) when discussing the possible progenitor of the C-19 stream.The difference in the distance to each component would correspond to an estimate of the projected distance between the accreted GC and its host DG.Such a system would fit within the sample of van den Bergh (2006) who provide a list of 101 GCs with a mean projected distance of 1.62 kpc to their host DGs.
We calculate the positions of the two components in integral of motion space since a separation between the two in that space indicates a different infall time between possible progenitors.The procedure and results are displayed in Appendix A where we observe no separation of this kind with the data we have so far.Although a GC scenario is likely, it is still difficult to rule out other potential origins of this stream such as a DG or a NSC accretion scenario.The dynamical perturbation from Sagittarius is also important to include in its history of formation.To have better certainty toward the origin of the Jhelum stream, we would need more member stars with available metallicity and radial velocity measurements.

Conclusions
In this work, we study the properties of the stellar stream Jhelum, a stream of the Milky Way galaxy that is known for its complex morphology.Our work is based on the findings of Bonaca et al. (2019) in which a narrow and a broad component are distinguished as substructures of the stream, and the work of Shipp et al. (2019) for which two signals were found in its proper motion space.For this work, we used a recently introduced machine learning methodology, LAAT, to mine these two components and attempt to link their newly estimated properties to the possible merger scenarios that formed the stream.The analysis and results reached in this work can be summarized as follows: -We used LAAT to highlight the density contrast between the stars more likely to belong to the stream, and stars that are part of the surrounding field.LAAT was applied on four dimensions consisting of two spatial and two proper motion dimensions, and the results were then used to refine the selection of stars in the CMD.-The produced density contrast enhancement revealed two distinct overdensities in proper motion space that we link to the two spatial components of Jhelum: the narrow and broad component.
-The separation between these two components was also confirmed a posteriori in radial velocity using measurements from the Southern Stellar Stream Spectroscopic Survey (Li  et al. 2019, S 5 ).We also found that the narrow component has a narrow trend in radial velocity, while the trend for the broad component is more diffuse.-With this new information, we calculated properties of the two separated components.Specifically, we used an MCMC procedure similar to the one used in Woudenberg et al. (2023) to sample from posteriors over the orbits of the two components.This was done while fitting for the width and velocity dispersion and followed by the calculation of the metallicity dispersion of either component.For the narrow component, we obtained a velocity dispersion of 4.84 +1.23 −0.79 km/s, metallicity dispersion of 0.15 +0.18 −0.10 , and a width of 28.13 +8.90 −6.64 pc.For the broad component, we obtained a velocity dispersion of 19.49 +2.19  −1.84 km/s, a metallicity dispersion of 0.34 +0.13  −0.09 , and a width of 84.09 +11.26 −7.17 pc.-The small velocity and metallicity dispersion as well as a small width indicate that the narrow component is more likely the result of GC accretion, though it is difficult to fully rule out a DG or a nuclear star cluster progenitor.On the other hand, the comparatively larger dispersions and width of the broad component suggest a DG progenitor for this part of the whole stream.We therefore argue for a likely scenario where Jhelum is the result of the accretion of a GC that belonged to a DG which merged with the Milky Way, although more data are needed to substantiate this claim.
It is possible to extend this study by performing deeper medium resolution and/or high resolution follow-up observations of the selection of stars identified in our work.It would be very helpful if these observations would provide high quality metallicity and radial velocity measurements for a larger number of target member stars, as well as light-element abundances (e.g., Na, Mg, and Al), ubiquitous to GCs.We leave this attempt therefore for future prospects.2 in the paper, we plotted the result of LAAT on the four-dimensional space (upper panels), and the result considering a threshold of 45% of the maximum value of pheromone during that run (lower panel).Since we know the label of each star, evaluating this result gives us an idea of the level of completeness and contamination in our sample.contaminant stars remain around the stream which correspond to the stars on the far right side of the proper motion space.These stars would have been further filtered out by the procedure followed in the paper after clustering the two overdensities in proper motion space and applying a cut on the log likelihood to belong to the mixture model (see Section 4).Since we know the label of each star in the mock dataset, we can see how many stream stars have been detected and how many contaminating stars remain.Given our results in Figure B.2, we find that 0.5% of the field stars remain, 80% of the narrow component has been detected, and 34% of the broad component is retrieved.
Though this is a simplified dataset representing the Jhelum stream and its surrounding stars, this experiment gives us an idea of the level of completeness and contamination in our sample.The following claims can be made: the level of contamination in our sample is kept minimal with a contamination percentage of less than 1% of the total stars in the sample.In order to achieve such a level of purity of the selection, member stars of Jhelum are filtered out in the process.This is also primarily due to the global nature of the pheromone threshold where one value is chosen to be applied on all stars of the dataset.In return, diffuse or faint structures run the risk of getting filtered out if more aligned or dense structures are present in the dataset.This effect is clearly seen in this experiment, where we detect 80% of the narrow component but only 34% of the broad component even though the latter is composed of more stars by construction.Since the broad component is more diffuse, our sample is likely missing more than 50% of the stars belonging to this component.This fact can also be evidenced in the left panel of Figure 4 in the paper as it seems that the region below the narrow component with ϕ 1 > 15 • is missing many of the stars belonging to the broad component.As for the narrow counterpart, its thin and dense nature make it easier to detect with the LAAT algorithm, and so we expect that the sample of stars composing this component is more complete to a level comparable with 80%.
The fact that we might be missing many of the members of the broad component could mean that we are underestimating some of its properties, for example its width.However, this does not change the conclusions of this paper: the velocity and metallicity dispersion for the broad component clearly point toward a DG origin for this component of the stream.This argument is not applicable to the narrow component however, since we rely on the small dispersions in velocity and metallicity to argue for a GC accretion scenario for this component.That is why it is reassuring that we detect a large portion of the narrow component and that the contamination in our sample is minimal.

Fig. 1 :
Fig. 1: The initial color and magnitude selection.In gray we plotted the color-magnitude diagram (CMD) of the stars that follow proper motion selection and extinction correction.Red points denote common stars between the selection in gray and the stars observed by the S 5 survey.Using these stars and a 12 Gyr, [Fe/H]= −1.7 PARSEC isochrone at 13 kpc (dark blue), we define the primary CMD selection as all stars lying in between the orange lines.The selection limit line on the right is positioned there to capture as many main sequence stars as possible while containing most stars in common with the S 5 survey.
Third run of LAAT applied with a threshold of 50%.

Fig. 2 :
Fig.2: Three runs of LAAT (a-c) after fine-tuning the CMD selection between each run.The CMD refinement is shown in the first column and is applied before each LAAT iteration.Middle panels show the spatial distribution of the stars while left panels show their position in proper motion space.The upper row of the panels (a-c) shows the distribution of pheromone at the end of the run, while the lower row shows the remaining stars after enforcing a threshold on the pheromone quantity.Darker colors mark stars that accumulated higher quantities of pheromone, indicating regions of local alignment and density.In this way, a CMD cut is first performed, followed by finding a distribution of pheromone using LAAT and retaining stars that have accumulated a pheromone quantity that exceeds the given threshold.The CMD of the remaining stars is replotted and refined and the procedure is repeated until the stream is isolated from the majority of field stars.

Fig. 3 :
Fig. 3: Proper motion space of the stars remaining after the selection procedure of Section 3. The data points within this space are attributed a score to belong to a 2-component Gaussian mixture trained on the distribution of stars in proper motion space shown in the lower left corner of Figure 2c.The data points are colored by the score (log likelihood) over the star's positions in proper motion space, and contours of the log likelihood are also visualized.The contours pin-point the location and direction of the trained Gaussian distributions.We also show the centers of the two Gaussian distributions by the plus sign.The red contour corresponds to log likelihood > −1.75.All data points that fall within this contour are chosen and kept for any subsequent analysis.

Fig. 4 :Fig. 5 :
Fig. 4: Posterior probabilities after clustering the two overdensities in proper motion space using a two-component Gaussian Mixture Model (GMM).Right panel: All stars in blue have a high probability of belonging to the top left overdenisty in proper motion space, while all stars in red have a high probability of belonging to the lower right overdenisty.Left panel: When visualized in position space, the two proper motion overdensities correspond to the narrow and broad components.
) P. Awad et al.: Sub-structure within the Jhelum Stream proper motion space

Fig. 6 :
Fig.6: Posterior distributions for all parameters modeled using a Markov chain Monte Carlo (MCMC) algorithm to obtain a best-fit orbit for the narrow (red) and broad (blue) components of Jhelum as well as the component widths and velocity dispersions.The median of each modeled parameter is indicated on top of each column along with the 16-th and 84-th percentile variations.Values indicated with or without a tilde on top of each column refer to the narrow or broad component, respectively.

Fig. 7 :
Fig. 7: Best fit orbits for the narrow and broad components of Jhelum in a standard Milky Way potential.Stars belonging to the narrow and broad component that were observed by the S 5 survey are shown in red and blue, respectively.The black solid and dashed lines in both panels indicate the best fit of orbit of each component respectively.

Fig. 8 :
Fig. 8: Radial velocity and metalicity distributions for either components of Jhelum.Top panel: Distribution of the radial velocities around the best-fit orbit of each component of the stream.The width of these distributions helps us visualize the velocity dispersion σ v for the broad component, and σv for the narrow component.Lower panel: Distribution of metallicities, [Fe/H],for either component of the stream.Metallicity measurements are obtained from the S 5 survey.The intrinsic width of these distributions is fitted to a Gaussian distribution, to find the metallicity dispersion of each component.This calculation is explained in Section 5.4.We also indicate the mean error on the metallicity measurements, σ meas = 0.2.

Fig. 9 :
Fig. 9: Posterior distributions of the mean metallicity [Fe/H] and intrinsic metallicity dispersion σ int modeled using an MCMC algorithm to obtain a best-fit Gaussian for the distribution in the lower panel of Figure 8. Values indicated by a tilde refer to the narrow component.

Fig
Fig. B.1: Resulting mock dataset of the Jhelum stream along with a distribution of surrounding field stars.The left panels correspond to the distribution of the sampled stars in position space and the right panels show the location of the same sample in proper motion space.Upper panels: the colors correspond to local density where regions with lighter colors correspond to dense regions.Lower Panels: The generated data points are colored according to their known label with red for the narrow component, blue for broad component and gray for generated nonmembers.

Fig
Fig. B.2: Similar to Figure2in the paper, we plotted the result of LAAT on the four-dimensional space (upper panels), and the result considering a threshold of 45% of the maximum value of pheromone during that run (lower panel).Since we know the label of each star, evaluating this result gives us an idea of the level of completeness and contamination in our sample.

Table 1 :
Input parameters for LAAT and run information.