TY - JOUR AU1 - Hall,, Alex AB - Abstract We conduct a thorough study into the feasibility of measuring large-scale correlated proper motions of galaxies with astrometric surveys. We introduce a harmonic formalism for analysing proper motions and their correlation functions on the sphere based on spin-weighted spherical harmonics, and study the statistics of the transverse velocity field induced by large-scale structure. We use a likelihood formalism to derive optimal estimators for the secular parallax due to the Solar System’s motion relative to distant objects, and compute the variance and bias due to peculiar velocities and relativistic aberration. We use a simulated catalogue of galaxy proper motions with radial distributions and noise properties similar to those expected from Gaia to forecast the detectability of the proper motion dipole, whose amplitude may be considered a proxy for the Hubble constant. We find cosmic variance to be the limiting source of noise for this measurement, forecasting a detectability of 1–2σ on a single component of the local velocity, increasing to 2–4σ (equivalent to a 25–50 per cent measurement of the Hubble constant) if the CMB dipole is included as prior information. We conduct a thorough study into the radial dependence of the signal-to-noise finding that most of the information comes from galaxies closer than a few hundred Mpc. We forecast that the amplitude of peculiar transverse velocities can potentially be measured with 10σ significance; such a measurement would offer a unique probe of cosmic flows and a valuable test of the cosmological model. methods: statistical, proper motions, large-scale structure of Universe, cosmology: observations 1 INTRODUCTION The launch of the ESA Gaia1 satellite and its subsequent data releases are expected to revolutionize the field of astrometry, producing the largest catalogue of precise positions and proper motions to date. Upcoming very long baseline interferometry measurements with the ngVLA2 will also have an unprecedented astrometric precision at radio frequencies. Both of these experiments should have an end-of-mission proper motion accuracy of the order of |$10 \, \mu \mathrm{as} \, \mathrm{yr}^{-1}$| for the brightest objects. With the dramatic increase in data volume and quality, it is timely to ask what precision astrometry can say about cosmology. Although not designed for this purpose, Gaia will measure the proper motions of some 106 galaxies (de Souza et al. 2014; de Bruijne et al. 2015), preferentially selecting objects that look most ‘point-source-like’, i.e. ellipticals with large bulge-to-disc components. Additionally, a large population of quasars will be observed to pin down the celestial reference frame (see e.g. Gaia Collaboration 2018). There are several potential uses of such a data set for cosmology (see Darling, Truebenbach & Paine 2018 for a review). First, since the Solar System moves relative to distant objects, there is a ‘secular parallax’ (SP) proper motion in the opposite direction to our local velocity. This proper motion has an amplitude of roughly |$\frac{80}{r / 1 \, \mathrm{Mpc}} \, \mu \mathrm{as} \, \mathrm{yr}^{-1}$|⁠, where r is the comoving distance to the object and has a dipolar dependence on angle, anti-aligned with the velocity vector of the Solar System with respect to the cosmic microwave background (CMB) rest frame.3 With spectroscopic redshifts as a proxy for distance, the SP proper motion may be used to infer the Hubble constant, H0, if the SP velocity is fixed by the CMB dipole. Bachchan, Hobbs & Lindegren (2016) forecast that Gaia can potentially make a |${\sim }30{{\ \rm per\ cent}}$| measurement of H0 with this method, with further improvements possible if more galaxies can be detected. A competitive measurement of H0 could shed light on recent tensions between CMB and classical distance–ladder measurements (see Freedman 2017 for a review). At greater distances, quasars may be used to probe the distance–redshift relation to constrain dark energy (Ding & Croft 2009). Complicating the measurement of SP is the typically larger signal of the ‘secular aberration drift’ (SAD) proper motion due to the time-varying relativistic aberration from the acceleration of the Solar System towards the galactic centre. This has a magnitude of roughly |$4 \, \mu \mathrm{as} \, \mathrm{yr}^{-1}$|⁠, with a dipolar angular dependence directed towards the galactic centre, and has been measured in quasars by Titov, Lambert & Gontier (2011). This signal is independent of distance, which in principle allows it to be distinguished from the SP proper motion; alternatively it can be measured from high-redshift quasars where SP is negligible and then subtracted. Going beyond local effects, galaxies and quasars have intrinsic peculiar velocities caused by large-scale structure (LSS). In linear perturbation theory this gives rise to an r.m.s. proper motion of roughly |$\frac{90}{r / 1 \, \mathrm{Mpc}} \, \mu \mathrm{as} \, \mathrm{yr}^{-1}$|⁠. This is roughly the same size as the SP proper motion and has the same distance dependence, and represents an important source of bias and variance in attempts to measure SP (unsurprisingly, as we are only sensitive to the relative motion between the Solar System and extragalactic objects). However, this signal has a distinct correlation structure, with quadrupolar, octupolar, and higher order angular structure due to correlations in the peculiar velocity field. This angular dependence in principle allows LSS transverse velocities to be partly separated from the SP effect, which would provide a valuable probe of large-scale motions, free from the Malmquist biases that affect radial velocity surveys (Nusser, Branchini & Davis 2012). Cosmic velocity fields are sensitive to large-scale inhomogeneities in the dark matter density field which can be modelled accurately in linear perturbation theory; such a measurement could provide a valuable probe of late-time physics such as dark energy. Predictions for the sensitivity of Gaia to these motions range from 1–2σ (Nusser et al. 2012) to 10σ (Darling et al. 2018), depending on survey assumptions and measurement techniques. In this work, we extend previous studies into the feasibility of detecting large-scale correlated proper motions, paying particular attention to the ‘cosmic variance’ imparted by LSS. Proper motions from SP and LSS are strongest in nearby objects, which is also the regime where correlations between the relevant velocities are expected to be strongest. Despite this, previous studies have adopted only a simplistic approach to including cosmic variance in their forecasts and have typically neglected correlations between transverse velocities. We consistently account for these correlations, and identify a previously neglected source of bias in the measurement of the proper motion dipole coming from correlations between the Solar System’s motion and those of nearby galaxies, closely related to the ‘bulk flow’ phenomenon observed in radial velocity surveys. We adopt a statistical approach to measuring large-scale correlated proper motions based on a likelihood function and use it to derive optimal ‘stacking’ estimators for SP proper motion and its LSS counterpart. We present several mitigation strategies to reduce the bias and variance of these estimators. To do this, we introduce a CMB-style formalism for measuring correlated proper motions across the full sky, which serves as an alternative to the widely used vector spherical harmonics (VSH; Mignard & Klioner 2012). This formalism uses the spin-weighted spherical harmonic decomposition of the proper motion field, which allows correlation functions and power spectra to be easily constructed and analysed (extending recent work by Darling & Truebenbach 2018) and ensures that all the information in the vector field is used. This formalism is easier to use than VSH when correlating proper motions, and we expect it to be useful beyond the applications presented here. We also conduct a detailed study into the statistics of transverse velocities, focusing on their redshift-dependence, angular structure, and sensitivity to non-linearity in dark matter inhomogeneities. Our likelihood formalism allows us to consistently propagate the variance from LSS velocities through to estimators of H0 and the amplitude of the peculiar velocity field. We present forecasts for these quantities with a Gaia-like astrometric survey, and thoroughly investigate the radial dependence of the signal-to-noise. This paper is structured as follows. In Section 2 we introduce the spin-weighted spherical harmonics as a means of analysing transverse vectors on the full sky and describe how correlation functions and power spectra may be constructed and compared with theoretical models such as linear perturbation theory. In Section 3 we construct a likelihood function for proper motions, and use it to derive optimal estimators for large-scale correlated signals. In Section 4 we describe our Gaia-like simulated galaxy catalogue, and in Section 5 we present forecasts for large-scale proper motion dipoles and H0. In Section 6 we extend our formalism to the measurement of transverse velocities induced by LSS, and we conclude in Section 7. In a series of appendices we present further technical details of our harmonic formalism. We set the speed of light c = 1 throughout this work. All results were computed with the best-fitting flat ΛCDM cosmological parameters (TT, EE, TE + lowP + lensing + ext) quoted in Planck Collaboration XIII (2016), namely (Ωbh2, Ωch2, h, As, ns) = (0.0223, 0.1188, 0.6774, 2.142 × 10−9, 0.9667). 2 POWER SPECTRA AND CORRELATION FUNCTIONS OF PROPER MOTION In this section we present the harmonic decomposition of a transverse velocity or proper motion field on the sphere, and construct the correlation functions and power spectra which form the basis of our likelihood formalism. Much of the material here will be familiar from CMB polarization studies, and further technical details are provided in Appendix A. Some of the formulae here are also presented in appendix B of Hall & Challinor (2014), albeit in a different context. 2.1 Rotational properties of the transverse velocity When dealing with vector or tensor quantities on the sphere, it is usually advantageous to work with objects that do not depend on a coordinate system, since any physically meaningful quantity cannot depend on the adopted coordinate system. Independence from the coordinate system is most easily achieved by working with objects that transform straightforwardly under a change of basis. In the case of transverse velocity (or proper motion, related to transverse velocity by a factor of 1/r), we commonly work with a locally orthonormal pair of basis vectors |$(\hat{\boldsymbol {x}},\hat{\boldsymbol {y}})$|⁠, orthogonal to the line of sight. Denoting by |$\hat{\boldsymbol {n}}$| the outward radial unit vector corresponding to the line of sight, the vectors |$(\hat{\boldsymbol {x}},\hat{\boldsymbol {y}}, -\hat{\boldsymbol {n}})$| form a right-handed set and the components of the transverse velocity in this system are (Vx, Vy). By considering the standard 2D rotation matrix, one sees that under a right-handed rotation of the coordinate basis about |$-\hat{\boldsymbol {n}}$| (or left-handed about |$\hat{\boldsymbol {n}}$|⁠) by an angle γ, the complex transverse velocity Vx + iVy transforms as \begin{eqnarray*} V_x + iV_y \rightarrow (V_x + iV_y)e^{i\gamma }. \end{eqnarray*} (1) Now, we say a quantity sη has spin s if, under the transformation |$\hat{\boldsymbol {x}} + i\hat{\boldsymbol {y}} \rightarrow (\hat{\boldsymbol {x}} + i\hat{\boldsymbol {y}})e^{i\gamma }$|⁠, we have sη → eisγsη. The complex transverse velocity Vx ± iVy thus has spin ±1. In this work we use the standard orthonormal spherical polar basis vectors |$(\hat{\boldsymbol{\theta }},\hat{\boldsymbol{\phi }}, \hat{\boldsymbol {n}})$|⁠, which form a right-handed set, and define the complex components of the transverse velocity as V± ≡ Vθ ± iVϕ. The components in this basis are related to those in the celestial reference frame (described by declination δ and cos δ-corrected right-ascension α*) by |$(V_{\theta }, V_{\phi }) = (-V_{\delta }, V_{\alpha ^*})$|⁠. We prefer to work with the quantities V± over (Vθ, Vϕ) because their simple transformation behaviour under rotations makes constructing correlation functions considerably easier, as we shall see. Being a spin ±1 field on the sphere, V± may naturally be expanded in spin ±1 spherical harmonics (a set of orthonormal basis functions on the sphere possessing the correct rotational properties, see Appendix A) as \begin{eqnarray*} V_{\pm }(\hat{\boldsymbol {n}}) = \sum _{lm} (\mp \epsilon _{lm} + i \beta _{lm}) {}_{\pm 1}Y_{lm}(\hat{\boldsymbol {n}}), \end{eqnarray*} (2) where the sum over m ranges between −l and l and we have l ≥ 1. As discussed in Appendix A, under a parity transformation the quantities ϵlm transform as ϵlm → (−1)lϵlm and are hence said to have electric parity, whereas the βlm transform as βlm → (−1)l + 1βlm and hence have magnetic parity. It is straightforward to show that these objects are related to the solenoidal (slm) and toroidal (tlm) coefficients of a VSH expansion (Mignard & Klioner 2012) by (slm, tlm) = (ϵlm, βlm). 2.2 Proper motion dipoles In this work we are primarily concerned with measuring proper motions which are correlated over large angular scales, such as the dipolar SP. In this section we show how to map a proper motion or transverse velocity dipole to the multipole coefficients ϵlm and βlm. In the SP scenario, the Solar System moves with some velocity |$\boldsymbol {V}$| relative to distant objects, which are assumed fixed. Taking the z-axis of a spherical coordinate system along |$\boldsymbol {V}$|⁠, the transverse velocity of distant galaxies or quasars relative to the Solar System barycentre is |$\boldsymbol{V}_\perp = \vert \boldsymbol {V} \vert \sin \theta ^{\prime } \hat{\boldsymbol{\theta }}^{\prime }$|⁠, where a prime denotes quantities in the frame with |$\hat{\boldsymbol {z}} = \hat{\boldsymbol {V}}$|⁠. In this coordinate system then we have |$(V_\theta ^{\prime },V_\phi ^{\prime }) = (\vert \boldsymbol{V} \vert \sin \theta ^{\prime }, 0)$|⁠. Using the explicit expressions for the spin-weighted spherical harmonics in equation (A16) and the definition of the complex transverse velocity we thus have \begin{eqnarray*} V_{\pm }^{\prime }(\hat{\boldsymbol {n}}) = \pm \vert \boldsymbol {V} \vert \sqrt{\frac{8\pi }{3}}{}_{\pm 1}Y_{10}(\hat{\boldsymbol{n}}). \end{eqnarray*} (3) We can now use equation (2) to read off the multipoles in this frame, finding |$\epsilon _{lm}^{\prime } = -\vert \boldsymbol {V} \vert \sqrt{8\pi /3}\delta _{l1}\delta _{m0}$| and |$\beta _{lm}^{\prime } = 0$|⁠. Thus, in this frame, the SP is purely E-mode and dipolar. This example illustrates a generic feature of the E-mode; the direction in which the amplitude of the transverse velocity is maximally changing is parallel or antiparallel to the local direction of the transverse velocity. If instead we rotated the transverse velocity vector at each point by 90°, we would have a pure B-mode pattern characteristic of global rotation of distant objects about the z-axis. In this case the amplitude is maximally changing 90° or 270° to the local direction, characteristic of a pure B-mode. To find the multipole coefficients in a general frame, we use the rotation law in equation (A14), |$\epsilon _{lm} = \sum _{m^{\prime }}D^l_{mm^{\prime }}(\alpha ,\beta ,\gamma) \epsilon _{lm^{\prime }}^{\prime }$| where (α, β, γ) are the Euler angles which rotate from the primed frame to the unprimed (general) frame, and the |$D^l_{mm^{\prime }}$| are the Wigner D matrix elements (see Appendix A). The velocity vector in a general frame has direction |$\hat{\boldsymbol {V}} = (\sin \beta \cos \alpha , \sin \beta \sin \alpha , \cos \beta)$|⁠, so using the relation of the Wigner D matrix elements to the spin-weighted spherical harmonics in equation (A15) we have, in a general frame \begin{eqnarray*} \epsilon _{lm} &=& -\frac{8\pi }{3}\frac{\vert \boldsymbol {V} \vert }{\sqrt{2}} Y_{1m}^*(\hat{\boldsymbol {V}})\delta _{l1}, \nonumber \\ \beta _{lm} &=& 0. \end{eqnarray*} (4) Note that the rotation has preserved the pure E-mode nature of the signal, a generic feature of the E/B decomposition. To gain further intuition into the multipole coefficients consider inserting the explicit forms of the spherical harmonics into equation (4). This yields \begin{eqnarray*} \epsilon _{1-1} &=& -\sqrt{\frac{8\pi }{3}}\frac{V_x + iV_y}{\sqrt{2}}, \nonumber \\ \epsilon _{10} &=& -\sqrt{\frac{8\pi }{3}} V_z \nonumber , \\ \epsilon _{11} &=& -\sqrt{\frac{8\pi }{3}}\frac{-V_x + iV_y}{\sqrt{2}}. \end{eqnarray*} (5) Thus, the multipole coefficients are just the components of the spatially fixed 3D velocity field in a helicity basis. Writing |$\boldsymbol{\epsilon } = (\epsilon _{1-1},\epsilon _{10},\epsilon _{11})^\intercal$|⁠, this may be compactly written as |$\boldsymbol{\epsilon } = -\sqrt{8\pi /3}\mathsf{B}{\boldsymbol {V}}$|⁠, with |$\mathsf{B}$| a unitary matrix. 2.3 Correlation functions Correlating vectors on the sphere is slightly more subtle than correlating scalars, since one must ensure that the final result is independent of the orientation of the |$(\hat{\boldsymbol {x}},\hat{\boldsymbol {y}})$| basis, which itself varies over the sky. Examples of basis-independent transverse-velocity correlation functions were recently presented by Darling & Truebenbach (2018), where it was pointed out that several choices of correlation function may be made but any single choice is not sufficient to capture all the information in the vector field, which has two degrees of freedom at each point. When correlating a vector field at two points on the sky, the natural basis in which to express the components is provided by the geodesic connecting the two points. Let γ1 be the angle required to rotate the local |$(\hat{\boldsymbol{\theta }},\hat{\boldsymbol{\phi }})$| basis at |$\hat{\boldsymbol {n}}_1$| in a right-handed sense about |$\hat{\boldsymbol {n}}_1$| such that the |$\hat{\boldsymbol{\theta }}$| vector is aligned with the geodesic connecting |$\hat{\boldsymbol {n}}_1$| and |$\hat{\boldsymbol {n}}_2$|⁠. Let γ2 be the corresponding angle at |$\hat{\boldsymbol {n}}_2$|⁠. On the flat sky we have γ1 = γ2. Let |$\cos \beta _{12} = \hat{\boldsymbol {n}}_1 \cdot \hat{\boldsymbol {n}}_2$|⁠. The complex transverse velocity in this rotated basis, |$\bar{V}_{\pm }$| may be found using equation (1). It is related to the complex transverse velocity in the global basis by \begin{eqnarray*} \bar{V}_{\pm }(\hat{\boldsymbol {n}}_1) = V_{\pm }(\hat{\boldsymbol {n}}_1)e^{\mp i \gamma _1}, \end{eqnarray*} (6) and likewise at |$\hat{\boldsymbol {n}}_2$|⁠. The angles {γ1, β12, −γ2} form a set of Euler angles which rotate the basis at |$\hat{\boldsymbol {n}}_1$| into that at |$\hat{\boldsymbol {n}}_2$|⁠. Given two points |$(r_1, \hat{\boldsymbol {n}}_1)$| and |$(r_2, \hat{\boldsymbol {n}}_2)$| in a 3D space,4 there are two correlation functions we can form with |$\bar{V}_{\pm }(\hat{\boldsymbol {n}}_1, r_1)$| and |$\bar{V}_{\pm }(\hat{\boldsymbol {n}}_2, r_2)$|⁠. Assuming that the transverse velocity field has vanishing mean (which is true for cosmic velocities sourced by LSS), these are given by \begin{eqnarray*} \xi _\pm (\beta _{12}, r_1, r_2) &\equiv& \langle \bar{V}_+(\hat{\boldsymbol {n}}_1, r_1) \bar{V}_{\pm }^*(\hat{\boldsymbol {n}}_2, r_2) \rangle \nonumber \\ &=& e^{-i\gamma _1}e^{\pm i\gamma _2} \langle V_+(\hat{\boldsymbol {n}}_1, r_1) V_{\pm }^*(\hat{\boldsymbol {n}}_2, r_2) \rangle , \end{eqnarray*} (7) where the angle brackets denote an expectation value over realizations of the transverse velocity field. As we shall see, these correlation functions are real valued. We can relate the correlation functions ξ± to the power spectra of the multipole coefficients by plugging equation (2) into equation (7). First we define the power spectra \begin{eqnarray*} \langle \epsilon _{lm}(r_1) \epsilon ^*_{l^{\prime }m^{\prime }}(r_2) \rangle &=& \zeta ^E_l(r_1,r_2) \delta _{ll^{\prime }}\delta _{mm^{\prime }}, \nonumber \\ \langle \beta _{lm}(r_1) \beta ^*_{l^{\prime }m^{\prime }}(r_2) \rangle &=& \zeta ^B_l(r_1,r_2) \delta _{ll^{\prime }}\delta _{mm^{\prime }}, \nonumber \\ \langle \epsilon _{lm}(r_1) \beta ^*_{l^{\prime }m^{\prime }}(r_2) \rangle &=& 0, \end{eqnarray*} (8) where we have assumed statistical isotropy (which imposes the Kronecker deltas and the m-independence of the power spectra) and parity non-violation (which ensures that the EB correlation vanishes). Note that these power spectra are real valued. Inserting these definitions into equation (7) gives \begin{eqnarray*} \xi _\pm (\beta _{12}, r_1, r_2) &=& e^{-i\gamma _1}e^{\pm i\gamma _2} \sum _l \left(\pm \zeta ^E_l(r_1,r_2) + \zeta ^B_l(r_1,r_2) \right) \nonumber \\ &&\times \, \sum _m {}_1 Y_{lm}(\hat{\boldsymbol {n}}_1) {}_{\pm 1}Y_{lm}^*(\hat{\boldsymbol {n}}_2). \end{eqnarray*} (9) Using equation (A13) to relate the spin-weighted spherical harmonics to the Wigner D matrix elements (Varshalovich, Moskalev & Khersonskii 1988) we find \begin{eqnarray*} \xi _\pm (\beta _{12}, r_1, r_2) = \!\sum _l \frac{2l+1}{4\pi } \left[ \pm \zeta ^E_l(r_1,r_2) \!+\! \zeta _l^B(r_1,r_2) \right] d^l_{1\pm 1}(\beta _{12}), \nonumber \\ \end{eqnarray*} (10) where |$d^l_{1\pm 1}(\beta _{12})$| are elements of the reduced Wigner matrices and are the generalization of the Legendre polynomials to fields with non-zero spin. These functions are real valued and obey the relations |$d^l_{1-1} = d^l_{-1 1}$| and |$d^l_{11} = d^l_{-1-1}$| and may be computed using the recursion relation (Varshalovich et al. 1988) \begin{eqnarray*} d^l_{1 \pm 1} &=& \frac{l(2l-1)}{l^2-1}\left[\left(\cos \beta \mp \frac{1}{l(l-1)}\right)d^{l-1}_{1 \pm 1}\right. \nonumber \\ &&\left.- \frac{l(l-2)}{(l-1)(2l-1)}d^{l-2}_{1 \pm 1}\right], \end{eqnarray*} (11) for l ≥ 3, with the boundary conditions \begin{eqnarray*} d^1_{1 \pm 1} &=& \frac{1}{2}(1 \pm \cos \beta), \nonumber \\ d^2_{1 \pm 1} &=& \frac{1}{2}(\pm 2\cos ^2 \beta + \cos \beta \mp 1). \end{eqnarray*} (12) The Wigner d elements obey the relations |$d^l_{11}(0) = 1$| and |$d^l_{1-1}(0) = 0$|⁠, which means that the quantity |$(2l+1)(\zeta ^E_l + \zeta ^B_l)/4\pi$| is the contribution per l to the variance at a point, and the quantity |$l(l+1)(\zeta ^E_l + \zeta ^B_l)/2\pi$| is roughly the contribution per log l to the variance at a point. Since the d functions and the power spectra are real, the ξ± are real and hence completely describe the correlation structure of the transverse velocity field. If in addition the transverse velocity is Gaussian distributed with zero mean, these correlation functions completely describe the statistics of the vector field. In Appendix B we compare the ξ± correlation functions with those introduced by Darling & Truebenbach (2018), finding that the two sets are linear combinations of each other. 2.4 The transverse velocity of large-scale structure Large-scale gravitational potentials induce departures from the Hubble flow in the motion of galaxies and dark matter haloes. Since the initial conditions for structure formation appear Gaussian to a close approximation, on large scales the velocity field of LSS also appears Gaussian due to the evolution being close to linear. On small scales the effects of non-linear evolution impart non-Gaussianity into the peculiar velocity field |$\boldsymbol {v}$|⁠, but the effects are quite weak for wavenumbers |$k \lesssim 0.5 \, h \, \mathrm{Mpc}^{-1}$| at z = 0 (Zheng et al. 2013). Unlike the matter overdensity δ, the peculiar velocity is not required to be greater than −1, and so non-Gaussianity is not a fundamental requirement when fluctuations in |$\boldsymbol {v}$| become large (Coles & Jones 1991). Gaussian fields are completely described by their mean and covariance, and since the former is zero by definition for peculiar velocity, the power spectrum or correlation function of |$\boldsymbol {v}$| is of central interest in studying cosmic flows. In particular, the proper motion of LSS can act as a contaminant for measurements of the Hubble constant from SP, and so quantifying its statistics is a key goal of this paper. The LSS proper motion is also potentially measurable in astrometric surveys (Nusser et al. 2012), offering the possibility to constrain cosmological parameters such as the dark energy equation of state from real-time cosmology. In this section, we compute the power spectrum and correlation functions of peculiar velocities, focusing on the low redshifts relevant for proper motion surveys. Some of this material has been presented in Hall & Challinor (2014) in the context of polarized emission from the kinetic Sunyaev–Zeldovich effect, but we repeat it here for completeness. On the large scales relevant for this work, the peculiar velocity field is, to a good approximation, curl-free (Percival & White 2009). As described in Appendix A, this implies that the transverse velocity is a pure E-mode, described by a single power spectrum |$\zeta ^E_l(r_1,r_2)$| for galaxies located at radii r1 and r2. In Fourier space we have |$\boldsymbol {v}(\boldsymbol {k}) = -i\hat{\boldsymbol {k}}v(\boldsymbol {k})$| where |$v(\boldsymbol {k})$| is the velocity potential. The real-space peculiar velocity is then \begin{eqnarray*} \boldsymbol {v}(\boldsymbol {r}) = \nabla _{\boldsymbol {r}} \int \frac{\mathrm{d}^3 \boldsymbol {k}}{(2\pi)^3} \, \frac{-v(\boldsymbol {k})}{k} e^{i\boldsymbol {k} \cdot \boldsymbol {r}}, \end{eqnarray*} (13) where ∇r is the 3D spatial gradient. Projecting on to the sphere with |$\boldsymbol {r} = r\hat{\boldsymbol {n}}$| and using that |$\nabla ^\perp _{\mathbf {r}} = (1/r)\nabla$| where ∇ is the angular covariant derivative, we find va = ∇aΩ with \begin{eqnarray*} \Omega (\hat{\boldsymbol {n}},r) = -\int \frac{\mathrm{d}^3 \boldsymbol {k}}{(2\pi)^3} \, \frac{v(\boldsymbol {k},r)}{kr} e^{ikr\hat{\boldsymbol {k}} \cdot \hat{\boldsymbol {n}}}. \end{eqnarray*} (14) Expanding the exponential in spherical harmonics and using the multipole expansion in equation (A8) allows us to read off the E-mode coefficients for the transverse velocity as \begin{eqnarray*} \epsilon _{lm}(r) = -4\pi i^l\sqrt{l(l+1)} \int \frac{\mathrm{d}^3 \boldsymbol {k}}{(2\pi)^3} \, v(\boldsymbol {k},r) \frac{j_l(kr)}{kr} Y^*_{lm}(\hat{\boldsymbol {k}}), \end{eqnarray*} (15) where jl is a spherical Bessel function. We now use the Newtonian linear continuity equation to write |$v(\boldsymbol {k},r) = -[\mathcal {H(\eta)}/k]f(\eta)\delta (\boldsymbol {k},r)$| where |$\mathcal {H}(\eta)$| is the comoving Hubble parameter at conformal time η = η0 − r, with η0 the current conformal time, and f(η) the growth rate of linear density fluctuations. We neglect any velocity bias, which has been shown to be very small on the relevant spatial scales here (Chen et al. 2018). Using the statistical isotropy and homogeneity of the density field allows us to compute the E-mode angular power spectrum as \begin{eqnarray*} \zeta ^E_l(r_1,r_2) &=& 4\pi l(l+1)\mathcal {H}(\eta _1)\mathcal {H}(\eta _2)f(\eta _1)f(\eta _2) \nonumber \\ &&\times \, \int \frac{k^2\mathrm{d}k}{2\pi ^2} \, \frac{P(k;\eta _1,\eta _2)}{k^2} \frac{j_l(kr_1)}{kr_1}\frac{j_l(kr_2)}{kr_2}, \end{eqnarray*} (16) where P(k; r1, r2) is the unequal time–matter power spectrum. The correlation functions ξ± may now be found using equation (10) with |$\zeta ^B_l = 0$|⁠. In Fig. 1 we plot the quantity |$l(l+1)\zeta ^E_l(r,r)/2\pi$|⁠, which is approximately the contribution to the LSS transverse velocity variance per log l. We use equation (16) with the linear theory matter power spectrum computed with camb (Lewis, Challinor & Lasenby 2000) and focus on |$r \lesssim 300 \, \mathrm{Mpc}/h$| in anticipation of the relevant distances for measuring SP. Fig. 1 also shows the noise power for our Gaia-like reference survey, which we introduce in Section 4. Figure 1. View largeDownload slide The angular power spectrum of LSS transverse velocities in linear theory, at (highest to lowest at l = 1) |$r = 25 \, \mathrm{Mpc}\,h^{ -1}$| (blue solid), |$r = 71 \, \mathrm{Mpc}\,h^{ -1}$| (orange solid), |$r = 147 \, \mathrm{Mpc}\,h^{ -1}$| (green solid), and |$r = 300 \, \mathrm{Mpc}\,h^{ -1}$| (red solid). We also plot the instrumental noise power for |$r = 25 \, \mathrm{Mpc}\,h^{ -1}$| (lowest, blue dashed), |$r = 71 \, \mathrm{Mpc}\,h^{ -1}$| (middle, orange dashed), and |$r = 147 \, \mathrm{Mpc}\,h^{ -1}$| (highest, green dashed). The correlation function at a given angular separation is found by summing over multipoles of these power spectra weighted by the Wigner d functions, as in equation (10). Note that BAOs become increasingly visible at greater distances. Structure at greater distance has power on smaller angular scales due to the roughly constant coherence length of the 3D velocity field. Figure 1. View largeDownload slide The angular power spectrum of LSS transverse velocities in linear theory, at (highest to lowest at l = 1) |$r = 25 \, \mathrm{Mpc}\,h^{ -1}$| (blue solid), |$r = 71 \, \mathrm{Mpc}\,h^{ -1}$| (orange solid), |$r = 147 \, \mathrm{Mpc}\,h^{ -1}$| (green solid), and |$r = 300 \, \mathrm{Mpc}\,h^{ -1}$| (red solid). We also plot the instrumental noise power for |$r = 25 \, \mathrm{Mpc}\,h^{ -1}$| (lowest, blue dashed), |$r = 71 \, \mathrm{Mpc}\,h^{ -1}$| (middle, orange dashed), and |$r = 147 \, \mathrm{Mpc}\,h^{ -1}$| (highest, green dashed). The correlation function at a given angular separation is found by summing over multipoles of these power spectra weighted by the Wigner d functions, as in equation (10). Note that BAOs become increasingly visible at greater distances. Structure at greater distance has power on smaller angular scales due to the roughly constant coherence length of the 3D velocity field. The linear variance of the transverse velocity field varies little over the distance range |$0 \lesssim r \lesssim 300 \, \mathrm{Mpc}\,h^{ -1}$|⁠, maintaining a roughly constant r.m.s. value of |$\sigma _{v_\perp } \approx 440 \, \mathrm{km} \, \mathrm{s}^{-1}$|⁠. The primary effect of increasing the distance in Fig. 1 is then to push the angular structure to smaller scales, keeping the area under each curve constant. When the radial distance is much less than the coherence scale of the velocity field, most of the power is at l = 1, which follows from projecting a constant velocity on to the sphere (see also Section 2.2). As we increase the radial distance, uncorrelated patches of the velocity field come into view which project to give enhanced angular structure, and the peak of the power spectrum shifts to smaller angular scales. We also see that baryon acoustic oscillations (BAOs) in the power spectrum become visible in projection for |$r \gtrsim 100 \, \mathrm{Mpc}\,h^{ -1}$|⁠. To study the radial dependence of the power spectrum more closely, in Fig. 2 we plot the power of the first few multipoles as a function of distance. The solid curves in this figure are the linear theory predictions, and demonstrate that the dipole power decreases strongly with distance, due to the transfer of power to smaller scales at roughly fixed total variance. Similarly, power in the higher multipoles initially increases with distance, but then falls as incoherent patches of transverse velocity become visible on the sky. Note that the proper motion due to LSS carries an additional factor of 1/r. Figure 2. View largeDownload slide The first few multipoles of the LSS transverse velocity power spectrum as a function of distance. Curves ordered bottom to top at |$r=300 \, \mathrm{Mpc}\,h^{ -1}$| are l = 1 (blue), l = 2 (orange) and l = 3 (green). We plot curves for both linear theory (solid) and a simplistic non-linear model which replaces the linear matter power spectrum with the halofit model of Takahashi et al. (2012) (dashed). Figure 2. View largeDownload slide The first few multipoles of the LSS transverse velocity power spectrum as a function of distance. Curves ordered bottom to top at |$r=300 \, \mathrm{Mpc}\,h^{ -1}$| are l = 1 (blue), l = 2 (orange) and l = 3 (green). We plot curves for both linear theory (solid) and a simplistic non-linear model which replaces the linear matter power spectrum with the halofit model of Takahashi et al. (2012) (dashed). The dashed curves in Fig. 2 are the power spectrum computed with the non-linear correction to the matter power spectrum from halofit (Takahashi et al. 2012). While this is not an accurate model for the non-linear velocity power spectrum as it does not account for non-linearity in the continuity equation, it gives us a rough idea of the sensitivity to non-linear scales. We model the non-linear unequal-time power spectrum as |$P(k; r_1, r_2) = \sqrt{P(k;r_1,r_1)P(k;r_2,r_2)}$|⁠, which holds in the linear regime, and apply the halofit correction to the equal-time power spectra. From the figure we see that the effects of non-linear evolution are only important for nearby objects and multipoles l ≳ 2, since a small spatial scale contributes to these large angular scales only when nearby. In particular, the dipole of LSS transverse velocity is very insensitive to changing the matter power spectrum on non-linear scales.5 This behaviour can be understood more readily in Fig. 3, where we plot the cumulative contribution to the l = 1 power spectrum as a function of the maximum wavenumber included in the integration in equation (16). At z ≈ 0, modifications to the matter power spectrum start to reach the per cent level around |$k \gtrsim 0.1 \, h\, \mathrm{Mpc}^{-1}$|⁠, but only for very nearby galaxies do these scales contribute to the dipole power, as evidenced by the convergence of the curves as kmax is increased. The forecasts in this paper include only galaxies at r ≳ 20 Mpc h−1, for which the contribution from non-linear scales to the dipole is negligible; the transverse velocity dipole from objects at r ≳ 20 Mpc h−1 is primarily sensitive to large, linear scales. This is increasingly true at greater distances due to angular projection. Figure 3. View largeDownload slide The cumulative contribution to the l = 1 multipole of the LSS transverse velocity power spectrum for increasing maximum wavenumbers kmax. We show curves for (top to bottom) |$r = 25 \, \mathrm{Mpc}\,h^{ -1}$| (blue), |$r = 71 \, \mathrm{Mpc}\, h^{ -1}$| (orange), |$r = 147 \, \mathrm{Mpc}\,h^{ -1}$| (green), and |$r = 300 \, \mathrm{Mpc}\,h^{ -1}$| (red). The dipole receives more contribution from small spatial scales when the LSS tracers are closer, a consequence of angular projection. Figure 3. View largeDownload slide The cumulative contribution to the l = 1 multipole of the LSS transverse velocity power spectrum for increasing maximum wavenumbers kmax. We show curves for (top to bottom) |$r = 25 \, \mathrm{Mpc}\,h^{ -1}$| (blue), |$r = 71 \, \mathrm{Mpc}\, h^{ -1}$| (orange), |$r = 147 \, \mathrm{Mpc}\,h^{ -1}$| (green), and |$r = 300 \, \mathrm{Mpc}\,h^{ -1}$| (red). The dipole receives more contribution from small spatial scales when the LSS tracers are closer, a consequence of angular projection. By studying the off-diagonal structure of |$\zeta ^E_l(r_1,r_2)$| we can study the coherence of the transverse velocity field. In Fig. 4 we plot the dimensionless correlation coefficient for the first few multipoles normalized at |$10 \, \mathrm{Mpc}\,h^{ -1}$|⁠, defined as |$\rho _{10}(r) = \zeta ^E_l(r_{10},r)/\sqrt{\zeta ^E_l(r_{10},r_{10})\zeta ^E_l(r,r)}$| with |$r_{10} = 10 \, \mathrm{Mpc}\,h^{ -1}$|⁠. The radial correlation length rco can then be defined for each l as the distance where ρ10 drops below a specified value, for example ρ10(rco) = 0.1. This figure shows that the correlation length is larger on larger angular scales, which makes intuitive sense since the angular power at lower l is primarily sensitive to larger spatial scales, as evidenced by the difference between the solid and dashed curves in Fig. 2. Summing over all multipoles we find the coherence length of the linear transverse velocity field at z ≈ 0 is roughly |$200 \, \mathrm{Mpc}\,h^{ -1}$|⁠, consistent with that of the 3D linear velocity field (Zheng et al. 2013). Figure 4. View largeDownload slide Dimensionless correlation coefficient of the LSS transverse velocity power spectrum between different radii, normalized at |$r = 10 \, \mathrm{Mpc}\,h^{ -1}$| (indicated by the dashed vertical line). The first few multipoles are shown as the solid curves at (top to bottom) l = 1 (blue), l = 2 (orange), and l = 3 (green). Higher multipoles decorrelate with distance faster than lower multipoles due to the smaller-scale structures typically probed by higher multipoles. Figure 4. View largeDownload slide Dimensionless correlation coefficient of the LSS transverse velocity power spectrum between different radii, normalized at |$r = 10 \, \mathrm{Mpc}\,h^{ -1}$| (indicated by the dashed vertical line). The first few multipoles are shown as the solid curves at (top to bottom) l = 1 (blue), l = 2 (orange), and l = 3 (green). Higher multipoles decorrelate with distance faster than lower multipoles due to the smaller-scale structures typically probed by higher multipoles. In Figs 5 and 6 we plot the ξ+ and ξ− correlation functions, respectively, as a function of angular separation, for different radial distances. The dependence on angular separation in ξ+ reflects the l-dependence of the power spectrum in Fig. 1; at low radii most of the signal is dipolar and so the transverse velocity is correlated over large angles. As we go to higher radii the power spectrum develops structure on smaller angular scales and most of the support of the correlation functions is for small β. The shape of the ξ− correlation functions can be explained by similar reasoning, albeit with the added complication that all curves must go to zero at vanishing separation due to the two components of transverse velocity being uncorrelated at a point. Figure 5. View largeDownload slide The ξ+ correlation function of LSS transverse velocities as a function of angular separation in radians. We show curves for (top to bottom) |$r = 25 \, \mathrm{Mpc}\, h^{ -1}$| (blue), |$r = 71 \, \mathrm{Mpc}\,h^{ -1}$| (orange), |$r = 147 \, \mathrm{Mpc}\,h^{ -1}$| (green), and |$r = 300 \, \mathrm{Mpc}\,h^{ -1}$| (red). The transverse velocity field develops greater angular structure as the distance is increased, consistent with Fig. 1. Figure 5. View largeDownload slide The ξ+ correlation function of LSS transverse velocities as a function of angular separation in radians. We show curves for (top to bottom) |$r = 25 \, \mathrm{Mpc}\, h^{ -1}$| (blue), |$r = 71 \, \mathrm{Mpc}\,h^{ -1}$| (orange), |$r = 147 \, \mathrm{Mpc}\,h^{ -1}$| (green), and |$r = 300 \, \mathrm{Mpc}\,h^{ -1}$| (red). The transverse velocity field develops greater angular structure as the distance is increased, consistent with Fig. 1. Figure 6. View largeDownload slide The ξ− correlation function of LSS transverse velocities as a function of angular separation in radians. We show curves for (bottom to top at β = π) |$r = 25 \, \mathrm{Mpc}\,h^{ -1}$| (blue), |$r = 71 \, \mathrm{Mpc}\,h^{ -1}$| (orange), |$r = 147 \, \mathrm{Mpc}\,h^{ -1}$| (green), and |$r = 300 \, \mathrm{Mpc}\,h^{ -1}$| (red). The transverse velocity field develops greater angular structure as the distance is increased, consistent with Fig. 1. Figure 6. View largeDownload slide The ξ− correlation function of LSS transverse velocities as a function of angular separation in radians. We show curves for (bottom to top at β = π) |$r = 25 \, \mathrm{Mpc}\,h^{ -1}$| (blue), |$r = 71 \, \mathrm{Mpc}\,h^{ -1}$| (orange), |$r = 147 \, \mathrm{Mpc}\,h^{ -1}$| (green), and |$r = 300 \, \mathrm{Mpc}\,h^{ -1}$| (red). The transverse velocity field develops greater angular structure as the distance is increased, consistent with Fig. 1. Similar to the radial coherence distance, we can define a coherence angle as the angular separation at which the correlation function ξ+ falls below 10 per cent of its zero-lag value. We find that even at the largest distance we include in our forecasts (⁠|$r \sim 300 \, \mathrm{Mpc}\,h^{ -1}$|⁠) the angular coherence scale is large at roughly 20°. We close this section by noting that the LSS correlation functions may be computed more directly by using the explicit forms of the Wigner d functions and the identity between spherical Bessel functions given in equation (B7) to perform the sum over multipoles. We find that \begin{eqnarray*} &&{\xi _\pm (\beta _{12},r_1,r_2) = \int \frac{k^2\mathrm{d}k}{2\pi ^2} \, P_v(k;r_1,r_2)} \nonumber \\ &&{\quad \times \, \left[ (\pm 1 + \cos \beta _{12})\frac{j_1(kx)}{kx} - \sin ^2 \beta _{12} \frac{(kr_1)(kr_2)}{(kx)^2}j_2(kx) \right], } \end{eqnarray*} (17) where Pv(k; r1, r2) is the velocity potential power spectrum and |$x = (r_1^2 + r_2^2 - 2r_1r_2\cos \beta _{12})^{1/2}$| is the 3D distance between the two points. Taking the limit β12 → 0 with r1 = r2 gives |$\xi _+ \rightarrow 2\sigma _v^2/3$|⁠, i.e. the transverse velocity carries two thirds of the total velocity variance at a point, with the radial velocity carrying the remaining third, as required by isotropy. 3 LIKELIHOOD FORMALISM In this section we develop the likelihood formalism which will allow us to derive optimal estimators for various sources of extragalactic proper motion. We focus on proper motions from the SP due to the relative motion of the Solar System barycentre with respect to galaxies and quasars, SAD due to the time-varying aberration caused by the acceleration of the Solar System barycentre towards the galactic centre, and the intrinsic peculiar motion of galaxies and quasars due to LSS. 3.1 Uncertainty from noise and peculiar velocities Suppose we observe a set of N objects and measure the complex proper motion +d and −d (cf. equation 1) for each object. Define the data vector |$\boldsymbol {d} = ({}_+\boldsymbol {d}^\intercal , {}_-\boldsymbol {d}^\intercal)^\intercal$|⁠, where |${}_\pm \boldsymbol {d}$| are vectors containing the measurements from each object. Assuming the measurement noise is uncorrelated between objects, the noise covariance matrix is given by \begin{eqnarray*} \mathsf{N} &=& {\begin{pmatrix}\mathsf{N}_+ \quad \mathsf{N}_- \\ \mathsf{N}_-^\dagger \quad \mathsf{N}_+^* \end{pmatrix}}, \nonumber \\ &=& {\begin{pmatrix}\mathrm{diag}[\sigma _{\mu \delta }^{(i)2} + \sigma _{\mu \alpha ^*}^{(i)2}] \quad \mathrm{diag}[\sigma _{\mu \delta }^{(i)2} - \sigma _{\mu \alpha ^*}^{(i)2}] \\ \mathrm{diag}[\sigma _{\mu \delta }^{(i)2} - \sigma _{\mu \alpha ^*}^{(i)2}] \quad \mathrm{diag}[\sigma _{\mu \delta }^{(i)2} + \sigma _{\mu \alpha ^*}^{(i)2}] \end{pmatrix}}, \end{eqnarray*} (18) where |$\sigma _{\mu \delta }^{(i)2}$| and |$\sigma _{\mu \alpha ^*}^{(i)2}$| are the proper motion noise variances in the declination and corrected-right-ascension (i.e. true arc) directions, respectively, for the ith object (throughout this section, μ labels proper motions). The diagonal matrices in each block are N × N, with the full noise covariance matrix a 2N × 2N dimensional matrix. For isotropic noise we would have |$\sigma _{\mu \delta }^{(i)2} = \sigma _{\mu \alpha ^*}^{(i)2} \equiv \sigma _\mu ^{(i)2}$|⁠, meaning the covariance matrix is diagonal with |$\mathsf{N} = 2(\sigma _\mu ^{2(i)} \mathsf{I}_N) \oplus (\sigma _\mu ^{2(i)} \mathsf{I}_N)$|⁠, but in general these noise variances will be different due to the scanning strategy. The noise variance will also differ for each object due to differences in magnitude, colour, and other intrinsic galaxy or quasar properties. The mean of the data vector over the noise has contributions from the SP velocity, the SAD proper motion, and the peculiar velocity due to LSS. We further split the LSS peculiar velocity into a part correlated with the Solar System’s motion relative to the CMB (i.e. the SP velocity), and an uncorrelated part; a correlated part is expected since part of the SP velocity is due to the Milky Way’s velocity relative to the CMB, |$\boldsymbol {V}_g$|⁠, which is correlated with that of nearby galaxies due to the gravitational effects of LSS. We write |$\boldsymbol {v}^{\mathrm{LSS}} = \boldsymbol {v}_c^{\mathrm{LSS}} + \boldsymbol {v}_u^{\mathrm{LSS}}$|⁠, where the correlated part is given by \begin{eqnarray*} \boldsymbol {v}_c^{\mathrm{LSS}} = \left\langle \boldsymbol {v}_c^{\mathrm{LSS}} \boldsymbol {v}_c^{\intercal \mathrm{SP}} \right\rangle \left\langle \boldsymbol {v}_c^{\mathrm{SP}} \boldsymbol {v}_c^{\intercal \mathrm{SP}} \right\rangle ^{-1} \boldsymbol {v}_c^{\mathrm{SP}}, \end{eqnarray*} (19) where |$\boldsymbol {v}_c^{\mathrm{SP}}$| is the part of the SP velocity correlated with LSS. We take this to be (minus) the Milky Way–CMB relative velocity vector, whose magnitude is roughly |$550 \, \mathrm{km\,s^{ -1}}$| towards galactic coordinates (l, b) = (266.5°, 29.1°) (Kogut et al. 1993). By isotropy the quantity |$\langle \boldsymbol {v}_c^{\mathrm{SP}} \boldsymbol {v}_c^{\intercal \mathrm{SP}} \rangle$| must be proportional to δab which implies that |$\langle \boldsymbol {v}_c^{\mathrm{SP}} \boldsymbol {v}_c^{\intercal \mathrm{SP}} \rangle = (\sigma _{\mathrm{SP},c}^2/3) \delta _{ab}$| where |$\sigma _{\mathrm{SP},c}^2$| is the variance of the correlated part of the Solar System’s velocity, equal to the variance of the LSS velocity field |$\sigma _v^2$| at our location (and by statistical homogeneity, any location on the z = 0 time slice). The covariance of the correlated LSS velocity at spatial location |$(\hat{\boldsymbol {n}}, r)$| with the correlated SP velocity is given by |$\langle \boldsymbol {v}_c^{\mathrm{LSS}} \boldsymbol {v}_c^{\intercal \mathrm{SP}} \rangle = \langle \boldsymbol {v}(\hat{\boldsymbol {n}}, r) \boldsymbol {v}^\intercal (\boldsymbol {r} = 0) \rangle$|⁠. As shown in Gorski (1988), this may be written as |$\langle v_a(\hat{\boldsymbol {n}}, r) v_b(\boldsymbol {r} = 0) \rangle = \Psi _\perp (r) \delta _{ab} + \left[\Psi _\parallel (r) - \Psi _\perp (r) \right]\hat{n}_a\hat{n}_b$|⁠. After projecting transverse to the line-of-sight, we find the part of the complex proper motion due to LSS correlated with the local SP velocity is \begin{eqnarray*} {}_\pm \boldsymbol{\mu }_c^{\mathrm{LSS}} = -\frac{3}{\sigma _v^2}\Psi _\perp (r){}_\pm \boldsymbol{\mu }_c^{\mathrm{SP}}, \end{eqnarray*} (20) where (Gorski 1988) \begin{eqnarray*} \Psi _\perp (r) = \int \frac{k^2 \mathrm{d}k}{2\pi ^2} P_v(k;r) \frac{j_1(kr)}{kr}, \end{eqnarray*} (21) with Pv(k; r) the velocity power spectrum. Note that as r → 0 we have |$\Psi _\perp (r) \rightarrow \sigma ^2_v/3$|⁠, which implies that |${}_\pm \boldsymbol{\mu }_c^{\mathrm{LSS}} \rightarrow -{}_\pm \boldsymbol{\mu }_c^{\mathrm{SP}}$|⁠, i.e. the total observed proper motion averaged over galaxies tends to the part of the SP velocity uncorrelated with LSS, which we take to be the velocity of the Solar System relative to that of the Milky Way. In Fig. 7 we plot the quantity |$3\Psi _\perp (r)/\sigma ^2_v$| with both the linear velocity power spectrum (upper curve) and the velocity power spectrum using the halofit correction to the matter power spectrum but assuming the linear continuity equation (lower curve) to give a rough idea of the contribution from non-linear scales. From this figure we see that the perpendicular correlation function falls to 10 per cent of its zero-lag value by roughly |$200 \, \mathrm{Mpc}\,h^{ -1}$|⁠, and that the contribution from non-linear scales can be quite large for distances less than this. This quantity is also sensitive to the chosen cosmological parameters, which impact the velocity power spectrum through the growth factor, the Hubble parameter, and the shape of the matter spectrum. Figure 7. View largeDownload slide The perpendicular velocity correlation function Ψ⊥(r) of Gorski (1988) normalized by the 1D variance |$\sigma _v^2/3$|⁠, in linear theory (blue, upper curve), and with the halofit correction to the matter power spectrum (orange, lower curve). Figure 7. View largeDownload slide The perpendicular velocity correlation function Ψ⊥(r) of Gorski (1988) normalized by the 1D variance |$\sigma _v^2/3$|⁠, in linear theory (blue, upper curve), and with the halofit correction to the matter power spectrum (orange, lower curve). The remaining part of the peculiar velocity due to LSS is uncorrelated with our local velocity. This part is zero on average, with a covariance matrix |$\mathsf{C}_{\pm }^u \equiv \langle {}_+\boldsymbol{\mu }^{\mathrm{LSS}}_u {}_\mp \boldsymbol{\mu }^{\intercal \mathrm{LSS}}_u \rangle$| given by \begin{eqnarray*} C_{\pm ,ij}^u = C_{\pm ,ij}^{\mathrm{LSS}} - \frac{3\Psi _\perp (r_i)}{\sigma _v^2}\frac{3\Psi _\perp (r_j)}{\sigma _v^2} \left\langle {}_+\mu ^{\mathrm{SP}}_{c,i} {}_\mp \mu ^{\mathrm{SP}}_{c,j} \right\rangle , \end{eqnarray*} (22) where |$C_{\pm ,ij}^{\mathrm{LSS}}$| are elements of the full LSS complex proper motion covariance matrix. Since the local SP velocity is purely dipolar, the E-mode power spectrum of the uncorrelated LSS transverse velocity is given by \begin{eqnarray*} \zeta _l^{E,u}(r_1, r_2) = \zeta _l^E(r_1, r_2) - \frac{8\pi }{3}\frac{3\Psi _\perp (r_i)}{\sigma _v^2}\frac{3\Psi _\perp (r_j)}{\sigma _v^2} \frac{\sigma _v^2}{3}\delta _{l1}. \end{eqnarray*} (23) In the limit that r1 and r2 both tend to zero we have |$\zeta _l^{E,u} \rightarrow 0$|⁠, since in this limit the proper motion due to LSS is completely correlated with the velocity of the Milky Way. We now marginalize over the uncorrelated proper motion due to LSS, assuming that it obeys Gaussian statistics6 with covariance matrix |$\mathsf{C}_{\pm }^u$|⁠.7 Assuming Gaussian noise, the likelihood of the data vector becomes, up to an arbitrary constant, \begin{eqnarray*} &&{-2\ln p(\boldsymbol {d} | {}_\pm \boldsymbol{\mu }^{\mathrm{SP}} , {}_\pm \boldsymbol{\mu }^{\mathrm{SAD}}) = {\begin{pmatrix}_+\boldsymbol {d} - {}_{+}\boldsymbol{\mu }\\ {}_-\boldsymbol{d} - {}_{-}\boldsymbol{\mu } \end{pmatrix}}^\dagger}\nonumber \\ &&{\quad \times \,{\begin{pmatrix}\mathsf{N}_+ + \mathsf{C}_+^u& \mathsf{N}_- + \mathsf{C}_-^u\\ (\mathsf{N}_- + \mathsf{C}_-^u)^\dagger & (\mathsf{N}_+ + \mathsf{C}_+^u)^* \end{pmatrix}}^{-1} {\begin{pmatrix}_+\boldsymbol {d} - {}_{+}\boldsymbol{\mu }\\ \ {}_-\boldsymbol {d} - {}_{-}\boldsymbol{\mu } \end{pmatrix}}, } \end{eqnarray*} (24) where \begin{eqnarray*} {}_{\pm }\boldsymbol{\mu } \equiv {}_\pm \boldsymbol{\mu }^{\mathrm{SP}} + {}_\pm \boldsymbol{\mu }^{\mathrm{SAD}} + {}_\pm \boldsymbol{\mu }^{\mathrm{LSS}}_c, \end{eqnarray*} (25) with |${}_\pm \boldsymbol{\mu }^{\mathrm{LSS}}_c$| given by equation (20). Note that |$(\mathsf{N}_+ + \mathsf{C}_+^u) = (\mathsf{N}_+ + \mathsf{C}_+^u)^\dagger$| and |$(\mathsf{N}_- + \mathsf{C}_-^u) = (\mathsf{N}_- + \mathsf{C}_-^u)^\intercal$|⁠. Equation (24) is a Gaussian likelihood function for the observed proper motion, conditioned on the Solar System’s velocity relative to the CMB, its acceleration towards the galactic centre, and the Milky Way’s velocity relative to the CMB. This function contains all the information available under the assumption that both sources of noise (instrumental and cosmic variance) obey Gaussian statistics. In Fig. 8 we give a schematic diagram that illustrates the various contributions to the proper motion of each galaxy. Figure 8. View largeDownload slide Schematic diagram of the various sources of proper motion considered in this work in the CMB rest frame. The Sun is at the centre and has a velocity |$\boldsymbol {V}_{\odot }$| indicated by the dashed blue arrow. This induces the SP velocity vector |$\boldsymbol {V}_{\mathrm{SP}}$|⁠, indicated by the solid blue arrow, which gives rise to a dipolar proper motion in each surrounding galaxy; the solid blue vector should be added to each galaxy and projected perpendicular to |$\hat{\boldsymbol {n}}$| to find the SP proper motion. The SAD proper motion vector is indicated by the solid green arrow, which is directed towards the Milky Way centre (the large central galaxy); again, this solid vector should be added to each surrounding galaxy to find its SAD proper motion. In addition, each galaxy has a peculiar velocity, which consists of a random component |$\boldsymbol {v}^{\mathrm{LSS}}_u$| (black solid arrows) and a part |$\boldsymbol {v}^{\mathrm{LSS}}_c$| correlated with the motion of the Milky Way (red solid arrows). The Milky Way velocity |$\boldsymbol {V}_g$| is indicated by the red dashed arrow. All galaxies are here assumed to be at the same radial distance, indicated by the dashed circle, so the lengths of the red solid vectors are equal; in general they are given by equation (20). The total proper motion of each object is found by summing up the solid arrows centred on that object, as well as the two solid arrows centred on the observer. The average of these proper motions has a contribution proportional to |$\boldsymbol {V}_g$|⁠. Figure 8. View largeDownload slide Schematic diagram of the various sources of proper motion considered in this work in the CMB rest frame. The Sun is at the centre and has a velocity |$\boldsymbol {V}_{\odot }$| indicated by the dashed blue arrow. This induces the SP velocity vector |$\boldsymbol {V}_{\mathrm{SP}}$|⁠, indicated by the solid blue arrow, which gives rise to a dipolar proper motion in each surrounding galaxy; the solid blue vector should be added to each galaxy and projected perpendicular to |$\hat{\boldsymbol {n}}$| to find the SP proper motion. The SAD proper motion vector is indicated by the solid green arrow, which is directed towards the Milky Way centre (the large central galaxy); again, this solid vector should be added to each surrounding galaxy to find its SAD proper motion. In addition, each galaxy has a peculiar velocity, which consists of a random component |$\boldsymbol {v}^{\mathrm{LSS}}_u$| (black solid arrows) and a part |$\boldsymbol {v}^{\mathrm{LSS}}_c$| correlated with the motion of the Milky Way (red solid arrows). The Milky Way velocity |$\boldsymbol {V}_g$| is indicated by the red dashed arrow. All galaxies are here assumed to be at the same radial distance, indicated by the dashed circle, so the lengths of the red solid vectors are equal; in general they are given by equation (20). The total proper motion of each object is found by summing up the solid arrows centred on that object, as well as the two solid arrows centred on the observer. The average of these proper motions has a contribution proportional to |$\boldsymbol {V}_g$|⁠. 3.2 Optimal estimators for large-scale proper motion The SP and SAD complex proper motions of object i at radial distance ri are given by \begin{eqnarray*} {}_\pm \mu ^{\mathrm{SP}}_i = \boldsymbol {V}^{\mathrm{SP}} \cdot [\hat{\boldsymbol{\theta }}^{(i)} \pm i \hat{\boldsymbol{\phi }}^{(i)}]/r_i \equiv \left[\mathsf{S}^{(1)}_\pm \boldsymbol {V}^{\mathrm{SP}} \right]_i \end{eqnarray*} (26) \begin{eqnarray*} {}_\pm \mu ^{\mathrm{SAD}}_i = \boldsymbol{\mu }^{\mathrm{SAD}} \cdot [\hat{\boldsymbol{\theta }}^{(i)} \pm i \hat{\boldsymbol{\phi }}^{(i)}] \equiv \left[\boldsymbol{\sf S}^{(0)}_\pm \boldsymbol{\mu }^{\mathrm{SAD}} \right]_i, \end{eqnarray*} (27) where |$\boldsymbol {V}^{\mathrm{SP}}$| is the fixed 3D SP velocity vector, i.e. minus the Solar System barycentre’s velocity with respect to the CMB rest frame, |$\boldsymbol{\mu }^{\mathrm{SAD}}$| is a 3D ‘acceleration’ vector having units of proper motion and parallel to the vector connecting the observer to the galactic centre, and we have introduced the N × 3 matrix |$\mathsf{S}^{(n)}_\pm$| whose elements are given by |$S^{(n)}_{\pm ,ia} = (\hat{\theta }^{(i)}_a \pm i\hat{\phi }^{(i)}_a)/r_i^n$|⁠. Note that the SAD proper motion is independent of distance. Inserting these expressions into equation (24) we find, after some manipulation \begin{eqnarray*} -2\ln p(\boldsymbol {d} |\boldsymbol{\mu }^{\mathrm{SAD}}, \boldsymbol {V}^{\mathrm{SP}}) &=& {\begin{pmatrix}\Delta \boldsymbol{\mu }^{\mathrm{SAD}} \\ \Delta \boldsymbol {V}^{\mathrm{SP}} \end{pmatrix}}^\intercal {\begin{pmatrix}\mathsf {M}^{(0)} & \mathsf {M}^{(1)} \\ \mathsf{M}^{(1)\intercal } & \mathsf{M}^{(2)} \end{pmatrix}} \nonumber \\ &&\times \,{\begin{pmatrix}\Delta \boldsymbol{\mu }^{\mathrm{SAD}} \\ \Delta \boldsymbol {V}^{\mathrm{SP}} \end{pmatrix}}, \end{eqnarray*} (28) up to an arbitrary constant. Denoting the total covariance matrix |$\mathsf{C}_T \equiv \mathsf{N} + \mathsf{C}^u$|⁠, the 3 × 3 matrices |$\mathsf{M}^{(n)}$| are given by \begin{eqnarray*} \mathsf{M}^{(0)} &\equiv& 2\mathsf{S}_+^{(0)\dagger }(\mathsf{C}_T^{-1})_{(1,1)} \mathsf{S}_+^{(0)} + 2\mathrm{Re}[ \mathsf{S}_+^{(0)\dagger }(\mathsf{C}_T^{-1})_{(1,2)} \mathsf{S}_-^{(0)}]\nonumber ,\\ \mathsf{M}^{(1)} &\equiv& 2\mathsf{S}_+^{(0)\dagger }(\mathsf{C}_T^{-1})_{(1,1)} \mathsf{S}_+^{(1)} + 2\mathrm{Re}[ \mathsf{S}_+^{(0)\dagger }(\mathsf{C}_T^{-1})_{(1,2)} \mathsf{S}_-^{(1)}]\nonumber ,\\ \mathsf{M}^{(2)} &\equiv& 2\mathsf{S}_+^{(1)\dagger }(\mathsf{C}_T^{-1})_{(1,1)} \mathsf{S}_+^{(1)} + 2\mathrm{Re}[ \mathsf{S}_+^{(1)\dagger }(\mathsf{C}_T^{-1})_{(1,2)} \mathsf{S}_-^{(1)}], \end{eqnarray*} (29) where |$(\mathsf{C}_T^{-1})_{(1,1)}$| denotes the (+, +) block of the inverse covariance matrix, and similarly for the other blocks. In the limit of no cosmic variance from LSS, these matrices reduce to \begin{eqnarray*} \mathsf {M}^{(n)}_{ab}\bigr \vert _{\mathrm{\mathrm{LSS}} = 0} = \sum _{i=1}^N \left(\frac{\hat{\theta }^{(i)}_a\hat{\theta }^{(i)}_b}{\sigma _{\mu \delta }^{(i)2}} + \frac{\hat{\phi }^{(i)}_a\hat{\phi }^{(i)}_b}{\sigma _{\mu \alpha ^*}^{(i)2}} \right)r_i^{-n}. \end{eqnarray*} (30) The means are given by |$\Delta \boldsymbol{\mu }^{\mathrm{SAD}} = \boldsymbol{\mu }^{\mathrm{SAD}} - \hat{\boldsymbol{\mu }}^{\mathrm{SAD}}$| and |$\Delta \boldsymbol {V}^{\mathrm{SP}} = \boldsymbol {V}^{\mathrm{SP}} - \hat{\boldsymbol {V}}^{\mathrm{SP}}$|8 with \begin{eqnarray*} \hat{\boldsymbol {V}}^{\mathrm{SP}} {=} \left[\mathsf{M}^{(2)} - \mathsf{M}_S^{(1)}\mathsf{M}^{(0)-1}\mathsf{M}^{(1)}_S\right]^{-1}\left[\boldsymbol {L}^{(1)} {-} \mathsf{M}_S^{(1)}\mathsf{M}^{(0)-1}\boldsymbol {L}^{(0)}\right], \end{eqnarray*} (31) \begin{eqnarray*} \hat{\boldsymbol{\mu }}^{\mathrm{SAD}} = \left[\mathsf{M}^{(0)} - \mathsf{M}_S^{(1)}\mathsf{M}^{(2)-1}\mathsf{M}_S^{(1)}\right]^{-1}\left[\boldsymbol {L}^{(0)} - \mathsf{M}_S^{(1)}\mathsf{M}^{(2)-1}\boldsymbol {L}^{(1)}\right], \nonumber \\ \end{eqnarray*} (32) where |$\mathsf{M}^{(1)}_S$| is the symmetric part of |$\mathsf{M}^{(1)}$| and the three-vectors |$\boldsymbol {L}^{(n)}$| are given by \begin{eqnarray*} \boldsymbol {L}^{(n)} \!=\! 2\mathrm{Re}\left\lbrace\! \left[ \mathsf{S}_+^{(n)\dagger }\left(\mathsf{C}_T^{-1}\right)_{(1,1)} + \mathsf{S}_-^{(n)\dagger }\left(\mathsf{C}_T^{-1}\right)^*_{(1,2)} \right] \!\left({}_+\boldsymbol {d} - {}_+\boldsymbol{\mu }^{\mathrm{LSS}}_c\right)\!\right\rbrace . \nonumber \\ \end{eqnarray*} (33) In the limit of no cosmic variance, these vectors reduce to \begin{eqnarray*} \mathsf {L}^{(n)}_a\bigr \vert _{\mathrm{\mathrm{LSS}} = 0} = \sum _{i=1}^N \left(\frac{d_{\hat{\theta }}^{(i)}\hat{\theta }^{(i)}_a}{\sigma _{\mu \delta }^{(i)2}} + \frac{d_{\hat{\phi }}^{(i)}\hat{\phi }^{(i)}_a}{\sigma _{\mu \alpha ^*}^{(i)2}} \right)r_i^{-n}, \end{eqnarray*} (34) where |$d_{\hat{\theta }}$| and |$d_{\hat{\phi }}$| are the declination and right-ascension components of the measured proper motion respectively, with the correlated LSS part subtracted off. 3.3 Bias and variance of the proper motion estimators The estimators for the three vectors given in equations (31) and (32) are optimal estimators for SP velocity and SAD acceleration vectors. If the correlated parts of the LSS velocity were known perfectly, these estimators would be unbiased, with |$\langle \hat{\boldsymbol {V}}^{\mathrm{SP}} \rangle = \boldsymbol {V}^{\mathrm{SP}}$| and similarly for |$\boldsymbol{\mu }^{\mathrm{SAD}}$|⁠. After subtraction of the correlated LSS term, the vectors |$\boldsymbol {L}^{(n)}$| have the form of correlations of the inverse-variance weighted data vector with the spherical polar basis vectors, weighted with the appropriate factor of r, then summed over the N objects in the proper motion catalogue. In general, the correlated term |$\boldsymbol{\mu }^{\mathrm{LSS}}_c$| will not be known perfectly, due to the uncertain form of |$3\Psi _\perp (r)/\sigma _v^2$| from non-linearity (see Fig. 7) and uncertainty in the cosmological parameters. An alternative way of treating this term is to ignore it in the estimator and study the size of the resulting bias in the mean of |$\hat{\boldsymbol {V}}^{\mathrm{SP}}$| and |$\hat{\boldsymbol{\mu }}^{\mathrm{SAD}}$|⁠. In a similar vein, we could consider neglecting the cosmic variance due to uncorrelated LSS proper motion altogether, and just adopt the instrument-noise-only expressions of equations (30) and (34). This is necessary when the number of objects in the proper motion catalogue is large enough that the matrix inversion of |$\boldsymbol{C}_T$| (a non-diagonal matrix) is slow, but small enough that the large-N approximations made below are inaccurate. In this case, the uncorrelated LSS proper motion simply adds noise to the estimator. The covariance matrix between the SAP and SP estimators can be read off from equation (28), since both estimators are Gaussian distributed. Note that in the limit that |$\boldsymbol {V}^{\mathrm{SP}}$| is perfectly known, the optimal estimator and variance of |$\boldsymbol{\mu }^{\mathrm{SAD}}$| may be found by taking |$\mathsf{M}^{(2)} \rightarrow \infty$| at fixed |$\mathsf{M}^{(0)}$|⁠, and conversely if |$\boldsymbol{\mu }^{\mathrm{SAD}}$| is perfectly known we can take |$\mathsf{M}^{(0)} \rightarrow \infty$| at fixed |$\mathsf{M}^{(2)}$|⁠. Finally, it is straightforward to show that the covariance matrix of the |$\boldsymbol {L}$| vectors is given by \begin{eqnarray*} \langle \Delta \boldsymbol {L}^{(m)} \Delta \boldsymbol {L}^{(n)\intercal } \rangle = \mathsf{M}^{(m+n)}. \end{eqnarray*} (35) In the scenario where we treat cosmic variance as ‘extra noise’ and use the instrument-noise-only estimators, equation (35) receives additional contributions from uncorrelated LSS which are straightforward to compute in our formalism. The |$\mathsf{M}$| matrices and |$\boldsymbol {L}$| vectors are the fundamental building blocks which we must compute before we form the optimal estimators for the SP and SAD three vectors in equations (31) and (32). 3.4 Amplitude estimators and the Hubble constant The SP velocity is known a priori from the CMB dipole having a magnitude of |$(369 \pm 0.9) \, \mathrm{km\,s^{ -1}}$| towards galactic coordinates (l, b) = (263.99° ± 0.14°, 48.26° ± 0.03°) (Planck Collaboration XVI 2014). Likewise, the magnitude of the SAD proper motion is roughly |$4.3 \, \mu \mathrm{as} \, \mathrm{yr}^{-1}$| towards the galactic centre (Kopeikin & Makarov 2006; Titov et al. 2011). Fixing the SP velocity vector to that implied by the CMB dipole,9 we can attempt to constrain only the amplitude of the SP by parametrizing |$\boldsymbol{\mu }^{\mathrm{SP}} = -A_{\mathrm{SP}}\boldsymbol {V_0}/r$|⁠, where |$\boldsymbol {V_0}$| is the Solar System’s motion relative to the CMB and ASP has a fiducial value of unity. Similarly, we can attempt to constrain only the amplitude of the SAD proper motion by writing |$\boldsymbol{\mu }^{\mathrm{SAD}} = A_{\mathrm{SAD}}\boldsymbol{\mu _0}$|⁠, where ASAD has unit fiducial value. Alternatively, since the observable quantity is actually |$\boldsymbol{\mu }^{\mathrm{SP}} + \boldsymbol{\mu }_c^{\mathrm{LSS}}$| we could attempt to project against the direction of this quantity; we do not take this approach since this would require knowledge of |$3\Psi _\perp /\sigma _v^2$|⁠, which is contaminated by non-linearities and imperfect knowledge of the cosmological parameters. Instead we treat this extra term as a bias in ASP. Spectroscopic redshift information provides the quantity H0r,10 so the amplitude of the SP velocity is completely degenerate with the Hubble constant H0. Thus, forecast constraints on the amplitude ASP may be interpreted as fractional constraints on the Hubble constant (cf. Bachchan et al. 2016). Alternatively we may consider the Hubble constant as fixed, in which case the amplitude of the SP effect can be constrained directly. Optimal estimators for ASP and ASAD may be straightforwardly derived from equation (28). For example, fixing the SAD proper motion, the optimal estimator for ASP is \begin{eqnarray*} \hat{A}_{\mathrm{SP}} = \left(\boldsymbol {V}_0^\intercal \mathsf{M}^{(2)}\boldsymbol {V}_0\right)^{-1} \boldsymbol {V}_0^\intercal \boldsymbol {L}^{(1)}, \end{eqnarray*} (36) which has variance |$\sigma _{A_{\mathrm{SP}}}^2 = \left(\boldsymbol {V}_0^\intercal \mathsf{M}^{(2)}\boldsymbol {V}_0\right)^{-1}$|⁠. 3.5 Large N, isotropic noise limit In the limit where the cosmic variance from uncorrelated LSS proper motion is non-negligible, we must invert the N × N matrices |$(\mathsf{N}_\pm + \mathsf{C}_\pm ^u)$| to form the optimal estimators in equations (31) and (32). This is clearly impractical if N is large, as it will be for current astrometric surveys. Fortunately, an accurate approximation to this term exists when the instrumental noise is isotropic (which is approximately true for current astrometric surveys) and N is larger than a few hundred. Consider first the cosmic-variance-free form of the |$\mathsf{M}$| matrices given in equation (30). Setting |$\sigma _{\mu \delta }^{(i)2} = \sigma _{\mu \alpha ^*}^{(i)2} \equiv \sigma _\mu ^{(i)2}$| and approximating the sum over objects with an integral over angular and radial position, we find that \begin{eqnarray*} \mathsf {M}^{(n)}_{ab}\bigr \vert _{\mathrm{\mathrm{LSS}} = 0} \approx \frac{2N}{3} \left\langle \frac{1}{\sigma _\mu ^2 r^n} \right\rangle \delta _{ab}, \end{eqnarray*} (37) where the angle brackets denote an average over the catalogue, accounting for the potential correlations between noise variance and distance that one might expect in a flux-limited survey. Note that we have implicitly neglected any correlations between the distance and the angular positions, i.e. we assume a uniform depth across the survey.11 The Kronecker delta is clearly demanded by isotropy in this scenario. In this large N, isotropic noise (LNIN) limit, the variance of one component of the SP velocity after marginalizing over the SAD proper motion is, in the absence of cosmic variance, \begin{eqnarray*} \sigma _{v_x}^2\bigr \vert _{\mathrm{\mathrm{LSS}} = 0} = \frac{3}{2N} \left(\left\langle \sigma _\mu ^{-2} r^{-2} \right\rangle - \frac{\left\langle \sigma _\mu ^{-2} r^{-1} \right\rangle ^2}{\left\langle \sigma _\mu ^{-2} \right\rangle }\right)^{-1}. \end{eqnarray*} (38) Equation (38) clearly demonstrates the importance of having radial information to disentangle the effects of SP and SAD proper motion. These have the same angular structure (E-mode dipoles) and differ only in their radial dependence. In the scenario where all the objects in the catalogue are at the same radial distance, the variance on each component of the SP velocity tends to infinity. Similarly, the variance of the SP amplitude ASP in this scenario is simply given by |$\sigma _{A_{\mathrm{SP}}}^2 = \sigma _{v_x}^2/\vert \boldsymbol {V}_0 \vert ^2$|⁠. We can use the LNIN approximation to compute the inverse of the matrices |$(\mathsf{N}_\pm + \mathsf{C}_\pm ^u)$|⁠. Firstly, we will bin all the galaxies into Nr distance bins, and assume that the noise variance is constant within each bin. We then use equation (9) to write (taking care to correct for the phase factors) \begin{eqnarray*} (\mathsf{N}_\pm + \mathsf{C}_\pm ^u)_{ij} = \sum _{lm} \left[\boldsymbol{\Omega }_l^{(\pm)}\right]_{r_i,r_j} {}_1Y_{lm}(\hat{\boldsymbol {n}}_i){}_{\pm 1}Y_{lm}^*(\hat{\boldsymbol {n}}_j), \end{eqnarray*} (39) where we have defined the quantities \begin{eqnarray*} \left[\boldsymbol{\Omega }_l^{(+)}\right]_{r_i, r_j} &=& N(r_i)\delta _{r_i,r_j} + \zeta ^{E,u}_l(r_i, r_j), \nonumber \\ \left[\boldsymbol{\Omega }_l^{(-)}\right]_{r_i, r_j} &=& -\zeta ^{E,u}_l(r_i, r_j), \end{eqnarray*} (40) where N(ri) is the total noise power spectrum in the bin at distance ri (containing |$N_{r_i}$| objects) given by |$N(r_i) = 2\sigma _\mu ^2(r_i)\times 4\pi /N_{r_i}$|⁠. Note that the noise power consists of E and B modes in equal ratio. The minimum bin width should be the typical distance uncertainty caused by radial peculiar velocities (for spectroscopic redshifts), which is roughly |$\Delta r \approx 3 \, \mathrm{Mpc}\,h^{ -1}$| at z = 0. We use a maximum of Nr = 20 redshift bins with |$\Delta r \approx 15 \, \mathrm{Mpc}\,h^{ -1}$|⁠. Having separated the radial and angular dependence of the quantities in equation (39), we can use the completeness and orthogonality relations of the spin-weighted spherical harmonics (equations A11 and A12) to write \begin{eqnarray*} (\mathsf{N}_\pm + \mathsf{C}_\pm ^u)^{-1}_{ij} \approx \frac{4\pi }{N_{r_i}}\frac{4\pi }{N_{r_j}}\sum _{lm} \left[\boldsymbol{\Omega }_l^{(\pm)-1}\right]_{r_i,r_j}{}_{\pm 1}Y_{lm}(\hat{\boldsymbol {n}}_i){}_{1}Y_{lm}^*(\hat{\boldsymbol {n}}_j), \nonumber \\ \end{eqnarray*} (41) where we have assumed that the sum over angular indices can be converted to an integral, which allows the use of the orthogonality relation of the spin-weighted spherical harmonics. Using the result of equation (41) we can form the blocks of the full inverse covariance matrix and hence derive the |$\mathsf{M}$| matrices needed for inferring the SP and SAD proper motion. We find (upon use of the Woodbury matrix formula) \begin{eqnarray*} \mathsf {M}^{(0)}_{ab} &=& \delta _{ab} \frac{8\pi }{3} \sum _{r_i,r_j} \left(\mathsf{S}^{E,u}_1 + \mathsf{N}^E\right)^{-1}_{r_i,r_j}, \nonumber \\ \mathsf {M}^{(1)}_{ab} &=& \delta _{ab} \frac{8\pi }{3} \sum _{r_i,r_j} \frac{\left(\mathsf{S}^{E,u}_1 + \mathsf{N}^E\right)^{-1}_{r_i,r_j}}{r_j}, \nonumber \\ \mathsf {M}^{(2)}_{ab} &=& \delta _{ab} \frac{8\pi }{3} \sum _{r_i,r_j} \frac{\left(\mathsf{S}^{E,u}_1 + \mathsf{N}^E\right)^{-1}_{r_i,r_j}}{r_ir_j}, \end{eqnarray*} (42) where |$\mathsf{N}^E$| is the E-mode noise power, whose ij component is equal to |$\delta _{r_i r_j} 4\pi \sigma _\mu ^2(r_i)/N_{r_i}$|⁠, and we have packaged up the E-mode LSS power spectrum into a Nr × Nr matrix |$\boldsymbol{\sf S}_l^E$|⁠. Importantly, only the dipole of LSS transverse velocities contributes to the variance of the SP and SAD proper motions. This result makes intuitive sense – higher multipoles of the LSS transverse velocity field cannot be confused for the dipolar SP and SAD proper motions, and are orthogonal in the limit of large N due to isotropy, which is precisely the LNIN limit. The above construction is formally only valid in the situation where the noise variance is constant in each radial bin. In reality this is not the case, since there is a distribution of magnitudes and colours across the sky within each bin, which gives |$\sigma _v^2$| angular dependence. Given this situation, what is the correct ‘averaged’ noise variance which should be used for |$\mathsf{N}^E$|? Conditioning on the noise variances and galaxy positions was the approach adopted in the previous sections, but to simplify the inverse covariance matrix we had to artificially fix the variances such that the spin-weighted spherical harmonics carried all the angular dependence. Formally the correct thing to do would be to marginalize over the magnitudes and colours in each radial bin, but this would give a non-Gaussian likelihood due to the non-linear dependence of noise variance, magnitude, and colour, as well as the intrinsically non-Gaussian underlying distributions of these quantities. To simplify the situation we simply take the limit of equation (42) in which the cosmic variance is zero. In this limit, equation (42) reduces to equation (37) only if we use the average of the inverse noise variance over the radial bin. We henceforth adopt this choice, but investigate the consequences of making a different choice for the average noise variance in Appendix C. We can also find the LNIN form of the |$\boldsymbol {L}$| vectors. Expanding the data vector in spin-weighted spherical harmonics, we find that \begin{eqnarray*} \boldsymbol {L}^{(n)} = \sqrt{\frac{8\pi }{3}} \sum _{r_i,r_j} \frac{\left(\mathsf{S}^{E,u}_1 + \mathsf{N}^E\right)^{-1}_{r_i,r_j}}{r_i^n} \mathsf{B}^{-1}\boldsymbol {d}^E_m(r_j), \end{eqnarray*} (43) where |$\boldsymbol {d}^E_m = (d^E_{1-1}, d^E_{10}, d^E_{11})^\intercal$| is a vector of the dipolar E-mode multipole coefficients of the data, and |$\mathsf{B}$| is the unitary matrix which converts from dipolar multipole space (m = −1, 0, 1) to 3D Cartesian space (see the discussion in Section 2.2). Equation (43) demonstrates that the |$\boldsymbol {L}$| vectors are simply the E-mode dipolar coefficients of the data ‘rotated’ to real Cartesian space, inverse-variance weighted, distance weighted (depending on whether SAD or SP is being measured), and then averaged over all radial bins, which makes intuitive sense for Gaussian data. These vectors may then be further normalized and projected against known velocity or proper motion vectors to estimate the SP or SAD amplitudes. We note finally that the bias in the estimators due to correlated LSS noise may also be derived in the LNIN approximation, which offers some insights. For example, we find that the bias in ASP is proportional to cos ψ, where ψ is the angle between the Solar System–CMB relative velocity and the Milky-Way–CMB relative velocity. We find that ψ ≈ 20°. In the case of ASAD we find ψ ≈ 87°, which implies that the bias in the SAD proper motion from correlated LSS proper motions is strongly suppressed by the fortuitous perpendicular alignment of the acceleration vector and the Milky Way’s velocity vector relative to the CMB. Explicitly, the bias to the |$\boldsymbol {L}$| vectors is \begin{eqnarray*} \Delta \langle \boldsymbol {L}^{(n)} \rangle = \frac{8\pi }{3} \boldsymbol {v}^{\mathrm{SP}}_c \sum _{r_i,r_j} \frac{\left(\mathsf{S}^{E,u}_1 + \mathsf{N}^E\right)^{-1}_{r_i,r_j}}{r_i^nr_j} \frac{-3\Psi _\perp (r_j)}{\sigma _v^2}. \end{eqnarray*} (44) The LNIN approximations to the estimators can be computed rapidly, since they only involve averages over the radial distribution of galaxies and the inversion of low-dimensional matrices. The price paid for this is a restrictive assumption about the form of the noise variances. Where possible, these assumptions can be tested against the exact results presented in Section (3.2). We find that for the survey assumptions made in Section (4), the LNIN approximations predict variances, biases, and correlations to better than 10 per cent for the cosmic-variance-free estimator of equation (34) (for which the covariance matrix inversions can be done analytically). This is due to the large number of galaxies typically included in each radial bin, the smoothness of the signal with distance making the radial binning close to lossless, the instrument noise being close to isotropic, and the sky coverage being almost complete. 4 SURVEY ASSUMPTIONS In order to forecast the ability of astrometric surveys to detect large-scale extragalactic proper motions, we must make assumptions for the radial distributions of galaxies and quasars, as well as their proper motion noise variances. In order to test the LNIN approximations of Section 3.5, we also need a realistic angular distribution. Following Bachchan et al. (2016), as our reference astrometric survey we use the Gaia Universe Model Snapshot (GUMS; Robin et al. 2012), a simulated catalogue that includes galaxies (unresolved in their stars) and quasars with realistic spatial, magnitude, and colour distributions similar to those observable with Gaia. The catalogue contains roughly 1 million quasars and about 38 million galaxies, although most of these are unobservable due to their extended surface brightness profiles. We select galaxies and quasars from the catalogue with 0 ≤ z ≤ 0.1 but vary the distance ranges used in the analysis in order to gain insights into the signal-to-noise forecasts.12 The SP and LSS proper motions both fall as 1/r, but more distant objects can help break degeneracies with the SAD proper motion and suffer less from bias due to correlations of LSS peculiar velocities with our local motion. Gaia is only expected to observe galaxies with centrally concentrated brightness profiles and is unlikely to be sensitive to galaxies with a large disc component (de Souza et al. 2014). In particular, late-type spiral galaxies are unlikely to be present in Gaia data (de Bruijne et al. 2015). As in Bachchan et al. (2016), we select galaxies with Hubble type E2, E-S0, Sa and Sb. We add to these galaxies the small number of quasars that reside at z ≤ 0.1 in the GUMS catalogue, which results in a final catalogue of roughly 2 × 106 objects, roughly half of which are Sa or Sb spiral galaxies. The inclusion of Sa and Sb spirals in the catalogue is optimistic despite their significant bulge components (bulge-to-total flux ratios of roughly 0.5 and 0.3, respectively, de Bruijne et al. 2015), but since our aim in this work is primarily to test our correlation and likelihood formalism and study the optimal redshift ranges for measuring H0, this assumption is not critical and allows for comparison with Bachchan et al. (2016). Furthermore, cosmic variance from LSS transverse velocities is the dominant source of variance in our estimators, so we expect our final results to be fairly insensitive to the total number of galaxies in a fixed radial distribution. In Figs 9 and 10 we plot the redshift distribution and Gaia G-band magnitude distribution of the galaxies used in our forecasts. Most of the galaxies in our sample reside close to the redshift limit of the survey, but since the proper motion signals we are measuring fall as 1/r this upper limit is sufficient, as we shall see. Likewise, most of the galaxies have magnitudes close to the Gaia magnitude limit G = 20. Figure 9. View largeDownload slide Redshift distribution of the galaxies used in our forecasts taken from GUMS (see the text for details). We cut out galaxies having z > 0.1 as there is little SP signal at these distances. Figure 9. View largeDownload slide Redshift distribution of the galaxies used in our forecasts taken from GUMS (see the text for details). We cut out galaxies having z > 0.1 as there is little SP signal at these distances. Figure 10. View largeDownload slide Gaia G-band magnitude distribution of the galaxies used in our forecasts taken from GUMS (see the text for details). The magnitude limit for Gaia is roughly G = 20. Figure 10. View largeDownload slide Gaia G-band magnitude distribution of the galaxies used in our forecasts taken from GUMS (see the text for details). The magnitude limit for Gaia is roughly G = 20. In order to assign proper motion variances to each object in the catalogue, we assume that the Gaia end-of-mission 5-yr point-source specifications may be applied to galaxies (as in Nusser et al. 2012 and Bachchan et al. 2016). The accuracy of the proper motion measurements of extended objects made with Gaia is uncertain, and the use of point-source specifications is likely overoptimistic. For a circular Gaussian image with noise limited by photon counts (rather than the background), the error of the centroid is proportional to the angular diameter of the image (i.e. the only length scale in the problem, see e.g. Irwin 1985). For rectilinear proper motion, it is straightforward to show that the proper motion in a particular direction is also proportional to the size, with an extra factor t−1 where t is the total observation time (cf. the fact that proper motion errors scale as t−3/2 whereas centroid errors scale as t−1/2). For galaxies with a fixed physical diameter of 1 kpc, the smallest angular diameter in our sample is roughly 0.5 arcsecs, which occurs at the upper redshift limit of z = 0.1 (the minimum value at any redshift for our cosmology is 0.11 arcsecs, which occurs at z ≈ 1.6). The Gaia resolution is roughly 0.1 arcsecs (FWHM), so all the galaxies in our sample would appear as resolved objects if they had a fixed physical FWHM of 1 kpc. Assuming a circular PSF, the image size of the smallest galaxy in our sample is five times larger than that of a point-source, which for fixed apparent magnitude and integration time suggests the proper motion error could be at least five times larger for most of the galaxies in our sample. Of course, galaxies with sub-kpc bulges could well appear as point-sources with Gaia; Nusser et al. (2012) argue that the vast majority of early-type galaxies and most late-type galaxies will be observed as point-sources with Gaia.13 Additionally, galaxies with sharp features in their surface brightness profile will have inherently more precise astrometry; many of the low-redshift galaxies observed with Gaia will be well-resolved, and resolved structure in the image will improve the centroid determination. Since our primary aim here is to study the broad behaviour of proper motion inference in the presence of LSS and distance cuts rather than sensitivity to the precise details of the measurement pipeline, the assumption of point-source noise is justifiable. However, we caution the reader against the overinterpretation of our final forecasts; a more complete treatment would involve running the GUMS galaxies through a realistic Gaia detection pipeline, but this is beyond the scope of this work. Furthermore, for many of the applications of our catalogue the instrument noise is subdominant to the cosmic variance, and its underestimation will have only a small impact on results. We also neglect the small spatial correlations expected between proper motion errors in the final end-of-mission Gaia data (Holl, Hobbs & Lindegren 2010), although these could easily be included in our likelihood formalism (spatial correlations in proper motion errors are present in Gaia DR2 Lindegren et al. 2018). We use the formulae from the Gaia science performance page14 to convert each galaxy’s G-band magnitude and colour to a proper motion uncertainty. We consistently distinguish between uncertainty in the declination and right-ascension proper motion, except when adopting the LNIN approximations (the component uncertainties differ by roughly 10 per cent). In Fig. 11 we plot the distribution of mean proper motion errors from the GUMS galaxies. The distribution is truncated at roughly |$330 \, \mu \mathrm{as} \, \mathrm{yr}^{-1}$| corresponding to the magnitude limit of the survey and has a broad peak at roughly |$50 \, \mu \mathrm{as} \, \mathrm{yr}^{-1}$|⁠. Barely visible in Fig. 11 is a small spike at |$\sigma _{\mu } \approx 4 \, \mu \mathrm{as} \, \mathrm{yr}^{-1}$|⁠, the minimum proper motion uncertainty, which is due to the bright-star parallax noise floor. We remind the reader that the amplitude of the SAD proper motion is roughly |$4.3 \, \mu \mathrm{as} \, \mathrm{yr}^{-1}$| and correlated over large-scales, while the SP proper motion is roughly |$\frac{78}{r / 1 \, \mathrm{Mpc}} \, \mu \mathrm{as} \, \mathrm{yr}^{-1}$|⁠. These estimates suggest that both SP and SAD proper motions could be measurable in our galaxy sample and assumed proper motion variances. Figure 11. View largeDownload slide Distribution of the mean proper motion error for galaxies used in our forecasts (see the text for details). These numbers were computed using the Gaia end-of-mission 5-yr point-source specifications and the magnitudes and colours produced by the GUMS galaxy catalogue. The sharp cutoff at large error values is due to the magnitude limit of the survey, while the small spike at |$\sigma _{\mu } \approx 4 \, \mu \mathrm{as} \, \mathrm{yr}^{-1}$| is due to the bright-star parallax noise floor. The remaining spikes are due to the choice of binning. Figure 11. View largeDownload slide Distribution of the mean proper motion error for galaxies used in our forecasts (see the text for details). These numbers were computed using the Gaia end-of-mission 5-yr point-source specifications and the magnitudes and colours produced by the GUMS galaxy catalogue. The sharp cutoff at large error values is due to the magnitude limit of the survey, while the small spike at |$\sigma _{\mu } \approx 4 \, \mu \mathrm{as} \, \mathrm{yr}^{-1}$| is due to the bright-star parallax noise floor. The remaining spikes are due to the choice of binning. In order to convert proper motions into transverse velocities, we assume that spectroscopic redshifts are available for every galaxy in our catalogue. This is a somewhat optimistic assumption given that we include all galaxies down to G = 20. Gaia will provide medium-resolution spectroscopy for objects down to G = 17 (de Souza et al. 2014), which constitutes roughly 15 per cent of our total sample. We will therefore have to rely on existing spectroscopic samples for the remaining objects – for example SDSS should provide substantial overlap in the north due its greater depth, and the upcoming DESI survey will improve on this further. We have not quantified the likely overlap with existing and planned surveys, but argue that this work provides further motivation for obtaining accurate redshifts with wide spectroscopic galaxy surveys. It should also be noted that since cosmic variance rather than instrumental noise limits the Hubble constant measurement, our forecasts for H0 should be fairly robust to a modest reduction in the galaxy sample size. Measurement of the transverse velocity of LSS will be more affected by incomplete redshift coverage, so we again caution the reader against overinterpretation of these forecasts. Having established our galaxy sample, we now use the expressions derived in 3 to forecast the detection significance of the SP and SAD proper motions (the former proportional to the H0), properly accounting for bias and variance from LSS transverse velocities. While the reference survey is a Gaia-like astrometric survey, the formalism we have presented can be applied to any proper motion measurements. For example, more precise astrometry is expected from the ngVLA (Darling et al. 2018) and post-Gaia optical and near-infrared missions (Hobbs et al. 2016). 5 SIGNAL-TO-NOISE FORECASTS Using the catalogue of galaxies and quasars presented in Section 4, we compute the variances and biases of the SP proper motion under a range of varying assumptions. We consider the sensitivity of the forecasts to varying the maximum redshift in the catalogue zmax, the minimum distance rmin, and to the marginalization or fixing of the SAD proper motion. The SAD signal is independent of distance and so could be well-constrained by a sample of high-redshift quasars, so fixing it is a reasonable approach for this analysis. We also compute the extra variance from LSS proper motions when using the cosmic-variance-free proper motion estimator of equation (34) (labelled as ‘Including LSS’ in the figures), and the variance when marginalizing over these proper motions (labelled as ‘Marginalized LSS’ in the figures), and study the impact of fixing the direction of the SP proper motion to that inferred from the CMB dipole (Planck Collaboration XVI 2014). We also compute the bias to the SP proper motion from LSS transverse velocities correlated with our local motion, as described in Section 3.3. As mentioned in Section 3.5, the LNIN approximations were found to be good to better than 10 per cent for all forecasts made with the cosmic-variance-free estimator, so we adopt the LNIN approximation throughout this section. This allows rapid calculation of the inverse covariance matrices needed to form of the optimal estimators presented in Section 3. 5.1 Varying zmax In this section we study the impact on our forecasts of varying the maximum redshift of the catalogue, zmax, fixing the minimum distance to be |$r_{\mathrm{min}} = 20 \, \mathrm{Mpc}\,h^{ -1}$|⁠. In Fig. 12 we plot the forecast error (square-root of the estimator variance) on one component of the SP velocity (equal to the errors in the other components in the LNIN approximation). We plot the error assuming no cosmic variance (blue curves), the error including cosmic variance from LSS proper motions uncorrelated with our local velocity (orange curves), and the error after marginalizing over these LSS proper motions (green curves). The solid and dashed curves are with and without marginalization over the SAD proper motion, and we have assumed H0 is fixed such that spectroscopic redshifts directly give the distance (accounting for the radial velocity uncertainty in the broad binning, see the footnote in Section 3.5). We also plot the r.m.s. value of the SP velocity as inferred from the CMB dipole (black dashed line). Figure 12. View largeDownload slide Forecast error on one component of the SP velocity as a function of the highest redshift included in the survey, for |$r_{\mathrm{min}} = 20 \, \mathrm{Mpc}\,h^{ -1}$|⁠. The curves are forecasts setting LSS to zero (blue, lowest curve), including LSS without marginalization (orange, highest curve), and with marginalization over LSS (green, middle curve). The solid and dashed curves are with and without marginalization over the SAD proper motion, respectively. The black dashed horizontal line shows the r.m.s. value from the CMB dipole. Figure 12. View largeDownload slide Forecast error on one component of the SP velocity as a function of the highest redshift included in the survey, for |$r_{\mathrm{min}} = 20 \, \mathrm{Mpc}\,h^{ -1}$|⁠. The curves are forecasts setting LSS to zero (blue, lowest curve), including LSS without marginalization (orange, highest curve), and with marginalization over LSS (green, middle curve). The solid and dashed curves are with and without marginalization over the SAD proper motion, respectively. The black dashed horizontal line shows the r.m.s. value from the CMB dipole. As expected, all the errors decrease as the maximum redshift is raised since more galaxies are included in the estimators, but the improvement is only marginal since the signal falls as 1/r. The variance from LSS dominates over that from instrumental noise and can only be mildly reduced by marginalization. With LSS marginalization and zmax = 0.1 we find |$\sigma _{v_x} \approx 130 \, \mathrm{km}\,\mathrm{ s}^{ -1}$| for marginalized SAD and |$\sigma _{v_x} \approx 85 \, \mathrm{km}\,\mathrm{ s^{ -1}}$| for fixed SAD, compared to the typical signal amplitude of |$213 \, \mathrm{km}\,\mathrm{ s^{ -1}}$|⁠, which represent 1.6σ (marginalized SAD) and 2.5σ (fixed SAD) measurements. With only variance from instrumental noise this would be boosted to roughly 8σ, which shows the importance of including peculiar velocities from LSS in these forecasts. Our forecast errors are similar to (albeit larger than) those of Bachchan et al. (2016), who found |$\sigma _{v_x} \approx 100 \, \mathrm{km}\,\mathrm{ s}^{ -1}$| with zmax ≈ 0.03 and simultaneous fitting of SAD (equivalent to our solid curves). This difference could be explained by the neglect of correlations between LSS proper motions in Bachchan et al. (2016). Most of the signal-to-noise has converged by zmax ≈ 0.04, with mild dependence on whether or not SAD is marginalized over. Marginalizing over SAD increases the error in all cases (as expected), with constraints degrading by a factor of roughly 50 per cent for zmax = 0.1. When zmax is small this degradation is significantly larger for the cosmic-variance-free estimator, as a smaller range of distances is available to break degeneracies with SP. When including LSS variance the degradation has a much more mild dependence on zmax. The optimal estimator is that with uncorrelated LSS transverse velocities marginalized out. In Fig. 13 we plot the bias from correlated LSS velocities (fixing or marginalizing over SAD) as a function of zmax, for a single component of the SP velocity estimator.15 For all values of zmax considered, the bias in the estimator exceeds its forecast error, and in some cases the bias can be more than twice the 1σ error. This bias originates from the inclusion of galaxies at low distances whose LSS proper motions are strongly correlated with that of the Milky Way (cf. Fig. 7), but tends to decrease as more high-redshift objects (which individually have low bias) are included in the catalogue due to the increasing |$\mathsf{M}$| normalization of the |$\boldsymbol {L}$| vector. This also explains why the bias is greater for the marginalized-SAD estimator, as this has a greater weighting due to the smaller effective |$\mathsf{M}$| which results from projecting out the SAD proper motion. Fig. 13 suggests that for this choice of rmin, the bias from correlated LSS will have to be mitigated if the potential SP detection suggested in Fig. 12 can be realized. Figure 13. View largeDownload slide The bias of a single component of the SP velocity from LSS proper motions correlated with the Solar System’s local motion (see Section 3.3 for a discussion). We plot biases for the estimator with uncorrelated LSS marginalized out, and for both fixed SAD (orange, upper solid) and marginalized SAD (blue, lower solid). We also plot the (negative) error of this estimator (dashed curves). For all values of zmax considered, the bias in the estimator exceeds its forecast error. The horizontal dashed line shows the r.m.s. SP velocity from the CMB dipole. Figure 13. View largeDownload slide The bias of a single component of the SP velocity from LSS proper motions correlated with the Solar System’s local motion (see Section 3.3 for a discussion). We plot biases for the estimator with uncorrelated LSS marginalized out, and for both fixed SAD (orange, upper solid) and marginalized SAD (blue, lower solid). We also plot the (negative) error of this estimator (dashed curves). For all values of zmax considered, the bias in the estimator exceeds its forecast error. The horizontal dashed line shows the r.m.s. SP velocity from the CMB dipole. In Fig. 14 we plot the forecast error on the amplitude of the SP velocity, with the direction fixed to that of the CMB dipole. Fixing the amplitude this can be reinterpreted as the forecast fractional constraint on H0, as described in Section 3.4. The behaviour in this plot is similar to that in Fig. 12 with a greater detection significance upon fixing the SP direction. Fixing the SAD proper motion (dot-dashed curves) yields a detection significance of roughly 4σ (i.e. a 25 per cent measurement of H0), whereas marginalizing over it (solid curves) gives roughly 2.7σ (a 40 per cent measurement of H0). Fixing the SAD direction to the direction of the galactic centre gives almost identical results to fixing both amplitude and direction, suggesting most of the uncertainty from SAD comes from constraining its direction rather than its amplitude. This is due to the fact that the Solar System’s acceleration vector and the CMB dipole are roughly orthogonal, subtending an angle of about 94°, which results in no confusion between the SP and SAD signals when their directions are fixed. Once again our forecast error on H0 is larger than that of Bachchan et al. (2016), who forecast an error of roughly 30 per cent on H0 for zmax = 0.03, compared to our roughly 40 per cent prediction. This discrepancy is likely a result of our inclusion of correlated velocities due to LSS. Figure 14. View largeDownload slide Forecast error on the amplitude of the SP velocity assuming the direction is fixed to that measured by the CMB dipole. The curves are forecasts setting LSS to zero (blue, lowest curves), including LSS without marginalization (orange, highest curves), and with marginalization over LSS (green, middle curves). The solid curves have had both the amplitude and direction of the SAD proper motion marginalized over, dashed curves have had only the amplitude of the SAD proper motion marginalized over with the direction kept fixed, and dot-dashed curves (indistinguishable from the dashed curves) have had both the amplitude and direction of the SAD proper motion kept fixed. The error on ASP can be interpreted as the forecast fractional error on H0. Figure 14. View largeDownload slide Forecast error on the amplitude of the SP velocity assuming the direction is fixed to that measured by the CMB dipole. The curves are forecasts setting LSS to zero (blue, lowest curves), including LSS without marginalization (orange, highest curves), and with marginalization over LSS (green, middle curves). The solid curves have had both the amplitude and direction of the SAD proper motion marginalized over, dashed curves have had only the amplitude of the SAD proper motion marginalized over with the direction kept fixed, and dot-dashed curves (indistinguishable from the dashed curves) have had both the amplitude and direction of the SAD proper motion kept fixed. The error on ASP can be interpreted as the forecast fractional error on H0. In Fig. 15 we plot the bias to ASP from correlated LSS transverse velocities. The shape of these curves is similar to those of the SP velocity components in Fig. 13, with the sign of the bias negative due to the angle between the total SP velocity and the part correlated between LSS being less than 90° and the negative sign in equation (20). Once again we see that if unaccounted for, the bias from LSS transverse velocities correlated to our local motion would overwhelm the forecast error, by up to a factor of two, when |$r_{\mathrm{min}} = 20 \, \mathrm{Mpc}\,h^{ -1}$|⁠. Figure 15. View largeDownload slide The bias of the amplitude of the SP velocity assuming the direction is fixed to that measured by the CMB dipole from LSS proper motions correlated with the Solar System’s local motion (see Section 3.3 for a discussion). We plot biases for the estimator with uncorrelated LSS marginalized out, and for both fixed SAD (orange, upper solid) and marginalized SAD (blue, lower solid). We also plot the (negative) error of this estimator (dashed curves). For all values of zmax considered, the bias in the estimator exceeds its forecast error. Figure 15. View largeDownload slide The bias of the amplitude of the SP velocity assuming the direction is fixed to that measured by the CMB dipole from LSS proper motions correlated with the Solar System’s local motion (see Section 3.3 for a discussion). We plot biases for the estimator with uncorrelated LSS marginalized out, and for both fixed SAD (orange, upper solid) and marginalized SAD (blue, lower solid). We also plot the (negative) error of this estimator (dashed curves). For all values of zmax considered, the bias in the estimator exceeds its forecast error. 5.2 Varying rmin In this section we study the impact on our forecasts of varying the minimum comoving distance of the catalogue, rmin, fixing the maximum redshift to be zmax = 0.1. We focus on only ASP; the qualitative conclusions are very similar for the individual components of the SP velocity vector. In Fig. 16 we plot the forecast errors assuming the SP direction is fixed by the CMB dipole, similar to Fig. 14. As expected, as we throw away galaxies for which the signal (proportional to 1/r) is high, the variance increases. Steep increases are seen when the SAD proper motion is projected out (solid curves), since in this scenario distance information is essential to break degeneracies. Significant detections of ASP, or H0 if the amplitude is fixed from the CMB dipole, are possible, although we require |$r_{\mathrm{min}} \lesssim 200 \, \mathrm{Mpc}$| if marginalizing over SAD to get a 1σ measurement. Figure 16. View largeDownload slide Same as Fig. 14 but varying rmin. The curves are forecasts setting LSS to zero (blue, lowest curve), including LSS without marginalization (orange, highest curve), and with marginalization over LSS (green, middle curve). The solid curves have had both the amplitude and direction of the SAD proper motion marginalized over, dashed curves have had only the amplitude of the SAD proper motion marginalized over with the direction kept fixed, and dot-dashed curves (indistinguishable from the dashed curves) have had both the amplitude and direction of the SAD proper motion kept fixed. The dashed horizontal line denotes the fiducial value of ASP = 1. The error on ASP can be interpreted as the forecast fractional error on H0. Figure 16. View largeDownload slide Same as Fig. 14 but varying rmin. The curves are forecasts setting LSS to zero (blue, lowest curve), including LSS without marginalization (orange, highest curve), and with marginalization over LSS (green, middle curve). The solid curves have had both the amplitude and direction of the SAD proper motion marginalized over, dashed curves have had only the amplitude of the SAD proper motion marginalized over with the direction kept fixed, and dot-dashed curves (indistinguishable from the dashed curves) have had both the amplitude and direction of the SAD proper motion kept fixed. The dashed horizontal line denotes the fiducial value of ASP = 1. The error on ASP can be interpreted as the forecast fractional error on H0. When fixing SAD and increasing rmin from its lowest value, we find that the total variance in the cosmic-variance-free estimator (orange dashed curve) decreases as the dipole of the uncorrelated LSS transverse velocity decreases due to projection (see Section 2.4) before eventually increasing due to the loss of high signal-to-noise galaxies. This behaviour is not present in the cosmic-variance-marginalized estimator (green dashed curve), which correctly accounts for this extra variance by including it in the inverse covariance matrix. In Fig. 17 we plot the bias in ASP from correlated LSS transverse velocities. The lowest redshift galaxies contribute most to this bias, as these are most correlated with the Solar System’s velocity (Fig. 7). Omitting these galaxies from the analysis reduces the bias, bringing it below the 1σ error for |$r_{\mathrm{min}} \gtrsim 100 \, \mathrm{Mpc}$| for marginalized SAD and |$r_{\mathrm{min}} \gtrsim 50 \, \mathrm{Mpc}$| for fixed SAD. Restricting to values of rmin for which this bias is less than the error, we forecast that ASP can be measured at roughly 2σ for marginalized SAD and 4σ for fixed SAD, corresponding to 50 and 25 per cent measurements of the Hubble constant, respectively. This bias could be reduced by modelling it (see Section 3.3) or by attempting to constrain the amplitude of the SP component orthogonal to the part correlated with LSS. However we find that this latter approach entails the loss of roughly 50 per cent of the signal-to-noise (which follows from the angles subtended by the relevant velocity vectors). Figure 17. View largeDownload slide The bias of the amplitude of the SP velocity assuming the direction is fixed to that measured by the CMB dipole from LSS proper motions correlated with the Solar System’s local motion (see Section 3.3 for a discussion). We plot biases for the estimator with uncorrelated LSS marginalized out, and for both fixed SAD (orange, upper solid) and marginalized SAD (blue, lower solid). We also plot the (negative) error of this estimator (dashed curves). Figure 17. View largeDownload slide The bias of the amplitude of the SP velocity assuming the direction is fixed to that measured by the CMB dipole from LSS proper motions correlated with the Solar System’s local motion (see Section 3.3 for a discussion). We plot biases for the estimator with uncorrelated LSS marginalized out, and for both fixed SAD (orange, upper solid) and marginalized SAD (blue, lower solid). We also plot the (negative) error of this estimator (dashed curves). In the case of the individual SP velocity components we reach similar conclusions, with galaxies below |$100$| and |$50 \, \mathrm{Mpc}$| requiring omission if SAD is marginalized or fixed, respectively. This limits us to a 1–2σ measurement of the SP velocity in a single component. Thus, cutting out low-redshift galaxies from the catalogue can bring the bias down to acceptable levels, but a significant detection of the SP velocity requires prior knowledge of its direction from the CMB dipole. Thus, it appears that cutting out galaxies to mitigate the bias due to correlated LSS transverse velocities limits us to at best a 25 per cent measurement of H0 with our survey specifications. For fixed SAD, knowing the bias helps little, since the saturation of the error at low values of rmin limits the gains of reducing the minimum distance (as does the uncertainty in distances due to radial velocities in the redshift). It thus seems that competitive measurements of H0 will require going to redshifts much higher than z = 0.1 in order to reduce cosmic variance (see Fig. 12). Alternatively, combining with radial velocity measurements (with redshift-independent distance measures) could potentially reduce the cosmic variance, since both components trace the same underlying dark matter. We caution against overinterpretation of these results however since several simplifying assumptions have been made, in particular the assumptions that go into the galaxy distributions and noise variances (see Section 4). Nevertheless, our analysis has highlighted the importance of cosmic variance in measurements of H0 with the proper motions of nearby galaxies. 6 MEASURING THE PROPER MOTION OF LARGE-SCALE STRUCTURE So far we have focused on the possibility of measuring dipolar patterns in the proper motion of extragalactic objects, identifying sources of bias and variance which make this measurement difficult. Alternatively, as in Nusser et al. (2012), Darling & Truebenbach (2018), and Truebenbach & Darling (2018), we could try to measure anisotropies in the proper motion on smaller angular scales. The l ≥ 2 modes of the proper motion field are sourced entirely by the transverse velocities of galaxies and quasars induced by LSS, and do not suffer from contamination from local effects such as SP or SAD. Fig. 1 suggests all modes with l ≲ 6 will be signal-dominated for our lowest redshift bin, which raises hopes that a measurement of LSS transverse velocities might be possible with our catalogue. With spectroscopic redshift information, such a measurement would be sensitive to the amplitude of the proper motion correlation function, specifically the parameter combination H0fσ8. In this section we forecast the ability of our galaxy catalogue to measure the higher-order l-modes of the proper motion. As a first step, we will simply attempt to constrain the amplitude of the part of the LSS transverse velocity uncorrelated with our local motion. Recall that this correlation only affects the l = 1 mode, so this procedure retains the amplitude information in the higher-order modes. We parametrize the uncorrelated LSS covariance matrix as \begin{eqnarray*} \mathsf{C}^u_\pm = A_{\mathrm{LSS}}^2 \mathsf{C}^u_{\pm ,0}, \end{eqnarray*} (45) where |$\boldsymbol{C}^u_{\pm ,0}$| is a fixed fiducial model, and the fiducial value of ALSS is unity. Constraints on ALSS are equivalent to fractional constraints on H0fσ8. Note that the transverse velocity from LSS carries the same radial dependence as the SP velocity (1/r). Following Bond, Jaffe & Knox (1998) and Hall & Challinor (2014), we can derive the first iteration of the Newton–Raphson solution of the maximum-likelihood estimator for ALSS assuming the initial value of ALSS = 1 and replacing the second-derivative of the log-likelihood with its ensemble average, the Fisher information. This gives \begin{eqnarray*} \hat{A}_{\mathrm{LSS}} = 1 + \frac{1}{F_{AA}} \left\lbrace \Delta \boldsymbol {d}^\dagger \mathsf{C}_T^{-1} \mathsf{C}^u \mathsf{C}_T^{-1} \Delta \boldsymbol {d} - \mathrm{Tr}\left[\mathsf{C}_T^{-1} \mathsf{C}^u\right] \right\rbrace , \end{eqnarray*} (46) where the data vector |$\Delta \boldsymbol {d}$| has potentially had the SAD, SP, and correlated LSS proper motions subtracted if we include the l = 1 mode. This estimator is unbiased and has variance given by |$\sigma _{A_{\mathrm{LSS}}}^2 = F_{AA}^{-1}$|⁠, where the Fisher information is \begin{eqnarray*} F_{AA} = 2 \mathrm{Tr} \left[\mathsf{C}_T^{-1} \mathsf{C}^u \mathsf{C}_T^{-1} \mathsf{C}^u \right]. \end{eqnarray*} (47) The estimator in equation (46) is not the maximum-likelihood estimator, but is optimal in the sense that it has minimum variance, saturating the Cramér–Rao bound. In the cosmic-variance limit we have simply |$\sigma _{A_{\mathrm{LSS}}} = (4N)^{-1/2}$|⁠, which is tiny for our catalogue. In the LNIN approximation, the Fisher information becomes \begin{eqnarray*} F_{AA} &=& 4 \sum _l \frac{(2l+1)}{2}\mathrm{Tr} \nonumber \\ &&\times \,\left[(\mathsf{S}^{E,u}_l + \mathsf{N}^E)^{-1}\mathsf{S}^{E,u}_l(\mathsf{S}^{E,u}_l + \mathsf{N}^E)^{-1}\mathsf{S}^{E,u}_l \right], \end{eqnarray*} (48) which is recognizable from the standard form of the Fisher matrix presented in, e.g. Tegmark, Taylor & Heavens (1997). In the LNIN approximation the estimator in equation (46) becomes \begin{eqnarray*} \hat{A}_{\mathrm{LSS}}\! &=& \!1 + \frac{1}{F_{AA}}\left[ \sum _l (2l+1) \times \, \mathrm{Tr}\right. \nonumber \\ &&\left. \times \left\lbrace\! \left[(\mathsf{S}^{E,u}_l \!+\! \mathsf{N}^E)^{-1}\hat{\mathsf{S}}^{E,u}_l - \mathsf{I} \right](\mathsf{S}^{E,u}_l + \mathsf{N}^E)^{-1}\mathsf{S}^{E,u}_l \!\right\rbrace \vphantom{\frac{1}{2}}\!\right], \end{eqnarray*} (49) where |$\mathsf{I}$| is the identity matrix and |$\hat{\mathsf{S}}^{E,u}_l$| is the quadratic estimator for the power spectrum between bin ri and rj given by \begin{eqnarray*} \left[\hat{\mathsf{S}}^{E,u}_l\right]_{r_i,r_j} = \frac{1}{2l+1}\sum _m \Delta d^E_{lm}(r_i) \Delta d_{lm}^{E*}(r_j), \end{eqnarray*} (50) where |$d^E_{lm}$| are the E-mode spin-weighted spherical harmonic coefficients of the data. Packaging up the independent elements of the symmetric Nr × Nr matrix |$\boldsymbol{\sf S}^{E,u}_l$| into a vector |$\mathrm{vecp}(\boldsymbol{\sf S}^{E,u}_l)$| of length Nr(Nr + 1)/2, we can rewrite equation (49) as \begin{eqnarray*} \hat{A}_{\mathrm{LSS}}-1 = \frac{1}{2}\frac{\sum _l \mathrm{vecp}\left(\mathsf{S}^{E,u}_l\right)^\intercal \mathsf{M}_l^{-1} \mathrm{vecp}\left(\Delta \mathsf{S}^{E,u}_l\right)}{ \sum _l \mathrm{vecp}\left(\mathsf{S}^{E,u}_l\right)^\intercal \mathsf{M}_l^{-1} \mathrm{vecp}\left(\mathsf{S}^{E,u}_l\right)}, \end{eqnarray*} (51) which follows from equation (10) of Hamimeche & Lewis (2008). |$\mathrm{vecp}(\Delta \mathsf{S}^{E,u}_l)$| in equation (51) is the vectorized debiased quadratic estimator and |$\mathsf{M}_l$| is its covariance matrix (note that for l ≥ 2 we can drop the ‘u’ label). The optimal estimator is thus the inverse-variance weighted quadratic estimator for the proper motion power spectrum, cross-correlated with the fixed template spectrum, summed over multipoles, and then normalized appropriately, with the factor of 1/2 present because |$A_{\mathrm{LSS}}^2$| is the amplitude of the power spectrum and not ALSS. Note that the l = 1 moment is contaminated by local terms like SP and SAD, but the estimator is still unbiased when these modes are excluded from both the numerator and denominator of equation (51). Note also that this expression assumes the shape of the template uncorrelated transverse velocity power spectrum is known perfectly, whereas in reality it will depend on the cosmological parameters. We assume that the uncertainty in the cosmology is subdominant to other sources of bias and variance, which is likely to be a very good approximation on linear scales. A more complete treatment would involve simultaneously marginalizing over these in an MCMC chain. In Fig. 18 we plot the forecast error on ALSS for our assumed survey parameters as a function of zmax. The blue (lowest) curve on this plot labelled ‘Fixed SP’ uses all the modes l ≥ 1 in equation (48), while the orange (highest) curve discards the contaminated l = 1 mode, which is equivalent to marginalizing over the SP proper motion. The green (middle) curve uses only two thirds of the information at l = 1, which is equivalent to fixing the SP direction and marginalizing over its amplitude. In all cases, the amplitude of LSS transverse velocities should be detectable with |$\sigma _{A_{\mathrm{LSS}}} \approx 0.1$|⁠, i.e. a significant 10σ measurement (similar to that claimed in Darling et al. 2018) or a 10 per cent measurement of the parameter combination H0fσ8. That this measurement is more significant than the SP measurement forecast is a testament to the significant amplitude of LSS transverse velocities over this redshift range, and the large amount of information available in the uncontaminated l ≥ 2 modes; only a small (⁠|${\sim } 5{{\ \rm per\ cent}}$|⁠) increase in error is incurred if the l = 1 mode is discarded. By zmax = 0.1 the error has saturated due to the amplitude of the signal falling away as 1/r. Figure 18. View largeDownload slide Forecast error on the amplitude of the LSS velocity, using all multipoles (i.e. fixing the SP direction and amplitude; blue, lowest curve), discarding the information from the l = 1 modes (i.e. marginalizing over the SP direction and amplitude; orange, highest curve), and discarding one third of the information in the l = 1 modes (i.e. marginalizing over the SP amplitude but keeping its direction fixed; green, middle curve). The error on ALSS can be interpreted as the forecast fractional error on the parameter combination H0fσ8. Figure 18. View largeDownload slide Forecast error on the amplitude of the LSS velocity, using all multipoles (i.e. fixing the SP direction and amplitude; blue, lowest curve), discarding the information from the l = 1 modes (i.e. marginalizing over the SP direction and amplitude; orange, highest curve), and discarding one third of the information in the l = 1 modes (i.e. marginalizing over the SP amplitude but keeping its direction fixed; green, middle curve). The error on ALSS can be interpreted as the forecast fractional error on the parameter combination H0fσ8. Unlike the case of the SP constraints in Section 5, the constraint on ALSS is dominated by the instrumental noise (compare Fig. 18 with the cosmic-variance limit of |$\sigma _{A_{\mathrm{LSS}}} = (4N)^{-1/2} \approx 10^{-4}$|⁠), and thus sensitive to the assumption of point-source proper motion errors for our resolved galaxies. It will therefore be important to thoroughly quantify the impact of proper motion error on these constraints, and we caution the reader against their overinterpretation; our results here show what may be achieved with ‘perfect’ proper motion measurements of galaxies, and thus provide motivation for improving extragalactic astrometry. In Fig. 19 we plot the cumulative contribution per l to the Fisher information on ALSS. Most of the contribution comes from scales 1 ≲ l ≲ 10 (consistent with Fig. 1), with the error using only the l = 1 mode equal to roughly |$\sigma _{A_{\mathrm{LSS}}} \approx 0.3$|⁠, comparable to the significance forecast for ASP. As found in Section (2.4), these higher l-modes can be affected by non-linear corrections to the velocity power spectrum for nearby objects. We defer further investigation into the impact of non-linearity on these forecasts to a future work. Figure 19. View largeDownload slide Cumulative Fisher information on the amplitude of the LSS velocity using only multipoles l less than lmax. The error on ALSS can be interpreted as the forecast fractional error on the parameter combination H0fσ8. Figure 19. View largeDownload slide Cumulative Fisher information on the amplitude of the LSS velocity using only multipoles l less than lmax. The error on ALSS can be interpreted as the forecast fractional error on the parameter combination H0fσ8. Finally, in Appendix C we investigate the impact on these forecasts of making different assumptions for the average noise variance; although our choice of using the average inverse variance reproduces the correct answer in the cosmic-variance-free limit for the ASP estimator, it is unclear whether this choice is correct for ALSS. Using the inverse of the average variance instead of the average inverse variance can increase the forecast uncertainty on ALSS by a factor of three, which still represents a significant detection. Further investigation using the exact result in equation (47) is required, which we again defer to a future work. 7 CONCLUSIONS We have conducted a thorough study into the potential of astrometric surveys to measure the large-scale correlated proper motions of galaxies, focusing on Gaia as a reference survey. Although Gaia is not optimized to measure the positions of extragalactic objects, it will none the less produce astrometry and proper motions of roughly 106 galaxies and 105–6 quasars (Robin et al. 2012). With these large catalogues and proper motion errors of |${\sim } 10^{1-2} \mu \mathrm{as} \, \mathrm{yr}^{-1}$|⁠, it is timely to investigate whether this data set can be used to constrain cosmological parameters. We have focused on the large-scale proper motion induced by the Solar System’s motion relative to distant galaxies (the SP), the large-scale proper motion caused by the time-varying relativistic aberration from the Solar System’s acceleration towards the galactic centre (the SAD), and the peculiar transverse velocities of the galaxies induced by LSS. Extending the previous works of Ding & Croft (2009), Nusser et al. 2012, Bachchan et al. 2016, and Darling & Truebenbach 2018 we have studied the potential to combine extragalactic proper motions with spectroscopic redshifts to measure the Hubble constant through the SP. We have identified a previously neglected bias from dipolar transverse velocities correlated with our local motion (similar to the Virgocentric infall and bulk flow phenomena familiar from radial velocity surveys). This bias is really a consequence of attempting to measure only |$\boldsymbol{\mu }^{\mathrm{SP}}$|⁠, the SP proper motion, rather than |$\boldsymbol{\mu }^{\mathrm{SP}} + \boldsymbol{\mu }^{\mathrm{LSS}} = (-\mathbf {v}_{\odot } + \mathbf {v})_{\perp }/r$|⁠, which is the relative proper motion between the Solar System and the galaxies, and the quantity actually observed. This bias is proportional to the transverse correlation function of Gorski (1988) and is especially problematic for nearby objects. As proper motions due to SP and LSS fall away with distance, the cosmic variance and bias from peculiar velocities is potentially very important for this measurement. We found that without proper modelling, the bias is important whenever galaxies closer than 50–100 Mpc are used in the analysis, which limits the significance of the SP velocity to 1–2σ depending on whether the contaminating SAD proper motion is simultaneously estimated or fixed from a sample of high-redshift quasars. If we add information from the CMB dipole, this significance increases to 2–4σ, corresponding to a 25–50 per cent measurement of H0, with the error dominated by cosmic variance from LSS transverse velocities. The significance of the H0 measurement could potentially be increased by going to higher redshifts and modelling the bias with perturbation theory or halo models, although the gains appear fairly small. Alternatively, combining with measurements of radial peculiar velocities could help to reduce the cosmic variance. For Gaussian fields at distance r, conditioning on the radial velocities of each galaxy would reduce the cosmic variance in H0 by roughly a factor |$\rho _1^2(r)$|⁠, where ρ1(r) is the dimensionless correlation coefficient between the part of the radial velocity dipole uncorrelated with our local motion and the part of the transverse velocity dipole uncorrelated with our local motion. Neglecting correlations with our local velocity altogether, we find that |$\rho _1^2(r)$| has a broad peak of roughly 0.74 around |$r = 150 \, \mathrm{Mpc}\, h^{ -1}$| and is never less than 0.54 for our galaxies, suggesting that the cosmic variance could be significantly reduced. This simple picture is complicated by the fact that radial velocity surveys typically measure fluctuations in galaxy size or apparent magnitude rather than the radial velocity directly, and the fact that the velocity measurements are noisy, which will degrade the improvements. Nevertheless, the potential gains are large if radial velocities are available. An accurate and precise measurement of the Hubble constant is vital to break degeneracies with other cosmological parameters such as neutrino mass (Allison et al. 2015) and could shed light on tensions between classical distance–ladder and CMB measurements (Freedman 2017). Since proper motion-based measurements of H0 are robust to the Malmquist and selection biases that complicate radial velocity surveys, further investigation into optimizing the measurement is required. In contrast we find that a significant measurement of the transverse velocities from LSS is possible in our reference Gaia-like survey, forecasting a measurement significance on the amplitude of roughly 10σ, corresponding to a 10 per cent measurement of the parameter combination H0fσ8, limited by the instrumental noise. This high significance is due to the large amount of information present in the angular structure of the correlation functions of transverse velocities, and the large number of galaxy pairs available to beat down the noise. A measurement of this parameter combination would be extremely useful to cosmology as the growth rate is sensitive to late-time physics such as dark energy. While the constraints from proper motions are not expected to be competitive with redshift-space distortions in galaxy clustering for some time, an independent measurement of the growth rate serves as a valuable cross-check on the ΛCDM model and could help pin down some of the systematics in galaxy clustering measurements. We have also introduced a CMB-style formalism for analysing proper motions on the full sky, which serves as an alternative to the widely used VSH (Mignard & Klioner 2012; Gaia Collaboration 2018). This formalism involves expanding the vector-valued proper motion in a basis of orthonormal functions on the sphere which possess the correct rotational properties, the spin-weighted spherical harmonics. Expanding in these functions makes correlating vectors on the sphere straightforward, and gives rise to simple analytic formulae which connect correlation functions to angular power spectra. This solves some of the issues raised in Darling & Truebenbach (2018), who noted that a single correlation function does not contain all the information available in the vector field. We expect this formalism to be of use in any application of proper motion measurements. Potential improvements to this work include a more realistic noise specification for measuring the proper motion of extended objects such as galaxies, a more realistic treatment of incompleteness and source clustering, and a more thorough treatment of radial peculiar velocities in the distance estimates. In making the large-N, isotropic noise (LNIN) approximation we had to assume homogeneous noise variance in each radial bin; the accuracy of this approximation should be tested more rigorously, although we do not expect it to impact the SP forecasts significantly. First these forecasts are limited by cosmic variance and secondly an estimator with inverse-variance weight containing only instrumental noise is actually fairly close to optimal (compare the green and orange curves in Figs 12 and 14). The detectability of B-mode signatures from gravitational waves or global rotation would also be of interest to study. This paper provides a necessary step in the development of proper motions from an attractive concept to a feasible and competitive probe of velocities. The potential to test the robustness of the ΛCDM model with real-time cosmology provides strong motivation for further investigation, which this work anticipates. ACKNOWLEDGEMENTS It is my pleasure to thank Anna Lisa Varri for providing useful references, Fergus Cullen and Nigel Hambly for helpful discussions, and Andy Taylor for comments on the manuscript. AH would also like to thank the anonymous referee for useful suggestions which improved this paper. AH was supported by an Science and Technology Facilities Council Consolidated Grant. Footnotes 1 http://sci.esa.int/gaia/ 2 http://ngvla.nrao.edu/ 3 We will often express quantities in the CMB rest frame, the frame in which the CMB dipole vanishes (Planck Collaboration XVI 2014). We assume that the matter rest frame (the frame in which the dipole anisotropy of peculiar velocities vanishes) and the CMB rest frame coincide, i.e. we do not consider the ‘tilted universe’ scenario of Turner (1991). We also neglect any intrinsic dipole anisotropy in the CMB. 4 Throughout this work, r refers to comoving distance, which is equal to the ‘proper motion distance’ which links transverse velocity to proper motion in a spatially flat universe (Hogg 1999). 5 Very non-linear structures such as Virgo could still have an impact on the low-l moments, since our quasi-linear approach is likely inaccurate in this case. Constrained N-body simulations may be required to accurately assess the impact of these, which we defer to a future work. 6 Non-Gaussianity could be incorporated with constrained realizations of a Milky Way-like local environment. 7 Any quasar intrinsic motion due to radio jet variability, assuming it obeys Gaussian statistics, could be marginalized over and included in the noise variance since it is uncorrelated between different objects. We focus on galaxies in this work and do not include this additional noise in the small number of quasars included in our forecasts. 8 Note that hats on quantities now refer to estimators rather than directions, as should be clear from the context. 9 The small uncertainty in the amplitude and direction of the SP velocity vector could be incorporated with a prior added to the log-likelihood, but the errors are small enough that fixing it is sufficiently accurate for our purposes. 10 Peculiar radial velocities add noise to the distance estimate causing fluctuations in the inferred transverse velocity of an object of |${\sim } \frac{15{{\ \rm per\ cent}}}{(r/20 \, h^{-1}\mathrm{Mpc})}$|⁠. This noise is subdominant to other sources of variance at low distances, and is further reduced by averaging over uncorrelated patches of the peculiar radial velocity. There will also be some bias from the part of the radial velocity correlated with our local motion, but this is no more than 15 per cent for the objects we consider and subdominant to the bias in the transverse velocity itself. Note that averaging over both parts of the LSS transverse velocity gives no additional bias, since radial and transverse velocities are uncorrelated at a point (Nusser et al. 2012). Similarly, sensitivity of the FRW distance–redshift relation to the cosmological parameters is negligible at the low redshifts we consider, given the expected sensitivity of our measurements. 11 In reality there will be preferential selection of nearby objects in areas of sky where the noise variance is higher, inducing correlations between sky position and distance and mainly affecting the faintest sources just above the detection threshold. We note that in this situation the LNIN approximations made in this section are likely to be poor, but the full likelihood formalism of the previous section can still be used. 12 Note that we do not account for survey incompleteness in our forecasts. The catalogue will likely be incomplete for the faint objects around G = 20. This is not as serious an issue for proper motions as it is for radial velocities since we do not rely on distance proxies that correlate with flux, but incompleteness which varies over the sky will impact the optimality of our estimators and should be accounted for in a more thorough treatment. Isotropic incompleteness will affect the clustering properties of measured sources, but this should provide a only small correction when correlating proper motions over large scales. We defer further investigation into these issues to a future work, and instead assume full completeness down to G = 20, as in Nusser et al. (2012). 13 In principle this information is already present in the Gaia DR2 catalogue. 14 https://www.cosmos.esa.int/web/gaia/science-performance 15 The bias is proportional to the corresponding component of the Milky Way-CMB relative velocity; for our Cartesian coordinate system the x-component carries roughly 75 per cent of the squared-length of this velocity vector. The bias in the y and z directions are roughly a factor of −0.4 and 0.5 smaller, respectively. Note that the sign of the bias depends on the chosen direction. REFERENCES Allison R. , Caucal P. , Calabrese E. , Dunkley J. , Louis T. , 2015 , Phys. Rev. D , 92 , 123535 https://doi.org/10.1103/PhysRevD.92.123535 Crossref Search ADS Bachchan R. K. , Hobbs D. , Lindegren L. , 2016 , A&A , 589 , A71 https://doi.org/10.1051/0004-6361/201527935 Crossref Search ADS Bond J. R. , Jaffe A. H. , Knox L. , 1998 , Phys. Rev. D , 57 , 2117 https://doi.org/10.1103/PhysRevD.57.2117 Crossref Search ADS Chen J. , Zhang P. , Zheng Y. , Yu Y. , Jing Y. , 2018 , ApJ , 861 , 58 https://doi.org/10.3847/1538-4357/aaca2f Crossref Search ADS Coles P. , Jones B. , 1991 , MNRAS , 248 , 1 https://doi.org/10.1093/mnras/248.1.1 Crossref Search ADS Darling J. , Truebenbach A. E. , 2018 , ApJ , 864 , 37 (DT18) https://doi.org/10.3847/1538-4357/aad3d0 Crossref Search ADS Darling J. , Truebenbach A. , Paine J. , 2018 , in Murphy E. , ed., ASP Conf. Ser. Vol. 517, Science with a Next Generation Very Large Array , Astron. Soc. Pac , San Francisco , p. 813 . de Bruijne J. H. J. , Allen M. , Azaz S. , Krone-Martins A. , Prod'homme T. , Hestroffer D. , 2015 , A&A , 576 , A74 Crossref Search ADS de Souza R. E. , Krone-Martins A. , dos Anjos S. , Ducourant C. , Teixeira R. , 2014 , A&A , 568 , A124 https://doi.org/10.1051/0004-6361/201423514 Crossref Search ADS Ding F. , Croft R. A. C. , 2009 , MNRAS , 397 , 1739 https://doi.org/10.1111/j.1365-2966.2009.15111.x Crossref Search ADS Freedman W. L. , 2017 , Nature Astron. , 1 , 0121 https://doi.org/10.1038/s41550-017-0121 Crossref Search ADS Gaia Collaboration , 2018 , A&A , 616 , A14 https://doi.org/10.1051/0004-6361/201832916 Crossref Search ADS Gelfand I. M. , Minlos R. A. , Shapiro Z. Y. , 1963 , Representations of the Rotation and Lorentz Groups and Their Applications . Pergamon Press , Oxford, UK , p. 48. Goldberg J. N. , Macfarlane A. J. , Newman E. T. , Rohrlich F. , Sudarshan E. C. G. , 1967 , J. Math. Phys. , 8 , 2155 https://doi.org/10.1063/1.1705135 Crossref Search ADS Gorski K. , 1988 , ApJ , 332 , L7 https://doi.org/10.1086/185255 Crossref Search ADS Hall A. , Challinor A. , 2014 , Phys. Rev. D , 90 , 063518 https://doi.org/10.1103/PhysRevD.90.063518 Crossref Search ADS Hamimeche S. , Lewis A. , 2008 , Phys. Rev. D , 77 , 103013 https://doi.org/10.1103/PhysRevD.77.103013 Crossref Search ADS Hobbs D. et al. . , 2016 , preprint (arXiv:1609.07325) Hogg D. W. , 1999 , preprint (arXiv:astr-ph/9905116 ) Holl B. , Hobbs D. , Lindegren L. , 2010 , in Klioner S. A. , Seidelmann P. K. , Soffel M. H. , eds, IAU Symp. Vol. 261, Relativity in Fundamental Astronomy: Dynamics, Reference Frames, and Data Analysis . Cambridge University Press , Cambridge , p. 320 https://doi.org/10.1017/S1743921309990573 Irwin M. J. , 1985 , MNRAS , 214 , 575 https://doi.org/10.1093/mnras/214.4.575 Crossref Search ADS Kogut A. et al. . , 1993 , ApJ , 419 , 1 https://doi.org/10.1086/173453 Crossref Search ADS Kopeikin S. M. , Makarov V. V. , 2006 , AJ , 131 , 1471 https://doi.org/10.1086/500170 Crossref Search ADS Lewis A. , Challinor A. , Lasenby A. , 2000 , ApJ , 538 , 473 https://doi.org/10.1086/309179 Crossref Search ADS Lindegren L. et al. . , 2018 , A&A , 616 , A2 https://doi.org/10.1051/0004-6361/201832727 Crossref Search ADS Mignard F. , Klioner S. , 2012 , A&A , 547 , A59 https://doi.org/10.1051/0004-6361/201219927 Crossref Search ADS Newman E. T. , Penrose R. , 1966 , J. Math. Phys. , 7 , 863 https://doi.org/10.1063/1.1931221 Crossref Search ADS Nusser A. , Branchini E. , Davis M. , 2012 , ApJ , 755 , 58 https://doi.org/10.1088/0004-637X/755/1/58 Crossref Search ADS Percival W. J. , White M. , 2009 , MNRAS , 393 , 297 https://doi.org/10.1111/j.1365-2966.2008.14211.x Crossref Search ADS Planck Collaboration et al. . , 2014 , A&A , 571 , A27 https://doi.org/10.1051/0004-6361/201321556 Crossref Search ADS Planck Collaboration et al. . , 2016 , A&A , 594 , A13 https://doi.org/10.1051/0004-6361/201525830 Crossref Search ADS Robin A. C. et al. . , 2012 , A&A , 543 , A100 https://doi.org/10.1051/0004-6361/201118646 Crossref Search ADS Seljak U. , Zaldarriaga M. , 1997 , Phys. Rev. Lett. , 78 , 2054 https://doi.org/10.1103/PhysRevLett.78.2054 Crossref Search ADS Takahashi R. , Sato M. , Nishimichi T. , Taruya A. , Oguri M. , 2012 , ApJ , 761 , 152 https://doi.org/10.1088/0004-637X/761/2/152 Crossref Search ADS Tegmark M. , Taylor A. N. , Heavens A. F. , 1997 , ApJ , 480 , 22 https://doi.org/10.1086/303939 Crossref Search ADS Titov O. , Lambert S. B. , Gontier A.-M. , 2011 , A&A , 529 , A91 https://doi.org/10.1051/0004-6361/201015718 Crossref Search ADS Truebenbach A. E. , Darling J. , 2018 , ApJ , 868 , 69 Crossref Search ADS Turner M. S. , 1991 , Phys. Rev. D , 44 , 3737 https://doi.org/10.1103/PhysRevD.44.3737 Crossref Search ADS Varshalovich D. A. , Moskalev A. N. , Khersonskii V. K. , 1988 , Quantum Theory of Angular Momentum . World Scientific , Singapore https://doi.org/10.1142/0270 Zheng Y. , Zhang P. , Jing Y. , Lin W. , Pan J. , 2013 , Phys. Rev. D , 88 , 103510 https://doi.org/10.1103/PhysRevD.88.103510 Crossref Search ADS APPENDIX A: SPIN-WEIGHTED SPHERICAL HARMONICS In this section we detail some formalism surrounding the use of spin-weighted spherical harmonics. These functions were introduced by Gelfand, Minlos & Shapiro (1963), Newman & Penrose (1966), and Goldberg et al. (1967), and applied to CMB analysis by Seljak & Zaldarriaga (1997). The material here is similar to that presented in Hall & Challinor (2014), but we repeat it for completeness. The transverse velocity field |$\boldsymbol{V}_{\perp }$| is a vector on the sphere, and hence can be decomposed into gradient and curl parts whose components in some coordinate system are \begin{eqnarray*} V_{\perp }^a = \nabla ^a \Omega + \epsilon ^a_{\, \, b} \nabla ^b \Psi , \end{eqnarray*} (A1) where |$\epsilon ^a_{\, \, b}$| is the alternating tensor, ∇ is the covariant derivative on the sphere, and Ω and Ψ are scalar functions on the sphere. We now introduce the spin raising and lowering operators ð and |$\bar{\eth }$|⁠, which act on a spin s quantity sη as \begin{eqnarray*} \eth {}_s \eta = -(\sin \theta)^s (\partial _{\theta } + i \mathrm{cosec} \, \theta \partial _{\phi }) [(\sin \theta)^{-s} {}_s \eta ], \end{eqnarray*} (A2) \begin{eqnarray*} \bar{\eth } {}_s \eta = -(\sin \theta)^{-s} (\partial _{\theta } - i \mathrm{cosec} \, \theta \partial _{\phi }) [(\sin \theta)^{s} {}_s \eta ]. \end{eqnarray*} (A3) The quantity ðsη has spin s + 1 while the quantity |$\bar{\eth } {}_s \eta$| has spin s − 1. The spin ±1 spherical harmonics are defined by operating on the standard spherical harmonics as \begin{eqnarray*} {}_1 Y_{lm} = \eth \left(\frac{\sqrt{(l-1)!}}{\sqrt{(l+1)!}} Y_{lm}\right), \end{eqnarray*} (A4) \begin{eqnarray*} {}_{-1}Y_{lm} = -\bar{\eth } \left(\frac{\sqrt{(l-1)!}}{\sqrt{(l+1)!}} Y_{lm}\right). \end{eqnarray*} (A5) Note that Ylm = 0Ylm. As discussed in Section 2, it is desirable to work with quantities with simple transformation properties under a rotation of the coordinate basis. For this reason we work with the complex transverse velocity, found by projection on to the null basis |$\hat{\boldsymbol{\theta }} \pm i \hat{\boldsymbol{\phi }}$|⁠, given by \begin{eqnarray*} V_{\theta } + iV_{\phi } = -\eth (\Omega - i\Psi), \end{eqnarray*} (A6) \begin{eqnarray*} V_\theta -i V_\phi = -\bar{\eth }(\Omega + i \Psi), \end{eqnarray*} (A7) where we used equations (A1), (A2), and (A3). We now expand the spin 0 potentials Ω and Ψ in spherical harmonics as \begin{eqnarray*} \Omega (\hat{\boldsymbol {n}}) = \sum _{lm} \sqrt{\frac{(l-1)!}{(l+1)!}} \epsilon _{lm} Y_{lm}(\hat{\boldsymbol {n}}), \end{eqnarray*} (A8) \begin{eqnarray*} \Psi (\hat{\boldsymbol {n}}) = \sum _{lm} \sqrt{\frac{(l-1)!}{(l+1)!}} \beta _{lm} Y_{lm}(\hat{\boldsymbol {n}}), \end{eqnarray*} (A9) where l ≥ 1 (the l = 0 mode corresponds to a constant that gives no observable contribution to the transverse velocity). Plugging this into equations (A6) and (A7) and using the definition of the spin-weighted spherical harmonics gives \begin{eqnarray*} (V_\theta \pm iV_\phi)(\hat{\boldsymbol {n}}) = \sum _{lm} (\mp \epsilon _{lm} + i\beta _{lm}) {}_{\pm 1} Y_{lm}(\hat{\boldsymbol {n}}). \end{eqnarray*} (A10) Under a parity transformation, the 3D velocity field transforms as |$\boldsymbol {V}(\hat{\boldsymbol {n}}) \rightarrow -\boldsymbol {V}(-\hat{\boldsymbol {n}})$|⁠, such that the transverse components transform as |$(V_\theta \pm i V_\phi)(\hat{\boldsymbol {n}}) \rightarrow -(V_\theta \mp i V_\phi)(-\hat{\boldsymbol {n}})$|⁠. The multipole coefficients thus transform as ϵlm → (−1)lϵlm and βlm → (−1)l + 1βlm, and therefore have electric and magnetic parity, respectively. The electric (E-mode) and magnetic (B-mode) coefficients describe gradient and curl contributions to the transverse velocity, respectively. Reality of the transverse velocity implies that |$\epsilon _{lm}^* = (-1)^m \epsilon _{l-m}$| and |$\beta _{lm}^* = (-1)^m \beta _{l-m}$|⁠. The spin-weighted spherical harmonics obey orthogonality and completeness relations over the sphere given by \begin{eqnarray*} \int \mathrm{d}\hat{\boldsymbol {n}} \, {}_s Y_{lm}(\hat{\boldsymbol {n}}) {}_s Y_{l^{\prime }m^{\prime }}^*(\hat{\boldsymbol {n}}) = \delta _{ll^{\prime }}\delta _{mm^{\prime }}, \end{eqnarray*} (A11) \begin{eqnarray*} \sum _{lm} {}_s Y_{lm}^*(\hat{\boldsymbol {n}}_1) {}_sY_{lm}(\hat{\boldsymbol {n}}_2) = \delta (\hat{\boldsymbol {n}}_1 - \hat{\boldsymbol {n}}_2), \end{eqnarray*} (A12) where the sums are over l ≥ s and −l ≤ m ≤ l. The spin-weighted spherical harmonics also obey the relation |${}_s Y^*_{lm} = (-1)^{s+m} {}_{-s}Y_{l-m}$| and satisfy the addition theorem \begin{eqnarray*} \sum _m {}_s Y_{lm}^*(\hat{\boldsymbol {n}}_1) {}_{s^{\prime }} Y_{lm}(\hat{\boldsymbol {n}}_2) = \frac{2l+1}{4\pi }D^l_{ss^{\prime }}(\gamma _1, \beta _{12}, -\gamma _2), \end{eqnarray*} (A13) where |$D^l_{ss^{\prime }}(\gamma _1, \beta _{12}, -\gamma _2)$| are the Wigner D matrix elements representing a rotation by the Euler angles {γ1, β12, −γ2} from the |$(\hat{\boldsymbol{\theta }},\hat{\boldsymbol{\phi }})$| basis at |$\hat{\boldsymbol {n}}_1$| to that at |$\hat{\boldsymbol {n}}_2$|⁠, see for example Varshalovich et al. (1988). Under an active rotation of a transverse velocity field by a rotation operator R, the scalar potentials Ω and Ψ must satisfy |${}[R\Omega ](\hat{\boldsymbol {n}}) = \Omega (R^{-1}\hat{\boldsymbol {n}})$| and |${}[R\Psi ](\hat{\boldsymbol {n}}) = \Psi (R^{-1}\hat{\boldsymbol {n}})$|⁠. Use of the matrix elements of the rotation operator in the (l, m)-representation (the Wigner D functions) on the spherical harmonics allows us to derive the transformation law for the multipole coefficients \begin{eqnarray*} \epsilon _{lm} \rightarrow \epsilon _{lm}^{\prime } = \sum _{m^{\prime }} D^l_{mm^{\prime }}(\alpha , \beta , \gamma)\epsilon _{lm^{\prime }}, \end{eqnarray*} (A14) and similarly for βlm, where (α, β, γ) are the Euler angles corresponding to R. The Wigner D matrix elements are related to the spin-weighted spherical harmonics by \begin{eqnarray*} D^l_{-m s}(\phi ,\theta ,0) = (-1)^m \sqrt{\frac{4\pi }{2l+1}}{}_s Y_{lm}(\theta ,\phi), \end{eqnarray*} (A15) which can be taken as the definition of the spin-weighted spherical harmonics. In this paper we are often concerned with dipolar transverse velocities, which correspond to the l = 1 mode. The explicit expressions for the spin ±1 spherical harmonics in this case are \begin{eqnarray*} {}_1 Y_{10}(\theta , \phi) = \sqrt{\frac{3}{8\pi }} \sin \theta , \end{eqnarray*} (A16) \begin{eqnarray*} {}_1 Y_{1\pm 1}(\theta , \phi) = -\sqrt{\frac{3}{16\pi }} (1 \mp \cos \theta) e^{\pm i\phi }. \end{eqnarray*} (A17) We close this section by connecting the spin-weighted spherical harmonics to the widely used VSHs. These are defined from the standard spherical harmonics as (Mignard & Klioner 2012) \begin{eqnarray*} \boldsymbol {S}_{lm} = \frac{1}{\sqrt{l(l+1)}} \nabla Y_{lm}, \end{eqnarray*} (A18) \begin{eqnarray*} \boldsymbol {T}_{lm} = -\hat{\boldsymbol {n}} \times \boldsymbol {S}_{lm}, \end{eqnarray*} (A19) where formally ∇ is now the 3D gradient operator projected orthogonal to the unit sphere, and |$\boldsymbol {S}_{lm}$| and |$\boldsymbol {T}_{lm}$| are 3D vectors orthogonal to |$\hat{\boldsymbol {n}}$|⁠. Using the definition of the spin ±1 spherical harmonics, it is straightforward to show that \begin{eqnarray*} (\hat{\boldsymbol{\theta }} \pm i \hat{\boldsymbol{\phi }})\cdot \boldsymbol {S}_{lm} = \mp {}_{\pm 1} Y_{lm}, \end{eqnarray*} (A20) \begin{eqnarray*} (\hat{\boldsymbol{\theta }} \pm i \hat{\boldsymbol{\phi }})\cdot \boldsymbol {T}_{lm} = i {}_{\pm 1} Y_{lm}. \end{eqnarray*} (A21) The spin ±1 spherical harmonics are thus projections of the VSHs on to a null basis. The expansion of a vector field on the sphere in VSHs is \begin{eqnarray*} \boldsymbol {V} = \sum _{lm} (s_{lm} \boldsymbol {S}_{lm} + t_{lm} \boldsymbol {T}_{lm}). \end{eqnarray*} (A22) Projecting on to the null basis and using equations (A20) and (A21) allows us to identify the coefficients of the VSH expansion with those of the spin-weighted spherical harmonic expansion as (slm, tlm) = (ϵlm, βlm). We advocate the use of spin-weighted spherical harmonics over VSHs, as they are much easier to work with when correlating transverse vector fields over the full sky. APPENDIX B: RELATIONSHIP OF ξ± TO THE CORRELATION FUNCTIONS OF DARLING & TRUEBENBACH (2018) In this section we compare our ξ± correlation functions to those recently introduced in Darling & Truebenbach (2018) (hereafter DT18). Their equation (1) defines a correlation function \begin{eqnarray*} \xi _{v_\perp }(\boldsymbol {x}_1, \boldsymbol {x}_2) = \langle [\boldsymbol {V}_\perp (\boldsymbol {x}_1) \cdot \hat{\boldsymbol {x}}] [\boldsymbol {V}_\perp (\boldsymbol {x}_2) \cdot \hat{\boldsymbol {x}}] \rangle , \end{eqnarray*} (B1) where |$\boldsymbol {x} = \boldsymbol {x}_1 - \boldsymbol {x}_2$|⁠. To relate this correlation function to ξ±, we first resolve |$\hat{\boldsymbol {x}}$| at the point |$\boldsymbol {x}_1$| into a part parallel to |$\boldsymbol {x}_1$| and a part orthogonal to |$\boldsymbol {x}_1$|⁠; only the orthogonal part survives contraction with |$\boldsymbol {V}_\perp (\boldsymbol {x}_1)$|⁠. We then rotate to the plane containing the origin and the vectors |$\boldsymbol {x}_1$| and |$\boldsymbol {x}_2$|⁠. |$\boldsymbol {V}_\perp (\boldsymbol {x}_1) \cdot \hat{\boldsymbol {x}}$| is then the component of |$\boldsymbol {V}_\perp (\boldsymbol {x}_1)$| in this plane, and may be related to the θ-component in the spherical coordinate basis aligned with the plane by \begin{eqnarray*} \boldsymbol {V}_\perp (\boldsymbol {x}_1) \cdot \hat{\boldsymbol {x}} = \frac{\vert \boldsymbol {x}_2 - (\boldsymbol {x}_2 \cdot \hat{\boldsymbol{x}}_1)\hat{\boldsymbol {x}}_1 \vert }{\vert \boldsymbol {x}_2 - \boldsymbol {x}_1\vert } \bar{V}_\theta (\boldsymbol {x}_1), \end{eqnarray*} (B2) where an overbar indicates |$\boldsymbol {V}_\perp$| in the coordinate basis rotated into the plane as per the discussion in Section 2. The prefactor in equation (B2) may be written in terms of the radial distances of the points r1 and r2 and the angle between them β12 as \begin{eqnarray*} \frac{\vert \boldsymbol {x}_2 - (\boldsymbol {x}_2 \cdot \hat{\boldsymbol {x}}_1)\hat{\boldsymbol {x}}_1 \vert }{\vert \boldsymbol {x}_2 - \boldsymbol {x}_1\vert } = \frac{r_1 r_2 \sin ^2 \beta _{12}}{r_1^2 + r_2^2 - 2r_1r_2\cos \beta _{12}} \equiv f(\beta _{12},r_1,r_2). \nonumber \\ \end{eqnarray*} (B3) Similar relations apply at |$\boldsymbol {x}_2$|⁠. Using V± = Vθ ± iVϕ to relate |$\bar{V}_\theta$| to V±, and the definition of ξ±, we find that \begin{eqnarray*} \xi _{v_\perp }(\boldsymbol {x}_1, \boldsymbol {x}_2) = f(\beta _{12},r_1,r_2)\frac{1}{2} \left[\xi _+(\beta _{12},r_1,r_2) + \xi _-(\beta _{12},r_1,r_2)\right]. \nonumber \\ \end{eqnarray*} (B4) In other words, the correlation function |$\xi _{v_\perp }$| defined in DT18 is proportional to a linear combination of our ξ± correlation functions. Note that equation (B4) is independent of the orientation of the coordinate basis (as required) but only contains half of the total information available. DT18 also define a correlation function \begin{eqnarray*} \xi ^{\prime }_{v_\perp }(\boldsymbol {x}_1,\boldsymbol{x}_2) = \langle \boldsymbol {V}_\perp (\boldsymbol {x}_1) \cdot \boldsymbol {V}_\perp (\boldsymbol {x}_2) \rangle , \end{eqnarray*} (B5) where the vectors here are assumed to be embedded in 3D space. Using similar geometric arguments to above, we find that \begin{eqnarray*} \xi ^{\prime }_{v_\perp }(\boldsymbol {x}_1,\boldsymbol {x}_2) &=& \left(\frac{1 + \cos \beta _{12}}{2}\right)\xi _+(\beta _{12},r_1,r_2) \nonumber \\ &&+\, \left(\frac{-1 + \cos \beta _{12}}{2}\right)\xi _-(\beta _{12},r_1,r_2), \end{eqnarray*} (B6) i.e. again a linear combination of the ξ± correlation functions. Once more, individually this correlation function only contains half the information available in the vector field; use of the ξ± correlation functions ensures no information is lost. In addition the ξ± are related straightforwardly to the power spectra of the E and B modes as shown in Section 2, and hence can be easily related to the VSH description. Finally, it may be shown that equations (B4) and (B6) when applied to the linear velocity field of LSS agree with the expressions in DT18 upon use of our equation (16), the explicit forms of the Wigner d elements given in Varshalovich et al. (1988), and the useful expression \begin{eqnarray*} \sum _{l=0}^\infty (2l+1)j_l(kr_1) j_l(k r_2) P_l^{(n)}(\cos \beta) = \frac{(k r_1)^n (k r_2)^n}{(kx)^n} j_n(kx), \end{eqnarray*} (B7) where |$P_l^{(n)}$| is the n-th derivative of the Legendre polynomial with n ≥ 0, jl is a spherical Bessel function, and |$x \equiv (r_1^2 + r_2^2 - 2r_1r_2\cos \beta)^{1/2}$|⁠. The proof of equation (B7) follows by induction. APPENDIX C: ALTERNATIVE CHOICES FOR THE AVERAGE NOISE VARIANCE When inverting the covariance matrix |$(\boldsymbol{N} + \boldsymbol{C})$|⁠, we had to assume that the noise variance was constant in each radial bin. As discussed in Section 3.5 this is not correct, but using the average value of the inverse noise variance |$\langle \sigma _\mu ^{-2} \rangle$| reproduces the exact result in the limit of zero cosmic variance. In reality the noise variance varies due to the distribution of magnitude and colour in each bin (cf. Fig. 11), and a proper treatment would involve marginalizing over the distributions of these latent variables. This must be done numerically since the distributions are all non-Gaussian and the dependencies are non-linear. In this section we take a simpler approach and just study the impact of different choices for the average noise variance in the noise power spectrum on the forecasts. Specifically, we investigate the impact of using the inverse of the average noise variance, |$\langle \sigma _\mu ^2 \rangle ^{-1}$| instead of |$\langle \sigma _\mu ^{-2} \rangle$|⁠. When using |$\langle \sigma _\mu ^2 \rangle ^{-1}$| and all the galaxies with |$r \ge 20 \, \mathrm{Mpc}\,h^{ -1}$| and z ≤ 0.1, we find that the forecast error on a single component of the SP velocity increases by roughly 20 per cent (marginalized SAD) and 50 per cent (fixed SAD), with similar changes to the amplitude parameter ASP. These mild changes are due to the variance in these estimators being dominated by cosmic variance from LSSt transverse velocities rather than the instrumental noise (cf. Fig. 12), so the results of Section 5 should be robust to assumptions about the average noise variance. In contrast, the forecasts on ALSS are dominated by instrumental noise and are quite sensitive to assumptions about the average noise variance. Using the inverse-average instead of the average-inverse, we find the size of the forecast error of ALSS is larger by roughly a factor of three, suggesting that |$\sigma _{A_{\mathrm{LSS}}} \approx 0.3$| for this choice, still a significant measurement. This suggests that more work will be required to investigate the impact of inhomogeneous noise on the ALSS estimator variance, which we defer to a future work. © 2019 The Author(s) Published by Oxford University Press on behalf of the Royal Astronomical Society This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) TI - Cosmology with extragalactic proper motions: harmonic formalism, estimators, and forecasts JF - Monthly Notices of the Royal Astronomical Society DO - 10.1093/mnras/stz648 DA - 2019-06-11 UR - https://www.deepdyve.com/lp/oxford-university-press/cosmology-with-extragalactic-proper-motions-harmonic-formalism-VwQuJcfjVz SP - 145 VL - 486 IS - 1 DP - DeepDyve ER -