Skip to main content

Ambient noise multimode surface wave tomography


Seismic techniques using earthquakes are powerful tools for exploring the Earth’s internal structure. However, the earthquake distribution limits the spatial resolution. In recent years, ambient noise surface wave tomography using ambient seismic wave field has resolved these limitations. A typical ambient seismic wave field is microseisms excited by ocean swell activities. Ambient noise surface wave tomography is a technique in seismic interferometry that extracts seismic wave propagation between pairs of stations by cross-correlating the seismic records. The cross-correlation function can be interpreted as an impulsive response at a station with a virtual source at the other station. This technique became standard with the development of modern dense seismic networks. This paper reviews a theory of seismic interferometry for ambient noise surface wave tomography and procedures for practical data processing to calculate cross-correlation functions. The tomographic method typically consists of four steps: (1) the construction of reference 1-D models, (2) phase velocity measurements for each path, (3) 2-D phase velocity inversions, and (4) the construction of a 3-D S-wave tomographic model obtained from series of local 1-D inversions at all the grids. This paper presents the feasibility of multimode surface wave dispersion measurements for improving depth resolution.

1 Introduction

Seismic techniques are powerful tools for exploring the Earth’s internal structure from local to global scales. Earthquakes are a primary illumination source of the Earth’s interior. Seismic tomography using earthquakes revealed the lateral heterogeneities of the Earth on various scales (e.g., Romanowicz 2003; Thurber and Ritsema 2015). Although the method has produced many significant results, it has inherent limitations attributed to the earthquake distribution. Earthquakes occur only in tectonically active areas, and large earthquakes do not occur frequently. The source and station distributions limit the spatial resolution of the tomographic images.

Even on seismically quiet days, the Earth persistently oscillates because ocean swells excite seismic waves. They are known as microseisms (e.g., Nishida 2017), which is a kind of ambient seismic wave field (also known as ambient seismic noise). Seismic interferometry (SI) provided a clue to overcome the limitations due to the earthquake distribution because SI produces a virtual earthquake record from the ambient seismic wave field. An application of SI for imaging Earth’s internal structure is known as ambient noise surface wave tomography (ANT). This section briefly introduces SI, ambient seismic wave field, and ANT followed by the scope of this paper.

1.1 Brief history of seismic interferometry

In seismology, an origin of SI is Aki’s Ph.D. thesis (Aki 1957), inspired by the seminal book Cybernetics (Wiener 1947). He proposed the SPatial AutoCorrelation (SPAC) method; later studies show that the method is mathematically equivalent to SI (e.g., Chávez-Garía and Luzón 2005; Harmon et al. 2010; Tsai and Moschetti 2010) under certain conditions. Aki’s idea had not been paid attention until the 1980s. After 26 years, a group at Hokkaido University (Okada and Sakajiri 1983) developed a survey method for shallow structure using microtremor, now known as the microtremor survey. After this work, this method became popular for surveying subsurface structures in seismic engineering (e.g., Cho et al. 2006), which was also extended to multimode inversion (Tokimatsu et al. 1992).

A key for SI is a cross-correlation function (CCF) between a pair of stations. The CCF exhibits an impulsive response at a station by a virtual source at the other station. The idea of SI dates back to the 1950s in different research fields: acoustic (Eckart 1953), ocean acoustic (Cox 1973), seismic exploration (Claerbout 1968), and seismology (Aki 1957). Although the ideas were proposed independently, they are mathematically identical under certain conditions (e.g., Chávez-Garía and Luzón 2005; Harmon et al. 2010; Tsai and Moschetti 2010). SI is applied not only to the Earth but also to experimental studies, such as ultrasonic waves (Lobkis and Weaver 2001), building responses (Snieder and Wapenaar 2010; Nakata et al. 2013), helioseismology (Gizon and Birch 2002; Duvall et al. 1993; Hanasoge et al. 2016) and ocean acoustic (Cox 1973). Later studies (e.g., Godin 2007; Snieder and Larose 2013) show that the principle of SI is akin to the fluctuation–dissipation theorem in physics (e.g., Callen and Welton 1951), which was generalized as linear response theory by Kubo (1957). The theorem establishes a connection between the response to an external force and the CCF of specific fluctuating properties when the system reaches thermal equilibrium.

1.2 Ambient seismic wave field

Ocean wave activities excite ambient seismic wave fields, even on seismically quiet days. Based on the types of ocean surface gravity waves and excitation mechanisms, we classified ambient seismic wave fields into (1) seismic hum from \(1\times 10^{-3}\) to 0.02 Hz, (2) primary microseisms from 0.02 to 0.1 Hz, and (3) secondary microseisms between 0.1 and 0.5 Hz (Nishida 2017).

The ocean surface gravity waves can be classified as ocean infragravity (IG) waves below 0.02 Hz and ocean swell above the frequency. IG wave is a shallow-water wave, whereas the ocean swell is physically a deep-water wave. The pressure fluctuations of the IG waves reach the seafloor in the pelagic and coastal regions, whereas those of the ocean swells cannot reach the seafloor in the pelagic regions.

IG waves excite background Love waves and Rayleigh waves predominantly from 1 to 20 mHz, known as seismic hum, through the topographic coupling on the seafloor (Nishida et al. 2008b; Fukao et al. 2010; Nishida 2013). Ocean swell activities excite microseisms from 0.05 to 0.5 Hz (Nishida 2017). Primary microseisms are excited by shoaling waves through the topographic coupling on the sea floor (e.g., Hasselmann 1963; Ardhuin et al. 2015). Secondary microseisms are excited by ocean swell activities in both the pelagic and coastal regions through the nonlinear interaction (Longuet-Higgins 1950; Hasselmann 1963). In the history of earthquake seismology, secondary microseisms have been a prominent noise source for earthquake signals. Historical seismological studies took different approaches depending on two frequency bands below and above the typical frequency of secondary microseisms. This frequency is also important in characterizing the physical nature of seismic waves (Aki 2003). Above the frequency, strong lateral heterogeneities in the crust and sediment make the seismic wave field complex on a regional or global scale (see Sect. 6.2 for details), stimulating the development of a stochastic approach. Below the frequency, the waveform can be reproduced in a deterministic manner. SI turns stochastic wave fields into a deterministic signal as a virtual seismic record, which can be applied to seismic tomography. Thus, the operation of cross-correlating seismic wave fields, or SI, bridges the stochastic and deterministic approaches in seismology, which had been incompatible, as noted by Aki (2003).

1.3 ANT

Campillo and Paul (2003) demonstrated that SI can extract virtual seismic records from the ambient seismic wave field. This paper describes the analysis of the coda wave, a diffusive wave field, in Mexico. Cross-correlating the seismic records of coda waves extracted clear Love and Rayleigh wave propagations between every pair of stations. Although this paper analyzed earthquake coda, the method can be applied to ambient seismic wave fields. Shapiro and Campillo (2004) subsequently showed that the CCFs of the microseisms also exhibited clear surface propagations between every pair of stations. This result suggested the possibility of seismic imaging without earthquakes, now known as ANT.

Shapiro et al. (2005) achieved the milestone of the ANT studies. They demonstrated the feasibility of ANT using a modern dense seismic network. After this study, ANT became a standard technique for dense broad-band observations because ANT does not require waiting for earthquakes, i.e., a long observation period. After the paper, ANT was applied to many regions: the USA (e.g., Bensen et al. 2007; Moschetti et al. 2007; Liang and Langston 2008; Lin et al. 2008; Yang et al. 2008), Australia (e.g., Saygin and Kennett 2010), Europe (e.g., Yang et al. 2007), China (e.g., Zheng et al. 2008), Japan (e.g., Nimiya et al. 2020; Nishida et al. 2009), and the global scale (Nishida et al. 2009; Haned et al. 2016). With the increase in seismic stations with continuous observations at many seismic stations in the 2000s, ANT has become a standard analysis method. Today, ANT is first applied after deploying a dense seismic array.

Fig. 1
figure 1

a Construction of reference 1-D models. b Phase/group velocity measurements for each path. c 2-D phase/group velocity inversions for multimodes. d A 3-D S-wave tomographic model is obtained from a collection of local 1-D inversions at all the grids. Modified from Takagi and Nishida (2022)

1.4 What is covered/not covered in this review

There are already many review papers on SI and ANT (e.g., Snieder and Larose 2013; Wapenaar et al. 2010a2010b) and textbooks (e.g., Schuster 2009, Ritzwoller and Feng 2019, Nakata et al. 2019, and Chapter 10 in Sato et al. 2012). This review focuses on a consistent theoretical treatment and systematic comparison among different phase velocity measurement methods.

This review also focuses on multimode measurements. Recent developments in dense arrays enable us to extract multimode dispersion (e.g., Spica et al. 2018; Chmiel et al. 2019; Savage et al. 2013; Jiang and Denolle 2022; Socco et al. 2010). Multimode inversions have significantly improved depth resolution, although the dispersion of fundamental mode branches alone has poor vertical resolution. As multimode dispersion measurement has inherent difficulties, this review aims to provide clues for practical applications.

Because SI does not require earthquakes, it accelerates the progress in monitoring the temporal change in seismic velocities. Monitoring the temporal change with events requires repeating earthquakes or repeating active sources. In most cases, they are not realistic. In the last ten years, SI has become a standard technique for monitoring the temporal change in seismic velocities associated with environmental origins (e.g., precipitation, groundwater, and temperature), volcanic eruptions, and earthquakes (e.g., Sens-Schönfelder and Wegler 2006a; Wegler et al. 2007; Brenguier et al. 2008a, b, Wang et al. 2017). Because this review does not cover such topics, see review articles (e.g., Sens-Schönfelder and Wegler 2011; Obermann and Hillers 2019) for further information.

This paper does not cover conventional surface wave tomography using earthquake data. Because many excellent reviews are already available (e.g., Romanowicz 200320202021; Laske and Widmer-Schnidrig 2015; Levshin et al. 2018; Barmin et al. 2001), please refer to those references.

This paper also does not cover attenuation tomography using CCFs of ambient seismic noise. Some studies inferred the attenuation structure from the amplitude information of CCFs (Prieto et al. 2009, 2011; Lin et al. 2011). Although the effects of source heterogeneities (e.g., Tsai 2011) can bias the estimation, recent developments to extract amplitude information from ambient noise cross-correlations (e.g., Lin et al. 2012b; Zhou et al. 2020; Liu et al. 2021) have made it reliable. Because this topic is beyond the scope of this paper, we refer only to the aforementioned papers.

Section 2 summarizes a theory of SI for ANT. Section 3 describes the procedures for practical data processing to calculate CCFs. Ambient noise multimode surface wave tomography (multimode ANT) consists of four steps (Fig. 1). The first step (Sect. 4) is the measurement of multimode surface wave dispersion for a local seismic array. The dispersion curves are also crucial for constructing local 1-D structures, which can be an initial model of ANT. The second step (Sect. 5) is the dispersion measurements for each path. The third and fourth steps (Sects. 6 and 7) are how to infer 3-D seismic velocity models from the dispersion measurements. The procedures consist of two steps: (1) the 2-D inversion of phase/group velocities in Sect. 6 and (2) a local 1-D inversion at each grid in Sect. 7. Figure 1 shows such procedures based on our previous studies (Nishida et al. 2008a; Nagaoka et al. 2012; Takagi and Nishida 2022; Takeo et al. 2022; Yamaya et al. 2021). This review will focus on and compare our previous studies with other studies.

2 A brief review of SI

Theories of SI originate from various backgrounds and have developed independently. Although they require different assumptions in different settings (e.g., an open or closed system), they are closely related. Theoretically, it is natural to consider a closed system for the global scale, whereas it is natural to consider an open system for the regional or local scale. This section provides an overview of the theories of SI in a closed system and an open system to understand the physical pictures. To model CCFs for ANT, we formulate synthetic CCFs for the homogeneous source distribution in a 2-D homogeneous medium. However, the heterogeneous source distribution realistically causes an apparent travel-time anomaly from the synthetic CCF for the homogeneous source distribution. We evaluate such biases based on the analytic formulation.

2.1 SI in a closed system

Here, we consider SI in a closed system. In the case of a finite body, we evaluate CCFs based on a normal mode approach (Lobkis and Weaver 2001). Because the Earth is a finite-size sphere, this approach is also feasible for multi-orbit propagations on a global scale (Nishida et al. 2002, 2009). For simplicity, this subsection describes a scalar 1-D case, but it can be easily extended to elastic 2-D and 3-D cases.

2.1.1 Repeating seismic experiments

Virtually, we consider repeating seismic experiments in a closed system of a perfect elastic body. Before the initial time \(t=0\), the body did not deform, and a random force \(f^k(x)\) was applied to the body at \(t=0\) in the kth experiment, where x is the spatial location. We consider a finite body from \(x=0\) to \(x=L\) and impose rigid or free boundary conditions at both ends. The equation of motions is given by

$$\begin{aligned} \rho (x)\frac{\partial ^{2}{u^{k}(x,t)}}{\partial {t^{2}}} = \kappa (x) \frac{\partial ^{2}{u^{k}(x,t)}}{\partial {x^{2}}} + f^{k}(x)\delta (t), \end{aligned}$$

where \(\kappa\) is the elastic constant, \(\rho\) is the density and \(u^k\) is the displacement at the kth experiment. After the force is applied, the displacement is measured at the stations. These experiments are repeated K times. Note that we cannot consider a persistent force in this system because no attenuation leads to an infinite increase in the amplitude over time.

We evaluated the displacement \(u^k(x,t)\) by a convolution between the force \(f^k(x)\) and the Green’s function. The Green’s function can be written in terms of normal mode theory (Dahlen and Tromp 1998) as

$$\begin{aligned} G(x,x';t) = \sum _n \frac{\mathcal{U}_n(x)\mathcal{U}_n(x')}{\omega _n} \sin (\omega _n t), t\ge 0, \end{aligned}$$

where \(\omega _n\) is nth eigenfrequency and \(\mathcal{U}_n\) is nth eigenfunction, which satisfies orthonormality:

$$\begin{aligned} \int _0^L \rho (x) \mathcal{U}_n(x)\mathcal{U}_{n'}(x) \textrm{d}x = \delta _{nn'}. \end{aligned}$$

Then, the displacement can be represented by the convolution

$$\begin{aligned} u^k(x,t)=\sum _n A^k_n \mathcal{U}_n(x)\frac{\sin (\omega _n t)}{\omega _n}, \end{aligned}$$

where \(A_n^k = \int _0^L \mathcal{U}_n(x) f^k(x)\textrm{d}x\). We consider the energy partition of the modes for a random external force in the next section.

2.1.2 Energy partition of modal energy

This subsection discusses the energy balance of each mode: how the work done by external forces is distributed to the kinematic and elastic energy. First, we evaluate the work done by the external force can be given by

$$\begin{aligned} \int _0^{\infty } \int _0^L v^k_n(x,t) f^k(x)\delta (t)\textrm{d}x\textrm{d}t = (A^k_n)^2, \end{aligned}$$

where the particle velocity \(v^k_n\) of nth mode in the kth experiment is written by \(v^k_n = A^k_n \mathcal{U}_n(x)\cos (\omega _n t)\).

The kinetic energy of nth mode \(T_n\) can be evaluated by integrating the kinetic energy density in space as

$$\begin{aligned} T_n = \int _0^L \frac{\rho }{2} (v^k_n)^2 \textrm{d}x = \frac{1}{2}(A^k_n)^2 \cos ^2(\omega _n t). \end{aligned}$$

The elastic energy \(V_n\), on the other hand, can be evaluated by integrating the strain energy density in space. The partial integral with the boundary condition leads to

$$\begin{aligned} V_n = \int _0^L \frac{\kappa }{2} \left( A^k_n\frac{\partial \mathcal{U}_n}{\partial x} \frac{\sin (\omega _n t)}{\omega _n}\right) ^2 \textrm{d}x = \frac{1}{2}(A^k_n)^2 \sin ^2(\omega _n t). \end{aligned}$$

The total energy \(T_n+V_n\) is

$$\begin{aligned} T_n+V_n = (A^k_n)^2, \end{aligned}$$

which balances the work done by external forces.

Here, we consider an excitation by random force \(F^k_i\) at location \(x_i\) as

$$\begin{aligned} f^k(x) = \sum _{i=0}^{I-1} F^k_i\delta (x_i), \end{aligned}$$

where I is the number of the force. We also assume that \(F^k_i(x)\) is white noise as

$$\begin{aligned} \langle F^k_i F^k_{i'}\rangle _k = {\bar{F}}^2 \delta _{ii'}, \end{aligned}$$

where \(\langle \rangle _k\) is ensemble average with respect to k. To simplify the problem, we consider the constant density \(\rho _0\). The expected value of the cross-correlation of \(A_n\) can be evaluated as follows.

$$\begin{aligned} \left\langle {A_{n}^{k} A_{{n^{\prime } }}^{k} } \right\rangle _{k} \sim \frac{{I{\bar{F}}^{2} }}{{L\rho _{0} }}\int_{0}^{L} {{\mathcal{U}}_{n} } (x){\mathcal{U}}_{{n^{\prime } }} (x){\text{d}}x = \frac{{I{\bar{F}}^{2} }}{{L\rho _{0} }}\delta _{{nn^{\prime } }} \equiv {\mathcal{E}}\delta _{{nn^{\prime } }} ,\end{aligned}$$

where \(\mathcal{E}\) is the modal energy. The amplitudes of the different modes \(A_n\) do not correlate with each other, and the total energy of each mode is distributed equally. The expected value of the modal energy is constant for each mode when the external force f is white noise. When a system meets this condition, we call the state the equipartition of energy. We note that the fluctuation–dissipation theorem in physics (Callen and Welton 1951) requires a thermal equilibrium that satisfies the equipartition of energy. Equation 10 can be derived from the principle of equal a priori probabilities in the case.

2.1.3 CCFs under the equipartition of energy

Here we define a CCF \(\phi ^k(x_1,x_2; \tau )\) between \(u^k(x_1)\) and \(u^k(x_2)\) under the equipartition of energy as,

$$\begin{aligned} \phi ^k(x_1, x_2; \tau ) \equiv \lim _{T\rightarrow \infty }\frac{1}{T} \int _0^T u^k(x_1,t)u^k(x_2,t+\tau )\textrm{d}t, \end{aligned}$$

where \(x_1\) shows the location of station 1, and \(x_2\) shows that of station 2. We note that there are two types of CCF definitions. The sign of the second term of the other type is flipped.

The ensemble average of \(\phi (x_1,x_2;t)\) over K time experiments is defined by the ensemble average \(\left\langle \phi ^k(x_1,x_2;\tau )\right\rangle _k\). Then, the CCF can be represented by

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}\tau } \phi (x_1,x_2; \tau ) = -\frac{\mathcal{E}}{2} (G(x_1,x_2;\tau )-G(x_2,x_1; -\tau )), \end{aligned}$$

which relates the CCF to Green’s function (Snieder 2004). The frequency dependence of this equation differs slightly from Aki’s formulation (e.g., Haney et al. 2012). Although the formulation showed that the Hilbert transform of the CCF can be related to Green’s function, the difference can be attributed to spectral normalization.

Compared to CCFs for real data, the biggest problem is the assumption of the equipartition of energy. Because the excitation sources are distributed near the surface, fundamental modes dominate the observed wave field. Energy is not partitioned equally among different mode branches. Indeed, the observed dominance of fundamental modes (e.g., Nishida 2017) shows that the energy is not equally distributed in the radial direction. When considering a single-mode branch, the energy is distributed equally in the horizontal direction if the external force is white noise distributed on the entire surface.

Although the theory assumed K experiments, practically, only one observation is possible. Therefore, the ensemble averages need to be replaced by time averages to calculate the CCFs of observed data for persistent external forces (e.g., ocean waves). However, no attenuation in a closed system causes the problem of diverging amplitudes without attenuation; that is, the seismic wave field in a closed system never meets the equilibrium for persistent sources. Physically, it is natural to consider the equilibrium between the energy dissipation due to attenuation and the work performed by external forces (Kobayashi and Nishida 1998; Fukao et al. 2002).

We note that the term Green’s function is often used in a less mathematically rigorous sense in studies on SI. Mathematically, a CCF converges to the Green’s function only in limited cases. A similar situation occurs in quantum field theory. The Green’s function refers to CCFs, although it does not satisfy the mathematical definition (Zagoskin 2014).

2.2 SI in an open system

Next, we consider SI in an open system. The formulations in an open system depend on the source-receiver configuration: (1) random point sources distributed over the whole space, (2) random sources distributed on a closed curve, and (3) uncorrelated plane wave incidents from various directions. This subsection describes the relationship between the different configurations of an open system. Mathematically, the theory of an open system differs from that of a closed system. We cannot use normal mode theory for an open system because the system loses energy from the radiation boundary.

For ANT, CCFs are usually formulated in an open system, which better approximates the source-receiver configuration. Because the surface wave can be formulated as a 2-D problem with a membrane approximation (e.g., Tanimoto 1990; Tromp and Dahlen 1992b1992a1993), we can consider them as a 2-D potential problem in an open system. For ANT, we explicitly express the mixed-component CCFs of surface waves in a simplified case at the end of the subsection.

2.2.1 Seismic excitation by an infinite number of sources

We consider one realization of the background seismic wave field excited by an infinite number of sources in an open system. When external force acts on the body at a location \({{\textbf {r}}}\) and time \(t = 0\), the 2-D wave equation is given by

$$\begin{aligned} \frac{\partial ^{2}{\psi ({{\textbf {r}}},t)}}{\partial {t^{2}}} = C({{\textbf {r}}}) \nabla ^{2} \psi ({{\textbf {r}}},t) + \sum _{i=0}^{\infty } \delta ({{\textbf {r}}}_i- {{\textbf {r}}})\delta (t)f_i \end{aligned}$$

where \(C({{\textbf {r}}})\) is phase velocity, and \(f_i\) is ith external force at the location \({{\textbf {r}}}_i\) (\(i=0,\cdots ,\infty\)). The potential \(\psi (x,t)\) is given by

$$\begin{aligned} \psi ({{\textbf {r}}},t) = \sum _{i=0}^{\infty } g^{2D}({{\textbf {r}}}-{{\textbf {r}}}_i;t) f_i, \end{aligned}$$

where \(g^{2D}\) is the Green’s function in time domain.

The Fourier component of the potential \(\psi ({{\textbf {r}}},t)\) is written by

$$\begin{aligned} \Psi ({{\textbf {r}}},\omega ) = \sum _{i=0}^{\infty } G^{2D}({{\textbf {r}}}-{{\textbf {r}}}_i;\omega ) f_i, \end{aligned}$$

where \(G^{2D}\) is the Green’s function in frequency domain. In this paper, we define the Fourier transform as follows (Dahlen and Tromp 1998),

$$\begin{aligned} X(\omega ) = \int _{\infty }^{\infty } x(t)e^{-i\omega t} \textrm{d}t. \end{aligned}$$

We note that the Fourier convention depends on the literature. For example, Aki and Richards ’s definition has the opposite sign in the exponential term.

Fig. 2
figure 2

Schematic figure for random sources in 2-dimension. a Source distribution in space. The red triangle shows the observed station. b Potential \(\psi ({{\textbf {r}}}_o,t)\) observed at the origin \({{\textbf {r}}}\) at time t. The wave packet at time \(t_0\) is traveled from the source within the concentric circle with radius r shown in (a)

Here, we consider a potential \(\psi ({{\textbf {r}}}_o,t_0)\) in a simplified case with constant phase velocity as \(C({{\textbf {r}}})=C_0\), where \({{\textbf {r}}}_o\) is the location of the origin and time denotes an arbitrary positive time. The \(\psi ({{\textbf {r}}}_o,t_0)\) is represented by the sum of the arrivals excited by sources along the concentric circle with radius \(r=C_0 t_0\). The typical separation of the sources is assumed to be \(\Delta x\). Within the circle with the band \(\Delta x\), about \(2\pi r/\Delta x\) sources are distributed (Fig. 2 left). Because the amplitude decay is proportional to \(r^{1/2}\), the mean square amplitude \(\psi ({{\textbf {r}}}_o,t_0)\) is estimated to be about \(2\pi r/(\Delta x (r^{1/2})^2)=2\pi /\Delta x\), which does not depend on the distance r. Therefore, after \(t>0\), the fluctuations of \(\psi\) last for a semi-infinite time with the same mean squared amplitudes (Fig. 2 right).

2.2.2 Random sources distributed on a closed curve

The second source configuration is random sources on an arbitrary curve enclosing stations. Now, we observe the potential \(\psi\) at \({{\textbf {r}}}_o\) within the circle with radius r (Fig. 2). Based on the representation theorem, the wave excited by distributed sources outside the circle can be completely reproduced from the stresses and displacements on the circle (e.g., Aki and Richards 1980). The contribution of the sources within the circle for the CCF can be neglected because the contribution from the sources outside the circle becomes infinitely large when considering infinitely long times. These features mean that it is equivalent to persistent sources distributed only on the circle. Suppose that uncorrelated random sources with white spectra on an arbitrary surface enclose stations in a heterogeneous medium. In this case, the time derivative of the corresponding CCF represents the exact Green’s function between a pair of stations (Wapenaar et al. 2010a).

2.2.3 Uncorrelated plane wave incidents

The third source configuration is uncorrelated plane wave incidents from various directions. Assuming that the source-station distances are sufficiently longer than the aperture of the seismic array, the assumption leads to a plane-wave approximation (Nakahara 2006; Haney et al. 2012). The above discussions for a homogeneous source distribution lead to the identical formulation of CCFs regardless of the source configurations, as shown in the following subsection. When we consider a heterogeneous source distribution, the first source configuration is the most flexible to represent the source, and the second and third are gradually less flexible. The flexibility of the first two source configurations causes a complex dependence of a CCF on the locations of the two stations. In contrast, the inflexibility of the third source configuration causes a simple dependence of the CCF only on the relative location between the stations (only the distance and the azimuth). In many cases, the ocean swell activities are far enough away from the stations to approximate the phenomenon well. For simplicity, we will use the third source configuration to represent CCFs based on plane wave incidents in the following subsections.

2.2.4 CCFs for homogeneous source distribution in a 2-D homogeneous medium

For ANT, we explicitly express the CCFs between all pairs of three-component seismometers in an open system when multimode Love and Rayleigh waves dominate the ambient seismic wave field. We evaluated the CCFs for the surface waves in a 2-D problem with a membrane approximation (e.g., Tanimoto 1990; Tromp and Dahlen 1992a, b, 1993). We show an expression of the mixed-component CCFs of surface waves.

Fig. 3
figure 3

Schematic of the geometry of a pair of stations with separation distance r and the coordinate. The figure shows the radial and transverse directions for the station pair. To simplify the equation, the signs of the directions are different from those of other studies. The difference changes the signs of CCF in the components of RR, TT, RZ, and ZR

An arbitrary seismic wave field \({{\textbf {u}}}(r,\theta ; \omega )\) in 2-D can be represented by a superposition of multimode Love and Rayleigh waves as

$$\begin{aligned} {{\textbf {u}}}(r,\theta ;\omega )&= \sum _{n=0}^{\infty } \sum _{m=-\infty }^{\infty } f^{\textrm{Ray}}_{n,m}(\omega )\left[ U_n(\omega ) {{\textbf {P}}}_m(r,\theta ;\omega ) +V_n(\omega ) {{\textbf {B}}}_m(r,\theta ;\omega ) \right] \nonumber \\&\quad +f^{Love}_{n,m}(\omega )W_n(\omega ) {{\textbf {C}}}_m(r,\theta ;\omega ), \end{aligned}$$

where \(f^{\textrm{Ray}}_{n,m}(\omega )\) is forcing for Rayleigh waves, and \(f^{Love}_{n,m}\) is forcing of Love waves. \(U_n\) and \(V_n\) are eigenfunctions of the nth overtone of the Rayleigh wave, and \(W_n\) is the eigenfunction of the nth overtone of the Love wave, which has a real value. The basis functions \({{\textbf {P}}}_{m}\), \({{\textbf {B}}}_{m}\) and \({{\textbf {W}}}_{m}\) are given by

$$\begin{aligned} {{\textbf {P}}}_{m}(r,\theta ;\omega )&= \hat{{{\textbf {z}}}} J_m\left( \frac{\omega r}{c^{\textrm{Ray}}_n(\omega )}\right) e^{im\theta }, \end{aligned}$$
$$\begin{aligned} {{\textbf {B}}}_{m}(r,\theta ;\omega )&= \frac{c^{\textrm{Ray}}_n(\omega )}{\omega r} \left[ \hat{{{\textbf {r}}}} \frac{\partial }{\partial r} +\frac{\hat{\varvec{\theta }}}{r}\frac{\partial }{\partial \theta }\right] J_m\left( \frac{\omega r}{c^{\textrm{Ray}}_n(\omega )}\right) e^{im\theta }, \end{aligned}$$
$$\begin{aligned} {{\textbf {C}}}_{m}(r,\theta ;\omega )&= \frac{c^{Love}_n(\omega )}{\omega r} \left[ \frac{\hat{{{\textbf {r}}}}}{r} \frac{\partial }{\partial \theta } -\hat{\varvec{\theta }}\frac{\partial }{\partial r}\right] J_m\left( \frac{\omega r}{c^{Love}_n(\omega )}\right) e^{im\theta }, \end{aligned}$$

where \(c^{\textrm{Ray}}_n(\omega )\) is the phase velocity of nth overtone of Rayleigh wave, and \(c^{Love}_n(\omega )\) is nth overtone of Love wave.

An assumption of equipartition of energy leads to

$$\begin{aligned} \langle f^{Ray*}_{n,m} f^{\textrm{Ray}}_{n',m'}\rangle U_n(\omega )U_{n'}(\omega )&\equiv P^{\textrm{Ray}}_n(\omega ) \delta _{nn'} \delta _{mm'}, \end{aligned}$$
$$\begin{aligned} \langle f^{Love*}_{n,m} f^{Love}_{n',m'}\rangle W_{n'}(\omega )W_n(\omega )&\equiv P^{Love}_{n}(\omega ) \delta _{nn'}\delta _{mm'},\end{aligned}$$
$$\begin{aligned} \langle f^{Ray*}_{n,m} f^{Love}_{n',m'}\rangle = \langle f^{Love*}_{n,m} f^{\textrm{Ray}}_{n',m'}\rangle&= 0, \end{aligned}$$
$$\begin{aligned} R_n(\omega )&\equiv V_n(\omega )/U_n(\omega ), \end{aligned}$$

where \(P^{\textrm{Ray}}_n(\omega )\) is the power spectrum of nth overtone of Rayleigh wave, and \(P^{Love}_n\) is the power spectrum of nth overtone of Love wave.

The mixed-component cross-spectra for multimode Rayleigh waves are written as,

$$\left[ {\begin{array}{*{20}c} {\Phi _{{ZZ}}^{{{\text{Ray}}}} } & {\Phi _{{ZR}}^{{{\text{Ray}}}} } & {\Phi _{{ZT}}^{{{\text{Ray}}}} } \\ {\Phi _{{RZ}}^{{{\text{Ray}}}} } & {\Phi _{{RR}}^{{{\text{Ray}}}} } & {\Phi _{{RT}}^{{{\text{Ray}}}} } \\ {\Phi _{{TZ}}^{{{\text{Ray}}}} } & {\Phi _{{TR}}^{{{\text{Ray}}}} } & {\Phi _{{TT}}^{{{\text{Ray}}}} } \\ \end{array} } \right] = \sum\limits_{{n = 0}}^{\infty } {\left\{ {P_{n}^{{{\text{Ray}}}} (\omega )\left[ {\begin{array}{*{20}c} {J_{0} \left( {\frac{{\omega r}}{{c_{n}^{{{\text{Ray}}}} }}} \right)} & { - R_{n} J_{1} \left( {\frac{{|\omega |r}}{{c_{n}^{{{\text{Ray}}}} }}} \right)} & 0 \\ { - R_{n} J_{1} \left( {\frac{{|\omega |r}}{{c_{n}^{{{\text{Ray}}}} }}} \right)} & {\frac{{R_{n}^{2} }}{2}J_{{0 - 2}} \left( {\frac{{\omega r}}{{c_{n}^{{{\text{Ray}}}} }}} \right)} & 0 \\ 0 & 0 & {\frac{{R_{n}^{2} }}{2}J_{{0 + 2}} \left( {\frac{{\omega r}}{{c_{n}^{{{\text{Ray}}}} }}} \right)} \\ \end{array} } \right]} \right\}} ,$$

where a cross-spectrum represents the CCF in frequency domain. The mixed-component cross-spectra for multimode Love waves are written as,

$$\begin{aligned} \begin{bmatrix} \Phi _{ZZ}^{Love} &{} \Phi _{ZR}^{Love} &{} \Phi _{ZT}^{Love} \\ \Phi _{RZ}^{Love} &{} \Phi _{RR}^{Love} &{} \Phi _{RT}^{Love} \\ \Phi _{TZ}^{Love} &{} \Phi _{TR}^{Love} &{} \Phi _{TT}^{Love} \end{bmatrix}=\sum _{n=0}^{\infty } \left\{ P^{Love}_n(\omega )\begin{bmatrix} 0 &{} 0 &{} 0 \\ 0 &{} \frac{1}{2}J_{0+2}\left( \frac{\omega r}{c^{Love}_n}\right) &{} 0 \\ 0 &{} 0 &{} \frac{1}{2}J_{0-2}\left( \frac{\omega r}{c^{Love}_n}\right) \end{bmatrix}\right\} , \end{aligned}$$

where \(J_{0-2}(z)\equiv J_0(z)-J_2(z)\) and \(J_{0+2}(z)\equiv J_0(z)+J_2(z)\). Z represents the vertical component, and R and T represent the horizontal components according to the polarization direction (Fig. 3).

These equations are identical to the result of Haney et al. (2012). A similar formulation for DAS observation is given by Nakahara et al. (2021). We note that the difference of sign \(\Phi ^{\textrm{Ray}}_{RZ}\) originates from the definition of the radial direction, which differs from this study. Using this formulation, cross-spectra for Rayleigh waves lead to a symmetric relation between the RZ and ZR components as \(\Phi _{RZ}^{\textrm{Ray}} = \Phi _{ZR}^{\textrm{Ray}}\), whereas cross-spectra for body waves lead to an antisymmetric relation as \(\Phi _{RZ}^{\textrm{body}} = -\Phi _{ZR}^{\textrm{body}}\) (Takagi et al. 2014).

Attenuation becomes significant when considering a seismic wave field in sediment above 0.1 Hz (Nishida et al. 2008a; Prieto et al. 2009, 2011). However, physically plausible attenuation measurements are practically difficult (e.g., Liu and Ben-Zion 2013; Tsai 2011) because there is ambiguity between the source heterogeneities and the attenuation. Although new techniques have been developed to overcome this problem (e.g., Liu et al. 2021; Magrini and Boschi 2021; Bowden et al. 2015), they are beyond the scope of this study. Even if the physically plausible attenuation estimation is difficult, apparent Q measurements are feasible for better fitting a synthetic cross-spectrum to an observation (e.g., Nishida et al. 2008a).

2.3 Travel-time anomalies due to the source heterogeneities

In this subsection, we evaluate travel-time anomalies due to source heterogeneities. Although the mode decomposition in the previous subsection can be applied to more general source distributions, the resultant CCF for heterogeneous source distribution becomes complex. With a stronger assumption of the 2-D ambient seismic wave field described by a superposition of uncorrelated plane waves from all azimuths (the third source-receiver configuration), the resultant CCF depends only on the relative location between the station pair. The condition that the excitation sources exist at infinite distances simplifies the problem (see Sect. 2.2.3 for details). Reducing variables simplifies the effects on the source heterogeneities. This assumption is sufficiently realistic, because we consider the on-land observation of the ambient seismic wave field excited by distant ocean swells. This estimate is consistent with Weaver et al. (2009), but with a different derivation.

Fig. 4
figure 4

Schematic figure of the geometry of stations and the stationary phase regions shown in blue. The sources are located sufficiently distant from the stations, and the distribution is given by \(B(\theta )\) as a function of the incident azimuth \(\theta\)

We assumed that the sources were located along a closed curve surrounding the stations. If the source is distant enough compared with the wavelength, the cross-spectrum \(\Phi\) between a pair of stations is given by

$$\begin{aligned} \Phi (r,\theta ;\omega ) = \sum _{m=0}^{\infty } i^m J_m(kr)\beta _{m}(\theta ), \end{aligned}$$

where wavenumber \(k(\omega )\) is \(\omega /c(\omega )\), m is the azimuthal order, \(\theta\) is azimuth (Fig. 4), and the Fourier coefficients of the source intensities \(\beta _{m}(\theta )\) are real value as

$$\begin{aligned} \beta _m(\theta ) = {\left\{ \begin{array}{ll} \frac{a_0}{2} &{} m=0,\\ a_m \cos (m\theta ) + b_m\sin (m\theta ) &{} m\ne 0, \end{array}\right. } \end{aligned}$$

where \(a_{m}\) and \(b_m\) are Fourier coefficients of the source distribution (Harmon et al. 2010; Cox 1973). The spatially symmetric part of the cross-spectrum with even m is a real function, whereas the antisymmetric part with odd m is an imaginary function. The symmetric feature originated from the plane wave decomposition by the Bessel function as

$$\begin{aligned} e^{ikr\cos \theta } = \sum _{m=-\infty }^{\infty } i^m J_m(kr)e^{-im\theta }. \end{aligned}$$

Here, we evaluate a travel-time anomaly due to source heterogeneities of CCFs by the asymptotic expansion of the Bessel function for a large argument given by Eqs. 9.2.5 and 9.2.6 of Abramowitz et al. (1988),

$$\begin{aligned} J_{m}(kr)&\approx \sqrt{\frac{2}{\pi kr}}\left[ \cos \left( \chi -\frac{m\pi }{2}\right) -\frac{4m^2-1}{8kr}\sin \left( \chi -\frac{m\pi }{2}\right) \right] ,\end{aligned}$$
$$\begin{aligned} N_{m}(kr)&\approx \sqrt{\frac{2}{\pi kr}}\left[ \sin \left( \chi -\frac{m\pi }{2}\right) +\frac{4m^2-1}{8kr}\cos \left( \chi -\frac{m\pi }{2}\right) \right] , \end{aligned}$$

where \(N_m\) is the Bessel function of the second kind (also known as the Neumann function) and \(\chi \equiv kr-\pi /4\). The real part of the cross-spectrum can be evaluated by only even orders as,

$$\begin{aligned} {\text{Re}} [\Phi (r,\theta ;\omega )] = \sum _{m=0}^{\infty } (-1)^{m} J_{2m}(kr)\beta _{2m}(\theta ). \end{aligned}$$

Insertion of the asymptotic Bessel function into the above equation leads to the following equations. The real part of the cross-spectrum is given by

$$\begin{aligned} {\text{Re}} [\Phi (r,\theta ;\omega )]&= \sum _{m=0}^{\infty } (-1)^{m} J_{2m}(kr) \beta _{2m}(\theta ) \nonumber \\&\approx J_0(kr)B_{ev}(\theta ) + \sqrt{\frac{2}{\pi kr}}\frac{\sin \chi }{2kr} B_{ev}'', \end{aligned}$$

and the imaginary part of the cross-spectrum is given by

$$\begin{aligned} {\text{Im}} [\Phi (r,\theta ;\omega )]&= i\sum _{m=1}^{\infty } (-1)^{m}i J_{2m+1}(kr)\beta _{2m+1}(\theta ) \nonumber \\&\approx N_0(kr)iB_{od}(\theta ) - i\sqrt{\frac{2}{\pi kr}}\frac{\cos (\chi )}{2kr} B_{od}'', \end{aligned}$$

where \(B_{od}\) and \(B_{ev}\) are defined by

$$\begin{aligned} B_{od}&\equiv \sum _{m=0}^{\infty } \beta _{2m+1}(\theta ), \end{aligned}$$
$$\begin{aligned} B_{ev}&\equiv \sum _{m=0}^{\infty } \beta _{2m}(\theta ). \end{aligned}$$

The real and imaginary parts lead to the following cross-spectrum \(\Phi\) as,

$$\begin{aligned} \Phi (r,\theta ;\omega ) \approx H_0^{(1)}(kr)B(\theta )&+ H_0^{(2)}(kr)B(\theta +\pi )\nonumber \\&-\frac{i}{\sqrt{2\pi (kr)^3}}\left[ e^{i\chi }B''(\theta )-e^{-i\chi } B''(\theta +\pi ) \right] , \end{aligned}$$

where \(B\equiv B_{ev}(\theta )+B_{od}(\theta )\). \(B(\theta )\) represents the intensity of incident plane waves as a function of azimuth \(\theta\). Here, we use the following relations.

$$\begin{aligned} B(\theta ) + B(\theta +\pi )&= 2B_{ev}(\theta ) ,\\ B(\theta ) - B(\theta +\pi )&= 2B_{od}(\theta ) . \end{aligned}$$

To evaluate the travel-time anomalies due to source heterogeneity, we consider the far-field approximation (\(kr\gg 1\)) of the Hankel function of the first kind:

$$\begin{aligned} H^{(1)}_0(kr) \approx \sqrt{\frac{2}{\pi kr}}e^{i \left( \chi -\frac{1}{8kr}\right) }. \end{aligned}$$

We also assume that the source heterogeneity is weak as

$$\begin{aligned} \frac{1}{krB}\frac{d^2B}{\textrm{d}\theta ^2} \ll 1. \end{aligned}$$

The causal part of the cross-spectra (Eq. 38) can be written by

$$\begin{aligned} H^{(1)}_0(kr)B(\theta ) -\frac{ie^{i\chi }B''(\theta )}{\sqrt{2\pi (kr)^3}}&\approx B(\theta ) \sqrt{\frac{2}{\pi kr}} \exp \left[ i \left( \chi -\frac{1}{8kr}- \frac{B''(\theta )}{2krB(\theta )}\right) \right] . \end{aligned}$$

Therefore, the travel-time anomaly \(\delta t\) of the causal part can be estimated by

$$\begin{aligned} \delta t(r, \theta ) \approx \frac{B''(\theta )}{2\omega krB(\theta )}, \end{aligned}$$

where t is the travel time and \(\omega\) is the nominal angular frequency. Figure 4 shows the stationary phase regions, which dominate the contribution of the CCF (Snieder 2006). The aperture \(\delta \theta\) of the stationary phase regions is proportional to the square root of the ratio between the wavelength and the station separation distance r (Froment et al. 2010). Because the narrower the aperture at the higher frequency, the travel-time anomaly decreases for a longer distance and a higher frequency. The equation does not include the first derivative of the intensity B because the symmetry cancels the contribution (Weaver et al. 2009). Equation 42 gives an error proportional to the second derivative of the intensity B to the azimuth. This equation can estimate the phase velocity bias due to source heterogeneities and correction (Weaver et al. 2009; Froment et al. 2010).

Finally, the analytic representation of a cross-spectrum for heterogeneous sources is given by

$$\begin{aligned} \Phi (r,\theta ;\omega ) \approx 4\left( e^{-\frac{\pi }{2} i}G^{*}(r;\omega )e^{-i\omega \delta t(r,\theta )}B(\theta )+ e^{\frac{\pi }{2} i}G(r;\omega )e^{i\omega \delta t(r,\theta +\pi )}B(\theta +\pi )\right) , \end{aligned}$$

where Green’s function in 2-D is given by

$$\begin{aligned} G(r;\omega ) = -i\frac{H^{(2)}_0(kr)}{4}. \end{aligned}$$

This subsection shows how to estimate travel-time anomalies due to source heterogeneities. We note that the CCFs still satisfy the original wave equation even if the source distribution is heterogeneous. This property ensures that phase velocity measurements can be possible even for a heterogeneous source distribution if the station density is sufficiently high and the excitation sources are located outside the station array. Following Lin et al. (2013), we briefly show the relation in the following.

Here we consider potential \(\psi\), which satisfies a wave equation \(\mathcal{L}\) as

$$\begin{aligned} \mathcal{L}[\psi ({{\textbf {r}}},t)] \equiv \left[ \frac{\partial ^{2}}{\partial {t^{2}}}- C({{\textbf {r}}}) \nabla ^{2} \right] \psi ({{\textbf {r}}},t) = 0. \end{aligned}$$

We also assume that no sources are distributed within the station array. The CCF \(\phi\) is defined by

$$\begin{aligned} \phi ({{\textbf {r}}}_1,{{\textbf {r}}}_2,t) = \lim _{T\rightarrow \infty } \frac{1}{T}\int _0^{T} \psi ({{\textbf {r}}}_1,\tau ) \psi ({{\textbf {r}}}_2,t+\tau )\textrm{d}\tau \end{aligned}$$

The CCF satisfies the wave equation as

$$\begin{aligned} \mathcal{L}_1[\phi ({{\textbf {r}}}_1,{{\textbf {r}}}_2,t)] = \lim _{T\rightarrow \infty } \frac{1}{T}\int _0^{T} \mathcal{L}_1 [\psi ({{\textbf {r}}}_1,\tau )] \psi ({{\textbf {r}}}_2,t+\tau )\textrm{d}t = 0, \end{aligned}$$

where \(\mathcal{L}_1\) is the wave equation concerning \({{\textbf {r}}}_1\). The CCF also satisfies the wave equation for \({{\textbf {r}}}_2\). Because the CCF satisfies the wave equation only without spectral whitening and one-bit normalization, these procedures often create unphysical phases (Nakata et al. 2013).

3 Data processing

If the seismic wave field is stochastic stationary, data processing for the CCF calculation is straightforward. In a realistic situation, the spectral structure of seismic records changes with time. Moreover, seismometers record transient phenomena such as earthquakes, which are noise for the CCF calculations. Instrumental noise is also problematic. To improve the quality of CCF, we need to select and regularize seismic records (Bensen et al. 2007; Ritzwoller and Feng 2019). This section explains the practical data processing.

3.1 Instrument

Before processing data, we must consider the influence of instrumental characteristics. This review considers the geophysical observable of ground motions and pressure in the ocean. The seismometer records the ground motions, and they can be categorized into strong motion sensors, short-period sensors, and broad-band sensors according to the frequency and amplitude ranges.

3.1.1 Sensors

Today, on a global scale, broad-band seismometers are commonly used. Because instrumental responses depend on sensor types, the responses should be removed prior to data analysis. Typically, broad-band sensors have a flat response in particle velocity above the corner frequency of about 10 to 100 s, and the response decreases with decreasing frequency below the corner frequency.

Short-period sensors (typically velocity seismometers with a corner frequency of 1 Hz) are also commonly used on a local scale. In the recent development of dense observation with more than 1000 stations (also known as a large-N array, e.g., Lin 2013), geophones with a natural frequency of 10 Hz are also often used. Although only nominal responses are available in many cases, the responses are slightly different from one to another (Ueno et al. 2015; Takeo et al. 2022). When we use the data with a frequency close to the natural frequency, the difference might cause an apparent phase shift, which could be problematic for a short separation distance.

When a sensor is deployed on the seafloor, pressure gauges are also commonly used (Webb 1998). Cross-correlation analysis of ocean bottom pressure gauges exhibits Rayleigh wave propagations by cross-correlation analysis (e.g., Takeo et al. 2014). When a sensor is deployed on the seafloor or bottom of a borehole, the orientation of the sensor is unknown (e.g., Takagi et al. 2019). In such cases, the orientations should be determined by known events, such as an active shot or teleseismic events.

3.1.2 Instrumental corrections

In ambient noise cross-correlation analysis, small-amplitude seismic wave signals are used compared to regular earthquake signals. Moreover, the signals may fall within the lower frequency band than the natural frequency of the seismometers, potentially residing below the instrument noise level. In such cases, instrument noise can significantly impact the quality of CCFs. Instrument noise can be categorized into incoherent and coherent noises across observation stations. Incoherent noise between stations decreases with the square root of the number of stacks of CCFs. However, coherent noise does not diminish and remains in the CCFs even with many stacks.

Takagi et al. (2015) conducted an ambient noise cross-correlation analysis using Hi-net short-period seismometers in Japan and found periodic pulses at 60-second and 1-second intervals within the CCFs. The stacking of raw records revealed that these periodic pulses were not a result of data analysis but were inherent in the raw records shared across all stations. Although the amplitudes were extremely small, less than 1 bit, they appeared in the CCFs at frequencies lower than the natural frequency band because of their coherence across stations. Similar instrument noise and its impact on CCFs have also been reported in other observation systems (e.g., Wang et al. 2017). This coherent periodic instrument noise is attributed to load fluctuations in data loggers, such as time calibration and data recording. There is also temporally random but coherent instrument noise across stations. The coherent random instrument noise appears as pulses near the zero lag in the CCFs. Such coherent random noise was found in DAS observations (e.g., Tribaldos and Ajo-Franklin 2021) and cabled ocean-bottom seafloor seismometers (Takagi et al. 2021).

It is crucial to remove coherent instrument noise to utilize CCFs to image subsurface structure. Several methods have been proposed to deal with periodic coherent instrument noise. One method involves creating waveforms of periodic coherent instrument noise by stacking many raw records and subtracting them from the raw observation records (Takagi et al. 2015). Alternatively, instead of computing instrument noise waveforms, one can calculate the difference of raw observation records (e.g., the difference between two adjacent days) and then compute the CCFs of these differential waveforms (Takagi et al. 2021). To reduce randomly coherent instrument noise, practical methods such as modeling and estimating the spectral shape of instrument noise (Takagi et al. 2021) or subtracting the median of CCFs from each trace (Tribaldos and Ajo-Franklin 2021) have been proposed. By taking advantage of dense DAS observations, the F-K filtering of fast-propagating phases before computing the CCFs is effective (Fukushima et al. 2022). In summary, while coherent periodic or coherent random instrument noise across observation stations can contaminate CCFs, understanding the characteristics of the noise and applying appropriate removal methods enable us to expand the analyzable frequency range and ultimately improve the resolution of structural imaging.

The time reliability is crucial for the cross-correlation analysis between stations. For onshore observations, satellite systems, such as the Global Navigation Satellite System (GNSS), usually give a precise time stamp. When GNSS reception fails, the temporal change in the CCFs has been used to estimate the error in the time stamps (Sens-Schönfelder 2008; Hirose and Ueda 2023). For offshore observations, precise time stamps are mostly given at deployment and recovery, and the drift of the internal clock can be linearly corrected. When the time stamp cannot be obtained either at the beginning or at the end of the observation, the temporal change in the CCFs can be used again to estimate the drift (Hannemann et al. 2014; Gouédard et al. 2014; Takeo et al. 2014). The time asymmetric shape of the CCFs can also be used to estimate unknown instrumental responses such as the constant time shift of the logger or the frequency-dependent phase response of the differential pressure gauges (Takeo et al. 2014).

3.2 Data selection

We must choose seismic data that satisfy a local stationary state to apply the SI. Outliers such as earthquake records and instrumental glitches decrease the SNR. This subsection introduces (1) one-bit normalization, (2) RMS-based data selection, and (3) polarization-based data selection.

3.2.1 One-bit normalization

To suppress the effects of transient phenomena such as earthquakes, one-bit normalization (e.g., Aki 1955; Bensen et al. 2007; Cupillard et al. 2011) keeps only the sign of the original information, changing all positive values to 1 and all negative values to -1. Because this method is simple, it has been widely used. When transient phenomena (e.g., many aftershocks and packet loss during data logging) occur frequently, their contributions decrease the quality of the CCFs. In such a case, careful reduction of transient signals allows us to improve the CCF’s quality. A disadvantage of this method is the loss of amplitude information. Different amplitude normalizations among components cause problems in the cross-correlation analysis of multicomponent data.

3.2.2 RMS-based data selection

If the signal obeys a stationary process, we can reject outliers using an RMS threshold. In a strict sense, secondary microseisms do not follow a stationary process but are in a local stationary state: RMS changes significantly by several orders of magnitude on a timescale of several days (Fig. 5). We must define the local background levels for the rejection, which change slowly over several days.

Here, we introduce an example for estimating the local background level by Nishida and Takagi (2022). To define the typical background level of the whole network at the ith time step, we calculate the median of MS amplitudes for all stations \(P_i\). Here, we consider a situation: ith time step is the latest accepted time step, and we reject n successive time steps. If \(P_{i+n}\) changes suddenly, we reject the (\(i+n+1)\)th time step with the threshold \(\epsilon\):

$$\begin{aligned} |\ln (P_{i+1})-\ln (P_i)|> n\epsilon , \end{aligned}$$

we reject all data at time step \(i+n+1\) with the criterion proportional to the rejection duration (orange points in Fig. 5).

3.2.3 Data rejection associated with large earthquake using a catalog

Seismometers record global propagations of many seismic phases excited by large earthquakes, typically with moment magnitudes greater than 6 (Ekström et al. 2012). In most frequency ranges, the RMS criterion can reject the corresponding data. However, the secondary microseisms are still large enough to hide some earthquakes. Even smaller amplitudes can bias the CCFs because the earthquake signals are coherent among the stations. Careful rejections of large earthquakes improve the quality. Using a global earthquake catalog (e.g., the global Centroid Moment Tensor (CMT) catalog (Ekström et al. 2012)), data rejection based on the magnitude in the catalog is feasible to exclude hidden seismic phases.

Approximately the amplitude of earthquake data decays exponentially with time. The typical duration \(D_e\) is:

$$\begin{aligned} D_e = 2Q\frac{\ln (J M/10^{23})}{(2\pi f)}, \end{aligned}$$

where f is the dominant frequency, Q is the typical quality factor, M is the moment, and J is a typical geometrical spreading (Nishida and Kobayashi 1999; Tanimoto and Um 1999). Data were rejected for \(D_e\) seconds after the arrival. The orange dots in Fig. 5 show a typical example of rejected data (Nishida and Takagi 2022). Data can be rejected until the earthquake signal is much smaller than the background noise levels.

Fig. 5
figure 5

Example of the data selection process. The orange dots show data excluded using the global CMT catalog (Ekström et al. 2012), and the blue dots show data transients when the amplitude changes suddenly. The black dots represent selected segments. The vertical axis shows the relative power normalized by the Peterson NLNM (Peterson 1993)), which represents the lowest ground noise on Earth. Taken from Nishida and Takagi (2022)

3.2.4 Polarization-based data selection

If the excitation processes are stochastic stationary in time and space, the energy partition among the mode branches should be constant. The energy partition changes over time because they are not stationary in a realistic situation. For example, Takagi et al. (2018) pointed out that the energy partition of P-wave microseisms at periods of 4–8 s becomes more significant on seismically quiet days based on the polarization analysis of Hi-net data in Japan. When we want to emphasize overtones of seismic surface waves, data selection based on polarization information can potentially improve the detection of overtone branches. Pedersen et al. (2023) proposed the data selection based on the H/V spectral ratio to extract teleseismic body wave microseisms, and this strategy may be applicable for detecting overtone branches.

3.3 Regularization: weight and normalization of CCFs

If both the signal and the noise are subjected to the Gaussian distribution, calculating the CCFs from selected data is straightforward. However, neither noise nor signal behaves well in a realistic situation. Regularization of data is required as part of the preprocessing (Bensen et al. 2007).

3.3.1 Weighting of cross-spectra

The weighting on the data is important when we calculate a cross-spectrum between a station pair (Nishida 2014). The amplitudes of microseisms at frequencies around 0.1 Hz change with time, which reaches more than one order of magnitude on a timescale of one day. Spectral whitening efficiently reduces the non-stationarity (e.g., Bensen et al. 2007). The amplitudes of the seismic hum in the mHz band are stationary, although the local noise level is higher than the signal levels. In this case, the weighting of the data depending on the local noise level can be effective (Nishida 2014; Takeo et al. 2013). In the mHz band, the noise levels of the horizontal components are orders of magnitude higher than those of the vertical components. For the calculation of the cross-spectrum, we suppressed noisy Fourier components using the data weighting as follows.

We calculated a weighted cross-spectrum \(\Phi _{ij}(f)\) between the ith and jth stations at a frequency f as

$$\begin{aligned} \Phi _{ij}(f) = \frac{1}{\displaystyle \sum _k w^k_{ij}(f)}\sum _k w^k_{ij}(f) U^{k*}_i (f) U^{k}_j (f), \end{aligned}$$

where \(U^{k}_i (f)\) is a Fourier spectrum of ground acceleration of the kth segment at the ith station. The weighting factor \(w^k_{ij}(f)\) depends on the situation. If the noise is much larger than the signal, the weight can be estimated by

$$\begin{aligned} w^k_{ij}(f) = \frac{1}{|U^{k}_i (f) |} \frac{1}{|U^{k}_j (f) |}. \end{aligned}$$

We can evaluate the uncertainty \(\sigma _{ij}(f)\) of the resultant \({\Phi }_{ij}(f)\) as

$$\begin{aligned} \sigma _{ij}(f) = \frac{1}{\sqrt{N_{ij}(f)}} \left( \frac{1}{N_{ij}(f)} \displaystyle \sum _{k=1}^{N_{ij}} w^k_{ij}(f) \right) ^{-1}, \end{aligned}$$

where \(N_{ij}(f)\) is number of stacked traces (Takeo et al. 2013). The corresponding CCF \(\phi _{ij}(t)\) is calculated by the inverse Fourier transform of \(\Phi _{ij}(f)\) (e.g., Fig. 6).

Fig. 6
figure 6

Cross-correlation functions as a function of interstation distances filtered at 20–30 s, 10–20 s, and 5–10 s. The amplitudes are multiplied by \(\displaystyle \sum _k w^k_{ij}(f)\) in the frequency domain to produce similar noise levels for different pairs of stations. The dashed pink line shows the group velocity of each specified mode at a typical period for the 1-D isotropic model. The modes include the fundamental mode and first-higher-mode Rayleigh waves, 0 S, and 1 S. The TT components show wave packets of Love waves xT, which could not be identified as a single mode due to possible interference between the fundamental mode and the higher mode with very similar group and phase velocities. Taken from Takeo et al. (2016)

3.3.2 Spectral whitening

If the signal is larger than the noise, the assumption of the previous Sect. 3.3.1 is broken down. Although simple stacking works for signals with Gaussian distribution, the signal level changes significantly with time. In particular, the amplitudes of secondary microseisms can change by several orders of magnitude over a timescale of a few days. Without regularizing the amplitude, the resultant CCFs emphasize days of high-ocean-swell activities too much. For the amplitude regularization, the cross-spectrum \(\Phi _{ij}\) is normalized by the amplitude (e.g., Aki 1957; Nishida et al. 2008a; Prieto et al. 2009) as,

$$\begin{aligned} \Phi _{ij}(f) = \sum _k w^k_{ij}(f) U^{k*}_i (f) U^{k}_j (f) = \sum _k \frac{U^{k*}_i (f) U^{k}_j (f)}{|U^{k}_i (f)||U^{k}_j (f)|}. \end{aligned}$$

The \(\Phi _{ij}\) with spectral whitening is also known as cross-coherency. The equation is similar to the weighting, but the factor \(\sum _k w_{ij}^k\) is not required.

Although direct waves are accurately reconstructed with spectral whitening, they cause many pseudo-arrivals. Therefore, if we ignore the pseudo-arrivals, all normalizations work practically (Nakata 2020). To mitigate the pseudo-arrivals, smoothing of \(w_{ij}^k\) in the frequency domain was proposed (Tauzin et al. 2019).

Another implementation is performed by spectral whitening of the individual seismic trace before calculating CCFs (Bensen et al. 2007). Specifically, the inverse Fourier transform of \(U^{k}_i (f)/|U^{k*}_i (f)|\) corresponds to the whitened seismic trace.

3.3.3 Temporal flattening

Different amplitude normalizations at different stations become problematic if we utilize amplitude information (e.g., attenuation). Although this topic is beyond the scope. Temporal flattening (Zhou et al. 2020; Weaver 2011) was proposed to preserve the amplitude information.

4 Dispersion measurement of surface waves for constructing local 1-D structures

In the first step of ANT (Fig. 1a), a reference dispersion curve of phase or group velocities is measured using the CCFs of an array. This section briefly summarizes the measurement methods: (1) frequency–time analysis, (2) slant stack technique, (3) SPAC method, (4) FJ method, and (5) waveform fitting. The dispersion curves are also important for the construction of local 1-D structures, which can be an initial model of ANT (see Sect. 6.3.1).

4.1 Frequency–time analysis (FTAN)

Frequency–time analysis is a common method for measuring the dispersion curves of earthquake data (Levshin et al. 1992; Cotte and Laske 2002; Romanowicz 2020). This method is also feasible for ambient noise CCFs (Bensen et al. 2007; Harmon et al. 2007; Yao et al. 2011; Spica et al. 2018). For the measurements, we usually use the time-symmetric part of CCF to reduce the effects of source heterogeneities (see equations 34 and 35 in Sect. 2.3). In this subsection, we use the folded CCF to lag time 0 \(\phi ^+_{ij}\) as,

$$\begin{aligned} \phi ^+_{ij}(t) = {\left\{ \begin{array}{ll} \frac{1}{2}\left[ \phi _{ij}(t)+\phi _{ij}(-t) \right] &{} t \ge 0\\ 0 &{} t<0 \end{array}\right. }, \end{aligned}$$

and the corresponding Fourier component is defined by \(\Phi ^+_{ij}(f)\). The folding procedure distorts the waveform when the separation distance is shorter than about one wavelength. The tail of the acausal wave packet contaminates the causal peak because the CCF does not satisfy causality, as in the case of Green’s function.

A group velocity-period diagram can be calculated for each CCF. The general outline of the procedures is as follows, although there are some other implementations. Figure 7 shows an example of a synthetic test.

At a given center frequency f, a narrow-band filter with the center frequency \(f_0\) is applied to the CCF. For example FTAN (Levshin et al. 2018; Ritzwoller and Feng 2019) use a Gaussian filter

$$\begin{aligned} F_b(f,f_0) = e^{-\alpha \left( \frac{f-f_0}{f_0}\right) ^2}, \end{aligned}$$

where \(\alpha\) is the coefficient, determining the bandwidth (Fig. 7a, b). Because a smaller \(\alpha\) broadens the bandwidth, a smaller \(\alpha\) provides a stable estimate for noisy data. At the same time, smaller \(\alpha\) can bias the estimation in cases of strong dispersion. \(\alpha\) often changes with the separation distance (Ritzwoller et al. 2011) as

$$\begin{aligned} \alpha = \alpha _0\sqrt{r/r_0}, \end{aligned}$$

where \(\alpha _0\) is the reference value at the separation distance \(r_0\). In a synthetic test shown in Fig. 7, \(\alpha _0\) is 40, and \(r_0\) is 200 km.

The envelope functions of the filtered CCFs are plotted against the periods and corresponding group velocities calculated from the separation distance and the lag time (Fig. 7c). Correction for \(f_0\) using the instantaneous frequency based on the analytical signal is feasible for a better estimate (Shapiro and Singh 1999; Levshin et al. 2018). In this case, a synthetic CCF is given by

$$\begin{aligned} \Phi ^{\textrm{syn}}(\omega ) = J_0(k(\omega )r) + n(\omega ), \end{aligned}$$

where n is Gaussian noise with a standard deviation of 0.03. We can measure group velocities by tracing the loci of a local maximum as a function of the period. Because a strong frequency dependence on the spectral content can bias the group velocity measurements, the correction is necessary for accurate measurements. AFTAN (Automatic Frequency–Time Analysis) package (Barmine 2018) is available for this type of measurement. We show an example of the estimation errors at 0.1 Hz as a function of the separation distance (Fig. 7d).

We can also measure the phase velocity from the phase of \(\Phi ^+_{ij}(f)\) (e.g., Levshin 2018; Ritzwoller and Feng 2019) if a single mode is dominant. Multimode interference is problematic for such measurements. To isolate a single mode, we must apply a proper filter. Here, we introduce two typical filters: (1) time variable filter (Landisman et al. 1969) and (2) floating filter (Levshin et al. 1992).

Fig. 7
figure 7

a Gaussian filters for various separation distances at 0.1 Hz. b Gaussian filter for \(r = 200\) km. c FTAN diagram for a synthetic CCF with the observation error is 1%. The separation distance is \(r = 200\) km. The black line shows the theoretical group velocity and the red dots show measured values using synthetic data. d Group and phase velocity anomalies for the synthetic CCFs with 1% measurement errors as a function of separation distance at 0.1 Hz. e Measured phases (blue dots) and the theoretical values (dashed lines) at 0.1 Hz. f Histogram of group and phase velocities estimation errors at 0.1 Hz

The time variable filter (Harmon et al. 2007; Landisman et al. 1969) isolates a single-mode wave packet by windowing the CCF around the arrival time predicted by the group velocity measurements. Window length is set according to the dispersion because the stronger dispersion makes the duration of the wave packet longer. A typical window length (Harmon et al. 2007) \(T_w\) is given by

$$\begin{aligned} T_w = 160 \times \frac{\textrm{d} U}{\textrm{d}\tau }, \end{aligned}$$

where U is group velocity, and \(\tau\) is period.

Floating filtering is also a common procedure for extracting a single mode (Levshin et al. 1992). Based on the group velocity measurement, phase correction \(\psi (\omega )\) is defined by

$$\begin{aligned} \psi (\omega ) = r\int _{\omega _0}^{\omega _1} U^{-1} \textrm{d}\omega + t_1 \omega , \end{aligned}$$

where r is the separation distance, and \(t_1\) is a constant. \(\Phi ^+_{ij}(\omega )\) is multiplied by \(e^{i\psi }\) to compress the signal around the average group time. \(t_1\) is chosen to focus on the average group arrival. After cleaning up the signal, \(\phi ^+\) is recovered by the inverse Fourier transform with \(e^{-i\psi }\).

The phase velocity C can be measured from the phase of \(\Phi ^+(\omega )\) (Lin et al. 2008; Harmon et al. 2007) as,

$$\begin{aligned} C(\omega ) = \frac{\omega r}{ -\arg (\Phi ^+(\omega )) + 2\pi N +\frac{\pi }{4}}, \end{aligned}$$

where N is an integer number. The \(\pi /4\) originates from the Bessel function, and the \(2\pi\) ambiguity can be solved if a good starting model is available. Note that \(\Phi ^+(\omega )\) is out of phase by \(\pi /2\) with the corresponding empirical Green’s function (see Sect. 2 of Lin et al. (2008)).

Figure 7e shows the phase of the synthetic test and the theoretical values. Although they agree, the finite bandwidth of the Gaussian filter biases the measured phases. At higher frequencies, estimating the phase ambiguity N is difficult due to the ambiguity of an initial model. If a dense array is available, the array methods, as discussed in the following subsections, are feasible for measuring the phase velocity at higher frequencies. The measurement using FTAN has a limitation for the separation distance of CCFs. The group velocity measurements are accurate when the separation distance exceeds three wavelengths. For this reason, (Bensen et al. 2007) recommended that the longest period should be r/12.

Histograms in Fig. 7f show the estimation errors of the group and phase velocities. Although the measured phase velocities were relatively accurate, the dispersion caused bias. When applying this method to strongly dispersive wave fields, we must carefully consider bias, particularly for low-SNR data.

An advantage of the FTAN method is the ability to measure the dispersion of a single station pair. As already pointed out, the FTAN method also has the disadvantage that the dispersion measurements are difficult for separation distances shorter than about three wavelengths. FTAN method is still feasible for measuring multimode dispersion if a wave packet of each mode branch is isolated in the time domain. If multiple phases overlap in time domain, an array-based method (slant stack or SPAC) is appropriate.

Figure 8 shows an example of a multimode Rayleigh wave dispersion diagram measured using ocean-bottom seismometers. The figure shows two clear branches. The fundamental mode with a period 8 s has energy in the ocean, while it has energy in the crust and the uppermost mantle with the longer period. The overtone also has energy in the crust and the uppermost mantle. This behavior of a multimode Rayleigh wave in an oceanic region is common, as shown in Fig. 6. A seismic array on a soft sediment layer records multiple mode branches above about 0.2 Hz (e.g., Spica et al. 2018).

Fig. 8
figure 8

Group and phase velocities and the average spectrogram of raw data. Gray-scale filled contours represent the averaged spectrograms of all the CCF that met the SNR requirements, where the maximum value for each period in the individual spectrograms was normalized to one. The fundamental and second mode Rayleigh wave is visible in these data as the broad regions. Filled black circles with error bars indicate the average group velocities for the fundamental mode, while open circles with error bars indicate the average group velocities for the higher-mode Rayleigh wave. Black triangles with error bars indicate fundamental mode uncorrected phase velocities; gray triangles with error bars indicate \(-\pi /4\) corrected phase velocities for higher and fundamental mode phase velocities, and open triangles with error bars indicate higher mode phase velocities. Taken from Harmon et al. (2007)

4.2 Slant stack technique

Suppose that a dense array with station spacing comparable to the wavelength is available. In this case, the slant stack technique, also known as the frequency–wavenumber (F-K) method, is a common array processing (Rost and Thomas 2002; Gouédard et al. 2008). This method also assumes that the superposition of plane waves can represent the wave field. The beam power \(B_p\) is defined by a sum of time-shifted waveforms \(\Phi ^+(\omega )\) with expected travel-time anomaly \(\omega {{\textbf {p}}}\cdot {{\textbf {x}}}_i\)

$$\begin{aligned} B_p({{\textbf {p}}},\omega ) = \left| \sum _{kl}\sqrt{r_{kl}}\Phi ^+_{kl}(\omega )e^{i \omega {{\textbf {p}}}\cdot ({{\textbf {x}}}_k-{{\textbf {x}}}_l)}\right| ^2, \end{aligned}$$

where k and l represent the station number, \(r_{kl}\) is the separation distance between kth station and lth station, and \({{\textbf {x}}}_k\) is the location vector of kth station represented by a Cartesian coordinate with the origin at the array centroid. The term \(\sqrt{r_{kl}}\) corrects the amplitude decay due to geometrical spreading for a surface wave.

The disadvantage of this method is a potential bias due to the plane wave assumption. Equation 31 shows that the \(J_0\) (vertical component) the \(J_{0-2}/2\) (radial and transverse component) can be approximated as

$$\begin{aligned} \sqrt{r}J_0(kr)&\approx \cos (kr-\frac{\pi }{4}) +\frac{1}{8kr}\sin (kr-\frac{\pi }{4}), \end{aligned}$$
$$\begin{aligned} \frac{\sqrt{r}}{2}J_{0-2}(kr)&\approx \cos (kr-\frac{\pi }{4}) -\frac{7}{8kr}\sin (kr-\frac{\pi }{4}). \end{aligned}$$

The contribution of the terms \((8kr)^{-1}\) and \(7(8kr)^{-1}\) is significant for a shorter station pair. These equations suggest that the bias of the horizontal components could be larger. We tested the bias of the vertical component using a synthetic CCF at 0.15 Hz given by Eq. 57 and the slowness of 0.3 s/km. Figure 9 shows the station distribution of the array (Hi-net stations with a radius of about 100 km) and the result. The beam peaks at 0.3 s/km, consistent with the theoretical value. To verify the bias, we measured the peak slowness for 10,000 experiments. Figure 10 shows the histogram. Although the measurements fluctuate, the central value is 0.05% lower than the theoretical value. The bias of about 0.05% is consistent with \(-(8k^2r^2)^{-1}\) estimated from equation 62 with a typical separation distance of 60 km.

This method can reveal multimode dispersion on a scale of 100 m to 100 km: DAS observation on a 100 m scale (Dou et al. 2017), a 10 km scale ocean bottom DAS (Viens et al. 2022; Williams et al. 2021), and a 100 km scale basin structure (e.g., Boué et al. 2016; Jiang and Denolle 2022). Li et al. (2020) demonstrated that supervised machine learning methods are feasible for separating multimode information.

There is another implementation to calculate the F-K spectrum. When calculating the slant stack, the separation distances between stations are not regular. We can re-sample the CCFs on regular grids if all the station separation distances are shorter than half the wavelength. CCFs can be mapped from the spatial domain to the F-K domain using the 2-D fast Fourier transform (FFT) after a correction \(r^{1/2}\) for geometrical spreading (Gabriels et al. 1987). This method is applied to joint inversion of the first overtone and fundamental mode for deep imaging in the Valhall oil field using ambient seismic noise (Tomar et al. 20172018).

Fig. 9
figure 9

Synthetic test of slant stack, SPAC, and FJ methods. We assumed an ambient seismic wave field with a slowness of 0.3 s/km. a Station distribution. b Histogram of separation distance between station pairs. c Plot in slowness domain: slant stack, variance reduction of the SPAC method, and the FJ spectrum as a function of slowness. The noise is assumed to be 3% of the signal

Fig. 10
figure 10

Bootstrap errors of the estimated slowness by SPAC, slant stacking, and FJ method. The noise is 3% of the signal, and the sampling number for the bootstrap method is 10,000

4.3 SPAC method

This subsection explains the implementation of the SPAC method. In the case of homogeneous source distribution in a stratified medium, the synthetic cross-spectrum \(\rho ^{\textrm{syn}}_{ZZ}\) between vertical components can be represented by a Bessel function (see Sect. 2.2.4) as

$$\begin{aligned} \rho ^{\textrm{syn}}_{ZZ}(r, a, C;\omega ) = a(\omega ) J_0\left( \frac{\omega r}{C} \right) , \end{aligned}$$

where r is the separation distance of the station pair. Assuming an arbitrary wavenumber \(k(\omega )\) at a given frequency, we determine the optimum amplitude a, by minimizing the squared difference S between the observed cross-spectrum \(\Phi _i\) of ith pair and \(\rho ^{syn}(r_i,a,C;\omega )\) (e.g., Nishida et al. 2008b) with weight \(w_i\). S is given by

$$\begin{aligned} S(a,C; \omega ) = \sum _{i=1}^{N} w_i \left[ \rho ^{syn}_{ZZ}(r_i, a, C;\omega )-\Phi _i(\omega )\right] ^2, \end{aligned}$$

where N is number of the pairs. By minimizing S at a given frequency \(\omega\), we infer the phase velocity \(C(\omega )\) and the amplitude \(a(\omega )\). Regarding a, S can be minimized analytically as

$$\begin{aligned} \frac{\partial S}{\partial a} = 0. \end{aligned}$$

We obtain the optimum a for given C and \(\omega\) as

$$\begin{aligned} a^{\textrm{opt}}(C, \omega ) = \frac{\displaystyle \sum _{i=1}^{N} w_i \Phi _i(\omega ) J_0\left( \frac{\omega r_i}{C}\right) }{\displaystyle \sum _{i=1}^{N} w_i J_0\left( \frac{\omega r_i}{C}\right) ^2 }. \end{aligned}$$

To minimize S with respect to C in \(\omega\), we maximize the variance reduction \(V_R\) given by

$$\begin{aligned} V_R(C,\omega ) = 1-\frac{\displaystyle \sum _{i=1}^{N} w_i \left[ a^{\textrm{opt}}(\omega ) J_0\left( \frac{\omega r_i}{C}\right) -\Phi _i(\omega )\right] ^2}{\displaystyle \sum _{i=1}^{N} w_i \Phi _i(\omega )^2}. \end{aligned}$$

The maximum is inferred using the grid search method. The estimate can be refined using a generalized least-squares method (Menke and Jin 2015).

Figure 9 shows a test of the SPAC method using a synthetic cross-spectrum at 0.15 Hz given by Eq. 57 and the slowness of 0.3 s/km. We used an array with a radius of about 100 km shown in Fig. 9a. Most of the separation distance ranges from 40 to 130 km (Fig. 9b). Figure 9c shows \(V_R\) as a function of slowness (1/C). Because the amplitude of the surface wave decreases with \(1/\sqrt{r}\), the appropriate modeling by SPAC sharpens the peak compared to the beam power.

We measured the maximum of \(V_R\) for 10,000 experiments to evaluate the estimation errors. Figure 10 shows the histogram. The measurements fluctuate within 0.05%, and the median value is around the theoretical value.

Modeling \(\Phi _{RR}\) and \(\Phi _{TT}\) in horizontal components is more complicated because both Love and Rayleigh waves contribute to these components, as shown by Eqs. 26 and 27. The corresponding synthetic cross-spectra \(\rho ^{syn}_{RR}\) and \(\rho ^{syn}_{TT}\) can be represented by

$$\begin{aligned} \begin{bmatrix} \rho ^{syn}_{RR} \\ \rho ^{syn}_{TT} \end{bmatrix} = \begin{bmatrix}\displaystyle J_{0-2}\left( \frac{\omega r}{c^R}\right) &{}J_{0+2}\left( \frac{\omega r}{c^L}\right) \\ \displaystyle J_{0+2}\left( \frac{\omega r}{c^R}\right) &{} J_{0-2}\left( \frac{\omega r}{c^L}\right) \end{bmatrix} \begin{bmatrix} a^R(\omega ) \\ a^L(\omega ) \end{bmatrix}. \end{aligned}$$

When we analyze the station with a separation distance shorter than the wavelength, we must consider the Love/Rayleigh wave for the RR/TT component, respectively. Although the Love and Rayleigh waves are coupled in this system, we can solve this equation directly. The maximum search becomes complex because the variance reduction at a given frequency is a function of \(C^R\) and \(C^L\).

Here, we consider a simplified problem with a far-field approximation. Because \(J_{0+2}\) is proportional to \(r^{-3/2}\), the term becomes negligible for r longer than the wavelength (e.g., Takeo et al. 2013). In this case, we can simply apply the SPAC method to these components separately by replacing \(J_0\) of the vertical component with \(J_{0-2}\). Figure 11 shows \(V_R\) using the SPAC method in Japan (Nishida et al. 2008a). The diagram shows clear Rayleigh and Love branches. Although this method assumed a single-mode branch, the figure exhibits multimode branches, which enable us to increase depth resolution (e.g., Ikeda et al. 2012). The SPAC method is also applied to seismic data from the ocean floor (Takeo et al. 2013, 2014; Lin et al. 2016; Takeo et al. 2016, 2018; Kawano et al. 2023). To separate multimode Rayleigh waves, polarization information could also be useful (Nayak and Thurber 2020).

On a global scale, cross-spectra can be modeled using the Legendre function instead of the Bessel function (Nishida et al. 2002; Nishida 2014).

Fig. 11
figure 11

Frequency–phase velocity spectra using all pairs of stations. a A spectrum of radial components shows a clear Rayleigh wave branch, and b that of transverse components, which shows a fundamental Love wave branch, first overtone branch, and second overtone branch. Taken from Nishida et al. (2008a)

4.4 FJ method

The frequency–Bessel (FJ) method (Wang et al. 2019a; Hu et al. 2020; Nimiya et al. 2023) is closely related to Aki’s SPAC method. Here, we reinterpret the FJ method with the SPAC method. For simplicity, we consider the ZZ component in this subsection. Refer to Hu et al. (2020) for other components.

If we observed \(\phi (r,\omega )\) as a function of the separation distance r, the squared difference with weight r can be evaluated by the following integration,

$$\begin{aligned} S = \int _0^{r_N}\left[ a(\omega ) J_0\left( \frac{\omega r}{C}\right) -\Phi (r,\omega )\right] ^2 r \textrm{d}r. \end{aligned}$$

By minimizing S for given C and \(\omega\), we obtain

$$\begin{aligned} a^{\textrm{opt}}(C, \omega ) = \frac{\displaystyle \int _0^{r_N} J_0\left( \frac{\omega r}{C}\right) \Phi (r,\omega )r\textrm{d}r }{\displaystyle \int _0^{r_N} \left[ J_0\left( \frac{\omega r}{C}\right) \right] ^2r\textrm{d}r } \end{aligned}$$

If \(r_N\) is much longer than the wavelength, the denominator can be approximated by

$$\begin{aligned} \int _0^{r_N} \left[ J_0\left( \frac{\omega r}{C}\right) \right] ^2r\textrm{d}r \approx \frac{r_N C}{\pi \omega }. \end{aligned}$$

\(a^{\textrm{opt}}(C,\omega )\) can be approximated by the integral (the numerator) evaluated at discrete points (Hu et al. 2020) as

$$\begin{aligned} a^{\textrm{opt}}(C, \omega )&\approx \frac{\pi \omega }{r_N C} \sum _i \Phi _i(\omega ) J_0 \left( \frac{\omega r_i}{C}\right) {\bar{r}}_i \Delta r_i \nonumber \\&= \frac{\pi \omega }{r_NC}\sum _i \Phi (r_i) J_0\left( \frac{\omega r_i}{C}\right) \frac{r_{i+1}^2+2r_i(r_{i+1}-r_{i-1})-r_{i-1}^2}{8}, \end{aligned}$$


$$\begin{aligned} {\bar{r}}_i&\equiv \frac{1}{2}\left( \frac{r_{j-1}+r_j}{2}+ \frac{r_j+r_{j+1}}{2}\right) ,\end{aligned}$$
$$\begin{aligned} \Delta r&\equiv \frac{1}{2}(r_{j+1}-r_{j-1}), \end{aligned}$$

and \(\Phi _i\) is the cross-spectrum of the ith station pair. FJ spectrum \(I(C,\omega )\) defined by Nimiya et al. (2023) can be approximated by

$$\begin{aligned} I(\omega ,C) \equiv \frac{\omega ^2}{C} \int _0^{r_N} \Phi (r,\omega ) J_0\left( \frac{\omega r}{C}\right) r\textrm{d}r \approx \frac{r_N\omega }{\pi } a^{\textrm{opt}}(C, \omega ). \end{aligned}$$

Although Hu et al. (2020)’s definition of the coefficient \(\omega ^2/C^3\) instead of \(\omega ^2/C\) has a physical meaning, the emphasis on small phase velocity C may bias the estimation, as shown below. We can interpret the FJ spectrum as a variant of SPAC method with weight \(w_i\) defined by

$$\begin{aligned} w_i = \frac{\pi \omega }{r_NC}\frac{r_{i+1}^2+2r_i(r_{i+1}-r_{i-1})-r_{i-1}^2}{8}. \end{aligned}$$

We again tested the FJ method using a synthetic cross-spectrum given by Eq. 57 and the slowness of 0.3 s/km. Figure 9c shows the FJ spectrum. The peak width is narrower, but the side lobes are more significant than the others. Let us evaluate the side lobes using Eq. 71. If \(r_N\) is sufficiently longer than the wavelength, the FJ spectrum \(\omega\) for synthetic data \(J(\omega r_i/C_0)\) can be approximated (see Eq. 31) by

$$\begin{aligned} I(C, \omega ) \approx \frac{2\omega }{\pi } \displaystyle \sum _ i \cos \left( \frac{\omega r_i}{C} -\frac{\pi }{4}\right) \cos \left( \frac{\omega r_i}{C_0} -\frac{\pi }{4}\right) \Delta r_i. \end{aligned}$$

This integral can be interpreted as the Fourier series. Fourier series without a taper function cause significant spectral leakage (Oppenheim and Schafer 2014). If \(r_N\) is sufficiently large, \(I(C,\omega )\) has value only at \(C=C_0\) like the delta function, due to the orthogonality of the cosine function (more precisely, the orthogonality of the Bessel function). This means that the FJ method has, in principle, the highest resolution in phase velocity.

To evaluate the accuracy, we measured the peak slowness for 10,000 experiments using the FJ method. Figure 10 shows the histogram. The estimated value of the FJ spectrum of Hu et al. (2020) is about 0.2% larger than the theoretical value. The bias is caused by the fall-off of the spectra at low phase velocity of \(C^{-2}\) (Shapiro and Singh 1999). The drop in the \(C^{-2}\) term (Nimiya et al. 2023) of equation 78 significantly reduces bias, as shown in Fig. 10. The greater variation in the estimates is due to the greater weight of the smaller amplitudes at longer separation distances (Eq. 77). When the observation error is sufficiently small, the high resolution of the FJ method in the slowness domain will be beneficial for separating multimode surface waves. When the observation error is significant, the SPAC method can consider the noise using appropriate weights.

Figure 12 shows an example of field data (Hu et al. 2020). Although this subsection mentioned only the ZZ components, it can be applied to other components. Because the RR and TT components record both Love and Rayleigh waves, as shown by Eq. 69, their coupling complicates the estimation of the FJ spectra in the RR and TT components. Hu et al. (2020) proposed procedures to separate the Love and Rayleigh wave using orthogonal relations of the Bessel functions (Fig. 12).

Fig. 12
figure 12

Multicomponent frequency–Bessel spectrograms of field data in North America. ad are the FJ spectrograms \(I_{R0}\), \(I_{R1}\), \(I_{R2}\), and \(I_{L0}\), respectively. Here, \(I_{R0}\) is FJ spectrum of Rayleigh wave using ZZ components, \(I_{R1}\) is FJ spectrum of Rayleigh wave using RZ and ZR components, \(I_{R2}\) is FJ spectrum of Rayleigh wave using RR and TT components, and \(I_{L0}\) is FJ spectrum of Love wave using RR and TT components. The black arrow in (a) indicates the location where the fundamental mode Rayleigh wave is bifurcated into two branches. The ith overtone is labeled next to each dispersion curve, and 0 indicates the fundamental mode. Taken from Hu et al. (2020)

4.5 Multimode dispersion measurements by waveform fitting in a model space

Fig. 13
figure 13

Array-based dispersion measurement of the real data. a 1-D shear velocity structure modeled with six layers of variable thickness (parameterization A). The thick and thin black lines represent the mean and standard deviation of the estimated models from the 100 bootstrap samples. b Rayleigh wave dispersion curves estimated with model A. The thick and thin white curves represent the mean and three times the standard deviation of the 100 bootstrap results. The background image is the average of the variance reductions of the conventional cross-spectral fitting for the ZZ, ZR, and RR components. The frequency range without the white shadow is used in the dispersion measurement. c Love wave dispersion curves estimated with model A. Taken from Takagi and Nishida (2022)

The frequency–phase velocity diagram does not show clear mode branches in regions with strong lateral heterogeneities, such as volcanoes and ocean sediments (Takeo et al. 2022; Takagi and Nishida 2022). For example, Fig. 13 shows an example of SPAC diagrams in an oceanic area offshore the central part of northern Japan near the Japan Trench. They show fundamental Rayleigh and Love waves below 0.08 Hz, but other mode branches are intermittent and ambiguous. Lateral variations of the shallow soft sediment distort the dispersion curves of multiple modes. Moreover, multimode interference also complexes the diagrams. In this case, the physically plausible phase velocity measurements are difficult. A physically plausible constraint on the dispersion is crucial for better measurements. One strategy is to skip phase velocity measurements using a SPAC method. For a given S-wave velocity structure, the corresponding phase velocity of jth overtone \(C^j({\beta };\omega )\) can be evaluated, where \({\beta }\) is a vector with components of S-wave velocity at each depth. The variance reduction given by Eq. 68 is summed over in a frequency range from \(\omega _0\) to \(\omega _1\) and the N mode branches as

$$\begin{aligned} V_R^{all}({\beta }) = \sum _{j=0}^{N-1} \int _{\omega _0}^{\omega _1} V_R(C^j({\beta };\omega );\omega )\textrm{d}\omega , \end{aligned}$$

where \(V_R^{all}\) is the summed variance reduction. By maximizing \(V_R^{all}\) for \({\beta }\), we can infer the local 1-D structure. This method can be applied to the RR and TT components even if Love and Rayleigh waves appear in both the RR and TT components with shorter separation distances. Figure 13 shows an example. Figure 13a shows the S-wave velocity models, and Fig. 13b, c shows the corresponding mode branches. Along the calculated mode branches, \(V_R\) is summed over the frequency and all wave types. By maximizing \(V_R^{all}\), we can infer the local 1-D model. This method searches for the global maximum of \(V_R^{all}\) in the model space. Physically implausible models can be rejected automatically. The plot of \(V_R\) against frequencies and phase velocities, as shown in the figures, is feasible to check the validity of the estimated model. Liu et al. (2023) proposed an inversion method, which directly compared multimode dispersion diagrams with the synthetic kernel of the Green’s function, which shares a similar idea as the method here.

5 Dispersion measurements for each path

In the second step of ANT (Fig. 1b), phase or group velocity is measured for each path. This section introduces four methods to measure dispersion for each CCF: (1) FTAN, (2) the zero-crossing method, (3) the SPAC-based waveform fitting in the phase velocity domain, and (4) the waveform fitting in a model space. FTAN and the zero-crossing method (Ekström et al. 2009) are feasible to measure the phase velocity of a single mode. The latter two methods measure phase velocities by SPAC-based waveform fitting. The first measures the phase velocity directory under an assumption of single-mode dominance, whereas the second measures by fitting synthetic CCF to the observed one with the help of reference CCF constructed by the local 1-D structural inversion.

5.1 FTAN

FTAN explained in Sect. 4.1 can also be used for phase and group velocity measurements of a single trace (e.g., Lin et al. 2008). We can measure the group velocities of multimode surface waves by picking up local maxima from a stacked FTAN diagram over an area (Spica et al. 2018). However, multimode interference makes phase velocity measurements using FTAN difficult. Even in the case of multimodes, if we can isolate a single mode by a time variable filter (Landisman et al. 1969) or (2) a floating filter (Levshin et al. 1992), we can measure the phase velocities.

5.2 Zero-crossing method

The zero-crossing method (Ekström et al. 2009) is briefly summarized here. We will consider the ZZ component for simplicity, but it can be easily extended to other components. For the frequency \(\omega _n\) of the nth zero-crossing observed, the phase velocity can be evaluated by

$$\begin{aligned} C_m(\omega _n) = \frac{\omega _n r}{z_{n+2m}}, \end{aligned}$$

where \(z_n\) of is the nth zero of the Bessel function \(J_0\), \(m=0, \pm 1, \pm 2\), which represents the ambiguity of n. This formulation allows missed or extra zero crossings because observed noise makes measurement of \(z_n\) difficult. The messed or extra zero crossings 2m arise from the distinction between positive-to-negative zero and negative-to-positive zero. The advantages of this method over the FTAN method are (i) applicability even for a shorter separation distance (typically shorter than three wavelengths) and (ii) smaller bias for shorter distances owing to no far-field approximation. Although zero-crossing measurements are practically robust (Ekström et al. 2009), the low-SNR CCF behaviors become unstable. In particular, because strong lateral heterogeneities distort the phase information, appropriate filtering prior to application is crucial for stable measurements.

5.3 SPAC-based waveform fitting in phase velocity domain

This method measures the phase velocity anomaly in the ith frequency band from \(\omega _i\) to \(\omega _{i+1}\) under the assumption of single-mode dominance. If we already inferred local 1-D models based on the SPAC method, the synthetic cross-spectrum can be represented by

$$\begin{aligned} \Phi ^{\textrm{synth}}(\epsilon ; \omega ) = a(\omega ) J_0\left( \frac{\omega r}{C(1+\epsilon )}\right) , \end{aligned}$$

where \(\epsilon\) is phase velocity anomaly. We infer \(\delta a\) and \(\epsilon\) by minimizing the squared difference defined by

$$\begin{aligned} S(\alpha _i,\epsilon _i) \equiv \int _{\omega _i}^{\omega _{i+1}} \left[ (1+\alpha _i)\Phi ^{\textrm{synth}}(\epsilon _i; \omega )-\Phi ^{obs}(\omega )\right] ^2\textrm{d}\omega , \end{aligned}$$

where \(\epsilon _i\) is the phase velocity anomaly at \(\omega _i\), and \(\alpha _i\) is an amplitude correction term at \(\omega _i\). \(\alpha _i\) is estimated first by the least-squares method for \(\epsilon _i\). With the estimated \(\alpha _i\), \(\epsilon _i\) is measured using the grid search method (Nagaoka et al. 2012; Yamaya et al. 2021).

5.4 Multimode dispersion measurements by waveform fitting in a model space

In the case of the dense spacing of the mode branches in the phase velocity domain, there is a method for estimating the multimode dispersion curve by waveform fitting using a 1-D velocity structure as model parameters. Yoshizawa and Kennett (2002) developed the original idea of measuring the phase velocity dispersion curve for each path of teleseismic surface waves. We applied this method to the cross-spectra of ambient noise continuously (Takagi and Nishida 2022; Takeo et al. 2022). The advantages of this method are that it provides physically achievable dispersion curves and can simultaneously measure dispersion curves for multimode surface waves. Given the 1-D velocity structure, it is possible to calculate dispersion curves for all corresponding higher-order modes, explaining observed waveforms through the superposition of multimode surface waves. For the superposition of multiple modes, it is necessary to know the excitation amplitudes for each mode. In the case of teleseismic surface waves, the excitation amplitudes for each surface wave mode can be calculated from the seismic source mechanism. We can use the amplitudes \(a^j\) calculated from the array-based SPAC method for the cross-spectrum of ambient noise (Eq. 67). Given a reference 1-D structure of the first step (Sect. 4), the synthetic cross-spectrum can be represented by

$$\begin{aligned} \Phi ^{{\textrm{synth}}}(\delta \beta ; \omega ) = \sum _{j=0}^{N-1} a^j(\omega ) J_0\left( \frac{\omega r}{C^j(\beta +\delta \beta ;\omega )}\right) , \end{aligned}$$

where \(\delta \beta\) is a perturbation of 1-D structure, N is the number of the mode branches, and j represents the order of the mode branch. We infer \(\delta \beta\) by minimizing the residual sum of squares given by

$$\begin{aligned} S(\delta \beta ) = \int _{\omega _0}^{\omega _1} \frac{w(\omega )^2 \left[ \Phi ^{\text {synth}}(\delta \beta ;\omega )-\Phi ^{\text {obs}}(\omega )\right] ^2}{w(\omega )^2} \textrm{d}\omega . \end{aligned}$$

It is worth noting that the 1-D velocity structure, which serves as a model parameter, does not necessarily represent the true Earth structure (Yoshizawa and Kennett 2002). The 1-D structure model is not unique with respect to band-limited dispersion curves; different 1-D models can reproduce similar dispersion curves. The main products here are the multimode dispersion curves calculated by the 1-D structure. Thus, the search range of the model space should be wide enough to reproduce a dispersion curve that nicely fits the waveform.

These equations correspond to the vertical component but can be readily extended to other components. In waveform fitting, a relatively wide frequency range is often handled, and multicomponent data can be simultaneously processed. Therefore, it is important to consider the weighting of the data in practical applications. The error estimate of the SPAC method based on the array can be utilized as the weights (Takagi and Nishida 2022).

Figure 14 exemplifies its application to S-net data (Takagi and Nishida 2022). Even in this case, where the signal-to-noise ratio of the CCFs is low, it is possible to robustly estimate multimode dispersion curves by utilizing the entire waveform of multicomponent CCFs and applying physical constraints based on 1-D structure.

Fig. 14
figure 14

Dispersion measurement for a single path by waveform fitting in a model space. a Modeled pair-specific 1-D shear velocity structure for the S-net station pair N.S2N08-N.S3N20. The thick and thin black lines represent the mean and standard deviation of the 1-D structures estimated by the bootstrap method. The gray line results from the reference model obtained by the array-based SPAC method in Fig. 13a. b Phase velocity dispersion curves of the fundamental-mode Rayleigh wave (0R), the first-overtone Rayleigh wave (1R), and the fundamental-mode Love wave (0 L). The thin blue and red lines are the means and three times standard deviations of the estimated pair-specific dispersion curves. The thick blue and red curves represent the frequency ranges that meet the quality control criteria based on variance reductions within narrow frequency bands. The gray ones are the results of the array-based measurement in Fig. 13b, c. c Fitting of the cross-spectra for this pair. The black and red are the data and the modeled cross-spectra, respectively. d Same as in (c), but both cross-spectra are normalized by the standard deviation of the cross-spectral data estimated from the array-based SPAC method. Taken from Takagi and Nishida (2022)

6 2-D multimode ANT

In the third step of ANT (Fig. 1c), we conduct a 2-D inversion of the phase/group velocity of multimode surface waves from the phase/group velocity measurements for all pairs of stations. The method has been developed to explore the mantle structure using earthquake data (e.g., Woodhouse and Dziewonski 1984; Nakanishi and Anderson 1982; Tanimoto and Anderson 1985; Ekström et al. 1997; Nolet 2008). Because review papers covering ANT are available (e.g., Barmin et al. 2001; Montagner, 2015 Rawlinson et al. 2010; Thurber and Ritsema 2015; ), see them for details.

For inversion, we should consider the sensitivity kernels of the phase/group velocity. Although in an idealized situation, the kernel is identical to a finite-frequency kernel for an earthquake, a heterogeneous distribution of the noise source distorts the kernels (e.g., Nishida 2011; Tromp et al. 2010; Hanasoge 2013; Fichtner 2014; 2015; Fichtner et al. 2016; de Vos et al. 2013).

The first subsection describes how source heterogeneities affect the kernel using an analytical method for a simplified case. The next section explains the conditions for the application of surface wave tomography. The last subsection briefly summarizes the 2-D inversion methods: local 1-D inversion, ray-theoretical inversion, and finite-frequency inversion.

6.1 2-D phase velocity sensitivity kernel for ANT

The sensitivity kernel for 2-D phase velocity is evaluated based on the Born and Rytov approximations (Born et al. 1999; Ishimaru 1997; Nolet 2008). The Born approximation relates the scattered wave to the phase velocity perturbations with the first-order approximation, but it does not separate the effects on the amplitude and the phase. On the other hand, the Rytov approximation by taking the logarithm of the wave (e.g., Yoshizawa and Kennett 2005) enables us to separate the logarithmic amplitude term and the phase term.

Here, we consider the Born sensitivity kernel K, which relates the phase velocity perturbation and the cross-spectrum perturbation \(\delta \Phi\) as

$$\begin{aligned} \delta \Phi ({{\textbf {r}}}_1,{{\textbf {r}}}_2;\omega ) = \int \frac{\delta C}{C} K({{\textbf {r}}}_1,{{\textbf {r}}}_2,{{\textbf {r}}}_3;\omega )\textrm{d}\Sigma _3, \end{aligned}$$

where C is phase velocity and \(\delta C\) is the perturbation. The locations \({{\textbf {r}}}_{1}\), \({{\textbf {r}}}_2,\) and \({{\textbf {r}}}_3\) are shown in Fig. 15a. The kernel K is given by

$$\begin{aligned} K({{\textbf {r}}}_1,{{\textbf {r}}}_2,{{\textbf {r}}}_3;\omega ) = -2k^2\left\{ \Phi ({{\textbf {r}}}_2,{{\textbf {r}}}_3;\omega )G^{*}({{\textbf {r}}}_1,{{\textbf {r}}}_3;\omega ) +\Phi ^{*}({{\textbf {r}}}_1,{{\textbf {r}}}_3;\omega )G({{\textbf {r}}}_2,{{\textbf {r}}}_3;\omega ) \right\} , \end{aligned}$$

where \(\Phi ({{\textbf {r}}}_1,{{\textbf {r}}}_2;\omega )\) is the cross-spectrum between \({{\textbf {r}}}_1\) and \({{\textbf {r}}}_2\) at frequency \(\omega\) (e.g., Nishida 2011), and G is the Green’s function in frequency domain. Although the kernel was generally calculated numerically, it can be estimated by observed data in principle if dense data were available (Chmiel et al. 2018).

Fig. 15
figure 15

a Schematic map of the geometry of stations. The star symbols show the station locations at \({{\textbf {r}}}_1\) and \({{\textbf {r}}}_2\). The open circle shows the location of a phase velocity anomaly at \({{\textbf {r}}}_3\). b Elliptic curves with constant \(|r_{13}-r_{23}|\), and hyperbolic curves with constant \(|r_{13}-r_{23}|\)

With an assumption of the 2-D ambient seismic wave field described by a superposition of uncorrelated plane waves as in Sect. 2.3, Eq. 43 gives the analytic representation of \(\Phi\). Assuming that \(\omega \delta t\) and kr are small enough, the kernel can be rewritten by

$$\begin{aligned} K({{\textbf {r}}}_1,{{\textbf {r}}}_2,{{\textbf {r}}}_3;\omega )&\approx \frac{-ik}{\pi \sqrt{r_{13}r_{23}}} \left\{ \left( e^{+i(\chi _{23}+\chi _{13})}B(\theta _2) -e^{-i(\chi _{13}+\chi _{23})}B(\theta _1) \right) \right. \nonumber \\&\hspace{1.8cm} \left. +e^{i(\chi _{13}-\chi _{23})}\left( B(\theta _2+\pi )-B(\theta _1+\pi )\right) \right\} , \end{aligned}$$

where \(B(\theta )\) represents the intensity of incident plane waves as a function of azimuth \(\theta\), and \(r_{ij}\) is the distance between \({{\textbf {r}}}_i\) and \({{\textbf {r}}}_j\), \(\chi _{ij} \equiv kr_{ij}-\pi /4\) (see Fig. 15a). The form of a phase sensitivity kernel for a homogeneous source distribution is the same as that of the earthquake data (Nishida 2011).

The Rytov approximation is feasible to evaluate the phase sensitivity kernel (e.g., Yoshizawa and Kennett 2005; Nishida 2011). Although the approximation requires a single-wave packet, the cross-spectrum \(\Phi\) given by Eq. 43 (\(\theta =0\), and \(r=r_{12}\)) includes both the causal and acausal parts. We divide the cross-spectrum \(\Phi\) and the kernel K into causal and acausal parts according to the sign of the exponent, approximately. The causal part can be represented by \(e^{-kr+\omega t}\), while the acausal part can be represented by \(e^{kr+\omega t}\), where the wavenumber k and the angular frequency \(\omega\) are positive. The cross-spectrum \(\Phi\) is divided into the causal part \(\Phi ^{c}\) and the acausal part \(\Phi ^{ac}\) as,

$$\begin{aligned} \Phi ^c(\omega )&= H^{(2)}_0(kr_{12}) e^{i\omega \delta t(r_{12},\pi )}B(\pi ) \nonumber \\ \Phi ^{ac}(\omega )&= H^{(1)}_0(kr_{12}) e^{-i\omega \delta t(r_{12},\pi )}B(0). \end{aligned}$$

The Born sensitivity kernel K is also divided into the causal part \(K^c\) and the acausal part \(K^{ac}\) according to the sign of the exponent as,

$$\begin{aligned} K^c&\equiv {\left\{ \begin{array}{ll} \frac{-ik}{\pi \sqrt{r_{13}r_{23}}} \left[ -e^{-ik(\chi _{13}+\chi _{23})}B(\theta _1) +e^{ik(\chi _{13}-\chi _{23})} \left( B(\theta _2+\pi )-B(\theta _1+\pi )\right) \right] &{} r_{13} < r_{23}\\ \frac{-k}{\pi \sqrt{r_{13}r_{23}}} \left[ -e^{-ik(\chi _{13}+\chi _{23})}B(\theta _1)\right] &{} r_{13} \ge r_{23} \end{array}\right. }, \end{aligned}$$


$$\begin{aligned} K^{ac}&\equiv {\left\{ \begin{array}{ll} \frac{-ik}{\pi \sqrt{r_{13}r_{23}}} \left[ e^{ik(\chi _{13}+\chi _{23})}B(\theta _2) \right] &{} r_{13} < r_{23}\\ \frac{-k}{\pi \sqrt{r_{13}r_{23}}} \left[ e^{ik(\chi _{13}+\chi _{23})}B(\theta _2) +e^{ik(\chi _{13}-\chi _{23})} \left( B(\theta _2+\pi )-B(\theta _1+\pi )\right) \right] &{} r_{13} \ge r_{23} \end{array}\right. }, \end{aligned}$$

where the region \(r_{13} > r_{23}\) corresponds to the first and fourth quadrants, whereas the region \(r_{13}<r_{23}\) corresponds to the second and third quadrants (Fig. 15b). The Born kernels also satisfy \(K=K^{a}+K^{ac}\) from the definition.

The causal perturbation \(\delta \Phi ^a\) and acausal perturbation \(\delta \Phi ^{ac}\) are related to phase velocity anomalies as

$$\begin{aligned} \delta \Phi ^c&= \int \frac{\delta c}{c} K^c({{\textbf {r}}}_1,{{\textbf {r}}}_2,{{\textbf {r}}}_3;\omega )\textrm{d}\Sigma _3,\end{aligned}$$
$$\begin{aligned} \delta \Phi ^{ac}&= \int \frac{\delta c}{c} K^{ac}({{\textbf {r}}}_1,{{\textbf {r}}}_2,{{\textbf {r}}}_3;\omega )\textrm{d}\Sigma _3. \end{aligned}$$

Most studies on ANT used the time-symmetric part of the cross-correlation function to measure phase velocity anomalies. The time-symmetric part of cross-spectrum \({\bar{\Phi }}\) is defined by

$$\begin{aligned} {\bar{\Phi }}(\omega ) \equiv \frac{\Phi ^{c}+\Phi ^{ac*}}{2}, \end{aligned}$$

We consider the relation between perturbation of \({\bar{\Phi }}\) and the corresponding Born kernel \({\bar{K}}\) as

$$\begin{aligned} \delta {\bar{\Phi }} = \int \frac{\delta c}{c} {\bar{K}}({{\textbf {r}}}_1,{{\textbf {r}}}_2,{{\textbf {r}}}_3;\omega )\textrm{d}\Sigma _3, \end{aligned}$$


$$\begin{aligned} \delta {\bar{\Phi }}(\omega )&\equiv \frac{\delta \Phi ^{c}+\delta \Phi ^{ac*}}{2}\end{aligned}$$
$$\begin{aligned} {\bar{K}}({{\textbf {r}}}_1,{{\textbf {r}}}_2,{{\textbf {r}}}_3;\omega )&\equiv \frac{K^{a}({{\textbf {r}}}_1,{{\textbf {r}}}_2,{{\textbf {r}}}_3;\omega )+ K^{ac*}({{\textbf {r}}}_1,{{\textbf {r}}}_2,{{\textbf {r}}}_3;\omega )}{2}. \end{aligned}$$

The Rytov approximation is employed to obtain a phase sensitivity kernel for phase velocity perturbations (Nishida 2011). In the Rytov method, the logarithm of the cross-spectrum \({\bar{\Phi }}\) is considered instead of the cross-spectrum itself. By taking the logarithm, \({\bar{\Phi }}\) can be divided into real (amplitude) and imaginary (phase) parts, as follows:

$$\begin{aligned} \ln {\bar{\Phi }} = \ln (Ae^{-i\psi }) = \ln A -i\psi , \end{aligned}$$

where A is the amplitude of the causal part, and \(\psi\) is its phase. The phase perturbation \(\delta \psi\) can be written by

$$\begin{aligned} \delta \psi = \int \frac{\delta c}{c} K_p({{\textbf {r}}}_1,{{\textbf {r}}}_2,{{\textbf {r}}}_3;\omega )\textrm{d}\Sigma _3. \end{aligned}$$

where the phase sensitivity kernel \(K_p\) is the imaginary part of \(-{\bar{K}}/{\bar{\Phi }}\).

Then, the phase kernel \(K_p\) can be written by

$$\begin{aligned} K_p&= \frac{k^{3/2}}{\sqrt{2\pi }} \sqrt{\frac{r_{1}}{r_{13}r_{23}}} \left[ -\cos \left( k(r_{13}+r_{23}-r_{12})-\frac{\pi }{4}\right) \frac{B(\theta _2)+B(\theta _1)}{B(0)+B(\pi )}\right. \nonumber \\&\hspace{2.7cm}\left. +\sin \left( k(|r_{13}-r_{23}|+r_{12})+\frac{\pi }{4}\right) \frac{B(\theta _2+\pi )-B(\theta _1+\pi )}{B(0)+B(\pi )}\right] , \end{aligned}$$

where we neglect the first-order term.

Figure 16 shows an example of sensitivity kernels. The separation distance of the station pair is 150 km, and the phase velocity is 3 km/s. The source \(B(\theta )\) is assumed to be \(\cos \theta +1\).

The first term has an elliptical shape (Fig. 15b), as shown in Fig. 16a, corresponding to the finite-frequency kernel of an earthquake. The kernel along the line between the station pair (\(\theta _1=0\) and \(\theta _2=\pi\)) is identical to the finite-frequency kernel of an earthquake. The width of the kernel at the midpoint is proportional to \(\sqrt{\lambda r_{12}}\), where \(\lambda\) is the wavelength \(2\pi /k\). As a location moves away from the line between, the source heterogeneity increases the sensitivity on the \(r_2\) side.

The second term with the dependency of \(|r_{13}-r_{23}|\) exhibits a hyperbolic shape (Fig. 15b) caused by source heterogeneity (Fig. 16b), and the term oscillates rapidly outside the stationary zones behind it. The effect tends to be smoothed out when the typical spatial scale of the lateral heterogeneity is larger than the wavelength.

When the spatial scale of the source distribution is large (i.e., the azimuthal dependence of \(B(\theta )\) is smooth), the shape of the kernel is similar to the finite-frequency kernel of an earthquake (e.g., Spetzler et al. 2002; Yoshizawa and Kennett 2005). Because an approximation of the finite-frequency kernel of an earthquake works for most cases practically, most studies did not consider the source heterogeneity effect of the kernel. To check the effects, travel-time anomalies due to the source heterogeneities given by Eq. 42 can be a common criterion. In particular, because source heterogeneities cause apparent azimuthal anisotropy, the travel-time anomalies given by Eq. 42 are useful to evaluate accuracy. Although attenuation measurements are beyond the scope, the corresponding amplitude kernel is more sensitive to the source distribution, so that the source heterogeneities should be considered.

Fig. 16
figure 16

a The first term of the phase kernel \(K_p\) for \(B=\cos \theta +1\). The first term has an elliptical shape. The yellow triangles show the station location. b The second term of the phase kernel \(K_p\). The second term has a hyperbolic shape. c The whole kernel of the phase kernel \(K_p\). The background phase velocity is 3 km/s. The frequency range is from 0.1 to 0.2 Hz. We also apply a cosine taper for averaging in the frequency range

6.2 Conditions for application of surface wave tomography

Fig. 17
figure 17

Log–log plot of reported isotropic scattering coefficients against frequency. The left vertical axis shows the isotropic scattering coefficient \(g_{iso}\), and the right vertical axis shows the corresponding mean free path (\(g_{iso}^{-1}\)). Grey lines are for the lithosphere. Taken from Fig. 1 of Sato (2019)

This subsection briefly summarizes the conditions for applying the 2-D inversion method, known as phase/group velocity tomography. Surface wave tomography requires weak scattering of the surface waves during propagation because we measure phase or group velocity anomalies of the direct waves. Because the propagation distance ranges from hundreds to several thousand km, the distance should be shorter than the mean free path. Otherwise, strong scattering makes phase or group velocity measurements difficult. Figure 17 shows the mean free path as a function of the frequency compiled by Sato (2019). When we explore the mantle structure using earthquake data, we use surface waves below 0.05 Hz. The mean free path longer than 1000 km at the frequency validates the application of the method of surface wave tomography.

Fig. 18
figure 18

\(ka-kL\) diagram (Aki and Richards 1980) for a classification of wave propagation, where k is the wavenumber, a is the scale of heterogeneities, and L is the propagation distance. Red lines show propagation distances from 10 to 150 km, and blue lines show those from 150 to 2000 km. The mean free path at 0.03 Hz is 5000 km, that at 0.1 is 500 km, and that at 0.2 Hz is 30 km

To understand applicability, Fig. 18 shows a classification of wave propagation and the applicable method in \(ka-kL\) diagram (Aki and Richards 1980), where k is the wavenumber, a is the scale of heterogeneities, and L is the propagation distance. Here, we assume that the mean free path is comparable to a, although this estimate is very crude. The mean free path of northern Japan from 0.5–1Hz is estimated to be about 30 km (Hirose et al. 2020), and that of Germany is estimated at 0.2 Hz to be about 500 km (Sens-Schönfelder and Wegler 2006b). The red lines show propagation distances from 10 to 150 km, and the blue broken lines show those from 150 to 2000 km. At 0.03 Hz, the surface wave propagation at teleseismic distance beyond 2000 km is within the regime of ray theory. At 0.2 Hz, the strong scattering makes measurements difficult for propagation distances longer than 200 km. This is the reason why earthquake surface wave tomography is practically difficult above 0.1 Hz. Before applying ANT, setting up an appropriate problem based on the observed situation is important.

Let us consider the difference in ray geometry between ANT and earthquake surface wave tomography. In the case of regional-scale earthquake surface wave tomography with source-receiver paths, the target area is surrounded by earthquakes, which are limited along active tectonic areas (Fig. 19a). Because most ray paths travel across the area, the travel-time anomalies reflect the seismic structure along lines over the scale. Because the perturbations of the inferred phase/group velocity map strongly depend on the damping of the tomographic inversion, the absolute velocities tend to be ambiguous. Although the damping problem is common in ray-theoretical inversion, including ANT, an initial local 1-D model for ANT constructed from multimode dispersion measurements using subarrays (see Sect. 6.3.1) can mitigate the ambiguity.

In contrast, in the case of ANT with a dense array, the distribution of the ray path is uniform (Fig. 19b). Above 0.05 Hz, CCFs with longer separation distances (typically longer than 1000 km) become complex. Ray paths with separation distances shorter than 1000 km enable us to infer tomographic maps even in regions with strong lateral heterogeneities above 0.1 Hz. This situation is similar to earthquake tomography using the two-station method (e.g., Dziewonski and Hales 1972; Hamada and Yoshizawa 2015), which measures the phase differences between seismograms for two stations on a common great-circle path. The drawbacks of the two-station method are (1) that it requires both stations to be located close to the common great-circle path, which reduces the available ray paths, and (2) that the longer earthquake-receiver distance tends to cause the wave propagation to enter the regime of multiple scattering at a higher frequency (Fig. 18).

The shape of the sensitivity kernel is elliptical along the ray path, and the width at the midpoint is proportional to \(\sqrt{\lambda r_{12}}\), where \(\lambda\) is the wavelength. Because the lengths of ray paths of earthquake data become longer in general, the widths become wider than those of ANT. The wider width tends to average the anomalies within the fat ray, and the averaging over a long distance makes restoring the small-scale image difficult.

Fig. 19
figure 19

a A typical example of earthquake-receiver geometry for earthquakes. The gray lines show ray paths between the earthquakes and the stations. Red stars show the earthquakes, and black dots show stations. b A typical example of station geometry for ANT. The lines show ray paths between the station pairs

The advantages of ANT are the homogeneous path distribution and the shorter ray paths. These features enable us to estimate phase/group velocity maps above 0.1 Hz.

6.3 Implementations of the 2-D phase/group velocity inversions

The subsection briefly summarizes implementations of 2-D phase/group velocity inversions using phase/group velocity anomalies measured from ambient noise CCFs: (1) regionalization using subarrays, (2) the ray-theoretical inversion of ANT, and (3) finite-frequency inversion of ANT.

6.3.1 Regionalization using subarrays

If dense stations are distributed over a large area, the stations can be divided into local subarrays. Phase/group velocity maps can be inferred from phase/group velocity measurements with the subarrays, such as F-K analysis (Jiang and Denolle 2022), FJ method (Fu et al. 2022; Nimiya et al. 2023), and slant stack (Qin et al. 2022). These methods can easily incorporate information on multimodes with smaller amplitudes because array analysis enhances the SN ratios.

The inferred map is also helpful as an initial model for further inversions. Most ANT studies employ linearized inversion to obtain 2-D phase/group velocity maps (e.g., Rawlinson and Sambridge 2004) because the computational cost of fully nonlinear inversion (e.g., Markov Chain Monte Carlo (MCMC); Bodin and Sambridge 2009; Bodin et al. 2012; Galetti et al. 2016) is still high. A good initial model is crucial for better modeling for linearized inversion, because an inappropriate initial model may lead to a solution with a local minimum. An inappropriate initial model sometimes requires strong damping to stabilize the inversion. The appropriate initial model could mitigate the degradation of a tomographic image.

6.3.2 Ray-theoretical inversion of ANT

The most common method of tomographic inversion of phase/group velocity anomalies is ray-theoretical inversion with the assumption of a high-frequency limit. For example, tomo_sp_cu_s is a program (Barmin 2018) designed for estimating 2-D models from regional- or global-scale surface wave phase/group velocity measurements (Barmin et al. 2001; Ritzwoller 2002). The program also assumed a straight ray for a weakly heterogeneous medium. Another implementation to solve the eikonal equation is the fast marching method (Sethian 1996; Rawlinson and Sambridge 2004; White et al. 2020). SURFTOMO (Rawlinson 2008) is an iterative nonlinear travel-time tomography code in 2-D spherical shell coordinates. Figure 20 shows an example of wavefront tracking using the final tomographic results. Although the figure shows the concentric wavefront within the arc distance of 10\(^{\circ }\), it shows kinks in Eastern Australia, which can be explained by triplication due to the low-velocity zones. This figure also suggests that CCFs with a separation distance longer than \(10^{\circ }\) are difficult to model above 0.1 Hz without the help of a good initial model. An advantage of ANT is that we can choose ray paths that meet the conditions according to the problem setting owing to the homogeneous distribution of the ray paths.

The eikonal tomographic method (for example, Lin et al. 2009) also assumes a high-frequency limit with a single mode and solves the eikonal equation directly. The eikonal tomographic method tracks the evolution of the wavefront using only local information with dense array data. The phase velocity can be estimated by the in situ spatial gradient of the phase travel time without an inversion. The method can be applied to a more general condition of the source distribution than equal energy distribution because the method does not require CCFs to be related to the corresponding Green’s function. This method works as long as the phase travel time satisfies the eikonal equation (see the end of Sect. 2.3; Snieder et al. 2006; Lin et al. 2013; Lehujeur and Chevrot 2020). If a dense array satisfies the station spacing shorter than the wavelength, this method is feasible to infer the phase velocity map (e.g., de Ridder and Dellinger 2011; Lin et al. 2013; Ritzwoller and Feng 2019; Chen et al. 2022).

Fig. 20
figure 20

a Evolution of the wavefront in the case of strongly heterogeneous media. The station with a green star acts as the source, and the red triangles show the stations concerning the advancing wavefronts. Note the triplication effects around the low-velocity zones. b Histograms of passage time residuals for measurements at 10 s for Rayleigh waves, initially compared to a uniform reference model with a velocity of 2.8 km/s with great-circle paths, and after the inversion with 6 iterations of the subspace method with deviated paths. Taken from Fig. 4 of Saygin and Kennett (2012)

6.3.3 Finite-frequency inversion of ANT

Although the patterns of the ray-theoretical inversions are robust in different studies, the retrieved strength of phase/group velocity perturbations differs (e.g., Yoshizawa and Kennett 2004). Finite-frequency inversion is crucial to improve the amplitude recovery of velocity perturbations. As shown in Sect. 6.1, the sensitivity kernel has an elliptical shape, and the width at the midpoint is proportional to \(\sqrt{\lambda r_{12}}\). For a heterogeneous data set with multiple scales, the difference in the width of the sensitivity kernels becomes problematic, because it becomes difficult to recover the amplitude of velocity perturbations by a ray-theoretical inversion. This subsection summarizes recent developments on finite-frequency inversions.

As explained in Sect. 6.1, a finite-frequency sensitivity kernel of a CCF also depends on the source distribution. Consequently, the phase/group velocity structure has a trade-off with the source distribution (Fichtner 2015). To improve the structural image, we must infer the source distribution simultaneously (Fichtner et al. 2016; Sager et al. 2020, 2017). To mitigate the effects of source heterogeneities, measurements of differential travel times between two pairs of three stations were proposed (Liu 2020). The sensitivity kernels of the three-station method significantly reduce the effect of the heterogeneous source distribution. Because CCF amplitudes are more sensitive to source heterogeneities (Tsai 2011), the considerations of the source information will become more crucial in inferring the attenuation structure.

Because the computational cost of the sensitivity kernels for both the structure and the source distribution is still high, the earthquake sensitivity kernel for phase/group velocities is instead used practically (Gao and Shen 2015a; Yang and Gao 2018; Covellone et al. 2015; Gao and Shen 2015b; Emry et al. 2019; Yang et al. 2022; Gao and Shen 2014; Wang et al. 2019b2020; Lu et al. 2020; Crosbie et al. 2019; Janiszewski et al. 2019; Russell and Gaherty 2021; Zhao et al. 2020). The approximation becomes exact under the condition of equipartition of energy (Nishida 2011). Although the kernel does not consider the source heterogeneities, it improves the image from ray-theoretical inversion.

Another implementation of the finite-frequency effect of ANT is Helmholtz tomography (Lin and Ritzwoller 2011b; Lin et al. 2012b; Mordret et al. 2013). This method is a natural extension of eikonal tomography to consider finite-frequency effects by introducing the amplitude correction term. Although this method requires dense data, the phase velocity can be estimated without the sensitivity kernel and the inversion, as in the case of eikonal tomography. This method can be applied to a more general condition of the source distribution (see the end of Sect. 2.3).

7 3-D seismic velocity inversion using the multimode dispersion

In the fourth step of ANT (Fig. 1d), a 3-D seismic velocity structure is constructed from the inferred phase/group velocity maps in the previous step, which exhibit the dispersion curves at each grid point. The local 1-D structure at each grid point can be inverted from the dispersion curves. 3-D seismic velocity structures can then be constructed by collecting local 1-D structures. In most cases, only the dispersion of fundamental modes (Love and Rayleigh waves) is utilized for the inversion because the excitation sources are distributed on the surface. Recent studies reveal that a dense array enables us to utilize multimode dispersion. Joint inversion with the H/V or receiver function is feasible to improve the depth resolution.

7.1 1-D seismic velocity inversion at each grid

Fig. 21
figure 21

a Phase velocities of the fundamental (0th) and first-higher modes of surface waves for the modified PREM (Dziewonski and Anderson 1981). The sudden phase velocity decrease marked by the ellipse corresponds to the transition of modal energy from solid to ocean. b, c Sensitivities of phase velocities to P-wave and S-wave velocities (\(K_{\alpha }\) and \(K_{\beta }\), respectively). The two lines in each panel show the depths of the seafloor and Moho. Taken from Takeo et al. (2013)

The measured dispersion curves are inverted to obtain local 1-D structures. Inversion methods can be categorized into a Monte Carlo method (Shapiro and Ritzwoller 2002; Maraschini and Foti 2010) and a linearized method (Tarantola and Valette 1982; Tarantola 2005; Montagner 2015). Because the local 1-D inversion is identical to the earthquake data, this section briefly summarizes the related studies mentioned above.

All 1-D inversions require a code to calculate the surface wave dispersion (e.g., DISPER80, Saito 1988; senskernel-1.0, Levshin 2018; Computer Programs in Seismology, Herrmann 2013). These codes also calculate the depth sensitivity kernel, which is also useful for a Monte Carlo inversion. Figure 21 shows a typical example of depth sensitivity kernels for a model based on a reference Earth model PREM (Dziewonski and Anderson 1981), where the thickness of the ocean is changed from 3.0 to 4.6 km, and the thickness of the crust is reduced from 22 to 6 km (Takeo et al. 2013). Fundamental Love and Rayleigh waves have sensitivity at shallower depths. The higher modes are not sensitive to the S-wave structure at shallower depths, and they are more sensitive to the S-wave velocity structure at greater depths than the corresponding fundamental models. With increasing periods, the kernels have higher sensitivity at greater depths. A multimode inversion can improve depth resolution because overtones have different sensitivities from fundamental modes.

Fig. 22
figure 22

a Stations of the ocean bottom seismometers array in the region off Ibaraki. The station intervals are about 6 km. The yellow triangles show the stations deployed on October 17, 2010, and the red triangles show those deployed on February 15, 2011. b, c S-wave velocities along the A-A’ and B-B’ lines. The red, orange, and green regions reflect layer 1, layer 2, and layer 3, respectively. We interpret the red and orange regions to be sedimentary layers, the green region to be Cretaceous sediment, and the blue region to be the upper crust, respectively. The black lines show the depth of the acoustic basement (between the orange and green regions) and the top of the upper crust (between the green and blue regions). Modified from Figs. 1 and 8 of Yamaya et al. (2021)

Figure 22 shows an example of the 1-D inversion (Yamaya et al. 2021) using multimode dispersion under the ocean. The study used a computer program package SEIS_FILO (Akuhara 2022) for the inversion, which implements a transdimensional Bayesian interface and reversible-jump MCMC (rj-MCMC) (Green 1995). They assumed a uniform distribution for the prior probability distribution (1–10 for the number of layers; 2.3–10 km for depths of the discontinuities; and 2.3 km for seafloor depth and 0.1–5 km/s for S-wave velocities). P-wave velocities and densities were scaled by the empirical relation (Brocher 2005). The multimode dispersion constraints enable us to infer both sediment thickness and S-wave velocity. To demonstrate the feasibility of multimode measurements to improve depth resolution, let us compare the inversions of Fig. 22 with and without higher-mode surface wave dispersion. Figure 23d shows the inversion result using multimode dispersion, whereas Fig. 23e shows the result using only fundamental modes. These figures show that the inversion using multimode dispersion is feasible to constrain the shallow sediment structure, improving the resolution of the crustal structure. Although Rayleigh waves also have sensitivity to the P-wave velocity, the limited sensitivity only at shallow depths (Fig. 21b) makes it difficult to constrain the P-wave velocity structure practically. With an empirical relation (e.g., Brocher 2005), the P-wave velocity and the density are scaled to the S-wave velocity to reduce the number of parameters.

Many groups have already published crust and uppermost mantle models using fundamental modes, although this paper focuses on multimode ANT. Since there is no space to discuss individual models, we introduce Data Services Products: EMC, A repository of Earth models (Trabant et al. 2012; IRIS DMC 2011), which is a resource that offers access to different Earth models, along with visualization tools to preview models, facilities to extract data and metadata from the models, and access to software and scripts that others in the research community have contributed.

Fig. 23
figure 23

Comparison of 1-D S-wave velocity structure inversions using both the fundamental and the first-higher modes of Rayleigh wave (a, b, d) with that using only the fundamental modes (c, e). a Phase velocity of the fundamental mode of the Rayleigh wave. The blue points show the average phase velocities. The Markov chain Monte Carlo method calculated the posterior probabilities for 1-D average S-wave velocity structures. b Phase velocity of the first-higher mode of the Rayleigh wave. Blue points show the 1-D average phase velocities. c The phase velocity of the fundamental mode of Rayleigh waves with posterior probabilities using only fundamental modes. d S-wave velocity structure inferred by the MCMC method using both the fundamental and the first-higher modes of Rayleigh wave. The blue line shows the median velocity at each 0.1 km depth grid point. Layers 1 and 2 are well constrained despite the given loosely bounded uniform priors. e S-wave velocity structure inferred by the MCMC method using only the fundamental modes. Modified from Figs. 5 and 10 of Yamaya et al. (2021)

Seismic anisotropy gives us a clue to understanding the rheology and deformation of the Earth. In some cases, an isotropic S-wave velocity model cannot explain the observed discrepancy between the Love wave dispersion and the Rayleigh wave dispersion, which can be explained by radial anisotropy (e.g., Aki and Kaminuma 1963; Anderson 1962). Radial anisotropy in the upper mantle can be explained by lattice-preferred orientation in the low-velocity zone or partial melt under shear stresses (e.g., Montagner 2015). Radial anisotropy also exists in the lower crust, which can be explained by the lattice-preferred orientation (LPO) of anisotropic crustal minerals under extension (Moschetti et al. 2010a, b; Huang et al. 2010). ANT also revealed radial anisotropy in volcano areas (Jaxybulatov et al. 2014; Harmon and Rychert 2015; Spica et al. 2017; Lynner et al. 2018; Jiang and Denolle 2022; Jiang et al. 2023, 2018; Miller et al. 2020). The layering of a partial melt layer in the magma reservoir can explain them. Thus, radial anisotropy is feasible for discussing the stress state and the texture. Azimuthal anisotropy can also provide information on rheology and deformation (Lin et al. 2009; Yao et al. 2010; Ritzwoller and Feng 2019; Russell and Gaherty 2021): for example, the flow in the uppermost mantle related to plate motion (Takeo et al. 2016). We note that apparent anisotropy due to the lateral heterogeneities of the isotropic S-wave velocity structure is problematic (Lin and Ritzwoller 2011a; Fichtner et al. 2013). When estimating anisotropy by the tomographic method, the anisotropy has trade-offs with the lateral heterogeneities. Moreover, ANT has another difficulty that source heterogeneity can yield up to about 1% apparent azimuthal anisotropy (Harmon et al. 2010; Takeo et al. 2014, 2016), which can be estimated by Eq. 42 (equivalent to Weaver et al. (2009)).

7.2 Joint inversion with other observations

Although multimode ANT improves depth resolution, surface wave tomography has an inherent low depth resolution. Joint inversions with other geophysical observables are feasible for a better depth resolution at the 4th step.

A thin low-velocity layer near the surface is problematic. Rayleigh waves are essentially less sensitive to the S-wave velocity structure near the surface, as shown in Fig. 21. The poor sensitivity of Rayleigh waves at shallow depths makes resolving the shallow structure difficult, although the layer can still change the dispersion curves of surface waves. The polarization of the Rayleigh wave provides information on the shallow S-wave velocity structure beneath a seismic station. If the amplitude ratios between the horizontal and vertical components (ellipticity) for individual mode branches are available, the joint inversion can constrain shallower depths (Lin et al. 2012a). Even if the ratios of individual modes are difficult to measure, the spectral ratio of the horizontal to vertical (H/V) components at a station (Nakamura 1989; Sánchez-Sesma et al. 2011; Bahavar et al. 2020) can be robustly measurable. The H/V ratio allows us to constrain the shallow structure by joint inversion with an assumption of equipartition of modes (Spica et al. 2018). We note that the ellipticity also has information on the azimuthal anisotropy (Lin and Schmandt 2014) because the ellipticity also depends on the azimuth from one station to the other station.

In particular, single-mode surface wave inversion is difficult to distinguish between the smooth Vs model and the layered model because the dispersion curves of surface waves are insensitive to velocity jumps. A joint inversion with the receiver function allows us to improve the resolution of the seismic discontinuities (e.g., Bodin et al. 2012; Gao and Shen 2015b; Zhou et al. 2020; Akuhara et al. 2023; Ai et al. 2023). A joint inversion of multimode dispersion with the receiver function for earthquake data (e.g., Taira and Yoshizawa 2020) could be feasible to improve depth resolution, although not yet applied to ANT.

8 Conclusions

The first section introduces theories of SI for ANT. Although they are based on different assumptions, the resultant CCFs can be related to the Green’s function under the equipartition of energy. In a realistic situation, although the modal energy is not equally distributed among different mode branches, they work for the individual mode branches. Heterogeneities of the noise sources break the equipartition of energy even for a single-mode branch. We evaluated the apparent anomaly caused by the source heterogeneities. The important point is that the CCF still satisfies the wave equation even for a heterogeneous source distribution, leading to the robustness of the ANT results.

We summarize practical data processing on the calculation of CCFs. For calculating a good quality CCF (satisfying the theoretical assumptions), it is important to choose appropriate data processing depending on the behavior of the data and the noise. Because recent phase velocity measurement methods used waveform information for precise measurements, amplitude information should be considered by careful data selection and normalization of CCFs.

Since most conventional ANT studies utilized only the fundamental modes, it has inherited a great uncertainty regarding the depth structure. Recent developments in measurement techniques and dense observations enable us to utilize information on multimode dispersion. We show a typical inversion of multimode ANT, which consists of the following four steps. The first step is the multimode dispersion measurement of surface waves to construct local 1-D structures. The second step is to measure the phase/group velocity for each path. Multimode interference was problematic in measuring multimode dispersion because most studies implicitly assumed the dominance of the fundamental mode. The dispersion measurements by waveform fitting in a model space are feasible for the multimode case. The third step is the 2-D phase/group velocity inversions. Ray-theoretical inversion was commonly used in most cases, but the finite-frequency effect has been considered recently. To evaluate the effects of source heterogeneities on the sensitivity kernel, we show an analytic kernel with source heterogeneities in a simplified case. Helmholtz tomography is also feasible for considering finite-frequency effects and the source heterogeneities for sufficiently dense stations compared to the wavelength because CCFs for a heterogeneous source distribution still satisfy the wave equation. The fourth step is a local 1-D inversion on each grid to construct a 3-D S-wave tomographic model. The multimode inversion improves the depth resolution of the S-wave velocity structure. Joint inversions with other geophysical observations, such as the H/V ratio and receiver functions, could be feasible to improve depth resolution further.

Availability of data and materials

The Python code of the synthetic tests (Figs. 7, 9, 10 and 16) will be available on the Zenodo page


1-D, 2-D, 3-D:

One-dimensional, two-dimensional, and therein-dimensional


Ambient noise surface wave tomography


Cross-Correlation Function, which is the time-domain representation of a cross-spectrum


Centroid Moment Tensor


Fast Fourier transform

FJ method:

The frequency–Bessel method


Frequency–time analysis


A high-sensitivity seismograph network of nearly 800 stations with an average spacing of 20 km. Seismometers are installed at the bottom of boreholes at a depth of 100–3500 m to reduce noise caused by winds, ocean waves, and human activity. Observed data is digitized at each station and continuously transmitted to NIED.

H/V spectral ratio:

The ratio between the Fourier amplitude spectra of the horizontal and vertical components of microtremors.


IRIS (Incorporated Research Institutions for Seismology[1]) was a university research consortium dedicated to exploring the interior of the Earth by collecting and distributing seismographic data. On January 1, 2023, IRIS merged with UNAVCO to form the EarthScope Consortium.


Lattice-preferred orientation


Milli-Hertz (10−3 Hz)


Markov Chain Monte Carlo


Mean Squared amplitude

multimode ANT:

Ambient noise multimode surface wave tomography


New Low Noise Model (Peterson 1993)


Preliminary Reference Earth Model (Dziewonski and Anderson 1981)


Reversible jump Markov Chain Monte Carlo


Root Mean Squared amplitude


Seismic Interferometry


SPatial AutoCorrelation method


A 15-year program to place a dense network of permanent and portable seismographs across the continental USA. Seismographs record local, regional, and distant (teleseismic) earthquakes.