2. Background Theory

This section aims to provide the user with a basic review of the physics, discretization, and optimization techniques used to solve the time domain electromagnetics problem. It is assumed that the user has some background in these areas. For further reading see [Nab91].

2.1. Fundamental Physics

Maxwell’s equations provide the starting point from which an understanding of how electromagnetic fields can be used to uncover the substructure of the Earth. The time domain Maxwell’s equations are:

(2.1)\[\begin{split}\nabla \times & \vec{e} = - \partial_t \vec{b} \\ \nabla \times & \vec{h} - \vec{j} = \vec{s} \, f(t) \\ \rho \vec{j} &= \vec{e} \\ \vec{b} &= \mu \vec{h}\end{split}\]

where \(\vec{e}\), \(\vec{h}\), \(\vec{j}\) and \(\vec{b}\) are the electric field, magnetic field, current density and magnetic flux density, respectively. \(\vec{s}\) contains the geometry of the source term and \(f(t)\) is a time-dependent waveform. Symbols \(\mu\) and \(\rho\) represent the magnetic permeability and electrical resistivity, respectively. This formulation assumes a quasi-static mode so that the system can be viewed as a diffusion equation (Weaver, 1994; Ward and Hohmann, 1988 in Nabighian, 1991). By doing so, some difficulties arise when solving the system;

the resistivity \(\rho\) varies over several orders of magnitude

the fields can vary significantly near the sources, but smooth out at distance thus high resolution is required near sources

2.2. Discretization of Operators

The operators div, grad, and curl are discretized using a finite volume formulation. Although div and grad do not appear in (2.1) , they are required for the solution of the system. The divergence operator is discretized in the usual flux-balance approach, which by Gauss’ theorem considers the current flux through each face of a cell. The nodal gradient (operates on a function with values on the nodes) is obtained by differencing adjacent nodes and dividing by edge length. The discretization of the curl operator is computed similarly to the divergence operator by utilizing Stokes theorem by summing the magnetic field components around the edge of each face. Please see Haber (2012) for a detailed description of the discretization process.

2.3. Forward Problem

To solve the forward problem (2.1) , we must first discretize in space and then in time. Using finite volume approach, the magnetic fields on cell edges (\(\mathbf{u}\)) discretized in space are described by:

(2.2)\[\mathbf{C^T \, M_\rho \, C \, u} + \mathbf{M_\mu} \, \partial_t \mathbf{u} = f(t) \, \mathbf{q}\]

where \(\mathbf{C}\) is the curl operator, \(f(t)\) is a time-dependent waveform, \(\mathbf{q}\) defines the time-independent portion of the right-hand side (explained here ) and

(2.3)\[\begin{split}\mathbf{M_\rho} &= diag \big ( \mathbf{A^T_{f2c} V} \, \boldsymbol{\rho} \big ) \\ \mathbf{M_\mu} &= diag \big ( \mathbf{A^T_{f2c} V} \, \boldsymbol{\mu} \big )\end{split}\]

\(\mathbf{V}\) is a diagonal matrix that contains the volume for each cell. Vectors \(\boldsymbol{\rho}\) and \(\boldsymbol{\mu}\) are vectors containing the electrical resistivity and magnetic permeability of each cell, respectively. \(\mathbf{A_{f2c}}\) averages from faces to cell centres and \(\mathbf{A_{e2c}}\) averages from edges to cell centres.

Discretization in time is performed using backward Euler. Thus for a single transmitter, we must solve the following for every time step \(\Delta t_i\):

(2.4)\[\mathbf{A_i \, u_{i}} = \mathbf{-B_i \, u_{i-1}} + \mathbf{q_i}\]

where

(2.5)\[\begin{split}\mathbf{A_i} &= \mathbf{C^T \, M_\rho \, C } + \Delta t_i^{-1} \mathbf{M_\mu} \\ \mathbf{B_i} &= - \Delta t_i^{-1} \mathbf{M_\mu} \\ \mathbf{q_i} &= f_i \, \mathbf{q}\end{split}\]

If we organize all time-steps into a single system, and by letting \(\mathbf{K} = \mathbf{C^T \; M_\rho \, C}\), the forward problem can be expressed as \(\mathbf{A \, u} = \mathbf{\tilde q}\):

(2.6)\[\begin{split}\begin{bmatrix} \mathbf{K} & & & & & \\ \mathbf{B_1} & \mathbf{A_1} & & & & \\ & \mathbf{B_2} & \mathbf{A_2} & & & \\ & & & \ddots & & \\ & & & & \mathbf{B_n} & \mathbf{A_n} \end{bmatrix} \begin{bmatrix} \mathbf{u_0} \\ \mathbf{u_1} \\ \mathbf{u_2} \\ \vdots \\ \mathbf{u_n} \end{bmatrix} = \begin{bmatrix} \mathbf{q_0} \\ \mathbf{q_1} \\ \mathbf{q_2} \\ \vdots \\ \mathbf{q_n} \end{bmatrix}\end{split}\]

Note

This problem must be solved for each source. However, LU factorization for each unique time step length is used to make solving for many right-hand sides more efficient.

2.3.1. Defining the Vector q

The H3DTD package models EM responses with inductive sources (e.g. a closed loop). For these types of sources, analytic solutions exist for the magnetostatic solution. We assume this is the case for \(t \leq t_0\). Let \(\mathbf{u_0}\) define the static magnetic field within the domain discretized to cell edges. From (2.2) , the time-derivative at \(t \leq t_0\) is zero, thus:

\[\mathbf{C^T \, M_\rho \, C \, u_0} = f_0 \, \mathbf{q}\]

For each \(\mathbf{q_i}\) defined in (2.5) we can use vector \(\mathbf{q}\) obtained here.

2.3.2. Modeling with Non-Zero Susceptibility

Modeling non-zero susceptibility is possible using the H3DTD package. However , the initial magnetostatic problem is very difficult to solve for a non-zero transmitter current with our choice in discretization. To model with non-zero susceptibility, you must discretize the entire transmitter waveform and it must start with a current amplitude of 0 at \(t=t_0\) !!!

2.4. Data

We have the magnetic field on cell edges at all time steps. Let \(Q\) be a linear operator that averages the magnetic fields from cell edges to cell centers then interpolates each Cartesian component to the locations of the receivers. Where

\[t_i = t_0 + \sum_{k=0}^i \Delta t_k\]

the Cartesian components of the magnetic field at the receivers at all time steps is:

(2.7)\[\mathbf{h_i} = Q \, \mathbf{ u_i}\]

and the time-derivative of the magnetic flux is:

(2.8)\[\frac{\partial \mathbf{b_i}}{\partial t} = - \mu_0 \Bigg [ \Bigg ( \frac{t_{i+1}-t_i}{t_{i+1} - t_{i-1}} \Bigg ) \Bigg ( \frac{\mathbf{h_i} - \mathbf{h_{i-1}}}{t_i - t_{i-1}} \Bigg ) + \Bigg ( \frac{t_i - t_{i-1}}{t_{i+1} - t_{i-1}} \Bigg ) \Bigg ( \frac{\mathbf{h_{i+1}} - \mathbf{h_{i}}}{t_{i+1} - t_i} \Bigg ) \Bigg ]\]

Once the field an its time-derivative have been computed at the receivers for every time channel, linear interpolation is carried out to compute the fields at the correct time channels. Where \(\mathbf{Q}\) is a block-diagonal matrix of \(Q\) that takes the magnetic fields from edges to the receivers at all times, \(\mathbf{P}\) performs the operation in (2.8), \(\mathbf{I}\) is an identity matrix, \(\mathbf{L_1}\) performs the linear interpolation to the correct time channels for the magnetic field and \(\mathbf{L_2}\) performs the linear interpolation to the correct time channels for the time-derivative, the predicted data is given by:

(2.9)\[\begin{split}\mathbf{d} = \begin{bmatrix} \mathbf{L_1} & \\ & \mathbf{L_2} \end{bmatrix} \begin{bmatrix} \mathbf{I} \\ \mathbf{P} \end{bmatrix} \mathbf{Q \, u} = \mathbf{Q_t \, u}\end{split}\]

We let \(\mathbf{Q_t}\) represent an operator that projects the magnetic fields on cell edges to the data. \(\mathbf{u}\) is a vector that contains the magnetic fields on cell edges at all time steps (see (2.6) )

2.5. Sensitivities

To solve the inverse problem, we will need to compute the sensitivity matrix. By differentiating the data with respect to the resistivities:

\[\mathbf{J} = \frac{\partial \mathbf{d}}{\partial \boldsymbol{\rho}} = - \mathbf{Q_t \, A^{-1} \, G}\]

Where \(\mathbf{A}\) is the full system defined in (2.6) , \(\mathbf{Q_t}\) is defined in (2.9) and

\[\mathbf{G} = \mathbf{C^T} \, diag \big ( \mathbf{C} \, (\mathbf{u - \tilde u_0} ) \big ) \, \mathbf{A_{fc}^T} \, diag \big ( \mathbf{V} \,\boldsymbol{\rho} \big )\]

where

\[\begin{split}\mathbf{\tilde u_0} = f_0^{-1} \begin{bmatrix} f_0 \mathbf{u_0} \\ f_1 \mathbf{u_0} \\ \vdots \\ f_n \mathbf{u_0} \end{bmatrix}\end{split}\]

2.6. Inverse Problem

We are interested in recovering the conductivity distribution for the Earth. However, the numerical stability of the inverse problem is made more challenging by the fact rock conductivities/resistivities can span many orders of magnitude. To deal with this, we define the model as the log-resistivity for each cell, e.g.:

\[\mathbf{m} = log (\boldsymbol{\rho}) = \log (\boldsymbol{\sigma}^{-1})\]

The inverse problem is solved by minimizing the following global objective function with respect to the model:

(2.10)\[\phi (\mathbf{m}) = \phi_d (\mathbf{m}) + \beta \phi_m (\mathbf{m})\]

where \(\phi_d\) is the data misfit, \(\phi_m\) is the model objective function and \(\beta\) is the trade-off parameter. The data misfit ensures the recovered model adequately explains the set of field observations. The model objective function adds geological constraints to the recovered model. The trade-off parameter weights the relative emphasis between fitting the data and imposing geological structures.

Note

Although the code defines the electrical properties of the Earth internally in terms of the electrical resistivity, the models imported an exported by the code are electrical conductivity models.

2.6.1. Data Misfit

Here, the data misfit is represented as the L2-norm of a weighted residual between the observed data (\(d_{obs}\)) and the predicted data for a given conductivity model \(\boldsymbol{\sigma}\), i.e.:

(2.11)\[\phi_d = \frac{1}{2} \big \| \mathbf{W_d} \big ( \mathbf{d_{obs}} - \mathbb{F}[\boldsymbol{\rho}] \big ) \big \|^2\]

where \(W_d\) is a diagonal matrix containing the reciprocals of the uncertainties \(\boldsymbol{\varepsilon}\) for each measured data point, i.e.:

\[\mathbf{W_d} = \textrm{diag} \big [ \boldsymbol{\varepsilon}^{-1} \big ]\]

Important

For a better understanding of the data misfit, see the GIFtools cookbook .

2.6.2. Model Objective Function

Due to the ill-posedness of the problem, there are no stable solutions obtained by freely minimizing the data misfit, and thus regularization is needed. The regularization uses penalties for both smoothness, and likeness to a reference model \(m_{ref}\) supplied by the user. The model objective function is given by:

(2.12)\[\begin{split}\phi_m = \frac{\alpha_s}{2} \!\int_\Omega w_s | m - & m_{ref} |^2 dV + \frac{\alpha_x}{2} \!\int_\Omega w_x \Bigg | \frac{\partial}{\partial x} \big (m - m_{ref} \big ) \Bigg |^2 dV \\ &+ \frac{\alpha_y}{2} \!\int_\Omega w_y \Bigg | \frac{\partial}{\partial y} \big (m - m_{ref} \big ) \Bigg |^2 dV + \frac{\alpha_z}{2} \!\int_\Omega w_z \Bigg | \frac{\partial}{\partial z} \big (m - m_{ref} \big ) \Bigg |^2 dV\end{split}\]

where \(\alpha_s, \alpha_x, \alpha_y\) and \(\alpha_z\) weight the relative emphasis on minimizing differences from the reference model and the smoothness along each gradient direction. And \(w_s, w_x, w_y\) and \(w_z\) are additional user defined weighting functions.

An important consideration comes when discretizing the regularization onto the mesh. The gradient operates on cell centered variables in this instance. Applying a short distance approximation is second order accurate on a domain with uniform cells, but only \(\mathcal{O}(1)\) on areas where cells are non-uniform. To rectify this a higher order approximation is used (Haber, 2012). The second order approximation of the model objective function can be expressed as:

\[\phi_m (\mathbf{m}) = \mathbf{\big (m-m_{ref} \big )^T W^T W \big (m-m_{ref} \big )}\]

where the regularizer is given by:

(2.13)\[\begin{split}\mathbf{W^T W} =& \;\;\;\;\alpha_s \textrm{diag} (\mathbf{w_s \odot v}) \\ & + \alpha_x \mathbf{G_x^T} \textrm{diag} (\mathbf{w_x \odot v_x}) \mathbf{G_x} \\ & + \alpha_y \mathbf{G_y^T} \textrm{diag} (\mathbf{w_y \odot v_y}) \mathbf{G_y} \\ & + \alpha_z \mathbf{G_z^T} \textrm{diag} (\mathbf{w_z \odot v_z}) \mathbf{G_z}\end{split}\]

The Hadamard product is given by \(\odot\), \(\mathbf{v_x}\) is the volume of each cell averaged to x-faces, \(\mathbf{w_x}\) is the weighting function \(w_x\) evaluated on x-faces and \(\mathbf{G_x}\) computes the x-component of the gradient from cell centers to cell faces. Similarly for y and z.

If we require that the recovered model values lie between \(\mathbf{m_L \preceq m \preceq m_H}\) , the resulting bounded optimization problem we must solve is:

(2.14)\[\begin{split}&\min_m \;\; \phi_d (\mathbf{m}) + \beta \phi_m(\mathbf{m}) \\ &\; \textrm{s.t.} \;\; \mathbf{m_L \preceq m \preceq m_H}\end{split}\]

A simple Gauss-Newton optimization method is used where the system of equations is solved using ipcg (incomplete preconditioned conjugate gradients) to solve for each G-N step. For more information refer again to (Haber, 2012) and references therein.

2.6.3. Inversion Parameters and Tolerances

2.6.3.1. Cooling Schedule

Our goal is to solve Eq. (2.14) , i.e.:

\[\begin{split}&\min_m \;\; \phi_d (\mathbf{m}) + \beta \phi_m(\mathbf{m - m_{ref}}) \\ &\; \textrm{s.t.} \;\; \mathbf{m_L \leq m \leq m_H}\end{split}\]

but how do we choose an acceptable trade-off parameter \(\beta\)? For this, we use a cooling schedule. This is described in the GIFtools cookbook . The cooling schedule can be defined using the following parameters:

beta_max: The initial value for \(\beta\)

beta_factor: The factor at which \(\beta\) is decrease to a subsequent solution of Eq. (2.14)

beta_min: The minimum \(\beta\) for which Eq. (2.14) is solved before the inversion will quit

Chi Factor: The inversion program stops when the data misfit \(\phi_d \leq N \times Chi \; Factor\), where \(N\) is the number of data observations

2.6.3.2. Gauss-Newton Update

For a given trade-off parameter (\(\beta\)), the model \(\mathbf{m}\) is updated using the Gauss-Newton approach. Because the problem is non-linear, several model updates may need to be completed for each \(\beta\). Where \(k\) denotes the Gauss-Newton iteration, we solve:

(2.15)\[\mathbf{H}_k \, \mathbf{\delta m}_k = - \nabla \phi_k\]

using the current model \(\mathbf{m}_k\) and update the model according to:

(2.16)\[\mathbf{m}_{k+1} = \mathbf{m}_{k} + \alpha \mathbf{\delta m}_k\]

where \(\mathbf{\delta m}_k\) is the step direction, \(\nabla \phi_k\) is the gradient of the global objective function, \(\mathbf{H}_k\) is an approximation of the Hessian and \(\alpha\) is a scaling constant. This process is repeated until a max number of GN iterations have been performed, i.e.

\[k = iter \_ per \_ beta\]

2.6.3.3. Gauss-Newton Solve

Here we discuss the details of solving Eq. (2.15) for a particular Gauss-Newton iteration \(k\). Using the data misfit from Eq. (2.11) and the model objective function from Eq. (2.13) , we must solve:

(2.17)\[\Big [ \mathbf{J^T W_d^T W_d J + \beta \mathbf{W^T W}} \Big ] \mathbf{\delta m}_k = - \Big [ \mathbf{J^T W_d^T W_d } \big ( \mathbf{d_{obs}} - \mathbb{F}[\mathbf{m}_k] \big ) + \beta \mathbf{W^T W m}_k \Big ]\]

where \(\mathbf{J}\) is the sensitivity of the data to the current model \(\mathbf{m}_k\). The system is solved for \(\mathbf{\delta m}_k\) using the incomplete-preconditioned-conjugate gradient (IPCG) method. This method is iterative and exits with an approximation for \(\mathbf{\delta m}_k\). Let \(i\) denote an IPCG iteration and let \(\mathbf{\delta m}_k^{(i)}\) be the solution to (2.17) at the \(i^{th}\) IPCG iteration, then the algorithm quits when:

the system is solved to within some tolerance and additional iterations do not result in significant increases in solution accuracy, i.e.:

\[\| \mathbf{\delta m}_k^{(i-1)} - \mathbf{\delta m}_k^{(i)} \|^2 / \| \mathbf{\delta m}_k^{(i-1)} \|^2 < tol \_ ipcg\]

a maximum allowable number of IPCG iterations has been completed, i.e.:

\[i = max \_ iter \_ ipcg\]