ARKode solves ODE initial value problems (IVPs) in \(\mathbb{R}^N\). These problems should be posed in explicit form, as

(1)\[M\dot{y} = f_E(t,y) + f_I(t,y), \qquad y(t_0) = y_0.\]

Here, \(t\) is the independent variable (e.g. time), and the dependent variables are given by \(y \in \mathbb{R}^N\), where we use the notation \(\dot{y}\) to denote \(\frac{dy}{dt}\).

\(M\) is a user-specified nonsingular operator from \(\mathbb{R}^N \to \mathbb{R}^N\). This operator may depend on \(t\) but is currently assumed to be independent of \(y\). For standard systems of ordinary differential equations and for problems arising from the spatial semi-discretization of partial differential equations using finite difference or finite volume methods, \(M\) is typically the identity matrix, \(I\). For PDEs using a finite-element spatial semi-discretization \(M\) is typically a well-conditioned mass matrix.

The two right-hand side functions may be described as:

- \(f_E(t,y)\) contains the “slow” time scale components of the system. This will be integrated using explicit methods.
- \(f_I(t,y)\) contains the “fast” time scale components of the system. This will be integrated using implicit methods.

ARKode may be used to solve stiff, nonstiff and mixed stiff/nonstiff problems. Roughly speaking, stiffness is characterized by the presence of at least one rapidly damped mode, whose time constant is small compared to the time scale of the solution itself. In the implicit/explicit (ImEx) splitting above, these stiff components should be included in the right-hand side function \(f_I(t,y)\).

In the sub-sections that follow, we elaborate on the numerical methods that comprise the ARKode solvers. We first discuss the general formulation of additive Runge-Kutta methods, including the resulting implicit systems that must be solved at each stage. We then discuss the solver strategies that ARKode uses in solving these systems: nonlinear solvers, linear solvers and preconditioners. We then describe our approaches for error control within the iterative nonlinear and linear solvers, including discussion on our choice of norms used within ARKode for measuring errors within various components of the solver. We then discuss specific enhancements available in ARKode, including an array of prediction algorithms for the solution at each stage, adaptive error controllers, mass-matrix handling, and rootfinding capabilities.

The methods used in ARKode are variable-step, embedded, additive Runge-Kutta methods (ARK), based on formulas of the form

(2)\[\begin{split}M z_i &= M y_{n-1} + h_n \sum_{j=1}^{i-1} A^E_{i,j} f_E(t^E_{n,j}, z_j)
+ h_n \sum_{j=1}^{i} A^I_{i,j} f_I(t^I_{n,j}, z_j),
\quad i=1,\ldots,s, \\
M y_n &= M y_{n-1} + h_n \sum_{i=1}^{s} \left(b^E_i f_E(t^E_{n,i}, z_i)
+ b^I_i f_I(t^I_{n,i}, z_i)\right), \\
M \tilde{y}_n &= M y_{n-1} + h_n \sum_{i=1}^{s} \left(
\tilde{b}^E_i f_E(t^E_{n,i}, z_i) +
\tilde{b}^I_i f_I(t^I_{n,i}, z_i)\right).\end{split}\]

Here the \(y_n\) are computed approximations to \(y(t_n)\),
\(\tilde{y}_n\) are [typically] lower-order embedded solutions
(used in error estimation), and \(h_n \equiv t_n - t_{n-1}\) is
the step size. The internal stage times are abbreviated using the
notation \(t^E_{n,j} = t_{n-1} + c^E_j h_n\) and \(t^I_{n,j}
= t_{n-1} + c^I_j h_n\). The ARK method is primarily defined through
the coefficients \(A^E \in \mathbb{R}^{s\times s}\),
\(A^I \in \mathbb{R}^{s\times s}\), \(b^E \in \mathbb{R}^{s}\),
\(b^I \in \mathbb{R}^{s}\), \(c^E \in \mathbb{R}^{s}\) and
\(c^I \in \mathbb{R}^{s}\), that correspond with the explicit and
implicit Butcher tables. Additional coefficients
\(\tilde{b}^E \in \mathbb{R}^{s}\) and
\(\tilde{b}^I \in \mathbb{R}^{s}\) may be used to enable an
*embedded solution* that is used to estimate error for adaptive
time-stepping. We note that ARKode currently enforces the constraint
that these tables must share the same number of stages \(s\)
between the explicit and implicit methods in an ARK pair.

The user of ARKode must choose appropriately between one of three
classes of methods: *ImEx*, *explicit* and *implicit*. All of
ARKode’s available Butcher tables encoding the coefficients
\(c^E\), \(c^I\), \(A^E\), \(A^I\), \(b^E\),
\(b^I\), \(\tilde{b}^E\) and \(\tilde{b}^I\) are further
described in the Appendix: Butcher tables.

For mixed stiff/nonstiff problems, a user should provide both of the functions \(f_E\) and \(f_I\) that define the IVP system. For such problems, ARKode currently implements the ARK methods proposed in [KC2003], allowing for methods having order \(q = \{3,4,5\}\). The tables for these methods are given in the section Additive Butcher tables.

For nonstiff problems, a user may specify that \(f_I = 0\), i.e. the equation (1) reduces to the non-split IVP

(3)\[M\dot{y} = f_E(t,y), \qquad y(t_0) = y_0.\]

In this scenario, the coefficients \(A^I=0\), \(c^I=0\), \(b^I=0\) and \(\tilde{b}^I=0\) in (2), and the ARK methods reduce to classical explicit Runge-Kutta methods (ERK). For these classes of methods, ARKode allows orders of accuracy \(q = \{2,3,4,5,6,8\}\), with embeddings of orders \(p = \{1,2,3,4,5,7\}\). These default to the Heun-Euler-2-1-2, Bogacki-Shampine-4-2-3, Zonneveld-5-3-4, Cash-Karp-6-4-5, Verner-8-5-6 and Fehlberg-13-7-8 methods, respectively.

Finally, for stiff problems the user may specify that \(f_E = 0\), so the equation (1) reduces to the non-split IVP

(4)\[M\dot{y} = f_I(t,y), \qquad y(t_0) = y_0.\]

Similarly to ERK methods, in this scenario the coefficients \(A^E=0\), \(c^E=0\), \(b^E=0\) and \(\tilde{b}^E=0\) in (2), and the ARK methods reduce to classical diagonally-implicit Runge-Kutta methods (DIRK). For these classes of methods, ARKode allows orders of accuracy \(q = \{2,3,4,5\}\), with embeddings of orders \(p = \{1,2,3,4\}\). These default to the SDIRK-2-1-2, ARK-4-2-3 (implicit), SDIRK-5-3-4 and ARK-8-4-5 (implicit) methods, respectively.

For both the DIRK and ARK methods corresponding to (1) and (4), an implicit system

(5)\[G(z_i) \equiv M z_i - h_n A^I_{i,i} f_I(t^I_{n,i}, z_i) - a_i = 0\]

must be solved for each stage \(z_i, i=1,\ldots,s\), where we have the data

\[a_i \equiv M y_{n-1} + h_n \sum_{j=1}^{i-1} \left[
A^E_{i,j} f_E(t^E_{n,j}, z_j) +
A^I_{i,j} f_I(t^I_{n,j}, z_j) \right]\]

for the ARK methods, or

\[a_i \equiv M y_{n-1} + h_n \sum_{j=1}^{i-1}
A^I_{i,j} f_I(t^I_{n,j}, z_j)\]

for the DIRK methods. Here, if \(f_I(t,y)\) depends nonlinearly on \(y\) then (5) corresponds to a nonlinear system of equations; if \(f_I(t,y)\) depends linearly on \(y\) then this is a linear system of equations.

For systems of either type, ARKode allows a choice of solution strategy. The default solver choice is a variant of Newton’s method,

(6)\[z_i^{(m+1)} = z_i^{(m)} + \delta^{(m+1)},\]

where \(m\) is the Newton iteration index, and the Newton update \(\delta^{(m+1)}\) in turn requires the solution of the linear Newton system

(7)\[{\mathcal A}\left(z_i^{(m)}\right) \delta^{(m+1)} = -G\left(z_i^{(m)}\right),\]

in which

(8)\[{\mathcal A} \approx M - \gamma J, \quad J = \frac{\partial f_I}{\partial y},
\quad\text{and}\quad \gamma = h_n A^I_{i,i}.\]

As an alternate to Newton’s method, ARKode may solve for each stage \(z_i, i=1,\ldots,s\) using an Anderson-accelerated fixed point iteration

(9)\[z_i^{(m+1)} = g(z_i^{(m)}), \quad m=0,1,\ldots\]

Unlike with Newton’s method, this method *does not* require the
solution of a linear system at each iteration, instead opting for
solution of a low-dimensional least-squares solution to construct the
nonlinear update. For details on how this iteration is performed, we
refer the reader to the reference [WN2011].

Finally, if the user specifies that \(f_I(t,y)\) depends linearly
on \(y\) (via a call to `ARKodeSetLinear()`

in C/C++, or
the *LINEAR* argument to `FARKSETIIN()`

in Fortran), and if
the Newton-based nonlinear solver is chosen, then the problem
(5) will be solved using only a single Newton iteration.
In this case, an additional argument to the respective function
denotes whether this Jacobian is time-dependent or not, indicating
whether the Jacobian or preconditioner needs to be recomputed at each
step.

The optimal solver (Newton vs fixed-point) is highly problem-dependent. Since fixed-point solvers do not require the solution of any linear systems, each iteration may be significantly less costly than their Newton counterparts. However, this can come at the cost of slower convergence (or even divergence) in comparison with Newton-like methods. However, these fixed-point solvers do allow for user specification of the Anderson-accelerated subspace size, \(m_k\). While the required amount of solver memory grows proportionately to \(m_k N\), larger values of \(m_k\) may result in faster convergence. In our experience, this improvement may be significant even for “small” values, e.g. \(1\le m_k\le 5\), and that convergence may not improve (or even deteriorate) for larger values of \(m_k\).

While ARKode uses a Newton-based iteration as its default solver due to its increased robustness on very stiff problems, it is highly recommended that users also consider the fixed-point solver for their when attempting a new problem.

For either the Newton or fixed-point solvers, it is well-known that both the efficiency and robustness of the algorithm intimately depends on the choice of a good initial guess. In ARKode, the initial guess for either nonlinear solution method is a predicted value \(z_i^{(0)}\) that is computed explicitly from the previously-computed data (e.g. \(y_{n-2}\), \(y_{n-1}\), and \(z_j\) where \(j<i\)). Additional information on the specific predictor algorithms implemented in ARKode is provided in the following section, Implicit predictors.

When a Newton-based method is chosen for solving each nonlinear
system, a linear system of equations must be solved at each nonlinear
iteration. For this solve ARKode provides several choices, including
the option of a user-supplied linear solver module. The linear solver
modules distributed with SUNDIALS are organized into two families: a
*direct* family comprising direct linear solvers for dense, banded or
sparse matrices, and a *spils* family comprising scaled, preconditioned,
iterative (Krylov) linear solvers. The methods offered through these
modules are as follows:

- dense direct solvers, using either an internal SUNDIALS implementation or a BLAS/LAPACK implementation (serial version only),
- band direct solvers, using either an internal SUNDIALS implementation or a BLAS/LAPACK implementation (serial version only),
- sparse direct solvers, using either the KLU sparse matrix library [KLU], or the OpenMP or PThreads-enabled SuperLU_MT sparse matrix library [SuperLUMT] (serial or threaded vector modules only) [Note that users will need to download and install the KLU or SuperLU_MT packages independent of ARKode],
- SPGMR, a scaled, preconditioned GMRES (Generalized Minimal Residual) solver,
- SPFGMR, a scaled, preconditioned Flexible GMRES (Generalized Minimal Residual) solver,
- SPBCGS, a scaled, preconditioned Bi-CGStab (Bi-Conjugate Gradient Stable) solver,
- SPTFQMR, a scaled, preconditioned TFQMR (Transpose-free Quasi-Minimal Residual) solver, or
- PCG, a preconditioned CG (Conjugate Gradient method) solver for symmetric linear systems.

For large stiff systems where direct methods are infeasible, the combination of an implicit integrator and a preconditioned Krylov method can yield a powerful tool because it combines established methods for stiff integration, nonlinear solver iteration, and Krylov (linear) iteration with a problem-specific treatment of the dominant sources of stiffness, in the form of a user-supplied preconditioner matrix [BH1989]. We note that the direct linear solvers provided by SUNDIALS (dense, band and sparse), as well as the direct linear solvers accessible through LAPACK, can only be used with the serial and threaded vector representations.

In the case that a direct linear solver is used, ARKode utilizes
either a Newton or a *modified Newton iteration*. The difference
between these is that in a modified Newton method, the matrix
\({\mathcal A}\) is held fixed for multiple Newton iterations.
More precisely, each Newton iteration is computed from the modified
equation

(10)\[\tilde{\mathcal A}\left(z_i^{(m)}\right) \delta^{(m+1)} = -G\left(z_i^{(m)}\right),\]

in which

(11)\[\tilde{\mathcal A} \approx M - \tilde{\gamma} \tilde{J}, \quad \tilde{J} =
\frac{\partial f_I}{\partial y}(\tilde y), \quad\text{and}\quad
\tilde{\gamma} = \tilde{h} A^I_{i,i}.\]

Here, the solution \(\tilde{y}\) and step size \(\tilde{h}\)
upon which the modified Jacobian rely, are merely values of the
solution and step size from a previous iteration. In other words, the
matrix \(\tilde{\mathcal A}\) is only computed rarely, and reused for
repeated stage solves. The frequency at which \(\tilde{\mathcal
A}\) is recomputed, and hence the choice between normal and modified
Newton iterations, is determined by the input parameter *msbp* to the
input function `ARKodeSetMaxStepsBetweenLSet()`

in C/C++, or
with the *LSETUP_MSBP* argument to `FARKSETIIN()`

in Fortran.

When using the direct and band solvers for the linear systems (10), the Jacobian may be supplied by a user routine or approximated by finite-differences. In the case of differencing, we use the standard approximation

\[J_{i,j}(t,y) = \frac{f_{I,i}(t,y+\sigma_j e_j) - f_{I,i}(t,y)}{\sigma_j},\]

where \(e_j\) is the jth unit vector, and the increments \(\sigma_j\) are given by

\[\sigma_j = \max\left\{ \sqrt{U}\, |y_j|, \frac{\sigma_0}{w_j} \right\}.\]

Here \(U\) is the unit roundoff, \(\sigma_0\) is a dimensionless value, and \(w_j\) is the error weight defined in (13). In the dense case, this approach requires \(N\) evaluations of \(f_I\), one for each column of \(J\). In the band case, the columns of \(J\) are computed in groups, using the Curtis-Powell-Reid algorithm, with the number of \(f_I\) evaluations equal to the bandwidth.

We note that with the sparse direct solvers, the Jacobian *must*
be supplied by a user routine.

In the case that an iterative linear solver is chosen, ARKode utilizes a
Newton method variant called an *Inexact Newton iteration*. Here, the
matrix \({\mathcal A}\) is not itself constructed since the
algorithms only require the product of this matrix with a given
vector. Additionally, each Newton system (7) is not
solved completely, since these linear solvers are iterative (hence the
“inexact” in the name). Resultingly. for these linear solvers
\({\mathcal A}\) is applied in a matrix-free manner,

\[{\mathcal A}v = Mv - \gamma Jv.\]

The matrix-vector products \(Jv\) are obtained by either calling an optional user-supplied routine, or through directional differencing using the formula

\[Jv = \frac{f_I(t,y+\sigma v) - f_I(t,y)}{\sigma},\]

where the increment \(\sigma = 1/\|v\|\) to ensure that \(\|\sigma v\| = 1\).

As with the modified Newton method that reused \({\mathcal A}\) between solves, ARKode’s inexact Newton iteration may also recompute the preconditioner matrix \(P\) infrequently to balance the high costs of matrix construction and factorization against the reduced convergence rate that may result from a stale preconditioner.

Alternately, for some preconditioning algorithms that do not rely on
costly matrix construction and factorization operations (e.g. when
using an iterative multigrid method as preconditioner), a user may
specify that \({\mathcal A}\) and/or \(P\) should be
recomputed at every Newton iteration, since the increased rate of
convergence may more than account for the additional cost of
Jacobian/preconditioner construction. To indicate this, a user need
only supply a negative value for the *msbp* argument to
`ARKodeSetMaxStepsBetweenLSet()`

in C/C++, or the
*LSETUP_MSBP* argument to `FARKSETIIN()`

in Fortran.

However, in cases where recomputation of the Newton matrix \(\tilde{\mathcal A}\) or preconditioner matrix \(P\) is lagged, ARKode will force recomputation of these structures only in the following circumstances:

- when starting the problem,
- when more than 20 steps have been taken since the last update (this
value may be changed via the
*msbp*argument to`ARKodeSetMaxStepsBetweenLSet()`

in C/C++, or the*LSETUP_MSBP*argument to`FARKSETIIN()`

in Fortran), - when the value \(\bar{\gamma}\) of \(\gamma\) at the last
update satisfies \(\left|\gamma/\bar{\gamma} - 1\right| > 0.2\)
(this tolerance may be changed via the
*dgmax*argument to`ARKodeSetDeltaGammaMax()`

in C/C++, or the*LSETUP_DGMAX*argument to`FARKSETRIN()`

in Fortran), - when a non-fatal convergence failure just occurred,
- when an error test failure just occurred, or
- if the problem is linearly implicit and \(\gamma\) has changed by a factor larger than 100 times machine epsilon.

When an update is forced due to a convergence failure, an update of \(\tilde{\mathcal A}\) or \(P\) may or may not involve a reevaluation of \(J\) (in \(\tilde{\mathcal A}\)) or of Jacobian data (in \(P\)), depending on whether errors in the Jacobian were the likely cause of the failure. More generally, the decision is made to reevaluate \(J\) (or instruct the user to reevaluate Jacobian data in \(P\)) when:

- starting the problem,
- more than 50 steps have been taken since the last evaluation,
- a convergence failure occurred with an outdated matrix, and the value \(\bar{\gamma}\) of \(\gamma\) at the last update satisfies \(\left|\gamma/\bar{\gamma} - 1\right| > 0.2\),
- a convergence failure occurred that forced a step size reduction, or
- if the problem is linearly implicit and \(\gamma\) has changed by a factor larger than 100 times machine epsilon.

As will be further discussed in the section Preconditioning, in the case of a Krylov method, preconditioning may be applied on the left, right, or on both sides of \({\mathcal A}\), with user-supplied routines for the preconditioner setup and solve operations.

In the process of controlling errors at various levels (time integration, nonlinear solution, linear solution), ARKode uses a weighted root-mean-square norm, denoted \(\|\cdot\|_\text{WRMS}\), for all error-like quantities,

(12)\[\|v\|_\text{WRMS} = \left( \frac{1}{N} \sum_{i=1}^N \left(v_i\,
w_i\right)^2\right)^{1/2}.\]

The power of this choice of norm arises in the specification of the weighting vector \(w\), that combines the units of the problem with the user-supplied measure of “acceptable” error. To this end, ARKode constructs and error weight vector using the most-recent step solution and the relative and absolute tolerances input by the user, namely

(13)\[w_i = \frac{1}{RTOL\cdot |y_i| + ATOL_i}.\]

Since \(1/w_i\) represents a tolerance in the component \(y_i\), a vector whose WRMS norm is 1 is regarded as “small.” For brevity, we will typically drop the subscript WRMS on norms in the remainder of this section.

Additionally, for problems involving a non-identity mass matrix, \(M\ne I\), the units of equation (1) may differ from the units of the solution \(y\). In this case, ARKode may also construct a residual weight vector,

(14)\[w_i = \frac{1}{RTOL\cdot |My_i| + ATOL'_i},\]

where the user may specify a separate absolute residual tolerance value or array, \(ATOL'_i\). The choice of weighting vector used in any given norm is determined by the quantity being measured: values having solution units use (13), whereas values having equation units use (14). Obviously, for problems with \(M=I\), the weighting vectors are identical.

The stopping test for all of ARKode’s nonlinear solvers is related to the subsequent local error test, with the goal of keeping the nonlinear iteration errors from interfering with local error control. Denoting the final computed value of each stage solution as \(z_i^{(m)}\), and the true stage solution solving (5) as \(z_i\), we want to ensure that the iteration error \(z_i - z_i^{(m)}\) is “small” (recall that a norm less than 1 is already considered within an acceptable tolerance).

To this end, we first estimate the linear convergence rate \(R_i\) of the nonlinear iteration. We initialize \(R_i=1\), and reset it to this value whenever \(\tilde{\mathcal A}\) or \(P\) are updated. After computing a nonlinear correction \(\delta^{(m)} = z_i^{(m)} - z_i^{(m-1)}\), if \(m>0\) we update \(R_i\) as

\[R_i \leftarrow \max\{ 0.3 R_i, \left\|\delta^{(m)}\right\| / \left\|\delta^{(m-1)}\right\| \}.\]

where the factor 0.3 is user-modifiable as the *crdown* input to the
the function `ARKodeSetNonlinCRDown()`

in C/C++, or the
*NONLIN_CRDOWN* argument to `FARKSETRIN()`

in Fortran.

Denoting the true time step solution as \(y(t_n)\), and the computed time step solution (computed using the stage solutions \(z_i^{(m)}\)) as \(y_n\), we use the estimate

\[\left\| y(t_n) - y_n \right\| \approx
\max_i \left\| z_i^{(m+1)} - z_i^{(m)} \right\| \approx
\max_i R_i \left\| z_i^{(m)} - z_i^{(m-1)} \right\| =
\max_i R_i \left\| \delta^{(m)} \right\|.\]

Therefore our convergence (stopping) test for the nonlinear iteration for each stage is

(15)\[\begin{split}R_i \left\|\delta^{(m)} \right\| < \epsilon,\end{split}\]

where the factor \(\epsilon\) has default value 0.1, and is
user-modifiable as the *nlscoef* input to the the function
`ARKodeSetNonlinConvCoef()`

in C/C++, or the *NLCONV_COEF*
input to the function `FARKSETRIN()`

in Fortran. We allow at
most 3 nonlinear iterations (modifiable through
`ARKodeSetMaxNonlinIters()`

in C/C++, or as the *MAX_NSTEPS*
argument to `FARKSETIIN()`

in Fortran). We also declare the
nonlinear iteration to be divergent if any of the ratios
\(\|\delta^{(m)}\| / \|\delta^{(m-1)}\| > 2.3\) with \(m>0\)
(the value 2.3 may be modified as the *rdiv* input to
`ARKodeSetNonlinRDiv()`

in C/C++, or the *NONLIN_RDIV* input
to `FARKSETRIN()`

in Fortran). If convergence fails in the
fixed point iteration, or in the Newton iteration with \(J\) or
\({\mathcal A}\) current, we must then reduce the step size by a
factor of 0.25 (modifiable via the *etacf* input to the
`ARKodeSetMaxCFailGrowth()`

function in C/C++, or the
*ADAPT_ETACF* input to `FARKSETRIN()`

in Fortran). The
integration is halted after 10 convergence failures (modifiable via the
`ARKodeSetMaxConvFails()`

function in C/C++, or the
*MAX_CONVFAIL* argument to `FARKSETIIN()`

in Fortran).

When a Krylov method is used to solve the linear systems (7), its errors must also be controlled. To this end, we approximate the linear iteration error in the solution vector \(\delta^{(m)}\) using the preconditioned residual vector, e.g. \(r = P{\mathcal A}\delta^{(m)} + PG\) for the case of left preconditioning (the role of the preconditioner is further elaborated on in the next section). In an attempt to ensure that the linear iteration errors do not interfere with the nonlinear solution error and local time integration error controls, we require that the norm of the preconditioned linear residual satisfies

(16)\[\|r\| \le \frac{\epsilon_L \epsilon}{10}.\]

Here \(\epsilon\) is the same value as that used above for the
nonlinear error control. The factor of 10 is used to ensure that the
linear solver error does not adversely affect the nonlinear solver
convergence. Smaller values for the parameter \(\epsilon_L\) are
typically useful for strongly nonlinear and stiff ODE systems, while
easier ODE systems may benefit from a value closer to 1; by default
\(\epsilon_L = 0.05\), which may be modified by the user through
the `ARKSpilsSetEpsLin()`

function in C/C++, or through the
`FARKSPILSSETEPSLIN()`

in Fortran. We note that for linearly
implicit problems the same tolerance (16) is used for
the single Newton iteration.

When using an inexact Newton method to solve the nonlinear system (5), ARKode makes repeated use of a linear solver to solve linear systems of the form \({\mathcal A}x = b\), where \(x\) is a correction vector and \(b\) is a residual vector. If this linear system solve is done with one of the scaled preconditioned iterative linear solvers, the efficiency of such solvers may benefit tremendously from preconditioning. A system \({\mathcal A}x=b\) can be preconditioned as one of:

\[\begin{split}(P^{-1}{\mathcal A})x = P^{-1}b & \qquad\text{[left preconditioning]}, \\
({\mathcal A}P^{-1})Px = b & \qquad\text{[right preconditioning]}, \\
(P_L^{-1} {\mathcal A} P_R^{-1}) P_R x = P_L^{-1}b & \qquad\text{[left and right
preconditioning]}.\end{split}\]

The Krylov method is then applied to a system with the matrix \(P^{-1}{\mathcal A}\), \({\mathcal A}P^{-1}\), or \(P_L^{-1} {\mathcal A} P_R^{-1}\), instead of \({\mathcal A}\). In order to improve the convergence of the Krylov iteration, the preconditioner matrix \(P\), or the product \(P_L P_R\) in the third case, should in some sense approximate the system matrix \({\mathcal A}\). Yet at the same time, in order to be cost-effective the matrix \(P\) (or matrices \(P_L\) and \(P_R\)) should be reasonably efficient to evaluate and solve. Finding an optimal point in this tradeoff between rapid convergence and low cost can be quite challenging. Good choices are often problem-dependent (for example, see [BH1989] for an extensive study of preconditioners for reaction-transport systems).

Most of the iterative linear solvers supplied with SUNDIALS allow for preconditioning either side, or on both sides, although for non-symmetric matrices \({\mathcal A}\) we know of few situations where preconditioning on both sides is superior to preconditioning on one side only (with the product \(P = P_L P_R\)). Moreover, for a given preconditioner matrix, the merits of left vs. right preconditioning are unclear in general, and the user should experiment with both choices. Performance will differ between these choices because the inverse of the left preconditioner is included in the linear system residual whose norm is being tested in the Krylov algorithm. As a rule, however, if the preconditioner is the product of two matrices, we recommend that preconditioning be done either on the left only or the right only, rather than using one factor on each side. An exception to this rule is the PCG solver, that itself assumes a symmetric matrix \({\mathcal A}\), since the PCG algorithm in fact applies the single preconditioner matrix \(P\) in both left/right fashion as \(P^{-1/2} {\mathcal A} P^{-1/2}\).

Typical preconditioners used with ARKode are based on approximations to the system Jacobian, \(J = \partial f_I / \partial y\). Since the Newton iteration matrix involved is \({\mathcal A} = M - \gamma J\), any approximation \(\bar{J}\) to \(J\) yields a matrix that is of potential use as a preconditioner, namely \(P = M - \gamma \bar{J}\). Because the Krylov iteration occurs within a Newton iteration and further also within a time integration, and since each of these iterations has its own test for convergence, the preconditioner may use a very crude approximation, as long as it captures the dominant numerical feature(s) of the system. We have found that the combination of a preconditioner with the Newton-Krylov iteration, using even a relatively poor approximation to the Jacobian, can be surprisingly superior to using the same matrix without Krylov acceleration (i.e., a modified Newton iteration), as well as to using the Newton-Krylov method with no preconditioning.

For problems with implicit components, ARKode will employ a prediction algorithm for constructing the initial guesses for each Runge-Kutta stage, \(z_i^{(0)}\). As is well-known with nonlinear solvers, the selection of a good initial guess can have dramatic effects on both the speed and robustness of the nonlinear solve, enabling the difference between rapid quadratic convergence versus divergence of the iteration. To this end, ARKode implements a variety of prediction algorithms that may be selected by the user. In each case, the stage guesses \(z_i^{(0)}\) are constructed explicitly using readily-available information, including the previous step solutions \(y_{n-1}\) and \(y_{n-2}\), as well as any previous stage solutions \(z_j, \quad j<i\). In most cases, prediction is performed by constructing an interpolating polynomial through existing data, which is then evaluated at the subsequent stage times to provide an inexpensive but (hopefully) reasonable prediction of the subsequent solution value. Specifically, for most Runge-Kutta methods each stage solution satisfies

\[z_i \approx y(t^I_{n,i}),\]

so by constructing an interpolating polynomial \(p_q(t)\) through a set of existing data, the initial guess at stage solutions may be approximated as

\[z_i^{(0)} = p_q(t^I_{n,i}).\]

Denoting \([a,b]\) as the interval containing the data used to construct \(p_q(t)\), and assuming forward integration from \(a\to b\), it is typically the case that \(t^I_{n,j} > b\). The dangers of using a polynomial interpolant to extrapolate values outside the interpolation interval are well-known, with higher-order polynomials and predictions further outside the interval resulting in the greatest potential inaccuracies.

The various prediction algorithms therefore construct different types of interpolant \(p_q(t)\), as described below.

The so-called “trivial predictor” is given by the formula

\[p_0(\tau) = y_{n-1}.\]

While this piecewise-constant interpolant is clearly not a highly accurate candidate for problems with time-varying solutions, it is often the most robust approach for either highly stiff problems, or problems with implicit constraints whose violation may cause illegal solution values (e.g. a negative density or temperature).

At the opposite end of the spectrum, ARKode can construct an
interpolant \(p_q(t)\) of polynomial order up to \(q=3\).
Here, the function \(p_q(t)\) is identical to the one used for
interpolation of output solution values between time steps, i.e. for
“dense output” of \(y(t)\) for \(t_{n-1} < t < t_n\).
The order of this polynomial, \(q\), may be specified by the user
with the function `ARKodeSetDenseOrder()`

in C/C++, or with the
*DENSE_ORDER* argument to `FARKSETIIN()`

in Fortran.

The interpolants generated are either of Lagrange or Hermite form, and use the data \(\left\{ y_{n-2}, f_{n-2}, y_{n-1}, f_{n-1} \right\}\), where we use \(f_{k}\) to denote \(M^{-1} \left(f_E(t_k,y_k) + f_I(t_k,y_k)\right)\). Defining a scaled and shifted “time” variable \(\tau\) for the interval \([t_{n-2}, t_{n-1}]\) as

\[\tau(t) = (t-t_{n-1})/h_{n-1},\]

we may denote the predicted stage times in the subsequent time interval \([t_{n-1}, t_{n}]\) as

\[\tau_i = c^I_i \frac{h_n}{h_{n-1}}.\]

We then construct the interpolants \(p(t)\) as follows:

\(q=0\): constant interpolant

\[p_0(\tau) = \frac{y_{n-2} + y_{n-1}}{2}.\]\(q=1\): linear Lagrange interpolant

\[p_1(\tau) = -\tau\, y_{n-2} + (1+\tau)\, y_{n-1}.\]\(q=2\): quadratic Hermite interpolant

\[p_2(\tau) = \tau^2\,y_{n-2} + (1-\tau^2)\,y_{n-1} + h(\tau+\tau^2)\,f_{n-1}.\]\(q=3\): cubic Hermite interpolant

\[p_3(\tau) = (3\tau^2 + 2\tau^3)\,y_{n-2} + (1-3\tau^2-2\tau^3)\,y_{n-1} + h(\tau^2+\tau^3)\,f_{n-2} + h(\tau+2\tau^2+\tau^3)\,f_{n-1}.\]

These higher-order predictors may be useful when using lower-order methods in which \(h_n\) is not too large. We further note that although interpolants of order \(> 3\) are possible, these are not implemented due to their increased computing and storage costs, along with their diminishing returns due to increased extrapolation error.

This predictor attempts to use higher-order interpolations \(p_q(t)\) for predicting earlier stages in the subsequent time interval, and lower-order interpolants for later stages. It uses the same formulas as described above, but chooses \(q\) adaptively based on the stage index \(i\), under the (rather tenuous) assumption that the stage times are increasing, i.e. \(c^I_j < c^I_k\) for \(j<k\):

\[q = \max\{ q_\text{max} - i,\; 1 \}.\]

This predictor follows a similar idea as the previous algorithm, but monitors the actual stage times to determine the polynomial interpolant to use for prediction:

\[\begin{split}q = \begin{cases}
q_\text{max}, & \text{if}\quad \tau < \tfrac12,\\
1, & \text{otherwise}.
\end{cases}\end{split}\]

This predictor does not use any information from within the preceding step, instead using information only within the current step \([t_{n-1},t_n]\) (including \(y_{n-1}\) and \(f_{n-1}\)). Instead, this approach uses the right-hand side from a previously computed stage solution in the same step, \(f(t_{n-1}+c^I_j h,z_j)\) to construct a quadratic Hermite interpolant for the prediction. If we define the constants \(\tilde{h} = c^I_j h\) and \(\tau = c^I_i h\), the predictor is given by

\[z_i^{(0)} = y_{n-1} + \left(\tau - \frac{\tau^2}{2\tilde{h}}\right)
f(t_{n-1},y_{n-1}) + \frac{\tau^2}{2\tilde{h}} f(t_{n-1}+\tilde{h},z_j).\]

For stages in which \(c^I_j=0\) for all previous stages \(j = 0,\ldots,i-1\), and for the first stage of any time step \((i=0)\), this method reduces to using the trivial predictor \(z_i^{(0)} = y_{n-1}\). For stages having multiple precdding nonzero \(c^I_j\), we choose the stage having largest \(c^I_j\) value, to minimize the amount of extrapolation induced through the prediction.

We note that in general, each stage solution \(z_j\) has
signicantly worse accuracy than the time step solutions
\(y_{n-1}\), due to the difference between the *stage order* and
the *method order* in typical Runge-Kutta methods. As a result, the
accuracy of this predictor will generally be rather limited, but it is
provided for problems in which this increased stage error is better
than the effects of extrapolation far outside of the previous time
step interval \([t_{n-2},t_{n-1}]\).

We further note that although this method could be used with non-identity mass matrix \(M\ne I\), support for that mode is not currently implemented, so selection of this predictor in the case that \(M\ne I\) will result in use of the Trivial predictor.

This predictor is not interpolation based; instead it utilizes all existing stage information from the current step to create a predictor containing all but the current stage solution. Specifically, as discussed in equations (2) and (5), each stage solves a nonlinear equation

\[\begin{split}z_i &= y_{n-1} + h_n \sum_{j=1}^{i-1} A^E_{i,j} f_E(t^E_{n,j}, z_j)
+ h_n \sum_{j=1}^{i} A^I_{i,j} f_I(t^I_{n,j}, z_j), \\
\Leftrightarrow \quad & \\
G(z_i) &\equiv z_i - h_n A^I_{i,i} f_I(t^I_{n,i}, z_i) - a_i = 0.\end{split}\]

This prediction method merely computes the predictor \(z_i\) as

\[\begin{split}z_i &= y_{n-1} + h_n \sum_{j=1}^{i-1} A^E_{i,j} f_E(t^E_{n,j}, z_j)
+ h_n \sum_{j=1}^{i-1} A^I_{i,j} f_I(t^I_{n,j}, z_j), \\
\Leftrightarrow \quad & \\
z_i &= a_i.\end{split}\]

We note that although this method could be also used with non-identity mass matrix \(M\ne I\), support for that mode is not currently implemented, so selection of this predictor in the case that \(M\ne I\) will result in use of the Trivial predictor.

A critical component of ARKode, making it an IVP “solver” rather than just an integrator, is its adaptive control of local truncation error. At every step, we estimate the local error, and ensure that it satisfies tolerance conditions. If this local error test fails, then the step is recomputed with a reduced step size. To this end, every Runge-Kutta method packaged within ARKode admit an embedded solution \(\tilde{y}_n\), as shown in equation (2). Generally, these embedded solutions attain a lower order of accuracy than the computed solution \(y_n\). Denoting these orders of accuracy as \(p\) and \(q\), where \(p\) corresponds to the embedding and \(q\) corresponds to the method, for the majority of embedded methods \(p = q-1\). These values of \(p\) and \(q\) correspond to the global order of accuracy for the method and embedding, hence each admit local errors satisfying [HW1993]

(17)\[\begin{split}\| y_n - y(t_n) \| = C h_n^{q+1} + \mathcal O(h_n^{q+2}), \\
\| \tilde{y}_n - y(t_n) \| = D h_n^{p+1} + \mathcal O(h_n^{p+2}),\end{split}\]

where \(C\) and \(D\) are constants independent of \(h\), and where we have assumed exact initial conditions for the step, \(y_{n-1} = y(t_{n-1})\). Combining these estimates, we have

\[\| y_n - \tilde{y}_n \| = \| y_n - y(t_n) - \tilde{y}_n + y(t_n) \|
\le \| y_n - y(t_n) \| + \| \tilde{y}_n - y(t_n) \|
\le D h_n^{p+1} + \mathcal O(h_n^{p+2}).\]

We therefore use this difference norm as an estimate for the local truncation error at the step \(n\),

(18)\[T_n = \beta \left(y_n - \tilde{y}_n\right) =
\beta h_n M^{-1} \sum_{i=1}^{s} \left[
\left(b^E_i - \tilde{b}^E_i\right) f_E(t^E_{n,i}, z_i) +
\left(b^I_i - \tilde{b}^I_i\right) f_I(t^I_{n,i}, z_i) \right].\]

Here, \(\beta>0\) is an error *bias* to help account for the error
constant \(D\); the default value of this is \(\beta = 1.5\),
and may be modified by the user through the function
`ARKodeSetErrorBias()`

in C/C++, or through the input
*ADAPT_BIAS* to `FARKSETRIN()`

in Fortran.

With this LTE estimate, the local error test is simply \(\|T_n\|
< 1\), where we remind that this norm includes the user-specified
relative and absolute tolerances. If this error test passes, the step
is considered successful, and the estimate is subsequently used to
estimate the next step size, as will be described below in the section
Asymptotic error control. If the error test fails,
the step is rejected and a new step size \(h'\) is then computed
using the error control algorithms described in
Asymptotic error control. A new attempt at the step
is made, and the error test is repeated. If it fails multiple times
(as specified through the *small_nef* input to
`ARKodeSetSmallNumEFails()`

in C/C++, or the *ADAPT_SMALL_NEF*
argument to `FARKSETIIN()`

in Fortran, which defaults to 2),
then \(h'/h\) is limited above to 0.3 (this is modifiable via the
*etamxf* argument to `ARKodeSetMaxEFailGrowth()`

in C/C++, or
the *ADAPT_ETAMXF* argument to `FARKSETRIN()`

in Fortran), and
limited below to 0.1 after an additional step failure. After
seven error test failures (modifiable via the function
`ARKodeSetMaxErrTestFails()`

in C/C++, or the *MAX_ERRFAIL*
argument to `FARKSETIIN()`

in Fortran), ARKode returns to the
user with a give-up message.

We define the step size ratio between a prospective step \(h'\) and a completed step \(h\) as \(\eta\), i.e.

\[\eta = h' / h.\]

This is bounded above by \(\eta_\text{max}\) to ensure that step size adjustments are not overly aggressive. This value is modified according to the step and history,

\[\begin{split}\eta_\text{max} = \begin{cases}
\text{etamx1}, & \quad\text{on the first step (default is 10000)}, \\
\text{growth}, & \quad\text{on general steps (default is 20)}, \\
1, & \quad\text{if the previous step had an error test failure}.
\end{cases}\end{split}\]

Here, the values of *etamx1* and *growth* may be modified by the user
in the functions `ARKodeSetMaxFirstGrowth()`

and
`ARKodeSetMaxGrowth()`

in C/C++, respectively, or through the
inputs *ADAPT_ETAMX1* and *ADAPT_GROWTH* to the function
`FARKSETRIN()`

in Fortran.

A flowchart detailing how the time steps are modified at each
iteration to ensure solver convergence and successful steps is given
in the figure below. Here, all norms correspond to the WRMS norm, and
the error adaptivity function **arkAdapt** is supplied by one of the
error control algorithms discussed in the subsections below.

For some problems it may be preferrable to avoid small step size
adjustments. This can be especially true for problems that construct
and factor the Newton Jacobian matrix \({\mathcal A}\) from
equation (8) for either a direct solve, or as a
preconditioner for an iterative solve, where this construction is
computationally expensive, and where Newton convergence can be
seriously hindered through use of a somewhat incorrect
\({\mathcal A}\). In these scenarios, the step is not changed
when \(\eta \in [\eta_L, \eta_U]\). The default values for these
parameters are \(\eta_L = 1\) and \(\eta_U = 1.5\), though
these are modifiable through the function
`ARKodeSetFixedStepBounds()`

in C/C++, or through the input
*ADAPT_BOUNDS* to the function `FARKSETRIN()`

in Fortran.

The user may supply external bounds on the step sizes within ARKode,
through defining the values \(h_\text{min}\) and \(h_\text{max}\) with
the functions `ARKodeSetMinStep()`

and
`ARKodeSetMaxStep()`

in C/C++, or through the inputs
*MIN_STEP* and *MAX_STEP* to the function `FARKSETRIN()`

in
Fortran, respectively. These default to \(h_\text{min}=0\) and
\(h_\text{max}=\infty\).

Normally, ARKode takes steps until a user-defined output value
\(t = t_\text{out}\) is overtaken, and then it computes
\(y(t_\text{out})\) by interpolation (using the same dense output
routines described in the section
Maximum order predictor). However, a “one step” mode option
is available, where control returns to the calling program after each
step. There are also options to force ARKode not to integrate past a
given stopping point \(t = t_\text{stop}\), through the function
`ARKodeSetStopTime()`

in C/C++, or through the input
*STOP_TIME* to `FARKSETRIN()`

in Fortran.

As mentioned above, ARKode adapts the step size in order to attain local errors within desired tolerances of the true solution. These adaptivity algorithms estimate the prospective step size \(h'\) based on the asymptotic local error estimates (17). We define the values \(\varepsilon_n\), \(\varepsilon_{n-1}\) and \(\varepsilon_{n-2}\) as

\[\begin{split}\varepsilon_k &\ \equiv \ \|T_k\|
\ = \ \beta \|y_k - \tilde{y}_k\|,\end{split}\]

corresponding to the local error estimates for three consecutive steps, \(t_{n-3} \to t_{n-2} \to t_{n-1} \to t_n\). These local error history values are all initialized to 1.0 upon program initialization, to accomodate the few initial time steps of a calculation where some of these error estimates are undefined. With these estimates, ARKode implements a variety of error control algorithms, as specified in the subsections below.

This is the default time adaptivity controller used by ARKode. It derives from those found in [KC2003], [S1998], [S2003] and [S2006]. It uses all three of the local error estimates \(\varepsilon_n\), \(\varepsilon_{n-1}\) and \(\varepsilon_{n-2}\) in determination of a prospective step size,

\[h' \;=\; h_n\; \varepsilon_n^{-k_1/p}\; \varepsilon_{n-1}^{k_2/p}\;
\varepsilon_{n-2}^{-k_3/p},\]

where the constants \(k_1\), \(k_2\) and \(k_3\) default
to 0.58, 0.21 and 0.1, respectively, though each may be changed via a
call to the C/C++ function `ARKodeSetAdaptivityMethod()`

in
C/C++, or to the Fortran function `FARKSETADAPTIVITYMETHOD()`

in Fortran. In this estimate, a floor of \(\varepsilon >
10^{-10}\) is enforced to avoid division-by-zero errors.

Like with the previous method, the PI controller derives from those found in [KC2003], [S1998], [S2003] and [S2006], but it differs in that it only uses the two most recent step sizes in its adaptivity algorithm,

\[h' \;=\; h_n\; \varepsilon_n^{-k_1/p}\; \varepsilon_{n-1}^{k_2/p}.\]

Here, the default values of \(k_1\) and \(k_2\) default
to 0.8 and 0.31, respectively, though they may be changed via a
call to `ARKodeSetAdaptivityMethod()`

in C/C++, or
`FARKSETADAPTIVITYMETHOD()`

in Fortran. As with the previous
controller, at initialization \(k_1 = k_2 = 1.0\) and the floor of
\(10^{-10}\) is enforced on the local error estimates.

The so-called I controller is the standard time adaptivity control algorithm in use by most available ODE solvers. It bases the prospective time step estimate entirely off of the current local error estimate,

\[h' \;=\; h_n\; \varepsilon_n^{-k_1/p}.\]

By default, \(k_1=1\), but that may be overridden by the user with
the function `ARKodeSetAdaptivityMethod()`

in C/C++, or the
function `FARKSETADAPTIVITYMETHOD()`

in Fortran.

This step adaptivity algorithm was proposed in [G1991], and is primarily useful in combination with explicit Runge-Kutta methods. Using the notation of our earlier controllers, it has the form

(19)\[\begin{split}h' \;=\; \begin{cases}
h_1\; \varepsilon_1^{-1/p}, &\quad\text{on the first step}, \\
h_n\; \varepsilon_n^{-k_1/p}\;
\left(\varepsilon_n/\varepsilon_{n-1}\right)^{k_2/p}, &
\quad\text{on subsequent steps}.
\end{cases}\end{split}\]

The default values of \(k_1\) and \(k_2\) are 0.367 and 0.268,
respectively, which may be changed by calling either
`ARKodeSetAdaptivityMethod()`

in C/C++, or
`FARKSETADAPTIVITYMETHOD()`

in Fortran.

A version of the above controller suitable for implicit Runge-Kutta methods was introduced in [G1994], and has the form

(20)\[\begin{split}h' = \begin{cases}
h_1 \varepsilon_1^{-1/p}, &\quad\text{on the first step}, \\
h_n \left(h_n / h_{n-1}\right) \varepsilon_n^{-k_1/p}
\left(\varepsilon_n/\varepsilon_{n-1}\right)^{-k_2/p}, &
\quad\text{on subsequent steps}.
\end{cases}\end{split}\]

The algorithm parameters default to \(k_1 = 0.98\) and
\(k_2 = 0.95\), but may be modified by the user with
`ARKodeSetAdaptivityMethod()`

in C/C++, or
`FARKSETADAPTIVITYMETHOD()`

in Fortran.

An ImEx version of these two preceding controllers is available in ARKode. This approach computes the estimates \(h'_1\) arising from equation (19) and the estimate \(h'_2\) arising from equation (20), and selects

\[h' = \frac{h}{|h|}\min\left\{|h'_1|, |h'_2|\right\}.\]

Here, equation (19) uses \(k_1\) and
\(k_2\) with default values of 0.367 and 0.268, while equation
(20) sets both parameters to the input \(k_3\) that
defaults to 0.95. All three of these parameters may be modified with
the C/C++ function `ARKodeSetAdaptivityMethod()`

in C/C++, or
the Fortran function `FARKSETADAPTIVITYMETHOD()`

in Fortran.

Finally, ARKode allows the user to define their own time step adaptivity function,

\[h' = H(y, t, h_n, h_{n-1}, h_{n-2}, \varepsilon_n, \varepsilon_{n-1}, \varepsilon_{n-2}, q, p),\]

via a call to the C/C++ routine `ARKodeSetAdaptivityFn()`

or
the Fortran routine `FARKADAPTSET()`

.

For problems that involve a nonzero explicit component, \(f_E(t,y) \ne 0\), explicit and ImEx Runge-Kutta methods may benefit from addition user-supplied information regarding the explicit stability region. All ARKode adaptivity methods utilize estimates of the local error. It is often the case that such local error control will be sufficient for method stability, since unstable steps will typically exceed the error control tolerances. However, for problems in which \(f_E(t,y)\) includes even moderately stiff components, and especially for higher-order integration methods, it may occur that a significant number of attempted steps will exceed the error tolerances. While these steps will automatically be recomputed, such trial-and-error may be costlier than desired. In these scenarios, a stability-based time step controller may also be useful.

Since the explicit stability region for any method depends on the problem under consideration, as it results from the eigenvalues of the linearized operator \(\frac{\partial f_E}{\partial y}\), information on the maximum stable step size is not computed internally within ARKode. However, for many problems such information is readily available. For example, in an advection-diffusion calculation, \(f_I\) may contain the stiff diffusive components and \(f_E\) may contain the comparably nonstiff advection terms. In this scenario, an explicitly stable step \(h_\text{exp}\) would be predicted as one satisfying the Courant-Friedrichs-Lewy (CFL) stability condition,

\[\begin{split}|h_\text{exp}| < \frac{\Delta x}{|\lambda|}\end{split}\]

where \(\Delta x\) is the spatial mesh size and \(\lambda\) is the fastest advective wave speed.

In these scenarios, a user may supply a routine to predict this
maximum explicitly stable step size, \(|h_\text{exp}|\), by calling the
C/C++ function `ARKodeSetStabilityFn()`

or the Fortran
function `FARKEXPSTABSET()`

. If a value for
\(|h_\text{exp}|\) is supplied, it is compared against the value
resulting from the local error controller, \(|h_\text{acc}|\), and
the step used by ARKode will satisfy

\[h' = \frac{h}{|h|}\min\{c\, |h_\text{exp}|,\, |h_\text{acc}|\}.\]

Here the explicit stability step factor (often called the “CFL
factor”) \(c>0\) may be modified through the function
`ARKodeSetCFLFraction()`

in C/C++, or through the input
*ADAPT_CFL* to the function `FARKSETRIN()`

in Fortran, and has
a default value of \(1/2\).

While ARKode is designed for time step adaptivity, it may additionally be called in “fixed-step” mode, typically used for debugging purposes or for verification against hand-coded Runge-Kutta methods. In this mode, all time step adaptivity is disabled:

- temporal error control is disabled,
- nonlinear or linear solver non-convergence results in an error (instead of a step size adjustment),
- no check against an explicit stability condition is performed.

Additional information on this mode is provided in the section Optional input functions.

Within the algorithms described above, there are three locations where a linear solve of the form

\[M x = b\]

is required: (a) in constructing the time-evolved solution \(y_n\), (b) in estimating the local temporal truncation error, and (c) in constructing predictors for the implicit solver iteration (see section Maximum order predictor). Specifically, to construct the time-evolved solution \(y_n\) from equation (2) we must solve

\[\begin{split}&M y_n \ = \ M y_{n-1} + h_n \sum_{i=1}^{s} \left( b^E_i f_E(t^E_{n,i}, z_i)
+ b^I_i f_I(t^I_{n,i}, z_i)\right), \\
\Leftrightarrow \qquad & \\
&M (y_n -y_{n-1}) \ = \ h_n \sum_{i=1}^{s} \left(b^E_i f_E(t^E_{n,i}, z_i)
+ b^I_i f_I(t^I_{n,i}, z_i)\right), \\
\Leftrightarrow \qquad & \\
&M \nu \ = \ h_n \sum_{i=1}^{s} \left(b^E_i f_E(t^E_{n,i}, z_i)
+ b^I_i f_I(t^I_{n,i}, z_i)\right),\end{split}\]

for the update \(\nu = y_n - y_{n-1}\). Similarly, in computing the local temporal error estimate \(T_n\) from equation (18) we must solve systems of the form

\[M\, T_n = h \sum_{i=1}^{s} \left[
\left(b^E_i - \tilde{b}^E_i\right) f_E(t^E_{n,i}, z_i) +
\left(b^I_i - \tilde{b}^I_i\right) f_I(t^I_{n,i}, z_i) \right].\]

Lastly, in constructing dense output and implicit predictors of order 2 or higher (as in the section Maximum order predictor above), we must compute the derivative information \(f_k\) from the equation

\[M f_k = f_E(t_k, y_k) + f_I(t_k, y_k).\]

Of course, for problems in which \(M=I\) these solves are not required; however for problems with non-identity \(M\), ARKode may use either an iterative linear solver or a direct linear solver, in the same manner as described in the section Linear solver methods for solving the linear Newton systems. We note that at present, the matrix \(M\) may depend on time \(t\) but must be independent of the solution \(y\), since we assume that each of the above systems are linear.

At present, for DIRK and ARK problems using a direct solver for
the Newton nonlinear iterations, the type of matrix (dense, band or
sparse) for the Newton systems \({\mathcal A}\delta = -G\) must
match the type of linear solver used for these mass-matrix systems,
since \(M\) is included inside \({\mathcal A}\). When direct
methods are employed, the user must supply a routine to compute
\(M\) in either dense, band or sparse form to match the
structure of \({\mathcal A}\), with a user-supplied routine of type
`ARKDlsMassFn()`

.

When iterative methods are used, a routine must be supplied to perform
the mass-matrix-vector product, \(Mv\), through a call to the
routine `ARKSpilsMassTimesVecFn()`

. As with iterative solvers
for the Newton systems, preconditioning may be applied to aid in
solution of the mass matrix systems \(Mx=b\). When using an
iterative mass matrix linear solver, we require that the norm of the
preconditioned linear residual satisfies

(21)\[\|r\| \le \epsilon_L \epsilon,\]

where again, \(\epsilon\) is the nonlinear solver tolerance
parameter from (15). When using iterative system
and mass matrix linear solvers, \(\epsilon_L\) may be specified
separately for both tolerances (16) and
(21); the mass matrix linear solver value of
\(\epsilon_L\) may be modified using
`ARKSpilsSetMassEpsLin()`

in C/C++, or
`FARKSPILSSETMASSEPSLIN()`

in Fortran.

The ARKode solver has been augmented to include a rootfinding feature. This means that, while integrating the IVP (1), ARKode can also find the roots of a set of user-defined functions \(g_i(t,y)\) that depend on \(t\) and the solution vector \(y = y(t)\). The number of these root functions is arbitrary, and if more than one \(g_i\) is found to have a root in any given interval, the various root locations are found and reported in the order that they occur on the \(t\) axis, in the direction of integration.

Generally, this rootfinding feature finds only roots of odd multiplicity, corresponding to changes in sign of \(g_i(t, y(t))\), denoted \(g_i(t)\) for short. If a user root function has a root of even multiplicity (no sign change), it will probably be missed by ARKode. If such a root is desired, the user should reformulate the root function so that it changes sign at the desired root.

The basic scheme used is to check for sign changes of any \(g_i(t)\) over each time step taken, and then (when a sign change is found) to home in on the root (or roots) with a modified secant method [HS1980]. In addition, each time \(g\) is computed, ARKode checks to see if \(g_i(t) = 0\) exactly, and if so it reports this as a root. However, if an exact zero of any \(g_i\) is found at a point \(t\), ARKode computes \(g(t+\delta)\) for a small increment \(\delta\), slightly further in the direction of integration, and if any \(g_i(t+\delta) = 0\) also, ARKode stops and reports an error. This way, each time ARKode takes a time step, it is guaranteed that the values of all \(g_i\) are nonzero at some past value of \(t\), beyond which a search for roots is to be done.

At any given time in the course of the time-stepping, after suitable checking and adjusting has been done, ARKode has an interval \((t_\text{lo}, t_\text{hi}]\) in which roots of the \(g_i(t)\) are to be sought, such that \(t_\text{hi}\) is further ahead in the direction of integration, and all \(g_i(t_\text{lo}) \ne 0\). The endpoint \(t_\text{hi}\) is either \(t_n\), the end of the time step last taken, or the next requested output time \(t_\text{out}\) if this comes sooner. The endpoint \(t_\text{lo}\) is either \(t_{n-1}\), or the last output time \(t_\text{out}\) (if this occurred within the last step), or the last root location (if a root was just located within this step), possibly adjusted slightly toward \(t_n\) if an exact zero was found. The algorithm checks \(g(t_\text{hi})\) for zeros, and it checks for sign changes in \((t_\text{lo}, t_\text{hi})\). If no sign changes are found, then either a root is reported (if some \(g_i(t_\text{hi}) = 0\)) or we proceed to the next time interval (starting at \(t_\text{hi}\)). If one or more sign changes were found, then a loop is entered to locate the root to within a rather tight tolerance, given by

\[\tau = 100\, U\, (|t_n| + |h|)\qquad (\text{where}\; U = \text{unit roundoff}).\]

Whenever sign changes are seen in two or more root functions, the one deemed most likely to have its root occur first is the one with the largest value of \(\left|g_i(t_\text{hi})\right| / \left| g_i(t_\text{hi}) - g_i(t_\text{lo})\right|\), corresponding to the closest to \(t_\text{lo}\) of the secant method values. At each pass through the loop, a new value \(t_\text{mid}\) is set, strictly within the search interval, and the values of \(g_i(t_\text{mid})\) are checked. Then either \(t_\text{lo}\) or \(t_\text{hi}\) is reset to \(t_\text{mid}\) according to which subinterval is found to have the sign change. If there is none in \((t_\text{lo}, t_\text{mid})\) but some \(g_i(t_\text{mid}) = 0\), then that root is reported. The loop continues until \(\left|t_\text{hi} - t_\text{lo} \right| < \tau\), and then the reported root location is \(t_\text{hi}\). In the loop to locate the root of \(g_i(t)\), the formula for \(t_\text{mid}\) is

\[t_\text{mid} = t_\text{hi} -
\frac{g_i(t_\text{hi}) (t_\text{hi} - t_\text{lo})}{g_i(t_\text{hi}) - \alpha g_i(t_\text{lo})} ,\]

where \(\alpha\) is a weight parameter. On the first two passes through the loop, \(\alpha\) is set to 1, making \(t_\text{mid}\) the secant method value. Thereafter, \(\alpha\) is reset according to the side of the subinterval (low vs high, i.e. toward \(t_\text{lo}\) vs toward \(t_\text{hi}\)) in which the sign change was found in the previous two passes. If the two sides were opposite, \(\alpha\) is set to 1. If the two sides were the same, \(\alpha\) is halved (if on the low side) or doubled (if on the high side). The value of \(t_\text{mid}\) is closer to \(t_\text{lo}\) when \(\alpha < 1\) and closer to \(t_\text{hi}\) when \(\alpha > 1\). If the above value of \(t_\text{mid}\) is within \(\tau /2\) of \(t_\text{lo}\) or \(t_\text{hi}\), it is adjusted inward, such that its fractional distance from the endpoint (relative to the interval size) is between 0.1 and 0.5 (with 0.5 being the midpoint), and the actual distance from the endpoint is at least \(\tau/2\).

Finally, we note that when running in parallel, the ARKode rootfinding module assumes that the entire set of root defining functions \(g_i(t,y)\) is replicated on every MPI task. Since in these cases the vector \(y\) is distributed across tasks, it is the user’s responsibility to perform any necessary inter-task communication to ensure that \(g_i(t,y)\) is identical on each task.