Nonlinear Equations#

Introduction#

Nonlinear equations are ubiquitous in scientific computing. In power systems, the AC power flow equations are nonlinear equations.

The standard form of nonlinear equations is:

\[ f(x) = 0 \]

where f is a nonlinear function and x is the variable. In other words, solving nonlinear equations means finding the roots of a function. This function is called the residual function, which will be small enough (zero at the tolerance level) when the solution is found.

Residual Function#

The way to represent a nonlinear equation is to define the residual function. This is straightforward for simple problems in Python.

Consider this problem from scipy.optimize.fsolve documentation:

\[ x_0 \cos(x_1) = 4 \]
\[ x_1 x_0 - x_1 = 5 \]

we can define the residual function as:

import numpy as np


def residual(x):
    return [x[0]*np.cos(x[1]) - 4,
            x[0]*x[1] - x[1] - 5]

Note that for the function argument x, it is implicitly a 1D NumPy array.

Root-Finding with fsolve#

One of the most common methods to solve nonlinear equations is the Newton-Raphson method. In practice, for simple problems, we can use off-the-shelf solvers in Python.

fsolve is a function in scipy.optimize for root-finding. We can check the docstring by typing ?fsolve in a Jupyter cell or visiting the documentation.

from scipy.optimize import fsolve

x0 = [1, 1]
sol = fsolve(residual, x0)
print(sol)
[6.50409711 0.90841421]
np.isclose(residual(sol), [0.0, 0.0])  # the residual at the solution should be almost 0.0.
array([ True,  True])

Initial Guess#

Initial guess plays an important role in the solver convergence. An inproper initial guess may result in failure to converge. Such guesses are said to be not in the region of attraction of the solution.

For example, if we use the initial guess [0, 0], the solver will fail to converge.

sol = fsolve(residual, [0, 0])
/tmp/ipykernel_260870/2019346469.py:1: RuntimeWarning: The iteration is not making good progress, as measured by the 
  improvement from the last ten iterations.
  sol = fsolve(residual, [0, 0])

The error message indicates that the residual function did not converge to zero in the last ten steps.

In other cases, a different initial guess may result in a different solution. Consider the simple equation

\[x^2 - 1 = 0\]

Starting from 1.2 and -1.2 will converge to 1 and -1, respectively.

def simple_residual(x):
    return [x[0]**2 - 1]

sol = fsolve(simple_residual, [1.2])
print("Starting from 1.2, the solution is:", sol)

sol = fsolve(simple_residual, [-1.2])
print("Starting from -1.2, the solution is:", sol)
Starting from 1.2, the solution is: [1.]
Starting from -1.2, the solution is: [-1.]

Storing History with Decorator#

Let’s hack into fsolve to print out the residual at the solution. First, let’s enable full_output=True to get some extra information, such as the number of iterations.

sol, infodict, ier, mesg = fsolve(residual, [0, 0], full_output=True)

print(f"Number of iterations: {infodict['nfev']}")
Number of iterations: 33

Now let’s store the x and the residual at each step. We will implement it with a technique called decorator.

Decorators are a way to modify the behavior of a function. They are often used to add extra functionality to a function.

A decorator is a function which takes a function as input and returns a modified function.

residual_history = []
x_history = []


# `with_history` is a decorator to wrap the callback function
def with_history(func):  # `func` is the function to modify (wrap)
    def wrapper(*args, **kwargs):  # `wrapper` is the modified function to be returned
        residual_history.append(residual(args[0]))
        x_history.append(args[0])
        return func(*args, **kwargs)
    return wrapper

Below is the basic way to use the decorator – just call the decorator with the function to be modified. In this way, we did not modify any of the residual code but added extra functionality. Decorators are most powerful when you have a long function needing new functionality with only external modifications.

residual_with_history = with_history(residual)

sol = fsolve(residual_with_history, [0, 0])
/tmp/ipykernel_260870/865456783.py:3: RuntimeWarning: The iteration is not making good progress, as measured by the 
  improvement from the last ten iterations.
  sol = fsolve(residual_with_history, [0, 0])
# plot the residual history and `x` history in two figures, side by side

import matplotlib.pyplot as plt
import seaborn as sns


def plot_history(residual_history, x_history):
    plt.figure(figsize=(8, 3))
    plt.subplot(1, 2, 1)
    plt.plot(residual_history)
    plt.xlabel('Iteration')
    plt.ylabel('Residual')
    plt.title('Residual History')
    plt.subplot(1, 2, 2)
    plt.plot(x_history)
    plt.xlabel('Iteration')
    plt.ylabel('x')
    plt.title('x History')


plot_history(residual_history, x_history)
../../_images/506246e08f6d40574a4b87f8fa2ab55679f5c1eaff97b94879dddfef1e883b3b.png

You can tell that the residual does not converge to zero, and the x are jumping back and forth.

Now let’s check the residual and x using initial guess [1, 1].

# don't forget to reset the history
residual_history = []
x_history = []

sol = fsolve(residual_with_history, [1, 1])
plot_history(residual_history, x_history)
../../_images/58d20a40c6ba483d159b901131333645a2189c36c77cf904d81b29ba4bd6af19.png

More on Decorators#

Decorators can be reused to wrap different functions. Suppose we have a different residual function for which the history is needed. In addition to writing out the residual function, we can use @with_history to wrap it. This is a syntax sugar provided by Python.

@with_history
def residual2(x):
    return [x[0]**2 + x[1]**2 - 1,
            x[0] + x[1] - 1]

Convergence#

We talked about “convergence”, but what does it mean? Intuitively, it means that the solution does not change much between iterations. That can be seen from the previous figure.

Typically, convergence is checked on the variable between consecutive iterations. For a specified xtol, if the solution satisfies \(|x_{k+1} - x_k| < \text{xtol}\), the solution is deemed converged. This is what is checked by fsolve.

In addition, the residual should be small enough when converged. The tolerance is often named ftol. If the residual satisfies \(|f(x_{k+1})| < \text{ftol}\), the solution is deemed converged.

It is beneficial to check both of them if you are writing a custom solver.

Jacobian Matrix#

SciPy’s fsolve uses the Powell’s hybrid method, which combines the Newton-Raphson method with gradient descent. They require the Jacobian matrix, which is the first-order derivatives of the residual function w.r.t the unknown variables.

Calculating the Jacobian matrix is computationally expensive. By default, fsolve uses the hybrd method in MINPACK, which calculates the Jacobian matrix using finite differences. For each column of the Jacobian matrix, two evaluations of the residual functions are needed, resulting in a total of 2N residual evaluations for an N-dimensional problem.

Providing the Jacobian matrix can speed up the calculation. This is one of the reasons why texts on power system analysis always show Jacobian matrix.

fsolve allows specifying the function that provides the Jacobian matrix through the fprime argument.

To clarify, the Jacobian function is the first-order derivatives of the residual function w.r.t the unknown variables. It is not necessarily the derivative of the original function w.r.t the unknown variables.

In this case, the residual function is:

\[\begin{split} r_0 = \begin{bmatrix} x_0 \cos(x_1) - 4 \\ x_0 x_1 - x_1 - 5 \end{bmatrix} \end{split}\]

And the unkonwns are

\[\begin{split} x = \begin{bmatrix} x_0 \\ x_1 \end{bmatrix} \end{split}\]

Therefore, the analytical Jacobian matrix is:

\[\begin{split} J = \begin{bmatrix} \frac{\partial r_0}{\partial x_0} & \frac{\partial r_0}{\partial x_1} \\ \frac{\partial r_1}{\partial x_0} & \frac{\partial r_1}{\partial x_1} \end{bmatrix} = \begin{bmatrix} \cos(x_1) & -x_0 \sin(x_1) \\ x_1 & x_0 - 1 \end{bmatrix} \end{split}\]

It is called the analytical Jacobian matrix because it is derived based on calculus.

def jacobian(x):
    return [[np.cos(x[1]), -x[0]*np.sin(x[1])],
            [x[1], x[0] - 1]]


sol = fsolve(residual, x0, fprime=jacobian)
print(sol)

np.isclose(residual(sol), [0.0, 0.0])  # the residual at the solution should be almost 0.0.
[6.50409711 0.90841421]
array([ True,  True])

In this small example, we cannot see the performance gain from providing the Jacobian matrix. However, this will be illustrated by a power flow problem.