Nonlinear Equations#
Introduction#
Nonlinear equations are ubiquitous in scientific computing. In power systems, the AC power flow equations are nonlinear equations.
The standard form of nonlinear equations is:
where f
is a nonlinear function and x
is the variable. In other
words, solving nonlinear equations means finding the roots of a function. This
function is called the residual function, which will be small enough (zero
at the tolerance level) when the solution is found.
Residual Function#
The way to represent a nonlinear equation is to define the residual function. This is straightforward for simple problems in Python.
Consider this problem from scipy.optimize.fsolve
documentation:
we can define the residual function as:
import numpy as np
def residual(x):
return [x[0]*np.cos(x[1]) - 4,
x[0]*x[1] - x[1] - 5]
Note that for the function argument x
, it is implicitly a 1D NumPy array.
Root-Finding with fsolve
#
One of the most common methods to solve nonlinear equations is the Newton-Raphson method. In practice, for simple problems, we can use off-the-shelf solvers in Python.
fsolve
is a function in scipy.optimize
for root-finding. We can check the
docstring by typing ?fsolve
in a Jupyter cell or visiting the documentation.
from scipy.optimize import fsolve
x0 = [1, 1]
sol = fsolve(residual, x0)
print(sol)
[6.50409711 0.90841421]
np.isclose(residual(sol), [0.0, 0.0]) # the residual at the solution should be almost 0.0.
array([ True, True])
Initial Guess#
Initial guess plays an important role in the solver convergence. An inproper initial guess may result in failure to converge. Such guesses are said to be not in the region of attraction of the solution.
For example, if we use the initial guess [0, 0]
, the solver will fail to
converge.
sol = fsolve(residual, [0, 0])
/tmp/ipykernel_260870/2019346469.py:1: RuntimeWarning: The iteration is not making good progress, as measured by the
improvement from the last ten iterations.
sol = fsolve(residual, [0, 0])
The error message indicates that the residual function did not converge to zero in the last ten steps.
In other cases, a different initial guess may result in a different solution. Consider the simple equation
Starting from 1.2
and -1.2
will converge to 1
and -1
, respectively.
def simple_residual(x):
return [x[0]**2 - 1]
sol = fsolve(simple_residual, [1.2])
print("Starting from 1.2, the solution is:", sol)
sol = fsolve(simple_residual, [-1.2])
print("Starting from -1.2, the solution is:", sol)
Starting from 1.2, the solution is: [1.]
Starting from -1.2, the solution is: [-1.]
Storing History with Decorator#
Let’s hack into fsolve
to print out the residual at the solution. First, let’s
enable full_output=True
to get some extra information, such as the number of
iterations.
sol, infodict, ier, mesg = fsolve(residual, [0, 0], full_output=True)
print(f"Number of iterations: {infodict['nfev']}")
Number of iterations: 33
Now let’s store the x
and the residual at each step. We will implement it with
a technique called decorator.
Decorators are a way to modify the behavior of a function. They are often used to add extra functionality to a function.
A decorator is a function which takes a function as input and returns a modified function.
residual_history = []
x_history = []
# `with_history` is a decorator to wrap the callback function
def with_history(func): # `func` is the function to modify (wrap)
def wrapper(*args, **kwargs): # `wrapper` is the modified function to be returned
residual_history.append(residual(args[0]))
x_history.append(args[0])
return func(*args, **kwargs)
return wrapper
Below is the basic way to use the decorator – just call the decorator with the
function to be modified. In this way, we did not modify any of the residual
code but added extra functionality. Decorators are most powerful when
you have a long function needing new functionality with only external
modifications.
residual_with_history = with_history(residual)
sol = fsolve(residual_with_history, [0, 0])
/tmp/ipykernel_260870/865456783.py:3: RuntimeWarning: The iteration is not making good progress, as measured by the
improvement from the last ten iterations.
sol = fsolve(residual_with_history, [0, 0])
# plot the residual history and `x` history in two figures, side by side
import matplotlib.pyplot as plt
import seaborn as sns
def plot_history(residual_history, x_history):
plt.figure(figsize=(8, 3))
plt.subplot(1, 2, 1)
plt.plot(residual_history)
plt.xlabel('Iteration')
plt.ylabel('Residual')
plt.title('Residual History')
plt.subplot(1, 2, 2)
plt.plot(x_history)
plt.xlabel('Iteration')
plt.ylabel('x')
plt.title('x History')
plot_history(residual_history, x_history)

You can tell that the residual does not converge to zero, and the x
are
jumping back and forth.
Now let’s check the residual and x
using initial guess [1, 1]
.
# don't forget to reset the history
residual_history = []
x_history = []
sol = fsolve(residual_with_history, [1, 1])
plot_history(residual_history, x_history)

More on Decorators#
Decorators can be reused to wrap different functions. Suppose we have a
different residual function for which the history is needed. In addition to
writing out the residual function, we can use @with_history
to wrap it. This
is a syntax sugar provided by Python.
@with_history
def residual2(x):
return [x[0]**2 + x[1]**2 - 1,
x[0] + x[1] - 1]
Convergence#
We talked about “convergence”, but what does it mean? Intuitively, it means that the solution does not change much between iterations. That can be seen from the previous figure.
Typically, convergence is checked on the variable between consecutive
iterations. For a specified xtol
, if the solution satisfies \(|x_{k+1} - x_k| <
\text{xtol}\), the solution is deemed converged. This is what is checked by
fsolve
.
In addition, the residual should be small enough when converged. The tolerance
is often named ftol
. If the residual satisfies \(|f(x_{k+1})| < \text{ftol}\), the solution
is deemed converged.
It is beneficial to check both of them if you are writing a custom solver.
Jacobian Matrix#
SciPy’s fsolve
uses the Powell’s hybrid
method, which combines
the Newton-Raphson method with gradient descent. They require the Jacobian
matrix, which is the first-order derivatives of the residual function w.r.t the
unknown variables.
Calculating the Jacobian matrix is computationally expensive. By default,
fsolve
uses the hybrd
method in MINPACK, which calculates the Jacobian
matrix using finite differences. For each column of the Jacobian matrix, two
evaluations of the residual functions are needed, resulting in a total of 2N
residual evaluations for an N-dimensional problem.
Providing the Jacobian matrix can speed up the calculation. This is one of the reasons why texts on power system analysis always show Jacobian matrix.
fsolve
allows specifying the function that provides the Jacobian matrix
through the fprime
argument.
To clarify, the Jacobian function is the first-order derivatives of the residual function w.r.t the unknown variables. It is not necessarily the derivative of the original function w.r.t the unknown variables.
In this case, the residual function is:
And the unkonwns are
Therefore, the analytical Jacobian matrix is:
It is called the analytical Jacobian matrix because it is derived based on calculus.
def jacobian(x):
return [[np.cos(x[1]), -x[0]*np.sin(x[1])],
[x[1], x[0] - 1]]
sol = fsolve(residual, x0, fprime=jacobian)
print(sol)
np.isclose(residual(sol), [0.0, 0.0]) # the residual at the solution should be almost 0.0.
[6.50409711 0.90841421]
array([ True, True])
In this small example, we cannot see the performance gain from providing the Jacobian matrix. However, this will be illustrated by a power flow problem.