10.1: Solving Nonlinear Equations with fzero()

Last updated
Save as PDF

Page ID: 86278

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

What does it mean to “solve” an equation? That may seem like an obvious question, but let’s take a minute to think about it, starting with a simple .

Suppose we want to know the value of a variable, \(x\), but all we know about it is the relationship \(x^2 = a\). If you’ve taken algebra, you probably know how to solve this equation: you take the square root of both sides and get \(x = \pm \sqrt{a}\). Then, with the satisfaction of a job well done, you move on to the next problem.

But what have you really done? The relationship you derived is equivalent to the relationship you started with—they contain the same information about \(x\)—so why is the second one preferable to the first?

There are two reasons. One is that the relationship is now explicit in \(x\): because \(x\) is all alone on the left side, we can treat the right side as a recipe for computing \(x\), assuming that we know the value of \(a\).

The other reason is that the recipe is written in terms of operations we know how to perform. Assuming that we know how to compute square roots, we can compute the value of \(x\) for any value of \(a\).

When people talk about solving an equation, what they usually mean is something like “finding an equivalent relationship that is explicit in one of the variables.” In the context of this book, that’s what we’ll call an analytic , to distinguish it from a numerical solution, which is what we are going to do next.

To demonstrate a numerical solution, consider the equation \(x^2 - 2x = 3\). You could solve this analytically, either by factoring it or by using the quadratic formula, and you would discover that there are two solutions, \(x=3\) and \(x=-1\). Alternatively, you could solve it numerically by rewriting it as \(x = \pm \sqrt{2x+3}\).

This equation is not explicit, since \(x\) appears on both sides, so it’s not clear that this move did any good at all. But suppose we had reason to expect a solution near 4. We could start with \(x=4\) as an initial value and then use the equation \(x = \sqrt{2x+3}\) to compute successive approximations of the solution. (To understand why this works, see https://greenteapress.com/matlab/fixed.)

Here’s what happens:

>> x = 4;
>> x = sqrt(2*x+3)
x = 3.3166

>> x = sqrt(2*x+3)
x = 3.1037

>> x = sqrt(2*x+3)
x = 3.0344

>> x = sqrt(2*x+3)
x = 3.0114

>> x = sqrt(2*x+3)
x = 3.0038

After each iteration, x is closer to the correct answer, and after five iterations the relative error is about 0.1 percent, which is good enough for most purposes.

Techniques that generate numerical solutions are called numerical method. The nice thing about the method we just used is that it’s simple. But it doesn’t always work, and it’s not often used in practice. We’ll see a better alternative in the next section.

Zero-Finding

The MATLAB function fzero that uses numerical methods to search for solutions to nonlinear equations. In order to use it, we have to rewrite the equation as an error function, like this:

\[f(x) = x^2 - 2x -3\notag\]

The value of the error function is 0 if \(x\) is a solution and nonzero if it is not. This function is useful because we can use values of \(f(x)\), evaluated at various values of \(x\), to infer the location of the solutions. And that’s what fzero does. Values of \(x\) where \(f(x) = 0\) are called zeros of the function or roots.

To use fzero you have to define a MATLAB function that computes the error function, like this:

function res = error_func(x)
    res = x^2 - 2*x -3;
end

You can call error_func from the Command Window and confirm that \(3\) and \(-1\) are zeros:

>> error_func(3)
ans = 0

>> error_func(-1)
ans = 0

But let’s pretend that we don’t know where the roots are; we only know that one of them is near 4. Then we could call fzero like this:

>> fzero(@error_func, 4)
ans = 3.0000

Success! We found one of the zeros.

The first argument is a function handle that specifies the error function. The @ symbol allows us to name the function without calling it. The interesting thing here is that you’re not actually calling error_func directly; you’re just telling fzero where it is. In turn, fzero calls your error function—more than once, in fact.

The second argument is the initial value. If we provide a different value, we get a different root (at least sometimes).

>> fzero(@error_func, -2)
ans = -1

Alternatively, if you know two values that bracket the root, you can provide both.

>> fzero(@error_func, [2,4])
ans = 3

The second argument is a vector that contains two elements.

You might be curious to know how many times fzero calls your function, and where. If you modify error_func so that it displays the value of x when it is called and then run fzero again, you get

>> fzero(@error_func, [2,4])
x = 2
x = 4
x = 2.75000000000000
x = 3.03708133971292
x = 2.99755211623500
x = 2.99997750209270
x = 3.00000000025200
x = 3.00000000000000
x = 3
x = 3
ans = 3

Not surprisingly, it starts by computing \(f(2)\) and \(f(4)\). Then it computes a point in the interval, \(2.75\), and evaluates \(f\) there. After each iteration, the interval gets smaller and the guess gets closer to the true root. The fzero function stops when the interval is so small that the estimated zero is correct to about 15 digits.

If you’d like to know more about how fzero works, see Chapter 15.2.

What Could Go Wrong?

The most common problem people have with fzero is leaving out the @. In that case, you get something like so:

>> fzero(error_func, [2,4])
Not enough input arguments.

Error in error_func (line 2)
    res = x^2 - 2*x -3;

The error occurs because MATLAB treats the first argument as a function call, so it calls error_func with no arguments.

Another common problem is writing an error function that never assigns a value to the output variable. In general, functions should always assign a value to the output variable, but MATLAB doesn’t enforce this rule, so it’s easy to forget.

For example, if you write

function res = error_func(x)
    y = x^2 - 2*x -3
end

and then call it from the Command Window,

>> error_func(4)
y = 5

it looks like it worked, but don’t be fooled. This function assigns a value to y, and it displays the result, but when the function ends, y disappears along with the function’s workspace. If you try to use it with fzero, you get

>> fzero(@error_func, [2,4])
y = -3

Error using fzero (line 231)
FZERO cannot continue because user-supplied function_handle ==>
error_func failed with the error below.

Output argument "res" (and maybe others) not assigned during call
to "error_func".

If you read it carefully, this is a pretty good error message, provided you understand that “output argument” and “output variable” are the same thing.

You would have seen the same error message when calling error_func from the interpreter, if you had assigned the result to a variable:

>> x = error_func(4)
y = 5

Output argument "res" (and maybe others) not assigned during
call to "error_func".

Another thing can go wrong: if you provide an interval for the initial guess and it doesn’t actually contain a root, you get

>> fzero(@error_func, [0,1])
Error using fzero (line 272)
The function values at the interval endpoints must differ in sign.

There is one other thing that can go wrong when you use fzero, but this one is less likely to be your fault. It’s possible that fzero won’t be able to find a root.

Generally, fzero is robust, so you may never have a problem, but you should remember that there is no guarantee that fzero will work, especially if you provide a single value as an initial guess. Even if you provide an interval that brackets a root, things can still go wrong if the error function is discontinuous.

Choosing an Initial Value

The better your initial value is, the more likely it is that fzero will work, and the fewer iterations it will need.

When you’re solving problems in the real world, you’ll usually have some intuition about the answer. This intuition is often enough to provide a good initial guess.

If not, another way to choose an initial guess is to plot the error function and approximate the zeros visually. If you have a function like error_func that takes a scalar input variable and returns a scalar output variable, you can plot it with ezplot:

>> ezplot(@error_func, [-2,5])

The first argument is a function handle; the second is the interval you want to plot the function in. By examining the plot, you can estimate the locations of the two roots.

Vectorizing Functions

When you call ezplot, you might get the following warning (or error, if you’re using Octave):

Warning: Function failed to evaluate on array inputs;
vectorizing the function may speed up its evaluation and
avoid the need to loop over array elements.

This means that MATLAB tried to call error_func with a vector, and it failed. The problem is that it uses the * and ^ operators. With vectors, those operators don’t do what we want, which is element-wise multiplication and exponentiation (see “Vector Arithmetic” on page 4.2).

If you rewrite error_func like this:

function res = error_func(x)
    res = x.^2 - 2.*x -3;
end

the warning message goes away, and ezplot runs faster.