Calculus: Linear Approximations, I

Last week’s post on the Geometry of Polynomials generated a lot of interest from folks who are interested in or teach calculus.  So I thought I’d start a thread about other ideas related to teaching calculus.

This idea is certainly not new.  But I think it is sorely underexploited in the calculus classroom.  I like it because it reinforces the idea of derivative as linear approximation.

The main idea is to rewrite

\displaystyle\lim_{h\to 0}\dfrac{f(x+h)-f(x)}h=f'(x)

as

f(x+h)\approx f(x)+hf'(x),

with the note that this approximation is valid when h\approx0.  Writing the limit in this way, we see that f(x+h), as a function of h, is linear in h in the sense of the limit in the definition actually existing — meaning there is a good linear approximation to f at x.

Moreover, in this sense, if

f(x+h)\approx f(x)+hg(x),

then it must be the case that f'(x)=g(x).  This is not difficult to prove.

Let’s look at a simple example, like finding the derivative of f(x)=x^2.  It’s easy to see that

f(x+h)=(x+h)^2=x^2+h(2x)+h^2.

So it’s easy to read off the derivative: ignore higher-order terms in h, and then look at the coefficient of h as a function of x.

Note that this is perfectly rigorous.  It should be clear that ignoring higher-order terms in h is fine since when taking the limit as in the definition, only one h divides out, meaning those terms contribute 0 to the limit.  So the coefficient of h will be the only term to survive the limit process.

Also note that this is nothing more than a rearrangement of the algebra necessary to compute the derivative using the usual definition.  I just find it is more intuitive, and less cumbersome notationally.  But every step taken can be justified rigorously.

Moreover, this method is the one commonly used in more advanced mathematics, where  functions take vectors as input.  So if

f({\bf v})={\bf v}\cdot{\bf v},

we compute

f({\bf u}+h{\bf v})={\bf u}\cdot{\bf u}+2h{\bf u}\cdot{\bf v}+h^2{\bf v}\cdot{\bf v},

and read off

\nabla_{\bf v}f({\bf u})=2{\bf u}\cdot{\bf v}.

I don’t want to go into more details here, since such calculations don’t occur in beginning calculus courses.  I just want to point out that this way of computing derivatives is in fact a natural one, but one which you don’t usually encounter until graduate-level courses.

Let’s take a look at another example:  the derivative of f(x)=\sin(x), and see how it looks using this rewrite.  We first write

\sin(x+h)=\sin(x)\cos(h)+\cos(x)\sin(h).

Now replace all functions of h with their linear approximations.  Since \cos(h)\approx1 and \sin(h)\approx h near h=0, we have

\sin(x+h)\approx\sin(x)+h\cos(x).

This immediately gives that \cos(x) is the derivative of \sin(x).

Now the approximation \cos(h)\approx1 is easy to justify geometrically by looking at the graph of \cos(x).  But how do we justify the approximation \sin(h)\approx h?

Of course there is no getting around this.  The limit

\displaystyle\lim_{h\to0}\dfrac{\sin(h)}h

is the one difficult calculation in computing the derivative of \sin(x).  So then you’ve got to provide your favorite proof of this limit, and then move on.  But this approximation helps to illustrate the essential point:  the differentiability of \sin(x) at x=0 does, in a real sense, imply the differentiability of \sin(x) everywhere else.

So computing derivatives in this way doesn’t save any of the hard work, but I think it makes the work a bit more transparent.  And as we continually replace functions of h with their linear approximations, this aspect of the derivative is regularly being emphasized.

How would we use this technique to differentiate f(x)=\sqrt x?  We need

\sqrt{x+h}\approx\sqrt x+hf'(x),

and so

x+h\approx \left(\sqrt x+hf'(x)\right)^2\approx x+2h\sqrt xf'(x).

Since the coefficient of h on the left is 1, so must be the coefficient on the right, so that

2\sqrt xf'(x)=1.

As a last example for this week, consider taking the derivative of f(x)=\tan(x).  Then we have

\tan(x+h)=\dfrac{\tan(x)+\tan(h)}{1-\tan(x)\tan(h)}.

Now since \sin(h)\approx h and \cos(h)\approx 1, we have \tan(h)\approx h, and so we can replace to get

\tan(x+h)\approx\dfrac{\tan(x)+h}{1-h\tan(x)}.

Now what do we do?  Since we’re considering h near 0, then h\tan(x) is small (as small as we like), and so we can consider

\dfrac1{1-h\tan(x)}

as the sum of the infinite geometric series

\dfrac1{1-h\tan(x)}=1+h\tan(x)+h^2\tan^2(x)+\cdots

Replacing, with the linear approximation to this sum, we get

\tan(x+h)\approx(\tan(x)+h)(1+h\tan(x)),

and so

\tan(x+h)\approx\tan(x)+h(1+\tan^2(x)).

This give the derivative of \tan(x) to be

1+\tan^2(x)=\sec^2(x).

Neat!

Now this method takes a bit more work than just using the quotient rule (as usually done).  But using the quotient rule is a purely mechanical process; this way, we are constantly thinking, “How do I replace this expression with a good linear approximation?”  Perhaps more is learned this way?

There are more interesting examples using this geometric series idea.  We’ll look at a few more next time, and then use this idea to prove the product, quotient, and chain rules.  Until then!

The Geometry of Polynomials

I recently needed to make a short demo lecture, and I thought I’d share it with you.  I’m sure I’m not the first one to notice this, but I hadn’t seen it before and I thought it was an interesting way to look at the behavior of polynomials where they cross the x-axis.

The idea is to give a geometrical meaning to an algebraic procedure:  factoring polynomials.  What is the geometry of the different factors of a polynomial?

Let’s look at an example in some detail:  f(x)=2(x-4)(x-1)^2.poly0b

Now let’s start looking at the behavior near the roots of this polynomial.

poly0c

Near x=1, the graph of the cubic looks like a parabola — and that may not be so surprising given that the factor (x-1) occurs quadratically.

poly0d

And near x=4, the graph passes through the x-axis like a line — and we see a linear factor of (x-4) in our polynomial.

But which parabola, and which line?  It’s actually pretty easy to figure out.  Here is an annotated slide which illustrates the idea.

Day137poly1

All you need to do is set aside the quadratic factor of (x-1)^2, and substitute the root, x=1, in the remaining terms of the polynomial, then simplify.  In this example, we see that the cubic behaves like the parabola y=-6(x-1)^2 near the root x=1. Note the scales on the axes; if they were the same, the parabola would have appeared much narrower.

We perform a similar calculation at the root x=4.

Day137poly2

Just isolate the linear factor (x-4), substitute x=4 in the remaining terms of the polynomial, and then simplify.  Thus, the line y=18(x-4) best describes the behavior of the graph of the polynomial as it passes through the x-axis.  Again, note the scale on the axes.

We can actually use this idea to help us sketch graphs of polynomials when they’re in factored form.  Consider the polynomial f(x)=x(x+1)^2(x-2)^3.  Begin by sketching the three approximations near the roots of the polynomial.  This slide also shows the calculation for the cubic approximation.

Day137poly3.png

Now you can begin sketching the graph, starting from the left, being careful to closely follow the parabola as you bounce off the x-axis at x=-1.

poly1d

Continue, following the red line as you pass through the origin, and then the cubic as you pass through x=2.  Of course you’d need to plot a few points to know just where to start and end; this just shows how you would use the approximations near the roots to help you sketch a graph of a polynomial.

poly1f

Why does this work?  It is not difficult to see, but here we need a little calculus.  Let’s look, in general, at the behavior of f(x)=p(x)(x-a)^n near the root x=a.  Given what we’ve just been observing, we’d guess that the best approximation near x=a would just be y=p(a)(x-a)^n.

Just what does “best approximation” mean?  One way to think about approximating, calculuswise, is matching derivatives — just think of Maclaurin or Taylor series.  My claim is that the first n derivatives of f(x)=p(x)(x-a)^n and y=p(a)(x-a)^n match at x=a.

First, observe that the first n-1 derivatives of both of these functions at x=a must be 0.  This is because (x-a) will always be a factor — since at most n-1 derivatives are taken, there is no way for the (x-a)^n term to completely “disappear.”

But what happens when the nth derivative is taken?  Clearly, the nth derivative of p(a)(x-a)^n at x=a is just n!p(a).  What about the nth derivative of f(x)=p(x)(x-a)^n?

Thinking about the product rule in general, we see that the form of the nth derivative must be f^{(n)}(x)=n!p(x)+ (x-a)(\text{terms involving derivatives of } p(x)). When a derivative of p(x) is taken, that means one factor of (x-a) survives.

So when we take f^{(n)}(a), we also get n!p(a).  This makes the nth derivatives match as well.  And since the first n derivatives of p(x)(x-a)^n and p(a)(x-a)^n match, we see that p(a)(x-a)^n is the best nth degree approximation near the root x=a.

I might call this observation the geometry of polynomials. Well, perhaps not the entire geometry of polynomials….  But I find that any time algebra can be illustrated graphically, students’ understanding gets just a little deeper.

Those who have been reading my blog for a while will be unsurprised at my geometrical approach to algebra (or my geometrical approach to anything, for that matter).  Of course a lot of algebra was invented just to describe geometry — take the Cartesian coordinate plane, for instance.  So it’s time for algebra to reclaim its geometrical heritage.  I shall continue to be part of this important endeavor, for however long it takes….

The Problem with Calculus Textbooks

Simply put, most calculus textbooks are written in the wrong order.

Unfortunately, this includes the most popular textbooks used in colleges and universities today.

This problem has a long history, and will not be quickly solved for a variety of reasons. I think the solution lies ultimately with high quality, open source e-modules (that is, stand-alone tutorials on all calculus-related topics), but that discussion is for another time. Today, I want to address a more pressing issue: since many of us (including myself) must teach from such textbooks — now, long before the publishing revolution — how might we provide students a more engaging, productive calculus experience?

To be specific, I’ll describe some strategies I’ve used in calculus over the past several years. Once you get the idea, you’ll be able to look through your syllabus and find ways to make similar adaptations. There are so many different versions of calculus taught, there is no “one size fits all” solution. So here goes.

1. I now teach differentiation before limits. The reason is that very little intuition about limits is needed to differentiate quadratics, for example — but the idea of limits is naturally introduced in terms of slopes of secant lines. Once students have the general idea, I give them a list of the usual functions to differentiate. Now they generate the limits we need to study — completely opposite of introducing various limits out of context that “they will need later.”

Students routinely ask, “When am I ever going to use this?” At one time, I dismissed the question as irrelevant — surely students should know that the learning process is not one of immediate gratification. But when I really understood what they were asking — “How do I make sense of what you’re telling me when I have nothing to relate it to except the promise of some unknown future problem?” — I started to rethink how I presented concepts in calculus.

I also didn’t want to write my own calculus textbook from scratch — so I looked for ways to use the resources I already had. Simply doing the introductory section on differentiation before the chapter on limits takes no additional time in the classroom, and not much preparation on the part of the teacher. This point is crucial for the typical teacher — time is precious. What I’m advocating is just a reshuffling of the topics we (have to) teach anyway.

2. I no longer teach the chapter on techniques of integration as a “chapter.” In the typical textbook, nothing in this chapter is sufficiently motivated. So here’s what I do.

I teach the section on integration by parts when I discuss volumes. Finding volumes using cylindrical shells naturally gives rise to using integration by parts, so why wait? Incidentally, I also bring center of mass and Pappus’ theorem into play, as they also fit naturally here. The one-variable formulation of the center of mass gives rise to squares of functions, so I introduce integrating powers of trigonometric functions here. (Though I omit topics such as using integration by parts to integrate unfriendly powers of tangent and secant — I do not feel this is necessary given any mathematician I know would jump to Mathematica or similar software to evaluate such integrals.)

I teach trigonometric substitution (hyperbolic as well — that for another blog post) when I cover arc length and surface area — again, since integrals involving square roots arise naturally here.

Partial fractions can either be introduced when covering telescoping series, or when solving the logistic equation. (A colleague recommended doing series in the middle of the course rather then the end (where it would have naturally have fallen given the order of chapters in our text), since she found that students’ minds were fresher then — so I introduced partial fractions when doing telescoping series. I found this rearrangement to be a good suggestion, by the way. Thanks, Cornelia!)

3. I no longer begin Taylor series by introducing sequences and series in the conventional way. First, I motivate the idea by considering limits like

\displaystyle\lim_{x\to0}\dfrac{\sin x-x}{x^3}=-\dfrac16.

This essentially means that near 0, we can approximate \sin(x) by the cubic polynomial

\sin(x)\approx x-\dfrac{x^3}6.

In other words, the limits we often encounter while studying L’Hopital’s rule provide a good motivation for polynomial approximations. Once the idea is introduced, higher-order — eventually “infinite-order” — approximations can be brought in. Some algorithms approximate transcendental functions with polynomials — this provides food for thought as well. Natural questions arise: How far do we need to go to get a given desired accuracy? Will the process always work?

I won’t say more about this approach here, since I’ve written up a complete set of Taylor series notes.  They were written for an Honors-level class, so some sections won’t be appropriate for a typical calculus course. They were also intended for use in an inquiry-based learning environment, and so are not in the usual “text, examples, exercise” order. But I hope they at least convey an approach to the subject, which I have adapted to a more traditional university setting as well. For the interested instructor, I also have compiled a complete Solutions Manual.

I think this is enough to give you the idea of my approach to using a traditional textbook. Every calculus teacher has their own way of thinking about the subject — as it should be. There is no reason to think that every teacher should teach calculus in the same way — but there is every reason to think that calculus teachers should be contemplating how to make this beautiful subject more accessible to their students.

Continue reading The Problem with Calculus Textbooks