Calculus: Hyperbolic Trigonometry, II

Now on to some calculus involving hyperbolic trigonometry!  Today, we’ll look at trigonometric substitutions involving hyperbolic functions.

Let’s start with a typical example:

\displaystyle\int\sqrt{1+x^2}\,dx.

The usual technique involving circular trigonometric functions is to put x=\tan(\theta), so that dx=\sec^2(\theta)\,d\theta, and the integral transforms to

\displaystyle\int\sec^3(\theta)\,d\theta.

In general, we note that when taking square roots, a negative sign is sometimes needed if the limits of the integral demand it.

This integral requires integration by parts, and ultimately evaluating the integral

\displaystyle\int\sec(\theta)\,d\theta.

And how is this done?  I shudder when calculus textbooks write

\displaystyle\int \sec(\theta)\cdot\dfrac{\sec(\theta)+\tan(\theta)}{\sec(\theta)+\tan(\theta)}\,d\theta=\ldots

How does one motivate that “trick” to aspiring calculus students?  Of course the textbooks never do.

Now let’s see how to approach the original integral using a hyperbolic substitution.  We substitute x=\sinh(u), so that dx=\cosh(u)\,du and \sqrt{1+x^2}=\cosh(u).  Note well that taking the positive square root is always correct, since \cosh(u) is always positive!

This results in the integral

\displaystyle\int\cosh^2(u)\,du=\displaystyle\int\dfrac{1+\cosh(2u)}2\,du,

which is quite simple to evaluate:

\dfrac12u+\dfrac14\sinh(2u)+C.

Now u=\hbox{arcsinh}(x), and

\sinh(2u)=2\sinh(u)\cosh(u)=2x\sqrt{1+x^2}.

Recall from last week that we derived an explicit formula for \hbox{arcsinh}(x), and so our integral finally becomes

\dfrac12\left(\ln(x+\sqrt{1+x^2})+x\sqrt{1+x^2}\right)+C.

You likely noticed that using a hyperbolic substitution is no more complicated than using the circular substitution x=\sin(\theta).  What this means is — no need to ever integrate

\displaystyle\int\tan^m(\theta)\sec^n(\theta)\,d\theta

again!  Frankly, I no longer teach integrals involving \tan(\theta) and \sec(\theta) which involve integration by parts.  Simply put, it is not a good use of time.  I think it is far better to introduce students to hyperbolic trigonometric substitution.

Now let’s take a look at the integral

\displaystyle\int\sqrt{x^2-1}\,dx.

The usual technique?  Substitute x=\sec(\theta), and transform the integral into

\displaystyle\int\tan^2(\theta)\sec(\theta)\,d\theta.

Sigh.  Those irksome tangents and secants.  A messy integration by parts again.

But not so using x=\cosh(u).  We get dx=\sinh(u)\,du and \sqrt{x^2-1}=\sinh(u) (here, a negative square root may be necessary).

We rewrite as

\displaystyle\int\sinh^2(u)\,du=\displaystyle\int\dfrac{\cosh(2u)-1}2\,du.

This results in

\dfrac14\sinh(2u)-\dfrac u2+C=\dfrac12(\sinh(u)\cosh(u)-u)+C.

All we need now is a formula for \hbox{arccosh}(x), which may be found using the same technique we used last week for \hbox{arcsinh}(x):

\hbox{arccosh}(x)=\ln(x+\sqrt{x^2-1}).

Thus, our integral evaluates to

\dfrac12(x\sqrt{x^2-1}-\ln(x+\sqrt{x^2-1}))+C.

We remark that the integral

\displaystyle\int\sqrt{1-x^2}\,dx

is easily evaluated using the substitution x=\sin(\theta).  Thus, integrals of the forms \sqrt{1+x^2}, \sqrt{x^2-1}, and \sqrt{1-x^2} may be computed by using the substitutions x=\sinh(u), x=\cosh(u), and x=\sin(\theta), respectively.  It bears repeating:  no more integrals involving powers of tangents and secants!

One of the neatest applications of hyperbolic trigonometric substitution is using it to find

\displaystyle\int\sec(\theta)\,d\theta

without resorting to a completely unmotivated trick.  Yes, I saved the best for last….

So how do we proceed?  Let’s think by analogy.  Why did the substitution x=\sinh(u) work above?  For the same reason x=\tan(\theta) works: we can simplify \sqrt{1+x^2} using one of the following two identities:

1+\tan^2(\theta)=\sec^2(\theta)\ \hbox{  or  }\ 1+\sinh^2(u)=\cosh^2(u).

So \sinh(u) is playing the role of \tan(\theta), and \cosh(u) is playing the role of \sec(\theta).  What does that suggest?  Try using the substitution \sec(\theta)=\cosh(u)!

No, it’s not the first think you’d think of, but it makes sense.  Comparing the use of circular and hyperbolic trigonometric substitutions, the analogy is fairly straightforward, in my opinion.  There’s much more motivation here than in calculus textbooks.

So with \sec(\theta)=\cosh(u), we have

\sec(\theta)\tan(\theta)\,d\theta=\sinh(u)\,du.

But notice that \tan(\theta)=\sinh(u) — just look at the above identities and compare. We remark that if \theta is restricted to the interval (-\pi/2,\pi/2), then as a result of the asymptotic behavior, the substitution \sec(\theta)=\cosh(u) gives a bijection between the graphs of \sec(\theta) and \cosh(u), and between the graphs of \tan(\theta) and \sinh(u). In this case, the signs are always correct — \tan(\theta) and \sinh(u) always have the same sign.

So this means that

\sec(\theta)\,d\theta=du.

What could be simpler?

Thus, our integral becomes

\displaystyle\int\,du=u+C.

But

u=\hbox{arccosh}(\sec(\theta))=\ln(\sec(\theta)+\tan(\theta)).

Thus,

\displaystyle\int \sec(\theta)\,d\theta=\ln(\sec(\theta)+\tan(\theta))+C.

Voila!

We note that if \theta is restricted to the interval (-\pi/2,\pi/2) as discussed above,  then we always have \sec(\theta)+\tan(\theta)>0, so there is no need to put the argument of the logarithm in absolute values.

Well, I’ve done my best to convince you of the wonder of hyperbolic trigonometric substitutions!  If integrating \sec(\theta) didn’t do it, well, that’s the best I’ve got.

The next installment of hyperbolic trigonometry?  The Gudermannian function!  What’s that, you ask?  You’ll have to wait until next time — or I suppose you can just google it….

Calculus: Hyperbolic Trigonometry, I

love hyperbolic trigonometry.  I always include it when I teach calculus, as I think it is important for students to see.  Why?

  1.  Many applications in the sciences use hyperbolic trigonometry; for example, the use of Laplace transforms in solving differential equations, various applications in physics, modeling population growth (the logistic model is a hyperbolic tangent curve);
  2. Hyperbolic trigonometric substitutions are, in many instances, easier than circular trigonometric substitutions, especially when a substitution involving \tan(x) or \sec(x) is involved;
  3. Students get to see another form of trigonometry, and compare the new form with the old;
  4. Hyperbolic trigonometry is fun.

OK, maybe that last reason is a bit of hyperbole (though not for me).

Not everyone thinks this way.  I once had a colleague who told me she did not teach hyperbolic trigonometry because it wasn’t on the AP exam.  What do you say to someone who says that?  I dunno….

In any case, I want to introduce the subject here for you, and show you some interesting aspects of hyperbolic trigonometry.  I’m going to stray from my habit of not discussing things you can find anywhere online, since in order to get to the better stuff, you need to know the basics.  I’ll move fairly quickly through the introductory concepts, though.

The hyperbolic cosine and sine are defined by

\cosh(x)=\dfrac{e^x+e^{-x}}2,\quad\sinh(x)=\dfrac{e^x-e^{-x}}2,\quad x\in\mathbb{R}.

I will admit that when I introduce this definition, I don’t have an accessible, simple motivation for doing so.  I usually say we’ll learn a lot more as we work with these definitions, so if anyone has a good idea in this regard, I’d be interested to hear it.

The graphs of these curves are shown below.

Day142Hyp1

The graph of \cosh(x) is shown in blue, and the graph of \sinh(x) is shown in red.  The dashed orange graph is y=e^{x}/2, which is easily seen to be asymptotic to both graphs.

Parallels to the circular trigonometric functions are already apparent:  y=\cosh(x) is an even function, just like y=\cos(x).  Similarly, \sinh(x) is odd, just like \sin(x).

Another parallel which is only slight less apparent is the fundamental relationship

\cosh^2(x)-\sinh^2(x)=1.

Thus, (\cosh(x),\sinh(x)) lies on a unit hyperbola, much like (\cos(x),\sin(x)) lies on a unit circle.

While there isn’t a simple parallel with circular trigonometry, there is an interesting way to characterize \cosh(x) and \sinh(x).  Recall that given any function f(x), we may define

E(x)=\dfrac{f(x)+f(-x)}2,\quad O(x)=\dfrac{f(x)-f(-x)}2

to be the even and odd parts of f(x), respectively.  So we might simply say that \cosh(x) and \sinh(x) are the even and odd parts of e^x, respectively.

There are also many properties of the hyperbolic trigonometric functions which are reminiscent of their circular counterparts.  For example, we have

\sinh(2x)=2\sinh(x)\cosh(x)

and

\sinh(x+y)=\sinh(x)\cosh(y)+\sinh(y)\cosh(x).

None of these are especially difficult to prove using the definitions.  It turns out that while there are many similarities, there are subtle differences.  For example,

\cosh(x+y)=\cosh(x)\cosh(y)+\sinh(x)\sinh(y).

That is, while some circular trigonometric formulas become hyperbolic just by changing \cos(x) to \cosh(x) and \sin(x) to \sinh(x), sometimes changes of sign are necessary.

These changes of sign from circular formulas are typical when working with hyperbolic trigonometry.  One particularly interesting place the change of sign arises is when considering differential equations, although given that I’m bringing hyperbolic trigonometry into a calculus class, I don’t emphasize this relationship.  But recall that \cos(x) is the unique solution to the differential equation

y''+y=0,\quad y(0)=1,\quad y'(0)=0.

Similarly, we see that \cosh(x) is the unique solution to the differential equation

y''-y=0,\quad y(0)=1,\quad y'(0)=0.

Again, the parallel is striking, and the difference subtle.

Of course it is straightforward to see from the definitions that (\cosh(x))'=\sinh(x) and (\sinh(x))'=\cosh(x).  Gone are the days of remembering signs when differentiating and integrating trigonometric functions!  This is one feature of hyperbolic trigonometric functions which students always appreciate….

Another nice feature is how well-behaved the hyperbolic tangent is (as opposed to needing to consider vertical asymptotes in the case of \tan(x)).  Below is the graph of y=\tanh(x)=\sinh(x)/\cosh(x).

Day142Hyp2

The horizontal asymptotes are easily calculated from the definitions.  This looks suspiciously like the curves obtained when modeling logistic growth in populations; that is, finding solutions to

\dfrac{dP}{dt}=kP(C-P).

In fact, these logistic curves are hyperbolic tangents, which we will address in more detail in a later post.

One of the most interesting things about hyperbolic trigonometric functions is that their inverses have closed formulas — in striking contrast to their circular counterparts.  I usually have students work this out, either in class or as homework; the derivation is quite nice, so I’ll outline it here.

So let’s consider solving the equation x=\sinh(y) for y.  Begin with the definition:

x=\dfrac{e^y-e^{-y}}2.

The critical observation is that this is actually a quadratic in e^y:

(e^y)^2-2xe^y-1=0.

All that is necessary is to solve this quadratic equation to yield

e^y=x\pm\sqrt{1+x^2},

and note that x-\sqrt{1+x^2} is always negative, so that we must choose the positive sign.  Thus,

y=\hbox{arcsinh}(x)=\ln(x+\sqrt{1+x^2}).

And this is just the beginning!  At this stage, I also offer more thought-provoking questions like, “Which is larger, \cosh(\ln(42)) or \ln(\cosh(42))?  These get students working with the definitions and thinking about asymptotic behavior.

Next week, I’ll go into more depth about the calculus of hyperbolic trigonometric functions.  Stay tuned!

Calculus: The Geometry of Polynomials, II

The original post on The Geometry of Polynomials generated rather more interest that usual.  One reader, William Meisel, commented that he wondered if something similar worked for curves like the Folium of Descartes, given by the equation

x^3+y^3=3xy,

and whose graph looks like:

Day141Folium1

I replied that yes, I had success, and what I found out would make a nice follow-up post rather than just a reply to his comment.  So let’s go!

Just a brief refresher:  if, for example, we wanted to describe the behavior of y=2(x-4)(x-1)^2 where it crosses the x-axis at x=1, we simply retain the (x-1)^2 term and substitute the root x=1 into the other terms, getting

y=2(1-4)(x-1)^2=-6(x-1)^2

as the best-fitting parabola at x=1.

Now consider another way to think about this:

\displaystyle\lim_{x\to1}\dfrac y{(x-1)^2}=-6.

For examples like the polynomial above, this limit is always trivial, and is essentially a simple substitution.

What happens when we try to evaluate a similar limit with the Folium of Descartes?  It seems that a good approximation to this curve at x=0 (the U-shaped piece, since the sideways U-shaped piece involves writing x as a function of y) is y=x^2/3, as shown below.

Day141Folium2

To see this, we need to find

\displaystyle\lim_{x\to0}\dfrac y{x^2}.

After a little trial and error, I found it was simplest to use the substitution z=y/x^2, and so rewrite the equation for the Folium of Descartes by using the substitution y=x^2z, which results in

1+x^3z^3=3z.

Now it is easy to see that as x\to0, we have z\to1/3, giving us a good quadratic approximation at the origin.

Success!  So I thought I’d try some more examples, and see how they worked out.  I first just changed the exponent of x, looking at the curve

x^n+y^3=3xy,

shown below when n=6.

Day141Folium3.png

What would be a best approximation near the origin?  You can almost eyeball a fifth-degree approximation here, but let’s assume we don’t know the appropriate power and make the substitution y=x^kz, with k yet to be determined. This results in

x^{3k-n}z^3+1=3zx^{k+1-n}.

Now observe that when k=n-1, we have

x^{2n-3}z^3+1=3z,

so that \displaystyle\lim_{x\to0}z=1/3. Thus, in our case with n=6, we see that y=x^5/3 is a good approximation to the curve near the origin.  The graph below shows just how good an approximation it is.

Day141Folium4.png

OK, I thought to myself, maybe I just got lucky.  Maybe introduce a change which will really alter the nature of the curve, such as

x^3+y^3=3xy+1,

whose graph is shown below.

Day141Folium5

Here, the curve passes through the x-axis at x=1, with what appears to be a linear pass-through.  This suggests, given our previous work, the substitution y=(x-1)z, which results in

x^3+(x-1)^3z^3=3x(x-1)z+1.

We don’t have much luck with \displaystyle\lim_{x\to1}z here.  But if we move the 1 to the other side and factor, we get

(x-1)(x^2+x+1)+(x-1)^3z^3=3x(x-1)z.

Nice!  Just divide through by x-1 to obtain

x^2+x+1+(x-1)^2z=3xz.

Now a simple calculation reveals that \displaystyle\lim_{x\to1}z=1. And sure enough, the line y=x-1 does the trick:

Day141Folium6

Then I decided to change the exponent again by considering

x^n+y^3=3xy+1.

Here is the graph of the curve when n=6:

Day141Folium7

It seems we have two roots this time, with linear pass-throughs.  Let’s try the same idea again, making the substitution y=(x-1)z, moving the 1 over, factoring, and dividing through by x-1.  This results in

x^{n-1}+x^{n-2}+\cdots+1+(x-1)^2z^3=3xz.

It is not difficult to calculate that \displaystyle\lim_{x\to1}z=n/3.

Now things become a bit more interesting when n is even, since there is always a root at x=-1 in this case.  Here, we make the substitution y=(x+1)z, move the 1 over, and divide by x+1, resulting in

\dfrac{x^n-1}{x+1}+(x+1)^2z^3=3xz.

But since n is even, then x^2-1 is a factor of x^n-1, so we have

(x-1)(x^{n-2}+x^{n-4}+\cdots+x^2+1)+(x+1)^2z^3=3xz.

Substituting x=-1 in this equation gives

-2\left(\dfrac n2\right)=3(-1)z,

which immediately gives  \displaystyle\lim_{x\to1}z=n/3 as well!  This is a curious coincidence, for which I have no nice geometrical explanation.  The case when n=6 is illustrated below.

Day141Folium8

This is where I stopped — but I was truly surprised that everything I tried actually worked.  I did a cursory online search for Taylor series of implicitly defined functions, but this seems to be much less popular than series for y=f(x).

Anyone more familiar with this topic care to chime in?  I really enjoyed this brief exploration, and I’m grateful that William Meisel asked his question about the Folium of Descartes.  These are certainly instances of a larger phenomenon, but I feel the statement and proof of any theorem will be somewhat more complicated than the analogous results for explicitly defined functions.

And if you find some neat examples, post a comment!  I’d enjoy writing another follow-up post if there is continued interested in this topic.

Bay Area Mathematical Artists, VII

We had yet another amazing meeting of the Bay Area Mathematical Artists yesterday!  Just two speakers — but even so, we went a half-hour over our usual 5:00 ending time.

Our first presenter was Stan Isaacs.  There was no real title to his presentation, but he brought another set of puzzles from his vast collection to share.  He was highlighting puzzles created by Wayne Daniel.

Below you’ll see one of the puzzles disassembled.  The craftsmanship is simply remarkable.

Day140Stan1

If you look carefully, you’ll see what’s going on.  The outer pieces make an icosahedron, and when you take those off, a dodecahedron, then a cube…a wooden puzzle of nested Platonic solids!  The pieces all fit together so perfectly.  Stan is looking forward to an exhibition of Wayne’s work at the International Puzzle Party in San Diego later on this year.  For more information, contact Stan at stan@isaacs.com.

Day140Stan2

Our second speaker was Scott Kim (www.scottkim.com), who’s presentation was entitled Motley Dissections.  What is a motley dissection?  The most famous example is the problem of the squared square — that is, dissecting a square with an integer side length into smaller squares with integer side lengths, but with all the squares different sizes.

One property of such a dissection is that no two edges of squares meet exactly corner to corner.  In other words, edges always overlap in some way.

But there are of course many other motley dissections.  For example, below you see a motley dissection of one rectangle into five, one pentagon into eleven, and finally, one hexagon into a triangle, square, pentagon and hexagon.

Day140SK1Look carefully, and you’ll see that no single edge in any of these dissections exactly matches any other.  For these decompositions, Scott has proved they are minimal — so, for example, there is no motley dissection of one pentagon to ten or fewer.  The proofs are not exactly elegant, but they serve their purpose.  He also mentioned that he credits Donald Knuth with the term motley dissection, who used the term in a phone conversation not all that long ago.

Can you cube the cube?  That is, can you take a cube and subdivide it into cubes which are all different?  Scott showed us a simple proof that you can’t.  But, it turns out, you can box the box.  In other words, if the length, width, and height of the larger box and all the smaller boxes may be different, then it is possible to box the box.

Next week, Scott is off to the Gathering 4 Gardner in Atlanta, and will be giving his talk on Motley Dissections there.  He planned an activity where participants actually build a boxed box — and we were his test audience!

Day140Box1.jpg

He created some very elaborate transparencies with detailed instructions for cutting out and assembling.  There were a very few suggestions for improvement, and Scott was happy to know about them — after all, it is rare that something works out perfectly the first time.  So now, his success at G4G in Atlanta is assured….

We were so into creating these boxed boxes, that we happily stayed until 5:30 until we had two boxes completed.

Day140Box2.jpg

I should mention that Scott also discussed something he terms pseudo-duals in two, three, and even four dimensions!  There isn’t room to go into those now, but you can contact him through his website for more information.

As usual, we went out to dinner afterwards — and we gravitated towards our favorite Thai place again.  The dinner conversation was truly exceptional this evening, revolving around an animated conversation between Scott Kim and magician Mark Mitton (www.markmitton.com).

The conversation was concerned with the way we perceive mathematics here in the U.S., and how that influences the educational system.  Simply put, there is a lot to be desired.

One example Scott and Mark mentioned was the National Mathematics Festival (http://www.nationalmathfestival.org).  Tens of thousands of kids and parents have fun doing mathematics.  Then the next week, they go back to their schools and keep learning math the same — usually, unfortunately, boring — way it’s always been learned.

So why does the National Mathematics Festival have to be a one-off event?  It doesn’t!  Scott is actively engaged in a program he’s created where he goes into an elementary school at lunchtime one day a week and let’s kids play with math games and puzzles.

Why this model?  Teachers need no extra prep time, kids don’t need to stay after school, and so everyone can participate with very little needed as far as additional resources are concerned.  He’s hoping to create a package that he can export to any school anywhere where with minimal effort, so that children can be exposed to the joy of mathematics on a regular basis.

Mark was interested in Scott’s model:  consider your Needs (improving the perception of mathematics), be aware of the Forces at play (unenlightened administrators, for example, and many other subtle forces at work, as Mark explained), and then decide upon Actions to take to move the Work (applied, pure, and recreational mathematics) forward.

The bottom line:  we all know about this problem of attitudes toward mathematics and mathematics education, but no one really knows what to do about it.  For Scott, it’s just another puzzle to solve.  There are solutions.  And he is going to find one.

We talked for over two hours about these ideas, and everyone chimed in at one time or another.  Yes, my summary is very brief, I know, but I hope you get the idea of the type of conversation we had.

Stay tuned, since we are planning an upcoming meeting where we focus on Scott’s model and work towards a solution.  Another theme throughout the conversation was that mathematics is not an activity done in isolation — it is a communal activity.  So the Needs will not be addressed by a single individual, rather a group, and likely involving many members of many diverse communities.

A solution is out there.  It will take a lot of grit to find it.  But mathematicians have got grit in spades.

 

Calculus: Linear Approximations, II

As I mentioned last week, I am a fan of emphasizing the idea of a derivative as a linear approximation.  I ended that discussion by using this method to find the derivative of \tan(x).   Today, we’ll look at some more examples, and then derive the product, quotient and chain rules.

Differentiating \sec(x) is particularly nice using this method.  We first approximate

\sec(x+h)=\dfrac1{\cos(x+h)}\approx\dfrac1{\cos(x)-h\sin(x)}.

Then we factor out a \cos(x) from the denominator, giving

\sec(x+h)\approx\dfrac1{\cos(x)(1-h\tan(x))}.

As we did at the end of last week’s post, we can make h as small as we like, and so approximate by considering 1/(1-h\tan(x)) as the sum of an infinite series:

\dfrac1{1-h\tan(x)}\approx1+h\tan(x).

Finally, we have

\sec(x+h)\approx\dfrac{1+h\tan(x)}{\cos(x)}=\sec(x)+h\sec(x)\tan(x),

which gives the derivative of \sec(x) as \sec(x)\tan(x).

We’ll look at one more example involving approximating with geometric series before moving on to the product, quotient, and chain rules.  Consider differentiating x^{-n}. We first factor the denominator:

\dfrac1{(x+h)^n}=\dfrac1{x^n(1+h/x)^n}.

Now approximate

\dfrac1{1+h/x}\approx1-\dfrac hx,

so that, to first order,

\dfrac1{(1+h/x)^n}\approx \left(1-\dfrac hx\right)^{\!\!n}\approx 1-\dfrac{nh}x.

This finally results in

\dfrac1{(x+h)^n}\approx \dfrac1{x^n}\left(1-\dfrac{nh}x\right)=\dfrac1{x^n}+h\dfrac{-n}{x^{n+1}},

giving us the correct derivative.

Now let’s move on to the product rule:

(fg)'(x)=f(x)g'(x)+f'(x)g(x).

Here, and for the rest of this discussion, we assume that all functions have the necessary differentiability.

We want to approximate f(x+h)g(x+h), so we replace each factor with its linear approximation:

f(x+h)g(x+h)\approx (f(x)+hf'(x))(g(x)+hg'(x)).

Now expand and keep only the first-order terms:

f(x+h)g(x+h)\approx f(x)g(x)+h(f(x)g'(x)+f'(x)g(x)).

And there’s the product rule — just read off the coefficient of h.

There is a compelling reason to use this method.  The traditional proof begins by evaluating

\displaystyle\lim_{h\to0}\dfrac{f(x+h)g(x+h)-f(x)g(x)}h.

The next step?  Just add and subtract f(x)g(x+h) (or perhaps f(x+h)g(x)).  I have found that there is just no way to convincingly motivate this step.  Yes, those of us who have seen it crop up in various forms know to try such tricks, but the typical first-time student of calculus is mystified by that mysterious step.  Using linear approximations, there is absolutely no mystery at all.

The quotient rule is next:

\left(\dfrac fg\right)^{\!\!\!'}\!(x)=\dfrac{g(x)f'(x)-f(x)g'(x)}{g(x)^2}.

First approximate

\dfrac{f(x+h)}{g(x+h)}\approx\dfrac{f(x)+hf'(x)}{g(x)+hg'(x)}.

Now since h is small, we approximate

\dfrac1{g(x)+hg'(x)}\approx\dfrac1{g(x)}\left(1-h\dfrac{g'(x)}{g(x)}\right),

so that

\dfrac{f(x+h)}{g(x+h)}\approx(f(x)+hf'(x))\cdot\dfrac1{g(x)}\left(1-h\dfrac{g'(x)}{g(x)}\right).

Multiplying out and keeping just the first-order terms results in

\dfrac{f(x+h)}{g(x+h)}\approx f(x)g(x)+h\dfrac{g(x)f'(x)-f(x)g'(x)}{g(x)^2}.

Voila!  The quotient rule.  Now usual proofs involve (1) using the product rule with f(x) and 1/g(x), but note that this involves using the chain rule to differentiate 1/g(x);  or (2) the mysterious “adding and subtracting the same expression” in the numerator.  Using linear approximations avoids both.

The chain rule is almost ridiculously easy to prove using linear approximations.  Begin by approximating

f(g(x+h))\approx f(g(x)+hg'(x)).

Note that we’re replacing the argument to a function with its linear approximation, but since we assume that f is differentiable, it is also continuous, so this poses no real problem.  Yes, perhaps there is a little hand-waving here, but in my opinion, no rigor is really lost.

Since g is differentiable, then g'(x) exists, and so we can make hg'(x) as small as we like, so the “hg'(x)” term acts like the “h” term in our linear approximation.  Additionally, the “g(x)” term acts like the “x” term, resulting in

f(g(x+h)\approx f(g(x))+hg'(x)f'(g(x)).

Reading off the coefficient of h gives the chain rule:

(f\circ g)'(x)=f'(g(x))g'(x).

So I’ve said my piece.  By this time, you’re either convinced that using linear approximations is a good idea, or you’re not.  But I think these methods reflect more accurately the intuition behind the calculations — and reflect what mathematicians do in practice.

In addition, using linear approximations involves more than just mechanically applying formulas.  If all you ever do is apply the product, quotient, and chain rules, it’s just mechanics.  Using linear approximations requires a bit more understanding of what’s really going on underneath the hood, as it were.

If you find more neat examples of differentiation using this method, please comment!  I know I’d be interested, and I’m sure others would as well.

In my next installment (or two or three) in this calculus series, I’ll talk about one of my favorite topics — hyperbolic trigonometry.