Calculus VII: Approximations

Although I’ll have a very busy summer with consulting, I’ve taken some time to start reading more again.  You know, those books which have been sitting on your shelves for years….

So I’ve started Volume I of A Treatise on the Integral Calculus by Joseph Edwards.

Day150Cover

I include a picture of the cover page, since you can google it and download a copy online.  Between Volumes I and II, there’s about 1800 pages of integral calculus….

Since I’ll likely be working with a calculus curriculum later this year, I thought I’d look at some older books and see what calculus was like back in the day.  I’m continually surprised at how much there is to learn about elementary calculus, despite having taught it for over 25 years.

My approach will be a simple one — I’ll organize my posts by page number.  As I read through the books and solve interesting problems, I’ll share with you things I find novel and interesting.  The more I read books like these and think about calculus, the more I think most current textbooks simply are not up to the task of presenting calculus in any meaningful way.  Sigh.

This is not the time to be on my soapbox — this is the time for some fun!  So here is the first topic:  Weddle’s Rule, found on page 21.

Ever hear of it?  Bonus points if you have — but I never did.  It’s another approximation rule for integrals.  Here it is: given a function f on the interval [a,b], divide the interval into six equal subintervals with points x_0, x_1,\ldots x_6 and corresponding function values y_0=f(x_0),\ldots,y_6=f(x_6).  Then

\displaystyle\int_a^bf(x)\,dx\approx \dfrac{b-a}{20}\left(y_1+5y_2+y_3+6y_4+y_5+5y_6+y_7\right).

Yikes!  Where did that come from?  I’ll present my take on the idea, and offer a theory.  If there are any historians of mathematics out there, I’d be happy to hear if my theory is correct.

One reason most of us haven’t heard of Weddle’s Rule is that approximations aren’t as important as they were before calculators and computers.  So many exercises in this book involve approximation techniques.

So how would you come up with Weddle’s Rule?  I’ll share my (likely mythical) scenario with you.  It’s based on some notes I wrote up a while ago on Taylor series.  So before diving into Weddle’s Rule, I’ll show you how I’d derive Simpson’s Rule — the technique is the same, but the algebra is easier.  And by the way, if anyone has seen this technique before, please let me know!  I’m sure it must have been done before, but I’ve never been able to find a source illustrating it.

Let’s assume we want to approximate

F(x)=\displaystyle\int_a^xf(t)\,dt

by using three equally-spaced points on the interval [a,x].  In other words, we want to find weights p, q, and r such that

S(x)=\left(p f(a)+ q f\left(\dfrac{a+x}2\right)+rf(x)\right)(x-a)\approx F(x).

How might we approach this?  We can create Taylor series for F(x) and S(x) about the point a.  The first is easy using the Fundamental Theorem of Calculus, assuming sufficient differentiability:

F(x)=f(a)(x-a)+\dfrac{f'(a)}{2!}(x-a)^2+\dfrac{f''(a)}{3!}(x-a)^3+\cdots

Now to construct the Taylor series of S(x) about x=a, we need to evaluate several derivatives at a. This is not difficult to do by hand, but it is easy to do using Mathematica and a command such as

Day150Mma

Doing so yields the following:Day150derivs2

Now the problem becomes a simpler algebra problem — to force as many of the coefficients of the derivatives on the right-hand side to be 1 as possible.  This will make the derivatives of F and S match, and the Taylor polynomials will be equal up to some order.

Solving the first three such equations,

Day150eqns

yields, as we expect, p=1/6, q=2/3, and r=1/6. Note that these values also imply that

\dfrac12q+4r=1,

but

\dfrac5{16}q+5r=\dfrac{25}{24}.

This implies that

S(x)-F(x)=\dfrac1{24}\cdot\dfrac{(x-a)^5}{5!}+O((x-a)^6)

on each subinterval, so that

S(x)-F(x)=O((x-a)^5)

on each subinterval, giving that Simpson’s rule is O((x-a)^4).

So how we apply these to derive Weddle’s rule?  We could try to find weights w_1,\ldots w_7 to create an approximation

W(x)=\left(w_1 f(a)+w_2f\left(\dfrac{5a+x}6\right)+\cdots+w_7f(x)\right)(x-a).

If we apply precisely the same procedure as we did with Simpson’s Rule, we get the following as the sequence of weights to create the best approximation:

\dfrac{41}{840},\ \dfrac9{35},\ \dfrac9{280},\ \dfrac{34}{105},\ \dfrac9{280},\ \dfrac9{35},\ \dfrac{41}{480}.

Not exactly easy to work with — remember, no calculators or computers.

So let’s make the approximation a little worse.  Recall how the weights were found — a system of seven equations in seven unknowns was solved, analogous to the three equations in three unknowns for Simpson’s rule.  Instead, we specify w_1, and solve the first six equations in terms of w_1.  This gives us

Day151Weddle.png

Now all weights must be positive; this gives the constraint

0.046\overline6\approx\dfrac7{150}<w_1<\dfrac{13}{200}=0.065.

Let’s put w_1=1/20, which is in the interval just described.  This gives the sequence of weights to be

\dfrac1{20},\ \dfrac5{20},\ \dfrac1{20},\ \dfrac6{20},\ \dfrac1{20},\ \dfrac5{20},\ \dfrac1{20},

where all fractions are written with the same denominator.  Now imagine factoring out the 1/2, and you notice that all divisions are by 10.  Can you see the advantage?  If you have a table of values for your functions, you just need to multiply function values by a single-digit number, and then move the decimal place over one.  An approximators dream!

So Weddle’s approximation is exact for fifth-degree polynomials, even though it is possible to use six subintervals to get weights which are exact for sixth-degree polynomials.  Yes, we lose an order of accuracy — but now our computations are much easier to carry out.

Was this Weddle’s thinking?  I can’t be sure; I wasn’t able to locate the original article online.  But it is a way for me to make sense out of Weddle’s rule.

I will admit that in a traditional calculus class, I don’t address approximations in this way.  There is a time crunch to get “everything” done — that is, everything the student is expected to know for the next course in the calculus sequence.

Should these concepts be taught?  I’ll make a brief observation:  in reading through the first 200 pages of this calculus book, it seems that all that has changed since 1954 is that content was pared down significantly, and more calculator exercises were added.

This is not the solution.  We need to rethink what students need to now know and how that material should be taught in light of emerging technology.  So let’s get started!

Calculus: Hyperbolic Trigonometry, IV

Of course, there is always more to say about hyperbolic trigonometry….  Next, we’ll look at what is usually called the logistic curve, which is the solution to the differential equation

\dfrac{dP}{dt}=kP(C-P),\quad P(0)\ \text{given}.

The logistic curve comes up in the usual chapter on differential equations, and is an example of population growth.  Without going into too many details (since the emphasis is on hyperbolic trigonometry), k is a constant which influences how fast the population grows, and C is called the carrying capacity of the environment.

Note that when P is very small, C-P\approx C, and so the population growth is almost exponential.  But when P(t) gets very close to C, then dP/dT\approx0, and so population growth slows down.  And of course when P(t)=C, growth stops — hence calling C the carrying capacity of the environment.  It represents the largest population the environment can sustain.

Here is an example of such a curve where C=500, k=0.02, and P(0)=50.

Day146logistic.png

Notice the S shape, obtained from a curve rapidly growing when the population is small. It happens that the population grows fastest at half the carrying capacity, and then growth slows to zero as the carrying capacity is reached.

Skipping the details (simple separation of variables), the solution to this differential equation is given by

P(t)=\dfrac{C}{1+Ae^{-kCt}},\qquad A=\dfrac{C-P(0)}{P(0)}.

I will digress for a moment, however, to mention partial fractions (as I step on my calculus soapbox).  I have mentioned elsewhere that incomprehensible chapter in calculus textbooks:  Techniques of Integration.  Pedagogically a disaster for so many reasons.

The first time I address partial fractions is when summing telescoping series, such as

\displaystyle\sum_{n=1}^\infty\dfrac1{n(n+1)}.

It really is necessary.  But I only go so far as to be able to sum such series.  (Note:  I do series as the middle third of Calculus II, rather than the end.  A colleague suggested that students are more tired near the end of the course, which is better for a more technique-oriented discussion of the solution to differential equations, which typically comes before series.)

You also need partial fractions to solve the differential equation for the logistic curve, which is when I revisit the topic.  After finding the logistic curve, we talk about partial fractions in more detail.  The point is that students see some motivation for the method of partial fractions — which they decidedly don’t in a chapter on techniques of integration.

OK, time to step off the soapbox and talk about hyperbolic trigonometry….  The punch line is that the logistic curve is actually a scaled and shifted hyperbolic tangent curve!  Of course it looks like a hyperbolic tangent, but let’s take a moment to see why.

We first use the definitions of \sinh u and \cosh u to write

\tanh u=\dfrac{\sinh h}{\cosh u}=1-\dfrac2{1+e^{2u}}.

This results in

\dfrac2{1+e^{2u}}=1-\tanh u.

You can see the form of the equation of the logistic curve starting to take shape.  Since the hyperbolic tangent has horizontal tangents at y=-1 and y=1, we need to scale by a factor of C/2 so that the asymptotes of the logistic curve are C units apart:

\dfrac C{1+e^{2u}}=\dfrac{C}2\left(1-\tanh u\right).

Note that this puts the horizontal asymptotes of the function at y=0 and y=C.

To take into account the initial population, we need a horizontal shift, since otherwise the initial population would be C/2. We can accomplish this be replacing \tanh u with \tanh(u+\varphi):

\dfrac C{1+e^{2\varphi} e^{2u}}=\dfrac C2(1-\tanh(u+\varphi)).

We’re almost done at this point:  we simply need

e^{2\varphi}=A,\qquad 2u=-kCt.

Solving and substituting back results in

P(t)=\dfrac C2\left(1-\tanh\left(\dfrac{-kCt+\ln A}2\right)\right),

which, since \tanh is an odd function, becomes

P(t)=\dfrac C2\left(1+\tanh\left(\dfrac{kCt-\ln A}2\right)\right).

And there it is!  The logistic curve as a scaled, shifted hyperbolic tangent.

Now what does showing this accomplish?  I can’t give you a definite answer from the point of view of the students.  But for me, it is a way to tie two seemingly unrelated concepts — hyperbolic trigonometry and solution of differential equations by separation of variables — together in a way that is not entirely contrived (as so many calculus textbook problems are).

I would love to perform the following experiment:  work out the solution to the differential equation together as a guided discussion, and then prompt students to suggest functions this curve “looks like.”  Of course the \arctan might be suggested, but how would we relate this to the exponential function?

Eventually we’d tease out the hyperbolic tangent, since this function actually does involve the exponential function.  Then I’d move into an inquiry-based lesson:  give the students the equation of a logistic curve, and have them work out the conversion to the hyperbolic tangent.

And as is typical in such an approach, I would put students into groups, and go around the classroom and nudge them along.  See what happens.

I say that yes, calculus students should be able to do this.  I recently sent an email about pedagogy in calculus which, among other things, addressed the question:  What do calculus students really need to know?

There is no room to adequately address that important question here, but in today’s context, I would say this:  I think it is more important for a student to be able to rewrite P(t) as a hyperbolic tangent than it is for them to know how to sketch the graph of P(t).

Why?  Because it is trivial to graph functions, now.  Type the formula into Desmos.  But how to interpret the graph?  Rewrite it?  Analyze it?  Draw conclusions from it?  We need to focus on what is no longer necessary, and what is now indispensable.  To my knowledge, no one has successfully done this.

I think it is about time for that to change….

Calculus: Hyperbolic Trigonometry, III

We continue where we left off on the last post about hyperbolic trigonometry.  Recall that we ended by finding an antiderivative for \sec(x) using the hyperbolic trigonometric substitution \sec(\theta)=\cosh(u).  Today, we’ll look at this substitution in more depth.

The functional relationship between \theta and u is described by the gudermannian function, defined by

\theta=\text{gd}\,u=2\arctan(e^u)-\dfrac\pi2.

This is not at all obvious, so we’ll look at the derivation of this rather surprising-looking formula.  It’s the only formula I’m aware of which involves both the arctangent and the exponential function.  We remark (as we did in the last post) that we restrict \theta to the interval (-\pi/2,\pi/2) so that this relationship is in fact invertible.

We use a technique similar to that used to derive a formula for the inverse hyperbolic cosine.  First, write

\sec\theta=\cosh u=\dfrac{e^u+e^{-u}}2,

and then multiply through by e^u to obtain the quadratic

(e^u)^2-2\sec(\theta)e^u+1=0.

This quadratic equation results in

e^u=\sec\theta\pm\tan\theta.

Which sign should we choose?  We note that \theta and u increase together, so that because e^u is an increasing function of u, then \sec\theta\pm\tan\theta must be an increasing function of \theta. It is not difficult to see that we must choose “plus,” so that e^u=\sec\theta+\tan\theta, and consequently

u=\ln(\sec\theta+\tan\theta).

We remark that no absolute values are required here; this point was discussed in the previous post.

Now to solve for \theta.  The trick is to use a lesser-known trigonometric identity:

\sec\theta+\tan\theta=\tan\left(\dfrac\pi4+\dfrac\theta2\right).

There is such a nice geometrical proof of this identity, I can’t help but include it.  Start with the usual right triangle, and extend the segment of length \tan\theta by \sec\theta in order to form an isosceles triangle.  Thus,

\tan(\theta+\alpha)=\sec\theta+\tan\theta.

Day146Figure

To find \alpha, observe that \beta is supplementary to both 2\alpha and \pi/2-\theta, so that

2\alpha=\dfrac\pi2-\theta,

which easily implies

\alpha=\dfrac\pi4-\dfrac\theta2.

Therefore

\theta+\alpha=\dfrac\pi4+\dfrac\theta2,

which is precisely what we need to prove the identity.

Now we substitute back into the previous expression for u, which results in

u=\ln\tan\left(\dfrac\pi4+\dfrac\theta2\right).

This may be solved for \theta, giving

\theta=\text{gd}\,u=2\arctan(e^u)-\dfrac\pi2.

So let’s see how to use this to relate circular and hyperbolic trigonometric functions.  We have

\sec(\text{gd}\,u)=\dfrac1{\cos(2\arctan(e^u)-\pi/2)},

which after using the usual circular trigonometric identities, becomes

\sec(\text{gd}\,u)=\dfrac{e^u+e^{-u}}2=\cosh u.

It is also an easy exercise to see that

\dfrac{d}{du}\,\text{gd}\,u=\text{sech}\, u.

So revisiting the integral

\displaystyle\int\sec\theta\,d\theta,

we may alternatively make the substitution \theta=\text{gd}\,u, giving

\displaystyle\int\sec\theta\,d\theta=\int\cosh u\,(\text{sech}\, u\,du)=\int du,

which is the same simple integral we saw in the previous post.

What about the other trigonometric functions?  Certainly we know that \cos(\text{gd}\,u)=\text{sech}\,u.  Again using the usual circular trigonometric identities, we can show that

\sin(\text{gd}\,u)=\tanh u.

Knowing these three relationships, the rest are easy to find: \tan(\text{gd}\,u)=\sinh u, \cot(\text{gd}\,u)=\text{csch}\,u, and \csc(\text{gd}\,u)=\text{coth}\,u.

I think that the gudermannian function should be more widely known.  On the face of it, circular and hyperbolic trigonometric functions are very different beasts — but they relate to each other in very interesting ways, in my opinion.

I will admit that I don’t teach students about the gudermannian function as part of a typical calculus course.  Again, there is the issue of time:  as you are well aware, students finishing one course in the calculus sequence must be adequately prepared for the next course in the sequence.

So what I do is this:  I put the exercises on the gudermannian function as extra challenge problems.  Then, if a student is already familiar with hyperbolic trigonometry, they can push a little further to learn about the gudermannian.

Not many students take on the challenge — but there are always one or two who will visit my office hours with questions.  Such a treat for a mathematics professor!  But I feel it is always necessary to give something to the very best students to chew on, so they’re not bored.  The gudermannian does the trick as far as hyperbolic trigonometry is concerned….

As a parting note, I’d like to leave you with a few more exercises which I include in my “challenge” question on the gudermannian.  I hope you enjoy working them out!

  1.  Show that \tanh\left(\dfrac x2\right)=\tan\left(\dfrac 12\text{gd}\,x\right).
  2. Show that e^x=\dfrac{1+\tan(\frac12\text{gd}\,x)}{1-\tan(\frac12\text{gd}\,x)}.
  3. Show that if h is the inverse of the gudermannian function, then h'(x)=\sec x.

Calculus: Hyperbolic Trigonometry, II

Now on to some calculus involving hyperbolic trigonometry!  Today, we’ll look at trigonometric substitutions involving hyperbolic functions.

Let’s start with a typical example:

\displaystyle\int\sqrt{1+x^2}\,dx.

The usual technique involving circular trigonometric functions is to put x=\tan(\theta), so that dx=\sec^2(\theta)\,d\theta, and the integral transforms to

\displaystyle\int\sec^3(\theta)\,d\theta.

In general, we note that when taking square roots, a negative sign is sometimes needed if the limits of the integral demand it.

This integral requires integration by parts, and ultimately evaluating the integral

\displaystyle\int\sec(\theta)\,d\theta.

And how is this done?  I shudder when calculus textbooks write

\displaystyle\int \sec(\theta)\cdot\dfrac{\sec(\theta)+\tan(\theta)}{\sec(\theta)+\tan(\theta)}\,d\theta=\ldots

How does one motivate that “trick” to aspiring calculus students?  Of course the textbooks never do.

Now let’s see how to approach the original integral using a hyperbolic substitution.  We substitute x=\sinh(u), so that dx=\cosh(u)\,du and \sqrt{1+x^2}=\cosh(u).  Note well that taking the positive square root is always correct, since \cosh(u) is always positive!

This results in the integral

\displaystyle\int\cosh^2(u)\,du=\displaystyle\int\dfrac{1+\cosh(2u)}2\,du,

which is quite simple to evaluate:

\dfrac12u+\dfrac14\sinh(2u)+C.

Now u=\hbox{arcsinh}(x), and

\sinh(2u)=2\sinh(u)\cosh(u)=2x\sqrt{1+x^2}.

Recall from last week that we derived an explicit formula for \hbox{arcsinh}(x), and so our integral finally becomes

\dfrac12\left(\ln(x+\sqrt{1+x^2})+x\sqrt{1+x^2}\right)+C.

You likely noticed that using a hyperbolic substitution is no more complicated than using the circular substitution x=\sin(\theta).  What this means is — no need to ever integrate

\displaystyle\int\tan^m(\theta)\sec^n(\theta)\,d\theta

again!  Frankly, I no longer teach integrals involving \tan(\theta) and \sec(\theta) which involve integration by parts.  Simply put, it is not a good use of time.  I think it is far better to introduce students to hyperbolic trigonometric substitution.

Now let’s take a look at the integral

\displaystyle\int\sqrt{x^2-1}\,dx.

The usual technique?  Substitute x=\sec(\theta), and transform the integral into

\displaystyle\int\tan^2(\theta)\sec(\theta)\,d\theta.

Sigh.  Those irksome tangents and secants.  A messy integration by parts again.

But not so using x=\cosh(u).  We get dx=\sinh(u)\,du and \sqrt{x^2-1}=\sinh(u) (here, a negative square root may be necessary).

We rewrite as

\displaystyle\int\sinh^2(u)\,du=\displaystyle\int\dfrac{\cosh(2u)-1}2\,du.

This results in

\dfrac14\sinh(2u)-\dfrac u2+C=\dfrac12(\sinh(u)\cosh(u)-u)+C.

All we need now is a formula for \hbox{arccosh}(x), which may be found using the same technique we used last week for \hbox{arcsinh}(x):

\hbox{arccosh}(x)=\ln(x+\sqrt{x^2-1}).

Thus, our integral evaluates to

\dfrac12(x\sqrt{x^2-1}-\ln(x+\sqrt{x^2-1}))+C.

We remark that the integral

\displaystyle\int\sqrt{1-x^2}\,dx

is easily evaluated using the substitution x=\sin(\theta).  Thus, integrals of the forms \sqrt{1+x^2}, \sqrt{x^2-1}, and \sqrt{1-x^2} may be computed by using the substitutions x=\sinh(u), x=\cosh(u), and x=\sin(\theta), respectively.  It bears repeating:  no more integrals involving powers of tangents and secants!

One of the neatest applications of hyperbolic trigonometric substitution is using it to find

\displaystyle\int\sec(\theta)\,d\theta

without resorting to a completely unmotivated trick.  Yes, I saved the best for last….

So how do we proceed?  Let’s think by analogy.  Why did the substitution x=\sinh(u) work above?  For the same reason x=\tan(\theta) works: we can simplify \sqrt{1+x^2} using one of the following two identities:

1+\tan^2(\theta)=\sec^2(\theta)\ \hbox{  or  }\ 1+\sinh^2(u)=\cosh^2(u).

So \sinh(u) is playing the role of \tan(\theta), and \cosh(u) is playing the role of \sec(\theta).  What does that suggest?  Try using the substitution \sec(\theta)=\cosh(u)!

No, it’s not the first think you’d think of, but it makes sense.  Comparing the use of circular and hyperbolic trigonometric substitutions, the analogy is fairly straightforward, in my opinion.  There’s much more motivation here than in calculus textbooks.

So with \sec(\theta)=\cosh(u), we have

\sec(\theta)\tan(\theta)\,d\theta=\sinh(u)\,du.

But notice that \tan(\theta)=\sinh(u) — just look at the above identities and compare. We remark that if \theta is restricted to the interval (-\pi/2,\pi/2), then as a result of the asymptotic behavior, the substitution \sec(\theta)=\cosh(u) gives a bijection between the graphs of \sec(\theta) and \cosh(u), and between the graphs of \tan(\theta) and \sinh(u). In this case, the signs are always correct — \tan(\theta) and \sinh(u) always have the same sign.

So this means that

\sec(\theta)\,d\theta=du.

What could be simpler?

Thus, our integral becomes

\displaystyle\int\,du=u+C.

But

u=\hbox{arccosh}(\sec(\theta))=\ln(\sec(\theta)+\tan(\theta)).

Thus,

\displaystyle\int \sec(\theta)\,d\theta=\ln(\sec(\theta)+\tan(\theta))+C.

Voila!

We note that if \theta is restricted to the interval (-\pi/2,\pi/2) as discussed above,  then we always have \sec(\theta)+\tan(\theta)>0, so there is no need to put the argument of the logarithm in absolute values.

Well, I’ve done my best to convince you of the wonder of hyperbolic trigonometric substitutions!  If integrating \sec(\theta) didn’t do it, well, that’s the best I’ve got.

The next installment of hyperbolic trigonometry?  The Gudermannian function!  What’s that, you ask?  You’ll have to wait until next time — or I suppose you can just google it….

Calculus: Hyperbolic Trigonometry, I

love hyperbolic trigonometry.  I always include it when I teach calculus, as I think it is important for students to see.  Why?

  1.  Many applications in the sciences use hyperbolic trigonometry; for example, the use of Laplace transforms in solving differential equations, various applications in physics, modeling population growth (the logistic model is a hyperbolic tangent curve);
  2. Hyperbolic trigonometric substitutions are, in many instances, easier than circular trigonometric substitutions, especially when a substitution involving \tan(x) or \sec(x) is involved;
  3. Students get to see another form of trigonometry, and compare the new form with the old;
  4. Hyperbolic trigonometry is fun.

OK, maybe that last reason is a bit of hyperbole (though not for me).

Not everyone thinks this way.  I once had a colleague who told me she did not teach hyperbolic trigonometry because it wasn’t on the AP exam.  What do you say to someone who says that?  I dunno….

In any case, I want to introduce the subject here for you, and show you some interesting aspects of hyperbolic trigonometry.  I’m going to stray from my habit of not discussing things you can find anywhere online, since in order to get to the better stuff, you need to know the basics.  I’ll move fairly quickly through the introductory concepts, though.

The hyperbolic cosine and sine are defined by

\cosh(x)=\dfrac{e^x+e^{-x}}2,\quad\sinh(x)=\dfrac{e^x-e^{-x}}2,\quad x\in\mathbb{R}.

I will admit that when I introduce this definition, I don’t have an accessible, simple motivation for doing so.  I usually say we’ll learn a lot more as we work with these definitions, so if anyone has a good idea in this regard, I’d be interested to hear it.

The graphs of these curves are shown below.

Day142Hyp1

The graph of \cosh(x) is shown in blue, and the graph of \sinh(x) is shown in red.  The dashed orange graph is y=e^{x}/2, which is easily seen to be asymptotic to both graphs.

Parallels to the circular trigonometric functions are already apparent:  y=\cosh(x) is an even function, just like y=\cos(x).  Similarly, \sinh(x) is odd, just like \sin(x).

Another parallel which is only slight less apparent is the fundamental relationship

\cosh^2(x)-\sinh^2(x)=1.

Thus, (\cosh(x),\sinh(x)) lies on a unit hyperbola, much like (\cos(x),\sin(x)) lies on a unit circle.

While there isn’t a simple parallel with circular trigonometry, there is an interesting way to characterize \cosh(x) and \sinh(x).  Recall that given any function f(x), we may define

E(x)=\dfrac{f(x)+f(-x)}2,\quad O(x)=\dfrac{f(x)-f(-x)}2

to be the even and odd parts of f(x), respectively.  So we might simply say that \cosh(x) and \sinh(x) are the even and odd parts of e^x, respectively.

There are also many properties of the hyperbolic trigonometric functions which are reminiscent of their circular counterparts.  For example, we have

\sinh(2x)=2\sinh(x)\cosh(x)

and

\sinh(x+y)=\sinh(x)\cosh(y)+\sinh(y)\cosh(x).

None of these are especially difficult to prove using the definitions.  It turns out that while there are many similarities, there are subtle differences.  For example,

\cosh(x+y)=\cosh(x)\cosh(y)+\sinh(x)\sinh(y).

That is, while some circular trigonometric formulas become hyperbolic just by changing \cos(x) to \cosh(x) and \sin(x) to \sinh(x), sometimes changes of sign are necessary.

These changes of sign from circular formulas are typical when working with hyperbolic trigonometry.  One particularly interesting place the change of sign arises is when considering differential equations, although given that I’m bringing hyperbolic trigonometry into a calculus class, I don’t emphasize this relationship.  But recall that \cos(x) is the unique solution to the differential equation

y''+y=0,\quad y(0)=1,\quad y'(0)=0.

Similarly, we see that \cosh(x) is the unique solution to the differential equation

y''-y=0,\quad y(0)=1,\quad y'(0)=0.

Again, the parallel is striking, and the difference subtle.

Of course it is straightforward to see from the definitions that (\cosh(x))'=\sinh(x) and (\sinh(x))'=\cosh(x).  Gone are the days of remembering signs when differentiating and integrating trigonometric functions!  This is one feature of hyperbolic trigonometric functions which students always appreciate….

Another nice feature is how well-behaved the hyperbolic tangent is (as opposed to needing to consider vertical asymptotes in the case of \tan(x)).  Below is the graph of y=\tanh(x)=\sinh(x)/\cosh(x).

Day142Hyp2

The horizontal asymptotes are easily calculated from the definitions.  This looks suspiciously like the curves obtained when modeling logistic growth in populations; that is, finding solutions to

\dfrac{dP}{dt}=kP(C-P).

In fact, these logistic curves are hyperbolic tangents, which we will address in more detail in a later post.

One of the most interesting things about hyperbolic trigonometric functions is that their inverses have closed formulas — in striking contrast to their circular counterparts.  I usually have students work this out, either in class or as homework; the derivation is quite nice, so I’ll outline it here.

So let’s consider solving the equation x=\sinh(y) for y.  Begin with the definition:

x=\dfrac{e^y-e^{-y}}2.

The critical observation is that this is actually a quadratic in e^y:

(e^y)^2-2xe^y-1=0.

All that is necessary is to solve this quadratic equation to yield

e^y=x\pm\sqrt{1+x^2},

and note that x-\sqrt{1+x^2} is always negative, so that we must choose the positive sign.  Thus,

y=\hbox{arcsinh}(x)=\ln(x+\sqrt{1+x^2}).

And this is just the beginning!  At this stage, I also offer more thought-provoking questions like, “Which is larger, \cosh(\ln(42)) or \ln(\cosh(42))?  These get students working with the definitions and thinking about asymptotic behavior.

Next week, I’ll go into more depth about the calculus of hyperbolic trigonometric functions.  Stay tuned!

Calculus: The Geometry of Polynomials, II

The original post on The Geometry of Polynomials generated rather more interest that usual.  One reader, William Meisel, commented that he wondered if something similar worked for curves like the Folium of Descartes, given by the equation

x^3+y^3=3xy,

and whose graph looks like:

Day141Folium1

I replied that yes, I had success, and what I found out would make a nice follow-up post rather than just a reply to his comment.  So let’s go!

Just a brief refresher:  if, for example, we wanted to describe the behavior of y=2(x-4)(x-1)^2 where it crosses the x-axis at x=1, we simply retain the (x-1)^2 term and substitute the root x=1 into the other terms, getting

y=2(1-4)(x-1)^2=-6(x-1)^2

as the best-fitting parabola at x=1.

Now consider another way to think about this:

\displaystyle\lim_{x\to1}\dfrac y{(x-1)^2}=-6.

For examples like the polynomial above, this limit is always trivial, and is essentially a simple substitution.

What happens when we try to evaluate a similar limit with the Folium of Descartes?  It seems that a good approximation to this curve at x=0 (the U-shaped piece, since the sideways U-shaped piece involves writing x as a function of y) is y=x^2/3, as shown below.

Day141Folium2

To see this, we need to find

\displaystyle\lim_{x\to0}\dfrac y{x^2}.

After a little trial and error, I found it was simplest to use the substitution z=y/x^2, and so rewrite the equation for the Folium of Descartes by using the substitution y=x^2z, which results in

1+x^3z^3=3z.

Now it is easy to see that as x\to0, we have z\to1/3, giving us a good quadratic approximation at the origin.

Success!  So I thought I’d try some more examples, and see how they worked out.  I first just changed the exponent of x, looking at the curve

x^n+y^3=3xy,

shown below when n=6.

Day141Folium3.png

What would be a best approximation near the origin?  You can almost eyeball a fifth-degree approximation here, but let’s assume we don’t know the appropriate power and make the substitution y=x^kz, with k yet to be determined. This results in

x^{3k-n}z^3+1=3zx^{k+1-n}.

Now observe that when k=n-1, we have

x^{2n-3}z^3+1=3z,

so that \displaystyle\lim_{x\to0}z=1/3. Thus, in our case with n=6, we see that y=x^5/3 is a good approximation to the curve near the origin.  The graph below shows just how good an approximation it is.

Day141Folium4.png

OK, I thought to myself, maybe I just got lucky.  Maybe introduce a change which will really alter the nature of the curve, such as

x^3+y^3=3xy+1,

whose graph is shown below.

Day141Folium5

Here, the curve passes through the x-axis at x=1, with what appears to be a linear pass-through.  This suggests, given our previous work, the substitution y=(x-1)z, which results in

x^3+(x-1)^3z^3=3x(x-1)z+1.

We don’t have much luck with \displaystyle\lim_{x\to1}z here.  But if we move the 1 to the other side and factor, we get

(x-1)(x^2+x+1)+(x-1)^3z^3=3x(x-1)z.

Nice!  Just divide through by x-1 to obtain

x^2+x+1+(x-1)^2z=3xz.

Now a simple calculation reveals that \displaystyle\lim_{x\to1}z=1. And sure enough, the line y=x-1 does the trick:

Day141Folium6

Then I decided to change the exponent again by considering

x^n+y^3=3xy+1.

Here is the graph of the curve when n=6:

Day141Folium7

It seems we have two roots this time, with linear pass-throughs.  Let’s try the same idea again, making the substitution y=(x-1)z, moving the 1 over, factoring, and dividing through by x-1.  This results in

x^{n-1}+x^{n-2}+\cdots+1+(x-1)^2z^3=3xz.

It is not difficult to calculate that \displaystyle\lim_{x\to1}z=n/3.

Now things become a bit more interesting when n is even, since there is always a root at x=-1 in this case.  Here, we make the substitution y=(x+1)z, move the 1 over, and divide by x+1, resulting in

\dfrac{x^n-1}{x+1}+(x+1)^2z^3=3xz.

But since n is even, then x^2-1 is a factor of x^n-1, so we have

(x-1)(x^{n-2}+x^{n-4}+\cdots+x^2+1)+(x+1)^2z^3=3xz.

Substituting x=-1 in this equation gives

-2\left(\dfrac n2\right)=3(-1)z,

which immediately gives  \displaystyle\lim_{x\to1}z=n/3 as well!  This is a curious coincidence, for which I have no nice geometrical explanation.  The case when n=6 is illustrated below.

Day141Folium8

This is where I stopped — but I was truly surprised that everything I tried actually worked.  I did a cursory online search for Taylor series of implicitly defined functions, but this seems to be much less popular than series for y=f(x).

Anyone more familiar with this topic care to chime in?  I really enjoyed this brief exploration, and I’m grateful that William Meisel asked his question about the Folium of Descartes.  These are certainly instances of a larger phenomenon, but I feel the statement and proof of any theorem will be somewhat more complicated than the analogous results for explicitly defined functions.

And if you find some neat examples, post a comment!  I’d enjoy writing another follow-up post if there is continued interested in this topic.

Calculus: Linear Approximations, II

As I mentioned last week, I am a fan of emphasizing the idea of a derivative as a linear approximation.  I ended that discussion by using this method to find the derivative of \tan(x).   Today, we’ll look at some more examples, and then derive the product, quotient and chain rules.

Differentiating \sec(x) is particularly nice using this method.  We first approximate

\sec(x+h)=\dfrac1{\cos(x+h)}\approx\dfrac1{\cos(x)-h\sin(x)}.

Then we factor out a \cos(x) from the denominator, giving

\sec(x+h)\approx\dfrac1{\cos(x)(1-h\tan(x))}.

As we did at the end of last week’s post, we can make h as small as we like, and so approximate by considering 1/(1-h\tan(x)) as the sum of an infinite series:

\dfrac1{1-h\tan(x)}\approx1+h\tan(x).

Finally, we have

\sec(x+h)\approx\dfrac{1+h\tan(x)}{\cos(x)}=\sec(x)+h\sec(x)\tan(x),

which gives the derivative of \sec(x) as \sec(x)\tan(x).

We’ll look at one more example involving approximating with geometric series before moving on to the product, quotient, and chain rules.  Consider differentiating x^{-n}. We first factor the denominator:

\dfrac1{(x+h)^n}=\dfrac1{x^n(1+h/x)^n}.

Now approximate

\dfrac1{1+h/x}\approx1-\dfrac hx,

so that, to first order,

\dfrac1{(1+h/x)^n}\approx \left(1-\dfrac hx\right)^{\!\!n}\approx 1-\dfrac{nh}x.

This finally results in

\dfrac1{(x+h)^n}\approx \dfrac1{x^n}\left(1-\dfrac{nh}x\right)=\dfrac1{x^n}+h\dfrac{-n}{x^{n+1}},

giving us the correct derivative.

Now let’s move on to the product rule:

(fg)'(x)=f(x)g'(x)+f'(x)g(x).

Here, and for the rest of this discussion, we assume that all functions have the necessary differentiability.

We want to approximate f(x+h)g(x+h), so we replace each factor with its linear approximation:

f(x+h)g(x+h)\approx (f(x)+hf'(x))(g(x)+hg'(x)).

Now expand and keep only the first-order terms:

f(x+h)g(x+h)\approx f(x)g(x)+h(f(x)g'(x)+f'(x)g(x)).

And there’s the product rule — just read off the coefficient of h.

There is a compelling reason to use this method.  The traditional proof begins by evaluating

\displaystyle\lim_{h\to0}\dfrac{f(x+h)g(x+h)-f(x)g(x)}h.

The next step?  Just add and subtract f(x)g(x+h) (or perhaps f(x+h)g(x)).  I have found that there is just no way to convincingly motivate this step.  Yes, those of us who have seen it crop up in various forms know to try such tricks, but the typical first-time student of calculus is mystified by that mysterious step.  Using linear approximations, there is absolutely no mystery at all.

The quotient rule is next:

\left(\dfrac fg\right)^{\!\!\!'}\!(x)=\dfrac{g(x)f'(x)-f(x)g'(x)}{g(x)^2}.

First approximate

\dfrac{f(x+h)}{g(x+h)}\approx\dfrac{f(x)+hf'(x)}{g(x)+hg'(x)}.

Now since h is small, we approximate

\dfrac1{g(x)+hg'(x)}\approx\dfrac1{g(x)}\left(1-h\dfrac{g'(x)}{g(x)}\right),

so that

\dfrac{f(x+h)}{g(x+h)}\approx(f(x)+hf'(x))\cdot\dfrac1{g(x)}\left(1-h\dfrac{g'(x)}{g(x)}\right).

Multiplying out and keeping just the first-order terms results in

\dfrac{f(x+h)}{g(x+h)}\approx f(x)g(x)+h\dfrac{g(x)f'(x)-f(x)g'(x)}{g(x)^2}.

Voila!  The quotient rule.  Now usual proofs involve (1) using the product rule with f(x) and 1/g(x), but note that this involves using the chain rule to differentiate 1/g(x);  or (2) the mysterious “adding and subtracting the same expression” in the numerator.  Using linear approximations avoids both.

The chain rule is almost ridiculously easy to prove using linear approximations.  Begin by approximating

f(g(x+h))\approx f(g(x)+hg'(x)).

Note that we’re replacing the argument to a function with its linear approximation, but since we assume that f is differentiable, it is also continuous, so this poses no real problem.  Yes, perhaps there is a little hand-waving here, but in my opinion, no rigor is really lost.

Since g is differentiable, then g'(x) exists, and so we can make hg'(x) as small as we like, so the “hg'(x)” term acts like the “h” term in our linear approximation.  Additionally, the “g(x)” term acts like the “x” term, resulting in

f(g(x+h)\approx f(g(x))+hg'(x)f'(g(x)).

Reading off the coefficient of h gives the chain rule:

(f\circ g)'(x)=f'(g(x))g'(x).

So I’ve said my piece.  By this time, you’re either convinced that using linear approximations is a good idea, or you’re not.  But I think these methods reflect more accurately the intuition behind the calculations — and reflect what mathematicians do in practice.

In addition, using linear approximations involves more than just mechanically applying formulas.  If all you ever do is apply the product, quotient, and chain rules, it’s just mechanics.  Using linear approximations requires a bit more understanding of what’s really going on underneath the hood, as it were.

If you find more neat examples of differentiation using this method, please comment!  I know I’d be interested, and I’m sure others would as well.

In my next installment (or two or three) in this calculus series, I’ll talk about one of my favorite topics — hyperbolic trigonometry.