Calculus: Hyperbolic Trigonometry, I

love hyperbolic trigonometry.  I always include it when I teach calculus, as I think it is important for students to see.  Why?

  1.  Many applications in the sciences use hyperbolic trigonometry; for example, the use of Laplace transforms in solving differential equations, various applications in physics, modeling population growth (the logistic model is a hyperbolic tangent curve);
  2. Hyperbolic trigonometric substitutions are, in many instances, easier than circular trigonometric substitutions, especially when a substitution involving \tan(x) or \sec(x) is involved;
  3. Students get to see another form of trigonometry, and compare the new form with the old;
  4. Hyperbolic trigonometry is fun.

OK, maybe that last reason is a bit of hyperbole (though not for me).

Not everyone thinks this way.  I once had a colleague who told me she did not teach hyperbolic trigonometry because it wasn’t on the AP exam.  What do you say to someone who says that?  I dunno….

In any case, I want to introduce the subject here for you, and show you some interesting aspects of hyperbolic trigonometry.  I’m going to stray from my habit of not discussing things you can find anywhere online, since in order to get to the better stuff, you need to know the basics.  I’ll move fairly quickly through the introductory concepts, though.

The hyperbolic cosine and sine are defined by

\cosh(x)=\dfrac{e^x+e^{-x}}2,\quad\sinh(x)=\dfrac{e^x-e^{-x}}2,\quad x\in\mathbb{R}.

I will admit that when I introduce this definition, I don’t have an accessible, simple motivation for doing so.  I usually say we’ll learn a lot more as we work with these definitions, so if anyone has a good idea in this regard, I’d be interested to hear it.

The graphs of these curves are shown below.

Day142Hyp1

The graph of \cosh(x) is shown in blue, and the graph of \sinh(x) is shown in red.  The dashed orange graph is y=e^{x}/2, which is easily seen to be asymptotic to both graphs.

Parallels to the circular trigonometric functions are already apparent:  y=\cosh(x) is an even function, just like y=\cos(x).  Similarly, \sinh(x) is odd, just like \sin(x).

Another parallel which is only slight less apparent is the fundamental relationship

\cosh^2(x)-\sinh^2(x)=1.

Thus, (\cosh(x),\sinh(x)) lies on a unit hyperbola, much like (\cos(x),\sin(x)) lies on a unit circle.

While there isn’t a simple parallel with circular trigonometry, there is an interesting way to characterize \cosh(x) and \sinh(x).  Recall that given any function f(x), we may define

E(x)=\dfrac{f(x)+f(-x)}2,\quad O(x)=\dfrac{f(x)-f(-x)}2

to be the even and odd parts of f(x), respectively.  So we might simply say that \cosh(x) and \sinh(x) are the even and odd parts of e^x, respectively.

There are also many properties of the hyperbolic trigonometric functions which are reminiscent of their circular counterparts.  For example, we have

\sinh(2x)=2\sinh(x)\cosh(x)

and

\sinh(x+y)=\sinh(x)\cosh(y)+\sinh(y)\cosh(x).

None of these are especially difficult to prove using the definitions.  It turns out that while there are many similarities, there are subtle differences.  For example,

\cosh(x+y)=\cosh(x)\cosh(y)+\sinh(x)\sinh(y).

That is, while some circular trigonometric formulas become hyperbolic just by changing \cos(x) to \cosh(x) and \sin(x) to \sinh(x), sometimes changes of sign are necessary.

These changes of sign from circular formulas are typical when working with hyperbolic trigonometry.  One particularly interesting place the change of sign arises is when considering differential equations, although given that I’m bringing hyperbolic trigonometry into a calculus class, I don’t emphasize this relationship.  But recall that \cos(x) is the unique solution to the differential equation

y''+y=0,\quad y(0)=1,\quad y'(0)=0.

Similarly, we see that \cosh(x) is the unique solution to the differential equation

y''-y=0,\quad y(0)=1,\quad y'(0)=0.

Again, the parallel is striking, and the difference subtle.

Of course it is straightforward to see from the definitions that (\cosh(x))'=\sinh(x) and (\sinh(x))'=\cosh(x).  Gone are the days of remembering signs when differentiating and integrating trigonometric functions!  This is one feature of hyperbolic trigonometric functions which students always appreciate….

Another nice feature is how well-behaved the hyperbolic tangent is (as opposed to needing to consider vertical asymptotes in the case of \tan(x)).  Below is the graph of y=\tanh(x)=\sinh(x)/\cosh(x).

Day142Hyp2

The horizontal asymptotes are easily calculated from the definitions.  This looks suspiciously like the curves obtained when modeling logistic growth in populations; that is, finding solutions to

\dfrac{dP}{dt}=kP(C-P).

In fact, these logistic curves are hyperbolic tangents, which we will address in more detail in a later post.

One of the most interesting things about hyperbolic trigonometric functions is that their inverses have closed formulas — in striking contrast to their circular counterparts.  I usually have students work this out, either in class or as homework; the derivation is quite nice, so I’ll outline it here.

So let’s consider solving the equation x=\sinh(y) for y.  Begin with the definition:

x=\dfrac{e^y-e^{-y}}2.

The critical observation is that this is actually a quadratic in e^y:

(e^y)^2-2xe^y-1=0.

All that is necessary is to solve this quadratic equation to yield

e^y=x\pm\sqrt{1+x^2},

and note that x-\sqrt{1+x^2} is always negative, so that we must choose the positive sign.  Thus,

y=\hbox{arcsinh}(x)=\ln(x+\sqrt{1+x^2}).

And this is just the beginning!  At this stage, I also offer more thought-provoking questions like, “Which is larger, \cosh(\ln(42)) or \ln(\cosh(42))?  These get students working with the definitions and thinking about asymptotic behavior.

Next week, I’ll go into more depth about the calculus of hyperbolic trigonometric functions.  Stay tuned!

Calculus: The Geometry of Polynomials, II

The original post on The Geometry of Polynomials generated rather more interest that usual.  One reader, William Meisel, commented that he wondered if something similar worked for curves like the Folium of Descartes, given by the equation

x^3+y^3=3xy,

and whose graph looks like:

Day141Folium1

I replied that yes, I had success, and what I found out would make a nice follow-up post rather than just a reply to his comment.  So let’s go!

Just a brief refresher:  if, for example, we wanted to describe the behavior of y=2(x-4)(x-1)^2 where it crosses the x-axis at x=1, we simply retain the (x-1)^2 term and substitute the root x=1 into the other terms, getting

y=2(1-4)(x-1)^2=-6(x-1)^2

as the best-fitting parabola at x=1.

Now consider another way to think about this:

\displaystyle\lim_{x\to1}\dfrac y{(x-1)^2}=-6.

For examples like the polynomial above, this limit is always trivial, and is essentially a simple substitution.

What happens when we try to evaluate a similar limit with the Folium of Descartes?  It seems that a good approximation to this curve at x=0 (the U-shaped piece, since the sideways U-shaped piece involves writing x as a function of y) is y=x^2/3, as shown below.

Day141Folium2

To see this, we need to find

\displaystyle\lim_{x\to0}\dfrac y{x^2}.

After a little trial and error, I found it was simplest to use the substitution z=y/x^2, and so rewrite the equation for the Folium of Descartes by using the substitution y=x^2z, which results in

1+x^3z^3=3z.

Now it is easy to see that as x\to0, we have z\to1/3, giving us a good quadratic approximation at the origin.

Success!  So I thought I’d try some more examples, and see how they worked out.  I first just changed the exponent of x, looking at the curve

x^n+y^3=3xy,

shown below when n=6.

Day141Folium3.png

What would be a best approximation near the origin?  You can almost eyeball a fifth-degree approximation here, but let’s assume we don’t know the appropriate power and make the substitution y=x^kz, with k yet to be determined. This results in

x^{3k-n}z^3+1=3zx^{k+1-n}.

Now observe that when k=n-1, we have

x^{2n-3}z^3+1=3z,

so that \displaystyle\lim_{x\to0}z=1/3. Thus, in our case with n=6, we see that y=x^5/3 is a good approximation to the curve near the origin.  The graph below shows just how good an approximation it is.

Day141Folium4.png

OK, I thought to myself, maybe I just got lucky.  Maybe introduce a change which will really alter the nature of the curve, such as

x^3+y^3=3xy+1,

whose graph is shown below.

Day141Folium5

Here, the curve passes through the x-axis at x=1, with what appears to be a linear pass-through.  This suggests, given our previous work, the substitution y=(x-1)z, which results in

x^3+(x-1)^3z^3=3x(x-1)z+1.

We don’t have much luck with \displaystyle\lim_{x\to1}z here.  But if we move the 1 to the other side and factor, we get

(x-1)(x^2+x+1)+(x-1)^3z^3=3x(x-1)z.

Nice!  Just divide through by x-1 to obtain

x^2+x+1+(x-1)^2z=3xz.

Now a simple calculation reveals that \displaystyle\lim_{x\to1}z=1. And sure enough, the line y=x-1 does the trick:

Day141Folium6

Then I decided to change the exponent again by considering

x^n+y^3=3xy+1.

Here is the graph of the curve when n=6:

Day141Folium7

It seems we have two roots this time, with linear pass-throughs.  Let’s try the same idea again, making the substitution y=(x-1)z, moving the 1 over, factoring, and dividing through by x-1.  This results in

x^{n-1}+x^{n-2}+\cdots+1+(x-1)^2z^3=3xz.

It is not difficult to calculate that \displaystyle\lim_{x\to1}z=n/3.

Now things become a bit more interesting when n is even, since there is always a root at x=-1 in this case.  Here, we make the substitution y=(x+1)z, move the 1 over, and divide by x+1, resulting in

\dfrac{x^n-1}{x+1}+(x+1)^2z^3=3xz.

But since n is even, then x^2-1 is a factor of x^n-1, so we have

(x-1)(x^{n-2}+x^{n-4}+\cdots+x^2+1)+(x+1)^2z^3=3xz.

Substituting x=-1 in this equation gives

-2\left(\dfrac n2\right)=3(-1)z,

which immediately gives  \displaystyle\lim_{x\to1}z=n/3 as well!  This is a curious coincidence, for which I have no nice geometrical explanation.  The case when n=6 is illustrated below.

Day141Folium8

This is where I stopped — but I was truly surprised that everything I tried actually worked.  I did a cursory online search for Taylor series of implicitly defined functions, but this seems to be much less popular than series for y=f(x).

Anyone more familiar with this topic care to chime in?  I really enjoyed this brief exploration, and I’m grateful that William Meisel asked his question about the Folium of Descartes.  These are certainly instances of a larger phenomenon, but I feel the statement and proof of any theorem will be somewhat more complicated than the analogous results for explicitly defined functions.

And if you find some neat examples, post a comment!  I’d enjoy writing another follow-up post if there is continued interested in this topic.

Bay Area Mathematical Artists, VII

We had yet another amazing meeting of the Bay Area Mathematical Artists yesterday!  Just two speakers — but even so, we went a half-hour over our usual 5:00 ending time.

Our first presenter was Stan Isaacs.  There was no real title to his presentation, but he brought another set of puzzles from his vast collection to share.  He was highlighting puzzles created by Wayne Daniel.

Below you’ll see one of the puzzles disassembled.  The craftsmanship is simply remarkable.

Day140Stan1

If you look carefully, you’ll see what’s going on.  The outer pieces make an icosahedron, and when you take those off, a dodecahedron, then a cube…a wooden puzzle of nested Platonic solids!  The pieces all fit together so perfectly.  Stan is looking forward to an exhibition of Wayne’s work at the International Puzzle Party in San Diego later on this year.  For more information, contact Stan at stan@isaacs.com.

Day140Stan2

Our second speaker was Scott Kim (www.scottkim.com), who’s presentation was entitled Motley Dissections.  What is a motley dissection?  The most famous example is the problem of the squared square — that is, dissecting a square with an integer side length into smaller squares with integer side lengths, but with all the squares different sizes.

One property of such a dissection is that no two edges of squares meet exactly corner to corner.  In other words, edges always overlap in some way.

But there are of course many other motley dissections.  For example, below you see a motley dissection of one rectangle into five, one pentagon into eleven, and finally, one hexagon into a triangle, square, pentagon and hexagon.

Day140SK1Look carefully, and you’ll see that no single edge in any of these dissections exactly matches any other.  For these decompositions, Scott has proved they are minimal — so, for example, there is no motley dissection of one pentagon to ten or fewer.  The proofs are not exactly elegant, but they serve their purpose.  He also mentioned that he credits Donald Knuth with the term motley dissection, who used the term in a phone conversation not all that long ago.

Can you cube the cube?  That is, can you take a cube and subdivide it into cubes which are all different?  Scott showed us a simple proof that you can’t.  But, it turns out, you can box the box.  In other words, if the length, width, and height of the larger box and all the smaller boxes may be different, then it is possible to box the box.

Next week, Scott is off to the Gathering 4 Gardner in Atlanta, and will be giving his talk on Motley Dissections there.  He planned an activity where participants actually build a boxed box — and we were his test audience!

Day140Box1.jpg

He created some very elaborate transparencies with detailed instructions for cutting out and assembling.  There were a very few suggestions for improvement, and Scott was happy to know about them — after all, it is rare that something works out perfectly the first time.  So now, his success at G4G in Atlanta is assured….

We were so into creating these boxed boxes, that we happily stayed until 5:30 until we had two boxes completed.

Day140Box2.jpg

I should mention that Scott also discussed something he terms pseudo-duals in two, three, and even four dimensions!  There isn’t room to go into those now, but you can contact him through his website for more information.

As usual, we went out to dinner afterwards — and we gravitated towards our favorite Thai place again.  The dinner conversation was truly exceptional this evening, revolving around an animated conversation between Scott Kim and magician Mark Mitton (www.markmitton.com).

The conversation was concerned with the way we perceive mathematics here in the U.S., and how that influences the educational system.  Simply put, there is a lot to be desired.

One example Scott and Mark mentioned was the National Mathematics Festival (http://www.nationalmathfestival.org).  Tens of thousands of kids and parents have fun doing mathematics.  Then the next week, they go back to their schools and keep learning math the same — usually, unfortunately, boring — way it’s always been learned.

So why does the National Mathematics Festival have to be a one-off event?  It doesn’t!  Scott is actively engaged in a program he’s created where he goes into an elementary school at lunchtime one day a week and let’s kids play with math games and puzzles.

Why this model?  Teachers need no extra prep time, kids don’t need to stay after school, and so everyone can participate with very little needed as far as additional resources are concerned.  He’s hoping to create a package that he can export to any school anywhere where with minimal effort, so that children can be exposed to the joy of mathematics on a regular basis.

Mark was interested in Scott’s model:  consider your Needs (improving the perception of mathematics), be aware of the Forces at play (unenlightened administrators, for example, and many other subtle forces at work, as Mark explained), and then decide upon Actions to take to move the Work (applied, pure, and recreational mathematics) forward.

The bottom line:  we all know about this problem of attitudes toward mathematics and mathematics education, but no one really knows what to do about it.  For Scott, it’s just another puzzle to solve.  There are solutions.  And he is going to find one.

We talked for over two hours about these ideas, and everyone chimed in at one time or another.  Yes, my summary is very brief, I know, but I hope you get the idea of the type of conversation we had.

Stay tuned, since we are planning an upcoming meeting where we focus on Scott’s model and work towards a solution.  Another theme throughout the conversation was that mathematics is not an activity done in isolation — it is a communal activity.  So the Needs will not be addressed by a single individual, rather a group, and likely involving many members of many diverse communities.

A solution is out there.  It will take a lot of grit to find it.  But mathematicians have got grit in spades.

 

Calculus: Linear Approximations, II

As I mentioned last week, I am a fan of emphasizing the idea of a derivative as a linear approximation.  I ended that discussion by using this method to find the derivative of \tan(x).   Today, we’ll look at some more examples, and then derive the product, quotient and chain rules.

Differentiating \sec(x) is particularly nice using this method.  We first approximate

\sec(x+h)=\dfrac1{\cos(x+h)}\approx\dfrac1{\cos(x)-h\sin(x)}.

Then we factor out a \cos(x) from the denominator, giving

\sec(x+h)\approx\dfrac1{\cos(x)(1-h\tan(x))}.

As we did at the end of last week’s post, we can make h as small as we like, and so approximate by considering 1/(1-h\tan(x)) as the sum of an infinite series:

\dfrac1{1-h\tan(x)}\approx1+h\tan(x).

Finally, we have

\sec(x+h)\approx\dfrac{1+h\tan(x)}{\cos(x)}=\sec(x)+h\sec(x)\tan(x),

which gives the derivative of \sec(x) as \sec(x)\tan(x).

We’ll look at one more example involving approximating with geometric series before moving on to the product, quotient, and chain rules.  Consider differentiating x^{-n}. We first factor the denominator:

\dfrac1{(x+h)^n}=\dfrac1{x^n(1+h/x)^n}.

Now approximate

\dfrac1{1+h/x}\approx1-\dfrac hx,

so that, to first order,

\dfrac1{(1+h/x)^n}\approx \left(1-\dfrac hx\right)^{\!\!n}\approx 1-\dfrac{nh}x.

This finally results in

\dfrac1{(x+h)^n}\approx \dfrac1{x^n}\left(1-\dfrac{nh}x\right)=\dfrac1{x^n}+h\dfrac{-n}{x^{n+1}},

giving us the correct derivative.

Now let’s move on to the product rule:

(fg)'(x)=f(x)g'(x)+f'(x)g(x).

Here, and for the rest of this discussion, we assume that all functions have the necessary differentiability.

We want to approximate f(x+h)g(x+h), so we replace each factor with its linear approximation:

f(x+h)g(x+h)\approx (f(x)+hf'(x))(g(x)+hg'(x)).

Now expand and keep only the first-order terms:

f(x+h)g(x+h)\approx f(x)g(x)+h(f(x)g'(x)+f'(x)g(x)).

And there’s the product rule — just read off the coefficient of h.

There is a compelling reason to use this method.  The traditional proof begins by evaluating

\displaystyle\lim_{h\to0}\dfrac{f(x+h)g(x+h)-f(x)g(x)}h.

The next step?  Just add and subtract f(x)g(x+h) (or perhaps f(x+h)g(x)).  I have found that there is just no way to convincingly motivate this step.  Yes, those of us who have seen it crop up in various forms know to try such tricks, but the typical first-time student of calculus is mystified by that mysterious step.  Using linear approximations, there is absolutely no mystery at all.

The quotient rule is next:

\left(\dfrac fg\right)^{\!\!\!'}\!(x)=\dfrac{g(x)f'(x)-f(x)g'(x)}{g(x)^2}.

First approximate

\dfrac{f(x+h)}{g(x+h)}\approx\dfrac{f(x)+hf'(x)}{g(x)+hg'(x)}.

Now since h is small, we approximate

\dfrac1{g(x)+hg'(x)}\approx\dfrac1{g(x)}\left(1-h\dfrac{g'(x)}{g(x)}\right),

so that

\dfrac{f(x+h)}{g(x+h)}\approx(f(x)+hf'(x))\cdot\dfrac1{g(x)}\left(1-h\dfrac{g'(x)}{g(x)}\right).

Multiplying out and keeping just the first-order terms results in

\dfrac{f(x+h)}{g(x+h)}\approx f(x)g(x)+h\dfrac{g(x)f'(x)-f(x)g'(x)}{g(x)^2}.

Voila!  The quotient rule.  Now usual proofs involve (1) using the product rule with f(x) and 1/g(x), but note that this involves using the chain rule to differentiate 1/g(x);  or (2) the mysterious “adding and subtracting the same expression” in the numerator.  Using linear approximations avoids both.

The chain rule is almost ridiculously easy to prove using linear approximations.  Begin by approximating

f(g(x+h))\approx f(g(x)+hg'(x)).

Note that we’re replacing the argument to a function with its linear approximation, but since we assume that f is differentiable, it is also continuous, so this poses no real problem.  Yes, perhaps there is a little hand-waving here, but in my opinion, no rigor is really lost.

Since g is differentiable, then g'(x) exists, and so we can make hg'(x) as small as we like, so the “hg'(x)” term acts like the “h” term in our linear approximation.  Additionally, the “g(x)” term acts like the “x” term, resulting in

f(g(x+h)\approx f(g(x))+hg'(x)f'(g(x)).

Reading off the coefficient of h gives the chain rule:

(f\circ g)'(x)=f'(g(x))g'(x).

So I’ve said my piece.  By this time, you’re either convinced that using linear approximations is a good idea, or you’re not.  But I think these methods reflect more accurately the intuition behind the calculations — and reflect what mathematicians do in practice.

In addition, using linear approximations involves more than just mechanically applying formulas.  If all you ever do is apply the product, quotient, and chain rules, it’s just mechanics.  Using linear approximations requires a bit more understanding of what’s really going on underneath the hood, as it were.

If you find more neat examples of differentiation using this method, please comment!  I know I’d be interested, and I’m sure others would as well.

In my next installment (or two or three) in this calculus series, I’ll talk about one of my favorite topics — hyperbolic trigonometry.

Calculus: Linear Approximations, I

Last week’s post on the Geometry of Polynomials generated a lot of interest from folks who are interested in or teach calculus.  So I thought I’d start a thread about other ideas related to teaching calculus.

This idea is certainly not new.  But I think it is sorely underexploited in the calculus classroom.  I like it because it reinforces the idea of derivative as linear approximation.

The main idea is to rewrite

\displaystyle\lim_{h\to 0}\dfrac{f(x+h)-f(x)}h=f'(x)

as

f(x+h)\approx f(x)+hf'(x),

with the note that this approximation is valid when h\approx0.  Writing the limit in this way, we see that f(x+h), as a function of h, is linear in h in the sense of the limit in the definition actually existing — meaning there is a good linear approximation to f at x.

Moreover, in this sense, if

f(x+h)\approx f(x)+hg(x),

then it must be the case that f'(x)=g(x).  This is not difficult to prove.

Let’s look at a simple example, like finding the derivative of f(x)=x^2.  It’s easy to see that

f(x+h)=(x+h)^2=x^2+h(2x)+h^2.

So it’s easy to read off the derivative: ignore higher-order terms in h, and then look at the coefficient of h as a function of x.

Note that this is perfectly rigorous.  It should be clear that ignoring higher-order terms in h is fine since when taking the limit as in the definition, only one h divides out, meaning those terms contribute 0 to the limit.  So the coefficient of h will be the only term to survive the limit process.

Also note that this is nothing more than a rearrangement of the algebra necessary to compute the derivative using the usual definition.  I just find it is more intuitive, and less cumbersome notationally.  But every step taken can be justified rigorously.

Moreover, this method is the one commonly used in more advanced mathematics, where  functions take vectors as input.  So if

f({\bf v})={\bf v}\cdot{\bf v},

we compute

f({\bf u}+h{\bf v})={\bf u}\cdot{\bf u}+2h{\bf u}\cdot{\bf v}+h^2{\bf v}\cdot{\bf v},

and read off

\nabla_{\bf v}f({\bf u})=2{\bf u}\cdot{\bf v}.

I don’t want to go into more details here, since such calculations don’t occur in beginning calculus courses.  I just want to point out that this way of computing derivatives is in fact a natural one, but one which you don’t usually encounter until graduate-level courses.

Let’s take a look at another example:  the derivative of f(x)=\sin(x), and see how it looks using this rewrite.  We first write

\sin(x+h)=\sin(x)\cos(h)+\cos(x)\sin(h).

Now replace all functions of h with their linear approximations.  Since \cos(h)\approx1 and \sin(h)\approx h near h=0, we have

\sin(x+h)\approx\sin(x)+h\cos(x).

This immediately gives that \cos(x) is the derivative of \sin(x).

Now the approximation \cos(h)\approx1 is easy to justify geometrically by looking at the graph of \cos(x).  But how do we justify the approximation \sin(h)\approx h?

Of course there is no getting around this.  The limit

\displaystyle\lim_{h\to0}\dfrac{\sin(h)}h

is the one difficult calculation in computing the derivative of \sin(x).  So then you’ve got to provide your favorite proof of this limit, and then move on.  But this approximation helps to illustrate the essential point:  the differentiability of \sin(x) at x=0 does, in a real sense, imply the differentiability of \sin(x) everywhere else.

So computing derivatives in this way doesn’t save any of the hard work, but I think it makes the work a bit more transparent.  And as we continually replace functions of h with their linear approximations, this aspect of the derivative is regularly being emphasized.

How would we use this technique to differentiate f(x)=\sqrt x?  We need

\sqrt{x+h}\approx\sqrt x+hf'(x),

and so

x+h\approx \left(\sqrt x+hf'(x)\right)^2\approx x+2h\sqrt xf'(x).

Since the coefficient of h on the left is 1, so must be the coefficient on the right, so that

2\sqrt xf'(x)=1.

As a last example for this week, consider taking the derivative of f(x)=\tan(x).  Then we have

\tan(x+h)=\dfrac{\tan(x)+\tan(h)}{1-\tan(x)\tan(h)}.

Now since \sin(h)\approx h and \cos(h)\approx 1, we have \tan(h)\approx h, and so we can replace to get

\tan(x+h)\approx\dfrac{\tan(x)+h}{1-h\tan(x)}.

Now what do we do?  Since we’re considering h near 0, then h\tan(x) is small (as small as we like), and so we can consider

\dfrac1{1-h\tan(x)}

as the sum of the infinite geometric series

\dfrac1{1-h\tan(x)}=1+h\tan(x)+h^2\tan^2(x)+\cdots

Replacing, with the linear approximation to this sum, we get

\tan(x+h)\approx(\tan(x)+h)(1+h\tan(x)),

and so

\tan(x+h)\approx\tan(x)+h(1+\tan^2(x)).

This give the derivative of \tan(x) to be

1+\tan^2(x)=\sec^2(x).

Neat!

Now this method takes a bit more work than just using the quotient rule (as usually done).  But using the quotient rule is a purely mechanical process; this way, we are constantly thinking, “How do I replace this expression with a good linear approximation?”  Perhaps more is learned this way?

There are more interesting examples using this geometric series idea.  We’ll look at a few more next time, and then use this idea to prove the product, quotient, and chain rules.  Until then!

The Geometry of Polynomials

I recently needed to make a short demo lecture, and I thought I’d share it with you.  I’m sure I’m not the first one to notice this, but I hadn’t seen it before and I thought it was an interesting way to look at the behavior of polynomials where they cross the x-axis.

The idea is to give a geometrical meaning to an algebraic procedure:  factoring polynomials.  What is the geometry of the different factors of a polynomial?

Let’s look at an example in some detail:  f(x)=2(x-4)(x-1)^2.poly0b

Now let’s start looking at the behavior near the roots of this polynomial.

poly0c

Near x=1, the graph of the cubic looks like a parabola — and that may not be so surprising given that the factor (x-1) occurs quadratically.

poly0d

And near x=4, the graph passes through the x-axis like a line — and we see a linear factor of (x-4) in our polynomial.

But which parabola, and which line?  It’s actually pretty easy to figure out.  Here is an annotated slide which illustrates the idea.

Day137poly1

All you need to do is set aside the quadratic factor of (x-1)^2, and substitute the root, x=1, in the remaining terms of the polynomial, then simplify.  In this example, we see that the cubic behaves like the parabola y=-6(x-1)^2 near the root x=1. Note the scales on the axes; if they were the same, the parabola would have appeared much narrower.

We perform a similar calculation at the root x=4.

Day137poly2

Just isolate the linear factor (x-4), substitute x=4 in the remaining terms of the polynomial, and then simplify.  Thus, the line y=18(x-4) best describes the behavior of the graph of the polynomial as it passes through the x-axis.  Again, note the scale on the axes.

We can actually use this idea to help us sketch graphs of polynomials when they’re in factored form.  Consider the polynomial f(x)=x(x+1)^2(x-2)^3.  Begin by sketching the three approximations near the roots of the polynomial.  This slide also shows the calculation for the cubic approximation.

Day137poly3.png

Now you can begin sketching the graph, starting from the left, being careful to closely follow the parabola as you bounce off the x-axis at x=-1.

poly1d

Continue, following the red line as you pass through the origin, and then the cubic as you pass through x=2.  Of course you’d need to plot a few points to know just where to start and end; this just shows how you would use the approximations near the roots to help you sketch a graph of a polynomial.

poly1f

Why does this work?  It is not difficult to see, but here we need a little calculus.  Let’s look, in general, at the behavior of f(x)=p(x)(x-a)^n near the root x=a.  Given what we’ve just been observing, we’d guess that the best approximation near x=a would just be y=p(a)(x-a)^n.

Just what does “best approximation” mean?  One way to think about approximating, calculuswise, is matching derivatives — just think of Maclaurin or Taylor series.  My claim is that the first n derivatives of f(x)=p(x)(x-a)^n and y=p(a)(x-a)^n match at x=a.

First, observe that the first n-1 derivatives of both of these functions at x=a must be 0.  This is because (x-a) will always be a factor — since at most n-1 derivatives are taken, there is no way for the (x-a)^n term to completely “disappear.”

But what happens when the nth derivative is taken?  Clearly, the nth derivative of p(a)(x-a)^n at x=a is just n!p(a).  What about the nth derivative of f(x)=p(x)(x-a)^n?

Thinking about the product rule in general, we see that the form of the nth derivative must be f^{(n)}(x)=n!p(x)+ (x-a)(\text{terms involving derivatives of } p(x)). When a derivative of p(x) is taken, that means one factor of (x-a) survives.

So when we take f^{(n)}(a), we also get n!p(a).  This makes the nth derivatives match as well.  And since the first n derivatives of p(x)(x-a)^n and p(a)(x-a)^n match, we see that p(a)(x-a)^n is the best nth degree approximation near the root x=a.

I might call this observation the geometry of polynomials. Well, perhaps not the entire geometry of polynomials….  But I find that any time algebra can be illustrated graphically, students’ understanding gets just a little deeper.

Those who have been reading my blog for a while will be unsurprised at my geometrical approach to algebra (or my geometrical approach to anything, for that matter).  Of course a lot of algebra was invented just to describe geometry — take the Cartesian coordinate plane, for instance.  So it’s time for algebra to reclaim its geometrical heritage.  I shall continue to be part of this important endeavor, for however long it takes….

The Puzzle Archives, II

This week, I’ll continue with some more problems from the contests for the 2014 conference of the International Group for Mathematical Creativity and Giftedness.  We’ll look at problems from the Intermediate Contest today.  Recall that the first three problems on all contests were the same; you can find them here.

The first problem I’ll share is a “ball and urn” problem.  These are a staple of mathematical contests everywhere.

You have 20 identical red balls and 14 identical green balls. You wish to put them into two baskets — one brown basket, and one yellow basket. In how many different ways can you do this if the number of green balls in either basket is less than the number of red balls?

Another popular puzzle idea is to write a problem or two which involve the year of the contest — in this case, 2014.

A positive integer is said to be fortunate if it is either divisible by 14, or contains the two adjacent digits “14” (in that order). How many fortunate integers n are there between 1 and 2014, inclusive?

The other two problems from the contest I’ll share with you today are from other contests shared with me by my colleagues.

In the figure below, the perimeters of three rectangles are given. You also know that the shaded rectangle is in fact a square. What is the perimeter of the rectangle in the lower left-hand corner?

Day136problem

I very much like this last problem.  It’s one of those problems that when you first look at it, it seems totally impossible — how could you consider all multiples of 23?  Nonetheless, there is a way to look at it and find the correct solution.  Can you find it?

Multiples of 23 have various digit sums. For example, 46 has digit sum 10, while 8 x 23 = 184 has digit sum 13. What is the smallest possible digit sum among all multiples of 23?

You can read more to see the solutions to these puzzles.  Enjoy!

Continue reading The Puzzle Archives, II