-------------------------------------------------------------------------- Welcome to the Fourth Act of Lecture 4 of Notes on Financial Mathematics, by Scot Adams and Fernando Reitich. Remember -------------------------------------------------------------------------- Kyle who wants to buy a -------------------------------------------------------------------------- call option on -------------------------------------------------------------------------- 5,000 shares of ABC at a strike of -------------------------------------------------------------------------- 5,000 dollars with a term of -------------------------------------------------------------------------- 30 days. -------------------------------------------------------------------------- Gail sells this option. We -------------------------------------------------------------------------- assume a spot price of $1 per share. We also assume that -------------------------------------------------------------------------- the stock ticks up or down a small amount each second, with uptick and downtick factors -------------------------------------------------------------------------- as given. We also assume -------------------------------------------------------------------------- this one-second risk-free factor. Our -------------------------------------------------------------------------- goal is to find the right, or -------------------------------------------------------------------------- arbitrage-free price. As usual, we'll begin by working out the -------------------------------------------------------------------------- payoff function, which we'll denote by -------------------------------------------------------------------------- f(S), where S is the final share price. *If* Kyle exercises, then Gail will provide him with -------------------------------------------------------------------------- 5,000 shares for -------------------------------------------------------------------------- 5,000 dollars. At share price S, the 5,000 shares cost Gail -------------------------------------------------------------------------- 5,000 S dollars, but that cost is offset by the -------------------------------------------------------------------------- 5,000 dollar strike price that Kyle pays her to exercise. Of course, it's reasonable to assume Kyle'll only exercise if the net amount he'll receive -------------------------------------------------------------------------- is positive. After all, who would choose to lose money, given the *option* not to? As usual, once we have a payoff function, we have an -------------------------------------------------------------------------- exercise: Graph it. The various numbers we're handling here are quite long, and so we'll give each a name. For example, there's the -------------------------------------------------------------------------- number of seconds in 30 days, which calculates to -------------------------------------------------------------------------- more than 2 and a half million. We'll call this number -------------------------------------------------------------------------- N. Next, there's the -------------------------------------------------------------------------- uptick factor, which we'll call -------------------------------------------------------------------------- u, and the -------------------------------------------------------------------------- downtick factor, which we'll call -------------------------------------------------------------------------- d, and the -------------------------------------------------------------------------- risk-free factor, which we'll call -------------------------------------------------------------------------- rho. We'll denote the one-second interest rate by -------------------------------------------------------------------------- iota, which is just -------------------------------------------------------------------------- the risk-free rate minus 1. We record that -------------------------------------------------------------------------- 1 plus iota is rho, that is, 1 plus the interest rate is the risk-free factor. Okay. We next work out the risk-neutral uptick and downtick probabilities. -------------------------------------------------------------------------- On a number line, we plot the -------------------------------------------------------------------------- uptick and downtick factors, and the -------------------------------------------------------------------------- risk-free factor. For pedagogical purposes, I chose these numbers so that -------------------------------------------------------------------------- rho is *exactly* halfway between $d$ and $u$, which means that -------------------------------------------------------------------------- the risk-neutral uptick and downtick probabilities are both -------------------------------------------------------------------------- 50 percent. In the -------------------------------------------------------------------------- 50-50 world, we can calculate expected returns of -------------------------------------------------------------------------- stock (pause...............) and -------------------------------------------------------------------------- bank. First, for the -------------------------------------------------------------------------- bank, the (pause...............) -------------------------------------------------------------------------- expected value is -------------------------------------------------------------------------- 50 percent of rho L plus -------------------------------------------------------------------------- 50 percent of rho L, which is (pause...............) -------------------------------------------------------------------------- 100 percent of rho L. A loan of -------------------------------------------------------------------------- $L$ will grow to -------------------------------------------------------------------------- rho L, after one second. The increase is -------------------------------------------------------------------------- rho L minus L, or -------------------------------------------------------------------------- (rho minus 1) times L, or -------------------------------------------------------------------------- iota times L, so the -------------------------------------------------------------------------- expected return is by a factor of -------------------------------------------------------------------------- iota. For the (pause...............) -------------------------------------------------------------------------- stock the -------------------------------------------------------------------------- expected value is -------------------------------------------------------------------------- 50% of uS plus -------------------------------------------------------------------------- 50% of dS. Because -------------------------------------------------------------------------- rho is exactly halfway between -------------------------------------------------------------------------- d and u, we get an -------------------------------------------------------------------------- expected value of rho S. So -------------------------------------------------------------------------- S dollars invested in the stock for one second can be *expected*, on average, in the risk-neutral world, to increase to -------------------------------------------------------------------------- rho S, so the expected amount of increase here is -------------------------------------------------------------------------- rho S minus S, or -------------------------------------------------------------------------- (rho minus 1) times S, or -------------------------------------------------------------------------- iota times S, which gives an -------------------------------------------------------------------------- expected return of by a factor of -------------------------------------------------------------------------- iota. The risk neutral world, in this problem, is a -------------------------------------------------------------------------- 50-50 world, and what's special about these probabilities is *not* that we necessarily believe that they're the real-world probabilities. What's special in this imaginary risk-neutral world is that -------------------------------------------------------------------------- these two expected returns are equal, so that any self-financed portfolio of -------------------------------------------------------------------------- stock and bank will have -------------------------------------------------------------------------- an expected return of iota each second. In other words, the expected value of any self-financed portfolio will grow by a factor of -------------------------------------------------------------------------- rho each second. Okay. Because of the size of the number N of subperiods, it's infeasible to use templates, as we did in the past, but we make a bit of an attempt anyway. We'll render, in black, the evolution of the underlying -------------------------------------------------------------------------- ABC share price, which starts out at a spot price of -------------------------------------------------------------------------- 1 dollar per share, and then evolves by the uptick or downtick factor to -------------------------------------------------------------------------- $u$ or $d$, over the first second. Over the *second* second, it again changes by a factor of $u$ or $d$ and -------------------------------------------------------------------------- these are the possible share prices after two seconds. We invite the interested listener to -------------------------------------------------------------------------- review the values of $u$ and $d$ and to compute -------------------------------------------------------------------------- $u$ squared, $ud$ and $d$ squared. At the end of the third second, the price may hit one of -------------------------------------------------------------------------- four possible values. This -------------------------------------------------------------------------- continues on for N steps, and N is so large that there's -------------------------------------------------------------------------- NO WAY we can show you all of them, but, if we -------------------------------------------------------------------------- continue to the ending share prices, we have many possibilities ranging from -------------------------------------------------------------------------- u to the N, representing N upticks, down to -------------------------------------------------------------------------- d to the N, representing N downticks. Another two possible ending share prices are -------------------------------------------------------------------------- u to the (N minus 1) d, -------------------------------------------------------------------------- u to the (N minus 2) d squared. Of course we -------------------------------------------------------------------------- can't list them all. We now move from the price of the underlying ABC, rendered in black, to the -------------------------------------------------------------------------- contingent claim, which we'll render in red. Remember that the contingent claim is the amount that Gail needs, in her hedge, to meet her obligation to Kyle, at the end of the 30 day term. That is, it's the ending value of the derivative. The connection between the underlying and the derivative is the payoff function -------------------------------------------------------------------------- f. That is, to get the contingent claim, we plug the -------------------------------------------------------------------------- ending underlying prices into the payoff function, f, -------------------------------------------------------------------------- like so. We want to price the derivative, so we want to move backward in time, to get from the contingent claim back to the initial price of the option. The inefficient way to do this is to solve huge numbers of equations in huge numbers of unknowns. The clever and efficient way to get the result is to move to -------------------------------------------------------------------------- the risk-neutral world, where any self-financed portfolio has an expected return of iota per second, and so its expected value grows by a factor of rho each second. The -------------------------------------------------------------------------- price, P, of the option is just the amount Gail needs in order to set up the hedge, so it's -------------------------------------------------------------------------- the initial value of the hedge. Gail, never steals from the hedge, and never puts money into it. She simply adjusts it. That is, the hedge is self-financing. In the risk-neutral world, the expected value grows by a factor of rho each second, so, since there are N seconds in 30 days, it grows by a factor of -------------------------------------------------------------------------- rho to the N over the 30 day term. So -------------------------------------------------------------------------- rho to the N times P, -------------------------------------------------------------------------- is the expected final value of the hedge. The collection of possible final values of the hedge *is* the -------------------------------------------------------------------------- contingent claim, so we seek to compute -------------------------------------------------------------------------- the expected contingent claim, working in the -------------------------------------------------------------------------- risk-neutral world. To compute this expected value, we should find the probability of -------------------------------------------------------------------------- N upticks and 0 downticks, and multiply it by -------------------------------------------------------------------------- f of u to the N. We should then find the probability of -------------------------------------------------------------------------- N-1 upticks and 1 downtick, and multiply *it* by -------------------------------------------------------------------------- f of u to the (N minus 1) d. We should then find the probability of -------------------------------------------------------------------------- N-2 upticks and 2 downticks, and multiply *it* by -------------------------------------------------------------------------- f of u to the (N minus 2) d squared. We should then -------------------------------------------------------------------------- continue on, until we finally find the probability of -------------------------------------------------------------------------- 0 upticks and N downticks, and multiply *it* by -------------------------------------------------------------------------- f of d to the N. Adding all those results gives the -------------------------------------------------------------------------- expected contingent claim, from which we can get the -------------------------------------------------------------------------- price of the option. Remembering that "Coin-Flippers got Price", we seek to relate this huge pricing computation to a -------------------------------------------------------------------------- coin-flipping game. Since the risk-neutral world is, here, a 50-50 world, we -------------------------------------------------------------------------- flip a fair (or 50-50) coin, and we do it N times. The payoff rule for this coin-flipping game is as follows: If, in those N flips, we see -------------------------------------------------------------------------- H heads and T tails, then we -------------------------------------------------------------------------- pay out f of u to the H d to the T. Moreover, this payoff is done -------------------------------------------------------------------------- 30 days from now. We've set up this coin-flipping game so that its -------------------------------------------------------------------------- expected payout, which we call E, is exactly the same as the -------------------------------------------------------------------------- expected contingent claim, which, in turn, is equal to -------------------------------------------------------------------------- rho to the N P, and so we have -------------------------------------------------------------------------- rho to the N P here, as well. Our goal is to price this option, so we want to compute the price -------------------------------------------------------------------------- P, and now we see that it's equal to -------------------------------------------------------------------------- rho to the minus N times E. Remember that rho is -------------------------------------------------------------------------- 1 plus iota. Since -------------------------------------------------------------------------- 1 plus iota to the minus N is the 30 day discount factor on the cost of money, we see that the option price is, as usual -------------------------------------------------------------------------- the discounted expected payout for the coin-flipping game. All this is well and good, but we're still faced with the seemingly daunting problem of computing -------------------------------------------------------------------------- this expected payout, and there's the rub. Fortunately, we'll see that we have a nice little fact called the Central Limit Theorem that'll bring us home. We can highlight all the main ideas in the Central Limit Theorem by doing an -------------------------------------------------------------------------- easier problem than the pricing problem. We'll get back to pricing in a moment, but let's cut our teeth on -------------------------------------------------------------------------- computing the *probability* that -------------------------------------------------------------------------- the number of heads minus the number of tails is between -------------------------------------------------------------------------- minus root N and -------------------------------------------------------------------------- plus root N. Our plan is that, once we gain some experience with -------------------------------------------------------------------------- probability calculation, we can move on to calculating some easy expected values and then, gradually, work our way up to the expected value, E, of that coin-flipping payout from the last slide. Okay. Let -------------------------------------------------------------------------- X be H minus T over root N. We want to -------------------------------------------------------------------------- compute the probability that -------------------------------------------------------------------------- this is true, but, if we divide -------------------------------------------------------------------------- this inequality by root N, then we get -------------------------------------------------------------------------- this simpler looking inequality. So we still have our -------------------------------------------------------------------------- easier problem, but it's been slightly restated. Next, let -------------------------------------------------------------------------- H_1 be the number of heads after the first flip, so H_1 may be 0 or 1 depending on whether the first flip comes up tails or heads. Let -------------------------------------------------------------------------- H_2 be the number of heads after the second flip, so H_2 may be 0 or 1 or 2. -------------------------------------------------------------------------- We continue on to -------------------------------------------------------------------------- H_N which is the number of heads after the Nth flip, and it could be any integer from 0 to N. H_N is, in fact, the same as -------------------------------------------------------------------------- H, the number of heads after the N flips of the coin. So, -------------------------------------------------------------------------- for every integer, j, from 1 to N, we've let -------------------------------------------------------------------------- H_j be the number of heads after the jth flip, and we noted that -------------------------------------------------------------------------- H is H_N. We also define -------------------------------------------------------------------------- T_j be the number of tails after the jth flip, -------------------------------------------------------------------------- so T is T_N. Let's also define -------------------------------------------------------------------------- D_j to be the difference between H_j and T_j. Now, -------------------------------------------------------------------------- X is (H minus T over root N). It's therefore -------------------------------------------------------------------------- (H_N minus T_N over root N), which is -------------------------------------------------------------------------- D_N over root N. Our -------------------------------------------------------------------------- easier problem involves -------------------------------------------------------------------------- X, but we propose to work our way up to X. In the next few slides, we'll first study -------------------------------------------------------------------------- D_1, (pause...............) then -------------------------------------------------------------------------- D_2. Then we'll boldly jump to -------------------------------------------------------------------------- D_N, and, finally -------------------------------------------------------------------------- D_N over root N, which is -------------------------------------------------------------------------- X. Actually, to understand what happens when we divide by a constant like -------------------------------------------------------------------------- root N, we'll throw in -------------------------------------------------------------------------- D_1 divided by a constant, -------------------------------------------------------------------------- say, by 7. Okay. We begin with the first difference, -------------------------------------------------------------------------- D_1, which is -------------------------------------------------------------------------- the difference between H_1 and T_1. After the first flip, we have either -------------------------------------------------------------------------- 1 head and 0 tails, or -------------------------------------------------------------------------- 0 heads and 1 tail. Moreover, there's a -------------------------------------------------------------------------- 50-50 chance of either. Note that we use an -------------------------------------------------------------------------- up-arrow for heads and a -------------------------------------------------------------------------- down-arrow for tails. We plot the possible results on -------------------------------------------------------------------------- a number line, and, half of the time, D_1 ends up -------------------------------------------------------------------------- 1, whereas the other half, it ends up -------------------------------------------------------------------------- minus 1. We record the ending probabilities -------------------------------------------------------------------------- here. In -------------------------------------------------------------------------- this box, we see data that, together, are called -------------------------------------------------------------------------- the distribution of D_1. One sometimes says -------------------------------------------------------------------------- probability distribution, (pause...............) or -------------------------------------------------------------------------- measure, (pause...............) or -------------------------------------------------------------------------- probability measure, but we'll say -------------------------------------------------------------------------- "distribution" in these lectures. -------------------------------------------------------------------------- D_1 is referred to as a -------------------------------------------------------------------------- random variable, because its value varies in a random way. One of our goals, for the next lecture, is to give a firm mathematical definition of a random variable, but, for now, we content ourselves with the following intuitive description: It's -------------------------------------------------------------------------- a variable whose value is determined by random events, like a coin flip. Incidentally, note that -------------------------------------------------------------------------- T_1 minus H_1 has *exactly* the same *distribution* as -------------------------------------------------------------------------- H_1 minus T_1. Both come up -------------------------------------------------------------------------- one half of the time and -------------------------------------------------------------------------- minus one the other half. So -------------------------------------------------------------------------- two different random variables can have the same distribution. Next, we simplify the graphic for H_1 minus T_1 -------------------------------------------------------------------------- like this. From the -------------------------------------------------------------------------- distribution, we can proceed to what's called its -------------------------------------------------------------------------- generating function, or, sometimes, its -------------------------------------------------------------------------- "moment generating function". First, we choose a variable, and, in this lecture, we'll use the variable -------------------------------------------------------------------------- "z". For us, a generating function technically -------------------------------------------------------------------------- *won't* be a function, but, rather, an -------------------------------------------------------------------------- expression of z. However, if we start saying "generating expression" in public, we'll get funny looks, so we stick with the traditional term -------------------------------------------------------------------------- "generating function". To compute it, we place -------------------------------------------------------------------------- z to the 1 next to 1, and -------------------------------------------------------------------------- z to the minus 1 next to minus 1. We record that -------------------------------------------------------------------------- z to the 1 is z. By definition, to get the generating function of -------------------------------------------------------------------------- this distribution, we take -------------------------------------------------------------------------- this times -------------------------------------------------------------------------- this (pause...............) plus -------------------------------------------------------------------------- this times -------------------------------------------------------------------------- this, (pause...............) which gives -------------------------------------------------------------------------- this expression. From the generating function of a distribution, we can proceed to its -------------------------------------------------------------------------- Fourier transform. First, remember that -------------------------------------------------------------------------- i denotes the square root of minus one. We now choose another variable, and, in this lecture, we'll use -------------------------------------------------------------------------- "t". Our Fourier transforms will be expressions of t. Another common variable to use in Fourier transforms is the Greek letter -------------------------------------------------------------------------- "xi", but, in these lectures, we'll stick to -------------------------------------------------------------------------- "t". Remember that, in *this* context, "t" has nothing to do with time. To go from the generating function to the Fourier transform, we -------------------------------------------------------------------------- replace z by e to the minus i t, -------------------------------------------------------------------------- like so. This may all seem a bit desultory, and you may be wondering: Why do all this? Well, first off, we'll see in a few slides that this cookbook definition of Fourier transform is very useful for calculating some difficult probabilities and expected values. Still, cookbook definitions are always a little unsatisfying. Unfortunately, it's beyond the scope of these lectures to go into great detail about motivation, but maybe it'd be good to hint that there's an -------------------------------------------------------------------------- infinite-dimensional version of the Spectral Theorem from Linear Algebra, and all these manipulations are motivated by an infinite-dimensional generalization of simultaneous diagonalization of commuting matrices. In the case of -------------------------------------------------------------------------- generating functions and Fourier transforms, we are, in some sense, simultaneously diagonalizing all the translation operators on the L two space of the real numbers. If you should want to learn more about the ideas that underlie these cookbook definitions, the place to start would be a Functional Analysis course that includes the -------------------------------------------------------------------------- Infinite-dimensional Spectral Theorem. In *this* lecture, there's no need to worry about *any* of that. Suffice it to say that we'll see, in a few slides, that these cookbook definitions are *very* useful. First, though, let's do some algebraic simplification. Remember -------------------------------------------------------------------------- this formula, which is easily verified by power series expansion. Replacing t by minus t, we get -------------------------------------------------------------------------- this equation. Remember that cosine of minus t is -------------------------------------------------------------------------- cosine of t whereas sine of minus t is -------------------------------------------------------------------------- the negative of sine of t. Now take -------------------------------------------------------------------------- point five times the first equation, -------------------------------------------------------------------------- and add it (pause...............) to -------------------------------------------------------------------------- point five times the second. On the -------------------------------------------------------------------------- left hand side, we get -------------------------------------------------------------------------- this, and, on the -------------------------------------------------------------------------- right hand side, -------------------------------------------------------------------------- these cancel, and we get -------------------------------------------------------------------------- cosine t. So the Fourier transform of the -------------------------------------------------------------------------- distribution of D_1 *is* -------------------------------------------------------------------------- cosine t. (pause...............) Another important concept is the -------------------------------------------------------------------------- inverse Fourier transform. For example, if you tell me you're thinking of a distribution whose Fourier transform is -------------------------------------------------------------------------- cosine t, and if I'm smart enough to know that cosine t is equal to -------------------------------------------------------------------------- this, then I can work my way back to -------------------------------------------------------------------------- the generating function, from which I can see that there are probabilities of -------------------------------------------------------------------------- point 5 at 1 and point 5 at minus 1, thereby recovering, from the -------------------------------------------------------------------------- Fourier transform, -------------------------------------------------------------------------- the distribution from which it came. Of course, I can't tell whether it came from -------------------------------------------------------------------------- H_1 minus T_1, or from -------------------------------------------------------------------------- T_1 minus H_1, or, for that matter, from any other random variable with the same -------------------------------------------------------------------------- distribution as D_1. So we can't work back from the -------------------------------------------------------------------------- Fourier transform all the way to the random variable; we can only get back to its distribution. Next, let's analyze what happens to the Fourier transform if we divide a random variable by a constant. For example, -------------------------------------------------------------------------- what about D_1 (over 7)? What happens to the Fourier transform? To answer this, on the next slide, we'll replace -------------------------------------------------------------------------- D_1 by (D_1) (over 7), and -------------------------------------------------------------------------- eliminate this text. D_1 (over 7) has possible values not -------------------------------------------------------------------------- 1 and minus 1, but one seventh and minus one seventh, so we'll divide these two numbers, -------------------------------------------------------------------------- 1 and minus 1, by 7. Therefore we'll also divide -------------------------------------------------------------------------- this 1 and this minus 1 by 7. It'll be convenient, -------------------------------------------------------------------------- in these two equations, to -------------------------------------------------------------------------- replace t by t over 7. Finally, we'll -------------------------------------------------------------------------- eliminate this text, to make room to work out the -------------------------------------------------------------------------- generating function and Fourier transform of (pause...............) -------------------------------------------------------------------------- (D_1) (over 7). To get the generating function, we take -------------------------------------------------------------------------- this times -------------------------------------------------------------------------- this (pause...............) plus -------------------------------------------------------------------------- this times -------------------------------------------------------------------------- this, (pause...............) -------------------------------------------------------------------------- like so. (pause...............) -------------------------------------------------------------------------- Replacing z by e to the minus i t, we get -------------------------------------------------------------------------- the Fourier transform. Averaging -------------------------------------------------------------------------- these two equations, we see that that Fourier transform is -------------------------------------------------------------------------- cosine t over 7. So, to answer the question -------------------------------------------------------------------------- "What about D_1 over 7", we now see that, in terms of the Fourier transform, the answer is -------------------------------------------------------------------------- replace t by t over 7. This, in fact, works quite generally: If you know the Fourier transform of the distribution of a random variable, and you then divide by a constant to get a new random variable, then the new Fourier transform is obtained from the old one by replacing t by t over the constant. (pause...............) We now move on to -------------------------------------------------------------------------- D_2. For D_2, one has -------------------------------------------------------------------------- two coin flips, and each flip has a -------------------------------------------------------------------------- fifty-fifty chance of heads or tails. We put in -------------------------------------------------------------------------- a number line. In the case of -------------------------------------------------------------------------- 2 heads and 0 tails, D_2 is -------------------------------------------------------------------------- 2. In the case of -------------------------------------------------------------------------- 1 head and 1 tail, D_2 is -------------------------------------------------------------------------- 0. In the case of -------------------------------------------------------------------------- 0 heads and 2 tails, D_2 is -------------------------------------------------------------------------- minus 2. The probability of -------------------------------------------------------------------------- heads-heads is (pause...............) -------------------------------------------------------------------------- point 2 5. The probability of -------------------------------------------------------------------------- heads-tails is (pause...............) -------------------------------------------------------------------------- point 2 5 and of -------------------------------------------------------------------------- tails-heads is (pause...............) -------------------------------------------------------------------------- point 2 5, as well. Then the total probability that D_2 is 0 is -------------------------------------------------------------------------- point 5. Finally, the probability of -------------------------------------------------------------------------- tails-tails is (pause...............) -------------------------------------------------------------------------- point 2 5. We therefore represent the distribution of D_2 graphically -------------------------------------------------------------------------- like this. To find the -------------------------------------------------------------------------- generating function, we place -------------------------------------------------------------------------- z to the 2 next to 2, -------------------------------------------------------------------------- z to the 0 next to 0 and -------------------------------------------------------------------------- z to the minus 2 next to minus 2. We record that -------------------------------------------------------------------------- z to the 0 is 1. The generating function is then -------------------------------------------------------------------------- this times this (pause...............) -------------------------------------------------------------------------- plus this times this (pause...............) -------------------------------------------------------------------------- plus this times this. Finally, we get to a key point: *This* generating function is -------------------------------------------------------------------------- a perfect square. In fact, it's the square of (pause...............) -------------------------------------------------------------------------- the generating function of the distribution of D_*1*. (pause...............) To get the -------------------------------------------------------------------------- Fourier transform, we -------------------------------------------------------------------------- replace z by e to the minus i t, and we already know that, when we do that -------------------------------------------------------------------------- here, we get cosine t, so the result for D_*2* will be -------------------------------------------------------------------------- the square of cosine t, or, -------------------------------------------------------------------------- cosine squared t. Okay. We now boldly jump to -------------------------------------------------------------------------- D_N. There is -------------------------------------------------------------------------- no way I can show you its distribution, since that'd involve plotting over 2 and a half million numbers and probabilities on a number line. Looking at the -------------------------------------------------------------------------- generating function, there's -------------------------------------------------------------------------- no way I can show it to you in expanded form, with over 2 and a half million terms. However, because of the magic of generating functions, we *can* see that the answer is -------------------------------------------------------------------------- the Nth power of -------------------------------------------------------------------------- the generating function of the distribution of D_*1*. The -------------------------------------------------------------------------- Fourier transform is, similarly, -------------------------------------------------------------------------- the Nth power of -------------------------------------------------------------------------- the Fourier transform of the distribution of D_*1*. It's -------------------------------------------------------------------------- cosine to the N of t. We're actually interested in the random variable -------------------------------------------------------------------------- X, which, remember, is -------------------------------------------------------------------------- D_N over root N. To find *its* Fourier transform, we -------------------------------------------------------------------------- replace t by t over root N -------------------------------------------------------------------------- here. Let's -------------------------------------------------------------------------- clear some room, and -------------------------------------------------------------------------- make the change. Again, D_N over root N is just -------------------------------------------------------------------------- X. Let's -------------------------------------------------------------------------- clear some room, and -------------------------------------------------------------------------- reformat. Here's the point: Even though it might seem very daunting to understand -------------------------------------------------------------------------- X in terms of its -------------------------------------------------------------------------- *distribution*, the trick of abandoning distributions in favor of -------------------------------------------------------------------------- their Fourier transforms *vastly* simplifies things. This may seem very clever, and you may wonder what mad scientist or mathematician came up with -------------------------------------------------------------------------- generating functions and Fourier transforms. Well, it only *appears* clever. In fact, as we suggested earlier, from the right perspective, mathematicians have simply followed their noses to develop -------------------------------------------------------------------------- Fourier analysis. They first understood that diagonal matrices are more easily manipulated than general matrices, and, applying this thinking to translation operators on L two of R, they were led, ineluctably, to develop Infinite-Dimensional -------------------------------------------------------------------------- Spectral Theory, and, in the process, the basics of -------------------------------------------------------------------------- Fourier analysis. Incidentally, this perspective, while right, (pause...............) is probably ahistorical. In any case, setting aside the question of how clever or historically accurate all of this is, you may also wonder what *value* there is in knowing the -------------------------------------------------------------------------- Fourier transform of the distribution of X. How will *that* help us to solve our -------------------------------------------------------------------------- problem, which was to -------------------------------------------------------------------------- compute the probability that -------------------------------------------------------------------------- X is between minus 1 and 1? To answer this, first note that, -------------------------------------------------------------------------- this Fourier transform is the renormalized Nth power of cosine. Because N is so large, that's very -------------------------------------------------------------------------- close to the limit of the renormalized powers of cosine, and we know how to do that kind of limit. In this case, you may remember that we got -------------------------------------------------------------------------- e to the minus t squared over 2. This is *so* important that, for practice, let's pause to -------------------------------------------------------------------------- verify this for t equals 3. Replacing t by 3, we -------------------------------------------------------------------------- want to show this, and we -------------------------------------------------------------------------- move it to the top to make room. We begin by taking logarithms. On the left, we have -------------------------------------------------------------------------- this. (pause...............) -------------------------------------------------------------------------- Copying it, and, taking its -------------------------------------------------------------------------- logarithm, we can move -------------------------------------------------------------------------- this n down front, -------------------------------------------------------------------------- like so. On the -------------------------------------------------------------------------- right, when we take its -------------------------------------------------------------------------- logarithm, we can use that log and exp are inverses, so we get -------------------------------------------------------------------------- minus 3 squared over 2, or -------------------------------------------------------------------------- minus 9 over 2. It therefore suffices to prove -------------------------------------------------------------------------- this. Next step is to substitute -------------------------------------------------------------------------- x for 1 over root n. Then -------------------------------------------------------------------------- 3 x is 3 over root n. Also, -------------------------------------------------------------------------- x squared is 1 over n and, reciprocating, we get that -------------------------------------------------------------------------- 1 over x squared is n. Substituting, we replace -------------------------------------------------------------------------- n (pause...............) by -------------------------------------------------------------------------- 1 over x squared and -------------------------------------------------------------------------- 3 over root n by -------------------------------------------------------------------------- 3 x, (pause...............) -------------------------------------------------------------------------- like so. As -------------------------------------------------------------------------- n approaches infinity, -------------------------------------------------------------------------- x approaches zero, so we get -------------------------------------------------------------------------- the limit as x approaches zero, and we now -------------------------------------------------------------------------- want to show that *this* limit is -------------------------------------------------------------------------- minus 9 over 2. Algebra changes -------------------------------------------------------------------------- this (pause...............) to -------------------------------------------------------------------------- this, and we leave it to you to see that this limit is a -------------------------------------------------------------------------- zero over zero indeterminate form, which just begs for an application of -------------------------------------------------------------------------- l'Hopital's Rule. We must calculate -------------------------------------------------------------------------- dee dee x of the numerator and -------------------------------------------------------------------------- dee dee x of the denominator. -------------------------------------------------------------------------- The denominator's easy. In the numerator, we have an -------------------------------------------------------------------------- expression inside a -------------------------------------------------------------------------- function, so we take the derivative of the -------------------------------------------------------------------------- function, (pause...............) -------------------------------------------------------------------------- like so, and plug in the -------------------------------------------------------------------------- expression, (pause...............) -------------------------------------------------------------------------- like so. It remains to multiply by the dervative of the -------------------------------------------------------------------------- expression, but this expression is itself an -------------------------------------------------------------------------- expression inside a -------------------------------------------------------------------------- function, so we take the derivative of the -------------------------------------------------------------------------- function, (pause...............) -------------------------------------------------------------------------- like so, and plug in the -------------------------------------------------------------------------- expression, (pause...............) -------------------------------------------------------------------------- like so. Finally, we multiply by the derivative of the -------------------------------------------------------------------------- expression, (pause...............) -------------------------------------------------------------------------- like so. L'Hopital's Rule now tells us to divide the derivative of -------------------------------------------------------------------------- this by the derivative of -------------------------------------------------------------------------- this, (pause...............) -------------------------------------------------------------------------- like so. -------------------------------------------------------------------------- This simplifies. In the numerator, we put -------------------------------------------------------------------------- this, (pause...............) -------------------------------------------------------------------------- like so. In the denominator, we put -------------------------------------------------------------------------- this, (pause...............) -------------------------------------------------------------------------- like so. We again get a -------------------------------------------------------------------------- zero over zero indeterminate form, and, with the tenacity of the truly obsessed, we again apply -------------------------------------------------------------------------- l'Hopital's Rule, which tells us to take the derivative of -------------------------------------------------------------------------- this, and divide it by the derivative of -------------------------------------------------------------------------- this. We leave it to you to work through the computations, and get -------------------------------------------------------------------------- this. In those computations, your allies are the Product and Chain Rules, and powerful allies they are, yes. We finally arrive at a -------------------------------------------------------------------------- limit that is *not* an indeterminate form, so our l'Hopital's Rule days are, thankfully, over. As x tends to zero, -------------------------------------------------------------------------- this tends to 1, as does -------------------------------------------------------------------------- this. Because -------------------------------------------------------------------------- x is approaching zero, -------------------------------------------------------------------------- this term tends to zero. We arrive at -------------------------------------------------------------------------- minus three times three divided by -------------------------------------------------------------------------- two, and that's -------------------------------------------------------------------------- minus nine over two, so we get what we wanted, and -------------------------------------------------------------------------- this is now verified for t equals 3. Its verification for any other value of t is similar. This now leads us to the -------------------------------------------------------------------------- Key idea in *our* presentation of the Central Limit Theorem. The Fourier transform of the distribution of -------------------------------------------------------------------------- X (pause...............) is -------------------------------------------------------------------------- this, so the inverse Fourier transform of -------------------------------------------------------------------------- this is the distribution of -------------------------------------------------------------------------- X. If we could somehow take the inverse Fourier transform of -------------------------------------------------------------------------- e to the minus t squared over 2, we'd get some distribution. If we could then find a random variable with that distribution, then that random variable should be, distributionally, very close to X. Let's write out that thought. -------------------------------------------------------------------------- Let X tilde be some random variable that we someday, somehow manage to find, and suppose its distribution has Fourier transform -------------------------------------------------------------------------- e to the minus t squared over 2. -------------------------------------------------------------------------- Here's the Fourier transform of the distribution of -------------------------------------------------------------------------- X. Then, because X and -------------------------------------------------------------------------- X tilde have distributions with -------------------------------------------------------------------------- close Fourier transforms, if there's any justice in the world, -------------------------------------------------------------------------- X and X tilde should be close in the sense that their distributions are close. In fact, the theory of random variables is developed to the point where this notion of closeness is a completely rigorous concept called "closeness -------------------------------------------------------------------------- in distribution", or, sometimes, -------------------------------------------------------------------------- "weak-*" closeness. In this lecture, we content ourselves with -------------------------------------------------------------------------- intuition, but be aware that there is rigorous mathematics that underlies it. To learn more, you'd want graduate level courses on probability theory and functional analysis. Remember that our immediate goal was -------------------------------------------------------------------------- this and, if we can find X tilde, and if our -------------------------------------------------------------------------- approximations are good, then, after replacing X -------------------------------------------------------------------------- by X tilde, we should get the same answer, up to several decimal places of accuracy. Now, how are we to find this X tilde? Well, in a moment, we'll simply show you a random variable, traditionally called, not X tilde, but rather -------------------------------------------------------------------------- Z whose distribution has -------------------------------------------------------------------------- this as its Fourier transform, so we can just use Z for X tilde. However, that leaves open the question of how to find Z if you're not told up front what it is. In other words, how does the inverse Fourier transform work? This is also too advanced for this lecture, but you should note that Fourier Theory is quite well-developed, and, if you want to see more, you'll want to take a course on Fourier analysis. We want to describe the distribution of this random variable -------------------------------------------------------------------------- Z. First, though, let me mention that, up to now, every random variable we've studied has had only a finite number of possible values. For example, -------------------------------------------------------------------------- D_2 could have exactly three possible values: -------------------------------------------------------------------------- 2, 0 or minus 2, depending on whether we have two heads, a head and a tail, or two tails, in the first two coin flips. Even -------------------------------------------------------------------------- D_N, while much more complicated, has only -------------------------------------------------------------------------- finitely many possible values. The theory of random variables, which we'll describe in the next lecture, is robust enough to account for random variables with infinitely many possible values. As we'll soon see, this -------------------------------------------------------------------------- Z has an infinite range of possible values. In fact, for *each* number -------------------------------------------------------------------------- x on the real number line, we'll put an infinitesimal amount of probability at x. Specifically, we'll put -------------------------------------------------------------------------- e to the minus x squared over 2 dee x at x. Note that -------------------------------------------------------------------------- dee x is an infinitesimal, so the point x is *not* given a positive probability. Moreover, we -------------------------------------------------------------------------- do this for *all* real numbers x. Before we go on, the astute listener might have noticed a small -------------------------------------------------------------------------- mistake. You see, if we take -------------------------------------------------------------------------- the probability at x, and -------------------------------------------------------------------------- add up over -------------------------------------------------------------------------- *all* x, by integrating -------------------------------------------------------------------------- from minus infinity to infinity, you may remember that this integral computes to -------------------------------------------------------------------------- root 2 pi. We want for the sum of the probabilities in a probability distribution to be 1, *not* root 2 pi, so -------------------------------------------------------------------------- we divide by root 2 pi, and the mistake -------------------------------------------------------------------------- goes away. When we get to a more rigorous development of random variables, we'll describe Z more precisely, but, for now, we simply ask you to accept that a random variable with -------------------------------------------------------------------------- this intuitively described distribution exists, and to watch how we do computations with this Z. For example, let's -------------------------------------------------------------------------- compute the probability that Z is equal to 7. To -------------------------------------------------------------------------- solve this, we take the -------------------------------------------------------------------------- probability at x and -------------------------------------------------------------------------- add up for all x between 7 and 7. Of course, this integral -------------------------------------------------------------------------- is zero because the -------------------------------------------------------------------------- upper and lower limits of integration are the same. There's *no* positive probability that Z will be equal to any one *particular* number, like 7. By contrast, let's -------------------------------------------------------------------------- compute the probability that Z is between 2 and 3. As before, we -------------------------------------------------------------------------- take the probability at x, but now we -------------------------------------------------------------------------- add up, for all x between 2 and 3. Remembering that -------------------------------------------------------------------------- Phi of x is an antiderivative, we use the Fundamental Theorem of Calculus, and -------------------------------------------------------------------------- evaluate between 2 and 3, which gives -------------------------------------------------------------------------- Phi of 3 minus Phi of 2. Running to our Phi-enabled calculator, we see that, to four decimals, this is -------------------------------------------------------------------------- point 0 2 1 4. In other words, the probability that Z is between 2 and 3 is -------------------------------------------------------------------------- 2.14 percent, to two decimals. Next, let's find the -------------------------------------------------------------------------- generating function of the distribution of Z. For each x, we put -------------------------------------------------------------------------- z to the x next to x. We then multiply -------------------------------------------------------------------------- this (pause...............) -------------------------------------------------------------------------- by this, (pause...............) and -------------------------------------------------------------------------- add up over -------------------------------------------------------------------------- all values of x from minus infinity to infinity. We leave this integral as an -------------------------------------------------------------------------- exercise, and move on to the -------------------------------------------------------------------------- Fourier transform. We replace -------------------------------------------------------------------------- z (pause...............) by -------------------------------------------------------------------------- e to the minus i t. -------------------------------------------------------------------------- This is an integral of a type that we've worked out in the past. It's equal to -------------------------------------------------------------------------- e to the minus t squared over 2. This is *so* important that, for practice, let's pause to -------------------------------------------------------------------------- verify this for t equals 3. Replacing t by 3, we -------------------------------------------------------------------------- want to show this. We start -------------------------------------------------------------------------- at the right hand side, and copy it -------------------------------------------------------------------------- here. We pull a -------------------------------------------------------------------------- constant out of the integral. Because the -------------------------------------------------------------------------- linear term in the exponent has coefficient minus 3 i, we follow our muse, and -------------------------------------------------------------------------- replace x by x minus 3 i, which yields -------------------------------------------------------------------------- this. Remember that, by Cauchy's Theorem, we do *not* need to subtract minus 3 i from -------------------------------------------------------------------------- these limits of integration. -------------------------------------------------------------------------- This expands (pause...............) -------------------------------------------------------------------------- like this, and to expand -------------------------------------------------------------------------- this, we start with -------------------------------------------------------------------------- the square of x minus 3 i, which is -------------------------------------------------------------------------- x squared *minus* 6 i x *minus* 9. Multiplying that by -------------------------------------------------------------------------- minus 1 over 2, we get -------------------------------------------------------------------------- *minus* x squared over 2 *plus* 3 i x *plus* 9 over 2, so the expansion of -------------------------------------------------------------------------- this exponent is done. By properties of exponentials, we get -------------------------------------------------------------------------- e to the *minus* x squared over 2 times e to the *plus* 3 i x times e to the *plus* 9 over 2, (pause...............) -------------------------------------------------------------------------- and the rest stays the same. By the magic of completing the square, -------------------------------------------------------------------------- these two factors cancel. -------------------------------------------------------------------------- These two constants multiply together to give -------------------------------------------------------------------------- e to the minus 9 over 2, -------------------------------------------------------------------------- and the rest stays the same. -------------------------------------------------------------------------- This is equal to 1, so we get -------------------------------------------------------------------------- e to the minus 9 over 2, which *is* -------------------------------------------------------------------------- e to the minus 3 squared over 2, which is -------------------------------------------------------------------------- the right hand side, and we've proved what we wanted to prove. (pause...............) So -------------------------------------------------------------------------- this is now verified for t equals 3, and a similar argument works for any value of t. Recall again the -------------------------------------------------------------------------- key idea, which involves finding some random variable whose distribution has Fourier transform -------------------------------------------------------------------------- e to the minus t squared over 2. Such a variable will be -------------------------------------------------------------------------- close to X in the sense that their distributions are close. Now we see that the -------------------------------------------------------------------------- Fourier transform of the distribution of -------------------------------------------------------------------------- Z *is* (pause...............) -------------------------------------------------------------------------- e to the minus t squared over 2, so -------------------------------------------------------------------------- Z is close to X in distribution, which we also record -------------------------------------------------------------------------- here. Our -------------------------------------------------------------------------- easier problem is to compute the probability that X is between minus 1 and 1. Since the distributions of Z and X are close, this should be -------------------------------------------------------------------------- about the same as the probability that -------------------------------------------------------------------------- *Z* is between minus 1 and 1. We take -------------------------------------------------------------------------- the probability at (little) x, -------------------------------------------------------------------------- like so, and -------------------------------------------------------------------------- add up, for all x between minus 1 and 1. This gives an -------------------------------------------------------------------------- approximate solution to our -------------------------------------------------------------------------- easier problem. An antiderivative is -------------------------------------------------------------------------- Phi of x, and we -------------------------------------------------------------------------- evaluate between minus 1 and 1. Our handy-dandy Phi-calculating calculator now tells us that, to two decimals, the answer is -------------------------------------------------------------------------- 68.27 percent. This is an approximate solution to the -------------------------------------------------------------------------- easier problem. Remember that N is over two and a half million. *Because* N is so large, we expect -------------------------------------------------------------------------- this answer to be accurate to several decimals. There are, in fact, ways of checking on accuracy, by bounding the error in our approximation. If you're interested in learning about how to do that kind of error-analysis, one uses a result called the -------------------------------------------------------------------------- Berry-Esseen Theorem. As usual, we omit such advanced topics in these lectures, for lack of time. We now work our way up from -------------------------------------------------------------------------- this easier (pause...............) -------------------------------------------------------------------------- probability problem to the original pricing problem, which was an -------------------------------------------------------------------------- expected value problem. Remember that our -------------------------------------------------------------------------- goal was to compute the expected value of -------------------------------------------------------------------------- f of u to the H d to the T, where -------------------------------------------------------------------------- H is the number of heads, after N flips of a fair coin, and -------------------------------------------------------------------------- T is the number of tails. Also, remember that the payoff function -------------------------------------------------------------------------- f(S) is this. This calculation is, perhaps, still too hard for us, so we again simplify, and come up with a -------------------------------------------------------------------------- new easier problem: We'll compute the expected value of a -------------------------------------------------------------------------- less complicated random variable, namely f(D_2). Remember the -------------------------------------------------------------------------- distribution of D_2. There's a -------------------------------------------------------------------------- 25 percent chance that D_2 is -------------------------------------------------------------------------- 2, in which case f(D_2) is -------------------------------------------------------------------------- f(2). There's a -------------------------------------------------------------------------- 50 percent chance that D_2 is -------------------------------------------------------------------------- 0, in which case f(D_2) is -------------------------------------------------------------------------- f(0). Finally, there's a -------------------------------------------------------------------------- 25 percent chance that D_2 is -------------------------------------------------------------------------- minus 2, in which case f(D_2) is -------------------------------------------------------------------------- f of minus 2. To get the -------------------------------------------------------------------------- expected value of f(D_2), we multiply -------------------------------------------------------------------------- this (pause...............) times -------------------------------------------------------------------------- this, (pause...............) then -------------------------------------------------------------------------- this (pause...............) times -------------------------------------------------------------------------- this, (pause...............) then -------------------------------------------------------------------------- this (pause...............) times -------------------------------------------------------------------------- this, (pause...............) and *then* -------------------------------------------------------------------------- add up. We leave it as an exercise for you to plug -------------------------------------------------------------------------- 2, 0 and minus 2 into -------------------------------------------------------------------------- f, and to compute that the expected value -------------------------------------------------------------------------- is 1 thousand 2 hundred 50 (1,250). (pause...............) We -------------------------------------------------------------------------- clear some room, and note that there's nothing special about f. The same computation would work for any function -------------------------------------------------------------------------- g. For example, we can -------------------------------------------------------------------------- define g(S) to be, say, -------------------------------------------------------------------------- 5 (e to the S) plus S squared. We leave it as an -------------------------------------------------------------------------- exercise to compute the -------------------------------------------------------------------------- expected value of *g* of D_2, with -------------------------------------------------------------------------- this definition of g. Next, let's -------------------------------------------------------------------------- clear some room, -------------------------------------------------------------------------- change back to f and move on to another problem, namely, to -------------------------------------------------------------------------- compute the expected value of -------------------------------------------------------------------------- f(*Z*). Remember the -------------------------------------------------------------------------- distribution of Z. There's -------------------------------------------------------------------------- this infinitesimal probability that Z is -------------------------------------------------------------------------- x, in which case f(Z) is -------------------------------------------------------------------------- f(x). To get the -------------------------------------------------------------------------- expected value of f(Z), we multiply -------------------------------------------------------------------------- this (pause...............) times -------------------------------------------------------------------------- this, and (pause...............) *then* -------------------------------------------------------------------------- add up. Here, adding up means integrating over -------------------------------------------------------------------------- all real numbers x from minus infinity to infinity. We can -------------------------------------------------------------------------- pull the constant out of the integral. In -------------------------------------------------------------------------- this equation, we can change -------------------------------------------------------------------------- S (pause...............) to -------------------------------------------------------------------------- x. We can then plug -------------------------------------------------------------------------- f(x) into the integral, and use our newfound integration skills to find the answer, but I'm too lazy myself, so I leave this work as another -------------------------------------------------------------------------- exercise. Now let's move from -------------------------------------------------------------------------- f(Z) (pause...............) to -------------------------------------------------------------------------- f(*X*). Remembering that -------------------------------------------------------------------------- Z is close to X, we see that our -------------------------------------------------------------------------- last solution is a good -------------------------------------------------------------------------- approximate solution to the problem of finding the -------------------------------------------------------------------------- expected value of f(*X*). I continue my lazy ways, and leave that integral as an -------------------------------------------------------------------------- exercise. There's nothing special about the function -------------------------------------------------------------------------- f, so we drop it both -------------------------------------------------------------------------- here (pause...............) and -------------------------------------------------------------------------- here. The same logic works for any *reasonable* function -------------------------------------------------------------------------- g, and, in the last act of this lecture, the meaning of the word "reasonable" will be made more precise. Now, if -------------------------------------------------------------------------- this (pause...............) were -------------------------------------------------------------------------- equal to g of capital X, for some reasonable g, we'd be in a position to achieve our -------------------------------------------------------------------------- goal, to a high degree of accuracy. Unfortunately, -------------------------------------------------------------------------- this expression involves H and T, not X. However, there is a connection between H, T and X, namely -------------------------------------------------------------------------- the definition of X as H minus T over root N. Remember that -------------------------------------------------------------------------- N is the number of coin flips, which is the same as the number of heads plus the number of tails. Then -------------------------------------------------------------------------- H plus T is N. If we multiply -------------------------------------------------------------------------- X by root N, we get -------------------------------------------------------------------------- X root N. If we multiply -------------------------------------------------------------------------- this by root N, we get -------------------------------------------------------------------------- H minus T, so these are -------------------------------------------------------------------------- equal. Adding -------------------------------------------------------------------------- these two equations, -------------------------------------------------------------------------- these cancel, and we get -------------------------------------------------------------------------- this. We copy -------------------------------------------------------------------------- this equation, (pause...............) -------------------------------------------------------------------------- over here. We negate -------------------------------------------------------------------------- this equation, (pause...............) -------------------------------------------------------------------------- over here. Adding (pause...............) -------------------------------------------------------------------------- these two equations, -------------------------------------------------------------------------- these cancel, and we get -------------------------------------------------------------------------- this. We now take -------------------------------------------------------------------------- the equation on the left, and -------------------------------------------------------------------------- divide it by 2. We now take -------------------------------------------------------------------------- the equation on the right, and -------------------------------------------------------------------------- divide *it* by 2. We've now written -------------------------------------------------------------------------- H and T in terms of X. Remember that -------------------------------------------------------------------------- N is a specific number, a bit over 2 and a half million. We can plug these expressions for H and T in -------------------------------------------------------------------------- here, to turn this -------------------------------------------------------------------------- expression of H and T into -------------------------------------------------------------------------- an expression of X, for which we have an -------------------------------------------------------------------------- approximate solution, assuming $g$ isn't unreasonable. Okay -- let's go through this in small steps. First, -------------------------------------------------------------------------- u to the H is -------------------------------------------------------------------------- this, (pause...............) and -------------------------------------------------------------------------- d to the T is -------------------------------------------------------------------------- this. To get -------------------------------------------------------------------------- u to the H d to the T, we multiply -------------------------------------------------------------------------- u to the H by -------------------------------------------------------------------------- d to the T. -------------------------------------------------------------------------- This product is equal to -------------------------------------------------------------------------- u d to the N over 2 -------------------------------------------------------------------------- This product is equal to -------------------------------------------------------------------------- this, where, because of the -------------------------------------------------------------------------- minus sign, we -------------------------------------------------------------------------- divide. Next, we define -------------------------------------------------------------------------- C to *be* u d to the N over 2. Remember that -------------------------------------------------------------------------- u, d and N are all known, and we'll eventually use their values calculate -------------------------------------------------------------------------- C. This definition allows us to replace -------------------------------------------------------------------------- this (pause...............) by -------------------------------------------------------------------------- C. Next, define -------------------------------------------------------------------------- k by this equation. Again, -------------------------------------------------------------------------- u, d and N are all known, and we'll eventually use their values calculate -------------------------------------------------------------------------- k. (pause...............) -------------------------------------------------------------------------- Exponentiating k, and remembering that exp and -------------------------------------------------------------------------- log are inverses, we get -------------------------------------------------------------------------- u over d to the (root N) over 2. If we raise -------------------------------------------------------------------------- this to the -------------------------------------------------------------------------- capital X power, we get -------------------------------------------------------------------------- exactly this. So -------------------------------------------------------------------------- this underlined expression is -------------------------------------------------------------------------- e to the k raised to the -------------------------------------------------------------------------- capital X power. We're interested in -------------------------------------------------------------------------- this expression, which we put -------------------------------------------------------------------------- here. This is f of -------------------------------------------------------------------------- u to the H d to the T, which is f of -------------------------------------------------------------------------- C [e to the k (capital X)], which we put -------------------------------------------------------------------------- here. We now define a specific function g by -------------------------------------------------------------------------- this equation. This g is "reasonable", in a sense that we'll make precise in the last act of this lecture. Anyway, replacing -------------------------------------------------------------------------- little x by capital X, we see that -------------------------------------------------------------------------- f of C [e to the k (capital X)] is equal to -------------------------------------------------------------------------- g of *capital* X. Then -------------------------------------------------------------------------- these two are equal, which we can re-state -------------------------------------------------------------------------- here. At this point, we no longer have a -------------------------------------------------------------------------- new easier problem, but, rather, -------------------------------------------------------------------------- a restatement of our goal. So our goal now becomes to calculate -------------------------------------------------------------------------- this approximate expected value. We move the -------------------------------------------------------------------------- definition of g(x) -------------------------------------------------------------------------- over here. In the payoff function, -------------------------------------------------------------------------- factor out 5,000. Changing -------------------------------------------------------------------------- S to -------------------------------------------------------------------------- C e to the k x, we find that g(x) is given by -------------------------------------------------------------------------- this expression. Again, in the last act of this lecture, we'll address the question of whether this g(x) is -------------------------------------------------------------------------- reasonable. For now, though, lte's focus on the -------------------------------------------------------------------------- approximate solution. We can now -------------------------------------------------------------------------- write out g(x), and -------------------------------------------------------------------------- leave the rest unchanged. We recall -------------------------------------------------------------------------- our definitions of C and k, as well as the values of -------------------------------------------------------------------------- u and d, and of -------------------------------------------------------------------------- N. (pause...............) Plugging -------------------------------------------------------------------------- u, d and N into -------------------------------------------------------------------------- the formula for C, we find that -------------------------------------------------------------------------- C (pause...............) is equal to -------------------------------------------------------------------------- this. (pause...............) Plugging -------------------------------------------------------------------------- u, d and N into -------------------------------------------------------------------------- the formula for k, we find that -------------------------------------------------------------------------- k (pause...............) is equal to -------------------------------------------------------------------------- this. We leave the computation of -------------------------------------------------------------------------- this approximate expected value for the next act, but we've already developed all the necessary integration skills. So, (pause...............) it -------------------------------------------------------------------------- remains to finish (pause...............) the -------------------------------------------------------------------------- pricing problem, which amounts to -------------------------------------------------------------------------- calculating the integral from the last slide. A reasonable complaint is that, while we've now exposed all the ideas in the Central Limit Theorem, we haven't actually stated it. We leave *that* for another -------------------------------------------------------------------------- act in this many-act lecture. Now, though, -------------------------------------------------------------------------- let's take a break. -------------------------------------------------------------------------- Welcome to the Fifth Act of Lecture 4 of Notes on Financial Mathematics, by Scot Adams and Fernando Reitich. -------------------------------------------------------------------------- Here's the approximate expected value problem that we left unsolved at the end of the last act. Remember that -------------------------------------------------------------------------- C and k are known constants. Let's -------------------------------------------------------------------------- move the problem up to the top of the page, -------------------------------------------------------------------------- bring the constant out of the integral, and -------------------------------------------------------------------------- leave the rest alone. -------------------------------------------------------------------------- This expression increases as x increases, and we need to figure where it's zero. We therefore substitute -------------------------------------------------------------------------- $a$ for $x$, and set the resulting expression of $a$ -------------------------------------------------------------------------- to zero. We solve for $a$ by -------------------------------------------------------------------------- adding 1, (pause...............) -------------------------------------------------------------------------- dividing by C, (pause...............) -------------------------------------------------------------------------- taking logarithms, (pause...............) -------------------------------------------------------------------------- simplifying and (pause...............) -------------------------------------------------------------------------- dividing by k. Remember that -------------------------------------------------------------------------- C and k are -------------------------------------------------------------------------- known numbers, and we'll eventually use their values to calculate -------------------------------------------------------------------------- $a$. (pause...............) The -------------------------------------------------------------------------- positive part of a negative number is zero, so the -------------------------------------------------------------------------- integrand here is zero, for all x less than a, so we may as well -------------------------------------------------------------------------- integrate from *$a$* to infinity. In that range, -------------------------------------------------------------------------- this expression is positive, and the -------------------------------------------------------------------------- positive part of a positive number is itself, so -------------------------------------------------------------------------- this "plus" has no effect, and we -------------------------------------------------------------------------- eliminate it. We -------------------------------------------------------------------------- leave everything else alone. Next step is to distribute -------------------------------------------------------------------------- the exponentiated quadratic and -------------------------------------------------------------------------- the integral over -------------------------------------------------------------------------- this subtraction, giving (pause...............) -------------------------------------------------------------------------- this. Note that -------------------------------------------------------------------------- the exponentiated quadratic has been distributed, as has -------------------------------------------------------------------------- the integral. Also, we snuck the -------------------------------------------------------------------------- constant C outside the integral. We move this -------------------------------------------------------------------------- up to the top to make more room. -------------------------------------------------------------------------- This integral is equal to -------------------------------------------------------------------------- this, and note that, because -------------------------------------------------------------------------- $a$ is in the *lower* limit of the integral, it gets -------------------------------------------------------------------------- negated. Also, we -------------------------------------------------------------------------- remembered and didn't forget "root 2 pi". For the -------------------------------------------------------------------------- other integral, we identify the -------------------------------------------------------------------------- linear term in the exponent, and we see that the coefficient is k, so we -------------------------------------------------------------------------- replace x by x plus k, which has -------------------------------------------------------------------------- no effect on dee x, and only a small effect on -------------------------------------------------------------------------- the limits of integration. Specifically, to compensate for adding k to x, we subtract k from -------------------------------------------------------------------------- *a*, (pause...............) giving -------------------------------------------------------------------------- a minus k. Expanding -------------------------------------------------------------------------- this, we get -------------------------------------------------------------------------- e to the k squared, -------------------------------------------------------------------------- e to the minus x squared over 2, and -------------------------------------------------------------------------- e to the minus k squared over 2. If you were watching carefully, you may have noticed that we skipped the exponentiated linear terms -------------------------------------------------------------------------- e to the k x and -------------------------------------------------------------------------- e to the minus 2 k x over 2, or minus k x. Skipping them was okay because -------------------------------------------------------------------------- they cancel, which is really the whole point of adding $k$ to $x$. To compute -------------------------------------------------------------------------- this integral, we can pull -------------------------------------------------------------------------- these constants to the -------------------------------------------------------------------------- outside of the integral, but -------------------------------------------------------------------------- this depends on x, and so -------------------------------------------------------------------------- must remain inside. -------------------------------------------------------------------------- This integral is equal to -------------------------------------------------------------------------- this, and, as usual, because -------------------------------------------------------------------------- $a$ minus $k$ is in the *lower* limit, we -------------------------------------------------------------------------- negate it. Also, we -------------------------------------------------------------------------- remembered and didn't forget "root 2 pi". We -------------------------------------------------------------------------- clear some room, and move the result up, and then -------------------------------------------------------------------------- clear some more room, and move the result up again. -------------------------------------------------------------------------- These cancel and, on the outside of -------------------------------------------------------------------------- the brackets, we have -------------------------------------------------------------------------- 5,000. On the inside, we have -------------------------------------------------------------------------- C times (pause...............) -------------------------------------------------------------------------- this (pause...............) -------------------------------------------------------------------------- minus this. Remember that we earlier computed -------------------------------------------------------------------------- k and C, and we now plug them into the -------------------------------------------------------------------------- expression for $a$, and get -------------------------------------------------------------------------- this. Using these numbers, -------------------------------------------------------------------------- C e to the k squared over 2 is equal to -------------------------------------------------------------------------- this, (pause...............) -------------------------------------------------------------------------- k minus a is equal to -------------------------------------------------------------------------- this (pause...............) and -------------------------------------------------------------------------- minus a is equal to -------------------------------------------------------------------------- this. Finally, a calculator that can compute the -------------------------------------------------------------------------- error function yields our long-sought -------------------------------------------------------------------------- answer. (pause...............) Okay. Now that we have the -------------------------------------------------------------------------- answer, does anyone remember the question? You have to go all the way back to our -------------------------------------------------------------------------- coin-flipping game in Act 4, where our goal was to find -------------------------------------------------------------------------- the expected value E. All this work was directed toward that end, and we now have, -------------------------------------------------------------------------- up to a good approximation, -------------------------------------------------------------------------- E. Again, the approximation is good to several decimals, because of the accuracy of the Central Limit Theorem, when -------------------------------------------------------------------------- N is over 2 and a half million. A bound on the error can be found through Berry-Esseen and tail estimates, but that's outside the scope of these lectures. We really want, not E, but the -------------------------------------------------------------------------- price, P, of Kyle's option, so, now that we have -------------------------------------------------------------------------- E, we need -------------------------------------------------------------------------- rho to the minus N. We look up the value of -------------------------------------------------------------------------- rho, and compute -------------------------------------------------------------------------- rho to the minus N. The price, -------------------------------------------------------------------------- P, of the option is just rho to the minus N times E, which is -------------------------------------------------------------------------- approximately this, and it should be accurate to several decimals. So Gail should charge Kyle -------------------------------------------------------------------------- 384 dollars and 87 cents for this option. (pause...............) Okay. -------------------------------------------------------------------------- Let's go back to our assumptions. No bank'll ever offer a -------------------------------------------------------------------------- per second rate. For this option, the term was 30 days, and one might get, from the bank, a statement of their -------------------------------------------------------------------------- 30-day risk-free factor. Of course, if we know -------------------------------------------------------------------------- this number, then, by taking its -------------------------------------------------------------------------- Nth root, we can work our way back to -------------------------------------------------------------------------- the one-second risk-free factor. In any real-life problem, -------------------------------------------------------------------------- this assumption would probably be stated as -------------------------------------------------------------------------- a 30-day risk-free factor, and we'd have to derive, from that, the -------------------------------------------------------------------------- per second risk-free factor. Similarly, it's unlikely that any market analyst would give Gail a -------------------------------------------------------------------------- per second volatility assumption the way it's shown here, and we want, eventually, to replace this assumption with a more reasonably stated one, but that requires some preliminary work. Remember that -------------------------------------------------------------------------- these numbers were called u and d, the one-second uptick and downtick factors. If -------------------------------------------------------------------------- $s$ is the ABC share price at the start of some second, then, -------------------------------------------------------------------------- at the end of that second, according to our model, it'll go either -------------------------------------------------------------------------- up to s u or -------------------------------------------------------------------------- down to s d, and we put in -------------------------------------------------------------------------- a multiplication sign to stress the fact that -------------------------------------------------------------------------- u and d are multiplicative factors. Financial mathematicians typically study not $s$, but rather -------------------------------------------------------------------------- log s, which'll go either -------------------------------------------------------------------------- up to log s plus log u or -------------------------------------------------------------------------- down to log s plus log d, and we put in an -------------------------------------------------------------------------- addition sign to stress that -------------------------------------------------------------------------- these are (pause...............) -------------------------------------------------------------------------- additive terms, *not* multiplicative factors. The point is that taking logarithms causes -------------------------------------------------------------------------- multiplication to become -------------------------------------------------------------------------- addition, and addition is easier. Let's imagine that our real world is a -------------------------------------------------------------------------- 50-50 world, which is a little suspicious because, for this problem, the risk-neutral world was also 50-50. Still, let's suppose. In that case, Gail's market analyst, who studies the real world, and not some imaginary one, will notice that -------------------------------------------------------------------------- log s will increase, on average, by the average of -------------------------------------------------------------------------- these two numbers, log u and log d, each second. -------------------------------------------------------------------------- Here's the average of log u and log d. To get the average log price change over *30 days*, instead of one second, -------------------------------------------------------------------------- multiply by N, the number of seconds in 30 days. Plugging in -------------------------------------------------------------------------- these two numbers for -------------------------------------------------------------------------- $u$ and $d$, the expected change computes to -------------------------------------------------------------------------- this number. Since it's positive, the -------------------------------------------------------------------------- log price, and therefore the -------------------------------------------------------------------------- price, is trending up. In any one second, the price might go -------------------------------------------------------------------------- down by a factor of $d$, but the overall drift is upward, assuming upticks and downticks happen with -------------------------------------------------------------------------- 50-50 probabilities. We record the 30-day expected log price change -------------------------------------------------------------------------- as an assumption. This kind of assumption is usually called a -------------------------------------------------------------------------- drift assumption, and we might call -------------------------------------------------------------------------- this (pause...............) the -------------------------------------------------------------------------- 50-50 drift equation, stressing the fact that it uses -------------------------------------------------------------------------- 50-50 probabilities. Incidentally, what we've been calling a -------------------------------------------------------------------------- volatility assumption is really a combined volatility *and drift* assumption, so our terminology has been somewhat sloppy. To get at -------------------------------------------------------------------------- volatility *alone*, we *might* take the difference of -------------------------------------------------------------------------- u and d, the way we did back in Lecture 3, but a more sophisticated view is that we should measure how far apart -------------------------------------------------------------------------- log u and log d are, and so, change -------------------------------------------------------------------------- this plus sign (pause...............) -------------------------------------------------------------------------- to a minus sign. Also, for reasons that'll be explained in the next lecture, we'll multiply by some odd-looking factors, namely -------------------------------------------------------------------------- the square root of N and -------------------------------------------------------------------------- the geometric mean of 50% and 50%. Plugging in $u$ and $d$, we get -------------------------------------------------------------------------- this number. -------------------------------------------------------------------------- Let's clear some room. The -------------------------------------------------------------------------- number we just calculated is called -------------------------------------------------------------------------- the 30-day price volatility. Again, it was calculated using -------------------------------------------------------------------------- 50-50 probabilities and so, we might want to keep track of that, and call -------------------------------------------------------------------------- this formula the -------------------------------------------------------------------------- 50-50 volatility equation. Traditional notation is -------------------------------------------------------------------------- sigma for the volatility and -------------------------------------------------------------------------- mu for the drift. It's also traditional to use -------------------------------------------------------------------------- e to the r for the risk-free factor, so that -------------------------------------------------------------------------- $r$ is the nominal interest rate in continuous compounding. In this lecture, we use the word "drift" to mean -------------------------------------------------------------------------- expected log price change. Let's examine the -------------------------------------------------------------------------- 50-50 drift and volatility equations. Again, note the -------------------------------------------------------------------------- odd-looking factors in the volatility equation, which'll be explained in the next lecture, when we talk about standard deviation and independence. We'll call these two equations the -------------------------------------------------------------------------- 50-50 equations. -------------------------------------------------------------------------- This number is N, the number of 1-second subperiods in the 30 day term. To hightlight that, we'll sometimes call these the 50-50 -------------------------------------------------------------------------- *N-subperiod* equations, and we may sometimes call each one individually, an -------------------------------------------------------------------------- N-subperiod equation. Here's the point: Since we somehow knew -------------------------------------------------------------------------- the one-second uptick and downtick factors, we were able to -------------------------------------------------------------------------- plug them in, and compute -------------------------------------------------------------------------- the 30-day drift and volatility, but bear in mind that it would be more typical to reverse this: Gail's market analyst would tell us -------------------------------------------------------------------------- the 30-day drift and volatility, and we'd solve for -------------------------------------------------------------------------- the one-second uptick and downtick factors, -------------------------------------------------------------------------- obtaining these numbers. Okay. Remember that -------------------------------------------------------------------------- N is a large number, the number of seconds in 30 days. The 30-day -------------------------------------------------------------------------- risk-free factor was calculated from the one-second risk-free factor, -------------------------------------------------------------------------- rho, but, in real life, we'd reverse that: Gail's banker would tell us that -------------------------------------------------------------------------- this is the 30-day risk free factor, then we'd take its Nth root, and get -------------------------------------------------------------------------- the one-second risk-free factor. Similarly, we calculated the -------------------------------------------------------------------------- 30-day drift and volatility from -------------------------------------------------------------------------- the one-second uptick and downtick factors, but, in real life, we'd reverse that: Gail's market analyst would tell us that -------------------------------------------------------------------------- these are the 30-day drift and volatility for ABC stock, and, by solving the 50-50 N-subperiod equations, we'd find the -------------------------------------------------------------------------- one-second uptick and downtick factors. By an *amazing* coincidence, -------------------------------------------------------------------------- rho turns out to be *exactly* halfway between -------------------------------------------------------------------------- u and d. Ordinarily, that wouldn't happen, but, in this situation, the *risk-neutral* uptick and downtick probabilities are *exactly* 50-50, same as the real-world uptick and downtick probabilities. Most of this lecture has been devoted to using ideas from the 50-50 Central Limit Theorem to show, via risk-neutral pricing, that the -------------------------------------------------------------------------- option price is approximately -------------------------------------------------------------------------- 384 dollars and 87 cents. We now propose a series of challenging -------------------------------------------------------------------------- exercises. First, we propose that you through the same computations, but with -------------------------------------------------------------------------- N changed to 10, so Gail is planning to adjust her portfolio 10 times in 30 days, not once per second. Thus a subperiod of the 30 day term is now 3 days, not one second, and there are 10 subperiods. We leave it as an exercise for you to take the 10th root of -------------------------------------------------------------------------- this, and get a new value of -------------------------------------------------------------------------- rho. Using -------------------------------------------------------------------------- these numbers, and solving the 50-50 *10*-subperiod equations, you can find the new values of the -------------------------------------------------------------------------- uptick and downtick factors. -------------------------------------------------------------------------- Here are the 50-50 N-subperiod equations, from back when N was -------------------------------------------------------------------------- this number. We find the number N -------------------------------------------------------------------------- here. We also find it -------------------------------------------------------------------------- in the odd-looking factors, -------------------------------------------------------------------------- under the square root. We change N to 10 -------------------------------------------------------------------------- like so, and you should now solve these two equations for -------------------------------------------------------------------------- u and d and then -------------------------------------------------------------------------- plug them in here. So, we're now assuming that, in each subperiod, the real-world probabilities of uptick and downtick are 50-50 and, under that assumption, we choose our -------------------------------------------------------------------------- uptick and downtick factors to conform to the known -------------------------------------------------------------------------- 30-day drift and volatility estimated by Gail's market analyst. Now, drift and volatility are based on empircal data, probably assuming that the future'll be something like the past. By contrast, the assumption of a -------------------------------------------------------------------------- 50-50 chance of uptick-downtick once every three days is our -------------------------------------------------------------------------- *model* for the evolution of the price of the underlying stock, and a model just an assumption. This model is called the -------------------------------------------------------------------------- binomial model, and the -------------------------------------------------------------------------- "bi" in "binomial" signifies that the price of the underlying has two choices for how it changes, in each subperiod, given by multiplying by -------------------------------------------------------------------------- $u$ or $d$. To be more specific, we'll call this model the 50-50 -------------------------------------------------------------------------- 10-subperiod binomial model. One of the major themes of financial mathematics is the analysis of how the derivative price changes as we change our model for the underlying price. Okay. When N was -------------------------------------------------------------------------- this number, it turned out that -------------------------------------------------------------------------- rho was *exactly* halfway between -------------------------------------------------------------------------- $u$ and $d$. In this new 10-subperiod model, the new -------------------------------------------------------------------------- rho you calculate will *not* turn out to be exactly halfway between the new -------------------------------------------------------------------------- $u$ and $d$. So, this time around, the risk-neutral world will *not* be a 50-50 world, even though, according to our model, the -------------------------------------------------------------------------- real world *is*. It'll be difficult for you to proceed to an -------------------------------------------------------------------------- approximate price, using risk-neutral pricing, because the new risk-neutral probabilities aren't 50-50, and we've so far only exposed ideas from the 50-50 Central Limit Theorem. In a future lecture, we'll talk about other Central Limit Theorems, and it'll become feasible to go through all the work in this lecture, and calculate an -------------------------------------------------------------------------- approximate price. If you enjoy a challenge, and maybe already know something of the Central Limit Theorem, perhaps you want to try to do that now. Anyway, to compute the approximate price, we need first to compute -------------------------------------------------------------------------- these three values, rho, $u$ and $d$. These are inputs to our -------------------------------------------------------------------------- model. These inputs are often called model parameters, and the process of going from -------------------------------------------------------------------------- real-world data to the -------------------------------------------------------------------------- values of the parameters is called -------------------------------------------------------------------------- calibration of the model. Models must be calibrated, or they're useless, so calibration is a major topic for financial analysts. Here, calibration is relatively easy, given that Gail's market analyst supplies the -------------------------------------------------------------------------- 30-day drift and volatility, and her banker supplies the -------------------------------------------------------------------------- 30-day risk-free factor. Next, we change the number of subperiods, -------------------------------------------------------------------------- N, to 100, -------------------------------------------------------------------------- changing the model. Here -------------------------------------------------------------------------- N is now 100. You should now -------------------------------------------------------------------------- recalibrate (pause...............) and -------------------------------------------------------------------------- calculate a new approximate price. Next we change -------------------------------------------------------------------------- N to 1,000, -------------------------------------------------------------------------- and go through the same thing again. Next, we change -------------------------------------------------------------------------- N to 10,000, -------------------------------------------------------------------------- and go through the same thing again. A reasonable question to ask is what happens to the price as -------------------------------------------------------------------------- N tends to infinity, that is, as Gail considers adjusting her hedge more and more often. The limiting price is sometimes called the -------------------------------------------------------------------------- continuous-hedging price, or Black-Scholes price. When we let N tend to infinity, we'll see in a later lecture that our -------------------------------------------------------------------------- 50-50 $N$-subperiod binomial model tends toward what's called the -------------------------------------------------------------------------- Black-Scholes model. This is a model that is driven by the Central Limit Theorem, and which therefore predicts -------------------------------------------------------------------------- thin tails in price distributions. Nowadays, it's commonly accepted that thin-tails are actually relatively rare, and, in fact, fat-tailed distributions make the finance world go round. So we want, eventually, to move beyond Black-Scholes, but it pays to understand *well* that which we seek to transcend, so we'll spend a good deal of time on Black-Scholes. The price of the option, in the Black-Scholes model, is exactly the price that's given by -------------------------------------------------------------------------- the Black-Scholes Option Pricing Formula, and we'll show you that formula on the next slide, though we won't derive it until a later lecture. On the next slide, we'll see that that formula -------------------------------------------------------------------------- gives a dollar price of 384 point 866 434. Note how close -------------------------------------------------------------------------- these two numbers are, and that's a reflection of the fact that, with such a large value of -------------------------------------------------------------------------- N, Gail has almost achieved continuous hedging by hedging once per second. Okay. Let's take a look at that Black-Scholes Option Pricing Formula. We first recall -------------------------------------------------------------------------- mu, sigma and e to the r. These are the 30-day drift, volatility and risk-free factor, respectively. Remember that Kyle's option had a strike price of 5,000 dollars, and -------------------------------------------------------------------------- K is a traditional notation for the dollar strike price. We -------------------------------------------------------------------------- divide K by the 30-day risk-free factor to get the present value of the strike price. We denote that present value by -------------------------------------------------------------------------- K', and it computes to -------------------------------------------------------------------------- this number. In Kyle's option, at the spot price of $1 per share, the initial cost of the promised 5,000 shares is -------------------------------------------------------------------------- 5,000 dollars. This initial dollar price of the promised underlying stock is typically denoted by -------------------------------------------------------------------------- S_0. By the way, notice that, in Kyle's option, -------------------------------------------------------------------------- S_0 and K are equal, and to indicate that, one says that Gail is selling -------------------------------------------------------------------------- "at the money". Two important quantities the Black-Scholes formula are -------------------------------------------------------------------------- d plus and d minus, and they're defined here. The -------------------------------------------------------------------------- Black-Scholes price is then -------------------------------------------------------------------------- S_0 Phi of d plus minus K prime Phi of d minus. Plugging in the numbers above, we get -------------------------------------------------------------------------- this, which then gives -------------------------------------------------------------------------- these values for d plus and d minus. We grab our Phi-calculating calculator, and compute -------------------------------------------------------------------------- Phi of d plus and Phi of d minus. Plugging in, we get -------------------------------------------------------------------------- this, which calculates to -------------------------------------------------------------------------- 384 point 866 434. (pause...............) Okay. Let's -------------------------------------------------------------------------- clear the calculations and examine the -------------------------------------------------------------------------- formula more closely. We've made no attempt to derive this -------------------------------------------------------------------------- Black-Scholes Option Pricing Formula. That'll come later. For now, we simply examine it. One interesting thing that we should notice right now about this formula, is that -------------------------------------------------------------------------- the drift mu isn't used, and Gail's market analyst need -------------------------------------------------------------------------- not bother estimating it. Strangely, even though, for each N, we need mu as a parameter in the N-subperiod binomial model, when we let N approach infinity, and we move to continuous hedging, mu becomes irrelevant. It may also seem strange that, in calculating the price of an option, we don't care whether the underlying is drifting up or down, nor do we care how quickly. We only care about the -------------------------------------------------------------------------- volatility, along with the -------------------------------------------------------------------------- risk-free factor, the -------------------------------------------------------------------------- spot price of the promised underlying and the -------------------------------------------------------------------------- strike price. Of these, -------------------------------------------------------------------------- four Black-Scholes parameters, -------------------------------------------------------------------------- volatility is the *only* one whose value is difficult to know. In this lecture, we've pretended that there are clever market analysts who somehow just know how to estimate it. Perhaps there are, but, in fact, what often happens in practice is that -------------------------------------------------------------------------- volatility is unclear, but, people who are looking to buy a certain option see, on their computer screens, that a -------------------------------------------------------------------------- price on that option is offered by a seller, with no explanation of how the price was calculated. Looking at that price, these buyers then wonder what volatility would have produced it -------------------------------------------------------------------------- in the Black-Scholes formula. In the vernacular of the quantitative analyst, -------------------------------------------------------------------------- solving for sigma is called "backing out" the volatility from the price. *That* volatility is called the -------------------------------------------------------------------------- "implied volatility" of the option. We'll see, in a later lecture, that the -------------------------------------------------------------------------- Black-Scholes price increases as the volatility increases, so there's at most one volatility for any given price. Implied volatilities are used by options traders in the way that home-buyers use interest rates. To wit: If you're comparing several possible home purchase agreements, there may be different conditions and terms on each one, making them hard to compare. However the interest rate gives you one basic dimensionless number to which you'll likely pay close attention. In today's home finance market, if you see an interest rate of, say, 25%, you'll likely *not* take that home financing agreement. So it is with options and implied volatility. A buyer comparing an array of options may have difficulty making the comparison, because each contract has its own peculiarities. It helps a great deal to have a single number for each that shows how costly it is. In today's options market, if you see an implied volatility of, say, 45%, you'll likely *not* buy the option. Okay. You might wonder if -------------------------------------------------------------------------- this formula ought not be called the -------------------------------------------------------------------------- *50-50* Black-Scholes Option Pricing formula, since the price it gives is a limit of *50-50* binomial prices. To explain why -------------------------------------------------------------------------- *not*, let's assign a few more -------------------------------------------------------------------------- exercises. You'll now work on a limit of -------------------------------------------------------------------------- 65-35 binomial prices. We'll start at -------------------------------------------------------------------------- 10 subperiods. Taking the 10th root of -------------------------------------------------------------------------- this number (pause...............), -------------------------------------------------------------------------- you get rho. Solving the 65-35 10-subperiod equations with -------------------------------------------------------------------------- these numbers for 30-day drift and volatility, -------------------------------------------------------------------------- you get u and d. -------------------------------------------------------------------------- Here are the *50-50* 10-subperiod equations. We find the probabilities -------------------------------------------------------------------------- here, and also in the -------------------------------------------------------------------------- odd-looking factors -------------------------------------------------------------------------- down here. We change them to 65-35 -------------------------------------------------------------------------- like so, and you should now solve these two equations -------------------------------------------------------------------------- for u and d and then (pause) -------------------------------------------------------------------------- write them in here. This completes calibration. Next, find the risk-neutral probabilities, and do risk-neutral pricing. With the help of the Central Limit Theorem, find an -------------------------------------------------------------------------- approximate price. Then set -------------------------------------------------------------------------- N to 100 and -------------------------------------------------------------------------- go through the process again. Then set -------------------------------------------------------------------------- N to 1,000 and -------------------------------------------------------------------------- go through the process again. Then set -------------------------------------------------------------------------- N to 10,000 and -------------------------------------------------------------------------- go through the process again. Then set -------------------------------------------------------------------------- N to 1 million and -------------------------------------------------------------------------- go through the process again. Letting -------------------------------------------------------------------------- N tend to inifinity, we'll show, in a later lecture, that these 65-35 -------------------------------------------------------------------------- prices converge, as N tends to infinity, to -------------------------------------------------------------------------- the *same* Black-Scholes price that we saw before. So, if you were to do the exercises on this slide correctly, then -------------------------------------------------------------------------- these two numbers would turn out to be very close. In fact, when I ran through the computations, I got -------------------------------------------------------------------------- exactly the same number, to 6 decimals. There's nothing special about -------------------------------------------------------------------------- 65-35 or 50-50. In fact, for *any* choice of real-world uptick-downtick probabilities, the -------------------------------------------------------------------------- the N-subperiod binomial model tends to the -------------------------------------------------------------------------- Black-Scholes model. This centrality and universality of Black-Scholes is part of what makes it such an attractive starting point, before moving on to more sophisticated models. Okay. For all this talk, we still haven't stated the Central Limit Theorem, which is, after all, the title of this lecture. In the -------------------------------------------------------------------------- remainder of this lecture, we'll -------------------------------------------------------------------------- state the 50-50 Central Limit Theorem. Time for the last -------------------------------------------------------------------------- intermission of this lecture. -------------------------------------------------------------------------- Welcome to the Sixth (and final) Act of Lecture 4 of Notes on Financial Mathematics, by Scot Adams and Fernando Reitich. Let's return -------------------------------------------------------------------------- to this slide, which we saw back in Act 4. Some of this -------------------------------------------------------------------------- we won't need. The point of what remains was that, if we have a "reasonable" function -------------------------------------------------------------------------- g, and if we want to compute the -------------------------------------------------------------------------- expected value of g(X), then the answer is closely -------------------------------------------------------------------------- approximated by the -------------------------------------------------------------------------- expression at the bottom of the slide. Remember that -------------------------------------------------------------------------- X was defined as H minus T over root N, where -------------------------------------------------------------------------- H is the number of heads and T is the number of tails, after -------------------------------------------------------------------------- N flips of a (pause...............) -------------------------------------------------------------------------- fair, or 50-50, coin. Eventually, it'll become important to talk about how to adjust this result when the coin is *not* fair, but, in this lecture, we content ourselves with stating the -------------------------------------------------------------------------- 50-50 Central Limit Theorem, which, roughtly speaking, asserts that if -------------------------------------------------------------------------- N is a large number, if -------------------------------------------------------------------------- we flip a 50-50 coin N times, if we -------------------------------------------------------------------------- let H be the number of heads and T the number of tails, and if -------------------------------------------------------------------------- g is any reasonable function -------------------------------------------------------------------------- then the expected value of -------------------------------------------------------------------------- g(X), which we write out as g of H minus T over root N -------------------------------------------------------------------------- is close to -------------------------------------------------------------------------- this value. (pause...............) -------------------------------------------------------------------------- Expected value is usually abbreviated by a -------------------------------------------------------------------------- blackboard bold E and -------------------------------------------------------------------------- "is close to" is usually abbreviated by an -------------------------------------------------------------------------- approximately equal sign. Let's adjust the text in -------------------------------------------------------------------------- this sentence, and fit it on one line, -------------------------------------------------------------------------- like so. As it appears now, while good for conveying the *idea* of the Central Limit Theorem, this is *not* really a mathematical -------------------------------------------------------------------------- theorem because of its imprecision. How large is -------------------------------------------------------------------------- large? What's meant by -------------------------------------------------------------------------- reasonable? How close is -------------------------------------------------------------------------- close? Once we answer those questions, we'll have our -------------------------------------------------------------------------- theorem. Let's tackle the -------------------------------------------------------------------------- second question first. -------------------------------------------------------------------------- Here's the expression g(x) to which we applied the theorem earlier. Remember that -------------------------------------------------------------------------- C and k are known. We leave it as an -------------------------------------------------------------------------- exercise to show, -------------------------------------------------------------------------- for every real number x, that -------------------------------------------------------------------------- this expression of x is -------------------------------------------------------------------------- less than 5,000 C times e to the k absolute value x. Here's a hint about how to do this exercise: Dropping -------------------------------------------------------------------------- this minus 1, -------------------------------------------------------------------------- like so, only increases the LHS. Then we're -------------------------------------------------------------------------- here taking the positive part of a positive number which -------------------------------------------------------------------------- has no effect. Finally, adding in -------------------------------------------------------------------------- these absolute value bars can't decrease the value. So, we can get -------------------------------------------------------------------------- from the LHS to the RHS, by making three changes. -------------------------------------------------------------------------- The first increases the value, and -------------------------------------------------------------------------- the other two don't decrease it. We now make a -------------------------------------------------------------------------- definition. A function g is exponentially bounded, -------------------------------------------------------------------------- abbreviated as "e-x-p dash b-d-d" -------------------------------------------------------------------------- if there are constants A and B such that -------------------------------------------------------------------------- for all real numbers x, -------------------------------------------------------------------------- g(x) is less than A times e to the B absolute value x. Letting A be -------------------------------------------------------------------------- 5000 C and B be -------------------------------------------------------------------------- k, we see then that -------------------------------------------------------------------------- this expression is -------------------------------------------------------------------------- exponentially bounded in x. -------------------------------------------------------------------------- We clear some room, -------------------------------------------------------------------------- rewrite our definition at the top, and -------------------------------------------------------------------------- go back to the 50-50 Central Limit Theorem, *except* that -------------------------------------------------------------------------- "reasonable" has been replaced by "continuous and exponentially bounded", which are precise hypotheses. The meaning of -------------------------------------------------------------------------- large and (pause...............) -------------------------------------------------------------------------- close still need to be clarified. RECERR: I said "verified" not "clarified" Let's replace -------------------------------------------------------------------------- this by -------------------------------------------------------------------------- "for each positive integer N". Then, -------------------------------------------------------------------------- here, we have an expected value that varies as N varies. For example, for -------------------------------------------------------------------------- N equals 1, let's work out a formula for -------------------------------------------------------------------------- this expected value. Note that -------------------------------------------------------------------------- root N is 1, and -------------------------------------------------------------------------- H minus T is either 1 or minus 1, with a -------------------------------------------------------------------------- 50-50 probability of each. Taking -------------------------------------------------------------------------- g of H minus T over root N, we get either -------------------------------------------------------------------------- g(1) or -------------------------------------------------------------------------- g(-1), with probabilities -------------------------------------------------------------------------- point 5 and point 5. To get the expected value, we -------------------------------------------------------------------------- add the two results. Moving on to -------------------------------------------------------------------------- N equals 2, let's work out a new formula. Note that -------------------------------------------------------------------------- root N is now root 2, and -------------------------------------------------------------------------- H minus T is now 2 , 0 or minus 2, with probabilities -------------------------------------------------------------------------- point 25, point 5 and point 25. Taking -------------------------------------------------------------------------- g of H minust T over root N, we get -------------------------------------------------------------------------- g of 2 over root 2, -------------------------------------------------------------------------- g of 0 over root 2, or -------------------------------------------------------------------------- g of minus 2 over root 2, with probabilities -------------------------------------------------------------------------- point 25, point 5 and point 25. To get the expected value, we -------------------------------------------------------------------------- add the three results. As an -------------------------------------------------------------------------- exercise, write out the formula for -------------------------------------------------------------------------- N equals 3. So -------------------------------------------------------------------------- this changes as N changes, giving a *sequence* of expected values, not just one. To state the 50-50 Central Limit Theorem precisely, we -------------------------------------------------------------------------- drop approximations, and say that the -------------------------------------------------------------------------- sequence of expected values -------------------------------------------------------------------------- tends to the expression on the right, -------------------------------------------------------------------------- as N tends to infinity. -------------------------------------------------------------------------- This is a precise mathematical theorem, and, while we won't give a formal mathematical proof of it, we've hinted a great deal at *how* it can be proved, using Fourier transforms and limits of renormalized powers of functions. In the version we're stating here, tail-estimation is another important ingredient. Now, (pause...............) it's not really professional-looking to phrase theorems -------------------------------------------------------------------------- in terms of flipping coins. The meaning is completely precise, and, to make that point, we've even written out exact formulas for the first -------------------------------------------------------------------------- two expected values. Nevertheless, one seeks a more formal mathematical model of coin-flipping. One seeks, it turns out, the concept of a -------------------------------------------------------------------------- sequence of independent random variables, as we'll see in the next lecture. For today, we've gone on long enough. This completes our lecture on the Central Limit Theorem, and we now outline a plan of seven -------------------------------------------------------------------------- future lectures. First, we need to have a better idea of what -------------------------------------------------------------------------- random variables are, in a mathematical context. Here, we'll also define and discuss -------------------------------------------------------------------------- independence and sequences of independent random variables. We'll also talk about -------------------------------------------------------------------------- expectation -------------------------------------------------------------------------- variance and -------------------------------------------------------------------------- standard deviation. Our discussion of standard deviation will explain why the odd-looking factors appeared in the 50-50 volatility equation. Second, we want to -------------------------------------------------------------------------- return to the Central Limit Theorem. We'll state it more formally, and not just in the 50-50 case. Next'll come an important theorem called -------------------------------------------------------------------------- Girsanov's Theorem, part of which asserts that, in -------------------------------------------------------------------------- continuous hedging, the -------------------------------------------------------------------------- volatility in the risk-neutral world is -------------------------------------------------------------------------- the same as in the real world. This is important, because, it solves the problem of computing -------------------------------------------------------------------------- risk-neutral volatility from observable real-world data. In fact, anything that helps us to see into that imaginary risk-neutral world helps us to price. You may wonder if, under continuous hedging, -------------------------------------------------------------------------- risk-neutral drift is equal to -------------------------------------------------------------------------- real-world drift. Well, in fact, it's typically not, but -------------------------------------------------------------------------- risk-neutral drift is nevertheless easy to calculate from real-world data. You see, by *definition* of the risk-neutral world, all portfolios have the same drift, so the drift of any asset is just the drift of the bank. So to find the -------------------------------------------------------------------------- risk-neutral drift, we simply ask our banker about the interest rate. Then -------------------------------------------------------------------------- risk-neutral drift is determined by the bank, and -------------------------------------------------------------------------- risk-neutral volatility is, by Girsanov, real-world volatility. Neither depends on -------------------------------------------------------------------------- real-world drift. That's why the real-world -------------------------------------------------------------------------- drift parameter, mu, is unused in continuous hedging, and it -------------------------------------------------------------------------- doesn't appear in the Black-Scholes formula. With Girsanov's Theorem and the Central Limit Theorem as our allies, we'll be in a position to -------------------------------------------------------------------------- derive the Black-Scholes Option Pricing Formula, a major milestone. We'll then devote a lecture to the formalism of -------------------------------------------------------------------------- stochastic processes, which are simply random variables that evolve over time. Whenever something is evolving, understanding its rate of change is important, and there's a whole -------------------------------------------------------------------------- calculus specifically designed for stochastic processes. Ordinary Differential Equations also have a stochastic analogue, called -------------------------------------------------------------------------- Stochastic Differential Equations, and we'll develop the basics of that subject. We'll separate out, into another lecture, the most important -------------------------------------------------------------------------- Stochastic Calculus result, which is a stochastic version of the -------------------------------------------------------------------------- Chain Rule. Remember that the ordinary Chain Rule is used when we see an expression inside a function, and we want to compute the derivative, or rate of change. There's something similar for stochastic processes: If you plug a stochastic process into a function, and you want to know the rate of change of the resulting stochastic process, there's a generalization of the Chain Rule that comes into play, and it's called -------------------------------------------------------------------------- Ito's lemma, which can -------------------------------------------------------------------------- also be called the Stochastic Chain Rule. With the -------------------------------------------------------------------------- Stochastic Calculus and Stochastic Differential Equations as our allies, we'll be ready to -------------------------------------------------------------------------- rederive the Black-Scholes Option Pricing formula in a simpler way, but using these sophisticated tools. One big advantage of this new derivation is that it points the way for how quantitative analysts can study *new* models that are more complicated than Black-Scholes. However, they must first learn how to handle the -------------------------------------------------------------------------- Stochastic Calculus and Stochastic Differential Equations. So, to do quantitative finance at the highest levels, one has to learn a lot of -------------------------------------------------------------------------- *rules* of advanced mathematics, and the bottom line is that you can do almost anything, if you know -------------------------------------------------------------------------- math rules. See you next time. --------------------------------------------------------------------------