\headline={\tenrm\lefthead\hfill \righthead\hfill Page \folio} \def\lefthead{{\bf Math 5467, Spring 2005}} \def\righthead{{\bf Intro. to inner product and Hilbert spaces } \quad\quad {\number\month/\number\day/\number\year}} \def\ip#1#2{\langle #1,\,#2\rangle} \def\ipv{$\langle v,\,w\rangle$ } \def\r{\real} \def\re{\app{\bf Re }} \def\im{\app{\bf Im }} \S 1 {\bf Introduction to inner product and Hilbert spaces } \gap ``Ordinary'' space, with its notions of length, distance and angle, is the source of the features that we use to \sla{define} {\bf inner product spaces,} a class of vector spaces with these properties. We also use the intuitive idea that, if points seem to be converging, then there ``is'' something for the points to converge \sla{to.} We then study a subclass of the inner product spaces that have this ``completeness'' property of convergence, the {\bf Hilbert spaces.} For us, ``vectors'' can usually be thought of as {\bf position} vectors. \gap \dft{1}{\bf Definition: } {\sl A {\it complex inner-product space\/} is a vector space $V$ with complex scalars, $\com,$ and a complex-valued function $\ip vw,$ called the {\it inner product\/} (defined on $V\times V)$ that has the five following properties: $(a)$ For all $v \in V,\ \ \ip vv\ge 0.$ $(b)$ If $\ip vv = 0$ then $v = 0.$ $(c)$ For all $v$ and $w$ in $V,$ $\ip vw=\overline{\ip wv}.$ $(d)$ For all $v_1,\ v_2\put{and} w$ in $V,\quad\ip {v_1+v_2}w=\ip {v_1}w+\ip {v_2}w.$ $(e)$ For all $v,\ w$ in $V,$ and all scalars $a,$ $\ip {av}w=a\ip vw.$} \gap In case the scalars are real, the axioms are the same, except that \ipv is assumed to be real-valued, so the complex conjugation is dropped in $(c)\colon\ \ip vw=\ip wv.$ We then call $V$ a \its{real inner product space.} \gap When $(c)$ is combined with $(d)$ and $(e)$ in turn, we have $(d')$ For all $v_1,\ v_2$ and $w$ in $V,\ \langle w,v_1 + v_2\rangle = \langle w,v_1\rangle + \langle w,v_2\rangle.$ $(e')$ For all $v,\ w \in V,$ and all scalars $a,\ \langle v,aw\rangle = \bar a\ip vw.$ Thus the inner product is linear in the first variable, and {\it conjugate linear\/} in the second. \gap \dft{1.1}{\bf Exercise: } Verify that the real part of the complex number $\ip vw,$ denoted $\re \ip vw,$ is given by $\re \ip vw={1\over 2}(\ip vw+\ip wv)$ and that the imaginary part of the complex number $\ip vw,$ denoted $\im \ip vw,$ is given by $\im \ip vw={1\over 2i}(\ip vw-\ip wv).$ \gap \dft{1.2}{\bf Exercise: } Suppose that $V$ is an inner product space with complex scalars and $v\bullet w:=\re \ip vw.$ Show that $V,$ now using real scalars only, is a real inner product space with inner product $v\bullet w.$ \gap The ``dot product'' in Euclidean space is the basic example of an inner product (the scalars are real in that case...). Note that $\re \ip vw$ is an inner product on the {\sl real\/} vector space obtained by restricting scalar multiplication to the real numbers. This will be important later. \gap \dft{2}{\bf Examples } include $\com$ itself, with $$\ip zw = z\bar w; $$ $C([0,\,1]),$ the complex-valued continuous functions on $[0,\,1]$ with $$ \ip fg := \int_0^1f(x)\overline{g(x)}\,dx; $$ and $L^2(\real^n),$ with $$ \ip fg := \int f(x)\overline{g(x)}\,dx. $$ The complex conjugation in the second factor is there so that the inner product of a ``vector'' with itself is non-negative -- that is, to make \ref{1(a)} be true. Other examples include the ``usual'' finite-dimensional spaces of vectors, such as three-space $(\real^3),$ with the ``dot product'' playing the r\^ole of inner product (in this example the scalars are real), $\real^n$ (or $n-\app{space}),$ consisting of column vectors $(x_1,\,x_2,\,\dots,\,x_n)$ or row vectors $(x_1\,x_2\,\dots\,x_n)$ with inner product (still a dot product, with real scalars) $$ \langle x,\,y\rangle=x\bullet y=\sum_{k=1}^n x_ky_k. $$ We can also work with the similar vector spaces $\com^n$ whose vectors $z$ have complex numbers as coordinates, with inner product $$ \langle z,\,w\rangle=z\bullet w=\sum_{k=1}^n z_k\overline{w_k}. $$ In passing, we notice that these last inner products can be expressed as products of matrices. We can write (in the context of $\real^n$ consisting of column vectors) $$ \langle x,\,y\rangle=x^T y=\sum_{k=1}^n x_ky_k, $$ where $x^T$ is the $1\times n$ matrix obtained by turning the column into a row. The matrix product $x^T y$ is well-defined, and its result is a $1\times 1$ matrix that we treat as a scalar. Similarly, in $\com^n,$ $$ \langle z,\,w\rangle=z^T \overline{w}=\sum_{k=1}^n z_k\overline{w}_k. $$ These things are assumed known by the user of standard computer ``packages'' that do the details of linear-algebra computations for us. \gap We next define the length of a vector, and call it the ``norm'' of the vector. The distance between two vectors $v$ and $w$ is then the length of (the norm of) the vector $v-w.$ In the context of vector spaces, ``norm'' is a technical term that picks out the essential features of the concept of length. \gap \dft{3}{\bf Definition: }\sla{A \its{norm} on a vector space $V$ is a real-valued function defined for each vector $v$ in $V,$ usually denoted $\|v\|,$ with properties (i) -- (iii) below, assumed true for all vectors $v$ and $w$ in $V$ and for all scalars $c$ (usually complex numbers for us, but the scalars are often real numbers): \nl (i) $\|v\|\ge 0,$ and $\|v\|= 0$ \iffi $v=0$ (the zero vector); \nl (ii) $\|cv\|=|c| \|v\|;$ \nl (iii) $\|v+w\|\le \|v\|+\|w\|,$ the \its{triangle inequality.}} \gap When a vector space has a norm, we call it a \its{normed space.} Thus a normed space consists of a vector space and a norm defined on the vector space. Definition \ref{3} was ``general;'' the next one is specific to inner product spaces. \gap \dft{4}{\bf Definition: } {\sl The {\it norm\/} of an element $v$ in an inner product space is denoted $\|v\|,$ and is given by taking the non-negative square root of $\|v\|^2 = \ip vv.$ That is, $\|v\|=\sqrt{\ip vv\,}.$} \gap Simply calling this a ``norm'' does not make it one! To prove this \sla{is} a norm, we'll use the very important {\it Schwarz inequality\/} in the proof of the triangle inequality. \gap \dft{5}{\bf Theorem (The Schwarz Inequality): } {\sl In an inner product space $V,$ for all vectors $v,\ w,$ $$ |\ip vw|\le \|v\| \|w\|, $$ and equality holds if and only if one of $v$ and $w$ is a multiple of the other.} \Pf This argument uses the quadratic formula! If one of $v$ and $w$ is zero, then equality holds, and the one that is zero is a multiple of the other. So suppose that neither of $v$ and $w$ is zero. Let $z\in\com.$ Consider $\|v - zw\|^2.$ Let us express $z$ in ``polar coordinates.'' For $\theta$ real and fixed (to be chosen later), and for $t \in R,$ put $z = t e^{i\theta},$ and let $f(t) = \|v - zw\|^2.$ The following are typical ``expansion'' steps we use when working with inner products: $$\eqalign{ f(t) &= \langle v - zw,v - zw\rangle\cr &= \langle v,v - zw\rangle - \langle zw,v - zw\rangle \cr &= \langle v,v\rangle - \langle v,zw\rangle - \langle zw,v\rangle + \langle zw,zw\rangle \cr &= \langle v,v\rangle - (\bar z\ip vw + z\langle w,v\rangle) + |z|^2\langle w,w\rangle \cr &= \langle v,v\rangle - 2 \re \bar z\ip vw + |z|^2\langle w,w\rangle\cr &= \|v\|^2 - 2t \re e^{-i\theta}\ip vw + t^2 \|w\|^2.\cr }$$ Next, choose $\theta$ so that $e^{-i\theta}\ip vw = |\ip vw| .$ With this choice of $\theta,$ we can write $$ f(t) =\|v\|^2 - 2t |\ip vw| + t^2 \|w\|^2. $$ Then the quadratic polynomial $f(t),$ being non-negative for all real $t,$ has no real roots, or one, repeated, root. In either case, it has non-positive discriminant. That is, $ 4 |\ip vw|^2 \le 4 \|v\|^2 \|w\|^2 ,$ as desired. \gap If equality holds, then $f(t)$ has zero discriminant, hence a root for the chosen value of $\theta.$ By the definition of $f,$ this means that $v = zw.$ \gap We'll use the Schwarz Inequality and a little algebra to show that $\|v\|=\sqrt{\ip vv\,}$ is really a norm. \gap \dft{6}{\bf Theorem: } {\sl An inner product space, with $\|v\|$ as norm, is indeed a normed space.\/} \gap \Pf By $(a)$ and $(b)$ in the definition of an inner product space, $\|v\|$ is non-negative, and is 0 \iffi $v = 0.$ When $c$ is a scalar, $\|cv\|^2 = \langle cv,cv\rangle = |c|^2\|v\|^2.$ The triangle inequality is an application of the Schwarz inequality: $\|v+w\|^2 = \|v\|^2 + 2 \re \ip vw + \|w\|^2 \le \|v\|^2 + 2 \|v\| \|w\| + \|w\|^2 = (\|v\|+\|w\|)^2;$ the inequality follows. \gap We express the \its{distance} between vectors $v$ and $w$ in terms of the norm: $\|v-w\|$ is the distance between $v$ and $w.$ Thus norm (expressed in terms of the inner product) and vector difference enter into the definition of distance. Distance is used to define \its{convergence} of sequences of vectors. This definition applies to \sla{all} normed spaces, not just inner product spaces. We say that $``v_n\to v\quot$ if $\dsp\lim_{n\to\infty}\|v_n-v\|=0.$ \gap If $S$ is a subset of an inner product space $V,$ we say that $S$ is \its{closed} if (and only if!) whenever $\{v_n\}$ is a \seq of points that are in $S,$ and there is a $v\in V$ \st $v_n\to v,$ then $v\in S$ as well. In other words: $``S$ is closed'' means that limits of \seqd s in $S$ are in $S.$ A set $S\sub V$ is said to be \its{open} if $S^c,$ the complement of $S,$ is closed. It is possible for a set to be neither open nor closed. The \its{complement} of a subset of $V$ is the set of all elements of $V$ that are not in $S.$ \gap An inner product space is a {\it Hilbert space\/} if it is \its{complete} with respect to the norm just defined (see \rend{4}). This means that, for every \seq $\{v_n\}$ of vectors in $V,$ if $\|v_m-v_n\|\to 0$ as $m$ and $n$ \sla{both} tend to infinity, then there is, in $V,$ a vector $v\in V$ \st $\|v_n-v\|\to 0$ as $n\to\infty.$ That is, $v_n\to v.$ Sequences with the property that $\lim_{m\to\infty,\ n\to\infty}\|v_m-v_n\|= 0$ are called \its{Cauchy sequences.} The definition of a Hilbert space can be compactly given by saying that ``an inner product space is a Hilbert space if all its Cauchy sequences converge.'' Usually we work with Hilbert spaces, since it's handy to have limits of Cauchy sequences available. The first and third of the examples are Hilbert spaces; the second is not. Finite dimensional inner product spaces are Hilbert spaces ``automatically.'' \gap The {\bf parallelogram identity} is useful (the squares on the diagonals of a parallelogram add to the sum of the squares on its perimeter): \gap \dft{7}{\bf Theorem (The Parallelogram Identity):} \sla{For all $u,\ v$ in $V,\ \|u+v\|^2 + \|u-v\|^2 = 2\|u\|^2 + 2\|v\|^2 .$} \gap The proof is a direct calculation, by expansion of the left-hand side, as done in the proof of the Schwarz inequality. \gap The polarization formula (this is how we can find inner products, if we can measure enough ``energies''): \gap \dft{8}{\bf Theorem (The Polarization Formula):} \sla{$$ \put{For all} u,\ w \put{in} V,\quad \ip vw = {1\over 4}\sum_{k=0}^3 i^k\,\|v+i^kw\|^2 . $$} This is proved by expansion and simplification on the right-hand side, using $i^2=-1,\ i^3=-i,\ i^4=1.$ If the scalars are real, there is a similar, simpler formula, the discovery of which is left to the reader as an exercise. \gap \dft{9}{\bf Definitions:} {\sl Vectors $v,\ w$ in an inner product space $V$ are {\it orthogonal\/} if $\ip vw = 0.$ Notation: $v\perp w.$ A set ${\cal O}$ in an inner product space is \its{orthogonal} if $v\perp w$ whenever $v$ and $w$ are unequal and both belong to ${\cal O}.$ An orthogonal set ${\cal O}$ is \its{orthonormal} if each of its elements has unit norm.} In particular, $ 0 $ is orthogonal to every vector $v.$ \gap \dft{10}{\bf Theorem (Pythagoras): } \sla{If, in an inner product space, $x\perp y,$ then $$ \|x\|^2+\|y\|^2=\|x+y\|^2=\|x-y\|^2 $$} The proof is a calculation, by expansion, as before. \gap {\bf The advantages of orthogonality} \gap {\bf Note: } the \its{span} of a set ${\cal S}$ of vectors is the set of all linear combinations of elements of ${\cal S},$ denoted $\app{span}({\cal S}).$ Linear combinations are always finite!! We are used to expressing vectors in 3-space in the form $$ v=\left(\matrix{x\cr y\cr z\cr}\right) =x\left(\matrix{1\cr 0\cr 0\cr}\right) +y\left(\matrix{0\cr 1\cr 0\cr}\right) +z\left(\matrix{0\cr 0\cr 1\cr}\right) :=xe_1+ye_2+ze_3,\put{so that} v\in \app{span}(\{e_1,\,e_2,\,e_3\}), $$ and we know that $e_i\perp e_j$ if $i\ne j.$ This means that every vector in 3-space can be represented this way in exactly one way, because we can recover the coordinates by taking dot products: if $v$ is a vector in 3-space, $$ v=(v\bullet e_1)e_1+(v\bullet e_2)e_2+(v\bullet e_3)e_3. $$ In other words, if we are given an orthonormal set ${\cal O},$ and we know that $v$ is in $\app{span}({\cal O}),$ then we can find the coefficients in the linear combination by taking inner products: $$ v=\sum_{v_n\in {\cal O}}\langle v,\,v_n\rangle v_n. $$ This is in contrast to what one has to do to find the coefficients when the vectors $v_n$ are not orthonormal. We usually have to find some kind of an ``inverse matrix,'' or solve a system of equations using Gaussian elimination, for example. \gap {\bf Orthogonal decomposition and projections} \gap We can ``drop a perpendicular'' in a Hilbert space. Put another way: if $d$ is the distance from a point $y$ to a closed \sla{convex} set $X$ in $H$, then the closed ball of radius $d,$ center $y,$ meets $X$ at exactly one point $x_o.$ With reference to that point, ``real'' angles between $y$ and points $x$ in $X$ are at least $90^\circ.$ I.e., $\re \ip {y-x_o}{x-x_o}\le 0.$ \gap {\bf Three definitions: Convex set, closed set, closure of a set: } A set $X$ in a vector space is \its{convex} if the line segments joining pairs of points in $X$ lie in $X$ also. A set $X$ in a vector space with a norm is \its{closed} if whenever $\{x_n\}$ is a \seq of points in $X$ that converges to a point $y$ [meaning that $\|x_n-y\|\to 0],$ then $y\in X.$ The \its{closure} of a set $X$ is denoted $\overline X$ and consists of all the points that are in $X$ as well as all the points that are the limits of \seqd s of points in $X.$ Example in $\r\colon$ $\overline{(0,\,1)}=[0,\,1].$ \gap \dft{11}{\bf Theorem: } {\sl If $X$ is a closed nonempty convex set in a Hilbert space $H,$ then for every $y$ in $H,$ there is a unique $\xi\in X$ such that $\re\langle y - \xi, x - \xi\rangle \le 0$ for all $x\in X.$ Indeed, $\xi$ is the element of $X$ closest to $y.$} \gap \Pf This classic argument exploits the parallelogram identity. Let $d := \app{dist}(y,X) = \inf_{x\in X}\|y-x\|.$ Then there is a ``minimizing sequence'' $\{x_n\}$ in $X$ such that $d =\lim_{n\to\infty}\|y-x_n\|.$ We define $\ep_n$ by $\ep_n^2 = \|y-x_n\|^2 - d^2.$ Since $\|y-x_n\|\to d,$ $\ep_n\to 0.$ By the parallelogram identity $$ \|(y-x_n) + (y-x_m)\|^2 + \|x_m-x_n\|^2 = 2\|y-x_n\|^2 + 2\|y-x_m\|^2, $$ or (making changes on both sides of this equation) $$ 4\left\|y - {x_n + x_m\over 2}\right\|^2 + \|x_m-x_n\|^2 = 2d^2 + 2\ep_n^2 + 2d^2 + 2\ep_m^2= 4d^2 + 2\ep_n^2+ 2\ep_m^2. $$ Since $X$ is convex, ${x_n + x_m\over 2}\in X,$ so $\|y - {x_n + x_m\over 2} \| \ge d.$ Hence $4d^2 + \|x_m-x_n\|^2 \le 4d^2 + 2\ep_n^2 + 2\ep_m^2.$ Thus $\{x_n\}$ is Cauchy, and so converges to an element $\xi$ of $X.$ This argument, applied with some other minimizing sequence $\{\widehat x_n\}$ in place of $\{x_m\}$ and ${\widehat{\ep_n}}^2$ in place of $\ep_m^2,$ shows the uniqueness of $\xi.$ \gap To verify the statement about angles in the ``real'' version of $H,$ let $x$ be \sla{any} element of $X.$ Then $$ d^2 \le \|y - x\|^2 = \|y - \xi\|^2 + 2 \re \langle y - \xi,\xi - x\rangle + \|\xi - x\|^2. $$ Since $d=\|y- \xi\|,$ we cancel $d^2$ and $\|y - \xi\|^2$ in the inequality above. The result is: $$ 0 \le 2 \re \langle y - \xi,\xi - x\rangle + \|\xi - x\|^2, \put{or}\re \langle y - \xi,x-\xi\rangle \le {1\over 2}\|\xi - x\|^2,\put{for \sla{every}}x\in X. $$ For $0 < r < 1,$ let $x^* := rx + (1-r)\xi.$ Then $x^*$ is in $X$ so we can put $x^*$ into the inequality above, in place of $x.$ Then $\re \langle y - \xi,x^*-\xi\rangle \le {1\over 2}\|\xi - x^*\|^2={1\over 2}\|x^*-\xi\|^2.$ When we subtract $\xi$ from (the equation that defined) $x^*$ we find that $x^* -\xi = r(x -\xi),$ so we have $\re r\langle y - \xi,\xi - x\rangle \le {r^2\over 2}\|\xi - x\|^2.$ We now cancel an $r$ from both sides and let $ r\to 0.$ The left-hand side does not change and the other side tends to zero, so in the limit we have $\re\langle y - \xi, x - \xi\rangle \le 0$ for all $x\in X.$ \gap {\bf Remarks } The argument just completed really took place in the three-dimensional {\sl real\/} vector space $Y$ spanned by $x,\ y$ and $\xi,$ using $\re\langle v, w\rangle$ as the inner product on $Y.$ We noticed that $d=\|y- \xi\|$ and that $\xi$ is \sla{unique.} This means that $\xi$ \sla{is the one and only element of $X$ that is \its{closest} to $y.$} \gap \dft{12}{\bf Corollary: } {\sl If $X$ is a closed subspace of $H,$ then $y- \xi\perp X.$ If $\xi'\in X$ and $y - \xi' \perp X,$ then $\xi' = \xi.$} \gap \Pf Because $X$ is a subspace, we also have $\widehat x := \xi -(x-\xi) \in X,$ so $\widehat x-\xi=-(x-\xi).$ Therefore by \ref{11} we have $\re \langle y - \xi,\widehat x-\xi\rangle \le 0.$ We can thus replace $\widehat x-\xi$ by $-(x-\xi)$ in the last inequality and we get $-\re \langle y - \xi,x - \xi\rangle \le 0.$ Hence $0\ge \re \langle y - \xi,x - \xi\rangle \ge 0,$ so $\re \langle y - \xi,x - \xi\rangle=0.$ The same argument works when we redefine $\widehat x$ by $\widehat x = \xi \pm i(x-\xi) \in X.$ This yields $\im \langle y - \xi,x - \xi\rangle = 0,$ so that $y- \xi\perp X.$ \gap If $ \xi'\in X$ and $y - \xi' \perp X$ then $y - \xi' \perp \xi' - \xi$ so $$ d^2 = \|y - \xi\|^2 = \|y - \xi' + \xi' - \xi\|^2 = \|y - \xi'\|^2 + \|\xi' - \xi\|^2 \ge d^2 + \|\xi' - \xi\|^2. $$ This implies that $\xi' = \xi.$ \gap \dft{12.1}{\bf Exercise: } Verify the part of the proof of \ref{12} about the imaginary part of $\langle y - \xi,x - \xi\rangle.$ \gap {\bf Definition of orthogonal complement and orthogonal projection} \gap If $X$ is a closed subspace of $H,$ set $X^\perp = \{y\in H: \langle y,x\rangle = 0 \put{for all} x\in X\}.$ Then $X^\perp$ is closed: if $y_k\in X^\perp$ and $y_k\to y$ then \fA $x\in X$ $|\ip{y}{x}|=|\ip{y-y_k}{x}+\ip{y_k}{x}|=|\ip{y-y_k}{x}| \le \|y-y_k\|\|x\|\to 0.$ It is left as an exercise to show that $X^\perp$ is a subspace and that $X^\perp\cap X =\{0\}.$ $X^\perp$ is called the \its{orthogonal complement} of $X.$ For $u\in H,$ let $P(u) (= P_X(u))$ denote the element of $X$ closest to $u.$ \sla{Recall that it is unique, and is the only element $\xi$ of $X$ such that $u-\xi\perp X.$} $P(u)$ is called the \its{projection of $u$ onto$X.$} \gap \dft{13}{\bf Theorem: } \sla{$P(u)$ is a linear map.} \gap \Pf Suppose that $a,\ b$ are scalars, and that $u,\ v$ are elements of $H.$ Then $$ \langle au + bv - (aP(u) + bP(v)), x\rangle = \langle au - aP(u), x\rangle + \langle bv - bP(v), x\rangle = a\langle u - P(u), x\rangle + b\langle v - P(v), x\rangle = 0 $$ for all $x\in X.$ Since $P(au + bv)$ is the unique $\xi\in X$ \st $au + bv - \xi\perp X,$ $P(au + bv) = aP(u) + bP(v).$ \gap Now we can express $u = P_Xu + (u - P_Xu)$ as the sum of terms in $X$ and in $X^\perp.$ This implies too that $I - P_X = P_{X^\perp}.$ These are called the {\it orthogonal projections\/} onto $X$ and $X^\perp,$ respectively. It is routine to show that they are projections. Orthogonality shows that $\|u\|^2 = \|P_Xu\|^2 + \|u - P_Xu\|^2 \ge \|P_Xu\|^2,$ so $P_X$ is continuous (proof is an exercise). The formula $I - P_X = P_{X^\perp}$ leads easily to a proof of the relation $(X^\perp)^\perp = X.$ All this can be applied to deduce such things as: \sla{The span of a subset $S$ of $H$ is dense in $H$ if and only if $ y\perp S$ implies $y = 0.$} \gap \dft{13.1}{\bf Exercise: } A function $f:V\to V$ is \its{continuous} on a normed space $V$ if for every $v_o\in V$ and for every $\ep>0$ there is a $\del>0$ (which in general depends on $f,$ $v_o$ and $\ep)$ \st $\|f(v)-f(v_o)\|<\ep$ whenever $\|v-v_o\|<\del.$ Specialize the definition to functions that are continuous at just one point. Show that $P_X$ is continuous. \gap {\bf Existence and properties of an orthonormal basis } \gap A set $S$ in an inner product space is {\it orthogonal\/} if $v \perp w$ whenever $v$ and $w$ are two different elements of $S.$ If every element of an orthogonal set $S$ has norm $1,$ we say $S$ is {\it orthonormal.\/} We want to know that, in Hilbert spaces, orthonormal sets exist with the useful property that every element $x$ of the Hilbert space can be ``expanded'' as an infinite series of the form $x=\sum_{v\in S} \langle x,\,v\rangle v.$ The meaning of such a series has to be clarified, and we also want the formula $\|x\|^2=\sum_{v\in S} |\langle x,\,v\rangle|^2$ to be true (and properly explained). \gap \dft{13.2} {\bf Example: } \sla{If $\cal O$ is a finite \on\ set in an inner product space $V,$ then $\app{span}({\cal O})$ is a closed subspace of $V.$} To see that this is true we need to show that if $v_m\in \app{span}({\cal O})$ and $\|v_m-y\|\to 0$ as $m\to\infty,$ then $y\in \app{span}({\cal O}).$ Since $v_m\in \app{span}({\cal O})$ we know that $\dsp v_m=\sum_{x\in\cal O}\ip{v_m}{x}x.$ Then for every $w\in V$ $$ |\ip{v_m}{w}-\ip{y}{w}|=|\ip{v_m-y}{w}|\le\|v_m-y\|\|w\|\to 0. $$ Thus for every $x\in\cal O,$ $\ip{v_m}{x}\to\ip{y}{x}.$ Therefore $\dsp v_m=\sum_{x\in\cal O}\ip{v_m}{x}x\to \sum_{x\in\cal O}\ip{y}{x}x.$ But also $v_m\to y,$ so (by uniqueness of limits in a normed space) $\dsp y=\sum_{x\in\cal O}\ip{y}{x}x$ and therefore $y\in \app{span}({\cal O}).$ Thus $\app{span}({\cal O})$ is a closed set and is also a subspace, as claimed. \gap \dft{14}{\bf Theorem: } {\sl Every non-trivial inner product space has a maximal orthonormal subset.\/} \gap \Pf This argument uses one of the forms of the Axiom of Choice (called ``The Maximal Principle'' in [1, p. 33]). The collection of (non-empty) orthonormal subsets of an inner product space is non-empty, since for each non-zero $v\in V,\ \{v/\|v\|\}$ is a non-empty orthonormal set. If a collection of orthonormal sets is linearly ordered by inclusion, it is routine to show that the union of them all is an orthonormal set. Hence, there is a maximal such set. \gap \dft{15}{\bf Corollary: } {\sl Every orthonormal subset of a Hilbert space is contained in some maximal orthonormal set.\/} \gap \dft{16}{\bf Theorem: } {\sl The span of a maximal orthonormal set in a Hilbert space $H$ is dense.\/} \gap \Pf A subset $D$ of a normed space is \its{dense} in the normed space if every element of the normed space is the limit of a \seq of points in the set $D.$ The \its{closure} of a set $S$ is (unhappily!) denoted $\overline S$ and consists of all points $x$ in $H$ \st $x=\lim_{n\to\infty} s_n,$ where all the $s_n$ lie in $S.$ Suppose that $\cal O$ is a maximal \on\ set and that $\app{span}({\cal O})$ not dense. Let $X$ denote the closure of the span of the maximal orthonormal set: $X:=\overline{\app{span}({\cal O})}.$ Let $y\in H\setminus X.$ Then $0 \ne v := y-P_Xy \in X^\perp,$ so the union of the given maximal orthonormal set and $\{v/\|v\|\}$ is a larger orthonormal set than $\cal O,$ contradicting the maximality of $\cal O.$ \gap \dft{16.1}{\bf Exercise: } Show that if the span of an \on\ set $\cal S$ in a Hilbert space $H$ is dense then $\cal S$ is a maximal \on\ set. \gap \dft{17}{\bf Definition: } {\sl A maximal orthonormal set in a Hilbert space is called an {\it orthonormal basis.\/}\/} \gap \dft{17.1}{\bf Exercise: } Verify that a closed subspace $X$ of a Hilbert space $H$ is itself a Hilbert space, using the inner product on $H$ as the inner product on $X$ (we use only elements of $X$ in the inner product ``inherited'' from $H).$ Apply this, \ref{14} and \ref{17} to show that if ${\cal O}$ is an \on\ set then $X:=\overline{\app{span}({\cal O})}$ \sla{has} an \onb. \gap \gap {\bf Remarks: } By \ref{14} \sla{every} Hilbert space has an \onb. An orthonormal basis ``of'' $H$ is not a basis in the usual sense unless it is finite! This curiosity is a consequence of completeness. See {\bf Deferred proofs, Item 2}. \gap One feature of orthonormal sets is: \gap \dft{18}{\bf Theorem (Bessel's inequality): } {\sl If ${\cal O}$ is an orthonormal set in an inner product space $V,$ then for each $v\in V,$ at most countably many of the numbers $\langle v,y\rangle$ can be non-zero, and $\sum_{y\in {\cal O}}|\langle v,y\rangle |^2 \le \|v\|^2.$} \gap \Pf Let ${\cal F}$ be a \sla{finite} subset of ${\cal O}.$ Let $w =\sum_{y\in {\cal F}}\langle v,y\rangle y.$ Then, by orthonormality, $$ \|w\|^2 = \Big\|\sum_{y\in {\cal F}}\langle v,y\rangle y\, \Big\|^2 = \sum_{y\in {\cal F}}|\langle v,y\rangle |^2 , $$ and $\widetilde y\perp v-w$ for each $\widetilde y\in{\cal F}.$ Thus, since $w$ is a linear combination of the $\widetilde y$ in $\cal F,$ $w\perp v-w.$ Hence $\|v\|^2= \|w\|^2 + \|v-w\|^2 \ge \|w\|^2,$ as claimed, at least for finite orthonormal sets. \gap It follows that there are only finitely many $y\in {\cal O}$ such that $|\langle v,y\rangle | \ge 1,\ \ |\langle v,y\rangle | \ge 1/2,\ \ |\langle v,y\rangle | \ge 1/3,$ and so on. This proves the countability assertion. We {\it define\/} $\sum_{y\in {\cal O}}|\langle v,y\rangle |^2$ as follows: $$ \sum_{y\in {\cal O}}|\langle v,y\rangle |^2 := \sup_{y\,\in {\cal F}\sub{\cal O},\ {\cal F} \app{\ \small{finite} }} \sum_{y\in {\cal F}}|\langle v,y\rangle |^2 . $$ We showed that each sum on the right is bounded by $\|v\|^2,$ so Bessel's Inequality holds. \gap {\bf A continuation valid in Hilbert spaces:} \gap Now $\|v\|^2 = \sum_{y\in {\cal F}}|\langle v,y\rangle |^2 + \|v-w\|^2 ,$ where $w = \sum_{y\in {\cal F}}\langle v,y\rangle y.$ Since $\|v-w\|^2=\dsp\inf_{\widehat w\in \app{\small{ span}}\,{\cal F}}\|v-\widehat w\|^2,$ we can show that $$ \|v\|^2 = \sum_{y\in {\cal O}}|\langle v,y\rangle |^2 + \inf_{w\in \app{\small{ span}}\,{\cal O}}\|v-w\|^2 =\sum_{y\in {\cal O}}|\langle v,y\rangle |^2 + d^2, $$ where $d^2$ denotes the square of the distance from $v$ to $\overline{\app{ span}\,{\cal O}} .$ Let us prove this (in {\bf Deferred proofs, Item 3}) after we look at some applications. If ${\cal O}$ is an orthonormal basis then $d^2 = 0,$ and so, in a Hilbert space, \gap \dft{19}{\bf Theorem (Parseval's relation): } {\sl If ${\cal O}$ is an orthonormal basis in a Hilbert space, then for all $x\in H,$ $$ \|x\|^2 =\sum_{y\in {\cal O}}|\langle x,y\rangle |^2. $$} \gap Polarization in $H$ and in $\com$ gives Plancherel's Theorem: \gap \dft{20}{\bf Theorem (Plancherel's Theorem): } {\sl Suppose ${\cal O}$ is an orthonormal basis in a Hilbert space $H.$ Then, for all $x\in H,\ y\in H,$ $$ \langle x,y\rangle = \sum_{u\in {\cal O}}\langle x,u\rangle \langle u,y\rangle =\sum_{u\in {\cal O}}\langle x,u\rangle \overline{\langle y,u\rangle}. $$} \gap An application of Parseval's relation: if $\langle x,y\rangle = \langle x',y\rangle$ for all $y$ in an orthonormal basis of a Hilbert space, then $x = x'$ (we replace $x$ by $x-x'$ in Parseval's relation). \gap Just as these numerical series converge, so do vector-valued series of the form $\sum_{y\in {\cal O}}c_yy,$ where ${\cal O}$ is an orthonormal set in a Hilbert space, whenever $\sum_{y\in {\cal O}}|c_yy|^2 < \infty.$ Proof that the ``sum'' is independent of the order of the terms will be part of {\bf Deferred proofs (Item 4)}. Proof that a specific (as to order) such ``sum'' exists is part of the proof of the next Theorem, in which we change our point of view, starting there with a set of coefficients as ``givens.'' \gap \dft{21}{\bf Theorem of Fischer and Riesz: } {\sl If ${\cal O}$ is an orthonormal set in a Hilbert space $ H,$ and for each $y\in {\cal O},\ \ c_y$ is a given complex number such that $\sum_{y\in {\cal O}}|c_y|^2 < \infty,$ then there exists $x\in H$ such that $\langle x,y\rangle = c_y$ for all $y\in {\cal O}.$} \gap \Pf Since $\sum_{y\in {\cal O}}|c_y|^2 < \infty,$ the set of $y$ such that $c_y$ is not zero is countable. They can be enumerated in some way: $y_1,\ y_2,\ \dots$ Consider $x_n:= \sum_{k=1}^n c_ky_k,$ where $c_k$ denotes the cumbersome $c_{y_k}.$ If $m < n,$ then $\|x_n-x_m\|^2 = \sum_{k=m+1}^n|c_k|^2,$ so $\{x_n\}$ is Cauchy, hence has a limit $x$ in $H$. By continuity of the inner product, for $k$ fixed $\langle x,y_k\rangle = \lim_{n\to\infty}\langle x_n,y_k\rangle = c_k.$ If $ y\in {\cal O}$ is not one of the $y_k,$ then $\langle x_n,y\rangle = 0$ for every $n,$ so $\langle x,y\rangle = 0 = c_y.$ \gap The Theorem of Fischer and Riesz does not ``pin down'' the $x$ that it asserts the existence of. There could be infinitely many of them. Examples abound, even in $\real^2\colon$ we can let ${\cal O}=\{(1,\,0)\},$ put $y=(1,\,0),$ and let $c_y=1.$ Then every vector $x=(1,\,t)$ satisfies $\langle x,y\rangle =1=c_y.$ We can find an ``optimal'' $x$ though, and that is what the Corollary that follows is all about. \gap \dft{22}{\bf Corollary: } \sla{If ${\cal O}$ is an orthonormal set in a Hilbert space $ H,$ and for each $y\in {\cal O},\ \ c_y$ is a given complex number such that $\sum_{y\in {\cal O}}|c_y|^2 < \infty,$ then there exists exactly one $x_o\in \overline{\app{span}({\cal O})}$ such that $\langle x_o,y\rangle = c_y$ for all $y\in {\cal O}.$} \gap \Pf We can, for short, set $X:=\overline{\app{span}({\cal O})}.$ Then $X$ is a closed subspace of $H.$ We then apply the Fischer-Riesz Theorem to obtain $x\in H$ such that $\langle x,y\rangle = c_y$ for all $y\in {\cal O}.$ We next define $x_o:=P_Xx.$ Then $x-x_o=x-P_Xx\perp X,$ so $x-x_o\perp y$ for every $y\in {\cal O}.$ Hence for each $y\in{\cal O},$ $$ \langle x_o,\,y\rangle =\langle x_o-x+x,\,y\rangle =\langle x_o-x,\,y\rangle+\langle x,\,y\rangle=0+c_y=c_y. $$ This takes care of existence. \gap For uniqueness, we suppose that $x_b\in \overline{\app{span}({\cal O})}$ is such that $\langle x_b,y\rangle = c_y$ for all $y\in {\cal O}.$ Then $$ \langle x_o-x_b,y\rangle = c_y-c_y=0 $$ for all $y\in {\cal O}.$ Then $\langle x_o-x_b,v\rangle = 0$ for all $v\in \app{span}({\cal O}).$ It follows that $x_b=x_o.$ \gap \dft{22.1} {\bf Exercise: } Show that $x_b=x_o.$ \small{Hint: both of these vectors are in $\scr{\overline{\app{span}({\cal O})}}.$} \gap \S 2 {\bf An inner product space can be embedded into its dual space by a conjugate-linear isometry} \gap What if we have an inner product space that is \sla{not} a Hilbert space? This part of this introduction is devoted to showing how we can ``fill up'' an inner product space by adjoining the elements that ``ought to be there'' as limits of Cauchy sequences. This process is called \its{completion.} A specific example might show what this means. The space $C[-1,\,1]$ of continuous functions on $[-1,\,1]$ can be made into an inner product space by defining $$ \langle f,\,g \rangle:=\int_{-1}^1 f(x)\overline{g(x)}\,dx. $$ If we set $$ f_n(x):=\left\{ \matrix{ 0,&\put{if}-1\le x\le 0,\cr nx,&\put{if}0\le x\le 1/n,\cr 1,&\put{if}1/n\le x\le 1,\cr }\right. $$ then each of these functions is in $C[-1,\,1]$ and $\{f_n(x)\}$ converges \sla{pointwise} to the function that is zero if $-1\le x\le 0$ and is $1$ if $0< x\le 1.$ It takes some tedious calculation (that you can ``easily'' do if you feel like it) to show that $\{f_n(x)\}$ is also a Cauchy \seq in this inner product space $C[-1,\,1].$ But the \seq cannot have a continuous limit \wrt the norm. This is intuitively obvious. Actually proving it mathematically is tough exactly because it is intuitively obvious! The process that follows, completion, is technical and is included here for completeness, and the pun is more or less intended. Since only the mathematically inclined or the brave will read it, it will be presented tersely. \gap If $V$ is an inner product space, we let $V^*$ denote the space of \sla{continuous} linear functionals on $V.$ We will let $v^*,\ w^*,$ etc. denote generic elements of $V^*.$ Thus, for all $v$ and $w$ in $V,$ and for all complex scalars $\alf$ and $\beta,$ $v^*(\alf v+\beta w) =\alf v^*(v)+\beta v^*(w)$ $(v^*$ is linear) and $\dsp\lim_{v\to v_o}v^*(v)=v^*(v_o)$ $(v^*$ is continuous). $V^*$ is alled the {\it dual space of $V.$\/} $V^*$ is a \its{Banach space,} namely a vector space in which Cauchy sequences converge (the proof is not relevant to this course). The norm we will use on $V^*,$ at least at first, is $$ \|v^*\|_* :=\sup_{\|v\||\le 1} |v^*(v)|. $$ The proof that this is a norm will be omitted here. We will use the notation \ipv for the inner product of $v$ and $w$ in $V,$ and $\|v\|$ for the norm of $v$ in $V.$ \gap \dft{23}{\bf A conjugate-linear embedding of $V$ into $V^*$} \gap For each $v_o\in V,$ we define $Ev_o(v) := \langle v,v_o\rangle.$ Thus $Ev_o$ is a linear functional on $V.$ \gap \dft{24}{\bf Claim: } {\sl $Ev_o$ belongs to $V^*$, and $\| Ev_o \|_* =\|v_o\|.$} \gap \Pf $|Ev_o(v)| = |\langle v,v_o\rangle |\le \|v\| \|v_o\|,$ by the Schwarz inequality. Therefore $\|Ev_o\|_* \le \|v_o\|.$ If $v_o \ne 0,$ then $v := v_o/\|v_o\|$ yields $Ev_o(v)=\langle v_o/\|v_o\| , v_o\rangle = \|v_o\| \le \|Ev_o\|_*,$ so actually $\| Ev_o \|_* = \|v_o\|$ (if $v_o = 0,$ then $Ev_o =0$ (of $V^*)).$ \gap A pair of properties of the mapping $E\colon\ E(v_o + v_1) = Ev_o + Ev_1,$ and $Eav_o = \bar a Ev_o;$ that is, $E$ is {\it conjugate linear.\/} In particular, $E(v_o - v_1) = Ev_o - Ev_1.$ These properties are proved using the definition of $E.$ Therefore, \gap \centerline{$E$ is an isometry (a distance-preserving map) of $V$ onto a subset of $V^*.$} \gap {\bf The conjugate-linear embedding $E$ of $V$ into $V^*$ has dense range} \gap \dft{25}{\bf Theorem: } {\sl $E(V)$ is dense in $V^*.$} \gap \Pf What we have to prove is that, for each $v^*$ in $V^*,$ there exists a sequence $\{v_n\}$ of elements of $V$ such that $Ev_n \to v^*$ in the norm of $V^*.$ Let $v^* \in V^*.$ If $v^* = 0,$ then $E0 = v^*,$ so we may assume that $v^* \ne 0.$ Further, we may assume that $\|v^*\|_* = 1.$ Then, \gap {\sl there exists a sequence $\{v_n\}$ of elements of $V,$ with $\|v_n\| = 1$ for each $n,$ such that $0 \le v^*(v_n) \to 1,$ from below, as $n \to\infty.$\/} \gap The non-negativity of $v^*(v_n)$ is a useful convenience. It is assured by multiplying some ``original $v_n\app{'s''}$ by suitable complex numbers of unit length, as in the proof of the Schwarz inequality. See {\bf Deferred proofs, Item 1} for details. \gap We will show that, as $n \to \infty,$ $Ev_n \to v^*,$ in the norm of $V^*.$ If $v^*(v)$ is to act like $\langle v,v_n\rangle$ and $w\perp v_n,$ we expect $v^*(w)$ to be a smaller and smaller fraction of $\|w\|,$ as $n$ increases. First we will prove a Lemma to that effect, and a useful Corollary of it. The Lemma gives an estimate for $v^*(w)$ when $w \perp v_n.$ \gap \dft{26}{\bf Lemma: } {\sl If $v^*\in V^*,\ v\in V,$ and $\|v^*\|_*=1=\|v\|,$ then whenever $w\in V$ and $w\perp v,$ $$ |v^*(w)|\le\sqrt{1-|v^*(v)|^2\,}\,\|w\|. $$} {\bf Remark } If we knew that $v^*(w)=\langle w,\,\widehat v\rangle$ for some $\widehat v$ in $V,$ this would follow from the fact that the cosine of the angle between $\widehat v$ and $w$ is more or less equal (in absolute value) to the {\sl sine\/} of the angle between $\widehat v$ and $v.$ \gap \Pf We notice that $|v^*(v)|\le 1,$ so the square root makes sense. We'll use the quadratic formula to make the estimate. Let $a$ be a complex number. Since $v^*$ has norm one and $w \perp v,$ $$ |v^*(av + w)|\le\|v^*\|_*\|av + w\| =1\cdot\sqrt{|a|^2 + 2 \re \langle av, w\rangle + \|w\|^2 }=\sqrt{|a|^2 + \|w\|^2\, }. $$ Thus $|v^*(av + w)|^2 \le |a|^2 + \|w\|^2.$ \gap Here is another way to express $|v^*(av + w)|^2\colon$ $$ |v^*(av + w)|^2 = |v^*(av) + v^*(w)|^2 =|a|^2v^*(v)^2 + 2\re\bar a v^*(v) v^*(w) + |v^*(w)|^2. $$ Therefore, $$ |a|^2v^*(v)^2 + 2\re\bar a v^*(v) v^*(w) + |v^*(w)|^2 \le |a|^2 + \|w\|^2. $$ Now we let $a = re^{i\fie}$ where $r$ is an arbitrary real number -- it doesn't have to be non-negative -- and we choose $\fie,$ also real, so that $\bar a v^*(v) v^*(w) = r|v^*(v)\|v^*(w)|.$ After we substitute the formula $a = re^{i\fie}$ into the last inequality and do some rearranging, we find that, for all real $r,$ $$ 0 \le r^2(1 - v^*(v)^2) - 2 r|v^*(v)\|v^*(w)| + (\|w\|^2 - |v^*(w)|^2). $$ The discriminant of this non-negative quadratic must therefore be non-positive; that is, $$ |v^*(v)|^2|v^*(w)|^2 \le (1 - |v^*(v)|^2) (\|w\|^2 - |v^*(w)|^2). $$ We add $(1 - |v^*(v)|^2)|v^*(w)|^2$ to both sides of this inequality and take square roots to complete the proof. \gap \dft{27}{\bf Corollary: } {\sl If $v^*\in V^*,\ v\in V,\quad \|v^*\|_*=1=\|v\|,$ and $0\le v^*(v),$ then $\|Ev-v^*\|_*\le\del+\del^2,$ where $\del$ is non-negative and $\del^2:=1-v^*(v)^2.$} \gap \Pf For any $u\in V,$ $(Ev - v^*)(u) = \langle u,v\rangle - v^*(u - \langle u,v\rangle v) - \langle u,v\rangle v^*(v).$ Now $w := u - \langle u,v\rangle v \perp v,$ so $\|w\| \le \|v\|,$ by Pythagoras' Theorem. Thus by the Lemma $$| (Ev - v^*)(u) | \le | \langle u,v\rangle |(1 - v^*(v) ) + \del\|w\| \le \|u\|(\del^2 + \del). $$ This completes the proof (we actually got the smaller but uglier estimate $\|Ev-v^*\|_*\le\del+(1-v^*(v))\ ).$ \gap We can now quickly complete the proof of the Theorem. We had to show that as $n \to \infty,$ $Ev_n \to v^*$ in the norm of $V^*.$ We set $\del_n:=\sqrt{1-|v^*(v_n)|^2\,}.$ By the Corollary, $\|Ev_n - v^*\|_*\le \del_n(1 + \del_n)\to 0$ as $n \to\infty.$ We're done! \gap {\bf Consequences of the Theorem, and what preceded it} \gap 1. $\{Ev_n\}$ is a Cauchy sequence in $V^*$ because it converges. By the isometric property of $E,$ $\{v_n\}$ is Cauchy in $V,$ so if $V$ is already complete, and $v := \lim_{n\to\infty} v_n,$ then $v^* = Ev.$ Thus, if $V$ is a Hilbert space, $E$ is onto as well as one-to-one. This gives us the \gap \dft{28}{\bf Riesz Representation Theorem: } {\sl Let $H$ be a Hilbert space. If $\lam(x)$ is a continuous linear functional on $H,$ then there exists a unique $y\in H$ such that $\lam(x) = \langle x,y\rangle,$ and $\|\lam\|_* = \|y\|.$} That is, there is an isometric one-to-one correspondence between $H$ and $H^*.$ \gap 2. If E is onto, the isometric property shows that, because $V^*$ is complete, so is $V.$ \gap 3. The inner product can be ``exported'' to $V^*.$ For $v^*,\ w^*$ in $V^*,$ let $$ \langle v^*,w^*\rangle^* :=\lim_{n\to\infty} {1\over 4}\sum_{k=0}^3 i^k\|w_n+i^kv_n\|^2 =\lim_{n\to\infty} \langle w_n,v_n\rangle , $$ where $Ev_n\to v^*,\ Ew_n\to w^*.$ \gap {\bf To show $\langle v^*,w^*\rangle^*$ is well-defined} \gap Suppose $E\tilde v_n\to v^*,\ \ E\tilde w_n\to w^*.$ Then $$ \|\tilde w_n+i^k\tilde v_n\|^2 =\|\tilde w_n - w_n + w_n +i^k\tilde v_n\|^2 =\|\tilde w_n - w_n\|^2+ 2\re \langle \tilde w_n - w_n,\,w_n + i^k\tilde v_n\rangle + \|w_n + i^k\tilde v_n\|^2. $$ The first 2 terms tend to zero. We repeat the calculation to replace $\tilde v_n$ by $v_n.$ Each sequence is Cauchy in $V$ because the mapping $E$ is isometric. \gap {\bf To show: the well-defined quantity $\langle v^*,w^*\rangle^*$ is an inner product} \gap It is immediate that $\langle v^*,w^*\rangle^*$ is additive in each argument. Congugate symmetry and the properties of $\langle v^*,v^*\rangle^*$ are also immediate. Since $Ev_n\to v^*,$ for any scalar $a,\ \ E(\bar a v_n)\to av^*,$ so $$ \langle av^*,w^*\rangle^* =\lim_{n\to\infty} \langle w_n,\bar a v_n\rangle = a\langle v^*,w^*\rangle^*. $$ A similar arugment shows $\langle v^*,aw^*\rangle^* = \bar a \langle v^*,w^*\rangle^*.$ The norm given by this inner product: $$ \|v^*\|^2=\lim_{n\to\infty}\langle v_n,v_n\rangle =\lim_{n\to\infty}Ev_n(v_n) =\lim_{n\to\infty}\|Ev_n\|^2_* $$ agrees with the standard norm $\|v^*\|_*$ in $V^*,$ and this completes the proof that $\langle v^*,w^*\rangle^*$ is an inner product on $V^*.$ We have shown: \gap {\sl If $V$ is an inner product space, then $V^*$ is a Hilbert space, that is homeomorphic to the completion of $V$ with respect to its norm. Moreover, the linear-functional norm on $V^*$ coincides with its Hilbert-space norm.\/} \gap A further note: if $V$ is a Hilbert space, we may use $\langle E^{-1}v^*, E^{-1}w^*\rangle $ in place of the limits used in the definition of the inner product on $V^*.$ \gap \S 3 {\bf Hilbert space isomorphism} \gap This section is devoted to a discussion of the question: ``When are two Hilbert spaces indistinguishable \sla{as} Hilbert spaces?''. This means that there is a one-to-one linear mapping of one of them onto the other that preserves inner products. It is a technical section, included for completeness. \gap Here we take up the question of when two Hilbert spaces are isomorphic in a way that preserves ``Hilbert space structure.'' The answer depends on {\bf the} cardinal number of an orthonormal basis. \gap \dft{29}{\bf Theorem: } {\sl Two orthonormal bases in a Hilbert space have the same cardinal number.\/} \gap \Pf Let ${\cal U},\ {\cal V}$ be orthonormal bases for a Hilbert space $H.$ If one is finite so is the other and they have the same number of elements, by the replacement theorem from linear algebra. Otherwise, without loss of generality we may assume $\app{card }{\cal V} \le \app{card } {\cal U}.$ For each $v\in {\cal V},$ let $U(v) = \{u\in {\cal U}: \langle u,v\rangle \ne 0\}.$ Each $U(v)$ is nonempty, countable, and $\bigcup_{v\in {\cal V}}U(v) = {\cal U}.$ In particular, if ${\cal V}$ is countable, so is ${\cal U}.$ If not, the cardinal number of the union is at most $\app{card } {\cal V}.$ Hence $\app{card } {\cal V} \ge \app{card } {\cal U},$ so $\app{card } {\cal V} = \app{card } {\cal U},$ as desired. \gap \dft{30}{\bf Definition: } {\sl The common cardinal number of the orthonormal bases of a Hilbert space is called the {\it Hilbert space dimension\/} of H.\/} \gap \small{I don't know how common this term is...} \gap Two Hilbert spaces are isomorphic as Hilbert spaces if there is a one-to-one correspondence between them that preserves inner products. It is straightforward to show that such correspondences are linear and continuous. They are thus ``operators,'' and these special operators are called {\it unitary operators.\/} \gap \dft{31}{\bf Theorem: } {\sl Hilbert spaces $H_1$ and $H_2$ are isomorphic as Hilbert spaces \iffi they have the same Hilbert space dimension.\/} \gap \Pf Let ${\cal O}_1,\ {\cal O}_2$ be orthonormal bases in $H_1,\ H_2$ respectively. If the Hilbert space dimensions are the same, let $\lam$ be a one-to-one correspondence between ${\cal O}_1$ and ${\cal O}_2 .$ Then $Ux := \sum_{y_1\in {\cal O}_1}\langle x,y_1\rangle \lam(y_1) $ is a unitary isomorphism. This is an application of previous theorems. Now suppose $ U: H_1 \to H_2$ is a unitary isomorphism. Then $U({\cal O}_1)$ is an orthonormal set in $H_2.$ Since $$ \overline{\app{span} U({\cal O}_1)} = \overline{U(\app{span} {\cal O}_1)} = U\left(\,\overline{\app{span} {\cal O}_1}\,\right) = H_2, $$ $U({\cal O}_1)$ is maximal, so $\app{dim }H_2 = \app{card } U({\cal O}_1) = \app{dim }H_1.$ \gap \dft{32}{\bf Theorem: } {\sl A Hilbert space $H$ is separable \iffi it has a countable orthonormal basis.\/} \gap \Pf If $H$ has a countable orthonormal basis then $H\simeq\ell^2,$ which is separable. \gap If $H$ is separable and ${\cal U}$ is an orthonormal basis of $H$ then there is a countable dense subset $\{y_k\}_{k=1}^\infty$ of $H.$ For each element $u\in{\cal U}$ there is some positive integer $k(u)$ \st $\|u-y_{k(u)}\|<1/2.$ If ${\cal U}$ were uncountable there would exist $u_1\ne u_2$ in ${\cal U}$ \st $k(u_1)=k(u_2)=:K.$ But then $2=\|u_1-u_2\|^2\le(\|u_1-y_K\|+\|y_K-u_2\|)^2<1.$ This contradiction shows that ${\cal U}$ is countable. \gap \S 4 {\bf Deferred proofs} \gap \dft{33}{\bf Item 1 }The following appeared in the proof that $E(V)$ is dense in $V^*.$ \gap We assumed that $\|v^*\|_* = 1.$ We want to show that there exists a sequence $\{v_n\}$ of elements of $V,$ with $\|v_n\| = 1$ for each $n,$ such that $0 \le v^*(v_n) \to 1,$ as $n \to\infty,$ with all $v^*(v_n)\le 1.$ \gap $\|v^*\|_* = 1$ means that there exist vectors $\tilde v_n\ne 0$ such that $\|\tilde v_n\|\le 1$ and $|v^*(\tilde v_n)|\to 1.$ We set $v_n:=e^{i\theta_n}\tilde v_n,$ where the numbers $\theta_n$ will be chosen in a moment, we have, since $\|\tilde v_n\|\le 1,$ that $$ 1\ge|v^*(v_n)|=\left|v^*({\tilde v_n\over \|\tilde v_n\|})\right| ={1\over \|\tilde v_n\|}|v^*(\tilde v_n)|\ge |v^*(\tilde v_n)|\to 1, $$ so $|v^*(v_n)|\to 1,$ by the Squeeze Principle. We now choose $\theta_n$ so that $v^*(e^{i\theta_n}\tilde v_n)=e^{i\theta_n}v^*(\tilde v_n)=|v^*(\tilde v_n)|.$ When we divide by $\|\tilde v_n\|$ we get what we wanted: $v^*(v_n)=|v^*(v_n)| \to 1.$ \gap \dft{34}{\bf Item 2 }An orthonormal basis is not a basis in the usual sense, unless it is finite. This is a consequence of completeness. \gap Suppose not, namely, we have an infinite orthonormal basis that is a basis in the usual sense. \gap We may select a denumerable set $\{y_n\}_{n=1}^\infty$ of members of the orthonormal basis. Then the following series (i.e. sequence of partial sums) converges in the Hilbert space to a non-zero vector $x\colon$ $$ \sum_{n=1}^\infty {y_n\over n^2}. $$ Proof that this is so is left to the reader. It involves straightforward checking that the definition of ``Cauchy sequence'' is satisfied by the partial sums. The limiting vector $x$ is non-zero because $\langle x,y_1\rangle=1.$ \gap Since our o.n. basis is a linear-algebra basis, we can also write $x=\sum_{y\in{\cal F}} c_y\,y,$ where ${\cal F}$ is a finite subset of our o.n. basis. Therefore $$ 0=\sum_{n=1}^\infty {y_n\over n^2}-\sum_{y\in{\cal F}} c_y\,y. $$ But ${\cal F}$ is finite, so for all $k$ sufficiently large we have to have $$ 0=\left\langle\sum_{n=1}^\infty {y_n\over n^2} -\sum_{y\in{\cal F}} c_y\,y,\,y_k\right\rangle =\left\langle\sum_{n=1}^\infty {y_n\over n^2} ,\,y_k\right\rangle=1/k^2, $$ which is a contradiction. \gap {\bf Remark }Completeness was really used in the last argument! Here is an example of a normed space with a countable basis in the linear-algebra sense. Let $V$ be the collection of all polynomials in $d$ real variables, with real coefficients. This means that a typical element of $V$ has the form $$ P(x)=\sum_{\alpha\ge0} p_\alpha x^\alpha, $$ where only finitely many of the coefficients $p_\alpha$ are non-zero, the quantities $\alpha$ are ``multi-indices'' belonging to $\nat^d,$ the collection of all $d\app{-tuples}$ of non-negative integers, and $x^\alpha:=x_1^{\alpha_1}\cdots x_d^{\alpha_d}.$ For example, $P(x):=|x|^2=\sum_{k=1}^d x^{2e_k}.$ \gap We define the norm of $P$ by $\|P\|^2:=\sum_{\alpha\ge0} p_\alpha^2.$ It can be shown (using polarization) that this norm is given by an inner product. Now the set $\{x^\alpha: \alpha\in \nat^d\}$ is a basis for $V$ in the sense of linear algebra. Of course, $V$ is not a Hilbert space. \gap \dft{35}{\bf Item 3 }We are to show that \fA $v\in H$ (and we will assume $v\ne0)$ $$ \|v\|^2 = \sum_{y\in {\cal O}}|\langle v,y\rangle |^2 + \inf_{w\in \app{\small{ span}}\,{\cal O}}\|v-w\|^2 =\sum_{y\in {\cal O}}|\langle v,y\rangle |^2 + d^2, $$ where $d^2$ denotes the square of the distance from $v$ to $\overline{\app{ span}\,{\cal O}} .$ \gap First, we know that the projection operator $P_o$ for $\overline{\app{ span}\,{\cal O}}$ is defined and continuous. We are given some $v\in H.$ Thus we know that $$ d^2=\inf_{w\in \app{\small{ span}}\,{\cal O}}\|v-w\|^2=\|v-P_ov\|^2. $$ For the given $v,$ we let $NZ:=\{y\in{\cal O}:\langle v,y\rangle\ne0\}.$ Then $NZ$ is countable, so we can enumerate the elements in $NZ,$ putting them into a sequence $\{y_k\}_{k=1}^\infty.$ Let us define $v_o:=\sum_{k=1}^\infty\langle v,y_k\rangle y_k.$ To show that this definition makes sense, we set $v_n:=\sum_{k=1}^n\langle v,y_k\rangle y_k$ and proceed as we did in the proof of the Theorem of Fischer and Riesz, to show that the sequence $\{v_n\}$ is Cauchy. We then set $v_o$ equal to the limit. In particular, we have $\|v_n-v_o\|\to 0.$ Now suppose that $y\in{\cal O}.$ Then $$ \langle v_o,y\rangle =\lim_{n\to\infty}\langle v_n,y\rangle =\lim_{n\to\infty} \sum_{k=1}^n\langle\langle v,y_k\rangle y_k,y\rangle =\left\{ \matrix{\langle v,y\rangle,\put{if}y\in NZ\cr \ \ 0,\quad\put{if}y\notin NZ.} \right. $$ Therefore \fA $y\in{\cal O},$ we have $\langle v-v_o,y\rangle=0.$ The same is true when $y$ is replaced by any element of $\app{ span}\,{\cal O}.$ Now let us suppose that $w\in \overline{\app{ span}\,{\cal O}}.$ Then there is a sequence $\{w_k\}$ of elements of $\app{ span}\,{\cal O}$ \st $w_k\to w.$ This gives us $$ \langle v-v_o,w\rangle=\lim_{k\to\infty}\langle v-v_o,w_k\rangle=0. $$ That is, $v-v_o\perp w$ \fA $w\in \overline{\app{ span}\,{\cal O}}.$ By the uniqueness of the projection, $v_o=P_ov.$ Therefore $$ \|v\|^2=\|v-v_o\|^2+\|v_o\|^2=\|v-P_ov\|^2 +\sum_{k=1}^\infty|\langle v,y_k\rangle |^2 =d^2+\sum_{y\in {\cal O}}|\langle v,y\rangle |^2, $$ as desired. \gap \dft{36}{\bf Item 4 }Proof that the ``sum'' $\sum_{y\in {\cal O}}c_yy,$ where ${\cal O}$ is an orthonormal set in a Hilbert space, and $\sum_{y\in {\cal O}}|c_yy|^2 $ is finite, is independent of the order of the terms. \gap As in Item 3 and as in the proof of the Theorem of Fischer and Riesz, for every enumeration of the non-zero coefficients $c_y,$ we have a well-defined element of $H$ given by a Cauchy sequence. Let us choose one enumeration as the starting one. Then every other enumeration is a rearrangement of the chosen one. Let us distinguish them by the name of the mapping $\pi:\ints^+\to\ints^+,$ one-to-one and onto, that accomplishes the rearrangement. Thus we let $c_k$ denote the coefficients of the starting element, $x_o:=\sum_{k=1}^\infty c_k\,y_k,$ and we let $x_\pi:=\sum_{n=1}^\infty c_{\pi n}\,y_{\pi n}.$ We want to show that $x_\pi=x_o$ no matter which $\pi$ is used. We can do this by showing that, \fA $\ep>0,$ $\|x_\pi-x_o\|<\ep.$ We may choose $K$ so large that $$ \sum_{k>K}|c_k|^2<\ep^2/9. $$ We can then be sure that there is $N$ so large that for each $k\le K,$ it is true that $k\in\{\pi 1,\,\dots\,\pi N\}.$ Then $$ x_o-x_\pi=\sum_{k=1}^K c_k\,y_k+R_{o,K} -\sum_{n=1}^N c_{\pi n}\,y_{\pi n}-R_{\pi,N}, $$ where the terms with $R$ denote the ``tails'' of the corresponding series. All the terms in the very first sum are cancelled by terms in the first ``negated'' sum. We can thus write $$ x_o-x_\pi=R_{o,K} -\sum_{n=1}^N [\pi n>K]c_{\pi n}\,y_{\pi n}-R_{\pi,N}. $$ Thus $\|x_o-x_\pi\|\le\|R_{o,K}\| +\|\sum_{n=1}^N [\pi n>K]c_{\pi n}\,y_{\pi n}\| +\|R_{\pi,N}\|.$ By construction, $\|R_{o,K}\|<\ep/3.$ Since we have made no use at all of rearrangement invariance, we can use Parseval's relation on the Hilbert space $\overline{\app{ span}\,{\cal O}}.$ Thus $$ \left\|\sum_{n=1}^N [\pi n>K]c_{\pi n}\,y_{\pi n}\right\|^2 =\sum_{n=1}^N [\pi n>K]|c_{\pi n}\,y_{\pi n}|^2 \le\sum_{k>K}|c_k|^2<\ep^2/9 $$ and (similarly) $\|R_{\pi,N}\|^2<\ep^2/9.$ Thus $\|x_\pi-x_o\|<\ep.$ It follows that $x_\pi-x_o,$ which is what we had to show. \gap \gap {\bf References} \gap [1] J. L. Kelley, {\it General Topology,\/} \ D. Van Nostrand, 1955. \gap [2] K. Yosida, {\it Functional Analysis,\/} \ Springer Verlag, 1965.