Wald GR, Chapter 3 Notes

2014-01-17 physics relativity notes

3.0: Curvature

  • Extrinsic curvature of a manifold is a notion of curvature relying on embedding the manifold in a higher-dimensional space. It will be discussed in chapter 9.

  • Intrinsic curvature of a manifold, with no reference to a higher-dimensional space, can be defined in terms of parallel transport.

  • Parallel transport is roughly “to keep a tangent vector pointing in the same direction” as it is moved around the manifold (across the various different tangent spaces). If a vector can be parallel-transported around a closed loop on the manifold, and is not the same as when it started, then the manifold is curved. This will never happen on a plane, but it can happen on a sphere.

  • There is a comment about a geodesic being a curve “whose tangent is parallel-transported along itself”. I don’t see what this means right now. Then, a manifold is curved if and only if some initially-parallel geodesics fail to remain parallel. This is violation of Euclid’s fifth postulate.

  • With only a bare manifold structure, there is no notion of parallel transport from the tangent space $V_p$ at $p$ to the tangent space $V_q$ at $q$. We need to know how to take derivatives of vector fields, so the vector parallel-transported from $Vp$ to $V{p+\delta p}$ can be computed as a linear approximation. Also, we can say that a vector is being parallel-transported if its derivative along a curve is zero.

  • Vector taking a new value after parallel transport around a closed loop corresponds to a failure of derivatives to commute. Curvature can be encoded as some sort of commutator of derivatives.

  • A metric will be shown to provide the structure necessary for parallel transport in section 3.1.

3.1: Derivative Operators and Parallel Transport

  • A [covariant] derivative operator $\nabla$, on manifold $M$, takes a smooth tensor of type $(k,l)$ and produces a smooth tensor of type $(k,l+1)$ satisfying the conditions

    • linearity: $\nabla(\alpha A+\beta B)=\alpha\nabla A+\beta\nabla B$.

    • Liebniz: $\nabla(AB)=A(\nabla B)+(\nabla A)B$.

    • commutativity with contraction: $\nabla_d ({A^{a_1\cdots c\cdots ak}}{b_1\cdots c\cdots b_l})={(\nabla_d A)^{a_1\cdots c\cdots ak}}{b_1\cdots c\cdots b_l}$.

    • consistent with tangent vectors as directional derivatives on scalar fields: for $f\in{\cal F}$ and $v\in V_p$, $v(f)=v^a\nabla_a f$.

    • torsion free: for $f\in{\cal F}$, $\nabla_a\nabla_b f=\nabla_b\nabla_a f$.

    Recall the abstract index notation of chapter 2: the latin indices indicate the rank of the tensor but do not represent components in any particular basis. Why does the derivative operator bump up the rank of the tensor? Because it adds its own covariant (lower) index into the mix.

  • The commutator of two vector fields $v,w$ on $M$, with respect to derivative operator $\nabla$, can be shown (in the same way as exercise 2.4a) to be $$[v,w]^a=v^b\nabla_b w^a-w^b\nabla_b v^a.$$

  • Does such a derivative operator actually exist? Yes, given a coordinate system at $p$, $\partial_\mu$, defined as component-wise differentation (with respect to $x^\mu$) of all the target tensor’s components, satisfies the axioms. However, because this derivative operator is coordinate-dependent, it is not a fundamental property of the manifold.

  • There is an involved discussion of uniqueness of derivative operators. The upshot is that if $\nabla’$ and $\nabla$ are derivative operators at $p\in M$, then $\nabla’_a-\nablaa$ is a tensor of type $(1,2)$ at $p$. More explicitly, there is a tensor field ${C^c}{ab}\omega_c$ with $$\nabla_a\omega_b=\nabla’_a\omegab-{C^c}{ab}\omega_c.$$

  • Plugging in the dual vector $\omega_b=\nabla_b f=\nabla’_b f$ (they are equal because they must agree on scalar fields) to the above gives $$\nabla_a\nabla_b f=\nabla’_a\nabla’b f-{C^c}{ab}\nablac f$$ so that ${C^c}{ab}={C^c}_{ba}$ because the derivative operators are torsion free.

  • Another not-so-obvious argument to show that the difference in derivative operators, when acting on arbitrary tensors, is also characterized by just the tensor ${C^c}_{ab}$, as in Eq. 3.1.14 (there are more terms than the case written out above). Hence this captures all of the freedom in defining derivative operators.

  • At point $p\in M$, a generic derivative operator looks like (e.g. when acting on a vector) $$\nabla_a v^b=\partiala v^b+{\Gamma^b}{ac}v^c.$$ The tensor $C$ is written as $\Gamma$ and called the Christoffel symbol in this case where the differences are with respect to the ordinary derivative.

  • Given a derivative operator $\nabla_a$, and given a curve with tangent vector field $t^a$, a vector $v$ is parallelly-transported along the curve if $$t^a\nabla_a v^b=0.$$ Note that this is for a given choice of $\nabla$. If, further, coordinates are chosen, then this is expressed as $$t^a\partiala v^b+t^a{\Gamma^b}{ac}v^c=0.$$ Given an initial vector $v$, this differential equation can be solved uniquely to determine the parallel-transported vector at each point on the curve. A curve connecting points $p,q\in M$, and a choice of derivative operator, then provides a way of interpreting a “parallel transport” of vector $v_p\in V_p$ to $v_q\in V_q$. This is a connection (between the different tangent spaces).

  • Now, a major thing: if we are given a metric (in addition to a bare manifold structure), then the condition that [the inner product $g(v,w)$ of two parallel-transported vectors] remains the same gives us a unique choice of derivative operator. This is the condition that $$t^a\nablaa\left(g{bc}v^b w^c\right)=0$$ where $t^a$ is a tangent vector along some curve. Because $v$ and $w$ are parallelly transported, the directional derivatives acting on them vanish. This leaves us with $$t^a v^b w^c\nablaa g{bc}=0$$ for all curves $t$ and all vectors $v,w$. The generality allows us to say $\nablaa g{bc}=0$.

  • Theorem 3.1.1 states that the condition $\nablaa g{bc}=0$ uniquely determines a derivative operator $\nablaa$. The connection tensor $C$ is solved explicitly to give $${C^c}{ab}=\frac{1}{2} g^{cd}\left[\tilde\nablaa g{bd}+\tilde\nablab g{ad}-\tilde\nablad g{ab}\right]$$ where $\tilde\nabla$ is any choice of derivative operator. With the $\tilde\nabla$ chosen to be the ordinary derivative $\partial$, the Christoffel symbol is given by $${\Gamma^c}_{ab}=\frac{1}{2} g^{cd}\left[\partiala g{bd}+\partialb g{ad}-\partiald g{ab}\right].$$

3.2: Curvature

  • The aim of this section is to use parallel transport of a vector, around a loop, to define/quantify intrinsic curvature of a manifold. The idea is that if a vector can be parallel-transported around any closed curve and end up the same as it started, then there is no curvature.

  • The Riemann curvature tensor ${R{abc}}^d$ is defined by its action on a test dual vector $\omega$: $${R{abc}}^d\omega_d=\nabla_a\nabla_b\omega_c-\nabla_b\nabla_a\omega_c.$$ This seems like a hard definition to use. There is an argument to show that $R$ (i.e. the right hand side) is indeed a tensor of type $(1,3)$. I don’t really follow the argument (it’s a rehash of an argument from section 3.1, top of page 33), but I will leave it alone for now.

  • ”${R_{abc}}^d$ is directly related to the failure of a vector to return to its initial value when parallel transported around a small closed curve.” Then a Taylor expansion argument to compute the change in the vector going around a small closed curve. The change in the vector vanishes to first order, but is proportional to $R$ in its leading order term (quadratic in the dimensions of the loop).

  • The action of a commutator of derivatives on an arbitrary tensor field can always be found as a sum of terms involving $R$.

  • Some properties of the Riemann curvature tensor $R$:

    1. ${R{abc}}^d=-{R{bac}}^d$.

    2. ${R_{[abc]}}^d=0$ where the notation refers to the totally anti-symmetric sum over the tensor’s indices (cf. section 2.4).

    3. For the special derivative operator $\nabla_a$ with $\nablaa g{bc}=0$, $R{abcd}=-R{abdc}$.

    4. Bianchi identity: $\nabla{[a}{R{bc]d}}^e=0$.

    These are discussed/proved in the text. It’s probably worthwhile to study those proofs, but I am skipping for now.

  • “By the antisymmetry properties (1) and (3), the trace of the Riemann tensor over its first two or last two indices vanishes.” This isn’t clear to me. For instance, what does it mean to trace over the first two indices? ${R_{aac}}^d$ looks wrong. Does one compute things like that? However, supposing it’s reasonable to compute that sum over $a$, then I should still figure out why that sum is said to vanish.

  • The Ricci tensor is defined as the trace over the second and fourth indices (or, equivalently, the first and third indices) of the Riemann tensor: $$R{ac}={R{abc}}^b.$$ It is symmetric: $R{ac}=R{ca}$.

  • The scalar curvature is defined as the trace of the Ricci tensor: $R={R_a}^a$. Now this gives some sense to the “trace over the first two indices” of the Riemann tensor that I was wondering about just above here.

  • What is the point of taking these traces? Trace the Riemann tensor to get the Ricci tensor. Trace the Ricci tensor to get the scalar curvature. The book mentions wanting to decompose the Riemann tensor into traceless and traceful parts, but, beyond that, there is probably some fundamental physical content to the trace. What is it?

  • The traceless part of the Riemann tensor is the Weyl tensor or conformal tensor, $C_{abcd}$. Given by eq. 3.2.28. Some more comments are made about this tensor and its symmetry properties, but I will ignore these until I understand the usefulness of this traceless/traceful decomposition.

  • “Contraction of the Bianchi identity”: $$0=\nablaa{R{bcd}}^a+\nablab R{cd}-\nablac R{bd}.$$ I think this is setting the index $e$ of the Bianchi identity equal to $a$, and summing over $a$. When the contraction takes place over $R$, the Riemann tensor simplifies to the Ricci tensor. Multiplying by $g^{db}$ gives $$0=\nabla_a{R_c}^a+\nabla_b{R_c}^b-\nablac R.$$ Defining the Einstein tensor $G$ by $$G{ab}=R{ab}-\frac{1}{2}Rg{ab},$$ we have $\nabla^a G_{ab}=0$.

3.3: Geodesics

  • A geodesic is a curve which “curves as little as possible”, meaning that its tangent vector is parallel-transported along the curve itself. If the tangent vector of the curve is $T^a$, this is the condition that $T^a\nabla_a T^b=0$; there is freedom is choosing the derivative operator $\nabla_a$.

  • The condition is sometimes relaxed to $T^a\nabla_a T^b=\alpha T^b$, but (it is claimed that) such a curve can always be parametrized to satisfy the stronger condition. Such a parametrization is called an affine parametrization.

  • What is my intuition about parallel transport, and a geodesic, supposed to be? I guess the parallel transport condition is that the vector’s component along the curve stays the same. I want to understand it in terms of a Taylor expansion, like $v(x+\delta x)-v(x)=\delta x\cdot\nabla v$. The actual condition (3.1.16) is $t^a\nabla_a v^b=0$ with $t^a$ the tangent to the curve. The vector curves as the curve curves – its orientation with respect to the curve stays the same. Wouldn’t this condition be $\frac{d}{dt} t^a v_a=0$ with $t$ the parameter along the curve?

  • Spoke with Dad: consider an arc on a line of latitude of a sphere (or whichever one runs west-east). These aren’t geodesics. From a “least distance” standpoint, we can understand this: $d\theta^2+\sin^2\,\theta d\phi^2$ incentivizes reducing $\sin\theta$ rather than going directly along the constant $\theta$ path. What about not being a geodesic in the parallel transport sense? The tangent vector is always $e_\phi$; the tangent plane might be changing its $\theta$-orientation? I can’t really visualize it.

  • If coordinates are chosen such that the curve is $x^\mu(t)\in{\Bbb R}^n$, then the parallel-transport geodesic equation becomes $$\frac{d^2 x^\mu}{dt^2}+{\Gamma^\mu}_{\sigma\nu}\frac{dx^\sigma}{dt}\frac{dx^\nu}{dt}=0.$$ Given initial values $x^\mu$ and $\frac{dx^\mu}{dt}$ at $t=0$, this equation has a unique solution.

  • With $p\in M$ and a tangent vector $T^a\in V_p$, there exists a unique geodesic through $p$ with tangent $T^a$.

  • Define exponential map $V_p\to M$ by taking $T^a$ to the unique geodesic (described above) at parameter $t=1$. This mapping can be somewhat degenerate, but there are ways to keep things well behaved (choose a restricted domain). Because $V_p\cong{\Bbb R}^n$, this is a set of $n$ coordinates parametrizing $M$; Riemann normal coordinates at $p$.

  • In Riemann normal coordinates, the geodesics through $p$ are mapped to straight lines in ${\Bbb R}^n$. What is mapping the geodesics? I thought the normal coordinates mapped ${\Bbb R}^n$ to $M$ (i.e. parametrized the manifold).

  • In Riemann normal coordinates at $p$, the Christoffel symbol ${\Gamma^\mu}_{\sigma\nu}$ vanishes at $p$.

  • Discussion of Gaussian normal coordinates (also synchronous coordinates) where the derivative operator comes from the metric.

  • Geodesics of a derivative operator arising from a metric also extremize the length of the curves. Arbitrary geodesics (under the definition of Wald) do not?

  • A curve is timelike where $g{ab}T^a T^b<0$, *spacelike* if $>0$ and null if $=0$. The length of a null or timelike curve is $$\ell=\int dt\sqrt{g{ab}T^a T^b}.$$ In the case of a spacelike curve, we instead consider the “proper time“ $$\tau=\int dt\sqrt{-g_{ab}T^a T^b}.$$ If a curve switches between spacelike and timelike, its length/proper time are not defined.

  • A geodesic [in a Lorentz manifold] cannot switch from timelike to spacelike or null (because its norm is constant). This isn’t obvious to me. Also, does this mean a geodesic always entirely falls under one of the three classifications?

  • Length of a curve is independent of parametrization.

  • Wald works out the condition for extremizing the length of a curve, and the result (3.3.13) is the same as the defining geodesic equation (3.3.5). “Thus, a curve extremizes the length between its endpoints if and only if it is a geodesic”.

  • The geodesic equation can be obtained by extremizing the lagrangian $L=g_{ab}T^a T^b$. What’s the physical content of this?

  • Often most convenient to compute the Christoffel symbol ${\Gamma^\mu}_{\sigma\nu}$ by starting with the lagrangian $L$, computing the Euler-Lagrange equations, and comparing with equation 3.3.5.

  • A curve of minimal length is always a geodesic (because it extremizes length), but a geodesic is not always a curve of minimal length. Recall computing the geodesics on a cylinder; there is a straight geodesic, and then another geodesic for any given winding number.

  • Ditto for timelike curves on a manifold with a Lorentzian metric, but “minimal length” swaps with “maximal proper time”.

  • Geodesic deviation equation (I am omitting the details of the derivation). Describes the spatial relations among a one-parameter family of geodesics on a manifold. “Some geodesics will accelerate towards or away from each other if and only if ${R_{abc}}^d\ne 0$.” I don’t understand what’s meant by geodesics moving or accelerating, so I will revisit this later.

3.4: Methods for Computing Curvature

  • This section is about computing the Riemann curvature tensor ${R_{abc}}^d$, which was proven to exist in section 3.2.

3.4a: Coordinate Component Method

  • Choosing coordinates, we have that $\nabla_b\omega_c=\partial_b\omegac-{\Gamma^d}{bc}\omegad$. The curvature tensor then turns out to be $${R{abc}}^d=-2\partial{[a}{\Gamma^d}{b]c}+2{\Gamma^e}{c[a}{\Gamma^d}{b]e}.$$ For an explicit version in terms of the chosen coordinates, see equation 3.4.4.

  • Define $g=\det(g_{\mu\nu})$ which can be computed by choosing coordinates, but is, of course, independent of the choice. The natural volume element on the manifold, for integration, is $\sqrt{|g|}d^n x$.

  • Identity ${\Gamma^a}{a\mu}=\frac{\partial}{\partial x^\mu}\ln\sqrt{|g|}$. This is useful for computing the Ricci tensor $R{ab}$ and for taking the divergence of arbitrary vector fields.

  • Calculations done by the coordinate component method are laborious and “non-geometrical”. So, while they are straightforward, they don’t foster insight and they are not particularly quick.

3.4b: Orthonormal Basis (Tetrad) Methods

  • This method uses an orthonormal basis. A coordinate basis ${\partial_\mu}$ “is not orthonormal except for the trivial case of flat spacetime in Cartesian coordinates”. What? What about cylindrical coordinates? Spherical coordinates? Am I missing something here?

  • Choose a set of vectors ${(e\mu)^a\mid\mu=1,\ldots,n}$ (the $\mu$ is an index in our collection while the $a$ indicates that each $e$ is a vector) such that $(e\mu)^a(e_\nu)a=\eta{\mu\nu}$, with $\eta={\rm diag}(-1,\ldots,-1,1,\ldots,1)$. The basis ${e_\mu}$ is called a tetrad, especially in the context of four dimensions. Does $\eta$ necessarily have any relation to the metric?

  • We have $$\eta^{\mu\nu}(e\mu)^a(e\nu)_b={\delta^a}_b.$$

  • Three points regarding the tetrad method for computing curvature:

    • $\nablaa g{bc}=0$, the derivative operator is “compatible” with the metric.

    • The derivative operator is torsion free.

    • The Riemann tensor is related to the derivative operator by equation 3.2.3.

  • connection 1-forms $\omega{a\mu\nu}=(e\mu)^b\nablaa(e\nu)b$. The components $\omega{\lambda\mu\nu}$ are Ricci rotation coefficients.

  • By orthonormality of the ${e\mu}$, we get $\omega{a\mu\nu}=-\omega_{a\nu\mu}$. Not clear to me how this works.

  • The Christoffel symbol, on the other hand, is symmetric. It has $n^2(n+1)/2$ free components while the Ricci rotation coefficients have $n^2(n-1)/2$ (because some “diagonal” entries are forced to be zero).

  • The Riemann curvature tensor can be computed as $$R{\rho\sigma\mu\nu}=(e\rho)^a\nablaa\omega{\sigma\mu\nu}-(e_\sigma)^a\nablaa\omega{\rho\mu\nu}-\eta^{\alpha\beta}\left[\omega{\rho\beta\mu}\omega{\sigma\alpha\nu}-\omega{\sigma\beta\mu}\omega{\rho\alpha\nu}+\omega{\rho\beta\sigma}\omega{\alpha\mu\nu}-\omega{\sigma\beta\rho}\omega{\alpha\mu\nu}\right].$$

  • The Ricci tensor can be computed as $$R{\rho\mu}=\eta^{\sigma\nu}R{\rho\sigma\mu\nu}.$$

  • The torsion-free condition gives the conditions $$(e_\sigma)a[e\mu,e\nu]^a=\omega{\mu\sigma\nu}-\omega{\nu\sigma\mu}$$ or, alternately, $$\partial{[a}(e\sigma){b]}=\eta^{\mu\nu}(e\mu){[a}\omega_{b]\sigma\nu}.$$

  • There is a nice form of the torsion-free condition in terms of differential forms. I will skip for now.

  • Summary of applying the tetrad method on page 52.

Problem 1: Relaxation of torsion-free condition

  1. Drop the torsion-free condition from the definition of the derivative operator $\nablaa$. Show that there exists a tensor ${T^c}{ab}$ (torsion tensor) such that, for all smooth functions $f$, we have $\nabla_a\nabla_b f-\nabla_b\nablaa f=-{T^c}{ab}\nabla_c f$.

    I’m glad to have this exercise, because two such arguments were presented in chapter 3 (“a tensor exists such that…”) which I could not internalize. Now I’ve read the argument again (page 32), I still don’t understand the statements about $\tilde\nabla_a\omega_b-\nabla_a\omega_b$ depends only on the value of $\omega_b$ at the given point $p$. However, the rest of the argument makes sense.

    I can build on the given argument to show the desired result: $\tilde\nabla_a-\nablaa$ defines a tensor ${C^c}{ab}$ of type $(1,2)$ through $\nabla_a\omega_b=\tilde\nabla_a\omegab-{C^c}{ab}\omega_c$. Then consider $\omega_b=\nabla_b f=\tilde\nabla_b f$ for a smooth scalar field $f$ (the equality is because all derivatives agree on scalar fields, axiomatically). The hint suggests taking $\tilde\nabla$ to be a torsion-free derivative operator (Why are we free to make such a choice? Is the existence guaranteed?), so we have:

    \begin{multline} -{C^c}_{ab}\nabla_c f=\nabla_a\nabla_b f-\tilde\nabla_b\tilde\nabla_a f=\nabla_a\nabla_b f-\tilde\nabla_b\nabla_a f
    =\nabla_a\nabla_b f-\left(\nabla_b\nablaa f+{C^d}{ba}\nabla_d f\right) \end{multline}


    $$\nabla_a\nabla_b f-\nabla_b\nablaa f=\left({C^c}{ba}-{C^c}_{ab}\right)\nabla_c f,$$

    proving the claim with ${T^c}{ab}={C^c}{ab}-{C^c}_{ba}$.

  2. Given smooth vector fields $X^a, Y^a$, show we have ${T^c}_{ab}X^a Y^b=X^a\nabla_a Y^c-Y^a\nabla_a X^c-[X,Y]^c$.

    Compute the most non-trivial term in the equation: the commutator. This parallels exercise 2.4, but this time without torsion. Expanding $X$ and $Y$ in the coordinate basis ${\nabla_a}$, we have

    \begin{multline} [X,Y]=X^a(\nabla_a Y^b)\nabla_b+X^a Y^b\nabla_a\nabla_b-Y^b(\nabla_b X^a)\nabla_a-Y^b X^a\nabla_b\nabla_a
    =X^a Y^b(\nabla_a\nabla_b-\nabla_b\nabla_a)+(X^a\nabla_a Y^b-Y^a\nabla_a X^b)\nabla_b. \end{multline}

    Now using the result of part 1 to re-express the commutator of the derivative operator in terms of $T$, and relabeling some indices, this yields

    $$[X,Y]=\left[X^a\nabla_a Y^c-Y^a\nablaa X^c-X^a Y^b {T^c}{ab}\right]\nabla_c.$$

    Of course, $[X,Y]=[X,Y]^c\nabla_c$; reading off the components from here gives the result.

  3. Given a metric $g_{ab}$, show there is a unique derivative operator $\nablaa$ with torsion ${T^c}{ab}$ such that $\nablac g{ab}=0$. Derive the analogue of equation 3.1.29.


comments powered by Disqus