11
$\begingroup$

The textbook(Introduction to the Classical Theory of Particles and Fields, by Boris Kosyakov) defines a hypersurface by $$F(x)~=~c,$$ where $F\in C^\infty[\mathbb M_4,\mathbb R]$. Differentiating gives $$dF~=~(\partial_\mu F)dx^\mu~=~0.$$ The text then says $dx^\mu$ is a covector and $\partial_\mu F$ a vector. I learnt from another book that $dx^\mu$ are 4 dual vectors(in Minkowski space), $\mu$ indexes dual vector themselves, not components of a single dual vector. So I think $\partial_\mu F$ should also be 4 vectors, each being the directional derivative along a coordinate axis. But this book later states that $(\partial_\mu F)dx^\mu=0$ describes a hyperplane $\Sigma$ with normal $\partial_\mu F$ spanned by vectors $dx^\mu$, and calls $\Sigma$ a tangent plane (page 33-34). This time, it seems to treat $\partial_\mu F$ as a single vector and $dx^\mu$ as vectors. But I think $dx^\mu$ should span a cotangent space.

I need some help to clarify these things.

[edit by Ben Crowell] The following appears to be the text the question refers to, from Appendix A (which Amazon let me see through its peephole):

Elie Cartan proposed to use differential coordinates $dx^i$ as a convenient basis of 1-forms. The differentials $dx^i$ transform like covectors [...] Furthermore, when used in the directional derivative $dx^i \partial F/\partial x^i$, $dx^i$ may be viewed as a linear functional which takes real values on vectors $\partial F/\partial x^i$. The line elements $dx^i$ are called [...] 1-forms.

$\endgroup$
11
  • $\begingroup$ Related: physics.stackexchange.com/q/79013/2451 $\endgroup$
    – Qmechanic
    Commented Oct 31, 2014 at 16:41
  • $\begingroup$ After writing an answer and then succeeding in getting a look at what Kosyakov wrote, I'm just as confused as elflyao. I would be interested in hearing from others who might have broader experience or be able to explain whether Kosyakov has an unusual point of view. $\endgroup$
    – user4552
    Commented Oct 31, 2014 at 19:36
  • $\begingroup$ cross-posted here: physicsforums.com/threads/… $\endgroup$
    – user4552
    Commented Nov 1, 2014 at 21:22
  • $\begingroup$ $\partial_\mu F$ is a function not a vector. Perhaps you mean the fourtuple of the $\partial_\mu F$'s? $\endgroup$
    – MBN
    Commented Nov 2, 2014 at 19:46
  • 1
    $\begingroup$ for short, simple answers see physicsoverflow.org/24735 $\endgroup$ Commented Nov 11, 2014 at 14:38

5 Answers 5

14
+300
$\begingroup$

Below follows a handful of excerpts from the book Introduction to the Classical Theory of Particles and Fields (2007) by B. Kosyakov.

Controversial/misleading/wrong statements are marked in $\color{Red}{\rm red}$. We agree with OP that the statements marked in $\color{Red}{\rm red}$ are opposite standard terminology/conventions. Some (not all) correct statements are marked in $\color{Green}{\rm green}$.

1.2 Affine and Metric Structures

[...] Let ${\bf e}_1$, $\ldots$, ${\bf e}_n$ and ${\bf e}^{\prime}_1$, $\ldots$, ${\bf e}^{\prime}_n$ be two arbitrary bases. Each $\color{Green}{\rm vector}$ of the latter basis can be expanded in terms of $\color{Green}{\rm vectors}$ of the former basis: $$ {\bf e}^{\prime}_i ~=~ {\bf e}_j~L^j{}_i .\tag{1.37} $$

[...] Thus, linear functionals form the dual vector space $V^{\prime}$. If $V$ is $n$-dimensional, so is $V^{\prime}$. Indeed, let ${\bf e}_1$, $\ldots$, ${\bf e}_n$ be a basis in $V$. Then any $\omega\in V^{\prime}$ is specified by $n$ real numbers $\omega_1=\omega({\bf e}_1)$, $\ldots$, $\omega_n=\omega({\bf e}_n)$, and the value of $\omega$ on ${\bf a} = a^i {\bf e}_i$ is given by $$\omega({\bf a}) ~=~ \omega_i a^i .\tag{1.52} $$ We see that $V^{\prime}$ is isomorphic to $V$. That is why we sometimes refer to linear functionals as $\color{Green}{covectors}$. A closer look at (1.52) shows that a $\color{Green}{\rm vector}$ ${\bf a}$ can be regarded as a linear functional on $V^{\prime}$. One can show (Problem 1.2.3) that changing the basis (1.37) implies the transformation of $\omega_i$ according to the same law: $$ \omega^{\prime}_i ~=~\omega_j ~L^j{}_i .\tag{1.53} $$ We will usually suppress the argument of $\omega({\bf a})$, and identify $\omega$ with its components $\omega_i$. [...]

1.3 Vectors, Tensors, and $n$-Forms

[...] A simple generalization of vectors and covectors are tensors. Algebraically, a tensor $T$ of rank $\color{Green}{(m,n)}$ is a multilinear mapping $$\color{Green}{T: \underbrace{V^{\prime} \times\ldots\times V^{\prime}}_{m\text{ times}} \times \underbrace{V \times\ldots\times V}_{n\text{ times}} \to \mathbb{R}}. \tag{1.112} $$ We have already encountered examples of tensors in the previous section: a scalar is a rank $(0,0)$ tensor, a $\color{Green}{\rm vector}$ is a rank $\color{Green}{(1,0)}$ tensor, a $\color{Green}{\rm covector}$ is a rank $\color{Green}{(0,1)}$ tensor, the metric $g_{ij}$ is a rank $(0,2)$, while $g^{ij}$ is a rank $(2,0)$ tensor, and the Kronecker delta $\delta^i{}_j$ is a rank $(1,1)$ tensor. Just as $\color{Green}{\rm four~vectors}$ can be regarded as objects which transform according to the law $$ a^{\prime \mu} ~=~ \color{Green}{\Lambda^{\mu}{}_{\nu}} ~a^{\nu} ,\tag{1.113} $$ where $\Lambda^{\mu}{}_{\nu}$ is the Lorentz transformation matrix relating the two frames of reference, so tensors of rank $(m,n)$ can be described in terms of Lorentz group representations by the requirement that their transformation law be $$T^{\prime\mu_1\cdots \mu_m}{}_{\nu_1\cdots \nu_n} ~=~\color{Green}{\Lambda^{\mu_1}{}_{\alpha_1}\ldots\Lambda^{\mu_m}{}_{\alpha_m}}~ T^{\alpha_1\cdots \alpha_m}{}_{\beta_1\cdots \beta_n}~ \color{Red}{\Lambda^{\beta_1}{}_{\nu_1}\ldots\Lambda^{\beta_n}{}_{\nu_n}}. \tag{1.114} $$

[...]The differential operator $$ \partial_{\mu}~=~\frac{\partial}{\partial x^{\mu}} \tag{1.140} $$ transforms like a $\color{Green}{\rm covariant~vector}$. To see this, we use the chain rule for differentiation: $$ \frac{\partial}{\partial x^{\mu}}~=~\frac{\partial x^{\prime \nu}}{\partial x^{\mu}} \frac{\partial}{\partial x^{\prime \nu}}, \tag{1.141} $$ and note that, for linear coordinate transformations $x^{\prime\mu} = \color{Green}{\Lambda^{\mu}{}_{\nu}}~x^{\nu} + a^{\mu}$ $$\frac{\partial x^{\prime \mu}}{\partial x^{\nu}} ~=~\color{Green}{\Lambda^{\mu}{}_{\nu}}. \tag{1.142} $$ We will always use the shorthand notation $\partial_{\mu}$, and treat this differential operator as an ordinary $\color{Green}{\rm vector}$. [...]

1.4 Lines and Surfaces

[...] We define a hypersurface $M_{n−1}$ by $$F(x) ~=~ C , \tag{1.176} $$ where $F$ is an arbitrary smooth function $\mathbb{M}_4 \to \mathbb{R}$. Differentiating (1.176) gives $$ (\partial_{\mu}F) dx^{\mu} ~=~ 0 . \tag{1.177} $$ One may view $dx^{\mu}$ as a $\color{Green}{\rm covector}$, and $\partial_{\mu}F$ as a $\color{Red}{\rm vector}$. Indeed, $dx^{\mu}$ transforms like a $\color{Red}{\rm covector}$ under linear coordinate transformations $x^{\prime\mu} = \color{Green}{\Lambda^{\mu}{}_{\nu}}~x^{\nu} + a^{\mu}$, $$dx^{\prime\mu} ~=~ \frac{\partial x^{\prime\mu}}{\partial x^{\nu}}dx^{\nu} ~=~\color{Green}{\Lambda^{\mu}{}_{\nu}}~dx^{\nu}, \tag{1.178} $$ and $\partial_{\mu}F$ transforms like a $\color{Red}{\rm vector}$: $$\frac{\partial F}{\partial x^{\prime\mu}} ~=~\frac{\partial F}{\partial x^{\nu}} \frac{\partial x^{\nu}}{\partial x^{\prime\mu}} ~=~\frac{\partial F}{\partial x^{\nu}}\color{Red}{\Lambda^{\nu}{}_{\mu}}. \tag{1.179} $$ In Minkowski space, vectors and covectors can be converted to each other according to (1.121). For this reason, we will often regard $dx^{\mu}$ as vectors. [...]

A. Differential Forms

[...] Elie Cartan proposed to use differential coordinates $dx^i$ as a convenient basis of $\color{Green}{\rm one~forms}$. The differentials $dx^i$ transform like $\color{Red}{\rm covectors}$ under a local coordinate change, $$dx^{\prime j}~=~ \frac{\partial x^{\prime j}}{\partial x^i}dx^i. \tag{A.1} $$ [If the coordinate change is specialized to Euclidean transformations $x^{\prime j} =\color{Red}{L^j{}_i} ~x^i + c^j$, then $\partial x^{\prime j} /\partial x^i$ reduces to $\color{Red}{L^j{}_i}$, an orthogonal matrix with constant entries, and (A.1) $\color{Red}{\rm becomes}$ (1.53), the transformation law for $\color{Green}{\rm covectors}$.] [...]

Notes:

  1. The corrected eq. (1.114) reads $$T^{\prime\mu_1\cdots \mu_m}{}_{\nu_1\cdots \nu_n} ~=~\Lambda^{\mu_1}{}_{\alpha_1}\ldots\Lambda^{\mu_m}{}_{\alpha_m}~ T^{\alpha_1\cdots \alpha_m}{}_{\beta_1\cdots \beta_n}~ (\Lambda^{-1})^{\beta_1}{}_{\nu_1}\ldots(\Lambda^{-1})^{\beta_n}{}_{\nu_n}. \tag{1.114} $$

  2. The corrected eq. (1.179) reads $$\frac{\partial F}{\partial x^{\prime\mu}} ~=~\frac{\partial F}{\partial x^{\nu}} \frac{\partial x^{\nu}}{\partial x^{\prime\mu}} ~=~\frac{\partial F}{\partial x^{\nu}}(\Lambda^{-1})^{\nu}{}_{\mu}. \tag{1.179} $$

  3. To explain why (A.1) does not becomes (1.53), let ${\bf e}^1$, $\ldots$, ${\bf e}^n$, be a (dual) basis in $V^{\prime}$. In light of (1.53), in order for a covector $\omega=\omega_i{\bf e}^i\in V^{\prime}$ to be independent of the choice of basis, the dual basis must transform as $$ {\bf e}^{\prime i} ~=~ M^i{}_j ~{\bf e}^j, \tag{*}$$ where $$ M~=~L^{-1}. \tag{1.45}$$ Identifying the dual bases ${\bf e}^i\leftrightarrow dx^i$, the above eq.(*) becomes (A.1). Moreover, in the sentence below eq. (A.1), the $L$ matrix should be replaced with the $M$ matrix in two places.

  4. Finally, let us answer OP's title question: A partial derivative $\partial_{\mu}F$ (of a scalar function $F$) is a component of a cotangent vector $dF=(\partial_\mu F)dx^\mu$, while the un-applied partial derivative $\partial_{\mu}$ is a local basis element of a tangent vector. Both $\partial_{\mu}F$ and $\partial_{\mu}$ transform as covectors.

$\endgroup$
2
  • $\begingroup$ well point 4. is nicely put, but i dont like so much the red/green tags, as this is convention than anything else (of course any book errata excepted) $\endgroup$
    – Nikos M.
    Commented Nov 5, 2014 at 21:03
  • 2
    $\begingroup$ @NikosM.: Qmechanic's points 1 and 2 are not matters of convention; Kosyakov has made mistakes in those spots. $\endgroup$
    – user4552
    Commented Nov 6, 2014 at 1:08
7
$\begingroup$

I took a quick look at pages 59 and 60 of "Gravitation", section 2.6 "Gradients and Directional Derivatives", to see if there's anything there we can use to clarify this issue.

In this section, the gradient of $f$ is $\mathbf df$, the directional derivative along the vector $\mathbf v$ is $\partial_{\mathbf v}f$ and the following relationship holds:

$$\partial_{\mathbf v}f = \langle\mathbf df, \mathbf v \rangle$$

Then assuming a set of basis forms $\mathbf dx^{\mu}$ and dual basis vectors $\mathbf e_{\mu}$ we have

$$\partial_{\mu} f \equiv \partial_{\mathbf e_{\mu}}f = \langle\mathbf df, \mathbf e_{\mu} \rangle = \frac{\partial f}{\partial x^{\mu}}$$

So, according to MTW in this section, $\partial_{\mu} f$ are the components of $\mathbf df$ on this basis.

Thus, it must be that, per the 2nd equation in the question,

$$\mathbf df = (\partial_{\mu} f) \mathbf dx^{\mu} $$

which is just the expansion of the form $\mathbf df$ on the basis forms $\mathbf dx^{\mu}$

As to why Kosyakov would identify this as a contraction of a form and a vector I haven't a clue.

$\endgroup$
0
6
$\begingroup$

I believe this is just imprecise use of language by the author - there is nothing mysterious happening, it is just not well stated:

As stated in the question, for a hypersurface $\Sigma$ defined by

$$ F(x) = c \in \mathbb{R}$$

we find that

$$ \mathrm{d}F = 0$$

must hold on $\Sigma$. This is crucial - it means that the 1-form $\mathrm{d}F$ acting upon tangent vectors of $\Sigma$ must vanish identically:

$$ \forall v \in T_x\Sigma : \mathrm{d}F(v) = (\partial_\mu F)v^\mu = 0$$

But we can recognize $(\partial_\mu F)v^\mu$ as the scalar product of the vectors $v$ and $g(\mathrm{d}F,\dot{})$, the latter being the usual dual of $\mathrm{d}F$ with components $\partial^\mu F$. Since $T_x\Sigma \subset T_x\mathbb{M}^4$ naturally, this means that $\mathrm{d}F = 0$ indeed sweeps out a hypersurface in the tangent space that has, in sloppy diction, the gradient as its normal (although it is really its dual).

$\endgroup$
4
  • $\begingroup$ I think the crucial point is in the sentence "But we can recognize...," and it's here that I don't follow you. Since the gradient $dF$ is a 1-form, its dual $\partial^\mu F$ is a vector. If so, then we can't take the scalar product of the vector $\partial^\mu F$ with a vector $v^\mu$. I don't see any reason for taking duals anywhere at all. Even if we didn't have a metric, and therefore couldn't take duals, we could simply have $(\partial_\mu F)v^\mu$, the scalar product of a covector with a vector. $\endgroup$
    – user4552
    Commented Nov 2, 2014 at 20:51
  • $\begingroup$ @BenCrowell: Yes, indeed. That will still define a hypersurface in the tangent space, but we will not have a "normal vector" to describe it. I agree that there is no need to take duals - but I think Kosyakov implicitly does exactly that when he talks of $\partial_\mu F$ being a normal vector. Your nomenclature seems a bit unorthodox to me though - a scalar product is between two vectors or two covectors (and usually induced by the metric) - applying a covector to a vector is not a scalar product. $\endgroup$
    – ACuriousMind
    Commented Nov 2, 2014 at 20:57
  • $\begingroup$ OK, by "scalar product" I simply meant a product that transforms as a scalar. So taking "scalar product" in your answer to mean $g(\cdot,\cdot)$, I don't understand why one would describe $(\partial_\mu F)v^\mu$ as a scalar product of $v^\mu$ with $\partial^\mu$. That might indicate that we take the gradient, raise its index, lower its index, and then contract. I don't see the point of raising an index and then immediately lowering it again. Or we could raise the gradient's index, lower $v^\mu$'s index, and contract. Again, why raise or lower at all? $\endgroup$
    – user4552
    Commented Nov 2, 2014 at 21:13
  • $\begingroup$ @BenCrowell: Because it yields the geometric interpretation of $\partial^\mu F$ as the normal vector to the hyperplane of tangent vectors of $\Sigma$. I don't think there's anything deeper than that here. $\endgroup$
    – ACuriousMind
    Commented Nov 2, 2014 at 21:26
1
$\begingroup$

I believe you are confused because you are mixing up related but slightly different quantities.

Yes, a partial derivative is a vector and yes, a vector is an object with an upper index.

The above statement may seem contradictory, but in fact it is not for the following reason. A vector is an abstract quantity that is an element of a "vector space". In this case, the vector space that is being discussed is the tangent space. On a vector space, one can choose a basis, any basis. Once a basis has been chosen, any other vector in the vector space can be described by simply prescribing a set of numbers. For instance in ${\mathbb R}^2$ (rather the corresponding affine space), one can choose a basis of vectors as ${\hat x}$ and ${\hat y}$. Once this has been done any other vector can be described simply by 2 numbers. For instance the numbers $(1,2)$ really implies that we are talking about the vector ${\hat x} + 2 {\hat y}$.

How does the discussion above apply here?

On the tangent space, a natural choice of basis are the set of partial derivatives $\partial_\mu = \{ \partial_0 , \partial_1 , \partial_2 , \partial_3 \}$ (assuming we are in $M_4$. Each partial derivative is in itself a vector.

Now, once this basis has been chosen, every other vector can be described by a set of 4 numbers $v^\mu = (v^0 , v^1 , v^2 , v^3)$ which corresponds to the vector $v^\mu \partial_\mu$. It is this sense, that the bold statement above is true. Often, since the basis of partial derivatives is obvious, one simply describes a vector as an object with an upper index $v^\mu$.

Next, let us discuss co-vectors (quantities with a lower index). These are lements of the dual vector space (which is the space of linear functions on the vector space) of the tangent space. Given the partial derivative basis on the tangent space, one then has a natural basis in the cotangent space denoted by $dx^\mu = \{ dx^0 , dx^1 , dx^2 , dx^3 \}$. Note that each differential itself is a covector. This natural basis is defined by the relation $dx^\mu (\partial_\nu ) = \delta^\mu_\nu$. As before, once this natural basis has been chosen, any element of the cotangent space can be described by 4 numbers, namely $v_\mu = \{ v_0, v_1 , v_2 , v_3 \}$ which corresponds to the covector $v_\mu dx^\mu$.

In summary, $\partial_\mu$ for each $\mu$ corresponds to a 4-dimensional vector whereas $v^\mu$ for each $\mu$ corresponds to 4 components of a single vector. Similarly, $dx^\mu$ for each $\mu$ corresponds to a 4-dimensional covector whereas $v_\mu$ for each $\mu$ corresponds to 4 components of a single covector.

PS 1 - Sometimes people like to use bases other than $\partial_\mu$ and $dx^\mu$ on the tangent and cotangent spaces respectively. These are known as non-coordinate bases.

PS 2 - Just to be clear, $\partial_\mu$ is a vector, but $\partial_\mu F$ is a function

$\endgroup$
0
0
$\begingroup$

It is interesting to note that Elie Cartan did not write $dx^i$ as we do today. He had not yet adopted the careful, Ricci-like distinctions with indices. for him, $x_i$ was not a component or coordinate of anything: it was a function on the manifold, or at least on a small neighbourhood, so it was a degree zero form, had no components at all, was neither covariant nor contravariant, it is just the same as $x$ or $X$ or $Z$. Now the "d" is an operator, which when applied to p-forms, yields $(p+1)$ forms, so, when applied to $x_1$, it yields a 1-form, which is a cotangent vector at each point in the manifold. So a covector, so covariant. So $dx_i$ has co-variant components. But it is a covector. Its components are $\partial x_i\over \partial x_1$ etc. Now in the same way, $\partial _{x_i}$ is a vector (field, i.e., a vector that depends on the point in the manifold). It is not the component of anything, but it has components. This is to correct the point 4 of @Qmechanic .

Your puzzlement would be cleared up if you didn't use indices for the coordinate functions, but called them x,y, and z. Or even just x and y. Two dimensions is often enought to clear up most things. And forget the Einstein convention or summation signs, write out the sums explicitly. Using indices to indicate various objects, in this case functions, can cause serious confusion with the use of indices to indicate the various components of the object. Cartan put the indices below the x, $x_i$, because we was thinkgin of $n$ different objects, not the $n$ components of one object. Scalar valued functions don't have any components at all....

$\endgroup$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.