July 24, 2015

Characterizing the trace of a matrix

I have been studying for my finals lately, and so I decided to put together a proof of a nice exercise I found in some book. The trace function, given by $tr : \mathbb{K}^{n \times n} \to \mathbb{K}$, is defined as

$tr(A) = \sum_{i=1}^n A_{ii}$

First of all, the proof of additivity

$\begin{equation} \begin{split} tr(A + B) &= \sum_{i=1}^n (A+B)_{ii} \\ &= \sum_{i=1}^n (A)_{ii} + (B)_{ii} \\ &= \sum_{i=1}^n (A)_{ii} + \sum_{i=1}^n (B)_{ii} \\ &= tr(A) + tr(B) \end{split} \end{equation}$

Afterwards, the proof of homogeneity

$\begin{equation} \begin{split} tr(\lambda A) &= \sum_{i=1}^n (\lambda A)_{ii}\\ &= \sum_{i=1}^n \lambda A_{ii}\\ &= \lambda \sum_{i=1}^n A_{ii}\\ &= \lambda tr(A) \end{split} \end{equation}$

Hence, $tr$ is a linear transform from the vector space $\mathbb{K}^{n \times n}$ into $\mathbb{K}$. The cool thing about the trace is that it has many more interesting properties which are not difficult to prove. First of all, that it is invariant under transposition

$\begin{equation} \begin{split} tr(A^T) &= \sum_{i=1}^n (A^T)_{ii}\\ &= \sum_{i=1}^n A_{ii}\\ &= tr(A) \end{split} \end{equation}$

And that it doesn’t change when a product commutes

$\begin{equation} \begin{split} tr(AB) &= \sum_{i=1}^n (AB)_{ii}\\ &= \sum_{i=1}^n \sum_{k=1}^n A_{ik} B_{ki}\\ &= \sum_{k=1}^n \sum_{i=1}^n B_{ki} A_{ik}\\ &= \sum_{k=1}^n (BA)_{kk}\\ &= tr(BA) \end{split} \end{equation}$

Therefore, we get a pretty non-intuitive property of the trace, we have that

$tr(ABC) = tr(BCA) = tr(CAB)$

Which comes from applying the previous property to different parenthesizations of the matrix product (notice that this is doable for any number of matrices). Going further, suppose that $P \in \mathbb{K}^{n \times n}$ is an invertible matrix, then

$tr(PAP^{-1}) = tr(AP^{-1}P) = tr(A)$

Meaning that if two matrices $A$ and $B$ are similar, their traces are the same!. Additionally, writing $A$ in any given basis will not change its trace. Last of all, the reason I decided to write this post:

Suppose that we have a linear functional $f : \mathbb{K}^{n \times n} \to \mathbb{K}$ such that $\forall A, B : f(AB) = f(BA)$, then $\exists \lambda \in \mathbb{K} / f(A) = \lambda tr(A)$. That is, linearity and the product property completely determine the trace function, up to a constant factor. The proof is fairly easy. Let

$\beta = {E^{ij}}_{1 \leq i, j \leq n}$

be the canonical ordered basis for $\mathbb{K}^{n \times n}$. Notice that:

$\begin{equation} \begin{split} E^{ij} E^{kl} &= \sum_{m=1}^n E^{ij}_{i'm} E^{kl}_{mj'}\\ &= \sum_{m=1}^n \delta_{ii'} \delta_{jm} \delta_{km} \delta_{lj'}\\ &= \delta_{ii'} \delta_{lj'} \delta_{jk} \end{split} \end{equation}$

Since $f$ is a linear functional, we can write it as $f(A) = \sum_{i, j = 1}^n \alpha_{ij} \phi_{ij}(A)$, where $\phi_{ij}$ are the corresponding functions from the dual basis, meaning that $\phi_{ij}(E^{i’j'}) = \delta_{ii'} \delta_{jj'}$. Thus, we get that

$f(E^{ij} E^{lm}) = \alpha_{im} \delta_{jl}$

And

$f(E^{lm} E^{ij}) = \alpha_{mi} \delta_{lj}$

The additional condition that $f(AB) = f(BA)$ means that $\alpha_{im} \delta_{jl} = \alpha_{mi} \delta_{lj} \forall i, j, m, l \in {1, …, n}$. In particular, if we take $i = m, j = l$, this turns into $\alpha_{ii} \delta_{jj} = \alpha_{ii} \delta_{jj}$, which implies that $\delta_{ii} = \delta_{jj} \forall i, j \in {1, …, n}$, we can call this value $\lambda$. Finally, taking $l \neq j, m = i$, we get that $\alpha_{lj} = 0 \forall l, j \in {1, …, n}, l \neq j$. This completely determines every single one of the $\alpha$s. Hence:

$f(A) = \lambda \sum_{i = 1}^n \phi_{ii}(A) = \lambda \sum_{i = 1}^n A_{ii} = \lambda tr(A)$

By this point, we have pretty much characterized the trace function, in the sense that we know that any function that is a linear transform and does not change when the matrix product order changes, then it is a scalar multiple of the trace. There is only one last important property to uniquely determine the trace. Given $f$ with the previous properties, $f(I) = n \iff f = tr$:

$f(I) = \lambda tr(I) = \lambda n = n$

Which happens if and only if $\lambda = 1$.I have been studying for my finals lately, and so I decided to put together a proof of a nice exercise I found in some book. The trace function, given by $tr : \mathbb{K}^{n \times n} \to \mathbb{K}$, is defined as