The task of the feedforward neural networks is to transform the input data into output data. That is, they are functions from the input space to the output space. We might want the neural networks to be equivariant under the assumption that data have symmetry informtation we want to preserve through the tranformation.
An example one can think of is that of image classification. When classifying images, the labels we give to the images are usually rotationally invariant. Hence, if the classifier neural network is equivariant to rotation, one can potentially save significant amount of computational power and make the model more generalizable.
Now let's look at an implementation of SO(3)-equivariant network from this paper. Here, we consider spherical functions in L2(S2) as data. Under the integrability assumption, we can perform the spherical Fourier transform (SFT)
Here, {Yℓm} are the spherical harmonics where ℓ is the degree of the corresponding homogeunous polynomials and m is the order. Given a kernel function h∈L2(S2) and the north pole N=(0,0,1)∈S2, we can write (3) as
f∗h(x)=∫SO(3)f(g⋅N)h(g−1⋅x)dμ(g).
using the fact that the Haar measure μ is essentially a multiple of the uniform measure on S2, we can derive
(f∗h)ℓm=2ℓ+216π3fℓmhℓ0.
The key observation here is that the information we need for the kernel h is only the 0-th order coefficients {h0ℓ}ℓ=0∞.
Suppose that f has bandwidthb>0. That is, fℓm=0 for all ℓ≥b. Then by (7), f∗h will also have bandwidth b and thus it is suffice to keep track of (h00,⋯,h0b−1).
where wj(b) are predetermined weights on {xj,k}. Hence, to implement Covh,η, we can first find coefficients fℓm with (8). Then we find apply the pointwise product by (7) to get the coefficients for Covh,η(f).
In practice, the nonlinear layer is done by the standard pointwise operation:
NLσ:f↦σ∘f.
One can easily check that NLσ is equivariant.
Warning!
Operation NLσdoes not preserve the bandwith of the data. In fact, NLσ(f) can have infinite bandwidth regardless of the bandwith of f. Therefore, computing the Fourier coefficients with (8) after a non-linearity operation will introduce errors (See equivariance error analysis).
Here we introduced a pooling layer that acts as a low-pass filter with cutoff frequency b/2. In practice, we can simply set fℓm to be zero for all b/2<ℓ<b.
In tasks such as image classification, the output is invariant to SO(3) actions (equivalently, SO(3) acts trivially on the output space). Therefore, we would like the output to be SO(3)-invariant. One way to achieve this is to use the following operation to produce an output vector:
The fact that each fℓ is SO(3)-invariant follow from that the action SO(3) on Yℓ:=span{Yℓm∣∣m∣≤ℓ} is representable by Wigner D-matrices, which are unitary.
The non-linearity layers are the only ones that introduce equivariant errors. To see this, we define the distribution
s=2b2πj=0∑2b−1k=0∑2b−1wj(b)δxi,j,
given the equi-angular grid {xi,j}.
We note that given a function (or a nice enough distribution) in L2(S2). When we use the sampling algorithm to approximate the Fourier coefficients, we are implicitly implementing an orthogonal projection of the atomic distribution
fs=2b2πj=0∑2b−1k=0∑2b−1wj(b)f(xi,j)δxi,j
onto the subspace Y[b]:=span{Yℓm∣∣m∣≤ℓ<b}. We can see that the operation f↦fs is notSO(3)-equivariant as