Convolution Layer#

Convolution#

The input data \(I\) of one dimension is convolved with some mask function \(G\) (a matrix) which can be written as

\[ (I*G)[t] = \sum_{k=-\infty}^\infty I[k]G[t-k] \]

In three dimension, for instance, let’s take \(I\) an RGB image; the convolution may be written as,

\[ (I*G)[x,y] = \sum_{a=0}^{\lfloor (w-1) / s \rfloor}\sum_{b=0}^{\lfloor (h-1) / s \rfloor}\sum_{c\in\{\text{r,g,b}\}} I_c[x+s\cdot a,y+ s\cdot b] G_c[a,b] \]
\[ (I*G) \in \mathbb R^{\lfloor {1+(W-w)/s \rfloor} \times \lfloor{1+(H-h)/s \rfloor} \times c} \]
  • \(W, H\) : Width and height of the input data.

  • \(w, h\) : Width and height of the mask matrix.

  • \(s\) : Stride of the convolution, \(s\ge1\).

Architecture#

I’ll see if I wanna stand in the rain tomorrow. Given some data \(I\) of size \((W, H, C)\) a convolution layer of mask function \(G\) of size \((w,h,c)\):

  • Trains \(w \cdot h \cdot c\) parameters

  • Outputs an array of size \((\lfloor {1+(W-w)/s \rfloor}, \lfloor{1+(H-h)/s \rfloor}, c)\)