Alright, let’s talk about quantization. I’ve started optimizing my deep learning pipeline, and since I’m using YOLOv5 as the base model (because it’s standard, widely used for benchmarking, and popular in AIoT applications), the first challenge I ran into was the SiLU activation function. Linear operations are straightforward to quantize, but SiLU? Not so much.

Why SiLU is a Problem for Quantization

The SiLU activation function is defined as:

\[ \text{SiLU}(x) = x \cdot \sigma(x) \]

where \( \sigma(x) \) is the sigmoid function:

\[ \sigma(x) = \frac{1}{1 + e^{-x}} \]

The issue? The sigmoid function is non-linear, computationally expensive, and doesn’t map well to MCUs with SIMD instructions. Look-up tables (LUTs) are an option, but they don’t work well with SIMD operations on embedded hardware.

Finding an Efficient Approximation

To solve this, I researched efficient approximations for the sigmoid function and found a great approach in this paper (see page 20).

The sigmoid approximation is:

\[ g(x) = 0.5 \cdot (0.25x - 1)^2 \]

With this, the approximated sigmoid function is defined as:

\[ \sigma_{\text{approx}}(x) = \begin{cases} 0, & x < -4 \\ g(-x), & -4 \leq x \leq 0 \\ 1 - g(x), & 0 \leq x \leq 4 \\ 1, & x > 4 \end{cases} \]

Sigmoid Approximation

Approximating SiLU

Since SiLU is simply \( x \cdot \sigma(x) \), the approximated SiLU function becomes:

\[ \text{SiLU}_{\text{approx}}(x) = \begin{cases} 0, & x < -4 \\ x \cdot g(-x), & -4 \leq x \leq 0 \\ x \cdot (1 - g(x)), & 0 \leq x \leq 4 \\ x, & x > 4 \end{cases} \]

SiLU Approximation

Optimized Computation for MCUs

To avoid costly division operations, we rewrite \( g(x) \) using bit shifts:

\[ g(x) = \left( (x \gg 2) - 1 \right)^2 \gg 1 \]

Bit shifting is much more efficient for MCUs, ensuring compatibility with SIMD instructions.

Why This Works So Well

Our approximation is limited to the range \([-4,4]\), but this isn’t an issue because:

  • For \( x < -4 \), SiLU naturally approaches 0, and our approximation does the same.
  • For \( x > 4 \), SiLU behaves like \( x \), which our approximation also captures.
  • Within \([-4,4]\), the quadratic approximation closely follows the real function.

So we maintain high accuracy while optimizing for embedded hardware!

What’s Next?

In the next post, I’ll dive into making this approximation work with quantized inputs instead of floating-point numbers. Stay tuned!

GitHub Implementation

You can find the full implementation here: GitHub Repository