Alright, let’s talk about quantization. I’ve started optimizing my deep learning pipeline, and since I’m using YOLOv5 as the base model (because it’s standard, widely used for benchmarking, and popular in AIoT applications), the first challenge I ran into was the SiLU activation function. Linear operations are straightforward to quantize, but SiLU? Not so much.
Why SiLU is a Problem for Quantization
The SiLU activation function is defined as:
\[ \text{SiLU}(x) = x \cdot \sigma(x) \]
where \( \sigma(x) \) is the sigmoid function:
\[ \sigma(x) = \frac{1}{1 + e^{-x}} \]
The issue? The sigmoid function is non-linear, computationally expensive, and doesn’t map well to MCUs with SIMD instructions. Look-up tables (LUTs) are an option, but they don’t work well with SIMD operations on embedded hardware.
Finding an Efficient Approximation
To solve this, I researched efficient approximations for the sigmoid function and found a great approach in this paper (see page 20).
The sigmoid approximation is:
\[ g(x) = 0.5 \cdot (0.25x - 1)^2 \]
With this, the approximated sigmoid function is defined as:
\[ \sigma_{\text{approx}}(x) = \begin{cases} 0, & x < -4 \\ g(-x), & -4 \leq x \leq 0 \\ 1 - g(x), & 0 \leq x \leq 4 \\ 1, & x > 4 \end{cases} \]
Approximating SiLU
Since SiLU is simply \( x \cdot \sigma(x) \), the approximated SiLU function becomes:
\[ \text{SiLU}_{\text{approx}}(x) = \begin{cases} 0, & x < -4 \\ x \cdot g(-x), & -4 \leq x \leq 0 \\ x \cdot (1 - g(x)), & 0 \leq x \leq 4 \\ x, & x > 4 \end{cases} \]
Optimized Computation for MCUs
To avoid costly division operations, we rewrite \( g(x) \) using bit shifts:
\[ g(x) = \left( (x \gg 2) - 1 \right)^2 \gg 1 \]
Bit shifting is much more efficient for MCUs, ensuring compatibility with SIMD instructions.
Why This Works So Well
Our approximation is limited to the range \([-4,4]\), but this isn’t an issue because:
- For \( x < -4 \), SiLU naturally approaches 0, and our approximation does the same.
- For \( x > 4 \), SiLU behaves like \( x \), which our approximation also captures.
- Within \([-4,4]\), the quadratic approximation closely follows the real function.
So we maintain high accuracy while optimizing for embedded hardware!
What’s Next?
In the next post, I’ll dive into making this approximation work with quantized inputs instead of floating-point numbers. Stay tuned!
GitHub Implementation
You can find the full implementation here: GitHub Repository
No comments:
Post a Comment