EN
CN EN
IoT Simple Talk

Media and Resources

Return to the list

AI Computing in Edge Devices: More Than Just TOPS

2025-07-15 18:01:06

As AI becomes a core part of modern smart devices—from AI cameras and service robots to industrial gateways—the way we think about computing performance needs to evolve. For many engineers and product teams, the focus still falls on one headline number: TOPS (Tera Operations Per Second). But in the real world, delivering AI at the edge is about more than just raw compute—it’s about achieving fast, reliable, and efficient intelligence within strict system constraints.



When TOPS Isn't Enough

While TOPS gives a theoretical measure of a chip's AI performance, it doesn't reflect what truly matters during deployment. A 10-TOPS processor may look impressive on paper, but if the model is too large for available memory, or the hardware doesn’t support essential layers or quantization formats, you’ll never see that full performance in the field.

 

In practice, developers often face bottlenecks caused by memory bandwidth, software compatibility, or thermal throttling. For AI-enabled devices like cameras or robots, what counts is how well the module runs your model under real conditions, with stable frame rates, low latency, and minimal power draw.

 

Real-Time AI Needs Low Latency, Not Just High Throughput

Edge AI applications require both low latency and high computing power. From autonomous driving and real-time translation to smart manufacturing and medical imaging, these scenarios rely on fast and efficient processing for accurate and timely decision-making. Whether it's enabling responsive robotics or performing high-precision analysis, the demand for scalable AI computing at the edge is growing rapidly across industries.


To meet these diverse demands, SIMCom's AI module portfolio offers scalable compute performance ranging from 1 to 48 TOPS, enabling developers to tailor solutions for various real-world scenarios at the edge.


SIMCom Module

CSDP/NPU

GPU

SIM9850

48TOPS

Adreno 740

SIM9650L-W

14TOPS

Adreno 643 @ 812 MHz

SIM9630L-W

3-9TOPS

Adreno A642L GPU

SIM8666

1TOPS

Mali-G52 GPU

SIM8668

1TOPS

Mali-G52 GPU

 

Unlike cloud-based AI, where high throughput is prioritized for batch inference, edge systems must respond quickly to individual inputs. Reducing latency involves optimizing models, minimizing preprocessing, and using hardware accelerators like NPUs that are designed for low-latency inference.

 

Precision and Quantization: Trade-Offs That Make a Difference

Cloud-trained AI models are often built in high precision, which ensures strong accuracy but demands more power and memory. For edge devices, quantization—converting models to lower-precision formats like INT16 or INT8—is a widely used technique to reduce complexity.

 

SIMCom Module

CSDP/NPU

Precision and Quantization

SIM9850

48TOPS

Support INT8/INT16

SIM9650L-W

14TOPS

Support INT8/INT16

SIM9630L-W

3-9TOPS

Support INT8/INT16

SIM8666

1TOPS

Support INT8/INT16

SIM8668

1TOPS

Support INT8/INT16

 

However, quantization is not without risk. Poorly quantized models may lose accuracy, especially in visually complex scenes or under varied lighting conditions. Developers should use quantization-aware training or post-training calibration tools to ensure that the drop in precision doesn’t significantly impact performance. Choosing  the right SIMCom AI computing  modules that support mixed-precision computing also gives flexibility in balancing speed and accuracy.

 

Software Support Can Make or Break a Project

Hardware is only half the equation. Without a robust software stack, even the best AI chip can become a roadblock. Developers often face challenges when trying to convert models, optimize them for inference, or integrate them into a broader system.

 

This is why choosing AI modules with a mature SDK, toolchain, and framework support is critical. Whether you're using TensorFlow Lite, ONNX, or PyTorch Mobile, the platform must support smooth model conversion, quantization, and runtime inference. SIMCom's AI computing modules support d ebugging tools, profiling utilities, and example code can all accelerate development and reduce deployment risk.

 

SIMCom Module

CSDP/NPU

Support

SIM9850

48TOPS

TensorFlow/TFLite/PyTorch/Onnx

SIM9650L-W

14TOPS

TensorFlow/TFLite/PyTorch/Onnx

SIM9630L-W

3-9TOPS

TensorFlow/TFLite/PyTorch/Onnx

SIM8666

1TOPS

TensorFlow / MXNet/PyTorch/Caffe

SIM8668

1TOPS

TensorFlow / MXNet/PyTorch/Caffe

 

For developers of AI cameras, robots, and edge IoT systems, this means selecting modules that offer the right combination of compute, power efficiency, latency, and ecosystem support. With SIMCom's AI computing modules , you can build AI-enabled products that are not only intelligent—but also practical, reliable, and ready for the real world.

Online questions