Meet the Jetsons

Craig Nielsen

March 3, 2026

Taking a closer look at what Nvidia Jetson has to offer. Their architecture, costs and uses along with various references for further reading.

Nvidia Jetson Edge AI

AI at the edge requires hardware that is very capable at number crunching while being under certain constraints such as reduced power consumption, limited cooling ability, size restrictions and likely network limitations.

The Raspberry PI has been very popular in robotics but NVidia has entered the game, and presents offerings that are customized to AI models. These options come at a premium, but their processing power and power efficiency make them the strongest candidates for production edge AI deployments. These are professional grade hardware SOC boards, that run ARM CPU's and can be used in a variety of very serious professional settings; in logistics, health and safety, production lines, robotics applications and healthcare. In fact, where privacy is a big driver for adoption, edge AI devices will be found. For example; medical data is heavily regulated, and keeping inference on-device means raw patient data never leaves the room.

The Jetson Options

Nvidia offers 3 options, in order of power: AGX, NX and Nano. Processing Power is measured in TOPs (Trillions Operations per Second) and this is calculated by finding the Operations per cycle X Clock frequency (cycles per second) X Number of Cores. For these variants, the numbers are roughly for AGX: 200 TOPS (32GB) and 275 TOPS (64GB), For NX: 100, and for Orin Nano Super: 67 TOPs.

Another great benefit of the AGX, and Nvidia, is that they also support dedicated components for VPI (Vision Programming Interface), the OFA (Optical Flow Accelerator): for example for dual camera setup, the depth maps are generated outside of the GPU, freeing up more compute for your AI workloads, whilst you are able to integrate depth into your application. Depth can help with occlusions for tracking, avoidance problems and more.

The other two: NX and Nano can still run the VPI but they rely on CUDA and the GPU, so they take some cycles away from the AI model.

Here is a table breakdown of the differences

The Jetson Architecture

The Jetsons run as combined ARM CPU with CUDA GPU's. ARM as you know, are most commonly used in mobile and edge devices because they offer efficient compute and are effective for multiple repeated math operations. They don't offer the same peak performance in a single thread as does the x86 systems, but low power and heat is the name of the game in edge AI. The main reason why ARM dominates here is due to SOC's. (System on a Chip). Historically ARM sells the designs of their CPU's to other manufacturers, compared to how Intel and AMD design and build their own flavors of x86 chips. This means manufacturers can more easily integrate these CPU's into SOC's which ultimately share the silicon with other components such as custom NPU's, reducing power consumption, latency, and physical footprint while enabling better performance in specialised accelerators.

x86 have started on the SOC road now, with products such as Intel Core Ultra (Meteor Lake) or AMD Ryzen AI chips, which integrate CPU + GPU + NPU on one die for consumer/edge devices.

Jetson Deployment Options

The AGX has components in the SOC that will offload vision tasks to a dedicated unit. This means the NX and Nano will need to share the GPU when running tools from the Nvidia VPI (Vision Programming Interface). This doesn't always need to be the case. In certain applications you could use a camera that does the compute for depth maps directly and interface using the USB ports, but this does come with limitations. When using the depth maps from a device such as the OAK-D, you would not be able to make use of AI vision depth models like Foundation Stereo or CREStereo which are very popular in robotics. Also, this decision has a few things to consider weighing up USB and CSI:

Latency and overhead CSI is a direct point-to-point hardware connection - data goes straight from the image sensor into the Jetson's ISP (Image Signal Processor) with minimal overhead. USB is a shared protocol with handshaking, packetisation, error checking, and a host controller managing the bus. Each of those steps adds latency. For a single camera the difference might be 1–5ms, but in a tight real-time control loop that matters.
CPU involvement CSI data arrives via DMA (Direct Memory Access - not similar to RDMA) - it goes straight into memory without the CPU touching it. The Jetson's ISP picks it up directly. USB transfers require the CPU to manage the transfer, consuming CPU cycles and adding jitter.
Firmware dependency Your depth pipeline is tied to external firmware updates. If there's a bug in their depth algorithm or they change behaviour in an update, you have limited ability to fix or control it.
Pre-calibrated fixed baseline The stereo baseline (distance between the two lenses) is fixed by the physical hardware design. With two separate CSI cameras you can choose your baseline distance. And as you know from your Geomatics degree, a wider baseline gives better intersection accuracy for longer ranges, narrower baselines work better up close.

Conclusion

As you can see, there are a few points to consider when selecting a hardware device that can fit your purpose, and especially important when needing to make this decision at scale. The price differences between the AGX, NX and Nano can be significant and knowing about the application requirements will help to decide what compute is required. It is similar to monitoring distributed systems in the cloud; in kubernetes when running microservice applications and wanting to tweak CPU and Memory limits, your ultimate goal is to keep utilization high which in turn keeps costs low in the long run. Overspecifying hardware leads to low utilisation - a cost that compounds quickly when deploying at scale (think hundreds of units across a warehouse).

I'll keep this post updated, so feel free to bookmark it and check back periodically.