Home/Projects/fpga-cnn-localisation

Robust FPGA Centric CNN Visual Localisation for GPS Denied UAVs

SHIPPED

Updated: Mon Dec 01 2025 00:00:00 GMT+0000 (Coordinated Universal Time) | Started: Sat Feb 01 2025 00:00:00 GMT+0000 (Coordinated Universal Time)

End to end development of a compact CNN for absolute visual localisation, quantised and deployed on an AMD Kria KV260 FPGA DPU for real time, low power inference in GPS denied UAV scenarios.

Thesis PDF

Gallery

9 images

UQ Illuminate with supervisor Matthew D'Souza

UQ Illuminate presentation to Boeing - 1

UQ Illuminate presentation to Boeing - 2

Videos

Presentation @ UQ Fintech Project Showcase

Notes

Overview

A compact CNN based localisation pipeline designed for UAVs operating in GPS denied environments, deployed on an FPGA to achieve real time inference under tight power constraints.

Abstract

Unmanned aerial vehicles (UAVs) often operate in environments where GPS signals are unavailable or unreliable, creating significant localisation challenges. Traditional feature based methods such as ORB and RANSAC can operate in real time but are highly sensitive to changes in lighting, viewpoint and scene structure.

This project investigated an alternative approach based on lightweight convolutional neural networks (CNNs) deployed on field programmable gate array (FPGA) hardware. Using the AMD Kria KV260 Vision AI platform, a custom CNN model was trained, quantised and compiled through the Vitis AI tool flow for execution on a deep learning processing unit (DPU). The model directly regresses image coordinates from monocular input, enabling on board visual localisation without reliance on external compute resources.

Preliminary results demonstrate real time inference at approximately 32 frames per second with total board power consumption below 11 watts, highlighting the efficiency of FPGA acceleration. Accuracy metrics are reduced by quantisation effects and unoptimised datasets, with error rates higher than the full precision baseline, but ongoing work is directed towards quantisation aware training and improved data handling.

In preliminary real world testing, localisation accuracy was noticeably affected, likely due to differences between the datasets used for training and the imaging characteristics of the deployed camera system. In ideal conditions, datasets would be calibrated and augmented to align with the specific properties of the imaging hardware, including field of view, expected flight altitude, and colour or lighting variations, to ensure consistency between training and deployment environments.

The outcomes of this research contribute towards practical, energy efficient localisation pipelines for small UAVs operating in GPS denied scenarios.

Problem

UAVs operating in GPS denied environments require reliable localisation under tight compute and power constraints.

Key constraints

On board compute only (no cloud dependency)
Real time throughput with stable latency
Limited power budget suitable for small UAV platforms
Robustness under viewpoint, lighting, and scene variation

Approach

The work was structured around three progressive studies, each addressing a stage of the pipeline from data design through to deployment and real time integration.

Study 1: Dataset diversity and augmentation

Localisation performance was strongly governed by the diversity and realism of the training data.

Models trained on a single static image displayed overfitting and poor generalisation
Augmented datasets incorporating AI generated imagery and altitude variation improved accuracy and robustness
Data centric strategies were as influential as architecture choices in this visual regression task

Study 2: Quantisation and hardware deployment

This study evaluated 8 bit integer quantisation and FPGA execution via the AMD Vitis AI tool flow.

Trained a compact CoordinateCNN style architecture for coordinate regression
Applied post training quantisation and compiled for DPU execution on the Kria KV260
Observed only minor degradation in mean and median error (approximately 1 to 2 percent) while enabling efficient deployment

Study 3: Real time inference and emulation

This study validated end to end behaviour in a live streaming pipeline.

Integrated the quantised model into a real time video streaming pipeline
Confirmed deterministic and repeatable behaviour across runs
Verified stable latency and coherent spatial error patterns aligned with scene structure

Results

Throughput and power

Inference: approximately 32 to 33 fps on Kria KV260
Total board power: below 11 W
Latency: tightly clustered around 22 to 24 ms in real time emulation

Accuracy and generalisation

Post training quantisation introduced only minor accuracy degradation compared to full precision baseline
Real world testing showed a noticeable drop in localisation accuracy, likely due to dataset mismatch with the deployed camera system
This highlights the importance of camera aligned calibration and augmentation (field of view, altitude distribution, colour and lighting variation)

Discussion

Overall interpretation of findings

Across all studies, results show dataset diversity is the dominant driver of localisation performance. Training on single or overly narrow datasets led to poor generalisation, while augmented datasets improved robustness. Quantised deployment retained high accuracy under 8 bit precision while delivering more than 30 fps under 11 W on the KV260, demonstrating FPGA inference as a practical pathway for on board visual localisation.

Real time emulation confirmed predictable behaviour, stable latency around 22 ms, and stable error distributions, supporting use in navigation pipelines where consistency is essential.

Critical evaluation

Key limitations and considerations:

Training data was limited to a single geographic region (UQ and surrounding suburbs)
Absolute regression assumes a fixed reference map and limits adaptability in evolving environments
Single frame localisation does not leverage temporal coherence that could smooth estimates
DPU profiling was coarse, limiting precise co design optimisation
AI generated imagery can introduce bias and should be supported by automated diversity and verification metrics

Practical implications

Demonstrates high performance visual localisation without GPUs or cloud connectivity
Shows compact FPGA systems can deliver real time performance within strict energy budgets
Provides a reproducible pipeline from data generation through to hardware deployment
Deterministic latency and repeatable behaviour suit sensor fusion frameworks (for example visual inertial fusion)

Future work

Quantisation aware training to close the remaining accuracy gap
Mixed precision inference to improve accuracy in sensitive layers while retaining efficiency
Expanded datasets across multiple regions, seasons, and environments with automated quality metrics
Sensor fusion with inertial, barometric, and magnetometer data for continuous navigation
Temporal modelling through lightweight filtering or recurrent components to exploit frame continuity
Hardware co design including pruning, pipeline tuning, and DPU scheduling optimisation
Field validation on a UAV platform with KV260 or equivalent Zynq UltraScale+ hardware
Real time mapping or incremental map adaptation beyond static reference assumptions

Farm Paddock Monitoring Using Satellite Imagery

A web based paddock monitoring system using multi year Sentinel-2 satellite imagery to track vegetation condition, moisture stress, and surface water at paddock scale.

Scary Crow

Game Jam project built in Unity where players farm pumpkins by day and survive crow attacks by night.

Quick info

Status: shipped

Updated: Mon Dec 01 2025 00:00:00 GMT+0000 (Coordinated Universal Time)

Milestones: 4/4

Next: None

Milestones

4/4 done

100%

Study 1 - Dataset diversity and augmentation experiments

Thu May 01 2025 00:00:00 GMT+0000 (Coordinated Universal Time)

Done

Study 2 - Quantisation and Vitis AI deployment on KV260

Mon Sep 01 2025 00:00:00 GMT+0000 (Coordinated Universal Time)

Done

Study 3 - Real time streaming inference and emulation tests

Sat Nov 01 2025 00:00:00 GMT+0000 (Coordinated Universal Time)

Done

Final thesis submission

Mon Nov 10 2025 00:00:00 GMT+0000 (Coordinated Universal Time)

Done

Tip: milestones drive tracker progress bars and this timeline.