|
Subject
With the increasing adoption of foundation models in vision tasks, the high-dimensional nature of their representations presents challenges in distributed and embedded systems. Many real-world applications, such as autonomous vehicles and smart surveillance, rely on transmitting image and point-cloud data across distributed networks, often constrained by limited bandwidth and computational resources. Reducing communication overhead is critical in these settings to ensure efficient and timely data exchange.
Learned compression and coding techniques offer a promising approach to mitigating this challenge. By optimizing data representations through deep learning methods, these techniques can significantly enhance transmission efficiency without sacrificing the performance of downstream tasks such as object detection. This thesis aims to explore and implement learned compression methods tailored to distributed computer vision applications using images and radar point cloud data.
Kind of work
The student will gain experience in understanding and working with foundation models for vision tasks. They will also learn and apply fundamental concepts from information theory related to compression. Additionally, the student will implement and evaluate learned compression methods for distributed vision applications.
Framework of the Thesis
The thesis will be structured into several phases. It will begin with a brief literature review to examine existing work on foundation models for vision tasks, distributed systems, and learned compression techniques. Following this, the student will set up a computer vision pipeline (e.g., object detection) using images and/or point cloud data. The next step involves developing learned compression techniques to optimize communication efficiency. Once these components are in place, the student will combine the computer vision pipeline with the developed compression method and evaluate their performance in a distributed setting. Finally, the findings, experimental results, and conclusions will be documented in the thesis.
Number of Students
1
Expected Student Profile
The ideal candidate for this thesis should have a strong background and interest in Python programming. Familiarity with deep learning frameworks such as PyTorch or TensorFlow and computer vision applications is necessary. An interest in information theory, particularly in compression and coding, is also essential.
This thesis offers an opportunity to work at the intersection of deep learning, computer vision, and distributed computing, making it an excellent fit for students interested in applied machine learning and communication-efficient AI systems.
|
|