Skip navigation links

July

07

3105 Engineering Building and Zoom

Doctoral Defense - James Mariani

the famous Belmont tower facing a sunset

About the Event

The Department of Computer Science & Engineering

Michigan State University

Ph.D. Dissertation Defense

 

July 7th, 2025 at 8:00am EST

EB 3105 & https://msu.zoom.us/j/91436339283  

Passcode: Upon Request from Vincent Mattison or Advisor

 

Computation Caching for Efficient Mobile Convolutional Neural Network Inference

By: James Mariani

Advisor: Dr. Li Xiao

 

Computer vision on smartphones is commonly achieved through the use of convolutional neural networks (CNNs). CNNs offer accurate image classification, but struggle with latency when run with resource constraints, such as on low-powered mobile devices. Additionally, older CNN models tend to struggle with image classification on smartphones more so than newer CNN models. This is a concern, as there are many older CNN models that have been trained, and without improvements to their performance, they would be unviable to run in a mobile environment. This dissertation remedies the concerns with real-time image classification on smartphones through an intuitive, systems-based approach that requires no re-training of any model. Our aim of providing faster CNN inference with no required model modifications enables our advancements to be widely used, even by those with little CNN experience.

In this dissertation, I introduce a variety of improvements to mobile CNN inference latency. This includes: (1) a system that takes advantage of the predictability of CNNs, as well as the inherent mobility of smartphones to enable users in the same physical area to share patterns of CNN execution with each other, enabling computation reuse, (2) a caching system for CNN early-exit strategies based on known patterns in CNN execution, we explore if we can confidently predict the class of an image without computing the entire CNN, (3) a class-aware caching scheme that improves on traditional filter-pruning techniques for CNNs, and (4) novel caching strategies to enable online caching for CNNs, which make online cache replacement feasible on mobile CNNs, while enabling CNN computation reuse and early-exit for significant latency reduction. This work improves the inference latency of CNNs on mobile devices while maintaining the strong accuracy that define image classification via CNN.

 

Tags

Doctoral Defenses

Date

Monday, July 07, 2025

Time

8:00 AM

Location

3105 Engineering Building and Zoom

Organizer

James Mariani