Date
Tuesday, November 26, 2024
November
26
The Department of Computer Science and Engineering
Michigan State University
Ph.D. Dissertation Defense
November 26, 2024 at 8:30AM EST
3405A/B Engineering Building
ABSTRACT
Proactive Schemes: Adversarial Attacks for Social Good
Vishal Asnani
Advisor: Dr. Xiaoming Liu
Adversarial attacks in computer vision typically exploit vulnerabilities in deep learning models, generating deceptive inputs that can lead AI systems to incorrect decisions. However, proactive schemes approaches designed to embed purposeful signals into visual data can serve as “adversarial attacks for social good,” harnessing similar principles to enhance the robustness, security, and interpretability of AI systems. This research explores application of proactive schemes in computer vision, diverging from conventional passive methods by embedding auxiliary signals known as "templates" into input data, fundamentally improving model performance, attribution capabilities, and detection accuracy across diverse tasks. This includes novel techniques for image manipulation detection and localization, which introduce learned templates to accurately identify and pinpoint alterations made by multiple, previously unseen Generative Models (GMs). The Manipulation Localization Proactive scheme (MaLP), for example, not only detects but also localizes specific pixel changes caused by manipulations, showing resilient performance across a broad range of GMs. Extending this approach, the Proactive Object Detection (PrObeD) scheme utilizes encoder-decoder architectures to embed task-specific templates within images, enhancing the efficacy of object detectors, even under challenging conditions like camouflaged environments.
This research further expands proactive schemes into generative models and video analysis, enabling attribution and action detection solutions. ProMark, for instance, introduces a novel attribution framework by embedding imperceptible watermarks within training data, allowing generated images to be traced back to specific training concepts—such as objects, motifs, or styles—while preserving image quality. Building on ProMark, CustomMark offers selective and efficient concept attribution, allowing artists to opt into watermarking specific styles and easily add new styles over time, without the need to retrain the entire model. Inspired by the proactive structure of PrObeD for 2D object detection, PiVoT introduces a video-based proactive wrapper that enhance action recognition and spatio-temporal action detection. By integrating action-specific templates through a template-enhanced Low-Rank Adaptation (LoRA) framework, PiVoT seamlessly augments various action detectors, preserving computational efficiency while significantly boosting detection performance. Lastly, the thesis presents a model parsing framework that estimates "fingerprints” for the generative models, extracting unique characteristics from generated images to predict the architecture and loss functions of underlying networks—a particularly valuable tool for deepfake detection and model attribution. Collectively, these proactive schemes offer significant advancements over passive methods, establishing robust, accurate, and generalizable solutions for diverse computer vision challenges. By addressing key issues related to the different vision applications caused by conventional passive approaches, this research lays the groundwork for a future where proactive frameworks can improve AI-driven applications.
Date
Tuesday, November 26, 2024
Time
8:30 AM
Location
3405A/B Engineering Building
Organizer
Vishal Asnani