Advisory Committee Chair
Advisory Committee Members
Date of Award
Degree Name by School
Doctor of Philosophy (PhD) College of Arts and Sciences
Visual appearance based person re-identification (re-ID) is the task of assigning the same identifier to all instances of a particular individual captured in images or videos, even after the occurrence of significant gaps over time or space. The state-of-the-art methods can be categorized into two main approaches: Given a set of gallery images with known IDs, the task is to infer either the ID label of a probe image individually (person re-ID via image retrieval) or the collective ID labeling of all probe images simultaneously (person re-ID via a highly-crafted re-ID structure). This dissertation is primarily focused on exploring the following question: without highly-crafting a predefined re-ID structure, is it possible to learn this re-ID structure among probe images automatically thereby further inferring their ID labels collectively? This dissertation formulates person re-ID, for the first time, as an energy-based structured prediction problem, which still manipulates the feature embedding of all the nodes but constructs the re-ID structure in the output label space. Without assuming a predefined structure, this dissertation takes a generative approach to approximating the unknown re-ID structure by generating ‘snapshot’ structure samples. The baseline formulation is as follows: to infer unknown IDs of all probe images collectively, the allowable uncertainty is introduced in feature embeddings and the associated intermediate labelings of all probe images. Such pairs consisting of ‘snapshot’ structures and their intermediate labelings are structure samples which are then fed into a structured prediction model to reason about the commonality of these structure samples, thereby approximating the unknown TRUE re-ID structure that better captures the labels’ interactions among all probe images. With this baseline formulation, this dissertation instantiates two families of structure sampling and learning paradigms. One is generating structure samples by Randomized Dropout, while structured prediction takes an unknown general-graph based pairwise Conditional Random Field (CRF). The other is generating structure samples by Neural-Style-Transferring bias of known gallery images, while structured prediction models possible higher-arity interactions among probe images utilizing Structured Prediction Energy Networks (SPENs). The current results of the latter approach swept all the competitions on benchmark datasets by the end of 2018.
Liao, Xinpeng, "Person Re-Identification By Deep Structured Prediction: A Generative Approach" (2019). All ETDs from UAB. 2282.