|
Subject
Text-to-image diffusion models achieve high-quality image synthesis from natural language prompts, but their generation processes remain opaque. Explainability methods aim to reveal how the text prompt affects the model's image output. This thesis investigates how to adapt explainability techniques from our research group to conditional diffusion models.
Kind of work
The student will first study the architecture of text-conditional diffusion models and our explainability methods. The explainability model will be implemented within the reverse diffusion loop, generating and analyzing perturbed inputs and outputs. The method will be evaluated on vision-language datasets to understand how text prompts influence generation, using qualitative examples and faithfulness metrics.
Framework of the Thesis
X. Kong, O. Liu, H. Li, D. Yogatama, and G. V. Steeg, “Interpretable Diffusion via Information Decomposition,” in International Conference on Learning Representations, 2023.
B. Joukovsky, F. Sammani, and N. Deligiannis, “Model-Agnostic Visual Explanations via Approximate Bilinear Models,” in IEEE International Conference on Image Processing, 2023.
Number of Students
1
Expected Student Profile
Good knowledge of Python programming and machine learning.
|
|