Publications
This page highlights all publications in reverse chronological order. Each entry links to its detail page and external resources when available. You can also browse citations on Google Scholar.
Conference (full paper)
Training-free Mixed-Resolution Latent Upsampling for Spatially Accelerated Diffusion Transformers
This paper is about accelerating Text-to-Image diffusion models with region-adaptive upsampling
Perceptually Consistent Low-Resolution Previews with High-Resolution Image for Efficient Workflows of Diffusion Models
This paper is about generating perceptually consistent low-resolution image for workflow efficiency.
Erasing Thousands of Concepts: Towards Scalable and Practical Concept Erasure for Text-to-Image Diffusion Model
This paper is about erasing thousands of concepts in Text-to-Image diffusion models.
On Epistemic Uncertainty of Visual Tokens for Object Hallucinations in Large Vision-Language Models
This paper is about mitigating hallucinations of Large Vision-Language Models via training-free method.
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation
This paper is about pruning large text encoder in T2I diffusion model for memory efficient image synthesis.
Efficient Personalization of Quantized Diffusion Model without Backpropagation
This paper is about personalizing Texto-to-Image diffusion models with extreme memory efficiency via zero-th order optimization of quantized model.
INTRA: Interaction Relationship-aware Weakly Supervised Affordance Grounding
This paper is about training models to tackle affordance grounding with weakly supervised learning.
BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion
This paper is about generating human-centric ultra-high resolution image with pre-trained diffusion model.
Conference (short paper, abstracts or workshop papers)
Harmonization for a black-box deep learning model
This paper is about harmonizing MR image for a black-box deep learning model
Technical Report
On Geometrical Properties of Text Token Embeddings for Strong Semantic Binding in Text-to-Image Generation
This paper is about analyzing its geometrical properties in cross-attention and leverage its properties for strong semantic binding.
DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model
This paper is about generate diverse 3D object with pre-trained 2D diffusion model.
