Irrelevant Region Preserving for Counterfactual Image Manipulation

doi:10.21203/rs.3.rs-4980747/v1

Download PDF

Article

Irrelevant Region Preserving for Counterfactual Image Manipulation

https://doi.org/10.21203/rs.3.rs-4980747/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Image manipulation is one of the most significant and potential research topics in multimodality. Several existing methods based on Contrastive-Language-Image-Pretraining (CLIP) have achieved high-resolution image editing recently, but the challenging problem of complex editing and attribute disentanglement has not been solved yet. In this paper, we propose an image editing method combining the powerful capability of complex editing with the accurate protection of the irrelevant attributes, simultaneously addressing above two challenging issues. To gain a more comprehensive semantic representation, we design a simple but effective structure with the cross-attention mechanism, allowing better fusion between text and image feature. In addition, a mask-controlled method is applied to keep the semantics of irrelevant regions unchanged after editing. We conduct extensive experiments and analysis to evaluate the generative capability of our method. The results demonstrate that our design successfully achieves semantic representation and accurate editing, and outperforms the compared methods in image quality.

Physical sciences/Mathematics and computing/Applied mathematics

Physical sciences/Mathematics and computing/Computer science

No competing interests reported.

Download PDF

Reviewers invited by journal
22 Sep, 2024
Editor assigned by journal
22 Sep, 2024
Editor invited by journal
06 Sep, 2024
Submission checks completed at journal
04 Sep, 2024
First submitted to journal
26 Aug, 2024

You are reading this latest preprint version

Irrelevant Region Preserving for Counterfactual Image Manipulation

Status:

Version 1

Abstract

Full Text

Additional Declarations

Status:

Version 1