Let’s Take a Look at TokenFlow’s Ablation Study

Table of Links

Abstract and 1. Introduction

2 Related Work

3 Preliminaries

4 Method

4.1 Key Sample and Joint Editing

4.2 Edit Propagation Via TokenFlow

5 Results

5.1 Qualitative Evaluation and 5.2 Quantitative Evaluation

5.3 Ablation Study

6 Discussion

7 Acknowledgement and References

A Implementation Details

5.3 ABLATION STUDY

First, we ablate the use of TokenFlow, Sec. 4.2, for enforcing temporal consistency. In this experiment, we replace TokenFlow with extended attention (Eq. 3) and compute it between each frames of the edited video and the keyframes (w joint attention). Second, we ablate the randomizing of the keyframe selection at each generation step (w/o random keyframes). In this experiment, we use the same keyframe indices (evenly spaced in time) across the generation. Table 1 (bottom) shows the quantitative results of our ablations, the resulting videos can be found in the SM. As seen, TokenFlow ensures higher degree of temporal consistency, indicating that solely relying on the extension of self-attention to multiple frames is insufficient for achieving fine-grained temporal consistency. Additionally, fixing the keyframes creates an artificial partition of the video into short clips between the fixed keyframes, which reflects poorly on the consistency of the result.


Table 2: We reconstruct the video using the TokenFlow pipeline, excluding keyframe editing. We evaluate the TokenFlow representation with PSNR and LPIPS metrics. Our reconstruction improves vanilla DDIM inversion, highlighting the robusteness of TokenFlow representation.

:::info
This paper is available on arxiv under CC BY 4.0 DEED DEED license.

:::

:::info
Authors:

(1) Michal Geyer, Weizmann Institute of Science and Indicates equal contribution;

(2) Omer Bar-Tal, Weizmann Institute of Science and Indicates equal contribution;

(3) Shai Bagon, Weizmann Institute of Science;

(4) Tali Dekel, Weizmann Institute of Science.

:::

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.