Based on the first frame, intermediate frame, and last frame images input by the user, generate a video that matches the content of the first frame, intermediate frame, and last frame.

update content: Alleviate the abnormal flickering issue at the middle frame connection points of videos.