Generate a video that matches the content of the first frame, middle frame, and last frame images based on user input.
Update: Reduce the issue of abnormal flickering at the connection point of the middle frame in the video.