

https://github.com/HM RunningHub/ComfyUI_RH_Step1XEdit
https://github.com/stepfun ai/Step1X Edit
We have released the state-of-the-art image editing model Step1X Edit, whose performance rivals closed-source models such as GPT 4o and Gemini2 Flash. More specifically, we leverage a multimodal LLM to process reference images and user editing instructions. It extracts latent embeddings and integrates them with a diffusion image decoder to obtain the target image. To train the model, we built a data generation pipeline to produce high-quality datasets. For evaluation, we developed GEdit Bench, a novel benchmark rooted in real user instructions. Experimental results on GEdit Bench demonstrate that Step1X Edit significantly outperforms existing open-source baselines and approaches the performance of leading proprietary models, making a major contribution to the field of image editing. For more details, please refer to our technical report.
https://github.com/HM RunningHub/ComfyUI_RH_Step1XEdit
https://github.com/stepfun ai/Step1X Edit
We have released the state-of-the-art image editing model Step1X Edit, whose performance rivals closed-source models such as GPT 4o and Gemini2 Flash. More specifically, we leverage a multimodal LLM to process reference images and user editing instructions. It extracts latent embeddings and integrates them with a diffusion image decoder to obtain the target image. To train the model, we built a data generation pipeline to produce high-quality datasets. For evaluation, we developed GEdit Bench, a novel benchmark rooted in real user instructions. Experimental results on GEdit Bench demonstrate that Step1X Edit significantly outperforms existing open-source baselines and approaches the performance of leading proprietary models, making a major contribution to the field of image editing. For more details, please refer to our technical report.