Access APIs for industry-leading standard models (Qwen-Image, Kling, Hailuo, etc.)
LLM API
Compatible with popular LLM call formats, ready for AI agents
AI Application API
Professional node-based creation system for complex video customization needs
ComfyUI Workflow API
Built for developers with fast generation APIs and large-scale concurrency support
nano-banana-pro image to image
Natural-language、 context-aware editing、Multilingual on-image text with auto translation、Camera-style controls、Aspect ratio flexibility、Consistent character and style rendering
Try it now
See docs
ComfyUI Plugin
RH_Skills
AI Developer Kit
Recently Added
7RH Art
volc-subtitle-erase/video(Refined version)
Upload a video to Volcengine VOD, run refined subtitle erase, and return the subtitle-erased video.
Gemini Omni Flash Image-to-Video turns reference images into dynamic short videos. It supports single-image video generation with 1 image and reference fusion with 3 images, offering 720p, 1080p, and 4k outputs with 4, 6, 8, and 10-second durations for character animation, product showcases, creative shots, and multi-reference video creation.
221RH Art
gemini-omni-flash/text-to-video/channel-low-price
Gemini Omni Flash is a unified video generation model that creates high-quality short videos from text prompts. It supports 720p, 1080p, and 4k outputs with 4, 6, 8, and 10-second durations, making it suitable for creative clips, advertising assets, social videos, and visual concept demos.
29RH Art
gemini-omni-flash/video-edit/channel-low-price
Omni Flash - All-in-One Video Image-to-video generation powered by reference images and videos. Compatible with 720p/1080p/4K resolutions and 4/6/8/10s durations. Great for character motion, product display, creative shots and multi-image referenced videos.
23RH Art
volc-subtitle-erase/video(Standard Version)
Upload a video to Volcengine VOD, run refined subtitle erase, and return the subtitle-erased video.
93RH Art
bytedance/jimeng-4.6/image-to-image
Jimeng Image 4.6 Image-to-Image is a high-quality image generation and editing model from Volcengine Jimeng AI. It creates new images from reference images and text prompts, supporting multi-reference fusion, portrait enhancement, image stylization, product visuals, and creative design. The model supports up to 14 reference images and output size control with size or width/height, covering image generation from around 1K to 4K.
46RH Art
bytedance/jimeng-4.6/text-to-image
Jimeng Image 4.6 Text-to-Image is a high-quality image generation model from Volcengine Jimeng AI. Built on the upgraded Seedream 4.0 foundation, it generates high-resolution images from text prompts and is suitable for portraits, graphic design, creative posters, product visuals, and image stylization. Output size can be controlled with size or width/height, supporting image generation from around 1K to 4K.
Nano Banana Models
87.25wRH Art
nano-banana-pro/text-to-image-channel-low-price
Google's Nano Banana Pro (Gemini 3.0 Pro Image) is an industry-leading cutting-edge text-to-image model, which can generate high-definition 4K images with excellent quality and has been fully optimized for mobile phone devices to ensure smooth operation. It is equipped with a ready-to-use REST inference API, has the best performance, zero coldstarts, and extremely affordable pricing. Channel-low-price: priced significantly lower than the Official Stable Edition, but stability is not guaranteed.
10.28MRH Art
nano-banana-pro/edit-channel-low-price
Google Nano Banana Pro (Gemini 3.0 Pro Image) Edit supports professional image editing with high-quality 4K-capable ultra-clear output, driven by the advanced Gemini 3.0 Pro Image model to ensure perfect visual effects. It offers an out-of-the-box REST inference API, achieves leading industry performance, has no coldstart latency at all, and provides cost-effective & affordable pricing for all scenarios. Channel-low-price: priced significantly lower than the Official Stable Edition, but stability is not guaranteed.
An image-to-image and editing endpoint powered by a highly efficient visual engine. It enables rapid style transfer, inpainting, or background replacement via image and text inputs. Nano Banana 2 maintains the core structure and reference features of the original image while applying extensive modifications, making it ideal for dynamic, interactive design tools. Channel-low-price: priced significantly lower than the Official Stable Edition, but stability is not guaranteed.
A lightweight text-to-image endpoint engineered for high-concurrency and rapid response. As the core of Nano Banana 2, it balances visual fidelity with high throughput, instantly transforming text prompts into high-quality assets. Ideal for modern API platforms requiring real-time previews, rapid iteration, or large-scale batch generation. Channel-low-price: priced significantly lower than the Official Stable Edition, but stability is not guaranteed.
1.38wRH Art
nano-banana/edit-official-stable
Google Nano-Banana Edit is an advanced AI-powered image editing model that transforms complex visual manipulation into intuitive natural language commands. Built on cutting-edge computer vision, it accurately interprets spatial relationships to execute precise edits—like object replacement or color tuning—while flawlessly preserving the original lighting, texture, and tone. It delivers professional-grade, seamless results for concept art, photography, and everyday design.Official-stable: stable and highly efficient, with pricing lower than purchasing directly from the official model provider.
882RH Art
nano-banana/text-to-image-official-stable
Google Nano Banana Text-to-Image is a lightweight yet powerful AI model designed for creators needing rapid, high-quality visuals. In seconds, it transforms simple text prompts into expressive, realistic images with remarkable clarity and composition. Supporting diverse styles from photorealism to illustration, it accurately interprets subject-background relationships and produces clean, balanced lighting. Optimized for speed and low compute costs, it is ideal for rapid prototyping and generating social content efficiently without requiring design skills.Official-stable: stable and highly efficient, with pricing lower than purchasing directly from the official model provider.
Nano Banana 2 Edit redefines visual manipulation by merging Google’s advanced CV research with intuitive semantic control. Capable of professional 4K outputs, it excels at translating natural language into precise pixel-level modifications. With a unique capacity for 14-image multi-reference compositing, it ensures seamless subject consistency and intelligent localization. From complex re-lighting to sophisticated text translation within visuals, it offers a fast, flexible, and context-aware editing workflow for modern creators. Official-stable: stable and highly efficient, with pricing lower than purchasing directly from the official model provider.
Seedance Models
7RH Art
volc-subtitle-erase/video(Refined version)
Upload a video to Volcengine VOD, run refined subtitle erase, and return the subtitle-erased video.
23RH Art
volc-subtitle-erase/video(Standard Version)
Upload a video to Volcengine VOD, run refined subtitle erase, and return the subtitle-erased video.
3.01wRH Art
seedance-2.0-global-fast/multimodal-video
Seedance 2.0 Fast Multimodal Video, optimized for speed and cost efficiency. Supports multimodal reference, video editing, and extension with flexible multi-modal inputs. Support the @-reference system for precise control over each asset. Offers video editing, extension, and multimodal reference generation with faster inference than the standard version. Please note: Generated video files are retained in the cloud for only 24 hours, please download and transfer them promptly upon task completion.
5.84kRH Art
seedance-2.0-global-fast/image-to-video
Seedance 2.0 Fast Image-to-Video is optimized for rapid image animation workflows. Supports both first-frame and first-last-frame modes, generating 4-15 second dynamic videos from 1-2 input images. Offers faster inference and lower cost than the standard version, with audio generation and adaptive aspect ratios. Please note: Generated video files are retained in the cloud for only 24 hours, please download and transfer them promptly upon task completion.
859RH Art
seedance-2.0-global-fast/text-to-video
Seedance 2.0 Fast Text-to-Video is designed for rapid text-driven video generation. Generate 4-15 second videos from text prompts alone, with optional web search enhancement for improved timeliness, and audio output. Faster inference than the standard version, ideal for quick prototyping and batch production. Please note: Generated video files are retained in the cloud for only 24 hours, please download and transfer them promptly upon task completion.
3.67wRH Art
seedance-2.0-global/multimodal-video
Seedance 2.0 Multimodal Video delivers the highest generation quality. Supports multimodal reference, video editing, and extension. Combine text, images (up to 9), videos (up to 3), and audio (up to 3) to produce 4-15 second high-quality videos with native audio. The @-reference system enables precise control over character consistency, camera movement replication, and lip synchronization. Up to 4K resolution supported. Please note: Generated video files are retained in the cloud for only 24 hours, please download and transfer them promptly upon task completion.
1.8kRH Art
seedance-2.0-global/image-to-video
Seedance 2.0 Image-to-Video for highest quality. Supports first-frame and first-last-frame modes—upload one image or two (start/end), and the model intelligently generates intermediate motion for 4-15 second videos with native audio. Superior performance in character consistency, physical motion realism, and frame stability, with multiple aspect ratio support. Please note: Generated video files are retained in the cloud for only 24 hours, please download and transfer them promptly upon task completion.