How Does Whisk Work?
Whisk gives users the flexibility to provide images as prompts in three categories:
- Subject: The main object or focus of the image.
- Scene: The setting or environment.
- Style: The artistic tone or visual aesthetic.
While text prompts are not mandatory, users can refine their creations by adding specific instructions in a text box at any stage of the process. After generating an image, Whisk will also produce a corresponding text prompt.
Iterating and Refining Images
If the initial result isn’t perfect, Whisk provides tools to improve it:
- Download or favorite an image you like.
- Click on an image to edit the text prompt and refine the output.
- Add additional text descriptions for more precision.
Powered by Google’s Imagen 3 Model
Whisk uses the “latest” version of Google’s Imagen 3 model, which promises advancements in AI image generation. This allows for more creative outputs and greater flexibility when iterating on designs.
Google Expands Its AI Tools: Veo 2 and Beyond
Alongside Whisk, Google also announced updates to its video generation model with the release of Veo 2. Designed to understand the “unique language of cinematography,” Veo 2 improves on previous models by minimizing errors, such as hallucinated elements (like extra fingers).
Key Highlights of Veo 2:
- It will be available first through Google’s VideoFX platform (currently on the Google Labs waitlist).
- Expansion to YouTube Shorts and other products is expected in 2025.
A Fun New Tool for Visual Creativity
In practice, Whisk provides a playful and engaging way to explore AI-generated visuals. While the generation process takes a few seconds, users can quickly iterate on ideas and refine outputs with minimal effort.
Google’s Whisk is a step forward in accessible AI tools, inviting users to experiment with visual prompts and enjoy a new level of creative freedom without relying heavily on text descriptions.
Image: Google via Whisk