Editing article

Title

Summary

Content

<h3>A Visual Remix : Swap Objects with Ease</h3>Artificial intelligence (AI) has revolutionized the way we create marketing images. With a simple text prompt, we can generate stunning visuals tailored to our campaigns. Following image, generated using <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/image/quickstart-image-generate-console">ImageGen on Veretx AI</a>, illustrates a sample of marketing image for a targeted campaign.<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*WJ0fbhfKcffRBx0ptsjA9A.png" /><figcaption>AI Generated Image with Google’s Imagegeneration@005 model from Vertex AI Studio</figcaption></figure>The Missing Piece: Real ProductsAI image generators are amazing at producing synthetic images from prompts like “A young woman standing in a gym and holding a ‘Specific Brand ’ sneaker in her hand”. The problem? Even if the brand logo appears correctly, that sneaker may not be an actual, purchasable brand product. These images, while visually appealing, lack the crucial link to real merchandise, making them less useful for personalized marketing campaigns. Techniques like subject tuning are promising but not yet refined enough to consistently produce marketing-grade quality content.The Solution: Product ReplacementWhat if we could easily replace generic products in AI-generated images with the specific products we want to promote? Imagine you have a stock image of a shoe from a particular brand as shown below-<figure><img alt="" src="https://cdn-images-1.medium.com/max/1004/1*ZxaVsfEvp5AtT8RAhz8Pxw.jpeg" /><figcaption>Sample AI generated imaginary product [Replace with your branded product stock image]</figcaption></figure>We can achieve easy object replacement with out manually drawing any bounding boxes or marking segmentation mask with the help of following steps -(1) <a href="https://ai.google.dev/?gad_source=1&amp;gclid=CjwKCAjw7-SvBhB6EiwAwYdCAZ_hrzfyMCDhNj_y3_EXPb2nwTbkGBDHvfzvqIA6_Bu2KI0QyIoU9RoCFnQQAvD_BwE">Gemini Model</a>: LLM for understanding the name (subject) of the product to be replaced.(2) <a href="https://cloud.google.com/vision?hl=en">Google Cloud Vision API</a>: Object detection to find the subject in the source images(3) <a href="https://segment-anything.com/">Segment Anything (SAM) Model</a>: for Image Segmentation(4) <a href="https://github.com/huggingface/diffusers">Diffuser Model</a>: for Image ImpaintingLet’s understand each step with some derails below -<h4>(1) Gemini for Text processing :</h4>Imaging the end user of your tool with use a simple english command like — replace ‘object name’ in the given image with target. As a first step you need to find which object is to be replaced.<h4>(2) Google Cloud Vision API:</h4>Use Google Cloud Visio API to detect the desired object in the given image<figure><img alt="" src="https://cdn-images-1.medium.com/max/980/1*Pral-XFsh2m95co2Z9gXHA.png" /><figcaption>Object detection using Google Cloud Vision API and segmentation with SAM model</figcaption></figure><h4>(3) Segment Anything (SAM) Model:</h4>Once the object is detected from Google Vision API, we use SAM model to segment the object. Perform similar operations on the target images as shown below -<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Eas2gVkGfSwZnXui9zOPDA.png" /><figcaption>Auto segmentation using SAM model</figcaption></figure><h4>(4) Diffuser Model:</h4>In most of the cases, the object in the AI generated image or in the given input image will be of different shape and size. Hence, if we just resize the target image and superimpose with the given image, it would look like below image<figure><img alt="" src="https://cdn-images-1.medium.com/max/990/1*W9QBKCE_1wwmMzRjap5rFQ.png" /><figcaption>Superimposed masked images of the object and the target product</figcaption></figure>The black portion in the above image represents the part of the original object where the target image has no appearance. This region need to be filled intelligently. We use image impainting technique using diffusers model to perform the task. Following image shows the final output -<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*dz9bs5ygWOkOBQ6iDLMpGA.png" /><figcaption>A Visual Remix output</figcaption></figure>We can use methods like ‘Haugh Transform’ to calculate the angle of rotation as well. However this method can work only if the desired objects are inclined in the 2D plain. As a future step, we need to incorporate 3D roations.<img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=8718b040723c" width="1" height="1" alt=""><hr><a href="https://medium.com/google-cloud/a-visual-remix-swap-objects-with-ease-8718b040723c">A Visual Remix : Swap Objects with Ease</a> was originally published in <a href="https://medium.com/google-cloud">Google Cloud - Community</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.

Author

Link

Published date

Image url

Feed url

Guid

Hidden blurb

--- !ruby/object:Feedjira::Parser::RSSEntry
title: 'A Visual Remix : Swap Objects with Ease'
url: https://medium.com/google-cloud/a-visual-remix-swap-objects-with-ease-8718b040723c?source=rss----e52cf94d98af---4
author: Bhushan Garware
categories:
- imagen
- machine-learning
- vertex-ai
- generative-ai
- google-cloud-platform
published: 2024-03-31 02:52:00.000000000 Z
entry_id: !ruby/object:Feedjira::Parser::GloballyUniqueIdentifier
 is_perma_link: 'false'
 guid: https://medium.com/p/8718b040723c
carlessian_info:
 news_filer_version: 2
 newspaper: Google Cloud - Medium
 macro_region: Blogs
rss_fields:
- title
- url
- author
- categories
- published
- entry_id
- content
content: '<h3>A Visual Remix : Swap Objects with Ease</h3>Artificial intelligence
 (AI) has revolutionized the way we create marketing images. With a simple text prompt,
 we can generate stunning visuals tailored to our campaigns. Following image, generated
 using <a href="https://cloud.google.com/vertex-ai/generative-ai/docs/image/quickstart-image-generate-console">ImageGen
 on Veretx AI</a>, illustrates a sample of marketing image for a targeted campaign.<figure><img
 alt="" src="https://cdn-images-1.medium.com/max/1024/1*WJ0fbhfKcffRBx0ptsjA9A.png"
 /><figcaption>AI Generated Image with Google’s Imagegeneration@005 model from Vertex
 AI Studio</figcaption></figure>The Missing Piece: Real ProductsAI
 image generators are amazing at producing synthetic images from prompts like “A
 young woman standing in a gym and holding a ‘Specific Brand ’ sneaker in her hand”.
 The problem? Even if the brand logo appears correctly, that sneaker may not be an
 actual, purchasable brand product. These images, while visually appealing, lack
 the crucial link to real merchandise, making them less useful for personalized marketing
 campaigns. Techniques like subject tuning are promising but not yet refined enough
 to consistently produce marketing-grade quality content.The Solution:
 Product ReplacementWhat if we could easily replace generic products
 in AI-generated images with the specific products we want to promote? Imagine you
 have a stock image of a shoe from a particular brand as shown below-<figure><img
 alt="" src="https://cdn-images-1.medium.com/max/1004/1*ZxaVsfEvp5AtT8RAhz8Pxw.jpeg"
 /><figcaption>Sample AI generated imaginary product [Replace with your branded product
 stock image]</figcaption></figure>We can achieve easy object replacement
 with out manually drawing any bounding boxes or marking segmentation mask with the
 help of following steps -(1) <a href="https://ai.google.dev/?gad_source=1&amp;gclid=CjwKCAjw7-SvBhB6EiwAwYdCAZ_hrzfyMCDhNj_y3_EXPb2nwTbkGBDHvfzvqIA6_Bu2KI0QyIoU9RoCFnQQAvD_BwE">Gemini
 Model</a>: LLM for understanding the name (subject) of the product to be replaced.(2)
 <a href="https://cloud.google.com/vision?hl=en">Google Cloud Vision API</a>: Object
 detection to find the subject in the source images(3) <a href="https://segment-anything.com/">Segment
 Anything (SAM) Model</a>: for Image Segmentation(4) <a href="https://github.com/huggingface/diffusers">Diffuser
 Model</a>: for Image ImpaintingLet’s understand each step with some derails
 below -<h4>(1) Gemini for Text processing :</h4>Imaging
 the end user of your tool with use a simple english command like — replace ‘object
 name’ in the given image with target. As a first step you need to find which object
 is to be replaced.<h4>(2) Google Cloud Vision API:</h4>Use
 Google Cloud Visio API to detect the desired object in the given image<figure><img
 alt="" src="https://cdn-images-1.medium.com/max/980/1*Pral-XFsh2m95co2Z9gXHA.png"
 /><figcaption>Object detection using Google Cloud Vision API and segmentation with
 SAM model</figcaption></figure><h4>(3) Segment Anything (SAM) Model:</h4>Once
 the object is detected from Google Vision API, we use SAM model to segment the object.
 Perform similar operations on the target images as shown below -<figure><img
 alt="" src="https://cdn-images-1.medium.com/max/1024/1*Eas2gVkGfSwZnXui9zOPDA.png"
 /><figcaption>Auto segmentation using SAM model</figcaption></figure><h4>(4)
 Diffuser Model:</h4>In most of the cases, the object in the AI generated
 image or in the given input image will be of different shape and size. Hence, if
 we just resize the target image and superimpose with the given image, it would look
 like below image<figure><img alt="" src="https://cdn-images-1.medium.com/max/990/1*W9QBKCE_1wwmMzRjap5rFQ.png"
 /><figcaption>Superimposed masked images of the object and the target product</figcaption></figure>The
 black portion in the above image represents the part of the original object where
 the target image has no appearance. This region need to be filled intelligently.
 We use image impainting technique using diffusers model to perform the task. Following
 image shows the final output -<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*dz9bs5ygWOkOBQ6iDLMpGA.png"
 /><figcaption>A Visual Remix output</figcaption></figure>We can use methods like
 ‘Haugh Transform’ to calculate the angle of rotation as well. However this method
 can work only if the desired objects are inclined in the 2D plain. As a future step,
 we need to incorporate 3D roations.<img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=8718b040723c"
 width="1" height="1" alt=""><hr><a href="https://medium.com/google-cloud/a-visual-remix-swap-objects-with-ease-8718b040723c">A
 Visual Remix : Swap Objects with Ease</a> was originally published in <a href="https://medium.com/google-cloud">Google
 Cloud - Community</a> on Medium, where people are continuing the conversation by
 highlighting and responding to this story.'

Language

Active

Ricc internal notes

Ricc source

Show this article Back to articles