♊️ GemiNews 🗞️

🗞️A Visual Remix : Swap Objects with Ease

A Visual Remix : Swap Objects with Ease

2024-03-31 - Bhushan Garware (from Google Cloud - Medium)

A Visual Remix : Swap Objects with EaseArtificial intelligence (AI) has revolutionized the way we create marketing images. With a simple text prompt, we can generate stunning visuals tailored to our campaigns. Following image, generated using ImageGen on Veretx AI, illustrates a sample of marketing image for a targeted campaign.AI Generated Image with Google’s Imagegeneration@005 model from Vertex AI StudioThe Missing Piece: Real ProductsAI image generators are amazing at producing synthetic images from prompts like “A young woman standing in a gym and holding a ‘Specific Brand ’ sneaker in her hand”. The problem? Even if the brand logo appears correctly, that sneaker may not be an actual, purchasable brand product. These images, while visually appealing, lack the crucial link to real merchandise, making them less useful for personalized marketing campaigns. Techniques like subject tuning are promising but not yet refined enough to consistently produce marketing-grade quality content.The Solution: Product ReplacementWhat if we could easily replace generic products in AI-generated images with the specific products we want to promote? Imagine you have a stock image of a shoe from a particular brand as shown below-Sample AI generated imaginary product [Replace with your branded product stock image]We can achieve easy object replacement with out manually drawing any bounding boxes or marking segmentation mask with the help of following steps -(1) Gemini Model: LLM for understanding the name (subject) of the product to be replaced.(2) Google Cloud Vision API: Object detection to find the subject in the source images(3) Segment Anything (SAM) Model: for Image Segmentation(4) Diffuser Model: for Image ImpaintingLet’s understand each step with some derails below -(1) Gemini for Text processing :Imaging the end user of your tool with use a simple english command like — replace ‘object name’ in the given image with target. As a first step you need to find which object is to be replaced.(2) Google Cloud Vision API:Use Google Cloud Visio API to detect the desired object in the given imageObject detection using Google Cloud Vision API and segmentation with SAM model(3) Segment Anything (SAM) Model:Once the object is detected from Google Vision API, we use SAM model to segment the object. Perform similar operations on the target images as shown below -Auto segmentation using SAM model(4) Diffuser Model:In most of the cases, the object in the AI generated image or in the given input image will be of different shape and size. Hence, if we just resize the target image and superimpose with the given image, it would look like below imageSuperimposed masked images of the object and the target productThe black portion in the above image represents the part of the original object where the target image has no appearance. This region need to be filled intelligently. We use image impainting technique using diffusers model to perform the task. Following image shows the final output -A Visual Remix outputWe can use methods like ‘Haugh Transform’ to calculate the angle of rotation as well. However this method can work only if the desired objects are inclined in the 2D plain. As a future step, we need to incorporate 3D roations.A Visual Remix : Swap Objects with Ease was originally published in Google Cloud - Community on Medium, where people are continuing the conversation by highlighting and responding to this story.

🏷️ imagen 🏷️ machine-learning 🏷️ vertex-ai 🏷️ generative-ai 🏷️ google-cloud-platform

[Blogs] 🌎 https://medium.com/google-cloud/a-visual-remix-swap-objects-with-ease-8718b040723c?source=rss----e52cf94d98af---4

🗿article.to_s

------------------------------
Title: A Visual Remix&nbsp;: Swap Objects with Ease
[content]
A Visual Remix&nbsp;: Swap Objects with&nbsp;EaseArtificial intelligence (AI) has revolutionized the way we create marketing images. With a simple text prompt, we can generate stunning visuals tailored to our campaigns. Following image, generated using ImageGen on Veretx AI, illustrates a sample of marketing image for a targeted campaign.AI Generated Image with Google’s Imagegeneration@005 model from Vertex AI&nbsp;StudioThe Missing Piece: Real&nbsp;ProductsAI image generators are amazing at producing synthetic images from prompts like “A young woman standing in a gym and holding a ‘Specific Brand ’ sneaker in her hand”. The problem? Even if the brand logo appears correctly, that sneaker may not be an actual, purchasable brand product. These images, while visually appealing, lack the crucial link to real merchandise, making them less useful for personalized marketing campaigns. Techniques like subject tuning are promising but not yet refined enough to consistently produce marketing-grade quality&nbsp;content.The Solution: Product ReplacementWhat if we could easily replace generic products in AI-generated images with the specific products we want to promote? Imagine you have a stock image of a shoe from a particular brand as shown&nbsp;below-Sample AI generated imaginary product [Replace with your branded product stock&nbsp;image]We can achieve easy object replacement with out manually drawing any bounding boxes or marking segmentation mask with the help of following steps&nbsp;-(1) Gemini Model: LLM for understanding the name (subject) of the product to be replaced.(2) Google Cloud Vision API: Object detection to find the subject in the source&nbsp;images(3) Segment Anything (SAM) Model: for Image Segmentation(4) Diffuser Model: for Image ImpaintingLet’s understand each step with some derails below&nbsp;-(1) Gemini for Text processing&nbsp;:Imaging the end user of your tool with use a simple english command like — replace ‘object name’ in the given image with target. As a first step you need to find which object is to be replaced.(2) Google Cloud Vision&nbsp;API:Use Google Cloud Visio API to detect the desired object in the given&nbsp;imageObject detection using Google Cloud Vision API and segmentation with SAM&nbsp;model(3) Segment Anything (SAM)&nbsp;Model:Once the object is detected from Google Vision API, we use SAM model to segment the object. Perform similar operations on the target images as shown below&nbsp;-Auto segmentation using SAM&nbsp;model(4) Diffuser&nbsp;Model:In most of the cases, the object in the AI generated image or in the given input image will be of different shape and size. Hence, if we just resize the target image and superimpose with the given image, it would look like below&nbsp;imageSuperimposed masked images of the object and the target&nbsp;productThe black portion in the above image represents the part of the original object where the target image has no appearance. This region need to be filled intelligently. We use image impainting technique using diffusers model to perform the task. Following image shows the final output&nbsp;-A Visual Remix&nbsp;outputWe can use methods like ‘Haugh Transform’ to calculate the angle of rotation as well. However this method can work only if the desired objects are inclined in the 2D plain. As a future step, we need to incorporate 3D roations.A Visual Remix&nbsp;: Swap Objects with Ease was originally published in Google Cloud - Community on Medium, where people are continuing the conversation by highlighting and responding to this story.
[/content]

Author: Bhushan Garware
PublishedDate: 2024-03-31
Category: Blogs
NewsPaper: Google Cloud - Medium
Tags: imagen, machine-learning, vertex-ai, generative-ai, google-cloud-platform

{"id"=>6,
"title"=>"A Visual Remix : Swap Objects with Ease",
"summary"=>nil,
"content"=>"

A Visual Remix : Swap Objects with Ease

Artificial intelligence (AI) has revolutionized the way we create marketing images. With a simple text prompt, we can generate stunning visuals tailored to our campaigns. Following image, generated using ImageGen on Veretx AI, illustrates a sample of marketing image for a targeted campaign.

\"\" — AI Generated Image with Google’s Imagegeneration@005 model from Vertex AI Studio

The Missing Piece: Real Products

AI image generators are amazing at producing synthetic images from prompts like “A young woman standing in a gym and holding a ‘Specific Brand ’ sneaker in her hand”. The problem? Even if the brand logo appears correctly, that sneaker may not be an actual, purchasable brand product. These images, while visually appealing, lack the crucial link to real merchandise, making them less useful for personalized marketing campaigns. Techniques like subject tuning are promising but not yet refined enough to consistently produce marketing-grade quality content.

The Solution: Product Replacement

What if we could easily replace generic products in AI-generated images with the specific products we want to promote? Imagine you have a stock image of a shoe from a particular brand as shown below-

\"\" — Sample AI generated imaginary product [Replace with your branded product stock image]

We can achieve easy object replacement with out manually drawing any bounding boxes or marking segmentation mask with the help of following steps -

(1) Gemini Model: LLM for understanding the name (subject) of the product to be replaced.

(2) Google Cloud Vision API: Object detection to find the subject in the source images

(3) Segment Anything (SAM) Model: for Image Segmentation

(4) Diffuser Model: for Image Impainting

Let’s understand each step with some derails below -

(1) Gemini for Text processing :

Imaging the end user of your tool with use a simple english command like — replace ‘object name’ in the given image with target. As a first step you need to find which object is to be replaced.

(2) Google Cloud Vision API:

Use Google Cloud Visio API to detect the desired object in the given image

\"\" — Object detection using Google Cloud Vision API and segmentation with SAM model

(3) Segment Anything (SAM) Model:

Once the object is detected from Google Vision API, we use SAM model to segment the object. Perform similar operations on the target images as shown below -

\"\" — Auto segmentation using SAM model

(4) Diffuser Model:

In most of the cases, the object in the AI generated image or in the given input image will be of different shape and size. Hence, if we just resize the target image and superimpose with the given image, it would look like below image

\"\" — Superimposed masked images of the object and the target product

The black portion in the above image represents the part of the original object where the target image has no appearance. This region need to be filled intelligently. We use image impainting technique using diffusers model to perform the task. Following image shows the final output -

We can use methods like ‘Haugh Transform’ to calculate the angle of rotation as well. However this method can work only if the desired objects are inclined in the 2D plain. As a future step, we need to incorporate 3D roations.

$\"\"$

A Visual Remix : Swap Objects with Ease was originally published in Google Cloud - Community on Medium, where people are continuing the conversation by highlighting and responding to this story.

",
"author"=>"Bhushan Garware",
"link"=>"https://medium.com/google-cloud/a-visual-remix-swap-objects-with-ease-8718b040723c?source=rss----e52cf94d98af---4",
"published_date"=>Sun, 31 Mar 2024 02:52:00.000000000 UTC +00:00,
"image_url"=>nil,
"feed_url"=>"https://medium.com/google-cloud/a-visual-remix-swap-objects-with-ease-8718b040723c?source=rss----e52cf94d98af---4",
"language"=>nil,
"active"=>true,
"ricc_source"=>"feedjira::v1",
"created_at"=>Sun, 31 Mar 2024 21:41:06.634193000 UTC +00:00,
"updated_at"=>Mon, 13 May 2024 18:38:03.861171000 UTC +00:00,
"newspaper"=>"Google Cloud - Medium",
"macro_region"=>"Blogs"}

Edit this article

Back to articles

🗞️A Visual Remix : Swap Objects with Ease

🗿Semantically Similar Articles (by :title_embedding)

A Visual Remix : Swap Objects with Ease

🗿article.to_s

A Visual Remix : Swap Objects with Ease

(1) Gemini for Text processing :

(2) Google Cloud Vision API:

(3) Segment Anything (SAM) Model:

(4) Diffuser Model: