"title"=>"A Visual Remix : Swap Objects with Ease",
"summary"=>nil,
"content"=>"
A Visual Remix : Swap Objects with Ease
Artificial intelligence (AI) has revolutionized the way we create marketing images. With a simple text prompt, we can generate stunning visuals tailored to our campaigns. Following image, generated using ImageGen on Veretx AI, illustrates a sample of marketing image for a targeted campaign.
The Missing Piece: Real Products
AI image generators are amazing at producing synthetic images from prompts like “A young woman standing in a gym and holding a ‘Specific Brand ’ sneaker in her hand”. The problem? Even if the brand logo appears correctly, that sneaker may not be an actual, purchasable brand product. These images, while visually appealing, lack the crucial link to real merchandise, making them less useful for personalized marketing campaigns. Techniques like subject tuning are promising but not yet refined enough to consistently produce marketing-grade quality content.
The Solution: Product Replacement
What if we could easily replace generic products in AI-generated images with the specific products we want to promote? Imagine you have a stock image of a shoe from a particular brand as shown below-
We can achieve easy object replacement with out manually drawing any bounding boxes or marking segmentation mask with the help of following steps -
(1) Gemini Model: LLM for understanding the name (subject) of the product to be replaced.
(2) Google Cloud Vision API: Object detection to find the subject in the source images
(3) Segment Anything (SAM) Model: for Image Segmentation
(4) Diffuser Model: for Image Impainting
Let’s understand each step with some derails below -
(1) Gemini for Text processing :
Imaging the end user of your tool with use a simple english command like — replace ‘object name’ in the given image with target. As a first step you need to find which object is to be replaced.
(2) Google Cloud Vision API:
Use Google Cloud Visio API to detect the desired object in the given image
(3) Segment Anything (SAM) Model:
Once the object is detected from Google Vision API, we use SAM model to segment the object. Perform similar operations on the target images as shown below -
(4) Diffuser Model:
In most of the cases, the object in the AI generated image or in the given input image will be of different shape and size. Hence, if we just resize the target image and superimpose with the given image, it would look like below image
The black portion in the above image represents the part of the original object where the target image has no appearance. This region need to be filled intelligently. We use image impainting technique using diffusers model to perform the task. Following image shows the final output -
We can use methods like ‘Haugh Transform’ to calculate the angle of rotation as well. However this method can work only if the desired objects are inclined in the 2D plain. As a future step, we need to incorporate 3D roations.
A Visual Remix : Swap Objects with Ease was originally published in Google Cloud - Community on Medium, where people are continuing the conversation by highlighting and responding to this story.
","author"=>"Bhushan Garware",
"link"=>"https://medium.com/google-cloud/a-visual-remix-swap-objects-with-ease-8718b040723c?source=rss----e52cf94d98af---4",
"published_date"=>Sun, 31 Mar 2024 02:52:00.000000000 UTC +00:00,
"image_url"=>nil,
"feed_url"=>"https://medium.com/google-cloud/a-visual-remix-swap-objects-with-ease-8718b040723c?source=rss----e52cf94d98af---4",
"language"=>nil,
"active"=>true,
"ricc_source"=>"feedjira::v1",
"created_at"=>Sun, 31 Mar 2024 21:41:06.634193000 UTC +00:00,
"updated_at"=>Mon, 13 May 2024 18:38:03.861171000 UTC +00:00,
"newspaper"=>"Google Cloud - Medium",
"macro_region"=>"Blogs"}