Google Imagen

Imagen on Vertex AI 将谷歌最先进的图像生成 AI 功能带给应用开发者。借助 Imagen on Vertex AI，开发者能够构建下一代 AI 产品，在几秒钟内将用户的想象转化为高质量的视觉素材。

使用 Langchain 中的 Imagen，您可以执行以下任务：

VertexAIImageGeneratorChat：仅使用文本提示生成新颖的图像（文本到图像 AI 生成）。
VertexAIImageEditorChat：使用文本提示编辑上传或生成的整个图像。
VertexAIImageCaptioning：通过视觉字幕获取图像的文本描述。
VertexAIVisualQnAChat：通过视觉问答 (VQA) 获取关于图像的问题的答案。
- 注意：目前我们仅支持视觉问答 (VQA) 的单轮对话

图像生成

仅使用文本提示即可生成新颖的图像（文本到图像 AI 生成）

from langchain_core.messages import AIMessage, HumanMessage
from langchain_google_vertexai.vision_models import VertexAIImageGeneratorChat

API Reference:AIMessage | HumanMessage | VertexAIImageGeneratorChat

# Create Image Gentation model Object
generator = VertexAIImageGeneratorChat()

messages = [HumanMessage(content=["a cat at the beach"])]
response = generator.invoke(messages)

# To view the generated Image
generated_image = response.content[0]

import base64
import io

from PIL import Image

# Parse response object to get base64 string for image
img_base64 = generated_image["image_url"]["url"].split(",")[-1]

# Convert base64 string to Image
img = Image.open(io.BytesIO(base64.decodebytes(bytes(img_base64, "utf-8"))))

# view Image
img

图片编辑

通过文本提示编辑已上传或生成的整张图片。

编辑生成的图像

from langchain_core.messages import AIMessage, HumanMessage
from langchain_google_vertexai.vision_models import (
    VertexAIImageEditorChat,
    VertexAIImageGeneratorChat,
)

API Reference:AIMessage | HumanMessage | VertexAIImageEditorChat | VertexAIImageGeneratorChat

# Create Image Gentation model Object
generator = VertexAIImageGeneratorChat()

# Provide a text input for image
messages = [HumanMessage(content=["a cat at the beach"])]

# call the model to generate an image
response = generator.invoke(messages)

# read the image object from the response
generated_image = response.content[0]

# Create Image Editor model Object
editor = VertexAIImageEditorChat()

# Write prompt for editing and pass the "generated_image"
messages = [HumanMessage(content=[generated_image, "a dog at the beach "])]

# Call the model for editing Image
editor_response = editor.invoke(messages)

import base64
import io

from PIL import Image

# Parse response object to get base64 string for image
edited_img_base64 = editor_response.content[0]["image_url"]["url"].split(",")[-1]

# Convert base64 string to Image
edited_img = Image.open(
    io.BytesIO(base64.decodebytes(bytes(edited_img_base64, "utf-8")))
)

# view Image
edited_img

图片字幕生成

from langchain_google_vertexai import VertexAIImageCaptioning

# Initialize the Image Captioning Object
model = VertexAIImageCaptioning()

API Reference:VertexAIImageCaptioning

注意：我们正在使用图片生成章节中的生成图片。

# use image egenarted in Image Generation Section
img_base64 = generated_image["image_url"]["url"]
response = model.invoke(img_base64)
print(f"Generated Cpation : {response}")

# Convert base64 string to Image
img = Image.open(
    io.BytesIO(base64.decodebytes(bytes(img_base64.split(",")[-1], "utf-8")))
)

# display Image
img

Generated Cpation : a cat sitting on the beach looking at the camera

视觉问答 (VQA)

from langchain_google_vertexai import VertexAIVisualQnAChat

model = VertexAIVisualQnAChat()

API Reference:VertexAIVisualQnAChat

注意：我们正在使用图像生成部分中的生成图像

question = "What animal is shown in the image?"
response = model.invoke(
    input=[
        HumanMessage(
            content=[
                {"type": "image_url", "image_url": {"url": img_base64}},
                question,
            ]
        )
    ]
)

print(f"question : {question}\nanswer : {response.content}")

# Convert base64 string to Image
img = Image.open(
    io.BytesIO(base64.decodebytes(bytes(img_base64.split(",")[-1], "utf-8")))
)

# display Image
img

question : What animal is shown in the image?
answer : cat

Tool conceptual guide
Tool how-to guides

图像生成​

图片编辑​

编辑生成的图像​

图片字幕生成​

视觉问答 (VQA)​

Related​

图像生成

图片编辑

编辑生成的图像

图片字幕生成

视觉问答 (VQA)

Related