Googleの新しいAI「Gemini」を使ってみた。 • snuow's brain

概要

公式のリファレンスに概ね沿って遊んでみました。
画像読み込み時一部リファレンスのままだとエラーになるので、ちょっと変更

Youtube

ソースコード

#%%
import os
import pathlib
import textwrap

import google.generativeai as genai

from IPython.display import display
from IPython.display import Markdown


def to_markdown(text):
    text = text.replace('•', '  *')
    return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

#%%
GOOGLE_API_KEY= os.environ.get("GOOGLE_API_KEY")
genai.configure(api_key=GOOGLE_API_KEY)

#%%
for m in genai.list_models():
    if 'generateContent' in m.supported_generation_methods:
        print(m.name)

#%%
model = genai.GenerativeModel('gemini-pro')

#%%
%%time
response = model.generate_content("プログラムを最短で上達するにはどうすればよいか？")
to_markdown(response.text)
#%%

#%%
import PIL.Image #pip install pillow

img = PIL.Image.open('sample.png').convert("RGB")
#%%
model = genai.GenerativeModel('gemini-pro-vision')

response = model.generate_content(img)

to_markdown(response.text)
#%% md
誕生日を祝う男女のイラスト。男性はろうそくの火がついたケーキを持ち、女性はそれを興奮気味に眺めています。
二人ともパーティーハットをかぶり、背景には星と紙吹雪があります。
#%%
response = model.generate_content(["この画像の男女はこのあとどうなりそうですか？",img])

to_markdown(response.text)
#%%

解説(テキストのみ)

osは環境変数取得のために必要
google.generativeaiはgeminiを操作するのに必要
Markdownはレスポンスをjupyter上できれいに表示するために必要

import os

import google.generativeai as genai

from IPython.display import Markdown

def to_markdown(text):
    text = text.replace('•', '  *')
    return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

ここでは、環境変数からAPIキーを取得します
取得したAPIキーをconfigureのapi_keyに渡します。

GOOGLE_API_KEY= os.environ.get("GOOGLE_API_KEY")
genai.configure(api_key=GOOGLE_API_KEY)

APIキーの取得は下記のURLから青いボタンを押して、取得できます。
- https://ai.google.dev/tutorials/python_quickstart

使用できるモデルを出力できます。現在は、「gemini-pro」と「gemini-pro-vision」の2つ

for m in genai.list_models():
    if 'generateContent' in m.supported_generation_methods:
        print(m.name)

gemini-proを呼び出して、テキストを入力することでgeminiを操作できます。
- 例えば、「プログラムを最短で上達するにはどうすればよいか？」とか

model = genai.GenerativeModel('gemini-pro')

%%time
response = model.generate_content("プログラムを最短で上達するにはどうすればよいか？")
to_markdown(response.text)

解説(テキストと画像)

画像を読み込むためには、pillowが必要です。

pip install Pillow

公式リファレンスと異なり、RGBに変換しています。

import PIL.Image #pip install pillow

# RGBに変換しないとgenerativeaiがエラーを吐くのでコンバートします。
img = PIL.Image.open('sample.png').convert("RGB")
#%%

入力画像

あとは、テキストのみのときと同じようにすればOK

model = genai.GenerativeModel('gemini-pro-vision')

response = model.generate_content(img)

to_markdown(response.text)

テキストと画像をリストにして投げることもできます。

response = model.generate_content(["この画像の男女はこのあとどうなりそうですか？",img])

to_markdown(response.text)