ChatGPT Plugin Tutorial: How to build ChatGPT Plugin for image generation using Stable Diffusion

Tuesday, May 30, 2023 by abdibrokhim
ChatGPT Plugin Tutorial: How to build ChatGPT Plugin for image generation using Stable Diffusion

Introduction

ChatGPT Plugins, are tools that connect ChatGPT to external applications. These plugins expand ChatGPT's abilities by allowing it to interact with specific APIs created by developers. With the help of these plugins, ChatGPT can perform various tasks such as fetching up-to-date information like sports scores, stock prices, or the latest news, assist users with tasks such as booking a flight or ordering food, providing helpful guidance and support. Stable Diffusion, is a new generative model that can generate high-resolution images with a single forward pass. It is based on the Diffusion Models and StyleGAN2 architectures.

What we are going to do?

I'm going to be walking you through step by step the simple and straightforward process of building a ChatGPT Plugin for image generation using Stable Diffusion. As a bonus at the end of this tutorial, I'll also show you how to integrate your plugin to ChatGPT and test it out. So sit back, relax, and let's get started!

Prerequisites

Download Visual Studio Code compatible with your operating system, or use any other code editor like: IntelliJ IDEA, PyCharm, etc. To use ChatGPT Plugins API you need join plugins waitlist. To use Stable Diffusion we need API Key, go to Dream Studio, create an account and grab your API key.

Stable Diffusion API Key
Stable Diffusion API Key

Getting started

Step 1 - Create a new project

Let's start by creating new folder for our project. Open Visual Studio Code and create new folder named chatgpt-plugin-stable-diffusion by running the following command in the terminal:

mkdir chatgpt-plugin-stable-diffusion
cd chatgpt-plugin-stable-diffusion

Quick Note: In order to plugin work properly, we need to go through following steps:

  • Building an API (Flask, FastAPI, etc.) that implements the OpenAPI specification
  • Creating the JSON manifest file that will define relevant metadata for the plugin
  • Documenting the API in the OpenAPI yaml or JSON format

Step 2 - API implementation

As a first step we should implement the API. In this tutorial, I will use Flask, a lightweight WSGI web application framework. It is designed to make getting started quick and easy, with the ability to scale up to complex applications. However, you are free to use any other framework like: FastAPI, Django, Starlette, etc.

Let's create new file named app.py. Open the terminal and run the following command:

touch app.py

It's time to install the required dependencies. Run the following command in the terminal:

pip install flask 
pip install stability-sdk

Now, we can start implementing the API. First, we need to import the required dependencies:

from flask import Flask, request, jsonify, send_file, Response
from stability_sdk import client
import stability_sdk.interfaces.gooseai.generation.generation_pb2 as generation
import base64

Next, we need to set up the Flask app and the Stable Diffusion API client:

# Set up the Flask app
app = Flask(__name__)

# Set up the Stable Diffusion API client
stability_api = client.StabilityInference(
    key='',  # Replace with your Stability API key
    verbose=True,  # Set to True to enable verbose logging
    engine="stable-diffusion-xl-beta-v2-2-2"  # Replace with the engine you want to use
)

And last but not least, we need to define the API endpoint for generating images and method that starts a development server locally:

# Define the API endpoint for generating images
@app.route('/generate-image', methods=['POST'])
def generate_image():

    # Get the prompt and other parameters from the request
    data = request.json  # Get the request payload
    prompt = data.get('prompt')  # Prompt to generate the image from
    seed = data.get('seed', None)  # Set to None to use a random seed
    steps = data.get('steps', 30)  # Number of steps to run the diffusion for
    cfg_scale = data.get('cfg_scale', 8.0)  # Scale of the diffusion model
    width = data.get('width', 512)  # Width of the generated image
    height = data.get('height', 512)  # Height of the generated image
    samples = data.get('samples', 1)  # Number of samples to generate

    # Generate the image using Stable Diffusion
    answers = stability_api.generate(  # Call the generate() method
        prompt=prompt,
        seed=seed,
        steps=steps,
        cfg_scale=cfg_scale,
        width=width,
        height=height,
        samples=samples,
        sampler=generation.SAMPLER_K_DPMPP_2M
    )

    # Retrieve the generated image(s) from the response
    # Extract and encode the generated image(s) so that they can be easily transmitted in the API response as a JSON object
    generated_images = []
    for resp in answers:
        for artifact in resp.artifacts:
            if artifact.type == generation.ARTIFACT_IMAGE:
                encoded_image = base64.b64encode(artifact.binary).decode('utf-8')  # encodes the binary data of the image using Base64 encoding, then decoded to a UTF-8 string using 
                generated_images.append(encoded_image)

    # Return the generated image(s) as the API response
    return jsonify(images=generated_images)


# Run the Flask app locally
if __name__ == '__main__':
    # Set debug=True to enable auto-reloading when you make changes to the code
    app.run(debug=True, host='127.0.0.1', port=5000)

Quick Note: The debug=True argument enables automatic reloading of the server when changes are made to the code, which is useful during the development process. The host='127.0.0.1' argument sets the IP address for the server to localhost, and the port=5000 argument sets the port number for the server to listen on.

Perfect! Now we can run the Flask app and test the API endpoint:

Create new test.py file and copy/paste following code:

import requests
import json
import base64

# Define the API endpoint URL
url = 'http://127.0.0.1:5000/generate-image'

# Set up the request payload
payload = {
    'prompt': 'a donut',
    'seed': 992446758,
    'steps': 30,
    'cfg_scale': 8.0,
    'width': 512,
    'height': 512,
    'samples': 1
}

# Send the POST request to generate the image
response = requests.post(url, json=payload)

# Check the response status code
if response.status_code == 200:
    # Get the generated images from the response
    data = response.json()
    generated_images = data['images']

    # Process the generated images
    for i, encoded_image in enumerate(generated_images):
        # Decode the base64-encoded image
        decoded_image = base64.b64decode(encoded_image)

        # Save the image to a file
        image_filename = f'generated_image_{i}.png'
        with open(image_filename, 'wb') as image_file:
            image_file.write(decoded_image)

        print(f'Saved generated image {i+1} as {image_filename}')
else:
    print(f'Request failed with status code {response.status_code}')

Don't forget to install requests library by running pip install requests in your terminal.

Now, run the Flask app: python app.py, make sure the Flask app is running correctly. Then, open new terminal window and run the test script: python test.py, after a couple of seconds you should see the generated image in your folder.

AI generated art by Stable Diffusion
AI generated art by Stable Diffusion

Congratulations! We've successfully built the API for image generation using Stable Diffusion. Feel free to explore further and experiment with different prompts and parameters to unleash the full potential of image generation using Stable Diffusion.

Step 3 - Plugin manifest

Every plugin requires a ai-plugin.json file, which will be hosted later, on the API’s domain. Read more here.

Create new file named ai-plugin.json and paste following code:

{
    "schema_version": "v1",
    "name_for_human": "Image Generation Plugin",
    "name_for_model": "image-generation",
    "description_for_human": "Plugin for generating high-quality images using Stable Diffusion.",
    "description_for_model": "Plugin for generating high-quality images using Stable Diffusion.",
    "auth": {
        "type": "none"  # No authentication required, but you can use "api_key" or "oauth2" instead
    },
    "api": {
        "type": "openapi",
        "url": "http://127.0.0.1:5000/openapi.yaml"  # URL to the OpenAPI specification
    },
    "logo_url": "http://127.0.0.1:5000/logo.png",  # URL to the plugin logo
    "contact_email": "support@example.com",
    "legal_info_url": "https://example.com/legal"
}

Step 4 - OpenAPI specification

The OpenAPI specification is a standard for describing REST APIs. It is used to define the API that the plugin will use to communicate with the model. Read more here.

Create new file named openapi.yaml and copy/paste following code:

openapi: 3.0.1
info:
  title: Image Generation Plugin
  description: A plugin that generates high-quality images using Stable Diffusion.
  version: "v1"
servers:
  - url: http://127.0.0.1:5000  # URL to the Flask app
paths:
  /generate-image:
    post:
      operationId: generateImage
      summary: Generate an image
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: "#/components/schemas/generateImageRequest"
      responses:
        "200":
          description: OK
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/generateImageResponse"

components:
  schemas:
    generateImageRequest:
      type: object
      required:
        - prompt
      properties:
        prompt:
          type: string
          description: The prompt for image generation.
          required: true
        seed:
          type: integer
          description: The seed for deterministic generation.
    generateImageResponse:
      type: object
      properties:
        images:
          type: array
          items:
            type: string
          description: The generated image(s) in base64 format.

Now, go back app.py file and add endpoints for plugin's logo, manifest and OpenAPI specification:

# Define the API endpoint for the plugin logo
@app.route('/logo.png', methods=['GET'])
def plugin_logo():
    filename = 'logo.png'
    return send_file(filename, mimetype='image/png')


# Define the API endpoint for the plugin manifest
@app.route('/ai-plugin.json', methods=['GET'])
def plugin_manifest():
    host = request.headers['Host']
    with open('./ai-plugin.json') as f:
        text = f.read()
        return Response(text, mimetype='text/json')


# Define the API endpoint for the OpenAPI specification
@app.route('/openapi.yaml', methods=['GET'])
def openapi_spec():
    host = request.headers['Host']
    with open('./openapi.yaml') as f:
        text = f.read()
        return Response(text, mimetype='text/yaml')

All done! Now that you have all of this done you've completed all of the four steps you have a server that's up and running with your functionality ready to be talked to now comes the step of actually integrating it with ChatGPT.

Here is the full code of app.py file:

from flask import Flask, request, jsonify, send_file, Response
from stability_sdk import client
import stability_sdk.interfaces.gooseai.generation.generation_pb2 as generation
import base64


# Define the API endpoint for generating images
@app.route('/generate-image', methods=['POST'])
def generate_image():

    # Get the prompt and other parameters from the request
    data = request.json  # Get the request payload
    prompt = data.get('prompt')  # Prompt to generate the image from
    seed = data.get('seed', None)  # Set to None to use a random seed
    steps = data.get('steps', 30)  # Number of steps to run the diffusion for
    cfg_scale = data.get('cfg_scale', 8.0)  # Scale of the diffusion model
    width = data.get('width', 512)  # Width of the generated image
    height = data.get('height', 512)  # Height of the generated image
    samples = data.get('samples', 1)  # Number of samples to generate

    # Generate the image using Stable Diffusion
    answers = stability_api.generate(  # Call the generate() method
        prompt=prompt,
        seed=seed,
        steps=steps,
        cfg_scale=cfg_scale,
        width=width,
        height=height,
        samples=samples,
        sampler=generation.SAMPLER_K_DPMPP_2M
    )

    # Retrieve the generated image(s) from the response
    generated_images = []
    for resp in answers:
        for artifact in resp.artifacts:
            if artifact.type == generation.ARTIFACT_IMAGE:
                encoded_image = base64.b64encode(artifact.binary).decode('utf-8')
                generated_images.append(encoded_image)

    # Return the generated image(s) as the API response
    return jsonify(images=generated_images)


# Define the API endpoint for the plugin logo
@app.route('/logo.png', methods=['GET'])
def plugin_logo():
    filename = 'logo.png'
    return send_file(filename, mimetype='image/png')


# Define the API endpoint for the plugin manifest
@app.route('/ai-plugin.json', methods=['GET'])
def plugin_manifest():
    host = request.headers['Host']
    with open('./ai-plugin.json') as f:
        text = f.read()
        return Response(text, mimetype='text/json')


# Define the API endpoint for the OpenAPI specification
@app.route('/openapi.yaml', methods=['GET'])
def openapi_spec():
    host = request.headers['Host']
    with open('./openapi.yaml') as f:
        text = f.read()
        return Response(text, mimetype='text/yaml')


# Run the Flask app locally
if __name__ == '__main__':
    # Set debug=True to enable auto-reloading when you make changes to the code
    app.run(debug=True, host='127.0.0.1', port=5000)

Open https://chat.openai.com/, go over to the plugin store>develop your own pluginand then clickmy manifest is ready, copy/paste the base link to our app which is https://127.0.0.1:5000` in our case. Then click find manifest file > next > install for me > continue and install plugin.

WooW! Our plugin installed and ready to be used. Now use them as you want!

Conclusion!

AI generated art by Midjourney
AI generated art by Midjourney

In this tutorial, we have explored the process of building a ChatGPT Plugin for image generation using Stable Diffusion. By leveraging the power of Stable Diffusion, we can enhance ChatGPT's capabilities to generate realistic and diverse images based on single textual prompts.

By the way, plugins play a crucial role in expanding the functionality of ChatGPT, allowing it to interact with external applications and APIs.

Thank you for following along with this tutorial, and I hope you found it valuable.

made with 💜 by abdibrokhim for lablab.ai tutorials.