GitHub

🎨 Image Generation & Editing

LibreChat comes with built-in image tools that you can add to an Agent.

Each has its own look, price-point, and setup step (usually just an API key or URL).

Tool	Best for	Needs
OpenAI Image Tools	Cutting-edge results (GPT-Image-1). Can also edit the images you upload.	OpenAI API
DALL·E (3 / 2)	Legacy OpenAI Image models.	OpenAI API
Stable Diffusion	Local or self-hosted generation, endless community models.	Automatic1111 API
Flux	Fast cloud renders, optional fine-tunes.	Flux API
MCP	Bring-your-own-Image-Generators	MCP server with image output support

Notes:

API keys can be omitted in favor of allowing the user to enter their own key from the UI.
Image Outputs are directly sent to the LLM as part of the immediate chat context following generation.
- The LLM will only get vision context from images attached to user messages, and not from generations/edits, except for immediately after generation.
- See Image Storage and Handling for more details.
MCP Server tool image outputs are supported, which may output images similarly to LC’s built-in tools.
- Note: MCP servers may or may not use the correct format when outputting images. See details in the MCP section below.

1 · OpenAI Image Tools (recommended)

Features

”OpenAI Image Tools” are an agent toolkit made up of 2 separate tools.

Image Generation:
- Create brand-new images from text prompts (no upload required).
Image Editing:
- Edit or remix the images you just uploaded—change colours, add objects, extend the canvas, etc.
Both use OpenAI’s latest image generation model, GPT-Image-1, for superior instruction following, text rendering, detailed editing, real-world knowledge
See OpenAI’s Image Generation documentation for more details.

Generation vs. Editing

Use-case	Invokes
”Start from scratch”	Image Generation
”Use existing image(s)“	Image Editing

The agent decides which tool to use based on the context:

Image Generation creates brand new images from text descriptions only
Image Editing modifies or remixes existing images using their image IDs
- These can be images from the current message or previously generated/referenced images
- The LLM keeps track of image IDs as long as they remain in the context window
- Includes the referenced image IDs in the tool output
Both tools are always available, but the LLM will choose the appropriate one based on the user’s request
Both tools will include the generated image ID in the tool output

⚠️ Important

Image editing relies on image IDs, which are retained in the chat history.
When files are uploaded to the current request, their image IDs are added to the context of the LLM before any tokens are generated.
Previously referenced or generated image IDs can be used for editing, as long as they remain within the context window.
The LLM can include any relevant image IDs in the image_ids array when calling the image editing tool.
You can also attach previously uploaded images from the side panel without needing to upload them again.
- This also has the added benefit of providing a vision model with the image context, which can be useful for informing the prompt for the image editing tool.

Parameters

Image Generation

• prompt – text description (required)
• size – auto (default), 1024x1024 (square), 1536x1024 (landscape), or 1024x1536 (portrait)
• quality – auto (default), high, medium, or low
• background – auto (default), transparent, or opaque (transparent requires PNG or WebP format)

Image Editing

• image_ids – array of image IDs to use as reference for editing (required) • prompt – text description of the changes (required)
• size – auto (default), 1024x1024, 1536x1024, 1024x1536, 256x256, or 512x512
• quality – auto (default), high, medium, or low

Setup

Create or reuse an OpenAI key and add to .env:

IMAGE_GEN_OAI_API_KEY=sk-...
# optional extras
IMAGE_GEN_OAI_BASEURL=https://...

For Azure OpenAI deployments, you will first need access: https://aka.ms/oai/gptimage1access

Then, add your corresponding credentials to your .env file:

IMAGE_GEN_OAI_API_KEY=your-api-key
# optional extras
IMAGE_GEN_OAI_BASEURL=https://deploymentname.openai.azure.com/openai/deployments/gpt-image-1/
IMAGE_GEN_OAI_AZURE_API_VERSION=2025-04-01-preview

Then add “OpenAI Image Tools” to your Agent’s Tools list.

Advanced Configuration

You can customize the tool descriptions and prompt guidance by setting these environment variables:

# Image Generation Tool Descriptions
IMAGE_GEN_OAI_DESCRIPTION=...
IMAGE_GEN_OAI_PROMPT_DESCRIPTION=...
 
# Image Editing Tool Descriptions
IMAGE_EDIT_OAI_DESCRIPTION=...
IMAGE_EDIT_OAI_PROMPT_DESCRIPTION=...

Pricing

See the GPT-Image-1 pricing page and Image Generation Documentation for details on costs associated with image generation.

2 · DALL·E (legacy)

DALL·E provides high-quality image generation using OpenAI’s legacy image models.

Parameters

• prompt – Text description of the desired image (required, up to 4000 characters)
• style – vivid (hyper-real, dramatic - default) or natural (less hyper-real)
• quality – standard (default) or hd
• size – 1024x1024 (default/square), 1792x1024 (wide), or 1024x1792 (tall)

Setup

# Required
DALLE_API_KEY=sk-...  # or DALLE3_API_KEY=sk-...
 
# Optional
DALLE_REVERSE_PROXY=https://...  # Alternative endpoint
DALLE3_BASEURL=https://...  # For Azure or custom endpoints
DALLE3_AZURE_API_VERSION=2023-12-01-preview  # For Azure deployments
DALLE3_SYSTEM_PROMPT=...  # Custom system prompt for DALL·E

Advanced Configuration

For Azure OpenAI deployments, configure both the base URL and API version:

DALLE3_BASEURL=https://your-resource-name.openai.azure.com/openai/deployments/your-deployment-name
DALLE3_AZURE_API_VERSION=2023-12-01-preview
DALLE3_API_KEY=your-azure-api-key

Enable the DALL·E tool for the Agent and start prompting.

Pricing

See the DALL-E pricing page and Image Generation Documentation for details on costs associated with image generation.

3 · Stable Diffusion (local)

Run images entirely on your own machine or server.
Point LibreChat at any Automatic1111 (or compatible) endpoint and you’re set.

Parameters

• prompt – Detailed keywords describing desired elements in the image (required)
• negative_prompt – Keywords describing elements to exclude from the image (required)

The Stable Diffusion implementation uses these default parameters:

cfg_scale: 4.5
steps: 22
width: 1024
height: 1024

These values are currently fixed but provide good results for most use cases.

Setup

SD_WEBUI_URL=http://127.0.0.1:7860  # URL to your Automatic1111 WebUI

No API key required—just the reachable URL.

More details on setting up Automatic1111 can be found in the dedicated Stable Diffusion guide.

4 · Flux

Cloud generator with an emphasis on speed and optional fine-tuned models.

Features

Fast cloud-based image generation
Support for fine-tuned models
Multiple quality levels and aspect ratios
Raw mode for less processed, more natural-looking images

Parameters

The Flux tool supports three main actions:

generate - Create a new image from a text prompt
generate_finetuned - Create an image using a fine-tuned model
list_finetunes - List available custom models for the user

More details can be found in the dedicated Flux guide.

Setup

FLUX_API_KEY=flux_live_...
FLUX_API_BASE_URL=https://api.us1.bfl.ai   # default is fine for most users

Choose the Flux tool inside the Agent. Prompts are plain text; one call produces one image.

Pricing

See the Flux pricing page for details on costs associated with image generation.

5 · Model Context Protocol (MCP)

Image outputs are supported from MCP servers.

For example, the Puppeteer MCP Server can be used to generate screenshots of web pages, which correctly output the image in the expected format and is treated the same as LC’s built-in image tools.

The examples below assume LibreChat is running outside of Docker, directly using Node.js. The Model Context Protocol is a relatively new framework, and many developers are still learning how to properly serve their systems with uv/node for scalable distribution.

As this technology is still emerging, there are currently few image-generating servers available, and many existing ones have yet to adopt the correct response format for images.

While many MCP servers do function well within Docker, the following examples do not, or not without more advanced configurations, showcasing some of the current inconsistency between MCP servers.

mcpServers:
  puppeteer:
    command: npx
    args:
      - -y
      - "@modelcontextprotocol/server-puppeteer"

The following is an example of an Image Generation server that outputs images using Replicate API, but returns URLs of the images, which doesn’t conform to MCP’s image response standard.

Note: for this particular server, you need to install the @gongrzhe/image-gen-server package globally using npm, i.e. npm install -g @gongrzhe/image-gen-server, then point to the package’s compiled files as shown below.

mcpServers:
  image-gen:
    command: "node"
    # First, install the package globally using npm:
    # `npm install -g @gongrzhe/image-gen-server`
    # Then, point to the location of the installed package,
    # which you can find by running `npm root -g`
    args:
      - "{REPLACE_WITH_NODE_MODULES_LOCATION}/@gongrzhe/image-gen-server/build/index.js"
      # Example with output from `npm root -g`:
      # - "/home/danny/.nvm/versions/node/v20.19.0/lib/node_modules/@gongrzhe/image-gen-server/build/index.js"
    env:
      # Do not hardcode the API token here, use the environment variable instead
      # The following will pick up the token from your .env file or environment
      REPLICATE_API_TOKEN: "${REPLICATE_API_TOKEN}"
      MODEL: "google/imagen-3"

Image Storage and Handling

All generated images are:

Saved according to the configured fileStrategy
Displayed directly in the chat interface
Image tool outputs are directly sent to the LLM as part of the immediate chat context following generation.

This may create issues if you are using an LLM that does not support image inputs.
There will be an option to disable this behavior on a per-agent-basis in the future.
These outputs are only directly sent to the LLM upon generation, not on every message.
To include the image in the chat, you can directly attach it to the message from the side panel.
To summarize, the LLM will only get vision context from images attached to user messages, and not from generations/edits, except for immediately after generation.

Proxy Support

All image generation tools support proxy configuration through the PROXY environment variable:

PROXY=http://proxy-url:port

Error Handling

If any of the tools encounter an error, they will return an error message explaining what went wrong. Common issues include:

Invalid API key
API unavailability
Content policy violations
Proxy/network issues
Invalid parameters
Unsupported image payload (see Image Storage and Handling above)

Prompting

Though you can customize the prompts for OpenAI Image Tools and DALL·E, the following tips inform the default prompts supplied by the tools, which is helpful to know for your own writing/prompting.

Start with the subject and style (photo, oil painting, etc.).
Add composition and camera / medium (“wide-angle shot of…”, “watercolour…”).
Mention lighting & mood (“golden hour”, “dramatic shadows”).
Finish with detail keywords (textures, colours, expressions).
Keep negatives positive—describe what should be included, not what to avoid.

Example:

A cinematic photo of an antique library bathed in warm afternoon sunlight. Tall wooden shelves overflow with leather-bound books, and dust particles shimmer in the light. A single green-shaded banker’s lamp illuminates an open atlas on a polished mahogany desk in the foreground. 85 mm lens, shallow depth of field, rich amber tones, ultra-high detail.

Memory Upload as Text