Skip to content

Documentation: GPT Action (AI Analysis with OpenAI)

Overview

The GPT Action is an automation node that allows you to use OpenAI (GPT) artificial intelligence models to process and analyze different types of content: text, images, audio, and video. It is part of the AI Models node family and shares the same structure as actions from other providers (Gemini, Claude, Grok).

In IoT and security environments, this node allows you, for example, to automatically analyze a camera image with AI when an alarm is triggered (detect people, vehicles, objects), generate descriptions, or process texts and reports.


When to use this action?

Use this action when you need to:

  • Analyze images from cameras automatically (detect people/objects, read text, describe the scene).
  • Generate images from text descriptions.
  • Analyze video to extract information or detect events.
  • Process text (summarize, translate, extract data) with AI.
  • Integrate OpenAI capabilities into your automations.

Node Configuration

The configuration is divided into two sections, switchable with the top selector: Basic Configuration and Prompt Configuration. It also includes the JSON Editor tab.

Empty basic configuration of the GPT node

Section: Basic Configuration

1. API Key *Required

Select the OpenAI credential that authenticates access. The credential is managed in a centralized and secure way (the API key is not written directly in the node).

2. Resource Type *Required

The type of content to process: Text, Image, Audio, or Video. Determines which additional fields appear.

3. Model *Required

The OpenAI model to use:

Model Value
GPT-4o gpt-4o
GPT-4o Mini gpt-4o-mini
GPT-4 Turbo gpt-4-turbo
GPT-4 gpt-4
GPT-3.5 Turbo gpt-3.5-turbo
o1 o1
o1-mini o1-mini

Basic configuration of the GPT node with Image resource

Section: Prompt Configuration

4. Operation *For Image and Video

  • For Image: Generate Image (generate) or Analyze Image (analyze).
  • For Video: Analyze Video.
  • For Text and Audio it does not appear (only one operation is available).

5. Image / Video URLs

For image or video analysis, enter the URLs (one per line). Supports template expressions (e.g., {{get_snapshot_node.url}} to analyze the snapshot obtained by a previous node).

6. Prompt *Required

The instruction or question for the model (e.g., "Are there people in this security image?"). Supports template expressions.

Prompt configuration of the GPT node


JSON Editor View

JSON Editor view of the GPT node


JSON Structure (Input Parameters)

{
  "api_key": "",
  "resource": "image",
  "operation": "analyze",
  "model_id": "gpt-4o",
  "image_urls": [
    "{{get_snapshot_node.url}}"
  ],
  "video_urls": [],
  "prompt": "Analyze this security image. Are there people? Describe their position and whether there is suspicious activity."
}

JSON Fields

Field Type Description
api_key string Reference to the OpenAI credential (managed securely).
resource string Resource type: text, image, audio, video.
operation string Operation (for image: generate/analyze; for video: analyze_video).
model_id string GPT model ID (e.g. gpt-4o).
image_urls array (string) URLs of images to analyze.
video_urls array (string) URLs of videos to analyze.
prompt string The instruction/question for the model.

Output: Where the node's data comes from

The analysis result (text generated by the model, description, extracted data, or, for image generation, the resulting URL) is available in the node's output and can be used in downstream nodes with {{node_key}}.


Usage Examples

Example 1: Intelligent alarm verification with a camera

Use case: When a motion alarm triggers, a Get snapshot node captures the camera image and this node analyzes it with GPT-4o to confirm whether there is actually a person.

  • Resource Type: Image | Model: gpt-4o
  • Operation: Analyze Image
  • Image URLs: {{get_snapshot_node.url}}
  • Prompt: Analyze this security image. Are there people? Describe their position and whether there is suspicious activity.

(see JSON structure above)

Example 2: Summary of a text report

Use case: Process a long text (e.g., a log) to extract the key ideas.

  • Resource Type: Text
  • Prompt: Summarize the following text in 3 key points: {{trigger.body.text}}

Validation and Errors

Condition Common cause / fix
Authentication error The OpenAI credential is invalid or lacks permissions/balance.
URLs not working Make sure the image/video URLs are publicly accessible (convert internal Docker URLs to their public path).
Usage limit exceeded Check the limits/quota on your OpenAI account; consider a lighter model (gpt-4o-mini).

Best Practices

  • Use centralized credentials: Do not write the API key in the node; select a credential managed securely.
  • Choose the right model: Use gpt-4o/gpt-4o-mini for vision; lighter models to reduce costs.
  • Specific prompts: Ask concrete questions to get actionable responses (ideal for confirming or dismissing alarms).
  • Chain with snapshot capture: The typical pattern is Get snapshotGPT (Analyze Image) → condition/notification based on the result.