Documentation: Grok Action (AI Analysis with xAI)¶

Overview¶

The Grok Action is an automation node that allows you to use xAI's Grok artificial intelligence models to process and analyze content: text, images, audio, and video. It belongs to the AI Models node family and shares the same structure as actions from other providers (Gemini, GPT, Claude).

In IoT and security environments, it is useful for analyzing camera images with AI when an event occurs, generating structured descriptions (e.g., in JSON format), or processing text.

When to use this action?¶

Use this action when you need to:

Analyze images from cameras (detect objects/people, describe the scene).
Generate images from text descriptions.
Analyze video to extract information or detect events.
Process text with Grok models.
Integrate xAI capabilities into your automations.

Node Configuration¶

The configuration is divided into two sections, switchable with the top selector: Basic Configuration and Prompt Configuration. It also includes the JSON Editor tab.

Empty basic configuration of the Grok node

Section: Basic Configuration¶

1. API Key *Required¶

Select the Grok (xAI) credential that authenticates access (managed in a centralized and secure way).

2. Resource Type *Required¶

The type of content to process: Text, Image, Audio, or Video.

3. Model *Required¶

The Grok model to use:

Model	Value
Grok 2	`grok-2-1212`
Grok 2 Vision	`grok-2-vision-1212`
Grok Beta	`grok-beta`
Grok Vision Beta	`grok-vision-beta`

Basic configuration of the Grok node with Image resource

Section: Prompt Configuration¶

4. Operation *For Image and Video¶

For Image: Generate Image or Analyze Image.
For Video: Analyze Video.

5. Image / Video URLs¶

For analysis, enter the URLs (one per line). Supports template expressions (e.g., {{get_snapshot_node.url}}).

6. Prompt *Required¶

The instruction/question for the model. Supports template expressions.

Prompt configuration of the Grok node

JSON Editor View¶

JSON Editor view of the Grok node

JSON Structure (Input Parameters)¶

{
  "api_key": "",
  "resource": "image",
  "operation": "analyze",
  "model_id": "grok-2-vision-1212",
  "image_urls": [
    "{{get_snapshot_node.url}}"
  ],
  "video_urls": [],
  "prompt": "Describe in JSON format the objects detected in this image from the industrial camera."
}

JSON Fields¶

Field	Type	Description
`api_key`	string	Reference to the Grok/xAI credential (managed securely).
`resource`	string	Resource type: `text`, `image`, `audio`, `video`.
`operation`	string	Operation (image: `generate`/`analyze`; video: `analyze_video`).
`model_id`	string	Grok model ID (e.g. `grok-2-vision-1212`).
`image_urls`	array (string)	URLs of images to analyze.
`video_urls`	array (string)	URLs of videos to analyze.
`prompt`	string	The instruction/question for the model.

Output: Where the node's data comes from¶

The analysis result is available in the node's output and can be used in downstream nodes with {{node_key}}.

Usage Examples¶

Example 1: Structured description of objects on an industrial camera¶

Use case: Analyze the image from a plant camera and obtain in JSON the detected objects, to feed into downstream logic.

Resource Type: Image | Model: grok-2-vision-1212
Operation: Analyze Image
Image URLs: {{get_snapshot_node.url}}
Prompt: Describe in JSON format the objects detected in this image from the industrial camera.

(see JSON structure above)

Example 2: Process text¶

Use case: Summarize or classify a text received in the flow.

Resource Type: Text
Prompt: Summarize the following text: {{trigger.body.text}}

Validation and Errors¶

Condition	Common cause / fix
Authentication error	The Grok/xAI credential is invalid or lacks permissions/balance.
URLs not working	Make sure the URLs are publicly accessible.
Model without vision	For image analysis, use a vision-capable model (e.g. `grok-2-vision-1212`).

Best Practices¶

Use centralized credentials: Do not write the API key in the node.
Vision model for images: Select a Vision model when analyzing images.
Specific prompts: Request structured outputs (JSON) when you plan to process the result in downstream nodes.
Chain with snapshot capture: Typical pattern: Get snapshot → Grok (Analyze Image) → condition/notification.