Documentation: Grok Action (AI Analysis with xAI)¶
Overview¶
The Grok Action is an automation node that allows you to use xAI's Grok artificial intelligence models to process and analyze content: text, images, audio, and video. It belongs to the AI Models node family and shares the same structure as actions from other providers (Gemini, GPT, Claude).
In IoT and security environments, it is useful for analyzing camera images with AI when an event occurs, generating structured descriptions (e.g., in JSON format), or processing text.
When to use this action?¶
Use this action when you need to:
- Analyze images from cameras (detect objects/people, describe the scene).
- Generate images from text descriptions.
- Analyze video to extract information or detect events.
- Process text with Grok models.
- Integrate xAI capabilities into your automations.
Node Configuration¶
The configuration is divided into two sections, switchable with the top selector: Basic Configuration and Prompt Configuration. It also includes the JSON Editor tab.

Section: Basic Configuration¶
1. API Key *Required¶
Select the Grok (xAI) credential that authenticates access (managed in a centralized and secure way).
2. Resource Type *Required¶
The type of content to process: Text, Image, Audio, or Video.
3. Model *Required¶
The Grok model to use:
| Model | Value |
|---|---|
| Grok 2 | grok-2-1212 |
| Grok 2 Vision | grok-2-vision-1212 |
| Grok Beta | grok-beta |
| Grok Vision Beta | grok-vision-beta |

Section: Prompt Configuration¶
4. Operation *For Image and Video¶
- For Image:
Generate ImageorAnalyze Image. - For Video:
Analyze Video.
5. Image / Video URLs¶
For analysis, enter the URLs (one per line). Supports template expressions (e.g., {{get_snapshot_node.url}}).
6. Prompt *Required¶
The instruction/question for the model. Supports template expressions.

JSON Editor View¶

JSON Structure (Input Parameters)¶
{
"api_key": "",
"resource": "image",
"operation": "analyze",
"model_id": "grok-2-vision-1212",
"image_urls": [
"{{get_snapshot_node.url}}"
],
"video_urls": [],
"prompt": "Describe in JSON format the objects detected in this image from the industrial camera."
}
JSON Fields¶
| Field | Type | Description |
|---|---|---|
api_key |
string | Reference to the Grok/xAI credential (managed securely). |
resource |
string | Resource type: text, image, audio, video. |
operation |
string | Operation (image: generate/analyze; video: analyze_video). |
model_id |
string | Grok model ID (e.g. grok-2-vision-1212). |
image_urls |
array (string) | URLs of images to analyze. |
video_urls |
array (string) | URLs of videos to analyze. |
prompt |
string | The instruction/question for the model. |
Output: Where the node's data comes from¶
The analysis result is available in the node's output and can be used in downstream nodes with {{node_key}}.
Usage Examples¶
Example 1: Structured description of objects on an industrial camera¶
Use case: Analyze the image from a plant camera and obtain in JSON the detected objects, to feed into downstream logic.
- Resource Type:
Image| Model:grok-2-vision-1212 - Operation:
Analyze Image - Image URLs:
{{get_snapshot_node.url}} - Prompt:
Describe in JSON format the objects detected in this image from the industrial camera.
(see JSON structure above)
Example 2: Process text¶
Use case: Summarize or classify a text received in the flow.
- Resource Type:
Text - Prompt:
Summarize the following text: {{trigger.body.text}}
Validation and Errors¶
| Condition | Common cause / fix |
|---|---|
| Authentication error | The Grok/xAI credential is invalid or lacks permissions/balance. |
| URLs not working | Make sure the URLs are publicly accessible. |
| Model without vision | For image analysis, use a vision-capable model (e.g. grok-2-vision-1212). |
Best Practices¶
- Use centralized credentials: Do not write the API key in the node.
- Vision model for images: Select a Vision model when analyzing images.
- Specific prompts: Request structured outputs (JSON) when you plan to process the result in downstream nodes.
- Chain with snapshot capture: Typical pattern: Get snapshot → Grok (Analyze Image) → condition/notification.