Skip to content

Media Generation Nodes

Media nodes generate and convert images, audio, video, and speech using provider-hosted AI models. All media output should be stored with Object Storage Upload before referencing in downstream nodes.

NodeTypeOutput
Text to Imagetext_to_imageimage
Image to Imageimage_to_imageimage
Image to Audioimage_to_audioaudio
Text to Speechtext_to_speechaudio
Speech to Textspeech_to_texttext
Text to Videotext_to_videovideo
Image Upscaleimage_upscaleimage
Media Convertmedia_convertfile

Text to Image

Type: text_to_image · Category: Media Generation

Generate an image from a text prompt using configured provider and model.

Inputs

NameTypeRequired
texttextYes — the generation prompt

Outputs

NameTypeDescription
imageimage{ url: text, mimeType: text }

Required Config

FieldTypeDescription
modelstringModel identifier (e.g. fal-ai/flux/dev).

Optional Config

FieldTypeDefault
providerselectfal-ai
promptPathstring$.prompt
sizestring1024x1024

Example

json
{
  "provider": "fal-ai",
  "model": "fal-ai/flux/dev",
  "promptPath": "$.prompt",
  "size": "1024x1024"
}

Expected output: { "image": { "url": "blob:image", "mimeType": "image/png" } }

Downstream:object_storage_upload

Execution Notes

  • Media generation requires user_byok_cloud or a platform plan with media credits.
  • Use object_storage_upload immediately after to persist the generated file — in-memory blobs are not preserved across workflow steps.

Image to Image · Object Storage Upload · Image Upscale


Image to Image

Type: image_to_image · Category: Media Generation

Transform an input image with a text prompt (inpainting, style transfer, editing).

Inputs

NameTypeRequired
imageimageYes
prompttextNo

Required Config

FieldTypeDescription
modelstringModel identifier.

Optional Config

FieldTypeDefault
providerstringfal-ai
imagePathstring$.image.url
promptstring

Example

json
{
  "provider": "fal-ai",
  "model": "fal-ai/flux-pro/kontext",
  "imagePath": "$.image.url",
  "prompt": "Create a polished arena card"
}

Downstream:object_storage_upload


Image to Audio

Type: image_to_audio · Category: Media Generation

Generate audio from an image description using a text-to-audio model.

Inputs

NameTypeRequired
imageimageYes

Required Config

FieldTypeDescription
modelstringModel identifier.

Example

json
{
  "provider": "fal-ai",
  "model": "fal-ai/stable-audio",
  "imagePath": "$.image.url",
  "prompt": "Ambient intro for the battle replay"
}

Expected output: { "audio": { "url": "blob:audio", "mimeType": "audio/mpeg" } }

Downstream:object_storage_upload


Text to Speech

Type: text_to_speech · Category: Media Generation

Generate spoken audio from text.

Inputs

NameTypeRequired
texttextYes

Required Config

FieldTypeDescription
voicestringVoice identifier (e.g. alloy, nova).

Optional Config

FieldTypeDefault
providerstringopenai
modelstringgpt-4o-mini-tts
textPathstring$.summary

Example

json
{
  "provider": "openai",
  "model": "gpt-4o-mini-tts",
  "voice": "alloy",
  "textPath": "$.summary"
}

Expected output: { "audio": { "url": "blob:tts", "mimeType": "audio/mpeg" } }

Downstream:object_storage_upload

Speech to Text · Summarizer


Speech to Text

Type: speech_to_text · Category: Media Generation

Transcribe speech audio into text with timestamps.

Inputs

NameTypeRequired
audioaudioYes

Required Config

FieldTypeDescription
modelstringTranscription model.

Optional Config

FieldTypeDefault
providerstringopenai
audioPathstring$.audio.url

Example

json
{
  "provider": "openai",
  "model": "gpt-4o-transcribe",
  "audioPath": "$.audio.url"
}

Expected output: { "text": "The battle winner is..." }

Downstream:summarizer

Valid Connections

summarizer, prompt_template, classifier, translator

Text to Speech · Audio Transcribe · Summarizer


Text to Video

Type: text_to_video · Category: Media Generation

Generate a video from a text prompt.

Inputs

NameTypeRequired
texttextYes

Required Config

FieldTypeDescription
modelstringVideo generation model.

Optional Config

FieldTypeDefault
providerstringfal-ai
promptPathstring$.prompt
durationSecondsnumber8
aspectRatiostring16:9

Example

json
{
  "provider": "fal-ai",
  "model": "fal-ai/veo3",
  "promptPath": "$.prompt",
  "durationSeconds": 8,
  "aspectRatio": "16:9"
}

Expected output: { "video": { "url": "blob:video", "mimeType": "video/mp4" } }

Downstream:object_storage_upload

Execution Notes

  • Video generation is the most resource-intensive media operation. Use retry with generous backoffMs.
  • Always persist with object_storage_upload immediately after.

Image Upscale

Type: image_upscale · Category: Media Generation

Upscale an image using a super-resolution model.

Inputs

NameTypeRequired
imageimageYes

Required Config

FieldTypeDescription
scalenumberUpscale factor (e.g. 2, 4).

Optional Config

FieldTypeDefault
providerstringfal-ai
modelstringfal-ai/esrgan
imagePathstring$.image.url

Example

json
{
  "provider": "fal-ai",
  "model": "fal-ai/esrgan",
  "imagePath": "$.image.url",
  "scale": 2
}

Downstream:object_storage_upload


Media Convert

Type: media_convert · Category: Media Generation

Convert media between supported formats (e.g. WAV → MP3, MP4 → WebM).

Inputs

NameTypeRequired
filefileYes

Required Config

FieldTypeDescription
targetFormatstringOutput format extension (e.g. mp3, webm, png).

Optional Config

FieldTypeDescription
inputPathstringInput file URL mapping.
audioBitratestringAudio bitrate (e.g. 128k).

Example

json
{
  "targetFormat": "mp3",
  "inputPath": "$.file.url",
  "audioBitrate": "128k"
}

Downstream:object_storage_upload


See also: Node Catalog Index · Storage Nodes · AI Primitive Nodes · Workflow Studio