# LLMWhisperer

LLMWhisperer is a technology that presents data from complex documents to LLMs in a way that they can best understand.

- **Category:** ai document extraction
- **Auth:** API_KEY
- **Composio Managed App Available?** N/A
- **Tools:** 11
- **Triggers:** 0
- **Slug:** `LLMWHISPERER`
- **Version:** 20260312_00

## Tools

### Convert Document to Text (v2)

**Slug:** `LLMWHISPERER_CONVERT_DOCUMENT_TO_TEXT_V2`

Tool to convert PDF/scanned documents to text format for LLM consumption. Supports file upload or URL processing with multiple modes (native_text, low_cost, high_quality, form, table). Use when you need to extract text from documents for LLM processing. Returns whisper_hash for status checking and text retrieval.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `tag` | string | No | Custom tag for usage tracking and auditing purposes. |
| `url` | string | No | URL of the document to convert. Provide either url or file, not both. |
| `file` | object | No | File to upload and convert. Provide either file or url, not both. |
| `lang` | string | No | Language hint for OCR. Currently auto-detected but can be specified. |
| `mode` | string ("native_text" | "low_cost" | "high_quality" | "form" | "table") | No | Processing mode for document conversion. |
| `filename` | string | No | Filename for usage reports and auditing. Auto-populated from file.name if not provided. |
| `output_mode` | string ("layout_preserving" | "text") | No | Output format mode. |
| `url_in_post` | boolean | No | If true, the URL is sent in POST body instead of query param. Only used when url is provided. |
| `use_webhook` | string | No | Name of registered webhook to call after processing completes. Webhook must be pre-registered. |
| `add_line_nos` | boolean | No | Add line numbers to extracted text and save line metadata for highlights API. |
| `page_seperator` | string | No | String to use as page separator in output. Supports dynamic placeholders like '<<< {{page_no}} >>>' to include page numbers. |
| `pages_to_extract` | string | No | Specify which pages to extract. Format: '1-5,7,21-' extracts pages 1,2,3,4,5,7,21 to last page. |
| `webhook_metadata` | string | No | Custom metadata to pass to webhook callback when processing completes. |
| `median_filter_size` | integer | No | Median filter size for noise removal. Only works in low_cost mode. Must be non-negative. |
| `mark_vertical_lines` | boolean | No | Whether to reproduce vertical lines in document. Not applicable for native_text mode. |
| `gaussian_blur_radius` | integer | No | Gaussian blur radius for noise removal. Only works in low_cost mode. Must be non-negative. |
| `mark_horizontal_lines` | boolean | No | Whether to reproduce horizontal lines. Requires mark_vertical_lines=true to work. |
| `line_splitter_strategy` | string | No | Line splitter strategy for customizing line splitting behavior in multi-column layouts. |
| `include_line_confidence` | boolean | No | Include line confidence scores in metadata. Requires add_line_nos=true. |
| `line_splitter_tolerance` | number | No | Factor to decide when to move text to next line. Default 0.4 means 40% of average character height. Range: 0.0 to 1.0. |
| `horizontal_stretch_factor` | number | No | Horizontal stretch factor for multi-column layouts. 1.1 = 10% stretch. Must be positive. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Highlights Metadata

**Slug:** `LLMWHISPERER_GET_HIGHLIGHTS`

Tool to get line metadata for highlighting extracted text in the original document. Returns bounding box coordinates (x, y, width, height) and page number for each line. Use when you need to create text overlays on document images or highlight specific lines in the source document.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `lines` | string | No | Lines to retrieve metadata for. Format: ranges and individual lines (e.g., '1-5,7,21-' retrieves lines 1,2,3,4,5,7,21 to end). Required if extract_all_lines is false. |
| `whisper_hash` | string | Yes | The whisper hash returned from the /whisper endpoint. Format: 'hash1\|hash2' |
| `extract_all_lines` | boolean | No | If true, extract metadata for all lines. If false, the 'lines' parameter must be provided. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Register Webhook

**Slug:** `LLMWHISPERER_REGISTER_WEBHOOK`

Tool to register a new webhook endpoint for LLMWhisperer async notifications. Use when you need to set up a callback URL to receive processing results. The webhook URL is validated during registration to ensure it's reachable.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `url` | string | Yes | The webhook endpoint URL that will receive callback notifications from LLMWhisperer after document processing is complete. Must be a valid HTTP/HTTPS URL. |
| `auth_token` | string | No | Bearer token for authenticating callbacks to your webhook endpoint. Provide the token value without the 'Bearer' prefix. Use an empty string if no authentication is required. |
| `webhook_name` | string | Yes | A unique identifier/name for this webhook registration. Must be unique across all your webhooks. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Usage Information

**Slug:** `LLMWHISPERER_USAGE_GET_INFO`

Tool to check usage metrics of your LLMWhisperer account. Use when you need to monitor API consumption, verify quotas, or check remaining page limits.

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Usage Statistics

**Slug:** `LLMWHISPERER_USAGE_GET_STATS`

Tool to retrieve usage statistics for your LLMWhisperer account based on a specific tag. Use when you need to check consumption metrics for a given tag and optional date range. Returns usage data for the preceding 30 days when date parameters are omitted.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `tag` | string | Yes | Tag to filter usage data. Required parameter to identify the specific usage metrics to retrieve. |
| `to_date` | string | No | End date for usage period in YYYY-MM-DD format. If omitted, defaults to current date. |
| `from_date` | string | No | Start date for usage period in YYYY-MM-DD format. If omitted, defaults to 30 days before to_date. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Delete Webhook

**Slug:** `LLMWHISPERER_WEBHOOK_DELETE`

Tool to delete a registered webhook from LLMWhisperer system. Use when you need to remove a webhook that is no longer needed.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `webhook_name` | string | Yes | The name of the webhook to delete. This is the unique identifier used when the webhook was registered. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Webhook Details

**Slug:** `LLMWHISPERER_WEBHOOK_GET_DETAILS`

Tool to retrieve registered webhook details for LLMWhisperer. Use when you need to get the configuration of a specific webhook including its URL and authentication token.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `webhook_name` | string | Yes | The name of the webhook to retrieve details for |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Update Webhook Configuration

**Slug:** `LLMWHISPERER_WEBHOOK_UPDATE`

Tool to update an existing webhook configuration for document conversion callbacks. Use when you need to modify the callback URL, authentication token, or webhook identifier. The system validates the webhook by sending a test payload and requires a 200 status response.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `url` | string | Yes | The callback URL that will receive notifications after document conversion completion |
| `auth_token` | string | No | Bearer token for authenticating requests to the callback URL. Leave empty if the endpoint requires no authentication |
| `webhook_name` | string | Yes | Unique identifier for the webhook configuration |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Check Whisper Status

**Slug:** `LLMWHISPERER_WHISPER_CHECK_STATUS`

Tool to check the status of a text extraction process in LLMWhisperer. Use when the conversion is done in async mode to poll for completion status.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `whisper_hash` | string | Yes | The whisper hash returned while starting the whisper process. This is used to identify and track the specific conversion job. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Whisper Detail

**Slug:** `LLMWHISPERER_WHISPER_GET_DETAIL`

Tool to retrieve comprehensive details about ongoing or completed text extraction process. Use when you need to monitor the status and progress metrics of a text extraction job.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `whisper_hash` | string | Yes | Identifier returned when initiating extraction. This is used to retrieve the status and details of a specific extraction job. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Retrieve Whisper Text

**Slug:** `LLMWHISPERER_WHISPER_RETRIEVE_TEXT`

Tool to retrieve extracted text from asynchronous whisper processing. Use when the conversion process was initiated in async mode and you need to retrieve the results using the whisper_hash identifier. Note that retrieval is single-use for security - once retrieved, the same whisper_hash cannot be used again.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `text_only` | boolean | No | When true, returns only the extracted text. When false (default), returns text with metadata including confidence scores and webhook metadata. |
| `whisper_hash` | string | Yes | Unique identifier returned when initiating the whisper process. Format: 'hash1\|hash2' |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |
