# Tavily

Tavily offers search and data retrieval solutions, helping teams quickly locate and filter relevant information from documents, databases, or web sources

- **Category:** ai web scraping
- **Auth:** API_KEY
- **Composio Managed App Available?** N/A
- **Tools:** 5
- **Triggers:** 0
- **Slug:** `TAVILY`
- **Version:** 20260316_00

## Tools

### Tavily crawl

**Slug:** `TAVILY_CRAWL`

Tool to perform intelligent graph-based website crawling with parallel path exploration and content extraction. Use when you need to traverse and extract content from multiple pages of a website following specific patterns or instructions. Supports depth/breadth controls, domain filtering, and natural language instructions for guided crawling.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `url` | string | Yes | Root URL to begin the crawl. Can be provided with or without protocol (e.g., 'docs.tavily.com' or 'https://docs.tavily.com'). |
| `limit` | integer | No | Total number of links to process before stopping the crawl. |
| `format` | string ("markdown" | "text") | No | Format of the extracted content: 'markdown' or 'text'. |
| `timeout` | integer | No | Maximum wait time in seconds for the crawl operation. Range: 10-150. |
| `max_depth` | integer | No | Maximum crawl depth from base URL. Range: 1-5. Depth of 1 means only direct links from the root URL. |
| `max_breadth` | integer | No | Maximum number of links to follow per page level. |
| `instructions` | string | No | Natural language guidance for the crawler to find specific pages or content. Using instructions increases cost to 2 credits per 10 pages. Example: 'Find all pages about the Python SDK'. |
| `select_paths` | array | No | List of regex patterns for specific URL paths to include. Example: ['/docs/.*', '/api/.*']. |
| `exclude_paths` | array | No | List of regex patterns to skip certain URL paths. Example: ['/admin/.*', '/private/.*']. |
| `extract_depth` | string ("basic" | "advanced") | No | Extraction level for content: 'basic' for standard extraction or 'advanced' for deeper analysis. |
| `include_usage` | boolean | No | If true, includes credit usage information in the response. |
| `allow_external` | boolean | No | If true, includes links to external domains in the crawl. |
| `include_images` | boolean | No | If true, includes images in the crawl results. |
| `select_domains` | array | No | List of regex patterns for domain filtering. Only URLs matching these patterns will be crawled. |
| `exclude_domains` | array | No | List of regex patterns to exclude certain domains from the crawl. |
| `include_favicon` | boolean | No | If true, includes favicon URLs in the results. |
| `chunks_per_source` | integer | No | Maximum content snippets per source (max 500 chars each). Range: 1-5. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Tavily extract

**Slug:** `TAVILY_EXTRACT`

Tool to extract and parse web page content from specified URLs using Tavily's extract endpoint. Use when you need to retrieve clean, structured content from web pages with optional image extraction and content reranking based on query relevance.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `urls` | string | Yes | URL(s) to extract content from. Can be a single URL string or a list of URL strings. |
| `query` | string | No | User intent for reranking extracted content chunks. Helps prioritize the most relevant extracted content based on the query. |
| `format` | string ("markdown" | "text") | No | Content format for extraction: 'markdown' or 'text'. Default is 'markdown'. |
| `timeout` | number | No | Maximum wait time in seconds for the extraction request. Must be between 1.0 and 60.0 seconds. Default is 30.0. |
| `extract_depth` | string ("basic" | "advanced") | No | Extraction depth level: 'basic' for standard extraction or 'advanced' for more in-depth extraction. Default is 'basic'. |
| `include_usage` | boolean | No | If true, includes credit usage information in the response. |
| `include_images` | boolean | No | If true, includes a list of image URLs found in the extracted content. |
| `include_favicon` | boolean | No | If true, includes the favicon URL for each result. |
| `chunks_per_source` | integer | No | Maximum number of relevant chunks to extract per source. Must be between 1 and 5. Default is 3. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Tavily usage

**Slug:** `TAVILY_GET_USAGE`

Tool to retrieve API key and account usage statistics from Tavily. Use when you need to check credit consumption, limits, and per-endpoint usage for search, extract, crawl, map, and research operations.

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Tavily map website

**Slug:** `TAVILY_MAP`

Tool to map a website and discover its pages. Use when you need to scan a website and get a structured list of URLs/pages it contains without extracting full content.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `url` | string | Yes | The root URL to begin mapping (e.g., 'docs.tavily.com'). This is the starting point from which the crawler will discover and map pages. |
| `limit` | integer | No | Total number of links to process before stopping. Minimum is 1. Default is 50. |
| `timeout` | integer | No | Maximum number of seconds to wait for the mapping to complete. Range: 10-150. Default is 150. |
| `max_depth` | integer | No | How far from the base URL the crawler explores. Range: 1-5. Default is 1. |
| `max_breadth` | integer | No | The number of links to follow per page level. Minimum is 1. Default is 20. |
| `instructions` | string | No | Natural language directions for the crawler to guide its exploration. Using this parameter increases cost to 2 credits per 10 pages instead of 1. |
| `select_paths` | array | No | List of regex patterns for specific URL paths to include (e.g., '/docs/.*' to only include documentation paths). |
| `exclude_paths` | array | No | List of regex patterns to skip certain URL paths (e.g., '/admin/.*' to exclude admin pages). |
| `include_usage` | boolean | No | If true, includes credit usage details in the response. Default is false. |
| `allow_external` | boolean | No | If true, includes external domain links in the results. Default is true. |
| `select_domains` | array | No | List of regex patterns for domain targeting. Only URLs matching these domain patterns will be included. |
| `exclude_domains` | array | No | List of regex patterns to exclude certain domains from the mapping results. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Tavily search

**Slug:** `TAVILY_SEARCH`

Use this to perform a web search via the Tavily API; offers controls for search depth, content types, result count, and domain filtering. Requires an active Tavily connection (401 = auth failure). Rate limit: ~2 req/s; apply exponential backoff on HTTP 429. Results are nested under response_data.results (not a flat list). Subject to HTTP 429 on rapid bursts.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `query` | string | Yes | The search query string to find relevant information online. No native date-filter exists; embed time hints directly in the query string. For broad coverage, issue multiple focused queries rather than one broad query. |
| `max_results` | integer | No | Maximum number of search results to return. Large values combined with include_raw_content=true produce very large payloads. |
| `search_depth` | string ("basic" | "advanced") | No | Specifies search depth: 'basic' (standard, 1 API Credit) or 'advanced' (in-depth, 2 API Credits). |
| `include_answer` | boolean | No | If true, attempts to include a direct answer to the query (suitable for factual questions) in search results. The answer field can be null; treat response_data.results array as primary evidence. |
| `include_images` | boolean | No | If true, includes links to relevant images in search results. |
| `exclude_domains` | array | No | A list of domain names (e.g., `['exclude.com', 'othersite.net']`) to exclude from search results; results from these domains will be filtered out. |
| `include_domains` | array | No | A list of specific domain names (e.g., `['example.com', 'website.org']`) to restrict the search to; only results from these domains are returned. |
| `include_raw_content` | boolean | No | If true, includes raw content from visited websites (e.g., unprocessed HTML or text) in search results. Without this, results may be short snippets that omit critical detail. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |
