# Honeyhive

HoneyHive is a modern AI observability and evaluation platform that enables developers and domain experts to collaboratively build reliable AI applications faster.

- **Category:** artificial intelligence
- **Auth:** API_KEY
- **Composio Managed App Available?** N/A
- **Tools:** 42
- **Triggers:** 0
- **Slug:** `HONEYHIVE`
- **Version:** 20260223_00

## Tools

### Add datapoints to dataset

**Slug:** `HONEYHIVE_ADD_DATAPOINTS_TO_DATASET`

Tool to add datapoints to a dataset. Use when you need to append multiple entries with specified input, ground truth, and history mappings.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | array | Yes | List of JSON objects representing datapoints to add |
| `mapping` | object | Yes | Mapping of data fields to inputs, ground truth, and history |
| `project` | string | Yes | Project name associated with the dataset |
| `dataset_id` | string | Yes | Dataset identifier to which datapoints will be added |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Compare Experiment Runs

**Slug:** `HONEYHIVE_COMPARE_RUNS`

Tool to retrieve experiment comparison between two evaluation runs. Use when you need to analyze the differences in metrics, datapoints, and events between two runs.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `filters` | string | No | Optional filters to apply (JSON string or array of filter objects) |
| `new_run_id` | string | Yes | New experiment run ID to compare (UUIDv4) |
| `old_run_id` | string | Yes | Old experiment run ID to compare against (UUIDv4) |
| `project_id` | string | Yes | Project ID to scope the comparison |
| `aggregate_function` | string ("average" | "min" | "max" | "median" | "p95" | "p99" | "p90" | "sum" | "count") | No | Aggregation function to apply to metrics. Defaults to 'average' if not specified |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Compare Runs Events

**Slug:** `HONEYHIVE_COMPARE_RUNS_EVENTS`

Tool to compare events between two experiment runs side-by-side. Use when analyzing differences in model behavior, performance metrics, or outputs between evaluation runs. Returns matched event pairs with their respective data from both runs for comparison.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `page` | integer | No | Page number for pagination (default 1) |
| `limit` | integer | No | Maximum number of event pairs to return (default 1000) |
| `filter` | string | No | Optional additional filter criteria as JSON string or query object to narrow comparison results |
| `run_id_1` | string | Yes | First experiment run ID (UUIDv4) to compare. Get this from Start Evaluation Run or list runs endpoint. |
| `run_id_2` | string | Yes | Second experiment run ID (UUIDv4) to compare. Get this from Start Evaluation Run or list runs endpoint. |
| `event_name` | string | No | Optional filter to only compare events with this specific name |
| `event_type` | string | No | Optional filter to only compare events of this type |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Batch Create Datapoints

**Slug:** `HONEYHIVE_CREATE_BATCH_DATAPOINTS`

Tool to create multiple datapoints in a single batch operation. Use when you need to bulk-import events into a dataset or create many datapoints at once. Supports filtering by date range, event IDs, or custom criteria. Efficient for migrating large numbers of events to evaluation datasets.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `events` | array | No | List of event IDs to convert into datapoints. Use to import events as datapoints. |
| `filters` | string | No | Filter criteria to select events or datapoints. Can be an object or array depending on the filtering logic. |
| `mapping` | object | No | Mapping configuration for datapoint fields |
| `dateRange` | object | No | Date range filter for batch datapoint creation |
| `selectAll` | boolean | No | If true, selects all matching items based on filters. Use to create datapoints from all events in a date range. |
| `checkState` | object | No | State flags to control batch operation behavior. Keys represent state attributes. |
| `dataset_id` | string | Yes | Unique identifier of the target dataset where datapoints will be created. Required. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Create Batch Model Events

**Slug:** `HONEYHIVE_CREATE_BATCH_MODEL_EVENTS`

Tool to create multiple model events in a single request. Use when you need to log a batch of event interactions to HoneyHive.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `model_events` | array | Yes | Array of model event objects to create |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Create Batch Tool Events

**Slug:** `HONEYHIVE_CREATE_BATCH_TOOL_EVENTS`

Tool to log a batch of external API calls as tool events. Use when you need to record multiple tool events in one request—use after gathering all event data.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `events` | array | Yes | Array of tool event objects to create |
| `is_single_session` | boolean | No | If true, all events in the batch are associated with the same session. Defaults to false. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Create Configuration

**Slug:** `HONEYHIVE_CREATE_CONFIGURATION`

Creates a new configuration in HoneyHive for managing LLM or pipeline settings. Use this to define reusable configurations with specific models, prompts, and parameters that can be deployed across different environments (dev, staging, prod). Configurations enable version control and environment-specific management of your AI application settings.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `env` | array | No | List of environments where this configuration should be active. If not specified, the configuration will not be tied to any specific environment. |
| `name` | string | Yes | Name of the configuration. Must be unique within the project and no longer than 200 characters. |
| `tags` | array | No | List of tags for categorizing and filtering configurations. |
| `type` | string ("LLM" | "pipeline") | No | Type of configuration. |
| `provider` | string | Yes | Provider name such as 'openai', 'anthropic', 'cohere', etc. |
| `parameters` | object | Yes | Configuration parameters including model, call_type, and optional settings. |
| `user_properties` | object | No | Additional custom properties for tracking metadata about the configuration. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Create Datapoint

**Slug:** `HONEYHIVE_CREATE_DATAPOINT`

Tool to create a new datapoint with input-output pairs. Use when you need to add a single datapoint with inputs, ground truth, conversation history, and metadata.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `inputs` | object | No | Input data for the datapoint as key-value pairs (e.g., {'prompt': 'What is AI?', 'context': '...'}) |
| `history` | array | No | Conversation history as a list of message objects with 'role' and 'content' fields |
| `metadata` | object | No | Additional metadata for the datapoint (e.g., token counts, custom fields) |
| `ground_truth` | object | No | Expected output/ground truth as key-value pairs (e.g., {'response': 'Paris'}) |
| `linked_event` | string | No | ID of the event this datapoint is created from, if any |
| `linked_datasets` | array | No | List of dataset IDs to link this datapoint to |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Create Dataset

**Slug:** `HONEYHIVE_CREATE_DATASET`

Tool to create a dataset. Use when you need to initialize a new dataset within a project.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `name` | string | Yes | Name of the dataset |
| `type` | string ("evaluation" | "fine-tuning") | No | What the dataset is used for |
| `saved` | boolean | No | Flag indicating if the dataset is saved |
| `project` | string | Yes | Name of the project associated with this dataset |
| `metadata` | object | No | Any helpful metadata to track for the dataset |
| `datapoints` | array | No | List of unique datapoint IDs to include in the dataset |
| `description` | string | No | A description for the dataset |
| `linked_evals` | array | No | List of unique evaluation run IDs to associate with the dataset |
| `pipeline_type` | string ("event" | "session") | No | Type of data pipeline |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Create Event

**Slug:** `HONEYHIVE_CREATE_EVENT`

Tool to create a new event in HoneyHive to track execution of different parts of your application. Use when you need to log a model call, tool execution, or chain step. Events can be grouped into sessions and nested hierarchically using parent_id and children_ids.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `event` | object | Yes | Event data object containing all event properties |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Create Metric

**Slug:** `HONEYHIVE_CREATE_METRIC`

Tool to create a new metric in HoneyHive. Use when you need to define how to evaluate model outputs, whether through code (PYTHON), AI evaluation (LLM), human review (HUMAN), or combining multiple metrics (COMPOSITE). Important: LLM metrics require both model_provider and model_name to be specified.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `name` | string | Yes | Name of the metric. Must be unique within the project and descriptive of what is being measured. |
| `type` | string ("PYTHON" | "LLM" | "HUMAN" | "COMPOSITE") | Yes | Type of metric evaluation. PYTHON: code-based, LLM: AI-evaluated, HUMAN: manually evaluated, COMPOSITE: combination of other metrics. |
| `scale` | integer | No | Scale for rating-type metrics (e.g., 1-5 scale, 1-10 scale) |
| `filters` | object | No | Filter conditions to apply before computing the metric |
| `criteria` | string | Yes | Evaluation criteria that defines what this metric measures and how it should be assessed. Required for all metric types. |
| `threshold` | object | No | Threshold settings for determining when a metric passes or fails. |
| `categories` | array | No | List of categories with scores for categorical metrics. Required when return_type is 'categorical'. |
| `model_name` | string | No | Specific model name for LLM-evaluated metrics (e.g., 'gpt-4', 'claude-3-opus'). Required when type is LLM. |
| `description` | string | No | Detailed description of what the metric measures and when to use it |
| `return_type` | string ("float" | "boolean" | "string" | "categorical") | No | Data type of the metric's return value |
| `child_metrics` | array | No | List of child metrics with weights for composite metrics. Required when type is 'COMPOSITE'. |
| `model_provider` | string | No | Model provider for LLM-evaluated metrics (e.g., 'openai', 'anthropic'). Required when type is LLM. |
| `enabled_in_prod` | boolean | No | Whether to automatically compute this metric on production events |
| `needs_ground_truth` | boolean | No | Whether ground truth data is required to compute this metric |
| `sampling_percentage` | number | No | Percentage of events to sample for this metric (0-100) |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Create Model Event

**Slug:** `HONEYHIVE_CREATE_MODEL_EVENT`

Tool to create a new model event to log LLM call data. Use when you need to track a single model interaction including messages, responses, usage, and metadata.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `model_event` | object | Yes | Model event object containing LLM call data |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Create Tool

**Slug:** `HONEYHIVE_CREATE_TOOL`

Creates a new tool definition in a HoneyHive project. Use this to register functions or plugins that can be invoked and tracked within HoneyHive. Tools are defined with a JSON Schema for their parameters, allowing HoneyHive to validate inputs and track tool usage in your AI workflows. Tool names must be unique within a project.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `name` | string | Yes | Unique name for the tool within the project. Must contain only alphanumeric characters and underscores (no special characters or unicode). |
| `task` | string | Yes | Project name or ID where this tool will be registered. Must match an existing HoneyHive project. |
| `type` | string ("function" | "tool") | Yes | Type of the tool. Use 'function' for callable functions or 'tool' for plugins/integrations. |
| `parameters` | object | Yes | JSON Schema object defining the tool's input parameters. Use standard JSON Schema format with 'type', 'properties', and optionally 'required' fields. |
| `description` | string | No | Human-readable description explaining what the tool does and when to use it. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Delete Datapoint

**Slug:** `HONEYHIVE_DELETE_DATAPOINT`

Tool to delete a specific datapoint by its ID. Use when you need to remove a datapoint from HoneyHive after confirming its identifier.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `id` | string | Yes | Unique identifier of the datapoint to delete. Can be obtained from retrieve_datapoints or add_datapoints_to_dataset responses. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Delete Dataset

**Slug:** `HONEYHIVE_DELETE_DATASET`

Tool to delete a dataset by ID. Use when you need to remove a dataset after confirming its ID.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `dataset_id` | string | Yes | Unique identifier of the dataset to delete |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### End Evaluation Run

**Slug:** `HONEYHIVE_END_EVALUATION_RUN`

Tool to update an evaluation run's status and metadata. Use to mark a run as completed after finishing evaluations, or update run properties like name, metadata, configuration, and associated event/datapoint IDs.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `name` | string | No | Display name for the evaluation run |
| `run_id` | string | Yes | Unique identifier of the evaluation run to update |
| `status` | string ("pending" | "completed") | Yes | Status of the evaluation run. Set to 'completed' to finalize the run, or 'pending' to keep it open for further updates |
| `metadata` | object | No | Arbitrary metadata fields to attach to the run |
| `event_ids` | array | No | List of session/event UUIDs to associate with this run |
| `dataset_id` | string | No | The UUID of the dataset this run is associated with |
| `configuration` | object | No | Configuration parameters used in this run |
| `datapoint_ids` | array | No | List of datapoint UUIDs to associate with this run |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Configurations

**Slug:** `HONEYHIVE_GET_CONFIGURATIONS`

Tool to retrieve a list of configurations. Use when you need to fetch all configurations for a specific project before making changes.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `env` | string ("dev" | "staging" | "prod") | No | Environment to filter by. Allowed values: dev, staging, prod. |
| `name` | string | No | The name of the configuration to filter by, e.g. 'v0'. |
| `project` | string | Yes | Project name for configuration like 'Example Project'. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Datasets

**Slug:** `HONEYHIVE_GET_DATASETS`

Retrieve datasets from HoneyHive for a specified project. Use this tool when you need to: - List all datasets within a project - Find datasets by type (evaluation or fine-tuning) - Retrieve a specific dataset by its ID Returns dataset details including name, description, datapoints count, type, and timestamps.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `type` | string ("evaluation" | "fine-tuning") | No | Optional filter by dataset type. Use 'evaluation' for evaluation datasets or 'fine-tuning' for fine-tuning datasets. |
| `project` | string | Yes | Project ID to filter datasets. Obtain project IDs using the Get Projects action. |
| `dataset_id` | string | No | Optional unique dataset ID to retrieve a specific dataset. When provided, returns only the matching dataset. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Events

**Slug:** `HONEYHIVE_GET_EVENTS`

Tool to query events with filters and projections from HoneyHive. Use this action when you need to retrieve events with lightweight filtering (limit 1000 results). For bulk exports or more complex queries, use the Retrieve Events action instead. Supports filtering by date range, event properties, and field projections.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `page` | integer | No | Page number for pagination. Default is 1 |
| `limit` | integer | No | Maximum number of results to return. Default is 1000 |
| `filters` | string | No | Array of filter objects as JSON string. Each filter should have 'field', 'operator', and 'value' properties (e.g., '[{"field":"event_type","operator":"is","value":"completion"}]') |
| `dateRange` | string | No | Date range filter as ISO string or JSON object with $gte/$lte fields (e.g., '{"$gte":"2023-01-01T00:00:00Z","$lte":"2023-12-31T23:59:59Z"}') |
| `ignoreOrder` | string | No | Whether to ignore ordering in the results. Set to 'true' or 'false' as string |
| `projections` | string | No | Fields to include in response as JSON array string (e.g., '["event_id","event_type","metadata.cost"]'). If not specified, all fields are returned |
| `evaluationId` | string | No | Filter events by evaluation ID |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Events By Session ID

**Slug:** `HONEYHIVE_GET_EVENTS_BY_SESSION_ID`

Tool to retrieve the complete tree of nested events for a specific session. Use when you need to analyze all events (model calls, tool calls, chains) that occurred within a session, including their hierarchical relationships, inputs, outputs, metrics, and costs. Returns a tree structure with recursive children.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `id` | string | Yes | Session ID (UUIDv4 format) to retrieve all events for. This should be the session_id returned from start_session or retrieved from other event queries. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Events Chart

**Slug:** `HONEYHIVE_GET_EVENTS_CHART`

Tool to retrieve charting and analytics data for events over time. Use when you need aggregated metrics (duration, cost, token usage) grouped by time buckets or fields. Supports percentile analysis (p50, p95, p99) for latency monitoring and custom filters for targeted analytics.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `bucket` | string ("minute" | "minutes" | "1m" | "hour" | "hours" | "1h" | "day" | "days" | "1d" | "week" | "weeks" | "1w" | "month" | "months" | "1M") | No | Time bucket options for aggregating chart data. |
| `metric` | string | No | Metric to aggregate for charting (default 'duration'). Common metrics: duration, metadata.cost, metadata.tokens |
| `filters` | string | No | Array of filter objects as JSON string to narrow results by field, operator, and value |
| `groupBy` | string | No | Field to group chart data by (e.g., event_type, event_name, metadata.model) |
| `dateRange` | string | No | Date range filter as ISO string or JSON object with $gte/$lte fields for start/end timestamps |
| `aggregation` | string ("avg" | "average" | "mean" | "p50" | "p75" | "p90" | "p95" | "p99" | "count" | "sum" | "min" | "max" | "median") | No | Aggregation function options for chart metrics. |
| `evaluation_id` | string | No | Filter chart data to a specific evaluation run by evaluation ID |
| `only_experiments` | string | No | Filter to include only experiment events (set to 'true' or '1' to enable) |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Metrics

**Slug:** `HONEYHIVE_GET_METRICS`

Retrieves all metrics associated with a HoneyHive project. Returns a list of metrics including their configuration (name, type, description, thresholds, evaluator details) and metadata (creation/update timestamps, sampling settings). Use this tool when you need to: - List all metrics configured for a project - Get metric IDs for updating metrics via HONEYHIVE_UPDATE_METRIC - Understand what evaluations are set up for a project Prerequisites: Obtain a valid project_name using HONEYHIVE_GET_PROJECTS first.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `project_name` | string | Yes | The name of the project to retrieve metrics for. This is a required filter that scopes which metrics are returned. Use HONEYHIVE_GET_PROJECTS to find valid project names. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Projects

**Slug:** `HONEYHIVE_GET_PROJECTS`

Tool to retrieve all projects in the HoneyHive account. Use when you need to list available projects, get project IDs for use in other API calls, or search for a specific project by name.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `name` | string | No | Optional filter to return only projects whose name contains this string (case-insensitive) |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Evaluation Run Details

**Slug:** `HONEYHIVE_GET_RUN`

Tool to get details of an evaluation run by its UUID. Use when you need to check the status, configuration, results, or metadata of a specific evaluation run.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `run_id` | string | Yes | UUID of the evaluation run to retrieve |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Run Metrics

**Slug:** `HONEYHIVE_GET_RUN_METRICS`

Tool to get event metrics for an experiment run. Use when you need to retrieve metrics computed on events within a specific experiment run. Returns an array of event objects with their associated metrics, which can be filtered by date range or custom filters.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `run_id` | string | Yes | Experiment run ID (UUIDv4) to retrieve metrics for |
| `filters` | string | No | Optional filters to apply as JSON string or array of filter objects (e.g., '[{"field": "event_type", "operator": "is", "value": "completion"}]') |
| `dateRange` | string | No | Date range filter as JSON string (e.g., '{"$gte": "2023-01-01T00:00:00Z", "$lte": "2023-12-31T23:59:59Z"}') |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Evaluation Runs

**Slug:** `HONEYHIVE_GET_RUNS`

Tool to retrieve a list of evaluation runs from HoneyHive. Use when you need to: - List all evaluation runs for analysis - Find runs by status, name, or dataset - Get specific runs by their IDs - Paginate through large sets of evaluation runs Returns evaluation details including status, results, configuration, and timestamps.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `name` | string | No | Filter by run name. Returns runs whose name contains this string (case-sensitive). |
| `page` | integer | No | Page number for pagination (default: 1) |
| `limit` | integer | No | Number of results per page (default: 20) |
| `status` | string ("pending" | "completed" | "failed" | "cancelled" | "running") | No | Enumeration of possible evaluation run statuses. |
| `run_ids` | array | No | List of specific run IDs to fetch. When provided, only returns the specified runs. |
| `sort_by` | string ("created_at" | "updated_at" | "name" | "status") | No | Enumeration of fields that can be used for sorting runs. |
| `dateRange` | string | No | Filter by date range. Format depends on API implementation (e.g., ISO date range or relative time). |
| `dataset_id` | string | No | Filter by dataset ID. Get dataset IDs from the Get Datasets action. |
| `sort_order` | string ("asc" | "desc") | No | Enumeration of sort order options. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Runs Schema

**Slug:** `HONEYHIVE_GET_RUNS_SCHEMA`

Tool to retrieve the schema for experiment runs in HoneyHive. Use when you need to understand available fields, datasets, and mappings for experiment runs.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `dateRange` | string | No | Filter by date range |
| `evaluation_id` | string | No | Filter by evaluation/run ID |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Get Session

**Slug:** `HONEYHIVE_GET_SESSION`

Retrieve a complete session tree by session ID from HoneyHive. Use this tool to fetch the full session hierarchy including all nested events (model calls, tool calls, chains) with their inputs, outputs, durations, and metadata. Returns a recursive tree structure with aggregated metrics. Prerequisites: You need a valid session ID from HONEYHIVE_START_SESSION or HONEYHIVE_RETRIEVE_EVENTS.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `session_id` | string | Yes | Session ID (UUIDv4) to retrieve. Obtain this from HONEYHIVE_START_SESSION or HONEYHIVE_RETRIEVE_EVENTS. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### List Tools

**Slug:** `HONEYHIVE_LIST_TOOLS`

Tool to list all available Honeyhive tools. Use when you need to discover which functions or plugins are registered for use.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `page` | integer | No | Page number for paginated results |
| `limit` | integer | No | Maximum number of tools to return per page |
| `project` | string | No | Project name (task) to filter tools by. If not provided, returns tools from all projects. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Retrieve Datapoint

**Slug:** `HONEYHIVE_RETRIEVE_DATAPOINT`

Retrieve a specific datapoint by its ID from HoneyHive. Use this tool when you need the full details of a single datapoint, including its inputs, ground truth, conversation history, linked datasets, and metadata. Prerequisites: You need a valid datapoint ID. Get datapoint IDs from: - HONEYHIVE_RETRIEVE_DATAPOINTS (list datapoints by project/dataset) - HONEYHIVE_ADD_DATAPOINTS_TO_DATASET (returns IDs of newly created datapoints)

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `id` | string | Yes | The unique datapoint ID to retrieve. Obtain this from HONEYHIVE_RETRIEVE_DATAPOINTS or HONEYHIVE_ADD_DATAPOINTS_TO_DATASET. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Retrieve Datapoints

**Slug:** `HONEYHIVE_RETRIEVE_DATAPOINTS`

Retrieve datapoints from a HoneyHive project. Use this tool to fetch evaluation datapoints containing inputs, ground truth, and metadata. Supports filtering by specific datapoint IDs or dataset name. Commonly used to: - Review existing test cases before running evaluations - Export datapoints for analysis - Verify datapoint contents after adding them to a dataset

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `project` | string | Yes | Name or ID of the HoneyHive project to retrieve datapoints from. Required parameter. |
| `dataset_name` | string | No | Filter by dataset name. When provided, returns only datapoints belonging to the specified dataset. |
| `datapoint_ids` | array | No | Specific datapoint IDs to retrieve. When provided, only datapoints with matching IDs are returned. Omit to retrieve all datapoints. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Retrieve Events

**Slug:** `HONEYHIVE_RETRIEVE_EVENTS`

Retrieve and export events from a HoneyHive project. Use this tool to query traced events (model calls, tool calls, sessions, chains) with optional filters by event_type, metadata, feedback scores, or date range. Returns events with their inputs, outputs, duration, and metrics. Supports pagination for large result sets (max 7500 per page).

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `page` | integer | No | Page number for pagination (default 1) |
| `limit` | integer | No | Limit number of events returned (default 1000; max 7500) |
| `filters` | array | No | Array of filter objects to narrow results. Pass empty list to retrieve all events. |
| `project` | string | Yes | Name of the project to query |
| `dateRange` | object | No | Date range filter for events, using ISO 8601 timestamps. |
| `projections` | array | No | Fields to include in the response |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Retrieve Experiment Result

**Slug:** `HONEYHIVE_RETRIEVE_EXPERIMENT_RESULT`

Tool to retrieve the result of a specific experiment run. Use when you need the status, metrics, and datapoint-level details of a completed experiment.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `run_id` | string | Yes | UUID of the experiment run to retrieve results for |
| `project_id` | string | Yes | Project identifier (ID or name) that the experiment run belongs to |
| `aggregate_function` | string ("average" | "min" | "max" | "median" | "p95" | "p99" | "p90" | "sum" | "count") | No | Aggregation function for metrics calculation. Defaults to 'average' if not specified |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Start Evaluation Run

**Slug:** `HONEYHIVE_START_EVALUATION_RUN`

Creates a new evaluation run to group and track multiple session events for analysis. Use this action when you want to: - Compare model performance across multiple sessions - Create evaluation batches for quality assurance - Link existing events to datasets for structured evaluation Prerequisites: - Get project ID using Get Projects action - Get event IDs from Start Session or Retrieve Events actions - (Optional) Get dataset ID from Get Datasets action

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `name` | string | Yes | Human-readable name for identifying this evaluation run in the HoneyHive dashboard |
| `status` | string ("pending" | "completed") | No | Initial status of the run. Use 'pending' for runs that will receive more data, 'completed' for finished runs. |
| `project` | string | Yes | Project ID to associate this evaluation run with. Get this from the Get Projects action. |
| `metadata` | object | No | Optional key-value metadata to attach to this run (e.g., environment, version) |
| `event_ids` | array | Yes | List of session/event UUIDs to include in this evaluation. Get these from Start Session or Retrieve Events actions. |
| `dataset_id` | string | No | Optional dataset ID to link with this run. Get this from the Get Datasets action. |
| `configuration` | object | No | Optional configuration object for the run (e.g., model settings, parameters) |
| `datapoint_ids` | array | No | Optional list of specific datapoint IDs from the linked dataset to evaluate |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Start Session

**Slug:** `HONEYHIVE_START_SESSION`

Start a new HoneyHive session for tracing and observability. Use this tool to initiate a tracking session that groups together related model, tool, and chain events. Returns a session_id that should be used to link subsequent events to this session. Common use cases: - Start tracing a user conversation - Begin logging an LLM pipeline execution - Initialize observability for a batch processing job

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `session` | object | Yes | Session configuration object containing all session properties |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Update Configuration

**Slug:** `HONEYHIVE_UPDATE_CONFIGURATION`

Tool to update an existing HoneyHive configuration. Use when you need to modify a configuration's name, provider, model parameters, environments, or other settings. You must provide the configuration ID (obtainable via Get Configurations action) and the name field. All other fields are optional and will only update if provided.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `id` | string | Yes | Configuration ID to update, e.g., '6638187d505c6812e4043f24' |
| `env` | array | No | List of environments where this configuration should be active |
| `name` | string | Yes | Name of the configuration |
| `tags` | array | No | Tags to categorize the configuration |
| `type` | string ("LLM" | "pipeline") | No | Type of configuration. |
| `provider` | string | No | Provider name, e.g., 'openai', 'anthropic', 'google' |
| `parameters` | object | No | Parameters for the configuration. |
| `user_properties` | object | No | Additional user-defined properties for the configuration |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Update Datapoint

**Slug:** `HONEYHIVE_UPDATE_DATAPOINT`

Update an existing datapoint by ID. Use this to modify any combination of inputs, ground_truth, history, metadata, linked_datasets, or linked_evals for a datapoint. Requires a valid datapoint ID obtained from retrieve_datapoints or add_datapoints_to_dataset.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `id` | string | Yes | Datapoint ID to update (string) |
| `inputs` | object | No | Arbitrary JSON object containing the inputs for the datapoint |
| `history` | array | No | Conversation history associated with the datapoint |
| `metadata` | object | No | Additional metadata for the datapoint |
| `ground_truth` | object | No | Expected output JSON object for the datapoint |
| `linked_evals` | array | No | IDs of evaluations where the datapoint is included |
| `linked_datasets` | array | No | IDs of datasets that include the datapoint |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Update Dataset

**Slug:** `HONEYHIVE_UPDATE_DATASET`

Tool to update an existing dataset. Use when you need to modify a dataset's details (name, description, datapoints, linked evaluations, or metadata) after confirming its ID.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `name` | string | No | Updated name for the dataset |
| `metadata` | object | No | Arbitrary metadata to attach to the dataset |
| `datapoints` | array | No | Full list of datapoint IDs to associate with the dataset |
| `dataset_id` | string | Yes | Unique identifier of the dataset to update |
| `description` | string | No | Updated description for the dataset |
| `linked_evals` | array | No | List of evaluation run IDs to link to this dataset |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Update Event

**Slug:** `HONEYHIVE_UPDATE_EVENT`

Update an existing HoneyHive event by ID. Use to attach feedback, metrics, metadata, outputs, config, user properties, or update duration on events created via start_session or batch event creation. At least one optional field must be provided alongside the event_id.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `config` | object | No | Configuration to apply/update |
| `metrics` | object | No | Metrics to record/update |
| `outputs` | object | No | Outputs to set/update for the event |
| `duration` | number | No | Duration of the event in seconds |
| `event_id` | string | Yes | UUID of the event to update. Can be obtained from retrieve_events or start_session actions. |
| `feedback` | object | No | Feedback payload to attach/update |
| `metadata` | object | No | Additional metadata to set/update for the event |
| `user_properties` | object | No | Custom user properties to set/update |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Update Metric

**Slug:** `HONEYHIVE_UPDATE_METRIC`

Tool to update an existing metric. Use when you need to modify a metric’s properties after creation. Ensure you retrieve the metric first to verify its current state.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `name` | string | No | Updated name of the metric |
| `type` | string ("custom" | "model" | "human") | No | Type of the metric |
| `prompt` | string | No | Updated evaluator prompt for the metric |
| `criteria` | string | No | Criteria for human-evaluated metrics |
| `metric_id` | string | Yes | Unique identifier of the metric to update |
| `pass_when` | boolean | No | Expected pass condition for boolean metrics |
| `threshold` | object | No | Threshold settings for numeric metrics. |
| `event_name` | string | No | Name of the event the metric is computed on |
| `event_type` | string ("model" | "tool" | "chain" | "session") | No | Type of event the metric is computed on |
| `description` | string | No | Short description of what the metric does |
| `return_type` | string ("boolean" | "float" | "string") | No | Expected return type of the metric |
| `code_snippet` | string | No | Updated code block for the metric |
| `enabled_in_prod` | boolean | No | Whether to compute this metric automatically on production events |
| `needs_ground_truth` | boolean | No | Whether a ground truth value is required to compute this metric |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Update Project

**Slug:** `HONEYHIVE_UPDATE_PROJECT`

Updates an existing HoneyHive project's name or description. Use this action to modify project metadata after creation. You must provide the project_id and at least one field to update (name or description). To find project IDs, use the Get Projects action first.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `name` | string | No | The new name for the project. If not provided, the existing name is preserved. |
| `project_id` | string | Yes | The unique identifier of the project to update. Can be obtained from the Get Projects action. |
| `description` | string | No | The new description for the project. If not provided, the existing description is preserved. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |

### Update Tool

**Slug:** `HONEYHIVE_UPDATE_TOOL`

Tool to update an existing tool in HoneyHive. Use when you need to modify a tool's name, description, parameters, or type after confirming its ID. At least one optional field must be provided alongside the required tool ID.

#### Input Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `id` | string | Yes | Unique identifier of the tool to update. Must match an existing tool ID in HoneyHive. |
| `name` | string | No | Updated name for the tool. Must contain only alphanumeric characters and underscores (no special characters or unicode). |
| `tool_type` | string ("function" | "tool") | No | Type of tool in HoneyHive. |
| `parameters` | object | No | Updated JSON Schema object defining the tool's input parameters. Use standard JSON Schema format with 'type', 'properties', and optionally 'required' fields. |
| `description` | string | No | Updated human-readable description explaining what the tool does and when to use it. |

#### Output

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `data` | string | Yes | Data from the action execution |
| `error` | string | No | Error if any occurred during the execution of the action |
| `successful` | boolean | Yes | Whether or not the action execution was successful or not |
