[Feature]: cache salting for multi-turn

### Motivation

Multi-turn conversation relies on kv reuse/prefix-caching/radix attention to reduce the cost of long prefill from increasing context length, but allowing full kv reuse between the exact same sample (e.g.: between dataset copies) is not the intended behavior.

### Proposed Solution

Add cache salt to identical samples.

### Alternatives Considered

_No response_

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: cache salting for multi-turn #312

Motivation

Proposed Solution

Alternatives Considered

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature]: cache salting for multi-turn #312

Description

Motivation

Proposed Solution

Alternatives Considered

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions