Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
143 changes: 143 additions & 0 deletions integrations/vaultak.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
---
layout: integration
name: Vaultak
description: Runtime security monitoring for Haystack pipelines — risk-score inputs, enforce policies, and mask PII before responses reach your users
authors:
- name: Vaultak
socials:
github: samueloladji-beep
twitter: VaultakAI
linkedin: https://www.linkedin.com/company/vaultak/
pypi: https://pypi.org/project/haystack-vaultak/
repo: https://github.com/samueloladji-beep/haystack-vaultak
type: Monitoring Tool
report_issue: https://github.com/samueloladji-beep/haystack-vaultak/issues
version: Haystack 2.0
toc: true
---

### **Table of Contents**

- [Overview](#overview)
- [Installation](#installation)
- [Usage](#usage)
- [VaultakSecurityChecker](#vaultaksecuritychecker)
- [VaultakPIIMasker](#vaultakpiimasker)
- [Full RAG pipeline example](#full-rag-pipeline-example)
- [License](#license)

## Overview

[Vaultak](https://vaultak.com) is a runtime security platform for AI pipelines. It intercepts
inputs and outputs in real time — scoring risk on a 0–10 scale, enforcing policy rules, and masking
PII — so that dangerous or sensitive content never reaches your LLM or your users.

This integration ships two Haystack components that can be dropped into any pipeline:

| Component | Position in pipeline | What it does |
|---|---|---|
| `VaultakSecurityChecker` | Before the LLM / retriever | Risk-scores the input; raises `RuntimeError` if above threshold; checks against policy rules |
| `VaultakPIIMasker` | After the LLM | Scans LLM replies for PII (names, emails, phone numbers, etc.) and masks them before they reach your users |

## Installation

```bash
pip install haystack-vaultak
```

Sign up at [vaultak.com](https://vaultak.com) to get your API key.

## Usage

### VaultakSecurityChecker

Insert `VaultakSecurityChecker` before your retriever or LLM to intercept and score every user
query before it enters your pipeline. Queries whose risk score exceeds your threshold raise a
`RuntimeError` so the pipeline halts cleanly.

```python
from haystack import Pipeline
from haystack_vaultak import VaultakSecurityChecker

pipeline = Pipeline()
checker = VaultakSecurityChecker(
api_key="YOUR_VAULTAK_API_KEY",
threshold=7.0,
verbose=True,
)

pipeline.add_component("security", checker)
pipeline.connect("security.query", "retriever.query")
Comment on lines +60 to +70

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code doesn't work. It misses the pipeline object. Also, when I initialize the VaultakSecurityChecker, I get "Vaultak.init() got an unexpected keyword argument 'agent_name'" error

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed both issues. The missing pipeline = Pipeline() is now added before add_component() in the VaultakSecurityChecker example. The agent_name bug was a mismatch between our wrapper's param and the SDK — Vaultak.__init__() takes agent_id, not agent_name. Fixed in components.py and published as haystack-vaultak==0.1.1 on PyPI.

```

### VaultakPIIMasker

Insert `VaultakPIIMasker` after your LLM generator to scan and redact PII from every reply
before it reaches your users.

```python
from haystack_vaultak import VaultakPIIMasker

masker = VaultakPIIMasker(api_key="YOUR_VAULTAK_API_KEY")

pipeline.add_component("pii_masker", masker)
pipeline.connect("llm.replies", "pii_masker.replies")
Comment on lines +79 to +84

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same error here

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix applied here — Vaultak(api_key=..., agent_id=agent_name) in both VaultakSecurityChecker and VaultakPIIMasker. Published as haystack-vaultak==0.1.1.

```

### Full RAG pipeline example

The example below adds both components to a standard RAG pipeline. `VaultakSecurityChecker`
gates every incoming query; `VaultakPIIMasker` cleans every outgoing reply.

```python
import os
from haystack import Pipeline, Document
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.dataclasses import ChatMessage
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_vaultak import VaultakSecurityChecker, VaultakPIIMasker

os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
VAULTAK_API_KEY = "YOUR_VAULTAK_API_KEY"

# --- Build the document store ---
document_store = InMemoryDocumentStore()
document_store.write_documents([
Document(content="Acme Corp Q3 revenue was $4.2M with 312 active customers."),
Document(content="Support contact: support@acme.com, phone 555-867-5309."),
])

messages = [
ChatMessage.from_system("Answer the question using the provided documents only."),
ChatMessage.from_user(
"Documents:\n{% for doc in documents %}\n {{ doc.content }}\n{% endfor %}\nQuestion: {{ query }}"
),
]

# --- Assemble the pipeline with Vaultak components ---
pipeline = Pipeline()
pipeline.add_component("security_checker", VaultakSecurityChecker(api_key=VAULTAK_API_KEY))
pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
pipeline.add_component("prompt_builder", ChatPromptBuilder(template=messages, required_variables=["query", "documents"]))
pipeline.add_component("llm", OpenAIChatGenerator(model="gpt-4o-mini"))
pipeline.add_component("pii_masker", VaultakPIIMasker(api_key=VAULTAK_API_KEY))

pipeline.connect("security_checker.query", "retriever.query")
pipeline.connect("security_checker.query", "prompt_builder.query")
pipeline.connect("retriever.documents", "prompt_builder.documents")
pipeline.connect("prompt_builder.prompt", "llm.messages")
pipeline.connect("llm.replies", "pii_masker.replies")

# --- Run ---
result = pipeline.run({"security_checker": {"query": "What is the support email?"}})
print(result["pii_masker"]["replies"])
# Email addresses and phone numbers are masked in the output
```

Every scored query and masked reply is visible in your [Vaultak dashboard](https://app.vaultak.com).

## License

`haystack-vaultak` is distributed under the terms of the [MIT](https://spdx.org/licenses/MIT.html) license.