Skip to content

A unified binary metadata protocol for encoding emotional, contextual, and semantic signals directly into text streams.

License

Notifications You must be signed in to change notification settings

vibexcode/vibe-x

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VIBE-X Protocol

Vector-Integrated Binary Extension — "Encode Once, Query Infinitely."

DOI PyPI version Python versions License: MIT

🎭 Try Interactive Demo | 📓 Jupyter Tutorial | 💡 Real-World Examples

VIBE-X is a lightweight binary encoding protocol that attaches sentiment and emotional metadata directly to or alongside UTF-8 text streams. By adding a compact 14-bit MetaBlock to sentiment-bearing spans, VIBE-X enables instant emotional queries without repeatedly calling large NLP models.

The goal is simple:

Turn expensive, repeated model inference into a one-time encoding step,
then enable near-zero-cost emotional queries for the rest of the data’s life cycle.


1. What Is VIBE-X?

VIBE-X (Vector-Integrated Binary Extension) is a compact, token-level emotional vector that travels with the text:

  • V — Vector: Multi-dimensional sentiment representation (polarity, intensity, emotion class, context, span, etc.)
  • I — Integrated: Aligned with UTF-8 text and modern subword tokenizers (BPE, SentencePiece)
  • B — Binary: Bit-level, storage- and compute-efficient representation
  • E — Extension: Extends existing text systems; does not replace UTF-8
  • X — Extensible: Designed for future modalities and fields (audio, video, emergency flags, etc.)

Traditional approach:
[TEXT] + [Separate Sentiment Database]

VIBE-X approach:
[TEXT + EMBEDDED / LINKED SENTIMENT BITS]

Emotion becomes a native attribute of data – portable, queryable, and secure.


2. Core Design: The 14-bit SPICE-R MetaBlock

At the heart of VIBE-X is a 14-bit MetaBlock called SPICE-R, which encodes the emotional state of a token span.

High-level fields:

  • S – Span Flag + Length
    • 1 bit flag (Has_SPAN)
    • Optional 3-bit span length (0–7 → up to 8 tokens total using 1+N logic)
  • P – Polarity (2 bits)
    • 00 = Neutral
    • 01 = Negative
    • 10 = Positive
    • 11 = Ironic / sarcasm-aware
  • I – Intensity (3 bits)
    • 0–7 scale for emotional strength (from flat to maximum intensity)
  • C – Context (1 bit)
    • 0 = Static / literal
    • 1 = Dynamic / context-dependent (sarcasm, rhetorical use, contrastive meaning)
  • E – Emotion Class (3 bits)
    • 8-way mapping (e.g., Joy, Trust, Fear, Surprise, Sadness, Disgust, Anger, Anticipation, etc.)
  • R – Reserved / Emergency (1 bit)
    • Reserved for forward compatibility
    • Can be used as an Emergency Flag (0 = normal, 1 = urgent/critical)

Total: 14 bits per MetaBlock

This vector is compact enough for petabyte-scale storage, yet expressive enough to cover nuanced, context-aware emotional states (including irony, multi-token phrases, and emergency cases).


3. Architecture: Inline vs Sidecar Modes

VIBE-X supports two complementary integration modes:

3.1 Inline Mode (Embedded in Text Stream)

MetaBlocks are embedded directly into the text near the anchor token, using non-printable markers (e.g., Unicode Private Use Area characters).

  • Text + metadata are stored as a single atomic stream
  • No risk of desynchronization between text and emotion
  • Ideal for:
    • Messaging protocols
    • Embedded / IoT devices
    • Offline and portable files
    • Real-time streaming scenarios

3.2 Sidecar Mode (External Metadata Layer)

The original UTF-8 text remains untouched. Emotional metadata is stored in a separate sidecar file or binary structure, referencing token indices.

  • 100% UTF-8 compatible
  • Easy redaction (delete or rotate sidecar without touching original text)
  • Ideal for:
    • Large-scale analytics
    • Archival / regulatory systems
    • Batch processing pipelines

3.3 Hybrid & Convertible

Tools can convert between Inline and Sidecar representations:

  • Inline → Sidecar: strip markers, export all MetaBlocks
  • Sidecar → Inline: inject markers for portable, single-file use

This flexibility lets integrators choose atomicity or strict UTF-8 compatibility per use case.


4. Encoding–Decoding Pipeline

4.1 Encoding: “Analyze Once, Encode Infinitely”

  1. Master Analyzer (e.g., a transformer-based sentiment model) processes the text once.
  2. It emits structured sentiment for each span: polarity, intensity, context, emotion class, span length, optional lifecycle info (timestamp, model version, confidence).
  3. The VIBE-X compiler:
    • Packs these attributes into the 14-bit SPICE-R MetaBlock.
    • Either embeds it into the text (Inline Mode) or writes a compact sidecar file (Sidecar Mode).

From that moment on, no further model inference is needed for emotional queries.

4.2 Decoding: Instant Query-Time Retrieval

  1. The VIBE-X decoder reads the text (and optionally sidecar).
  2. It parses each 14-bit MetaBlock and reconstructs:
    • Emotional valence
    • Intensity level
    • Emotion class
    • Context flag (literal vs ironic / dynamic)
    • Span coverage
    • Optional Emergency flag
  3. Applications can answer complex emotional queries using only cheap bit operations:
    • “Show all negatively intense comments with irony.”
    • “Find all emergency spans in this log stream.”
    • “Filter for high-intensity joy in the last 24 hours.”

No GPU, no large model required at query time.


5. Efficiency & Benchmark Highlights

A Grok-based benchmark on 100,000 real-world social media comments (Turkish + English) shows:

  • Decoding latency: ≈ 0.045 µs per MetaBlock
  • Query latency: < 0.3 ms per query
  • Sentiment accuracy:
    • ~96% (Turkish)
    • ~94% (English)
    • Outperforms typical traditional NLP baselines (~85%)
  • Compute savings: >92% reduction in compute cost for repeated queries
  • Storage savings:
    • 99% vs JSON-style sentiment metadata

    • ≈ 1–3% overhead when stored inline with text
  • Energy & CO₂:
    • ≈ 99.9% lower energy use vs re-running BERT per query on the same data
    • From ~15 kg to ~0.01 kg CO₂ per 10M queries (illustrative scenario)

At scale (e.g., 10M–1B queries), VIBE-X converts sentiment analysis from a recurring OPEX cost into a one-time CAPEX encoding step, then amortizes the benefit over all future reads.


6. Real-World Applications

Because VIBE-X is designed for read-heavy workloads, it is especially valuable wherever the same data is queried many times:

  • Content Moderation & Safety

    • Sub-ms detection of hostility, self-harm, hate, harassment
    • Emergency flagging for critical events and distress signals
  • Personalization & Recommendation

    • Mood-aware feeds, playlists, and news ranking
    • Filter by emotional fit instead of recomputing sentiment
  • Customer Support & CX

    • Instant escalation for angry or distressed customers
    • Sentiment-aware agent routing and dashboards
  • Search & Information Retrieval

    • “Only positive reviews” filters for search results
    • Emotion-aware legal, compliance, or e-discovery search
  • Business Intelligence & Analytics

    • Brand perception tracking
    • Market/employee sentiment over time, at very low cost
  • Compliance & Archiving

    • Long-term archives searchable by emotional state
    • GDPR/CCPA-aligned design via sidecar separation and anonymization
  • Emergency-Aware Systems

    • Call centers, IT incident management, smart cities, healthcare triage
    • Bit-level emergency flag for fast prioritization

7. Extensibility: VIBE-A and VIBE-V

VIBE-X is defined as a core text protocol, but the same logic extends to other modalities:

  • VIBE-A (Audio)

    • Encodes prosody, stress, tempo, intensity, and emotional class from audio into a Meta-A* block.
    • Use cases: call centers, voice assistants, mental health monitoring.
  • VIBE-V (Video / Visual Media)

    • Attaches affective metadata to video frames or segments.
    • Use cases: content moderation, gaming, interactive narratives, surveillance analytics.

Together, VIBE-X, VIBE-A, and VIBE-V form a multi-modal emotional metadata layer for next-generation AI and analytics systems.


8. Status & Implementation Notes

  • This repository hosts the Python reference implementation of the VIBE-X protocol (encoder/decoder and utilities).
  • The protocol is designed to be:
    • Language-agnostic (C/C++/Rust/Go bindings are straightforward)
    • Tokenizer-agnostic (supports subword tokenization)
    • Storage-agnostic (inline, sidecar, or hybrid)

Production deployments are expected to:

  • Use a binary sidecar format for maximum performance.
  • Optionally support JSON/CBOR/Protobuf wrappers for debugging and interoperability.
  • Maintain lifecycle metadata (timestamp, model version, confidence scores) outside the 14-bit MetaBlock to keep the core encoding compact.

9. Installation

You can install the official VIBE-X Protocol implementation directly from PyPI:

pip install vibex-protocol

This will install the core library, including the encoder, decoder, metadata models, and CLI tool (vibex-cli).


9.1 Try the Interactive Demo

We provide interactive demos to help you quickly understand and experiment with VIBE-X:

🎭 Streamlit Web Demo

# Install demo dependencies
pip install -r requirements-demo.txt

# Run the interactive web demo
streamlit run demo/streamlit_app.py

Open your browser at http://localhost:8501 to access:

  • Interactive encoder with real-time configuration
  • Metadata decoder with visualization
  • Batch analysis tools
  • Comprehensive tutorial

📓 Jupyter Notebook Tutorial

For a deeper dive with step-by-step explanations:

# Install dependencies
pip install -r requirements-demo.txt

# Launch Jupyter
jupyter notebook demo/interactive_tutorial.ipynb

Or try it online:

  • Open In Colab
  • Binder

💡 Real-World Examples

Check out practical implementations in the examples/ directory:

  • chat_moderation.py — Live chat moderation system
  • social_media_analysis.py — Social media sentiment tracking
  • customer_reviews.py — E-commerce review analysis

---------------------------------------------

10. Project Structure

vibe-x/ │ ├── src/ │ └── vibex/ │ ├── init.py │ ├── inline_encoder.py │ ├── inline_decoder.py │ ├── metablock.py │ ├── tokenizer.py │ └── exceptions.py │ ├── examples/ │ ├── basic_encode.py │ ├── basic_decode.py │ ├── multi_annotation.py │ └── error_handling.py │ ├── tests/ │ └── test_basic_flow.py │ ├── docs/ │ └── (overview or specifications) │ ├── LICENSE ├── README.md └── pyproject.toml

11. USAGE

---------------------------------------------

Basic Encoding Example

---------------------------------------------

from vibex.inline_encoder import InlineEncoder, SentimentAnnotation from vibex.tokenizer import Tokenizer

Initialize encoder with default tokenizer

encoder = InlineEncoder(tokenizer=Tokenizer())

text = "The movie was absolutely amazing"

Create an annotation:

anchor = token index

length = number of tokens spanned

annotation = SentimentAnnotation( anchor=3, # "absolutely" length=2, # covers "absolutely amazing" polarity=2, # positive intensity=5, # strong emotion context=0, # literal emotion=1 # Joy )

encoded_text = encoder.encode(text, [annotation]) print("Encoded:", encoded_text)

---------------------------------------------

Basic Decoding Example

---------------------------------------------

from vibex.inline_decoder import InlineDecoder from vibex.tokenizer import Tokenizer

decoder = InlineDecoder(tokenizer=Tokenizer())

decoded = decoder.decode(encoded_text)

print("Clean text:", decoded.clean_text) print("Clean tokens:", decoded.clean_tokens) print("\nDecoded MetaBlocks:") for block in decoded.blocks: print(" - HEX:", block.block.to_hex()) print(" Span:", block.span) print(" Polarity:", block.block.polarity) print(" Intensity:", block.block.intensity) print(" Emotion:", block.block.emotion)

---------------------------------------------

Multiple Annotation Example

---------------------------------------------

from vibex.inline_encoder import InlineEncoder, SentimentAnnotation from vibex.tokenizer import Tokenizer

text = "I loved the performance but the ending felt rushed"

encoder = InlineEncoder(Tokenizer())

annotations = [ SentimentAnnotation( anchor=1, # "loved" length=1, polarity=2, # positive intensity=6, context=0, emotion=1 ), SentimentAnnotation( anchor=7, # "felt" length=2, # "felt rushed" polarity=1, # negative intensity=5, context=1, # context-dependent emotion=4 ), ]

encoded = encoder.encode(text, annotations) print("Encoded:", encoded)

---------------------------------------------

Error Handling Example

---------------------------------------------

from vibex.inline_encoder import InlineEncoder, SentimentAnnotation from vibex.exceptions import MetaBlockEncodingError from vibex.tokenizer import Tokenizer

encoder = InlineEncoder(Tokenizer()) text = "This is a short text"

try: # Intentional mistake: anchor index out of token range annotation = SentimentAnnotation( anchor=50, length=1, polarity=1, intensity=3, context=0, emotion=2 ) encoder.encode(text, [annotation]) except MetaBlockEncodingError as e: print("Encoding error caught:", e)

12. License

This project is released under the MIT License.
See the LICENSE file for details. Uğur KANDEMİŞ (2025)

13. Citation

If you use VIBE-X in academic or industrial work, please cite:

Kandemiş, U. (2025). VIBE-X: Vector-Integrated Binary Extension for Sentiment-Aware Communication Systems. Zenodo.
DOI: https://doi.org/10.5281/zenodo.17228992

Contributing

Contributions, issues, and feature requests are welcome. Feel free to open an issue or submit a pull request.

Star on GitHub

About

A unified binary metadata protocol for encoding emotional, contextual, and semantic signals directly into text streams.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages