PPAD for Spontaneous Reporting Systems

This project provides a Python-based implementation of the Poisson Process AutoDecoder (PPAD) model [https://arxiv.org/abs/2502.01627], adapted for use with Spontaneous Reporting System (SRS) data like FAERS or VAERS. It includes modules for data simulation, data loading, and the core PPAD model, along with demonstration notebooks.

💡 Poisson Process AutoDecoder (PPAD) for Signal Detection in Event Reporting Systems A novel deep learning approach for unsupervised representation learning and anomaly detection in safety event data streams, suitable for Post-Market Surveillance (PMS) of adverse drug reactions (ADR) and medical device incidents.

🎯 Project Overview The Poisson Process AutoDecoder (PPAD) is a neural field method designed to accurately model and encode the non-homogeneous (time-varying) rates of Poisson Processes. Since the submission of safety reports (events) over time is mathematically modeled as a Poisson process, PPAD offers a robust method to analyze and compress this data.

This repository demonstrates how PPAD can be used to transform complex event histories into actionable intelligence:

Encode a product's entire reporting history into a fixed-length vector—a "Risk Fingerprint."

Detect subtle or complex deviations (signals) in the event rate that are often overlooked by simple count-based methods.

Cluster analogous products based on similarities in their underlying risk time profiles.

🧠 PPAD Methodology: Modeling the Continuous Rate PPAD functions as an AutoDecoder using an unsupervised learning framework, ensuring the model's output is optimized against the actual, recorded event times:

Latent Vector (z): Every event history is assigned a fixed-length, low-dimensional vector (z), which serves as its unique data compression and risk fingerprint.

Neural Field Decoder (f): A shared neural network (f) takes the latent vector (z) and a time input (t) and outputs the expected continuous Poisson rate for that product at that time:

λ(t)=f(z,t)

Optimization Objective: The model minimizes the negative log-likelihood of the observed event times under the model's predicted rate function λ(t). By minimizing this loss across a vast dataset, the latent space Z naturally groups product histories with similar patterns of event arrivals.

🚨 Signal Detection and Anomaly Analysis PPAD's power lies in its ability to quantify how well new data fits the learned "normal" risk space.

Anomaly Scoring for Batch Analysis When a new batch of reports is received (e.g., as part of a scheduled reporting cycle):

The core network weights f are frozen.

A new latent vector z new

is optimized to represent the new, updated event history.

The final Loss Value (Negative Log-Likelihood) for that optimized z new

is the Anomaly Score.

Signal Trigger: An unusually high Anomaly Score signifies that the product's updated history is a poor fit for the patterns learned by the model (i.e., the rate function λ(t) has experienced a significant, uncharacteristic shift). This validates the injection experiment—the surge of reports generated a high loss, confirming a strong anomaly signal.

Risk Space Clustering The resulting latent vectors (z) for all products can be used for deep, structural analysis:

Pattern-Based Classification: Allows analysts to visualize and cluster products based on the shape of their reporting trends (e.g., all products with a 'sudden transient spike' versus all products with a 'slow, logarithmic increase').

Identification of Novelty: A product whose z vector falls outside the established clusters in the latent space is flagged as a novel risk profile, guiding focused safety investigation resources.

🛠️ Implementation Note: Operational Fit PPAD is an ideal analytical tool for systems that rely on scheduled batch processing (e.g., weekly, monthly, or quarterly reviews). While the method is highly sensitive to changes, the computational requirement to optimize a new latent vector (z) for each update means it is best employed for in-depth anomaly fingerprinting rather than instantaneous, millisecond-latency monitoring.

Project Structure

The project is organized into the following directories:

ppad_lib/: A Python library containing the core logic.
- model.py: The PPADDecoder model implementation.
- losses.py: Custom loss functions (Poisson NLL, TV penalty).
- simulation.py: Functions for generating synthetic SRS data.
- data_loader.py: Functions for loading and parsing data from CSV files.
- utils.py: Utility functions like positional encoding.
notebooks/: Jupyter notebooks demonstrating the usage of the library.
- demo.ipynb: A basic demo showing how to use the core components.
- anomaly_detection_demo.ipynb: An advanced, end-to-end workflow for training the model and detecting anomalies.
data/: Contains sample data files.
*.py: Root-level test scripts.

Getting Started

Follow these steps to set up and run the project.

1. Installation

First, clone the repository and install the required Python packages using the requirements.txt file.

# It is recommended to use a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

pip install -r requirements.txt

2. Running the Demonstrations

This project includes two main demonstrations.

Basic Demo

To explore the basic components of the library, you can use the demo.ipynb notebook. First, launch Jupyter:

jupyter notebook

Then, navigate to notebooks/demo.ipynb and run the cells.

Anomaly Detection Demo

To run the full, end-to-end anomaly detection workflow, execute the new notebook runner script from the root directory:

python run_anomaly_demo.py

This script will programmatically execute the anomaly_detection_demo.ipynb notebook, which trains the model, detects anomalies, and saves the following files to the results/ directory:

anomaly_detection_summary.json: A JSON file with the numerical results.
anomaly_detection_plot.png: A plot visualizing the detected anomaly.
plot_interpretation.txt: A text file explaining the results.

The notebooks/anomaly_detection_demo.ipynb notebook itself contains the full implementation and detailed explanations of each step.

3. Running Tests

To verify that the environment is set up correctly and the code is functional, you can run the included test scripts from the root directory:

# Run the core logic test
python run_demo_test.py

# Run the data loader unit tests
python test_data_loader.py

# Run the simulation unit tests
python test_simulation.py

A successful run should complete without errors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PPAD for Spontaneous Reporting Systems

Project Structure

Getting Started

1. Installation

2. Running the Demonstrations

Basic Demo

Anomaly Detection Demo

3. Running Tests

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
notebooks		notebooks
ppad_lib		ppad_lib
.gitignore		.gitignore
README.md		README.md
_fix_notebook_sys_path.py		_fix_notebook_sys_path.py
_run_full_demo.py		_run_full_demo.py
_update_notebook_interpretation.py		_update_notebook_interpretation.py
anomaly_detection_plot.png		anomaly_detection_plot.png
anomaly_detection_summary.json		anomaly_detection_summary.json
requirements.txt		requirements.txt
run_anomaly_demo.py		run_anomaly_demo.py
run_demo_test.py		run_demo_test.py
test_data_loader.py		test_data_loader.py
test_simulation.py		test_simulation.py

Folders and files

Latest commit

History

Repository files navigation

PPAD for Spontaneous Reporting Systems

Project Structure

Getting Started

1. Installation

2. Running the Demonstrations

Basic Demo

Anomaly Detection Demo

3. Running Tests

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages