diff --git a/SUMMARY.md b/SUMMARY.md index c77bc79f3..2fb58cd47 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -636,15 +636,6 @@ * [Aidbox Form Workflow](deprecated/deprecated/forms/aidbox-code-editor/aidbox-form-workflow.md) * [Changing data after form signing](deprecated/deprecated/forms/aidbox-code-editor/changing-data-after-form-signing.md) * [FHIR Questionnaire to Aidbox forms and back conversion](deprecated/deprecated/forms/aidbox-code-editor/fhir-questionnaire-to-aidbox-forms-and-back-conversion.md) - * [MDM — Master Data Management](deprecated/deprecated/mdm/README.md) - * [Run MDM locally](deprecated/deprecated/mdm/run-mdm-locally.md) - * [Configure MDM module](deprecated/deprecated/mdm/configure-mdm-module.md) - * [Find duplicates: $match](deprecated/deprecated/mdm/find-duplicates-match.md) - * [Merging and Unmerging Records: $merge and $unmerge](deprecated/deprecated/mdm/merging-and-unmerging-records-usdmerge-and-usdunmerge.md) - * [RBAC configuration](deprecated/deprecated/mdm/rbac.md) - * [Matching Model Explanation](deprecated/deprecated/mdm/matching-model-explanation.md) - * [Mathematical Details](deprecated/deprecated/mdm/mathematical-details.md) - * [MDM Module Resources](deprecated/deprecated/mdm/mdm-module-resources.md) * [AidboxDB](deprecated/deprecated/aidboxdb/README.md) * [HA AidboxDB](deprecated/deprecated/aidboxdb/ha-aidboxdb.md) * [Migrate to AidboxDB 16](deprecated/deprecated/aidboxdb/migrate-to-aidboxdb-16.md) diff --git a/docs/deprecated/deprecated/mdm/configure-mdm-module.md b/docs/deprecated/deprecated/mdm/configure-mdm-module.md deleted file mode 100644 index 0791b6e79..000000000 --- a/docs/deprecated/deprecated/mdm/configure-mdm-module.md +++ /dev/null @@ -1,444 +0,0 @@ ---- -description: Configure MDM module with matching models for deduplication including examples and tuning notes. ---- - -# Configure MDM module - -{% hint style="warning" %} -The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager. -{% endhint %} - -{% hint style="info" %} -The matching model defines the target resource type via the `resource` field (for example, `Patient`, `Practitioner`, `Organization`, or any custom FHIR resource). -{% endhint %} - -The example in the next section provides a **basic model** that allows you to **start the MDM module** and **test** its functionality. For a detailed explanation of all model elements and matching logic, see [Matching Model Explanation](matching-model-explanation.md). - -## Create OAuth Client for MDM frontend - -To enable authentication for the MDM frontend, create an OAuth client in Aidbox: - -```yaml -PUT /fhir/Client/mpi-dev -content-type: text/yaml -accept: text/yaml - -id: mpi-dev -auth: - authorization_code: - redirect_uri: https://mdm.example.com/api/auth/callback/aidbox - token_format: jwt - refresh_token: true - secret_required: true - access_token_expiration: 36000 - refresh_token_expiration: 864000 -secret: pass -first_party: true -grant_types: -- code -``` - -## Add admin privileges to your user - -Navigate to: **Aidbox → IAM → Users → Your Admin** - -1. Open the Aidbox dashboard. -2. Go to the IAM (Identity and Access Management) section. -3. Select Users. -4. Find and open your Admin user profile. - -Add the following section to the user configuration JSON: - -```json -{ - "data": { - "groups": [ - "SIT_EMPI_ADMIN_DEV" - ] - } -} -``` - -## Create SQL functions - -Create the following SQL functions in your Aidbox database: - -```sql -CREATE OR REPLACE FUNCTION public.immutable_unaccent(x text) - RETURNS text - LANGUAGE sql - IMMUTABLE - AS $function$ - SELECT - unaccent($1); -$function$; - -CREATE OR REPLACE FUNCTION public.immutable_unaccent_upper(text) - RETURNS text - LANGUAGE plpgsql - IMMUTABLE - AS $function$ -BEGIN - RETURN upper(public.unaccent($1)); -END; -$function$; - -CREATE OR REPLACE FUNCTION public.immutable_remove_spaces_unaccent_upper(text) - RETURNS text - LANGUAGE plpgsql - IMMUTABLE - AS $function$ -BEGIN - RETURN replace(public.upper(public.unaccent($1)), ' ', ''); -END; -$function$; -``` - -## Create database indexes - -{% hint style="info" %} -The indexes below are **recommendations** that work well with the example **Patient** model from the "Add model to MDM backend" section. If your model targets a different resource, adapt table names and expressions accordingly. -{% endhint %} - -Create the following indexes to optimize matching performance and resource reference lookups: - -```sql --- Patient indexes for matching and search -CREATE INDEX IF NOT EXISTS patient_full_name_idx_mdm ON public.patient USING btree ((((immutable_unaccent_upper((resource #>> '{name,0,family}'::text[])) || ' '::text) || immutable_unaccent_upper((resource #>> '{name,0,given,0}'::text[]))))); -- match blocks -CREATE INDEX IF NOT EXISTS patient_given_gin_idx_mdm ON public.patient USING gin (((resource #>> '{name,0,given,0}'::text[])) gin_trgm_ops); -- search by partial given -CREATE INDEX IF NOT EXISTS patient_family_gin_idx_mdm ON public.patient USING gin (((resource #>> '{name,0,family}'::text[])) gin_trgm_ops); -- search by partial family -CREATE INDEX IF NOT EXISTS patient_given_btree_idx_mdm ON public.patient USING btree (immutable_unaccent_upper((resource #>> '{name,0,given,0}'::text[]))); -- search by exact given -CREATE INDEX IF NOT EXISTS patient_family_btree_idx_mdm ON public.patient USING btree (immutable_unaccent_upper((resource #>> '{name,0,family}'::text[]))); -- search by exact family -CREATE INDEX IF NOT EXISTS patient_email_idx_mdm ON public.patient USING gin (jsonb_path_query_array(resource, '$."telecom"[*]?(@."system" == "email")."value"'::jsonpath) jsonb_path_ops); -- search by email -CREATE INDEX IF NOT EXISTS patient_identifier_idx_mdm ON public.patient USING gin (jsonb_path_query_array(resource, '$."identifier"[*]."value"'::jsonpath) jsonb_path_ops); -- search by identifier -CREATE INDEX IF NOT EXISTS patient_phone_idx_mdm ON public.patient USING gin (jsonb_path_query_array(resource, '$."telecom"[*]?(@."system" == "phone")."value"'::jsonpath) jsonb_path_ops); -- search by phone -CREATE INDEX IF NOT EXISTS patient_address_line_btree_idx_mdm ON public.patient USING btree (immutable_remove_spaces_unaccent_upper((resource #>> '{address,0,line,0}'::text[]))); -- match blocks -CREATE INDEX IF NOT EXISTS patient_identifier_idx2_mdm ON public.patient USING gin (((resource #> '{identifier}'::text[]))); -- for second model, review needed -CREATE INDEX IF NOT EXISTS patient_birthdate_idx_mdm ON public.patient USING btree (((resource #>> '{birthDate}'::text[]))); -- match blocks - --- Observation indexes for merge/unmerge operations -CREATE INDEX IF NOT EXISTS observation_encounter_references_idx_mdm ON public.observation USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Encounter")."id"'::jsonpath)); -- unmerge -CREATE INDEX IF NOT EXISTS observation_patient_references_idx_mdm ON public.observation USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -- merge - --- Specimen indexes for merge operations -CREATE INDEX IF NOT EXISTS specimen_patient_references_idx_mdm ON public.specimen USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -- merge - --- DiagnosticReport indexes for merge/unmerge operations -CREATE INDEX IF NOT EXISTS diagnosticreport_patient_references_idx_mdm ON public.diagnosticreport USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -- merge -CREATE INDEX IF NOT EXISTS diagnosticreport_encounter_references_idx_mdm ON public.diagnosticreport USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Encounter")."id"'::jsonpath)); -- unmerge - --- Encounter indexes for merge operations -CREATE INDEX IF NOT EXISTS encounter_patient_references_idx_mdm ON public.encounter USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -- merge -CREATE INDEX IF NOT EXISTS encounter_identifier_idx_mdm ON public.encounter USING gin ((jsonb_path_query_array(resource, '$."identifier".**."value"')) jsonb_path_ops); - --- Condition indexes for merge/unmerge operations -CREATE INDEX IF NOT EXISTS condition_patient_references_idx_mdm ON public.condition USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -- merge -CREATE INDEX IF NOT EXISTS condition_encounter_references_idx_mdm ON public.condition USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Encounter")."id"'::jsonpath)); -- unmerge - --- Media indexes for merge/unmerge operations -CREATE INDEX IF NOT EXISTS media_patient_references_idx_mdm ON public.media USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -- merge -CREATE INDEX IF NOT EXISTS media_encounter_references_idx_mdm ON public.media USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Encounter")."id"'::jsonpath)); -- unmerge - --- SourceMessage indexes for merge/unmerge operations -CREATE INDEX IF NOT EXISTS sourcemessage_patient_references_idx_mdm ON public.sourcemessage USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -CREATE INDEX IF NOT EXISTS sourcemessage_encounter_references_idx_mdm ON public.sourcemessage USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Encounter")."id"'::jsonpath)); -``` - -## Add model to MDM backend - -Matching models are stored in the **MDM server (backend)**, not in Aidbox. You can manage them via: - -* **Admin UI**: `https://mdm.example.com/admin` -* **API**: `POST /MatchingModel`, `PUT /MatchingModel`, `GET /MatchingModel` - -Authentication is **optional**. If it is enabled, MDM uses **Aidbox OAuth** for access control. - -Example of creating a **MatchingModel** via the MDM backend API: - -```http -POST /MatchingModel -Content-Type: application/json - -{ - "id": "model", - "resource": "Patient", - "thresholds": { - "auto": 25, - "manual": 16 - }, - "blocks": { - "fn": { - "var": "name" - }, - "dob": { - "var": "dob" - }, - "addr": { - "sql": "(l.#address = r.#address)" - } - }, - "vars": { - "dob": "(#.resource#>>'{birthDate}')", - "name": "((#.#family) || ' ' || (#.#given))", - "given": "(immutable_unaccent_upper(#.resource#>>'{name,0,given,0}'))", - "family": "(immutable_unaccent_upper(#.resource#>>'{name,0,family}'))", - "gender": "(#.resource#>>'{gender}')", - "address": "(immutable_remove_spaces_unaccent_upper(#.resource#>>'{address,0,line,0}'))", - "telecomArray": "array(select jsonb_array_elements_text(jsonb_path_query_array( #.resource, '$.telecom[*] ? (@.value != \"\").value')))", - "addressLength": "(length(#.resource#>>'{address,0,line,0}'))" - }, - "features": { - "fn": [ - { - "bf": 0, - "expr": "( l.resource->'name' IS NULL OR r.resource->'name' IS NULL )" - }, - { - "bf": 13.336495228175629, - "expr": "l.#name = r.#name" - }, - { - "bf": 13.104401641242227, - "expr": "r.#given = l.#family AND l.#given = r.#family" - }, - { - "bf": 5.36329167966839, - "expr": "r.#family = l.#family AND length(l.#given) <= 5 AND length(r.#given) <= 5 AND levenshtein(l.#given, r.#given) <= 2" - }, - { - "bf": 9.288385498954133, - "expr": "levenshtein(l.#name, r.#name) <= 2" - }, - { - "bf": 10.36329167966839, - "expr": "r.#given = l.#given AND string_to_array(l.#family, ' ') && string_to_array(r.#family, ' ')" - }, - { - "bf": 10.36329167966839, - "expr": "r.#family = l.#family AND string_to_array(l.#given, ' ') && string_to_array(r.#given, ' ')" - }, - { - "bf": 2.402276401131933, - "expr": "r.#given = l.#given" - }, - { - "else": -12.37233293924643 - } - ], - "dob": [ - { - "bf": 0, - "expr": "( l.#dob IS NULL OR r.#dob IS NULL )" - }, - { - "bf": 10.59415069916466, - "expr": "l.#dob = r.#dob" - }, - { - "bf": 3.9911610470417744, - "expr": "levenshtein(l.#dob, r.#dob) <= 1" - }, - { - "bf": 0.5164298695732575, - "expr": "levenshtein(l.#dob, r.#dob) <= 2" - }, - { - "else": -10.322063538772698 - } - ], - "telecom": [ - { - "bf": 0, - "expr": "( l.#telecomArray IS NULL OR r.#telecomArray IS NULL OR array_length(l.#telecomArray, 1) IS NULL OR array_length(r.#telecomArray, 1) IS NULL )" - }, - { - "bf": 6.465648574292063, - "expr": "l.#telecomArray && r.#telecomArray" - }, - { - "else": -10.517360697819983 - } - ], - "address": [ - { - "bf": 0, - "expr": "( l.#address IS NULL OR r.#address IS NULL )" - }, - { - "bf": 9.236771286242664, - "expr": "((l.#addressLength > r.#addressLength) and (l.#address %>> r.#address)) or ((l.#addressLength <= r.#addressLength) and (l.#address <<% r.#address))" - }, - { - "bf": 7.465648574292063, - "expr": "(l.#addressLength = r.#addressLength) and (l.#address = r.#address)" - }, - { - "else": -10.517360697819983 - } - ], - "sex": [ - { - "bf": 0, - "expr": "( l.#gender IS NULL OR r.#gender IS NULL )" - }, - { - "bf": 1.8504082299552485, - "expr": "l.#gender = r.#gender" - }, - { - "else": -4.842034404727677 - } - ] - } -} -``` - -### Matching Model Tuning - -The example model is intended for **testing and demonstration purposes** and may not deliver optimal results out of the box. - -For production use and reliable, accurate matching on your data, you should: - -* **Adapt the model** to reflect your data specifics and your definition of a correct match. -* **Calibrate feature weights** using your real-world data. This step typically involves **machine learning** and **manual expert tuning**. - -{% hint style="success" %} -We offer a **professional service** for model training and expert tuning.\ -If you need assistance, please [contact us](../../overview/contact-us.md). -{% endhint %} - -### Performance considerations - -For fast and accurate matching, consider the following: - -* **Database indexes:** If you are working with large volumes of records, ensure proper database indexes are created to keep matching fast and scalable. -* **Data normalization:** Matching quality depends heavily on well‑normalized input data. Avoid using placeholders like `"UNKNOWN"` or `"not provided"` for names, addresses, or birthdates, as they negatively impact results. - -## Configure Audit Events (Optional) - -The MDM module can track and export audit events for compliance and monitoring purposes. When enabled, the system generates FHIR AuditEvent resources for operations like: - -* Merge/unmerge operations -* Search and matching -* Marking/unmarking duplicates -* Record creation and viewing - -### Enable Audit Worker - -To enable audit event collection and export, configure the following environment variables in your backend service: - -```bash -# Enable audit worker -MPI_AUDIT_WORKER_ENABLE=true - -# URL where audit events will be sent (FHIR Bundle endpoint) -MPI_AUDIT_CONSUMER_URL=http://your-audit-repository:8080/fhir/Bundle - -# Polling interval in milliseconds (how often to check for pending events) -MPI_AUDIT_INTERVAL=1000 - -# Number of events to process per batch -MPI_AUDIT_BATCH_SIZE=10 - -# PostgreSQL advisory lock ID (prevents concurrent workers) -MPI_AUDIT_LOCK_ID=54321 -``` - -### How it works - -1. **Event Collection**: The system creates FHIR AuditEvent resources for auditable operations and stores them in the `mpi.audit_event` table with `send_status = 'pending'`. - -2. **Worker Processing**: The audit worker periodically: - - Fetches pending audit events (up to `batch-size`) - - Bundles them into a FHIR Bundle (type: "collection") - - POSTs the bundle to the configured `audit-repository-url` - - Marks events as `delivered` on successful response (HTTP 2xx) - -3. **Event Format**: Each audit event includes: - - Operation type and outcome (success/failure) - - User information (from Aidbox IAM) - - Affected resources (primary resources, related resources) - - Timestamp and source system details - -### Audit Repository Requirements - -The audit events are sent as FHIR AuditEvent resources following the [BALP (Basic Audit Log Patterns)](https://profiles.ihe.net/ITI/BALP/) specification. You can use any FHIR-compliant audit repository, but we recommend **Auditbox** for optimal integration and audit log management. - -{% hint style="success" %} -**Recommended**: Use [Auditbox](https://www.health-samurai.io/auditbox) for comprehensive audit event storage, querying, and compliance reporting with built-in FHIR AuditEvent support. -{% endhint %} - -Your audit consumer endpoint should: - -- Accept FHIR Bundle resources via HTTP POST -- Support `application/json` content type -- Return HTTP 2xx status for successful processing -- Handle Bundle resources with `type: "collection"` containing AuditEvent entries -- Support FHIR AuditEvent resources (R4 specification) - -## Configure Merge/Unmerge Notifications (Optional) - -The notification worker sends real-time alerts when merge or unmerge operations occur, allowing external systems to react to record changes. - -### Enable Notification Worker - -Configure the following environment variables in your backend service: - -```bash -# Enable notification worker -MPI_NOTIFICATION_WORKER_ENABLE=true - -# URL where notifications will be sent -MPI_NOTIFICATION_CONSUMER_URL=http://your-consumer-service:9876/notifications - -# Polling interval in milliseconds -MPI_NOTIFICATION_INTERVAL=1000 - -# Number of notifications to process per batch -MPI_NOTIFICATION_BATCH_SIZE=10 - -# PostgreSQL advisory lock ID (prevents concurrent workers) -MPI_NOTIFICATION_LOCK_ID=12345 -``` - -### How it works - -1. **Event Tracking**: When merge/unmerge operations complete, they are marked with `notification_status = 'not_delivered'` in the database. - -2. **Worker Processing**: The notification worker periodically: - - Fetches undelivered merge and unmerge operations (up to `batch-size`) - - POSTs them to the configured `consumer-url` - - Marks as `delivered` on successful response (HTTP 2xx) - -3. **Notification Payload**: The worker sends a JSON payload containing (Patient example): - -```json -{ - "merges": [ - { - "id": "merge-id", - "target-patient-id": "Patient/123", - "source-patient-id": "Patient/456", - "related-resources-refs": ["Observation/789", "Encounter/012"], - "result-patient": { /* FHIR Patient resource */ } - } - ], - "unmerges": [ - { - "id": "unmerge-id", - "merge-id": "original-merge-id", - "source-patient": { /* Restored Patient resource */ }, - "user-id": "user-123", - "related-resources": ["Observation/789", "Encounter/012"] - } - ] -} -``` - -### Consumer Endpoint Requirements - -Your notification consumer endpoint should: - -- Accept HTTP POST requests with `Content-Type: application/json` -- Process the payload containing `merges` and `unmerges` arrays -- Return HTTP 2xx status for successful processing diff --git a/docs/deprecated/deprecated/mdm/find-duplicates-match.md b/docs/deprecated/deprecated/mdm/find-duplicates-match.md deleted file mode 100644 index 255ee83ab..000000000 --- a/docs/deprecated/deprecated/mdm/find-duplicates-match.md +++ /dev/null @@ -1,191 +0,0 @@ ---- -description: Use $match operation to find potential duplicate records with configurable query parameters and scoring. ---- - -# Find duplicates: $match - -{% hint style="warning" %} -The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager. -{% endhint %} - -{% hint style="warning" %} -To use the `$match` operation, you need to set up an MDM module. Read the [MDM manual](./) to learn how to run and use it. -{% endhint %} - -The `$match` operation is used to **find potential duplicate records**. - -It performs a probabilistic search based on a **matching model** that compares the record you provide with other records in the system across multiple features and estimates how similar they are. The structure of the matching model and its parameters are described on the [Matching Model Explanation](matching-model-explanation.md) page. - -The **result is a list of potential duplicates**, each with a calculated match score and a detailed breakdown of feature similarity. - -Below we use **Patient** as an example, but the same flow works for any resource type your matching model targets. - -This page provides key information about using `$match`. For full API details, refer to our [Swagger documentation](https://dev.mdm.health-samurai.io/backend/static/swagger.html). - -## $match - -The match operation can be initiated either through the **MDM user interface** or by using the **API**. - -The `$match` operation supports several **query parameters** that let you control how matching is performed and how results are returned: - -
| Name | Type | Default | Description | Example |
|---|---|---|---|---|
model | string | model | Matching model ID to be used for matching | model |
threshold | integer | 0 | Minimum score threshold for a candidate to appear in the match results | 0 |
page | integer | 1 | Page number of results | 1 |
size | integer | 10 | Number of results per page | 10 |
POST /fhir/Patient/$match?model=model&threshold=10&page=1&size=10
-Content-Type: application/json
-
-{
- "resourceType": "Parameters",
- "parameter": [
- {
- "name": "resource",
- "resource": {
- "name": [
- {
- "given": [
- "Freya"
- ],
- "family": "Shah"
- }
- ],
- "address": [
- {
- "city": "London"
- }
- ],
- "birthDate": "1970-12-17"
- }
- }
- ]
-}
-
-
-As a result, you will receive the following:
-
-* A **list of candidate duplicate records**
-* For each candidate record:
- * `match_weight` — an overall similarity score calculated by the matching model
- * `match_details` — per-feature similarity contributions (e.g., name similarity, date of birth match, address closeness, etc.)
- * `resource` — the full FHIR resource for that candidate
-
-The response is sorted by `match_weight` in descending order so that the most similar records appear first.
-
-For example:
-
-```json
-[
- {
- "match_details": {
- "fn": 13.336495228175629,
- "dob": 10.59415069916466,
- "ext": -10.517360697819983,
- "sex": 0
- },
- "match_weight": 13.413285229520307,
- "resource": {
- "id": "236",
- "resourceType": "Patient",
- "name": [
- {
- "given": [
- "Freya"
- ],
- "family": "Shah"
- }
- ],
- "address": [
- {
- "city": "Londodn"
- }
- ],
- "birthDate": "1970-12-17",
- "identifier": [
- {
- "value": "62",
- "system": "cluster"
- }
- ]
- }
- },
- {
- "match_details": {
- "fn": 13.336495228175629,
- "dob": 10.59415069916466,
- "ext": -10.517360697819983,
- "sex": 0
- },
- "match_weight": 13.413285229520307,
- "resource": {
- "id": "242",
- "resourceType": "Patient",
- "name": [
- {
- "given": [
- "Freya"
- ],
- "family": "Shah"
- }
- ],
- "address": [
- {
- "city": "Lonnod"
- }
- ],
- "birthDate": "1970-12-17",
- "identifier": [
- {
- "value": "62",
- "system": "cluster"
- }
- ]
- }
- },
- {
- "match_details": {
- "fn": 13.104401641242227,
- "dob": 10.59415069916466,
- "ext": -10.517360697819983,
- "sex": 0
- },
- "match_weight": 13.181191642586905,
- "resource": {
- "id": "238",
- "resourceType": "Patient",
- "name": [
- {
- "given": [
- "Shah"
- ],
- "family": "Freya"
- }
- ],
- "address": [
- {
- "city": "London"
- }
- ],
- "telecom": [
- {
- "value": "f.s@flynn.com",
- "system": "email"
- }
- ],
- "birthDate": "1970-12-17",
- "identifier": [
- {
- "value": "62",
- "system": "cluster"
- }
- ]
- }
- }
-]
-```
diff --git a/docs/deprecated/deprecated/mdm/matching-model-explanation.md b/docs/deprecated/deprecated/mdm/matching-model-explanation.md
deleted file mode 100644
index c0f36691c..000000000
--- a/docs/deprecated/deprecated/mdm/matching-model-explanation.md
+++ /dev/null
@@ -1,321 +0,0 @@
----
-description: >-
- This page explains how the MDM matching model works, describing its structure,
- scoring logic, and configurable elements with an example.
----
-
-# Matching Model Explanation
-
-{% hint style="warning" %}
-The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager.
-{% endhint %}
-
-{% hint style="info" %}
-This page provides the **matching model code** and explains its elements.\
-For an overview of probabilistic matching concepts and match score calculation, see our article [Master Patient Index and Record Linkage](https://www.health-samurai.io/articles/master-patient-index-and-record-linkage).
-{% endhint %}
-
-This model is used for **record matching**, but the same approach can be adapted to detect duplicates for any type of resource.\
-If you are interested in applying this approach to your use case, please [contact us](../../overview/contact-us.md).
-
-Matching models are stored in the **MDM server (backend)** and managed via the `/MatchingModel` API or the `/admin` UI.
-
-Below we use **Patient** as an example to illustrate the model structure.
-
-## Core Idea
-
-The model compares selected fields from records and evaluates predefined comparison rules.\
-Each rule in the **features** section contains an expression `expr` and an associated weight `bf` (Bayes Factor), indicating how strongly a match or mismatch on that field affects the total score.
-
-All weights are summed into a **total score**. If the score is above the defined threshold, the record pair is included in the match results; if it is below, it is excluded.
-
-## Model Structure
-
-**Which fields to compare** and **how to compare** them is described in the example model:
-
-{
- "id": "model",
- "vars": {
- "dob": "(#.resource#>>'{birthDate}')",
- "name": "((#.#family) || ' ' || (#.#given))",
- "given": "(immutable_unaccent_upper(#.resource#>>'{name,0,given,0}'))",
- "family": "(immutable_unaccent_upper(#.resource#>>'{name,0,family}'))",
- "gender": "(#.resource#>>'{gender}')",
- "address": "(#.resource#>>'{address,0,line,0}')",
- "addressLength": "(length(#.resource#>>'{address,0,line,0}'))",
- "telecomArray": "array(select jsonb_array_elements_text(jsonb_path_query_array( #.resource, '$.telecom[*] ? (@.value != \"\").value')))"
- },
- "blocks": {
- "fn": {
- "var": "name"
- },
- "dob": {
- "var": "dob"
- },
- "addr": {
- "sql": "(l.#address % r.#address)"
- }
- },
- "features": {
- "fn": [
- {
- "bf": 0,
- "expr": " ( l.resource->'name' IS NULL OR r.resource->'name' IS NULL )"
- },
- {
- "bf": 13.336495228175629,
- "expr": "l.#name = r.#name"
- },
- {
- "bf": 13.104401641242227,
- "expr": "r.#given = l.#family AND l.#given = r.#family"
- },
- {
- "bf": 9.288385498954133,
- "expr": "levenshtein(l.#name, r.#name) <= 2"
- },
- {
- "bf": 10.36329167966839,
- "expr": "r.#given = l.#given AND string_to_array(l.#family, ' ') && string_to_array(r.#family, ' ')"
- },
- {
- "bf": 10.36329167966839,
- "expr": "r.#family = l.#family AND string_to_array(l.#given, ' ') && string_to_array(r.#given, ' ')"
- },
- {
- "bf": 2.402276401131933,
- "expr": "r.#given = l.#given"
- },
- {
- "else": -12.37233293924643
- }
- ],
- "dob": [
- {
- "bf": 0,
- "expr": " ( l.#dob IS NULL OR r.#dob IS NULL )"
- },
- {
- "bf": 10.59415069916466,
- "expr": "l.#dob = r.#dob"
- },
- {
- "bf": 3.9911610470417744,
- "expr": "levenshtein(l.#dob, r.#dob) <= 1"
- },
- {
- "bf": 0.5164298695732575,
- "expr": "levenshtein(l.#dob, r.#dob) <= 2"
- },
- {
- "else": -10.322063538772698
- }
- ],
- "ext": [
- {
- "bf": 9.236771286242664,
- "expr": "((l.#telecomArray && r.#telecomArray) AND (((l.#addressLength > r.#addressLength) and (l.#address %>> r.#address)) or ((l.#addressLength <= r.#addressLength) and (l.#address <<% r.#address))))"
- },
- {
- "bf": 7.465648574292063,
- "expr": "(((l.#addressLength > r.#addressLength) and (l.#address %>> r.#address)) or ((l.#addressLength <= r.#addressLength) and (l.#address <<% r.#address)))"
- },
- {
- "bf": 6.465648574292063,
- "expr": "l.#telecomArray && r.#telecomArray"
- },
- {
- "else": -10.517360697819983
- }
- ],
- "sex": [
- {
- "bf": 0,
- "expr": " ( l.#gender IS NULL OR r.#gender IS NULL )"
- },
- {
- "bf": 1.8504082299552485,
- "expr": " l.#gender = r.#gender"
- },
- {
- "else": -4.842034404727677
- }
- ]
- },
- "resource": "Patient",
- "thresholds": {
- "auto": 25,
- "manual": 16
- },
- "resourceType": "MatchingModel"
-}
-
-
-### **Variables (`vars`)**
-
-**Variables** defined in the model can **reference resource fields** directly or be composed from them using expressions (e.g., concatenating values, applying normalization, or calculating derived values). These variables are used in feature expressions and blocking rules.
-
-* `dob` – birth date (if applicable)
-* `name` – concatenation of family and given names
-* `given` – normalized first name (accents removed, uppercase)
-* `family` – normalized last name (accents removed, uppercase)
-* `gender` – gender value
-* `address` – normalized address line
-* `telecomArray` – contact information (phone, email)
-
-### **Comparison Blocks (`blocks`)**
-
-Blocking rules **limit** the number of candidate record pairs by selecting only those that **share key characteristics** (e.g., similar names, matching birth dates, or addresses).\
-This **reduces** the number of comparisons, which significantly **speeds up processing**, while still preserving potential matches for scoring.
-
-* `fn`: blocks by name
-* `dob`: blocks by date of birth
-* `addr`: blocks by address
-
-### **Matching Features and Scoring**
-
-Features describe **how resource fields are compared** and **how much each comparison influences** the overall **match score**.
-
-Each feature contains:
-
-* `expr` – a logical expression that compares values of specific fields or variables between two records.
-* `bf` (Bayes factor / weight) – a numeric value representing how strongly a match or mismatch on that feature affects the total score.
-
-When records are compared, all satisfied feature expressions **add their weights** to the total score. If a mismatch is detected, **negative weights** may be applied. The result is an aggregated score reflecting the likelihood that two records refer to the same entity.
-
-{% hint style="info" %}
-The model uses **Levenshtein distance** to tolerate typos and small text differences. It counts how many single‑character edits (insertions, deletions, substitutions) are needed to make two strings equal.\
-For example, levenshtein('Jonathan', 'Jonatan') = 1.
-{% endhint %}
-
-#### **Name Matching (`fn`)**:
-
-* Exact match: 13.34 points
-* Swapped first/last names: 13.10 points
-* Levenshtein distance ≤ 2: 9.29 points
-* Partial matches (same first name + matching parts of last name): 10.36 points
-* Same first name only: 2.40 points
-* No match: -12.37 points
-
-```json
-"fn": [
- {
- "bf": 0,
- "expr": " ( l.resource->'name' IS NULL OR r.resource->'name' IS NULL )"
- },
- {
- "bf": 13.336495228175629,
- "expr": "l.#name = r.#name"
- },
- {
- "bf": 13.104401641242227,
- "expr": "r.#given = l.#family AND l.#given = r.#family"
- },
- {
- "bf": 9.288385498954133,
- "expr": "levenshtein(l.#name, r.#name) <= 2"
- },
- {
- "bf": 10.36329167966839,
- "expr": "r.#given = l.#given AND string_to_array(l.#family, ' ') && string_to_array(r.#family, ' ')"
- },
- {
- "bf": 10.36329167966839,
- "expr": "r.#family = l.#family AND string_to_array(l.#given, ' ') && string_to_array(r.#given, ' ')"
- },
- {
- "bf": 2.402276401131933,
- "expr": "r.#given = l.#given"
- },
- {
- "else": -12.37233293924643
- }
-]
-```
-
-#### **Date of Birth Matching (`dob`)**:
-
-* Exact match: 10.59 points
-* Levenshtein distance ≤ 1: 3.99 points
-* Levenshtein distance ≤ 2: 0.52 points
-* No match: -10.32 points
-
-```json
-"dob": [
- {
- "bf": 0,
- "expr": " ( l.#dob IS NULL OR r.#dob IS NULL )"
- },
- {
- "bf": 10.59415069916466,
- "expr": "l.#dob = r.#dob"
- },
- {
- "bf": 3.9911610470417744,
- "expr": "levenshtein(l.#dob, r.#dob) <= 1"
- },
- {
- "bf": 0.5164298695732575,
- "expr": "levenshtein(l.#dob, r.#dob) <= 2"
- },
- {
- "else": -10.322063538772698
- }
-]
-```
-
-#### **Address Matching (`ext`)**:
-
-* Exact address match: 7.47 points
-* Matching contact information: 9.24 points
-* No match: -10.52 points
-
-```json
-"ext": [
- {
- "bf": 9.236771286242664,
- "expr": "((l.#telecomArray && r.#telecomArray) AND (((l.#addressLength > r.#addressLength) and (l.#address %>> r.#address)) or ((l.#addressLength <= r.#addressLength) and (l.#address <<% r.#address))))"
- },
- {
- "bf": 7.465648574292063,
- "expr": "(((l.#addressLength > r.#addressLength) and (l.#address %>> r.#address)) or ((l.#addressLength <= r.#addressLength) and (l.#address <<% r.#address)))"
- },
- {
- "bf": 6.465648574292063,
- "expr": "l.#telecomArray && r.#telecomArray"
- },
- {
- "else": -10.517360697819983
- }
-]
-```
-
-#### **Gender Matching (`sex`)**:
-
-* Exact match: 1.85 points
-* No match: -4.84 points
-
-```json
-"sex": [
- {
- "bf": 0,
- "expr": " ( l.#gender IS NULL OR r.#gender IS NULL )"
- },
- {
- "bf": 1.8504082299552485,
- "expr": " l.#gender = r.#gender"
- },
- {
- "else": -4.842034404727677
- }
-]
-```
-
-### **Thresholds**
-
-Thresholds define the **decision boundaries** for match results.\
-After the total score is calculated based on all feature comparisons, it is compared against threshold values:
-
-* `auto`: matching score ≥ 25 → automatic merge can be processed
-* `manual`: 16 ≤ matching score < 25 → manual review required
-* Below `manual` – score < 16 → non‑match
diff --git a/docs/deprecated/deprecated/mdm/mathematical-details.md b/docs/deprecated/deprecated/mdm/mathematical-details.md
deleted file mode 100644
index 2c1321d93..000000000
--- a/docs/deprecated/deprecated/mdm/mathematical-details.md
+++ /dev/null
@@ -1,51 +0,0 @@
----
-description: MDM Mathematical Details for matching and deduplication in FHIR.
----
-
-# Mathematical Details
-
-{% hint style="warning" %}
-The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager.
-{% endhint %}
-
-See the [fastlink](https://imai.fas.harvard.edu/research/files/linkage.pdf) paper for more details.
-
-The algorithm is based on comparisons.
-
-We will use the term _record_ instead of resource here (this term is used in the record linkage articles).
-
-Define a set of comparison functions over pairs of records. Each comparison function returns a single category like
-
-* null
-* significantly different
-* slightly different
-* exactly equal
-
-Different comparison functions can have different sets of possible categories (i.e., codomains are not necessarily equal).
-
-An example of a comparison function is
-
-* \-1, if the surname of one of the records is missing
-* 0, if Levenshtein distance between surnames is greater than 2
-* 1, if Levenshtein distance is 2
-* 2, if Levenshtein distance is 1
-* 3, if surnames are equal
-
-We will say that two records match if they belong to the same entity. For example there can be two records for a single person or organization. These records can differ (e.g., a name change).
-
-We are going to use Bayes' theorem. Prior probability is the probability that two random records match.
-
-Then we define two conditional probabilities for each comparison function value:
-
-* m-probability: probability of the specific comparison function value, given that records match
-* u-probability: probability of the specific comparison function value, given that records don't match
-
-Then m-probability divided by u-probability is a Bayes factor.
-
-To calculate the match score, multiply Bayes factors of each comparison result, and multiply that value by the prior.
-
-To estimate probability, compute x/(1+x), where x is the score. Or calculate it using Bayes' theorem.
-
-Note that comparison functions have to be mutually independent. However, in practice the algorithm is quite robust to independence violations.
-
-Probability estimation is done using the EM algorithm. It is discussed in detail in the fastlink paper.
diff --git a/docs/deprecated/deprecated/mdm/mdm-module-resources.md b/docs/deprecated/deprecated/mdm/mdm-module-resources.md
deleted file mode 100644
index 7a4ebc66a..000000000
--- a/docs/deprecated/deprecated/mdm/mdm-module-resources.md
+++ /dev/null
@@ -1,72 +0,0 @@
----
-description: Aidbox MDM module resources for master data management and record linkage.
----
-
-# MDM Module Resources
-
-Resources for MDM module.
-
- ## AidboxLinkageModel
-
-MDM (Master Data Management) Linkage Model resource for probabilistic record matching
-
-```fhir-structure
-[ {
- "path" : "blocks",
- "name" : "blocks",
- "lvl" : 0,
- "min" : 1,
- "max" : 1,
- "type" : "Object",
- "desc" : ""
-}, {
- "path" : "features",
- "name" : "features",
- "lvl" : 0,
- "min" : 1,
- "max" : 1,
- "type" : "Object",
- "desc" : ""
-}, {
- "path" : "resource",
- "name" : "resource",
- "lvl" : 0,
- "min" : 1,
- "max" : 1,
- "type" : "string",
- "desc" : ""
-}, {
- "path" : "thresholds",
- "name" : "thresholds",
- "lvl" : 0,
- "min" : 0,
- "max" : 1,
- "type" : "BackboneElement",
- "desc" : ""
-}, {
- "path" : "thresholds.auto",
- "name" : "auto",
- "lvl" : 1,
- "min" : 0,
- "max" : 1,
- "type" : "decimal",
- "desc" : ""
-}, {
- "path" : "thresholds.manual",
- "name" : "manual",
- "lvl" : 1,
- "min" : 0,
- "max" : 1,
- "type" : "decimal",
- "desc" : ""
-}, {
- "path" : "vars",
- "name" : "vars",
- "lvl" : 0,
- "min" : 0,
- "max" : 1,
- "type" : "Object",
- "desc" : ""
-} ]
-```
-
diff --git a/docs/deprecated/deprecated/mdm/merging-and-unmerging-records-usdmerge-and-usdunmerge.md b/docs/deprecated/deprecated/mdm/merging-and-unmerging-records-usdmerge-and-usdunmerge.md
deleted file mode 100644
index 817007d70..000000000
--- a/docs/deprecated/deprecated/mdm/merging-and-unmerging-records-usdmerge-and-usdunmerge.md
+++ /dev/null
@@ -1,253 +0,0 @@
----
-description: >-
- This page explains how to use merge and unmerge operations in MDM and how they
- work, with practical examples.
----
-
-# Merging and Unmerging Records: $merge and $unmerge
-
-{% hint style="warning" %}
-The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager.
-{% endhint %}
-
-## Overview
-
-We use a **hybrid merge strategy** that combines elements of both the **golden record** and **survivor record** approaches:
-
-* You have to select one of the existing records as the **survivor**.
-* You can optionally **edit its data** using fields from the other records before completing the merge.
-
-Currently, only **manual merging** is supported. However, the system is designed to support **automatic merging** in the future.
-
-The **unmerge** operation allows reversing a previous merge by restoring the original source record and its relationships based on audit data, ensuring no permanent data loss if a merge was done by mistake.
-
-This page provides key information about using `$merge` and `$unmerge`. For full API details, refer to our [Swagger documentation](https://dev.mdm.health-samurai.io/backend/static/swagger.html).
-
-{% hint style="success" %}
-If you need **alternative merge and unmerge approaches** to adjust MDM to your specific workflows and requirements, please [contact us](../../overview/contact-us.md).
-{% endhint %}
-
-## Merge Operation
-
-### **$merge**
-
-The merge operation can be initiated either through the **MDM user interface** or by using the **API**.
-
-To perform it via the API, send the `$merge` request, for example (Patient example):
-
-```http
-POST /fhir/Patient/$merge
-Content-Type: application/json
-
-{
- "targetPatient": {
- "reference": "Patient/0"
- },
- "sourcePatient": {
- "reference": "Patient/3"
- },
- "resultPatient": {
- "name": [
- {
- "given": ["Robert"],
- "family": "Alan"
- }
- ]
- }
-}
-```
-
-Where:
-
-* `targetPatient` – the resource record selected as the **survivor**. After the merge, this record remains active and contains the resulting data.
-* `sourcePatient` – the resource record being merged into the survivor. After the merge, this record will be removed.
-* `resultPatient` – optional. Provides updated data for the survivor (for example, if you want to correct a name, address, or any other field during the merge). If omitted, the survivor record remains unchanged, except for linked resources and identifiers that are merged automatically.
-
-The **response** returns a `merge-id` used for audit and tracking:
-
-```json
-{
- "merge-id": "b71dc614-7a1c-44b8-a727-027cbee3466a"
-}
-```
-
-### How Merging Works
-
-When a merge request is processed, the system performs several steps to ensure data consistency and auditability:
-
-1. Verifies the match weight exists
-2. All related resources are found and updated
-3. A merge record is created with status and match weight
-4. All changes are logged for audit purposes
-5. The target record is updated with merged data
-6. The source record is deleted
-7. The merge ID is returned to the client
-
-The diagram below shows the full process flow:
-
-```mermaid
-sequenceDiagram
- participant Client
- participant MDM as MDM Service
- participant DB as Database
-
- Client->>MDM: merge-record(source, target, result)
- MDM->>DB: find-match-weight(source, target)
- alt No match weight found
- MDM-->>Client: Error: No match weight
- else Match weight exists
- MDM->>DB: find-resources-by-record-id(source)
- Note over MDM,DB: Find all related resources