From 4ca75e429b941a1b51632629d7e16bcd1ce5a743 Mon Sep 17 00:00:00 2001 From: kuzmordas Date: Thu, 4 Jun 2026 12:40:12 +0300 Subject: [PATCH 1/2] docs: removed mdm --- SUMMARY.md | 9 - docs/deprecated/deprecated/mdm/README.md | 100 ---- .../deprecated/mdm/configure-mdm-module.md | 444 ------------------ .../deprecated/mdm/find-duplicates-match.md | 191 -------- .../mdm/matching-model-explanation.md | 321 ------------- .../deprecated/mdm/mathematical-details.md | 51 -- .../deprecated/mdm/mdm-module-resources.md | 72 --- ...merging-records-usdmerge-and-usdunmerge.md | 253 ---------- docs/deprecated/deprecated/mdm/rbac.md | 63 --- .../deprecated/mdm/run-mdm-locally.md | 218 --------- redirects.yaml | 37 -- 11 files changed, 1759 deletions(-) delete mode 100644 docs/deprecated/deprecated/mdm/README.md delete mode 100644 docs/deprecated/deprecated/mdm/configure-mdm-module.md delete mode 100644 docs/deprecated/deprecated/mdm/find-duplicates-match.md delete mode 100644 docs/deprecated/deprecated/mdm/matching-model-explanation.md delete mode 100644 docs/deprecated/deprecated/mdm/mathematical-details.md delete mode 100644 docs/deprecated/deprecated/mdm/mdm-module-resources.md delete mode 100644 docs/deprecated/deprecated/mdm/merging-and-unmerging-records-usdmerge-and-usdunmerge.md delete mode 100644 docs/deprecated/deprecated/mdm/rbac.md delete mode 100644 docs/deprecated/deprecated/mdm/run-mdm-locally.md diff --git a/SUMMARY.md b/SUMMARY.md index c77bc79f3..2fb58cd47 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -636,15 +636,6 @@ * [Aidbox Form Workflow](deprecated/deprecated/forms/aidbox-code-editor/aidbox-form-workflow.md) * [Changing data after form signing](deprecated/deprecated/forms/aidbox-code-editor/changing-data-after-form-signing.md) * [FHIR Questionnaire to Aidbox forms and back conversion](deprecated/deprecated/forms/aidbox-code-editor/fhir-questionnaire-to-aidbox-forms-and-back-conversion.md) - * [MDM — Master Data Management](deprecated/deprecated/mdm/README.md) - * [Run MDM locally](deprecated/deprecated/mdm/run-mdm-locally.md) - * [Configure MDM module](deprecated/deprecated/mdm/configure-mdm-module.md) - * [Find duplicates: $match](deprecated/deprecated/mdm/find-duplicates-match.md) - * [Merging and Unmerging Records: $merge and $unmerge](deprecated/deprecated/mdm/merging-and-unmerging-records-usdmerge-and-usdunmerge.md) - * [RBAC configuration](deprecated/deprecated/mdm/rbac.md) - * [Matching Model Explanation](deprecated/deprecated/mdm/matching-model-explanation.md) - * [Mathematical Details](deprecated/deprecated/mdm/mathematical-details.md) - * [MDM Module Resources](deprecated/deprecated/mdm/mdm-module-resources.md) * [AidboxDB](deprecated/deprecated/aidboxdb/README.md) * [HA AidboxDB](deprecated/deprecated/aidboxdb/ha-aidboxdb.md) * [Migrate to AidboxDB 16](deprecated/deprecated/aidboxdb/migrate-to-aidboxdb-16.md) diff --git a/docs/deprecated/deprecated/mdm/README.md b/docs/deprecated/deprecated/mdm/README.md deleted file mode 100644 index 42d12b20f..000000000 --- a/docs/deprecated/deprecated/mdm/README.md +++ /dev/null @@ -1,100 +0,0 @@ ---- -description: >- - This page introduces the Aidbox MDM module, its core capabilities, and guides - for deployment, configuration, matching, and merge/unmerge operations. ---- - -# MDM — Master Data Management - -{% hint style="warning" %} -The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager. -{% endhint %} - -**Master Data Management (MDM)** is a module in Aidbox that ensures **accurate entity identification** by detecting and removing duplicate records. It helps maintain consistent and reliable data across healthcare systems. - -**MDM enables:** - -* accurate [**matching**](find-duplicates-match.md) of records across different systems and facilities, -* [**merging**](merging-and-unmerging-records-usdmerge-and-usdunmerge.md#merge-operation) of duplicate records into a single record, -* [**unmerging**](merging-and-unmerging-records-usdmerge-and-usdunmerge.md#unmerge-operation) of incorrectly linked records, -* maintaining the **integrity** of clinical data and treatment history. - -Using MDM **reduces the risk** of lost or duplicated data, errors, and issues with data exchange. This is especially critical in complex ecosystems with many sources — such as clinics, labs, and telemedicine platforms. - -The MDM module utilizes a **probabilistic** (score-based or Fellegi-Sunter) method. It is more flexible and can provide better results than rule-based approaches, but at the cost of simplicity. - -## MDM Capabilities Overview - -### Technical Capabilities - -* FHIR R4 support -* Seamless integration with the Aidbox platform -* API-first architecture with a user-friendly web-based UI -* Notifications for external systems via webhooks (non-FHIR format) -* Unlimited scalability — supports any number of records -* Can be deployed in the cloud or on-premises - -### Data Safety, Transparency and Consistency - -* Role-based access control -* Full traceability of all operations, user actions and API calls -* Supports compliance with security and regulatory standards - -### Core Feature set - -* Search for records -* Flexible matching using a probabilistic algorithm - * Fully configurable for specific data and use cases - * Handles typos and incomplete data -* Manual record merging with unique merge strategy combining golden record and survivor record approaches -* Unmerge capability -* Ability to mark record pairs as non-duplicates to exclude them from future match results - -## Run MDM locally - -{% content-ref url="run-mdm-locally.md" %} -[run-mdm-locally.md](run-mdm-locally.md) -{% endcontent-ref %} - -## Configure MDM module - -Configure the MDM module to use a matching model stored in the MDM server (backend) - -{% content-ref url="configure-mdm-module.md" %} -[configure-mdm-module.md](configure-mdm-module.md) -{% endcontent-ref %} - -## Find Duplicates - -Use `$match` operation to find duplicates - - -## Merge and Unmerge Records - -Use `$merge` and `$unmerge` operations to manage duplicate records - -{% content-ref url="merging-and-unmerging-records-usdmerge-and-usdunmerge.md" %} -[merging-and-unmerging-records-usdmerge-and-usdunmerge.md](merging-and-unmerging-records-usdmerge-and-usdunmerge.md) -{% endcontent-ref %} - -## How It Works - -Learn more about: - -1. How our matching model works - -{% content-ref url="matching-model-explanation.md" %} -[matching-model-explanation.md](matching-model-explanation.md) -{% endcontent-ref %} - -2. How record merge and unmerge operations work - -{% content-ref url="merging-and-unmerging-records-usdmerge-and-usdunmerge.md" %} -[merging-and-unmerging-records-usdmerge-and-usdunmerge.md](merging-and-unmerging-records-usdmerge-and-usdunmerge.md) -{% endcontent-ref %} - -3. Mathematics behind probabilistic matching - -{% content-ref url="mathematical-details.md" %} -[mathematical-details.md](mathematical-details.md) -{% endcontent-ref %} diff --git a/docs/deprecated/deprecated/mdm/configure-mdm-module.md b/docs/deprecated/deprecated/mdm/configure-mdm-module.md deleted file mode 100644 index 0791b6e79..000000000 --- a/docs/deprecated/deprecated/mdm/configure-mdm-module.md +++ /dev/null @@ -1,444 +0,0 @@ ---- -description: Configure MDM module with matching models for deduplication including examples and tuning notes. ---- - -# Configure MDM module - -{% hint style="warning" %} -The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager. -{% endhint %} - -{% hint style="info" %} -The matching model defines the target resource type via the `resource` field (for example, `Patient`, `Practitioner`, `Organization`, or any custom FHIR resource). -{% endhint %} - -The example in the next section provides a **basic model** that allows you to **start the MDM module** and **test** its functionality. For a detailed explanation of all model elements and matching logic, see [Matching Model Explanation](matching-model-explanation.md). - -## Create OAuth Client for MDM frontend - -To enable authentication for the MDM frontend, create an OAuth client in Aidbox: - -```yaml -PUT /fhir/Client/mpi-dev -content-type: text/yaml -accept: text/yaml - -id: mpi-dev -auth: - authorization_code: - redirect_uri: https://mdm.example.com/api/auth/callback/aidbox - token_format: jwt - refresh_token: true - secret_required: true - access_token_expiration: 36000 - refresh_token_expiration: 864000 -secret: pass -first_party: true -grant_types: -- code -``` - -## Add admin privileges to your user - -Navigate to: **Aidbox → IAM → Users → Your Admin** - -1. Open the Aidbox dashboard. -2. Go to the IAM (Identity and Access Management) section. -3. Select Users. -4. Find and open your Admin user profile. - -Add the following section to the user configuration JSON: - -```json -{ - "data": { - "groups": [ - "SIT_EMPI_ADMIN_DEV" - ] - } -} -``` - -## Create SQL functions - -Create the following SQL functions in your Aidbox database: - -```sql -CREATE OR REPLACE FUNCTION public.immutable_unaccent(x text) - RETURNS text - LANGUAGE sql - IMMUTABLE - AS $function$ - SELECT - unaccent($1); -$function$; - -CREATE OR REPLACE FUNCTION public.immutable_unaccent_upper(text) - RETURNS text - LANGUAGE plpgsql - IMMUTABLE - AS $function$ -BEGIN - RETURN upper(public.unaccent($1)); -END; -$function$; - -CREATE OR REPLACE FUNCTION public.immutable_remove_spaces_unaccent_upper(text) - RETURNS text - LANGUAGE plpgsql - IMMUTABLE - AS $function$ -BEGIN - RETURN replace(public.upper(public.unaccent($1)), ' ', ''); -END; -$function$; -``` - -## Create database indexes - -{% hint style="info" %} -The indexes below are **recommendations** that work well with the example **Patient** model from the "Add model to MDM backend" section. If your model targets a different resource, adapt table names and expressions accordingly. -{% endhint %} - -Create the following indexes to optimize matching performance and resource reference lookups: - -```sql --- Patient indexes for matching and search -CREATE INDEX IF NOT EXISTS patient_full_name_idx_mdm ON public.patient USING btree ((((immutable_unaccent_upper((resource #>> '{name,0,family}'::text[])) || ' '::text) || immutable_unaccent_upper((resource #>> '{name,0,given,0}'::text[]))))); -- match blocks -CREATE INDEX IF NOT EXISTS patient_given_gin_idx_mdm ON public.patient USING gin (((resource #>> '{name,0,given,0}'::text[])) gin_trgm_ops); -- search by partial given -CREATE INDEX IF NOT EXISTS patient_family_gin_idx_mdm ON public.patient USING gin (((resource #>> '{name,0,family}'::text[])) gin_trgm_ops); -- search by partial family -CREATE INDEX IF NOT EXISTS patient_given_btree_idx_mdm ON public.patient USING btree (immutable_unaccent_upper((resource #>> '{name,0,given,0}'::text[]))); -- search by exact given -CREATE INDEX IF NOT EXISTS patient_family_btree_idx_mdm ON public.patient USING btree (immutable_unaccent_upper((resource #>> '{name,0,family}'::text[]))); -- search by exact family -CREATE INDEX IF NOT EXISTS patient_email_idx_mdm ON public.patient USING gin (jsonb_path_query_array(resource, '$."telecom"[*]?(@."system" == "email")."value"'::jsonpath) jsonb_path_ops); -- search by email -CREATE INDEX IF NOT EXISTS patient_identifier_idx_mdm ON public.patient USING gin (jsonb_path_query_array(resource, '$."identifier"[*]."value"'::jsonpath) jsonb_path_ops); -- search by identifier -CREATE INDEX IF NOT EXISTS patient_phone_idx_mdm ON public.patient USING gin (jsonb_path_query_array(resource, '$."telecom"[*]?(@."system" == "phone")."value"'::jsonpath) jsonb_path_ops); -- search by phone -CREATE INDEX IF NOT EXISTS patient_address_line_btree_idx_mdm ON public.patient USING btree (immutable_remove_spaces_unaccent_upper((resource #>> '{address,0,line,0}'::text[]))); -- match blocks -CREATE INDEX IF NOT EXISTS patient_identifier_idx2_mdm ON public.patient USING gin (((resource #> '{identifier}'::text[]))); -- for second model, review needed -CREATE INDEX IF NOT EXISTS patient_birthdate_idx_mdm ON public.patient USING btree (((resource #>> '{birthDate}'::text[]))); -- match blocks - --- Observation indexes for merge/unmerge operations -CREATE INDEX IF NOT EXISTS observation_encounter_references_idx_mdm ON public.observation USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Encounter")."id"'::jsonpath)); -- unmerge -CREATE INDEX IF NOT EXISTS observation_patient_references_idx_mdm ON public.observation USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -- merge - --- Specimen indexes for merge operations -CREATE INDEX IF NOT EXISTS specimen_patient_references_idx_mdm ON public.specimen USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -- merge - --- DiagnosticReport indexes for merge/unmerge operations -CREATE INDEX IF NOT EXISTS diagnosticreport_patient_references_idx_mdm ON public.diagnosticreport USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -- merge -CREATE INDEX IF NOT EXISTS diagnosticreport_encounter_references_idx_mdm ON public.diagnosticreport USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Encounter")."id"'::jsonpath)); -- unmerge - --- Encounter indexes for merge operations -CREATE INDEX IF NOT EXISTS encounter_patient_references_idx_mdm ON public.encounter USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -- merge -CREATE INDEX IF NOT EXISTS encounter_identifier_idx_mdm ON public.encounter USING gin ((jsonb_path_query_array(resource, '$."identifier".**."value"')) jsonb_path_ops); - --- Condition indexes for merge/unmerge operations -CREATE INDEX IF NOT EXISTS condition_patient_references_idx_mdm ON public.condition USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -- merge -CREATE INDEX IF NOT EXISTS condition_encounter_references_idx_mdm ON public.condition USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Encounter")."id"'::jsonpath)); -- unmerge - --- Media indexes for merge/unmerge operations -CREATE INDEX IF NOT EXISTS media_patient_references_idx_mdm ON public.media USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -- merge -CREATE INDEX IF NOT EXISTS media_encounter_references_idx_mdm ON public.media USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Encounter")."id"'::jsonpath)); -- unmerge - --- SourceMessage indexes for merge/unmerge operations -CREATE INDEX IF NOT EXISTS sourcemessage_patient_references_idx_mdm ON public.sourcemessage USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Patient")."id"'::jsonpath)); -CREATE INDEX IF NOT EXISTS sourcemessage_encounter_references_idx_mdm ON public.sourcemessage USING gin (jsonb_path_query_array(resource, '$.**?(@."resourceType" == "Encounter")."id"'::jsonpath)); -``` - -## Add model to MDM backend - -Matching models are stored in the **MDM server (backend)**, not in Aidbox. You can manage them via: - -* **Admin UI**: `https://mdm.example.com/admin` -* **API**: `POST /MatchingModel`, `PUT /MatchingModel`, `GET /MatchingModel` - -Authentication is **optional**. If it is enabled, MDM uses **Aidbox OAuth** for access control. - -Example of creating a **MatchingModel** via the MDM backend API: - -```http -POST /MatchingModel -Content-Type: application/json - -{ - "id": "model", - "resource": "Patient", - "thresholds": { - "auto": 25, - "manual": 16 - }, - "blocks": { - "fn": { - "var": "name" - }, - "dob": { - "var": "dob" - }, - "addr": { - "sql": "(l.#address = r.#address)" - } - }, - "vars": { - "dob": "(#.resource#>>'{birthDate}')", - "name": "((#.#family) || ' ' || (#.#given))", - "given": "(immutable_unaccent_upper(#.resource#>>'{name,0,given,0}'))", - "family": "(immutable_unaccent_upper(#.resource#>>'{name,0,family}'))", - "gender": "(#.resource#>>'{gender}')", - "address": "(immutable_remove_spaces_unaccent_upper(#.resource#>>'{address,0,line,0}'))", - "telecomArray": "array(select jsonb_array_elements_text(jsonb_path_query_array( #.resource, '$.telecom[*] ? (@.value != \"\").value')))", - "addressLength": "(length(#.resource#>>'{address,0,line,0}'))" - }, - "features": { - "fn": [ - { - "bf": 0, - "expr": "( l.resource->'name' IS NULL OR r.resource->'name' IS NULL )" - }, - { - "bf": 13.336495228175629, - "expr": "l.#name = r.#name" - }, - { - "bf": 13.104401641242227, - "expr": "r.#given = l.#family AND l.#given = r.#family" - }, - { - "bf": 5.36329167966839, - "expr": "r.#family = l.#family AND length(l.#given) <= 5 AND length(r.#given) <= 5 AND levenshtein(l.#given, r.#given) <= 2" - }, - { - "bf": 9.288385498954133, - "expr": "levenshtein(l.#name, r.#name) <= 2" - }, - { - "bf": 10.36329167966839, - "expr": "r.#given = l.#given AND string_to_array(l.#family, ' ') && string_to_array(r.#family, ' ')" - }, - { - "bf": 10.36329167966839, - "expr": "r.#family = l.#family AND string_to_array(l.#given, ' ') && string_to_array(r.#given, ' ')" - }, - { - "bf": 2.402276401131933, - "expr": "r.#given = l.#given" - }, - { - "else": -12.37233293924643 - } - ], - "dob": [ - { - "bf": 0, - "expr": "( l.#dob IS NULL OR r.#dob IS NULL )" - }, - { - "bf": 10.59415069916466, - "expr": "l.#dob = r.#dob" - }, - { - "bf": 3.9911610470417744, - "expr": "levenshtein(l.#dob, r.#dob) <= 1" - }, - { - "bf": 0.5164298695732575, - "expr": "levenshtein(l.#dob, r.#dob) <= 2" - }, - { - "else": -10.322063538772698 - } - ], - "telecom": [ - { - "bf": 0, - "expr": "( l.#telecomArray IS NULL OR r.#telecomArray IS NULL OR array_length(l.#telecomArray, 1) IS NULL OR array_length(r.#telecomArray, 1) IS NULL )" - }, - { - "bf": 6.465648574292063, - "expr": "l.#telecomArray && r.#telecomArray" - }, - { - "else": -10.517360697819983 - } - ], - "address": [ - { - "bf": 0, - "expr": "( l.#address IS NULL OR r.#address IS NULL )" - }, - { - "bf": 9.236771286242664, - "expr": "((l.#addressLength > r.#addressLength) and (l.#address %>> r.#address)) or ((l.#addressLength <= r.#addressLength) and (l.#address <<% r.#address))" - }, - { - "bf": 7.465648574292063, - "expr": "(l.#addressLength = r.#addressLength) and (l.#address = r.#address)" - }, - { - "else": -10.517360697819983 - } - ], - "sex": [ - { - "bf": 0, - "expr": "( l.#gender IS NULL OR r.#gender IS NULL )" - }, - { - "bf": 1.8504082299552485, - "expr": "l.#gender = r.#gender" - }, - { - "else": -4.842034404727677 - } - ] - } -} -``` - -### Matching Model Tuning - -The example model is intended for **testing and demonstration purposes** and may not deliver optimal results out of the box. - -For production use and reliable, accurate matching on your data, you should: - -* **Adapt the model** to reflect your data specifics and your definition of a correct match. -* **Calibrate feature weights** using your real-world data. This step typically involves **machine learning** and **manual expert tuning**. - -{% hint style="success" %} -We offer a **professional service** for model training and expert tuning.\ -If you need assistance, please [contact us](../../overview/contact-us.md). -{% endhint %} - -### Performance considerations - -For fast and accurate matching, consider the following: - -* **Database indexes:** If you are working with large volumes of records, ensure proper database indexes are created to keep matching fast and scalable. -* **Data normalization:** Matching quality depends heavily on well‑normalized input data. Avoid using placeholders like `"UNKNOWN"` or `"not provided"` for names, addresses, or birthdates, as they negatively impact results. - -## Configure Audit Events (Optional) - -The MDM module can track and export audit events for compliance and monitoring purposes. When enabled, the system generates FHIR AuditEvent resources for operations like: - -* Merge/unmerge operations -* Search and matching -* Marking/unmarking duplicates -* Record creation and viewing - -### Enable Audit Worker - -To enable audit event collection and export, configure the following environment variables in your backend service: - -```bash -# Enable audit worker -MPI_AUDIT_WORKER_ENABLE=true - -# URL where audit events will be sent (FHIR Bundle endpoint) -MPI_AUDIT_CONSUMER_URL=http://your-audit-repository:8080/fhir/Bundle - -# Polling interval in milliseconds (how often to check for pending events) -MPI_AUDIT_INTERVAL=1000 - -# Number of events to process per batch -MPI_AUDIT_BATCH_SIZE=10 - -# PostgreSQL advisory lock ID (prevents concurrent workers) -MPI_AUDIT_LOCK_ID=54321 -``` - -### How it works - -1. **Event Collection**: The system creates FHIR AuditEvent resources for auditable operations and stores them in the `mpi.audit_event` table with `send_status = 'pending'`. - -2. **Worker Processing**: The audit worker periodically: - - Fetches pending audit events (up to `batch-size`) - - Bundles them into a FHIR Bundle (type: "collection") - - POSTs the bundle to the configured `audit-repository-url` - - Marks events as `delivered` on successful response (HTTP 2xx) - -3. **Event Format**: Each audit event includes: - - Operation type and outcome (success/failure) - - User information (from Aidbox IAM) - - Affected resources (primary resources, related resources) - - Timestamp and source system details - -### Audit Repository Requirements - -The audit events are sent as FHIR AuditEvent resources following the [BALP (Basic Audit Log Patterns)](https://profiles.ihe.net/ITI/BALP/) specification. You can use any FHIR-compliant audit repository, but we recommend **Auditbox** for optimal integration and audit log management. - -{% hint style="success" %} -**Recommended**: Use [Auditbox](https://www.health-samurai.io/auditbox) for comprehensive audit event storage, querying, and compliance reporting with built-in FHIR AuditEvent support. -{% endhint %} - -Your audit consumer endpoint should: - -- Accept FHIR Bundle resources via HTTP POST -- Support `application/json` content type -- Return HTTP 2xx status for successful processing -- Handle Bundle resources with `type: "collection"` containing AuditEvent entries -- Support FHIR AuditEvent resources (R4 specification) - -## Configure Merge/Unmerge Notifications (Optional) - -The notification worker sends real-time alerts when merge or unmerge operations occur, allowing external systems to react to record changes. - -### Enable Notification Worker - -Configure the following environment variables in your backend service: - -```bash -# Enable notification worker -MPI_NOTIFICATION_WORKER_ENABLE=true - -# URL where notifications will be sent -MPI_NOTIFICATION_CONSUMER_URL=http://your-consumer-service:9876/notifications - -# Polling interval in milliseconds -MPI_NOTIFICATION_INTERVAL=1000 - -# Number of notifications to process per batch -MPI_NOTIFICATION_BATCH_SIZE=10 - -# PostgreSQL advisory lock ID (prevents concurrent workers) -MPI_NOTIFICATION_LOCK_ID=12345 -``` - -### How it works - -1. **Event Tracking**: When merge/unmerge operations complete, they are marked with `notification_status = 'not_delivered'` in the database. - -2. **Worker Processing**: The notification worker periodically: - - Fetches undelivered merge and unmerge operations (up to `batch-size`) - - POSTs them to the configured `consumer-url` - - Marks as `delivered` on successful response (HTTP 2xx) - -3. **Notification Payload**: The worker sends a JSON payload containing (Patient example): - -```json -{ - "merges": [ - { - "id": "merge-id", - "target-patient-id": "Patient/123", - "source-patient-id": "Patient/456", - "related-resources-refs": ["Observation/789", "Encounter/012"], - "result-patient": { /* FHIR Patient resource */ } - } - ], - "unmerges": [ - { - "id": "unmerge-id", - "merge-id": "original-merge-id", - "source-patient": { /* Restored Patient resource */ }, - "user-id": "user-123", - "related-resources": ["Observation/789", "Encounter/012"] - } - ] -} -``` - -### Consumer Endpoint Requirements - -Your notification consumer endpoint should: - -- Accept HTTP POST requests with `Content-Type: application/json` -- Process the payload containing `merges` and `unmerges` arrays -- Return HTTP 2xx status for successful processing diff --git a/docs/deprecated/deprecated/mdm/find-duplicates-match.md b/docs/deprecated/deprecated/mdm/find-duplicates-match.md deleted file mode 100644 index 255ee83ab..000000000 --- a/docs/deprecated/deprecated/mdm/find-duplicates-match.md +++ /dev/null @@ -1,191 +0,0 @@ ---- -description: Use $match operation to find potential duplicate records with configurable query parameters and scoring. ---- - -# Find duplicates: $match - -{% hint style="warning" %} -The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager. -{% endhint %} - -{% hint style="warning" %} -To use the `$match` operation, you need to set up an MDM module. Read the [MDM manual](./) to learn how to run and use it. -{% endhint %} - -The `$match` operation is used to **find potential duplicate records**. - -It performs a probabilistic search based on a **matching model** that compares the record you provide with other records in the system across multiple features and estimates how similar they are. The structure of the matching model and its parameters are described on the [Matching Model Explanation](matching-model-explanation.md) page. - -The **result is a list of potential duplicates**, each with a calculated match score and a detailed breakdown of feature similarity. - -Below we use **Patient** as an example, but the same flow works for any resource type your matching model targets. - -This page provides key information about using `$match`. For full API details, refer to our [Swagger documentation](https://dev.mdm.health-samurai.io/backend/static/swagger.html). - -## $match - -The match operation can be initiated either through the **MDM user interface** or by using the **API**. - -The `$match` operation supports several **query parameters** that let you control how matching is performed and how results are returned: - -
NameTypeDefaultDescriptionExample
modelstringmodelMatching model ID to be used for matchingmodel
thresholdinteger0Minimum score threshold for a candidate to appear in the match results0
pageinteger1Page number of results1
sizeinteger10Number of results per page10
- -To call the `$match` operation, you have to send a FHIR `Parameters` resource that includes the **record** for which you want to search potential duplicates. Typically, this record contains identifying data such as: - -* Name (given and family) -* Address (e.g., city, state) -* Birth date -* Other identifying attributes if available (e.g., telecom, identifiers) - -For example, the request can look like this (Patient example): - -
POST /fhir/Patient/$match?model=model&threshold=10&page=1&size=10
-Content-Type: application/json
-
-{
-  "resourceType": "Parameters",
-  "parameter": [
-    {
-      "name": "resource",
-      "resource": {
-        "name": [
-          {
-            "given": [
-              "Freya"
-            ],
-            "family": "Shah"
-          }
-        ],
-        "address": [
-          {
-            "city": "London"
-          }
-        ],
-        "birthDate": "1970-12-17"
-      }
-    }
-  ]
-}
-
- -As a result, you will receive the following: - -* A **list of candidate duplicate records** -* For each candidate record: - * `match_weight` — an overall similarity score calculated by the matching model - * `match_details` — per-feature similarity contributions (e.g., name similarity, date of birth match, address closeness, etc.) - * `resource` — the full FHIR resource for that candidate - -The response is sorted by `match_weight` in descending order so that the most similar records appear first. - -For example: - -```json -[ - { - "match_details": { - "fn": 13.336495228175629, - "dob": 10.59415069916466, - "ext": -10.517360697819983, - "sex": 0 - }, - "match_weight": 13.413285229520307, - "resource": { - "id": "236", - "resourceType": "Patient", - "name": [ - { - "given": [ - "Freya" - ], - "family": "Shah" - } - ], - "address": [ - { - "city": "Londodn" - } - ], - "birthDate": "1970-12-17", - "identifier": [ - { - "value": "62", - "system": "cluster" - } - ] - } - }, - { - "match_details": { - "fn": 13.336495228175629, - "dob": 10.59415069916466, - "ext": -10.517360697819983, - "sex": 0 - }, - "match_weight": 13.413285229520307, - "resource": { - "id": "242", - "resourceType": "Patient", - "name": [ - { - "given": [ - "Freya" - ], - "family": "Shah" - } - ], - "address": [ - { - "city": "Lonnod" - } - ], - "birthDate": "1970-12-17", - "identifier": [ - { - "value": "62", - "system": "cluster" - } - ] - } - }, - { - "match_details": { - "fn": 13.104401641242227, - "dob": 10.59415069916466, - "ext": -10.517360697819983, - "sex": 0 - }, - "match_weight": 13.181191642586905, - "resource": { - "id": "238", - "resourceType": "Patient", - "name": [ - { - "given": [ - "Shah" - ], - "family": "Freya" - } - ], - "address": [ - { - "city": "London" - } - ], - "telecom": [ - { - "value": "f.s@flynn.com", - "system": "email" - } - ], - "birthDate": "1970-12-17", - "identifier": [ - { - "value": "62", - "system": "cluster" - } - ] - } - } -] -``` diff --git a/docs/deprecated/deprecated/mdm/matching-model-explanation.md b/docs/deprecated/deprecated/mdm/matching-model-explanation.md deleted file mode 100644 index c0f36691c..000000000 --- a/docs/deprecated/deprecated/mdm/matching-model-explanation.md +++ /dev/null @@ -1,321 +0,0 @@ ---- -description: >- - This page explains how the MDM matching model works, describing its structure, - scoring logic, and configurable elements with an example. ---- - -# Matching Model Explanation - -{% hint style="warning" %} -The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager. -{% endhint %} - -{% hint style="info" %} -This page provides the **matching model code** and explains its elements.\ -For an overview of probabilistic matching concepts and match score calculation, see our article [Master Patient Index and Record Linkage](https://www.health-samurai.io/articles/master-patient-index-and-record-linkage). -{% endhint %} - -This model is used for **record matching**, but the same approach can be adapted to detect duplicates for any type of resource.\ -If you are interested in applying this approach to your use case, please [contact us](../../overview/contact-us.md). - -Matching models are stored in the **MDM server (backend)** and managed via the `/MatchingModel` API or the `/admin` UI. - -Below we use **Patient** as an example to illustrate the model structure. - -## Core Idea - -The model compares selected fields from records and evaluates predefined comparison rules.\ -Each rule in the **features** section contains an expression `expr` and an associated weight `bf` (Bayes Factor), indicating how strongly a match or mismatch on that field affects the total score. - -All weights are summed into a **total score**. If the score is above the defined threshold, the record pair is included in the match results; if it is below, it is excluded. - -## Model Structure - -**Which fields to compare** and **how to compare** them is described in the example model: - -
{
-    "id": "model",
-    "vars": {
-        "dob": "(#.resource#>>'{birthDate}')",
-        "name": "((#.#family) || ' ' || (#.#given))",
-        "given": "(immutable_unaccent_upper(#.resource#>>'{name,0,given,0}'))",
-        "family": "(immutable_unaccent_upper(#.resource#>>'{name,0,family}'))",
-        "gender": "(#.resource#>>'{gender}')",
-        "address": "(#.resource#>>'{address,0,line,0}')",
-        "addressLength": "(length(#.resource#>>'{address,0,line,0}'))",
-        "telecomArray": "array(select jsonb_array_elements_text(jsonb_path_query_array( #.resource, '$.telecom[*] ? (@.value != \"\").value')))"
-    },
-    "blocks": {
-        "fn": {
-            "var": "name"
-        },
-        "dob": {
-            "var": "dob"
-        },
-        "addr": {
-            "sql": "(l.#address % r.#address)"
-        }
-    },
-    "features": {
-        "fn": [
-            {
-                "bf": 0,
-                "expr": " ( l.resource->'name' IS NULL OR r.resource->'name' IS NULL )"
-            },
-            {
-                "bf": 13.336495228175629,
-                "expr": "l.#name = r.#name"
-            },
-            {
-                "bf": 13.104401641242227,
-                "expr": "r.#given = l.#family AND l.#given = r.#family"
-            },
-            {
-                "bf": 9.288385498954133,
-                "expr": "levenshtein(l.#name, r.#name) <= 2"
-            },
-            {
-                "bf": 10.36329167966839,
-                "expr": "r.#given = l.#given AND string_to_array(l.#family, ' ') && string_to_array(r.#family, ' ')"
-            },
-            {
-                "bf": 10.36329167966839,
-                "expr": "r.#family = l.#family AND string_to_array(l.#given, ' ') && string_to_array(r.#given, ' ')"
-            },
-            {
-                "bf": 2.402276401131933,
-                "expr": "r.#given = l.#given"
-            },
-            {
-                "else": -12.37233293924643
-            }
-        ],
-        "dob": [
-            {
-                "bf": 0,
-                "expr": " ( l.#dob  IS NULL OR r.#dob IS NULL )"
-            },
-            {
-                "bf": 10.59415069916466,
-                "expr": "l.#dob = r.#dob"
-            },
-            {
-                "bf": 3.9911610470417744,
-                "expr": "levenshtein(l.#dob, r.#dob) <= 1"
-            },
-            {
-                "bf": 0.5164298695732575,
-                "expr": "levenshtein(l.#dob, r.#dob) <= 2"
-            },
-            {
-                "else": -10.322063538772698
-            }
-        ],
-        "ext": [
-            {
-                "bf": 9.236771286242664,
-                "expr": "((l.#telecomArray && r.#telecomArray) AND (((l.#addressLength > r.#addressLength) and (l.#address %>> r.#address)) or ((l.#addressLength <= r.#addressLength) and (l.#address <<% r.#address))))"
-            },
-            {
-                "bf": 7.465648574292063,
-                "expr": "(((l.#addressLength > r.#addressLength) and (l.#address %>> r.#address)) or ((l.#addressLength <= r.#addressLength) and (l.#address <<% r.#address)))"
-            },
-            {
-                "bf": 6.465648574292063,
-                "expr": "l.#telecomArray && r.#telecomArray"
-            },
-            {
-                "else": -10.517360697819983
-            }
-        ],
-        "sex": [
-            {
-                "bf": 0,
-                "expr": " ( l.#gender IS NULL OR r.#gender IS NULL )"
-            },
-            {
-                "bf": 1.8504082299552485,
-                "expr": " l.#gender = r.#gender"
-            },
-            {
-                "else": -4.842034404727677
-            }
-        ]
-    },
-    "resource": "Patient",
-    "thresholds": {
-        "auto": 25,
-        "manual": 16
-    },
-    "resourceType": "MatchingModel"
-}
-
- -### **Variables (`vars`)** - -**Variables** defined in the model can **reference resource fields** directly or be composed from them using expressions (e.g., concatenating values, applying normalization, or calculating derived values). These variables are used in feature expressions and blocking rules. - -* `dob` – birth date (if applicable) -* `name` – concatenation of family and given names -* `given` – normalized first name (accents removed, uppercase) -* `family` – normalized last name (accents removed, uppercase) -* `gender` – gender value -* `address` – normalized address line -* `telecomArray` – contact information (phone, email) - -### **Comparison Blocks (`blocks`)** - -Blocking rules **limit** the number of candidate record pairs by selecting only those that **share key characteristics** (e.g., similar names, matching birth dates, or addresses).\ -This **reduces** the number of comparisons, which significantly **speeds up processing**, while still preserving potential matches for scoring. - -* `fn`: blocks by name -* `dob`: blocks by date of birth -* `addr`: blocks by address - -### **Matching Features and Scoring** - -Features describe **how resource fields are compared** and **how much each comparison influences** the overall **match score**. - -Each feature contains: - -* `expr` – a logical expression that compares values of specific fields or variables between two records. -* `bf` (Bayes factor / weight) – a numeric value representing how strongly a match or mismatch on that feature affects the total score. - -When records are compared, all satisfied feature expressions **add their weights** to the total score. If a mismatch is detected, **negative weights** may be applied. The result is an aggregated score reflecting the likelihood that two records refer to the same entity. - -{% hint style="info" %} -The model uses **Levenshtein distance** to tolerate typos and small text differences. It counts how many single‑character edits (insertions, deletions, substitutions) are needed to make two strings equal.\ -For example, levenshtein('Jonathan', 'Jonatan') = 1. -{% endhint %} - -#### **Name Matching (`fn`)**: - -* Exact match: 13.34 points -* Swapped first/last names: 13.10 points -* Levenshtein distance ≤ 2: 9.29 points -* Partial matches (same first name + matching parts of last name): 10.36 points -* Same first name only: 2.40 points -* No match: -12.37 points - -```json -"fn": [ - { - "bf": 0, - "expr": " ( l.resource->'name' IS NULL OR r.resource->'name' IS NULL )" - }, - { - "bf": 13.336495228175629, - "expr": "l.#name = r.#name" - }, - { - "bf": 13.104401641242227, - "expr": "r.#given = l.#family AND l.#given = r.#family" - }, - { - "bf": 9.288385498954133, - "expr": "levenshtein(l.#name, r.#name) <= 2" - }, - { - "bf": 10.36329167966839, - "expr": "r.#given = l.#given AND string_to_array(l.#family, ' ') && string_to_array(r.#family, ' ')" - }, - { - "bf": 10.36329167966839, - "expr": "r.#family = l.#family AND string_to_array(l.#given, ' ') && string_to_array(r.#given, ' ')" - }, - { - "bf": 2.402276401131933, - "expr": "r.#given = l.#given" - }, - { - "else": -12.37233293924643 - } -] -``` - -#### **Date of Birth Matching (`dob`)**: - -* Exact match: 10.59 points -* Levenshtein distance ≤ 1: 3.99 points -* Levenshtein distance ≤ 2: 0.52 points -* No match: -10.32 points - -```json -"dob": [ - { - "bf": 0, - "expr": " ( l.#dob IS NULL OR r.#dob IS NULL )" - }, - { - "bf": 10.59415069916466, - "expr": "l.#dob = r.#dob" - }, - { - "bf": 3.9911610470417744, - "expr": "levenshtein(l.#dob, r.#dob) <= 1" - }, - { - "bf": 0.5164298695732575, - "expr": "levenshtein(l.#dob, r.#dob) <= 2" - }, - { - "else": -10.322063538772698 - } -] -``` - -#### **Address Matching (`ext`)**: - -* Exact address match: 7.47 points -* Matching contact information: 9.24 points -* No match: -10.52 points - -```json -"ext": [ - { - "bf": 9.236771286242664, - "expr": "((l.#telecomArray && r.#telecomArray) AND (((l.#addressLength > r.#addressLength) and (l.#address %>> r.#address)) or ((l.#addressLength <= r.#addressLength) and (l.#address <<% r.#address))))" - }, - { - "bf": 7.465648574292063, - "expr": "(((l.#addressLength > r.#addressLength) and (l.#address %>> r.#address)) or ((l.#addressLength <= r.#addressLength) and (l.#address <<% r.#address)))" - }, - { - "bf": 6.465648574292063, - "expr": "l.#telecomArray && r.#telecomArray" - }, - { - "else": -10.517360697819983 - } -] -``` - -#### **Gender Matching (`sex`)**: - -* Exact match: 1.85 points -* No match: -4.84 points - -```json -"sex": [ - { - "bf": 0, - "expr": " ( l.#gender IS NULL OR r.#gender IS NULL )" - }, - { - "bf": 1.8504082299552485, - "expr": " l.#gender = r.#gender" - }, - { - "else": -4.842034404727677 - } -] -``` - -### **Thresholds** - -Thresholds define the **decision boundaries** for match results.\ -After the total score is calculated based on all feature comparisons, it is compared against threshold values: - -* `auto`: matching score ≥ 25 → automatic merge can be processed -* `manual`: 16 ≤ matching score < 25 → manual review required -* Below `manual` – score < 16 → non‑match diff --git a/docs/deprecated/deprecated/mdm/mathematical-details.md b/docs/deprecated/deprecated/mdm/mathematical-details.md deleted file mode 100644 index 2c1321d93..000000000 --- a/docs/deprecated/deprecated/mdm/mathematical-details.md +++ /dev/null @@ -1,51 +0,0 @@ ---- -description: MDM Mathematical Details for matching and deduplication in FHIR. ---- - -# Mathematical Details - -{% hint style="warning" %} -The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager. -{% endhint %} - -See the [fastlink](https://imai.fas.harvard.edu/research/files/linkage.pdf) paper for more details. - -The algorithm is based on comparisons. - -We will use the term _record_ instead of resource here (this term is used in the record linkage articles). - -Define a set of comparison functions over pairs of records. Each comparison function returns a single category like - -* null -* significantly different -* slightly different -* exactly equal - -Different comparison functions can have different sets of possible categories (i.e., codomains are not necessarily equal). - -An example of a comparison function is - -* \-1, if the surname of one of the records is missing -* 0, if Levenshtein distance between surnames is greater than 2 -* 1, if Levenshtein distance is 2 -* 2, if Levenshtein distance is 1 -* 3, if surnames are equal - -We will say that two records match if they belong to the same entity. For example there can be two records for a single person or organization. These records can differ (e.g., a name change). - -We are going to use Bayes' theorem. Prior probability is the probability that two random records match. - -Then we define two conditional probabilities for each comparison function value: - -* m-probability: probability of the specific comparison function value, given that records match -* u-probability: probability of the specific comparison function value, given that records don't match - -Then m-probability divided by u-probability is a Bayes factor. - -To calculate the match score, multiply Bayes factors of each comparison result, and multiply that value by the prior. - -To estimate probability, compute x/(1+x), where x is the score. Or calculate it using Bayes' theorem. - -Note that comparison functions have to be mutually independent. However, in practice the algorithm is quite robust to independence violations. - -Probability estimation is done using the EM algorithm. It is discussed in detail in the fastlink paper. diff --git a/docs/deprecated/deprecated/mdm/mdm-module-resources.md b/docs/deprecated/deprecated/mdm/mdm-module-resources.md deleted file mode 100644 index 7a4ebc66a..000000000 --- a/docs/deprecated/deprecated/mdm/mdm-module-resources.md +++ /dev/null @@ -1,72 +0,0 @@ ---- -description: Aidbox MDM module resources for master data management and record linkage. ---- - -# MDM Module Resources - -Resources for MDM module. - - ## AidboxLinkageModel - -MDM (Master Data Management) Linkage Model resource for probabilistic record matching - -```fhir-structure -[ { - "path" : "blocks", - "name" : "blocks", - "lvl" : 0, - "min" : 1, - "max" : 1, - "type" : "Object", - "desc" : "" -}, { - "path" : "features", - "name" : "features", - "lvl" : 0, - "min" : 1, - "max" : 1, - "type" : "Object", - "desc" : "" -}, { - "path" : "resource", - "name" : "resource", - "lvl" : 0, - "min" : 1, - "max" : 1, - "type" : "string", - "desc" : "" -}, { - "path" : "thresholds", - "name" : "thresholds", - "lvl" : 0, - "min" : 0, - "max" : 1, - "type" : "BackboneElement", - "desc" : "" -}, { - "path" : "thresholds.auto", - "name" : "auto", - "lvl" : 1, - "min" : 0, - "max" : 1, - "type" : "decimal", - "desc" : "" -}, { - "path" : "thresholds.manual", - "name" : "manual", - "lvl" : 1, - "min" : 0, - "max" : 1, - "type" : "decimal", - "desc" : "" -}, { - "path" : "vars", - "name" : "vars", - "lvl" : 0, - "min" : 0, - "max" : 1, - "type" : "Object", - "desc" : "" -} ] -``` - diff --git a/docs/deprecated/deprecated/mdm/merging-and-unmerging-records-usdmerge-and-usdunmerge.md b/docs/deprecated/deprecated/mdm/merging-and-unmerging-records-usdmerge-and-usdunmerge.md deleted file mode 100644 index 817007d70..000000000 --- a/docs/deprecated/deprecated/mdm/merging-and-unmerging-records-usdmerge-and-usdunmerge.md +++ /dev/null @@ -1,253 +0,0 @@ ---- -description: >- - This page explains how to use merge and unmerge operations in MDM and how they - work, with practical examples. ---- - -# Merging and Unmerging Records: $merge and $unmerge - -{% hint style="warning" %} -The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager. -{% endhint %} - -## Overview - -We use a **hybrid merge strategy** that combines elements of both the **golden record** and **survivor record** approaches: - -* You have to select one of the existing records as the **survivor**. -* You can optionally **edit its data** using fields from the other records before completing the merge. - -Currently, only **manual merging** is supported. However, the system is designed to support **automatic merging** in the future. - -The **unmerge** operation allows reversing a previous merge by restoring the original source record and its relationships based on audit data, ensuring no permanent data loss if a merge was done by mistake. - -This page provides key information about using `$merge` and `$unmerge`. For full API details, refer to our [Swagger documentation](https://dev.mdm.health-samurai.io/backend/static/swagger.html). - -{% hint style="success" %} -If you need **alternative merge and unmerge approaches** to adjust MDM to your specific workflows and requirements, please [contact us](../../overview/contact-us.md). -{% endhint %} - -## Merge Operation - -### **$merge** - -The merge operation can be initiated either through the **MDM user interface** or by using the **API**. - -To perform it via the API, send the `$merge` request, for example (Patient example): - -```http -POST /fhir/Patient/$merge -Content-Type: application/json - -{ - "targetPatient": { - "reference": "Patient/0" - }, - "sourcePatient": { - "reference": "Patient/3" - }, - "resultPatient": { - "name": [ - { - "given": ["Robert"], - "family": "Alan" - } - ] - } -} -``` - -Where: - -* `targetPatient` – the resource record selected as the **survivor**. After the merge, this record remains active and contains the resulting data. -* `sourcePatient` – the resource record being merged into the survivor. After the merge, this record will be removed. -* `resultPatient` – optional. Provides updated data for the survivor (for example, if you want to correct a name, address, or any other field during the merge). If omitted, the survivor record remains unchanged, except for linked resources and identifiers that are merged automatically. - -The **response** returns a `merge-id` used for audit and tracking: - -```json -{ - "merge-id": "b71dc614-7a1c-44b8-a727-027cbee3466a" -} -``` - -### How Merging Works - -When a merge request is processed, the system performs several steps to ensure data consistency and auditability: - -1. Verifies the match weight exists -2. All related resources are found and updated -3. A merge record is created with status and match weight -4. All changes are logged for audit purposes -5. The target record is updated with merged data -6. The source record is deleted -7. The merge ID is returned to the client - -The diagram below shows the full process flow: - -```mermaid -sequenceDiagram - participant Client - participant MDM as MDM Service - participant DB as Database - - Client->>MDM: merge-record(source, target, result) - MDM->>DB: find-match-weight(source, target) - alt No match weight found - MDM-->>Client: Error: No match weight - else Match weight exists - MDM->>DB: find-resources-by-record-id(source) - Note over MDM,DB: Find all related resources
(Encounters, Specimens, etc.) - - MDM->>DB: create merge record - Note over MDM,DB: Generate merge_id
Set status=COMPLETED
Store match_weight - - loop For each related resource - MDM->>DB: log resource update - MDM->>DB: update resource reference - end - - MDM->>DB: log target record update - MDM->>DB: log source record deletion - - MDM->>DB: update target record - Note over MDM,DB: Update with merged identifiers
and result resource - - MDM->>DB: delete source record - MDM->>DB: delete non-duplicate record - - MDM-->>Client: Return merge_id - end -``` - - - -## Unmerge Operation - -If a merge was performed incorrectly, it can be reversed using the **unmerge** operation. - -This operation **restores** the original **source record** and **re‑links all related resources** based on audit information collected during the merge. - -If any **linked resources were created after the merge**, the operation also provides a way to **manually reassign them** to the appropriate record. - -### $unmerge-preview/{id} - -Before performing an unmerge, you should preview what will be changed using the **Unmerge Review** endpoint. This endpoint returns a list of **encounters created after the merge**, which may need to be manually reassigned during the unmerge process. - -To get the list of after-merge linked resources via API, send the `$unmerge-preview` request, for example (Patient example): - -```http -POST /fhir/Patient/$unmerge_preview/1 -Content-Type: application/json -``` - -Where `1` is the **merge ID** returned by the `$merge` operation you want to undo. - -**Example response:** - -```json -{ - "target-id": "patient-123", - "source-id": "patient-456", - "new-encounters": [ - { - "id": "encounter-789", - "resource": { - "resourceType": "Encounter", - "status": "finished", - "class": { - "system": "http://terminology.hl7.org/CodeSystem/v3-ActCode", - "code": "AMB", - "display": "ambulatory" - } - } - } - ] -} -``` - -Where: - -* `target-id` – the ID of the **survivor** record after the merge. -* `source-id` – the ID of the **merged (source)** record that will be restored. -* `new-encounters` – list of encounters created **after the merge** that may need to be manually reassigned when performing an unmerge. - -### $unmerge - -Like the merge operation, the unmerge operation can be initiated either through the **MDM user interface** or by using the **API**. - -To perform the unmerge via the API, send the `$unmerge` request, for example (Patient example): - -```http -POST /fhir/Patient/$unmerge -Content-Type: application/json - -{ - "merge-id": "1", - "encounters-and-patient-ids": [ - { - "patient-id": "patient-456", - "encounter-ids": [ - "encounter-123" - ] - } - ] -} -``` - -Where: - -* `merge-id` – the unique identifier of the merge operation to be reversed. -* `encounters-and-patient-ids` – optional mapping of **encounters created after the merge** that need to be manually reassigned: - * `patient-id` – the record to whom encounters should be linked after the unmerge. - * **`encounter-ids`** – list of encounters that should be re‑linked to this record.\ - This ensures that any encounters created post‑merge are correctly redirected when restoring the original records. You can get the list of post-merge encounters via **$unmerge-preview/{id}** request. - -### How Unmerging Works - -Processing an unmerge request follows a specific workflow designed to safely restore previously merged data: - -1. The system retrieves the original merge logs -2. Source and target record information is extracted -3. Encounters are grouped by record ID -4. Resources that need to be reverted are identified -5. Resource references are updated for selected encounters and other linked resources -6. Original resources are restored from merge logs -7. The merge record is removed -8. Records are marked as non-duplicates -9. The operation is completed - -The diagram below shows the full process flow: - -```mermaid -sequenceDiagram - participant Client - participant MDM as MDM Service - participant DB as Database - - Client->>MDM: unmerge(merge_id, encounters_and_record_ids) - MDM->>DB: find-merge-log(merge_id) - Note over MDM,DB: Get source and target record logs - - MDM->>DB: get-source-and-target-logs(merge_logs) - MDM->>DB: group-by-record-id(encounters_and_record_ids) - Note over MDM,DB: Group encounters by record ID - - MDM->>DB: get-resources-logs-to-revert(merge_logs) - Note over MDM,DB: Get all resources that need to be reverted - - MDM->>DB: update-encounters-and-related-resources - Note over MDM,DB: Update resource references
for selected encounters - - MDM->>DB: revert-old-resources - Note over MDM,DB: Restore original resources
from merge logs - - MDM->>DB: remove-merge(merge_id) - Note over MDM,DB: Delete merge record - - MDM->>DB: mark-as-non-duplicate - Note over MDM,DB: Prevent future matches
between these records - - MDM-->>Client: Unmerge completed - -``` diff --git a/docs/deprecated/deprecated/mdm/rbac.md b/docs/deprecated/deprecated/mdm/rbac.md deleted file mode 100644 index 4a008b8d6..000000000 --- a/docs/deprecated/deprecated/mdm/rbac.md +++ /dev/null @@ -1,63 +0,0 @@ -# RBAC configuration - -{% hint style="warning" %} -The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager. -{% endhint %} - -> **Note:** RBAC in MPI is minimal and early-stage. It supports only two roles with no granular permissions. We are open to feedback and suggestions on how access control should evolve. - -## How it works - -MPI uses a simple **admin / basic** two-tier model. A user is either an **admin** (full access) or a **basic user** (patient search and duplicate matching only). - -The role is determined by checking the user's `data.groups` array in Aidbox against the configured `MPI_ADMIN_ROLE` environment variable. - -## Configuration - -### 1. Environment variables - -| Variable | Service | Purpose | Example | -|----------|---------|---------|---------| -| `MPI_ADMIN_ROLE` | Backend + Frontend | Group name that grants admin access | `SIT_EMPI_ADMIN_DEV` | -| `MPI_ENABLE_AUTHENTICATION` | Backend | Enable authentication (`true`/`false`) | `true` | -| `MPI_ENABLE_AUTHORIZATION` | Backend | Enable authorization (`true`/`false`) | `true` | -| `AUTH_DISABLED` | Frontend | Disable auth entirely, dev mode (`true`/`false`) | `false` | - -### 2. Aidbox User setup - -We use `data.groups` (not `data.roles`) because it maps naturally to **Active Directory / LDAP groups**. When Aidbox is connected to an external IdP (Azure AD, ADFS, Okta, etc.), AD group memberships are propagated into `data.groups` automatically — so adding a user to the AD group is enough, no manual Aidbox edits needed. - -Add the role string to the `data.groups` array of the Aidbox User resource: - -```json -{ - "resourceType": "User", - "id": "my-user", - "data": { - "groups": [ - "SIT_EMPI_ADMIN_DEV" - ] - } -} -``` - -The value in `groups` must match `MPI_ADMIN_ROLE` exactly. If it doesn't match, the user is treated as a basic user. - -> For Aidbox **Client** resources (service accounts), the check looks at `details.roles` instead of `data.groups`. - -## What each role can see - -| Feature | Admin | Basic user | -|---------|:-----:|:----------:| -| Patient search & details | Yes | Yes | -| Duplicate matching | Yes | Yes | -| Select matching model | Yes | No | -| Merges page | Yes | No | -| Non-duplicates page | Yes | No | -| Audit logs page | Yes | No | -| Unmerge operations | Yes | No | -| Aidbox Resource Browser link | Yes | Hidden | -| REST API (merge, unmerge, model CRUD, bulk match) | Yes | 403 Forbidden | - -Basic users see only the **Patients** tab in the navigation. All other tabs are hidden. - \ No newline at end of file diff --git a/docs/deprecated/deprecated/mdm/run-mdm-locally.md b/docs/deprecated/deprecated/mdm/run-mdm-locally.md deleted file mode 100644 index d94dfe339..000000000 --- a/docs/deprecated/deprecated/mdm/run-mdm-locally.md +++ /dev/null @@ -1,218 +0,0 @@ ---- -description: Follow these steps to launch Aidbox MDM module locally using Docker Compose ---- - -# Run MDM locally - -{% hint style="warning" %} -The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager. -{% endhint %} - -## Prerequisites - -{% hint style="warning" %} -Please **make sure** that both [Docker & Docker Compose](https://docs.docker.com/engine/install/) are installed. -{% endhint %} - -{% hint style="info" %} -Replace example hosts like `mdm.example.com` and `aidbox.example.com` with your actual domains. -{% endhint %} - -## Steps - -### 1. Create a directory - -```sh -mkdir aidbox-mdm && cd aidbox-mdm -``` - -### 2. Create Docker Compose file - -Create a file named `mdm-compose.yml` with the following content: - -```yaml -volumes: - postgres_data: {} -services: - aidbox-db: - image: docker.io/library/postgres:18 - volumes: - - postgres_data:/var/lib/postgresql/18/docker:delegated - environment: - PGDATA: /data - POSTGRES_USER: postgres - POSTGRES_PORT: "5432" - POSTGRES_DB: aidbox - POSTGRES_PASSWORD: postgres - POSTGRES_INITDB_ARGS: "--data-checksums" - command: - - postgres - - -c - - shared_preload_libraries=pg_stat_statements - healthcheck: - test: ["CMD-SHELL", "pg_isready -U postgres"] - interval: 5s - timeout: 5s - retries: 5 - start_period: 10s - - aidbox: - image: healthsamurai/aidboxone:latest - pull_policy: always - depends_on: - aidbox-db: - condition: service_healthy - ports: - - 8889:8080 - environment: - BOX_ADMIN_PASSWORD: password - BOX_BOOTSTRAP_FHIR_PACKAGES: hl7.fhir.r4.core#4.0.1 - BOX_COMPATIBILITY_VALIDATION_JSON__SCHEMA_REGEX: "#{:fhir-datetime}" - BOX_DB_DATABASE: aidbox - BOX_DB_HOST: aidbox-db - BOX_DB_PASSWORD: postgres - BOX_DB_PORT: "5432" - BOX_DB_USER: postgres - BOX_FHIR_COMPLIANT_MODE: true - BOX_FHIR_CORRECT_AIDBOX_FORMAT: true - BOX_FHIR_CREATEDAT_URL: https://aidbox.app/ex/createdAt - BOX_FHIR_SCHEMA_VALIDATION: true - BOX_FHIR_SEARCH_AUTHORIZE_INLINE_REQUESTS: true - BOX_FHIR_SEARCH_CHAIN_SUBSELECT: true - BOX_FHIR_SEARCH_COMPARISONS: true - BOX_FHIR_TERMINOLOGY_SERVICE_BASE_URL: https://tx.health-samurai.io/fhir - BOX_ROOT_CLIENT_ID: root - BOX_ROOT_CLIENT_SECRET: H_ZuMewXLL - BOX_SEARCH_INCLUDE_CONFORMANT: true - BOX_SECURITY_AUDIT_LOG_ENABLED: true - BOX_SECURITY_DEV_MODE: true - BOX_SETTINGS_MODE: read-write - BOX_WEB_BASE_URL: https://aidbox.example.com - BOX_WEB_PORT: 8080 - healthcheck: - test: curl -f https://aidbox.example.com/health || exit 1 - interval: 5s - timeout: 5s - retries: "90" - start_period: 30s - - backend: - pull_policy: always - image: healthsamurai/mpi-backend:edge - environment: - - MPI_URI=https://mdm.example.com - - MPI_PG_HOST=aidbox-db - - MPI_PG_PORT=5432 - - MPI_PG_USER=postgres - - MPI_PG_PASSWORD=postgres - - MPI_PG_DATABASE=aidbox - - MPI_HTTP_PORT=3003 - - MPI_HTTP_BINDING=0.0.0.0 - - MPI_LOG_LEVEL=2 - - MPI_ENABLE_AUTHENTICATION=true - - MPI_ENABLE_AUTHORIZATION=false - - AIDBOX_BASE_URL=http://aidbox:8080 - - MPI_PG_TRGM_SIMILARITY_THRESHOLD=0.9 - - MPI_PG_TRGM_STRICT_WORD_SIMILARITY_THRESHOLD=0.9 - - MPI_NOTIFICATION_WORKER_ENABLE=false - - MPI_NOTIFICATION_CONSUMER_URL=https://notifications.example.com - - MPI_NOTIFICATION_INTERVAL=1000 - - MPI_NOTIFICATION_BATCH_SIZE=10 - - MPI_NOTIFICATION_LOCK_ID=12345 - - MPI_AUDIT_WORKER_ENABLE=false - - MPI_AUDIT_CONSUMER_URL=https://audit.example.com - - MPI_AUDIT_INTERVAL=1000 - - MPI_AUDIT_BATCH_SIZE=10 - - MPI_AUDIT_LOCK_ID=54321 - ports: - - "3003:3003" - depends_on: - aidbox-db: - condition: service_healthy - aidbox: - condition: service_healthy - restart: unless-stopped - healthcheck: - test: curl -f https://mdm.example.com/health || exit 1 - interval: 5s - timeout: 5s - retries: "90" - start_period: 30s - - frontend: - pull_policy: always - image: healthsamurai/mpi-frontend:edge - environment: - - NEXT_PUBLIC_API_BASE_URL=http://backend:3003 - - NEXTAUTH_SECRET=your-very-strong-random-secret-here - - BASE_URL=https://mdm.example.com - - AIDBOX_BASE_URL_EXTERNAL=https://aidbox.example.com - - AIDBOX_BASE_URL_INTERNAL=http://aidbox:8080 - - AIDBOX_CLIENT_ID=mpi-dev - - AIDBOX_CLIENT_SECRET=pass - - NEXTAUTH_URL=https://mdm.example.com - - MPI_BASIC_ROLE=SIT_EMPI_USER_DEV - - MPI_ADMIN_ROLE=SIT_EMPI_ADMIN_DEV - - PATIENT_PORTAL_IDENTIFIER_TYPE_CODE=LUMID.PROD - - MPI_PG_HOST=aidbox-db - - MPI_PG_PORT=5432 - - MPI_PG_USER=postgres - - MPI_PG_PASSWORD=postgres - - MPI_PG_DATABASE=aidbox - - DEV_MODE=false - - DEFAULT_PATIENT_MODEL=model - ports: - - "3000:3000" - depends_on: - - backend - restart: unless-stopped -``` - -### 3. Start the MDM module - -```bash -docker compose -f mdm-compose.yml up -``` - -This command starts all required services: - -* **aidbox-db**: PostgreSQL database -* **aidbox**: Aidbox FHIR server -* **backend**: MDM backend service -* **frontend**: MDM frontend interface - -### 4. Access Aidbox - -Open in your browser [https://aidbox.example.com/](https://aidbox.example.com) - -### 5. Activate your Aidbox instance - -Click "Continue with Aidbox account" and create a free Aidbox account in [Aidbox user portal](https://aidbox.app/). - -More about Aidbox licenses [here](../../overview/aidbox-user-portal/licenses.md). - -### 6. Configure the MDM module - -Follow the [configuration guide](configure-mdm-module.md) to set up OAuth authentication, user privileges, SQL functions, and **create the matching model in the MDM backend** via `/MatchingModel` or `https://mdm.example.com/admin`. - -### 7. Access the MDM Frontend - -Once all services are running and configured, access the MDM frontend at [https://mdm.example.com](https://mdm.example.com) - -You can now log in using OAuth authentication through Aidbox. - -## Service URLs - -After successful startup, the following services will be available: - -| Service | URL | Description | -| ------------ | --------------------- | ----------------------------------- | -| Aidbox UI | https://aidbox.example.com | FHIR server and admin interface | -| MDM Frontend | https://mdm.example.com | MDM user interface | -| MDM Backend | https://mdm.example.com | MDM REST API | - -## Next steps - -* [Configure the MDM matching model](configure-mdm-module.md) to start matching records -* Learn about [matching algorithms](matching-model-explanation.md) -* Explore the [MDM API documentation](https://dev.mdm.health-samurai.io/backend/static/swagger.html) for integration diff --git a/redirects.yaml b/redirects.yaml index 2ce8a13c4..0802a2df1 100644 --- a/redirects.yaml +++ b/redirects.yaml @@ -352,7 +352,6 @@ redirects: modules-1/integration-toolkit/hl7-v2-integration/mappings-with-lisp-mapping: modules/integration-toolkit/hl7-v2-integration/mappings-with-lisp-mapping.md modules-1/integration-toolkit/mappings: modules/integration-toolkit/mappings.md modules-1/integration-toolkit/x12-message-converter: modules/integration-toolkit/x12-message-converter.md - modules-1/mdm: deprecated/deprecated/mdm/README.md modules-1/observability/getting-started/how-to-export-telemetry-to-the-otel-collector: modules/observability/getting-started/how-to-export-telemetry-to-the-otel-collector.md modules-1/observability/getting-started/run-aidbox-with-opentelemetry-locally: modules/observability/getting-started/run-aidbox-with-opentelemetry-locally.md modules-1/observability/logs/extending-aidbox-logs: modules/observability/logs/extending-aidbox-logs.md @@ -376,9 +375,6 @@ redirects: modules-1/observability/traces/how-to-use-tracing: modules/observability/traces/how-to-use-tracing.md modules-1/observability/traces/otel-traces-exporter-parameters: modules/observability/traces/otel-traces-exporter-parameters.md modules-1/other-modules/mcp: modules/other-modules/mcp.md - modules-1/other-modules/mdm/configure-mdm-module: deprecated/deprecated/mdm/configure-mdm-module.md - modules-1/other-modules/mdm/find-duplicates-usdmatch: deprecated/deprecated/mdm/find-duplicates-match.md - modules-1/other-modules/mdm/mathematical-details: deprecated/deprecated/mdm/mathematical-details.md modules-1/profiling-and-validation/asynchronous-resource-validation: modules/profiling-and-validation/asynchronous-resource-validation.md modules-1/profiling-and-validation/fhir-schema-validator/setup-aidbox-with-fhir-schema-validation-engine: modules/profiling-and-validation/fhir-schema-validator/setup-aidbox-with-fhir-schema-validation-engine.md modules-1/profiling-and-validation/fhir-schema-validator/tutorials/how-to-create-fhir-npm-package: tutorials/artifact-registry-tutorials/how-to-create-fhir-npm-package.md @@ -431,15 +427,6 @@ redirects: modules/observability/logging-and-audit/readme-1/how-to-explore-and-visualize-aidbox-logs-with-kibana-and-grafana: modules/observability/logs/tutorials/log-analysis-and-visualization-tutorial.md modules/observability/logging-and-audit/technical-reference: modules/observability/logs/technical-reference/README.md modules/observability/logging-and-audit: access-control/audit-and-logging.md - modules/other-modules/mdm/find-duplicates-usdmatch.md: deprecated/deprecated/mdm/find-duplicates-match.md - modules/other-modules/mdm: deprecated/deprecated/mdm/README.md - modules/other-modules/mpi/find-duplicates-match: deprecated/deprecated/mdm/find-duplicates-match.md - modules/other-modules/mpi/get-started/configure-mpi-module: deprecated/deprecated/mdm/configure-mdm-module.md - modules/other-modules/mpi/get-started: deprecated/deprecated/mdm/README.md - modules/other-modules/mpi/matching-model-explanation: deprecated/deprecated/mdm/matching-model-explanation.md - modules/other-modules/mpi/mathematical-details: deprecated/deprecated/mdm/mathematical-details.md - modules/other-modules/mpi/merging-and-unmerging-records-usdmerge-and-usdunmerge: deprecated/deprecated/mdm/merging-and-unmerging-records-usdmerge-and-usdunmerge.md - modules/other-modules/mpi: deprecated/deprecated/mdm/README.md modules/profiling-and-validation/fhir-schema-validator/upload-fhir-implementation-guide/environment-variable: tutorials/artifact-registry-tutorials/upload-fhir-implementation-guide/environment-variable.md modules/security-and-access-control/overview: access-control/access-control.md modules/security-and-access-control/patient-data-access-api/how-to-enable-patient-data-access-api: tutorials/security-access-control-tutorials/how-to-enable-patient-data-access-api.md @@ -506,7 +493,6 @@ redirects: aidbox-configuration/aidbox-zen-lang-project: deprecated/deprecated/zen-related/aidbox-zen-lang-project/README.md modules-1/custom-resources/getting-started-with-custom-resources: tutorials/artifact-registry-tutorials/custom-resources/README.md modules-1/topic-based-subscriptions/topic-based-subscriptions: modules/topic-based-subscriptions/README.md - modules/other-modules/mdm/find-duplicates-match: deprecated/deprecated/mdm/find-duplicates-match.md modules/profiling-and-validation/fhir-schema-validator/upload-fhir-implementation-guide: tutorials/artifact-registry-tutorials/upload-fhir-implementation-guide/README.md modules/security-and-access-control/set-up-external-identity-provider: access-control/authentication/sso-with-external-identity-provider.md api: api/api-overview.md @@ -517,15 +503,6 @@ redirects: modules-1/security-and-access-control/how-to-guides/token-introspection: tutorials/security-access-control-tutorials/set-up-token-introspection.md modules/compliance/aidbox-+-fhir-app-portal/getting-started/run-aidbox-+-fhir-portal-locally: solutions/aidbox-+-fhir-app-portal/getting-started/run-aidbox-+-fhir-portal-locally.md modules/integration-toolkit/ccda-converter/sections/planoftreatmentsectionv2: modules/integration-toolkit/ccda-converter/sections/plan-of-treatment-section-docs-v2.md - modules/mdm: deprecated/deprecated/mdm/README.md - modules/mdm/run-mdm-locally: deprecated/deprecated/mdm/run-mdm-locally.md - modules/mdm/configure-mdm-module: deprecated/deprecated/mdm/configure-mdm-module.md - modules/mdm/find-duplicates-match: deprecated/deprecated/mdm/find-duplicates-match.md - modules/mdm/merging-and-unmerging-records-usdmerge-and-usdunmerge: deprecated/deprecated/mdm/merging-and-unmerging-records-usdmerge-and-usdunmerge.md - modules/mdm/rbac: deprecated/deprecated/mdm/rbac.md - modules/mdm/matching-model-explanation: deprecated/deprecated/mdm/matching-model-explanation.md - modules/mdm/mathematical-details: deprecated/deprecated/mdm/mathematical-details.md - reference/system-resources-reference/mdm-module-resources: deprecated/deprecated/mdm/mdm-module-resources.md modules/profiling-and-validation/fhir-schema-validator/tutorials: tutorials/artifact-registry-tutorials/how-to-create-fhir-npm-package.md modules/security-and-access-control/multitenancy/organization-based-hierarchical-access-control: access-control/authorization/scoped-api/organization-based-hierarchical-access-control.md modules/security-and-access-control/security: access-control/authorization/access-policies.md @@ -600,7 +577,6 @@ redirects: readme-1/readme-1-1/sample-patient-can-see-its-own-data: tutorials/security-access-control-tutorials/accesspolicy-examples.md readme-1/validation-tutorials/davinci-pdex: tutorials/validation-tutorials/davinci-pdex.md reference/aidbox-forms/aidbox-sdc-api: reference/aidbox-forms-reference/aidbox-sdc-api.md - reference/zen-schema-reference/aidbox/mdm/model: deprecated/deprecated/mdm/README.md storage-1/aidboxdb-image: deprecated/deprecated/aidboxdb/README.md storage-1/aidboxdb-image/migration-to-aidboxdb-16.1-handling-the-removal-of-jsonknife-extension: deprecated/deprecated/aidboxdb/migrate-to-aidboxdb-16.md storage-1/custom-resources/migrate-to-fhirschema/migrate-custom-resources-defined-with-zen-to-fhir-schema: tutorials/artifact-registry-tutorials/custom-resources/migrate-to-fhirschema/migrate-custom-resources-defined-with-zen-to-fhir-schema.md @@ -673,16 +649,6 @@ redirects: modules-1/smartbox/how-to-guides/setup-email-provider: deprecated/deprecated/smartbox/how-to-guides/setup-email-provider.md modules-1/smartbox/the-b11-decision-support-interventions/source-attributes: deprecated/deprecated/smartbox/the-b11-decision-support-interventions/source-attributes.md - # modules/mpi/* → modules/mdm/* - modules/mpi: deprecated/deprecated/mdm/README.md - modules/mpi/find-duplicates-match: deprecated/deprecated/mdm/find-duplicates-match.md - modules/mpi/get-started: deprecated/deprecated/mdm/README.md - modules/mpi/get-started/configure-mpi-module: deprecated/deprecated/mdm/configure-mdm-module.md - modules/mpi/get-started/deploy-mpi-with-kubernetes: deprecated/deprecated/mdm/README.md - modules/mpi/get-started/run-mdm-locally: deprecated/deprecated/mdm/run-mdm-locally.md - modules/mpi/matching-model-explanation: deprecated/deprecated/mdm/matching-model-explanation.md - modules/mpi/mathematical-details: deprecated/deprecated/mdm/mathematical-details.md - modules/mpi/merging-and-unmerging-records-usdmerge-and-usdunmerge: deprecated/deprecated/mdm/merging-and-unmerging-records-usdmerge-and-usdunmerge.md # modules/aidbox-forms/aidbox-code-editor/* → deprecated/deprecated/forms/aidbox-code-editor/* modules/aidbox-forms/aidbox-code-editor: deprecated/deprecated/forms/aidbox-code-editor/README.md @@ -769,9 +735,6 @@ redirects: getting-started/run-aidbox-in-kubernetes/high-available-aidbox: deployment-and-maintenance/deploy-aidbox/run-aidbox-in-kubernetes/highly-available-aidbox.md getting-started/use-aidbox-with-react: developer-experience/use-aidbox-with-react.md - # mdm/* - mdm/mdm-module: deprecated/deprecated/mdm/README.md - # modules-1/aidbox-forms/* # modules-1/ccda-converter/* From a7a1b224d706966283fd6872c6e42732d5df62f7 Mon Sep 17 00:00:00 2001 From: kuzmordas Date: Thu, 4 Jun 2026 12:59:45 +0300 Subject: [PATCH 2/2] docs: revert README for mdm --- docs/deprecated/deprecated/mdm/README.md | 100 +++++++++++++++++++++++ 1 file changed, 100 insertions(+) create mode 100644 docs/deprecated/deprecated/mdm/README.md diff --git a/docs/deprecated/deprecated/mdm/README.md b/docs/deprecated/deprecated/mdm/README.md new file mode 100644 index 000000000..42d12b20f --- /dev/null +++ b/docs/deprecated/deprecated/mdm/README.md @@ -0,0 +1,100 @@ +--- +description: >- + This page introduces the Aidbox MDM module, its core capabilities, and guides + for deployment, configuration, matching, and merge/unmerge operations. +--- + +# MDM — Master Data Management + +{% hint style="warning" %} +The MDM module is currently available for **testing and evaluation purposes only**. If you plan to use it with real data in a production environment, please [contact us](https://www.health-samurai.io/#contact-form) or reach out to your Aidbox customer success manager. +{% endhint %} + +**Master Data Management (MDM)** is a module in Aidbox that ensures **accurate entity identification** by detecting and removing duplicate records. It helps maintain consistent and reliable data across healthcare systems. + +**MDM enables:** + +* accurate [**matching**](find-duplicates-match.md) of records across different systems and facilities, +* [**merging**](merging-and-unmerging-records-usdmerge-and-usdunmerge.md#merge-operation) of duplicate records into a single record, +* [**unmerging**](merging-and-unmerging-records-usdmerge-and-usdunmerge.md#unmerge-operation) of incorrectly linked records, +* maintaining the **integrity** of clinical data and treatment history. + +Using MDM **reduces the risk** of lost or duplicated data, errors, and issues with data exchange. This is especially critical in complex ecosystems with many sources — such as clinics, labs, and telemedicine platforms. + +The MDM module utilizes a **probabilistic** (score-based or Fellegi-Sunter) method. It is more flexible and can provide better results than rule-based approaches, but at the cost of simplicity. + +## MDM Capabilities Overview + +### Technical Capabilities + +* FHIR R4 support +* Seamless integration with the Aidbox platform +* API-first architecture with a user-friendly web-based UI +* Notifications for external systems via webhooks (non-FHIR format) +* Unlimited scalability — supports any number of records +* Can be deployed in the cloud or on-premises + +### Data Safety, Transparency and Consistency + +* Role-based access control +* Full traceability of all operations, user actions and API calls +* Supports compliance with security and regulatory standards + +### Core Feature set + +* Search for records +* Flexible matching using a probabilistic algorithm + * Fully configurable for specific data and use cases + * Handles typos and incomplete data +* Manual record merging with unique merge strategy combining golden record and survivor record approaches +* Unmerge capability +* Ability to mark record pairs as non-duplicates to exclude them from future match results + +## Run MDM locally + +{% content-ref url="run-mdm-locally.md" %} +[run-mdm-locally.md](run-mdm-locally.md) +{% endcontent-ref %} + +## Configure MDM module + +Configure the MDM module to use a matching model stored in the MDM server (backend) + +{% content-ref url="configure-mdm-module.md" %} +[configure-mdm-module.md](configure-mdm-module.md) +{% endcontent-ref %} + +## Find Duplicates + +Use `$match` operation to find duplicates + + +## Merge and Unmerge Records + +Use `$merge` and `$unmerge` operations to manage duplicate records + +{% content-ref url="merging-and-unmerging-records-usdmerge-and-usdunmerge.md" %} +[merging-and-unmerging-records-usdmerge-and-usdunmerge.md](merging-and-unmerging-records-usdmerge-and-usdunmerge.md) +{% endcontent-ref %} + +## How It Works + +Learn more about: + +1. How our matching model works + +{% content-ref url="matching-model-explanation.md" %} +[matching-model-explanation.md](matching-model-explanation.md) +{% endcontent-ref %} + +2. How record merge and unmerge operations work + +{% content-ref url="merging-and-unmerging-records-usdmerge-and-usdunmerge.md" %} +[merging-and-unmerging-records-usdmerge-and-usdunmerge.md](merging-and-unmerging-records-usdmerge-and-usdunmerge.md) +{% endcontent-ref %} + +3. Mathematics behind probabilistic matching + +{% content-ref url="mathematical-details.md" %} +[mathematical-details.md](mathematical-details.md) +{% endcontent-ref %}