From 17dda5059ebd2e9e33533c7d1f65c3926cdd87fe Mon Sep 17 00:00:00 2001
From: JK <jk@chequer.io>
Date: Fri, 13 Mar 2026 19:57:42 +0900
Subject: [PATCH 1/5] =?UTF-8?q?confluence-mdx:=20reverse-sync=20=EC=A0=84?=
 =?UTF-8?q?=EB=A9=B4=20=EC=9E=AC=EA=B5=AC=EC=84=B1=20=EC=84=A4=EA=B3=84=20?=
 =?UTF-8?q?=EB=AC=B8=EC=84=9C=EB=A5=BC=20=EC=B6=94=EA=B0=80=ED=95=A9?=
 =?UTF-8?q?=EB=8B=88=EB=8B=A4?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

현재 patch 기반 Reverse Sync의 복잡도 문제를 분석하고,
MDX → XHTML 직접 재구성 방식으로의 전환 설계를 제안합니다.

설계 문서와 v1~v4 반복 검토 평가 문서를 포함합니다.

Co-Authored-By: Atlas <atlas@jk.agent>
---
 ...verse-sync-reconstruction-design-review.md |  139 ++
 ...3-13-reverse-sync-reconstruction-design.md | 1270 +++++++++++++++++
 2 files changed, 1409 insertions(+)
 create mode 100644 confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design-review.md
 create mode 100644 confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design.md
diff --git a/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design-review.md b/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design-review.md
new file mode 100644
index 000000000..3d18d0e3f
--- /dev/null
+++ b/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design-review.md
@@ -0,0 +1,139 @@
+# Reverse Sync 전면 재구성 설계 — 검토 평가 결과 (v4)
+
+> 검토 대상: `2026-03-13-reverse-sync-reconstruction-design.md`
+> 검토일: 2026-03-13
+> 검토자: Claude Sonnet 4.6
+> 이전 검토(v3) 대비 변경: 이전 지적사항 전체 반영 확인 + 신규 이슈 도출
+
+---
+
+## 이전 검토(v3) 반영 결과
+
+| 항목 | 내용 | 반영 결과 |
+|------|------|-----------|
+| C-1 | Section 5 Phase 1이 v1 설계(`list_items` 중첩 필드)를 기술 | ✅ 플랫 매핑 + children ref 기준으로 재작성, `_process_paragraph()` 추가 |
+| C-2 | `zip()` silent truncation — 항목 추가 시 새 항목 누락 | ✅ `for node in mdx_nodes[len(sidecar_refs):]` 후처리 루프 추가 |
+| W-1 | ParagraphEditSequence 생성(sidecar creation) 로직 미정의 | ✅ `_process_paragraph()` 완전 정의 (XHTML 조각 누적 + `convert_inline()`) |
+| W-2 | `reconstruct_ul/li_entry()` Level 0 테스트 미명시 | ✅ 각 함수별 테스트 케이스 추가 |
+| W-3 | Section 4.1이 구 함수명 `reconstruct_list_with_trailing()` 참조 | ✅ `reconstruct_ul_entry()`로 교체 |
+| S-1 | `<p>` 없는 `<li>` 처리 미명시 | ✅ Confluence에서 존재하지 않음을 코드 주석에 명시 |
+| S-2 | 다단락 `<li>` 처리 미언급 | ✅ 존재 여부 조사 TODO 추가 (Section 3.1 TODO 6) |
+
+이전 지적사항 7건 전부 반영되었습니다. 아래는 최신 설계 문서 기준 신규 이슈입니다.
+
+---
+
+## 총평
+
+설계가 구현 가능한 수준에 도달했습니다. 플랫 sidecar + children ref, ParagraphEditSequence, 5단계 테스트 구조가 일관성을 갖추었고, `_process_paragraph()` 생성 측이 완전히 정의된 것이 특히 중요한 개선입니다.
+
+남은 이슈는 Section 6가 이전 설계(v1)를 참조하는 불일치와, Level 0 테스트의 예상값 오류입니다.
+
+---
+
+## Warning — 구현 시 혼란을 유발할 수 있는 문제
+
+### W-1. Section 6 위험 1이 삭제된 설계를 기술
+
+**위치:** Section 6 위험 1
+
+```
+- _match_mdx_inline_item()에서 순서(index) 기반 폴백 우선 사용
+  - list_items 시퀀스에서 kind: inline 항목의 순서(inline_ptr)와 MDX 항목의 순서를 매칭
+  - 텍스트 완전 일치 → prefix 20자 매칭 → 순서 기반 매칭 순으로 폴백
+```
+
+현재 설계는 `_match_mdx_inline_item()`, `list_items` 필드, 텍스트 매칭을 사용하지 않습니다. `reconstruct_ul_entry()`의 `zip()` 위치 기반 매칭으로 전부 대체되었습니다. 위험 1의 "증상"(텍스트 변경으로 matching 실패)도 더 이상 발생하지 않습니다.
+
+현재 설계에서 위험 1에 해당하는 실제 위험은 **sidecar 항목 수와 MDX 항목 수 불일치 시 `inline_trailing_html` 손실**이며, 이는 케이스 분류 표에 이미 기술되어 있습니다. Section 6 위험 1을 현재 설계 기준으로 교체해야 합니다.
+
+---
+
+### W-2. Section 6 위험 3이 삭제된 `list_items` 필드를 기술
+
+**위치:** Section 6 위험 3
+
+```
+- list_items를 optional 필드로 선언 (기본값 [])
+- 구 버전 sidecar에서 list_items가 없으면 trailing 없이 재구성 → 기존 동작과 동일
+```
+
+`list_items`는 v1 설계 필드로, 현재 설계에 없습니다. 실제 backward compat 대상은 `children`, `plain_text`, `inline_trailing_html`이며, `reconstruct_paragraph()`의 `if not entry.children` 폴백이 이를 처리합니다. Section 6 위험 3을 현재 스키마 기준으로 업데이트해야 합니다.
+
+---
+
+### W-3. `_process_paragraph()` Level 0 테스트 예상값 오류
+
+**위치:** Section 8.3.1 Level 0 `_process_paragraph()` 테스트 케이스
+
+```
+| <p><strong>bold</strong></p> | children = [{'kind': 'text', 'text': 'bold'}]
+                                 — TextSegment 1개 (inline element는 get_text() 처리) |
+```
+
+그러나 `_process_paragraph()` 구현은:
+
+```python
+cursor_xhtml += str(child)   # → "<strong>bold</strong>" 누적
+children.append({'kind': 'text', 'text': convert_inline(cursor_xhtml)})
+```
+
+`convert_inline("<strong>bold</strong>")` = `"**bold**"` (MDX bold)이므로 TextSegment.text는 `'bold'`가 아니라 `'**bold**'`여야 합니다. 주석의 `inline element는 get_text() 처리`도 구현과 다릅니다.
+
+설계 문서는 "TextSegment.text는 MDX 텍스트 조각"으로 명시하므로, 테스트 케이스 예상값을 `'**bold**'`로, 주석을 `XHTML 조각을 convert_inline()으로 변환`으로 수정해야 합니다.
+
+---
+
+## Suggestion
+
+### S-1. Section 6 위험 1 TODO가 삭제된 설계를 참조
+
+**위치:** Section 6 위험 1 하단
+
+```
+> TODO (W-3): prefix 20자 매칭의 충돌 가능성을 기존 testcase 전수 조사로 확인.
+```
+
+prefix 20자 매칭은 현재 설계에서 완전히 제거되었습니다. 이 TODO는 삭제하면 됩니다.
+
+---
+
+### S-2. Level 1 테스트가 sidecar-aware 경로를 커버하지 않음을 미명시
+
+**위치:** Section 8.4 Level 1 테스트 코드
+
+```python
+reconstructed = mdx_block_to_xhtml_element(block)
+```
+
+`sidecar_entry` 없이 호출하므로 `inline_trailing_html` 재주입 경로와 callout macro 포맷 선택 경로가 Level 1에서 커버되지 않습니다. 이 경로들이 Level 3에서 검증되는 의도라면 Section 8.4에 명시하면 충분합니다:
+
+> "sidecar-aware 경로(list `inline_trailing_html`, callout macro format)는 Level 3에서 검증한다."
+
+---
+
+## 평가 요약
+
+| 항목 | 평가 |
+|------|------|
+| 이전 지적사항 전체 반영 | ✅ |
+| 플랫 매핑 + children ref 구조 | ✅ |
+| `_process_paragraph()` 생성 측 완전 정의 | ✅ |
+| `zip()` 항목 추가 후처리 루프 | ✅ |
+| 5단계 테스트 구조 및 실행 흐름 | ✅ |
+| Level 0 `reconstruct_ul/li_entry()` + `_process_paragraph()` 테스트 | ✅ |
+| Section 5 Phase 1 ↔ Section 3.1 일치 | ✅ |
+| Section 6 위험 1 구 설계 기술 (W-1) | ⚠️ |
+| Section 6 위험 3 `list_items` 참조 (W-2) | ⚠️ |
+| `_process_paragraph()` Level 0 예상값 오류 (W-3) | ⚠️ |
+| Section 6 위험 1 TODO 잔존 (S-1) | 💡 |
+| Level 1 sidecar-aware 경로 미커버 미명시 (S-2) | 💡 |
+
+---
+
+## 다음 단계
+
+Warning 3건(W-1~3)은 구현 전에 수정하는 것이 좋습니다.
+
+- **W-1, W-2**: Section 6를 현재 설계 기준으로 재작성
+- **W-3**: `_process_paragraph()` Level 0 테스트 예상값과 주석을 `str(child)` + `convert_inline()` 기준으로 수정
diff --git a/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design.md b/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design.md
new file mode 100644
index 000000000..a035108ff
--- /dev/null
+++ b/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design.md
@@ -0,0 +1,1270 @@
+# Reverse Sync 전면 재구성 설계
+
+> 작성일: 2026-03-13
+> 연관 분석: `analysis-reverse-sync-refactoring.md`
+
+> **설계 범위 원칙**
+> 이 문서의 모든 설계와 구현은 `tests/testcases/`에 실제로 존재하는 XHTML/MDX 사례에 기반한다.
+> 실제 사례로 확인되지 않은 가설적 케이스에 대한 예외 처리는 이 설계의 커버 범위 밖이다.
+> 새로운 케이스가 발견되면 testcase를 먼저 추가하고, 그 이후 설계·구현을 보완하는 사이클로 진행한다.
+
+---
+
+## 1. 배경 및 동기
+
+### 1.1 현재 아키텍처의 문제
+
+Reverse Sync의 현재 접근 방식은 **"XHTML을 최대한 건드리지 않고 텍스트 차이만 이식"** 하는 전략이다. 이 전략은 Confluence 전용 요소(`<ac:image>`, `<ac:link>` 등)를 보호하기 위해 선택되었지만, 구현이 진행될수록 다음과 같은 기술부채가 쌓이고 있다.
+
+- `patch_builder.py`의 전략 분기가 5가지(direct / containing / list / table / skip)로 늘어났고, 각 분기마다 예외 케이스가 추가되고 있다.
+- `_resolve_child_mapping()`이 4단계, `_resolve_mapping_for_change()`가 6단계 폴백 체인을 갖는다.
+- 버그를 수정할수록 새로운 edge case가 발견되는 패턴이 반복된다 (PR #852, #853, #866, #888, #903).
+- `text_transfer.py`의 문자 단위 위치 정렬은 두 좌표계(MDX ↔ XHTML) 사이의 매핑으로, 본질적으로 불안정하다.
+
+### 1.2 대안적 접근: MDX → XHTML 전체 재구성
+
+변경된 MDX 블록을 **XHTML로 직접 재구성**하고, 소실된 Confluence 요소만 원본에서 선택적으로 복원하면 위의 복잡도를 근본적으로 제거할 수 있다.
+
+```
+[현재] MDX diff → 텍스트 변경 추출 → XHTML 내 텍스트 위치 매핑 → 문자 단위 치환
+[제안] MDX diff → 변경 블록 재구성 → lossy 요소 재주입 → XHTML 교체
+```
+
+`mdx_block_to_xhtml_element()`는 이미 `_build_insert_patch()`에서 사용되고 있다. 이를 **수정(modified) 블록에도 적용**하는 것이 이 설계의 핵심이다.
+
+---
+
+## 2. 재구성이 막히는 두 가지 근본 문제
+
+### 문제 A: 리스트 항목 내 lossy 요소의 위치를 모름
+
+리스트 항목 안에 `<ac:image>`, `<ac:link>`, `<span style=...>` 같은 요소가 있어도 MDX에는 표현되지 않는다. 재구성 시 이 요소들을 어느 `<li>`의 어느 위치에 넣어야 할지 알 수 없다.
+
+```xml
+<!-- 원본 XHTML -->
+<ul>
+  <li><p>item 1</p><ac:image><ri:attachment ri:filename="img.png"/></ac:image></li>
+  <li><p>item 2</p></li>
+</ul>
+
+<!-- MDX — ac:image 없음 -->
+* item 1
+* item 2
+```
+
+재구성 후 `<ac:image>`를 어디에 넣어야 할지 알려주는 정보가 현재 sidecar에 없다.
+
+### 문제 B: Callout 내부 복합 구조를 재구성하지 못함
+
+MDX에서 callout 내부는 단순한 텍스트 블록이지만, XHTML에서는 `<p>`, `<ul>`, `<ac:structured-macro ac:name="code">` 등이 중첩된 구조다. 현재 `_convert_callout_inner()`는 내부를 단일 paragraph로만 변환하므로, 내부에 리스트나 코드 블록이 있으면 구조가 무너진다.
+
+```mdx
+<Callout type="info">
+  단락 텍스트
+
+  * 리스트 항목 1
+  * 리스트 항목 2
+</Callout>
+```
+
+→ 현재: `<p>단락 텍스트 * 리스트 항목 1 * 리스트 항목 2</p>` (틀림)
+→ 목표: `<p>단락 텍스트</p><ul><li><p>리스트 항목 1</p></li>...` (맞음)
+
+---
+
+## 3. 해결 방안
+
+### 3.1 문제 A 해결: Sidecar 플랫 매핑 + ref 참조 구조
+
+#### 핵심 원칙
+
+`<ul>`/`<ol>`과 `<li>` 모두 **최상위 sidecar entry**로 나열하고, 부모-자식 관계는 `children` ref 목록으로 표현한다.
+
+- `ul`/`ol` entry → `children: [li xhtml_xpath 목록]`
+- `li` entry → `plain_text`, `inline_trailing_html`, `children: [block child xhtml_xpath 목록]`
+- `<li>` 내부의 block 요소 (`<ul>`, `<ol>`, `<ac:structured-macro>` 등)도 독립 entry — `li` entry의 `children`에서 참조
+
+이 구조는 `ul`-`li` 관계와 `li`-block child 관계를 동일한 패턴으로 처리하므로, 어떤 깊이의 nesting도 추가 설계 없이 커버된다. nested list의 이중 삽입 문제가 구조적으로 제거된다.
+
+#### Sidecar 스키마 변경
+
+```yaml
+# 기존
+mappings:
+  - xhtml_xpath: ul[1]
+    xhtml_type: list
+    mdx_blocks: [5]
+
+# 변경 후 — ul, li 모두 최상위 entry, 관계는 children ref 로 표현
+# xhtml_type 은 XHTML 태그명 그대로 사용 (ul, ol, li, p, ac:image 등)
+mappings:
+  - xhtml_xpath: ul[1]
+    xhtml_type: ul
+    mdx_blocks: [5]
+    children:
+      - ref: "ul[1]/li[1]"
+      - ref: "ul[1]/li[2]"
+
+  - xhtml_xpath: ul[1]/li[1]
+    xhtml_type: li
+    plain_text: "item 1 설명 텍스트"           # inline 매칭 키
+    inline_trailing_html: >-
+      <ac:image><ri:attachment
+      ri:filename="img.png"/></ac:image>        # <p> 직후 non-block lossy 요소만
+    children:
+      - ref: "ul[1]/li[1]/ul[1]"               # block child 참조
+
+  - xhtml_xpath: ul[1]/li[1]/ul[1]
+    xhtml_type: ul
+    children:
+      - ref: "ul[1]/li[1]/ul[1]/li[1]"
+
+  - xhtml_xpath: ul[1]/li[1]/ul[1]/li[1]
+    xhtml_type: li
+    plain_text: "sub-item A"
+    inline_trailing_html: ""
+    children: []
+
+  - xhtml_xpath: ul[1]/li[2]
+    xhtml_type: li
+    plain_text: "item 2 텍스트"
+    inline_trailing_html: ""
+    children: []
+```
+
+`inline_trailing_html`은 `<p>` 직후의 **non-block** 요소만 저장한다. `<ul>`, `<ol>`, `<ac:structured-macro>` 등 block 요소는 `children` ref로 분리하므로 저장 대상이 아니다.
+
+#### 생성: `_process_element()` 재귀 처리
+
+```python
+def _process_element(elem, xpath: str) -> list[SidecarEntry]:
+    """ul/ol/li 요소를 재귀 처리하여 독립 SidecarEntry 목록을 반환한다.
+
+    모든 요소는 최상위 entry 로 생성되고, 부모-자식 관계는 children ref 로 표현된다.
+    """
+    entries = []
+
+    if elem.name in ('ul', 'ol'):
+        children_refs = []
+        for li_idx, li in enumerate(elem.find_all('li', recursive=False), start=1):
+            li_xpath = f"{xpath}/li[{li_idx}]"
+            children_refs.append({'ref': li_xpath})
+            entries.extend(_process_element(li, li_xpath))
+
+        entries.insert(0, SidecarEntry(
+            xhtml_xpath=xpath,
+            xhtml_type=elem.name,   # 'ul' 또는 'ol' — XHTML 태그명 그대로
+            children=children_refs,
+        ))
+
+    elif elem.name == 'li':
+        # Confluence storage format에서 <li> 내부는 항상 <p>로 래핑됨.
+        # <p> 없는 <li>는 실제로 존재하지 않는 케이스이므로 별도 처리 불필요.
+        p_elem = elem.find('p')
+        plain_text = p_elem.get_text(separator=' ', strip=True) if p_elem else ''
+
+        # inline trailing: <p> 직후 non-block 형제 요소
+        inline_trailing = []
+        block_children_refs = []
+        block_counters = {}
+
+        if p_elem:
+            for sib in p_elem.next_siblings:
+                if not hasattr(sib, 'name'):
+                    continue
+                if sib.name in ('ul', 'ol') or _is_block_macro(sib):
+                    tag = sib.name
+                    block_counters[tag] = block_counters.get(tag, 0) + 1
+                    child_xpath = f"{xpath}/{tag}[{block_counters[tag]}]"
+                    block_children_refs.append({'ref': child_xpath})
+                    entries.extend(_process_element(sib, child_xpath))
+                else:
+                    inline_trailing.append(str(sib))
+
+        entries.insert(0, SidecarEntry(
+            xhtml_xpath=xpath,
+            xhtml_type='li',        # XHTML 태그명 그대로
+            plain_text=plain_text,
+            inline_trailing_html=''.join(inline_trailing),
+            children=block_children_refs,
+        ))
+
+    return entries
+```
+
+#### 소비: `reconstruct_ul_entry()` / `reconstruct_li_entry()` — `_ListNode` tree 위치 기반 매칭
+
+MDX 파서(`parse_mdx_blocks()`)는 nested list 전체를 하나의 `list` 블록으로 반환하고, 구조화는 하지 않는다. `emitter.py`에 이미 존재하는 `_parse_list_items()` + `_build_list_tree()`를 재사용하여 `_ListNode` tree를 생성하고, sidecar `children` refs와 **위치 기반(zip)** 으로 매칭한다.
+
+텍스트 기반 큐(`pop_inline_item()`) 없이 위치 기반으로만 동작하므로, 동일 텍스트 항목이 여럿 있어도 충돌이 없다.
+
+```python
+def reconstruct_ul_entry(
+    entry: SidecarEntry,
+    sidecar_index: dict,
+    mdx_nodes: list[_ListNode],    # 이 ul/ol 레벨의 MDX 항목들 (_ListNode)
+) -> str:
+    """ul/ol entry를 재구성한다.
+
+    sidecar children refs와 mdx_nodes를 위치(zip)로 매칭한다.
+    emitter._parse_list_items() + _build_list_tree()로 생성한 _ListNode tree를 인자로 받는다.
+
+    항목 수 불일치 처리:
+      - MDX 항목이 더 많음(추가): zip 이후 남은 mdx_nodes를 sidecar 없이 재구성
+      - MDX 항목이 더 적음(삭제): zip이 짧은 쪽에서 멈추므로 삭제 항목은 자동 생략
+    """
+    tag = entry.xhtml_type  # 'ul' or 'ol' — XHTML 태그명 그대로
+    sidecar_refs = entry.children or []
+    parts = []
+
+    # sidecar가 있는 항목: ref + node 위치 매칭
+    for ref_dict, node in zip(sidecar_refs, mdx_nodes):
+        li_entry = sidecar_index.get(ref_dict['ref'])
+        if li_entry:
+            parts.append(reconstruct_li_entry(li_entry, sidecar_index, node))
+
+    # sidecar보다 MDX 항목이 많으면 — 새로 추가된 항목, sidecar 없이 재구성
+    for node in mdx_nodes[len(sidecar_refs):]:
+        parts.append(f'<li><p>{convert_inline(node.text)}</p></li>')
+
+    return f'<{tag}>{"".join(parts)}</{tag}>'
+
+
+def reconstruct_li_entry(
+    entry: SidecarEntry,
+    sidecar_index: dict,
+    node: _ListNode,               # 이 li에 대응하는 MDX _ListNode
+) -> str:
+    """li entry를 재구성한다.
+
+    node.text — li 본문 MDX 텍스트
+    node.children — 이 li의 nested list 항목들 (nested ul/ol 재구성에 전달)
+    """
+    li_inner = f'<p>{convert_inline(node.text)}</p>'
+    li_inner += entry.inline_trailing_html or ''
+
+    for ref_dict in (entry.children or []):
+        child = sidecar_index.get(ref_dict['ref'])
+        if child is None:
+            continue
+        if child.xhtml_type in ('ul', 'ol'):
+            # node.children = 이 li의 nested list 항목들
+            li_inner += reconstruct_ul_entry(child, sidecar_index, node.children)
+        else:
+            # block macro 등 기타 block children — 원본 xhtml_fragment 그대로
+            li_inner += child.xhtml_fragment or ''
+    return f'<li>{li_inner}</li>'
+
+
+# 진입점 (patch_builder.py에서 list 블록 처리 시)
+# emitter.py의 기존 함수 재사용
+items = _parse_list_items(new_block.content)   # bin/mdx_to_storage/emitter.py
+roots = _build_list_tree(items)                # bin/mdx_to_storage/emitter.py
+xhtml = reconstruct_ul_entry(sidecar_entry, sidecar_index, roots)
+```
+```
+
+#### 처리 가능한 케이스 분류
+
+| 케이스 | 처리 방법 |
+|--------|-----------|
+| 항목 내용 변경 + lossy 없음 | `li` entry 재구성 (clean) |
+| 항목 내용 변경 + inline lossy 있음 | 재구성 + `inline_trailing_html` 재주입 |
+| 항목 내 nested list 있음 | `li` entry의 `children` → `ul`/`ol` entry 재구성 (이중 삽입 구조적 불가) |
+| 항목 추가 (MDX > sidecar) | `zip()` 이후 남은 `mdx_nodes`를 sidecar 없이 재구성 — `inline_trailing_html` 없음 |
+| 항목 삭제 (MDX < sidecar) | `zip()`이 짧은 쪽에서 멈추므로 삭제 항목 자동 생략 |
+| 깊이 무관한 nesting | `reconstruct_entry()` 재귀로 동일하게 처리 |
+
+---
+
+> **TODO — 구현 전 조사 필요**
+>
+> 1. **[Phase 1 선결] 현재 `generate_sidecar_mapping()`의 `li` 처리 여부**: `<li>`에 대해 독립 entry를 이미 생성하는지, 아니면 `<ul>`/`<ol>` entry 안에 포함하는지 확인. `_process_element()` 도입 시 기존 entry 생성 로직과의 충돌 범위 파악 필요.
+>
+> 2. **[Phase 1 선결] `xhtml_type` 태그명 일치 확인**: `xhtml_type`은 XHTML 태그명 그대로 사용한다 (`ul`, `ol`, `li`, `p`, `ac:image` 등). 기존 sidecar에 추상 타입(`list`, `list_item`)이 저장된 경우 마이그레이션 또는 역직렬화 시 변환 필요 여부 확인.
+>
+> 3. **[Phase 1 선결] `xhtml_xpath` 포맷**: nested element의 xpath `ul[1]/li[1]/ul[1]` 형식이 현재 `mapping_recorder.py`의 xpath 생성 방식과 일치하는지 확인 필요.
+>
+> 4. **[Phase 1 선결] `_parse_list_items()` / `_build_list_tree()` 접근성 확인**: `emitter.py`의 두 함수가 모듈 외부에서 import 가능한지 확인. private 함수(`_`prefix)이므로 `reverse_sync` 패키지에서 호출 가능한 형태로 노출 필요 여부 검토.
+>
+> 5. **[Phase 1 선결] `_is_block_macro()` 판별 기준**: `<ac:structured-macro>`가 block child로 분류되어야 할 케이스와 inline trailing으로 남아야 할 케이스 구분 기준 확인 필요.
+>
+> 6. **[조사] 다단락 `<li>` 존재 여부**: `<li><p>para1</p><p>para2</p></li>` 형태가 실제 Confluence XHTML에 존재하는지 testcase 전수 조사. 추측으로는 존재하지 않으며, 단일 `<p>` 내에서 `<br/>` 줄바꿈만 사용하는 것으로 보임. 존재하지 않음이 확인되면 단순 케이스로 간주하고 별도 처리 불필요. 존재하면 두 번째 `<p>`를 `inline_trailing_html`에 보존하는 방향 검토.
+
+---
+
+### 3.1.1 Paragraph 내 inline-block 요소 처리 — ParagraphEditSequence 기반
+
+`<p>` 내부에 텍스트와 lossy inline-block 요소(`<ac:image>` 등)가 혼재하는 경우, list-item과 동일한 원칙을 적용한다: lossy 요소는 **독립 sidecar entry**로 분리하고, `paragraph` entry의 `children`에서 ref로 참조한다.
+
+텍스트 편집이 발생한 경우, old MDX → new MDX의 **ParagraphEditSequence**(Myers diff 기반)를 구하여 각 `AnchorSegment`의 삽입 위치를 old 좌표에서 new 좌표로 정확히 매핑한다. fuzzy 매칭은 사용하지 않는다. 설계가 커버하지 못하는 케이스는 명시적으로 실패시키고, 테스트케이스를 보강하는 사이클로 해결한다.
+
+#### 용어 정의
+
+```
+ParagraphEditSequence = list[InlineSegment]
+
+InlineSegment
+├── TextSegment(text: str)      # diff 대상 텍스트 조각
+└── AnchorSegment(ref: str)     # ac:image 등 lossy inline-block — 위치 고정 앵커
+```
+
+`<p>` 하나는 하나의 `ParagraphEditSequence`로 표현된다. `TextSegment`와 `AnchorSegment`가 교차하는 시퀀스이며, edit(Myers diff)는 `TextSegment`에만 적용된다. `AnchorSegment`는 인접 `TextSegment`의 위치 변화를 따라 new 좌표로 매핑된다.
+
+#### Sidecar 스키마
+
+`<p>` 내부를 `TextSegment`(`kind: text`)와 `AnchorSegment`(`kind: ref`) 교차 시퀀스로 표현한다. `TextSegment.text`는 MDX 텍스트 조각(`convert_inline()` 출력)이며, `AnchorSegment`의 위치를 정의하는 좌표 기준이 된다. TextSegments를 이어 붙이면 해당 단락의 MDX 텍스트(`old_mdx_text`)와 정확히 일치한다.
+
+```yaml
+- xhtml_xpath: p[1]
+  xhtml_type: p
+  children:
+    - kind: text
+      text: "텍스트A "            # TextSegment — MDX 텍스트 조각 (convert_inline 출력)
+    - kind: ref
+      ref: "p[1]/ac:image[1]"    # AnchorSegment — 이 TextSegment 직후 위치
+    - kind: text
+      text: " 텍스트B"            # TextSegment — MDX 텍스트 조각
+    - kind: ref
+      ref: "p[1]/ac:image[2]"    # AnchorSegment
+
+- xhtml_xpath: p[1]/ac:image[1]
+  xhtml_type: ac:image
+  html: "<ac:image><ri:attachment ri:filename='img1.png'/></ac:image>"
+
+- xhtml_xpath: p[1]/ac:image[2]
+  xhtml_type: ac:image
+  html: "<ac:image><ri:attachment ri:filename='img2.png'/></ac:image>"
+```
+
+신규 sidecar는 `ac:image` 존재 여부와 무관하게 **항상** `children`을 구성한다. `ac:image`가 없는 경우 `children = [{'kind': 'text', 'text': '전체텍스트'}]` 형태가 된다. `children`이 없는 entry는 구 sidecar에 대한 backward compat 폴백으로만 사용된다.
+
+#### 생성: `_process_paragraph()` — 항상 교차 시퀀스로 구성
+
+`TextSegment.text`는 XHTML plain text가 아닌 **MDX 텍스트 조각**을 저장한다. XHTML 조각을 누적한 뒤 `convert_inline()`을 적용하면, TextSegments를 이어 붙인 결과가 해당 단락의 MDX 텍스트와 정확히 일치한다. 이로써 `reconstruct_paragraph()`에서 별도 normalization 없이 `old_text == old_mdx_text`가 항상 성립한다.
+
+```python
+def _process_paragraph(elem, xpath: str) -> list[SidecarEntry]:
+    """<p> 요소를 처리하여 ParagraphEditSequence 구조의 sidecar entry를 생성한다.
+
+    ac:image 존재 여부와 무관하게 항상 TextSegment/AnchorSegment 교차 시퀀스로 children을 구성한다.
+      - ac:image 없음: children = [{'kind': 'text', 'text': '전체MDX텍스트'}]
+      - ac:image 있음: [text, ref, text, ref, ..., text] 교차 시퀀스
+
+    TextSegment.text는 MDX 텍스트 조각이다 — XHTML 조각(strong, em, code 등)을
+    convert_inline()으로 변환한 결과. TextSegments 연결 = 해당 단락의 MDX 텍스트.
+    """
+    entries = []
+    children = []
+    image_counters = {}
+    cursor_xhtml = ''  # <p> 내 텍스트/인라인 요소의 XHTML 조각 누적
+
+    for child in elem.children:
+        if hasattr(child, 'name') and child.name == 'ac:image':
+            # 직전 TextSegment flush — 누적 XHTML을 MDX 텍스트로 변환 (빈 문자열도 포함)
+            children.append({'kind': 'text', 'text': convert_inline(cursor_xhtml)})
+            cursor_xhtml = ''
+            # AnchorSegment
+            image_counters['ac:image'] = image_counters.get('ac:image', 0) + 1
+            img_xpath = f"{xpath}/ac:image[{image_counters['ac:image']}]"
+            children.append({'kind': 'ref', 'ref': img_xpath})
+            # ac:image 독립 entry
+            entries.append(SidecarEntry(
+                xhtml_xpath=img_xpath,
+                xhtml_type='ac:image',
+                html=str(child),
+            ))
+        else:
+            # 텍스트 노드(NavigableString) 또는 인라인 요소(strong, em, code 등) — XHTML 조각 누적
+            cursor_xhtml += str(child)
+
+    # 마지막 TextSegment flush (항상 추가 — ac:image 없어도 전체 텍스트가 여기에 들어감)
+    children.append({'kind': 'text', 'text': convert_inline(cursor_xhtml)})
+
+    entries.insert(0, SidecarEntry(
+        xhtml_xpath=xpath,
+        xhtml_type='p',
+        children=children,
+    ))
+    return entries
+```
+
+#### ParagraphEditSequence를 이용한 AnchorSegment 위치 매핑
+
+```
+old_seq = ParagraphEditSequence from sidecar:
+  [TextSegment("텍스트A "), AnchorSegment(ref[1]), TextSegment(" 텍스트B"), AnchorSegment(ref[2])]
+
+old_text = TextSegment 연결: "텍스트A  텍스트B"
+new_text = 새 MDX: "수정된텍스트A 수정된텍스트B"
+
+AnchorSegment 위치 (old 좌표):
+  ref[1]: old_pos = len("텍스트A ")  = 5   (TextSegment[0] 끝)
+  ref[2]: old_pos = len("텍스트A  텍스트B") = 11  (TextSegment[1] 끝)
+
+Myers diff edit ops (TextSegment 연결 텍스트 기준):
+  DELETE  "텍스트A"   (old 0..4)
+  INSERT  "수정된텍스트A"
+  RETAIN  " "         (old 5)    ← ref[1] old_pos=5 → new_pos 여기서 결정
+  DELETE  " 텍스트B"  (old 6..11)
+  INSERT  " 수정된텍스트B"
+                                 ← ref[2] old_pos=11 → new_pos 여기서 결정
+
+매핑 결과:
+  ref[1]: new_pos = 8   ("수정된텍스트A " 이후)
+  ref[2]: new_pos = 18  (" 수정된텍스트B" 이후)
+
+재구성:
+  new_text[:8] + image[1].html + new_text[8:18] + image[2].html + new_text[18:]
+  = "수정된텍스트A " + <ac:image[1]> + " 수정된텍스트B" + <ac:image[2]>
+```
+
+#### 구현
+
+```python
+def map_anchor_positions(
+    old_text: str,
+    new_text: str,
+    old_positions: list[int],   # unit: Python str index (Unicode code point)
+) -> list[int]:                 # unit: Python str index (Unicode code point)
+    """Myers diff edit sequence로 AnchorSegment의 old 좌표를 new 좌표로 매핑한다.
+
+    좌표 단위: Python str index (Unicode code point).
+    한국어/영어 텍스트에서 code point = grapheme cluster이므로 실용상 문제 없음.
+    이모지 결합 문자(ZWJ sequence) 등 edge case가 발생하면 그때 testcase 추가 후 대응한다.
+
+    edit op:
+      ('retain', n) — old n code points 유지 → old_ptr += n, new_ptr += n
+      ('delete', n) — old n code points 삭제 → old_ptr += n
+      ('insert', s) — new s 삽입 → new_ptr += len(s)  (len = code point 수)
+
+    AnchorSegment position은 TextSegment 끝(직후)에 위치하므로:
+      old_ptr 이 old_pos 에 도달하는 시점의 new_ptr 를 기록한다.
+    """
+    ops = myers_diff(old_text, new_text)  # → list of (op, value)
+    old_ptr = 0
+    new_ptr = 0
+    pos_iter = iter(sorted(old_positions))
+    next_pos = next(pos_iter, None)
+    new_positions = []
+
+    for op, value in ops:
+        if next_pos is None:
+            break
+        if op == 'retain':
+            n = value
+            while next_pos is not None and old_ptr + n >= next_pos:
+                offset = next_pos - old_ptr
+                new_positions.append(new_ptr + offset)
+                next_pos = next(pos_iter, None)
+            old_ptr += n
+            new_ptr += n
+        elif op == 'delete':
+            n = value
+            while next_pos is not None and old_ptr + n >= next_pos:
+                # 삭제 구간 안에 AnchorSegment → delete 직후 new_ptr 로 매핑
+                new_positions.append(new_ptr)
+                next_pos = next(pos_iter, None)
+            old_ptr += n
+        elif op == 'insert':
+            new_ptr += len(value)
+
+    # 남은 position은 old_text 끝 이후 → new_text 끝으로 매핑
+    while next_pos is not None:
+        new_positions.append(len(new_text))
+        next_pos = next(pos_iter, None)
+
+    return new_positions
+
+
+def reconstruct_paragraph(
+    old_mdx_text: str,
+    new_mdx_text: str,
+    entry: SidecarEntry,
+    sidecar_index: dict,
+) -> str:
+    """paragraph를 ParagraphEditSequence 기반으로 재구성한다.
+
+    children이 없으면 새 MDX 텍스트를 그대로 변환한다 (구 sidecar backward compat 폴백).
+    children이 있으면:
+      - TextSegment 연결 = 해당 단락의 MDX 텍스트 → old_mdx_text와 직접 비교 (normalization 불필요)
+      - AnchorSegment가 없는 경우(ac:image 없음): convert_inline 적용
+      - AnchorSegment가 있는 경우: map_anchor_positions()로 위치 매핑 후 new_text에 삽입
+    old_mdx_text가 TextSegment 연결과 일치하지 않으면 SidecarMismatchError를 발생시킨다.
+    """
+    if not entry.children:
+        return f'<p>{convert_inline(new_mdx_text)}</p>'
+
+    # ParagraphEditSequence 복원
+    text_segments = [c['text'] for c in entry.children if c['kind'] == 'text']  # TextSegment
+    anchor_refs = [c['ref'] for c in entry.children if c['kind'] == 'ref']      # AnchorSegment
+    old_text = ''.join(text_segments)
+
+    if old_text != old_mdx_text:
+        raise SidecarMismatchError(
+            f"paragraph TextSegment mismatch:\n"
+            f"  sidecar: {old_text!r}\n"
+            f"  actual:  {old_mdx_text!r}\n"
+            f"  → sidecar 재생성 필요 (testcase fixture 업데이트)"
+        )
+
+    # AnchorSegment old 좌표 계산 (TextSegment 누적 길이)
+    old_positions = []
+    cursor = 0
+    seg_iter = iter(text_segments)
+    for child in entry.children:
+        if child['kind'] == 'text':
+            cursor += len(next(seg_iter))
+        elif child['kind'] == 'ref':
+            old_positions.append(cursor)
+
+    # Myers diff로 AnchorSegment new 좌표 매핑
+    new_positions = map_anchor_positions(old_text, new_mdx_text, old_positions)
+
+    # new_mdx_text를 MDX 좌표로 먼저 분할한 뒤, 각 조각에 convert_inline() 적용
+    result = ''
+    prev = 0
+    for ref, new_pos in zip(anchor_refs, new_positions):
+        mdx_piece = new_mdx_text[prev:new_pos]          # MDX 좌표로 MDX 텍스트 분할
+        result += convert_inline(mdx_piece)              # 분할 후 XHTML 변환
+        ref_entry = sidecar_index.get(ref)
+        if ref_entry is None:
+            raise SidecarMismatchError(f"AnchorSegment ref not found in sidecar_index: {ref!r}")
+        result += ref_entry.html
+        prev = new_pos
+    result += convert_inline(new_mdx_text[prev:])       # 마지막 조각
+
+    return f'<p>{result}</p>'
+```
+
+#### 명시적 실패 케이스
+
+| 케이스 | 동작 |
+|--------|------|
+| `old_text != TextSegment 연결` | `SidecarMismatchError` — sidecar 재생성 후 testcase 업데이트 |
+| `AnchorSegment ref`가 `sidecar_index`에 없음 | `SidecarMismatchError` — sidecar 구조 오류 |
+| 삭제 구간 안에 `AnchorSegment` | `new_ptr`(삭제 직후) 로 매핑 (정의된 동작) |
+| children이 없는 paragraph | 기존 경로 — `convert_inline(new_mdx_text)` 그대로 사용 |
+
+실패가 발생하면:
+1. 실패 케이스를 재현하는 testcase fixture를 추가한다
+2. `SidecarMismatchError`면 sidecar 생성 로직을 수정하고 fixture를 재생성한다
+3. 정의되지 않은 구조적 케이스면 설계를 보완하고 다시 사이클을 돈다
+
+---
+
+> **TODO — 구현 전 조사 필요**
+>
+> 1. **[확정] `myers_diff()` 좌표 단위**: Python str index (Unicode code point) 단위로 확정. 함수 시그니처 주석에 명시됨. 이모지 결합 문자 등 edge case는 발생 시 testcase 추가 대응.
+>
+> 2. **[Phase 1 선결] `old_mdx_text` 추출 방법**: `patch_builder.py`에서 paragraph 블록의 old MDX 텍스트를 어떻게 가져오는지 확인. `block_diff.py`의 `Change.old_block.content`가 이 역할을 하는지 확인.
+>
+> 3. **[Phase 1 선결] `<ac:image>`의 위치 유형 구분**: `<p>` 내부 inline vs `<p>` 외부 독립 block 케이스를 `mapping_recorder.py`가 어떻게 구분하는지 확인. 독립 block이면 이 설계 대상 밖.
+>
+> 4. **[확정] `convert_inline()` 적용 시점**: MDX 좌표로 MDX 텍스트를 먼저 분할한 뒤 각 조각에 `convert_inline()` 적용 — 좌표계 불일치 구조적 해소. `reconstruct_paragraph()` 구현에 반영됨.
+
+---
+
+### 3.2 문제 B 해결: Callout 내부 재귀 파싱
+
+#### 아이디어
+
+`_convert_callout_inner()`에서 내부 텍스트를 paragraph로 변환하는 대신, `parse_mdx_blocks()`로 내부를 블록 시퀀스로 파싱한 뒤 `mdx_block_to_xhtml_element()`를 재귀 적용한다.
+
+#### 구현
+
+```python
+def _convert_callout_inner_full(text: str) -> str:
+    """callout 내부를 재귀적으로 블록 파싱하여 XHTML을 재구성한다."""
+    from mdx_to_storage.parser import parse_mdx_blocks
+
+    # <Callout> 래퍼 제거 후 들여쓰기 보정
+    inner = _strip_callout_wrapper(text)
+    inner = _dedent_callout_body(inner)
+
+    # 내부를 MDX 블록으로 파싱 (이미 존재하는 파서 재사용)
+    inner_blocks = [b for b in parse_mdx_blocks(inner)
+                    if b.type not in ('frontmatter', 'import_statement', 'empty')]
+
+    if not inner_blocks:
+        return ''
+
+    # 각 블록을 재귀 재구성
+    parts = [mdx_block_to_xhtml_element(b) for b in inner_blocks]
+    return ''.join(parts)
+
+
+def _dedent_callout_body(text: str) -> str:
+    """callout 내부 들여쓰기(공통 선행 공백)를 제거한다."""
+    lines = text.splitlines()
+    non_empty = [l for l in lines if l.strip()]
+    if not non_empty:
+        return text
+    indent = min(len(l) - len(l.lstrip()) for l in non_empty)
+    return '\n'.join(l[indent:] for l in lines)
+```
+
+#### Callout 외부 wrapper: sidecar xhtml_type 활용
+
+callout을 재구성할 때 원래 macro 포맷(`ac:structured-macro` vs `ac:adf-extension`)을 유지해야 한다. `sidecar_entry.xhtml_type`이 이미 이 정보를 갖고 있다.
+
+```python
+def mdx_callout_to_xhtml(block, sidecar_entry) -> str:
+    """MDX callout 블록을 XHTML macro로 재구성한다."""
+    callout_type = _extract_callout_type(block.content)   # "info" | "warning" | ...
+    inner_xhtml = _convert_callout_inner_full(block.content)
+
+    if sidecar_entry and sidecar_entry.xhtml_type == 'adf_extension':
+        return _wrap_adf_callout(inner_xhtml, callout_type)
+    else:
+        return _wrap_structured_macro_callout(inner_xhtml, callout_type)
+```
+
+#### 재귀 파싱의 안전성
+
+`parse_mdx_blocks()`는 현재 최상위 MDX 문서를 대상으로 작성되어 있다. callout 내부에 적용할 때 주의할 점:
+
+- frontmatter, import_statement는 callout 내부에 없으므로 무시해도 됨 (이미 필터 적용)
+- callout 내부에 다시 `<Callout>`이 있는 중첩 케이스: `mdx_block_to_xhtml_element()`의 callout 분기가 재귀 호출됨 → 자연스럽게 처리됨
+- callout 내부의 `<figure>`, `<details>`, `<Badge>` 등 HTML 블록: `html_block` 타입으로 파싱되어 그대로 통과됨 → 문제 없음
+
+---
+
+## 4. 새로운 Reverse Sync 흐름
+
+두 가지 문제가 해결되면 `build_patches()`의 로직이 다음과 같이 단순해진다.
+
+### 4.1 변경된 블록 처리 (수정)
+
+```
+[현재]
+modified 블록
+  → _resolve_mapping_for_change() (6단계 폴백)
+    → strategy: direct | containing | list | table | skip
+      → (각 strategy마다 별도 처리)
+
+[제안]
+modified 블록
+  → sidecar O(1) 조회 → BlockMapping
+  → mdx_block_to_xhtml_element(new_block, sidecar_entry)
+    → heading, paragraph, code_block: 직접 재구성
+    → list: reconstruct_ul_entry() (trailing_html 재주입 포함)
+    → callout: mdx_callout_to_xhtml() (재귀 파싱 + macro wrap)
+    → 그 외: 기존 폴백
+  → xhtml_xpath에 new_inner_xhtml 교체
+```
+
+### 4.2 삭제되는 코드
+
+| 모듈/함수 | 이유 |
+|-----------|------|
+| `_resolve_mapping_for_change()` | 단순 sidecar O(1) 조회로 대체 |
+| `_find_containing_mapping()` | containing 전략 불필요 |
+| `_resolve_child_mapping()` 4단계 폴백 | sidecar가 정확하면 불필요 |
+| `text_transfer.py` (대부분) | 텍스트 위치 매핑 불필요 |
+| `has_inline_format_change()` | 재구성이 기본이므로 감지 불필요 |
+| `has_inline_boundary_change()` | 동상 |
+| `lost_info_patcher.py` 블록 레벨 heuristic | `inline_trailing_html` 재주입으로 대체 |
+| `build_list_item_patches()` 매칭 로직 | `reconstruct_entry()` ref 순회로 대체 |
+| `_convert_callout_inner()` → paragraph 폴백 | 재귀 파싱으로 대체 |
+
+### 4.3 유지되는 코드
+
+| 모듈/함수 | 이유 |
+|-----------|------|
+| `block_diff.py` | diff 로직 그대로 사용 |
+| `sidecar.py` O(1) 인덱스 | 그대로 사용 (children/plain_text/inline_trailing_html 필드 추가) |
+| `mdx_to_xhtml_inline.py` | 재구성의 핵심 — 확장 |
+| `xhtml_patcher.py` `_replace_inner_html()` | XHTML 교체 메커니즘 |
+| `roundtrip_verifier.py` | 검증 로직 |
+| `table_patcher.py` | 테이블은 별도 처리 (표 구조가 복잡) |
+
+---
+
+## 5. 구현 계획
+
+### Phase 1: Sidecar 플랫 매핑 + children ref 구조 도입
+
+**목표:** `ul`/`ol`/`li` 및 `<p>` 내 `ac:image`를 최상위 SidecarEntry로 생성하고, 관계를 `children: [ref]`로 표현
+
+**작업:**
+1. `SidecarEntry` dataclass에 `children`, `plain_text`, `inline_trailing_html` 필드 추가
+   - `children: List[ChildRef]` — `{'ref': xhtml_xpath}` 목록 (ul→li, li→block child, p→ac:image)
+   - `plain_text: str` — li entry의 MDX 항목 텍스트 (zip 매칭 키)
+   - `inline_trailing_html: str` — `<p>` 직후 non-block lossy 요소 원본 HTML
+2. `_process_element(elem, xpath)` 구현 — `ul`/`ol`/`li`를 최상위 entry로 재귀 생성
+   - `xhtml_type`은 XHTML 태그명 그대로 (`ul`, `ol`, `li`)
+   - `<li>` 내 block 요소(`<ul>`, `<ol>`, block macro) → 독립 entry + `children` ref
+   - `<li>` 내 non-block lossy 요소(`<ac:image>` 등) → `inline_trailing_html`에 저장
+3. `_process_paragraph(elem, xpath)` 구현 — `<p>` 내 `ac:image` 를 독립 entry + `children` ref로 생성
+   - `ParagraphEditSequence` 구조(`kind: text` / `kind: ref`) sidecar에 기록
+4. `generate_sidecar_mapping()`의 list/paragraph 처리를 위 두 함수 호출로 교체
+5. `load_sidecar_mapping()`에서 `children`, `plain_text`, `inline_trailing_html` 역직렬화
+6. `mapping.yaml` 스키마 버전 업 (`version: 2`)
+
+**검증 testcase:**
+
+| testcase ID | 검증 대상 | 근거 |
+|-------------|-----------|------|
+| `544145591` | `ac:image` 포함 li 9개, nested list 21개 | li+image, nested 모두 풍부 |
+| `880181257` | `ac:image` 포함 li 12개 | ac:image 포함 li 집중 검증 |
+| `883654669` | `ac:image` 포함 li 16개 | ac:image 포함 li 최다 |
+
+> **TODO:** Phase 1 시작 전 "TODO — 구현 전 조사 필요" 항목(Section 3.1) 중 1~5번 확인 후 구현 방향 확정
+
+---
+
+### Phase 2: `_convert_callout_inner` 재귀 파싱
+
+**목표:** callout 내부 리스트/코드 블록을 올바르게 재구성
+
+**작업:**
+1. `_strip_callout_wrapper()` 및 `_dedent_callout_body()` 유틸리티 추가
+2. `_convert_callout_inner_full()` 구현 (재귀 파싱)
+3. `mdx_block_to_xhtml_element()`의 callout 분기에서 sidecar_entry를 인자로 받아 macro 포맷 결정
+4. `mdx_block_to_xhtml_element()` 시그니처에 optional `sidecar_entry` 추가
+
+**검증 testcase:**
+
+| testcase ID | 검증 대상 | 근거 |
+|-------------|-----------|------|
+| `1454342158` | callout 내부 list 4개 | callout+list 가장 많음 |
+| `880181257` | callout 내부 list 2개 | Phase 1+2 복합 케이스 (ac:image 포함 li도 존재) |
+
+---
+
+### Phase 3: `build_patches()` 재구성 경로 전환
+
+**목표:** modified 블록 처리를 재구성 기반으로 전환
+
+**작업:**
+1. `reconstruct_ul_entry()` / `reconstruct_li_entry()` 구현 및 단위 테스트
+2. `build_patches()`에서 modified 블록의 처리를 `mdx_block_to_xhtml_element()` 기반으로 전환
+3. 기존 전략 분기(containing, list, text_transfer) 단계적 제거
+4. 전체 테스트 케이스 통과 확인 (`make test-reverse-sync`)
+
+**검증 기준:** `tests/reverse-sync/pages.yaml`의 모든 `expected_status: pass` 케이스 유지
+
+---
+
+## 6. 위험 및 대응
+
+### 위험 1: inline 항목 텍스트 변경으로 `inline_trailing_html` 매칭 실패
+
+**증상:** 리스트 항목 내용이 크게 바뀌면 `plain_text` 키가 일치하지 않아 `inline_trailing_html`을 찾지 못함
+
+**대응:**
+- `_match_mdx_inline_item()`에서 순서(index) 기반 폴백 우선 사용
+  - `list_items` 시퀀스에서 `kind: inline` 항목의 순서(inline_ptr)와 MDX 항목의 순서를 매칭
+  - 텍스트 완전 일치 → prefix 20자 매칭 → 순서 기반 매칭 순으로 폴백
+- 매칭 실패 시 `inline_trailing_html` 없이 재구성 → lossy 요소 손실이지만 구조 파괴보다 안전
+
+> **TODO (W-3):** prefix 20자 매칭의 충돌 가능성을 기존 testcase 전수 조사로 확인. 충돌이 빈번하면 prefix 폴백을 제거하고 순서 기반만 유지하는 방향 검토.
+
+### 위험 2: Callout 내부 들여쓰기 처리 오류
+
+**증상:** `_dedent_callout_body()`가 내부 코드 블록의 들여쓰기를 과도하게 제거
+
+**대응:**
+- code_block 내부는 `parse_mdx_blocks()`가 펜스 마커 기준으로 파싱하므로, 들여쓰기 제거가 코드 내용에 영향 없음
+- 단위 테스트로 코드 블록 포함 callout 케이스 커버
+
+### 위험 3: Sidecar 버전 비호환
+
+**증상:** `list_items` 필드가 없는 구 버전 `mapping.yaml`을 읽을 때 오류
+
+**대응:**
+- `list_items`를 optional 필드로 선언 (기본값 `[]`)
+- 구 버전 sidecar에서 `list_items`가 없으면 trailing 없이 재구성 → 기존 동작과 동일
+
+---
+
+## 7. 기대 효과 요약
+
+| 지표 | 현재 | 개선 후 |
+|------|------|---------|
+| `patch_builder.py` 전략 수 | 5 (direct / containing / list / table / skip) | 2 (reconstruct / skip) |
+| `_resolve_child_mapping()` 폴백 단계 | 4 | 0 (삭제) |
+| 인라인 변경 감지 함수 | 2 (`has_inline_format_change`, `has_inline_boundary_change`) | 0 (삭제) |
+| callout 내부 리스트 처리 | text_transfer 우회 | 재귀 재구성 |
+| 신규 edge case 시 대응 | 전략 분기 추가 | `mdx_to_xhtml_inline.py` 개선 |
+| 기술부채 방향 | 분기 누적 | 단일 재구성 경로 개선으로 집중 |
+
+---
+
+## 8. 테스트 설계
+
+### 8.1 테스트 목표: 재구성의 "완전함"을 증명하는 방법
+
+"완전함"을 다음 두 가지 명제로 구분하여 증명한다.
+
+**명제 1 — 재구성 정확성:** `mdx_block_to_xhtml_element(block)`이 각 블록 타입에 대해 XHTML을 올바르게 생성한다.
+
+**명제 2 — 손실 복원 완전성:** `trailing_html` 재주입 후 재구성된 XHTML이 원본 `page.xhtml`과 블록 수준에서 등가이다.
+
+두 명제 모두 기존 `tests/testcases/`와 `tests/reverse-sync/`의 데이터로 검증할 수 있다. **새로운 테스트 입력 파일을 만들 필요가 없다.**
+
+---
+
+### 8.2 테스트 가능성 분류: 블록 타입별 재구성 가능성
+
+주어진 testcase 내의 블록을 재구성 가능성에 따라 세 범주로 나눈다.
+
+| 범주 | 설명 | 재구성 후 기대 결과 | 예시 |
+|------|------|---------------------|------|
+| **가역 블록** | MDX로 완전히 표현 가능한 블록 | 원본 XHTML과 byte-equal | heading, paragraph, code_block, 단순 list |
+| **손실 복원 블록** | MDX에 표현 안 되는 요소가 있지만 trailing_html로 복원 가능 | trailing_html 포함 시 원본과 등가 | `<ac:image>` 포함 list item, `<span style>` 포함 항목 |
+| **비가역 블록** | 정보 손실이 불가역적 | 기능적으로 무해한 변환만 허용 | `ac:adf-extension` callout, `ac:link` 포함 paragraph |
+
+비가역 블록은 이미 `architecture.md`의 "정보 손실 카테고리"에 문서화된 항목들이다. 이 범주는 재구성 목표 밖이며, 테스트에서 skip 처리한다.
+
+---
+
+### 8.3 테스트 수준 구조 (5단계)
+
+```
+Level 0: 보조 함수 단위 테스트              (unit)
+    ↓
+Level 1: 블록 단위 재구성 정확성            (unit)
+    ↓
+Level 2: 전체 문서 재구성 + block-level 비교  (integration)
+    ↓
+Level 3: lossy 요소 재주입 후 byte-equal    (integration)
+    ↓
+Level 4: E2E reverse-sync 회귀 방지         (e2e)
+```
+
+---
+
+### 8.3.1 Level 0: 보조 함수 단위 테스트
+
+**목적:** 신규 추가되는 보조 함수 각각이 독립적으로 올바르게 동작하는지 확인한다. Level 1보다 먼저 실행하여 버그 위치를 좁힌다.
+
+**실행 방법:**
+```bash
+python3 -m pytest tests/test_reconstruction_helpers.py -v --tb=short
+```
+
+#### 테스트 대상 함수 및 케이스
+
+**`_process_element(ul_or_ol, xpath)`** — sidecar entry 생성
+
+| 테스트 케이스 | 검증 내용 |
+|---------------|-----------|
+| 단순 `<ul><li><p>text</p></li></ul>` | `xhtml_type: ul` entry + `xhtml_type: li` entry 생성, children ref 정확성 |
+| `<li>` 내부 `<ac:image>` 포함 | `inline_trailing_html` 추출, block child가 아님을 확인 |
+| `<li>` 내부 nested `<ul>` | 부모 li의 `children`에 ref, nested ul의 독립 entry 생성 (`xhtml_type: ul`) |
+| 빈 `<li>` (`<li></li>`) | `plain_text=""`, `children=[]` |
+| `<li>` 내부 block macro (`<ac:structured-macro>`) | `children`에 ref, 독립 entry 생성 |
+| `<ol>` | `xhtml_type: ol`로 생성 확인 |
+
+**`map_anchor_positions(old_text, new_text, old_positions)`** — AnchorSegment 위치 매핑
+
+| 테스트 케이스 | 검증 내용 |
+|---------------|-----------|
+| `old_text == new_text` | 모든 AnchorSegment position 그대로 유지 |
+| 앞부분 삽입 (`"AB"` → `"XAB"`) | position이 삽입 길이만큼 뒤로 이동 |
+| 앞부분 삭제 (`"AB"` → `"B"`) | position이 삭제 길이만큼 앞으로 이동 |
+| 삭제 구간 안에 AnchorSegment (`"AB"` → `"B"`, position=1) | 삭제 직후 위치(0)로 매핑 |
+| 전체 교체 | AnchorSegment가 new_text 끝으로 매핑 |
+| AnchorSegment 2개, 중간 TextSegment 수정 | 각각 독립적으로 정확히 매핑 |
+
+**`reconstruct_paragraph(old_mdx_text, new_mdx_text, entry, sidecar_index)`**
+
+| 테스트 케이스 | 검증 내용 |
+|---------------|-----------|
+| `children` 없음 | `convert_inline(new_mdx_text)` 그대로 반환 |
+| `old_mdx_text != TextSegment 연결` | `SidecarMismatchError` 발생 |
+| AnchorSegment 1개, 텍스트 변경 없음 | AnchorSegment가 정확한 위치에 삽입 |
+| AnchorSegment 2개, TextSegment 수정 | Myers diff로 두 AnchorSegment 위치 모두 정확히 매핑 |
+| ref가 `sidecar_index`에 없음 | `SidecarMismatchError` 발생 |
+
+**`_process_paragraph(elem, xpath)`** — ParagraphEditSequence sidecar entry 생성
+
+| 테스트 케이스 | 검증 내용 |
+|---------------|-----------|
+| `<p>text only</p>` | `children = [{'kind': 'text', 'text': 'text only'}]` — TextSegment 1개 |
+| `<p><strong>bold</strong></p>` | `children = [{'kind': 'text', 'text': 'bold'}]` — TextSegment 1개 (inline element는 get_text() 처리) |
+| `<p>텍스트A <ac:image/> 텍스트B</p>` | `children = [text, ref, text]` 교차 시퀀스, `ac:image` 독립 entry 생성 |
+| `<p><ac:image/><ac:image/></p>` | `children = [text(''), ref, text(''), ref, text('')]` — 빈 TextSegment 포함 |
+| `<p>텍스트A <ac:image/></p>` | `children = [text, ref, text('')]` — 마지막 빈 TextSegment 포함 |
+
+**`reconstruct_ul_entry(entry, sidecar_index, mdx_nodes)`** — ul/ol 재구성
+
+| 테스트 케이스 | 검증 내용 |
+|---------------|-----------|
+| 단순 ul (sidecar 2개, mdx 2개 항목) | `<ul><li>...</li><li>...</li></ul>` 정상 재구성 |
+| ol 재구성 | `xhtml_type: ol` → `<ol>` 출력 |
+| sidecar 2개, mdx 3개 (항목 추가) | 새 항목이 `<li><p>...</p></li>` 로 출력에 포함 |
+| sidecar 3개, mdx 2개 (항목 삭제) | 삭제 항목 생략, 2개 `<li>` 출력 |
+| nested ul (li → ul → li 2단계) | 재귀 재구성 정확성 — `<ul><li><p>...</p><ul><li>...</li></ul></li></ul>` |
+
+**`reconstruct_li_entry(entry, sidecar_index, node)`** — li 재구성
+
+| 테스트 케이스 | 검증 내용 |
+|---------------|-----------|
+| 단순 li (텍스트만) | `<li><p>텍스트</p></li>` |
+| `inline_trailing_html` 있는 li | `<li><p>텍스트</p><ac:image.../></li>` — trailing html 재주입 확인 |
+| nested ul children 있는 li | `<li><p>텍스트</p><ul>...</ul></li>` |
+| block macro children 있는 li | `xhtml_fragment` 그대로 삽입 |
+
+**`normalize_xhtml(xhtml)`**
+
+| 테스트 케이스 | 검증 내용 |
+|---------------|-----------|
+| 속성 순서 다른 두 XHTML | normalize 후 equal |
+| `<br/>` vs `<br />` | normalize 후 equal |
+| `<tag />` vs `<tag></tag>` (빈 속성값) | normalize 후 equal |
+| trailing 공백 다른 텍스트 노드 | normalize 후에도 **not equal** |
+| 다른 namespace prefix | normalize 후에도 **not equal** |
+
+**`_strip_callout_wrapper(text)` / `_dedent_callout_body(text)`**
+
+| 테스트 케이스 | 검증 내용 |
+|---------------|-----------|
+| `<Callout type="info">...</Callout>` | wrapper 제거 후 내부 텍스트만 반환 |
+| 들여쓰기 2칸 callout body | 공통 선행 공백 제거 |
+| 내부 코드 블록 포함 | 코드 블록 내 들여쓰기 보존 |
+
+---
+
+### 8.4 Level 1: 블록 단위 재구성 정확성 테스트
+
+**목적:** `mdx_block_to_xhtml_element()`이 각 블록 타입에 대해 올바른 XHTML을 생성하는지 확인한다.
+
+**데이터 소스:** `tests/testcases/{page_id}/mapping.yaml` + `expected.mdx`
+
+**방법:**
+
+`mapping.yaml`은 각 MDX 블록의 `xhtml_xpath`와 원본 XHTML 내 `xhtml_text`를 갖고 있다. `expected.mdx`를 파싱하여 각 블록을 재구성하면, `mapping.yaml`의 `xhtml_text`와 비교할 수 있다.
+
+```python
+# tests/test_reconstruction_unit.py
+
+def test_block_reconstruction(page_id, block_idx):
+    """각 MDX 블록을 재구성하여 원본 XHTML 텍스트와 비교한다."""
+    mapping = load_mapping(f"testcases/{page_id}/mapping.yaml")
+    mdx_blocks = parse_mdx_blocks(open(f"testcases/{page_id}/expected.mdx").read())
+
+    for entry in mapping.entries:
+        mdx_idx = entry.mdx_blocks[0] if entry.mdx_blocks else None
+        if mdx_idx is None:
+            continue  # 비가역 블록 skip
+
+        block = mdx_blocks[mdx_idx]
+        reconstructed = mdx_block_to_xhtml_element(block)
+
+        # normalize_xhtml() 정규화 범위:
+        #   정규화 O — 속성 순서, 빈 태그 형식(<br/> vs <br />), 빈 속성값 형식
+        #   정규화 X — 텍스트 노드 trailing 공백, 네임스페이스 prefix
+        # trailing 공백 차이나 namespace prefix 차이는 실제 버그로 판정한다.
+        assert normalize_xhtml(reconstructed) == normalize_xhtml(entry.xhtml_text), \
+            f"Block {block_idx} ({block.type}) mismatch"
+```
+
+#### `normalize_xhtml()` 스펙
+
+| 항목 | 정규화 여부 | 이유 |
+|------|------------|------|
+| 속성 순서 | ✅ 정렬 | `mdx_block_to_xhtml_element()` 출력 순서가 원본과 다를 수 있음 |
+| 빈 태그 형식 (`<br/>` vs `<br />`) | ✅ 통일 | XML 파서/직렬화 도구마다 출력이 다름 |
+| 빈 속성값 형식 (`<ac:parameter ... />` vs `<ac:parameter ...></ac:parameter>`) | ✅ 통일 | BeautifulSoup 출력 방식에 따라 달라짐 |
+| 텍스트 노드 trailing 공백 | ❌ 유지 | 공백 차이는 실제 버그로 판정 |
+| 네임스페이스 prefix (`ac:`, `ri:`) | ❌ 유지 | prefix는 항상 동일하게 유지되어야 함 — 차이 발생 시 실제 버그 |
+
+```python
+def normalize_xhtml(xhtml: str) -> str:
+    """비교용 XHTML 정규화.
+
+    정규화 O: 속성 순서, 빈 태그 형식, 빈 속성값 형식
+    정규화 X: 텍스트 노드 공백, 네임스페이스 prefix
+    """
+    from lxml import etree
+    root = etree.fromstring(f"<root>{xhtml}</root>")
+    for elem in root.iter():
+        # 속성 순서 정렬
+        elem.attrib = dict(sorted(elem.attrib.items()))
+    # 직렬화 — 빈 태그/속성값 형식은 lxml 기본 출력으로 통일
+    result = etree.tostring(root, encoding="unicode")
+    # <root>...</root> 제거
+    return result[6:-7]
+```
+
+이 함수 자체에 대한 단위 테스트(Level 0)가 필요하다 — Section 8.3 Level 0 참고.
+
+**측정 지표:** `passed_blocks / total_blocks` — 목표 80% 이상 (비가역 블록 제외)
+
+**실행 방법:**
+```bash
+python3 -m pytest tests/test_reconstruction_unit.py -v \
+    --tb=short --no-header 2>&1 | tail -20
+```
+
+---
+
+### 8.5 Level 2: 전체 문서 재구성 커버리지 테스트
+
+**목적:** `tests/testcases/`의 모든 페이지에 대해 재구성이 가능한 블록의 비율(커버리지)을 측정한다.
+
+**방법:** 각 페이지의 `expected.mdx`를 전체 재구성한 뒤, 원본 `page.xhtml`의 beautified diff와 비교한다. 비가역 블록 위치에서 발생하는 diff만 허용한다.
+
+```
+tests/testcases/{page_id}/
+    page.xhtml                ← 원본
+    expected.mdx              ← MDX 입력
+    mapping.yaml              ← 블록 매핑
+    output.reconstruct.xhtml  ← 재구성 결과 (신규 생성)
+    output.reconstruct.diff   ← beautify-diff (신규 생성)
+```
+
+#### 핵심 함수 인터페이스
+
+```python
+def reconstruct_full_xhtml(
+    mdx_text: str,
+    mapping: SidecarMapping,
+    page_xhtml: str,              # 비가역 블록 원본 보존용
+) -> str:
+    """MDX 전체를 sidecar mapping 기반으로 재구성한다.
+
+    처리 순서:
+      1. mapping의 각 entry를 순회
+      2. 가역 블록 → mdx_block_to_xhtml_element()로 재구성
+      3. 비가역 블록(ac:link 포함, adf_extension 등) → page_xhtml에서 원본 fragment 추출하여 그대로 사용
+      4. document envelope(prefix/suffix) → sidecar에서 복원 (RoundtripSidecar.reassemble_xhtml() 참고)
+    """
+
+
+def compare_reversible_blocks(
+    original: str,
+    reconstructed: str,
+    mapping: SidecarMapping,
+) -> list[str]:
+    """가역 블록에서 발생한 diff 목록을 반환한다.
+
+    반환값이 [] 이면 전원 일치.
+    mapping의 각 entry를 순회하며:
+      - 비가역 블록(ac:link 포함, adf_extension 등) → skip
+      - 가역 블록 → original vs reconstructed의 해당 xhtml_xpath fragment 비교
+      - 불일치 시 f"{xhtml_xpath}: {diff}" 형태의 문자열을 목록에 추가
+    normalize_xhtml()로 정규화 후 비교한다.
+    """
+```
+
+**실행 스크립트 (기존 run-tests.sh 확장):**
+
+```bash
+# run-tests.sh에 추가할 타입
+# --type reconstruct
+# page.xhtml의 모든 블록을 expected.mdx로 재구성하여 diff를 생성한다.
+```
+
+```python
+# tests/test_reconstruction_coverage.py
+
+@pytest.mark.parametrize("page_id", list_testcase_ids())
+def test_reconstruction_coverage(page_id):
+    """MDX → XHTML 재구성 커버리지: 가역 블록은 원본과 일치해야 한다."""
+    page_xhtml = open(f"testcases/{page_id}/page.xhtml").read()
+    mdx_text = open(f"testcases/{page_id}/expected.mdx").read()
+    mapping = load_mapping(f"testcases/{page_id}/mapping.yaml")
+
+    reconstructed_xhtml = reconstruct_full_xhtml(mdx_text, mapping, page_xhtml)
+
+    reversible_diffs = compare_reversible_blocks(
+        original=page_xhtml,
+        reconstructed=reconstructed_xhtml,
+        mapping=mapping,
+    )
+    # 가역 블록에서 diff가 없어야 함
+    assert reversible_diffs == [], \
+        f"Reversible block diff found:\n" + "\n".join(reversible_diffs)
+```
+
+**`compare_reversible_blocks()`의 동작:**
+
+1. `mapping.yaml`의 각 엔트리를 순회
+2. 비가역 블록(ac:link 포함, adf_extension 등) → skip
+3. 가역 블록 → `original_block_xhtml` vs `reconstructed_block_xhtml` 비교
+4. 불일치 시 diff를 반환
+
+---
+
+### 8.6 Level 3: 재구성 후 byte-equal 테스트
+
+**목적:** 손실 복원 블록(`ac:image`, `inline_trailing_html` 포함 항목 등)에 대해 재구성 결과가 원본 XHTML fragment와 byte-equal임을 증명한다.
+
+#### 기존 인프라 (신규 파일 불필요)
+
+다음 구현이 이미 존재한다:
+
+| 항목 | 구현 위치 | 설명 |
+|------|-----------|------|
+| `expected.roundtrip.json` 파일 | `tests/testcases/{page_id}/` | `SidecarBlock.xhtml_fragment` (byte-exact), `SidecarBlock.mdx_content_hash` (SHA-256) 포함 |
+| `SidecarBlock.mdx_content_hash` | `bin/reverse_sync/sidecar.py` L53 | MDX 블록 content의 SHA-256 — MDX 블록 식별 키 |
+| `RoundtripSidecar`, `load_sidecar()` | `bin/reverse_sync/sidecar.py` L59, L233 | JSON 역직렬화 |
+| `build_sidecar()` | `bin/reverse_sync/sidecar.py` L159 | `page.xhtml` + `expected.mdx` → `RoundtripSidecar` 생성 |
+| 생성 CLI | `bin/mdx_to_storage_roundtrip_sidecar_cli.py` | `batch-generate` 서브커맨드로 전체 testcase 일괄 생성 |
+| splice 경로 | `bin/reverse_sync/rehydrator.py` L62 `splice_rehydrate_xhtml()` | `mdx_content_hash` 기반 블록 매칭 — `find_mdx_block_by_hash()` 역할 수행 |
+
+**데이터 소스:** `tests/testcases/{page_id}/expected.roundtrip.json` + `tests/testcases/{page_id}/expected.mdx`
+
+#### 테스트 코드
+
+```python
+# tests/test_reconstruction_lossless.py
+
+from reverse_sync.sidecar import load_sidecar, sha256_text, load_sidecar_mapping
+from reverse_sync.mdx_block_parser import parse_mdx_blocks
+
+@pytest.mark.parametrize("page_id", list_testcase_ids())
+def test_lossless_reconstruction(page_id):
+    """재구성 결과가 원본 XHTML fragment와 byte-equal인지 검증한다."""
+    sidecar = load_sidecar(Path(f"testcases/{page_id}/expected.roundtrip.json"))
+    # load_sidecar: bin/reverse_sync/sidecar.py L233
+    mapping_entries = load_sidecar_mapping(f"testcases/{page_id}/mapping.yaml")
+    # load_sidecar_mapping: bin/reverse_sync/sidecar.py L257
+    xpath_index = {e.xhtml_xpath: e for e in mapping_entries}
+
+    mdx_text = open(f"testcases/{page_id}/expected.mdx").read()
+    mdx_blocks = parse_mdx_blocks(mdx_text)
+    # parse_mdx_blocks: bin/reverse_sync/mdx_block_parser.py
+
+    # mdx_content_hash → MDX 블록 인덱스 (splice 경로와 동일 방식)
+    # rehydrator.py L96: content_hash == sb.mdx_content_hash 비교와 동일
+    hash_to_block = {sha256_text(b.content): b for b in mdx_blocks if b.content}
+
+    for sb in sidecar.blocks:
+        if not sb.mdx_content_hash:
+            continue  # MDX 대응 없는 블록 skip (image, TOC 등)
+            # SidecarBlock.mdx_content_hash: sidecar.py L53
+
+        mdx_block = hash_to_block.get(sb.mdx_content_hash)
+        if mdx_block is None:
+            pytest.skip(f"hash not found: {sb.mdx_content_hash[:8]}...")
+
+        sidecar_entry = xpath_index.get(sb.xhtml_xpath)
+        reconstructed = mdx_block_to_xhtml_element(mdx_block, sidecar_entry)
+
+        assert reconstructed == sb.xhtml_fragment, (
+            f"Fragment mismatch at {sb.xhtml_xpath}:\n"
+            f"  expected: {sb.xhtml_fragment!r}\n"
+            f"  got:      {reconstructed!r}"
+        )
+```
+
+#### 기존 `byte_verify`와의 관계
+
+| 검증 | 구현 | 목적 |
+|------|------|------|
+| 기존 `byte_verify` | `bin/reverse_sync/byte_verify.py` | MDX 무변경 시 XHTML byte-equal 보장 (fast path) |
+| Level 3 `test_reconstruction_lossless` | 신규 | 재구성 경로로 변환해도 byte-equal임을 보장 |
+
+Level 3이 통과하면 "재구성 = fast path"임이 증명된다.
+
+**측정 목표: failed = 0** (mdx_content_hash 없는 블록 skip 제외)
+
+---
+
+### 8.7 Level 4: E2E 회귀 방지 테스트
+
+**목적:** 재구성 기반 reverse-sync가 기존 테스트케이스를 회귀시키지 않음을 보장한다.
+
+**데이터 소스:** `tests/reverse-sync/{page_id}/` (기존 인프라 그대로 사용)
+
+**방법:** 기존 `run-tests.sh --type reverse-sync-verify`를 그대로 사용하되, reverse-sync 내부 경로가 재구성 기반으로 전환된 후 동일하게 실행한다.
+
+```bash
+# 기존 명령 그대로
+cd tests && ./run-tests.sh --type reverse-sync-verify
+
+# 검증 기준: pages.yaml의 expected_status와 일치
+# pass 26건 유지, fail 16건 유지 (신규 pass 전환만 허용)
+```
+
+**회귀 판정 기준:**
+
+| 상태 전환 | 판정 |
+|-----------|------|
+| `pass` → `pass` | ✅ 유지 |
+| `fail` → `pass` | ✅ 개선 (expected_status 업데이트 필요) |
+| `pass` → `fail` | ❌ 회귀 — PR 차단 |
+| `fail` → `fail` | ✅ 유지 |
+
+**Phase 3 구현 완료 기준:** 26건 `expected_status: pass` 전원 유지 + 신규 pass 전환 확인
+
+---
+
+### 8.8 테스트 실행 순서와 피드백 루프
+
+구현 변경 후 아래 순서로 테스트를 실행한다. 각 단계는 이전 단계가 전원 통과한 후 진행한다.
+
+#### Step 1 — Level 1 실행
+
+```bash
+python3 -m pytest tests/test_reconstruction_unit.py -v --tb=short
+```
+
+**무엇을 확인하는가:** 블록 하나를 재구성했을 때 XHTML이 올바른가.
+
+**실패 시 수정 위치:** `bin/reverse_sync/mdx_to_xhtml_inline.py` — 해당 블록 타입의 변환 로직.
+
+---
+
+#### Step 2 — Level 2 실행
+
+```bash
+python3 -m pytest tests/test_reconstruction_coverage.py -v --tb=short
+```
+
+**무엇을 확인하는가:** 문서 전체를 재구성했을 때 블록 조립 순서와 envelope(문서 앞뒤 고정 텍스트)가 올바른가.
+
+**실패 시 수정 위치:** 블록 조립 순서 오류라면 `reconstruct_entry()`, envelope 오류라면 `RoundtripSidecar.reassemble_xhtml()` (`bin/reverse_sync/sidecar.py` L70).
+
+---
+
+#### Step 3 — Level 3 실행
+
+```bash
+python3 -m pytest tests/test_reconstruction_lossless.py -v --tb=short
+```
+
+**무엇을 확인하는가:** `ac:image` 등 lossy 요소를 재주입한 후 원본 XHTML fragment와 byte-equal인가.
+
+**실패 시 수정 위치:** sidecar 생성 로직 — `_process_element()` 또는 `reconstruct_entry()`의 `inline_trailing_html` / `children ref` 처리.
+
+---
+
+#### Step 4 — Level 4 실행
+
+```bash
+cd tests && ./run-tests.sh --type reverse-sync-verify
+```
+
+**무엇을 확인하는가:** 기존에 통과하던 reverse-sync E2E 케이스가 재구성 경로 전환 후에도 동일하게 통과하는가.
+
+**실패 시 수정 위치:** `bin/reverse_sync/patch_builder.py`의 재구성 경로 — Level 1/2/3에서 놓친 블록 타입이 있다는 신호이므로, 해당 케이스를 Level 1 단위 테스트로 먼저 재현하고 수정한다.
+
+---
+
+#### 판정 기준 요약
+
+| 단계 | 통과 기준 | 실패 의미 |
+|------|-----------|-----------|
+| Level 1 | 모든 블록 타입 재구성 정확 | 변환 로직 버그 |
+| Level 2 | 문서 단위 조립 정확 | 블록 순서 또는 envelope 버그 |
+| Level 3 | lossy 요소 재주입 후 byte-equal | sidecar children/trailing 추출 버그 |
+| Level 4 | 기존 pass 케이스 전원 유지 | Level 1~3에서 놓친 케이스 존재 |
+
+---
+
+### 8.9 기존 인프라와의 관계 정리
+
+| 기존 테스트 | 역할 | 재구성 후 변화 |
+|-------------|------|----------------|
+| `run-tests.sh --type convert` | XHTML → MDX forward 변환 검증 | 변화 없음 |
+| `run-tests.sh --type reverse-sync` | expected 파일 비교 | Phase 3 완료 후 expected 파일 재생성 필요 |
+| `run-tests.sh --type reverse-sync-verify` | `expected_status` 검증 | 그대로 사용 (회귀 게이트) |
+| `byte_verify` | roundtrip sidecar byte-equal | 변화 없음 (fast path 그대로) |
+| `test_reconstruction_unit.py` | **신규** — 블록 단위 재구성 | Level 1 |
+| `test_reconstruction_lossless.py` | **신규** — trailing_html byte-equal | Level 3 |

From 4f6ac62b7235d245a1e41f39abd5c0beb45c9b9d Mon Sep 17 00:00:00 2001
From: JK <jk@chequer.io>
Date: Fri, 13 Mar 2026 20:05:33 +0900
Subject: [PATCH 2/5] =?UTF-8?q?confluence-mdx:=20=EC=9E=AC=EA=B5=AC?=
 =?UTF-8?q?=EC=84=B1=20=EC=84=A4=EA=B3=84=20=EA=B2=80=ED=86=A0=20=ED=8F=89?=
 =?UTF-8?q?=EA=B0=80=20=EB=AC=B8=EC=84=9C=EB=A5=BC=20v5=EB=A1=9C=20?=
 =?UTF-8?q?=EC=97=85=EB=8D=B0=EC=9D=B4=ED=8A=B8=ED=95=A9=EB=8B=88=EB=8B=A4?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 ...verse-sync-reconstruction-design-review.md | 394 ++++++++++++++----
 1 file changed, 320 insertions(+), 74 deletions(-)

diff --git a/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design-review.md b/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design-review.md
index 3d18d0e3f..2e487ac83 100644
--- a/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design-review.md
+++ b/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design-review.md
@@ -1,139 +1,385 @@
-# Reverse Sync 전면 재구성 설계 — 검토 평가 결과 (v4)
+# Reverse Sync 전면 재구성 설계 — 검토 평가 결과 (v5)
 
 > 검토 대상: `2026-03-13-reverse-sync-reconstruction-design.md`
 > 검토일: 2026-03-13
-> 검토자: Claude Sonnet 4.6
-> 이전 검토(v3) 대비 변경: 이전 지적사항 전체 반영 확인 + 신규 이슈 도출
+> 검토 기준 버전: PR #913 head `33fa095e56cb26766995b3930d3616a58559685e`
+> 검토 관점: 설계 타당성, 코드베이스 정합성, TDD 관점의 테스트 확보 가능성
 
 ---
 
-## 이전 검토(v3) 반영 결과
+## 결론
 
-| 항목 | 내용 | 반영 결과 |
-|------|------|-----------|
-| C-1 | Section 5 Phase 1이 v1 설계(`list_items` 중첩 필드)를 기술 | ✅ 플랫 매핑 + children ref 기준으로 재작성, `_process_paragraph()` 추가 |
-| C-2 | `zip()` silent truncation — 항목 추가 시 새 항목 누락 | ✅ `for node in mdx_nodes[len(sidecar_refs):]` 후처리 루프 추가 |
-| W-1 | ParagraphEditSequence 생성(sidecar creation) 로직 미정의 | ✅ `_process_paragraph()` 완전 정의 (XHTML 조각 누적 + `convert_inline()`) |
-| W-2 | `reconstruct_ul/li_entry()` Level 0 테스트 미명시 | ✅ 각 함수별 테스트 케이스 추가 |
-| W-3 | Section 4.1이 구 함수명 `reconstruct_list_with_trailing()` 참조 | ✅ `reconstruct_ul_entry()`로 교체 |
-| S-1 | `<p>` 없는 `<li>` 처리 미명시 | ✅ Confluence에서 존재하지 않음을 코드 주석에 명시 |
-| S-2 | 다단락 `<li>` 처리 미언급 | ✅ 존재 여부 조사 TODO 추가 (Section 3.1 TODO 6) |
+문서가 제시하는 **큰 방향**은 타당하다.
 
-이전 지적사항 7건 전부 반영되었습니다. 아래는 최신 설계 문서 기준 신규 이슈입니다.
+- patch heuristic을 계속 누적하는 대신 MDX → XHTML 재구성 경로로 수렴시키려는 방향은 맞다
+- list를 sidecar flat entry + `children` ref 구조로 재설계한 점도 기존 `list_items` 계열 설계보다 낫다
+- callout 내부를 paragraph fallback이 아니라 재귀 파싱으로 처리하겠다는 판단도 적절하다
 
----
+하지만 현재 문서는 **최종 설계 승인 가능 상태는 아니다**.
 
-## 총평
+핵심 이유는 두 가지다.
 
-설계가 구현 가능한 수준에 도달했습니다. 플랫 sidecar + children ref, ParagraphEditSequence, 5단계 테스트 구조가 일관성을 갖추었고, `_process_paragraph()` 생성 측이 완전히 정의된 것이 특히 중요한 개선입니다.
+1. paragraph 재구성의 핵심 불변식 하나가 현재 코드베이스 기준으로 성립하지 않는다
+2. 테스트 설계가 일부 레벨에서 실제 fixture/loader/API와 맞지 않아, TDD 게이트로 바로 작동하지 않는다
 
-남은 이슈는 Section 6가 이전 설계(v1)를 참조하는 불일치와, Level 0 테스트의 예상값 오류입니다.
+즉, 방향은 맞지만 아직 "구현 전에 바로 들어가도 되는 설계" 수준은 아니다.
 
 ---
 
-## Warning — 구현 시 혼란을 유발할 수 있는 문제
+## 주요 지적사항
+
+### C-1. ParagraphEditSequence의 핵심 불변식이 현재 설계대로는 성립하지 않는다
+
+**심각도:** Critical
+
+**문서 위치:**
+
+- Section 3.1.1 `_process_paragraph()`
+- Section 3.1.1 `reconstruct_paragraph()`
+
+**문서의 주장:**
+
+- XHTML 조각을 누적한 뒤 `convert_inline()`을 적용하면 `TextSegment.text`가 MDX 텍스트 조각이 된다
+- 따라서 `TextSegments`를 이어 붙인 값이 `old_mdx_text`와 정확히 일치한다
+- 그 결과 `old_text == old_mdx_text`를 강한 불변식으로 둘 수 있다
+
+**문제:**
 
-### W-1. Section 6 위험 1이 삭제된 설계를 기술
+현재 코드베이스의 `convert_inline()`은 **MDX → XHTML 변환기**다. XHTML 조각을 넣었을 때 XHTML을 MDX로 역변환해주지 않는다.
 
-**위치:** Section 6 위험 1
+예를 들어:
 
+```python
+convert_inline("<strong>bold</strong>") == "<strong>bold</strong>"
 ```
-- _match_mdx_inline_item()에서 순서(index) 기반 폴백 우선 사용
-  - list_items 시퀀스에서 kind: inline 항목의 순서(inline_ptr)와 MDX 항목의 순서를 매칭
-  - 텍스트 완전 일치 → prefix 20자 매칭 → 순서 기반 매칭 순으로 폴백
+
+즉 `_process_paragraph()`가 아래처럼 동작하면:
+
+```python
+cursor_xhtml += str(child)   # "<strong>bold</strong>"
+children.append({
+    "kind": "text",
+    "text": convert_inline(cursor_xhtml),
+})
 ```
 
-현재 설계는 `_match_mdx_inline_item()`, `list_items` 필드, 텍스트 매칭을 사용하지 않습니다. `reconstruct_ul_entry()`의 `zip()` 위치 기반 매칭으로 전부 대체되었습니다. 위험 1의 "증상"(텍스트 변경으로 matching 실패)도 더 이상 발생하지 않습니다.
+`TextSegment.text`는 `"**bold**"`가 아니라 `"<strong>bold</strong>"`로 남는다.
+
+그러면 문서가 전제한:
+
+```python
+old_text == old_mdx_text
+```
 
-현재 설계에서 위험 1에 해당하는 실제 위험은 **sidecar 항목 수와 MDX 항목 수 불일치 시 `inline_trailing_html` 손실**이며, 이는 케이스 분류 표에 이미 기술되어 있습니다. Section 6 위험 1을 현재 설계 기준으로 교체해야 합니다.
+는 서식이 포함된 paragraph에서 곧바로 깨진다.
+
+**왜 중요한가:**
+
+이 불변식이 깨지면 `map_anchor_positions()` 이전 단계에서 좌표계가 이미 무너진다. 즉 paragraph 설계의 중심축이 성립하지 않는다.
+
+**필요한 수정 방향:**
+
+둘 중 하나를 명확히 선택해야 한다.
+
+1. `TextSegment.text`를 "MDX 텍스트"가 아니라 "XHTML/normalized plain 기준 텍스트"로 바꾸고, `reconstruct_paragraph()` 비교식도 그 좌표계에 맞게 다시 설계한다
+2. XHTML inline fragment를 실제로 MDX inline text로 역변환하는 별도 함수/규칙을 설계하고, 그 함수의 지원 범위를 테스트로 고정한다
+
+현재 문서는 2번을 이미 해결된 것처럼 쓰고 있지만, 실제로는 해결되지 않았다.
 
 ---
 
-### W-2. Section 6 위험 3이 삭제된 `list_items` 필드를 기술
+### C-2. Level 1 / Level 2 테스트의 oracle 정의가 현재 fixture와 맞지 않는다
 
-**위치:** Section 6 위험 3
+**심각도:** High
 
-```
-- list_items를 optional 필드로 선언 (기본값 [])
-- 구 버전 sidecar에서 list_items가 없으면 trailing 없이 재구성 → 기존 동작과 동일
+**문서 위치:**
+
+- Section 8.4 Level 1
+- Section 8.5 Level 2
+
+**문서의 주장:**
+
+- `mapping.yaml`에서 각 블록의 `xhtml_text`를 읽어 재구성 결과와 비교할 수 있다
+- `load_mapping()`으로 이를 로드할 수 있다
+
+**문제:**
+
+현재 실제 `mapping.yaml`은 `xhtml_xpath`, `xhtml_type`, `mdx_blocks` 중심이며, 문서가 예제로 사용하는 `entry.xhtml_text`는 존재하지 않는다. 실제 loader인 `load_sidecar_mapping()`도 그 필드를 읽지 않는다.
+
+즉 문서의 예제:
+
+```python
+mapping = load_mapping(...)
+assert normalize_xhtml(reconstructed) == normalize_xhtml(entry.xhtml_text)
 ```
 
-`list_items`는 v1 설계 필드로, 현재 설계에 없습니다. 실제 backward compat 대상은 `children`, `plain_text`, `inline_trailing_html`이며, `reconstruct_paragraph()`의 `if not entry.children` 폴백이 이를 처리합니다. Section 6 위험 3을 현재 스키마 기준으로 업데이트해야 합니다.
+는 현재 리포지토리 기준으로 성립하지 않는다.
+
+**왜 중요한가:**
+
+TDD에서 가장 중요한 것은 "무엇을 expected로 비교할 것인가"다. 그런데 Level 1/2의 expected source가 현재 정의되지 않았다.
+
+이 상태에서는:
+
+- 테스트 파일 이름은 정할 수 있어도
+- 실제 assertion이 무엇을 비교해야 하는지
+- 어느 loader를 쓸지
+- 원본 fragment를 어디서 읽어올지
+
+가 정해지지 않은 셈이다.
+
+**필요한 수정 방향:**
+
+둘 중 하나를 선택해야 한다.
+
+1. `mapping.yaml` 스키마를 확장해 테스트 oracle로 쓸 원본 XHTML fragment를 명시적으로 넣는다
+2. Level 1/2의 oracle을 `mapping.yaml`이 아니라 `page.xhtml` 또는 `expected.roundtrip.json`에서 xpath 기반으로 추출하는 방식으로 다시 설계한다
+
+현재 리포지토리 자산을 고려하면 2번이 더 현실적이다.
 
 ---
 
-### W-3. `_process_paragraph()` Level 0 테스트 예상값 오류
+### C-3. `normalize_xhtml()` 설계가 현재 저장소 전제와 맞지 않는다
 
-**위치:** Section 8.3.1 Level 0 `_process_paragraph()` 테스트 케이스
+**심각도:** High
 
-```
-| <p><strong>bold</strong></p> | children = [{'kind': 'text', 'text': 'bold'}]
-                                 — TextSegment 1개 (inline element는 get_text() 처리) |
-```
+**문서 위치:**
+
+- Section 8.4 `normalize_xhtml()`
+
+**문서의 주장:**
+
+- `lxml.etree.fromstring()`으로 fragment를 파싱해 정규화한다
+
+**문제 1: 의존성 누락**
+
+현재 저장소의 `requirements.txt`에는 `lxml`이 없다.
+
+**문제 2: Confluence fragment 파싱 전제 미정의**
+
+문서의 테스트 대상에는 `ac:image`, `ri:attachment` 같은 namespace prefix가 포함된 XHTML fragment가 많다. 이들은 namespace 선언 없이 XML parser에 그대로 넣으면 실패할 가능성이 높다.
+
+이는 단순 구현 디테일이 아니라, Level 1/2 비교 함수가 실제 테스트 데이터에 적용 가능한지 여부를 가르는 전제다.
+
+**왜 중요한가:**
+
+비교 함수가 실제 fragment를 파싱하지 못하면 Level 1/2는 시작 자체가 안 된다.
+
+**필요한 수정 방향:**
+
+다음 중 하나를 문서에 명시해야 한다.
+
+1. `lxml`을 새 테스트 의존성으로 도입하고, Confluence namespace wrapper를 어떻게 붙일지 명확히 규정한다
+2. XML 정규화 대신 BeautifulSoup 기반 canonicalization 혹은 byte/string comparison 규칙으로 축소한다
+
+현재 문서는 이 전제를 해결하지 않은 채 테스트가 가능한 것처럼 서술하고 있다.
+
+---
+
+## Warning
+
+### W-1. Level 3의 `mdx_content_hash` 단독 매칭은 중복 content 페이지에서 오검증 위험이 있다
+
+**문서 위치:**
 
-그러나 `_process_paragraph()` 구현은:
+- Section 8.6 Level 3
+
+문서는 아래 방식으로 MDX 블록을 찾는다.
 
 ```python
-cursor_xhtml += str(child)   # → "<strong>bold</strong>" 누적
-children.append({'kind': 'text', 'text': convert_inline(cursor_xhtml)})
+hash_to_block = {sha256_text(b.content): b for b in mdx_blocks if b.content}
+mdx_block = hash_to_block.get(sb.mdx_content_hash)
 ```
 
-`convert_inline("<strong>bold</strong>")` = `"**bold**"` (MDX bold)이므로 TextSegment.text는 `'bold'`가 아니라 `'**bold**'`여야 합니다. 주석의 `inline element는 get_text() 처리`도 구현과 다릅니다.
+하지만 실제 testcase에는 **동일한 non-empty content를 가진 블록이 반복되는 페이지가 이미 존재한다**.
+
+예:
+
+- `tests/testcases/1454342158`
+- `tests/testcases/544375741`
+- `tests/testcases/544145591`
+
+이 경우 dict comprehension은 마지막 블록으로 덮어쓰므로, 다른 위치의 동일 content 블록과 잘못 매칭될 수 있다.
 
-설계 문서는 "TextSegment.text는 MDX 텍스트 조각"으로 명시하므로, 테스트 케이스 예상값을 `'**bold**'`로, 주석을 `XHTML 조각을 convert_inline()으로 변환`으로 수정해야 합니다.
+**영향:**
+
+- 테스트가 실패해야 할 케이스를 통과시키거나
+- 반대로 맞는 구현을 오검증으로 실패시킬 수 있다
+
+**권장 수정:**
+
+- key를 `mdx_content_hash` 단독이 아니라 `(mdx_content_hash, mdx_line_range)` 또는 "hash -> list of candidate blocks"로 바꾸고
+- sidecar block의 순서나 line range를 함께 사용해 disambiguation 하도록 문서화해야 한다
 
 ---
 
-## Suggestion
+### W-2. 테스트 실행 순서가 Level 0를 건너뛰고 있어 TDD 루프가 약하다
 
-### S-1. Section 6 위험 1 TODO가 삭제된 설계를 참조
+**문서 위치:**
 
-**위치:** Section 6 위험 1 하단
+- Section 8.3
+- Section 8.8
 
+문서는 테스트 수준 구조에서 Level 0를 가장 먼저 둔다.
+
+```text
+Level 0 -> Level 1 -> Level 2 -> Level 3 -> Level 4
 ```
-> TODO (W-3): prefix 20자 매칭의 충돌 가능성을 기존 testcase 전수 조사로 확인.
+
+그런데 실제 실행 순서 섹션은 Step 1을 Level 1부터 시작한다.
+
+이렇게 되면 helper 단위의 red/green이 빠지고, 실패 원인이:
+
+- helper 버그인지
+- block renderer 버그인지
+- document assembly 버그인지
+
+를 뒤늦게 분리하게 된다.
+
+**권장 수정:**
+
+Section 8.8의 Step 1은 Level 0여야 한다.
+
+```bash
+python3 -m pytest tests/test_reconstruction_helpers.py -v --tb=short
 ```
 
-prefix 20자 매칭은 현재 설계에서 완전히 제거되었습니다. 이 TODO는 삭제하면 됩니다.
+그 다음에 Level 1로 넘어가야 문서가 말하는 TDD 순서와 실제 실행 절차가 일치한다.
 
 ---
 
-### S-2. Level 1 테스트가 sidecar-aware 경로를 커버하지 않음을 미명시
+### W-3. "새로운 테스트 입력 파일이 필요 없다"는 표현은 과도하다
 
-**위치:** Section 8.4 Level 1 테스트 코드
+**문서 위치:**
 
-```python
-reconstructed = mdx_block_to_xhtml_element(block)
-```
+- Section 8.1
 
-`sidecar_entry` 없이 호출하므로 `inline_trailing_html` 재주입 경로와 callout macro 포맷 선택 경로가 Level 1에서 커버되지 않습니다. 이 경로들이 Level 3에서 검증되는 의도라면 Section 8.4에 명시하면 충분합니다:
+기존 testcase를 최대한 재사용하겠다는 방향은 좋다. 다만 현재 확인된 공백을 메우려면 최소한 다음 중 일부는 새 fixture 또는 기존 fixture 파생 샘플이 필요할 가능성이 높다.
 
-> "sidecar-aware 경로(list `inline_trailing_html`, callout macro format)는 Level 3에서 검증한다."
+- formatted paragraph + inline image 혼합 unit fixture
+- namespace-bearing XHTML fragment normalization fixture
+- duplicate-content Level 3 disambiguation fixture
+
+즉 "대부분 기존 fixture 재사용"은 맞지만, "새로운 테스트 입력 파일이 전혀 필요 없다"는 문장은 너무 강하다.
 
 ---
 
-## 평가 요약
+## Suggestion
 
-| 항목 | 평가 |
-|------|------|
-| 이전 지적사항 전체 반영 | ✅ |
-| 플랫 매핑 + children ref 구조 | ✅ |
-| `_process_paragraph()` 생성 측 완전 정의 | ✅ |
-| `zip()` 항목 추가 후처리 루프 | ✅ |
-| 5단계 테스트 구조 및 실행 흐름 | ✅ |
-| Level 0 `reconstruct_ul/li_entry()` + `_process_paragraph()` 테스트 | ✅ |
-| Section 5 Phase 1 ↔ Section 3.1 일치 | ✅ |
-| Section 6 위험 1 구 설계 기술 (W-1) | ⚠️ |
-| Section 6 위험 3 `list_items` 참조 (W-2) | ⚠️ |
-| `_process_paragraph()` Level 0 예상값 오류 (W-3) | ⚠️ |
-| Section 6 위험 1 TODO 잔존 (S-1) | 💡 |
-| Level 1 sidecar-aware 경로 미커버 미명시 (S-2) | 💡 |
+### S-1. Section 8을 "설계 검증 테스트"와 "회귀 방지 테스트"로 분리하면 더 명확하다
+
+현재 Section 8은 unit/integration/e2e를 모두 포함하지만, 성격이 다른 두 가지 테스트가 섞여 있다.
+
+- 설계 자체의 타당성을 입증하는 테스트
+- 구현 후 회귀를 막는 테스트
+
+다음을 분리하면 읽는 사람이 덜 헷갈린다.
+
+- Part A: 설계 검증 테스트
+  - ParagraphEditSequence
+  - list children ref
+  - callout recursive parsing
+- Part B: 회귀 방지 테스트
+  - reconstruction coverage
+  - lossless fragment compare
+  - reverse-sync-verify
 
 ---
 
-## 다음 단계
+### S-2. 승인 기준을 "Phase 진입 가능"과 "구현 완료 가능"으로 나누는 편이 낫다
+
+현재 문서는 구현 계획과 테스트 계획은 자세하지만, "지금 당장 Phase 1에 착수 가능한가"를 가르는 gate가 약하다.
 
-Warning 3건(W-1~3)은 구현 전에 수정하는 것이 좋습니다.
+다음처럼 나누면 더 실용적이다.
+
+- Phase 1 착수 전 필수 해소
+  - C-1 paragraph invariant
+  - C-2 Level 1/2 oracle
+  - C-3 normalization strategy
+- Phase 3 머지 전 필수 해소
+  - W-1 Level 3 duplicate hash
+  - W-2 실행 순서 정합성
+
+---
 
-- **W-1, W-2**: Section 6를 현재 설계 기준으로 재작성
-- **W-3**: `_process_paragraph()` Level 0 테스트 예상값과 주석을 `str(child)` + `convert_inline()` 기준으로 수정
+## TDD 관점 평가
+
+### 좋은 점
+
+- 실제 testcase 기반 설계 원칙을 명시한 점은 좋다
+- helper → block → document → byte-equal → E2E로 내려가는 계층적 테스트 구조는 적절하다
+- E2E `reverse-sync-verify`를 최종 회귀 게이트로 유지한 판단도 맞다
+
+### 부족한 점
+
+- 가장 위험한 설계 가정(paragraph 좌표계)이 unit red test로 먼저 고정되어 있지 않다
+- Level 1/2 expected source가 불명확해 테스트를 바로 쓸 수 없다
+- Level 3 block identity가 불안정해 "pass = correct"라는 신뢰를 주지 못한다
+
+### 현재 판단
+
+**"문제 해결을 위해 충분한 테스트케이스를 확보하는 방안이 도출되어 있는가?"**라는 질문에 대한 답은:
+
+**아직 충분하지 않다.**
+
+테스트 레벨의 개수는 충분하지만, 최소 3개의 핵심 전제가 미정이다.
+
+1. paragraph sidecar의 좌표계
+2. Level 1/2의 oracle source
+3. Level 3의 block identity
+
+이 셋이 해결되어야 비로소 TDD 계획이 실제 문제 해결을 보장하는 체계가 된다.
+
+---
+
+## 권장 후속 조치
+
+### 1. 설계 문서 우선 수정
+
+다음 항목을 먼저 문서에서 확정해야 한다.
+
+- paragraph `TextSegment.text`의 기준 좌표계
+- `normalize_xhtml()` 구현 전략 또는 대체 비교 전략
+- Level 1/2에서 원본 fragment를 어디서 읽을지
+- Level 3에서 duplicate content를 어떻게 disambiguate할지
+
+### 2. TDD 진입용 최소 red test 정의
+
+구현 전에 아래 4개를 먼저 failing test로 고정하는 것이 좋다.
+
+1. formatted paragraph + inline image
+2. nested list + `inline_trailing_html`
+3. callout + nested list + code block
+4. duplicate MDX content page에서 Level 3 fragment identity 유지
+
+### 3. 승인 기준 재정의
+
+현재 문서는 "방향은 맞음" 수준이다.
+
+다음 기준을 만족하면 구현 착수 가능으로 볼 수 있다.
+
+- C-1, C-2, C-3 해소
+- Level 0를 포함한 실행 순서 재정의
+- Level 3 identity 전략 확정
+
+---
+
+## 평가 요약
+
+| 항목 | 평가 |
+|------|------|
+| 재구성 중심 방향성 | ✅ 적절 |
+| list flat mapping + `children` ref | ✅ 적절 |
+| callout 재귀 파싱 방향 | ✅ 적절 |
+| paragraph 설계 완결성 | ❌ Critical |
+| Level 1/2 테스트 oracle 정의 | ❌ High |
+| XHTML normalization 전략 | ❌ High |
+| Level 3 hash 기반 block 식별 | ⚠️ Warning |
+| TDD 실행 순서 일관성 | ⚠️ Warning |
+| "기존 fixture만으로 충분" 주장 | ⚠️ Warning |
+
+최종 판단:
+
+- **설계 방향:** 승인 가능
+- **설계 문서의 현재 완성도:** 수정 필요
+- **TDD 관점의 테스트 확보 방안:** 보강 필요

From 63ca57269428301ecc7f48464cc4c735f9d88457 Mon Sep 17 00:00:00 2001
From: JK <jk@chequer.io>
Date: Fri, 13 Mar 2026 20:34:34 +0900
Subject: [PATCH 3/5] docs: rewrite reverse sync reconstruction design

---
 ...3-13-reverse-sync-reconstruction-design.md | 1555 ++++++-----------
 1 file changed, 497 insertions(+), 1058 deletions(-)

diff --git a/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design.md b/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design.md
index a035108ff..fe5d8da7c 100644
--- a/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design.md
+++ b/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design.md
@@ -1,1270 +1,709 @@
 # Reverse Sync 전면 재구성 설계
 
 > 작성일: 2026-03-13
-> 연관 분석: `analysis-reverse-sync-refactoring.md`
+> 대상 PR: #913
+> 연관 문서:
+> - `docs/plans/2026-03-13-reverse-sync-reconstruction-design-review.md`
+> - `docs/analysis-reverse-sync-refactoring.md`
 
-> **설계 범위 원칙**
-> 이 문서의 모든 설계와 구현은 `tests/testcases/`에 실제로 존재하는 XHTML/MDX 사례에 기반한다.
-> 실제 사례로 확인되지 않은 가설적 케이스에 대한 예외 처리는 이 설계의 커버 범위 밖이다.
-> 새로운 케이스가 발견되면 testcase를 먼저 추가하고, 그 이후 설계·구현을 보완하는 사이클로 진행한다.
+## 1. 문서 목적
 
----
+이 문서는 PR #913의 reverse-sync 설계를 전면 재작성한 버전이다.
 
-## 1. 배경 및 동기
+목표는 세 가지다.
 
-### 1.1 현재 아키텍처의 문제
+1. MDX 변경을 XHTML로 재구성하여 Confluence 문서를 안정적으로 업데이트한다.
+2. 현재의 heuristic 텍스트 패치 체인을 구조적 재구성 경로로 치환한다.
+3. 현재 저장소에 이미 존재하는 `tests/testcases/` 와 `tests/reverse-sync/` 자산을 중심으로, 구현·회귀·유지보수가 가능한 테스트 체계를 만든다.
 
-Reverse Sync의 현재 접근 방식은 **"XHTML을 최대한 건드리지 않고 텍스트 차이만 이식"** 하는 전략이다. 이 전략은 Confluence 전용 요소(`<ac:image>`, `<ac:link>` 등)를 보호하기 위해 선택되었지만, 구현이 진행될수록 다음과 같은 기술부채가 쌓이고 있다.
+이 문서는 "방향 제안"이 아니라, 구현 착수 전에 필요한 설계 전제와 테스트 게이트를 코드베이스 기준으로 확정하는 문서다.
 
-- `patch_builder.py`의 전략 분기가 5가지(direct / containing / list / table / skip)로 늘어났고, 각 분기마다 예외 케이스가 추가되고 있다.
-- `_resolve_child_mapping()`이 4단계, `_resolve_mapping_for_change()`가 6단계 폴백 체인을 갖는다.
-- 버그를 수정할수록 새로운 edge case가 발견되는 패턴이 반복된다 (PR #852, #853, #866, #888, #903).
-- `text_transfer.py`의 문자 단위 위치 정렬은 두 좌표계(MDX ↔ XHTML) 사이의 매핑으로, 본질적으로 불안정하다.
+## 2. 현재 문제와 재설계 목표
 
-### 1.2 대안적 접근: MDX → XHTML 전체 재구성
+현재 reverse-sync 파이프라인은 다음 흐름이다.
 
-변경된 MDX 블록을 **XHTML로 직접 재구성**하고, 소실된 Confluence 요소만 원본에서 선택적으로 복원하면 위의 복잡도를 근본적으로 제거할 수 있다.
+`MDX diff -> mapping 추론 -> text/plain 기준 패치 -> patched XHTML -> forward convert -> MDX 재검증`
 
-```
-[현재] MDX diff → 텍스트 변경 추출 → XHTML 내 텍스트 위치 매핑 → 문자 단위 치환
-[제안] MDX diff → 변경 블록 재구성 → lossy 요소 재주입 → XHTML 교체
-```
+핵심 병목은 `patch_builder.py` 와 `text_transfer.py` 에 있다.
 
-`mdx_block_to_xhtml_element()`는 이미 `_build_insert_patch()`에서 사용되고 있다. 이를 **수정(modified) 블록에도 적용**하는 것이 이 설계의 핵심이다.
+- 변경 블록을 XHTML로 다시 만드는 대신, 기존 XHTML 내부 텍스트만 이식한다.
+- list, table, callout, containing block, direct replacement 등 전략 분기가 계속 늘어난다.
+- Confluence 전용 요소(`<ac:image>`, `<ac:link>`, `<ac:structured-macro>`, `ac:adf-extension`)는 텍스트 좌표계 밖에 있기 때문에, 텍스트만 옮기는 방식이 구조적으로 불안정하다.
 
----
+이번 재설계의 목표는 분명하다.
 
-## 2. 재구성이 막히는 두 가지 근본 문제
+- 변경된 MDX 블록은 가능한 한 "다시 emit한 XHTML fragment"로 교체한다.
+- emitter가 재현할 수 없는 Confluence 전용 정보만 sidecar metadata로 보존 후 재주입한다.
+- modified block 처리의 기본 전략을 `transfer_text_changes()` 가 아니라 `reconstruct_fragment()` 로 바꾼다.
 
-### 문제 A: 리스트 항목 내 lossy 요소의 위치를 모름
+즉 새 기본 경로는 다음과 같다.
 
-리스트 항목 안에 `<ac:image>`, `<ac:link>`, `<span style=...>` 같은 요소가 있어도 MDX에는 표현되지 않는다. 재구성 시 이 요소들을 어느 `<li>`의 어느 위치에 넣어야 할지 알 수 없다.
+`MDX diff -> changed block identify -> emit XHTML fragment -> restore preserved anchors/lost info -> replace top-level fragment -> forward verify`
 
-```xml
-<!-- 원본 XHTML -->
-<ul>
-  <li><p>item 1</p><ac:image><ri:attachment ri:filename="img.png"/></ac:image></li>
-  <li><p>item 2</p></li>
-</ul>
+## 3. 리뷰에서 확정된 수정 요구
 
-<!-- MDX — ac:image 없음 -->
-* item 1
-* item 2
-```
+리뷰 문서에서 지적한 사항 중 설계 착수 전에 반드시 확정해야 하는 항목은 아래 네 가지다.
 
-재구성 후 `<ac:image>`를 어디에 넣어야 할지 알려주는 정보가 현재 sidecar에 없다.
+### 3.1 Paragraph 좌표계
 
-### 문제 B: Callout 내부 복합 구조를 재구성하지 못함
+기존 문서는 `convert_inline()` 를 사실상 "XHTML -> MDX 역변환기"처럼 가정했다. 실제 코드에서는 성립하지 않는다.
 
-MDX에서 callout 내부는 단순한 텍스트 블록이지만, XHTML에서는 `<p>`, `<ul>`, `<ac:structured-macro ac:name="code">` 등이 중첩된 구조다. 현재 `_convert_callout_inner()`는 내부를 단일 paragraph로만 변환하므로, 내부에 리스트나 코드 블록이 있으면 구조가 무너진다.
+- `convert_inline()` 는 `mdx_to_storage.inline.convert_inline`
+- 역할은 MDX inline -> XHTML inline 변환
+- XHTML fragment를 넣어도 MDX로 돌아오지 않는다
 
-```mdx
-<Callout type="info">
-  단락 텍스트
+따라서 새 설계는 다음 원칙을 따른다.
 
-  * 리스트 항목 1
-  * 리스트 항목 2
-</Callout>
-```
+- paragraph/list-item anchor 매핑의 기준 좌표계는 "MDX literal"이 아니다
+- 기준 좌표계는 "XHTML DOM 에서 추출한 normalized plain text"다
+- old/new 비교는 `old_mdx_text` 가 아니라 `old_plain_text -> new_plain_text` 로 수행한다
 
-→ 현재: `<p>단락 텍스트 * 리스트 항목 1 * 리스트 항목 2</p>` (틀림)
-→ 목표: `<p>단락 텍스트</p><ul><li><p>리스트 항목 1</p></li>...` (맞음)
-
----
-
-## 3. 해결 방안
-
-### 3.1 문제 A 해결: Sidecar 플랫 매핑 + ref 참조 구조
-
-#### 핵심 원칙
-
-`<ul>`/`<ol>`과 `<li>` 모두 **최상위 sidecar entry**로 나열하고, 부모-자식 관계는 `children` ref 목록으로 표현한다.
-
-- `ul`/`ol` entry → `children: [li xhtml_xpath 목록]`
-- `li` entry → `plain_text`, `inline_trailing_html`, `children: [block child xhtml_xpath 목록]`
-- `<li>` 내부의 block 요소 (`<ul>`, `<ol>`, `<ac:structured-macro>` 등)도 독립 entry — `li` entry의 `children`에서 참조
-
-이 구조는 `ul`-`li` 관계와 `li`-block child 관계를 동일한 패턴으로 처리하므로, 어떤 깊이의 nesting도 추가 설계 없이 커버된다. nested list의 이중 삽입 문제가 구조적으로 제거된다.
-
-#### Sidecar 스키마 변경
-
-```yaml
-# 기존
-mappings:
-  - xhtml_xpath: ul[1]
-    xhtml_type: list
-    mdx_blocks: [5]
-
-# 변경 후 — ul, li 모두 최상위 entry, 관계는 children ref 로 표현
-# xhtml_type 은 XHTML 태그명 그대로 사용 (ul, ol, li, p, ac:image 등)
-mappings:
-  - xhtml_xpath: ul[1]
-    xhtml_type: ul
-    mdx_blocks: [5]
-    children:
-      - ref: "ul[1]/li[1]"
-      - ref: "ul[1]/li[2]"
-
-  - xhtml_xpath: ul[1]/li[1]
-    xhtml_type: li
-    plain_text: "item 1 설명 텍스트"           # inline 매칭 키
-    inline_trailing_html: >-
-      <ac:image><ri:attachment
-      ri:filename="img.png"/></ac:image>        # <p> 직후 non-block lossy 요소만
-    children:
-      - ref: "ul[1]/li[1]/ul[1]"               # block child 참조
-
-  - xhtml_xpath: ul[1]/li[1]/ul[1]
-    xhtml_type: ul
-    children:
-      - ref: "ul[1]/li[1]/ul[1]/li[1]"
-
-  - xhtml_xpath: ul[1]/li[1]/ul[1]/li[1]
-    xhtml_type: li
-    plain_text: "sub-item A"
-    inline_trailing_html: ""
-    children: []
-
-  - xhtml_xpath: ul[1]/li[2]
-    xhtml_type: li
-    plain_text: "item 2 텍스트"
-    inline_trailing_html: ""
-    children: []
-```
+이 결정으로 XHTML -> MDX inverse 가정을 제거한다.
 
-`inline_trailing_html`은 `<p>` 직후의 **non-block** 요소만 저장한다. `<ul>`, `<ol>`, `<ac:structured-macro>` 등 block 요소는 `children` ref로 분리하므로 저장 대상이 아니다.
-
-#### 생성: `_process_element()` 재귀 처리
-
-```python
-def _process_element(elem, xpath: str) -> list[SidecarEntry]:
-    """ul/ol/li 요소를 재귀 처리하여 독립 SidecarEntry 목록을 반환한다.
-
-    모든 요소는 최상위 entry 로 생성되고, 부모-자식 관계는 children ref 로 표현된다.
-    """
-    entries = []
-
-    if elem.name in ('ul', 'ol'):
-        children_refs = []
-        for li_idx, li in enumerate(elem.find_all('li', recursive=False), start=1):
-            li_xpath = f"{xpath}/li[{li_idx}]"
-            children_refs.append({'ref': li_xpath})
-            entries.extend(_process_element(li, li_xpath))
-
-        entries.insert(0, SidecarEntry(
-            xhtml_xpath=xpath,
-            xhtml_type=elem.name,   # 'ul' 또는 'ol' — XHTML 태그명 그대로
-            children=children_refs,
-        ))
-
-    elif elem.name == 'li':
-        # Confluence storage format에서 <li> 내부는 항상 <p>로 래핑됨.
-        # <p> 없는 <li>는 실제로 존재하지 않는 케이스이므로 별도 처리 불필요.
-        p_elem = elem.find('p')
-        plain_text = p_elem.get_text(separator=' ', strip=True) if p_elem else ''
-
-        # inline trailing: <p> 직후 non-block 형제 요소
-        inline_trailing = []
-        block_children_refs = []
-        block_counters = {}
-
-        if p_elem:
-            for sib in p_elem.next_siblings:
-                if not hasattr(sib, 'name'):
-                    continue
-                if sib.name in ('ul', 'ol') or _is_block_macro(sib):
-                    tag = sib.name
-                    block_counters[tag] = block_counters.get(tag, 0) + 1
-                    child_xpath = f"{xpath}/{tag}[{block_counters[tag]}]"
-                    block_children_refs.append({'ref': child_xpath})
-                    entries.extend(_process_element(sib, child_xpath))
-                else:
-                    inline_trailing.append(str(sib))
-
-        entries.insert(0, SidecarEntry(
-            xhtml_xpath=xpath,
-            xhtml_type='li',        # XHTML 태그명 그대로
-            plain_text=plain_text,
-            inline_trailing_html=''.join(inline_trailing),
-            children=block_children_refs,
-        ))
-
-    return entries
-```
+### 3.2 테스트 oracle
 
-#### 소비: `reconstruct_ul_entry()` / `reconstruct_li_entry()` — `_ListNode` tree 위치 기반 매칭
-
-MDX 파서(`parse_mdx_blocks()`)는 nested list 전체를 하나의 `list` 블록으로 반환하고, 구조화는 하지 않는다. `emitter.py`에 이미 존재하는 `_parse_list_items()` + `_build_list_tree()`를 재사용하여 `_ListNode` tree를 생성하고, sidecar `children` refs와 **위치 기반(zip)** 으로 매칭한다.
-
-텍스트 기반 큐(`pop_inline_item()`) 없이 위치 기반으로만 동작하므로, 동일 텍스트 항목이 여럿 있어도 충돌이 없다.
-
-```python
-def reconstruct_ul_entry(
-    entry: SidecarEntry,
-    sidecar_index: dict,
-    mdx_nodes: list[_ListNode],    # 이 ul/ol 레벨의 MDX 항목들 (_ListNode)
-) -> str:
-    """ul/ol entry를 재구성한다.
-
-    sidecar children refs와 mdx_nodes를 위치(zip)로 매칭한다.
-    emitter._parse_list_items() + _build_list_tree()로 생성한 _ListNode tree를 인자로 받는다.
-
-    항목 수 불일치 처리:
-      - MDX 항목이 더 많음(추가): zip 이후 남은 mdx_nodes를 sidecar 없이 재구성
-      - MDX 항목이 더 적음(삭제): zip이 짧은 쪽에서 멈추므로 삭제 항목은 자동 생략
-    """
-    tag = entry.xhtml_type  # 'ul' or 'ol' — XHTML 태그명 그대로
-    sidecar_refs = entry.children or []
-    parts = []
-
-    # sidecar가 있는 항목: ref + node 위치 매칭
-    for ref_dict, node in zip(sidecar_refs, mdx_nodes):
-        li_entry = sidecar_index.get(ref_dict['ref'])
-        if li_entry:
-            parts.append(reconstruct_li_entry(li_entry, sidecar_index, node))
-
-    # sidecar보다 MDX 항목이 많으면 — 새로 추가된 항목, sidecar 없이 재구성
-    for node in mdx_nodes[len(sidecar_refs):]:
-        parts.append(f'<li><p>{convert_inline(node.text)}</p></li>')
-
-    return f'<{tag}>{"".join(parts)}</{tag}>'
-
-
-def reconstruct_li_entry(
-    entry: SidecarEntry,
-    sidecar_index: dict,
-    node: _ListNode,               # 이 li에 대응하는 MDX _ListNode
-) -> str:
-    """li entry를 재구성한다.
-
-    node.text — li 본문 MDX 텍스트
-    node.children — 이 li의 nested list 항목들 (nested ul/ol 재구성에 전달)
-    """
-    li_inner = f'<p>{convert_inline(node.text)}</p>'
-    li_inner += entry.inline_trailing_html or ''
-
-    for ref_dict in (entry.children or []):
-        child = sidecar_index.get(ref_dict['ref'])
-        if child is None:
-            continue
-        if child.xhtml_type in ('ul', 'ol'):
-            # node.children = 이 li의 nested list 항목들
-            li_inner += reconstruct_ul_entry(child, sidecar_index, node.children)
-        else:
-            # block macro 등 기타 block children — 원본 xhtml_fragment 그대로
-            li_inner += child.xhtml_fragment or ''
-    return f'<li>{li_inner}</li>'
-
-
-# 진입점 (patch_builder.py에서 list 블록 처리 시)
-# emitter.py의 기존 함수 재사용
-items = _parse_list_items(new_block.content)   # bin/mdx_to_storage/emitter.py
-roots = _build_list_tree(items)                # bin/mdx_to_storage/emitter.py
-xhtml = reconstruct_ul_entry(sidecar_entry, sidecar_index, roots)
-```
-```
+`mapping.yaml` 은 runtime lookup 용이지, fragment oracle 용이 아니다. 실제 저장소의 `load_sidecar_mapping()` 도 fragment 본문을 읽지 않는다.
 
-#### 처리 가능한 케이스 분류
+새 설계의 oracle은 다음 순서로 사용한다.
 
-| 케이스 | 처리 방법 |
-|--------|-----------|
-| 항목 내용 변경 + lossy 없음 | `li` entry 재구성 (clean) |
-| 항목 내용 변경 + inline lossy 있음 | 재구성 + `inline_trailing_html` 재주입 |
-| 항목 내 nested list 있음 | `li` entry의 `children` → `ul`/`ol` entry 재구성 (이중 삽입 구조적 불가) |
-| 항목 추가 (MDX > sidecar) | `zip()` 이후 남은 `mdx_nodes`를 sidecar 없이 재구성 — `inline_trailing_html` 없음 |
-| 항목 삭제 (MDX < sidecar) | `zip()`이 짧은 쪽에서 멈추므로 삭제 항목 자동 생략 |
-| 깊이 무관한 nesting | `reconstruct_entry()` 재귀로 동일하게 처리 |
+1. `expected.roundtrip.json`
+   - 모든 `tests/testcases/*` 21개에 존재
+   - top-level `xhtml_fragment` 를 exact oracle로 제공
+2. `page.xhtml`
+   - sidecar에 없는 nested fragment나 sub-xpath 비교에 사용
+3. `expected.reverse-sync.patched.xhtml`
+   - 변경 시나리오 16개에 대한 golden page oracle
 
----
+즉 unit/integration 테스트는 `mapping.yaml` 에 의존하지 않는다.
 
-> **TODO — 구현 전 조사 필요**
->
-> 1. **[Phase 1 선결] 현재 `generate_sidecar_mapping()`의 `li` 처리 여부**: `<li>`에 대해 독립 entry를 이미 생성하는지, 아니면 `<ul>`/`<ol>` entry 안에 포함하는지 확인. `_process_element()` 도입 시 기존 entry 생성 로직과의 충돌 범위 파악 필요.
->
-> 2. **[Phase 1 선결] `xhtml_type` 태그명 일치 확인**: `xhtml_type`은 XHTML 태그명 그대로 사용한다 (`ul`, `ol`, `li`, `p`, `ac:image` 등). 기존 sidecar에 추상 타입(`list`, `list_item`)이 저장된 경우 마이그레이션 또는 역직렬화 시 변환 필요 여부 확인.
->
-> 3. **[Phase 1 선결] `xhtml_xpath` 포맷**: nested element의 xpath `ul[1]/li[1]/ul[1]` 형식이 현재 `mapping_recorder.py`의 xpath 생성 방식과 일치하는지 확인 필요.
->
-> 4. **[Phase 1 선결] `_parse_list_items()` / `_build_list_tree()` 접근성 확인**: `emitter.py`의 두 함수가 모듈 외부에서 import 가능한지 확인. private 함수(`_`prefix)이므로 `reverse_sync` 패키지에서 호출 가능한 형태로 노출 필요 여부 검토.
->
-> 5. **[Phase 1 선결] `_is_block_macro()` 판별 기준**: `<ac:structured-macro>`가 block child로 분류되어야 할 케이스와 inline trailing으로 남아야 할 케이스 구분 기준 확인 필요.
->
-> 6. **[조사] 다단락 `<li>` 존재 여부**: `<li><p>para1</p><p>para2</p></li>` 형태가 실제 Confluence XHTML에 존재하는지 testcase 전수 조사. 추측으로는 존재하지 않으며, 단일 `<p>` 내에서 `<br/>` 줄바꿈만 사용하는 것으로 보임. 존재하지 않음이 확인되면 단순 케이스로 간주하고 별도 처리 불필요. 존재하면 두 번째 `<p>`를 `inline_trailing_html`에 보존하는 방향 검토.
+### 3.3 XHTML normalization
 
----
+새 비교 전략은 `lxml` 을 도입하지 않는다.
 
-### 3.1.1 Paragraph 내 inline-block 요소 처리 — ParagraphEditSequence 기반
+이 저장소에는 이미 다음 자산이 있다.
 
-`<p>` 내부에 텍스트와 lossy inline-block 요소(`<ac:image>` 등)가 혼재하는 경우, list-item과 동일한 원칙을 적용한다: lossy 요소는 **독립 sidecar entry**로 분리하고, `paragraph` entry의 `children`에서 ref로 참조한다.
+- `bin/reverse_sync/mdx_to_storage_xhtml_verify.py`
+- `xhtml_beautify_diff.py`
+- BeautifulSoup 기반 attribute stripping / layout stripping / macro stripping
 
-텍스트 편집이 발생한 경우, old MDX → new MDX의 **ParagraphEditSequence**(Myers diff 기반)를 구하여 각 `AnchorSegment`의 삽입 위치를 old 좌표에서 new 좌표로 정확히 매핑한다. fuzzy 매칭은 사용하지 않는다. 설계가 커버하지 못하는 케이스는 명시적으로 실패시키고, 테스트케이스를 보강하는 사이클로 해결한다.
+새 설계는 이 경로를 공용 normalizer로 승격한다.
 
-#### 용어 정의
+- 새 공용 모듈: `reverse_sync/xhtml_normalizer.py`
+- 구현 기반: BeautifulSoup + 기존 ignored-attribute 규칙 재사용
+- 비교 단위: page 전체와 fragment 모두 지원
 
-```
-ParagraphEditSequence = list[InlineSegment]
+이로써 새 의존성 없이 테스트 가능성을 확보한다.
 
-InlineSegment
-├── TextSegment(text: str)      # diff 대상 텍스트 조각
-└── AnchorSegment(ref: str)     # ac:image 등 lossy inline-block — 위치 고정 앵커
-```
+### 3.4 block identity
 
-`<p>` 하나는 하나의 `ParagraphEditSequence`로 표현된다. `TextSegment`와 `AnchorSegment`가 교차하는 시퀀스이며, edit(Myers diff)는 `TextSegment`에만 적용된다. `AnchorSegment`는 인접 `TextSegment`의 위치 변화를 따라 new 좌표로 매핑된다.
+`mdx_content_hash` 단독 매칭은 충분하지 않다.
 
-#### Sidecar 스키마
+현재 실제 데이터에서도 중복 content가 존재한다. 특히 `reverse_sync.mdx_block_parser` 기준으로는 `</Callout>` 같은 동일 블록이 여러 번 잡히는 케이스가 이미 보인다.
 
-`<p>` 내부를 `TextSegment`(`kind: text`)와 `AnchorSegment`(`kind: ref`) 교차 시퀀스로 표현한다. `TextSegment.text`는 MDX 텍스트 조각(`convert_inline()` 출력)이며, `AnchorSegment`의 위치를 정의하는 좌표 기준이 된다. TextSegments를 이어 붙이면 해당 단락의 MDX 텍스트(`old_mdx_text`)와 정확히 일치한다.
+새 설계의 block identity는 아래를 함께 사용한다.
 
-```yaml
-- xhtml_xpath: p[1]
-  xhtml_type: p
-  children:
-    - kind: text
-      text: "텍스트A "            # TextSegment — MDX 텍스트 조각 (convert_inline 출력)
-    - kind: ref
-      ref: "p[1]/ac:image[1]"    # AnchorSegment — 이 TextSegment 직후 위치
-    - kind: text
-      text: " 텍스트B"            # TextSegment — MDX 텍스트 조각
-    - kind: ref
-      ref: "p[1]/ac:image[2]"    # AnchorSegment
+- `block_index`
+- `mdx_line_range`
+- `mdx_content_hash`
+- 필요 시 동일 hash 후보군 내 상대 순서
 
-- xhtml_xpath: p[1]/ac:image[1]
-  xhtml_type: ac:image
-  html: "<ac:image><ri:attachment ri:filename='img1.png'/></ac:image>"
+즉 lookup key는 "hash 하나"가 아니라 "hash + line range + order"다.
 
-- xhtml_xpath: p[1]/ac:image[2]
-  xhtml_type: ac:image
-  html: "<ac:image><ri:attachment ri:filename='img2.png'/></ac:image>"
-```
+## 4. 현재 코드베이스와 자산 분석
 
-신규 sidecar는 `ac:image` 존재 여부와 무관하게 **항상** `children`을 구성한다. `ac:image`가 없는 경우 `children = [{'kind': 'text', 'text': '전체텍스트'}]` 형태가 된다. `children`이 없는 entry는 구 sidecar에 대한 backward compat 폴백으로만 사용된다.
-
-#### 생성: `_process_paragraph()` — 항상 교차 시퀀스로 구성
-
-`TextSegment.text`는 XHTML plain text가 아닌 **MDX 텍스트 조각**을 저장한다. XHTML 조각을 누적한 뒤 `convert_inline()`을 적용하면, TextSegments를 이어 붙인 결과가 해당 단락의 MDX 텍스트와 정확히 일치한다. 이로써 `reconstruct_paragraph()`에서 별도 normalization 없이 `old_text == old_mdx_text`가 항상 성립한다.
-
-```python
-def _process_paragraph(elem, xpath: str) -> list[SidecarEntry]:
-    """<p> 요소를 처리하여 ParagraphEditSequence 구조의 sidecar entry를 생성한다.
-
-    ac:image 존재 여부와 무관하게 항상 TextSegment/AnchorSegment 교차 시퀀스로 children을 구성한다.
-      - ac:image 없음: children = [{'kind': 'text', 'text': '전체MDX텍스트'}]
-      - ac:image 있음: [text, ref, text, ref, ..., text] 교차 시퀀스
-
-    TextSegment.text는 MDX 텍스트 조각이다 — XHTML 조각(strong, em, code 등)을
-    convert_inline()으로 변환한 결과. TextSegments 연결 = 해당 단락의 MDX 텍스트.
-    """
-    entries = []
-    children = []
-    image_counters = {}
-    cursor_xhtml = ''  # <p> 내 텍스트/인라인 요소의 XHTML 조각 누적
-
-    for child in elem.children:
-        if hasattr(child, 'name') and child.name == 'ac:image':
-            # 직전 TextSegment flush — 누적 XHTML을 MDX 텍스트로 변환 (빈 문자열도 포함)
-            children.append({'kind': 'text', 'text': convert_inline(cursor_xhtml)})
-            cursor_xhtml = ''
-            # AnchorSegment
-            image_counters['ac:image'] = image_counters.get('ac:image', 0) + 1
-            img_xpath = f"{xpath}/ac:image[{image_counters['ac:image']}]"
-            children.append({'kind': 'ref', 'ref': img_xpath})
-            # ac:image 독립 entry
-            entries.append(SidecarEntry(
-                xhtml_xpath=img_xpath,
-                xhtml_type='ac:image',
-                html=str(child),
-            ))
-        else:
-            # 텍스트 노드(NavigableString) 또는 인라인 요소(strong, em, code 등) — XHTML 조각 누적
-            cursor_xhtml += str(child)
-
-    # 마지막 TextSegment flush (항상 추가 — ac:image 없어도 전체 텍스트가 여기에 들어감)
-    children.append({'kind': 'text', 'text': convert_inline(cursor_xhtml)})
-
-    entries.insert(0, SidecarEntry(
-        xhtml_xpath=xpath,
-        xhtml_type='p',
-        children=children,
-    ))
-    return entries
-```
+### 4.1 코드베이스에서 재사용할 축
 
-#### ParagraphEditSequence를 이용한 AnchorSegment 위치 매핑
+이미 있는 구현 중 이번 설계에서 그대로 활용할 축은 다음과 같다.
 
-```
-old_seq = ParagraphEditSequence from sidecar:
-  [TextSegment("텍스트A "), AnchorSegment(ref[1]), TextSegment(" 텍스트B"), AnchorSegment(ref[2])]
-
-old_text = TextSegment 연결: "텍스트A  텍스트B"
-new_text = 새 MDX: "수정된텍스트A 수정된텍스트B"
-
-AnchorSegment 위치 (old 좌표):
-  ref[1]: old_pos = len("텍스트A ")  = 5   (TextSegment[0] 끝)
-  ref[2]: old_pos = len("텍스트A  텍스트B") = 11  (TextSegment[1] 끝)
-
-Myers diff edit ops (TextSegment 연결 텍스트 기준):
-  DELETE  "텍스트A"   (old 0..4)
-  INSERT  "수정된텍스트A"
-  RETAIN  " "         (old 5)    ← ref[1] old_pos=5 → new_pos 여기서 결정
-  DELETE  " 텍스트B"  (old 6..11)
-  INSERT  " 수정된텍스트B"
-                                 ← ref[2] old_pos=11 → new_pos 여기서 결정
-
-매핑 결과:
-  ref[1]: new_pos = 8   ("수정된텍스트A " 이후)
-  ref[2]: new_pos = 18  (" 수정된텍스트B" 이후)
-
-재구성:
-  new_text[:8] + image[1].html + new_text[8:18] + image[2].html + new_text[18:]
-  = "수정된텍스트A " + <ac:image[1]> + " 수정된텍스트B" + <ac:image[2]>
-```
+- `reverse_sync_cli.py`
+  - verify / push orchestration
+  - forward convert 후 strict roundtrip 검증
+- `reverse_sync.sidecar`
+  - `RoundtripSidecar`, `SidecarBlock`, `expected.roundtrip.json`
+- `reverse_sync.mapping_recorder`
+  - XHTML top-level / callout child mapping 추출
+- `mdx_to_storage.parser`, `mdx_to_storage.emitter`
+  - MDX 구조 파싱과 XHTML emission
+  - callout child 재귀 emission 가능
+  - nested list tree 구성 함수 보유
+- `reverse_sync.lost_info_patcher`
+  - 링크, 이모티콘, filename, image, ADF extension 복원 로직
 
-#### 구현
-
-```python
-def map_anchor_positions(
-    old_text: str,
-    new_text: str,
-    old_positions: list[int],   # unit: Python str index (Unicode code point)
-) -> list[int]:                 # unit: Python str index (Unicode code point)
-    """Myers diff edit sequence로 AnchorSegment의 old 좌표를 new 좌표로 매핑한다.
-
-    좌표 단위: Python str index (Unicode code point).
-    한국어/영어 텍스트에서 code point = grapheme cluster이므로 실용상 문제 없음.
-    이모지 결합 문자(ZWJ sequence) 등 edge case가 발생하면 그때 testcase 추가 후 대응한다.
-
-    edit op:
-      ('retain', n) — old n code points 유지 → old_ptr += n, new_ptr += n
-      ('delete', n) — old n code points 삭제 → old_ptr += n
-      ('insert', s) — new s 삽입 → new_ptr += len(s)  (len = code point 수)
-
-    AnchorSegment position은 TextSegment 끝(직후)에 위치하므로:
-      old_ptr 이 old_pos 에 도달하는 시점의 new_ptr 를 기록한다.
-    """
-    ops = myers_diff(old_text, new_text)  # → list of (op, value)
-    old_ptr = 0
-    new_ptr = 0
-    pos_iter = iter(sorted(old_positions))
-    next_pos = next(pos_iter, None)
-    new_positions = []
-
-    for op, value in ops:
-        if next_pos is None:
-            break
-        if op == 'retain':
-            n = value
-            while next_pos is not None and old_ptr + n >= next_pos:
-                offset = next_pos - old_ptr
-                new_positions.append(new_ptr + offset)
-                next_pos = next(pos_iter, None)
-            old_ptr += n
-            new_ptr += n
-        elif op == 'delete':
-            n = value
-            while next_pos is not None and old_ptr + n >= next_pos:
-                # 삭제 구간 안에 AnchorSegment → delete 직후 new_ptr 로 매핑
-                new_positions.append(new_ptr)
-                next_pos = next(pos_iter, None)
-            old_ptr += n
-        elif op == 'insert':
-            new_ptr += len(value)
-
-    # 남은 position은 old_text 끝 이후 → new_text 끝으로 매핑
-    while next_pos is not None:
-        new_positions.append(len(new_text))
-        next_pos = next(pos_iter, None)
-
-    return new_positions
-
-
-def reconstruct_paragraph(
-    old_mdx_text: str,
-    new_mdx_text: str,
-    entry: SidecarEntry,
-    sidecar_index: dict,
-) -> str:
-    """paragraph를 ParagraphEditSequence 기반으로 재구성한다.
-
-    children이 없으면 새 MDX 텍스트를 그대로 변환한다 (구 sidecar backward compat 폴백).
-    children이 있으면:
-      - TextSegment 연결 = 해당 단락의 MDX 텍스트 → old_mdx_text와 직접 비교 (normalization 불필요)
-      - AnchorSegment가 없는 경우(ac:image 없음): convert_inline 적용
-      - AnchorSegment가 있는 경우: map_anchor_positions()로 위치 매핑 후 new_text에 삽입
-    old_mdx_text가 TextSegment 연결과 일치하지 않으면 SidecarMismatchError를 발생시킨다.
-    """
-    if not entry.children:
-        return f'<p>{convert_inline(new_mdx_text)}</p>'
-
-    # ParagraphEditSequence 복원
-    text_segments = [c['text'] for c in entry.children if c['kind'] == 'text']  # TextSegment
-    anchor_refs = [c['ref'] for c in entry.children if c['kind'] == 'ref']      # AnchorSegment
-    old_text = ''.join(text_segments)
-
-    if old_text != old_mdx_text:
-        raise SidecarMismatchError(
-            f"paragraph TextSegment mismatch:\n"
-            f"  sidecar: {old_text!r}\n"
-            f"  actual:  {old_mdx_text!r}\n"
-            f"  → sidecar 재생성 필요 (testcase fixture 업데이트)"
-        )
-
-    # AnchorSegment old 좌표 계산 (TextSegment 누적 길이)
-    old_positions = []
-    cursor = 0
-    seg_iter = iter(text_segments)
-    for child in entry.children:
-        if child['kind'] == 'text':
-            cursor += len(next(seg_iter))
-        elif child['kind'] == 'ref':
-            old_positions.append(cursor)
-
-    # Myers diff로 AnchorSegment new 좌표 매핑
-    new_positions = map_anchor_positions(old_text, new_mdx_text, old_positions)
-
-    # new_mdx_text를 MDX 좌표로 먼저 분할한 뒤, 각 조각에 convert_inline() 적용
-    result = ''
-    prev = 0
-    for ref, new_pos in zip(anchor_refs, new_positions):
-        mdx_piece = new_mdx_text[prev:new_pos]          # MDX 좌표로 MDX 텍스트 분할
-        result += convert_inline(mdx_piece)              # 분할 후 XHTML 변환
-        ref_entry = sidecar_index.get(ref)
-        if ref_entry is None:
-            raise SidecarMismatchError(f"AnchorSegment ref not found in sidecar_index: {ref!r}")
-        result += ref_entry.html
-        prev = new_pos
-    result += convert_inline(new_mdx_text[prev:])       # 마지막 조각
-
-    return f'<p>{result}</p>'
-```
+반대로 이번 재설계에서 더 이상 중심축이 되어서는 안 되는 부분은 다음과 같다.
 
-#### 명시적 실패 케이스
+- `transfer_text_changes()` 기반 modified block 패치
+- `mapping.yaml` 을 fragment oracle처럼 사용하는 방식
+- block 내부 구조를 content text만으로 추론하는 방식
 
-| 케이스 | 동작 |
-|--------|------|
-| `old_text != TextSegment 연결` | `SidecarMismatchError` — sidecar 재생성 후 testcase 업데이트 |
-| `AnchorSegment ref`가 `sidecar_index`에 없음 | `SidecarMismatchError` — sidecar 구조 오류 |
-| 삭제 구간 안에 `AnchorSegment` | `new_ptr`(삭제 직후) 로 매핑 (정의된 동작) |
-| children이 없는 paragraph | 기존 경로 — `convert_inline(new_mdx_text)` 그대로 사용 |
+### 4.2 테스트 자산 현황
 
-실패가 발생하면:
-1. 실패 케이스를 재현하는 testcase fixture를 추가한다
-2. `SidecarMismatchError`면 sidecar 생성 로직을 수정하고 fixture를 재생성한다
-3. 정의되지 않은 구조적 케이스면 설계를 보완하고 다시 사이클을 돈다
+현재 확보된 테스트 자산은 설계 검증에 충분히 강하다.
 
----
+#### `tests/testcases/`
 
-> **TODO — 구현 전 조사 필요**
->
-> 1. **[확정] `myers_diff()` 좌표 단위**: Python str index (Unicode code point) 단위로 확정. 함수 시그니처 주석에 명시됨. 이모지 결합 문자 등 edge case는 발생 시 testcase 추가 대응.
->
-> 2. **[Phase 1 선결] `old_mdx_text` 추출 방법**: `patch_builder.py`에서 paragraph 블록의 old MDX 텍스트를 어떻게 가져오는지 확인. `block_diff.py`의 `Change.old_block.content`가 이 역할을 하는지 확인.
->
-> 3. **[Phase 1 선결] `<ac:image>`의 위치 유형 구분**: `<p>` 내부 inline vs `<p>` 외부 독립 block 케이스를 `mapping_recorder.py`가 어떻게 구분하는지 확인. 독립 block이면 이 설계 대상 밖.
->
-> 4. **[확정] `convert_inline()` 적용 시점**: MDX 좌표로 MDX 텍스트를 먼저 분할한 뒤 각 조각에 `convert_inline()` 적용 — 좌표계 불일치 구조적 해소. `reconstruct_paragraph()` 구현에 반영됨.
+- 총 21개 케이스
+- `page.xhtml`: 21개
+- `expected.mdx`: 21개
+- `expected.roundtrip.json`: 21개
+- `original.mdx` + `improved.mdx` + `expected.reverse-sync.*`: 16개
+- `attachments.v1.yaml`: 19개
+- `page.v1.yaml`: 19개
+- `page.v2.yaml`, `children.v2.yaml`: 각 19개
+- `page.adf`: 18개
 
----
+구조적 커버리지:
 
-### 3.2 문제 B 해결: Callout 내부 재귀 파싱
+- list: 20개
+- table: 9개
+- image: 13개
+- callout macro/ADF panel: 12개
+- `ac:adf-extension`: 3개
+- 링크: 12개
+- code macro: 4개
 
-#### 아이디어
+대표 케이스:
 
-`_convert_callout_inner()`에서 내부 텍스트를 paragraph로 변환하는 대신, `parse_mdx_blocks()`로 내부를 블록 시퀀스로 파싱한 뒤 `mdx_block_to_xhtml_element()`를 재귀 적용한다.
+- list item + image: `544113141`, `544145591`, `692355151`, `880181257`, `883654669`
+- callout + nested list: `1454342158`, `544145591`, `692355151`, `880181257`, `883654669`
+- callout + code macro: `544112828`
+- ADF panel: `1454342158`, `544379140`, `panels`
 
-#### 구현
+#### `tests/reverse-sync/`
 
-```python
-def _convert_callout_inner_full(text: str) -> str:
-    """callout 내부를 재귀적으로 블록 파싱하여 XHTML을 재구성한다."""
-    from mdx_to_storage.parser import parse_mdx_blocks
+- 총 42개 실제 reverse-sync 회귀 케이스
+- `pages.yaml` 기준: `pass` 28개, `fail` 14개, `catalog_only` 24개
 
-    # <Callout> 래퍼 제거 후 들여쓰기 보정
-    inner = _strip_callout_wrapper(text)
-    inner = _dedent_callout_body(inner)
+구조적 커버리지:
 
-    # 내부를 MDX 블록으로 파싱 (이미 존재하는 파서 재사용)
-    inner_blocks = [b for b in parse_mdx_blocks(inner)
-                    if b.type not in ('frontmatter', 'import_statement', 'empty')]
+- list: 42개
+- image: 38개
+- callout: 28개
+- table: 10개
+- 링크: 19개
+- code macro: 7개
+
+특히 중요한 실사례:
+
+- paragraph/list item 내부 inline image: `544376004`
+- callout + code: `544112828`
+- 다수의 이미지/링크/callout 혼합 페이지: `544145591`, `1454342158`
+
+#### 결론
+
+새 설계는 "fixture가 부족해서 추상 설계를 해야 하는 상태"가 아니다. 오히려 반대다.
+
+- unchanged fragment oracle: 이미 충분함
+- changed-page golden oracle: 16개 존재
+- failure reproduction corpus: 42개 존재
+
+부족한 것은 fixture 양이 아니라, 이 자산을 설계 검증 단계별로 재배치하는 일이다.
+
+## 5. 제안 아키텍처
+
+### 5.1 최상위 원칙
+
+1. modified block는 whole-fragment replacement가 기본이다
+2. preserved 정보는 "text"가 아니라 "raw XHTML preservation unit" 으로 다룬다
+3. anchor 재주입은 MDX 좌표가 아니라 normalized plain-text 좌표에서 수행한다
+4. list / callout / details / ADF panel 은 child order 기반으로 재구성한다
+5. 지원 범위 밖 구조는 fuzzy patch 하지 않고 명시적으로 fail 한다
+
+### 5.2 sidecar 전략
+
+기존 `RoundtripSidecar` 를 primary runtime artifact 로 승격한다.
+
+- `mapping.yaml`
+  - 역할 축소: top-level routing, 사람이 읽는 디버그 용도
+- `expected.roundtrip.json`
+  - 역할 확대: exact fragment oracle + reconstruction metadata
+
+새 스키마는 `RoundtripSidecar schema_version = 3` 으로 정의한다.
+
+핵심 변화:
+
+- 각 `SidecarBlock` 에 reconstruction metadata 추가
+- modified block 재구성에 필요한 preserved anchor/unit 을 block 단위로 저장
+
+예시:
+
+```json
+{
+  "block_index": 12,
+  "xhtml_xpath": "p[3]",
+  "xhtml_fragment": "<p>A <ac:image ... /> B</p>",
+  "mdx_content_hash": "...",
+  "mdx_line_range": [40, 40],
+  "lost_info": {},
+  "reconstruction": {
+    "kind": "paragraph",
+    "old_plain_text": "A  B",
+    "anchors": [
+      {
+        "anchor_id": "p[3]/ac:image[1]",
+        "raw_xhtml": "<ac:image ... />",
+        "old_plain_offset": 2,
+        "affinity": "after"
+      }
+    ]
+  }
+}
+```
 
-    if not inner_blocks:
-        return ''
+리스트 예시:
 
-    # 각 블록을 재귀 재구성
-    parts = [mdx_block_to_xhtml_element(b) for b in inner_blocks]
-    return ''.join(parts)
+```json
+{
+  "block_index": 8,
+  "xhtml_xpath": "ul[1]",
+  "xhtml_fragment": "<ul>...</ul>",
+  "reconstruction": {
+    "kind": "list",
+    "ordered": false,
+    "items": [
+      {
+        "item_xpath": "ul[1]/li[1]",
+        "old_plain_text": "item 1",
+        "anchors": [],
+        "child_blocks": []
+      },
+      {
+        "item_xpath": "ul[1]/li[2]",
+        "old_plain_text": "item 2",
+        "anchors": [
+          {
+            "anchor_id": "ul[1]/li[2]/ac:image[1]",
+            "raw_xhtml": "<ac:image ... />",
+            "old_plain_offset": 6,
+            "affinity": "after"
+          }
+        ],
+        "child_blocks": [
+          {
+            "kind": "list",
+            "xpath": "ul[1]/li[2]/ol[1]"
+          }
+        ]
+      }
+    ]
+  }
+}
+```
 
+이 구조의 의도는 단순하다.
 
-def _dedent_callout_body(text: str) -> str:
-    """callout 내부 들여쓰기(공통 선행 공백)를 제거한다."""
-    lines = text.splitlines()
-    non_empty = [l for l in lines if l.strip()]
-    if not non_empty:
-        return text
-    indent = min(len(l) - len(l.lstrip()) for l in non_empty)
-    return '\n'.join(l[indent:] for l in lines)
-```
+- top-level fragment는 `xhtml_fragment` 가 책임진다
+- list/paragraph/container 내부 보존 정보는 `reconstruction` 이 책임진다
+- 테스트 oracle와 runtime metadata가 같은 artifact 안에 있게 한다
 
-#### Callout 외부 wrapper: sidecar xhtml_type 활용
+### 5.3 block 분류
 
-callout을 재구성할 때 원래 macro 포맷(`ac:structured-macro` vs `ac:adf-extension`)을 유지해야 한다. `sidecar_entry.xhtml_type`이 이미 이 정보를 갖고 있다.
+새 재구성기는 top-level block를 네 종류로 나눈다.
 
-```python
-def mdx_callout_to_xhtml(block, sidecar_entry) -> str:
-    """MDX callout 블록을 XHTML macro로 재구성한다."""
-    callout_type = _extract_callout_type(block.content)   # "info" | "warning" | ...
-    inner_xhtml = _convert_callout_inner_full(block.content)
+#### A. Clean block
 
-    if sidecar_entry and sidecar_entry.xhtml_type == 'adf_extension':
-        return _wrap_adf_callout(inner_xhtml, callout_type)
-    else:
-        return _wrap_structured_macro_callout(inner_xhtml, callout_type)
-```
+대상:
 
-#### 재귀 파싱의 안전성
+- heading
+- code macro
+- table
+- hr
+- paragraph without preserved anchors
 
-`parse_mdx_blocks()`는 현재 최상위 MDX 문서를 대상으로 작성되어 있다. callout 내부에 적용할 때 주의할 점:
+처리:
 
-- frontmatter, import_statement는 callout 내부에 없으므로 무시해도 됨 (이미 필터 적용)
-- callout 내부에 다시 `<Callout>`이 있는 중첩 케이스: `mdx_block_to_xhtml_element()`의 callout 분기가 재귀 호출됨 → 자연스럽게 처리됨
-- callout 내부의 `<figure>`, `<details>`, `<Badge>` 등 HTML 블록: `html_block` 타입으로 파싱되어 그대로 통과됨 → 문제 없음
+- `mdx_to_storage.emit_block()` 또는 `mdx_block_to_xhtml_element()` 로 새 fragment emit
+- block-level `lost_info` 적용
+- 기존 fragment 전체 replace
 
----
+#### B. Inline-anchor block
 
-## 4. 새로운 Reverse Sync 흐름
+대상:
 
-두 가지 문제가 해결되면 `build_patches()`의 로직이 다음과 같이 단순해진다.
+- paragraph 안의 `ac:image`, `ac:link` 류 preservation unit
+- list item 안의 inline image / trailing preserved node
 
-### 4.1 변경된 블록 처리 (수정)
+처리:
 
-```
-[현재]
-modified 블록
-  → _resolve_mapping_for_change() (6단계 폴백)
-    → strategy: direct | containing | list | table | skip
-      → (각 strategy마다 별도 처리)
-
-[제안]
-modified 블록
-  → sidecar O(1) 조회 → BlockMapping
-  → mdx_block_to_xhtml_element(new_block, sidecar_entry)
-    → heading, paragraph, code_block: 직접 재구성
-    → list: reconstruct_ul_entry() (trailing_html 재주입 포함)
-    → callout: mdx_callout_to_xhtml() (재귀 파싱 + macro wrap)
-    → 그 외: 기존 폴백
-  → xhtml_xpath에 new_inner_xhtml 교체
-```
+1. improved MDX block를 먼저 XHTML로 emit
+2. emit 결과에서 plain text를 추출
+3. sidecar의 `old_plain_text` 와 anchor offset을 기준으로 old -> new offset 매핑
+4. 매핑된 위치에 raw anchor XHTML 삽입
 
-### 4.2 삭제되는 코드
+중요한 점:
 
-| 모듈/함수 | 이유 |
-|-----------|------|
-| `_resolve_mapping_for_change()` | 단순 sidecar O(1) 조회로 대체 |
-| `_find_containing_mapping()` | containing 전략 불필요 |
-| `_resolve_child_mapping()` 4단계 폴백 | sidecar가 정확하면 불필요 |
-| `text_transfer.py` (대부분) | 텍스트 위치 매핑 불필요 |
-| `has_inline_format_change()` | 재구성이 기본이므로 감지 불필요 |
-| `has_inline_boundary_change()` | 동상 |
-| `lost_info_patcher.py` 블록 레벨 heuristic | `inline_trailing_html` 재주입으로 대체 |
-| `build_list_item_patches()` 매칭 로직 | `reconstruct_entry()` ref 순회로 대체 |
-| `_convert_callout_inner()` → paragraph 폴백 | 재귀 파싱으로 대체 |
+- old/new 비교는 plain-text 좌표
+- 삽입 대상은 "생성된 XHTML DOM"
+- raw 문자열 위치 삽입이 아니라 DOM walk 기반 삽입
 
-### 4.3 유지되는 코드
+#### C. Ordered child block
 
-| 모듈/함수 | 이유 |
-|-----------|------|
-| `block_diff.py` | diff 로직 그대로 사용 |
-| `sidecar.py` O(1) 인덱스 | 그대로 사용 (children/plain_text/inline_trailing_html 필드 추가) |
-| `mdx_to_xhtml_inline.py` | 재구성의 핵심 — 확장 |
-| `xhtml_patcher.py` `_replace_inner_html()` | XHTML 교체 메커니즘 |
-| `roundtrip_verifier.py` | 검증 로직 |
-| `table_patcher.py` | 테이블은 별도 처리 (표 구조가 복잡) |
+대상:
 
----
+- nested list
+- callout / details / ADF panel body
 
-## 5. 구현 계획
+처리:
 
-### Phase 1: Sidecar 플랫 매핑 + children ref 구조 도입
+- original XHTML 의 child order를 sidecar에 저장
+- improved MDX 는 `mdx_to_storage.parser.parse_mdx()` 로 child blocks 파싱
+- child type과 순서를 기준으로 재귀 reconstruct
 
-**목표:** `ul`/`ol`/`li` 및 `<p>` 내 `ac:image`를 최상위 SidecarEntry로 생성하고, 관계를 `children: [ref]`로 표현
+여기서는 text matching을 하지 않는다. 위치와 child slot이 기준이다.
 
-**작업:**
-1. `SidecarEntry` dataclass에 `children`, `plain_text`, `inline_trailing_html` 필드 추가
-   - `children: List[ChildRef]` — `{'ref': xhtml_xpath}` 목록 (ul→li, li→block child, p→ac:image)
-   - `plain_text: str` — li entry의 MDX 항목 텍스트 (zip 매칭 키)
-   - `inline_trailing_html: str` — `<p>` 직후 non-block lossy 요소 원본 HTML
-2. `_process_element(elem, xpath)` 구현 — `ul`/`ol`/`li`를 최상위 entry로 재귀 생성
-   - `xhtml_type`은 XHTML 태그명 그대로 (`ul`, `ol`, `li`)
-   - `<li>` 내 block 요소(`<ul>`, `<ol>`, block macro) → 독립 entry + `children` ref
-   - `<li>` 내 non-block lossy 요소(`<ac:image>` 등) → `inline_trailing_html`에 저장
-3. `_process_paragraph(elem, xpath)` 구현 — `<p>` 내 `ac:image` 를 독립 entry + `children` ref로 생성
-   - `ParagraphEditSequence` 구조(`kind: text` / `kind: ref`) sidecar에 기록
-4. `generate_sidecar_mapping()`의 list/paragraph 처리를 위 두 함수 호출로 교체
-5. `load_sidecar_mapping()`에서 `children`, `plain_text`, `inline_trailing_html` 역직렬화
-6. `mapping.yaml` 스키마 버전 업 (`version: 2`)
+#### D. Opaque block
 
-**검증 testcase:**
+대상:
 
-| testcase ID | 검증 대상 | 근거 |
-|-------------|-----------|------|
-| `544145591` | `ac:image` 포함 li 9개, nested list 21개 | li+image, nested 모두 풍부 |
-| `880181257` | `ac:image` 포함 li 12개 | ac:image 포함 li 집중 검증 |
-| `883654669` | `ac:image` 포함 li 16개 | ac:image 포함 li 최다 |
+- emitter가 재구성하지 못하는 custom macro
+- 현재 testcase에 없거나 metadata 규칙이 정의되지 않은 구조
 
-> **TODO:** Phase 1 시작 전 "TODO — 구현 전 조사 필요" 항목(Section 3.1) 중 1~5번 확인 후 구현 방향 확정
+처리:
 
----
+- `UnsupportedReconstructionError`
+- verify는 fail
+- 해당 페이지를 testcase로 승격 후 설계 범위 확장
 
-### Phase 2: `_convert_callout_inner` 재귀 파싱
+이 fail-closed 정책이 중요하다. unsupported structure에서 silent corruption이 가장 위험하다.
 
-**목표:** callout 내부 리스트/코드 블록을 올바르게 재구성
+### 5.4 paragraph / list item anchor 재주입
 
-**작업:**
-1. `_strip_callout_wrapper()` 및 `_dedent_callout_body()` 유틸리티 추가
-2. `_convert_callout_inner_full()` 구현 (재귀 파싱)
-3. `mdx_block_to_xhtml_element()`의 callout 분기에서 sidecar_entry를 인자로 받아 macro 포맷 결정
-4. `mdx_block_to_xhtml_element()` 시그니처에 optional `sidecar_entry` 추가
+이 설계의 핵심 차별점은 "anchor를 plain-text offset에 고정"하는 것이다.
 
-**검증 testcase:**
+#### 좌표계
 
-| testcase ID | 검증 대상 | 근거 |
-|-------------|-----------|------|
-| `1454342158` | callout 내부 list 4개 | callout+list 가장 많음 |
-| `880181257` | callout 내부 list 2개 | Phase 1+2 복합 케이스 (ac:image 포함 li도 존재) |
+- `old_plain_text`: original XHTML fragment에서 DOM text를 뽑아 정규화한 값
+- `new_plain_text`: improved MDX 를 emit한 XHTML fragment에서 같은 규칙으로 뽑은 값
+- `old_plain_offset`: original plain text 기준 anchor 위치
+- `new_plain_offset`: old -> new diff로 계산된 삽입 위치
 
----
+#### 알고리즘
 
-### Phase 3: `build_patches()` 재구성 경로 전환
+1. `extract_plain_text(fragment)` 로 old/new plain text 생성
+2. `map_offsets(old_plain, new_plain, offsets)` 로 new offset 계산
+3. `insert_raw_anchor_at_plain_offset(soup, raw_xhtml, offset)` 로 DOM 삽입
 
-**목표:** modified 블록 처리를 재구성 기반으로 전환
+이 방식은 review에서 지적된 "XHTML inline fragment를 MDX text로 역변환해야 하는가" 문제를 제거한다.
 
-**작업:**
-1. `reconstruct_ul_entry()` / `reconstruct_li_entry()` 구현 및 단위 테스트
-2. `build_patches()`에서 modified 블록의 처리를 `mdx_block_to_xhtml_element()` 기반으로 전환
-3. 기존 전략 분기(containing, list, text_transfer) 단계적 제거
-4. 전체 테스트 케이스 통과 확인 (`make test-reverse-sync`)
+### 5.5 list 재구성
 
-**검증 기준:** `tests/reverse-sync/pages.yaml`의 모든 `expected_status: pass` 케이스 유지
+리스트는 text queue가 아니라 tree + order 매칭으로 재구성한다.
 
----
+재사용 자산:
 
-## 6. 위험 및 대응
+- `mdx_to_storage.emitter._parse_list_items()`
+- `mdx_to_storage.emitter._build_list_tree()`
 
-### 위험 1: inline 항목 텍스트 변경으로 `inline_trailing_html` 매칭 실패
+다만 private 함수 직접 import는 피한다. 이번 작업에서 public helper로 승격한다.
 
-**증상:** 리스트 항목 내용이 크게 바뀌면 `plain_text` 키가 일치하지 않아 `inline_trailing_html`을 찾지 못함
+제안:
 
-**대응:**
-- `_match_mdx_inline_item()`에서 순서(index) 기반 폴백 우선 사용
-  - `list_items` 시퀀스에서 `kind: inline` 항목의 순서(inline_ptr)와 MDX 항목의 순서를 매칭
-  - 텍스트 완전 일치 → prefix 20자 매칭 → 순서 기반 매칭 순으로 폴백
-- 매칭 실패 시 `inline_trailing_html` 없이 재구성 → lossy 요소 손실이지만 구조 파괴보다 안전
+- 새 public API: `mdx_to_storage.emitter.parse_list_tree(content: str) -> list[ListNode]`
 
-> **TODO (W-3):** prefix 20자 매칭의 충돌 가능성을 기존 testcase 전수 조사로 확인. 충돌이 빈번하면 prefix 폴백을 제거하고 순서 기반만 유지하는 방향 검토.
+재구성 로직:
 
-### 위험 2: Callout 내부 들여쓰기 처리 오류
+1. improved MDX list block -> list tree 생성
+2. sidecar list item sequence와 index 기반 zip
+3. 각 item에 대해
+   - item text emit
+   - item-level anchors 재삽입
+   - child list / block child 재귀 재구성
+4. top-level `<ul>` / `<ol>` wrapper regenerate
 
-**증상:** `_dedent_callout_body()`가 내부 코드 블록의 들여쓰기를 과도하게 제거
+이렇게 하면 다음이 가능하다.
 
-**대응:**
-- code_block 내부는 `parse_mdx_blocks()`가 펜스 마커 기준으로 파싱하므로, 들여쓰기 제거가 코드 내용에 영향 없음
-- 단위 테스트로 코드 블록 포함 callout 케이스 커버
+- 동일 텍스트 item이 여러 번 나와도 안정적
+- nested list의 중복 삽입 방지
+- image가 들어간 list item도 text patch 없이 처리
 
-### 위험 3: Sidecar 버전 비호환
+### 5.6 callout / details / ADF panel 재구성
 
-**증상:** `list_items` 필드가 없는 구 버전 `mapping.yaml`을 읽을 때 오류
+callout은 이번 설계에서 "containing block에 text만 이식"하지 않는다.
 
-**대응:**
-- `list_items`를 optional 필드로 선언 (기본값 `[]`)
-- 구 버전 sidecar에서 `list_items`가 없으면 trailing 없이 재구성 → 기존 동작과 동일
+이미 있는 자산:
 
----
+- `mapping_recorder.record_mapping()` 는 callout의 child xpath를 생성한다
+- `mdx_to_storage.parser.parse_mdx()` 와 `_emit_callout()` 은 child block 재귀 emission 을 지원한다
 
-## 7. 기대 효과 요약
+따라서 새 경로는 아래와 같다.
 
-| 지표 | 현재 | 개선 후 |
-|------|------|---------|
-| `patch_builder.py` 전략 수 | 5 (direct / containing / list / table / skip) | 2 (reconstruct / skip) |
-| `_resolve_child_mapping()` 폴백 단계 | 4 | 0 (삭제) |
-| 인라인 변경 감지 함수 | 2 (`has_inline_format_change`, `has_inline_boundary_change`) | 0 (삭제) |
-| callout 내부 리스트 처리 | text_transfer 우회 | 재귀 재구성 |
-| 신규 edge case 시 대응 | 전략 분기 추가 | `mdx_to_xhtml_inline.py` 개선 |
-| 기술부채 방향 | 분기 누적 | 단일 재구성 경로 개선으로 집중 |
+1. original callout body child order를 sidecar metadata에 저장
+2. improved MDX callout body를 `parse_mdx()` 로 child block sequence로 파싱
+3. child slot 단위로 reconstruct
+4. 최종 body를 `<ac:rich-text-body>` 또는 `ac:adf-content` 아래에 다시 조립
 
----
+주의:
 
-## 8. 테스트 설계
+- `macro-panel` 과 `ac:adf-extension` 은 body 구조는 같지만 outer wrapper가 다르다
+- outer wrapper 보존은 `lost_info_patcher` 가 아니라 reconstruction metadata가 책임진다
+- ADF panel raw outer fragment가 필요한 경우 sidecar에 raw wrapper를 저장한다
 
-### 8.1 테스트 목표: 재구성의 "완전함"을 증명하는 방법
+### 5.7 patch 적용 단위
 
-"완전함"을 다음 두 가지 명제로 구분하여 증명한다.
+modified block는 `new_inner_xhtml` 보다 `new_element_xhtml` 교체가 기본이다.
 
-**명제 1 — 재구성 정확성:** `mdx_block_to_xhtml_element(block)`이 각 블록 타입에 대해 XHTML을 올바르게 생성한다.
+이유:
 
-**명제 2 — 손실 복원 완전성:** `trailing_html` 재주입 후 재구성된 XHTML이 원본 `page.xhtml`과 블록 수준에서 등가이다.
+- top-level element 전체를 교체해야 wrapper, attribute, child structure를 한 번에 통제할 수 있다
+- innerHTML 교체만으로는 callout outer wrapper, list root tag, table root tag의 일관성을 강제하기 어렵다
 
-두 명제 모두 기존 `tests/testcases/`와 `tests/reverse-sync/`의 데이터로 검증할 수 있다. **새로운 테스트 입력 파일을 만들 필요가 없다.**
+따라서 `xhtml_patcher.py` 에 새 액션을 추가한다.
 
----
+- `replace_fragment`
+  - 입력: `xhtml_xpath`, `new_element_xhtml`
+  - 의미: xpath 대상 top-level element 전체를 새 fragment로 치환
 
-### 8.2 테스트 가능성 분류: 블록 타입별 재구성 가능성
+기존 `insert` / `delete` 는 유지한다.
 
-주어진 testcase 내의 블록을 재구성 가능성에 따라 세 범주로 나눈다.
+### 5.8 block identity와 planner
 
-| 범주 | 설명 | 재구성 후 기대 결과 | 예시 |
-|------|------|---------------------|------|
-| **가역 블록** | MDX로 완전히 표현 가능한 블록 | 원본 XHTML과 byte-equal | heading, paragraph, code_block, 단순 list |
-| **손실 복원 블록** | MDX에 표현 안 되는 요소가 있지만 trailing_html로 복원 가능 | trailing_html 포함 시 원본과 등가 | `<ac:image>` 포함 list item, `<span style>` 포함 항목 |
-| **비가역 블록** | 정보 손실이 불가역적 | 기능적으로 무해한 변환만 허용 | `ac:adf-extension` callout, `ac:link` 포함 paragraph |
+기존 `patch_builder.py` 는 전략 분기와 fallback이 많다. 새 설계는 planner를 분리한다.
 
-비가역 블록은 이미 `architecture.md`의 "정보 손실 카테고리"에 문서화된 항목들이다. 이 범주는 재구성 목표 밖이며, 테스트에서 skip 처리한다.
+제안 모듈:
 
----
+- `reverse_sync/reconstruction_planner.py`
+  - changed block -> reconstruction strategy 결정
+- `reverse_sync/reconstruction_sidecar.py`
+  - sidecar schema v3 load/build
+- `reverse_sync/reconstructors.py`
+  - paragraph/list/container별 fragment rebuild
+- `reverse_sync/xhtml_normalizer.py`
+  - shared normalization / plain-text extraction
 
-### 8.3 테스트 수준 구조 (5단계)
+`patch_builder.py` 는 최종적으로 orchestration thin layer가 된다.
 
-```
-Level 0: 보조 함수 단위 테스트              (unit)
-    ↓
-Level 1: 블록 단위 재구성 정확성            (unit)
-    ↓
-Level 2: 전체 문서 재구성 + block-level 비교  (integration)
-    ↓
-Level 3: lossy 요소 재주입 후 byte-equal    (integration)
-    ↓
-Level 4: E2E reverse-sync 회귀 방지         (e2e)
-```
+## 6. 구현 범위와 비범위
 
----
+### 이번 설계 범위
 
-### 8.3.1 Level 0: 보조 함수 단위 테스트
+- modified top-level block의 whole-fragment reconstruction
+- paragraph/list item inline anchor 재주입
+- nested list reconstruction
+- callout/details/ADF panel body reconstruction
+- block identity 안정화
+- golden/oracle 기반 테스트 체계 구축
 
-**목적:** 신규 추가되는 보조 함수 각각이 독립적으로 올바르게 동작하는지 확인한다. Level 1보다 먼저 실행하여 버그 위치를 좁힌다.
+### 이번 설계 비범위
 
-**실행 방법:**
-```bash
-python3 -m pytest tests/test_reconstruction_helpers.py -v --tb=short
-```
+- sidecar/rehydrator 전체를 단일 parser 체계로 통합하는 대형 리팩토링
+- testcase에 없는 custom macro 일반화
+- Confluence storage 전체에 대한 generic DOM diff 엔진
 
-#### 테스트 대상 함수 및 케이스
+parser 통합은 후속 과제로 남긴다. 이번 작업은 "reverse-sync를 구조적 재구성 경로로 전환"하는 데 집중한다.
 
-**`_process_element(ul_or_ol, xpath)`** — sidecar entry 생성
+## 7. 테스트 설계
 
-| 테스트 케이스 | 검증 내용 |
-|---------------|-----------|
-| 단순 `<ul><li><p>text</p></li></ul>` | `xhtml_type: ul` entry + `xhtml_type: li` entry 생성, children ref 정확성 |
-| `<li>` 내부 `<ac:image>` 포함 | `inline_trailing_html` 추출, block child가 아님을 확인 |
-| `<li>` 내부 nested `<ul>` | 부모 li의 `children`에 ref, nested ul의 독립 entry 생성 (`xhtml_type: ul`) |
-| 빈 `<li>` (`<li></li>`) | `plain_text=""`, `children=[]` |
-| `<li>` 내부 block macro (`<ac:structured-macro>`) | `children`에 ref, 독립 entry 생성 |
-| `<ol>` | `xhtml_type: ol`로 생성 확인 |
+테스트는 두 묶음으로 나눈다.
 
-**`map_anchor_positions(old_text, new_text, old_positions)`** — AnchorSegment 위치 매핑
-
-| 테스트 케이스 | 검증 내용 |
-|---------------|-----------|
-| `old_text == new_text` | 모든 AnchorSegment position 그대로 유지 |
-| 앞부분 삽입 (`"AB"` → `"XAB"`) | position이 삽입 길이만큼 뒤로 이동 |
-| 앞부분 삭제 (`"AB"` → `"B"`) | position이 삭제 길이만큼 앞으로 이동 |
-| 삭제 구간 안에 AnchorSegment (`"AB"` → `"B"`, position=1) | 삭제 직후 위치(0)로 매핑 |
-| 전체 교체 | AnchorSegment가 new_text 끝으로 매핑 |
-| AnchorSegment 2개, 중간 TextSegment 수정 | 각각 독립적으로 정확히 매핑 |
-
-**`reconstruct_paragraph(old_mdx_text, new_mdx_text, entry, sidecar_index)`**
-
-| 테스트 케이스 | 검증 내용 |
-|---------------|-----------|
-| `children` 없음 | `convert_inline(new_mdx_text)` 그대로 반환 |
-| `old_mdx_text != TextSegment 연결` | `SidecarMismatchError` 발생 |
-| AnchorSegment 1개, 텍스트 변경 없음 | AnchorSegment가 정확한 위치에 삽입 |
-| AnchorSegment 2개, TextSegment 수정 | Myers diff로 두 AnchorSegment 위치 모두 정확히 매핑 |
-| ref가 `sidecar_index`에 없음 | `SidecarMismatchError` 발생 |
-
-**`_process_paragraph(elem, xpath)`** — ParagraphEditSequence sidecar entry 생성
-
-| 테스트 케이스 | 검증 내용 |
-|---------------|-----------|
-| `<p>text only</p>` | `children = [{'kind': 'text', 'text': 'text only'}]` — TextSegment 1개 |
-| `<p><strong>bold</strong></p>` | `children = [{'kind': 'text', 'text': 'bold'}]` — TextSegment 1개 (inline element는 get_text() 처리) |
-| `<p>텍스트A <ac:image/> 텍스트B</p>` | `children = [text, ref, text]` 교차 시퀀스, `ac:image` 독립 entry 생성 |
-| `<p><ac:image/><ac:image/></p>` | `children = [text(''), ref, text(''), ref, text('')]` — 빈 TextSegment 포함 |
-| `<p>텍스트A <ac:image/></p>` | `children = [text, ref, text('')]` — 마지막 빈 TextSegment 포함 |
-
-**`reconstruct_ul_entry(entry, sidecar_index, mdx_nodes)`** — ul/ol 재구성
-
-| 테스트 케이스 | 검증 내용 |
-|---------------|-----------|
-| 단순 ul (sidecar 2개, mdx 2개 항목) | `<ul><li>...</li><li>...</li></ul>` 정상 재구성 |
-| ol 재구성 | `xhtml_type: ol` → `<ol>` 출력 |
-| sidecar 2개, mdx 3개 (항목 추가) | 새 항목이 `<li><p>...</p></li>` 로 출력에 포함 |
-| sidecar 3개, mdx 2개 (항목 삭제) | 삭제 항목 생략, 2개 `<li>` 출력 |
-| nested ul (li → ul → li 2단계) | 재귀 재구성 정확성 — `<ul><li><p>...</p><ul><li>...</li></ul></li></ul>` |
-
-**`reconstruct_li_entry(entry, sidecar_index, node)`** — li 재구성
-
-| 테스트 케이스 | 검증 내용 |
-|---------------|-----------|
-| 단순 li (텍스트만) | `<li><p>텍스트</p></li>` |
-| `inline_trailing_html` 있는 li | `<li><p>텍스트</p><ac:image.../></li>` — trailing html 재주입 확인 |
-| nested ul children 있는 li | `<li><p>텍스트</p><ul>...</ul></li>` |
-| block macro children 있는 li | `xhtml_fragment` 그대로 삽입 |
-
-**`normalize_xhtml(xhtml)`**
-
-| 테스트 케이스 | 검증 내용 |
-|---------------|-----------|
-| 속성 순서 다른 두 XHTML | normalize 후 equal |
-| `<br/>` vs `<br />` | normalize 후 equal |
-| `<tag />` vs `<tag></tag>` (빈 속성값) | normalize 후 equal |
-| trailing 공백 다른 텍스트 노드 | normalize 후에도 **not equal** |
-| 다른 namespace prefix | normalize 후에도 **not equal** |
-
-**`_strip_callout_wrapper(text)` / `_dedent_callout_body(text)`**
-
-| 테스트 케이스 | 검증 내용 |
-|---------------|-----------|
-| `<Callout type="info">...</Callout>` | wrapper 제거 후 내부 텍스트만 반환 |
-| 들여쓰기 2칸 callout body | 공통 선행 공백 제거 |
-| 내부 코드 블록 포함 | 코드 블록 내 들여쓰기 보존 |
-
----
-
-### 8.4 Level 1: 블록 단위 재구성 정확성 테스트
-
-**목적:** `mdx_block_to_xhtml_element()`이 각 블록 타입에 대해 올바른 XHTML을 생성하는지 확인한다.
-
-**데이터 소스:** `tests/testcases/{page_id}/mapping.yaml` + `expected.mdx`
-
-**방법:**
-
-`mapping.yaml`은 각 MDX 블록의 `xhtml_xpath`와 원본 XHTML 내 `xhtml_text`를 갖고 있다. `expected.mdx`를 파싱하여 각 블록을 재구성하면, `mapping.yaml`의 `xhtml_text`와 비교할 수 있다.
-
-```python
-# tests/test_reconstruction_unit.py
-
-def test_block_reconstruction(page_id, block_idx):
-    """각 MDX 블록을 재구성하여 원본 XHTML 텍스트와 비교한다."""
-    mapping = load_mapping(f"testcases/{page_id}/mapping.yaml")
-    mdx_blocks = parse_mdx_blocks(open(f"testcases/{page_id}/expected.mdx").read())
-
-    for entry in mapping.entries:
-        mdx_idx = entry.mdx_blocks[0] if entry.mdx_blocks else None
-        if mdx_idx is None:
-            continue  # 비가역 블록 skip
-
-        block = mdx_blocks[mdx_idx]
-        reconstructed = mdx_block_to_xhtml_element(block)
-
-        # normalize_xhtml() 정규화 범위:
-        #   정규화 O — 속성 순서, 빈 태그 형식(<br/> vs <br />), 빈 속성값 형식
-        #   정규화 X — 텍스트 노드 trailing 공백, 네임스페이스 prefix
-        # trailing 공백 차이나 namespace prefix 차이는 실제 버그로 판정한다.
-        assert normalize_xhtml(reconstructed) == normalize_xhtml(entry.xhtml_text), \
-            f"Block {block_idx} ({block.type}) mismatch"
-```
+1. 설계 검증 테스트
+2. 회귀 방지 테스트
 
-#### `normalize_xhtml()` 스펙
-
-| 항목 | 정규화 여부 | 이유 |
-|------|------------|------|
-| 속성 순서 | ✅ 정렬 | `mdx_block_to_xhtml_element()` 출력 순서가 원본과 다를 수 있음 |
-| 빈 태그 형식 (`<br/>` vs `<br />`) | ✅ 통일 | XML 파서/직렬화 도구마다 출력이 다름 |
-| 빈 속성값 형식 (`<ac:parameter ... />` vs `<ac:parameter ...></ac:parameter>`) | ✅ 통일 | BeautifulSoup 출력 방식에 따라 달라짐 |
-| 텍스트 노드 trailing 공백 | ❌ 유지 | 공백 차이는 실제 버그로 판정 |
-| 네임스페이스 prefix (`ac:`, `ri:`) | ❌ 유지 | prefix는 항상 동일하게 유지되어야 함 — 차이 발생 시 실제 버그 |
-
-```python
-def normalize_xhtml(xhtml: str) -> str:
-    """비교용 XHTML 정규화.
-
-    정규화 O: 속성 순서, 빈 태그 형식, 빈 속성값 형식
-    정규화 X: 텍스트 노드 공백, 네임스페이스 prefix
-    """
-    from lxml import etree
-    root = etree.fromstring(f"<root>{xhtml}</root>")
-    for elem in root.iter():
-        # 속성 순서 정렬
-        elem.attrib = dict(sorted(elem.attrib.items()))
-    # 직렬화 — 빈 태그/속성값 형식은 lxml 기본 출력으로 통일
-    result = etree.tostring(root, encoding="unicode")
-    # <root>...</root> 제거
-    return result[6:-7]
-```
+### 7.1 설계 검증 테스트
 
-이 함수 자체에 대한 단위 테스트(Level 0)가 필요하다 — Section 8.3 Level 0 참고.
+#### Level 0. Helper / invariant
 
-**측정 지표:** `passed_blocks / total_blocks` — 목표 80% 이상 (비가역 블록 제외)
+새 파일 제안:
 
-**실행 방법:**
-```bash
-python3 -m pytest tests/test_reconstruction_unit.py -v \
-    --tb=short --no-header 2>&1 | tail -20
-```
+- `tests/test_reverse_sync_xhtml_normalizer.py`
+- `tests/test_reverse_sync_reconstruction_offsets.py`
+- `tests/test_reverse_sync_reconstruction_insert.py`
 
----
+검증 항목:
 
-### 8.5 Level 2: 전체 문서 재구성 커버리지 테스트
+- plain-text extraction이 original/emitted fragment에서 같은 규칙으로 동작하는지
+- old -> new offset mapping이 삽입/삭제/대체에 대해 안정적인지
+- raw anchor insertion이 DOM 파괴 없이 수행되는지
+- `hash + line_range` disambiguation이 duplicate content에서도 안정적인지
 
-**목적:** `tests/testcases/`의 모든 페이지에 대해 재구성이 가능한 블록의 비율(커버리지)을 측정한다.
+여기서 review의 Critical 이슈를 먼저 red test로 고정한다.
 
-**방법:** 각 페이지의 `expected.mdx`를 전체 재구성한 뒤, 원본 `page.xhtml`의 beautified diff와 비교한다. 비가역 블록 위치에서 발생하는 diff만 허용한다.
+필수 red cases:
 
-```
-tests/testcases/{page_id}/
-    page.xhtml                ← 원본
-    expected.mdx              ← MDX 입력
-    mapping.yaml              ← 블록 매핑
-    output.reconstruct.xhtml  ← 재구성 결과 (신규 생성)
-    output.reconstruct.diff   ← beautify-diff (신규 생성)
-```
+1. paragraph + inline image
+2. list item + image
+3. duplicate hash candidate
+4. namespace-bearing fragment normalization
 
-#### 핵심 함수 인터페이스
-
-```python
-def reconstruct_full_xhtml(
-    mdx_text: str,
-    mapping: SidecarMapping,
-    page_xhtml: str,              # 비가역 블록 원본 보존용
-) -> str:
-    """MDX 전체를 sidecar mapping 기반으로 재구성한다.
-
-    처리 순서:
-      1. mapping의 각 entry를 순회
-      2. 가역 블록 → mdx_block_to_xhtml_element()로 재구성
-      3. 비가역 블록(ac:link 포함, adf_extension 등) → page_xhtml에서 원본 fragment 추출하여 그대로 사용
-      4. document envelope(prefix/suffix) → sidecar에서 복원 (RoundtripSidecar.reassemble_xhtml() 참고)
-    """
-
-
-def compare_reversible_blocks(
-    original: str,
-    reconstructed: str,
-    mapping: SidecarMapping,
-) -> list[str]:
-    """가역 블록에서 발생한 diff 목록을 반환한다.
-
-    반환값이 [] 이면 전원 일치.
-    mapping의 각 entry를 순회하며:
-      - 비가역 블록(ac:link 포함, adf_extension 등) → skip
-      - 가역 블록 → original vs reconstructed의 해당 xhtml_xpath fragment 비교
-      - 불일치 시 f"{xhtml_xpath}: {diff}" 형태의 문자열을 목록에 추가
-    normalize_xhtml()로 정규화 후 비교한다.
-    """
-```
+#### Level 1. Block reconstruction against exact fragment oracle
 
-**실행 스크립트 (기존 run-tests.sh 확장):**
+새 파일 제안:
 
-```bash
-# run-tests.sh에 추가할 타입
-# --type reconstruct
-# page.xhtml의 모든 블록을 expected.mdx로 재구성하여 diff를 생성한다.
-```
+- `tests/test_reverse_sync_reconstruct_paragraph.py`
+- `tests/test_reverse_sync_reconstruct_list.py`
+- `tests/test_reverse_sync_reconstruct_container.py`
 
-```python
-# tests/test_reconstruction_coverage.py
-
-@pytest.mark.parametrize("page_id", list_testcase_ids())
-def test_reconstruction_coverage(page_id):
-    """MDX → XHTML 재구성 커버리지: 가역 블록은 원본과 일치해야 한다."""
-    page_xhtml = open(f"testcases/{page_id}/page.xhtml").read()
-    mdx_text = open(f"testcases/{page_id}/expected.mdx").read()
-    mapping = load_mapping(f"testcases/{page_id}/mapping.yaml")
-
-    reconstructed_xhtml = reconstruct_full_xhtml(mdx_text, mapping, page_xhtml)
-
-    reversible_diffs = compare_reversible_blocks(
-        original=page_xhtml,
-        reconstructed=reconstructed_xhtml,
-        mapping=mapping,
-    )
-    # 가역 블록에서 diff가 없어야 함
-    assert reversible_diffs == [], \
-        f"Reversible block diff found:\n" + "\n".join(reversible_diffs)
-```
+oracle:
 
-**`compare_reversible_blocks()`의 동작:**
+- 기본: `expected.roundtrip.json.blocks[].xhtml_fragment`
+- nested child: `page.xhtml` 에서 xpath extraction
 
-1. `mapping.yaml`의 각 엔트리를 순회
-2. 비가역 블록(ac:link 포함, adf_extension 등) → skip
-3. 가역 블록 → `original_block_xhtml` vs `reconstructed_block_xhtml` 비교
-4. 불일치 시 diff를 반환
+검증 방식:
 
----
+- unchanged MDX block를 reconstruct 했을 때 oracle fragment와 normalize-equal
 
-### 8.6 Level 3: 재구성 후 byte-equal 테스트
+대표 파라미터:
 
-**목적:** 손실 복원 블록(`ac:image`, `inline_trailing_html` 포함 항목 등)에 대해 재구성 결과가 원본 XHTML fragment와 byte-equal임을 증명한다.
+- list item + image: `544113141`, `544145591`, `692355151`, `880181257`, `883654669`
+- callout + list: `1454342158`, `544145591`, `692355151`, `880181257`, `883654669`
+- callout + code: `544112828`
+- ADF panel: `1454342158`, `544379140`, `panels`
+- inline paragraph image: `tests/reverse-sync/544376004`
 
-#### 기존 인프라 (신규 파일 불필요)
+`544376004` 는 `tests/testcases` 가 아니므로, unit test에서는 해당 page에서 관련 fragment만 추출한 minimal fixture를 추가해도 된다. 이것은 review의 "새 fixture가 아예 불필요하다고 말하면 안 된다"는 지적에 대한 현실적 대응이다.
 
-다음 구현이 이미 존재한다:
+#### Level 2. Changed block golden reconstruction
 
-| 항목 | 구현 위치 | 설명 |
-|------|-----------|------|
-| `expected.roundtrip.json` 파일 | `tests/testcases/{page_id}/` | `SidecarBlock.xhtml_fragment` (byte-exact), `SidecarBlock.mdx_content_hash` (SHA-256) 포함 |
-| `SidecarBlock.mdx_content_hash` | `bin/reverse_sync/sidecar.py` L53 | MDX 블록 content의 SHA-256 — MDX 블록 식별 키 |
-| `RoundtripSidecar`, `load_sidecar()` | `bin/reverse_sync/sidecar.py` L59, L233 | JSON 역직렬화 |
-| `build_sidecar()` | `bin/reverse_sync/sidecar.py` L159 | `page.xhtml` + `expected.mdx` → `RoundtripSidecar` 생성 |
-| 생성 CLI | `bin/mdx_to_storage_roundtrip_sidecar_cli.py` | `batch-generate` 서브커맨드로 전체 testcase 일괄 생성 |
-| splice 경로 | `bin/reverse_sync/rehydrator.py` L62 `splice_rehydrate_xhtml()` | `mdx_content_hash` 기반 블록 매칭 — `find_mdx_block_by_hash()` 역할 수행 |
+새 파일 제안:
 
-**데이터 소스:** `tests/testcases/{page_id}/expected.roundtrip.json` + `tests/testcases/{page_id}/expected.mdx`
+- `tests/test_reverse_sync_reconstruction_goldens.py`
 
-#### 테스트 코드
+oracle:
 
-```python
-# tests/test_reconstruction_lossless.py
+- `expected.reverse-sync.patched.xhtml`
+- 필요한 경우 `expected.reverse-sync.mapping.original.yaml` / `expected.reverse-sync.result.yaml`
 
-from reverse_sync.sidecar import load_sidecar, sha256_text, load_sidecar_mapping
-from reverse_sync.mdx_block_parser import parse_mdx_blocks
+대상:
 
-@pytest.mark.parametrize("page_id", list_testcase_ids())
-def test_lossless_reconstruction(page_id):
-    """재구성 결과가 원본 XHTML fragment와 byte-equal인지 검증한다."""
-    sidecar = load_sidecar(Path(f"testcases/{page_id}/expected.roundtrip.json"))
-    # load_sidecar: bin/reverse_sync/sidecar.py L233
-    mapping_entries = load_sidecar_mapping(f"testcases/{page_id}/mapping.yaml")
-    # load_sidecar_mapping: bin/reverse_sync/sidecar.py L257
-    xpath_index = {e.xhtml_xpath: e for e in mapping_entries}
+- `original.mdx` + `improved.mdx` + `expected.reverse-sync.*` 가 존재하는 16개 `tests/testcases`
 
-    mdx_text = open(f"testcases/{page_id}/expected.mdx").read()
-    mdx_blocks = parse_mdx_blocks(mdx_text)
-    # parse_mdx_blocks: bin/reverse_sync/mdx_block_parser.py
+검증 방식:
 
-    # mdx_content_hash → MDX 블록 인덱스 (splice 경로와 동일 방식)
-    # rehydrator.py L96: content_hash == sb.mdx_content_hash 비교와 동일
-    hash_to_block = {sha256_text(b.content): b for b in mdx_blocks if b.content}
+- changed block만 reconstruct + page assembly 후 `expected.reverse-sync.patched.xhtml` 와 normalize-equal
+- `expected.reverse-sync.result.yaml` 의 `status: pass` 케이스는 forward verify까지 exact pass
 
-    for sb in sidecar.blocks:
-        if not sb.mdx_content_hash:
-            continue  # MDX 대응 없는 블록 skip (image, TOC 등)
-            # SidecarBlock.mdx_content_hash: sidecar.py L53
+### 7.2 회귀 방지 테스트
 
-        mdx_block = hash_to_block.get(sb.mdx_content_hash)
-        if mdx_block is None:
-            pytest.skip(f"hash not found: {sb.mdx_content_hash[:8]}...")
+#### Level 3. Existing sidecar / byte-equal gates
 
-        sidecar_entry = xpath_index.get(sb.xhtml_xpath)
-        reconstructed = mdx_block_to_xhtml_element(mdx_block, sidecar_entry)
+기존 테스트를 유지하고 schema v3에 맞춰 확장한다.
 
-        assert reconstructed == sb.xhtml_fragment, (
-            f"Fragment mismatch at {sb.xhtml_xpath}:\n"
-            f"  expected: {sb.xhtml_fragment!r}\n"
-            f"  got:      {reconstructed!r}"
-        )
-```
+- `tests/test_reverse_sync_sidecar_v2.py`
+- `tests/test_reverse_sync_rehydrator.py`
+- `tests/test_reverse_sync_byte_verify.py`
 
-#### 기존 `byte_verify`와의 관계
+변경점:
 
-| 검증 | 구현 | 목적 |
-|------|------|------|
-| 기존 `byte_verify` | `bin/reverse_sync/byte_verify.py` | MDX 무변경 시 XHTML byte-equal 보장 (fast path) |
-| Level 3 `test_reconstruction_lossless` | 신규 | 재구성 경로로 변환해도 byte-equal임을 보장 |
+- `expected.roundtrip.json` builder/loader가 reconstruction metadata를 읽고 써야 한다
+- unchanged case에서는 여전히 21/21 byte-equal 유지
 
-Level 3이 통과하면 "재구성 = fast path"임이 증명된다.
+#### Level 4. CLI / E2E
 
-**측정 목표: failed = 0** (mdx_content_hash 없는 블록 skip 제외)
+기존 테스트를 유지하되 reconstruction path를 기본 경로로 바꾼다.
 
----
+- `tests/test_reverse_sync_cli.py`
+- `tests/test_reverse_sync_e2e.py`
+- `tests/test_reverse_sync_structural.py`
 
-### 8.7 Level 4: E2E 회귀 방지 테스트
+여기에 다음을 추가한다.
 
-**목적:** 재구성 기반 reverse-sync가 기존 테스트케이스를 회귀시키지 않음을 보장한다.
+- `tests/reverse-sync/pages.yaml` 의 `expected_status: pass` 케이스는 새 경로에서도 계속 pass
+- `expected_status: fail` 케이스는 failure type별로 하나씩 우선 red -> green 전환
 
-**데이터 소스:** `tests/reverse-sync/{page_id}/` (기존 인프라 그대로 사용)
+우선순위는 아래 순으로 둔다.
 
-**방법:** 기존 `run-tests.sh --type reverse-sync-verify`를 그대로 사용하되, reverse-sync 내부 경로가 재구성 기반으로 전환된 후 동일하게 실행한다.
+1. list/image
+2. callout/code
+3. callout/list
+4. ADF panel
 
-```bash
-# 기존 명령 그대로
-cd tests && ./run-tests.sh --type reverse-sync-verify
+### 7.3 현재 자산 활용 계획 요약
 
-# 검증 기준: pages.yaml의 expected_status와 일치
-# pass 26건 유지, fail 16건 유지 (신규 pass 전환만 허용)
-```
+| 자산 | 수량 | 새 설계에서의 역할 |
+|------|------|--------------------|
+| `tests/testcases/*/page.xhtml` | 21 | exact source page, nested fragment extraction |
+| `tests/testcases/*/expected.roundtrip.json` | 21 | unchanged top-level fragment oracle |
+| `tests/testcases/*/original.mdx` | 16 | reverse-sync original input |
+| `tests/testcases/*/improved.mdx` | 16 | reverse-sync changed input |
+| `tests/testcases/*/expected.reverse-sync.patched.xhtml` | 16 | changed-page golden oracle |
+| `tests/testcases/*/expected.reverse-sync.result.yaml` | 16 | expected verify outcome |
+| `tests/testcases/*/attachments.v1.yaml` | 19 | image filename / asset context |
+| `tests/testcases/*/page.v1.yaml`, `page.v2.yaml`, `children.v2.yaml`, `page.adf` | 18~19 | forward converter context, ADF/callout/link validation |
+| `tests/reverse-sync/*` | 42 | 실사례 회귀 및 failure reproduction |
 
-**회귀 판정 기준:**
+## 8. 단계별 구현 계획
 
-| 상태 전환 | 판정 |
-|-----------|------|
-| `pass` → `pass` | ✅ 유지 |
-| `fail` → `pass` | ✅ 개선 (expected_status 업데이트 필요) |
-| `pass` → `fail` | ❌ 회귀 — PR 차단 |
-| `fail` → `fail` | ✅ 유지 |
+### Phase 0. 공용 helper 추출
 
-**Phase 3 구현 완료 기준:** 26건 `expected_status: pass` 전원 유지 + 신규 pass 전환 확인
+- `xhtml_normalizer.py` 추가
+- `extract_plain_text()`, `normalize_fragment()`, `extract_fragment_by_xpath()` 구현
+- list tree helper public API 승격
 
----
+게이트:
 
-### 8.8 테스트 실행 순서와 피드백 루프
+- Level 0 helper tests green
 
-구현 변경 후 아래 순서로 테스트를 실행한다. 각 단계는 이전 단계가 전원 통과한 후 진행한다.
+### Phase 1. sidecar schema v3
 
-#### Step 1 — Level 1 실행
+- `RoundtripSidecar` 에 reconstruction metadata 추가
+- builder/load/write/update 구현
+- `hash + line_range` 기반 identity helper 도입
 
-```bash
-python3 -m pytest tests/test_reconstruction_unit.py -v --tb=short
-```
+게이트:
 
-**무엇을 확인하는가:** 블록 하나를 재구성했을 때 XHTML이 올바른가.
+- existing sidecar tests green
+- unchanged 21개 `expected.roundtrip.json` roundtrip 유지
 
-**실패 시 수정 위치:** `bin/reverse_sync/mdx_to_xhtml_inline.py` — 해당 블록 타입의 변환 로직.
+### Phase 2. clean block whole-fragment replacement
 
----
+- heading/code/table/simple paragraph modified block를 reconstruction path로 전환
+- `replace_fragment` patch 추가
 
-#### Step 2 — Level 2 실행
+게이트:
 
-```bash
-python3 -m pytest tests/test_reconstruction_coverage.py -v --tb=short
-```
+- simple modified golden cases green
+- `transfer_text_changes()` 경로 없이 clean block 변경 처리 가능
 
-**무엇을 확인하는가:** 문서 전체를 재구성했을 때 블록 조립 순서와 envelope(문서 앞뒤 고정 텍스트)가 올바른가.
+### Phase 3. paragraph/list anchor reconstruction
 
-**실패 시 수정 위치:** 블록 조립 순서 오류라면 `reconstruct_entry()`, envelope 오류라면 `RoundtripSidecar.reassemble_xhtml()` (`bin/reverse_sync/sidecar.py` L70).
+- inline anchor metadata builder
+- offset mapping + DOM insertion helper
+- list item + nested list reconstruction
 
----
+게이트:
 
-#### Step 3 — Level 3 실행
+- `544113141`, `544145591`, `692355151`, `880181257`, `883654669`
+- `544376004` helper/unit case
 
-```bash
-python3 -m pytest tests/test_reconstruction_lossless.py -v --tb=short
-```
+### Phase 4. container reconstruction
 
-**무엇을 확인하는가:** `ac:image` 등 lossy 요소를 재주입한 후 원본 XHTML fragment와 byte-equal인가.
+- callout/details/ADF panel body reconstruction
+- child slot order 기반 재귀 rebuild
 
-**실패 시 수정 위치:** sidecar 생성 로직 — `_process_element()` 또는 `reconstruct_entry()`의 `inline_trailing_html` / `children ref` 처리.
+게이트:
 
----
+- `544112828`
+- `1454342158`
+- `544379140`
+- `panels`
 
-#### Step 4 — Level 4 실행
+### Phase 5. planner 전환과 batch 회귀
 
-```bash
-cd tests && ./run-tests.sh --type reverse-sync-verify
-```
+- `patch_builder.py` modified path를 reconstruction planner로 위임
+- legacy text-transfer path는 fallback 또는 제거
+
+게이트:
+
+- `tests/testcases` 16개 reverse-sync golden green
+- `tests/reverse-sync/pages.yaml` pass 케이스 유지
 
-**무엇을 확인하는가:** 기존에 통과하던 reverse-sync E2E 케이스가 재구성 경로 전환 후에도 동일하게 통과하는가.
+## 9. 승인 기준
 
-**실패 시 수정 위치:** `bin/reverse_sync/patch_builder.py`의 재구성 경로 — Level 1/2/3에서 놓친 블록 타입이 있다는 신호이므로, 해당 케이스를 Level 1 단위 테스트로 먼저 재현하고 수정한다.
+이 설계는 아래를 만족해야 구현 완료로 본다.
 
----
+1. modified block의 기본 경로가 whole-fragment reconstruction 이다
+2. paragraph/list anchor 처리가 plain-text 좌표계 기준으로 구현된다
+3. test oracle이 `mapping.yaml` 이 아니라 `expected.roundtrip.json` / `page.xhtml` / `expected.reverse-sync.patched.xhtml` 로 확정된다
+4. XHTML normalization은 BeautifulSoup 기반 공용 helper로 통일된다
+5. duplicate content에서도 `hash + line_range` 기반 identity가 동작한다
+6. 기존 `tests/testcases` / `tests/reverse-sync` 자산을 그대로 회귀 게이트로 사용할 수 있다
 
-#### 판정 기준 요약
+## 10. 최종 판단
 
-| 단계 | 통과 기준 | 실패 의미 |
-|------|-----------|-----------|
-| Level 1 | 모든 블록 타입 재구성 정확 | 변환 로직 버그 |
-| Level 2 | 문서 단위 조립 정확 | 블록 순서 또는 envelope 버그 |
-| Level 3 | lossy 요소 재주입 후 byte-equal | sidecar children/trailing 추출 버그 |
-| Level 4 | 기존 pass 케이스 전원 유지 | Level 1~3에서 놓친 케이스 존재 |
+PR #913의 원래 방향은 맞다. 다만 기존 문서는 "재구성으로 간다"는 선언에 비해, 실제 구현이 의존할 좌표계, oracle, sidecar 책임 분리가 부족했다.
 
----
+새 설계의 핵심 차이는 다음 세 가지다.
 
-### 8.9 기존 인프라와의 관계 정리
+- `convert_inline()` 역변환 가정을 버리고 plain-text 좌표계를 채택한다
+- `mapping.yaml` 을 oracle 자리에서 내리고 `expected.roundtrip.json` 을 중심 artifact 로 올린다
+- 기존 testcase 자산을 설계 검증 테스트와 회귀 테스트로 분리해 사용한다
 
-| 기존 테스트 | 역할 | 재구성 후 변화 |
-|-------------|------|----------------|
-| `run-tests.sh --type convert` | XHTML → MDX forward 변환 검증 | 변화 없음 |
-| `run-tests.sh --type reverse-sync` | expected 파일 비교 | Phase 3 완료 후 expected 파일 재생성 필요 |
-| `run-tests.sh --type reverse-sync-verify` | `expected_status` 검증 | 그대로 사용 (회귀 게이트) |
-| `byte_verify` | roundtrip sidecar byte-equal | 변화 없음 (fast path 그대로) |
-| `test_reconstruction_unit.py` | **신규** — 블록 단위 재구성 | Level 1 |
-| `test_reconstruction_lossless.py` | **신규** — trailing_html byte-equal | Level 3 |
+이 기준으로 구현하면, 최종 목표인 "MDX 변경을 XHTML로 재구성하여 Confluence 문서를 업데이트"하는 기능을 현재 코드베이스 위에서 더 안정적으로 구현하고 유지보수할 수 있다.

From 2fed45c52b107a04b8e2f6a8fe8c17d559984b7a Mon Sep 17 00:00:00 2001
From: JK <jk@chequer.io>
Date: Fri, 13 Mar 2026 21:10:20 +0900
Subject: [PATCH 4/5] =?UTF-8?q?confluence-mdx:=20reverse-sync=20=EC=9E=AC?=
 =?UTF-8?q?=EA=B5=AC=EC=84=B1=20=EC=84=A4=EA=B3=84=20=EA=B2=80=EC=A6=9D=20?=
 =?UTF-8?q?=EA=B2=B0=EA=B3=BC=EB=A5=BC=20=EB=A6=AC=EB=B7=B0=20=EB=AC=B8?=
 =?UTF-8?q?=EC=84=9C=EC=97=90=20=EC=B6=94=EA=B0=80=ED=95=A9=EB=8B=88?=
 =?UTF-8?q?=EB=8B=A4?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

설계의 적절성, 목표 달성 가능성, 아키텍처, 위험 요소, 복잡도 대비 효과를 검증하였습니다.
Phase 0-2는 즉시 착수 가능하며, Phase 3 착수 전 list item 불일치 처리 방식 및
anchor offset 기준 명문화를 권고합니다.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 ...verse-sync-reconstruction-design-review.md | 108 ++++++++++++++++++
 1 file changed, 108 insertions(+)

diff --git a/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design-review.md b/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design-review.md
index 2e487ac83..a8b02493e 100644
--- a/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design-review.md
+++ b/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design-review.md
@@ -383,3 +383,111 @@ python3 -m pytest tests/test_reconstruction_helpers.py -v --tb=short
 - **설계 방향:** 승인 가능
 - **설계 문서의 현재 완성도:** 수정 필요
 - **TDD 관점의 테스트 확보 방안:** 보강 필요
+
+---
+
+## 설계 검증 (Claude Code 검토, 2026-03-13)
+
+### 검증 요약
+
+`2026-03-13-reverse-sync-reconstruction-design.md`는 v5 리뷰에서 지적된 세 가지 Critical/High 이슈(paragraph 좌표계 혼란, oracle 출처 불명, lxml 의존성)를 모두 해소하고 구현 착수 가능한 수준으로 재작성됐다. 현재 코드베이스와 정합성도 높다. 다만 설계 내부에 검증이 필요한 가정 몇 가지가 남아 있으며, Phase 3~4에서 현실적 어려움이 예상된다.
+
+---
+
+### 1. 문제 진단의 정확성
+
+코드베이스와 대조한 결과, 설계 문서의 현재 시스템 문제 진단은 실제 코드와 일치한다.
+
+**정확한 진단:**
+
+- `patch_builder.py`의 전략 분기 누적 (`direct` / `containing` / `list` / `table` / `skip`) 및 각 전략마다 별도 fallback이 실제로 존재한다 (patch_builder.py:88-156).
+- `transfer_text_changes()` 기반 수정이 중심 경로임을 확인했다. `containing` 전략과 delete+add 쌍 처리 모두 이 함수에 의존한다 (patch_builder.py:76, 221).
+- Confluence 전용 요소(`<ac:image>`, `<ac:link>` 등)를 발견하면 재생성 대신 text transfer로 폴백하는 코드가 명시적으로 존재한다 (patch_builder.py:300-309).
+- `sidecar.py`의 `SidecarBlock`에 `reconstruction` 필드가 없음을 확인했다. 설계 문서의 "현재 schema v2에 reconstruction metadata가 없다"는 진단이 정확하다.
+
+**추가로 확인한 사항:**
+
+`generate_sidecar_mapping()`(sidecar.py:306)의 4차 prefix 매칭(`[:20]`)은 text similarity가 낮은 경우 false positive를 낼 수 있다. PR 이력(#853)에서 이미 버그가 발생한 패턴이며, 설계 문서가 이 경로를 "더 이상 중심축이 되어서는 안 된다"고 명시한 것은 정확하다.
+
+---
+
+### 2. 목표 달성 가능성
+
+**달성 가능한 목표:**
+
+설계의 세 핵심 전환은 코드베이스 자산을 기반으로 달성 가능하다.
+
+- `convert_inline()` 역변환 가정 제거 → 설계가 명시적으로 "기준 좌표계는 XHTML DOM에서 추출한 normalized plain text"로 전환했다 (§3.1). 현재 `mapping_recorder.py`가 이미 `get_text()` 기반 plain text를 생성하므로 좌표계 기반은 존재한다.
+- `expected.roundtrip.json`을 primary oracle로 승격 → 현재 모든 21개 testcase에 `xhtml_fragment` 필드가 존재함을 직접 확인했다. oracle 전환의 전제가 충족된다.
+- BeautifulSoup 기반 normalizer 구축 → `mapping_recorder.py`, `xhtml_patcher.py`, `xhtml_beautify_diff.py`가 이미 BeautifulSoup을 사용한다. 공용 `xhtml_normalizer.py`로 통합하는 것은 현실적이다.
+
+**달성이 불확실한 목표:**
+
+- **paragraph + inline anchor 재주입 (§5.4)**: `old_plain_offset` 기반 offset mapping이 핵심이다. anchor가 두 개 이상이거나, 변경으로 인해 앞 anchor의 offset이 뒤 anchor에 영향을 줄 때의 순서 보장 로직이 설계에 명시되지 않았다. 구현 시 edge case가 될 가능성이 높다.
+- **list reconstruction의 zip 매칭 (§5.5)**: "sidecar list item sequence와 index 기반 zip"은 MDX와 XHTML의 list item 수가 다를 때(항목 추가/삭제) 어떻게 처리할지 명확하지 않다. 설계가 이 케이스를 암묵적으로 "child slot 수 불일치 → fail"로 처리한다면 실용적 커버리지가 낮아질 수 있다.
+
+---
+
+### 3. 아키텍처 설계의 적절성
+
+**적절한 설계 결정:**
+
+- `replace_fragment` 액션을 `xhtml_patcher.py`에 추가하는 접근은 자연스럽다. 기존 `insert` / `delete` 패턴과 일관되고, DOM 전체 교체라는 의미도 명확하다.
+- `reconstruction_planner.py`를 분리하고 `patch_builder.py`를 thin orchestration layer로 만드는 방향은 현재 `patch_builder.py`의 1개 함수가 전략 분기 + fallback + 그룹화를 모두 담당하는 문제를 직접 해결한다.
+- block identity에 `hash + line_range + order`를 함께 사용하는 방식은 v5 리뷰의 W-1 지적을 정확히 해소한다.
+- `SidecarBlock`에 `reconstruction` 필드를 추가하는 스키마 v3 설계는 테스트 oracle과 runtime metadata를 같은 artifact에 담는 좋은 설계다. 현재 `SidecarBlock.lost_info`가 비슷한 패턴으로 이미 존재하므로 확장이 자연스럽다.
+
+**검토가 필요한 설계 결정:**
+
+- **callout outer wrapper 보존 (§5.6)**: "outer wrapper 보존은 `lost_info_patcher`가 아니라 reconstruction metadata가 책임진다"는 원칙이 맞지만, 현재 `lost_info_patcher.py`가 `_STRUCTURED_MACRO_RE`로 callout macro를 처리하는 코드가 이미 있다. 두 책임 분리 경계를 구현 시 명확히 해야 한다.
+- **Opaque block의 fail-closed 정책 (§5.3-D)**: `UnsupportedReconstructionError`로 명시적 실패하는 방향은 올바르지만, 현재 `build_patches()`가 mapping을 찾지 못하면 `skip`으로 조용히 통과한다. fail-closed로 전환하면 현재 pass하던 케이스 일부가 fail로 바뀔 수 있어 Phase 5 전환 시 회귀 위험이 있다.
+
+---
+
+### 4. 위험 요소 및 미검토 에지케이스
+
+**설계에서 명시적으로 언급되지 않은 에지케이스:**
+
+1. **paragraph anchor affinity 충돌**: `affinity: "after"` anchor가 연속으로 나올 때 두 번째 anchor의 `old_plain_offset`이 첫 번째 anchor 삽입 후 shift된 DOM 기준인지 원본 기준인지 명확하지 않다. 구현 시 "모든 offset은 original XHTML 기준"으로 고정해야 한다.
+
+2. **list item 수 불일치 처리**: 설계 §5.5는 "child type과 순서를 기준으로 재귀 reconstruct"한다고 하지만, MDX에서 list item을 추가/삭제한 경우의 처리 방식이 명시되지 않았다. 이것이 실제 reverse-sync에서 가장 흔한 변경 패턴 중 하나임을 감안하면, Level 3 테스트(Phase 3 게이트) 전에 결정이 필요하다.
+
+3. **sidecar v3 빌더의 old_plain_text 생성 시점**: `reconstruction.old_plain_text`는 sidecar 빌드 시(forward convert 시점) 기록된다. 이후 page.xhtml이 Confluence에서 자체 수정되면 `old_plain_text`가 실제 XHTML과 달라질 수 있다. 현재 `source_xhtml_sha256`로 감지 가능하나, 불일치 시 처리 경로가 설계에 없다.
+
+4. **`_parse_list_items()` 및 `_build_list_tree()` public 승격 범위**: 설계가 이 private 함수들을 `parse_list_tree()` public API로 승격하겠다고 밝혔다. 현재 `_parse_list_items`가 continuation line(마커 없는 줄)을 이전 항목에 붙이는 로직(emitter.py:182-183)이 있는데, 이 동작이 reconstruction context에서도 의도된 것인지 확인이 필요하다.
+
+5. **BeautifulSoup `html.parser` 속성 순서**: `xhtml_normalizer.py` 구현 시 BeautifulSoup으로 파싱 후 재직렬화하면 attribute 순서가 바뀔 수 있다. fragment comparison에서 false negative를 방지하려면 attribute 정렬 규칙을 명시해야 한다. 현재 `mdx_to_storage_xhtml_verify.py`에 이 로직이 있을 수 있으나 통합 방법을 확인해야 한다.
+
+---
+
+### 5. 구현 복잡도 vs 기대 효과
+
+**비용 측면:**
+
+- Phase 0-2(normalizer, schema v3, clean block replacement): 비교적 낮은 복잡도. 기존 자산 재사용이 명확하고 test oracle이 이미 준비됐다.
+- Phase 3(paragraph/list anchor reconstruction): 중간-높은 복잡도. offset mapping 알고리즘, DOM 삽입 순서 보장, list item 수 불일치 처리 등 새로 작성해야 할 로직이 많다.
+- Phase 4(container reconstruction): 높은 복잡도. callout/details/ADF panel이 각각 outer wrapper 구조가 다르고, `lost_info_patcher`와의 책임 분리를 정확히 해야 한다.
+- Phase 5(planner 전환): 중간 복잡도. 하지만 fail-closed 전환 시 회귀 위험이 있어 신중한 rollout이 필요하다.
+
+**기대 효과:**
+
+- 현재 시스템의 근본적 취약점(text coordinate를 벗어난 Confluence 요소 손실)을 구조적으로 해소한다.
+- `transfer_text_changes()` fallback에 의존하는 silent corruption 경로를 제거한다.
+- test oracle이 `mapping.yaml`에서 `expected.roundtrip.json`으로 전환되어 "무엇을 기준으로 테스트하는가"가 명확해진다.
+
+**판단:**
+
+Phase 0-2의 ROI는 높다. Phase 3-4는 복잡도 대비 효과가 여전히 높지만, 에지케이스 처리 결정을 Phase 착수 전에 명확히 해야 낭비 없이 구현할 수 있다.
+
+---
+
+### 종합 판단
+
+v5 리뷰의 3개 Critical/High 이슈가 새 설계 문서에서 모두 명시적으로 해소됐다. 특히 paragraph 좌표계를 "XHTML DOM에서 추출한 normalized plain text"로 확정하고, oracle을 `expected.roundtrip.json.xhtml_fragment`로 명시한 결정이 핵심이다. 현재 코드베이스와의 정합성도 양호하다.
+
+**권고사항:**
+
+1. **Phase 3 착수 전 필수 결정**: list item 수 불일치(추가/삭제) 시 처리 방식을 명시하라. "수 불일치는 항상 fail"이라면 테스트 설계에 해당 케이스를 명시적으로 포함해야 한다.
+2. **anchor offset 기준 명문화**: `reconstruction.anchors[].old_plain_offset`이 "원본 XHTML 기준 누적 offset"임을 설계 문서에 명시하라. 구현자가 이 가정을 따르지 않으면 멀티 anchor 케이스에서 버그가 발생한다.
+3. **`lost_info_patcher` vs reconstruction metadata 경계 결정**: callout outer wrapper를 어느 쪽이 책임지는지 Phase 4 착수 전에 코드 수준 경계를 설계 문서에 추가하라.
+4. **현재 설계 문서 완성도**: 구현 착수 가능 수준이다. Phase 0-2는 즉시 착수 가능하고, Phase 3-4는 위 항목을 보완 후 착수를 권장한다.

From abe90b3ce3b8698b92c3c35932786b3ac15d4fcd Mon Sep 17 00:00:00 2001
From: JK <jk@chequer.io>
Date: Fri, 13 Mar 2026 21:18:11 +0900
Subject: [PATCH 5/5] docs: add reverse sync cleanup scope

---
 ...3-13-reverse-sync-reconstruction-design.md | 239 ++++++++++++++++++
 1 file changed, 239 insertions(+)

diff --git a/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design.md b/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design.md
index fe5d8da7c..c3010079d 100644
--- a/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design.md
+++ b/confluence-mdx/docs/plans/2026-03-13-reverse-sync-reconstruction-design.md
@@ -707,3 +707,242 @@ PR #913의 원래 방향은 맞다. 다만 기존 문서는 "재구성으로 간
 - 기존 testcase 자산을 설계 검증 테스트와 회귀 테스트로 분리해 사용한다
 
 이 기준으로 구현하면, 최종 목표인 "MDX 변경을 XHTML로 재구성하여 Confluence 문서를 업데이트"하는 기능을 현재 코드베이스 위에서 더 안정적으로 구현하고 유지보수할 수 있다.
+
+## 11. 기존 코드 삭제 및 정리 범위
+
+새 구현이 기본 경로가 되면, 현재 코드베이스에는 "과거 heuristic text patch 경로"가 상당 부분 중복 상태로 남는다. 이 섹션은 무엇을 삭제할 수 있고, 무엇을 단계적으로 축소해야 하는지를 명시한다.
+
+삭제 원칙은 단순하다.
+
+1. `tests/testcases` 16개 reverse-sync golden 과 `tests/reverse-sync/pages.yaml` 의 `expected_status: pass` 게이트가 새 경로에서 안정화되기 전에는 삭제하지 않는다.
+2. 새 경로가 green 이 된 뒤에는 동일 책임의 구 구현을 남겨두지 않는다.
+3. debug artifact는 기본 동작에서 제거하고, 필요하면 명시적 debug flag 뒤로 보낸다.
+
+### 11.1 완전 삭제 대상
+
+아래 모듈은 새 설계가 완료되면 역할이 완전히 대체된다.
+
+#### `bin/reverse_sync/text_transfer.py`
+
+삭제 사유:
+
+- MDX plain text와 XHTML plain text를 문자 단위로 정렬해 수정분만 이식하는 구현이다.
+- 새 설계는 modified block 기본 경로를 whole-fragment reconstruction으로 바꾸므로 더 이상 중심 경로가 아니다.
+- paragraph/list anchor 처리도 이 모듈이 아니라 plain-text offset + DOM insertion helper가 담당한다.
+
+함께 삭제/교체할 테스트:
+
+- `tests/test_reverse_sync_text_transfer.py`
+- `tests/test_reverse_sync_cli.py` 내 `align_chars`, `find_insert_pos`, `transfer_text_changes` 관련 테스트
+
+#### `bin/reverse_sync/list_patcher.py`
+
+삭제 사유:
+
+- 리스트를 item-level text patch 또는 전체 innerHTML 재생성으로 처리하는 전용 heuristic 모듈이다.
+- 새 설계에서는 list tree + sidecar reconstruction metadata 기반 재구성기로 대체된다.
+- 특히 `_regenerate_list_from_parent()` 의 `transfer_text_changes()` 폴백은 새 경로와 철학적으로 충돌한다.
+
+함께 삭제/교체할 테스트:
+
+- `tests/test_reverse_sync_patch_builder.py` 내 `build_list_item_patches`, `split_list_items` 관련 테스트
+- `tests/test_reverse_sync_cli.py` 내 `build_list_item_patches` 직접 호출 테스트
+
+#### `bin/reverse_sync/table_patcher.py`
+
+삭제 사유:
+
+- table row별 plain text patch를 containing block에 누적 적용하는 구 경로다.
+- 새 설계에서는 table도 clean block로 whole-fragment replacement 대상이다.
+- row-level text patch는 더 이상 유지할 가치가 없다.
+
+함께 삭제/교체할 테스트:
+
+- `tests/test_reverse_sync_patch_builder.py` 내 `build_table_row_patches`, `split_table_rows`, `normalize_table_row` 관련 테스트
+
+#### `bin/reverse_sync/inline_detector.py`
+
+삭제 사유:
+
+- inline marker 변화 여부를 감지해 기존 heuristic branch를 선택하기 위한 보조 모듈이다.
+- 새 경로는 inline marker 변화 감지로 분기하지 않고, block kind와 reconstruction metadata로 분기한다.
+- 따라서 `has_inline_format_change()` / `has_inline_boundary_change()` 는 planner 설계에서 더 이상 필요하지 않다.
+
+함께 삭제/교체할 테스트:
+
+- `tests/test_reverse_sync_patch_builder.py` 내 `has_inline_format_change`, `has_inline_boundary_change` 관련 테스트
+
+### 11.2 부분 삭제 또는 대폭 축소 대상
+
+아래 코드는 즉시 파일 전체를 지우기보다는, 새 경로가 기본이 된 뒤 내부 범위를 줄여야 한다.
+
+#### `bin/reverse_sync/patch_builder.py`
+
+삭제할 범위:
+
+- `_flush_containing_changes()`
+- `_resolve_mapping_for_change()`
+- `'direct' | 'containing' | 'list' | 'table' | 'skip'` 전략 분기
+- `transfer_text_changes()` 를 호출하는 modified path
+- `mdx_block_to_inner_xhtml()` 기반 `new_inner_xhtml` 패치 경로
+
+남길 가능성이 있는 범위:
+
+- insert/delete orchestration
+- alignment를 이용한 insert anchor 계산
+
+권장 최종 형태:
+
+- `patch_builder.py` 는 사실상 thin wrapper가 되거나,
+- 새 `reconstruction_planner.py` / `reconstructors.py` 로 책임을 이동한 후 삭제한다.
+
+#### `bin/reverse_sync/xhtml_patcher.py`
+
+삭제할 범위:
+
+- `old_plain_text` + `new_plain_text` modify path
+- `_apply_text_changes()` 와 그에 딸린 text-only patch 로직
+
+남겨야 할 범위:
+
+- XPath resolve
+- `insert`
+- `delete`
+- 새로 추가할 `replace_fragment`
+- CDATA 복원
+
+즉 이 모듈은 "텍스트 패처"가 아니라 "fragment-level DOM patcher"로 축소되어야 한다.
+
+함께 정리할 테스트:
+
+- `tests/test_reverse_sync_xhtml_patcher.py` 의 `new_plain_text` 중심 테스트는 제거하거나 `replace_fragment` 중심 테스트로 교체한다.
+
+#### `bin/reverse_sync/mdx_to_xhtml_inline.py`
+
+삭제 후보 범위:
+
+- `mdx_block_to_inner_xhtml()`
+
+삭제 검토 사유:
+
+- 새 설계의 기본 emitter는 `mdx_to_storage.emit_block()` 이다.
+- `mdx_block_to_inner_xhtml()` 는 innerHTML 단위 패치에 최적화된 구 계층이다.
+
+권장 방향:
+
+- 단기: `mdx_block_to_xhtml_element()` 만 compatibility wrapper로 유지 가능
+- 최종: planner가 `emit_block()` 을 직접 사용하면 모듈 전체 삭제 가능
+
+함께 정리할 테스트:
+
+- `tests/test_reverse_sync_mdx_to_xhtml_inline.py` 의 innerHTML 중심 테스트는 축소하거나 새 reconstruction helper 테스트로 대체한다.
+
+#### `bin/reverse_sync/sidecar.py` 의 mapping.yaml 계층
+
+축소 또는 삭제 대상:
+
+- `SidecarEntry`
+- `SidecarChildEntry`
+- `load_sidecar_mapping()`
+- `build_mdx_to_sidecar_index()`
+- `build_xpath_to_mapping()`
+- `generate_sidecar_mapping()`
+- `find_mapping_by_sidecar()`
+
+삭제 조건:
+
+- `RoundtripSidecar schema v3` 가 top-level routing + reconstruction metadata 책임까지 흡수한 뒤
+- `reverse_sync_cli.py` 와 `converter/cli.py` 가 더 이상 `mapping.yaml` 을 읽고 쓰지 않을 때
+
+즉 이 계층은 "즉시 삭제"가 아니라 "sidecar v3 정착 후 제거" 대상이다.
+
+#### `bin/reverse_sync_cli.py` 의 debug artifact 경로
+
+축소 또는 삭제 대상:
+
+- `reverse-sync.mapping.original.yaml`
+- `reverse-sync.mapping.patched.yaml`
+- runtime 중간 산출물로서의 `mapping.yaml`
+
+권장 방향:
+
+- 기본 verify/push 경로에서는 생성하지 않는다
+- 필요 시 `--debug-mapping` 같은 명시적 플래그 뒤로 이동한다
+
+#### `bin/converter/cli.py` 의 자동 `mapping.yaml` 생성
+
+삭제 또는 optional 화 대상:
+
+- `generate_sidecar_mapping()` 호출 블록 전체
+
+사유:
+
+- forward convert 성공 여부와 mapping.yaml 생성은 본질적으로 분리되어야 한다
+- 새 설계의 중심 artifact는 `mapping.yaml` 이 아니라 `expected.roundtrip.json` / sidecar v3 다
+
+### 11.3 유지 대상
+
+아래 모듈은 새 설계에서도 유지한다.
+
+#### 유지: `bin/reverse_sync/mapping_recorder.py`
+
+이유:
+
+- callout / ADF panel child xpath 추출
+- debug와 fixture 분석
+- nested fragment extraction helper
+
+다만 역할은 "runtime truth"가 아니라 "XHTML 분석/보조 도구"로 한정한다.
+
+#### 유지: `bin/reverse_sync/lost_info_patcher.py`
+
+이유:
+
+- 링크, emoticon, filename, image, adf-extension 복원은 여전히 필요하다
+- 다만 적용 위치는 modified fragment emit 후의 post-process 단계로 고정한다
+
+#### 유지: `bin/reverse_sync/sidecar.py` 의 roundtrip core
+
+유지 범위:
+
+- `RoundtripSidecar`
+- `SidecarBlock`
+- `build_sidecar()`
+- `load_sidecar()`
+- `write_sidecar()`
+- `verify_sidecar_integrity()`
+
+즉 sidecar 모듈 전체를 지우는 것이 아니라, 그 안의 `mapping.yaml` 서브계층만 걷어내는 방향이다.
+
+### 11.4 테스트 코드 삭제 범위
+
+새 구현으로 전환되면 다음 테스트 묶음은 제거 또는 대체되어야 한다.
+
+- `tests/test_reverse_sync_text_transfer.py`
+- `tests/test_reverse_sync_patch_builder.py` 의 heuristic branch 테스트
+- `tests/test_reverse_sync_xhtml_patcher.py` 의 `new_plain_text` 기반 modify 테스트
+- `tests/test_reverse_sync_cli.py` 내부의 text-transfer helper 직접 테스트
+- `tests/test_reverse_sync_sidecar.py` 중 `mapping.yaml` 전용 테스트
+
+대신 아래 묶음이 새 기본 세트가 된다.
+
+- `tests/test_reverse_sync_xhtml_normalizer.py`
+- `tests/test_reverse_sync_reconstruction_offsets.py`
+- `tests/test_reverse_sync_reconstruction_insert.py`
+- `tests/test_reverse_sync_reconstruct_paragraph.py`
+- `tests/test_reverse_sync_reconstruct_list.py`
+- `tests/test_reverse_sync_reconstruct_container.py`
+- `tests/test_reverse_sync_reconstruction_goldens.py`
+
+### 11.5 실제 삭제 순서
+
+삭제는 아래 순서로 진행한다.
+
+1. 새 reconstruction path 구현
+2. 새 helper / block / golden / E2E 테스트 green
+3. `patch_builder.py` 에서 구 heuristic path unreachable 상태 확인
+4. `text_transfer.py`, `list_patcher.py`, `table_patcher.py`, `inline_detector.py` 삭제
+5. 관련 테스트 삭제
+6. 마지막으로 `mapping.yaml` 계층과 debug artifact 생성 경로 제거
+
+이 순서를 지키면 "새 경로 추가 + 구 경로 잔존" 상태를 최소화할 수 있고, 유지보수 비용도 빠르게 줄일 수 있다.