From 2cea765218116fb833630622b006dc96fa779dc0 Mon Sep 17 00:00:00 2001
From: ywuenthought <yifengwucms@gmail.com>
Date: Tue, 30 Dec 2025 16:55:38 -0600
Subject: [PATCH 1/2] fix: replace `mask_value` with `mask_token`

---
 docs/examples/dcn.ipynb | 108 ++++++++++++++++++++--------------------
 1 file changed, 54 insertions(+), 54 deletions(-)
diff --git a/docs/examples/dcn.ipynb b/docs/examples/dcn.ipynb
index e3125d47..613fa588 100644
--- a/docs/examples/dcn.ipynb
+++ b/docs/examples/dcn.ipynb
@@ -37,22 +37,22 @@
         "id": "ikhIvrku-i-L"
       },
       "source": [
-        "# Deep \u0026 Cross Network (DCN)\n",
+        "# Deep & Cross Network (DCN)\n",
         "\n",
-        "\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
-        "  \u003ctd\u003e\n",
-        "    \u003ca target=\"_blank\" href=\"https://www.tensorflow.org/recommenders/examples/dcn\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" /\u003eView on TensorFlow.org\u003c/a\u003e\n",
-        "  \u003c/td\u003e\n",
-        "  \u003ctd\u003e\n",
-        "    \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/recommenders/blob/main/docs/examples/dcn.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n",
-        "  \u003c/td\u003e\n",
-        "  \u003ctd\u003e\n",
-        "    \u003ca target=\"_blank\" href=\"https://github.com/tensorflow/recommenders/blob/main/docs/examples/dcn.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView source on GitHub\u003c/a\u003e\n",
-        "  \u003c/td\u003e\n",
-        "  \u003ctd\u003e\n",
-        "    \u003ca href=\"https://storage.googleapis.com/tensorflow_docs/recommenders/docs/examples/dcn.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\n",
-        "  \u003c/td\u003e\n",
-        "\u003c/table\u003e"
+        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
+        "  <td>\n",
+        "    <a target=\"_blank\" href=\"https://www.tensorflow.org/recommenders/examples/dcn\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
+        "  </td>\n",
+        "  <td>\n",
+        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/recommenders/blob/main/docs/examples/dcn.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
+        "  </td>\n",
+        "  <td>\n",
+        "    <a target=\"_blank\" href=\"https://github.com/tensorflow/recommenders/blob/main/docs/examples/dcn.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
+        "  </td>\n",
+        "  <td>\n",
+        "    <a href=\"https://storage.googleapis.com/tensorflow_docs/recommenders/docs/examples/dcn.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
+        "  </td>\n",
+        "</table>"
       ]
     },
     {
@@ -61,16 +61,16 @@
         "id": "Q-rOX95bAye4"
       },
       "source": [
-        "This tutorial demonstrates how to use Deep \u0026 Cross Network (DCN) to effectively learn feature crosses.\n",
+        "This tutorial demonstrates how to use Deep & Cross Network (DCN) to effectively learn feature crosses.\n",
         "\n",
         "##Background\n",
         "\n",
         "**What are feature crosses and why are they important?** Imagine that we are building a recommender system to sell a blender to customers. Then, a customer's past purchase history such as `purchased_bananas` and `purchased_cooking_books`, or geographic features, are single features. If one has purchased both bananas **and** cooking books, then this customer will more likely click on the recommended blender. The combination of `purchased_bananas` and `purchased_cooking_books` is referred to as a **feature cross**, which provides additional interaction information beyond the individual features.\n",
-        "\u003cdiv\u003e\n",
-        "\u003ccenter\u003e\n",
-        "\u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/cross_features.gif?raw=true\" width=\"600\"/\u003e\n",
-        "\u003c/center\u003e\n",
-        "\u003c/div\u003e\n",
+        "<div>\n",
+        "<center>\n",
+        "<img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/cross_features.gif?raw=true\" width=\"600\"/>\n",
+        "</center>\n",
+        "</div>\n",
         "\n",
         "\n",
         "\n",
@@ -78,28 +78,28 @@
         "**What are the challenges in learning feature crosses?** In Web-scale applications, data are mostly categorical, leading to large and sparse feature space. Identifying effective feature crosses in this setting often requires\n",
         "manual feature engineering or exhaustive search. Traditional feed-forward multilayer perceptron (MLP) models are universal function approximators; however, they cannot efficiently approximate even 2nd or 3rd-order feature crosses [[1](https://arxiv.org/pdf/2008.13535.pdf), [2](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/18fa88ad519f25dc4860567e19ab00beff3f01cb.pdf)].\n",
         "\n",
-        "**What is Deep \u0026 Cross Network (DCN)?** DCN was designed to learn explicit and bounded-degree cross features more effectively. It starts with an input layer (typically an embedding layer), followed by a *cross network* containing multiple cross layers that models explicit feature interactions, and then combines\n",
+        "**What is Deep & Cross Network (DCN)?** DCN was designed to learn explicit and bounded-degree cross features more effectively. It starts with an input layer (typically an embedding layer), followed by a *cross network* containing multiple cross layers that models explicit feature interactions, and then combines\n",
         "with a *deep network* that models implicit feature interactions.\n",
         "\n",
         "\n",
         "*   Cross Network. This is the core of DCN. It explicitly applies feature crossing at each layer, and the highest\n",
         "polynomial degree increases with layer depth. The following figure shows the $(i+1)$-th cross layer.\n",
-        "\u003cdiv class=\"fig figcenter fighighlight\"\u003e\n",
-        "\u003ccenter\u003e\n",
-        "  \u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/feature_crossing.png?raw=true\" width=\"50%\" style=\"display:block\"\u003e\n",
-        "  \u003c/center\u003e\n",
-        "\u003c/div\u003e\n",
+        "<div class=\"fig figcenter fighighlight\">\n",
+        "<center>\n",
+        "  <img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/feature_crossing.png?raw=true\" width=\"50%\" style=\"display:block\">\n",
+        "  </center>\n",
+        "</div>\n",
         "*   Deep Network. It is a traditional feedforward multilayer perceptron (MLP).\n",
         "\n",
         "The deep network and cross network are then combined to form DCN [[1](https://arxiv.org/pdf/2008.13535.pdf)]. Commonly, we could stack a deep network on top of the cross network (stacked structure); we could also place them in parallel (parallel structure). \n",
         "\n",
         "\n",
-        "\u003cdiv class=\"fig figcenter fighighlight\"\u003e\n",
-        "\u003ccenter\u003e\n",
-        "  \u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/parallel_deep_cross.png?raw=true\" hspace=\"40\" width=\"30%\" style=\"margin: 0px 100px 0px 0px;\"\u003e\n",
-        "  \u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/stacked_deep_cross.png?raw=true\" width=\"20%\"\u003e\n",
-        "  \u003c/center\u003e\n",
-        "\u003c/div\u003e"
+        "<div class=\"fig figcenter fighighlight\">\n",
+        "<center>\n",
+        "  <img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/parallel_deep_cross.png?raw=true\" hspace=\"40\" width=\"30%\" style=\"margin: 0px 100px 0px 0px;\">\n",
+        "  <img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/stacked_deep_cross.png?raw=true\" width=\"20%\">\n",
+        "  </center>\n",
+        "</div>"
       ]
     },
     {
@@ -621,7 +621,7 @@
         "      vocabulary = vocabularies[feature_name]\n",
         "      self._embeddings[feature_name] = tf.keras.Sequential(\n",
         "          [tf.keras.layers.IntegerLookup(\n",
-        "              vocabulary=vocabulary, mask_value=None),\n",
+        "              vocabulary=vocabulary, mask_token=None),\n",
         "           tf.keras.layers.Embedding(len(vocabulary) + 1,\n",
         "                                     self.embedding_dimension)\n",
         "    ])\n",
@@ -758,11 +758,11 @@
       },
       "source": [
         "**DCN (stacked).** We first train a DCN model with a stacked structure, that is, the inputs are fed to a cross network followed by a deep network.\n",
-        "\u003cdiv\u003e\n",
-        "\u003ccenter\u003e\n",
-        "\u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/stacked_structure.png?raw=true\" width=\"140\"/\u003e\n",
-        "\u003c/center\u003e\n",
-        "\u003c/div\u003e\n"
+        "<div>\n",
+        "<center>\n",
+        "<img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/stacked_structure.png?raw=true\" width=\"140\"/>\n",
+        "</center>\n",
+        "</div>\n"
       ]
     },
     {
@@ -785,11 +785,11 @@
       "source": [
         "**Low-rank DCN.** To reduce the training and serving cost, we leverage low-rank techniques to approximate the DCN weight matrices. The rank is passed in through argument `projection_dim`; a smaller `projection_dim` results in a lower cost. Note that `projection_dim` needs to be smaller than (input size)/2 to reduce the cost. In practice, we've observed using low-rank DCN with rank (input size)/4 consistently preserved the accuracy of a full-rank DCN.\n",
         "\n",
-        "\u003cdiv\u003e\n",
-        "\u003ccenter\u003e\n",
-        "\u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/low_rank_dcn.png?raw=true\" width=\"400\"/\u003e\n",
-        "\u003c/center\u003e\n",
-        "\u003c/div\u003e\n"
+        "<div>\n",
+        "<center>\n",
+        "<img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/low_rank_dcn.png?raw=true\" width=\"400\"/>\n",
+        "</center>\n",
+        "</div>\n"
       ]
     },
     {
@@ -872,14 +872,14 @@
         "\n",
         "*   *Concatenating cross layers.* The inputs are fed in parallel to multiple cross layers to capture complementary feature crosses.\n",
         "\n",
-        "\u003cdiv class=\"fig figcenter fighighlight\"\u003e\n",
-        "\u003ccenter\u003e\n",
-        "  \u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/alternate_dcn_structures.png?raw=true\" hspace=40 width=\"600\" style=\"display:block;\"\u003e\n",
-        "  \u003cdiv class=\"figcaption\"\u003e\n",
-        "  \u003cb\u003eLeft\u003c/b\u003e: DCN with a parallel structure; \u003cb\u003eRight\u003c/b\u003e: Concatenating cross layers. \n",
-        "  \u003c/div\u003e\n",
-        "  \u003c/center\u003e\n",
-        "\u003c/div\u003e"
+        "<div class=\"fig figcenter fighighlight\">\n",
+        "<center>\n",
+        "  <img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/alternate_dcn_structures.png?raw=true\" hspace=40 width=\"600\" style=\"display:block;\">\n",
+        "  <div class=\"figcaption\">\n",
+        "  <b>Left</b>: DCN with a parallel structure; <b>Right</b>: Concatenating cross layers. \n",
+        "  </div>\n",
+        "  </center>\n",
+        "</div>"
       ]
     },
     {
@@ -952,11 +952,11 @@
         "\n",
         "\n",
         "##References\n",
-        "[DCN V2: Improved Deep \u0026 Cross Network and Practical Lessons for Web-scale Learning to Rank Systems](https://arxiv.org/pdf/2008.13535.pdf). \\\n",
+        "[DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems](https://arxiv.org/pdf/2008.13535.pdf). \\\n",
         "*Ruoxi Wang, Rakesh Shivanna, Derek Zhiyuan Cheng, Sagar Jain, Dong Lin, Lichan Hong, Ed Chi. (2020)*\n",
         "\n",
         "\n",
-        "[Deep \u0026 Cross Network for Ad Click Predictions](https://arxiv.org/pdf/1708.05123.pdf). \\\n",
+        "[Deep & Cross Network for Ad Click Predictions](https://arxiv.org/pdf/1708.05123.pdf). \\\n",
         "*Ruoxi Wang, Bin Fu, Gang Fu, Mingliang Wang. (AdKDD 2017)*"
       ]
     }

From d8bbd1fe78bb4d3756399b8e634e5cb2b0fba8eb Mon Sep 17 00:00:00 2001
From: ywuenthought <yifengwucms@gmail.com>
Date: Tue, 30 Dec 2025 16:57:42 -0600
Subject: [PATCH 2/2] fix: avoid mutating `features` dict

---
 docs/examples/dcn.ipynb | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/docs/examples/dcn.ipynb b/docs/examples/dcn.ipynb
index 613fa588..d0c3028c 100644
--- a/docs/examples/dcn.ipynb
+++ b/docs/examples/dcn.ipynb
@@ -663,12 +663,11 @@
         "    return self._logit_layer(x)\n",
         "\n",
         "  def compute_loss(self, features, training=False):\n",
-        "    labels = features.pop(\"user_rating\")\n",
-        "    scores = self(features)\n",
-        "    return self.task(\n",
-        "        labels=labels,\n",
-        "        predictions=scores,\n",
-        "    )"
+        "    labels = features[\"user_rating\"]\n",
+        "    feats_no_label = {k: v for k, v in features.items() if k != \"user_rating\"}\n",
+        "\n",
+        "    scores = self(feats_no_label)\n",
+        "    return self.task(labels=labels, predictions=scores)"
       ]
     },
     {