Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 59 additions & 60 deletions docs/examples/dcn.ipynb
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After fixing the mask_value issue, a new error shows up

/usr/local/lib/python3.12/dist-packages/keras/src/backend/tensorflow/trainer.py in multi_step_on_iterator(iterator)
    131             if self.steps_per_execution == 1:
    132                 return tf.experimental.Optional.from_value(
--> 133                     one_step_on_data(iterator.get_next())
    134                 )
    135 

ValueError: TensorFlowTrainer._make_function.<locals>.one_step_on_data(data) should not modify its Python input arguments. Modifying a copy is allowed. The following parameter(s) were modified: data

This is due to the direct mutation of features in the compute_loss method of DCN.

Original file line number Diff line number Diff line change
Expand Up @@ -37,22 +37,22 @@
"id": "ikhIvrku-i-L"
},
"source": [
"# Deep \u0026 Cross Network (DCN)\n",
"# Deep & Cross Network (DCN)\n",
"\n",
"\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://www.tensorflow.org/recommenders/examples/dcn\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" /\u003eView on TensorFlow.org\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/recommenders/blob/main/docs/examples/dcn.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://github.com/tensorflow/recommenders/blob/main/docs/examples/dcn.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView source on GitHub\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca href=\"https://storage.googleapis.com/tensorflow_docs/recommenders/docs/examples/dcn.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\n",
" \u003c/td\u003e\n",
"\u003c/table\u003e"
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://www.tensorflow.org/recommenders/examples/dcn\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
" </td>\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/recommenders/blob/main/docs/examples/dcn.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
" </td>\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://github.com/tensorflow/recommenders/blob/main/docs/examples/dcn.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
" </td>\n",
" <td>\n",
" <a href=\"https://storage.googleapis.com/tensorflow_docs/recommenders/docs/examples/dcn.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
" </td>\n",
"</table>"
]
},
{
Expand All @@ -61,45 +61,45 @@
"id": "Q-rOX95bAye4"
},
"source": [
"This tutorial demonstrates how to use Deep \u0026 Cross Network (DCN) to effectively learn feature crosses.\n",
"This tutorial demonstrates how to use Deep & Cross Network (DCN) to effectively learn feature crosses.\n",
"\n",
"##Background\n",
"\n",
"**What are feature crosses and why are they important?** Imagine that we are building a recommender system to sell a blender to customers. Then, a customer's past purchase history such as `purchased_bananas` and `purchased_cooking_books`, or geographic features, are single features. If one has purchased both bananas **and** cooking books, then this customer will more likely click on the recommended blender. The combination of `purchased_bananas` and `purchased_cooking_books` is referred to as a **feature cross**, which provides additional interaction information beyond the individual features.\n",
"\u003cdiv\u003e\n",
"\u003ccenter\u003e\n",
"\u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/cross_features.gif?raw=true\" width=\"600\"/\u003e\n",
"\u003c/center\u003e\n",
"\u003c/div\u003e\n",
"<div>\n",
"<center>\n",
"<img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/cross_features.gif?raw=true\" width=\"600\"/>\n",
"</center>\n",
"</div>\n",
"\n",
"\n",
"\n",
"\n",
"**What are the challenges in learning feature crosses?** In Web-scale applications, data are mostly categorical, leading to large and sparse feature space. Identifying effective feature crosses in this setting often requires\n",
"manual feature engineering or exhaustive search. Traditional feed-forward multilayer perceptron (MLP) models are universal function approximators; however, they cannot efficiently approximate even 2nd or 3rd-order feature crosses [[1](https://arxiv.org/pdf/2008.13535.pdf), [2](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/18fa88ad519f25dc4860567e19ab00beff3f01cb.pdf)].\n",
"\n",
"**What is Deep \u0026 Cross Network (DCN)?** DCN was designed to learn explicit and bounded-degree cross features more effectively. It starts with an input layer (typically an embedding layer), followed by a *cross network* containing multiple cross layers that models explicit feature interactions, and then combines\n",
"**What is Deep & Cross Network (DCN)?** DCN was designed to learn explicit and bounded-degree cross features more effectively. It starts with an input layer (typically an embedding layer), followed by a *cross network* containing multiple cross layers that models explicit feature interactions, and then combines\n",
"with a *deep network* that models implicit feature interactions.\n",
"\n",
"\n",
"* Cross Network. This is the core of DCN. It explicitly applies feature crossing at each layer, and the highest\n",
"polynomial degree increases with layer depth. The following figure shows the $(i+1)$-th cross layer.\n",
"\u003cdiv class=\"fig figcenter fighighlight\"\u003e\n",
"\u003ccenter\u003e\n",
" \u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/feature_crossing.png?raw=true\" width=\"50%\" style=\"display:block\"\u003e\n",
" \u003c/center\u003e\n",
"\u003c/div\u003e\n",
"<div class=\"fig figcenter fighighlight\">\n",
"<center>\n",
" <img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/feature_crossing.png?raw=true\" width=\"50%\" style=\"display:block\">\n",
" </center>\n",
"</div>\n",
"* Deep Network. It is a traditional feedforward multilayer perceptron (MLP).\n",
"\n",
"The deep network and cross network are then combined to form DCN [[1](https://arxiv.org/pdf/2008.13535.pdf)]. Commonly, we could stack a deep network on top of the cross network (stacked structure); we could also place them in parallel (parallel structure). \n",
"\n",
"\n",
"\u003cdiv class=\"fig figcenter fighighlight\"\u003e\n",
"\u003ccenter\u003e\n",
" \u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/parallel_deep_cross.png?raw=true\" hspace=\"40\" width=\"30%\" style=\"margin: 0px 100px 0px 0px;\"\u003e\n",
" \u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/stacked_deep_cross.png?raw=true\" width=\"20%\"\u003e\n",
" \u003c/center\u003e\n",
"\u003c/div\u003e"
"<div class=\"fig figcenter fighighlight\">\n",
"<center>\n",
" <img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/parallel_deep_cross.png?raw=true\" hspace=\"40\" width=\"30%\" style=\"margin: 0px 100px 0px 0px;\">\n",
" <img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/stacked_deep_cross.png?raw=true\" width=\"20%\">\n",
" </center>\n",
"</div>"
]
},
{
Expand Down Expand Up @@ -621,7 +621,7 @@
" vocabulary = vocabularies[feature_name]\n",
" self._embeddings[feature_name] = tf.keras.Sequential(\n",
" [tf.keras.layers.IntegerLookup(\n",
" vocabulary=vocabulary, mask_value=None),\n",
" vocabulary=vocabulary, mask_token=None),\n",
" tf.keras.layers.Embedding(len(vocabulary) + 1,\n",
" self.embedding_dimension)\n",
" ])\n",
Expand Down Expand Up @@ -663,12 +663,11 @@
" return self._logit_layer(x)\n",
"\n",
" def compute_loss(self, features, training=False):\n",
" labels = features.pop(\"user_rating\")\n",
" scores = self(features)\n",
" return self.task(\n",
" labels=labels,\n",
" predictions=scores,\n",
" )"
" labels = features[\"user_rating\"]\n",
" feats_no_label = {k: v for k, v in features.items() if k != \"user_rating\"}\n",
"\n",
" scores = self(feats_no_label)\n",
" return self.task(labels=labels, predictions=scores)"
]
},
{
Expand Down Expand Up @@ -758,11 +757,11 @@
},
"source": [
"**DCN (stacked).** We first train a DCN model with a stacked structure, that is, the inputs are fed to a cross network followed by a deep network.\n",
"\u003cdiv\u003e\n",
"\u003ccenter\u003e\n",
"\u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/stacked_structure.png?raw=true\" width=\"140\"/\u003e\n",
"\u003c/center\u003e\n",
"\u003c/div\u003e\n"
"<div>\n",
"<center>\n",
"<img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/stacked_structure.png?raw=true\" width=\"140\"/>\n",
"</center>\n",
"</div>\n"
]
},
{
Expand All @@ -785,11 +784,11 @@
"source": [
"**Low-rank DCN.** To reduce the training and serving cost, we leverage low-rank techniques to approximate the DCN weight matrices. The rank is passed in through argument `projection_dim`; a smaller `projection_dim` results in a lower cost. Note that `projection_dim` needs to be smaller than (input size)/2 to reduce the cost. In practice, we've observed using low-rank DCN with rank (input size)/4 consistently preserved the accuracy of a full-rank DCN.\n",
"\n",
"\u003cdiv\u003e\n",
"\u003ccenter\u003e\n",
"\u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/low_rank_dcn.png?raw=true\" width=\"400\"/\u003e\n",
"\u003c/center\u003e\n",
"\u003c/div\u003e\n"
"<div>\n",
"<center>\n",
"<img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/low_rank_dcn.png?raw=true\" width=\"400\"/>\n",
"</center>\n",
"</div>\n"
]
},
{
Expand Down Expand Up @@ -872,14 +871,14 @@
"\n",
"* *Concatenating cross layers.* The inputs are fed in parallel to multiple cross layers to capture complementary feature crosses.\n",
"\n",
"\u003cdiv class=\"fig figcenter fighighlight\"\u003e\n",
"\u003ccenter\u003e\n",
" \u003cimg src=\"https://github.com/tensorflow/recommenders/blob/main/assets/alternate_dcn_structures.png?raw=true\" hspace=40 width=\"600\" style=\"display:block;\"\u003e\n",
" \u003cdiv class=\"figcaption\"\u003e\n",
" \u003cb\u003eLeft\u003c/b\u003e: DCN with a parallel structure; \u003cb\u003eRight\u003c/b\u003e: Concatenating cross layers. \n",
" \u003c/div\u003e\n",
" \u003c/center\u003e\n",
"\u003c/div\u003e"
"<div class=\"fig figcenter fighighlight\">\n",
"<center>\n",
" <img src=\"https://github.com/tensorflow/recommenders/blob/main/assets/alternate_dcn_structures.png?raw=true\" hspace=40 width=\"600\" style=\"display:block;\">\n",
" <div class=\"figcaption\">\n",
" <b>Left</b>: DCN with a parallel structure; <b>Right</b>: Concatenating cross layers. \n",
" </div>\n",
" </center>\n",
"</div>"
]
},
{
Expand Down Expand Up @@ -952,11 +951,11 @@
"\n",
"\n",
"##References\n",
"[DCN V2: Improved Deep \u0026 Cross Network and Practical Lessons for Web-scale Learning to Rank Systems](https://arxiv.org/pdf/2008.13535.pdf). \\\n",
"[DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems](https://arxiv.org/pdf/2008.13535.pdf). \\\n",
"*Ruoxi Wang, Rakesh Shivanna, Derek Zhiyuan Cheng, Sagar Jain, Dong Lin, Lichan Hong, Ed Chi. (2020)*\n",
"\n",
"\n",
"[Deep \u0026 Cross Network for Ad Click Predictions](https://arxiv.org/pdf/1708.05123.pdf). \\\n",
"[Deep & Cross Network for Ad Click Predictions](https://arxiv.org/pdf/1708.05123.pdf). \\\n",
"*Ruoxi Wang, Bin Fu, Gang Fu, Mingliang Wang. (AdKDD 2017)*"
]
}
Expand Down