DIEN Tutorial
1. Model Overview and Use Cases
DIEN (Deep Interest Evolution Network), proposed by Alibaba at AAAI 2019, is an evolution of DIN. DIEN introduces an Interest Extractor (GRU) and an Interest Evolution module (AUGRU) to model how user interests change over time, instead of only doing a weighted sum over historical behaviors.
Paper: Deep Interest Evolution Network for Click-Through Rate Prediction
Model Architecture
Note: DIEN contains dynamic GRU-based computation, so
torchviewcannot fully trace its internal graph at the moment.
- Interest Extractor Layer: uses GRU to model user behavior sequences
- Auxiliary Loss: uses the next-step behavior as an auxiliary supervision signal
- Interest Evolution Layer: uses AUGRU with target-aware attention to model interest evolution relevant to the current target item
Suitable Scenarios
- CTR prediction
- News / e-commerce scenarios where user interests change clearly over time
- Sequential recommendation tasks with temporally ordered behavior data
2. Data Preparation and Preprocessing
DIEN uses almost the same data pipeline as DIN and also relies on the sampled Amazon Electronics dataset. The additional input is history_labels, which marks whether each historical behavior is a positive signal.
import pandas as pd
import torch
from torch_rechub.basic.features import SparseFeature, SequenceFeature
from torch_rechub.utils.data import DataGenerator, df_to_dict, generate_seq_feature
# Load data
data = pd.read_csv("examples/ranking/data/amazon-electronics/amazon_electronics_sample.csv")
# Automatically generate sequence features ordered by timestamp.
train, val, test = generate_seq_feature(
data=data,
user_col="user_id",
item_col="item_id",
time_col="timestamp",
item_attribute_cols=["cate_id"],
min_item=0,
shuffle=True,
)
user_num = data["user_id"].max() + 1
item_num = data["item_id"].max() + 1
cate_num = data["cate_id"].max() + 1Define Features
# Base features (target item + user profile features)
features = [
SparseFeature("user_id", vocab_size=user_num, embed_dim=16),
SparseFeature("gender", vocab_size=data["gender"].max() + 1, embed_dim=8),
SparseFeature("age", vocab_size=data["age"].max() + 1, embed_dim=8),
]
# History sequence features
history_features = [
SequenceFeature("hist_item_id", vocab_size=item_num, embed_dim=16, pooling="concat", shared_with="item_id"),
SequenceFeature("hist_cate_id", vocab_size=cate_num, embed_dim=16, pooling="concat", shared_with="cate_id"),
]
target_features = [
SparseFeature("item_id", vocab_size=item_num, embed_dim=16),
SparseFeature("cate_id", vocab_size=cate_num, embed_dim=16),
]Build DataLoaders
x_train, y_train = df_to_dict(train), train["label"].values
x_val, y_val = df_to_dict(val), val["label"].values
x_test, y_test = df_to_dict(test), test["label"].values
dg = DataGenerator(x_train, y_train)
train_dl, val_dl, test_dl = dg.generate_dataloader(
x_val=x_val,
y_val=y_val,
x_test=x_test,
y_test=y_test,
batch_size=1024,
)3. Model Configuration and Parameter Notes
3.1 Create the Model
from torch_rechub.models.ranking import DIEN
# `history_labels` marks whether each historical behavior is a positive signal.
# In this simplified tutorial we use all ones. In real applications this should
# come from business labels such as click / purchase / watch / favorite.
history_labels = torch.ones(len(history_features))
model = DIEN(
features=features,
history_features=history_features,
target_features=target_features,
mlp_params={"dims": [256, 128], "dropout": 0.2, "activation": "dice"},
history_labels=history_labels,
alpha=0.2,
)3.2 Parameter Details
history_labels: fixed model-level configuration used by the auxiliary lossalpha: weight of the auxiliary loss termpooling="concat"is still required because DIEN needs the raw sequence
4. Training Process and Code Example
import os
from torch_rechub.trainers import CTRTrainer
os.makedirs("./saved/dien", exist_ok=True)
trainer = CTRTrainer(
model,
optimizer_params={"lr": 1e-3, "weight_decay": 1e-5},
n_epoch=3,
earlystop_patience=2,
device="cpu",
model_path="./saved/dien",
)
trainer.fit(train_dl, val_dl)5. Evaluation and Result Analysis
auc = trainer.evaluate(trainer.model, test_dl)
print(f"DIEN test AUC: {auc:.4f}")- DIEN is often more suitable than DIN when long-term sequential evolution matters.
- It is also heavier than DIN, so the extra gain depends on sequence quality and data scale.
6. Tuning Suggestions
- Tune
alphacarefully. If the auxiliary loss is too strong, the main objective can suffer. - DIEN is more computationally expensive than DIN, so start from smaller hidden sizes and shorter sequences.
- If the dataset is small, DIN may already be a better trade-off.
7. FAQ and Troubleshooting
Q1: What is the core difference between DIEN and DIN?
DIN uses target attention over historical behaviors, while DIEN further models how interests evolve with GRU / AUGRU.
Q2: How should history_labels be obtained?
They should be derived from the business definition of positive and negative feedback for each step in the historical sequence.
8. Model Visualization
DIEN contains GRU-based dynamic computation, so automatic graph visualization is limited compared with plain MLP models.
9. ONNX Export
trainer.export_onnx(
"./saved/dien/dien.onnx",
data_loader=test_dl,
dynamic_batch=True,
)Full Example
The snippets above form a complete runnable example. Use them together with the same Amazon Electronics preprocessing flow shown in examples/ranking/run_amazon_electronics.py.
