Multi-Task Learning Tutorial
This tutorial will introduce how to use multi-task learning models in Torch-RecHub. We'll use Alibaba's e-commerce dataset as an example.
Data Preparation
First, we need to prepare data for multi-task learning:
import pandas as pd
import numpy as np
from rechub.utils import DataGenerator
from rechub.models import *
from rechub.trainers import *
# Load data
df = pd.read_csv("ali_ccp_data.csv")
# Feature definitions
user_features = ['user_id', 'age', 'gender', 'occupation']
item_features = ['item_id', 'category_id', 'shop_id', 'brand_id']
features = user_features + item_features
# Multi-task labels
tasks = ['click', 'conversion'] # CTR and CVR tasks
SharedBottom Model
The most basic multi-task learning model with shared parameters in bottom layers:
# Model configuration
model = SharedBottom(
features=features,
hidden_units=[256, 128],
task_hidden_units=[64, 32],
num_tasks=2,
task_types=['binary', 'binary'])
# Training configuration
trainer = MTLTrainer(
model=model,
optimizer_params={'lr': 0.001},
n_epochs=10)
# Train model
trainer.fit(train_dataloader, val_dataloader)
ESMM (Entire Space Multi-Task Model)
A multi-task model that addresses sample selection bias:
# Model configuration
model = ESMM(
features=features,
hidden_units=[256, 128, 64],
tower_units=[32, 16],
embedding_dim=16)
# Training configuration
trainer = MTLTrainer(
model=model,
optimizer_params={'lr': 0.001},
n_epochs=10)
MMoE (Multi-gate Mixture-of-Experts)
Implements soft parameter sharing between tasks through expert mechanism:
# Model configuration
model = MMoE(
features=features,
expert_units=[256, 128],
num_experts=8,
num_tasks=2,
expert_activation='relu',
gate_activation='softmax')
# Training configuration
trainer = MTLTrainer(
model=model,
optimizer_params={'lr': 0.001},
n_epochs=10)
PLE (Progressive Layered Extraction)
Better models task relationships through layered extraction:
# Model configuration
model = PLE(
features=features,
expert_units=[256, 128],
num_experts=4,
num_layers=3,
num_shared_experts=2,
task_types=['binary', 'binary'])
# Training configuration
trainer = MTLTrainer(
model=model,
optimizer_params={'lr': 0.001},
n_epochs=10)
Task Weight Optimization
GradNorm
Use GradNorm algorithm to dynamically adjust task weights:
# Configure GradNorm
trainer = MTLTrainer(
model=model,
optimizer_params={'lr': 0.001},
task_weights_strategy='gradnorm',
gradnorm_alpha=1.5)
MetaBalance
Use MetaBalance optimizer to balance task gradients:
from rechub.utils import MetaBalance
# Configure MetaBalance optimizer
optimizer = MetaBalance(
model.parameters(),
relax_factor=0.7,
beta=0.9)
trainer = MTLTrainer(
model=model,
optimizer=optimizer)
Model Evaluation
Use appropriate evaluation metrics for different tasks:
# Evaluate model
results = evaluate_multi_task(model, test_dataloader)
for task, metrics in results.items():
print(f"Task: {task}")
print(f"AUC: {metrics['auc']:.4f}")
print(f"LogLoss: {metrics['logloss']:.4f}")
Advanced Applications
-
Custom Task Loss Weights
trainer = MTLTrainer( model=model, task_weights=[1.0, 0.5]) # Set fixed task weights
-
Get Shared and Task-Specific Layers
from rechub.utils import shared_task_layers shared_params, task_params = shared_task_layers(model)
-
Task-Specific Learning Rates
trainer = MTLTrainer( model=model, task_specific_lr={'click': 0.001, 'conversion': 0.0005})
Important Notes
- Choose appropriate multi-task learning architecture
- Pay attention to task correlations
- Handle data imbalance between tasks
- Set reasonable task weights
- Monitor training progress for each task
- Prevent negative transfer between tasks
- Consider computational resource constraints