Configuring Out-of-Network Retrieval

Out-of-network retrieval (Phoenix Retrieval) allows the algorithm to surface content from the global corpus, even if you don't follow the author. This process uses a two-tower architecture powered by the Grok-based transformer to match user interests with candidate posts in a shared embedding space.

This guide walks you through configuring the Phoenix Retrieval model, setting up the hash-based embeddings, and initializing the two-tower system.

1. Configure Hash-based Embeddings

The retrieval model uses hash-based embeddings to handle the massive scale of users and posts without maintaining a fixed vocabulary. You first need to define a HashConfig to specify how many hash functions to use for different entity types.

from recsys_model import HashConfig

# Configure the number of hash functions for each entity
hash_config = HashConfig(
    num_user_hashes=2,
    num_item_hashes=2,
    num_author_hashes=2
)

2. Define the Retrieval Model Configuration

The PhoenixRetrievalModelConfig is the primary interface for tuning the retrieval system. It requires a TransformerConfig (based on the Grok architecture) and sequence length parameters.

from grok import TransformerConfig
from recsys_retrieval_model import PhoenixRetrievalModelConfig

retrieval_config = PhoenixRetrievalModelConfig(
    emb_size=64,                # Shared embedding dimension
    history_seq_len=128,        # Number of historical actions to consider
    candidate_seq_len=32,       # Number of candidates to process in a batch
    hash_config=hash_config,
    model=TransformerConfig(
        emb_size=64,
        widening_factor=2,
        key_size=32,
        num_q_heads=2,
        num_kv_heads=2,
        num_layers=1,
        attn_output_multiplier=0.125,
    )
).initialize()

Key Parameters:

emb_size: The dimensionality of the vector space where users and posts are compared.
history_seq_len: Determines how much of the user's recent engagement history is fed into the User Tower.
model: The underlying Grok transformer configuration that processes the user sequence.

3. Initialize the Two-Tower Model

The model follows a "Two-Tower" design:

User Tower: Encodes the user's features and history into a single vector.
Candidate Tower: Projects post and author data into the same vector space.

You can initialize the model using Haiku.

import haiku as hk

def retrieval_forward_fn(batch, embeddings):
    model = retrieval_config.make()
    return model(batch, embeddings)

# Transform the function for JAX usage
retrieval_model = hk.transform(retrieval_forward_fn)

4. Understanding Retrieval Output

When you run a forward pass, the model returns a RetrievalOutput. This contains the normalized user representation which you can then use for Approximate Nearest Neighbor (ANN) search against your post corpus.

# The model produces L2-normalized representations
# enabling efficient dot-product similarity search.
output = retrieval_model.apply(params, rng, batch, embeddings)

user_vec = output.user_representation  # [Batch, EmbSize]
scores = output.top_k_scores           # Top-K scores for candidates in the batch

5. Configuring the Candidate Tower

The Candidate Tower ensures that post and author embeddings are projected and normalized. This is critical for ensuring that the dot-product calculation correctly represents cosine similarity.

If you are extending the system, you can configure the CandidateTower independently for offline embedding generation:

from recsys_retrieval_model import CandidateTower

def candidate_projection_fn(post_author_embedding):
    tower = CandidateTower(emb_size=64)
    return tower(post_author_embedding)

candidate_tower = hk.transform(candidate_projection_fn)

6. Integration with Home Mixer

Once configured, the Phoenix Retrieval model is plugged into the Candidate Pipeline. The Home Mixer orchestration layer calls this model to fetch a set of global candidates, which are then combined with the in-network results from Thunder before being passed to the final ranking stage.

Best Practices

Normalization: The system relies on L2 normalization. Ensure that any custom candidate embeddings are normalized before performing similarity searches.
Sequence Length: If you increase history_seq_len, expect higher memory usage in the Grok transformer, but potentially better long-term interest matching.
Hashing: Use different seeds for the hash functions in HashConfig to ensure diverse feature representations.