Mobile Engineer · Senior-Level · Phone Screen
The first round involved debugging a Transformer model with the following issues: - Position embedding initialization - Mask not set to -inf - Missing loss.backward() - Incorrect dimensions in the projection layer's nn.Linear module The follow-up involved KV c...