Clubhouse is now using a machine learning model that makes nuanced predictions on ranking rooms in the hallway and results better in overall quality of personalized recommendations.
Previously, Clubhouse ranked rooms in the hallway by finding the best match for the users. Rooms were picked based on how many of users' friends were in them or how closely they matched the selected topics, and the recommendations were not generated by a mechanism.
The Gradient Boosted Decision Tree (GBDT) ranking model is trained on features that quantify various attributes of your activity in past rooms. For instance, the model looks at whether users spend more time in small private vs. large public rooms or whether they speak often or prefer to listen, and uses these to rank rooms optimally for users.
The model also looks at features of the room itself, like its duration and the number of participants, as well as features of the club if it's a club room. The ranking model also accepts user and club embedding vectors as inputs. These embeddings are trained on user-club interaction data to build dense representations of users and clubs.
The model is trained as a classifier, with the target being room joins / non-joins from past room impressions. The classifier score for each room, a value between 0 and 1, is used to determine the relative ordering of the room in the hallway.
Further, the model training and inference system looks like a traditional ranking system but has some key differences that arise from the fact that the platform is ranking live rooms.