Thompson Sampling · Embedding Bandits
Shared linear model over arm embeddings — pulls on one arm improve estimates for similar arms.
Parameters
Arms
200
Embedding dim
6
Noise σ
0.30
Feedback rate
0.30
Probability of observing reward after each pull
Prior variance
1.00
Controls
Reset
+1
+10
+100
+500
▶ Auto-Run
Statistics
0
Round
0
Observed
0.00
TS avg regret
—
Best arm %
Cumulative Regret
Embedding Space (PCA → 2D)
Top Arms (by posterior estimate)
#
Est vs True
True
Est
Pulls
Event Log