Touch Clarity (www.touchclarity.com) provides realtime optimisation of websites. Touch Clarity chooses, from a number of options, the most popular content to display on a page. This decision is made by tracking how many visitors respond to each of the options, by clicking on them. This is a direct commercial application of the multiarmed bandit problem - each of the items which might be shown is a separate bandit, with a separate response rate.
As in the multiarmed bandit problem, there is a tradeoff between exploration and exploitation - it is necessary to sometimes serve items other than the most popular in order to measure their response rate with sufficient precision to correctly identify which is the most popular. However, in this application there is a further complication - typically the rates of response to each item will vary over time, so continuous exploration is necessary in order to track this time variation, as old knowledge becomes out of date. An extreme example of this might be in choosing which news story to serve as the main story on a news page - interest in one story will decrease over time while interest in another will increase. In addition, the interest in several stories might vary in a similar, coherent way - for example a general increase in interest in sports stories at weekends, or in political stories near to an election. So there are typically two types of variation to consider - where response rates vary together, and where response rates vary completely independently.