A Tutorial on Thompson Sampling (Foundations and Trends® in Machine Learning)

個数：

ポイントキャンペーン

A Tutorial on Thompson Sampling (Foundations and Trends® in Machine Learning)

Russo, Daniel J./ Van Roy, Benjamin/ Kazerouni, Abbas

ウェブストア価格 ¥13,552（本体¥12,320）
now publishers Inc（2018/07発売）
外貨定価 US$ 80.00
【ウェブストア限定】洋書・洋古書ポイント5倍対象商品（～2/28）
ポイント 615pt

国内仕入先からお取り寄せいたします。通常6日～12日程度で発送いたします。
【重要ご説明事項】
1. 納期遅延や、ご入手不能となる場合が若干ございます。
2. 国内仕入れ先の在庫数がご注文数に満たない場合は、国内仕入れ先を通して海外へ発注いたします。海外へ発注した場合は、国内入荷までに6～9週間ほどお時間をいただく場合がございます。
3. 美品のご指定は承りかねます。

●3Dセキュア導入とクレジットカードによるお支払いについて

【入荷遅延について】
世界情勢の影響により、海外からお取り寄せとなる洋書・洋古書の入荷が、表示している標準的な納期よりも遅延する場合がございます。
おそれいりますが、あらかじめご了承くださいますようお願い申し上げます。

◆画像の表紙や帯等は実物とは異なる場合があります。

◆ウェブストアでの洋書販売価格は、弊社店舗等での販売価格とは異なります。
また、洋書販売価格は、ご注文確定時点での日本円価格となります。
ご注文確定後に、同じ洋書の販売価格が変動しても、それは反映されません。

製本 Paperback:紙装版/ペーパーバック版／ページ数 112 p.
言語 ENG
商品コード 9781680834703

Full Description

Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may improve future performance. The algorithm addresses a broad range of problems in a computationally efficient manner and is therefore enjoying wide use.

A Tutorial on Thompson Sampling covers the algorithm and its application, illustrating concepts through a range of examples, including Bernoulli bandit problems, shortest path problems, product recommendation, assortment, active learning with neural networks, and reinforcement learning in Markov decision processes. Most of these problems involve complex information structures, where information revealed by taking an action informs beliefs about other actions. It also discusses when and why Thompson sampling is or is not effective and relations to alternative algorithms.

1. Introduction
2. Greedy Decisions
3. Thompson Sampling for the Bernoulli Bandit
4. General Thompson Sampling
5. Approximations
6. Practical Modeling Considerations
7. Further Examples
8. Why it Works, When it Fails, and Alternative Approaches
Acknowledgements
References