Grokking Deep Reinforcement Learning

個数：

Grokking Deep Reinforcement Learning

Morales, Miguel

ウェブストア価格 ¥10,926（本体¥9,933）
Manning Publications（2021/01発売）
外貨定価 UK£ 39.99
ポイント 99pt

提携先の海外書籍取次会社に在庫がございます。通常約2週間で発送いたします。
【重要ご説明事項】
1. 納期遅延や、ご入手不能となる場合が若干ございます。
2. 複数冊ご注文の場合は、ご注文数量が揃ってからまとめて発送いたします。
3. 美品のご指定は承りかねます。

●3Dセキュア導入とクレジットカードによるお支払いについて

【入荷遅延について】
世界情勢の影響により、海外からお取り寄せとなる洋書・洋古書の入荷が、表示している標準的な納期よりも遅延する場合がございます。
おそれいりますが、あらかじめご了承くださいますようお願い申し上げます。

◆画像の表紙や帯等は実物とは異なる場合があります。

◆ウェブストアでの洋書販売価格は、弊社店舗等での販売価格とは異なります。
また、洋書販売価格は、ご注文確定時点での日本円価格となります。
ご注文確定後に、同じ洋書の販売価格が変動しても、それは反映されません。

製本 Paperback:紙装版/ペーパーバック版／ページ数 465 p.
言語 ENG
商品コード 9781617295454
DDC分類 006.3

Full Description

Written for developers with some understanding of deep learning algorithms. Experience with reinforcement learning is not required.

Grokking Deep Reinforcement Learning introduces this powerful machine learning approach, using examples, illustrations, exercises, and crystal-clear teaching. You'll love the perfectly paced teaching and the clever, engaging writing style as you dig into this awesome exploration of reinforcement learning fundamentals, effective deep learning techniques, and practical applications in this emerging field.

We all learn through trial and error. We avoid the things that cause us to experience pain and failure. We embrace and build on the things that give us reward and success. This common pattern is the foundation of deep reinforcement learning: building machine learning systems that explore and learn based on the responses of the environment.

• Foundational reinforcement learning concepts and methods

• The most popular deep reinforcement learning agents solving high-dimensional environments

• Cutting-edge agents that emulate human-like behavior and techniques for artificial general intelligence

Deep reinforcement learning is a form of machine learning in which AI agents learn optimal behavior on their own from raw sensory input. The system perceives the environment, interprets the results of its past decisions and uses this information to optimize its behavior for maximum long-term return.