Accelerators for Convolutional Neural Networks

個数：

電子版価格
¥17,822

電子版あり

Accelerators for Convolutional Neural Networks

Munir, Arslan/ Kong, Joonho/ Qureshi, Mahmood Azhar

ウェブストア価格 ¥28,020（本体¥25,473）
Wiley-IEEE Press（2023/10発売）
外貨定価 US$ 145.00
ポイント 254pt

提携先の海外書籍取次会社に在庫がございます。通常3週間で発送いたします。
【重要ご説明事項】
1. 納期遅延や、ご入手不能となる場合が若干ございます。
2. 複数冊ご注文の場合、分割発送となる場合がございます。
3. 美品のご指定は承りかねます。

●3Dセキュア導入とクレジットカードによるお支払いについて

【入荷遅延について】
世界情勢の影響により、海外からお取り寄せとなる洋書・洋古書の入荷が、表示している標準的な納期よりも遅延する場合がございます。
おそれいりますが、あらかじめご了承くださいますようお願い申し上げます。

◆画像の表紙や帯等は実物とは異なる場合があります。

◆ウェブストアでの洋書販売価格は、弊社店舗等での販売価格とは異なります。
また、洋書販売価格は、ご注文確定時点での日本円価格となります。
ご注文確定後に、同じ洋書の販売価格が変動しても、それは反映されません。

製本 Hardcover:ハードカバー版／ページ数 304 p.
言語 ENG
商品コード 9781394171880
DDC分類 006.32

Full Description

Accelerators for Convolutional Neural Networks Comprehensive and thorough resource exploring different types of convolutional neural networks and complementary accelerators

Accelerators for Convolutional Neural Networks provides basic deep learning knowledge and instructive content to build up convolutional neural network (CNN) accelerators for the Internet of things (IoT) and edge computing practitioners, elucidating compressive coding for CNNs, presenting a two-step lossless input feature maps compression method, discussing arithmetic coding -based lossless weights compression method and the design of an associated decoding method, describing contemporary sparse CNNs that consider sparsity in both weights and activation maps, and discussing hardware/software co-design and co-scheduling techniques that can lead to better optimization and utilization of the available hardware resources for CNN acceleration.

The first part of the book provides an overview of CNNs along with the composition and parameters of different contemporary CNN models. Later chapters focus on compressive coding for CNNs and the design of dense CNN accelerators. The book also provides directions for future research and development for CNN accelerators.

Other sample topics covered in Accelerators for Convolutional Neural Networks include:

How to apply arithmetic coding and decoding with range scaling for lossless weight compression for 5-bit CNN weights to deploy CNNs in extremely resource-constrained systems
State-of-the-art research surrounding dense CNN accelerators, which are mostly based on systolic arrays or parallel multiply-accumulate (MAC) arrays
iMAC dense CNN accelerator, which combines image-to-column (im2col) and general matrix multiplication (GEMM) hardware acceleration
Multi-threaded, low-cost, log-based processing element (PE) core, instances of which are stacked in a spatial grid to engender NeuroMAX dense accelerator
Sparse-PE, a multi-threaded and flexible CNN PE core that exploits sparsity in both weights and activation maps, instances of which can be stacked in a spatial grid for engendering sparse CNN accelerators

For researchers in AI, computer vision, computer architecture, and embedded systems, along with graduate and senior undergraduate students in related programs of study, Accelerators for Convolutional Neural Networks is an essential resource to understanding the many facets of the subject and relevant applications.

About the Authors xiii

Preface xv

Part I Overview 1

1 Introduction 3

1.1 History and Applications 5

1.2 Pitfalls of High-Accuracy DNNs/CNNs 6

1.2.1 Compute and Energy Bottleneck 6

1.2.2 Sparsity Considerations 9

1.3 Chapter Summary 11

2 Overview of Convolutional Neural Networks 13

2.1 Deep Neural Network Architecture 13

2.2 Convolutional Neural Network Architecture 15

2.3 Popular CNN Models 26

2.4 Popular CNN Datasets 30

2.5 CNN Processing Hardware 31

2.6 Chapter Summary 37

Part II Compressive Coding for CNNs 39

3 Contemporary Advances in Compressive Coding for CNNs 41

3.1 Background of Compressive Coding 41

3.2 Compressive Coding for CNNs 43

3.3 Lossy Compression for CNNs 43

3.4 Lossless Compression for CNNs 44

3.5 Recent Advancements in Compressive Coding for CNNs 48

3.6 Chapter Summary 50

4 Lossless Input Feature Map Compression 51

4.1 Two-Step Input Feature Map Compression Technique 52

4.2 Evaluation 55

4.3 Chapter Summary 57

5 Arithmetic Coding and Decoding for 5-Bit CNN Weights 59

5.1 Architecture and Design Overview 60

5.2 Algorithm Overview 63

5.3 Weight Decoding Algorithm 67

5.4 Encoding and Decoding Examples 69

5.5 Evaluation Methodology 74

5.6 Evaluation Results 75

5.7 Chapter Summary 84

Part III Dense CNN Accelerators 85

6 Contemporary Dense CNN Accelerators 87

6.1 Background on Dense CNN Accelerators 87

6.2 Representation of the CNNWeights and Feature Maps in Dense Format 87

6.3 Popular Architectures for Dense CNN Accelerators 89

6.4 Recent Advancements in Dense CNN Accelerators 92

6.5 Chapter Summary 93

7 iMAC: Image-to-Column and General Matrix Multiplication-Based Dense CNN Accelerator 95

7.1 Background and Motivation 95

7.2 Architecture 97

7.3 Implementation 99

7.4 Chapter Summary 100

8 NeuroMAX: A Dense CNN Accelerator 101

8.1 RelatedWork 102

8.2 Log Mapping 103

8.3 Hardware Architecture 105

8.4 Data Flow and Processing 108

8.5 Implementation and Results 118

8.6 Chapter Summary 124

Part IV Sparse CNN Accelerators 125

9 Contemporary Sparse CNN Accelerators 127

9.1 Background of Sparsity in CNN Models 127

9.2 Background of Sparse CNN Accelerators 128

9.3 Recent Advancements in Sparse CNN Accelerators 131

9.4 Chapter Summary 133

10 CNN Accelerator for In Situ Decompression and Convolution of Sparse Input Feature Maps 135

10.1 Overview 135

10.2 Hardware Design Overview 135

10.3 Design Optimization Techniques Utilized in the Hardware Accelerator 140

10.4 FPGA Implementation 141

10.5 Evaluation Results 143

10.6 Chapter Summary 149

11 Sparse-PE: A Sparse CNN Accelerator 151

11.1 RelatedWork 155

11.2 Sparse-PE 156

11.3 Implementation and Results 174

11.4 Chapter Summary 184

12 Phantom: A High-Performance Computational Core for Sparse CNNs 185

12.1 RelatedWork 189

12.2 Phantom 190

12.3 Phantom-2D 201

12.4 Experiments and Results 209

12.5 Chapter Summary 218

Part V HW/SW Co-Design and Co-Scheduling for CNN Acceleration 221

13 State-of-the-Art in HW/SW Co-Design and Co-Scheduling for CNN Acceleration 223

13.1 HW/SW Co-Design 223

13.2 HW/SW Co-Scheduling 228

13.3 Chapter Summary 230

14 Hardware/Software Co-Design for CNN Acceleration 231

14.1 Background of iMAC Accelerator 231

14.2 Software Partition for iMAC Accelerator 232

14.3 Experimental Evaluations 235

14.4 Chapter Summary 237

15 CPU-Accelerator Co-Scheduling for CNN Acceleration 239

15.1 Background and Preliminaries 240

15.2 CNN Acceleration with CPU-Accelerator Co-Scheduling 242

15.3 Experimental Results 251

15.4 Chapter Summary 257

16 Conclusions 259

References 265

Index 285