A 1 TOPS/W analog deep machine-learning engine with floating-gate storage in 0.13 μm CMOS

Junjie Lu, Steven Young, Itamar Arel, Jeremy Holleman

Research output: Contribution to journalArticlepeer-review

97 Scopus citations

Abstract

An analog implementation of a deep machine-learning system for efficient feature extraction is presented in this work. It features online unsupervised trainability and non-volatile floating-gate analog storage. It utilizes a massively parallel reconfigurable current-mode analog architecture to realize efficient computation, and leverages algorithm-level feedback to provide robustness to circuit imperfections in analog signal processing. A 3-layer, 7-node analog deep machine-learning engine was fabricated in a 0.13 μm standard CMOS process, occupying 0.36 mm 2 active area. At a processing speed of 8300 input vectors per second, it consumes 11.4 μW from the 3 V supply, achieving 1×10 12 operation per second per Watt of peak energy efficiency. Measurement demonstrates real-time cluster analysis, and feature extraction for pattern recognition with 8-fold dimension reduction with an accuracy comparable to the floating-point software simulation baseline.

Original languageEnglish
Article number6919341
Pages (from-to)270-281
Number of pages12
JournalIEEE Journal of Solid-State Circuits
Volume50
Issue number1
DOIs
StatePublished - Jan 1 2015
Externally publishedYes

Keywords

  • Analog signal processing
  • current mode arithmetic
  • deep machine learning
  • floating gate
  • neuromorphic engineering
  • translinear circuits

Fingerprint

Dive into the research topics of 'A 1 TOPS/W analog deep machine-learning engine with floating-gate storage in 0.13 μm CMOS'. Together they form a unique fingerprint.

Cite this