Counting Cards: Exploiting Weight and Variance Distributions for Robust Compute In-Memory
Abstract: Compute in-memory (CIM) is a promising technique that minimizes data transport, the primary performance bottleneck and energy cost of most data intensive applications. This has found wide-spread adoption in accelerating neural networks for machine learning applications.
Utilizing a crossbar architecture with emerging non-volatile memories (eNVM) such as dense resistive random access memory (RRAM) or phase change random access memory (PCRAM), various forms of neural networks can be implemented to greatly reduce power and increase on chip memory capacity.
However, compute in-memory faces its own limitations at both the circuit and the device levels.
In this work, we explore the impact of device variation and peripheral circuit design constraints. Furthermore, we propose a new algorithm based on device variance and neural network weight distributions to increase both performance and accuracy for compute-in memory based designs.
We demonstrate a 23\% performance improvement and 27% power reduction for low and high variance eNVM, while satisfying a programmable threshold for a target error tolerance.
Bio: Brian Crafton received his B.S. in Computer Engineering from Northeastern University in Boston. He spent two and a half years working for AMD and Intel before starting his graduate studies at Georgia Tech in Fall 2017. In August 2018 he joined ICSRL as a PhD student advised by Arijit Raychowdhury. His research focuses on circuit level algorithms for machine learning and compute in-memory.
2776.043.Fine and Coarse-Grained Logic-in-Memory Compute Kernels.
This meeting is only available to the JUMP research community, such as Principal Investigators, Postdoc researchers, Students, and Industry/Government liaisons