TY - JOUR
T1 - MIOpen
T2 - 30th International Conference on Computer Graphics and Machine Vision, GraphiCon 2020
AU - Khan, Jehandad
AU - Fultz, Paul
AU - Tamazov, Artem
AU - Lowell, Daniel
AU - Liu, Chao
AU - Melesse, Michael
AU - Nandhimandalam, Murali
AU - Nasyrov, Kamil
AU - Perminov, Ilya
AU - Shah, Tejash
AU - Filippov, Vasilii
AU - Zhang, Jing
AU - Zhou, Jing
AU - Natarajan, Bragadeesh
AU - Daga, Mayank
N1 - Publisher Copyright:
© 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
PY - 2020
Y1 - 2020
N2 - Deep Learning has established itself to be a common occurrence in the business lexicon. The unprecedented success of deep learning in recent years can be attributed to: an abundance of data, availability of gargantuan compute capabilities offered by GPUs, and adoption of open-source philosophy by the researchers and industry. Deep neural networks can be decomposed into a series of different operators. MIOpen, AMD's open-source deep learning primitives library for GPUs, provides highly optimized implementations of such operators, shielding researchers from internal implementation details and hence, accelerating the time to discovery. This paper introduces MIOpen and provides details about the internal workings of the library and supported features. MIOpen innovates on several fronts, such as implementing fusion to optimize for memory bandwidth and GPU launch overheads, providing an auto-tuning infrastructure to overcome the large design space of problem configurations, and implementing different algorithms to optimize convolutions for different filter and input sizes. MIOpen is one of the first libraries to publicly support the bfloat16 data-type for convolutions, allowing efficient training at lower precision without the loss of accuracy.
AB - Deep Learning has established itself to be a common occurrence in the business lexicon. The unprecedented success of deep learning in recent years can be attributed to: an abundance of data, availability of gargantuan compute capabilities offered by GPUs, and adoption of open-source philosophy by the researchers and industry. Deep neural networks can be decomposed into a series of different operators. MIOpen, AMD's open-source deep learning primitives library for GPUs, provides highly optimized implementations of such operators, shielding researchers from internal implementation details and hence, accelerating the time to discovery. This paper introduces MIOpen and provides details about the internal workings of the library and supported features. MIOpen innovates on several fronts, such as implementing fusion to optimize for memory bandwidth and GPU launch overheads, providing an auto-tuning infrastructure to overcome the large design space of problem configurations, and implementing different algorithms to optimize convolutions for different filter and input sizes. MIOpen is one of the first libraries to publicly support the bfloat16 data-type for convolutions, allowing efficient training at lower precision without the loss of accuracy.
KW - Convolution
KW - Deep Learning
KW - GPU
KW - HIP
KW - MIOpen
KW - Machine Learning
KW - OpenCL
KW - Performance
UR - http://www.scopus.com/inward/record.url?scp=85098199184&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85098199184
SN - 1613-0073
VL - 2744
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
Y2 - 22 September 2020 through 25 September 2020
ER -