Efficient convolution optimisation by composing micro-kernels - Systèmes Répartis, Calcul Parallèle et Réseaux Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2021

Efficient convolution optimisation by composing micro-kernels

Résumé

Tiling is a key loop transformation for optimizing tensor computations such as CNNs (Convolutional Neural Networks). Tile optimization involves an explosively large search space for multi-level tiling, including all possible permutations of the tiling loops and all possible valid tile sizes. In this paper, we develop a comprehensive methodology for finding optimized tile configurations with imperfectly nested micro-kernels ("beyond perfect") and outer tile loops optimized via analytical modeling. Experimental results on over 30 CNN benchmarks from three popular DNN pipelines demonstrate the effectiveness of the presented optimization approach by comparing with the Intel oneDNN library.
Fichier principal
Vignette du fichier
HAL_Ttile_CNNopt_bibtex_updated.pdf (634.61 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03149553 , version 1 (23-02-2021)
hal-03149553 , version 2 (09-04-2021)
hal-03149553 , version 3 (14-10-2021)

Identifiants

  • HAL Id : hal-03149553 , version 2

Citer

Nicolas Tollenaere, Auguste Olivry, Guillaume Iooss, Hugo Brunie, Albert Cohen, et al.. Efficient convolution optimisation by composing micro-kernels. 2021. ⟨hal-03149553v2⟩
715 Consultations
1527 Téléchargements

Partager

Gmail Facebook X LinkedIn More