제목

NeuroVectorizer: End-to-End Vectorization with Deep Reinforcement Learning

저자

Ameer Haj-Ali, Nesreen K. Ahmed, Ted Willke, Sophia Shao, Krste Asanovic, Ion Stoica

Motivation

Compilers are designed today to use fixed-cost models that are based on heuristics to make vectorization decisions on loops. However, these models are unable to capture the data dependency, the computation graph, or the organization of instructions The vectorization is critical to enhancing the performance of compute-intensive workloads in modern computers.

Contribution

A comprehensive data set of more than 10,000 synthetic loop examples. An end-to-end deep reinforcement learning (RL) based auto loop-vectorization method

개인적인 느낌

search space가 너무 작아서 솔찍하게 의미가 있는지 의문.. /

The Proposed Framework Architecture

Code Embedding

A code snippet and its predicted labels as computed by code2vec reference
The architecture of our path-attention network. A full-connected layer learns to combine embeddings of each path-contexts with itself; attention weights are learned using the combined context vectors, and used to compute a code vector. The code vector is used to predicts the label. reference

Automatic Vectorization Example

The RL Environment Definition

where baseline is the execution time when compiled with the currently implemented baseline cost model in LLVM and RL is the execution time when compiled with the injected pragmas by the RL agent
where MAX_VF and MAX_IF are respectively the maximum VF and IF supported by the underlying architecture

Dataset Description

To speed up the training, and make it more efficient, we built a dataset that includes loops only. We built generators that generate more than 10,000 synthetic loop examples automatically from the LLVM vectorization test-suite.

Handling Long Compilation Time

Results:Reward mean and training loss for different action space definitions

Results:The performance of the proposed vectorizer

The performance is normalized to the baseline(VF = 4, IF = 2)

Results:Normalized average performance of supervised FCNN and deep RL

Results:The performance of the proposed vectorizer on

Mibench compared to Polly and the baseline cost model

The performance is normalized to the baeline(VF = 4, IF = 2)

references

https://arxiv.org/abs/1909.13639

라이선스

저작자: Jaehun Ryu

링크: https://jaehun.me/posts/%EB%85%BC%EB%AC%B8-%EC%A0%95%EB%A6%AC-neurovectorizer-end-to-end-vectorization-with-deep-reinforcement-learning-cgo-20/

라이선스: CC BY 4.0

이 저작물은 크리에이티브 커먼즈 저작자표시 4.0 국제 라이선스에 따라 이용할 수 있습니다. 출처를 밝히면 상업적 목적을 포함해 자유롭게 이용 가능합니다.

댓글

검색 시작

검색어를 입력하세요

↑↓
ESC
⌘K 단축키