Pypy 砰呸 - Part 1. 使用 Rpython Translation Toolchain 來做最佳化
在本章節,目的有二個 :
- 教大家怎麼使用 Rpython 撰寫 Brainfuck Interpreter
- 怎麼利用 Rpython Translation Toolchain 把一個慢到不行的 Interpreter 最佳化
# 一、用 Rpython 撰寫 Brainfuck Interpreter
在 此系列文章 (opens new window),就有清楚闡明 brainfuck 語言以及用 C 語言去撰寫 Brainfuck Interpreter,因此不再贅述 brainfuck 的背景知識。
首先,先用 Rpython 撰寫 Interpreter,這裡參考這個連結 (opens new window)
這裡使用一點之前提過的跳轉表的加速技巧
brainfuck-rpython.py
import os
def precompute_jumps(program):
stack = []
ret = {}
pc = 0
while not pc == len(program):
opcode = program[pc]
if opcode == "[":
stack.append(pc)
elif opcode == "]":
target = stack.pop()
ret[target] = pc
ret[pc] = target
pc += 1
return ret
def run(program):
buffer = [0]
jump_map = precompute_jumps(program)
ptr = 0
pc = 0
while not pc == len(program):
opcode = program[pc]
if opcode == ">":
ptr += 1
if ptr == len(buffer): buffer.append(0)
elif opcode == "<": ptr -= 1
elif opcode == "+": buffer[ptr] += 1
elif opcode == "-": buffer[ptr] -= 1
elif opcode == ".":
os.write(1, chr(buffer[ptr]))
elif opcode == ",":
buffer[ptr] = ord(os.read(0, 1)[0])
elif opcode == "[":
if buffer[ptr] == 0: pc = jump_map[pc]
elif opcode == "]":
if buffer[ptr] != 0: pc = jump_map[pc]
pc += 1
def main(argv):
if not len(argv) > 1:
print("Missing input filename")
return 1
fd = os.open(argv[1], os.O_RDONLY, 0777)
size = os.stat(argv[1]).st_size
contents = os.read(fd, size)
os.close(fd)
program = []
for c in contents:
if c in "<>-+[],.":
program.append(c)
run(program)
return 0
def target(driver, args):
return main, None
if __name__ == "__main__":
import sys
main(sys.argv)
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
再來將方才 brainfuck-rpython.py
加入 JIT 的引擎。
brainfuck-rpython-jit.py
import os
from rpython.rlib.jit import JitDriver, purefunction
def precompute_jumps(program):
stack = []
ret = {}
pc = 0
while not pc == len(program):
opcode = program[pc]
if opcode == "[":
stack.append(pc)
elif opcode == "]":
target = stack.pop()
ret[target] = pc
ret[pc] = target
pc += 1
return ret
def jitpolicy(driver):
from rpython.jit.codewriter.policy import JitPolicy
return JitPolicy()
jitdriver = JitDriver(
greens=['pc', 'program', 'jump_map'],
reds=['ptr', 'buffer']
)
@purefunction
def get_matching_bracket(bracket_map, pc):
return bracket_map[pc]
def run(program):
buffer = [0]
jump_map = precompute_jumps(program)
ptr = 0
pc = 0
while not pc == len(program):
jitdriver.jit_merge_point(
pc=pc,
program=program,
jump_map=jump_map,
ptr=ptr,
buffer=buffer
)
opcode = program[pc]
if opcode == ">":
ptr += 1
if ptr == len(buffer): buffer.append(0)
elif opcode == "<": ptr -= 1
elif opcode == "+": buffer[ptr] += 1
elif opcode == "-": buffer[ptr] -= 1
elif opcode == ".":
os.write(1, chr(buffer[ptr]))
elif opcode == ",":
buffer[ptr] = ord(os.read(0, 1)[0])
elif opcode == "[":
if buffer[ptr] == 0: pc = get_matching_bracket(jump_map, pc)
elif opcode == "]":
if buffer[ptr] != 0: pc = get_matching_bracket(jump_map, pc)
pc += 1
def main(argv):
if not len(argv) > 1:
print "Missing input filename"
return 1
fd = os.open(argv[1], os.O_RDONLY, 0777)
size = os.stat(argv[1]).st_size
contents = os.read(fd, size)
os.close(fd)
program = []
for c in contents:
if c in "<>-+[],.":
program.append(c)
run(program)
return 0
def target(driver, args):
return main, None
if __name__ == "__main__":
import sys
main(sys.argv)
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
有了這兩個檔案,我們就可以開始進行效能比較
# 二、下載 pypy
拜訪這個網站 (opens new window),下載帶有 src 的壓縮檔, 這樣才有 Rpython 的執行檔。本篇是下載pypy3.8-v7.3.11-src.tar.bz2 (opens new window)。待會我們會用到的路徑在 pypy3.8-v7.3.11-src/rpython/bin/rpython
# 三、開始做效能比較吧
# 3-1. 先來執行慢吞吞的 brainfuck-rpython.py
首先我們先用一般慢吞吞的 Cpython 執行腳本,因為這是用 interpreter language 寫的 interpreter,跑的速度會超級慢。
python2 brainfuck-rpython.py mandelbrot.bf
總共花了快兩小時 = = (超級久😡)
再來我們使用 pypy 直譯器,來執行這個腳本
pypy brainfuck-rpython.py mandelbrot.bf
總共花了 2 分 50 秒
# 3-2. 使用 Rpython 編譯工具
我們可以將 brainfuck-rpython.py
利用 Rpython 編譯成執行檔
pypy3.8-v7.3.11-src/rpython/bin/rpython brainfuck-rpython.py
中間會輸出很多最佳化的訊息
[translation:info] 2.7.13 (7.3.1+dfsg-2, Apr 21 2020, 05:05:41)
[PyPy 7.3.1 with GCC 9.3.0]
[platform:msg] Set platform with 'host' cc=None, using cc='gcc', version='Unknown'
[translation:info] Translating target as defined by rpy
[translation] translate.py configuration:
[translation] [translate]
targetspec = rpy
[translation] translation configuration:
[translation] [translation]
gc = incminimark
gctransformer = framework
list_comprehension_operations = True
rpython_translate = True
withsmallfuncsets = 5
[translation:info] Annotating&simplifying...
[202] {translation-task
starting annotate
...
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
中間的過程將程式碼最佳化後轉成 C Code,並再使用 GCC 編譯成執行檔 (再一次最佳化)
並將編譯好的執行檔執行 brainfuck testcase
./brainfuck-rpython mandelbrot.bf
總共花了 2 分 15 秒 (有快一點)
接下來,我們把剛剛有加入 JIT hint 的 brainfuck-rpython-jit.py
,使用 Rpython 編譯,並給他 jit 的 hint。看到底可以快多快
pypy3.8-v7.3.11-src/rpython/bin/rpython --opt=jit brainfuck-rpython-jit.py
記得 --opt=jit 要放在檔名前==,不然 rpython 會吃不到這個最佳化選項
./brainfuck-rpython-jit mandelbrot.bf
總共花了 15 秒 (快好多😋)
# 四、比較和小結
Cpython2
: 用 Cpython (2.X) 執行brainfuck-rpython.py
pypy2
: 用 pypy (2.X) 執行brainfuck-rpython.py
Rpython
:brainfuck-rpython.py
用 Rpython 編譯成執行檔執行Rpython(JIT)
:brainfuck-rpython.py
用 Rpython 編譯成執行檔執行
可以發現,由一般 Cpython 直譯器執行是最慢的,但經由加入一些 JIT Hint 以及經過 Rpython Translation Toolchain 的幫助,效能雖然不能像用之前的文章一樣到一兩秒等級,但是至少執行效率是有顯著提升的。
# 五、參考資源
# 5-1. 投影片
- Two-level Just-in-Time Compilation with One Interpreter and One Engine (opens new window)
- Recent Performance Results with PyPy High-speed Python for data analysis (opens new window)
- Just in Time Compilation (opens new window)
- The PyPy translation tool chain (opens new window)
- PyPy JIT under the hood (opens new window)
# 5-2. 參考網站
- Pypy Document (opens new window)
- Unlocking Python Performance: An In-Depth Guide to PyPy and RPython (opens new window)
- 第23篇 深入理解RPython(入门篇) (opens new window)
- Pypy 中文介紹 (opens new window)
- Tutorial: Writing an Interpreter with PyPy, Part 1 (opens new window)
- Tutorial Part 2: Adding a JIT (opens new window)
- Getting Started with PyPy (opens new window)
- RPython is not a Language: An introduction to the RPython language (opens new window)