Golang 1.13 defer 变化
Golang的1.13发布了,在Release Note的Runtime section上提到了defer在多数情况下可以提升30%的性能。 那么,这30%的性能是怎么提升起来的呢?
我们知道的,以前的defer func
会被拆解成runtime.deferproc
和runtime.deferreturn
两个过程。
现在,在deferproc这一步,增加了一个deferprocStack
的新过程,由编译器来选择使用deferproc
还是deferprocStack
。
当然了,既然官方说优化了大部分的使用场景,那么就说明,大部分的情况下编译器是使用了deferprocStack
。
来看代码,golang runtime panic
// deferprocStack queues a new deferred function with a defer record on the stack.
// The defer record must have its siz and fn fields initialized.
// All other fields can contain junk.
// The defer record must be immediately followed in memory by
// the arguments of the defer.
// Nosplit because the arguments on the stack won't be scanned
// until the defer record is spliced into the gp._defer list.
//go:nosplit
func deferprocStack(d *_defer) {
gp := getg()
if gp.m.curg != gp {
// go code on the system stack can't defer
throw("defer on system stack")
}
// siz and fn are already set.
// The other fields are junk on entry to deferprocStack and
// are initialized here.
d.started = false
d.heap = false
d.openDefer = false
d.sp = getcallersp()
d.pc = getcallerpc()
d.framepc = 0
d.varp = 0
// The lines below implement:
// d.panic = nil
// d.fd = nil
// d.link = gp._defer
// gp._defer = d
// But without write barriers. The first three are writes to
// the stack so they don't need a write barrier, and furthermore
// are to uninitialized memory, so they must not use a write barrier.
// The fourth write does not require a write barrier because we
// explicitly mark all the defer structures, so we don't need to
// keep track of pointers to them with a write barrier.
*(*uintptr)(unsafe.Pointer(&d._panic)) = 0
*(*uintptr)(unsafe.Pointer(&d.fd)) = 0
*(*uintptr)(unsafe.Pointer(&d.link)) = uintptr(unsafe.Pointer(gp._defer))
*(*uintptr)(unsafe.Pointer(&gp._defer)) = uintptr(unsafe.Pointer(d))
return0()
// No code can go here - the C return register has
// been set and must not be clobbered.
}
来看一下编译器的选择
package main
func main() {
defer println(1)
}
0x001d 00029 (./main.go:4) PCDATA $0, $0
0x001d 00029 (./main.go:4) PCDATA $1, $0
0x001d 00029 (./main.go:4) MOVL $8, ""..autotmp_1+8(SP)
0x0025 00037 (./main.go:4) PCDATA $0, $1
0x0025 00037 (./main.go:4) LEAQ "".wrap·1·f(SB), AX
0x002c 00044 (./main.go:4) PCDATA $0, $0
0x002c 00044 (./main.go:4) MOVQ AX, ""..autotmp_1+32(SP)
0x0031 00049 (./main.go:4) MOVQ $1, ""..autotmp_1+56(SP)
0x003a 00058 (./main.go:4) PCDATA $0, $1
0x003a 00058 (./main.go:4) LEAQ ""..autotmp_1+8(SP), AX
0x003f 00063 (./main.go:4) PCDATA $0, $0
0x003f 00063 (./main.go:4) MOVQ AX, (SP)
0x0043 00067 (./main.go:4) CALL runtime.deferprocStack(SB)
0x0048 00072 (./main.go:4) TESTL AX, AX
0x004a 00074 (./main.go:4) JNE 9
确实调用了新的deferprocStack。
那么以前的deferproc呢?我们来看一下defer结构的代码,增加了一个heap的变量,用来区分是在堆上还是在栈上分配。
type _defer struct {
siz int32 // includes both arguments and results
started bool
heap bool // <-- 增加了这个新字段
sp uintptr // sp at time of defer
pc uintptr
fn *funcval
_panic *_panic // panic that is running defer
link *_defer
}
在1.13之前,走的都是deferproc,虽然也有deferpool,但是还是不够用。社区一直在吐槽defer慢,于是这次终于响应了民意。
如何区分defer是在heap还是在stack上呢?
case ODEFER:
d := callDefer
if n.Esc == EscNever {
d = callDeferStack
}
s.call(n.Left, d)
这个n.Esc
是ast.Node
的逃逸分析结果,被修改为EscNever主要是以下这段:
case ODEFER:
if e.loopdepth == 1 { // top level
n.Esc = EscNever // force stack allocation of defer record (see ssa.go)
break
}
看意思,八成是如果defer外面有1层以上的for循环,就不是EscNever了。 我们来试一下,改一下之前的代码,加个for循环:
package main
func main() {
for i := 0; i < 10; i++ {
defer println(1)
}
}
来看一眼汇编,熟悉的配方又回来了。
0x0035 00053 (./main.go:5) MOVL $8, (SP)
0x003c 00060 (./main.go:5) PCDATA $0, $1
0x003c 00060 (./main.go:5) LEAQ "".wrap·1·f(SB), AX
0x0043 00067 (./main.go:5) PCDATA $0, $0
0x0043 00067 (./main.go:5) MOVQ AX, 8(SP)
0x0048 00072 (./main.go:5) MOVQ $1, 16(SP)
0x0051 00081 (./main.go:5) CALL runtime.deferproc(SB)
0x0056 00086 (./main.go:5) TESTL AX, AX
0x0058 00088 (./main.go:5) JNE 92
0x005a 00090 (./main.go:5) JMP 33
0x005c 00092 (./main.go:5) XCHGL AX, AX
0x005d 00093 (./main.go:5) CALL runtime.deferreturn(SB)
0x0062 00098 (./main.go:5) MOVQ 32(SP), BP
0x0067 00103 (./main.go:5) ADDQ $40, SP
0x006b 00107 (./main.go:5) RET
好了,这就是defer提速的原因,defer分配到了栈上,而且确实大多数情况下我们不会在循环中调用defer,所以,RN写的没毛病。