Site Overlay

Go 1.12 — Performance updates for POWER

Go 1.12 — Performance updates for POWER

Now that the source tree is closed, here are some IBM POWER architecture related updates I did that you can expect in go 1.12:

  • Atomics: implemented a new memory model for the Go runtime internals that includes lightweight atomics (load-acquire, store-release, CAS-release) for ppc64x. This increases performance by approximately 10% in producer-consumer scenarios. Details in commit 5c472132b.
  • Byte/string algorithms: improved performance of IndexByte by removing some loop dependencies (+1.6x performance). See commit 23f75541.
  • Crypto: added a VSX (vector-scalar) implementation of xorBytes for ppc64x that increases the throughput up to 290%. See commit 0ff6e5f1b for details.
  • Compiler SSA: implemented some small functions as intrinsics:
    • Multiprecision math (math/big): mulWW (up to 30% performance increase). See commit 1e8ecefcd for details.
    • Bitwise operations (math/bits): TrailingZeroes16, OnesCount{8,16} (up to 2x performance increase), Mul (+9x performance). See commits 23578f9d, 9aed4cc39.
  • Timing: added Linux VDSO support (22x better performance). See commit dbd8af74.

In addition, there are other performance improvements in SSA and several bug fixes (including for gccgo) made by our team. I also reviewed the port that is being done for IBM AIX, which will also be released in Go 1.12.

Happy New Year! I hope you all have a great 2019!

by Carlos Eduardo Seo