Site Overlay

Go 1.13 — Performance updates for POWER

 

Better later than never — Go 1.13 updates. Here are my performance highlights for POWER architecture:

  • Implementation of math/bits RotateLeft{32|64} as intrinsics in SSA reduces execution time by up to 2%.
  • Added new POWER9 instructions for counting trailing zeros in SSA. This reduces execution time of math/bits TrailingZeros by up to 9%.
  • Implementation of math/bits Add64 as an intrinsic in SSA reduces execution time by up to 63%.
  • New chacha20 algorithm implementation gives up to +254% performance boost on ppc64le.

In addition to these, I also added a new environment variable (GOPPC64) to allow processor level selection for instruction generation (i.e. a GOPPC64=power9 will tell the compiler it is allowed to generate POWER9 instructions). This enables us to generate new instructions and add performance optimizations in SSA for POWER9 and beyond starting in Go 1.13.

I also added POWER9 builders to the Go server farm, so we can test the POWER9-specific optimizations. So, from Go 1.13 and beyond, you will see two ppc64le builders in the Go Builder Dashboard — one for POWER8 and another for POWER9.

Finally, there are some bug fixes as well, mostly in the assembler.

Go 1.13 was the last release I work on actively. As I moved to a new job within IBM, I will not be as active as before. However, I will still be around posting and reviewing patches, and helping maintain the ppc64le port.

by Carlos Eduardo Seo