Go 1.10 updates for Power

Go 1.10 updates for Power

The Go 1.10 source is frozen now and I have some updates on Power architecture support.

In addition to the updates I’ve already given here regarding performance and POWER9 support, I also added:

  • New implementations for bytes.IndexByte and string.IndexByte that use vector registers and instructions in commit be943df. The performance improvement can be up to 2.75x, depending on the slice size. I will probably look for more basic function loops that may be optimized using vector instructions for Go 1.11.
  • Fixed the implementation of Data Cache instructions in the Go assembler.
  • Updated the syscall package to use newselect.
  • Fixed a very bad performance regression in both bytes.Compare and runtime.cmpstring introduced in Go 1.8.

Also, Go should be almost fully functional on ppc64le / Alpine Linux in 1.10. There are some timeout (hanging) problems when C code is linked with Go code, but pure Go programs should work properly now.

by Carlos Eduardo Seo

Go updates for POWER9 / ISA 3.0

Go updates for POWER9 / ISA 3.0

I recently pushed two patches upstream related to the enablement of ISA 3.0 (POWER9) instructions. The first one, commit 526f342, enables instructions in the assembler. This includes new compares, loads, math operations, register moves, the new random number generator and the new copy/paste facility.

The second one, commit 6661cf6, enables the new internal/cpu package for Power. This package provides runtime CPU identification and capabilities detection and we will use it to write new runtime performance optimizations specific for POWER9 without breaking the code for POWER8.

Both will be available in go1.10. Or you can grab the current upstream code and build it yourself to use them.

by Carlos Eduardo Seo

Go 1.9 released

Go 1.9 released

Go 1.9 is released and has some new features, including:

For the Power architecture, the new minimum requirement for big endian is now POWER8. This unifies hardware requirements for both little endian and big endian and will make it easier to us to provide new features and optimizations for both.

In addition, we have many performance optimizations, including:

  • POWER8 performance optimizations for the math/big, strings and bytes packages
  • Compiler optimizations in the SSA backend for better instruction sequence generation
  • POWER8 performance optimization for AES

For more information, please check the release notes.

by Carlos Eduardo Seo

Go performance updates for Power

Go performance updates for Power

I rewrote most of the math/big arithmetic implementations so they perform better on Power. Commit 9459c03, which I already mentioned here before, added new implementations for addVV/subVV with up to ~3x improvement. Now, commit 3cb41be adds better implementations for addMulVVW and mulAddVVW, which are important since they are used in the big numbers multiplication functions. Speedups are up to ~1.5x.

My colleague Lynn also added more performance optimizations in the compiler, with new intrinsics for math operations (floor, ceil, trunc) in commit 0f19e24. Those show a drastic performance improvement, with 88% benchmark time reduction.

Finally, we fixed the compiler on ppc64le-alpine with commit 9aea0e8. There are still some testsuite failures on Alpine Linux on Power, but we are working quickly to fix those.

In the short term, we should soon have runtime CPU capabilities detection on Power with the new internal/cpu package ported to ppc64/ppc64le, and also enable the Go assembler for the new ISA 3.0 (POWER9).

by Carlos Eduardo Seo