Go performance updates for Power

Go performance updates for Power

I rewrote most of the math/big arithmetic implementations so they perform better on Power. Commit 9459c03, which I already mentioned here before, added new implementations for addVV/subVV with up to ~3x improvement. Now, commit 3cb41be adds better implementations for addMulVVW and mulAddVVW, which are important since they are used in the big numbers multiplication functions. Speedups are up to ~1.5x.

My colleague Lynn also added more performance optimizations in the compiler, with new intrinsics for math operations (floor, ceil, trunc) in commit 0f19e24. Those show a drastic performance improvement, with 88% benchmark time reduction.

Finally, we fixed the compiler on ppc64le-alpine with commit 9aea0e8. There are still some testsuite failures on Alpine Linux on Power, but we are working quickly to fix those.

In the short term, we should soon have runtime CPU capabilities detection on Power with the new internal/cpu package ported to ppc64/ppc64le, and also enable the Go assembler for the new ISA 3.0 (POWER9).

by Carlos Eduardo Seo

Advance Toolchain for Linux on Power 10.0-4 released

Advance Toolchain for Linux on Power 10.0-4 released

A new update for the Advance Toolchain for Linux on Power 10.0 is released.

Advance Toolchain for Linux on Power 10.0-4 new features

The complete list and details of bug/performance fixes is available at the official IBM website for the Advance Toolchain.

For more information about Power architecture and the OpenPOWER ecosystem, please visit the official OpenPOWER Foundation website. You can also follow our Linux on Power Community blog.

by Carlos Eduardo Seo

* The IBM logo is property of IBM Corporation. Courtesy of International Business Machines Corporation. Unauthorized use not permitted.

Go: Changed function alignment to 16 bytes for Power

Go: Changed function alignment to 16 bytes for Power

I just added a simple, but important change for Power. In order to eliminate inefficiencies in the iBuffer, all functions are now aligned to 16 bytes.

This opens a new future work line to add an alignment directive in the assembler, so we can properly align loops in the compiler and when writing code in assembly.

Committed as 09b71d5.

by Carlos Eduardo Seo

Go: Performance optimization for addVV for Power

Go: Performance optimization for addVV for Power

I added a new implementation for addVV (math/big package) for Power architecture. The new assembly implementation leverages specific Power instructions and provides a speedup of ~3x over the generic implementation in Go. This works on both Little Endian and Big Endian ppc64, and will be available in the next go1.9 release.

In addition, for go1.10, I plan to add optimizations for math/big using POWER9 instructions, which will help some of the multiply-and-add functions.

Committed as 9459c03.

by Carlos Eduardo Seo