TensorFlow SSE 4.1 and other CPU flags: recompiling is easier than it seems

When running TensorFlow using the official binaries, something that will happen more often now that macOS builds no longer support GPU off the bat, is the recurring and annoying warning that it could be running faster if it were only compiled with some more CPU flags. This seems ominous — keeping track of all the given flags, and then figuring out what compiler flags to apply. There are guides online that will list all the flags that may or may not be applicable for your CPU.

This sounded familiar though. And circuitous. I shouldn’t need to specify all of this out. Sure enough, testing to follow both the explicit compiler flags, and also merely following the official install from source instructions (which use the default -march=current) will lead to the same result: a TensorFlow library that is more than happy to use all the fancy CPU instructions available. Yay, no need to complicate. Be aware that compiling may take a while.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.