Skip to content

Update OpenBLAS to v0.3.5#30583

Merged
ararslan merged 5 commits into
masterfrom
aa/openblas-0.3.5
Jan 6, 2019
Merged

Update OpenBLAS to v0.3.5#30583
ararslan merged 5 commits into
masterfrom
aa/openblas-0.3.5

Conversation

@ararslan

@ararslan ararslan commented Jan 4, 2019

Copy link
Copy Markdown
Member

Note that we're currently using v0.3.3 and this PR goes directly to the newest version, v0.3.5. See the release notes for both 0.3.4 and 0.3.5 for a more complete summary of the changes: https://github.com/xianyi/OpenBLAS/releases.

@staticfloat, we'll need some kind of BinaryBuilder magic for this as well, right? How do?

@ararslan ararslan added building Build system, or building Julia or its dependencies external dependencies Involves LLVM, OpenBLAS, or other linked libraries labels Jan 4, 2019
@ararslan ararslan requested a review from staticfloat January 4, 2019 08:35
@vchuravy

vchuravy commented Jan 4, 2019

Copy link
Copy Markdown
Member

@staticfloat

Copy link
Copy Markdown
Member

I've started a build of v0.3.5; we'll see what's broken this time around. ;)

@ViralBShah

Copy link
Copy Markdown
Member

We should increase the number of MAX THREADS to 40 or perhaps even 64 (on x86-64).

https://github.com/JuliaPackaging/Yggdrasil/blob/2b1576aeb8cdac4e783bd35f58e77fad97c09225/OpenBLAS/build_tarballs.jl#L46

@vtjnash

vtjnash commented Jan 4, 2019

Copy link
Copy Markdown
Member

We set that option low because it seemed to improve performance on small matrices and limits virtual memory usage (iirc, it reserves 256MB * MAX_THREADS). We can reconsider (esp. if OpenBLAS is improved now to initialize this lazily), but that should be a separate PR.

edit: cf 716151d

@ViralBShah

Copy link
Copy Markdown
Member

Yes, but the flip side now is that people complain they have bigger machines and Julia does not use all the cores.

@ViralBShah

Copy link
Copy Markdown
Member

The small matrix performance is driven by the gemm threshold (which is 50, so multi-threading is not used for smaller matrices). The main driver was memory when we did that change in 2013. I say that core counts and memories have grown enough that doubling the max threads now should be ok (ideally only on linux x86-64).

@ViralBShah

Copy link
Copy Markdown
Member

From the openblas 0.3.4 release notes:

OpenBLAS will now provide enough buffer space for at least 50 threads by default.

I am sure we override this with our setting.

@staticfloat

Copy link
Copy Markdown
Member

BB tarballs are good to go.

@ViralBShah let's open a separate issue, do some performance measurements, then decide the new value to default to.

@ViralBShah

Copy link
Copy Markdown
Member

Yes, agree that the max threads should not hold up the version update.

@ararslan

ararslan commented Jan 4, 2019

Copy link
Copy Markdown
Member Author

Thanks for taking care of the BinaryBuilder stuff, Elliot!

@ararslan

ararslan commented Jan 4, 2019

Copy link
Copy Markdown
Member Author

Travis macOS is trying to build GCC from source which filled up the max log length so Travis killed the build.

@ararslan

ararslan commented Jan 5, 2019

Copy link
Copy Markdown
Member Author

The macOS failure should be fixed by #30599.

@ararslan ararslan merged commit ffb6f1e into master Jan 6, 2019
@ararslan ararslan deleted the aa/openblas-0.3.5 branch January 6, 2019 00:48
@vtjnash

vtjnash commented Jan 8, 2019

Copy link
Copy Markdown
Member

I'm seeing a substantial number of BLAS tests failing now on Intel Xeon Silver 4114 (none are present if I revert this PR): https://gist.github.com/vtjnash/c4f09f3019b335a690862134807da41c

julia> using LinearAlgebra

julia> LinearAlgebra.versioninfo()
BLAS: libopenblas (OpenBLAS 0.3.5  USE64BITINT DYNAMIC_ARCH NO_AFFINITY SkylakeX MAX_THREADS=16)
LAPACK: libopenblas64_

julia> versioninfo()
Julia Version 1.2.0-DEV.131
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

@staticfloat

Copy link
Copy Markdown
Member

Yep, looks like an OpenBLAS bug. Let's open an issue on OpenBLAS. I'll take a look to see if there's anything suspicious they've merged against SkylakeX recently and try reverting those patches.

@ViralBShah

ViralBShah commented Jan 8, 2019

Copy link
Copy Markdown
Member

Why not just move to an older version of openblas and adopt a new version when they fix things upstream? Would be lesser work overall.

@andreasnoack

Copy link
Copy Markdown
Member

There might be some important bug fixes in the latest release #30460 (comment)

@staticfloat

Copy link
Copy Markdown
Member

Sure, we could revert back to 0.3.3 (0.3.4 won't build on all our platforms) but there are some nice things to have in 0.3.5, lots of bugs fixed, so it's better IMO to disable a few optimized kernels instead.

@ViralBShah ViralBShah added the linear algebra Linear algebra label Jan 9, 2019
@ViralBShah

ViralBShah commented Jan 9, 2019

Copy link
Copy Markdown
Member

I see pinv, bunchkaufman, qr, and triangular failing on skylake. I've been trying to find a reproducible test case for OpenMathLib/OpenBLAS#1955

The openblas failure is strange. On skylake, the tests that fail in triangular, if I run them on the command line, they are fine. Perhaps some buffer overruns somewhere in openblas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

building Build system, or building Julia or its dependencies external dependencies Involves LLVM, OpenBLAS, or other linked libraries linear algebra Linear algebra

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants