BLAS Performance Comparisons

Discussions center on benchmarking a new matrix library against optimized BLAS implementations like OpenBLAS, MKL, and Eigen, debating if it outperforms them in speed on various hardware and questioning the value of alternatives to established linear algebra libraries.

πŸ“‰ Falling 0.4x Programming Languages
2,629
Comments
19
Years Active
5
Top Authors
#9586
Topic ID

Activity Over Time

2008
1
2009
17
2010
28
2011
28
2012
55
2013
127
2014
61
2015
110
2016
157
2017
178
2018
192
2019
174
2020
279
2021
211
2022
290
2023
220
2024
312
2025
176
2026
13

Keywords

SKX ATLAS e.g CPU TR98751 HOW softlib.rice ASSEMBLY MATLAB godoc.org matrix performance benchmarks library matlab libraries numpy julia intel tuned

Sample Comments

go_elmo β€’ Sep 12, 2021 β€’ View on HN

Why exactly is this better than atlas / blas / any library using it, e.g. Eigen?

MobiusHorizons β€’ May 12, 2023 β€’ View on HN

BLAS is a very well optimized library. I think a lot of it is in Fortran, which can be faster than c. It is very heavily used in scientific compute. BLAS also has methods that have been hand tuned in assembly. It’s not magic, but the amount of work that has gone into it is not something you would probably want to replicate.

skidrow β€’ Jul 11, 2024 β€’ View on HN

are OpenBLAS and MKL not well optimized lol? They literally compared against OpenBLAS/MKL and posted the results in the article. As someone already mentioned, this implementation is faster than MKL even on a Intel Xeon with 96 cores. Maybe you missed the point, but the purpose of the arcticle was to show HOW to implement matmul without FORTRAN/ASSEMBLY code with NumPy-like performance. NOT how to write a BLIS-competitive library. So the article and the code seem to be LGTM.

p1esk β€’ Sep 1, 2018 β€’ View on HN

It would help to see performance benchmarks against blas or armadillo, etc.

martopix β€’ Dec 8, 2022 β€’ View on HN

I always hear this "fast matrix operations" argument from Matlab users, but don't they both use BLAS? The difference can only be marginal

agibsonccc β€’ Dec 16, 2013 β€’ View on HN

Change my matrix library to a BLAS binding for machine learning. 150x speedup

tycho01 β€’ Aug 28, 2015 β€’ View on HN

Uses BLAS but no mention of cuBLAS to speed things up? Does that mean the linear algebra wasn't big enough a component to merit optimizing on?

StefanKarpinski β€’ Jan 30, 2022 β€’ View on HN

Eagerly awaiting matrix libraries written in pure Python that outperform BLAS.

jey β€’ Nov 17, 2015 β€’ View on HN

You don't just call DGEMM from vendor BLAS?

willis936 β€’ Apr 29, 2022 β€’ View on HN

Don't all high performance math libraries have the option of LAPACK interfaces?