mirglas 0.2.3
Generic Linear Algebra Subprograms
To use this package, put the following dependency into your project's dependencies section:
glas
LLVMaccelerated Generic Linear Algebra Subprograms (GLAS)
Description
GLAS is a C library written in Dlang. No C++/D runtime is required but libc, which is available everywhere.
The library provides
 BLAS (Basic Linear Algebra Subprograms) API.
 GLAS (Generic Linear Algebra Subprograms) API.
CBLAS API can be provided by linking with Netlib's CBLAS library.
dub
GLAS can be used with DMD and LDC but
LDC (LLVM D Compiler) >= 1.1.0 beta 6
should be installed in common path anyway.
Note performance issue https://github.com/libmir/mirglas/issues/18.
GLAS can be included automatically in a project using dub (the D package manager). DUB will build GLAS and CPUID manually with LDC.
{
...
"dependencies": {
"mirglas": "~><current_mirglas_version>",
"mircpuid": "~><current_mircpuid_version>"
},
"lflags": ["L$MIR_GLAS_PACKAGE_DIR", "L$MIR_CPUID_PACKAGE_DIR"]
}
$MIR_GLAS_PACKAGE_DIR
and $MIR_CPUID_PACKAGE_DIR
will be replaced automatically by DUB to appropriate directories.
Usage
mirglas
can be used like a common C library. It should be linked with mircpuid
.
A compiler, for example GCC, may require mircpuid
to be passed after mirglas
: lmirglas lmircpuid
.
GLAS API and Documentation
Documentation can be found at http://docs.glas.dlang.io/.
GLAS API is based on the new ndslice
from miralgorithm.
Other languages can use simple structure definition.
Examples are available for C and for Dlang.
Headers
C/C++ headers are located in include/
.
D headers are located in source/
.
There are two files:
glas/fortran.h
/glas/fortran.d
 for Netilb's BLAS APIglas/ndslice.h
/glas/ndslice.d
 for GLAS API
Manual Compilation
Compiler installation
LDC (LLVM D Compiler) >= 1.1.0 beta 6
is required to build a project.
1.1.0
version is not released yet.
You may want to build LDC from source or use LDC 1.1.0 beta 6.
Beta 2 generates a lot of warnings that can be ignored. Beta 3 is not supported.
LDC binaries contains two compilers: ldc2 and ldmd2. It is recommended to use ldmd2 with mirglas.
Recent LDC packages come with the dub package manager. dub is used to build the project.
Mir CPUID
Mir CPUID is CPU Identification Routines.
Download mircpuid
dub fetch mircpuid cache=local
Change the directory
cd mircpuid<currentmircpuidversion>/mircpuid
Build mircpuid
dub build build=releasenobounds compiler=ldmd2 buildmode=singleFile parallel force
You may need to add arch=x86_64
, if you use windows.
Copy libmircpuid.a
to your project or add its directory to the library path.
Mir GLAS
Download mirglas
dub fetch mirglas cache=local
Change the directory
cd mirglas<currentmirglasversion>/mirglas
Build mirglas
dub build config=static build=targetnative compiler=ldmd2 buildmode=singleFile parallel force
You may need to add arch=x86_64
if you use windows.
Copy libmirglas.a
to your project or add its directory to the library path.
Status
We are open for contributing! The hardest part (GEMM) is already implemented.
 [x] CI testing with Netlib's BLAS test suite.
 [x] CI testing with Netlib's CBLAS test suite.
 [ ] CI testing with Netlib's LAPACK test suite.
 [ ] CI testing with Netlib's LAPACKE test suite.
 [ ] Multithreading
 [ ] GPU backend
 [ ] Shared library support  requires only DUB configuration fixes.
 [ ] Level 3  matrixmatrix operations
 [x] GEMM  matrix matrix multiply
 [x] SYMM  symmetric matrix matrix multiply
 [x] HEMM  hermitian matrix matrix multiply
 [ ] SYRK  symmetric rankk update to a matrix
 [ ] HERK  hermitian rankk update to a matrix
 [ ] SYR2K  symmetric rank2k update to a matrix
 [ ] HER2K  hermitian rank2k update to a matrix
 [ ] TRMM  triangular matrix matrix multiply
 [ ] TRSM  solving triangular matrix with multiple right hand sides
 [ ] Level 2  matrixvector operations
 [ ] GEMV  matrix vector multiply
 [ ] GBMV  banded matrix vector multiply
 [ ] HEMV  hermitian matrix vector multiply
 [ ] HBMV  hermitian banded matrix vector multiply
 [ ] HPMV  hermitian packed matrix vector multiply
 [ ] TRMV  triangular matrix vector multiply
 [ ] TBMV  triangular banded matrix vector multiply
 [ ] TPMV  triangular packed matrix vector multiply
 [ ] TRSV  solving triangular matrix problems
 [ ] TBSV  solving triangular banded matrix problems
 [ ] TPSV  solving triangular packed matrix problems
 [ ] GERU  performs the rank 1 operation
A := alpha*x*y' + A
 [ ] GERC  performs the rank 1 operation
A := alpha*x*conjg( y' ) + A
 [ ] HER  hermitian rank 1 operation
A := alpha*x*conjg(x') + A
 [ ] HPR  hermitian packed rank 1 operation
A := alpha*x*conjg( x' ) + A
 [ ] HER2  hermitian rank 2 operation
 [ ] HPR2  hermitian packed rank 2 operation
 [x] Level 1  vectorvector and scalar operations. Note: Mir already provides generic implementation.
 [x] ROTG  setup Givens rotation
 [x] ROTMG  setup modified Givens rotation
 [X] ROT  apply Givens rotation
 [x] ROTM  apply modified Givens rotation
 [x] SWAP  swap x and y
 [x] SCAL 
x = a*x
. Note: requires addition optimization for complex numbers.  [x] COPY  copy x into y
 [x] AXPY 
y = a*x + y
. Note: requires addition optimization for complex numbers.  [x] DOT  dot product
 [x] DOTU  dot product. Note: requires addition optimization for complex numbers.
 [x] DOTC  dot product, conjugating the first vector. Note: requires addition optimization for complex numbers.
 [x] DSDOT  dot product with extended precision accumulation and result
 [x] SDSDOT  dot product with extended precision accumulation
 [x] NRM2  Euclidean norm
 [x] ASUM  sum of absolute values
 [x] IAMAX  index of max abs value
Porting to a new target
Five steps
 Implement
cpuid_init
function formircpuid
. This function should be implemented per platform or OS. Already implemented targets are
 x86, any OS
 x86_64, any OS
 Verify that source/glas/internal/memory.d contains an implementation for the OS. Already implemented targets are
 Posix (Linux, macOS, and others)
 Windows
 Add new configuration for register blocking to source/glas/internal/config.d. Already implemented configuration available for
 x87
 SSE2
 AVX / AVX2
 AVX512 (requires LLVM bug fixes).
 Create a Pool Request.
 Coordinate with LDC team in case of compiler bugs.
Questions & Answers
Why GLAS is called "Generic ..."?
 GLAS has a generic internal implementation, which can be easily ported to any other architecture with minimal efforts (5 minutes).
 GLAS API provides more functionality comparing with BLAS.
 It is written in Dlang using generic programming.
Why it is better then other BLAS Open Source Libraries like OpenBLAS and Eigen?
 GLAS is faster.
 GLAS API is more userfriendly and does not require additional data copying.
 GLAS does not require C++ runtime comparing with Eigen.
 GLAS does not require platform specific optimizations like Eigen intrinsics micro kernels and OpenBLAS assembler macro kernels.
 GLAS has a simple implementation, which can be easily ported and extended.
Why GLAS does not have Lazy Evaluation and Aliasing like Eigen?
GLAS is a lower level library than Eigen. For example, GLAS can be an Eigen BLAS backend in the future
Lazy Evaluation and Aliasing can be easily implemented in D.
Explicit composition of operations can be done using mir.ndslice.algorithm
and multidimensional map
from mir.ndslice.topology
, which is a generic way to perform any lazy operations you want.
 Registered by Ilya Yaroshenko
 libmir/mirglas
 BSL1.0
 Copyright © 2016, Ilya Yaroshenko
 Authors:
 Ilya Yaroshenko
 Dependencies:
 miralgorithm
 Versions:

0.2.3 2017May18 0.2.2 2017Mar31 0.2.1 2017Mar31 0.2.0 2017Mar29 0.1.5 2017Mar20  Download Stats:


1 downloads today

8 downloads this week

8 downloads this month

146 downloads total
