Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding matrix norms to the stdlib_linalg module. #820

Open
loiseaujc opened this issue May 17, 2024 · 11 comments
Open

Adding matrix norms to the stdlib_linalg module. #820

loiseaujc opened this issue May 17, 2024 · 11 comments
Labels
idea Proposition of an idea and opening an issue to discuss it

Comments

@loiseaujc
Copy link

Motivation

I'm currently updating my package LightKrylov to make use as much as possible of the new linalg features offered in the stdlib_linalg module. Among the things I'll need eventually are functions to compute matrix norms. Is there any on-going work at the moment from @perazz, @jvdp1, or @jalvesz on this subject or could I make this my first task with stdlib ?

Prior Art

  • SciPy: scipy.linalg.norm(A, ord=None, axis=None, keepdims=False, check_finite=True). It handles both standard vector norms as well as a variety of vector-induced and non-induced matrix norms.
  • Julia: LinearAlgebra.norm(x, p) where x is an n-vector returns the $p$-norm of this vector.
  • Julia: LinearAlgebra.norm(A, p) where A is an m x n matrix returns the "entry-wise" $p$-norm of A.
  • Julia : LinearAlgebra.opnorm(A, p) where A is an m x n matrix returns the vector-induced $p$-norm of A.

Additional Information

While most people might be accustomed to the SciPy standard, my personal preference would still go to the Julia principle of separating the true vector-induced norms (i.e. 1-norm, 2-norm, $\infty$-norm) from the entrywise ones (e.g. Frobenius) although I'm obviously open to discussion.

@loiseaujc loiseaujc added the idea Proposition of an idea and opening an issue to discuss it label May 17, 2024
@perazz
Copy link
Contributor

perazz commented May 17, 2024

Great idea @loiseaujc. Rn I'm trying to tackle the decompositions (pseudo-inverse, Cholesky, QR) and I was planning to address norms and condition number next. So, your contribution would be very welcome!

As a way to separate between the two approaches to norm, would there be a way to use the same norm interface for all, but with different arguments maybe? I was loosely thinking that something like norm(A,2) vs. norm(A,'Frobenius') could work.

@loiseaujc
Copy link
Author

loiseaujc commented May 17, 2024

We probably could. Julia is more exhaustive than SciPy for this. Having norm and opnorm allows them to have both the element wise p-norm (for abitrary p) and the induced one (for p=1, 2 and infinity). SciPy is restricted to the standard operator induced norms + the Frobenius one. I think it mostly depends on how exhaustive we want to be (and how often the non-Frobenius element wise norms are actually being used which I think is not much).

The more I think about it, the more it feels like the Julia way may actually cover more use-cases than are actually used in practice. I still like the difference between norm and opnorm though just to recall the user that these are fundamentally different mathematical objects. Having the two interfaces might also make it more transparent when a 2D array represent an actual matrix or a collection of vectors instead of having to play with the axis keyword.

@loiseaujc
Copy link
Author

loiseaujc commented May 17, 2024

I took some time to think about all the different norms (both for vectors and matrices) I've ever needed in my scientific computing adventure over the past 15 years or so. Here is a pretty standard list (+ whatever scipy/Julia are offering).

Vector norms

  • norm(x, 1): 1-norm of a vector (i.e. sum of absolute values),
  • norm(x, 2): 2-norm of a vector (i.e. standard Euclidean norm),
  • norm(x, inf): $\infty$-norm of a vector (i.e. maximum absolute value),
  • norm(x, p): p-norm of a vector.

Among these four, the Euclidean 2-norm is by far the most popular and, more often than not, is what people mean when discussing the norm of a vector $\mathbf{x}$.

I've used the other three quite regularly as well but more from a convex optimization perspective. I hardly ever had to actually compute them but they're simple enough to implement and I believe are something everybody expects from any standard linear algebra module.

Matrix norms

  • opnorm(A, 1): 1-norm of a matrix (i.e. maximum absolute column sum)
  • opnorm(A, 2): 2-norm of a matrix (i.e. largest singular value)
  • opnorm(A, inf): $\infty$-norm of a matrix (i.e. maximum absolute row sum)
  • opnorm(A, "nuclear"): nuclear norm (i.e. sum of singular values)
  • opnorm(A, "fro"): Frobenius norm.

Like everyone, I use the Frobenius norm all the time. Similarly, the 2-norm and the nuclear norm are quite extensively used in model order reduction. As before, the 1-norm and the $\infty$-norm, although I don't quite know where they are being used, are some things everybody would expect from a standard linalg module.

There are a bunch of other matrix-related norms that I have used or seen being used over the years but they are more of a niche thing. These include:

  • schatten_norm(A, p): Schatten p-norm (i.e. the vector p-norm applied to the singular values of A).
  • lognorm(A, 2): Log-norm of a matrix, also known as its numerical abcissa and defined as $\lambda_{\max} ( \frac{\mathbf{A} + \mathbf{A}^H}{2})$.

Among these two, the lognorm is probably the one I've used most often. It is related to the non-normality of the matrix and proves important when studying linear time invariant systems of the form $\dot{x} = Ax + Bu$.


Here are my thoughts (in no particular order) on the matter:

  • Even if it goes against the scipy notation, I kind of prefer the norm/opnorm dichotomy as it makes norm(x, 2) and opnorm(x, 2) very clear from just reading the code and not having to check the declarations to see if x is rank-1 or rank-2 array.
  • norm and opnorm should return the Euclidean and Frobenius norms by default, respectively.
  • The schatten_norm and lognorm could be added although they wouldn't be very high in my priority list.
  • As a starting point, the implementations should focus on general matrices. Specialized computations for say Hessenberg matrices, Circulant matrices etc could be included later on. Since they're all standard rank-2 arrays, we can't naturally dispatch based on the matrix type and, the simplest, would be to write a master subroutine with is_symmetric(A), is_hessenberg(A) etc dispatching to the specialized solvers. If you know of any better/more fortranic way, I'm all ears.
  • On the development and mathematical sides, I would restrict norm to take only a rank-1 array as argument. This would make less code bloat to handle all the corner cases scipy is handling (i.e. whether the norm is applied on the whole array, only along the rows, or only the columns). On the other hand, this would force the user to write a for loop to compute the norm of a collection of vectors represented as a rank-2 array and would be a major departure from the numpy/scipy standard.

I tend to think that being strict about the rank-1 array argument for norm is a better scientific/programming practice than what scipy currently offers. It doesn't leave any room for interpretation. Forcing the user to write a for loop for a collection of vectors might also make the code more readable and less error-prone (although it may possibly cause some performance loss (?), I don't know).

@perazz : On a side note, do you have a publicly available roadmap for the development of the stdlib_linalg module? I have seen some discussions on the fortran-lang discourse but nothing centralized. Translating/adapting everything scipy.linalg is offering is too much for a single person, even a one-man team. I suppose however that we all have bits and pieces scattered around our different code bases (e.g. I have some sketch for schur, expm, hankel, toeplitz, etc).

@jalvesz
Copy link
Contributor

jalvesz commented May 18, 2024

@loiseaujc I basically agree with everything you said! I just have one question regarding the naming opnorm, is it for "operator norm" as to say the matrix is an operator on a vector space? could there be a more "explicit" name to make the distinction? in any case, totally agree that it is good to have 2 well distinguished interfaces. If not, opnorm is just fine.

Regarding your query about the type of matrix, I think stdlib has in the linalg module https://stdlib.fortran-lang.org/page/specs/stdlib_linalg.html there are several checks such https://stdlib.fortran-lang.org/page/specs/stdlib_linalg.html#is_hermitian-checks-if-a-matrix-is-hermitian, https://stdlib.fortran-lang.org/page/specs/stdlib_linalg.html#is_hessenberg-checks-if-a-matrix-is-hessenberg among others.

I tend to think that being strict about the rank-1 array argument for norm is a better scientific/programming practice than what scipy currently offers. It doesn't leave any room for interpretation. Forcing the user to write a for loop for a collection of vectors might also make the code more readable and less error-prone (although it may possibly cause some performance loss (?), I don't know).

Agree, and I don't think that in Fortran you would have a performance loss (that would be true in python). If the implementations and interfaces are well designed the compilers might even be able to properly vectorize nicely. One thing to consider though, would be to have at least dimensions 2, 3 and 4 with explicit unrolled implementations and then a generic one for d>4.

@loiseaujc
Copy link
Author

I've used norm and opnorm in the discussion because that is what is being used in Julia. opnorm indeed stands for operator norm. While using norm for vectors is pretty obvious, I don't have a strong opinion regarding opnorm (although I don't have a better alternative to propose).

@perazz
Copy link
Contributor

perazz commented May 18, 2024

Thanks @loiseaujc for the detailed comments. I very mostly agree with your ideas and I would summarize the consensus so far:

  • let's have two separate functions for vector (norm) and matrix (opnorm) norms
  • let's start from general matrix types, and then we can extend them later.

Here is some further thoughts I'm having about the design.

  • the Fortran standard has norm2 that works for both arrays and matrices with the optional dim specification. We should find a way to be sure our notation is not easily confused with this.
  • We should find a unique way to express the norm type. For example, could it be that we require a character input for all norms, instead of a numeric one? I'm saying so because I'd personally like more having to deal with something like norm(x, '2'), norm(x, 'L2'), norm(x, 'inf') rather than having to do norm(x, ieee_value(0.0, ieee_positive_inf))). So I think the code would be clear for both the user and the developer:
character(*), optional, intent(in) :: norm_type
select case (norm_type)
   case ('2','Euclidean')
      ...
   case ('inf','Inf','Infinity')
     ...
end select   
  • For the (later) specialized matrix norms, I would think that an optional additional flag like is_hessenberg=.true. or matrix_form='Hessenberg' is probably the most performant design, because we can't really check if the matrix complies to any special properties at all times, it would be too expensive.

@loiseaujc
Copy link
Author

  • the Fortran standard has norm2 that works for both arrays and matrices with the optional dim specification. We should find a way to be sure our notation is not easily confused with this.

I shall be able to start working on this by the end of next week. I'll start with the baseline implementation and will see from there how we can improve and make sure it is not confounded with norm2.

  • We should find a unique way to express the norm type. For example, could it be that we require a character input for all norms, instead of a numeric one? I'm saying so because I'd personally like more having to deal with something like norm(x, '2'), norm(x, 'L2'), norm(x, 'inf') rather than having to do norm(x, ieee_value(0.0, ieee_positive_inf))). So I think the code would be clear for both the user and the developer:
character(*), optional, intent(in) :: norm_type
select case (norm_type)
   case ('2','Euclidean')
      ...
   case ('inf','Inf','Infinity')
     ...
end select   

I like this.

  • For the (later) specialized matrix norms, I would think that an optional additional flag like is_hessenberg=.true. or matrix_form='Hessenberg' is probably the most performant design, because we can't really check if the matrix complies to any special properties at all times, it would be too expensive.

Agreed. I prefer the matrix_form = "Hessenberg" rather than is_hessenberg = .true.. I may be naïve but the second would incur a lot of optional args (is_hessenberg, is_symmetric, is_triangular, etc) of which only one can be set to .true.. The matrix_form = string seems easier to handle and if not specified, opnorm defaults to the computation using the general matrix approach.

@perazz
Copy link
Contributor

perazz commented May 20, 2024

I prefer the matrix_form = "Hessenberg"

Absolutely agree, and it would be more in line with the norm_type argument

@loiseaujc
Copy link
Author

Here is a draft/hacky implementation of norm I'd like to discuss a bit before drafting the PR.

    #:for rk, rt, ri in RC_KINDS_TYPES
    #:if rk !="xdp"
    module function stdlib_linalg_norm_${ri}$(x, which) result(out)
        !> Input vector x[n]
        ${rt}$, intent(in) :: x(:)
        !> [optional] Which vector norm is being computed.
        character(len=*), optional, intent(in) :: which
        !> Norm of the vector.
        real(${rk}$) :: out

        !> Internal variables.
        integer(ilp) :: n, idx
        character(len=:), allocatable :: which_

        !> Dimension of the vector.
        n = size(x, kind=ilp)

        !> Default: Euclidean 2-norm.
        which_ = optval(which, "2") ; which_ = to_lower(which_)

        !> Dispatch computation to the appropriate driver.
        select case (which_)
            case ("2", "euclidean", "l2")
                out = nrm2(n, x, 1_ilp)
            case ("1", "l1")
                #:if rt[0] == "r"
                out = stdlib_${rk[0]}$asum(n, x, 1_ilp)
                #:elif rt[0] == "c" and rk[0] == "s"
                out = stdlib_scsum1(n, x, 1_ilp)
                #:elif rt[0] == "c" and rk[0] == "d"
                out = stdlib_dzsum1(n, x, 1_ilp)
                #:elif rt[0] == "c" and rk[0] == "q"
                out = stdlib_qzsum1(n, x, 1_ilp)
                #:endif
            case ("inf")
                #:if rt[0] == "r"
                idx = stdlib_i${rk[0]}$amax(n, x, 1_ilp)
                #:else
                idx = stdlib_i${ri[0]}$max1(n, x, 1_ilp)
                #:endif
                out = abs(x(idx))
            case default
                stop
        end select
    end function stdlib_linalg_norm_${ri}$

    #:endif
    #:endfor

After having extended the interface for nrm2 for complex-valued vectors, the code compiles correctly. I haven't yet implemented the tests though, but I don't see any obvious errors. I do have a few questions though to polish it and make it more fypp-/stdlib-compliant. There are a couple of things I'm not quite sure how to handle properly:

  • At the moment, we have the $\ell_2$ norm (default), the $\ell_1$ norm as well as the infinity norm. In my original post, I discussed the possibility also to have a generic $p$-norm. Since which is of type character(len=*), we would need to check that it does not correspond to a case already handled by the dispatch, that which is one character-long (assuming we only consider integer $p$), convert the character to a float (either single or double) and then compute $\sqrt[p]{\sum \vert x_i \vert^p}$. This seems unnecessarily complicated for a relatively edge case. A simple option however would be to have a dedicated stdlib_linalg_pnorm_${ri}$ function for which which is actually a non-negative float or integer and put it under the same interface as `stdlib_linalg_norm_${ri}$. What would be your take on this?
  • Second point has to do with the 1-norm computation. Is there a more fypp-esque way to handle the different types? The problem I see is that while asum is defined for both real and complex, sum1 only is defined for complex. I hesitate on whether I should keep the code as it is or if we should create an interface for asum, as well as one for sum1 which under the hood also calls asum if x is real. Basically something like this (don't pay attention to the syntax)
interface asum
    module function rsasum
    module function csasum
    ...
end interface

and

interface sum1
    module function rsasum
    module function csum1
    ...
end interface

asum for real-valued vectors would basically end up being referenced by two different interfaces. I'm not sure whether this is the cleanest thing to do although it would somehow make some sense from a mathematical point of view.

  • Last point finally is that I'm not sure how to properly handle the case where which is none of the cases covered by the dispatch (e.g. user made a typo).

I'll wait for your feedback at the moment and will start implementing the tests and documentation in the mean time.

@jalvesz
Copy link
Contributor

jalvesz commented Jun 29, 2024

A simple option however would be to have a dedicated stdlib_linalg_pnorm_${ri}$ function for which which is actually a non-negative float or integer and put it under the same interface as `stdlib_linalg_norm_${ri}$. What would be your take on this?

This sounds good! another option, if you prefere to keep a single interface:

use stdlib_str2num, only: to_num
use stdlib_error, only: error_stop
...
case default
    p = to_num( which_ , p ) 
    if( p > 2 .and. p < p_limit ) then !> define a limit p value
       !> compute the p-norm
    else
       call error_stop( "Invalid p-norm",code )
    end if

If you prefere to have an interface that takes an integer p value also exposed, I would then suggest to make this the base implementation that is also exposed to the user, and that the interface taking the character "just" handles the conversion and then call the one taking an integer value.

@perazz
Copy link
Contributor

perazz commented Jun 29, 2024

Here is a draft/hacky implementation of norm I'd like to discuss a bit before drafting the PR.

looks good so far!

  • At the moment, we have the ℓ2 norm (default), the ℓ1 norm as well as the infinity norm. In my original post, I discussed the possibility also to have a generic p-norm. Since which is of type character(len=*), we would need to check that it does not correspond to a case already handled by the dispatch, that which is one character-long (assuming we only consider integer p), convert the character to a float (either single or double) and then compute ∑|xi|pp. This seems unnecessarily complicated for a relatively edge case. A simple option however would be to have a dedicated stdlib_linalg_pnorm_${ri}$ function for which which is actually a non-negative float or integer and put it under the same interface as `stdlib_linalg_norm_${ri}$. What would be your take on this?

What I like the most is to have a separate function to do the option parsing, so the input could be either integer or character, as @jalvesz suggested. Something like:

pure subroutine parse_norm(which, norm_type, err)
    character(*), intent(in) :: which
    integer(ilp), intent(out) :: norm_type
    type(linalg_state_type), intent(out) :: err

   select case (to_lower(which))
        case ("euclidean","2","l2")
              norm_type = NORM_TYPE_2
        case ("inf","infinity")
              norm_type = NORM_TYPE_INF
        case default
              p = to_num(which, p)
              if (p>2 .and. p<=NORM_TYPE_MAX) then 
                  norm_type = p
              else
                  norm_type = NORM_TYPE_ERROR
                  err = linalg_state_type("norm",LINALG_VALUE_ERROR,"invalid norm type: ",which)
              endif
    end select
end subroutine 

The rule could be to define parameters that are >1 for p-norms and <=0 for ther norms, i.e.:

integer(ilp), parameter :: NORM_TYPE_ERROR = -huge(0_ilp)
integer(ilp), parameter :: NORM_TYPE_FROB = -1
integer(ilp), parameter :: NORM_TYPE_INF = -2
integer(ilp), parameter :: NORM_TYPE_L1 = 1
integer(ilp), parameter :: NORM_TYPE_L2 = 2
integer(ilp), parameter :: NORM_TYPE_MAX = 100
  • Second point has to do with the 1-norm computation. Is there a more fypp-esque way to handle the different types? The problem I see is that while asum is defined for both real and complex, sum1 only is defined for complex. I hesitate on whether I should keep the code as it is or if we should create an interface for asum, as well as one for sum1 which under the hood also calls asum if x is real. Basically something like this (don't pay attention to the syntax)

For the BLAS/LAPACK functions that have different names for real/complex types, I've been using inline fypp declarations, i.e.:

work_dummy,lwork,#{if rt.startswith('complex')}#rwork,#{endif}#iwork,info)

  • Last point finally is that I'm not sure how to properly handle the case where which is none of the cases covered by the dispatch (e.g. user made a typo).

You can look at the linalg_state_type model, that is the common way that all linear algebra routines handle exceptions, this is typically an optional, intent(out) argument that is internally handlead by a final call to:

call linalg_error_handling(err0,err)

This forces the code to either return an error code (hopefully LINALG_SUCCESS) to the user, or stop the program on an exception.

@perazz perazz mentioned this issue Sep 13, 2024
6 tasks
@perazz perazz mentioned this issue Nov 6, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea Proposition of an idea and opening an issue to discuss it
Projects
None yet
Development

No branches or pull requests

3 participants