Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check if correlation matrix is consistent #675

Merged
merged 1 commit into from
Dec 6, 2024
Merged

Conversation

dafeda
Copy link
Contributor

@dafeda dafeda commented Dec 6, 2024

Previous versions of fmudesign let the users know that a nearest correlation matrix is being used.
Bringing this functionality back in.
Set seed in design_input_onebyone to be able to check stdout in test.

@dafeda dafeda self-assigned this Dec 6, 2024
Copy link
Contributor

@tommyod tommyod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it - very useful for the user to get feedback like this. In fact, maybe feedback should be even more detailed? See comments.

src/semeio/fmudesign/create_design.py Outdated Show resolved Hide resolved
np.linalg.cholesky(matrix)
except np.linalg.LinAlgError:
print("Warning: Correlation matrix is not positive semidefinite")
return False
Copy link
Contributor

@tommyod tommyod Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the others checks raise, but this returns False. raise here too?

actually, the other conditions here would raise, and the matrix would not be fixed. there seems to be some inconsistencies here. if we can fix it we should, but we should print warnings. if there is no way to fix it, then we should raise on the outside.

we can also consider:

  • print on "small errors"
  • raise on "big errors" (diagonal is 2.0 => raise, diagonal is 1.000001 => print and fix)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have instead opted to just return True or False. I think it's cleaner. What do you think?

Added sensitivity : sens6
Warning: Correlation matrix is not positive semidefinite
Input correlation matrix:
[[1. 0.9 0. 0. ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would the prints be prettier if the same number of digits was always used?
that way the columns will align.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed the output to look like the following. What do you think?

    expected_output = """Added sensitivity : seed
    Added sensitivity : faults
    Added sensitivity : velmodel
    Added sensitivity : contacts
    Added sensitivity : multz
    Added sensitivity : sens6

    Warning: Correlation matrix is not consistent
    Requirements:
    - Ones on the diagonal
    - Positive semi-definite matrix

    Input correlation matrix:
    [[1.00 0.90 0.00 0.00]
    [0.90 1.00 0.90 0.00]
    [0.00 0.90 1.00 0.00]
    [0.00 0.00 0.00 1.00]]

    Adjusted to nearest consistent correlation matrix:
    [[1.00 0.74 0.11 0.00]
    [0.74 1.00 0.74 0.00]
    [0.11 0.74 1.00 0.00]
    [0.00 0.00 0.00 1.00]]

    Added sensitivity : sens7
    Added sensitivity : sens8
    Provided number of background values (11) is smaller than number of realisations for sensitivity ('sens7', 'p10_p90') and parameter PARAM13. Will be filled with default values.
    Provided number of background values (11) is smaller than number of realisations for sensitivity ('sens7', 'p10_p90') and parameter PARAM14. Will be filled with default values.
    Provided number of background values (11) is smaller than number of realisations for sensitivity ('sens7', 'p10_p90') and parameter PARAM15. Will be filled with default values.
    Provided number of background values (11) is smaller than number of realisations for sensitivity ('sens7', 'p10_p90') and parameter PARAM16. Will be filled with default values.
    A total of 91 realizations were generated
    Designmatrix written to generateddesignmatrix.xlsx"""

raise ValueError("Correlation matrix diagonal elements must be 1")

# Check elements are between -1 and 1
if np.any(matrix > 1 + 1e-8) or np.any(matrix < -1 - 1e-8):
Copy link
Contributor

@tommyod tommyod Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps these errors should be propagated out to print with a more detailed description on the element level?

element C[row=1, column=3] = 1.001, but every correlation should be in range [0, 1]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this test as I think the definition of a consisten correlation matrix is that it's positive semi-definite and has ones on the diagonal. This is a different type of check I think. Might still be useful though. What do you think?

Previous versions of fmudesign let the users know that
a nearest correlation matrix is being used.
Bringing this functionality back in.
Set seed in design_input_onebyone to be able to check stdout
in test.
@dafeda dafeda force-pushed the check-is-consistent branch from 145738e to a247015 Compare December 6, 2024 13:41
Copy link
Collaborator

@berland berland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@dafeda dafeda merged commit 9b347db into main Dec 6, 2024
8 checks passed
@dafeda dafeda deleted the check-is-consistent branch December 6, 2024 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants