Bugfix/use status log #38

mnlevy1981 · 2016-03-22T20:44:55Z

This branch resolves issue #23 -- marbl_interface_types.F90 now provides output to the MARBL status log rather than writing to stdout (allowing errors to be trapped and returned to the GCM as well).

Ran the aux_pop_MARBL test suite, and all 9 tests were bit-for-bit with MARBL0.0.0

Working towards solution to issue #23 -- errors returned from diagnostic init are now added to the status log instead of printed to stdout.

When adding forcing fields, errors are logged properly rather than printed to stdout. Also cleaned up some of the diagnostic logging from the prior commit. This should let us close issue #23

klindsay28 · 2016-04-05T15:25:56Z

For src/marbl_diagnostics_mod.F90, github states

1,805 additions, 227 deletions not shown because the diff is too large. Please use a local Git client to view these changes.

I don't know how to do this.

klindsay28 · 2016-04-05T16:01:04Z

The string "error code returned" confuses me. When I see that, I'm expecting to see a code, but there is none. Perhaps I'm not understanding the semanatics of marbl_logging.F90 or how it is to be understood. Unfortunately, examining the source code of marbl_logging.F90 is helping me out. It would be useful if there were documentation at the top of marbl_logging.F90 describing usage.

mnlevy1981 · 2016-04-05T16:13:26Z

@klindsay28 -- to see all the diffs locally, do the following:

Add my fork as a remote to your local clone of MARBL

$ git remote add mlevy [email protected]:mnlevy1981/MARBL.git
Make all the branches / tags from the mlevy remote available locally (this will only update your .git/ directory, it will not change any of the files outside that directory; i.e. it will not alter any of the files in your src/ directory)

$ git fetch mlevy
Compare the revision I branched from (bce81dc) to my mlevy/bugfix/use_status_log branch

$ git diff bce81dc mlevy/bugfix/use_status_log

Instead of "error code returned ..." I can change the text to "error reported from [module]". I wasn't sure of the best way to word the messages, but consider the situation where the GCM calls MARBL_A(), which calls MARBL_B(), which calls MARBL_C(), which reports an error.

Ideally, the message log would show the following (in this order):

MARBL_C: [description of error]
MARBL_B: [tell user error came from call to MARBL_C()]
MARBL_A: [tell user error came from call to MARBL_B()]
GCM: [tell user error came from call to MARBL_A()]

The GCM would write the log to stdout or a file and then call MPI_Abort(). Like I said, I wasn't sure of the best way to actually word the error messages and am happy to change the text.

I'm happy to add documentation to marbl_logging.F90 describing this usage.

klindsay28 · 2016-04-05T17:01:22Z

I'm concerned about using a module variable in marbl_logging.F90 as a temporary for message creation. It smells like a potential thread race condition waiting to bite us. Our calls to marbl_init, we are calling it for each thread, are in an iblock loop in ecosys_driver_init. If someone were to put on openmp do directive around the loop, say to speed things up, I think we'd intermittently be in trouble.

Would it be better for users of marbl_logging to either use their own temp string, or to move the temp string into one of the marbl_logging data types?

mnlevy1981 · 2016-04-05T17:08:05Z

I didn't think about the threading / race issues, but that's a great point. The safest fix would be to make error_msg and status_msg local variables in every routine that uses the status log. I'll probably just use one variable (log_msg or log_message) and pass it to both log_error() and log_noerror().

Resolved one conflict -- was using marbl_logging for both this branch and the surface_forcing_output branch (which was recently merged), but had slightly different formatting between the two.

As Keith pointed out, having these module variables creates the potential for a race condition if multiple OpenMP threads in the GCM are writing log messages at the same time. Every routine that calls log_error() or log_noerror() now has a local variable named "log_message" that we write messages into before calling one of the two aforementioned routines.

I created a new subroutine to standardize the error traceback in MARBL. When first encountering an error, call LOG%log_error() and return; every subsequent call should be to LOG%log_error_trace, which takes the name of the routine that produced the error as an argument rather than a log message. Note that this is just a wrapper script - log_error_trace creates the log message "Error reported from [routine name]" and then calls LOG%log_error with that message. I say "first pass" because I still need to adapt the errors from add_diagnostics() and add_forcing_field(), where I want to include the name of the diagnostic [or forcing] field that produced the error. I was originally thinking that I should add an optional argument for field name, but instead I might treat add_forcing_field([FIELDNAME]) as the routine name.

I created a wrapper routine in marbl_diagnostics_mod that calls LOG%log_error_trace when labort_marbl is detecting from an add_diags call. Something similar should be used in marbl_mod for the forcing fields.

mnlevy1981 · 2016-04-05T23:06:42Z

This is what the log looks like when I make num_forcing_diags too small:

 MARBL ERROR (marbl_interface_types:marbl_diagnostics_add): not enough memory al
 located for this number of diagnostics!
 MARBL ERROR (marbl_diagnostics_mod:marbl_diagnostics_init): Error reported from
  diags%add_diagnostic(O2SAT)
 MARBL ERROR (marbl_interface:marbl_init): Error reported from marbl_diagnostics
 _init()
 MARBL ERROR (ecosys_driver:ecosys_driver_init): error code returned from marbl(
 1)%init()

Obviously I need to update the "error code returned" message in the driver, but otherwise I think this is a pretty informative set of messages.

This commit only affects the comments in .F90 files, but there is now a header in marbl_logging.F90 that explains how marbl_instance%StatusLog works and how the GCM should interact with it. Also, there was a "TODO" comment in marbl_interface_types.F90 that had already been done, so I deleted it.

Much like the diagnostics in e257a8c there is now a routine in marbl_mod to handle the log_error_trace calls when forcing_fields%add_forcing_field() returns an error.

mnlevy1981 · 2016-04-06T05:07:49Z

Add when I make num_surface_forcing_fields too small:

 MARBL ERROR (marbl_interface_types:marbl_forcing_fields_add): not enough memory
  allocated for number of forcing fields!
 MARBL ERROR (marbl_mod:marbl_sflux_forcing_fields_init): Error reported from fo
 rcing_fields%add(xco2_alt_co2)
 MARBL ERROR (marbl_interface:marbl_init): Error reported from marbl_init_surfac
 e_forcing_fields()
 MARBL ERROR (ecosys_driver:ecosys_driver_init): error code returned from marbl(
 1)%init()

Again, I need to clean up the error message coming from the driver.

@klindsay28 -- at this point, I think I've addressed everything you've brought up today. I also brought this branch up to date with master (which was updated when I brought in the surface_forcing_output branch), so to see the full set of diffs you need to run

$ git fetch mlevy
$ git diff mlevy/master mlevy/bugfix/use_status_log

If there's more work to be done on this branch, I'll put it off until after I finish the tracer index clean-up... but I think this is pretty close.

I was trying to write a message combining strings and integers using the "(4A)" formatting string, which needed to be changed to "(A,I0,A,I0)"

klindsay28 · 2016-04-08T15:46:17Z

subname in subroutines of marbl_co2calc_mod.F90 are not consistent with the full module name and/or the subroutine name, or how many colons are used to separate the module name from the subroutine name. Suggested changes are:

character(*), parameter :: subname = 'marbl_co2calc:marbl_co2calc_surf'
character(*), parameter :: subname = 'marbl_co2calc_mod:marbl_co2calc_surf'
character(*), parameter :: subname = 'marbl_co2calc:marbl_comp_CO3terms'
character(*), parameter :: subname = 'marbl_co2calc_mod:marbl_comp_CO3terms'
character(*), parameter :: subname = 'marbl_co2calc:marbl_comp_htotal'
character(*), parameter :: subname = 'marbl_co2calc_mod:comp_htotal'
character(*), parameter :: subname = 'marbl_co2calc::drtsafe'
character(*), parameter :: subname = 'marbl_co2calc_mod:drtsafe'

I now see that this is the case in numerous other instances across MARBL source code. Running the command
grep -Ei 'subname =|^ *subroutine|^ *function' src/.F90 | grep -B1 'subname *='
picks up problems in marbl_restore_mod.F90, marbl_mod.F90, marbl_diagnostics_mod.F90, and marbl_ciso_mod.F90

Perhaps fixing these mismatches in files not already changed on this branch could be postponed to another tag/checkin. Perhaps a bug should be filed for this.

mnlevy1981 · 2016-04-08T16:08:53Z

src/marbl_logging.F90

@@ -1,5 +1,51 @@
 module marbl_logging


Notes from a conversation with @klindsay28

Documentation at top of routine would be more readable in a bullet-point format

Routines used to write log entries

Routines used by the driver
-- Include psuedo-code for marl_log_print!

Documentation in marbl_log_[noerror|error|error_trace] should define CodeLoc and whatnot
"CodeLoc is the name of the subroutine that has called marbl_log_error_trace, RoutineName is the name of the routine that was called by CodeLoc and now labort_marbl is true"

horizontal rule between subroutines (!====================)

Need better documentation on OutputOptions (mention that it is included in the code as a hook for future development), remove namelist reference from comment at top of marbl_log_error

klindsay28 · 2016-04-08T16:15:21Z

everything else looks fine to me
FYI, I learned from google that adding ?w=1 to the URL when examining files changed will show the diff with whitespace ignored (it mostly works in my experience)

The parameter subname appears in many subroutines, and should be declared as character(*), parmameter :: subname = '[module name]:[subroutine name]' This was not being done consistently throughout the code.

Per the comments in Pull Request #38, I did the following: 1. Used more stream-lined approach to documenting module use 2. Included basic print_marbl_log code 3. Defined the various strings that are intent(in) to the different marbl_log_* routines 4. Added a horizontal rule between subroutines 5. Added a brief blurb about the OutputOptions variable, mentioning that it is a place-holder for future development.

mnlevy1981 · 2016-04-22T21:34:37Z

Note that this update also requires an update to ecosys_driver.F90 (which currently uses two variables from marbl_logging that have been removed from the module); I realize that I haven't committed those changes to a branch of POP yet, so I need to do that and then I'll post a link with the POP changes + test results.

mnlevy1981 · 2016-04-25T16:37:34Z

The POP branch needed to run with this MARBL tag is

https://svn-ccsm-models.cgd.ucar.edu/pop2/branch_tags/marbl_dev_status_log_tags/marbl_dev_status_log_n01_marbl_dev_n03_cesm_pop_2_1_20160407/

on yellowstone_intel, all aux_pop and aux_pop_MARBL passed and also passed baseline comparisons (using cime 4.4.8 and cice 20160309; baseline POP was the March 17 trunk tag)

Added visual spacer between subroutines and made whitespace consistent with previous commit.

mnlevy1981 added 2 commits March 22, 2016 11:00

Add status log to diagnostic calls

7230e1e

Working towards solution to issue #23 -- errors returned from diagnostic init are now added to the status log instead of printed to stdout.

Add status log to forcing field calls

c3cdc12

When adding forcing fields, errors are logged properly rather than printed to stdout. Also cleaned up some of the diagnostic logging from the prior commit. This should let us close issue #23

mnlevy1981 added 4 commits April 5, 2016 11:10

Merge branch 'master' into bugfix/use_status_log

026155c

Resolved one conflict -- was using marbl_logging for both this branch and the surface_forcing_output branch (which was recently merged), but had slightly different formatting between the two.

Remove 'error code returned' from diagnostics mod

8213853

I created a wrapper routine in marbl_diagnostics_mod that calls LOG%log_error_trace when labort_marbl is detecting from an add_diags call. Something similar should be used in marbl_mod for the forcing fields.

mnlevy1981 added 2 commits April 5, 2016 22:28

Remove 'error code returned' from forcing field

3344b3c

Much like the diagnostics in e257a8c there is now a routine in marbl_mod to handle the log_error_trace calls when forcing_fields%add_forcing_field() returns an error.

Use proper formatting when writing to string

2c42d81

I was trying to write a message combining strings and integers using the "(4A)" formatting string, which needed to be changed to "(A,I0,A,I0)"

mnlevy1981 reviewed Apr 8, 2016
View reviewed changes

mnlevy1981 added In Progress ready and removed ready In Progress labels Apr 21, 2016

mnlevy1981 added 2 commits April 22, 2016 14:28

Standardize formatting for subname

1bd3def

The parameter subname appears in many subroutines, and should be declared as character(*), parmameter :: subname = '[module name]:[subroutine name]' This was not being done consistently throughout the code.

mnlevy1981 added the Priority #2 label Apr 29, 2016

mnlevy1981 added Priority #1 In Progress and removed Priority #2 Priority #1 labels May 2, 2016

mnlevy1981 added 2 commits May 2, 2016 16:55

Merge branch 'master' into bugfix/use_status_log

bcc9497

Formatting clean-up

29dc48a

Added visual spacer between subroutines and made whitespace consistent with previous commit.

mnlevy1981 merged commit 29dc48a into marbl-ecosys:master May 3, 2016

mnlevy1981 removed the In Progress label May 3, 2016

mnlevy1981 deleted the bugfix/use_status_log branch May 3, 2016 18:23

mnlevy1981 mentioned this pull request May 17, 2016

Add one more status_log call #65

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix/use status log #38

Bugfix/use status log #38

mnlevy1981 commented Mar 22, 2016

klindsay28 commented Apr 5, 2016

klindsay28 commented Apr 5, 2016

mnlevy1981 commented Apr 5, 2016

klindsay28 commented Apr 5, 2016

mnlevy1981 commented Apr 5, 2016

mnlevy1981 commented Apr 5, 2016

mnlevy1981 commented Apr 6, 2016

klindsay28 commented Apr 8, 2016

mnlevy1981 Apr 8, 2016

klindsay28 commented Apr 8, 2016

mnlevy1981 commented Apr 22, 2016

mnlevy1981 commented Apr 25, 2016

Bugfix/use status log #38

Bugfix/use status log #38

Conversation

mnlevy1981 commented Mar 22, 2016

klindsay28 commented Apr 5, 2016

klindsay28 commented Apr 5, 2016

mnlevy1981 commented Apr 5, 2016

klindsay28 commented Apr 5, 2016

mnlevy1981 commented Apr 5, 2016

mnlevy1981 commented Apr 5, 2016

mnlevy1981 commented Apr 6, 2016

klindsay28 commented Apr 8, 2016

mnlevy1981 Apr 8, 2016

Choose a reason for hiding this comment

klindsay28 commented Apr 8, 2016

mnlevy1981 commented Apr 22, 2016

mnlevy1981 commented Apr 25, 2016