Document Number: | P2192R4 |
Date | 2021-03-13 |
Audience | SG18 LEWG Incubator |
Author | Dusan B. Jovanovic ( [email protected] ) |
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. -- C.A.R. Hoare
- 1. Abstract
- 2. From error to returned information handling
- 4. valstat protocol C++ definition
- 5. Usage
- 6. Conclusions
- 7. References
- 8. Appendix A
- 9. Appendix B: Requirements Common Across Domains
R4: valstat protocol is a separate project. this is its c++ definition.
R3: Two stages returns handling clarification. Better examples.
R2: More elaborate motivation. Better valstat section. Cleaner Appendix examples. Title changed from "std::valstat - transparent return type" to "std::valstat -Transparent Returns Handling".
R1: Marketing blurb taken out. Focused and short proposal. valstat in the front.
R0: "Everything is numbered" style. A lot of story telling and self marketing. Too long.
This is a proposal about logical, feasible, lightweight and effective handling of information returned from functions, based on the valstat protocol.
valstat is not error handling idiom. Please take a slight and quick detour to read that document first.
Implemented in standard C++, this would be a tiny std library citizen without any language change required.
In standard C++ there is no unanimously adopted standard call/response handling idiom or return type. Like there is in Rust, GO , JavaScript to some extent, Swift and so on alsmot ad infinitum.
As of today, in the std lib, there are more than few, error handling paradigms, idioms and return types. Accumulated through decades, from ancient to contemporary. Together they have inevitably contributed to a rising technical debt present inside C++ std lib.
In order to achieve the required wide scope of the valstat protocol, implementation has to be simple. valstat actual programming language shape has to be completely problem domain or context free. Put simply the C++ std lib implementation must not influence or dictate the usage beside supporting the protocol.
Please see Appendix B: Requirements Across Domains for a bit more detailed but quick overview.
valstat protocol defines information as made of state and data. valstat structure is response information carrier. it is made of two fields: value and status.
structure valstat
field value
field status
valstat protocol field has two occupancy states
field_state ::= empty | occupied
Valstat enabled API returns valstat structure instance made by encapsulated application logic, in order to pass the information to the caller. The valstat protocol users (callers), have the opportunity to decode (aka capture) one of the four states, carried over from a call responder.
That is wider functionality than simple error handling by some special value returned.
Information carried over by valstat structure is handled in two steps:
- Step One
- decode the state
- Step Two
- use the data
"std::valstat" is a name of the C++ template, offering the greatest possible degree of freedom for valstat protocol adopters. Implementation is simplest possible, resilient, lightweight and feasible. Almost transparent.
std::valstat<T,S> as a template is an generic interface whose aliases and definitions allow the easy "step 1" state decoding by examining the state of occupancy of the 'value' and 'status' fields.
// Synopsis
// std lib header: <valstat>
namespace std
{
template< typename V, typename S >
struct [[nodiscard]] valstat
{
// both types must be able to
// simply and resiliently
// exhibit state of occupancy
// "empty" or "occupied"
using value_field_type = V ;
using status_field_type = S ;
// valstat state is deduced by combining
// state of occupancy of these two fields
value_field_type value;
status_field_type status;
};
} // std
std::valstat will is assuring the valstat protocol in the realm of ISO C++, only as an recommendation. It will not mandate its usage in any way. It should be in a separate header <valstat>, to allow for complete decoupling from any of the std lib types and headers.
Let us repeat: std::valstat is a recommendation. valstat protocol can be implemented in many ways in C++; and many other languages as a matter of fact.
Both value and status field types, should offer a simple mechanism that reveals their occupancy state. Readily available type offering that "field like" behavior is std::optional.
In specific contexts a native pointer or any other type can serve the same purpose, as it will be explained shortly. What is the meaning of "empty" for a particular C++ type, and what is not, depends on the context. Please see an example in the appendix
What is the difference vs the two other "current" proposals, persevered through years of selection:
- std::status_code
- std::expected
Differences are:
- the above are based on special types purposely designed and developed to be returned from functions.
- std::valstat is not a type. it is a template implementing the valstat protocol
- type based on std::valstat can use any types appropriate
- the above serve the purpose inside the standard C++ only
- valstat protocol is language agnostic
- valstat protocol has no exceptions related parts
- std::valstat based types solve the no-exception requirement too
- None of the above carries same kind of information as valstat protocol does.
- Fields on the valstat structure can contain values of any type
Template is not a type. "my::valstat" is a descriptive name of a template alias we will use for illustration purposes in most examples in this proposal. We will solve the occupancy requirement imposed on valstat fields by simply using std::optional. We will show we do not need hundreds (thousands?) of lines of non-trivial C++ required for implementation of special "error handling" type. No need to be concerned about the implementation complexity[13].
// 'my' is adopters namespace
namespace my {
// ready to operate on almost any type
// std::optional rules permitting
template<typename T, typename S>
using valstat = std::valstat<
std::optional<T>,
std::optional<S> >;
} // my
In standard C++ world, it is not wrong to relax a valstat structure definition, down to an "AND combination" of two std::optional's.
Now both API responders and API callers (in the my namespace) have the universal readily applicable my::valstat, as an simple template alias. Most of the time my::valstat C++ users will use a structured binding. Let's see some ad-hoc C++ examples of my::valstat direct usage (no calls involved yet):
// OK valstat created
// both fields are std::option<int> instances
auto [ value, status ] = my::valstat< int, int >{ 42, {} };
// step one: compare the fields occupancy
// OK state captured
// there is a value but no status returned
if ( value && ! status ) {
/* step two:
depend on the status value and type, taken from a field instance */
std::cout << "OK valstat captured, value is: " << *value ;
}
If required, the other three valstat protocol states wil be created like so:
// both fields are std::option<int> instances
auto [ value, status ] = my::valstat< int, int >{ 42, 42 }; // INFO
auto [ value, status ] = my::valstat< int, int >{ {}, {} }; // EMPTY
auto [ value, status ] = my::valstat< int, int >{ {}, 42 }; // ERROR
What states are produced and their exact logic of usage, completely depends on the adopters domain.
After all this postulating, protocol and field theory and such, it might come as a surprise, in some circumstances it is quite ok and enough to be using fundamental types for both value and status fields. We can implement the "field" paradigm by using just fundamental types.
Let us consider some very strict embedded system, platform.
// valstat type but not as we know it
// note: this might be defined in some C code too
struct valstat_int_int final {
int value;
int status;
};
// both value and status fields in here are just integers
// consuming is same as with my::valstat
// difference is completely transparent
auto [ value, status ] = valstat_int_int{ 42, {} }; // OK
// step one
// OK state decoding
// (42 && !0 ) yields true
if ( value && ! status ) { uplink( value ) ; }
// other three metastates, but only if required
// again, to callers difference v.s. my::valstat
// is completely transparent
auto [ value, status ] = valstat_int_int{ 42, 42 }; // INFO
auto [ value, status ] = valstat_int_int{ {}, {} }; // EMPTY
auto [ value, status ] = valstat_int_int{ {}, 42 }; // ERROR
That is still valstat protocol in action. It is only, in some situations valstat field types can be two simple integers.
// in some specific narrow context integer is "empty" if it is zero
bool is_empty( int val_ ) { return ! val_ ; }
Above is rather important valstat ability to be transparently adopted for various projects. That solution is not using std lib and is working under extremely strict pre-conditions. The already mentioned example in the appendix, shows something different but similar.
It is admittedly hard to immediately see the connection between my::valstat or std::valstat, and the somewhat bold promises about wide spectrum of benefits, presented in the motivation section.
There are many equally simple and convincing examples of valstat usage benefits. In order to keep this core proposal short we will first observe just one, but illustrative use-case. Appendix A contains few more.
Recap: my::valstat struct instance carries (out of the functions) information: state and payload to be utilized by callers. How and why (or why not) is the valstat state decoding algorithm shaped, that completely depends on the project, the API logic and many other requirements dictated by adopters architects and developers.
Example bellow is used by valstat adopters operating on some database. In this illustration, adopters use the valstat to pass back (to the caller) full information (state + data), obtained after the database field fetching operation. Again, please notice there is no 'special' over-elaborated return type required. That is a good thing. valstat is a protocol, there is no complex C++ type, just clean and repeatable idioms of two step returns handling.
// declaration of a valstat emitting function
template<typename T>
// we use my::valstat type from above
// `my::stat` is 'code' from some internal code/message mechanism.
my::valstat<T, my::stat >
full_field_info
(database::row /*row_*/ , std::string_view /* field_name */ )
// valstat protocol naturally allows no exception throwing
noexcept ;
Primary objective is enabling callers comprehension of a full information passed out of the function, state and data. Full returns, not just error handling.
// full return handling after
// the attempted field content retrieval
auto [ value, status ] = full_field_info<int>( db_row, field_name ) ;
When designing a solution, adopters have decided they will utilise all four valstat states. Calling code is capturing all.
// step one
// capturing: info
if ( value && status ) {
// step two
std::cout << "\nSpecial value found: " << *value ;
// *status type is my::stat
std::cout << "\nStatus is: " << my::status_message(*status) ;
}
// capturing: ok
if ( value && ! status ) {
std::cout << "\nOK: Retrieved value: " << *value ;
}
// capturing: error
if ( ! value && status ) {
// in this design status contains an error code
std::cout << "\nRead error: " << my::status_message(*status) ;
}
// capturing: empty
if ( ! value && ! status ) {
// empty feild is not an error
std::cout << "\nField is empty." ;
}
Please do note, using the same paradigm it is almost trivial to imagine that same code above, in e.g. JavaScript, calling the module written in C++ returning valstat protocol structure that JavaScript will understand.
Let us emphasize: Not all possible states need to be captured by the caller each and every time. It entirely depends on the API design, on the logic of the calling site, on application requirements and such.
Requirements permitting, API implementers are free to choose if they will use and return them all, one,two or three valstat states.
// API implementation using valstat protocol
template<typename T>
my::valstat<T, my::stat >
full_field_info
(database::row row_, std::string_view field_name )
// platform requirements do not allow
// throwing exceptions
noexcept
{
// sanity check
if ( field_name.size() < 1)
// return ERROR state
return { {}, my::stat::name_empty };
// using some hypothetical database API
// where row is made of fields
database::field_descriptor field = row_.fetch( field_name ) ;
// error can be anything not
// just database related
if ( field.in_error() )
// return ERROR valstat
// status is some internal code of my::stat type
return { {}, my::stat::db_api_error( field.error() ) };
// empty field is not an error
// return an EMPTY valstat structure
if ( field.is_empty() )
return { {}, {} };
// db type will have to be cast into the type T
// type T handles value semantics
T field_value{} ;
// try getting the value from a database field
// and casting it into T
if ( false == field.data( field_value ) )
// failed, return ERROR state
return { {}, my::stat::type_cast_failed( field.error() ) };
// solving business requirement
// API contract requires signalling if 'special' value is found
if ( special_value( field_value ) )
// return INFO state and both fields populated
return { field_value, my::stat::special_value };
// value is obtained and ready
// status field is empty
// OK state signalled back
return { field_value, {} };
}
Basically function returning the valstat state + data, is simply returning two fields structure. With all the advantages and disadvantages imposed by the core language rules. Any kind of home grown but functional valstat type will work in there too. As long as callers can capture the states and data in two steps by using the two fields returned.
Using thread safe abstractions, or asynchronous processing is also not stopping the adopters to return the metastates from their API's.
Fundamentally, the burden of proof is on the proposers. — B. Stroustrup, [11]
"valstat" protocol is multilingual in nature. Thus adopters from any imperative language are free to implement it in any way they wish too. The key protocol benefit is: interoperability.
Using the same protocol implementation it is feasible to develop standard C++ code using standard library, but in restricted environments. Author is certain readership knows quite well why is that situation considered unresolved in the domain of ISO C++.
Authors primary aim is to propagate widespread adoption of this paradigm. As shown valstat protocol implemented in C++ is more than just solving the "error-signalling problem"[11]. It is an paradigm shift, instrumental in solving the often hard and orthogonal set of platform requirements described in the motivation section.
valstat protocol and C++ definition, while imposing extremely little on adopters is leaving the non-adopters to "proceed as before".
Obstacles to paradigm adoption are far from just technical. But here is at least an immediately usable attempt to chart the way out.
-
[0] B. Stroustrup (2018) P0976: The Evils of Paradigms Or Beware of one-solution-fits-all thinking, https://www.stroustrup.com/P0976-the-evils-of-paradigms.pdf
-
[1] Ben Craig, Ben Saks, Leaving no room for a lower-level language: A C++ Subset, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1105r1.html#p0709
-
[2] Lawrence Crowl, Chris Mysen, A Class for Status and Optional Value, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0262r1.html
-
[3] Herb Sutter,Zero-overhead deterministic exceptions, https://wg21.link/P0709
-
[4] Douglas, Niall, SG14 status_code and standard error object for P0709 Zero-overhead deterministic exceptions, https://wg21.link/P1028
- Douglas Niall, Zero overhead deterministic failure – A unified mechanism for C and C++, https://wg21.link/P1095
-
[5]Vicente Botet, JF Bastien, std::expected http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0323r8.html
-
[6] Craig Ben, Error size benchmarking: Redux , http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1640r1.html
-
[7] Vicente J. Botet Escribá, JF Bastien, Utility class to represent expected object, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0323r3.pdf
-
[8] Shoop Kirk, Cancellation is not an Error, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1677r0.pdf
-
[9] Wikipedia Empty String, https://en.wikipedia.org/wiki/Empty_string
-
[10] "Your Dictionary" Definition of empty, https://www.yourdictionary.com/empty
-
[11] Bjarne Stroustrup P1947 C++ exceptions and alternatives, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1947r0.pdf
-
[12] A Conversation with Anders Hejlsberg, Part II The Trouble with Checked Exceptions, https://www.artima.com/intv/handcuffs.html
-
[13] Niall Douglass Concerns about expected<T, E> from the Boost.Outcome peer review, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0762r0.pdf
-
[14] Library Evolution Working Group Summary of SG14 discussion on <system_error>, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0824r1.html
-
[15] Joel Spolsky, Joel On Software -- 13: Exceptions, https://www.joelonsoftware.com/2003/10/13/13/
To me, one of the hallmarks of good programming is that the code looks so simple that you are tempted to dismiss the skill of the author. Writing good clean understandable code is hard work whatever language you are using -- Francis Glassborow
Value of the programming paradigm is best understood by seeing the code using it. The more the merrier. Here are a few more simple examples illustrating the valstat protocol and implementations applicability.
An perhaps very elegant solution to the "index out of bounds" problem. Using my::valstat as already defined above.
// inside some sequence like container
// note how we use pre existing std type
// for the status field
my::valstat< T , std::errc >
operator [] ( size_t idx_ ) noexcept
{
if ( ! ( idx_ < size_ ) )
/* ERROR state + data */
return { {}, my::errc::invalid_argument };
/* OK state + data */
return { data_[idx_] , {} };
}
That usage of my::valstat alone resolves few difficult and well known design issues.
auto [ value, status ] = my_vector[42] ;
// first step: check the states
if ( value ) { /* second step: we are here just if state is OK */ }
if ( status ) { /* second step: we are here just if state is ERROR */ }
No exceptions, no assert()
and no exit()
.
Perhaps the key reasons for appearance of C++ dialects, are to be found in the std lib perceived inability to be used for components required to operate in the environments with limited resources available. That essentially means developing using the C++ core language but without the std lib. [1]
One motivation of this paper is to try and offer an "over arching", but simple enough, returns handling paradigm applicable across the C++ landscape. Including across a growing number of C++ dialects, fragmenting away the industry and markets relying on existence of the standard C++.
Minimal list of requirements
(for ISO C++ projects, producing components for restricted environments)
For details, authoritative references are provided. Author will be so bold not to delve into the reasons and background of this list, in order to keep this paper simple and focused. Gaming, embedded systems, high performance, mission critical computing, are just the tip of the iceberg.
Each traditional solution to strict platform requirements is one nail in the coffin of interoperability. In danger of sounding like offering an panacea, author will also draw the attention to the malleability of the valstat paradigm to be implemented with wide variety of languages used in developing components of an modern distributed system.
Usability of an API is measured on all levels: from the code level, to the distributed system level. In order to design an API to be feasibly usable it has to be interoperable. That leads to three core requirements of
Interoperable API core requirements (to start with)
- no "error code" as return value
- Complexity arises from "special" error codes multiplied with different types multiplied with different context
- In turn increasing the learning curve for every function in every API
- How to decode the error code, becomes a prominent issue
- Think Windows:
NTSTATUS
,HRESULT
,GetCode()
,errno
- Think Windows:
- no "return arguments" aka "reference arguments" in C++.
- language specific mutable argument solutions are definitely not interoperable.
- no special globals
- Think errno legacy
- pure functions can not use globals
Some of the designed-in, simplicity in this paper is an result of deliberate attempt to increase the interoperability (also with other run-time environments and languages).
It is important to understand there are inter domain interoperability requirements, not just using standard C++. Examples: WASM, Node.JS, Android and such.