From 3906ef734a109c2654a3dcc120be2f14efe21c79 Mon Sep 17 00:00:00 2001 From: Martijn Dekker Date: Sat, 10 Feb 2024 18:40:17 +0000 Subject: [PATCH] Add SHOPT_PRINTF_LEGACY: non-compliant printf option parsing On some systems, the external printf(1) allows formatters that begin with a '-' without a preceding '--' options terminator argument. The acceptance of this obsolete syntax violates POSIX because printf falls under the regular utility syntax guidelines. This situation makes it difficult to add options to printf as an extension (which POSIX allows) while maintaining backward compatibility. But some systems come with old sh scripts that depend on the obsolete printf behaviour, so ksh needs to adapt if it is to be accpeptable as a replacement system shell on these. Illumos, in particular, ships a patched ksh 93u+ 2012-08-01 that completely disables option parsing in printf and replaces it with a kludge that only recognises the POSIX '--' options terminator. This matches some external printf commands but kills 'printf --man' and friends and makes the addition of 'printf -v' impossible. This commit adds a compile-time option that provides good-enough compatibility with the obsolete printf behaviour, while keeping our own built-in printf options usable. By default, this is only compiled in if the external printf(1) command on the system where ksh is built exhibits the obsolete behaviour. It can be manually (de)activated regardless of this by setting SHOPT_PRINTF_LEGACY to 0 or 1 src/cmd/ksh93/SHOPT.sh. src/cmd/ksh93/SHOPT.sh, src/cmd/ksh93/Mamfile: - Add SHOPT_PRINTF_LEGACY compile-time option. - When no value is given, probe if external printf(1) accepts a formatter starting with - without preceding --, and emit a #define for SHOPT_PRINTF_LEGACY if found. src/cmd/ksh93/bltins/print.c: b_print(): - If SHOPT_PRINTF_LEGACY is on and b_print() is called from printf, maximise backward compatibility without killing our options using a two-pronged approach: - If the first argument contains a % or a \, disable option parsing: that is very likely a format operand even if it starts with '-', and it cannot be a valid printf option. - When about to throw an error on an invalid option (c==':'), check if the error occurred on the first argument (note that optget leaves opt_info.index==1 on error in a short option, but 2 on error in a long option). If so, ignore the error and treat the argument as a format operand regardless of content. (Note that for --man, etc., c=='?', which is not affected.) src/cmd/ksh93/sh/shtests: - For the regression tests, we need to check of SHOPT_PRINTF_LEGACY was enabled by default at compile time, which we cannot tell from the SHOPT.sh file. Scan the generated shopt.h file and use its definitions as variables as well, overriding any from SHOPT.sh. src/cmd/ksh93/tests/builtins.sh: - Do not check for correct handling of an invalid option in printf if SHOPT_PRINTF_LEGACY is on. --- NEWS | 9 +++++++++ src/cmd/ksh93/Mamfile | 7 +++++++ src/cmd/ksh93/README | 5 +++++ src/cmd/ksh93/SHOPT.sh | 1 + src/cmd/ksh93/bltins/print.c | 17 +++++++++++++++++ src/cmd/ksh93/data/builtins.c | 33 +++++++++++++++------------------ src/cmd/ksh93/include/version.h | 2 +- src/cmd/ksh93/sh.1 | 28 +++++++++++++++++++++++++++- src/cmd/ksh93/tests/builtins.sh | 2 ++ src/cmd/ksh93/tests/shtests | 24 +++++++++++++++++++++--- 10 files changed, 105 insertions(+), 23 deletions(-) diff --git a/NEWS b/NEWS index fde00d44fe95..9db1b4de5161 100644 --- a/NEWS +++ b/NEWS @@ -2,6 +2,15 @@ This documents significant changes in the 1.0 branch of ksh 93u+m. For full details, see the git log at: https://github.com/ksh93/ksh/tree/1.0 Uppercase BUG_* IDs are shell bug IDs as used by the Modernish shell library. +2024-02-11: + +- Added SHOPT_PRINTF_LEGACY compile-time option for compatibility with + POSIX-ignorant external printf(1) commands that process a format operand + starting with '-' without the standard preceding '--' options terminator. + This is for backward compatibility with local system scripts. The + behaviour of the system's external printf(1) determines whether this is + compiled in or out by default. Normal ksh printf options keep working. + 2024-02-09: - Fixed multiple inaccurate or missing values in the /opt/ast/bin/getconf diff --git a/src/cmd/ksh93/Mamfile b/src/cmd/ksh93/Mamfile index 3053ff07af70..e5fb9a828630 100644 --- a/src/cmd/ksh93/Mamfile +++ b/src/cmd/ksh93/Mamfile @@ -71,6 +71,13 @@ make install virtual exec - then writedef TEST_L 1 exec - fi 2>/dev/null exec - ${STDRM} -f "$EXECROOT/link$$" ;; + exec - 'PRINTF_LEGACY=') + exec - # if external 'printf' allows POSIX-ignorant syntax (i.e., a format operand + exec - # starting with '-' without prior '--'), enable code for backward compat + exec - case $(env printf '-zut%s\n' alors 2>/dev/null) in + exec - -zutalors) + exec - writedef PRINTF_LEGACY 1 ;; + exec - esac ;; exec - 'SYSRC=') exec - # if one of these exists, make SHOPT_SYSRC load /etc/ksh.kshrc by default exec - if test -f /etc/ksh.kshrc || test -f /etc/bash.bashrc diff --git a/src/cmd/ksh93/README b/src/cmd/ksh93/README index 89766fe01b67..ca4803de0928 100644 --- a/src/cmd/ksh93/README +++ b/src/cmd/ksh93/README @@ -131,6 +131,11 @@ The options have the following defaults and meanings: OPTIMIZE on Optimize loop invariants for with for and while loops. + PRINTF_LEGACY The printf built-in accepts a format operand that starts + with '-' without the standard preceding '--' options + terminator. This is for compatibility with local scripts. + Enabled by default if the OS's printf(1) is POSIX-ignorant. + P_SUID 0 If set, all real UIDs greater than or equal to this value will require the -p option to run the shell setuid/setgid. diff --git a/src/cmd/ksh93/SHOPT.sh b/src/cmd/ksh93/SHOPT.sh index 27a8b88f1fdc..f291134afa62 100644 --- a/src/cmd/ksh93/SHOPT.sh +++ b/src/cmd/ksh93/SHOPT.sh @@ -32,6 +32,7 @@ SHOPT NOECHOE=0 # turn off 'echo -e' when SHOPT_ECHOPRINT is disabled SHOPT OLDTERMIO= # support both TCGETA and TCGETS SHOPT OPTIMIZE=1 # optimize loop invariants SHOPT P_SUID=0 # real UIDs >= this value require -p for set[ug]id (to turn off, use empty, not 0) +SHOPT PRINTF_LEGACY= # allow noncompliant printf(1) syntax (format arg starting with '-' without prior '--') SHOPT REGRESS= # enable __regress__ builtin and instrumented intercepts for testing SHOPT REMOTE= # enable --rc if running as a remote shell SHOPT SCRIPTONLY=0 # build ksh for running scripts only; compile out the interactive shell diff --git a/src/cmd/ksh93/bltins/print.c b/src/cmd/ksh93/bltins/print.c index bf8bede562bf..389bab40726d 100644 --- a/src/cmd/ksh93/bltins/print.c +++ b/src/cmd/ksh93/bltins/print.c @@ -197,6 +197,14 @@ int b_print(int argc, char *argv[], Shbltin_t *context) } } opt_info.disc = &disc; +#if SHOPT_PRINTF_LEGACY + /* POSIX-ignorant printf(1) compat, prong 1: prevent option parsing if first arg looks like format operand */ + if(argc<0 && argv[1] && (strchr(argv[1],'%') || strchr(argv[1],'\\'))) + { + opt_info.index = 1; + goto skipopts; + } +#endif /* SHOPT_PRINTF_LEGACY */ while((n = optget(argv,options))) switch(n) { case 'n': @@ -264,6 +272,14 @@ int b_print(int argc, char *argv[], Shbltin_t *context) vflag='C'; break; case ':': +#if SHOPT_PRINTF_LEGACY + /* POSIX-ignorant printf(1) compat, prong 2: treat erroneous first option as operand */ + if(argc<0 && (opt_info.index==1 || opt_info.index==2 && argv[1][0]=='-' && argv[1][1]=='-')) + { + opt_info.index = 1; + goto skipopts; + } +#endif /* SHOPT_PRINTF_LEGACY */ /* The following is for backward compatibility */ if(strcmp(opt_info.name,"-R")==0) { @@ -290,6 +306,7 @@ int b_print(int argc, char *argv[], Shbltin_t *context) errormsg(SH_DICT,ERROR_usage(2), "%s", opt_info.arg); UNREACHABLE(); } +skipopts: opt_info.disc = NULL; argv += opt_info.index; if(error_info.errors || (argc<0 && !(format = *argv++))) diff --git a/src/cmd/ksh93/data/builtins.c b/src/cmd/ksh93/data/builtins.c index c2f6fe6491da..2bd08c3f6c95 100644 --- a/src/cmd/ksh93/data/builtins.c +++ b/src/cmd/ksh93/data/builtins.c @@ -1252,28 +1252,25 @@ const char sh_optprint[] = ; const char sh_optprintf[] = -"[-1c?\n@(#)$Id: printf (ksh 93u+m) 2023-03-23 $\n]" +"[-1c?\n@(#)$Id: printf (ksh 93u+m) 2024-02-11 $\n]" "[--catalog?" SH_DICT "]" "[+NAME?printf - write formatted output]" "[+DESCRIPTION?\bprintf\b writes each \astring\a operand to " "standard output using \aformat\a to control the output format.]" -"[+?The \aformat\a operands supports the full range of ANSI C formatting " +#if SHOPT_PRINTF_LEGACY +"[+?For backward compatibility with this system's external \bprintf\b(1) " + "command, this built-in version allows the \aformat\a operand to " + "start with a \b-\b without a prior \b--\b options terminator " + "argument, provided no \aoptions\a are given. " + "This is not portable and should be avoided in new scripts.]" +#else +"[+?Note that the \aformat\a operand cannot start with a \b-\b unless it is " + "preceded by a \b--\b options terminator argument.]" +#endif +"[+?The \aformat\a operand supports the full range of ANSI C formatting " "specifiers plus the following additional specifiers:]{" "[+%b?Each character in the \astring\a operand is processed " - "specially as follows:]{" - "[+\\a?Alert character.]" - "[+\\b?Backspace character.]" - "[+\\c?Terminate output without appending newline. " - "The remaining \astring\a operands are ignored.]" - "[+\\f?Formfeed character.]" - "[+\\n?Newline character.]" - "[+\\t?Tab character.]" - "[+\\v?Vertical tab character.]" - "[+\\\\?Backslash character.]" - "[+\\E?Escape character (ASCII octal 033).]" - "[+\\0\ax\a?The 8-bit character whose ASCII code is " - "the 1-, 2-, or 3-digit octal number \ax\a.]" - "}" + "for \b\\\b codes as in \bprint\b(1).]" "[+%q?Output \astring\a quoted in a manner that it can be read in " "by the shell to get back the same string. However, empty " "strings resulting from missing \astring\a operands will " @@ -1415,8 +1412,8 @@ const char sh_optprintf[] = "specifiers will be treated as if empty strings were supplied, " "numeric conversions will be treated as if 0 were supplied, and " "time conversions will be treated as if \bnow\b were supplied.]" -"[+?\bprintf\b is equivalent to \bprint -f\b which allows additional " - "options to be specified.]" +"[+?Except for the \b-v\b option, \bprintf\b is equivalent to \bprint -f\b " + "which allows additional options to be specified.]" "[v]:[name?Put the output in the variable \aname\a instead of writing to " "standard output. \aname\a may include an array subscript (note that " "the square brackets should be quoted to avoid pathname expansion).]" diff --git a/src/cmd/ksh93/include/version.h b/src/cmd/ksh93/include/version.h index 816dbea9f370..e7b8496f8f03 100644 --- a/src/cmd/ksh93/include/version.h +++ b/src/cmd/ksh93/include/version.h @@ -18,7 +18,7 @@ #define SH_RELEASE_FORK "93u+m" /* only change if you develop a new ksh93 fork */ #define SH_RELEASE_SVER "1.0.9-beta" /* semantic version number: https://semver.org */ -#define SH_RELEASE_DATE "2024-02-09" /* must be in this format for $((.sh.version)) */ +#define SH_RELEASE_DATE "2024-02-11" /* must be in this format for $((.sh.version)) */ #define SH_RELEASE_CPYR "(c) 2020-2024 Contributors to ksh " SH_RELEASE_FORK /* Scripts sometimes field-split ${.sh.version}, so don't change amount of whitespace. */ diff --git a/src/cmd/ksh93/sh.1 b/src/cmd/ksh93/sh.1 index e4b89923c82d..9340f8dc2f8b 100644 --- a/src/cmd/ksh93/sh.1 +++ b/src/cmd/ksh93/sh.1 @@ -7356,8 +7356,34 @@ command substitution and avoids the latter's stripping of final linefeed characters (\fB\\n\fR). The \f2vname\fP argument should be a valid variable name, optionally with one or more array subscripts in square brackets. Note that square brackets should be quoted to avoid pathname expansion. +.PP +On some systems, the external +.BR printf (1) +command +allows the +.I format\^ +operand to begin with a +.B -\^ +without any preceding +.B --\^ +option terminator argument +for compatibility with ancient scripts, +provided no options are given. +On such systems, +.if \nZ=0 .BR sh 's +.if \nZ=1 .BR ksh 's +.if \nZ=2 .BR ksh93 's +built-in +.B printf\^ +may have been built to be as compatible with +this as possible while still allowing its options to be usable. +However, the POSIX standard requires adding the preceding +.B --\^ +to keep such a +.I format\^ +operand from being misinterpreted as options. +The obsolete syntax is not portable and should be avoided in new scripts. .RE - .TP \f3pwd\fP \*(OK \f3\-LP\fP \*(CK Outputs the value of the current working diff --git a/src/cmd/ksh93/tests/builtins.sh b/src/cmd/ksh93/tests/builtins.sh index a6865c178132..e45ac02172af 100755 --- a/src/cmd/ksh93/tests/builtins.sh +++ b/src/cmd/ksh93/tests/builtins.sh @@ -1206,6 +1206,8 @@ function test_usage do case $bltin in fc | hist ) ((SHOPT_SCRIPTONLY)) && continue ;; + printf ) + ((SHOPT_PRINTF_LEGACY)) && continue ;; echo | test | true | false | \[ | : | expr | */expr | getconf | */getconf | uname | */uname | catclose | catgets | catopen | Dt* | _Dt* | X* | login | newgrp ) continue ;; /*/*) expect="Usage: ${bltin##*/} " diff --git a/src/cmd/ksh93/tests/shtests b/src/cmd/ksh93/tests/shtests index 6bbe2ef2740b..51ef3b30c1ed 100755 --- a/src/cmd/ksh93/tests/shtests +++ b/src/cmd/ksh93/tests/shtests @@ -8,7 +8,7 @@ valgrindflags='--xml=yes --log-file=/dev/null --track-origins=yes --read-var-inf USAGE=$' [-s8? -@(#)$Id: shtests (ksh 93u+m) 2023-05-18 $ +@(#)$Id: shtests (ksh 93u+m) 2024-02-11 $ ] [-author?David Korn ] [-author?Glenn Fowler ] @@ -274,9 +274,12 @@ if [[ -d /usr/ucb ]] then PATH=$PATH:/usr/ucb fi PATH=$PATH:$d -if [[ $INSTALLROOT && -r $INSTALLROOT/bin/.paths ]] -then PATH=$INSTALLROOT/bin:$PATH +if [[ ! -v INSTALLROOT || $INSTALLROOT != /* ]] +then echo "$0: cannot continue: INSTALLROOT must be set" >&2 + echo "$0: invoke me after 'bin/package use', or via 'bin/shtests' or 'bin/package test'" >&2 + exit 1 fi +PATH=$INSTALLROOT/bin:$PATH if [[ ${SHELL%/*} != $INSTALLROOT/bin ]] then PATH=${SHELL%/*}:$PATH fi @@ -308,12 +311,27 @@ fi export HOME=$tmp # make the SHOPT_* macros available to the tests as environment variables +command unset "${!SHOPT_@}" 2>/dev/null SHOPT() { [[ $1 == *=* ]] && eval "export SHOPT_${ printf %q "${1%%=*}"; }=${ printf %q "${1#*=}"; }" } . "${SHOPTFILE:-../SHOPT.sh}" unset -f SHOPT +# override with probed values from shopt.h: +# change '#define SHOPT_foo = bar' to 'export SHOPT_foo=bar' and evaluate +if [[ -f $INSTALLROOT/src/cmd/ksh93/shopt.h ]] && + i=$(sed -n $'/^#[ \t]*define[ \t][ \t]*SHOPT_/ { + s/.*define[ \t][ \t]*// + s/[ \t][ \t]*/=/ + s/^/export / + p + }' "$INSTALLROOT"/src/cmd/ksh93/shopt.h) +then eval "$i" || exit +else echo "$0: cannot grep shopt.h" >&2 + exit 1 +fi + [[ -n $SHOPT_MULTIBYTE ]] || SHOPT_MULTIBYTE=$( LC_ALL=C.UTF-8; x=$'\xc3\xa9'; print $(( ${#x}==1 )) ) if (( !SHOPT_MULTIBYTE && utf8 && !posix && !compile )) then echo "-u/--utf8 is unavailable because multibyte support was not compiled in." >&2