Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiboot2: Follow C behavior for string parsing #172

Merged
merged 1 commit into from
Sep 21, 2023

Conversation

A0lson
Copy link
Contributor

@A0lson A0lson commented Jul 18, 2023

The various strings in the Multiboot2 spec are specified as C-style zero-terminated strings.

Previously, if the multiboot2 crate is handed a string with non-NUL characters present between the terminating NUL (within the tag), they were returned as well.

This revision instead terminates the strings on the first NUL character, as would be typical in C handling. Although unterminated strings in Multiboot2 are technically an error, current unterminated string behavior is maintained.

@A0lson
Copy link
Contributor Author

A0lson commented Jul 18, 2023

FYI, in Rust 1.69 a new method was stabilized for converting nul-terminated strings. However, it returns a different type of error for non-terminated strings.

Thus, I think using it in the multiboot2 crate will require refactoring of the Utf8Error return results into a new error that encompasses both the non-terminated error and the Utf8Error.

(Functionally, it seems to also terminate on the string on the first NUL character too.)

The following seems to work too, but is probably a breaking change due to the crate's API changes:

main...A0lson:multiboot2:cstring_from_bytes_until_nul

@phip1611
Copy link
Collaborator

Hi @A0lson ,
thanks for the bug report and the fix! I'll look into it very soon but I'm busy at the moment.

@phip1611
Copy link
Collaborator

phip1611 commented Jul 21, 2023

Can you please tell me a real-world example where this causes a problem? If a Multiboot2 bootloader puts foo\0bar\0 into the tag and the size properly reflects that, why shouldn't Rust support that? You still have the option to truncate the string slice further to the first NUL on a higher level, i.e., in your application code.

@A0lson
Copy link
Contributor Author

A0lson commented Jul 21, 2023

Here's an example that fails in main but I think should pass. Here simply truncating the returned value at the NUL byte isn't an possibility as the result is a Utf8Error instead of a string.

assert_eq!(Ok("foobar"), Tag::get_dst_str_slice(b"foobar\0\xff\xff"));

This is not entirely hypothetical either. For example, when the "tboot" module boots an ELF-based kernel (such as Xen), it has to replace the Multiboot2 command-line string with that of the first module (holding the Xen kernel). In the case where the new command-line is shorter than the existing one, it simply relies on C-style termination and doesn't shorten the Multiboot2 tag. Of course, this may be arguable...

To me, the Multiboot2 spec is clear that strings end on the NUL byte (and that it must be present). Even their example C code relies on this and doesn't care about the length of the Tag. (Even though libc is capable of dealing with non-terminated strings).
https://www.gnu.org/software/grub/manual/multiboot2/multiboot.html

While I would generally hope that protocol specifications would be language agnostic, the Multiboot specs seem to be strongly focused on the C language. Although the spec is somewhat quiet on what (if anything) may come after a NUL byte, their example code and current practical implementations (Grub, Xen, etc.) are all C-based and rely on typical NUL-terminated C string behavior.

Thus, for interoperability, I think the multiboot2 crate should follow the same behavior.

@phip1611 phip1611 self-assigned this Sep 3, 2023
@phip1611 phip1611 self-requested a review September 3, 2023 17:53
@phip1611
Copy link
Collaborator

Hi. Sorry, it took a while.

You are right, we should follow the spec and not try to be feature-creep and support every obscure variant of wrong string encodings.

@phip1611 phip1611 enabled auto-merge September 21, 2023 13:24
The various strings in the Multiboot2 spec are specified as C-style
zero-terminated strings.

Previously, if the multiboot2 crate is handed a string with non-NUL characters
present between the terminating NUL (within the tag), they were returned as
well.

This revision instead terminates the strings on the first NUL character,
as would be typical in C handling. Although unterminated strings in Multiboot2
are technically an error, current unterminated string behavior is maintained.
@phip1611 phip1611 merged commit 2cf6266 into rust-osdev:main Sep 21, 2023
14 checks passed
@phip1611 phip1611 added this to the multiboot2: v0.19 milestone Sep 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants