Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(fmt): Preserve multiple newlines between elements (#374) #919

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

AlexanderArvidsson
Copy link

@AlexanderArvidsson AlexanderArvidsson commented Sep 12, 2024

Preserves multiple lines between elements, to allow organizing code better.
For reference and discussion, see #374.

templ fmt is inconsistent with itself, sometimes preserving newlines (collapsing) and sometimes not.
This aligns templ fmt with gofmt, which also preserves newlines (by collapsing multiple standalone newlines into one), which templ fmt also does in go functions.

This change should be backwards compatible (as in, it shouldn't modify existing code bases since they all should already strip all newlines). But the best way to guarantee that is to run this fork on a big codebase and observe what it does.
Other than that, this PR also includes a couple of tests for this, both for the parser and the formatter output. I can add more formatter tests if needed, in case we find some edge cases.

parser/v2/types.go Show resolved Hide resolved
@AlexanderArvidsson
Copy link
Author

AlexanderArvidsson commented Sep 13, 2024

I found a case that I'd like to include before potentially accepting this PR:

temp x() {
	<header></header>

	@c.Hero()
	<main class="max-w-7xl mx-auto px-8 mt-24 overflow-hidden"></main>
			
	<a></a>

	if true {
		<span></span>
	}
	<a></a>
}

The space should have been preserved between c.Hero and main and the if statement and a tag, but they were stripped.

@AlexanderArvidsson
Copy link
Author

AlexanderArvidsson commented Sep 13, 2024

I also found that this throws an error in my own project, due to the first <a></a>. Says test_templ.go:32:61: string literal not terminated (and 8 more errors), but doesn't seem to throw anything (and formats correctly) when I run it as a format test.

package components

templ Test() {
  <div class="flex items-center gap-4">
    <a></a>



    <a>
    </a>
  </div>
}

EDIT: I've found that I've missed adding checks where there's normally a n == parser.SpaceVertical, it doesn't account for SpaceVerticalDouble. For example, this strips newlines into horizontal space:

	// Normalize whitespace for minified output. In HTML, a single space is equivalent to
	// any number of spaces, tabs, or newlines.
	if n == parser.SpaceVertical {
		n = parser.SpaceHorizontal
	}

This should also do the same for parser.SpaceVerticalDouble. This will fix the issue I was facing, since this affects the minified generated go code, and thus the unterminated string.

Also, this switch statement must include a parser.SpaceVerticalDouble to preserve indent:

		switch trailing {
		case SpaceNone:
			level = 0
		case SpaceHorizontal:
			level = 0
		case SpaceVertical:
			level = startLevel
		}

@AlexanderArvidsson
Copy link
Author

@joerdav Before I go and preserve newlines for statements like if and for, I've decided to only focus on Elements and TemplElements.
Adding the trailing space logic to every statement is a bigger change that I think would only be fair if I got your opinion on first.

You can see how I added the logic to TemplElementExpression and let me know if I should do the same for if, for, etc.

I believe consistency is important, so I think preserving newlines should work as you expect for all types of statements or expressions, but let me know your thoughts.

@joerdav
Copy link
Collaborator

joerdav commented Sep 16, 2024

Hey @AlexanderArvidsson , thanks for all the time on this so far, I've looked at the code here and am happy with how you have approached this. I agree with you that due to consistency I think that we should support trailing spaces for all elements.

I think going ahead with this pattern is a solid approach.

One to think about is that the following code block may end up repeated a bit so could maybe be made into a function (or not since I suppose it isn't a lot of code):


		// Parse trailing whitespace after expression.
		ws, _, err := parse.Whitespace.Parse(pi)
		if err != nil {
			return r, false, err
		}
		r.TrailingSpace, err = NewTrailingSpace(ws, true)
		if err != nil {
			return r, false, err
		}

@AlexanderArvidsson
Copy link
Author

AlexanderArvidsson commented Sep 16, 2024

One to think about is that the following code block may end up repeated a bit so could maybe be made into a function (or not since I suppose it isn't a lot of code):


		// Parse trailing whitespace after expression.
		ws, _, err := parse.Whitespace.Parse(pi)
		if err != nil {
			return r, false, err
		}
		r.TrailingSpace, err = NewTrailingSpace(ws, true)
		if err != nil {
			return r, false, err
		}

@joerdav The elementparser has a function addTrailingSpaceAndValidate which does almost the same thing, it also checks for voidElementCloser and then validates. I can put the addTrailingSpace portion of it in its own function and use that everywhere!

So far, I just followed the "don't repeat yourself more than twice" philosophy, and just copied that small piece of code twice. But for more uses, I would definitely put it in a function. I'll get on adding support for the rest of the elements and try and find all cases.

Also, tests seem to be failing, even locally (I think I had only ran the parser tests, not generate ones), so I'll get on fixing that as well.

@AlexanderArvidsson
Copy link
Author

@joerdav I've went ahead and added trailing space logic to most parsers and added relevant tests.
It's quite a lot of additions, but they're all very similar.

Please take an extra look at how I unified the trailing space logic in addTrailingSpace via the interface TrailingSpaceSetter. I kept them separate to the WhitespaceTrailer due to the required pointer receiver, but there might be a better way.

Other than that, from my point of view, this is ready to be tested!

@joerdav
Copy link
Collaborator

joerdav commented Sep 20, 2024

Amazing, thanks! I've tested this out on a fairly large repo at my company and as you said there is no effect to existing code.

I also have read the trailing space interface logic, I think it is done sensibly!

One note I have is on the addTrailingSpace function. I can't see any usages of 2 of the return values. Which could also mean that this no longer needs to be a generic function: func addTrailingSpace(e TrailingSpaceSetter, pi *parse.Input, allowMulti bool) error.

Going further, I wonder if we need the interface at all? If the trailing space function looked like: func parseTrailingSpace(pi *parse.Input, allowMulti bool) (types.TrailingSpace, err error)

It could be used as:

	// Parse trailing whitespace after closing brace.
	r.TrailingSpace, err = parseTrailingSpace(pi, true)
	if err != nil {
		return r, false, err
	}

What do you think?

Also in general, I think it would be worth @a-h casting his eyes over this.

@AlexanderArvidsson
Copy link
Author

Amazing, thanks! I've tested this out on a fairly large repo at my company and as you said there is no effect to existing code.

I also have read the trailing space interface logic, I think it is done sensibly!

One note I have is on the addTrailingSpace function. I can't see any usages of 2 of the return values. Which could also mean that this no longer needs to be a generic function: func addTrailingSpace(e TrailingSpaceSetter, pi *parse.Input, allowMulti bool) error.

Going further, I wonder if we need the interface at all? If the trailing space function looked like: func parseTrailingSpace(pi *parse.Input, allowMulti bool) (types.TrailingSpace, err error)

It could be used as:

	// Parse trailing whitespace after closing brace.
	r.TrailingSpace, err = parseTrailingSpace(pi, true)
	if err != nil {
		return r, false, err
	}

What do you think?

Also in general, I think it would be worth @a-h casting his eyes over this.

You're absolutely right! I originally did have uses for the other values, but as I learned more about the codebase I started removing them. I think your way is much cleaner. I'll make those changes :)
I wasn't a big fan of the interface anyway but at the time was the quickest way I thought of.

Good to hear that you didn't see any effect on the existing code! :)

@joerdav
Copy link
Collaborator

joerdav commented Sep 20, 2024

All good fair enough, I know the feeling! You go on a journey with the code and end up with designs based off previous iterations more than your current understanding.

And yep no issues with the existing code! That's 165 templates to be exact :)

@AlexanderArvidsson
Copy link
Author

@joerdav @a-h This should be ready for a final look-through now! I've applied the previous suggestions, even went a bit further for the statements that enforce a breaking space afterwards (like if, for, switch, etc).

@AlexanderArvidsson
Copy link
Author

AlexanderArvidsson commented Sep 29, 2024

@joerdav I've fixed the tests, although I would highly suggest checking the last commit, because I'm not sure if it's correct.
If, switch and for are treated as inline, which to me is weird. To fix the tests introducing double whitespace, I had to only write whitespace when the current node is not if, switch or for, but keep whitespace logic if the next node is if, switch or for. Weird one, but it fixed the tests.

Templ elements seem to be special too, there should always be a horizontal space between a Templ block element and a subsequent Templ element (block or inline).
I was only able to fix this by adding a specific check for this, I'm not sure that's the correct approach, but all tests are passing now.

Copy link
Owner

@a-h a-h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive changes. Thanks a lot for all the effort you've put in to this.

It sounds easy to get the formatting spacing right, but somehow, it isn't at all! The tests, in particular, look great.

The changes I've asked for are really just to make the spacing algorithm clearer to understand for anyone that needs to work in this area in the future.

If you're not able to make the changes due to time constraints, or you're just fed up, let me know and I can refactor from your starting position and merge your PR.

Thanks again!

err = g.writeForExpression(indentLevel, n, next)
case parser.CallTemplateExpression:
err = g.writeCallTemplateExpression(indentLevel, n)
case parser.TemplElementExpression:
err = g.writeTemplElementExpression(indentLevel, n)

// TemplElementExpression with block should always have whitespace if the next element is also
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment. It doesn't make grammatical sense.

  • TemplElementExpression with block... block what? Are we saying that a templ element expression that is a block HTML element the it should always have following whitespace?
  • "if the next element is also..." what? What is the next element?

The code doesn't check whether the element is a block element, from what I can see it checks whether the current element has children, AND that the next child is also a templ element.

An element can have children, and still be an inline element, e.g. <i>child text node</i>. I haven't tested it out, but if the comment is accurate, with <i>child text node</i><b>next node</b>, we might end up with a line break between those inline nodes, since whitespace would be "forced".

There's also no nil check here on next. I haven't checked the call sites, but it looks like next could be nil, and this might panic if there's an node with children at the end of a list of nodes.

Suggest explaining the expected behaviour (why of the comment), adding a nil check, and adding a test to make sure that subsequent inline elements don't inappropriately get space between them.

Copy link
Author

@AlexanderArvidsson AlexanderArvidsson Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great feedback! Let me explain:

The full comment on that line is:

		// TemplElementExpression with block should always have whitespace if the next element is also
		// a TemplElementExpression

I wrapped it to keep within a sensible line width. Upon reading your other comments I see your convention seems to have 1 sentence per line, with a period.

Templ elements can contain children, which is then a Go block (not HTML block), and is called as such in the code base in tests. I merely used the same terminology, but I see the confusion with HTML blocks.

		{
			name: "templelement: simple, block with text",
			input: `@Other(p.Test) {
	some words
}`,

When I didn't add this special case, tests were failing. See my comment in one of the outdated threads.
There are tests that expect horizontal spacing between 2 TemplElementExpressions that contain children.
See test test-import, which includes multiple Templ elements.
This test already tests this case (albeit without a self-explanatory name), so not necessary to add another one. I'd argue to rename it, since I don't even understand what its testing, there are no imports in the test.

The nil check was my bad, missed that one, sorry.

Copy link
Author

@AlexanderArvidsson AlexanderArvidsson Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the comment is accurate, with child text nodenext node, we might end up with a line break between those inline nodes, since whitespace would be "forced".

Let me clarify that what the failing tests were checking for here is a space, not a newline.
I think you're right though that this change could be adding a line break. Let me add some tests for this.

@@ -566,8 +580,10 @@ func (g *generator) writeNode(indentLevel int, current parser.Node, next parser.
// Write trailing whitespace, if there is a next node that might need the space.
// If the next node is inline or text, we might need it.
// If the current node is a block element, we don't need it.
needed := (isInlineOrText(current) && isInlineOrText(next))
if ws, ok := current.(parser.WhitespaceTrailer); ok && needed {
// If, switch and for as current node skip whitespace, but not always when next node.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment either.

Copy link
Author

@AlexanderArvidsson AlexanderArvidsson Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If, switch and for statements are marked as inline elements. I don't agree with this, but I'm sure there are reasons for this. The thing is, when these statements got their own TrailingSpace, they no longer satisfy the tests which were written to handle "2 inline elements following each other", which the previous code did here.

These statements should not add a whitespace if they're the current node, but if they're the "next" node (and the current is inline), they should add a whitespace. At least, that's what the failing tests indicated to me. Fixing the issue in any other way caused other tests to fail, indicating some weird inconsistency that was only solved via this edge-case.

Again, as mentioned in my previous comment, I'm not happy with any of these solutions and if you have a better plan for this, feel free. These were really the last pieces of failing tests that I tried to fix without making modifications like this, but were unsuccessful.

needed := (isInlineOrText(current) && isInlineOrText(next))
if ws, ok := current.(parser.WhitespaceTrailer); ok && needed {
// If, switch and for as current node skip whitespace, but not always when next node.
neededWhitespace := forceWhitespace || (maybeWhitespace && isInlineOrText(current) && isInlineOrText(next))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neededWhitespace puts the variable into past tense - i.e. that we needed whitespace in the past.

However, I now see that the algorithm for deciding whether to add trailing whitespace is mixed in with the logic for writing out various nodes, plus this final statement.

I think it would be clearer if this was a function called, e.g. shouldAllowTrailingWhitespace(current, next) and the logic applied there, e.g. something that looks like this (but not this!):

func shouldAllowTrailingWhitespace(current, next) bool {
   if _, isTemplElement := parser.TemplElementExpression; isTemplElement {
		if len(n.Children) > 0 {
			if _, ok := next.(parser.TemplElementExpression); ok {
				return true
		}
	}
   }
   switch n := current.(type) {
      case x:
        return false
      case y:
         return false     
   }
   
}

That way, the logic for testing trailing whitespace can be explained more easily with some tests.

Copy link
Author

@AlexanderArvidsson AlexanderArvidsson Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this whole function was extremely confusion. I tried to do the absolute minimal changes to it to allow the existing tests to pass. While I am not happy with it, the truth is that this function needs a huge refactoring. It generalizes adding whitespaces after elements, but I don't think that its something that can be generalized like this.

I probably don't have the full knowledge of the codebase to write a better one, or argue why a different solution is better. Your function makes it much clearer though, so that is a good start.

@@ -528,6 +528,9 @@ func (g *generator) writeNodes(indentLevel int, nodes []parser.Node, next parser
}

func (g *generator) writeNode(indentLevel int, current parser.Node, next parser.Node) (err error) {
maybeWhitespace := true
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be whitespace implies that something might be whitespace, but with this variable, I think you're trying to say that trailing whitespace is not be allowed if this set to true, therefore, a better variable name would be isTrailingWhitespaceAllowed.

And forceWhitespace might be better off named forceTrailingWhitespace.

Copy link
Author

@AlexanderArvidsson AlexanderArvidsson Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention is the opposite; we if this is true, we might add a whitespace. If it's false, we definitely won't add a whitespace. It's only there to prevent If, For and Switch expressions from adding a whitespace, since that's handled elsewhere.

But I agree, your variable names are better.

@@ -29,5 +29,11 @@ func (p callTemplateExpressionParser) Parse(pi *parse.Input) (n Node, ok bool, e
return
}

// Parse trailing whitespace.
r.TrailingSpace, err = parseTrailingSpace(pi, true, false)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not ideal that from reading the file, I have no idea what the true and false arguments mean. I'd have to find parseTrailingSpace and read the args.

A better design would use an enum to define the behaviour, e.g. something like this:

type TrailingSpaceParseOpts int
const (
  ParseTrailingAllowVerticalAndHorizontal TrailingSpaceParseOpts = iota
  ParseTrailingAllowVerticalOnly
)
parseTrailingSpace(pi, ParseTrailingAllowVerticalAndHorizontal)

That way, it's obvious what the intent of the code is.

@@ -10,7 +10,10 @@ var switchExpression parse.Parser[Node] = switchExpressionParser{}
type switchExpressionParser struct{}

func (switchExpressionParser) Parse(pi *parse.Input) (n Node, ok bool, err error) {
var r SwitchExpression
r := SwitchExpression{
// Default behavior is always a trailing space
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really add anything to the code, I'd strip it.

Suggested change
// Default behavior is always a trailing space

@@ -13,7 +13,11 @@ func (p templElementExpressionParser) Parse(pi *parse.Input) (n Node, ok bool, e
return
}

var r TemplElementExpression
r := TemplElementExpression{
// Default behavior is always a trailing space
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Default behavior is always a trailing space

@@ -80,7 +80,7 @@ func TestTextParser(t *testing.T) {
},
},
{
name: "Multiline text is colected line by line",
name: "Multiline text is collected line by line",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the typos!

)

var ErrNonSpaceCharacter = errors.New("non space character found")

func NewTrailingSpace(s string) (ts TrailingSpace, err error) {
func NewTrailingSpace(s string, allowMulti bool) (ts TrailingSpace, err error) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the other conversation, the bool flag makes it harder to understand what's going on without referring to the function definition. Would prefer an alternative solution, e.g. an enum style for allow multi, or functional params.

if r == '\n' {
if allowMulti && i < len(runes)-1 {
next := runes[i+1]
if next == '\n' {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect Windows users would like a check for \r\n.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had assumed this was normalized before entering this part of the code. Seems dangerous to have 2 code-paths based on OS everywhere in the code. I'd check, but it'd be difficult for me to test it since I don't have Windows 😅

@AlexanderArvidsson
Copy link
Author

AlexanderArvidsson commented Sep 30, 2024

If you're not able to make the changes due to time constraints, or you're just fed up, let me know and I can refactor from your starting position and merge your PR.

@a-h I'll go through and make the changes you've requested, no problem!

I've found an alternative way to satisfy the tests that I believe is much clearer. The "issue" is that If, For and Switch expressions are inline in HTML, but Block in Templ. So the "TrailingSpace" needs to contain the Templ formatting, while it also at the same time needs to collapse its trailing space into an empty string during generation.

@AlexanderArvidsson
Copy link
Author

AlexanderArvidsson commented Sep 30, 2024

Upon adding some more generator tests, I found that the htmldiff.Diff incorrectly diffs HTMl:
Consider this expected HTML:

<i>child text node</i> <b>next node</b>

With this actual HTML:

<i>child text node</i><b>next node</b>

These two HTMLs are not the same. If you open a browser with these two, there is a difference. In HTML, spacing is preserved if 2 inline elements are on the same line. If they're on different lines, an implicit space is added.
However, when htmldiff.Diff compares these, it formats them like this:

actual:
<i>
 child text node
</i>
<b>
 next node
</b>

expected:
<i>
 child text node
</i>
<b>
 next node
</b>

Which are incorrectly marked as equal. I'm not too sure why there's a formatting step here, as the point of generator tests should be to compare the real output with the expected output, no middle-hand formatting.

For this specific test I'm writing, I won't be using htmldiff and instead comparing the exact string, just to get the test right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants