Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure headers are strings, not NULL if not set #95

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

eileenmcnaughton
Copy link
Contributor

@eileenmcnaughton eileenmcnaughton commented Aug 26, 2024

This is a patch we have been carrying for a while - in our tests we have some examples of emails where these headers are not set and the attached file is (e.g. an .ics). I think that is possible for real emails. This was our original issue https://lab.civicrm.org/dev/core/-/issues/940

@derickr
Copy link
Member

derickr commented Aug 26, 2024 via email

@eileenmcnaughton
Copy link
Contributor Author

@derickr thanks for your super quick response - I think we need to do a bit more work on our side to figure out if we have hacked your package for good reasons or bad .... so tomorrow is a bit premature.

I'll update this when I'm a bit clearer as to whether there is a solid use case for another option here or whether we did something weird a long time ago & have been carrying it every since

@seamuslee001
Copy link
Contributor

@derickr @eileenmcnaughton

It seems as though CiviCRM's use case where we want the body of the email processed as text no matter what but if there is .txt files or similar or basically any files attached to be processed as attachments is different to https://github.com/zetacomponents/Mail/blob/master/tests/parser/parser_test.php#L1769. I'm not sure if its something about how kmail works but maybe this is just an incompatibility between Civi's use case and this library's expectations.

@eileenmcnaughton
Copy link
Contributor Author

@derickr it would be great to get your feedback on the right approach - we are working with the sample data below - ie a multipart message with

  • body text
  • body html
  • a text attachment
  • a pdf attachment
  • a word attachment

The goal is to have the text attachment (which could be a data file or an ical I believe) to still be an attachment without turning the body text & body html into attachments

If we set

$parser->options->parseTextAttachmentsAsFiles = TRUE;

Then we wind up with 5 attachments & no body text

If we set

$parser->options->parseTextAttachmentsAsFiles = FALSE;

We wind up with 2 attachments and our attachment file appended to the body

My suspicion is that we are hitting behaviour in the multipart flow that was intended for the non-multipart




MIME-Version: 1.0
Date: Tue, 27 Aug 2024 09:28:55 +1000
Message-ID: <[email protected]>
Subject: test txt attachment
From: Sam Lee <[email protected]>
To: [email protected]
Content-Type: multipart/mixed; boundary="0000000000000f1e9806209e7d06"

--0000000000000f1e9806209e7d06
Content-Type: multipart/alternative; boundary="0000000000000f1e9606209e7d04"

--0000000000000f1e9606209e7d04
Content-Type: text/plain; charset="UTF-8"

test

--
Sam Lee


Big Business

He/Him/His

--0000000000000f1e9606209e7d04
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable


--0000000000000f1e9606209e7d04--
--0000000000000f1e9806209e7d06
Content-Type: text/plain; charset="UTF-8"; name="2_load_xss.html.txt"
Content-Disposition: attachment; filename="2_load_xss.html.txt"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_m0bmqmih2
Content-ID: <f_m0bmqmih2>

77u/PGh0bWw+DQo8Zm9ybSBlbmN0eXBlPSJhcHBsaWNhdGlvbi94LXd3dy1mb3JtLXVybGVuY29k
ZWQiIG1ldGhvZD0iUE9TVCIgYWN0aW9uPSJodHRwOi8vbG9jYWxob3N0L3dwLWFkbWluL2FkbWlu
LnBocD9wYWdlPUNpdmlDUk0mcT1jaXZpY3JtJTJGYWRtaW4lMkZja2VkaXRvciI+DQo8aW5wdXQg
dHlwZT0idGV4dCIgdmFsdWU9Im1vb25vIiBuYW1lPSJjb25maWdfc2tpbiI+IDxicj4NCjxpbnB1
dCB0eXBlPSJ0ZXh0IiB2YWx1ZT0iQ0tFRElUT1IuZWRpdG9yQ29uZmlnID0gZnVuY3Rpb24oIGNv
bmZpZyApIHt9OyIgbmFtZT0iY29uZmlnIj4gPGJyPg0KPGlucHV0IHR5cGU9InRleHQiIHZhbHVl
PSIiIG5hbWU9ImNvbmZpZ19leHRyYVBsdWdpbnMiPiA8YnI+DQo8aW5wdXQgdHlwZT0idGV4dCIg
dmFsdWU9Ii4uLy4uLy4uLy4uLy4uL3VwbG9hZHMvY2l2aWNybS9wZXJzaXN0L2NybS1ja2VkaXRv
ci14c3MuanMiIG5hbWU9ImNvbmZpZ19jdXN0b21Db25maWciPiA8YnI+DQo8aW5wdXQgdHlwZT0i
c3VibWl0IiB2YWx1ZT0iUlVOIFBPQyI+DQo8L2Zvcm0+DQo8L2h0bWw+
--0000000000000f1e9806209e7d06
Content-Type: application/msword; name="form_03.doc"
Content-Disposition: attachment; filename="form_03.doc"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_m0bmqhqx1
Content-ID: <f_m0bmqhqx1>


--0000000000000f1e9806209e7d06
Content-Type: application/pdf; name="2932_1 Ward Grouped Mayor-Lismore.pdf"
Content-Disposition: attachment; filename="2932_1 Ward Grouped Mayor-Lismore.pdf"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_m0bmqhpv0
Content-ID: <f_m0bmqhpv0>


--0000000000000f1e9806209e7d06--

@derickr
Copy link
Member

derickr commented Sep 3, 2024

OK. Here I need to dive in a lot deeper :-). It's been a while since I've really looked in-depth at this code.

@eileenmcnaughton
Copy link
Contributor Author

@derickr did you have a chance to think any more about this?

@derickr
Copy link
Member

derickr commented Nov 5, 2024

I have indeed, and I think the best way here would be to introduce another flag. I'll see if I can come up with something.

derickr added a commit that referenced this pull request Nov 5, 2024
@derickr
Copy link
Member

derickr commented Nov 5, 2024

I have had a look again, and created a branch with two test cases, and I don't see your "We wind up with 2 attachments and our attachment file appended to the body" allegation: https://github.com/zetacomponents/Mail/blob/test-alternative-text-attachment/tests/parser/parser_test.php#L1866

In both cases, I see the following structure:

  • ezcMailMultipartAlternative
    • ezcMailText / ezcMailFile (the text body)
    • ezcMailText / ezcMailFile (the HTML text body)
  • ezcMailText / ezcMailFile (the 2_load_xss.html.txt attachment)
  • ezcMailFile (the form_03.doc attachment)
  • ezcMailFile (the 2932_1 Ward Grouped Mayor-Lismore.pdf attachment)

The $parser->options->parseTextAttachmentsAsFiles=true option correctly makes the three ezcMailText elements change to ezcMailFile, as it is supposed to do.

Do you do any other processing, or any other way of parsing perhaps?

@seamuslee001
Copy link
Contributor

@derickr I think the expectation differences is shown in these lines fc981b4#diff-657e3b25123d0680407b0c9651027d42fe4f46bb7199ff757550900b030ef008R1936 and specifically I think in CiviCRM's case we want to always have the HTML and Text Bodies as ezcMailText but the txt file attachment a ezcMailFile

@eileenmcnaughton
Copy link
Contributor Author

Right - the body of an email is not an 'attachment' from a user point of view - and if the body is bundled up as an attachment the same as the 'real' attachments it's hard to present that information in a way that is meaningful to a user

@derickr
Copy link
Member

derickr commented Nov 6, 2024

OK. I added a new option to the branch in PR #96, which allows for creation of ezcMailFile objects for multipart/mixed, but leaves the sub-parts of multipart/alternative alone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants