Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We��ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve MAPIMessage.guess7BitEncoding, improve MAPIMessage.getHtmlBody #149

Closed
wants to merge 1 commit into from
Closed

improve MAPIMessage.guess7BitEncoding, improve MAPIMessage.getHtmlBody #149

wants to merge 1 commit into from

Conversation

dhoelzl
Copy link

@dhoelzl dhoelzl commented May 23, 2019

improve MAPIMessage.guess7BitEncoding

  • general properties code page sources: PR_MESSAGE_CODEPAGE, default ANSI code page of PR_MESSAGE_LOCALE_ID, charset from content-type header
  • code page source for PR_BODY and PR_BODY_HTML: PR_INTERNET_CPID (as documented)
    no need to get from HTML; Outlook skips that too
    special case: outlook uses code page CP1252 instead of utf-8 for PR_BODY

improve MAPIMessage.getHtmlBody:

  • in case if binary property use code page defined in PR_INTERNET_CPID, default to CP1252 (no need to get from HTML; Outlook skips that too)
  • in case of string property rely on code page set by MAPIMessage.guess7BitEncoding
@asfgit
Copy link

asfgit commented May 23, 2019

Can one of the admins verify this patch?

Copy link
Contributor

@pjfanning pjfanning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any sample files to try the new code on? If so, can you use them in unit tests (for regression purposes)?

 - general properties code page sources: PR_MESSAGE_CODEPAGE, default ANSI code page of PR_MESSAGE_LOCALE_ID, charset from content-type header
 - code page source for PR_BODY and PR_BODY_HTML: PR_INTERNET_CPID (as documented)
   no need to get from HTML; Outlook skips that too
   special case: outlook uses code page CP1252 instead of utf-8 for PR_BODY

* improve MAPIMessage.getHtmlBody:
 - in case if binary property use code page defined in PR_INTERNET_CPID, default to CP1252 (no need to get from HTML; Outlook skips that too)
 - in case of string property rely on code page set by MAPIMessage.guess7BitEncoding

* add unit tests
@dhoelzl
Copy link
Author

dhoelzl commented May 24, 2019

I have added a unit test to the PR.

@asfgit asfgit closed this in 721180d May 26, 2019
@pjfanning
Copy link
Contributor

Closed using 721180d

asfgit pushed a commit that referenced this pull request Oct 6, 2019
…ge.getHtmlBody. Thanks to Dominik Hölzl. This closes #149

git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1860043 13f79535-47bb-0310-9956-ffa450edef68
Alain-Bearez pushed a commit to cuali/poi that referenced this pull request Dec 12, 2019
…ge.getHtmlBody. Thanks to Dominik Hölzl. This closes apache#149

git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1860043 13f79535-47bb-0310-9956-ffa450edef68
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants