You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I am trying to let MetaMap process some translated german texts, which include words with the letter 'ß'.
After analyzing why the JSON output breaks, I found out that the character 'ß' seems to cause an error, if it is included in a word (not a standalone character).
Example request:
from skr_web_api import Submission, METAMAP_INTERACTIVE_URL
args = "-AI -R SNOMEDCT_US_2022_03_01 --JSONf 2 -V USAbase -Z 2022AA"
inst = Submission(email, apikey)
inst.init_mm_interactive('This is a test with Straße', args=args)
response = inst.submit()
When I decode the content of the response via response.content.decode(), it returns a broken JSON string (broken, since it does not clsoe at the end and seems cut off):
Somewhat of fix would be possible by replacing the character 'ß' with 'ss' to avoid this issue, but I am not sure if the results will be the same as with the online version of MetaMap, since words containing 'ß' are not a problem there:
Breaking here refers to the incomplete JSON at the end, ending on "UttText": [
So this is also fixable by removing the "é" but perhaps it leads in some cases to a loss of valuable information.
KimBenjaminTang
changed the title
MetaMap API breaks when special character 'ß' occurs in a word
MetaMap API breaks when special characters (e.g. 'ß') occurs in a word
Dec 6, 2022
It also breaks with the String m² T due to the character ² followed by another character/word. If the string contains the ² at the end with nothing following other than a whitespace, it gets processed:
Hello, I am trying to let MetaMap process some translated german texts, which include words with the letter 'ß'.
After analyzing why the JSON output breaks, I found out that the character 'ß' seems to cause an error, if it is included in a word (not a standalone character).
Example request:
When I decode the content of the response via response.content.decode(), it returns a broken JSON string (broken, since it does not clsoe at the end and seems cut off):
Somewhat of fix would be possible by replacing the character 'ß' with 'ss' to avoid this issue, but I am not sure if the results will be the same as with the online version of MetaMap, since words containing 'ß' are not a problem there:
Request:
User Information: [email protected]
Run Time: 12/06/2022 06:12:29
MetaMap Version Used: metamap20
MetaMap Options: -A+ -R SNOMEDCT_US_2022_03_01 --JSONf 2 -V USAbase
Knowledge Source Used: 2022AA
Input Text:
This is a test with Straße
Output:
Can this be fixed by adjusting the MetaMap API to match the procedure of the MetaMap Online version?
The text was updated successfully, but these errors were encountered: