-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using multiple models in NER #1334
Comments
That's not an error, though. It should work just fine with that warning
…On Sun, Jan 21, 2024, 11:41 PM linlinloo ***@***.***> wrote:
I want to run the following code, but an error occurred.
import stanza
pipe = stanza.Pipeline("en", processors="tokenize,ner", package={"ner":
["ncbi_disease", "ontonotes"]})
doc = pipe("John Bauer works at Stanford and has hip arthritis. He works
for Chris Manning")
print(doc.ents)
WARNING: Language en package default expects mwt, which has been added
I have downloaded ncbi_disease.pt and placed it in
site-packages\stanza\stanza_resources\en\ner What's the problem?and why?
—
Reply to this email directly, view it on GitHub
<#1334>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2AYWOCQ57BIGJKMEGETMLYPYJ2ZAVCNFSM6AAAAABCEXKUXCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA4TGMRYGIYDONY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
However, the operation did not yield any results, and a series of errors would appear: ConnectTimeout, MaxRetryError......
|
If it's giving a timeout error, I would guess the most likely culprit is
it's trying to download missing resources and isn't able to connect. You
can add download_method=None to the Pipeline to stop it from downloading
…On Mon, Jan 22, 2024 at 12:13 AM linlinloo ***@***.***> wrote:
However, the operation did not yield any results, and a series of errors
would appear: ConnectTimeout, MaxRetryError......
When I run other code, there is no ncbi_disease in ner. Is it the wrong
package I have put?
Loading these models for language: en (English): | Processor | Package | |
tokenize | combined |
| mwt | combined |
| pos | combined_charlm |
| lemma | combined_nocharlm |
| constituency | ptb3-revised_charlm |
| depparse | combined_charlm |
| sentiment | sstplus |
| ner | ontonotes-ww-multi_charlm |
—
Reply to this email directly, view it on GitHub
<#1334 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2AYWJBCZJTBAHSZ67PITLYPYNRHAVCNFSM6AAAAABCEXKUXCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBTGQ2TSNBVHA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Also, I should note that for version 1.7.0, the default NER model is now
"ontonotes-ww-multi_charlm"
there's also
"ontonotes_charlm"
They are named this way so that you can get "nocharlm" models if you want
faster processing. If there's some stale documentation, please let me know
and I'll update it.
|
I find ontonotes_charlm.pt, and I can download it, do you meant that I should replace ontonotes-ww-multi_charlm? |
You can do whatever you like, of course. The ww-multi model was trained on both OntoNotes and the dataset described in this paper
Yes, exactly. I suggest that because it's the most likely reason you're getting timeouts. If the problem is somewhere else, please include the complete stack trace. |
I want to run the following code, but an error occurred.
import stanza
pipe = stanza.Pipeline("en", processors="tokenize,ner", package={"ner": ["ncbi_disease", "ontonotes"]})
doc = pipe("John Bauer works at Stanford and has hip arthritis. He works for Chris Manning")
print(doc.ents)
WARNING: Language en package default expects mwt, which has been added
I have downloaded ncbi_disease.pt and placed it in site-packages\stanza\stanza_resources\en\ner What's the problem?and why?
The text was updated successfully, but these errors were encountered: