-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Swallowing up text in the parser #122
Comments
In Python regexes
|
BTW, here is the right place to ask questions about parglare. |
Ah, now, I tried import parglare
grammar = r"""
Program: al=AuthorLine sentences=Sentences;
AuthorLine: title=Identifier "by" author=Identifier DOT;
Sentences: Sentence*;
Sentence: Anything DOT;
Identifier: IdentifierWord*;
terminals
IdentifierWord: /\w+/;
DOT: ".";
Anything: /(?s).*?/;
"""
text = """
Program by Stuart.
This is sentence one.
This is sentence two
which has newlines in.
"""
g = parglare.Grammar.from_string(grammar)
p = parglare.Parser(g, debug=True)
result = p.parse(text) This fails with error: I don't know how to tell parglare "just swallow up the rest of the document, I don't care about parsing it", or "please only detect an |
The problem is that
which means Another feature you might find useful, depending on what you are trying to achieve, is incomplete parsing. |
Incomplete parsing looks like exactly what I want! Thank you! |
I have a document which contains a heading, which is a quoted string, and then a series of "sentences" which end with a "." and may have newlines in. I'd like to parse the document into Heading and Sentences. I tried to do it this way:
However, this fails with
parglare.exceptions.ParseError: Error at 6:0:"ence one.\n **> This is se" => Expected: DOT but found <Anything(This is sentence two)>
.All I care about is the Heading, and parsing the Body into separate sentences, but I can't work out how to do that; what's the best way to express this in a parglare grammar? The sentences can contain anything at all; I don't need a structure or parsing for them at this stage, just a list with
["This is sentence one.", "This is sentence two which has newlines in."]
as the return; sentences might contain any characters at all.(Apologies if this isn't actually an issue, but I hope it's the best place to ask questions about parglare. I'm happy to ask it somewhere else if that's better.)
The text was updated successfully, but these errors were encountered: