Replies: 4 comments 3 replies
-
This is great, @raindrum, sorry for the slow reply. So your list of supported materials looks pretty great. Some replies:
cc: @jcushman, @mattdahl, @flooie for any additional thoughts. |
Beta Was this translation helpful? Give feedback.
-
To chime in, I agree that making this play nicely with our existing parsing logic would be ideal. As @mlissner says, doing so would do wonders for helping CL resolve Id. citations. Right now we just use a dummy class to keep track of when possible "non-opinion citations" appear (https://github.com/freelawproject/eyecite/blob/master/eyecite/models.py#L241) -- so we don't erroneously link Id. citations to them -- but this implementation is very brittle. @raindrum your code looks very nice to me and has some nice abstractions. However, I fear that it is paradigmatically different from the approach that As I wrote in #931 (comment), I once attempted to do something similar, by way of integrating all these regexes here: https://github.com/freelawproject/citation-regexes/. The approach I took was to make two passes through each opinion, one using our existing citation logic (now I hope that's helpful! There are surely other strategies to pursue as well. Your package looks really interesting and I would love to see it used on CL! |
Beta Was this translation helpful? Give feedback.
-
Thanks for the feedback, all! I just put out an update to CiteURL that addresses some of @mlissner's concerns, and others:
I'm open to the idea of merging CiteURL into
The purpose of the recursive id. schemas is to let each id. schema match only once, and create the context for the next one. However, that feature doesn't seem to be working, so right now CiteURL just fudges it and uses the context from the most-recent full or shortform citation. Unfortunately that leads to errors where if someone cites 42 USC 1983, then id. at 1988, then id. at (b), it'll generate a link to § 1983(b). I'll try and fix the issue soon. As for performance, it might be possible to migrate CiteURL to use Hyperscan, but I'm not really familiar enough with the latter to know. |
Beta Was this translation helpful? Give feedback.
-
I just want to write a quick comment, since I'm mostly on vacation this week. I think we have two really interesting big things happening simultaneously right now, and I think we should make them into just one. Over in freelawproject/eyecite#10, @jcushman has a big PR that I've briefly looked at which overhauls the approach of I haven't thought carefully yet about how this project could get folded into eyecite, but I'm happy to hear that you're willing and interested, @raindrum. I think my suggestion would be that we merge Jack's big PR, and that we then do the work of integrating this. Between the two, I think we could have something wildly neat. |
Beta Was this translation helpful? Give feedback.
-
Hi everyone! I recently wrote a python library, CiteURL, which can detect (theoretically) all kinds of legal citations in text, and insert hyperlinks to view them on sites like Cornell.
By default, the library recognizes an incomplete mishmash of federal bodies of law, but it's easy to extend it with YAML, including replacing the built-in schemas altogether.
I'd love to help integrate it into CourtListener, but there are a few big decisions to make first.
First, what bodies of should it support before it's worth running it against the whole database? You can find the currently-supported formats here. Where multiple sites publish a body of law, which one is preferred? By default, CiteURL uses mostly the Cornell website, but a lot of federal materials are available on government sites, for instance.
The biggest question is, if we want to implement CiteURL for statutory and other citations, is it worth using it for linking intra-site court case citations as well? This would be a much more complicated change, but it could be worthwhile, since CiteURL supports shortform citations and pincites.
What do folks think?
EDIT: After reading through #1521 and learning a bit more about how CiteURL already handles court opinion matching, I hesitate to suggest replacing it. But I do think CiteURL would be very useful for other kinds of citations.
Beta Was this translation helpful? Give feedback.
All reactions