-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reference text #1
Comments
If you want to test, it should be something like:
You will remark that the diffs corresponding to I-1° and II are correct, but the one for I-2° is not. I guess there are some issue with the cursor, but if the source text is always the original text, then fixing the issue will be different. {"children":[{"children":[{"children":[{"children":[{"children":[{"children":[{"children":[{"order":1,"type":"alinea-reference"}],"id":"41","type":"article-reference"}],"id":"constitution","type":"code-reference"},{"children":[{"type":"quote","words":"Les propositions de loi ou les amendements qui ne sont pas du domaine de la loi ou qui, hors le cas des lois de programmation, sont dépourvus de portée normative, et les amendements qui sont sans lien direct avec le texte déposé ou transmis en première lecture ne sont pas recevables.\nS'il apparaît au cours de la procédure législative qu'une proposition de loi ou un amendement est contraire à une habilitation accordée en vertu de l'article 38, le Gouvernement ou le président de l'assemblée saisie peut opposer l'irrecevabilité."}],"type":"word-definition"}],"editType":"replace","type":"edit"}],"order":1,"type":"header2"},{"children":[{"children":[{"children":[{"children":[{"children":[{"children":[{"children":[{"type":"quote","words":"intéressée"}],"position":"after","type":"word-reference"}],"order":2,"type":"alinea-reference"}],"id":"41","type":"article-reference"}],"id":"constitution","type":"code-reference"},{"children":[{"type":"quote","words":"sur une irrecevabilité au titre de l'un des cas prévus aux deux alinéas précédents"}],"type":"word-definition"}],"editType":"add","type":"edit"}],"order":1,"type":"header3"},{"children":[{"children":[{"children":[{"children":[{"children":[{"children":[{"type":"quote","words":"huit jours"}],"type":"word-reference"}],"order":2,"type":"alinea-reference"}],"id":"41","type":"article-reference"}],"id":"constitution","type":"code-reference"},{"children":[{"type":"quote","words":"trois jours pour les amendements et de huit jours pour les propositions de loi, dans les conditions fixées par la loi organique"}],"type":"word-definition"}],"editType":"replace","type":"edit"}],"order":2,"type":"header3"}],"order":2,"type":"header2"}],"order":1,"type":"header1"},{"children":[{"children":[{"children":[{"children":[{"children":[{"order":2,"type":"sentence-reference"}],"order":1,"type":"alinea-reference"}],"id":"45","type":"article-reference"}],"id":"constitution","type":"code-reference"}],"editType":"delete","type":"edit"}],"order":2,"type":"header1"}],"content":"I. - L'article 41 de la Constitution est ainsi modifié :\n1° Le premier alinéa est remplacé par les dispositions suivantes :\n\"Les propositions de loi ou les amendements qui ne sont pas du domaine de la loi ou qui, hors le cas des lois de programmation, sont dépourvus de portée normative, et les amendements qui sont sans lien direct avec le texte déposé ou transmis en première lecture ne sont pas recevables.\n\"S'il apparaît au cours de la procédure législative qu'une proposition de loi ou un amendement est contraire à une habilitation accordée en vertu de l'article 38, le Gouvernement ou le président de l'assemblée saisie peut opposer l'irrecevabilité.\" ;\n2° Le deuxième alinéa est ainsi modifié :\na) Après le mot : \"intéressée\" sont insérés les mots : \"sur une irrecevabilité au titre de l'un des cas prévus aux deux alinéas précédents\" ;\nb) Les mots : \"huit jours\" sont remplacés par les mots : \"trois jours pour les amendements et de huit jours pour les propositions de loi, dans les conditions fixées par la loi organique\".\nII. - La seconde phrase du premier alinéa de l'article 45 est supprimée.","isNew":false,"order":3,"type":"bill-article"}],"date":"2018-5-9","id":911,"legislature":15,"place":"assemblée nationale","type":"projet de loi","url":null} |
At least in some texts (XVe-911, constitutional law), the source text is always the original text. Issue: #1
@promethe42: this could interest you: interesting/difficult issues on sight! With the commit eb0aa26 (always restart the text to the original text) the article 3 of project 911 works entirely (and I am preparing some changes to improve typography like orphan spaces at the end of sentences). On the longer term it poses a difficult issue: we have to merge each individual change, and hence we need some robust tool to merge. Instead of using a text-based merge we can do merges on DuraLex trees. (I just tried git-merge on this article 3, and I had to manually resolve the conflict :-/) The source text of the DuraLex tree is always the same (more or less the text in force), but after each individual change applied by SedLex (each verb) the DuraLex tree needs to be "rebased": e.g. when an alinea is added on the beginning of an article, we must increment the alinea counter [of this specific article] in the further changes; then the diffs generated by SedLex could be different than the original ones. If we do something like this, we need a loop FOR 1/ SedLex-diff ; 2/ SedLex-rebase ROF to apply a set of changes on a text. PS: perhaps I’m a bit enthusiastic, but on the very long term, such an infrastructure should enable a git-blame (or pijul-credit) on character level, even leading to the amendment if we have enough good-quality data. E.g. we would be able to see that the last "e" in "menacées" in article 16 of the Constitution has be introduced (or not) in amendment 1924 of this constitutional law (wisdom of Commission and Government on this change ;-). |
About your cursor question, I guess it's like the pastilles system: At first all the alineas are marked with their number and they keep it until the end. +1 for pijul character-based credit, would love to see it everywhere ! I applied the changes (with |
Ok, thanks for the precision about the pastilles. Even if individual changes in articles 1 to 10 work (with a small fix in DuraLex trees of articles 6 and 7), the merges could need a rebase operation depending on the articles of the Constitution. |
Ola, |
By reviewing the adopted amendments of the constitutional law, I see another issue related to this one: these two amendments 773 and 1047 add two different words (« , des mers et des océans. » and « , de la biodiversité. ») after a same word (« environnement »). Here the two added words can be added as a list of expressions without any difficulty, but it could not always be like that; also I’m not sure if there is a canonical order to apply these conflicting patches or if this is a case where SedLex should trigger a warning. If an automatic merging is done, typography should be handled in this case (the full stop is wrong here in both amendments). Do you have some experience with the state-of-the-art in this case? And do you have an opinion about how should behave SedLex? |
In fact the issue mentionned above is twice: in the projet de loi, it will be added two new articles after the article 2 in some (undefined or not undefined?) order, and when the projet de loi will be applied with these two new articles the issue mentionned above will occur. Or possibly these two amendments will be merged into a single article by the services de l’Assemblée and they will manually resolve the conflict. We are here in a fine issue, which only occur when merging two amendments, this is not important in the short-medium term. |
Interesting case indeed! I don't think the National assembly services will merge the two articles, they should remain as two extra articles after article 2 and I guess in the end (if the text goes any further...) it would be the SGG that would decide how to implement the two additions into the constitution. |
I just thought about this issue. Currently the diffs generated by SedLex are only at the level of the verbs, roughly each sentence in each article in a pjl/ppl or in an amendment. It might be better to create diffs at the level of each article and globally for the pjl/ppl, but it is needed to "rebase" the future relevant references. For instance you add an alinea at the beginning, you add a sentence after the second sentence of the second alinea, and you add some words at the end of the third sentence of the second alinea (everything referenced from the initial text). To apply all these operations in a standalone manner (without relying on a external text-merging program):
(*) or for extra careful it might be thought about some mark to better manage word-level insertions, e.g. if your amendment adds some words after the word "intéressé" but a previous amendment added that words before the initial occurence… Thinking about this, the current logic in AddDiffVisitor could be extended at larger scales, but you need to manage the states of all non-processed patches for all scales, which either would create headaches if you want an arbitrary-scale algorithm either would be some hardcoded scales. And in either case you need to recompute everything previously because it would be difficult to store these states. Or probably a better manner would be to solve this issue #1 and the exact diffs #3 together:
With such manner you can store easily half-computed projected texts, you can apply (almost) any combinaison of patches (assuming the patches are not dependent), you can easily construct the dependency of patches by searching tags in tags, you can have the exact diff of a combinaison of patches, you have a (git-)blame at a word-level, and your word-level git-blame is the exact diff (not externally created by an independant program). |
Also, for a git-blame-like feature, it is needed to uniquely identify a patch, roughly the amendment which lastly changed the text, but it could be also the initial pjl/ppl. I’m a bit lost in the numbers of the reference texts on the AN website, but either the URLs either the identifiers could be a first identifier although I’m not sure they are really perennial (for the URLs). For amendment I see http://www.assemblee-nationale.fr/15/amendements/0857/AN/98.asp for instance with the législature, the reference text, the category (AN/commission), the amendment identifier. For the reference text, I see http://www.assemblee-nationale.fr/15/textes/0857.asp and http://www.assemblee-nationale.fr/15/ta-commission/r0857-a0.asp, I’m not sure what is the role of each text (I didn’t compare exactly these two texts (and others?) as of now) |
The main difference is, IMHO, that |
I agree: /textes/ are probably the best (and easier to generate the urls generically). |
This is implemented, not in SedLex but in Durafront for now. Possibly it could be moved to SedLex, to be discussed. Compared to the discussion above, Durafront is currently not able to apply an amendment on an arbitrary text (e.g. an article of a pjl/ppl) but only on a code, but it will implemented it soon and I will probably use identifiers of an amendment as discussed above. |
@RouxRC @mdamien: I am testing with the French constitutionnal bill (see also Legilibre/DuraLex#7). On the article 3 of the bill, it is modified the articles 41 (in section I) and 45 (in section II). In section I, the 1° adds an alinea, and the 2° changes some words on the second alinea (which becomes the third when the 1° is applied). Given the changed word appears only one time, there are no doubts that the source text in I-2° is the original text.
Do you know/is there somewhere the rules about "when restarting the cursor"? = is there some cases when you modify the text modified in an earlier section, or is the source text always the original text (in force)?
The text was updated successfully, but these errors were encountered: