Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing index.st (search plugin breaks parsing titles with " or \) #20

Closed
1 task done
maphew opened this issue Feb 2, 2023 · 8 comments
Closed
1 task done

Comments

@maphew
Copy link
Contributor

maphew commented Feb 2, 2023

  • I have searched the issues (including closed ones) and believe that this is not a duplicate.

Issue

When and how does the index get built? On two systems, Windows and Ubuntu, I've followed the instructions and verfied that stork is installed and in PATH.

$ stork --version
Stork 1.6.0

...then launched pelican with pelican --autoreload --listen, and in the resultant preview browser window typed text in the search box.

The web page says "Error! Check the browser console.".
Browser console says "Uncaught (in promise) undefined ".
Shell console says:

Done: Processed 69 articles, 0 drafts, 0 hidden articles, 1 page, 0 hidden pages and 0 draft pages in 8.83 seconds.
Unable to find `/search-index.st` or variations:
/search-index.st.html
/search-index.st/index.html
/search-index.st 

I've searched the file system for those files and they don't exist, so it looks like the index is not being built. How do I test and/or ensure the index is being built?

@maphew maphew mentioned this issue Feb 2, 2023
3 tasks
@maphew
Copy link
Contributor Author

maphew commented Feb 2, 2023

After reading #12 I added search to pelicanconf.py PLUGINS:

PLUGINS = ['m.htmlsanity', 'search']

now Pelican exits immediately with the error:

 CRITICAL Exception: Search plugin reported Error: Couldn't read the          __init__.py:566
                    configuration file: Cannot parse config as TOML. Stork recieved
                    error: `expected newline, found an identifier at line 288 column
                    11`

The only toml file in the file system for my site is output/search.toml.
Line 288 is:

title = ""dir \*1" returns unexpected files"

I could see \* being parsed improperly, but it's not at position 11. Likewise the double quotes, but they're not at that position either. d is at 11 if zero based, i if 1 based.

Lines 285 to 293 are:

[[input.files]]
path = "Other/dir_1_returns_unexpected_files.html"
url = "/Other/dir_1_returns_unexpected_files.html"
title = ""dir \*1" returns unexpected files"

[[input.files]]
path = "Linux/Fix_for_broken_wireless_after_suspend_resume.html"
url = "/Linux/Fix_for_broken_wireless_after_suspend_resume.html"
title = "Fix for broken wireless after suspend/resume"

@maphew
Copy link
Contributor Author

maphew commented Feb 2, 2023

Ahhh, I realised the d at position 11 is correct. The parser thinks there should be no more chars on the line as it's just processed what it thinks is a closing quote at position 10.

Below is the source markdown that's tripping it up. If I remove this post then pelican-search works, generating the index.st etc file I was missing, and using search input box in the output web pages return results.

---
title: "dir \*1" returns unexpected files
date:  17.10.2009
category: Other
tags:  other, cmd
summary: an unexpected quirk of `dir` in Windows CMD
---

These source lines crash pelican-search:

title: "dir \*1" returns unexpected files
title: `dir \*1` returns unexpected files
title: "'dir *1' returns unexpected files"

These source lines are okay:

title: 'dir *1' returns unexpected files
title: dir \\*1 returns unexpected files

@maphew maphew changed the title Missing index.st Missing index.st (search plugin breaks parsing title with and \`) Feb 2, 2023
@maphew maphew changed the title Missing index.st (search plugin breaks parsing title with and \`) Missing index.st (search plugin breaks parsing titles with " and \) Feb 2, 2023
@maphew maphew changed the title Missing index.st (search plugin breaks parsing titles with " and \) Missing index.st (search plugin breaks parsing titles with " or \) Feb 2, 2023
@justinmayer
Copy link
Contributor

justinmayer commented Feb 2, 2023 via email

@maphew
Copy link
Contributor Author

maphew commented Feb 2, 2023

Yes it does seem to be stork itself:

» stork build -i output\search.toml -o x
Error: Couldn't read the configuration file: Cannot parse config as TOML. Stork recieved error: `expected newline, found an identifier at line 288 column 11`

@maphew
Copy link
Contributor Author

maphew commented Feb 2, 2023

Refering to

title = "{striptags(page.title)}"

title = {dumps(striptags(page.title))} might do the trick?

It seems to work in this standalone snippet anyway:

from jinja2.filters import do_striptags as striptags
from json import dumps

X1 = r'''""dir *1" returns unexpected files"'''
X2 = r'''`dir \*1` returns unexpected files'''
X3 = r'''"'dir *1' returns unexpected files"'''

def test(page_title):
    input_file = f"""
        [[input.files]]
        title = {dumps(striptags(page_title))}
    """
    return input_file

for x in (X1,X2,X3):
    print( test(x) )

I got that json idea from https://stackoverflow.com/questions/17941109/escaping-quotes-in-jinja2 which was then then reinforced by chatgpt.

@justinmayer
Copy link
Contributor

Can you try your change locally and see whether your fix works? That would entail temporarily uninstalling the current release package, cloning the repo, making your changes to it, and using Pip's "editable install" function to install the plugin:

python -m pip uninstall pelican-search
git clone https://github.com/pelican-plugins/search.git ~/pelican-plugins/search
# [Make the code changes]
python -m pip install -e ~/pelican-plugins/search/

maphew added a commit to maphew/search that referenced this issue Feb 3, 2023
maphew added a commit to maphew/search that referenced this issue Feb 3, 2023
Fixes pelican-plugins#20, according to local testing.
@maphew
Copy link
Contributor Author

maphew commented Feb 3, 2023

Yes this does fix the problem on my machine! PR coming

This was referenced Feb 3, 2023
@maphew
Copy link
Contributor Author

maphew commented Feb 9, 2023

Closing since either of accepted PR #15 or the alternative proposal of #23 will address the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants