Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when I use the "qlever index" command, the following error occurs. How can I resolve this issue? #64

Open
biegekekeke opened this issue Aug 28, 2024 · 8 comments

Comments

@biegekekeke
Copy link

2024-08-28 14:47:35.804 - INFO: QLever IndexBuilder, compiled on Tue Aug 27 16:54:45 UTC 2024 using git hash ac257c
2024-08-28 14:47:35.804 - INFO: You specified the input format: TTL
2024-08-28 14:47:35.804 - INFO: Processing input triples from /dev/stdin ...
2024-08-28 14:47:35.804 - INFO: You specified "locale = en_US" and "ignore-punctuation = 1"
2024-08-28 14:47:35.805 - INFO: You specified "parallel-parsing = true", which enables faster parsing for TTL files with a well-behaved use of newlines
2024-08-28 14:47:35.805 - INFO: You specified "num-triples-per-batch = 10,000,000", choose a lower value if the index builder runs out of memory
2024-08-28 14:47:35.805 - INFO: By default, integers that cannot be represented by QLever will throw an exception
2024-08-28 14:47:35.808 - ERROR: Operation not permitted

@joka921
Copy link
Member

joka921 commented Aug 28, 2024

Hi @biegekekeke and sorry for the spam post above, these keep appearing since yesterday or so...
Can you please send us some context:

  • What type of system is this (X86 or ARM, which operating system in which version)
  • how do you run QLever (natively compiled, via docker, via the qlever control script, from the command line) [EDIT: It seems you are running the control script via qlever index, do you have any custom settings in your QLeverfile, and what is the dataset?
  • What are the permissions on the folder you are

@ad-freiburg ad-freiburg deleted a comment Aug 28, 2024
@biegekekeke
Copy link
Author

biegekekeke commented Aug 28, 2024

Hi @biegekekeke and sorry for the spam post above, these keep appearing since yesterday or so... Can you please send us some context:
>

  • What type of system is this (X86 or ARM, which operating system in which version)
  • how do you run QLever (natively compiled, via docker, via the qlever control script, from the command line) [EDIT: It seems you are running the control script via qlever index, do you have any custom settings in your QLeverfile, and what is the dataset?
  • What are the permissions on the folder you are

Thank you for your response.I am using the Ubuntu 20.04 operating system. For QLever, I installed it by running pip install qlever and then used the qlever index command.
The QLeverfile is as follows:

[data]
NAME = wikidata
GET_DATA_URL = https://dumps.wikimedia.org/wikidatawiki/entities
GET_DATA_CMD = curl -LO -C - ${GET_DATA_URL}/latest-truthy.nt.bz2 ${GET_DATA_URL}/latest-lexemes.nt.bz2
INDEX_DESCRIPTION = "Full Wikidata dump from ${GET_DATA_URL} (latest-truthy.nt.bz2 and latest-lexemes.nt.bz2)"

[index]
INPUT_FILES = wikidata-20231222-lexemes.nt.bz2 wikidata-20231222-truthy.nt.bz2
CAT_INPUT_FILES = bzcat ${INPUT_FILES}
SETTINGS_JSON = { "languages-internal": ["en"], "prefixes-external": [ "<http://www.wikidata.org/entity/statement", "<http://www.wikidata.org/value", "<http://www.wikidata.org/reference" ], "locale": { "language": "en", "country": "US", "ignore-punctuation": true }, "ascii-prefixes-only": false, "num-triples-per-batch": 10000000 }
WITH_TEXT_INDEX = false
STXXL_MEMORY = 10g

[server]
PORT = 7001
ACCESS_TOKEN = ${data:NAME}_832649627
MEMORY_FOR_QUERIES = 100G
CACHE_MAX_SIZE = 100G

[runtime]
SYSTEM = docker
IMAGE = docker.io/adfreiburg/qlever:latest

[ui]
PORT = 7000
CONFIG = wikidata
The dataset is based on Wikidata, and my folder permissions are already set to rwx.

@joka921
Copy link
Member

joka921 commented Aug 28, 2024

  1. Do smaller datasets, where you just use the provide qleverfile without changes (best try olympics in a separate folder) work for you, or do they show the same issues?
  2. Can you run ls -al in the directory where you run the qlever index command and post the output?

@joka921
Copy link
Member

joka921 commented Aug 28, 2024

Another idea:
What is your file system (a "normal" ext4 disk, some fancy network mount or something, or something completely else)?
How did you install docker, or do you have any special configurations for docker (security hardenings etc) that might lead to permission problems inside the container?

And can you also post the output of qlever index --show (this logs what qlever index does under the hood.

@biegekekeke
Copy link
Author

qlever index --show

When I use olympics, the same issue also occurs. These are the permissions of the directory where I run the command:

drwxrwxrwx  2 xxx xxx        4096 Aug  28 12:54 .
drwxrwxrwx 11 xxx xxx        4096 Aug  28 15:34 ..
-rwxrwxrwx  1 xxx xxx        1422 Aug  28 15:44 Qleverfile
-rwxrwxrwx  1 xxx xxx          38 Aug  28 15:14 .stxxl
-rwxrwxrwx  1 xxx xxx   867124662 Aug  28 10:26 wikidata-20231222-lexemes.nt.bz2
-rwxrwxrwx  1 xxx xxx 41030599679 Aug  28 10:29 wikidata-20231222-truthy.nt.bz2
-rwxrwxrwx  1 xxx xxx         830 Aug  28 15:14 wikidata.index-log.txt
-rwxrwxrwx  1 xxx xxx         317 Aug  28 15:14 wikidata.settings.json

The file system I am using is a "normal" ext4 disk. When I run qlever index --show, the output is as follows:

qlever index --show

Command: index

echo '{ "languages-internal": ["en"], "prefixes-external": [ "<http://www.wikidata.org/entity/statement", "<http://www.wikidata.org/value", "<http://www.wikidata.org/reference" ], "locale": { "language": "en", "country": "US", "ignore-punctuation": true }, "ascii-prefixes-only": false, "num-triples-per-batch": 10000000 }' > wikidata.settings.json
docker run --rm -u $(id -u):$(id -g) -v /etc/localtime:/etc/localtime:ro -v $(pwd):/index -w /index --init --entrypoint bash --name qlever.index.wikidata docker.io/adfreiburg/qlever:latest -c 'ulimit -Sn 1048576; bzcat wikidata-20231222-lexemes.nt.bz2 wikidata-20231222-truthy.nt.bz2 | IndexBuilderMain -F ttl -f - -i wikidata -s wikidata.settings.json --stxxl-memory 10g | tee wikidata.index-log.txt'

You called "qlever ... --show", therefore the command is only shown, but not executed (omit the "--show" to execute it)

@hannahbast
Copy link
Member

@biegekekeke Are you using Docker inside of WSL (Windows Subsystem Linux)?

@biegekekeke
Copy link
Author

@biegekekeke您在 WSL(Windows 子系统 Linux)中使用 Docker 吗?

No

@github-staff github-staff deleted a comment Aug 28, 2024
@joka921
Copy link
Member

joka921 commented Aug 28, 2024

Okay, this sounds like some debuggin inside the Docker is required to track the concrete issue.
I won't have time for this in the coming week, but after this we can tackle this.
Are you proficient with using GDB/Docker etc.? In this case you could try debugging the call to IndexBuilderMain inside the Docker and send me a backtrace of the location where the error occurs, but this requires some particular computer science background.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants