Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double-Check Command: Unsupported integer size (0) #291

Open
bitahu opened this issue Nov 13, 2024 · 11 comments
Open

Double-Check Command: Unsupported integer size (0) #291

bitahu opened this issue Nov 13, 2024 · 11 comments
Labels

Comments

@bitahu
Copy link

bitahu commented Nov 13, 2024

What do you want to know?

RepeatMasker version:RepeatMasker-4.1.7-p1

When I execute commands:
~/RepeatMasker/famdb.py lineage -ad fusarium

it tell me:
Double-Check Command: Unsupported integer size (0)

i tested several species such as maize、rice、Homo sapiens and got the same results

Helpful context

  • Is there a particular genome assembly or organism your question is about? If possible, please provide a link to a publicly available assembly and/or a species name.

  • Have you installed RepBase RepeatMasker Edition for RepeatMasker?
    yes

@rmhubley
Copy link
Member

rmhubley commented Dec 4, 2024

Hmm. I get:

./famdb.py lineage -ad fusarium
1 root(0) [9]
└─131567 cellular organisms(0) [0]
  └─2759 Eukaryota(0) [0]
    └─33154 Opisthokonta(0) [0]
      └─4751 Fungi(0) [0]
        └─451864 Dikarya(0) [0]
          └─4890 Ascomycota(0) [0]
            └─716545 saccharomyceta(0) [0]
              └─147538 Pezizomycotina(0) [0]
                └─716546 leotiomyceta(0) [0]
                  └─715989 sordariomyceta(0) [0]
                    └─147550 Sordariomycetes(0) [0]
                      └─222543 Hypocreomycetidae(0) [0]
                        └─5125 Hypocreales(0) [0]
                          └─110618 Nectriaceae(0) [0]
                            └─5506 Fusarium(0) [0]
                              ├─171631 Fusarium oxysporum species complex(0) [0]
                              │ └─5507 Fusarium oxysporum(0) [0]
                              └─232080 Fusarium solani species complex(0) [0]
                                └─984957 Fusarium haematococcum(0) [0]

Could you provide the output from the command "./famdb.py info"?

@wozixing
Copy link

wozixing commented Dec 9, 2024

Hi rmhubley,
I get the same result like him.
When I use ./famdb.py lineage -ad Cucurbitaceae, it gave the results:

1 root(0) [9]
└─131567 cellular organisms(0) [0]
 └─2759 Eukaryota(0) [0]
Double-Check Command: 'NoneType' object has no attribute 'keys'.

And when I use
python ./famdb.py -i /Libararies/famdb families -f embl --curated -a -d Cucurbitaceae > Cucurbitaceae.embl

it just tell me:
Double-Check Command: 'NoneType' object has no attribute 'keys'

Is there something wrong with my package or database?

@bitahu
Copy link
Author

bitahu commented Dec 9, 2024

I think it's a version issue. You'd better use version RepeatMasker-4.1.6

@wozixing
Copy link

wozixing commented Dec 9, 2024

Thank you for your quick reply, but after I changed the repeatmasker version to 4.1.6, added the two data files of Repbase in Libraries, and added the two files dfam38-1_full.0.h5 and dfam38-1_full.5.h5 in famdb.
After configure, here is the output

[Building FASTA version of RepeatMasker.lib ...........................................
Building RMBlast frozen libraries..
The program is installed with a the following repeat libraries:

FamDB Directory     : /gss1/home/lqr20200519/FXB/t2t/02.annotation/software/RepeatMasker/Libraries/famdb
FamDB Generator     : famdb.py v1.0
FamDB Format Version: 1.0
FamDB Creation Date : 2023-11-15 11:30:15.311827](url)

Database: Dfam
Version : 3.8
Date    : 2023-11-14

Dfam - A database of transposable element (TE) sequence alignments and HMMs.

2 Partitions Present
Total consensus sequences present: 485773
Total HMMs present               : 472219

when I use
./famdb.py lineage -a -d Cucurbitaceae
the following problem occurred:
Double-Check Command Expecting value: line 1 column 2 (char 1).

Looking forward to your reply and help!

I think it's a version issue. You'd better use version RepeatMasker-4.1.6

@bitahu
Copy link
Author

bitahu commented Dec 9, 2024

You can try deleting the data in famdb and uploading the dfam data package again. Since there will be some data modifications when creating the database, I'm not sure if this will work, but this is what I did.

@rmhubley
Copy link
Member

rmhubley commented Dec 9, 2024

@wozixing Is that the complete listing of the info command? It's missing a large section that details each partition -- could you try and paste the entire result? Also, did you obtain the Dfam partition files from this site: https://www.dfam.org/releases/Dfam_3.8/families/FamDB/ and did you double check the size/md5 signatures on your downloads? It's possible the download was corrupt or incomplete.

It's a bit odd that the lineage listing would work up until it needs to get data from partition 5 (Viridiplantae). It's also odd that your info command has this suffix '](url)' on the Creation Date:

FamDB Creation Date : 2023-11-15 11:30:15.311827](url)

And lastly, I wanted to point out that it appears that Repbase didn't get added to the FamDB files. Perhaps you untar'd the RepBase distribution inside the Libraries directory rather than inside the top-level folder of RepeatMasker. Check that the RepBase file 'RMRBSeqs.embl' is located in RepeatMasker/Libraries next to the RepeatMasker provided file 'RMRBMeta.embl'. The RepeatMasker configure script should change the FamDB 'Database' name and add additional descriptions to the FamDB files after a successful merger of RepBase. It should look like:

FamDB Directory     : /u3/local/RepeatMasker-4.1.7-p1-Dfam_3.8.1_RB/Libraries/famdb
FamDB Generator     : famdb.py v1.0
FamDB Format Version: 1.0
FamDB Creation Date : 2023-11-15 11:30:15.311827

Database: Dfam withRBRM
Version : 3.8
Date    : 2023-11-14

Dfam - A database of transposable element (TE) sequence alignments and HMMs.
RBRM - RepBase RepeatMasker Edition - version 20181026

Hopefully with a bit more sleuthing we can figure out what is going wrong with your installation.

@rmhubley
Copy link
Member

rmhubley commented Dec 9, 2024

@bitahu, it shouldn't be a version issue. It works with both 4.1.6/4.1.7-p1 on my end.

@wozixing
Copy link

@rmhubley
Thank you very much for your reply. After checking various files and reconfiguring, we solved the above problems. The output results are as follows:

FamDB Directory     : /gss1/home/lqr20200519/FXB/t2t/02.annotation/software/RepeatMasker/Libraries/famdb
FamDB Generator     : famdb.py v1.0
FamDB Format Version: 1.0
FamDB Creation Date : 2023-11-15 11:30:15.311827

Database: Dfam withRBRM
Version : 3.8
Date    : 2023-11-14

Dfam - A database of transposable element (TE) sequence alignments and HMMs.
RBRM - RepBase RepeatMasker Edition - version 20181026

2 Partitions Present
Total consensus sequences present: 498368
Total HMMs present               : 472219


Partition Details
-----------------
 Partition 0 [dfam38_full.0.h5]: root - Mammalia, Amoebozoa, Bacteria <bacteria>, Choanoflagellata, Rhodophyta, Haptista, Metamonada, Fungi, Sar, Placozoa, Ctenophora <comb jellies>, Filasterea, Spiralia, Discoba, Cnidaria, Porifera, Viruses
     Consensi: 308186, HMMs: 295552

 Partition 1 [ Absent ]: Obtectomera 

 Partition 2 [ Absent ]: Euteleosteomorpha 

 Partition 3 [ Absent ]: Sarcopterygii - Sauropsida, Coelacanthimorpha, Amphibia, Dipnomorpha

 Partition 4 [ Absent ]: Diptera 

 Partition 5 [dfam38_full.5.h5]: Viridiplantae 
     Consensi: 190182, HMMs: 176667

 Partition 6 [ Absent ]: Deuterostomia - Chondrichthyes, Hemichordata, Cladistia, Holostei, Tunicata, Cephalochordata, Cyclostomata <vertebrates>, Osteoglossocephala, Otomorpha, Elopocephalai, Echinodermata, Chondrostei

 Partition 7 [ Absent ]: Hymenoptera 

 Partition 8 [ Absent ]: Ecdysozoa - Nematoda, Gelechioidea, Yponomeutoidea, Incurvarioidea, Chelicerata, Collembola, Polyneoptera, Tineoidea, Apoditrysia, Monocondylia, Strepsiptera, Palaeoptera, Neuropterida, Crustacea, Coleoptera, Siphonaptera, Trichoptera, Paraneoptera, Myriapoda, Scalidophora
1 root(0) [9]
└─131567 cellular organisms(0) [0]
  └─2759 Eukaryota(0) [0]
    └─33090 Viridiplantae(5) [2]
      └─35493 Streptophyta(5) [0]
        └─131221 Streptophytina(5) [0]
          └─3193 Embryophyta(5) [25]
            └─58023 Tracheophyta(5) [0]
              └─78536 Euphyllophyta(5) [0]
                └─58024 Spermatophyta(5) [0]
                  └─3398 Magnoliopsida(5) [0]
                    └─1437183 Mesangiospermae(5) [0]
                      └─71240 eudicotyledons(5) [0]
                        └─91827 Gunneridae(5) [0]
                          └─1437201 Pentapetalae(5) [0]
                            └─71275 rosids(5) [0]
                              └─91835 fabids(5) [0]
                                └─71239 Cucurbitales(5) [0]
                                  └─3650 Cucurbitaceae(5) [0]
                                    ├─1003877 Benincaseae(5) [0]
                                    │ ├─3653 Citrullus(5) [0]
                                    │ │ └─3654 Citrullus lanatus(5) [1804]
                                    │ └─3655 Cucumis(5) [0]
                                    │   ├─3656 Cucumis melo(5) [1973]
                                    │   └─3659 Cucumis sativus(5) [1063]
                                    └─1003878 Cucurbiteae(5) [0]
                                      └─3660 Cucurbita(5) [0]
                                        ├─3661 Cucurbita maxima(5) [1]
                                        └─3663 Cucurbita pepo(5) [1]

Thanks again for you kind help!

@rmhubley
Copy link
Member

I wonder if you learned anything that might help us or others with this problem? Was there a particular installation process that led to this error? Do you know how to reproduce it?

@wozixing
Copy link

wozixing commented Dec 11, 2024

@rmhubley
Following your suggestion, we checked the integrity of the fdam database file, but there was no problem. However, we did notice that when we added Rebase to the fdam database, the following words "Dfam withRBRM" were not displayed when the configuration was completed. Subsequently, we deleted all the fdam and Repbase database files and uploaded them again, and untar'd the tar file of the Repbase database in the RepeatMasker directory, and the above problem was solved after the configuration. I speculate that the two database files may have been modified in the previous multiple incorrect configurations, such as "configuring without downloading the root file dfam38-1_full.0.h5 or without deleting min_init.0.h5, et al", which resulted in the subsequent Fdam and Repbase databases not being able to merge normally.

@rmhubley
Copy link
Member

Thank you, that is helpful, and sorry that you had so much trouble getting this installed. In the next release we have added an edit history and incomplete edit checks to FamDB, which should detect problems like this. Also, @bitahu, should you still be having this problem, please let us know. We wrote a script to validate the FamDB files this last week which might help us identify problems with your installation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants