Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Covalent bond #16

Open
kanghw0325 opened this issue Nov 19, 2024 · 4 comments
Open

Covalent bond #16

kanghw0325 opened this issue Nov 19, 2024 · 4 comments

Comments

@kanghw0325
Copy link

Hi, Congrats your with an impressive work. Boltz1 is Huge.

I'd like to ask you about writing covalent bond with ligand and residue.

Here is my yaml file

version: 1 # Optional, defaults to 1

sequences:
  - protein:
      id: A
      sequence: GREDAELLVTVRGGRLRGIRLKTPGGPVSAFLGIPFAEPPMGPRRFLPPEPKQPWSGVVDATTFQSVCYQYVDTLYPGFEGTEMWNPNRELSEDCLYLNVWTPYPRPTSPTPVLVWIYGGGFYSGASSLDVYDGRFLVQAERTVLVSMNYRVGAFGFLALPGSREAPGNVGLLDQRLALQWVQENVAAFGGDPTSVTLFGESAGAASVGMHLLSPPSRGLFHRAVLQSGAPNGPWATVGMGEARRRATQLAHLVGCPPGGTGGNDTELVACLRTRPAQVLVNHEWHVLPQESVFRFSFVPVVDGDFLSDTPEALINAGDFHGLQVLVGVVKDEGSYFLVYGAPGFSKDNESLISRAEFLAGVRVGVPQVSDLAAEAVVLHYTDWLHPEDPARLREALSDVVGDHNVVCPVAQLAGRLAAQGARVYAYVFEHRASTLSWPLWMGVPHGYEIEFIFGIPLDPSRNYTAEEKIFAQRLMRYWANFARTGDPNEPRDPKAPQWPPYTAGAQQYVSLDLRPLEVRRGLRAQACAFWNRFLPKLLSAT
      msa: /content/6CQT/6CQT.a3m
  - ligand:
      id: B
      smiles: CC(O)NC1COC(COC2OC(C)C(O)C(O)C2O)C(OC2OC(CO)C(O)C(O)C2NC(C)O)C1O
constraints:
    - bond:
        atom1: [A, 349, ND2]
        atom2: [B, 1, C4]

and I got this error message.
Is anything wrong in my yaml file?

Downloading data and model to /content/weights. You may change this by setting the --cache flag.
Checking input data.
Processing input data.
  0% 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/usr/local/bin/boltz", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/boltz/main.py", line 346, in predict
    processed = process_inputs(data, out_dir, ccd)
  File "/usr/local/lib/python3.10/dist-packages/boltz/main.py", line 179, in process_inputs
    target = parse_yaml(path, ccd)
  File "/usr/local/lib/python3.10/dist-packages/boltz/data/parse/yaml.py", line 57, in parse_yaml
    return parse_boltz_schema(name, data, ccd)
  File "/usr/local/lib/python3.10/dist-packages/boltz/data/parse/schema.py", line 736, in parse_boltz_schema
    c1, r1, a1 = atom_idx_map[tuple(constraint["bond"]["atom1"])]
KeyError: ('A', 349, 'ND2')
@jwohlwend
Copy link
Owner

Hi, thanks for giving Boltz a try! We do not support covalent bonds when using smiles ligands at this time. This is because we cannot guarantee that the atom naming scheme will match the user's expectation (i.e what we name C4 might not be the atom you want). So currently this requires using a CCD code for your ligand. Do you happen to know if your ligand is in the CCD dictionary? If not we can try to brainstorm a good way to achieve this!

@kanghw0325
Copy link
Author

Thank you for fast response!
I will try with CCD code and tell you about the result!
Thank you!

@xiongzhp
Copy link

xiongzhp commented Nov 22, 2024

The ligand will be randomly bonded to a cysteine residue at an undesired site.

sequences:
    - protein:
        id: [A] 
        sequence: SMKQTGYLTIGGQRYQAEINDLENLGEMGSGTCGQVWKMRFRKTGHVIAVKQMRRSGNKEENKRILMDLDVVLKSHDCPYIVQCFGTFITNTDVFIAMELMGTCAEKLKKRMQGPIPERILGKMTVAIVKALYYLKEKHGVIHRDVKPSNILLDERGQIKLCDFGISGRLVDSKAKTRSAGCAAYMAPERIDPPDPTKPDYDIRADVWSLGISLVELATGQFPYKNCKTDFEVLTKVLQEEPPLLPGHMGFSGDFQSFVKDCLTKDHRKRPKYNKLLEHSFIKRYETLEVDVASWFKDVMAKTESPR    # only for protein, dna, rna
        # ccd: CCD              # only for ligand, exclusive with smiles
        msa: empty #msa: MSA_PATH         # only for protein
        # modifications:
        #   - position: RES_IDX   # index of residue, starting from 1
        #     ccd: CCD            # CCD code of the modified residue
        
    - ligand:
        id: [K]    # multiple ids in case of multiple identical entities
        ccd: 8E8

constraints:
    - bond:
        atom1: [A, 104, SG]
        atom2: [K, 1, CAA]

@xiongzhp
Copy link

The ligand will be randomly bonded to a cysteine residue at an undesired site.

sequences:
    - protein:
        id: [A] 
        sequence: SMKQTGYLTIGGQRYQAEINDLENLGEMGSGTCGQVWKMRFRKTGHVIAVKQMRRSGNKEENKRILMDLDVVLKSHDCPYIVQCFGTFITNTDVFIAMELMGTCAEKLKKRMQGPIPERILGKMTVAIVKALYYLKEKHGVIHRDVKPSNILLDERGQIKLCDFGISGRLVDSKAKTRSAGCAAYMAPERIDPPDPTKPDYDIRADVWSLGISLVELATGQFPYKNCKTDFEVLTKVLQEEPPLLPGHMGFSGDFQSFVKDCLTKDHRKRPKYNKLLEHSFIKRYETLEVDVASWFKDVMAKTESPR    # only for protein, dna, rna
        # ccd: CCD              # only for ligand, exclusive with smiles
        msa: empty #msa: MSA_PATH         # only for protein
        # modifications:
        #   - position: RES_IDX   # index of residue, starting from 1
        #     ccd: CCD            # CCD code of the modified residue
        
    - ligand:
        id: [K]    # multiple ids in case of multiple identical entities
        ccd: 8E8

constraints:
    - bond:
        atom1: [A, 104, SG]
        atom2: [K, 1, CAA]

If I use MSA, it works correctly. It seems to be because the constraints are soft. Is it possible to add hard constraints?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants