RuntimeError: The size of tensor a (18294) must match the size of tensor b (18293) at non-singleton dimension 0 #110

gcassone-cnr · 2024-10-25T13:36:04Z

Dear Developers,

I'm a new Allegro user. I'm just trying to run the simple input shown below

*************
# general
root: results/water-tutorial
run_name: water
seed: 42
dataset_seed: 42
append: true
default_dtype: float32

# -- network --
model_builders:
 - allegro.model.Allegro
 # the typical model builders from `nequip` can still be used:
 - PerSpeciesRescale
 - ForceOutput
 - RescaleEnergyEtc

# cutoffs
r_max: 4.5
avg_num_neighbors: auto

# radial basis
BesselBasis_trainable: true
PolynomialCutoff_p: 48

# symmetry
l_max: 2
parity: o3_full   

# Allegro layers:
num_layers: 2
env_embed_multiplicity: 8
embed_initial_edge: true

two_body_latent_mlp_latent_dimensions: [32, 64, 128]
two_body_latent_mlp_nonlinearity: silu
two_body_latent_mlp_initialization: uniform

latent_mlp_latent_dimensions: [128]
latent_mlp_nonlinearity: silu
latent_mlp_initialization: uniform
latent_resnet: true

env_embed_mlp_latent_dimensions: []
env_embed_mlp_nonlinearity: null
env_embed_mlp_initialization: uniform

# - end allegro layers -

# Final MLP to go from Allegro latent space to edge energies:
edge_eng_mlp_latent_dimensions: [32]
edge_eng_mlp_nonlinearity: null
edge_eng_mlp_initialization: uniform

include_keys:
  - user_label
key_mapping:
  user_label: label0

# -- data --
dataset: ase                                                                   
dataset_file_name: /content/cp2k/colab/AIMD_data/conc_wat_pos_frc.extxyz                     # path to data set file
ase_args:
  format: extxyz

# A mapping of chemical species to type indexes is necessary if the dataset is provided with atomic numbers instead of type indexes.
chemical_symbols:
  - H
  - O

# logging
wandb: false
#wandb_project: allegro-water-tutorial
verbose: info
log_batch_freq: 10

# training
n_train: 1000
n_val: 100
batch_size: 5
max_epochs: 100
learning_rate: 0.002
train_val_split: random
shuffle: true
metrics_key: validation_loss

# use an exponential moving average of the weights
use_ema: true
ema_decay: 0.99
ema_use_num_updates: true

# loss function
loss_coeffs:
  forces: 1.
  total_energy:
    - 1.
    - PerAtomMSELoss

# optimizer
optimizer_name: Adam
optimizer_params:
  amsgrad: false
  betas: !!python/tuple
  - 0.9
  - 0.999
  eps: 1.0e-08
  weight_decay: 0.

metrics_components:
  - - forces                               # key 
    - mae                                  # "rmse" or "mae"
  - - forces
    - rmse
  - - total_energy
    - mae    
  - - total_energy
    - mae
    - PerAtom: True                        # if true, energy is normalized by the number of atoms

# lr scheduler, drop lr if no improvement for 50 epochs
lr_scheduler_name: ReduceLROnPlateau
lr_scheduler_patience: 50
lr_scheduler_factor: 0.5

early_stopping_lower_bounds:
  LR: 1.0e-5

early_stopping_patiences:
  validation_loss: 100
********

but at the 10th epoch I get the following error:

Traceback (most recent call last):
  File "/home/user/anaconda3/bin/nequip-train", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/scripts/train.py", line 115, in main
    trainer.train()
  File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/train/trainer.py", line 784, in train
    self.epoch_step()
  File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/train/trainer.py", line 919, in epoch_step
    self.batch_step(
  File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/train/trainer.py", line 814, in batch_step
    out = self.model(data_for_loss)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/nn/_graph_model.py", line 112, in forward
    data = self.model(new_data)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/nn/_rescale.py", line 144, in forward
    data = self.model(data)
           ^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/nn/_grad_output.py", line 85, in forward
    data = self.func(data)
           ^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/nequip/nn/_graph_mixin.py", line 366, in forward
    input = module(input)
            ^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/anaconda3/lib/python3.12/site-packages/allegro/nn/_allegro.py", line 612, in forward
    new_latents = cutoff_coeffs[active_edges].unsqueeze(-1) * new_latents
                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~
RuntimeError: The size of tensor a (18294) must match the size of tensor b (18293) at non-singleton dimension 0

Can you please suggest me what's wrong in my installation and how to fix this issue?

Many thanks in advance and best wishes,
Giuseppe Cassone

The text was updated successfully, but these errors were encountered:

gcassone-cnr · 2024-11-04T05:51:40Z

Dear developers, Is this forum still active?

cw-tan · 2024-11-22T03:14:34Z

Hi Giuseppe,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: The size of tensor a (18294) must match the size of tensor b (18293) at non-singleton dimension 0 #110

RuntimeError: The size of tensor a (18294) must match the size of tensor b (18293) at non-singleton dimension 0 #110

gcassone-cnr commented Oct 25, 2024 •

edited by cw-tan

Loading

gcassone-cnr commented Nov 4, 2024

cw-tan commented Nov 22, 2024

RuntimeError: The size of tensor a (18294) must match the size of tensor b (18293) at non-singleton dimension 0 #110

RuntimeError: The size of tensor a (18294) must match the size of tensor b (18293) at non-singleton dimension 0 #110

Comments

gcassone-cnr commented Oct 25, 2024 • edited by cw-tan Loading

gcassone-cnr commented Nov 4, 2024

cw-tan commented Nov 22, 2024

gcassone-cnr commented Oct 25, 2024 •

edited by cw-tan

Loading