Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do autoencode-only with latent layer of 3 neurons #55

Open
richelbilderbeek opened this issue May 20, 2022 · 6 comments
Open

Do autoencode-only with latent layer of 3 neurons #55

richelbilderbeek opened this issue May 20, 2022 · 6 comments
Assignees

Comments

@richelbilderbeek
Copy link
Contributor

No description provided.

@richelbilderbeek
Copy link
Contributor Author

richel@N141CU:~$ python3 ~/.local/share/GenoCAE/run_gcae.py train --datadir /home/richel/GitHubs/gcaer/inst/extdata/ --data gcae_input_files_1 --model_id M1_3n --resume_from 0 --epochs 1 --save_interval 1 --train_opts_id ex3 --data_opts_id b_0_4 --trainedmodeldir /home/richel/.cache/gcaer/file1dc156c34037/
2022-05-20 13:34:45.815151: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2022-05-20 13:34:45.837919: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2022-05-20 13:34:45.837937: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (N141CU): /proc/driver/nvidia/version does not exist
2022-05-20 13:34:45.838170: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2022-05-20 13:34:45.843953: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2099940000 Hz
2022-05-20 13:34:45.844385: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fc088000b60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-05-20 13:34:45.844425: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
tensorflow version 2.2.0

______________________________ arguments ______________________________
train : True
datadir : /home/richel/GitHubs/gcaer/inst/extdata/
data : gcae_input_files_1
model_id : M1_3n
train_opts_id : ex3
data_opts_id : b_0_4
save_interval : 1
epochs : 1
resume_from : 0
trainedmodeldir : /home/richel/.cache/gcaer/file1dc156c34037/
pheno_model_id : None
project : False
superpops : None
epoch : None
pdata : None
trainedmodelname : None
plot : False
animate : False
evaluate : False
metrics : None

______________________________ data opts ______________________________
sparsifies : [0.0, 0.1, 0.2, 0.3, 0.4]
norm_opts : {'flip': False, 'missing_val': -1.0}
norm_mode : genotypewise01
impute_missing : True
validation_split : 0.2

______________________________ train opts ______________________________
learning_rate : 0.00032
batch_size : 10
noise_std : 0.0032
n_samples : -1
loss : {'module': 'tf.keras.losses', 'class': 'CategoricalCrossentropy', 'args': {'from_logits': False}}
regularizer : {'reg_factor': 1e-07, 'module': 'tf.keras.regularizers', 'class': 'l2'}
lr_scheme : {'module': 'tf.keras.optimizers.schedules', 'class': 'ExponentialDecay', 'args': {'decay_rate': 0.96, 'decay_steps': 100, 'staircase': False}}
______________________________
Imputing originally missing genotypes to most common value.
Reading ind pop list from /home/richel/GitHubs/gcaer/inst/extdata/gcae_input_files_1.fam
Reading ind pop list from /home/richel/GitHubs/gcaer/inst/extdata/gcae_input_files_1.fam
Mapping files: 100%|███████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 241.18it/s]
Using learning rate schedule tf.keras.optimizers.schedules.ExponentialDecay with {'decay_rate': 0.96, 'decay_steps': 100, 'staircase': False}

______________________________ Data ______________________________
N unique train samples: 800
--- training on : 800
N valid samples: 200
N markers: 1


______________________________ Building model ______________________________
Traceback (most recent call last):
  File "/home/richel/.local/share/GenoCAE/run_gcae.py", line 1619, in <module>
    main()
  File "/home/richel/.local/share/GenoCAE/run_gcae.py", line 1004, in main
    autoencoder = Autoencoder(model_architecture, n_markers, noise_std, regularizer)
  File "/home/richel/.local/share/GenoCAE/run_gcae.py", line 87, in __init__
    layer_module = getattr(eval(first_layer_def["module"]), first_layer_def["class"])
TypeError: eval() arg 1 must be a string, bytes or code object

@richelbilderbeek
Copy link
Contributor Author

The modified file has also a different layout:

Screenshot from 2022-05-20 13-38-22

@richelbilderbeek
Copy link
Contributor Author

Use a same layout:

Screenshot from 2022-05-20 13-39-11

@richelbilderbeek
Copy link
Contributor Author

richel@N141CU:~/.cache/gcaer/file1dc115a5b4c8/ae.M1_3n.ex3.b_0_4.gcae_input_files_1$ 'python3' ~/.local/share/GenoCAE/run_gcae.py project --datadir /home/richel/GitHubs/gcaer/inst/extdata/ --data gcae_input_files_1 --model_id M1_3n --train_opts_id ex3 --data_opts_id b_0_4 --trainedmodeldir /home/richel/.cache/gcaer/file1dc13a3ab996/
2022-05-20 14:27:50.120315: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2022-05-20 14:27:50.146672: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2022-05-20 14:27:50.146739: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (N141CU): /proc/driver/nvidia/version does not exist
2022-05-20 14:27:50.147310: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2022-05-20 14:27:50.173128: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2099940000 Hz
2022-05-20 14:27:50.174602: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fa588000b60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-05-20 14:27:50.174665: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
tensorflow version 2.2.0

______________________________ arguments ______________________________
train : False
datadir : /home/richel/GitHubs/gcaer/inst/extdata/
data : gcae_input_files_1
model_id : M1_3n
train_opts_id : ex3
data_opts_id : b_0_4
save_interval : None
epochs : None
resume_from : None
trainedmodeldir : /home/richel/.cache/gcaer/file1dc13a3ab996/
pheno_model_id : None
project : True
superpops : None
epoch : None
pdata : None
trainedmodelname : None
plot : False
animate : False
evaluate : False
metrics : None

______________________________ data opts ______________________________
sparsifies : [0.0, 0.1, 0.2, 0.3, 0.4]
norm_opts : {'flip': False, 'missing_val': -1.0}
norm_mode : genotypewise01
impute_missing : True
validation_split : 0.2

______________________________ train opts ______________________________
learning_rate : 0.00032
batch_size : 10
noise_std : 0.0032
n_samples : -1
loss : {'module': 'tf.keras.losses', 'class': 'CategoricalCrossentropy', 'args': {'from_logits': False}}
regularizer : {'reg_factor': 1e-07, 'module': 'tf.keras.regularizers', 'class': 'l2'}
lr_scheme : {'module': 'tf.keras.optimizers.schedules', 'class': 'ExponentialDecay', 'args': {'decay_rate': 0.96, 'decay_steps': 100, 'staircase': False}}
______________________________
Imputing originally missing genotypes to most common value.
Reading ind pop list from /home/richel/GitHubs/gcaer/inst/extdata/gcae_input_files_1.fam
Reading ind pop list from /home/richel/GitHubs/gcaer/inst/extdata/gcae_input_files_1.fam
Mapping files: 100%|███████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 219.27it/s]
Projecting epochs: [1, 2]
Already projected: []
In DG.get_train_set: number of -1.0 genotypes in train: 0
In DG.get_train_set: number of -9 genotypes in train: 0
In DG.get_train_set: number of 0 values in train mask: 0
Replacing dataset ind_pop_list_train in /home/richel/.cache/gcaer/file1dc13a3ab996/ae.M1_3n.ex3.b_0_4.gcae_input_files_1/gcae_input_files_1/encoded_data.h5

______________________________ Building model ______________________________
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'strides': 1}
Adding layer: BatchNormalization: {}
Adding layer: ResidualBlock2: {'filters': 8, 'kernel_size': 5}
--- conv1d  filters: 8 kernel_size: 5
--- batch normalization
--- conv1d  filters: 8 kernel_size: 5
--- batch normalization
Adding layer: MaxPooling1D: {'pool_size': 5, 'strides': 2, 'padding': 'same'}
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'activation': 'elu'}
Adding layer: BatchNormalization: {}
Adding layer: Flatten: {}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75, 'activation': 'elu'}
Adding layer: Dense: {'units': 3, 'name': 'encoded'}
Adding layer: Dense: {'units': 75, 'activation': 'elu'}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 75, 'activation': 'elu'}
Adding layer: Dropout: {'rate': 0.01}
Adding layer: Dense: {'units': 8}
Adding layer: Reshape: {'target_shape': (1, 8), 'name': 'i_msvar'}
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'activation': 'elu'}
Adding layer: BatchNormalization: {}
Adding layer: Reshape: {'target_shape': (1, 1, 8)}
Adding layer: UpSampling2D: {'size': (2, 1)}
Adding layer: Reshape: {'target_shape': (2, 8)}
Adding layer: ResidualBlock2: {'filters': 8, 'kernel_size': 5}
--- conv1d  filters: 8 kernel_size: 5
--- batch normalization
--- conv1d  filters: 8 kernel_size: 5
--- batch normalization
Adding layer: Conv1D: {'filters': 8, 'kernel_size': 5, 'padding': 'same', 'activation': 'elu', 'name': 'nms'}
Adding layer: BatchNormalization: {}
Adding layer: Conv1D: {'filters': 1, 'kernel_size': 1, 'padding': 'same'}
Adding layer: Flatten: {'name': 'logits'}
########################### epoch 1 ###########################
Reading weights from /home/richel/.cache/gcaer/file1dc13a3ab996/ae.M1_3n.ex3.b_0_4.gcae_input_files_1/weights/1
Traceback (most recent call last):
  File "/home/richel/.local/share/GenoCAE/run_gcae.py", line 1619, in <module>
    main()
  File "/home/richel/.local/share/GenoCAE/run_gcae.py", line 1287, in main
    encoded_train = np.concatenate((encoded_train, encoded_train_batch), axis=0)
  File "<__array_function__ internals>", line 5, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 3 and the array at index 1 has size 2

@richelbilderbeek richelbilderbeek self-assigned this Jun 14, 2022
@richelbilderbeek
Copy link
Contributor Author

This is the warning:

Warning messages:                                                                                                 
1: In system2(command = run_args[1], args = run_args[-1], stdout = TRUE,  :
  running command ''python3' ~/.local/share/GenoCAE/run_gcae.py project --datadir /home/richel/GitHubs/gcaer/inst/extdata/ --data gcae_input_files_1 --model_id M1_3n --train_opts_id ex3 --data_opts_id b_0_4 --trainedmodeldir ~/.cache/gcaer/ae_out315275230761/ 2>&1' had status 1
2: In system2(command = run_args[1], args = run_args[-1], stdout = TRUE,  :
  running command ''python3' ~/.local/share/GenoCAE/run_gcae.py project --datadir /home/richel/GitHubs/gcaer/inst/extdata/ --data gcae_input_files_1 --model_id M1_3n --train_opts_id ex3 --data_opts_id b_0_4 --trainedmodeldir ~/.cache/gcaer/ae_out315275230761/ 2>&1' had status 1

Note that the problem is in project only.

@richelbilderbeek
Copy link
Contributor Author

richelbilderbeek commented Jun 14, 2022

Fixed the warning by only doing project when there are 2 neurons in the latent layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant