forked from davidecaroselli/marian-dev
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conflicts resolution #1
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Enable compute86 where supported
* Enable on-line packing/quantization * Add half precision min/max quantization for model weights * Change default quantization of B matrix to min/max, revert a false commit for AggregateAll * Fixed missing half quantization * Fix quantization range for A * Set all default values for the quantize range to 0.f * Use 7 bits clip for the weight matrix quantization to avoid an overflow of VPMADDUBSW
…-stalled This fixes a bug that's been discovered recently by checking if a validator exists before resetting its stalled validations. Regression test for it is in: marian-nmt/marian-regression-tests#80
Update SPM module to include CMake changes.
Bumps [regression-tests](https://github.com/marian-nmt/marian-regression-tests) from `2a8bed3` to `89ce02e`. - [Release notes](https://github.com/marian-nmt/marian-regression-tests/releases) - [Commits](marian-nmt/marian-regression-tests@2a8bed3...89ce02e) --- updated-dependencies: - dependency-name: regression-tests dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Without these quotes, cmake fails in a confusing manner on systems whose cpuinfo output includes spaces. This arose in the context of attempting to compile natively on an m1 mac. $ /usr/sbin/sysctl -n machdep.cpu.features machdep.cpu.leaf7_features sysctl: unknown oid 'machdep.cpu.leaf7_features' Obviously, this didn't work out well; there is still much more to do. Still, the quotes are cheap and eliminate a confusing failure mode. For this reason, I added them to the linux as well as the darwin path.
Bumps [src/3rd_party/fbgemm](https://github.com/marian-nmt/FBGEMM) from `6f45243` to `0e33146`. - [Commits](marian-nmt/FBGEMM@6f45243...0e33146) --- updated-dependencies: - dependency-name: src/3rd_party/fbgemm dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ed version during training Adds option to replace current parameters with smoothed version during training. Could potentially help with convergence and training stability.
…g with fp16 fails This PR adds a do-while loop to training. It should only repeat if a fp16 training run was interrupted via the throwing of a DivergenceException from training/scheduler.h and if --throw-on-divergence and --fp16-fallback-to-fp32 are enabled. The repeated training run will continue from last checkpoint (similar to a manually interrupted training) but attempt training in fp32. If that training run or any other fp32 training happens to diverge, training will exit with an unhandled DivergenceException. This is on purpose to indicate a fatal error.
Small simplification to create the correctly named tarball via `make marian_tgz` resulting in e.g. `marian-2023-06-28-8390b1d.tgz` This will be executed every time make `marian_tgz` is invoked, but depends on the correct targets and will update changed commit revisions etc. Uses PST time zone.
LSH vocab filtering for GPU. Speed is not competitive with non-LSH. Checking in for completeness and possible future use of LSH on GPU for non-filtering stuff eg. decoding $22k sentences, mini-batch 256, maxi-batch 10 using production SSRU model: Without LSH: 53.86 sec. With LSH: 108.27
This PR adds: * An implementation of BLEURT with conversion script * Some code refactoring for COMET models * A more cleanly separated "evaluate" and "embed" functionality for COMET/COMET-QE/BLEURT * A number of MBR-related scripts.
Fixes and extends unit test for layer norm. Previous version had a weird usage of Glorot Uniform.
Various small improvements, missing operators, missing gradient computations etc. The two most useful ones are probably: * Working backward step (gradient) for scatter operation * Possiblity to use LayerNorm and RMSNorm without scale and bias vectors (especially in new layer framework)
Undoes the accidental renaming of the scale parameter in Norms layer back to "weight".
Reusing these YAML configs helps speed up coreleaf loading. The only consumers of this quicksand API are the leaf, and I think this small memory tradeoff of keeping these in cache is worth the speedup. Related work items: #146810
…ation number) when requested. This PR adds the option `--overwrite-checkpoints` (by default true to mimic current behavior) which can be set to `false` to force full checkpoint saving and preservation at saving intervals. E.g. for a model named `rus.enu.generalnn.replica_1.model.iter37769.npz`, Marian will then also save `rus.enu.generalnn.replica_1.model.iter37769.npz.optimizer.npz` and `rus.enu.generalnn.replica_1.model.iter37769.npz.progress.yml`.
…nmt#1000) Bumps [src/3rd_party/sentencepiece](https://github.com/marian-nmt/sentencepiece) from `8dc9172` to `fb6f8e4`. - [Commits](marian-nmt/sentencepiece@8dc9172...fb6f8e4) --- updated-dependencies: - dependency-name: src/3rd_party/sentencepiece dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This PR explicitly disables server compilation in macOS build with clang. It seems an update to the macos-12 environment provided openssl and boost, which when found by cmake, enables compilation of marian-server, which doesn't work with clang.
Set compatible versions of Python modules after Cython 3.0 release.
This PR adds `--custom-fallbacks` and generalizes the previous attempt at handling diverged trainings. Now we can specify any number of fallback options that get used in subsequent diverged trainings. E.g. we can restart a training from the last checkpoint by turning off fp16 training and if we still encounter a divergence, we can also lower the learning rate on the next attempt. This would be achieved by adding the following to a config file: ``` custom-fallbacks: - fp16: false precision: [float32, float32] cost-scaling: [] - fp16: false precision: [float32, float32] cost-scaling: [] learn-rate: 0.0001 ``` On the command line we can specify json-style options like `--custom-fallbacks "{fp16: false, precision: [float32, float32], cost-scaling: []}" "{fp16: false, precision: [float32, float32], cost-scaling: [], learn-rate: 0.0001}"` where each string in `"..."` gets parsed to a Yaml list entry. The previous option `--fp16-fallback-to-fp32` is now just an alias for the corresponding `--custom-fallbacks` values (first entry above). Any number of fallbacks can be specified.
This PR fixes fine-tuning a model trained with an older version of Marian by: - adding the removed option `num-devices` to the list of deprecated options - checking if `loss-{arg,var}-{slow,fast}` are present in .progress.yml file
…rgence Make sure that the averaged loss is actually well-defined and not inf or nan.
Co-authored-by: Hieu Hoang <[email protected]>
marian-nmt#1003) * Add an option to not encode sentencepiece during training/decoding allowing passing of spmIDs directly * Update changelog * numbers -> pieces
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Please add a clear and concise description of the changes.
This PR fixes a bug/adds a new feature/refactorizes the code/does something else.
It is related to issues: marian-nmt#998, marian-nmt#999, ...
List of changes:
Added dependencies: none
How to test
Describe how to test your changes, adding command line examples and sample input/output files if relevant.
Point to unit tests or regression tests covering the changes if they have been added.
Describe how you have tested your code, including OS and the cmake command.
Checklist