-
-
Notifications
You must be signed in to change notification settings - Fork 21
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This commit adds a basic implementation of `vidyut-chandas` along with some starter documentation. `vidyut-chandas` is pre-alpha code that I'm shipping as a scaffold for future work. Currently, `vidyut-chandas` supports a variety of vrtta meters, but it has no support for jAti meters, which is the next major feature to implement. Our documentation is also lacking, but it's a promising start. This commit was prepared primarily by me but with major contributions from Pradyumna Malladi (@MSSRPRAD), who created a [first draft][1] of the code based on a design document we collaborated on [here][2]. My contributions were: - some cleanup to the input data format to use simple TSVs instead of JSON files with serde. We currently have a serde dependency to support our frontend, but this is not a core part of the crate and can be removed in a future refactor. - some simplification to the API, particularly the use of a single `name` instead of `names` for each meter and the avoidance of some intermediate structs during serde - the `classify` method and its matching logic - some tweaks to our logic for calculating meter ganas - most of the documentation, including the README, docstrings, and sample code - most unit tests, especially for classifying different meters - our WebAssembly bindings and frontend Pradyuma's contributions were: - a partial first draft of an end-to-end solution, including the basic data model for meters and aksharas - data management, including sourcing our original meters file and writing the logic to read this file into our data structures - a first draft of logic to calculate meter ganas - a partial port of some of Vidyut's utility functions for testing Sanskrit sounds - suggestions to make the returned matching object richer in order to support future ranking use-cases All errors in recollection are my own. [1]: #82 [2]: https://paper.dropbox.com/doc/Metrical-recognizer-API-v0--CCgFzji5rgBZYEMJkWAt_j5bAg-pZGhBvlQQBFcPx7SeWSE9 Co-authored-by: Pradyumna Malladi <[email protected]>
- Loading branch information
Showing
22 changed files
with
1,532 additions
and
12 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
www/static/wasm | ||
www/static/data |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
[package] | ||
name = "vidyut-chandas" | ||
version = "0.1.0" | ||
authors = ["Arun Prasad <[email protected]>"] | ||
description = "A Sanskrit metrical classifier" | ||
homepage = "https://github.com/ambuda-org/vidyut" | ||
repository = "https://github.com/ambuda-org/vidyut" | ||
categories = ["text-processing"] | ||
keywords = ["sanskrit"] | ||
license = "MIT" | ||
edition = "2021" | ||
|
||
[dependencies] | ||
console_error_panic_hook = "0.1.7" | ||
lazy_static = "1.4.0" | ||
serde = { version = "1.0.150", features = ["derive"] } | ||
serde-wasm-bindgen = "0.4" | ||
serde_derive = "1.0.193" | ||
wasm-bindgen = "0.2" | ||
|
||
[lib] | ||
crate-type = ["cdylib", "rlib"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
debugger: | ||
./scripts/run-debugger.sh |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
<div align="center"> | ||
<h1><code>vidyut-chandas</code></h1> | ||
<p><i>A Sanskrit metrical classifier</i></p> | ||
</div> | ||
|
||
`vidyut-chandas` is an experimental classifier for Sanskrit meters. | ||
|
||
This [crate][crate] is under active development as part of the [Ambuda][ambuda] | ||
project. If you enjoy our work and wish to contribute to it, we encourage you | ||
to [join our Discord server][discord], where you can meet other Sanskrit | ||
programmers and enthusiasts. | ||
|
||
An online demo is available [here][demo]. | ||
|
||
[crate]: https://doc.rust-lang.org/book/ch07-01-packages-and-crates.html | ||
[ambuda]: https://ambuda.org | ||
[discord]: https://discord.gg/7rGdTyWY7Z | ||
[demo]: https://ambuda-org.github.io/vidyut-lipi/ | ||
|
||
- [Overview](#overview) | ||
- [Usage](#usage) | ||
|
||
|
||
Overview | ||
-------- | ||
|
||
Sanskrit poetry uses a variety of *meters*, which specify the syllable patterns | ||
that a verse must follow. `vidyut-chandas` aims to provide a simple API for | ||
identifying the meter that a given verse uses. | ||
|
||
`vidyut-chandas` is experimental code and follows in the footsteps of great | ||
projects like [sanskritmetres][sm] and [Skrutable][skrutable]. | ||
|
||
[sm]: https://github.com/shreevatsa/sanskrit | ||
[skrutable]: https://github.com/tylergneill/skrutable | ||
|
||
|
||
Usage | ||
----- | ||
|
||
*(This API is unstable.)* | ||
|
||
We recommend using `vidyut-chandas` through our `Chandas` API: | ||
|
||
```rust | ||
use vidyut_chandas::{Chandas, MatchType, Vrtta}; | ||
|
||
let vrttas: Vec<Vrtta> = vec![ | ||
"vasantatilakA\tvrtta\tGGLGLLLGLLGLGG".try_into().unwrap(), | ||
"mandAkrAntA\tvrtta\tGGGGLLLLLGGLGGLGG".try_into().unwrap(), | ||
"puzpitAgrA\tvrtta\tLLLLLLGLGLGG/LLLLGLLGLGLGG".try_into().unwrap(), | ||
"udgatA\tvrtta\tLLGLGLLLGL/LLLLLGLGLG/GLLLLLLGLLG/LLGLGLLLGLGLG".try_into().unwrap() | ||
]; | ||
let chandas = Chandas::new(vrttas); | ||
|
||
let result = chandas.classify("mAtaH samastajagatAM maDukEwaBAreH"); | ||
assert_eq!(result.vrtta().as_ref().unwrap().name(), "vasantatilakA"); | ||
assert_eq!(result.match_type(), MatchType::Pada); | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
Data | ||
==== | ||
|
||
We have designed `vidyut-chandas` so that it can run without any side data. But | ||
in practice, it is useful to have a list of meters available already. For this | ||
reason, the `vidyut-chandas` crate includes `meters.tsv`, which you can use to | ||
get started. | ||
|
||
|
||
Creating the data file | ||
---------------------- | ||
|
||
`meters.tsv` was sourced from the [Sanskrit metres][s-m] project by Shreevatsa | ||
Rajagopalan, which itself sources its data from a transcription of the | ||
*Vṛttaratnākara* prepared by Dr. Dhaval Patel. Our version of this data | ||
contains only *vṛtta* meters and does not currently support *jāti* meters. | ||
|
||
We extracted this data using `extract_meter_data.py`, which you can find in the | ||
`scripts` directory of this crate. | ||
|
||
[s-m]: https://github.com/shreevatsa/sanskrit |
Oops, something went wrong.