Skip to content

Commit

Permalink
[chandas] Add basic implementation
Browse files Browse the repository at this point in the history
This commit adds a basic implementation of `vidyut-chandas` along with
some starter documentation. `vidyut-chandas` is pre-alpha code that I'm
shipping as a scaffold for future work.

Currently, `vidyut-chandas` supports a variety of vrtta meters, but it
has no support for jAti meters, which is the next major feature to
implement. Our documentation is also lacking, but it's a promising
start.

This commit was prepared primarily by me but with major contributions
from Pradyumna Malladi (@MSSRPRAD), who created a [first draft][1] of
the code based on a design document we collaborated on [here][2].

My contributions were:

- some cleanup to the input data format to use simple TSVs instead of
  JSON files with serde. We currently have a serde dependency to support
  our frontend, but this is not a core part of the crate and can be
  removed in a future refactor.

- some simplification to the API, particularly the use of a single
  `name` instead of `names` for each meter and the avoidance of some
  intermediate structs during serde

- the `classify` method and its matching logic

- some tweaks to our logic for calculating meter ganas

- most of the documentation, including the README, docstrings, and
  sample code

- most unit tests, especially for classifying different meters

- our WebAssembly bindings and frontend

Pradyuma's contributions were:

- a partial first draft of an end-to-end solution, including the basic
  data model for meters and aksharas

- data management, including sourcing our original meters file and
  writing the logic to read this file into our data structures

- a first draft of logic to calculate meter ganas

- a partial port of some of Vidyut's utility functions for testing
  Sanskrit sounds

- suggestions to make the returned matching object richer in order to
  support future ranking use-cases

All errors in recollection are my own.

[1]: #82
[2]: https://paper.dropbox.com/doc/Metrical-recognizer-API-v0--CCgFzji5rgBZYEMJkWAt_j5bAg-pZGhBvlQQBFcPx7SeWSE9

Co-authored-by: Pradyumna Malladi <[email protected]>
  • Loading branch information
akprasad and MSSRPRAD committed Jan 1, 2024
1 parent 976baa7 commit 471a5d1
Show file tree
Hide file tree
Showing 22 changed files with 1,532 additions and 12 deletions.
37 changes: 25 additions & 12 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
[workspace]

members = [
"vidyut-chandas",
"vidyut-cheda",
"vidyut-kosha",
"vidyut-lipi",
Expand All @@ -21,6 +22,7 @@ license = "MIT"
edition = "2021"

[dependencies]
vidyut-chandas = { path = "./vidyut-chandas" }
vidyut-cheda = { path = "./vidyut-cheda" }
vidyut-kosha = { path = "./vidyut-kosha" }
vidyut-lipi = { path = "./vidyut-lipi" }
Expand Down
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,13 @@ independently depending on your use case.
In Rust, components of this kind are called *crates*.


### [`vidyut-chandas`][vidyut-chandas]

`vidyut-chandas` is an experimental classifier for Sanskrit meters.

For details, see the [vidyut-chandas README][vidyut-chandas].


### [`vidyut-cheda`][vidyut-cheda]

`vidyut-cheda` segments Sanskrit expressions into words then annotates those
Expand Down Expand Up @@ -147,6 +154,7 @@ between words. It is fast, simple, and appropriate for most use cases.
For details, see the [vidyut-sandhi README][vidyut-sandhi].


[vidyut-chandas]: vidyut-chandas/README.md
[vidyut-cheda]: vidyut-cheda/README.md
[vidyut-kosha]: vidyut-kosha/README.md
[vidyut-prakriya]: vidyut-prakriya/README.md
Expand Down
2 changes: 2 additions & 0 deletions vidyut-chandas/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
www/static/wasm
www/static/data
22 changes: 22 additions & 0 deletions vidyut-chandas/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
[package]
name = "vidyut-chandas"
version = "0.1.0"
authors = ["Arun Prasad <[email protected]>"]
description = "A Sanskrit metrical classifier"
homepage = "https://github.com/ambuda-org/vidyut"
repository = "https://github.com/ambuda-org/vidyut"
categories = ["text-processing"]
keywords = ["sanskrit"]
license = "MIT"
edition = "2021"

[dependencies]
console_error_panic_hook = "0.1.7"
lazy_static = "1.4.0"
serde = { version = "1.0.150", features = ["derive"] }
serde-wasm-bindgen = "0.4"
serde_derive = "1.0.193"
wasm-bindgen = "0.2"

[lib]
crate-type = ["cdylib", "rlib"]
2 changes: 2 additions & 0 deletions vidyut-chandas/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
debugger:
./scripts/run-debugger.sh
59 changes: 59 additions & 0 deletions vidyut-chandas/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
<div align="center">
<h1><code>vidyut-chandas</code></h1>
<p><i>A Sanskrit metrical classifier</i></p>
</div>

`vidyut-chandas` is an experimental classifier for Sanskrit meters.

This [crate][crate] is under active development as part of the [Ambuda][ambuda]
project. If you enjoy our work and wish to contribute to it, we encourage you
to [join our Discord server][discord], where you can meet other Sanskrit
programmers and enthusiasts.

An online demo is available [here][demo].

[crate]: https://doc.rust-lang.org/book/ch07-01-packages-and-crates.html
[ambuda]: https://ambuda.org
[discord]: https://discord.gg/7rGdTyWY7Z
[demo]: https://ambuda-org.github.io/vidyut-lipi/

- [Overview](#overview)
- [Usage](#usage)


Overview
--------

Sanskrit poetry uses a variety of *meters*, which specify the syllable patterns
that a verse must follow. `vidyut-chandas` aims to provide a simple API for
identifying the meter that a given verse uses.

`vidyut-chandas` is experimental code and follows in the footsteps of great
projects like [sanskritmetres][sm] and [Skrutable][skrutable].

[sm]: https://github.com/shreevatsa/sanskrit
[skrutable]: https://github.com/tylergneill/skrutable


Usage
-----

*(This API is unstable.)*

We recommend using `vidyut-chandas` through our `Chandas` API:

```rust
use vidyut_chandas::{Chandas, MatchType, Vrtta};

let vrttas: Vec<Vrtta> = vec![
"vasantatilakA\tvrtta\tGGLGLLLGLLGLGG".try_into().unwrap(),
"mandAkrAntA\tvrtta\tGGGGLLLLLGGLGGLGG".try_into().unwrap(),
"puzpitAgrA\tvrtta\tLLLLLLGLGLGG/LLLLGLLGLGLGG".try_into().unwrap(),
"udgatA\tvrtta\tLLGLGLLLGL/LLLLLGLGLG/GLLLLLLGLLG/LLGLGLLLGLGLG".try_into().unwrap()
];
let chandas = Chandas::new(vrttas);

let result = chandas.classify("mAtaH samastajagatAM maDukEwaBAreH");
assert_eq!(result.vrtta().as_ref().unwrap().name(), "vasantatilakA");
assert_eq!(result.match_type(), MatchType::Pada);
```
21 changes: 21 additions & 0 deletions vidyut-chandas/data/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Data
====

We have designed `vidyut-chandas` so that it can run without any side data. But
in practice, it is useful to have a list of meters available already. For this
reason, the `vidyut-chandas` crate includes `meters.tsv`, which you can use to
get started.


Creating the data file
----------------------

`meters.tsv` was sourced from the [Sanskrit metres][s-m] project by Shreevatsa
Rajagopalan, which itself sources its data from a transcription of the
*Vṛttaratnākara* prepared by Dr. Dhaval Patel. Our version of this data
contains only *vṛtta* meters and does not currently support *jāti* meters.

We extracted this data using `extract_meter_data.py`, which you can find in the
`scripts` directory of this crate.

[s-m]: https://github.com/shreevatsa/sanskrit
Loading

0 comments on commit 471a5d1

Please sign in to comment.