scan GRETIL
One way you can help improve the world's stock of digital Sanskrit data
is to use skrutable's whole-file meter identification
to detect and correct certain kinds of textual problems
that show up during metrical analysis.
GRETIL texts include some of the highest-quality ones around and are relatively easy to work with,
so they're a good starting point.
See the cumulative results so far.
how-to
Step 1: Identify and obtain an electronic text you'd like to work on.
For example: the Bhagavadgītā (BhG) raw GRETIL file
Step 2: Clean up the file (using search-and-replace, etc.) so that it is a text file with exactly one (potential) whole verse per line. Half an anuṣṭubh śloka is ok, too.
For example:
BhG input cleaned
Step 3: Submit the cleaned file to skrutable's meter identifier using the "whole file" button.
(See the help page for instructions.)
skrutable is fast, so the 700 verses of the Gītā will only take a second:
BhG raw output and
BhG output cleaned.
Step 4: Read through the results (use regexes to search more effectively),
and use detected metrical irregularities to identify potential textual issues in need of attention:
hyper- or hypo-metrical lines, typos that result in different syllable weights, and so on.
The BhG text is in relatively good shape. But consider the huge text of the Rāmāyaṇa:
(raw in — Dec 2018 version with semicolons), (clean in), (raw out), (clean out), (tallies of meter types), (breakdown and problems).
Some tips on skrutable's categories:
- "asamīcīna": for invalid parts of anuṣṭubhs
- "x yuktāḥ pādāḥ": for samavṛttas with less than four valid pādas
- "na kiṃcid adhyavasitam": for material not successfully determined to be a particular meter
- "ajñātasamavṛtta": for verses recognized as having the shape of samavṛttas (also ardhasamavṛttas) but not yet known by name
- "upajāti xyz": for upajātis (= pādas all having same length) other than the standard indravajrā-upendravajrā type
- "ajñātam": for "upajātis" in name only, with pādas of equal lengths but standardized recognized as such (usually spurious)
- "atha vā": for verses that could be one thing or another
Step 5: Fix or otherwise account for the irregularities you find:
- Correct the digital text
- Make a note of how to correct the edition itself
- Come up with an explanation in terms of poetic license, etc.
Step 6: Share your results with the community.
Questions? Email me (see about page), we'll talk.