scan GRETIL

One way you can help improve the world's stock of digital Sanskrit data is to use skrutable's whole-file meter identification to detect and correct certain kinds of textual problems that show up during metrical analysis. GRETIL texts include some of the highest-quality ones around and are relatively easy to work with, so they're a good starting point.

See the cumulative results so far.

how-to

Step 1: Identify and obtain an electronic text you'd like to work on. For example: the Bhagavadgītā (BhG) raw GRETIL file

Step 2: Clean up the file (using search-and-replace, etc.) so that it is a text file with exactly one (potential) whole verse per line. Half an anuṣṭubh śloka is ok, too. For example: BhG input cleaned

Step 3: Submit the cleaned file to skrutable's meter identifier using the "whole file" button. (See the help page for instructions.) skrutable is fast, so the 700 verses of the Gītā will only take a second: BhG raw output and BhG output cleaned.

Step 4: Read through the results (use regexes to search more effectively), and use detected metrical irregularities to identify potential textual issues in need of attention: hyper- or hypo-metrical lines, typos that result in different syllable weights, and so on.

The BhG text is in relatively good shape. But consider the huge text of the Rāmāyaṇa: (raw in — Dec 2018 version with semicolons), (clean in), (raw out), (clean out), (tallies of meter types), (breakdown and problems).

Some tips on skrutable's categories:

Step 5: Fix or otherwise account for the irregularities you find:

Step 6: Share your results with the community.

Questions? Email me (see about page), we'll talk.