License

sohamhamso is published under two licenses, depending on what you are reusing. The full legal text lives in LICENSE in the repository; this page summarises and gives practical examples.

The split

Source code — MIT

Everything under src/, pipeline/, scripts/, tests/, db/schema.sql, and configuration files. Free to use, modify, and redistribute with attribution; no copyleft.

Corpus content — CC-BY-SA 4.0

Everything under data/corpus/, the generated dataset/ outputs (CSV, JSON shards, TEI-XML), translations, glosses, and prose documentation under docs/. Free to share and adapt, including commercially, provided you (a) credit the source and (b) release derivative works under the same CC-BY-SA 4.0 license.

When in doubt about a file, check its header comment or the LICENSE field of the most-specific containing directory's metadata. Where an upstream source carries a stricter license than ours (e.g., Muktabodha pending-permission status), the most-restrictive applicable license governs that file.

Attributing the corpus

For any republication of verses, translations, or glosses, CC-BY-SA 4.0 requires you to give appropriate credit, provide a link to the license, and indicate whether changes were made.

Inline credit (short form):

Sanskrit text from sohamhamso (https://sohamhamso.org), released under CC-BY-SA 4.0. Source: GRETIL, Muktabodha Indological Research Institute. Translations AI-generated by sohamhamso; see methodology.

Full credit (long form, recommended for academic use):

Sanskrit verse and word-by-word glosses from sohamhamso (https://sohamhamso.org), dataset release vYYYY.MM.DD, DOI 10.5281/zenodo.PLACEHOLDER. Primary source: GRETIL (CC-BY 4.0). English translation AI-generated under the sohamhamso methodology (https://sohamhamso.org/about/methodology) and released under CC-BY-SA 4.0. Modifications: [describe].

BibTeX (placeholder)

The first Zenodo release will publish a real DOI; until then the following template uses a placeholder. The canonical citation block is regenerated on each tagged release and mirrored on /cite.

@dataset{sohamhamso_vYYYY_MM_DD,
  author       = {sohamhamso contributors},
  title        = {sohamhamso: Tantric Sanskrit canon dataset},
  year         = {YYYY},
  version      = {vYYYY.MM.DD},
  doi          = {10.5281/zenodo.PLACEHOLDER},
  publisher    = {Zenodo},
  url          = {https://doi.org/10.5281/zenodo.PLACEHOLDER},
  license      = {CC-BY-SA-4.0}
}

No endorsement

Attribution as required by the license does not constitute or imply endorsement by sohamhamso, by any upstream source, or by any contributor.

Last revised: 2026-05-31 · Full text: LICENSE · CC-BY-SA 4.0 legalcode: creativecommons.org.