This is a running list of my published research papers (some of which are non-archival). You can also see my profile on Semantic Scholar.

My research interests are broadly in computer science, linguistics, and their intersection. Mostly, I have worked on South Asian languages and English. In computational linguistics, I am particularly interested in meaning representations, whether that is semantic supersense tagging and parsing or applying syntactic formalisms to NLP for various languages. I also work on formal language theory, phonology, and (computational) historical linguistics.

: paper at a major NLP/CL conference or journal (e.g. *CL, LREC, TACL).
: paper at a NLP/CL workshop or smaller conference (e.g. SIGMORPHON, Syntaxfest).
: non-archival talk (e.g. SIGTYP abstract).
(none): preprint.


  1. Brett Reynolds, Aryaman Arora, Nathan Schneider. CGELBank: CGEL as a framework for English syntax annotation.
  2. Jordan Kodner, ..., Aryaman Arora, ..., Ekaterina Vylomova. SIGMORPHON–UniMorph 2022 Shared Task 0: Generalization and Typologically Diverse Morphological Inflection. SIGMORPHON.
  3. Khuyagbaatar Batsuren, Gábor Bella, Aryaman Arora, ..., Ekaterina Vylomova. The SIGMORPHON 2022 Shared Task on Morpheme Segmentation. SIGMORPHON.
  4. Aryaman Arora. Universal Dependencies for Punjabi. LREC.
  5. Aryaman Arora, Nitin Venkateswaran, Nathan Schneider. MASALA: Modelling and analysing the semantics of adpositions in linguistic annotation of Hindi. LREC. [code]
  6. Khuyagbaatar Batsuren*, Omer Goldman*, ..., Aryaman Arora, ..., Ekaterina Vylomova. UniMorph 4.0: Universal Morphology. LREC.
  7. Aryaman Arora, Clara Isabel Meister, Ryan Cotterell. Estimating the entropy of linguistic distributions. ACL. [code]
  8. Aryaman Arora, Adam Farris, Samopriya Basu, Suresh Kolichala. Computational historical linguistics and language diversity in South Asia. ACL.
  9. Aryaman Arora, Nathan Schneider, Brett Reynolds. A CGEL-formalism English treebank. MASC-SLL, Philadelphia, USA (April 30, 2022). [code]
  10. Adam Farris*, Aryaman Arora*. DIPI: Dependency Parsing for Ashokan Prakrit Historical Dialectology. Towards a comparative historical dialectology: evidence from morphology and syntax @ DGfS, Tübingen, Germany (February 23–25, 2022). [code]


  1. Adam Farris*, Aryaman Arora*. For the purpose of curry: A UD Treebank for Ashokan Prakrit. Universal Dependencies Workshop. [code]
  2. Aryaman Arora, Adam Farris, Gopalakrishnan R, Samopriya Basu. Bhāṣācitra: Visualising the dialect geography of South Asia. Workshop on Computational Approaches to Historical Language Change. [code]
  3. Aryaman Arora, Ahmed Etebari. Kholosi Dictionary.
  4. Aryaman Arora, Nitin Venkateswaran, Nathan Schneider. Adposition and case supersenses v1.0: Guidelines for Hindi–Urdu.
  5. Aryaman Arora, Nitin Venkateswaran, Nathan Schneider. SNACS annotation of case markers and adpositions in Hindi. SCiL. [code]


  1. Michael Kranzlein, Emma Manning, Siyao Peng, Shira Wein, Aryaman Arora, Nathan Schneider. PASTRIE: A corpus of prepositions annotated with supsersense tags in Reddit International English. Linguistic Annotation Workshop. [code]
  2. Aryaman Arora, Nathan Schneider. SNACS annotation of case markers and adpositions in Hindi. SIGTYP. [code]
  3. Aryaman Arora, Luke Gessler, Nathan Schneider. Supervised grapheme-to-phoneme conversion of orthographic schwas in Hindi and Punjabi. ACL. [code] [slides]
  4. Aryaman Arora, John R. McIntyre. Quasi-passive lower and upper extremity robotic exoskeleton for strengthening human locomotion. Sustainable Innovation.