My research interests are broadly in computer science, linguistics, and their intersection. In computational linguistics/NLP, I am particularly interested in meaning representations, whether that is semantic supersense tagging and parsing or applying syntactic formalisms to NLP. There is something incredibly satisfying about structured representations of meaning inspired from linguistics that I find lacking in our deep-learning-dominated field.

This is a running list of my published research papers (some of which are non-archival). You can also see my profile on Semantic Scholar.

: paper at a major NLP/CL conference or journal (e.g. *CL, LREC, TACL).
: paper at a NLP/CL workshop or smaller conference (e.g. SIGMORPHON, Syntaxfest).
: non-archival talk (e.g. SIGTYP abstract)


  1. Khuyagbaatar Batsuren, Gábor Bella, Aryaman Arora, ..., Ekaterina Vylomova. The SIGMORPHON 2022 Shared Task on Morpheme Segmentation. SIGMORPHON.
  2. Aryaman Arora. Universal Dependencies for Punjabi. LREC.
  3. Aryaman Arora, Nitin Venkateswaran, Nathan Schneider. MASALA: Modelling and Analysing the Semantics of Adpositions in Linguistic Annotation of Hindi. LREC. [code]
  4. Khuyagbaatar Batsuren*, Omer Goldman*, ..., Aryaman Arora, ..., Ekaterina Vylomova. UniMorph 4.0: Universal Morphology. LREC.
  5. Aryaman Arora, Clara Isabel Meister, Ryan Cotterell. Estimating the entropy of linguistic distributions. ACL. [code]
  6. Aryaman Arora, Adam Farris, Samopriya Basu, Suresh Kolichala. Computational historical linguistics and language diversity in South Asia. ACL.
  7. Aryaman Arora, Nathan Schneider, Brett Reynolds. A CGEL-formalism English treebank. MASC-SLL, Philadelphia, USA (April 30, 2022). [code]
  8. Adam Farris*, Aryaman Arora*. DIPI: Dependency Parsing for Ashokan Prakrit Historical Dialectology. Towards a comparative historical dialectology: evidence from morphology and syntax @ DGfS, Tübingen, Germany (February 23–25, 2022). [code]


  1. Adam Farris*, Aryaman Arora*. For the Purpose of Curry: A UD Treebank for Ashokan Prakrit. Universal Dependencies Workshop. [code]
  2. Aryaman Arora, Adam Farris, Gopalakrishnan R, Samopriya Basu. Bhāṣācitra: Visualising the dialect geography of South Asia. Workshop on Computational Approaches to Historical Language Change. [code]
  3. Aryaman Arora, Ahmed Etebari. Kholosi Dictionary.
  4. Aryaman Arora, Nitin Venkateswaran, Nathan Schneider. Adposition and Case Supersenses v1.0: Guidelines for Hindi–Urdu.
  5. Aryaman Arora, Nitin Venkateswaran, Nathan Schneider. SNACS Annotation of Case Markers and Adpositions in Hindi. SCiL. [code]


  1. Michael Kranzlein, Emma Manning, Siyao Peng, Shira Wein, Aryaman Arora, Nathan Schneider. PASTRIE: A Corpus of Prepositions Annotated with Supsersense Tags in Reddit International English. Linguistic Annotation Workshop. [code]
  2. Aryaman Arora, Nathan Schneider. SNACS Annotation of Case Markers and Adpositions in Hindi. SIGTYP. [code]
  3. Aryaman Arora, Luke Gessler, Nathan Schneider. Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi. ACL. [code] [slides]
  4. Aryaman Arora, John R. McIntyre. Quasi-Passive Lower and Upper Extremity Robotic Exoskeleton for Strengthening Human Locomotion. Sustainable Innovation.