ACL 2020

ACL 2020 was the most exciting event in my life so far as a proto-computational linguist. This was my first conference, and my first published research paper. I never thought I would be doing something like this when I started out with computer science (something that was fun and exciting), and when I started out with linguistics (something that became a hobby as I sought to improve my Hindi). I learned, I grew, all while meeting some fantastic people who gave me advice that I will pay heed to as I continue forward.

I had a great time livetweeting in Hindi and English as well as joining the myriad extra-conference sessions.

And despite the conference being virtual and all the hassles that entailed, I felt I was able to get to know a lot of people and felt welcomed by the *CL community.

I decided, in order to protect the important memories that my brain will discard eventually, to write up some of the things I have learned as well as my thoughts on the conference itself. Future me will certainly find it useful, but I hope others do too.

My Paper, Getting There

Our paper (my coauthors were Luke Gessler and Dr. Nathan Schneider) at the conference was "Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi". Basically, we focused in on the problem of schwa deletion in Hindi and Punjabi. That is the primary issue in text-to-speech for those languages, and we proposed a high successful machine learning solution.

I've been very luck to get to attend a conference of this magnitude and quality even before my freshman year of college. The reality is that lots of things aligned for me being where I am. I'm especially thankful to have a great mentor in Dr. Nathan Schneider, who has made me super excited to be in this field and been very willing to teach me, an absolute beginner, as well as great people to work with like Luke.

So when I got the chance to see the brightest people in the field gathered together sharing all their ground-breaking work? Of course I wanted to make the most of it!

Papers, Talks

My original idea about the conference was that the paper talks would be the most important part of it. I don't think that's entirely true anymore—socializing and talking to interesting people is the real benefit. Regardless, I learned immeasurably from the paper talks. There was no talk that did not teach me something new, and often I learned about whole subfields that I had not heard of before. (I admittedly gave up some family Netflix time to binge-watch the talks.) It was honestly the most exciting thing to me.

For the talks I enjoyed the most I tried to livetweet them in Hindi under #acl2020hi. (I was assigned the Lexical Semantics track for that but I livetweeted pretty much anything interesting.) This was beneficial in a couple ways:

  1. It forced me to pay attention to what was going on and take notes. Surprisingly, I'd don't like saying stupidly wrong things publicly!
  2. It let me practice my Hindi in an engaging way. I'm somewhere between a native and heritage speaker so I always take the opportunity to get better. I also got to come up with some cool words like अंतःस्थापन "embedding".

The combined mental effort of both of these meant I could get the most out of every talk and I didn't descend into mindless NLP Netflix. On the other hand, it made me very tired and I'm not sure it would be as feasible at a physical conference.

Nevertheless, I would advise future me to always take notes at conferences. I think it's important to maximize how much I get out of events like these, especially ACL because it's only a couple days of the year. And obviously, it's super exciting to watch (hopefully "attend" in future years) the talks but remembering what was exciting is pretty great too.

As for the contents of the talks themselves... they were just so cool! I hope to be as excited for future conferences as I was for this one. Some of my favourite papers were:

I generally like the way the virtual site was organised. It was very easy to navigate and the attached RocketChat channels made asking questions very easy. The only real disappointment of the conference system was the Q&A Zoom session format, which seemed redundant and clunky especially relative to the RocketChat. It's still unclear what an hour-long Q&A was supposed to do. The ones I went in were quite sparse in audience, and no one showed up to one of mine. Maybe some kind of live presentation would have been more interesting?

Meetings, People

The people I met at ACL were awesome and very friendly. I can't think of any community I've been in that is as welcoming, and *CL is certainly what I would want any community that I am a part of to emulate. Everyone I've talked to has been genuinely interested; even in the RocketChat, people would drop by and say nice things about our paper (as well as ask great questions). We even got livetweeted about in Nepali and Hindi!

रिसर्च शीर्षक - Supervised grapheme-to-phoneme conversion of orthographic schwas in Hindi and Punjabi

अनुसन्धान कर्ता - @aryaman2020 @LukeGessler @complingy

यहाँ पुरा पढनुहोस् - https://t.co/NpnqO5iSC1#acl2020nlp #acl2020ne @aclmeeting

— Oyashi (@oya163) July 9, 2020

#acl2020nlp #acl2020hi अरोरा, आर्यमन : Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi

शोध पत्र : https://t.co/y6Y2T4nXYy#acl2020nlp #acl2020hi (1/n)

— Rahul Mittal (@rahul14mittal) July 6, 2020

The Birds of a Feather session that I was able to attend was on Lexical Semantics, organized by Tiago Torrent. It was quite interesting to see how meaning and structure cooperate and all the research questions that arise from it (as well as meet the people who work on it), and I'm sorry I wasn't able to attend more of those sessions. I also enjoyed meeting the extended academic family of Noah Smith, wonderful people at the University of Chicago and McGill University at a meeting organized by Jackie C. K. Cheung, the very cool people who attended the ACL Undergraduate Panel led by Sasha Rush, Suchin Gururangan, and Sabrina J. Mielke (although I was too shy to ask questions in that one, it was still very informative), other people interested in Low-Resource NLP at the session organized by Ibrahim Sharaf, and Kyle Gorman at SIGMORPHON. It was nice to see all the varied experiences and interests in the community.

Some of the people who gave me really great advice were Rishi Bommasani, Nelson Liu, Tiago Torrent, and Luca Soldaini. I had many great short conversations with several other people as well. I learned a lot from them and these conversations will without a doubt inform the future steps in my career. It really made me happy to be part of a community with these kinds of people.

So, to future me, I say talk to people and don't be afraid to start the conversation. It's definitely the most rewarding part of a conference, to be surrounded by people with the same interests yet with different (read: more) experiences and knowledge to give.

Future, Present

Where do I stand now? What do I take away from this? If there's been any consistent thread tying all my conversations, all the advice I got, all that I learned here, it's that trying new things is important. Diversity of experience in the present moment is the most valuable thing I can have. It's too early to worry about what form my career will take in a decade, what subfield I'll specialize in, or even what will happen after my undergrad. I have four formative years in front of me in which I can figure out what is exciting for me and what I want to do in the years that follow, so it makes sense that I explore all my options. I'm sure research is the path I will stay on, but that doesn't preclude other fields like software engineering, data science, historical linguistics, and (of course!) all the non-academic portions of a college education.

One thing to consider is how much I want to remain working on South Asian languages. It seems the cutting edge of the field is squarely in the courtyard of English, which is concerning. But I foresee my work being the most important when it deals with languages like my own, because they sorely need it. This is probably something I will continue having to figure out.

And what about the field? It seems, as someone very new, that there is a reckoning arriving in NLP/CL. People are frustrated with black box models, increasingly massive data that powers it, and just a general feeling of not knowing what is actually going on in the field. Sure, the metrics are up and there is money to be made from those performance gains. But the fundamental questions of linguistics have not been addressed at all by NLP/CL. How do we, as human beings, generalize the rules of language so well? How do we understand languages? How does language even exist? And, most unnervingly, what are computational systems even learning that makes these metrics go up? For a new researcher, this was a great time to have a Theme track at ACL that examines this from above. I think it's an exciting time to be in the field, to be grappling with these questions that don't have any clear answers awaiting us.

And so, I say ACL 2020 was a success, not only in how it occurred but in what I personally got from it. I really look forward to attending more conferences in the field (and out of the field!) and to growing as a researcher. I believe that the future is generally bright. I think good things are happening in the field. And I'm excited beyond measure at the prospect of having something to contribute in it.