I am currently a 5th year Ph.D. candidate in Computational Linguistics at Georgetown University, where I belong to NERT, Corpling, PICoL, and the larger GUCL community. Broadly, my research interests include topics in computational psycholinguistics. In particular, I am interested in the following areas: (1) language acquisition and pretraining dynamics, (2) language processing, and (3) interpretability. I find it very exciting to apply these lines of work to model human language cognition.
I am also interested in linguistic formalisms that capture linguistic phenomena at various levels. For example, I am working on developing a novel approach to Combinatory Categorial Grammar (CCG) supertagging with Nathan Schneider, and on expanding rhetorical structure theory (RST) by creating and examining an anchored RST corpus with Amir Zeldes.
Prior to joining Georgetown, I obtained a B.A. in Liberal Arts from Soka University of America, where my main focus was Psychology and Economics (although I wrote my thesis on a topic related to Second Language Acquisition!). Subsequently, I pursued an M.A. in Teaching English to Speakers of Other Languages (TESOL) at Michigan State University, where I was co-advised by Sandra Deshors and Kristen Johnson. My M.A. thesis focused on quantifying how Chinese and Japanese speakers' use of the English articles deviates from that of native speakers, leveraging a modeling technique called Multifactorial Prediction and Deviation Analysis with Regressions (MuPDAR).
Most recent publications on Google Scholar.
Modeling Nonnative Sentence Processing with L2 Language Models.
Tatsuya Aoyama and Nathan Schneider
In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024, main long).
GDTB: Genre Diverse Data for English Shallow Discourse Parsing across Modalities, Text Types, and Domains.
Yang Janet Liu*, Tatsuya Aoyama*, Wesley Scivetti*, Yilun Zhu*, Shabnam Behzad, Lauren Levine, Jessica Lin, Devika Tiwari and Amir Zeldes.
In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024, main long).
eRST: A Signaled Graph Theory of Discourse Relations and Organization. PDF
Amir Zeldes, Tatsuya Aoyama, Janet Yang Liu, Siyao Peng, Debopam Das, and Luke Gessler
Computational Linguistics, 1-47.
Identifying Fairness Issues in Automatically Generated Testing Content. PDF Code
Kevin Stowe, Benny Longwill, Alyssa Francis, Tatsuya Aoyama, Debanjan Ghosh, Swapna Somasundaran
The 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA), the North American Chapter of the Association for Computational Linguistics (NAACL) 2024.
J-SNACS: Adposition and Case Supersenses for Japanese Joshi. PDF Code
Tatsuya Aoyama, Chihiro Taguchi, and Nathan Schneider
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
DISRPT: A Multilingual, Multi-domain, Cross-framework Benchmark for Discourse Processing. PDF Code
Chloé Braud, Amir Zeldes, Laura Rivière, Yang Janet Liu, Philippe Muller, Damien Sileo, and Tatsuya Aoyama
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
What’s Hard in RST Parsing? Predictive Models for Error Analysis. PDF Code
Yang Janet Liu, Tatsuya Aoyama, and Amir Zeldes
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), Prague, Czechia. Association for Computational Linguistics.
GENTLE: A Genre-Diverse Multilayer Challenge Set for English NLP and Linguistic Evaluation. PDF Code
Tatsuya Aoyama, Shabnam Behzad, Luke Gessler, Lauren Levine, Jessica Lin, Yang Janet Liu, Siyao Peng, Yilun Zhu, and Amir Zeldes
The 17th Linguistic Annotation Workshop (LAW-XVII), Association for Computational Linguistics (ACL) 2023
Corpus-Based Investigation of the Markedness and Frequency of Japanese Passives in Contemporary Written Japanese. PDF
Tatsuya Aoyama
Society for Computation in Linguistics (SCiL) 2023
Comparing Native and Learner Englishes Using a Large Pre-trained Language Model. PDF
Tatsuya Aoyama
11th Natural Language Processing for Computer-Assisted Language Learning (NLP4CALL) workshop 2022
Probe-Less Probing of BERT’s Layer-Wise Linguistic Knowledge with Masked Word Prediction. PDF
Tatsuya Aoyama, Nathan Schneider
North American Chapter of the Association for Computational Linguistics Student Research Workshop (NAACL-SRW 2022)
Revisiting Layer-Wise Linguistic Knowledge with Masked Word Prediction.
Tatsuya Aoyama, Nathan Schneider
Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL 2022)
International Students’ Willingness to Communicate in English as a Second Language : Effects of L2 Self-Confidence, Acculturation, and Motivational Types PDF
Tatsuya Aoyama, Tomoko Takahashi
Journal of International Students 10(3)
A Corpus-based Multifactorial Analysis of Japanese and Chinese Speakers’ English Article Use : Quantifying the Deviation Using MuPDAR PDF
Tatsuya Aoyama
Unpublished M.A. Thesis
Modeling Nonnative Sentence Processing with L2 Language Models.
Tatsuya Aoyama and Nathan Schneider
In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024, main long).
GDTB: Genre Diverse Data for English Shallow Discourse Parsing across Modalities, Text Types, and Domains.
Yang Janet Liu*, Tatsuya Aoyama*, Wesley Scivetti*, Yilun Zhu*, Shabnam Behzad, Lauren Levine, Jessica Lin, Devika Tiwari and Amir Zeldes.
In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024, main long).
eRST: A Signaled Graph Theory of Discourse Relations and Organization. PDF
Amir Zeldes, Tatsuya Aoyama, Janet Yang Liu, Siyao Peng, Debopam Das, and Luke Gessler
Computational Linguistics, 1-47.
Identifying Fairness Issues in Automatically Generated Testing Content. PDF Code
Kevin Stowe, Benny Longwill, Alyssa Francis, Tatsuya Aoyama, Debanjan Ghosh, Swapna Somasundaran
The 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA), the North American Chapter of the Association for Computational Linguistics (NAACL) 2024.
J-SNACS: Adposition and Case Supersenses for Japanese Joshi. PDF Code
Tatsuya Aoyama, Chihiro Taguchi, and Nathan Schneider
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
DISRPT: A Multilingual, Multi-domain, Cross-framework Benchmark for Discourse Processing. PDF Code
Chloé Braud, Amir Zeldes, Laura Rivière, Yang Janet Liu, Philippe Muller, Damien Sileo, and Tatsuya Aoyama
The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
What’s Hard in RST Parsing? Predictive Models for Error Analysis. PDF Code
Yang Janet Liu, Tatsuya Aoyama, and Amir Zeldes
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), Prague, Czechia. Association for Computational Linguistics.
GENTLE: A Genre-Diverse Multilayer Challenge Set for English NLP and Linguistic Evaluation. PDF Code
Tatsuya Aoyama, Shabnam Behzad, Luke Gessler, Lauren Levine, Jessica Lin, Yang Janet Liu, Siyao Peng, Yilun Zhu, and Amir Zeldes
The 17th Linguistic Annotation Workshop (LAW-XVII), Association for Computational Linguistics (ACL) 2023
Corpus-Based Investigation of the Markedness and Frequency of Japanese Passives in Contemporary Written Japanese. PDF
Tatsuya Aoyama
Society for Computation in Linguistics (SCiL) 2023
Comparing Native and Learner Englishes Using a Large Pre-trained Language Model. PDF
Tatsuya Aoyama
11th Natural Language Processing for Computer-Assisted Language Learning (NLP4CALL) workshop 2022
Probe-Less Probing of BERT’s Layer-Wise Linguistic Knowledge with Masked Word Prediction. PDF
Tatsuya Aoyama, Nathan Schneider
North American Chapter of the Association for Computational Linguistics Student Research Workshop (NAACL-SRW 2022)
Revisiting Layer-Wise Linguistic Knowledge with Masked Word Prediction.
Tatsuya Aoyama, Nathan Schneider
Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL 2022)
A Corpus-based Multifactorial Analysis of Japanese and Chinese Speakers’ English Article Use : Quantifying the Deviation Using MuPDAR PDF
Tatsuya Aoyama
Unpublished M.A. Thesis
Modeling Nonnative Sentence Processing with L2 Language Models.
Tatsuya Aoyama and Nathan Schneider
In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024, main long).
Comparing Native and Learner Englishes Using a Large Pre-trained Language Model. PDF
Tatsuya Aoyama
11th Natural Language Processing for Computer-Assisted Language Learning (NLP4CALL) workshop 2022
International Students’ Willingness to Communicate in English as a Second Language : Effects of L2 Self-Confidence, Acculturation, and Motivational Types PDF
Tatsuya Aoyama, Tomoko Takahashi
Journal of International Students 10(3)
A Corpus-based Multifactorial Analysis of Japanese and Chinese Speakers’ English Article Use : Quantifying the Deviation Using MuPDAR PDF
Tatsuya Aoyama
Unpublished M.A. Thesis
Second Language Research Forum (SLRF 2020) (October 2020): A Corpus-based Multifactorial Analysis of Japanese and Chinese Learners’ English Article Use : Quantifying theDeviation using MuPDAR
The Asian Conference on Language (ACL 2020) (March 2020): Japanese ESL Students’ Willingness to Communicate in English : The Effects of L2 Self-Confidence, Acculturation, and Motivational Types
MSU LLT 860 Guest Lecture (February 2020): Universal Grammar and Second Language Acquisition
LING/COSC-572 (Spring 2023, Spring 2024): Empirical Methods in Natural Language Processing
LING-001 (Fall 2021, Spring 2022): Introduction to Language
JPN-102 (Spring 2019): Elementary Japanese II
JPN-101 ( Fall 2018, Fall 2019): Elementary Japanese I
This website uses the website design and template by Martin Saveski