Tatsuya Aoyama (青山達也)

Research Scientist, Meta

ta571 [AT] georgetown.edu

Bio

Welcome to my website! My name is Tatsuya Aoyama (青山達也), and I am a research scientist at Meta. I obtained a Ph.D. in Computational Linguistics from Georgetown University, where I was a member of NERT, Corpling, PICoL, and the larger GUCL community. I was very fortunate to have worked with wonderful professors at Georgetown: Nathan, Amir, and Ethan. Broadly, my research interests are centered around language modeling. In particular, I am interested in language models' (1) pretraining dynamics, (2) (mechanistic) interpretability, and (3) application to cognitive science.

Previously, I obtained a B.A. in Liberal Arts from Soka University of America, where my main focus was Psychology and Economics. I then received an M.A. in Teaching English to Speakers of Other Languages (TESOL) from Michigan State University..

Publications

Most recent publications on Google Scholar.

  • Selected
  • All
  • NLP
  • SLA
2025
ACL

Language Models Grow Less Humanlike beyond Phase Transition. PDF Code

Tatsuya Aoyama and Ethan Wilcox

In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025, main long).

ACL

Anything Goes? A Crosslinguistic Study of (Im)possible Language Learning in LMs. PDF Code

Xiulin Yang, Tatsuya Aoyama, Yuekun Yao and Ethan Wilcox

In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025, main long).

2024
EMNLP

Modeling Nonnative Sentence Processing with L2 Language Models. PDF Code

Tatsuya Aoyama and Nathan Schneider

In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024, main long).

2022
NAACL
SRW

Probe-Less Probing of BERT’s Layer-Wise Linguistic Knowledge with Masked Word Prediction. PDF

Tatsuya Aoyama, Nathan Schneider

North American Chapter of the Association for Computational Linguistics Student Research Workshop (NAACL-SRW 2022)

2025
EMNLP

Unpacking Let Alone: Human-Scale Models Generalize to a Rare Construction in Form but not Meaning. PDF

Wesley Scivetti, Tatsuya Aoyama, Ethan Wilcox and Nathan Schneider

In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025, main long).

ACL

Language Models Grow Less Humanlike beyond Phase Transition. PDF Code

Tatsuya Aoyama and Ethan Wilcox

In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025, main long).

ACL

Anything Goes? A Crosslinguistic Study of (Im)possible Language Learning in LMs. PDF Code

Xiulin Yang, Tatsuya Aoyama, Yuekun Yao and Ethan Wilcox

In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025, main long).

2024
EMNLP

Modeling Nonnative Sentence Processing with L2 Language Models. PDF Code

Tatsuya Aoyama and Nathan Schneider

In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024, main long).

EMNLP

GDTB: Genre Diverse Data for English Shallow Discourse Parsing across Modalities, Text Types, and Domains. PDF Code

Yang Janet Liu*, Tatsuya Aoyama*, Wesley Scivetti*, Yilun Zhu*, Shabnam Behzad, Lauren Levine, Jessica Lin, Devika Tiwari and Amir Zeldes.

In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024, main long).

CL

eRST: A Signaled Graph Theory of Discourse Relations and Organization. PDF

Amir Zeldes, Tatsuya Aoyama, Janet Yang Liu, Siyao Peng, Debopam Das, and Luke Gessler

Computational Linguistics, 1-47.

BEA

Identifying Fairness Issues in Automatically Generated Testing Content. PDF Code

Kevin Stowe, Benny Longwill, Alyssa Francis, Tatsuya Aoyama, Debanjan Ghosh, Swapna Somasundaran

The 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA), the North American Chapter of the Association for Computational Linguistics (NAACL) 2024.

LREC
COLING

J-SNACS: Adposition and Case Supersenses for Japanese Joshi. PDF Code

Tatsuya Aoyama, Chihiro Taguchi, and Nathan Schneider

The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

LREC
COLING

DISRPT: A Multilingual, Multi-domain, Cross-framework Benchmark for Discourse Processing. PDF Code

Chloé Braud, Amir Zeldes, Laura Rivière, Yang Janet Liu, Philippe Muller, Damien Sileo, and Tatsuya Aoyama

The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

2023
SIGDIAL

What’s Hard in RST Parsing? Predictive Models for Error Analysis. PDF Code

Yang Janet Liu, Tatsuya Aoyama, and Amir Zeldes

Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), Prague, Czechia. Association for Computational Linguistics.

LAW

GENTLE: A Genre-Diverse Multilayer Challenge Set for English NLP and Linguistic Evaluation. PDF Code

Tatsuya Aoyama, Shabnam Behzad, Luke Gessler, Lauren Levine, Jessica Lin, Yang Janet Liu, Siyao Peng, Yilun Zhu, and Amir Zeldes

The 17th Linguistic Annotation Workshop (LAW-XVII), Association for Computational Linguistics (ACL) 2023

SCiL

Corpus-Based Investigation of the Markedness and Frequency of Japanese Passives in Contemporary Written Japanese. PDF

Tatsuya Aoyama

Society for Computation in Linguistics (SCiL) 2023

2022
NLP4
CALL

Comparing Native and Learner Englishes Using a Large Pre-trained Language Model. PDF

Tatsuya Aoyama

11th Natural Language Processing for Computer-Assisted Language Learning (NLP4CALL) workshop 2022

NAACL
SRW

Probe-Less Probing of BERT’s Layer-Wise Linguistic Knowledge with Masked Word Prediction. PDF

Tatsuya Aoyama, Nathan Schneider

North American Chapter of the Association for Computational Linguistics Student Research Workshop (NAACL-SRW 2022)

MASC
SLL

Revisiting Layer-Wise Linguistic Knowledge with Masked Word Prediction.

Tatsuya Aoyama, Nathan Schneider

Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL 2022)

2020
JIS

International Students’ Willingness to Communicate in English as a Second Language : Effects of L2 Self-Confidence, Acculturation, and Motivational Types PDF

Tatsuya Aoyama, Tomoko Takahashi

Journal of International Students 10(3)

MA
thesis

A Corpus-based Multifactorial Analysis of Japanese and Chinese Speakers’ English Article Use : Quantifying the Deviation Using MuPDAR PDF

Tatsuya Aoyama

Unpublished M.A. Thesis

2025
EMNLP

Unpacking Let Alone: Human-Scale Models Generalize to a Rare Construction in Form but not Meaning. PDF

Wesley Scivetti, Tatsuya Aoyama, Ethan Wilcox and Nathan Schneider

In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025, main long).

ACL

Language Models Grow Less Humanlike beyond Phase Transition. PDF Code

Tatsuya Aoyama and Ethan Wilcox

In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025, main long).

ACL

Anything Goes? A Crosslinguistic Study of (Im)possible Language Learning in LMs. PDF Code

Xiulin Yang, Tatsuya Aoyama, Yuekun Yao and Ethan Wilcox

In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025, main long).

2024
EMNLP

Modeling Nonnative Sentence Processing with L2 Language Models. PDF Code

Tatsuya Aoyama and Nathan Schneider

In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024, main long).

EMNLP

GDTB: Genre Diverse Data for English Shallow Discourse Parsing across Modalities, Text Types, and Domains. PDF Code

Yang Janet Liu*, Tatsuya Aoyama*, Wesley Scivetti*, Yilun Zhu*, Shabnam Behzad, Lauren Levine, Jessica Lin, Devika Tiwari and Amir Zeldes.

In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024, main long).

CL

eRST: A Signaled Graph Theory of Discourse Relations and Organization. PDF

Amir Zeldes, Tatsuya Aoyama, Janet Yang Liu, Siyao Peng, Debopam Das, and Luke Gessler

Computational Linguistics, 1-47.

BEA

Identifying Fairness Issues in Automatically Generated Testing Content. PDF Code

Kevin Stowe, Benny Longwill, Alyssa Francis, Tatsuya Aoyama, Debanjan Ghosh, Swapna Somasundaran

The 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA), the North American Chapter of the Association for Computational Linguistics (NAACL) 2024.

LREC
COLING

J-SNACS: Adposition and Case Supersenses for Japanese Joshi. PDF Code

Tatsuya Aoyama, Chihiro Taguchi, and Nathan Schneider

The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

LREC
COLING

DISRPT: A Multilingual, Multi-domain, Cross-framework Benchmark for Discourse Processing. PDF Code

Chloé Braud, Amir Zeldes, Laura Rivière, Yang Janet Liu, Philippe Muller, Damien Sileo, and Tatsuya Aoyama

The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

2023
SIGDIAL

What’s Hard in RST Parsing? Predictive Models for Error Analysis. PDF Code

Yang Janet Liu, Tatsuya Aoyama, and Amir Zeldes

Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), Prague, Czechia. Association for Computational Linguistics.

LAW

GENTLE: A Genre-Diverse Multilayer Challenge Set for English NLP and Linguistic Evaluation. PDF Code

Tatsuya Aoyama, Shabnam Behzad, Luke Gessler, Lauren Levine, Jessica Lin, Yang Janet Liu, Siyao Peng, Yilun Zhu, and Amir Zeldes

The 17th Linguistic Annotation Workshop (LAW-XVII), Association for Computational Linguistics (ACL) 2023

SCiL

Corpus-Based Investigation of the Markedness and Frequency of Japanese Passives in Contemporary Written Japanese. PDF

Tatsuya Aoyama

Society for Computation in Linguistics (SCiL) 2023

2022
NLP4
CALL

Comparing Native and Learner Englishes Using a Large Pre-trained Language Model. PDF

Tatsuya Aoyama

11th Natural Language Processing for Computer-Assisted Language Learning (NLP4CALL) workshop 2022

NAACL
SRW

Probe-Less Probing of BERT’s Layer-Wise Linguistic Knowledge with Masked Word Prediction. PDF

Tatsuya Aoyama, Nathan Schneider

North American Chapter of the Association for Computational Linguistics Student Research Workshop (NAACL-SRW 2022)

MASC
SLL

Revisiting Layer-Wise Linguistic Knowledge with Masked Word Prediction.

Tatsuya Aoyama, Nathan Schneider

Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL 2022)

2020
MA
thesis

A Corpus-based Multifactorial Analysis of Japanese and Chinese Speakers’ English Article Use : Quantifying the Deviation Using MuPDAR PDF

Tatsuya Aoyama

Unpublished M.A. Thesis

2024
EMNLP

Modeling Nonnative Sentence Processing with L2 Language Models. PDF Code

Tatsuya Aoyama and Nathan Schneider

In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024, main long).

2022
NLP4
CALL

Comparing Native and Learner Englishes Using a Large Pre-trained Language Model. PDF

Tatsuya Aoyama

11th Natural Language Processing for Computer-Assisted Language Learning (NLP4CALL) workshop 2022

2020
JIS

International Students’ Willingness to Communicate in English as a Second Language : Effects of L2 Self-Confidence, Acculturation, and Motivational Types PDF

Tatsuya Aoyama, Tomoko Takahashi

Journal of International Students 10(3)

MA
thesis

A Corpus-based Multifactorial Analysis of Japanese and Chinese Speakers’ English Article Use : Quantifying the Deviation Using MuPDAR PDF

Tatsuya Aoyama

Unpublished M.A. Thesis

Experience

  • Meta Platforms, Inc. 2025-
    Research Scientist
    GenAI
  • JPMorgan Chase & Co. Summer 2024
    Summer Associate, Data Science & AI Associate Program
    Machine Learning Center of Excellence (MLCOE)
  • ETS, AI Lab Summer 2023
    NLP Research Intern
  • Posh Technologies Summer 2022
    NLP Research Intern
  • Money Forward, Inc. Summer 2021
    NLP Engineer Intern
  • Georgetown University 2020 - 2025
    Ph.D. in Computational Linguistics
    Advised by Nathan Schneider and Amir Zeldes
  • Center for Applied Linguistics Summer 2019
    Research Intern
  • Michigan State University 2018 - 2020
    M.A. in Teaching English to Speakers of Other Languages (TESOL)
    Advised by Sandra Deshors and Kristen Johnson
  • Soka University of America 2014 - 2018
    B.A. in Liberal Arts
    Advised by Tomoko Takahashi

Talks

Second Language Research Forum (SLRF 2020) (October 2020): A Corpus-based Multifactorial Analysis of Japanese and Chinese Learners’ English Article Use : Quantifying theDeviation using MuPDAR

The Asian Conference on Language (ACL 2020) (March 2020): Japanese ESL Students’ Willingness to Communicate in English : The Effects of L2 Self-Confidence, Acculturation, and Motivational Types

MSU LLT 860 Guest Lecture (February 2020): Universal Grammar and Second Language Acquisition

Teaching

JPN-102 (Spring 2019): Elementary Japanese II

JPN-101 ( Fall 2018, Fall 2019): Elementary Japanese I

Acknowledgement

This website uses the website design and template by Martin Saveski