publications

2024

  1. arXiv
    Linear recency bias during training improves Transformers’ fit to reading times
    Christian Clark, Byung-Doh Oh, and William Schuler
    arXiv, 2024
  2. EMNLP
    Leading whitespaces of language models’ subword vocabulary pose a confound for calculating word probabilities
    Byung-Doh Oh, and William Schuler
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
  3. EACL
    Frequency explains the inverse correlation of large language models’ size, training data amount, and surprisal’s fit to reading times
    Byung-Doh Oh, Shisen Yue, and William Schuler
    In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

2023

  1. EMNLP Findings
    Transformer-based language model surprisal predicts human reading times best with about two billion training tokens
    Byung-Doh Oh, and William Schuler
    In Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
  2. ACL
    Token-wise decomposition of autoregressive language model hidden states for analyzing model predictions
    Byung-Doh Oh, and William Schuler
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023
  3. TACL
    Why does surprisal from larger Transformer-based language models provide a poorer fit to human reading times?
    Byung-Doh Oh, and William Schuler
    Transactions of the Association for Computational Linguistics, 2023
  4. HSP
    On the bigger-is-worse nature of pre-trained language model surprisal
    Byung-Doh Oh, and William Schuler
    In 36th Annual Conference on Human Sentence Processing, 2023
  5. HSP
    Memory-based predictors from GPT-2 attention predict reading times over surprisal
    Byung-Doh Oh, and William Schuler
    In 36th Annual Conference on Human Sentence Processing, 2023

2022

  1. EMNLP
    Entropy- and distance-based predictors from GPT-2 attention patterns predict reading times over and above GPT-2 surprisal
    Byung-Doh Oh, and William Schuler
    In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
  2. DARPA Risers
    Unified unsupervised grammar induction for typologically diverse languages
    Byung-Doh Oh
    In DARPA Risers, 2022
  3. FAI
    Comparison of structural parsers and neural language models as surprisal estimators
    Byung-Doh Oh, Christian Clark, and William Schuler
    Frontiers in Artificial Intelligence, 2022

2021

  1. EMNLP Findings
    Character-based PCFG induction for modeling the syntactic acquisition of morphologically rich languages
    Lifeng Jin, Byung-Doh Oh, and William Schuler
    In Findings of the Association for Computational Linguistics: EMNLP 2021, 2021
  2. EMNLP Findings
    Coreference-aware surprisal predicts brain response
    Evan Jaffe, Byung-Doh Oh, and William Schuler
    In Findings of the Association for Computational Linguistics: EMNLP 2021, 2021
  3. ACL
    Surprisal estimators for human reading times need character models
    Byung-Doh Oh, Christian Clark, and William Schuler
    In Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
  4. CMCL
    Contributions of propositional content and syntactic category information in sentence processing
    Byung-Doh Oh, and William Schuler
    In Proceedings of the 11th Workshop on Cognitive Modeling and Computational Linguistics, 2021
  5. CMCL
    Team Ohio State at CMCL 2021 shared task: Fine-tuned RoBERTa for eye-tracking data prediction
    Byung-Doh Oh
    In Proceedings of the 11th Workshop on Cognitive Modeling and Computational Linguistics, 2021
  6. CUNY
    Comparison of structural and neural language models as surprisal estimators
    Byung-Doh Oh, Christian Clark, and William Schuler
    In 34th Annual CUNY Conference on Human Sentence Processing, 2021
  7. CUNY
    Contributions of propositional content and syntactic categories in sentence processing
    Byung-Doh Oh, and William Schuler
    In 34th Annual CUNY Conference on Human Sentence Processing, 2021

2019

  1. SIGMORPHON
    THOMAS: The hegemonic OSU morphological analyzer using seq2seq
    Byung-Doh Oh, Pranav Maneriker, and Nanjiang Jiang
    In Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology, 2019
  2. JLM
    Modeling morphological learning, typology, and change: What can the neural sequence-to-sequence framework contribute?
    Micha Elsner, Andrea D. Sims, Alexander Erdmann, and 15 more authors
    Journal of Language Modelling, 2019
  3. AIMM
    The role of learnability in morphological change: A computational approach
    Evan Jaffe, and Byung-Doh Oh
    In Fourth American International Morphology Meeting, 2019

2018

  1. Eng Tea
    Exploring English online research and comprehension strategies of Korean college students
    Byung-Doh Oh, and Youngsoon So
    English Teaching, 2018

2017

  1. FLER
    Predicting L2 writing proficiency with computational indices based on n-grams
    Byung-Doh Oh
    Foreign Language Education Research, 2017