Publications and presentations

Publications and presentations#

For an up-to-date list, also check my Google Scholar page.

Preprints#

2024#

  • Bunzeck, B., Duran, D., Schade, L., Zarrieß, S. (2024). Small Language Models Like Small Vocabularies: Probing the Linguistic Abilities of Grapheme- and Phoneme-Based Baby Llamas. https://arxiv.org/abs/2410.01487

Publications (peer-reviewed)#

2024#

  • Bunzeck, B., Zarrieß, S. (2024). Fifty shapes of BLiMP: Syntactic learning curves in language models are not uniform, but sometimes unruly. Proceedings of the 2024 CLASP Conference on Multimodality and Interaction in Language Learning, 39–55. https://aclanthology.org/2024.clasp-1.7/

  • Bunzeck, B., Zarrieß, S. (2024). The SlayQA benchmark of social reasoning: Testing gender-inclusive generalization with neopronouns. Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP, 42–53. https://aclanthology.org/2024.genbench-1.3/

2023#

  • Bunzeck, B., & Zarrieß S. (2023). GPT-wee: How Small Can a Small Language Model Really Get?. Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, 7–18. https://aclanthology.org/2023.conll-babylm.2/

  • Bunzeck, B., & Zarrieß, S. (2023). Entrenchment matters: Investigating positional and constructional sensitivity in small and large language models. Proceedings of the 2023 CLASP conference on learning with small data (LSD), 25–37. https://aclanthology.org/2023.clasp-1.3

  • Druskat, S., Krause, T., Lachenmaier, C., & Bunzeck, B. (2023). Hexatomic: An extensible, OS-independent platform for deep multi-layer linguistic annotation of corpora. Journal of Open Source Software, 8(86), 4825. https://doi.org/10.21105/joss.04825

  • Wojcik, P., Bunzeck, B., & Zarrieß, S. (2023). The Wikipedia Republic of Literary Characters. Journal of Cultural Analytics, 8(2). https://doi.org/10.22148/001c.70251

Presentations#

2024#

  • Constructions in child-directed speech (with Holger Diessel), (peer-reviewed oral presentation), 10th International Conference of the German Cognitive Linguistics Association, Osnabrück University (Germany)

  • Generating authentic child speech from little data, (poster presentation), NLG in the Lowlands 2024, Bielefeld University (Germany)

2023#

  • GPT-wee: Experiments in downscaling and curriculum learning, (poster presentation), SAIL Workshop on Fundamental Limits of Large Language Models, Bielefeld University (Germany)

  • From Byte to Babel: Large Language Models and the Tower of Linguistic Knowledge, (peer-reviewed oral presentation), META-LING 2023 - Methodological Exploration and Technological Advances in Linguistics, University of Bamberg (Germany)

  • Where and How Do Literary Characters Figure in Wikipedia? (with Sina Zarrieß), (invited presentation), International Workshop | Wikipedia, Wikidata and Wikibase: Usage Scenarios for Literary Studies, Free University of Berlin (Germany)