Publications and presentations

Publications and presentations#

For an up-to-date list, also check my Google Scholar page.

Bunzeck, B., Duran, D., Schade, L., Zarrieß, S. (2024). Small Language Models Like Small Vocabularies: Probing the Linguistic Abilities of Grapheme- and Phoneme-Based Baby Llamas. https://arxiv.org/abs/2410.01487

Bunzeck, B., Zarrieß, S. (2024). Fifty shapes of BLiMP: Syntactic learning curves in language models are not uniform, but sometimes unruly. Proceedings of the 2024 CLASP Conference on Multimodality and Interaction in Language Learning, 39–55. https://aclanthology.org/2024.clasp-1.7/
Bunzeck, B., Zarrieß, S. (2024). The SlayQA benchmark of social reasoning: Testing gender-inclusive generalization with neopronouns. Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP, 42–53. https://aclanthology.org/2024.genbench-1.3/

Bunzeck, B., & Zarrieß S. (2023). GPT-wee: How Small Can a Small Language Model Really Get?. Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning, 7–18. https://aclanthology.org/2023.conll-babylm.2/
Bunzeck, B., & Zarrieß, S. (2023). Entrenchment matters: Investigating positional and constructional sensitivity in small and large language models. Proceedings of the 2023 CLASP conference on learning with small data (LSD), 25–37. https://aclanthology.org/2023.clasp-1.3
Druskat, S., Krause, T., Lachenmaier, C., & Bunzeck, B. (2023). Hexatomic: An extensible, OS-independent platform for deep multi-layer linguistic annotation of corpora. Journal of Open Source Software, 8(86), 4825. https://doi.org/10.21105/joss.04825
Wojcik, P., Bunzeck, B., & Zarrieß, S. (2023). The Wikipedia Republic of Literary Characters. Journal of Cultural Analytics, 8(2). https://doi.org/10.22148/001c.70251

Constructions in child-directed speech (with Holger Diessel), (peer-reviewed oral presentation), 10th International Conference of the German Cognitive Linguistics Association, Osnabrück University (Germany)
Generating authentic child speech from little data, (poster presentation), NLG in the Lowlands 2024, Bielefeld University (Germany)

GPT-wee: Experiments in downscaling and curriculum learning, (poster presentation), SAIL Workshop on Fundamental Limits of Large Language Models, Bielefeld University (Germany)
From Byte to Babel: Large Language Models and the Tower of Linguistic Knowledge, (peer-reviewed oral presentation), META-LING 2023 - Methodological Exploration and Technological Advances in Linguistics, University of Bamberg (Germany)
Where and How Do Literary Characters Figure in Wikipedia? (with Sina Zarrieß), (invited presentation), International Workshop | Wikipedia, Wikidata and Wikibase: Usage Scenarios for Literary Studies, Free University of Berlin (Germany)