|
1 |
| -# Lexical Inference dataset - previous version |
| 1 | +# Danish Lexical Inference Datasets |
2 | 2 |
|
3 |
| -This is the first un-curated version of the lexical inference dataset. This version constituted the basis for the lexical inference experiments performed in the papers: |
| 3 | +The entailment datasets consist of a list of statements, where for each line is given |
| 4 | +- Two true statements encompassing features of hyponymy and inheritance are given |
| 5 | +- These are followed by an additional similar statement |
| 6 | +- The last statement is supplemented with a label denoting whether it is *true* or *false*. |
| 7 | +- Finally, information is given regarding the ontological types and/or relations being tested in the given set of statements |
4 | 8 |
|
5 |
| -Bolette Pedersen, Nathalie Sørensen, Sussi Olsen, Sanni Nimb & Simon Gray. 2024. Towards a Danish Semantic Reasoning Benchmark - Compiled from Lexical-Semantic Resources for Assessing Selected Language Understanding Capabilities of Large Language Models. In *Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)*, p. 16353–16363, Torino, Italia. ELRA and ICCL |
| 9 | +The task intended for the language model is to answer whether the third statement is true or false. |
6 | 10 |
|
7 |
| -and |
| 11 | +This dataset was developed as part of the Danish Reasoning Benchmark. To cite, please use following citation: |
8 | 12 |
|
9 |
| -Bolette S. Pedersen, Nathalie C. Hau Sørensen, Sussi Olsen & Sanni Nimb. 2024. Evaluering af sprogforståelsen i danske sprogmodeller – med |
10 |
| -udgangspunkt i semantiske ordbøger. In *NyS – Nydanske Sprogstudier, vol. 65, p. 8-40. DOI 10.7146/nys.v1i65.143072*. |
| 13 | +Bolette Pedersen, Nathalie Sørensen, Sussi Olsen, Sanni Nimb, and Simon Gray. 2024. |
| 14 | +[Towards a Danish Semantic Reasoning Benchmark - Compiled from Lexical-Semantic Resources for Assessing Selected Language Understanding Capabilities of Large Language Models](https://aclanthology.org/2024.lrec-main.1421/). |
| 15 | +In *Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)*, pages 16353–16363, Torino, Italia. ELRA and ICCL. |
11 | 16 |
|
12 |
| -In the updated version, various errors present in this version have been corrected. Furthermore, the data in “inference-point-in-time.txt” was substituted with more concise terms for “points in time”. |
| 17 | + |
| 18 | + |
| 19 | +All data are derived from [the Danish WordNet, DanNet](https://wordnet.dk/dannet/page/frontpage). |
| 20 | + |
| 21 | +To cite, please use the following citation: |
| 22 | +Pedersen et al. 2009. DanNet: the challenge of compiling a wordnet for Danish by reusing a monolingual dictionary. *Language Resources and Evaluation*, 43, 269–299. [DOI: 10.1007/s10579-009-9092-1](https://doi.org/10.1007/s10579-009-9092-1). |
| 23 | + |
| 24 | +# Content |
| 25 | +The datasets are composed based on the four Qualia Roles defined by J. Pustejovsky (in *The Generative Lexicon*. 1998, Cambridge, MA: MIT Press): |
| 26 | +- Agentive role (how a concept came about) |
| 27 | +- Constitutive role (part-whole relation of a concept) |
| 28 | +- Formal role (the taxonomical classification of a concept) |
| 29 | +- Telic (the function of a concept) |
| 30 | + |
| 31 | +Test instances are generated from a generic template constructed for each ontological type under each qualia role. |
| 32 | +For instance, for the telic role (function) with the ontological type Instrument, we use the template |
| 33 | + |
| 34 | +*Man bruger en X til at Y med*\ |
| 35 | +(you use a X for Y-ing). |
| 36 | + |
| 37 | +We negate a selected number of utterances and try to contrast with examples from different parts of the ontology, keeping, however, always track of the truth-value. |
| 38 | + |
| 39 | +# License |
| 40 | +[CC-BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/) |
| 41 | + |
| 42 | +Credit: [Centre for Language Technology (CST), University of Copenhagen](https://cst.ku.dk/english/) |
| 43 | + |
| 44 | +Contact: Bolette Sandford Pedersen (bspedersen @ hum.ku.dk) |
0 commit comments