Abstract: This repository contain datasets and results for the paper: Large Language Models are Easily Confused: A Quantitative Metric, Security Implications and Typological Analysis Github repository for the code: Quantifying Language Confusion GitHub repo DATA include the following datasets: i) raw language graphs and ii) the calculated language similarities from the language graphs, iii) MTEI : the files from the experimental results of multilingual inversion attacks , and calculated language confusion entropy from the data; iv) LCB : the files from the language confusion benchmark and calculated language confusion entropy from the data Results includeaggregated results for further analysis: i) inversion_language_confusion : results from MTEI ii) prompting_language_confusion : results from LCB
No Comments.