Instructions to use dh-unibe/turmbuecher-ner-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Flair
How to use dh-unibe/turmbuecher-ner-v1 with Flair:
from flair.models import SequenceTagger tagger = SequenceTagger.load("dh-unibe/turmbuecher-ner-v1") - Notebooks
- Google Colab
- Kaggle
Turmbücher NER
A model for historical German developed by Ismail Prada Ziegler as part of a research project at the University of Bern, Digital Humanities.
Performance
| PER | ORG | LOC | Micro-Avg | |
|---|---|---|---|---|
| Precision | 82.46% | 28.81% | 88.51% | 81.21% |
| Recall | 88.51% | 44.74% | 83.02% | 83.99% |
| F1-Score | 85.38% | 35.05% | 85.67% | 82.57% |
Note: ORG-tags were too inconsistent in the training data and performed poorly.
We discovered in first experiments that the model also performs reasonably well on automatically transcribed text (CER of around 5%).
Data Set
Main data set: Berner Turmbücher, early volumes from 16th C., Early New High German, 61k tokens training data.
Secondary data sets:
- SSRQ - Fribourg, language model + tagging, 59k tokens.
- Chorgerichtsmanuale (unpublished), language model + tagging, 76k tokens.
- Königsfelden Charters, language model, 623k tokens.
- Talgerichtsprotokolle (unpublished), language model, 438k tokens.
Notice
This project is still in progress.
- Downloads last month
- 3