top of page
Iahlt

Products & Services

Via our Github, you can experiment the IAHLT open-source annotated content and decide if you would like to become IAHLT member to access our large Hebrew & Arabic datasets and models.

Our products in IAHLT Github - Click here
 

Services and Tools for Hebrew & Arabic

IAHLT Services2.png

Universal Dependencies (UD) is a framework for consistent annotation of grammar (parts of speech, morphological features, and syntactic dependencies) across different human languages. UD is an open community effort with over 300 contributors producing nearly 200 treebanks in over 100 languages.
 

IAHLT public contribution 

The UD Hebrew-IAHLTWiki treebank consists of 5,000 contemporary Hebrew sentences representing a variety of texts originating from Wikipedia entries:

https://github.com/UniversalDependencies/UD_Hebrew-IAHLTwiki

Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.

IAHLT Automatic Annotations Demos

IAHLT Open Source Use

Name
Description
Use
URL
Sourcehut
Forge with hosting for git/mailing list/CI and more
Code hosting/continuous integration/mailing list/issue tracking
https://sourcehut.org
NECKar
Wikidata entity extractor
Entity linking and NE preannotation
https://event.ifi.uni-heidelberg.de/?page_id=532
sacr
Coreference annotation tool
Coreference annotation
http://boberle.com/projects/sacr
trankit
UD parser and NE recognizer
UD parsing and NE recognition
https://github.com/nlp-uoregon/trankit
udpipe
Classical UD parser
Sentence segmentation (HE + AR) and lemmatization
https://github.com/ufal/udpipe
arborator
Universal dependencies annotation tool
Annotation for UD
https://github.com/Arborator
Doccano
Named entity annotation tool
Named entity annotation
https://github.com/doccano
Grew
Graph-based corpus search tool
Corpus search and validation for lemmatization and UD
https://grew.fr

Hebrew LLM Project

פרויקט משותף למודל שפה גנרטיבי גדול בעברית, פתוח, וחזק

 האיגוד הישראלי לטכנולוגיות שפת אנוש

  מרכז דיקטה, בשיתוף מפא"ת / התכנית הלאומית

אינטל

 

Please contact us for any content creation or annotation needs, (text and audio).

bottom of page