skillNer.skill_extractor_class.SkillExtractor¶
- class skillNer.skill_extractor_class.SkillExtractor(nlp, skills_db, phraseMatcher, tranlsator_func=False)¶
Main class to annotate skills in a given text and visualize them.
Constructor of the class.
- Parameters
nlp ([type]) – NLP object loaded from spacy.
skills_db ([type]) – A skill database used as a lookup table to annotate skills.
phraseMatcher ([type]) – A phrasematcher loaded from spacy.
tranlsator_func (Callable) – A fucntion to translate text from source language to english def tranlsator_func(text_input: str) -> text_input:str
- __init__(nlp, skills_db, phraseMatcher, tranlsator_func=False)¶
Constructor of the class.
- Parameters
nlp ([type]) – NLP object loaded from spacy.
skills_db ([type]) – A skill database used as a lookup table to annotate skills.
phraseMatcher ([type]) – A phrasematcher loaded from spacy.
tranlsator_func (Callable) – A fucntion to translate text from source language to english def tranlsator_func(text_input: str) -> text_input:str
Methods
__init__
(nlp, skills_db, phraseMatcher[, ...])Constructor of the class.
annotate
(text[, tresh])To annotate a given text and thereby extract skills from it.
describe
(annotations)To display more details about the annotated skills.
display
(results)To display the annotated skills.
- annotate(text: str, tresh: float = 0.5) dict ¶
To annotate a given text and thereby extract skills from it.
- Parameters
text (str) – The target text.
tresh (float, optional) – A treshold used to select skills in case of confusion, by default 0.5
- Returns
returns a dictionnary with the text that was used and the annotated skills (see example).
- Return type
dict
Examples
>>> import spacy >>> from spacy.matcher import PhraseMatcher >>> from skillNer.skill_extractor_class import SkillExtractor >>> from skillNer.general_params import SKILL_DB >>> nlp = spacy.load('en_core_web_sm') >>> skill_extractor = SkillExtractor(nlp, SKILL_DB, PhraseMatcher) loading full_matcher ... loading abv_matcher ... loading full_uni_matcher ... loading low_form_matcher ... loading token_matcher ... >>> text = "Fluency in both english and french is mandatory" >>> skill_extractor.annotate(text) {'text': 'fluency in both english and french is mandatory', 'results': {'full_matches': [], 'ngram_scored': [{'skill_id': 'KS123K75YYK8VGH90NCS', 'doc_node_id': [3], 'doc_node_value': 'english', 'type': 'lowSurf', 'score': 1, 'len': 1}, {'skill_id': 'KS1243976G466GV63ZBY', 'doc_node_id': [5], 'doc_node_value': 'french', 'type': 'lowSurf', 'score': 1, 'len': 1}]}}
- display(results: dict)¶
To display the annotated skills. This method uses built-in classes of spacy to render annotated text, namely displacy.
- Parameters
results (dict) – results is the dictionnary resulting from applying .annotate() to a text.
Results –
------- –
None – render the text with annotated skills.