skillNer.skill_extractor_class.SkillExtractor¶

class skillNer.skill_extractor_class.SkillExtractor(nlp, skills_db, phraseMatcher, tranlsator_func=False)¶

Main class to annotate skills in a given text and visualize them.

Constructor of the class.

Parameters

nlp ([type]) – NLP object loaded from spacy.
skills_db ([type]) – A skill database used as a lookup table to annotate skills.
phraseMatcher ([type]) – A phrasematcher loaded from spacy.
tranlsator_func (Callable) – A fucntion to translate text from source language to english def tranlsator_func(text_input: str) -> text_input:str

__init__(nlp, skills_db, phraseMatcher, tranlsator_func=False)¶

Constructor of the class.

Parameters

nlp ([type]) – NLP object loaded from spacy.
skills_db ([type]) – A skill database used as a lookup table to annotate skills.
phraseMatcher ([type]) – A phrasematcher loaded from spacy.
tranlsator_func (Callable) – A fucntion to translate text from source language to english def tranlsator_func(text_input: str) -> text_input:str

Methods

`__init__`(nlp, skills_db, phraseMatcher[, ...])	Constructor of the class.
`annotate`(text[, tresh])	To annotate a given text and thereby extract skills from it.
`describe`(annotations)	To display more details about the annotated skills.
`display`(results)	To display the annotated skills.

annotate(text: str, tresh: float = 0.5) → dict¶

To annotate a given text and thereby extract skills from it.

Parameters

text (str) – The target text.
tresh (float, optional) – A treshold used to select skills in case of confusion, by default 0.5

Returns

returns a dictionnary with the text that was used and the annotated skills (see example).

Return type

dict

Examples

>>> import spacy
>>> from spacy.matcher import PhraseMatcher
>>> from skillNer.skill_extractor_class import SkillExtractor
>>> from skillNer.general_params import SKILL_DB
>>> nlp = spacy.load('en_core_web_sm')
>>> skill_extractor = SkillExtractor(nlp, SKILL_DB, PhraseMatcher)
loading full_matcher ...
loading abv_matcher ...
loading full_uni_matcher ...
loading low_form_matcher ...
loading token_matcher ...
>>> text = "Fluency in both english and french is mandatory"
>>> skill_extractor.annotate(text)
{'text': 'fluency in both english and french is mandatory',
'results': {'full_matches': [],
'ngram_scored': [{'skill_id': 'KS123K75YYK8VGH90NCS',
    'doc_node_id': [3],
    'doc_node_value': 'english',
    'type': 'lowSurf',
    'score': 1,
    'len': 1},
{'skill_id': 'KS1243976G466GV63ZBY',
    'doc_node_id': [5],
    'doc_node_value': 'french',
    'type': 'lowSurf',
    'score': 1,
    'len': 1}]}}

display(results: dict)¶

To display the annotated skills. This method uses built-in classes of spacy to render annotated text, namely displacy.

Parameters

results (dict) – results is the dictionnary resulting from applying .annotate() to a text.
Results –
------- –
None – render the text with annotated skills.