named_entity_recognition_features module

features.named_entity_recognition_features.built_spacy_ner(text, target, type)

Returns a tuple of sentences, the named entity and its position in the sentence, and its label for training

Inspired by https://dataknowsall.com/blog/ner.html

Parameters:
  • text (str) – The message (utterance) for which we are counting named entities.

  • target (str) – The named entity.

  • type (str) – The entity type (e.g. PERSON, ORG, LOC, PRODUCT, LANGUAGE, etc.)

Returns:

The message and a dictionary of its identified named entities associated with the start and end characters and the type of named entity

Return type:

Tuple

features.named_entity_recognition_features.calculate_named_entities(text, cutoff)

Counts the number of named entities in a message in which their confidence scores exceed the cutoff.

Inspired by https://support.prodi.gy/t/accessing-probabilities-in-ner/94

Parameters:
  • text (str) – The message (utterance) for which we are counting named entities.

  • cutoff (int) – The confidence threshold for each named entity.

Returns:

The list of all named entities in a message and their confidence scores

Return type:

List

features.named_entity_recognition_features.named_entities(text, cutoff)

Returns a tuple of all (named-entities, confidence score) in a message

Parameters:
  • text (str) – The message (utterance) for which we are counting named entities.

  • cutoff (int) – The confidence threshold for each named entity.

Returns:

A tuple of tuples that contains the (named entity, confidence score)

Return type:

tuple

features.named_entity_recognition_features.num_named_entity(text, cutoff)

Returns the number of named entities in a message.

Parameters:
  • text (str) – The message (utterance) for which we are counting named entities.

  • cutoff (int) – The confidence threshold for each named entity.

Returns:

Number of named entities in a message

Return type:

int

features.named_entity_recognition_features.train_spacy_ner(training)

Trains model based on user inputted dataframe that provides example sentences and the named entity that appears in each sentence.

Inspired by https://dataknowsall.com/blog/ner.html

Parameters:

training (pd.DataFrame) – The user inputted training dataframe

Returns: