lexical_features_v2 module

file: lexical_features_v2.py — A faster version of the lexical_features.py file.

features.lexical_features_v2.get_liwc_count(regex, chat)

” Count the number of LIWC lexicon words

Parameters:
  • regex (str) – The regular expression for the lexicon.

  • chat (str) – The message (utterance) being analyzed.

Returns:

The number of lexicon words present in the message

Return type:

float

features.lexical_features_v2.liwc_features(chat_df: DataFrame, message_col_original: str, custom_liwc_dictionary: dict = {}) DataFrame

This function takes in the chat level input dataframe and computes lexical features (the number of words from a given lexicon, such as LIWC).

Parameters:
  • chat_df (pd.DataFrame) – This is a pandas dataframe of the chat level features. Should contain ‘message’ column.

  • message_col (str) – This is a string with the name of the column containing the message / text.

  • custom_liwc_dictionary (dict) – This is a dictionary of the user’s custom LIWC dic.

Returns:

Dataframe of the lexical features stacked as columns.

Return type:

pd.DataFrame