lexical_features_v2 module
file: lexical_features_v2.py — A faster version of the lexical_features.py file.
- features.lexical_features_v2.get_liwc_count(regex, chat)
” Count the number of LIWC lexicon words
- Parameters:
regex (str) – The regular expression for the lexicon.
chat (str) – The message (utterance) being analyzed.
- Returns:
The number of lexicon words present in the message
- Return type:
float
- features.lexical_features_v2.liwc_features(chat_df: DataFrame, message_col_original: str, custom_liwc_dictionary: dict = {}) DataFrame
This function takes in the chat level input dataframe and computes lexical features (the number of words from a given lexicon, such as LIWC).
- Parameters:
chat_df (pd.DataFrame) – This is a pandas dataframe of the chat level features. Should contain ‘message’ column.
message_col (str) – This is a string with the name of the column containing the message / text.
custom_liwc_dictionary (dict) – This is a dictionary of the user’s custom LIWC dic.
- Returns:
Dataframe of the lexical features stacked as columns.
- Return type:
pd.DataFrame