temporal_features module

features.temporal_features.get_time_diff(df, on_column, conversation_id_col, timestamp_unit)

Obtains the time difference between messages, assuming there is only a single timestamp column representing the time of each utterance.

Parameters:
  • df (pd.DataFrame) – This is a pandas dataframe of the chat level features.

  • on_column (str) – The column name for the timestamp columns.

  • conversation_id_col (str) – A string representing the column name that should be selected as the unique conversation identifier.

  • timestamp_unit (str) – A string representing the unit of a timestamp. Defaults to ‘ms’.

Returns:

A column representing the time difference between messages.

Return type:

pd.Series

features.temporal_features.get_time_diff_startend(df, timestamp_start, timestamp_end, conversation_id_col, timestamp_unit)

Obtains the time difference between messages, assuming there are two timestamp columns, one representing the start of a message and one representing the end of a message.

Currently assumes that the start and end columns are named “timestamp_start” and “timestamp_end”, although this should be made more generalizable in a future commit.

Parameters:
  • df (pd.DataFrame) – This is a pandas dataframe of the chat level features.

  • timestamp_start (str) – A string representing the column name that should be selected as the start timestamp.

  • timestamp_end (str) – A string representing the column name that should be selected as the end timestamp.

  • conversation_id_col (str) – A string representing the column name that should be selected as the conversation ID.

  • timestamp_unit (str) – A string representing the unit of a timestamp. Defaults to ‘ms’.

Returns:

A column representing the time difference between messages.

Return type:

pd.Series