Summarize Text

Generate a shorter version of text while preserving its important information. Quickly understand the content of a long text, such as a news article or a research paper. This enrichment can also help reduce the number of words in your original text while retaining the most important points in a clear and concise manner. Use the Prompt Template to guide the text generation. The $field_source_column notation specifies the source column and the $column_<column_name> notation specifies all other columns in your dataset.

Parameters

  • Source Column: The column name containing the text you want to summarize. Defaults to content
  • Destination Column: The column name that holds the summary. Defaults to summary.
  • LLM: The large language model used for text summarization. Defaults to gpt-3.5-turbo.
  • Prompt Template: The template that guides the generation of the summary. Defaults to Summarize this document: $field_source_column Summary:.
  • Credential ID: The connector from your Mantium account. This is a required field.

Usage

To use the Summarize Text transformation, you will need to have a valid API key configured in Mantium for the third-party service (e.g., OpenAI) you want to use. If you don't have one, see the guide here

To use this Mantium Enrichment, follow these steps:

  1. Configure the Source Column parameter by selecting the column containing the text you want to summarize.
  2. Configure the Destination Column parameter by specifying the new name for the column that will hold the summarized text.
  3. Configure the Summarization Model parameter by selecting the LLM model to use for the summarization.
  4. Optional: If you want to change how the text is summarized, you can edit the text in the Prompt template field to specify a specific summarization method such as: Summarize this document _into bullet points_: $field_source_column Summary:
  5. Configure the Credential ID parameter by selecting the appropriate credential from the list of available credentials in your Mantium account.
  6. Run the transformation by clicking the Save and Run Transforms button. The resulting dataset will have a new column with the specified Destination Column name which will contain the summarized text for the text in the source column.

Example 1: Summarizing news articles

Suppose you have a dataset of news articles you want to summarized. Summarization can help you quickly get an idea of the key topics and information contained within the articles, allowing you to identify important news stories or trends. You can use the summarization transform.

titlecontent
COVID-19 cases continue to riseHealth officials have reported an increase in the number of COVID-19 cases in the past week. The surge is believed to be linked to the new Delta variant of the virus, which is more contagious than previous strains. Officials are urging people to get vaccinated and to continue to follow public health guidelines to prevent the spread of the virus
New study shows benefits of exerciseA new study has found that regular exercise can help improve overall health and reduce the risk of chronic diseases. The study, which was conducted over a period of two years, involved over 1,000 participants. The researchers found that those who exercised regularly had lower rates of heart disease, diabetes, and other chronic conditions compared to those who did not exercise.

To do this, you would configure the transformation as follows:

Source Column: content
Destination Column: summarized_text
LLM Model: gpt-3.5-turbo
Prompt Template: "Summarize this document: $field_source_column Summary:"
Credential ID: OpenAI

The resulting dataset would look like this:

titlecontentsummarized_text
COVID-19 cases continue to riseHealth officials have reported an increase in the number of COVID-19 cases in the past week. The surge is believed to be linked to the new Delta variant of the virus, which is more contagious than previous strains. Officials are urging people to get vaccinated and to continue to follow public health guidelines to prevent the spread of the virusCOVID-19 cases are increasing due to the more contagious Delta variant. Officials urge vaccination and adherence to public health guidelines.
New study shows benefits of exerciseA new study has found that regular exercise can help improve overall health and reduce the risk of chronic diseases. The study, which was conducted over a period of two years, involved over 1,000 participants. The researchers found that those who exercised regularly had lower rates of heart disease, diabetes, and other chronic conditions compared to those who did not exercise.Regular exercise reduces the risk of chronic diseases, according to a two-year study of over 1,000 participants, which found lower rates of heart disease, diabetes, and other conditions in those who exercised regularly