Coalesce Columns

Coalesce one or more columns into a Destination Column. This Transformation iterates through a dataset to fill out a new column. For each row in the Source Columns, the first cell value found that is not None will be returned to the new Destination Column. If all cell values are None, the Transformation will return None to the Destination Column. Note that the order of the Source Columns in your dataset matters for this Transformation. This can be particularly useful in data analysis and reporting, where missing data can skew results or make it difficult to draw accurate conclusions.

Parameters

  • Source Columns: The column names to be coalesced. Defaults to [].
  • Destination Column: The column name that holds the coalesced content. Defaults to coalesced.

Usage

To use the Coalesce Columns transformation, you will need to follow these steps:

  1. Specify the Source Column parameter with the names of the columns to be coalesced, in the order in which they should be checked.
  2. Define the Destination Column parameter with the name of the new column that will hold the coalesced content.
  3. Run the transformation by clicking the Save and Run Transforms button.

Example 1: Coalescing Contact Information

Suppose you have a dataset of customer contact information with separate columns for home phone, work phone, and mobile phone numbers. You want to create a new column called "primary_phone" that contains the first available phone number for each customer.

IDNameHome_PhoneWork_PhoneMobile_Phone
1John555-123-4567NoneNone
2JaneNone555-987-6543555-987-0001

Parameters (YAML):

transform:
  name: Coalesce Columns
  parameters:
    source_columns:
      - Home_Phone
      - Work_Phone
      - Mobile_Phone
    destination_column: primary_phone

Expected Result Dataset:

IDNameHome_PhoneWork_PhoneMobile_PhonePrimary_Phone
1John555-123-4567NoneNone555-123-4567
2JaneNone555-987-6543555-987-0001555-987-6543

Example 2: Coalescing Preferred Contact Method

Suppose you have a dataset of customer contact information with separate columns for email, phone, and postal address. You want to create a new column called "preferred_contact" that contains the first available contact method for each customer.

IDNameEmailPhonePostal_Address
1John[email protected]None123 Main St, Anytown
2JaneNone555-987-6543456 Elm St, Sometown

Parameters:

transform:
  name: Coalesce Columns
  parameters:
    source_columns:
      - Email
      - Phone
      - Postal_Address
    destination_column: preferred_contact

Coalesce Colums - Address
IDNameEmailPhonePostal_AddressPreferred_Contact
1John[email protected]None123 Main St, Anytown[email protected]
2JaneNone555-987-6543456 Elm St, Sometown555-987-6543