Skip to main content

Connect and Transform Data

Learning Objectives

After completing this unit, you’ll be able to:

  • Explain how to connect various data sources to Data 360, including ingesting data and accessing zero-copy data.
  • Describe ways to transform data and define when to use each, including batch and streaming data transforms and formula fields.
Note

As of October 14, 2025, Data Cloud has been rebranded to Data 360. During this transition, you may see references to Data Cloud in our application and documentation. While the name is new, the functionality and content remains unchanged.

Before You Start

This badge is part of the Data 360: Explore Setup to Activation trail. The trail is designed to give you hands-on experience with the core functionalities of Data 360. In this badge, you learn the fundamentals of ingesting data or accessing it through Zero Copy. Learn about data transformation, data mapping, and explore identity resolution for creating unified profiles.

Connect Data

Connecting your data is the first step to unlocking its value. Data can come in many forms and from various locations, and Data 360 provides a central place to unify, analyze, and act on it.

You have two options for connecting your data.

Ingest Data

If your data resides in CRM sources like Sales, Service, Industries, B2C Commerce, or Marketing Cloud Engagement, Data 360 offers flexible, zero cost connectors to connect the CRM source to Data 360. You can then manually create a data stream to ingest your data. When data is ingested, it keeps its original structure and data types and is stored in a data lake object (DLO).

To save time, use a starter data bundle. These bundles contain preconfigured data streams that automatically map DLOs to associated data model objects (DMOs).

You can also ingest data outside of CRM records, including unstructured data, such as audio files and knowledge articles, and structured data, such as CSV files.

Access External Data with Zero Copy

With Zero Copy, you can bidirectionally connect Data 360 to external systems, including Google BigQuery, Snowflake, Amazon Redshift, and Databricks. This means you can freely access data from your external data source and use it in Data 360 without duplication, saving time and resources. This capability is known as Zero Copy Data Federation. Zero Copy Data Federation offers benefits such as:

  • Total data fluidity: Data 360 can overlay your existing IT architecture, creating a virtual operational layer that accesses data only when needed.
  • Real-time data access: You get the latest information without waiting for syncs or updates.
  • Simplified integration: Simplifies integration by removing the need for complex and costly data pipelines.
  • Enhanced governance and compliance: Less exposure to risk with fewer copies of data.

There are two types of zero-copy connectors.

  • Query Federation: Data 360 sends a query to the external system’s query engine, which then retrieves records from its storage layer and returns them to Data 360. All external systems support this.
  • File Federation: Data 360 directly queries the external system’s storage layer using its own compute engines. This is optimized for accessing and analyzing large datasets and generally results in lower costs as you avoid external compute fees.

Think of Query Federation for live data lookups and File Federation for running analytics on massive, preexisting datasets.

Learn more in Data Cloud with Zero Copy.

Transform and Cleanse Data

Once your data is connected, you can clean, enrich, and manipulate it to suit your needs.

  • Batch data transforms: Use a batch data transform for complex data transformations or when you need data updated on a scheduled basis.

Unlike streaming transforms that run continually, batch transforms run at specified intervals. You can combine data from multiple DLOs, create calculated fields using various functions, and output data to multiple DLOs. You can perform operations like joining, aggregating, appending data, and using formulas and filters. For example, you can aggregate data using functions like Average, Count, Sum, and append rows from multiple datasets, filter out unneeded rows, or join two input nodes. The transform node itself allows for calculating values, modifying string values, formatting dates, and dropping columns.

  • Streaming data transforms: If you need to clean and enrich data in near real-time as it enters the system, a streaming data transform is your go-to.

This is perfect for scenarios like detecting credit card fraud by aggregating and normalizing incoming data immediately to spot irregularities. New records are transformed and appended to the output object as they are ingested. A streaming data transform reads from a source DLO, runs a SQL query to modify data, and then maps the target DLO to a DMO.

Learn more in Batch Data Transforms in Data Cloud: Quick Look and Streaming Data Transforms in Data Cloud: Quick Look.

  • Formula Fields in Data Streams: You can create formula fields within a data stream to clean and adjust raw data.

These optional fields help you standardize formatting, add keys to join and map data, and add flags for data that meets specific criteria. They support text manipulation, type conversions, date calculations, and logical expressions.

For example, an order can have several products, and the same product can be included in multiple orders. You can create a formula field named Unique ID that identifies each product in an order by concatenating the order confirmation number with the product ID.

Connecting and transforming your data sets the stage with comprehensive and cleansed data. The next step is to map and unify it to create customer profiles.

Resources

Share your Trailhead feedback over on Salesforce Help.

We'd love to hear about your experience with Trailhead - you can now access the new feedback form anytime from the Salesforce Help site.

Learn More Continue to Share Feedback