Logo

Getting started

How to set up transformations

While the data Datazoom collects is built on standards and common interaction models, customers often need to transform the standard set of data to better suit their needs to avoid post-processing. Transformations conform data while in transit before it lands in a connector. 

Save money by eliminating extra data before it is sent to the Connector data store, Save time by having the data in the desired schema as soon as it arrives in the Connector. 

Contact Datazoom to learn more!

Overview

The Data Transformation service is built on a rules engine that can trigger a set of actions, altering a data message in realtime before the data leaves Datazoom. The rules engine allows for complex rule definitions using multiple criteria and actions. The goal is to enable any possible manipulation of an event message.

Each connector instance can have a ruleset attached – the rules will be applied to all data sent to that connector regardless of how many data pipes to which the connector is attached – to ensure data consistency in the final destination.

A ruleset consists of criteria, actions and results:  

  • The criteria defines which messages should be affected by a rule. The criteria could match with all messages or only very specific messages. Criteria can be data across multiple fields using a variety of operators to limit the event stream down to only the events that need to be transformed.

  • The actions define what types of transformations that should be applied to each message that passed the criteria stage. See example use cases below.

  • The results are either simple value replacements or complex functions that can be executed to generate the result.

Use cases

Data point renaming

Customers may not want to use the Datazoom Data Dictionary as we have defined it. Transformations allow for field names or hierarchical locations to be updated in real time.

Example: key renaming

As a simple example, client_ip can become ip_address. 

Example: node location change

Change the node location of specific key. For example, move from user_details.content_session_id to video.content_session_id to better fit data analysts’ needs. 

Rename data points and values

Example: value renaming

Conform values Datazoom determined for its dictionary to your custom defined value list

Rule example: if device_type = “ott device”, then set device_type to “set-top box”

Example: Modify field value based on the value of another field   

Examine all of the data in a message and use it to change the value of a specific field. For example, Datazoom’s Data Dictionary field player_viewable is a boolean value that is set to true when the player_viewable_percent field is greater than or equal to 50%.   What if a customer considers something viewable when only 25% of the video is viewable?   In this case the customer could build their own rule to transform the originally collected value in player_viewable.

Rule Examples:

  • If player_viewable_percent >= 25, then set event.metrics.player_viewable to TRUE

  • If device_type = ott device and OS = Roku, then set device_type to “roku”

Example: new metadata field

Generate new data to add to a message, create a new field using logic based on values in another field.

Rule Example: if device_type = “ott device” and OS = “Roku”, then create a new field device.platform  and set the value to “Roku”

Example: new event types

Change the event name from Milestone to Quartile to comply with Connector expectations or to clarify for users of the data that only Quartile data will be available in this data set.

Rule Examples:

If milestone_percent = 25, then set event.type to first_quartile

If milestone_percent = 50, then set event.type to midpoint

If milestone_percent = 76, then set event.type to third_quartile

Calculations

Example: math operations

Change the unit of measurement for a numeric field.

Rule example: for all playhead_position values, multiply by 1000 to convert seconds to milliseconds

Example: hashing for PII handling

Handle PII fields with built in anonymization logic.

Rule Examples:

  • For all client_ip values, apply hash algorithm to anonymize value with a one-way hash

  • For latitude & longitude fields, reduce precision to anonymize value by reducing number of decimals

Get started with transformations

Contact your Datazoom account representative to create an implementation plan and build, test and verify the results of rulesets.