Introducing Omni Reporting: Omni CDI’s real-time business intelligence layer

Omni CDI, our customer data infrastructure, is designed to run in your private cloud. So far, we’ve introduced two key components that enable this: Omni Analytics, a Dockerized application for collecting, enriching, and structuring raw events; and Omni Infrastructure, which includes warehousing pipelines and various components.

Today, with the launch of Omni Reporting, we are introducing our approach to the real-time reporting layer of your customer data infrastructure. This marks another step in our mission to build a private customer data infrastructure system that any company can operate within its own private cloud, maintain full control over its data, and cut hefty data processing costs.

What we require from the reporting system

With the launch of Omni Reporting today, we’re focusing primarily on real-time reporting. Real-time dashboards show what’s happening right now, enabling you to act on changes immediately, without the delays common in ETL-based systems and dashboards that update at set intervals. Although technically near-real-time, the millisecond-level delays that add up along the pipeline are insignificant in the marketing and growth context, so we treat them as real-time.

There are certain requirements we have established for the real-time reporting system as we understand it.

Data freshness is key to real-time reporting. For a dashboard to be truly real-time, it must rely on a real-time data pipeline that captures and stores data immediately as it is generated. A high-performance dashboard or BI platform that refreshes quickly is not sufficient if the underlying data pipeline is a batch pipeline. This is the most crucial aspect of a real-time reporting system.

Good reporting layer should be able to handle complex queries quickly and support some level of concurrency to guarantee good experience for the users. Marketing queries can become complicated, involving multiple data sources, filters, and aggregations, and they need to deliver results instantly. Caching is a must to keep frequently accessed data in memory, which significantly improves query performance.

Third, a real-time reporting system needs continuous access to all historical data. This access is crucial for building comparison-based metrics and reports, as well as for enrichment. It differs from streaming or log analytics, which are also based on real-time pipelines but mainly focus on immediate data feed health and typically do not store much historical data.

What Omni Reporting is built on

The reporting layer of Omni CDI operates in the private cloud using the open-source Metabase as its core component. It connects to various data warehouses that store data processed by Omni Analytics and Omni Activation, which is ingested through end-to-end pipelines from Omni Infrastructure and Warehousing. All components are built with Docker and Infrastructure as Code (primarily Terraform), enabling quick implementations in private clouds.

The Omni Reporting system is already used in a significant number of our customer data infrastructure services, as well as in MarTech and analytics services, not just in our own customer data offering. For example, the GA4-based customer data infrastructure includes several real-time reports on a data model that integrates various data sources.

High-level overview of real-time Omni Reporting layer

The real-time aspect of the Omni Reporting layer is built to help you stay on top of marketing and growth metrics and be alerted whenever something turns red.

To fully understand where the reporting layer fits into the bigger picture of Omni CDI, let’s look at a high-level walkthrough of the pipeline.

In the first step, the event arrives at the Omni Analytics collectors, where it undergoes several processes specific to Omni Analytics, including cleaning, transformation, and filtering. The most critical step is aggregation, which involves running queries on the event related to downstream metrics, such as average deal value or the lifetime value of a person performing the event. Finally, Omni Analytics standardizes the payload and prepares it for dispatch to the activation layer in the standardized event format used at Omni CDI. Omni Analytics doesn’t concern itself with how the event will be used downstream; its focus is on preparing the events and sending them downstream as quickly as possible.

What happens next with the event is a direct consequence of our understanding of reporting and its role in the customer data infrastructures we’re building. As described in our lessons learned from building customer data platforms, warehousing begins in the data activation layer—it is here that events are converted into value. The activation layer first captures events dispatched from Omni Analytics to an Omni Data Client, either built in GTM Server or AWS Lambda, and populates data fields specific to the container for later use by reporting tags. These data points are then picked up by dedicated GTM Server or AWS Lambda reporting tags, which compose the final payload for the warehousing collectors and dispatch the final event. The purpose of the dedicated reporting tags is to reference the relevant metrics schema used in the specific dashboard.

The structured data is stored in your private cloud-based data warehouse, such as PostgreSQL, MySQL, Redshift, typically deployed automatically using our warehousing pipelines. We are also working to incorporate columnar storage and expand the BI platforms we support to include additional solutions.

The reporting instance is deployed in your private cloud and configured to connect directly to your data warehouse, allowing it to query real-time event data. The warehousing tag in the activation layer prepares your data according to the specific metrics you plan to use in your dashboard. The reporting platform connects to the warehouse using secure credentials, queries data in real-time with SQL, and leverages your database’s indexing, caching, and query optimization.

With features such as Pulse and Alerts, the dashboard can automatically notify stakeholders when key metrics reach certain thresholds, ensuring real-time monitoring of critical business metrics.

Where can I see Omni Reporting in action

Here is a step-by-step tutorial that guides you from theory to practice using our real-time Pipedrive dashboard template as an example.

What’s the next for Omni Reporting

Our plan for Omni Reporting includes creating real-time dashboard templates for various data sources and cross-data source integrations. Check the Omni CDI documentation on Omni Reporting to stay updated on our available dashboards.

Photo attribution

As usual, the featured image of the article is a photograph that corresponds with the article’s topic. This time, the shoutout goes to Pawel Czerwinski via Unsplash.