Asian Y Combinator fintech enters growth phase with Datomni-built CDP running on Segment, dbt, GA4, BigQuery, Fivetran, and Looker Studio

A startup from Southern Asia partnered with Datomni to build a customer data platform. Their goal was to track user behavior throughout the customer journey and activate users at different stages of the marketing funnel—from awareness to long-term retention. In addition, the company wanted a complete overview of all parts of their business, from product to marketing, through a set of 30+ metrics calculated using custom business logic they devised.

Datomni assembled a 5-person internal team of data engineers, developers, and marketers to architect and set up our client’s CDP by combining two predefined customer data infrastructure solutions: Segment-based customer data infrastructure and GA4-based customer data infrastructure, consisting of multiple layers: data capture and collection, activation, warehousing, reporting, and modeling. We implemented tracking for around 80 events throughout the user journey. We also deployed BigQuery for data warehousing and used dbt (data build tool) to build data models for reporting on metrics through a dashboard created with Looker Studio. The real-time data was consumed by downstream marketing and activation tags using Segment’s destinations. Today, the client’s platforms process hundreds of thousands of events, helping them engage users and understand their growth.

Client profile: fast-growing Asian fintech startup focused on financial literacy

Our client is a fast-growing fintech startup from Asia, aimed at helping people manage their finances. Their mission is to improve financial literacy in the country by offering easy-to-use tools that allow users to take control of their financial future. The app combines traditional banking services with modern digital solutions, making financial planning accessible for all.

With an intuitive interface, the client’s mobile app lets users manage savings, investments, and expenses. Since launching in 2021, the client has raised over $2 million in seed funding from investors like AC Ventures, Vibe.VC, and Y Combinator. Despite being new, the client is already seen as a rising star in fintech.

Key challenges: complex data model, extensive stack, resource limitations, and large data model

Let’s walk through the key limitations and challenges we faced when venturing into this project.

Relatively complex business and product

The client’s product is complex because it’s an all-in-one application offering various financial services. The app provides an overview of finances, tracks spending, offers insights, and includes savings, budgeting, investments (like mutual funds and bonds), and personalized financial advice. This complexity required a detailed tracking plan and robust data strategy.

To track these services, we needed a strategy that went beyond standard activity tracking and addressed key moments throughout the long-term customer journey. We also had to design the event and user model so that dbt could calculate numerous engagement and profitability metrics. Additionally, to activate users across different ad channels, we carefully considered the essential attribution properties.

Extensive stack and data model vs limited resources

The client’s tech stack includes a React Native mobile app, a Java backend, and a React.js web app. This mix of systems made it challenging to track user activity across all touchpoints. The key was choosing the right CDP to unify all the data and build complete user profiles from multiple sources.

Besides collecting events, the stack also integrated many third-party services, further complicating tracking. The system needed to be scalable, ensuring accurate data collection while considering Segment’s performance and event volume limitations.

Finally, like any fast-growing startup, the client faced significant resource constraints. As a result, the CDP had to be designed to require minimal effort and attention from the team.

Sophisticated data and metrics needs

To measure app performance and gain insights into user engagement, the client needed a complex data model to track various metrics. The financial aspects of their business, such as asset growth, had to be closely linked with user behavior data. This required a dedicated metrics layer that would perform sessionization on the atomic events captured by Segment, in order to build around 30 metrics required by management on a single dashboard.

Optimizing for Monthly Active Users (MTU)

In order to minimize the total ownership cost, the client wanted to develop a CDP infrastructure that ensures they pay for processing only for engaged users and not for all cold traffic. This complicated the architecture of the CDP.

Unified and permanent identity resolution

Due to the many data sources and processes, and the need to support users throughout their journey, a strong identity layer was essential. This layer had to resolve user identity with unique UUIDs and capture and send key user characteristics and metrics. These were necessary for effective user activation.

Solutions: comprehensive tracking plan, Segment & GA4, data models via dbt, and deep activation

Extensive data strategy and tracking plan

We started by working with the client to understand their business and data strategy. Together, we defined key questions for each stage of the customer journey: acquisition, activation, retention, and conversion. These questions guided how we designed the tracking plan (event model, user model, metrics model and more) and helped align the data infrastructure with the client’s growth goals.

Selecting and architecting the CDP in the correct proportions

After evaluating different options, we chose Datomni’s Segment-based customer data platform stack as the core of the client’s CDP. Segment allowed for quick implementation and iteration, which was essential for a fast-moving startup like the client’s. It made data management easier while supporting rapid user growth. Unlike self-hosted solutions that require a lot of maintenance, Segment provided a ready-to-use solution with strong data collection and activation capabilities.

We supported the client in implementing a dual-tracking approach: using GA4-based customer data platform for less engaged users (top-of-funnel traffic) and Segment for more valuable users. This ensured that the client only used Segment’s higher costs for important interactions, making their budget go further.

Building a robust stack to achieve all data backend goals

After selecting the right CDP and outlining its architecture, we’ve ventured into planning out the entire stack that would support not only high volume of event processing using a complex data model and turn data into actual business value.

  • Pipelines: Segment, GA4
  • Data capture sources: Javascript, Java, React Native
  • Warehouse: BigQuery
  • Modeling layer: dbt
  • Reporting layer: Looker Studio
  • Destinations: Amplitude, Intercom, Meta Conversion API, AppsFlyer, Firebase, and more.

Comprehensive real-time data capture and processing from all sources

We helped the client set up Segment to capture events from their mobile app (using the React Native sink), backend (using the Java sink), and web app (using the client-side JavaScript tracker), tracking user interactions in real time. We also used GA4 to capture traffic from less-engaged users on their web app, reducing data processing costs with Segment. The events tracked included key financial activities such as savings, portfolio performance, and investment transactions. Ultimately, the event model grew to over 80 events across all sources and pipelines.

Identity resolution and user model development

Identity resolution was implemented using Segment’s identify method. This not only assigned a central identifier to each user but also captured useful data points and user properties in Segment, which were then used in downstream activation tools.

Extensive real-time data activation to engage users along the full customer journey

We integrated several third-party tools into the client’s CDP to activate users effectively. These included Amplitude for product analytics, Intercom for customer engagement, and Meta Conversion API for server-side Meta tracking. These integrations ensured that the client’s data wasn’t just collected but also used to engage users throughout their journey. The destinations were activated in Segment.

Metric model, dashboards and KPI’s

We implemented over 30 metrics across the client’s stack, covering key areas like customer acquisition cost (CAC), user retention, active app users, and financial performance. Each metric was designed to align with the client’s business goals, providing insights that drive growth. The metric model was developed using dbt. The dbt layer recalculates metrics daily and updates the BigQuery tables connected to the Looker Studio dashboards. Fivetran’s ETL tools sync ad spend data from Google Ads, Facebook Ads, and Apple Ads.

A visual excerpt of the dbt model lineage is presented below, showing the extent of data and data sources we’ve been working with, including both third-party batch sources and real-time event sinks ingested to BigQuery using built-in Segment warehousing pipelines. Please note that sensitive parts of the image were blurred.

To the moon!

Our collaboration with the client resulted in a scalable customer data platform that brings together data from various sources, activates users at different stages of the customer journey, and provides valuable insights. The system is flexible and future-proof, allowing the client to continue growing while optimizing for cost and efficiency.

Photo attribution

As usual, the featured image of the article is a photograph that corresponds with the article’s topic. This time, the shoutout goes to Tobias Keller on Unsplash.