How NimbleEdge enables optimized real-time data ingestion with on-device event stream processing
Introduction
In our previous blog, we covered how NimbleEdge helps capture event streams with our on-device data warehouse capabilities, with provisions to easily transfer to cloud storage. We also discussed how these capabilities enable AI teams to quickly build training datasets for session-aware personalization models, which drive significant conversion and engagement uplift in apps across verticals.
In this blog, we continue building on that foundation and cover how event stream capture can be further optimized using on-device data processing and filtering before transferring to cloud storage.
Potential issues: Coarse-grained events and large event payloads
NimbleEdge’s on-device data warehouse solves many key challenges in data collection for session-aware models and is usually adequate on its own for customers with well-classified user event streams, where the size of each event payload is not very large.
However, for many apps, user events are coarsely defined and payloads may indeed be very large. For example, when event payloads are responses from backend APIs, they may include extensive metadata, or detailed product catalogues, which are not relevant to session-aware personalization but lead to oversized event payloads. In such cases, transferring event streams to cloud servers becomes both costly and inefficient, replicating some of the same issues involved in data transfer from CDPs, such as high costs (in this case for storage of large event payloads), and additional pre-processing steps before use in training session-aware models (e.g. filtering only relevant data points). Often in these pre-processing steps AI teams want to create new event types with only the payload required for experimentation. The key here is to provide the AI teams flexibility to quickly iterate and deliver value without requiring them to jump through multiple hoops slowing down the experimentation while the end users continue to suffer with degraded experiences.
New NimbleEdge feature: On-device Event Stream Processing & Filtering
To address this challenge, NimbleEdge introduces its latest feature: event stream processing & filtering on-device before relay using Python scripts!
For apps with large event payloads, NimbleEdge now also enables on-device processing of event streams before transfer to cloud storage. This feature ensures that only the essential data points are transmitted, reducing data transfer volumes significantly and resulting in lower cloud storage and processing costs. Additionally, preprocessing data on-device simplifies the downstream workflow for AI teams by providing structured, analysis-ready datasets.
With the introduction of this new feature, AI teams can now:
- Use customizable Python scripts to process and filter event streams on-device before transmitting to cloud storage, allowing AI teams to define tailored preprocessing rules directly on user devices. A snippet showcasing a sample script in the context of food delivery apps is shared below:
- Easily access and explore captured events on the NimbleEdge user portal, enabling intuitive navigation and streamlined data discovery as showcased below
Using these capabilities, this feature maximizes data collection flexibility for AI teams, significantly reducing the effort and cost involved in building training datasets for session-aware models.
Impact
Processing user events on-device before sending to cloud storage drives various advantages for AI teams, including:
- Session-aware modeling: With easy access to processed user event streams, AI teams can easily build session-aware models, enabling 8-10% improvement in ranking model performance, and driving conversion and order value uplift
- Faster iteration and lower engineering bandwidth: Data ingested using NimbleEdge is ready to use for analysis, with no need for further processing to convert the data into a usable format. Session-aware use-cases can hence be brought into production much faster, with much lower effort into readying data for analysis
- Transfer and processing costs minimized: Directly transferring clickstream data from user devices to cloud servers already helps circumvent the massive transfer costs charged by CDPs. Additionally, on-device event stream processing helps prevent the large cost of storing unprocessed event streams, as well as the costs of processing the stream to convert them into formats required for training session-aware models
This new functionality underscores NimbleEdge's dedication to providing highly scalable and cost-efficient solutions for session-aware personalization essential for building experiences with AI and empowering AI teams within these organizations for rapid experimentation and model building.
To learn more about how NimbleEdge drives real-time AI-driven personalized experiences at scale, visit nimbleedge.com or reach out to contact@nimbleedge.com