JUMP Lakehouse

JUMP Lakehouse

A unified data repository combining the best of a data lake and a data warehouse
mpp global
Cleeng Logo

Ingest the data generated across your business from a myriad of data sources to get a unified source of truth

Easily load data from all your video service data sources by using connectors and third party pre-integrations.

Pull data connector

Push data from any data source that exposes data through an API or data dump files accessible through any medium (FTP, Shared cloud storage, etc.)

Push data connectors

Post real time and batch idata into your data repository through an infinite storage FTP server or by sending any JSON structure data files


API to collect data using Jump’s SDK installed in your digital media frontend applications

Apache Spark ready

Data ingestion uses Apache Spark pipeline to transform unstructured raw data into structured and partitioned data ready for Advanced Analytics

Third party connectors

AWS S3, AWS Redshift, Apache Kafka, Google BigQuery and many more connectors available

Learn more in 30 seconds


Data Ingestion strategies for video businesses Grappling with Analytics

Store your video service data in a centralized repository

Harmonize your diverse data sources and create a “single source of truth” for your video service data.

Landing data layer

Includes all data in its RAW format as it arrived

Cleaned data layer

Includes all the RAW data without any modifications but already cleansed (missing imputations, data structure, etc.) and stored in PARQUET file

Warehouse data layer

Data has already gone through JUMP’s extraction, transformation and loading processes and has been transformed into a known data model to enable further data exploitation

Presentation data layer

Data stored in this layer is ready to be consumed by final applications

Learn more in 30 seconds
Blogpost: Data Lake or Lake house? Pros and cons for consumer digital services tech stack

Give any process or person access to your data at any stage of the data journey

The access module provides access to data at any processing stage (cleansed data or warehouse data).

API available to access your data and drive queries

Use normal ANSI SQL to make complex analytic queries over your data

JDBC drivers available

All your queries are stored forever

Results from any query stored in CSV files can always be retrieved

Apache HiVE and Presto

Connectors for BI

Use JUMP Deep insights to visualize your data or use a third party visualization tool (Tableau, Amazon QuickSight, Google data Studio, etc.)

Learn more in 30 seconds
Whitepaper: Batch vs real time data visualizations. Infrastructure and differences

Book a demo