top of page

Media

Apache Iceberg Merge-On-Read: Streaming CDC
Crunch Conference 2022

What is Apache Iceberg, and how can you use it to do streaming upserts with its V2 spec? The age old problem of building type-1 dimensions, but delivering data as fast as possible. We will go over how Shopify is consuming change data capture events (CDC) from our relational databases, and how we utilize Iceberg to stream upserts to our datasets, giving data scientists speedy and accurate representation of our production data stores.

 

Conference Link: https://crunchconf.com/2022/speaker/victoria-bukta

Data Journey With Victoria Bukta (Shopify)
Apache Iceberg and Data Ingestion
Radio DaTa - GetInData

Viktoria works as a senior data engineer at Shopify. Shopify is one of the most well-known e-commerce companies and it is a very early adopter of big data & cloud technologies. We talk with Viktora about how her team ingests data at Shopify using a mix of open-source and cloud-native technologies such as Apache Iceberg, Debezium, Kafka, and GCP. Hosted on Acast.

Scaling your data lake with Apache Iceberg
Big Data Technology Warsaw Summit - 2022

- Common issues with data lakes
- What is Apache Iceberg? and what problems does it solve
- Building CDC archive at Shopify using Iceberg
- Management when using Iceberg
- Brief intro into whats next on deck for Shopify + Iceberg (Type-1 dimensions using Iceberg's V2 spec with row-level deletion)

#iceberg #datalake #columnardata #dataplatform

​

Conference Link: https://bigdatatechwarsaw.eu/prelegenci/victoria-bukta/

Recording: https://bigdatatechwarsaw.eu/top-3-presentations

DataOps -
Continuous Monitoring & Deployment Of Our Streaming Pipelines

DevOps@Enterprise Forum - 2022

bottom of page