Thanks for your interest in an Apache Iceberg deep-dive workshop, Jean!
We’re excited to help you architect, optimize, and deploy Apache Iceberg at scale in your mixed GCP, AWS, and on-prem Spark environment. Based on our conversation:
- Data Scale & SLAs: You’re managing ~3 PB of data with a 99.9% SLA and sub-2s query targets.
- Multi-Cloud & On-Prem: You have Spark workloads across GCP (Dataproc), AWS (EMR/Glue), and on-prem clusters.
- Open-Table Formats: You’re interested in leveraging Iceberg’s time-travel, partition evolution, schema enforcement, and compaction features.
Why ITMAGINATION?
Our Data & AI team has over 400 experts delivering petabyte-scale data platforms with strict SLAs for clients in finance, retail, and manufacturing. We’ve optimized Spark pipelines on Databricks, EMR, and on-prem Hadoop clusters—tuning Iceberg and Delta Lake for sub-second query responses.
Workshop Agenda (Draft)
- Iceberg Architecture & Table Layout Best Practices
- Partition Strategies & Predicate Pushdown
- Schema Evolution & Rollbacks (Time-Travel)
- Performance Tuning: Caching & Compaction
- CI/CD & Governance with Iceberg in Multi-Cloud
- Q&A and Next Steps
Next Steps
Please confirm which of these proposed time slots works for you (all CEST):
- Monday at 8:00 AM
- Tuesday at 3:00 AM
Once confirmed, we’ll send a calendar invite with session details. After the workshop, we’ll provide a detailed project schedule and high-level budget estimate based on your feedback.
Key Case Studies
- Unified Analytics Platform (Under NDA): Petabyte-scale Spark workloads on Databricks with 99.9% SLA.
- Social Media Monitoring: Real-time Spark inference pipelines delivering sub-second responses.
- High-Accuracy Forecasting: Hourly forecasts on hundreds of TBs/day with <2s query times.
We look forward to helping you unlock Iceberg’s full potential!