Quantcast
Channel: Reg Whitepapers
Viewing all articles
Browse latest Browse all 1703

How WeChat's Data Lakehouse Architecture Enhances Efficiency for Trillions of Daily Records

$
0
0

With over a billion users, it comes as no surprise that WeChat manages extremely large data volumes. In some cases, single tables are growing by trillions of records daily and queries regularly scan over 1 billion records.

WeChat's business scenarios demand rapid end-to-end response times, with a query latency P90 target of under 5 seconds, and data freshness requirements that vary from seconds to minutes. This complexity is elevated by the need to process often more than 50 dimensions and 100 metrics at a time.

WeChat's legacy data architecture involved a Hadoop-based data lake system along with a variety of data warehouses. This resulted in significant operational overhead and data governance challenges including:

  • Juggling multiple systems from separated real-time and batch analytics pipelines
  • Maintaining data ingestion pipelines for data warehouses
  • Governance challenges from managing multiple copies of the same data
  • Managing incompatible APIs of different systems
  • Challenges in standardizing data analysis processes

Viewing all articles
Browse latest Browse all 1703

Trending Articles