Data engineering and analytics for leaders

S1E8 Data quality on Modern Data Stack

4 min · 15 de mar de 2023
portada del episodio S1E8 Data quality on Modern Data Stack

Descripción

In a modern data stack, data is collected from various sources, such as databases, APIs, and third-party applications. This data is then processed and transformed into a usable format for analysis. However, data quality can suffer at every stage of this process, leading to unreliable insights and flawed decision-making. One of the biggest challenges of maintaining data quality in a modern data stack is the sheer volume and variety of data. With so much data coming in from different sources, ensuring that all data is accurate, complete, and consistent can be challenging. Another challenge is data lineage. With data flowing through multiple systems, it can be difficult to track its origin and how it has been transformed over time. This lack of transparency can make it challenging to identify and address issues with data quality.

Comentarios

0

Sé la primera persona en comentar

¡Regístrate ahora y forma parte de la comunidad de Data engineering and analytics for leaders!

Prueba gratis

Empieza 7 días de prueba

$99 / mes después de la prueba. · Cancela cuando quieras.

  • Podcasts solo en Podimo
  • 20 horas de audiolibros al mes
  • Podcast gratuitos

Todos los episodios

8 episodios

episode S1E8 Data quality on Modern Data Stack artwork

S1E8 Data quality on Modern Data Stack

In a modern data stack, data is collected from various sources, such as databases, APIs, and third-party applications. This data is then processed and transformed into a usable format for analysis. However, data quality can suffer at every stage of this process, leading to unreliable insights and flawed decision-making. One of the biggest challenges of maintaining data quality in a modern data stack is the sheer volume and variety of data. With so much data coming in from different sources, ensuring that all data is accurate, complete, and consistent can be challenging. Another challenge is data lineage. With data flowing through multiple systems, it can be difficult to track its origin and how it has been transformed over time. This lack of transparency can make it challenging to identify and address issues with data quality.

15 de mar de 20234 min
episode S1E7 Modern Data Stack artwork

S1E7 Modern Data Stack

A modern data stack combines different tools, technologies, and processes businesses use to collect, store, analyze, and visualize data. It is designed to provide a unified and streamlined approach to data management, allowing organizations to make data-driven decisions quickly and efficiently. The modern data stack differs from the traditional one in several ways. Traditionally, data stacks were built using a monolithic architecture that relied on expensive hardware and software licenses. These stacks were challenging to manage and slow to scale and often resulted in data silos that hindered collaboration between different teams. On the other hand, the modern data stack is built using a modular architecture that leverages cloud computing, open-source software, and APIs. This approach allows organizations to use the best-of-breed tools for each step of the data pipeline, resulting in a more flexible, scalable, and cost-effective solution.

14 de mar de 20238 min