Data engineering and analytics for leaders

S1E8 Data quality on Modern Data Stack

4 min · 15. mars 2023
episode S1E8 Data quality on Modern Data Stack cover

Beskrivelse

In a modern data stack, data is collected from various sources, such as databases, APIs, and third-party applications. This data is then processed and transformed into a usable format for analysis. However, data quality can suffer at every stage of this process, leading to unreliable insights and flawed decision-making. One of the biggest challenges of maintaining data quality in a modern data stack is the sheer volume and variety of data. With so much data coming in from different sources, ensuring that all data is accurate, complete, and consistent can be challenging. Another challenge is data lineage. With data flowing through multiple systems, it can be difficult to track its origin and how it has been transformed over time. This lack of transparency can make it challenging to identify and address issues with data quality.

Kommentarer

0

Vær den første til å kommentere

Registrer deg nå og bli medlem av Data engineering and analytics for leaders sitt community!

Kom i gang

2 Måneder for 19 kr

Deretter 99 kr / Måned · Avslutt når som helst.

  • Eksklusive podkaster
  • 20 timer lydbøker i måneden
  • Gratis podkaster

Alle episoder

8 Episoder

episode S1E8 Data quality on Modern Data Stack cover

S1E8 Data quality on Modern Data Stack

In a modern data stack, data is collected from various sources, such as databases, APIs, and third-party applications. This data is then processed and transformed into a usable format for analysis. However, data quality can suffer at every stage of this process, leading to unreliable insights and flawed decision-making. One of the biggest challenges of maintaining data quality in a modern data stack is the sheer volume and variety of data. With so much data coming in from different sources, ensuring that all data is accurate, complete, and consistent can be challenging. Another challenge is data lineage. With data flowing through multiple systems, it can be difficult to track its origin and how it has been transformed over time. This lack of transparency can make it challenging to identify and address issues with data quality.

15. mars 20234 min
episode S1E7 Modern Data Stack cover

S1E7 Modern Data Stack

A modern data stack combines different tools, technologies, and processes businesses use to collect, store, analyze, and visualize data. It is designed to provide a unified and streamlined approach to data management, allowing organizations to make data-driven decisions quickly and efficiently. The modern data stack differs from the traditional one in several ways. Traditionally, data stacks were built using a monolithic architecture that relied on expensive hardware and software licenses. These stacks were challenging to manage and slow to scale and often resulted in data silos that hindered collaboration between different teams. On the other hand, the modern data stack is built using a modular architecture that leverages cloud computing, open-source software, and APIs. This approach allows organizations to use the best-of-breed tools for each step of the data pipeline, resulting in a more flexible, scalable, and cost-effective solution.

14. mars 20238 min