Why Traditional Testing Fails for AI Systems - Dušanka Lečić

24 min · 28 de may de 2026

Descripción

From prompt failures to hallucinations: what breaks in AI testing 🚨 Are we actually testing too much sometimes? Just because we run a lot of tests doesn’t mean we’ll find a lot of bugs. Here’s how we can solve this: Free Online Workshop [https://tul.fm/team] "For the same input we have a lot of different outputs, some of them can be similar, but yeah still non-determinism is completely there." - Dušanka Lečić This time I talk with Dušanka Lečić about why testing chatbots breaks everything we know about traditional QA. She explains how chatbot bugs are invisible – they hide in prompts, retrieval logic, and chunks, not in code – and why the same input can produce dozens of valid outputs. Dušanka shares her framework for testing context retention, hallucination control, and accuracy, and reveals why stress testing a chatbot means checking for typos and user frustration, not system load. Dušanka Lečić [https://www.linkedin.com/in/dusanka-lecic/] is a dynamic leader and technical expert with nearly a decade of experience steering software testing initiatives across international teams. As a Test Lead and Department Manager at Levi9, she specializes in performance testing, agile methodologies, and engineering excellence. Holding a Ph.D. in Technical Sciences, Dušanka blends academic insight with real-world execution, and is a frequent contributor to industry conferences, mentoring programs, and expert communities. Her sessions offer a rich perspective on quality assurance, innovation, and leadership in fast-paced development environments. Highlights: * Chatbot testing requires multiple valid test cases, unlike traditional testing's single pass scenario. * Bugs in chatbots are invisible—hidden in prompts, retrieval logic, or generation, not code. * Context retention across conversations matters more than isolated correct answers in chatbot testing. * Stress testing chatbots means checking typos and frustration wording, not performance loads. * Manual testing remains essential; no single tool automates complete chatbot quality verification yet.

Comentarios

Sé la primera persona en comentar

¡Regístrate ahora y únete a la comunidad de Software Testing Unleashed - QA, DevEx & Quality Engineering!

Prueba gratis

Todos los episodios

55 episodios

Why Traditional Testing Fails for AI Systems - Dušanka Lečić

28 de may de 202624 min

Why Testers Are Safe Despite AI Hype - Mitko Mitev

From test planning to defect clustering: where AI already saves you 30% effort 🚨 Are we actually testing too much sometimes? Just because we run a lot of tests doesn’t mean we’ll find a lot of bugs. Here’s how we can solve this: Free Online Workshop [https://tul.fm/team] "People should stop asking on interviews what's the difference between class and object. You should probably ask: What is MCP?" - Mitko Mitev This time I talk to Mitko Mitev, about how AI is reshaping our work as testers, without replacing us. Mitko shows exactly where AI tools save real time across test planning, test case generation, and exploratory testing, and why human expertise remains non-negotiable for context, business logic, and validation. We go into the shift from writing scripts to instructing agents in plain language, how ISTQB's new AI syllabi prepare testers for what's coming, and why waiting another year to explore AI might already be too late. With over 30 years in software quality assurance and more than 20 years as a Project and Test Manager, Mitko Mitev [https://www.linkedin.com/in/mitko-mitev-030522/] is recognized as one of South East Europe’s leading software testing experts. A dedicated advocate for the QA and testing professions, he has been instrumental in establishing and promoting international standards through his work with the ISTQB and as President of the South East European Testing Board (SEETB). Mitko also serves as Chief Editor of Quality Matters magazine and Chair of the SEETEST conference, both focused on advancing global best practices in software quality. Today, Mitko continues to develop and refine educational materials, books, and articles that help professionals deepen their expertise in software testing. He is also the founder and owner of Quality House – a leading outsourcing and consultancy company with offices in Bulgaria, Serbia and Romania, proudly celebrating 21 years on the market and delivering world-class independent testing services. Highlights: * AI won't replace testers but shifts focus from mechanical tasks to creative thinking, validation, and training. * AI generates test cases and data in days instead of months, saving 20-40% effort time. * Plain language AI interfaces let business people write tests, expanding who can do test automation. * Test automation coding skills still matter—someone must verify AI-generated scripts are correct and functional. * Start learning AI testing now, not next year—software development speed now demands AI-assisted testing.

21 de may de 202623 min

How to Build QA Culture in Your Company - Filip Barszcz

Why your stakeholders, devs and PMs all mean something different by "quality" 🚨 Are we actually testing too much sometimes? Just because we run a lot of tests doesn’t mean we’ll find a lot of bugs. Here’s how we can solve this: Free Online Workshop [https://tul.fm/team] "The truth is that we are feedback givers for all the development teams." - Filip Barszcz In this episode, I talk with Filip Barszcz about what most companies get wrong when they claim to have a quality culture. Filip reveals why stakeholders, developers, and product owners all speak different languages when they say "quality" and how he translates between them to build actual buy-in for testing strategy. He walks through his playbook for introducing change without burning out the team: small wins first, honesty about short-term productivity drops, and color-coded tables that make executives eager to invest in QA. If you've ever struggled to get testing taken seriously beyond "just click through it before release," this conversation gives you the roadmap. Filip Barszcz [https://www.linkedin.com/in/filip-barszcz/] is a full-time QA Chapter Leader with over 10 years of experience in the IT industry. Throughout his career, he has collaborated with renowned organisations such as SCIB (Santander Corporate & Investment Banking), T-Mobile, Capital.Com, and IQVIA. He specialises in building and refining quality assurance processes, mentoring QA professionals, and fostering close collaboration between QA and development teams. As a strategic QA leader, he has driven major organisational transformations — from building QA departments from the ground up to restructuring teams for greater efficiency and alignment with business goals. He has successfully defined test strategy, designed automation architecture, and implemented multi-level testing — from unit to end-to-end coverage. Highlights: * QA's real job is being brutally honest feedback givers, not just testers stuck at the end. * Share colored tables weekly—stakeholders love metrics and invest when they see QA's actual value. * Start change with quick wins: better communication and information flow costs just hours weekly. * Engage everyone impacted before big changes—consultancy beats shock therapy, even when you want speed. * One major change per quarter maximum—tired teams resist everything when stability disappears.

14 de may de 202629 min

Why Quality Engineers Fail at Business Thinking - Marta Firlej

How to prove your testing work in money - before the next budget cut hits 🚨 Are we actually testing too much sometimes? Just because we run a lot of tests doesn’t mean we’ll find a lot of bugs. Here’s how we can solve this: Free Online Workshop [https://tul.fm/team] "That's the goal of every company. Every company, government and country make money." - Marta Firlej In this episode, I talk with Marta Firlej about a topic most testers avoid: money. Marta explains why understanding how your company actually makes money is crucial for QA professionals, and walks through the real costs behind salaries, automation projects, and test activities that stakeholders care about. She shares a practical calculation method to assess whether test automation is worth the investment, and challenges us to translate testing value into business numbers. Marta Firlej [https://www.linkedin.com/in/firlejmarta/] is inventor and organizer of the testing conference test:fest [https://www.testfest.pl] in Wrocław Poland. Proud member of the Polish and European testing community by being an organizer of various events, sharing knowledge and experience as a speaker, and participating as an attendee. Currently working as a Head of FS Testing Practice at Capgemini in Poland. Throughout her career, she worked on different positions always having quality in heart for different industries such as finance, healthcare, edutech, etc. Her favorite part is working with people. Highlights: * Testers must calculate testing's business value in money, not just report quality information. * Employee true cost is roughly double their salary when including taxes, benefits, and overhead. * Automation isn't worth it for rapidly changing features, POCs, or non-critical small-audience functions. * Companies exist to make money—testers ignoring business fundamentals risk being cut first during crises. * Test reports aren't optional because clients don't ask—experts must proactively show stakeholder value. More Links with Insights: * Testwarez Conference [https://testwarez.pl]

7 de may de 202619 min

Building Trust with AI Agents - Henri Terho

How to build trust into AI systems when they constantly change underneath you 🚨 Are we actually testing too much sometimes? Just because we run a lot of tests doesn’t mean we’ll find a lot of bugs. Here’s how we can solve this: Free Online Workshop [https://tul.fm/team] "AI doesn't think, it doesn't analyze, it predicts." - Henri Terho In this episode, I talk with Henri Terho, senior consultant and AI enthusiast, about why building trust in AI systems requires the same rigor we've always applied to software—just now at a whole new level. Henri explains how AI agents multiply both our successes and our mistakes, why prompting is harder than it looks, and why testers are uniquely positioned to thrive in this shift. We dig into the oracle problem, the communication trap, and why your test suite might soon matter more than your codebase. Henri Terho [https://www.linkedin.com/in/henriterho/] is a Senior AI Consultant at Eficode with broad experience spanning regulated industries—automotive, banking, aerospace, and beyond—alongside a deep commitment to open-source collaboration. He has played a key role in fostering community-driven innovation, having served as chairman of Tampere Entreprenourship society and co-founding Tampere Tribe to support local startup culture. Henri’s passion for AI, quality assurance, and rapid software development is evident in both his industry work and ongoing PhD research on agile product innovation. He frequently shares his expertise on stage and in publications, championing lean practices and the latest AI advances to empower organizations worldwide. Highlights: * AI models only predict, don't think—trust requires building validation systems and guardrails around them. * Testing must shift from deterministic green/red checks to monitoring trends and statistical validation over time. * Communication problems with AI mirror human ones: vague prompts fail like vague requirements always did. * Testers' skill set—writing specs, defining criteria, verification—is perfectly positioned for AI-driven development. * AI democratizes software creation but surfaces old problems: conflicting documents, unclear specs, poor documentation.

30 de abr de 202621 min

Why Traditional Testing Fails for AI Systems - Dušanka Lečić

Descripción

Comentarios

Empieza 7 días de prueba

Todos los episodios