<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Datacraft]]></title><description><![CDATA[Charting the confluence of data, AI, and business – insights for novices and experts as we explore this ever-evolving landscape.]]></description><link>https://www.datacraft.wiki</link><image><url>https://substackcdn.com/image/fetch/$s_!B9-O!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc339250a-b353-458c-bb0b-ac92bda74ae5_1280x1280.png</url><title>Datacraft</title><link>https://www.datacraft.wiki</link></image><generator>Substack</generator><lastBuildDate>Thu, 23 Apr 2026 12:17:01 GMT</lastBuildDate><atom:link href="https://www.datacraft.wiki/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Fenil Dedhia]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[datacraft@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[datacraft@substack.com]]></itunes:email><itunes:name><![CDATA[Fenil Dedhia]]></itunes:name></itunes:owner><itunes:author><![CDATA[Fenil Dedhia]]></itunes:author><googleplay:owner><![CDATA[datacraft@substack.com]]></googleplay:owner><googleplay:email><![CDATA[datacraft@substack.com]]></googleplay:email><googleplay:author><![CDATA[Fenil Dedhia]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[The Definitive Guide to Implementing Data Governance]]></title><description><![CDATA[Breaking down modern data governance into a practical, actionable framework that adapts to your organization's maturity level and unique challenges.]]></description><link>https://www.datacraft.wiki/p/the-definitive-guide-to-implementing-data-governance</link><guid isPermaLink="false">https://www.datacraft.wiki/p/the-definitive-guide-to-implementing-data-governance</guid><dc:creator><![CDATA[Fenil Dedhia]]></dc:creator><pubDate>Fri, 11 Apr 2025 02:17:36 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!cMGh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0b7a434-a8f9-454e-a337-8bd161bc83b4_10000x5625.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><p>I spent over 3 months finding and studying the best data governance resources, and reading up the implementation stories and case studies found on popular data management forums, slack communities, and even Reddit and LinkedIn.</p><p>And one thing was very clear:&nbsp;<br><strong>Most "how to" data governance resources fail at the implementation stage.</strong></p><p>Why?</p><p>Because they were built for a different era.&nbsp;Traditional frameworks like <a href="https://www.dama.org/cpages/body-of-knowledge">DAMA-DMBOK</a> and <a href="https://datagovernance.com/the-dgi-data-governance-framework/">DGI's Data Governance Framework</a> suffer from critical limitations.</p><p>They emerged when data lived in centralized data warehouses, when regulatory pressures were the primary drivers, and when risk management concerns trumped value creation through data enablement.</p><p>Today's data ecosystem demands something different.</p><p>This is why if you've tried implementing data governance before, you've likely faced one or more of these challenges:</p><ul><li><p>Resistance from teams who see governance as bureaucracy rather than enablement</p></li><li><p>Difficulty measuring and demonstrating tangible value of data governance</p></li><li><p>Frameworks that collapse under real-world complexity</p></li><li><p>Technology that promises solutions but creates new problems (low/negative ROI)</p></li></ul><p>This guide takes a practical approach.</p><p>If you want to <em>start implementing data governance</em> in your organization, this resource is for you.</p><p>Building on our previous discussions about <a href="https://datacraft.substack.com/p/the-rise-of-modern-data-governance?r=6121o">the rise of modern data governance</a> and the <a href="https://datacraft.substack.com/p/crafting-a-data-governance-strategy-the-defensive-offensive-framework?r=6121o">offensive-defensive governance forces</a>, we're now moving from theory to practice.</p><p>Let&#8217;s start with the bottom line, upfront.</p><h1>Executive Summary: Implementing Modern Data Governance</h1><p>If you're looking to build effective governance in your organization, here are the key principles to guide your journey:</p><p><strong>Core Principles of Modern Data Governance:</strong></p><ul><li><p>Balance value creation (offensive) with value protection (defensive)</p></li><li><p>Align governance directly with business objectives and product goals</p></li><li><p>Start small with high-value domains, then scale methodically</p></li><li><p>Build governance into workflows rather than imposing it as a separate process</p></li><li><p>Measure success through business outcomes, not governance activities</p></li></ul><p><strong>Getting Started - Your First 90 Days:</strong></p><ol><li><p>First 30 days: Measure your current state, identify 1-2 high priority domains, establish initial roles</p></li><li><p>Days 31-60: Implement basic quality monitoring, enhance discovery capabilities, develop essential policies</p></li><li><p>Days 61-90: Measure improvements, expand to additional domains, refine based on feedback</p></li></ol><p><strong>The Framework at a Glance:&nbsp;</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1mPY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdca23e9-de53-4be2-a9c0-1c619da73159_3000x1688.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1mPY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdca23e9-de53-4be2-a9c0-1c619da73159_3000x1688.webp 424w, https://substackcdn.com/image/fetch/$s_!1mPY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdca23e9-de53-4be2-a9c0-1c619da73159_3000x1688.webp 848w, https://substackcdn.com/image/fetch/$s_!1mPY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdca23e9-de53-4be2-a9c0-1c619da73159_3000x1688.webp 1272w, https://substackcdn.com/image/fetch/$s_!1mPY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdca23e9-de53-4be2-a9c0-1c619da73159_3000x1688.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1mPY!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdca23e9-de53-4be2-a9c0-1c619da73159_3000x1688.webp" width="1200" height="675" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bdca23e9-de53-4be2-a9c0-1c619da73159_3000x1688.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:56918,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.datacraft.wiki/i/162129901?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdca23e9-de53-4be2-a9c0-1c619da73159_3000x1688.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1mPY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdca23e9-de53-4be2-a9c0-1c619da73159_3000x1688.webp 424w, https://substackcdn.com/image/fetch/$s_!1mPY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdca23e9-de53-4be2-a9c0-1c619da73159_3000x1688.webp 848w, https://substackcdn.com/image/fetch/$s_!1mPY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdca23e9-de53-4be2-a9c0-1c619da73159_3000x1688.webp 1272w, https://substackcdn.com/image/fetch/$s_!1mPY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdca23e9-de53-4be2-a9c0-1c619da73159_3000x1688.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">High-level framework for implementing Data Governance</figcaption></figure></div><p>Our framework consists of three interconnected layers that build upon your existing technology:</p><ol><li><p>Foundation: Strategic alignment with business objectives and product vision</p></li><li><p>Offensive Governance: Data reliability, data democratization, and operational excellence</p></li><li><p>Defensive Governance: Compliance, security, and risk management</p></li></ol><p>Throughout this guide, you'll learn how to customize this framework for your specific organizational context and maturity level.</p><p>Remember: effective governance isn't built overnight! It's an evolving practice that grows with your organization.</p><h1>How to Use This Guide</h1><p>This guide is designed to be useful whether you're just learning about data governance or actively implementing it in your organization. Here's how to navigate this content and find the most relevant starting points for your specific needs.</p><h3>For Different Reader Journeys</h3><p><strong>If you're exploring data governance </strong><em><strong>concepts</strong></em><strong>:</strong></p><ul><li><p>Start with the sections "<em>Executive Summary</em>" and "<em>The Modern Data Governance Framework: An Overview</em>"</p></li><li><p>Focus on understanding the balance between offensive and defensive governance. The Part 2 of this series might be particular helpful: <a href="https://datacraft.substack.com/p/crafting-a-data-governance-strategy-the-defensive-offensive-framework?r=6121o">Balancing a Data Governance Strategy: The Defensive-Offensive Framework</a></p></li></ul><p><strong>If you're implementing governance </strong><em><strong>now</strong></em><strong>:</strong></p><ul><li><p>Use the Data Governance Implementation Navigator below to identify your starting point</p></li></ul><p><strong>If you're enhancing </strong><em><strong>existing</strong></em><strong> governance:</strong></p><ul><li><p>Focus on balancing offensive and defensive capabilities. Use the Data Governance Implementation Navigator below to identify a starting point</p></li><li><p>Read the "<em>Building the Tech Foundation: Choosing the Right Toolkit for Governance</em>" section to evaluate your current tools</p></li></ul><h3>&#127775; Data Governance Implementation Navigator: Start Here</h3><p>Use this decision tree to identify where to focus your governance efforts first:</p><p><strong>Step 1: What's your primary motivation?</strong></p><ul><li><p>Regulatory pressure &#8594; Start with "<em>Value Protection: Managing Risks and Enforcing Regulatory Compliance</em>"</p></li><li><p>Data reliability &#8594; Start with "<em>Value Creation: Enforce Business Rules through Data Quality and Observability</em>"</p></li><li><p>Data literacy and democratization &#8594; Start with "<em>Democratization: Making Data Work for Everyone</em>"</p></li><li><p>Optimizing governance operations &#8594; Start with "<em>Operational Excellence: Optimizing Regular Governance Operations</em>"</p></li><li><p>Building data foundation &#8594; Start with &#8220;<em>Building Your Foundation: Aligning Governance with Business Strategy</em>"</p></li></ul><p><strong>Step 2: What's your organization's data maturity?</strong></p><ul><li><p>Early stage (undefined data practices, limited tooling) &#8594; Focus on basic documentation, core quality rules, and essential access controls</p></li><li><p>Growing (some practices established, inconsistent application) &#8594; Implement domain ownership, standard quality metrics, and self-service discovery</p></li><li><p>Mature (established practices, seeking optimization) &#8594; Focus on integration across domains, advanced quality automation, and enhanced democratization</p></li></ul><p><strong>Step 3: What resources do you have available?</strong></p><ul><li><p>Limited (part-time roles, minimal budget) &#8594; Begin with high-value quick wins that require minimal investment</p></li><li><p>Moderate (dedicated roles, some budget) &#8594; Implement a balanced approach across offensive and defensive capabilities</p></li><li><p>Substantial (dedicated team, significant budget) &#8594; Build comprehensive capabilities with advanced tooling</p></li></ul><p><strong>Step 4: Which domain should you tackle first? Choose a domain that is:</strong></p><ol><li><p>High-value to the business</p></li><li><p>Experiencing painful data issues</p></li><li><p>Has engaged stakeholders</p></li><li><p>Manageable scope</p></li></ol><p>Common domains to start with include customer data, product data, or financial data, depending on your specific business priorities.</p><h2>Moving Beyond Traditional Frameworks: What Actually Works</h2><p>Traditional frameworks like DAMA-DMBOK, DCAM, and DGI's Data Governance Framework suffer from critical limitations:</p><ol><li><p><strong>Rigid processes with centralization bias</strong>: They prescribe fixed processes instead of adaptive approaches. They assume centralized control when today's reality is distributed data ownership</p></li><li><p><strong>Compliance-first mentality</strong>: They prioritize risk management over value creation</p></li><li><p><strong>Technology lag</strong>: They don't account for modern data stack capabilities. They focus on structured data when most organizations now manage complex, multi-modal data (think data lakes and lake houses)</p></li></ol><p>How can you tell if a framework will work for your context?</p><p>Ask yourself:<br>"Does this framework acknowledge the reality of how my organization actually works?"</p><h3>Evaluation criteria for selecting the right approach</h3><p>When evaluating governance frameworks, look for these qualities:</p><ol><li><p><strong>Business alignment</strong>: Connects governance activities to your specific business objectives</p></li><li><p><strong>Balance</strong>: Addresses both offensive (value-creating) and defensive (value-protecting, risk-mitigating) needs</p></li><li><p><strong>Adaptability</strong>: Allows customization to fit your organizational structure and culture. It should offer different implementation paths based on your current capabilities. Modern data governance must be adaptive by design.</p></li></ol><p>The most effective frameworks don't prescribe specific solutions but provide decision frameworks that help you make the right choices for your context.</p><p>Reality? Data governance is never "done." It's an ongoing effort that evolves with your organization.</p><h2>The Modern Data Governance Framework: An Overview</h2><p>Establishing a clear framework early on is critical. It clarifies what data governance is and what it is not, helping to avoid confusion, set expectations, and drive adoption.</p><p>The framework we&#8217;re about to explore addresses both the defensive aspects of governance (compliance, security, risk management) and the offensive elements (data reliability, data democratization, operational excellence) that create tangible business value.</p><p>The framework is simple. At its core,</p><ul><li><p>Defensive Data Governance <em>protects</em> business value.</p></li><li><p>Offensive Data Governance <em>creates</em> business value.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cMGh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0b7a434-a8f9-454e-a337-8bd161bc83b4_10000x5625.webp" data-component-name="Image2ToDOM"><div class="image2-inset image2-full-screen"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cMGh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0b7a434-a8f9-454e-a337-8bd161bc83b4_10000x5625.webp 424w, https://substackcdn.com/image/fetch/$s_!cMGh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0b7a434-a8f9-454e-a337-8bd161bc83b4_10000x5625.webp 848w, https://substackcdn.com/image/fetch/$s_!cMGh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0b7a434-a8f9-454e-a337-8bd161bc83b4_10000x5625.webp 1272w, https://substackcdn.com/image/fetch/$s_!cMGh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0b7a434-a8f9-454e-a337-8bd161bc83b4_10000x5625.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cMGh!,w_5760,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0b7a434-a8f9-454e-a337-8bd161bc83b4_10000x5625.webp" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c0b7a434-a8f9-454e-a337-8bd161bc83b4_10000x5625.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;full&quot;,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:556274,&quot;alt&quot;:&quot;Modern Data Governance Framework&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://datacraft.substack.com/i/162129901?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0b7a434-a8f9-454e-a337-8bd161bc83b4_10000x5625.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-fullscreen" alt="Modern Data Governance Framework" title="Modern Data Governance Framework" srcset="https://substackcdn.com/image/fetch/$s_!cMGh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0b7a434-a8f9-454e-a337-8bd161bc83b4_10000x5625.webp 424w, https://substackcdn.com/image/fetch/$s_!cMGh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0b7a434-a8f9-454e-a337-8bd161bc83b4_10000x5625.webp 848w, https://substackcdn.com/image/fetch/$s_!cMGh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0b7a434-a8f9-454e-a337-8bd161bc83b4_10000x5625.webp 1272w, https://substackcdn.com/image/fetch/$s_!cMGh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0b7a434-a8f9-454e-a337-8bd161bc83b4_10000x5625.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A practical framework to implement modern data governance</figcaption></figure></div><p></p><p>The framework consists of three interconnected layers:</p><ol><li><p><strong>The Foundation</strong></p><ol><li><p><strong>Strategic Governance Inputs</strong>: The foundation that aligns governance with business goals</p></li><li><p><strong>Technology Toolkit</strong>: The framework sits atop a technology layer that provides the tools and infrastructure to implement the required capabilities effectively.</p></li></ol></li><li><p><strong>Offensive Data Governance</strong></p><ol><li><p><strong>Data Reliability</strong>:Enables trust and usage of data</p></li><li><p><strong>Data Democratization</strong>: Makes data accessible to all appropriate stakeholders</p></li><li><p><strong>Operational Excellence</strong>: Optimizes day-to-day practices that sustain governance</p></li></ol></li><li><p><strong>Defensive Data Governance</strong></p><ol><li><p><strong>Compliance</strong>: Monitors the regulatory landscape and ensure meeting compliance standards.&nbsp;</p></li><li><p><strong>Security</strong>: The foundation of protection - implementing controls, managing access, and safeguarding data throughout its lifecycle. While compliance tells you what to protect, security determines how to protect it.</p></li><li><p><strong>Risk Management</strong>: Beyond immediate security concerns, this involves anticipating threats, managing vulnerabilities, and maintaining data integrity across the organization.</p></li></ol></li></ol><p>Each layer contains specific capabilities and activities that organizations develop according to their maturity level and needs.</p><p>The framework deliberately incorporates both offensive and defensive forces, building on the balance we explored in depth in our previous guide on the offensive-defensive framework. For specific strategies on calibrating this balance for your organization's unique context, refer back to that resource.</p><p><strong>Feeling Overwhelmed? You're Closer Than You Think!</strong></p><p>Most organizations already have foundational elements of governance in place, even if they don't call it "data governance." Look for these existing capabilities you can build upon:</p><ul><li><p>Documentation: Product specs, API documentation, and data dictionaries</p></li><li><p>Quality checks: Validation rules in applications, data pipeline tests</p></li><li><p>Access controls: User permissions, role-based security</p></li><li><p>Data ownership: Tribal knowledge of who manages key systems</p></li></ul><p>Instead of building from scratch, inventory these existing capabilities and formalize them within your governance framework. This approach can accelerate implementation and can increase adoption by building on familiar practices.</p><p>In the following sections, we'll explore each layer of the framework in detail, with practical guidance on implementation.</p><h2>Building Your Foundation: Aligning Governance with Business Strategy</h2><p>Effective governance doesn't start with policies or tools - it starts with strategy.</p><p>The foundational layer of the framework ensures your governance efforts address the right problems and delivers the right value for your specific context.</p><h3>Step 1: Aligning with company mission and business objectives</h3><p>Generic governance programs fail. Full stop.</p><p>Your governance approach must <em>explicitly</em> connect to your organization's strategic priorities. Here are four things to quickly document first:</p>
      <p>
          <a href="https://www.datacraft.wiki/p/the-definitive-guide-to-implementing-data-governance">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Balancing a Data Governance Strategy: The Defensive-Offensive Framework]]></title><description><![CDATA[This practical framework for viewing data governance takes you through defensive and offensive forces. Discover how to strike the right balance for your strategy. This is Part 2 of "The Datacraft Guide to Data Governance."]]></description><link>https://www.datacraft.wiki/p/crafting-a-data-governance-strategy-the-defensive-offensive-framework</link><guid isPermaLink="false">https://www.datacraft.wiki/p/crafting-a-data-governance-strategy-the-defensive-offensive-framework</guid><dc:creator><![CDATA[Fenil Dedhia]]></dc:creator><pubDate>Wed, 12 Feb 2025 23:10:48 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/116b2452-859c-4f70-862b-0447d4d8b2bd_960x540.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bc8B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f1f7dc-a549-475e-a08c-1acc59523b38_960x540.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bc8B!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f1f7dc-a549-475e-a08c-1acc59523b38_960x540.png 424w, https://substackcdn.com/image/fetch/$s_!Bc8B!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f1f7dc-a549-475e-a08c-1acc59523b38_960x540.png 848w, https://substackcdn.com/image/fetch/$s_!Bc8B!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f1f7dc-a549-475e-a08c-1acc59523b38_960x540.png 1272w, https://substackcdn.com/image/fetch/$s_!Bc8B!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f1f7dc-a549-475e-a08c-1acc59523b38_960x540.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bc8B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f1f7dc-a549-475e-a08c-1acc59523b38_960x540.png" width="960" height="540" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17f1f7dc-a549-475e-a08c-1acc59523b38_960x540.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:540,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:82155,&quot;alt&quot;:&quot;Crafting a balanced governance strategy&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://datacraft.substack.com/i/162129899?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f1f7dc-a549-475e-a08c-1acc59523b38_960x540.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Crafting a balanced governance strategy" title="Crafting a balanced governance strategy" srcset="https://substackcdn.com/image/fetch/$s_!Bc8B!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f1f7dc-a549-475e-a08c-1acc59523b38_960x540.png 424w, https://substackcdn.com/image/fetch/$s_!Bc8B!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f1f7dc-a549-475e-a08c-1acc59523b38_960x540.png 848w, https://substackcdn.com/image/fetch/$s_!Bc8B!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f1f7dc-a549-475e-a08c-1acc59523b38_960x540.png 1272w, https://substackcdn.com/image/fetch/$s_!Bc8B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17f1f7dc-a549-475e-a08c-1acc59523b38_960x540.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The Dual Forces of Data Governance: A push-pull framework</figcaption></figure></div><p>In today's data-driven landscape, organizations face a tough balancing act of protecting and securing data while simultaneously making it accessible and valuable. Understanding this inherent duality is the best way to internalize data governance in the form of a defensive-offensive framework.</p><p><strong>Why does this matter now?</strong><br>The stakes for getting data governance right have never been higher. Organizations are grappling with an exponential increase in data volume and complexity, while facing intensifying regulatory pressures and cyber threats. Simultaneously, the role for data to drive competitive advantage has become even more critical. Building on our previous discussion about <a href="https://nohellofenil.com/the-rise-of-modern-data-governance/">The Rise of Modern Data Governance</a>, we can now understand the practical challenge of balancing governance with accessibility, and security with democratizing.</p><blockquote><p>The companies that master balancing between protection and value creation will separate themselves from those that get stalled by either excessive controls or unmanaged risks.</p></blockquote><p>Think of data governance as being pulled by two opposing forces. The defensive force is pushing, while the offensive force is pulling.</p><p>Understanding these forces - and how they interact - is kind of a cheat code for making informed governance decisions in any given situation.</p><p>In this second installment of <em>The Datacraft Guide to Data Governance</em> series, I'll break down:</p><ul><li><p>The defensive aspects: What they encompass and why they matter</p></li><li><p>The offensive aspects: Their role in driving business value</p></li><li><p>How these forces interact in different organizational contexts</p></li><li><p>Ultimately, the million dollar question: How to craft a balanced governance strategy?</p></li></ul><p>This post is not about prescribing <em>specific implementations</em>. Rather, it's about <em>understanding the framework</em> so you can better evaluate and balance competing demands in your own governance initiatives.</p><p>Whether you're revamping your data governance strategy or building one from scratch, this framework will help you evaluate and balance competing demands in your specific context.</p><h2>The Dual Forces of Data Governance: An Overview</h2><p>The framework is simple. At its core,</p><ul><li><p>Defensive Data Governance <em>protects</em> business value.</p></li><li><p>Offensive Data Governance <em>creates</em> business value.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!r6cD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2f1d8c0-6e82-4bc2-a6ce-745aa7eed407_960x540.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r6cD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2f1d8c0-6e82-4bc2-a6ce-745aa7eed407_960x540.png 424w, https://substackcdn.com/image/fetch/$s_!r6cD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2f1d8c0-6e82-4bc2-a6ce-745aa7eed407_960x540.png 848w, https://substackcdn.com/image/fetch/$s_!r6cD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2f1d8c0-6e82-4bc2-a6ce-745aa7eed407_960x540.png 1272w, https://substackcdn.com/image/fetch/$s_!r6cD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2f1d8c0-6e82-4bc2-a6ce-745aa7eed407_960x540.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r6cD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2f1d8c0-6e82-4bc2-a6ce-745aa7eed407_960x540.png" width="960" height="540" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a2f1d8c0-6e82-4bc2-a6ce-745aa7eed407_960x540.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:540,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:58402,&quot;alt&quot;:&quot;Defensive Governance protects business value and Offensive Governance creates business value&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://datacraft.substack.com/i/162129899?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2f1d8c0-6e82-4bc2-a6ce-745aa7eed407_960x540.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Defensive Governance protects business value and Offensive Governance creates business value" title="Defensive Governance protects business value and Offensive Governance creates business value" srcset="https://substackcdn.com/image/fetch/$s_!r6cD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2f1d8c0-6e82-4bc2-a6ce-745aa7eed407_960x540.png 424w, https://substackcdn.com/image/fetch/$s_!r6cD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2f1d8c0-6e82-4bc2-a6ce-745aa7eed407_960x540.png 848w, https://substackcdn.com/image/fetch/$s_!r6cD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2f1d8c0-6e82-4bc2-a6ce-745aa7eed407_960x540.png 1272w, https://substackcdn.com/image/fetch/$s_!r6cD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2f1d8c0-6e82-4bc2-a6ce-745aa7eed407_960x540.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The Dual Forces of Data Governance: A push-pull framework</figcaption></figure></div><p></p><h3>The 'Defensive' Force of Data Governance</h3><p>The defensive force emerges from the need to protect data assets and ensure compliance. This force <em>pushes</em> back against risks and uncontrolled data usage, acting as a necessary constraint on how data is handled and accessed.</p><p>The defensive force protects business value through three primary channels:</p><ol><li><p><strong>Compliance:</strong></p><ol><li><p><em>Industry-specific Regulations</em>: From HIPAA in healthcare to BCBS 239 in banking, each industry brings its own regulatory demands</p></li><li><p><em>Privacy Compliance Laws</em>: Global requirements like GDPR and regional ones like CCPA shape how we handle personal data</p></li><li><p><em>Audit and Reporting Requirements</em>: Organizations must maintain detailed audit trails and documentation to demonstrate ongoing regulatory adherence</p></li><li><p><em>Cross-Border Data Transfer</em>: Strict protocols govern how data moves between jurisdictions, especially for transfers involving the EU and US</p></li><li><p><em>Emerging Compliance Requirements</em>: As data capabilities evolve, new compliance frameworks emerge to address novel risks</p></li></ol></li><li><p><strong>Security:</strong> The foundation of protection - implementing controls, managing access, and safeguarding data throughout its lifecycle. While compliance tells you what to protect, security determines how to protect it.</p></li><li><p><strong>Risk Management:</strong> Beyond immediate security concerns, this involves anticipating threats, managing vulnerabilities, and maintaining data integrity across the organization.</p></li></ol><h3>The 'Offensive' Force of Data Governance</h3><p>The offensive force emerges from the need to create value from data. This force <em>pulls</em> organizations toward innovation and broader data utilization, driving the expansion of how data is leveraged across the business.</p><p>The offensive force drives business value through three primary channels:</p><ol><li><p><strong>Data Reliability: </strong>At the core of business value creation sits a comprehensive <em>Data Reliability Framework</em>. Business value is created through outcomes like:</p><ol><li><p>Establishing comprehensive <em>data quality </em>and<em> data observability </em>capabilities, through structured monitoring and remediation processes while maintaining appropriate guardrails.</p></li><li><p>Enabling both reliable decisions and innovative initiatives (like AI-driven data pipelines and workflows) through trusted data</p></li><li><p>Creating opportunities for data-driven growth and transformation through a <em>Data-as-a-Product</em> mindset.</p></li></ol></li><li><p><strong>Data Democratization:</strong> Making data accessible to those who need it while maintaining appropriate controls to ensure only those with proper approvals can access it. The goal is empowerment without chaos and misuse through these three primary ways:</p><ol><li><p>Building trust through transparent quality metrics and certification</p></li><li><p>Enabling collaboration between domain experts and technical teams</p></li><li><p>Creating a shared terminology and understanding of data assets</p></li></ol></li><li><p><strong>Operational Excellence:</strong> Mainly achieved through transforming how work gets done through well-governed data flows. This goes beyond efficiency to fundamentally improve how organizations operate, making processes more effective while reducing friction in data-driven activities. Some of the ways operational excellence can be scaled:</p><ol><li><p>Implementing continuous monitoring of data quality against business-defined rules</p></li><li><p>Establishing remediation workflows that turn each issue into a learning opportunity</p></li><li><p>Creating feedback loops between data quality incidents and process improvements</p></li></ol></li></ol><p>Each channel reinforces the others. Better-governed data enables more users to make better decisions, leading to improved operations and new opportunities for value creation. The idea is to maintain this forward momentum while ensuring appropriate controls remain in place.</p><div><hr></div><p>&#128161; Now, <strong>let's examine each force in detail</strong>, starting with the defensive force that forms the foundation of any robust governance framework. Understanding these components will help you identify where to strengthen protection without creating unnecessary restrictions.</p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JlZ2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d37555-a915-4e9c-9058-fca4866909e6_498x389.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JlZ2!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d37555-a915-4e9c-9058-fca4866909e6_498x389.gif 424w, https://substackcdn.com/image/fetch/$s_!JlZ2!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d37555-a915-4e9c-9058-fca4866909e6_498x389.gif 848w, https://substackcdn.com/image/fetch/$s_!JlZ2!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d37555-a915-4e9c-9058-fca4866909e6_498x389.gif 1272w, https://substackcdn.com/image/fetch/$s_!JlZ2!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d37555-a915-4e9c-9058-fca4866909e6_498x389.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JlZ2!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d37555-a915-4e9c-9058-fca4866909e6_498x389.gif" width="498" height="389" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c8d37555-a915-4e9c-9058-fca4866909e6_498x389.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:389,&quot;width&quot;:498,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!JlZ2!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d37555-a915-4e9c-9058-fca4866909e6_498x389.gif 424w, https://substackcdn.com/image/fetch/$s_!JlZ2!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d37555-a915-4e9c-9058-fca4866909e6_498x389.gif 848w, https://substackcdn.com/image/fetch/$s_!JlZ2!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d37555-a915-4e9c-9058-fca4866909e6_498x389.gif 1272w, https://substackcdn.com/image/fetch/$s_!JlZ2!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8d37555-a915-4e9c-9058-fca4866909e6_498x389.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Defensive Data Governance</h2><p>Think of defensive data governance as your organization's <strong>immune system </strong>- your data's very own <a href="https://dc-fanbase-snyderverse.fandom.com/wiki/Justice_League">Justice League</a>.</p><p>It <em>pushes</em> back against risks and uncontrolled data usage by establishing the protection mechanisms that keep your data assets secure while ensuring you remain compliant with an ever-evolving regulatory landscape.</p><h3>Compliance</h3><p><em>1. Industry-Specific Compliance</em></p><p>Data governance helps organizations navigate a maze of industry-specific regulations. Whether it's HIPAA in healthcare, BCBS 239 in banking, or FERPA in education, each industry comes with its own regulatory demands.</p><p>In healthcare, HIPAA mandates strict controls over Protected Health Information (PHI). Consider a medical research institution that needs to share patient data with external researchers. Every data point must be carefully de-identified, access must be strictly controlled and logged, and any data transfers must be encrypted. This isn't just about having the right data; it's about proving it's being handled correctly.</p><p><em>2. Privacy Compliance Laws</em></p><p>Privacy regulations have evolved from simple guidelines to complex, enforceable laws with serious teeth. The GDPR set a new global standard when it launched in 2017, but it's just one piece of a growing privacy puzzle. From CCPA in California to the UAE's Personal Data Protection Law, organizations face an expanding web of privacy requirements.</p><p>In January 2024, France's privacy watchdog CNIL fined Amazon France Logistique &#8364;32 million for what it deemed an "excessively intrusive" employee surveillance system. The regulator found issues with how the company tracked employee scanner inactivity time and item scanning speeds, along with retaining this data for extended periods. This case demonstrates how compliance extends beyond customer data privacy to encompass employee privacy rights as well.</p><p><em>3. Audit and Reporting Requirements</em></p><p>Compliance often requires organizations to maintain detailed audit trails and generate reports demonstrating adherence to regulations. This includes documenting data access patterns, changes to sensitive information, and proof of required security controls.</p><p><em>4. Cross-Border Data Transfer Compliance</em></p><p>With global operations becoming the norm, organizations must navigate complex requirements for international data transfers. This includes understanding and implementing appropriate data transfer mechanisms, maintaining required documentation, and ensuring continued compliance as regulations evolve.</p><p>Example: When European companies store or process EU resident data in cloud services hosted in the United States, they must follow strict protocols. They need to implement Standard Contractual Clauses (SCCs), conduct transfer impact assessments, and maintain detailed documentation about their data protection measures. This often requires mapping data flows, assessing risks in the destination country, and implementing additional safeguards where necessary.</p><p><em>5. Emerging Compliance Requirements<strong>&nbsp;</strong></em></p><p>While many compliance regulations exist, the transformative nature of data-driven industries dictates that new compliance regulations combat new threats, like advanced ransomware, and support new data methodologies. We'll get to those later.&nbsp;</p><p>Example: The Social Media Privacy Protection and Consumer Rights Act of 2021 requires social media platform operators to provide users with information about data collection and usage before creating an account.</p><h3>Data Security</h3><p>Consider a healthcare startup implementing role-based access control to manage patient data access. Their governance framework can ensure clinical staff see full patient records, billing staff access only financial information, and research teams work with anonymized data. The system can use data masking to show only the last four digits of social security numbers to billing staff while completely hiding them from research teams. Each access attempt can be logged and monitored, creating an audit trail of who accessed what data and when.</p><p>Strong security is about creating secure pathways for legitimate use. It's about implementing the technical and organizational controls that protect data throughout its lifecycle. When security controls are well-designed, they protect sensitive data while enabling authorized users to work efficiently.</p><h3>Risk Management</h3><p>A good risk management practice will address these three types of risks:</p><ul><li><p><em>Data Integrity Risks</em>: Ensuring data remains accurate and untampered throughout its lifecycle</p></li><li><p><em>Operational Risks</em>: Managing the risks of data unavailability or system failures</p></li><li><p><em>Reputational Risks</em>: Protecting against data breaches and misuse that could damage trust</p></li></ul><p>Consider a financial services firm addressing all three risk types through a comprehensive approach. For data integrity, they could implement validation checks and audit trails to ensure market data accuracy. For operational risks, they might discover analysts using personal email accounts to circumvent slow systems - leading them to improve system performance and provide secure file-sharing alternatives. For reputational risks, they could establish clear data handling protocols and monitoring systems to prevent unauthorized data sharing. This comprehensive approach can address both the immediate security risks and the underlying operational issues driving risky behavior. <br><br><em>Tackling risks holistically &gt; just implementing restrictions.</em></p><p>Risk management should extend beyond immediate security concerns to encompass a broader view of potential threats to data assets. This includes identifying vulnerabilities, assessing their potential impact, and implementing controls to mitigate them.</p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OTsj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9476fc98-fa48-4b10-8643-9140cc393297_498x293.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OTsj!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9476fc98-fa48-4b10-8643-9140cc393297_498x293.gif 424w, https://substackcdn.com/image/fetch/$s_!OTsj!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9476fc98-fa48-4b10-8643-9140cc393297_498x293.gif 848w, https://substackcdn.com/image/fetch/$s_!OTsj!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9476fc98-fa48-4b10-8643-9140cc393297_498x293.gif 1272w, https://substackcdn.com/image/fetch/$s_!OTsj!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9476fc98-fa48-4b10-8643-9140cc393297_498x293.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OTsj!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9476fc98-fa48-4b10-8643-9140cc393297_498x293.gif" width="498" height="293" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9476fc98-fa48-4b10-8643-9140cc393297_498x293.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:293,&quot;width&quot;:498,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!OTsj!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9476fc98-fa48-4b10-8643-9140cc393297_498x293.gif 424w, https://substackcdn.com/image/fetch/$s_!OTsj!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9476fc98-fa48-4b10-8643-9140cc393297_498x293.gif 848w, https://substackcdn.com/image/fetch/$s_!OTsj!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9476fc98-fa48-4b10-8643-9140cc393297_498x293.gif 1272w, https://substackcdn.com/image/fetch/$s_!OTsj!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9476fc98-fa48-4b10-8643-9140cc393297_498x293.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Offensive Data Governance</h2><p>Think of offensive data governance as your organization's <strong>growth engine </strong>- your data's <a href="https://en.wikipedia.org/wiki/X-Force#:~:text=An%20offshoot%20of%20the%20X,the%20X%2DMen%20film%20series.">X-Force</a>.</p><p>It <em>pulls</em> organizations toward innovation and broader data utilization by establishing the frameworks and capabilities that transform data from a raw material into business value, while maintaining appropriate controls.</p><h3>Data Reliability</h3><p>At the core of business value creation sits a comprehensive <em>Data Reliability Framework</em>. This framework connects data quality with systematic observability, monitoring, and remediation capabilities.</p><p>Imagine a manufacturing company that transformed its sensor data from a compliance burden into a competitive advantage. By implementing a comprehensive data quality framework, they were able to enable plant managers to make reliable decisions about maintenance and resource allocation. This same governed data foundation could then power their data science team to develop AI-powered predictive maintenance. Because the data is already well-documented, validated, and accessible, users can focus on innovation rather than data preparation. The AI-powered predictive maintenance system can help prevent equipment failures and identifying optimization opportunities that significantly reduced energy consumption.</p><p>In healthcare contexts, companies could implement observability tools that continuously monitor patient demographic data quality. When the system detects a pattern of missing or inconsistent information, it could automatically trigger a remediation workflow. These quality improvements might then enable more accurate patient matching across systems, potentially reducing duplicate testing and improving care coordination. The organization could track how these quality improvements directly impact clinical outcomes and operational efficiency.</p><p>These examples illustrate how well-governed data becomes a trusted foundation for both everyday decisions and innovative initiatives. When data is of high quality, standardized, and certified, teams can utilize prepped data to accelerate the path from idea to value. Easier said than done.</p><h3>Data Democratization</h3><p>Data democratization will amplify the impact of data literacy and data governance, and it's one of the most rewarding transformations that you can pull off in any organizations. This is because the benefits of data democratization go beyond efficiency &#8211; it can transform how teams collaborate and serve customers.</p><p>For instance, the product and consulting teams can iterate faster on customer insights, making more data-driven feature prioritization, and quantifying customer success metrics without constantly requiring support from their data team.</p><p>Consider an educational institution implementing a self-service analytics platforms where faculty can access student performance data with clear quality indicators. The system could provide transparency into data lineage and quality metrics, building trust among users who might otherwise question the reliability of insights. When faculty members spot potential issues, they could contribute to quality improvement through collaborative feedback mechanisms.</p><p>Government agencies could implement data quality certification processes that visibly indicate which datasets meet rigorous quality standards. This approach could encourage broader use of high-quality public data while maintaining appropriate governance. Agency staff without technical backgrounds could confidently use certified data, knowing that quality monitoring and remediation processes are actively maintaining its accuracy.</p><p>For builders and consultants, data democratization is about striking the perfect balance between enablement and control. It's not just about providing access &#8211; it's about creating an environment where teams can confidently discover, understand, and use data in their daily work while maintaining security and compliance. This requires a balance of technology, processes, and culture.</p><h3>Operational Excellence</h3><p>Operational excellence focuses on establishing robust processes that maintain data quality throughout the data lifecycle. The higher the maturity of your operational excellence, the greater the benefits of offensive data governance that can be redeemed.</p><p>Consider a global enterprise standardizing their customer interaction data across regions. By implementing a governance framework that ensures consistent data capture and classification across different teams and channels, from sales to support, they can build a unified view of customer relationships. This can reduce duplicate efforts and help identify opportunities that were previously hidden in siloed systems.</p><p>Well-governed data enables both efficiency and predictability in operations. When data flows smoothly and reliably between systems, teams spend less time fighting data issues and more time creating value.</p><div><hr></div><blockquote><p>The sweet spot in data governance is where defensive guardrails enable offensive initiatives, creating an environment where innovation thrives within secure boundaries.</p></blockquote><div><hr></div><p>By treating data as a product with quality at its center, offensive data governance can transform how organizations create value from their data assets. The continuous quality improvement cycle - observe, monitor, remediate, improve business rules - creates a virtuous circle that progressively enhances data's business impact.</p><h2>Striking the Right Balance</h2><p>The key to effective data governance isn't choosing between these forces but orchestrating their interplay. Each force serves an essential purpose, and neither should dominate completely.</p><p>Consider these scenarios:</p><ul><li><p>When defensive forces overpower offensive ones, you get a "data vault" that stifles innovation and value creation</p></li><li><p>When offensive forces run unchecked in pursuit of business value and operational goals, you risk security breaches and compliance violations</p></li><li><p>The sweet spot is where defensive guardrails enable rather than restrict offensive initiatives</p></li></ul><p>This balance isn't static - it shifts based on your organization's evolving context. You need to recognize which force needs strengthening in your specific context, rather than prescribing a one-size-fits-all approach.</p><blockquote><p>Good data governance looks much like a world-class nightclub - tight security at the door without the excessive wait, freedom to dance inside, and savvy bartenders who keep the right drinks flowing quickly to the right people. No bottlenecks, just smooth service and a great experience.<br>- Fenil Dedhia</p></blockquote><h2>&#127775; Crafting Your Defensive-Offensive Strategy</h2><p>Consider the following key questions to assess your current situation (not meant to be a comprehensive list):</p><h3>Business Objectives</h3><ol><li><p><strong>What's your primary goal?</strong> Growth needs offensive focus, risk mitigation needs defensive strength. Growth phases might require amplifying offensive forces. Crisis periods could demand stronger defensive measures. A startup focusing on market expansion might emphasize offensive capabilities, while a company recovering from a data breach might need to strengthen defensive measures. A new market entry might need both forces working in concert.</p></li><li><p><strong>What's your time horizon?</strong> Short-term compliance needs might require immediate defensive action. Companies preparing for IPO or entering new regulated markets need to prioritize defensive capabilities to meet compliance requirements.</p></li><li><p><strong>What resources do you have?</strong> Limited resources might need you to prioritize one force initially. An early-stage startup, for example, might focus first on defensive essentials before investing in risky bets (as its their nature) that involve sophisticated data initiatives requiring more complex governance structures.</p></li></ol><h3>Industry Context</h3><ol><li><p><strong>How regulated is your industry?</strong> More regulations typically demand stronger defensive controls. For instance, healthcare or finance industries require leaning more on defensive governance due to HIPAA and BCBS 239 regulations.</p></li><li><p><strong>What's your competitive landscape?</strong> High competition might require stronger offensive capabilities. Technology companies often lean toward offensive forces to drive rapid innovation and maintain market position.</p></li><li><p><strong>What's your data sensitivity level?</strong> Higher sensitivity needs robust defensive measures. Companies handling personal health information or financial data need stronger protective measures than those dealing with public data.</p></li></ol><h3>Organizational Maturity</h3><ol><li><p><strong>How established are your data practices?</strong> New initiatives might need defensive foundations first. Organizations just beginning their data journey should focus on establishing secure data handling practices before pursuing advanced analytics. Early-stage companies will likely need to establish defensive foundations first. Mature organizations can push offensive initiatives with established controls. Legacy companies often need to strengthen offensive forces to modernize.</p></li><li><p><strong>What's your current data infrastructure?</strong> Legacy systems might need offensive modernization. Organizations with siloed, outdated systems often need to strengthen offensive capabilities to modernize and integrate their data landscape.</p></li><li><p><strong>How data-literate are your teams?</strong> Lower literacy might need guided democratization. Teams new to data-driven decision making need structured access and training before implementing full self-service analytics.</p></li></ol><div><hr></div><p>The path to effective data governance isn't about choosing between protection and value creation &#8211; it's about orchestrating their interplay to serve your specific context. As data continues to grow in both volume and strategic importance, the organizations that thrive will be those that master this balance.</p><p>Whether you're establishing new data initiatives or evolving existing ones, the defensive-offensive framework provides a practical lens for making informed decisions.</p><p>Execution is everything. Start by understanding where you stand today, identify which force needs attention, and remember that this balance requires constant, thoughtful adjustment as your context and organization evolves.</p>]]></content:encoded></item><item><title><![CDATA[The Rise of Modern Data Governance]]></title><description><![CDATA[How Data Governance evolved from being the corporate equivalent of eating your vegetables to becoming rocket fuel for business growth.]]></description><link>https://www.datacraft.wiki/p/the-rise-of-modern-data-governance</link><guid isPermaLink="false">https://www.datacraft.wiki/p/the-rise-of-modern-data-governance</guid><dc:creator><![CDATA[Fenil Dedhia]]></dc:creator><pubDate>Mon, 10 Feb 2025 05:21:12 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e46c6-5255-4e78-9285-a41cff098860_1024x608.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ecjo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e46c6-5255-4e78-9285-a41cff098860_1024x608.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ecjo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e46c6-5255-4e78-9285-a41cff098860_1024x608.png 424w, https://substackcdn.com/image/fetch/$s_!ecjo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e46c6-5255-4e78-9285-a41cff098860_1024x608.png 848w, https://substackcdn.com/image/fetch/$s_!ecjo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e46c6-5255-4e78-9285-a41cff098860_1024x608.png 1272w, https://substackcdn.com/image/fetch/$s_!ecjo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e46c6-5255-4e78-9285-a41cff098860_1024x608.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ecjo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e46c6-5255-4e78-9285-a41cff098860_1024x608.png" width="1024" height="608" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/852e46c6-5255-4e78-9285-a41cff098860_1024x608.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:608,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ecjo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e46c6-5255-4e78-9285-a41cff098860_1024x608.png 424w, https://substackcdn.com/image/fetch/$s_!ecjo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e46c6-5255-4e78-9285-a41cff098860_1024x608.png 848w, https://substackcdn.com/image/fetch/$s_!ecjo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e46c6-5255-4e78-9285-a41cff098860_1024x608.png 1272w, https://substackcdn.com/image/fetch/$s_!ecjo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F852e46c6-5255-4e78-9285-a41cff098860_1024x608.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>Data governance as a practice has had a rough journey &#8211; from being a helpful way to manage data, to becoming buried in red tape, and now emerging as a crucial enabler of innovation.</p><p>Data governance means different things to different people, making clear definitions essential. As I covered in <a href="https://datacraft.wiki/p/speaking-data-fluently-a-guide-for-modern-data-ai-practitioners?r=6121o">Speaking Data Fluently: A Guide for Modern Data &amp; AI Practitioners</a>, understanding the terminology in your domain is critical, particularly for knowledge workers in technical fields like Data &amp; AI.</p><p>The evolution and ambiguity around Data Governance, along with its fundamental importance today, is why I'm dedicating a comprehensive series to the topic.</p><div><hr></div><p>This is Part 1 of The Datacraft Guide to Data Governance. In this installment, we'll understand how a practice meant to help teams work with data became synonymous with bureaucracy, and why it's now re-emerging as a critical foundation for innovation.</p><div><hr></div><h2>A Brief History: The Origins of Data Governance</h2><blockquote><p>The story of data governance begins with a simple but powerful premise: helping teams work better with data.</p></blockquote><p>Before it acquired its corporate undertones, data governance emerged from data stewardship &#8211; where dedicated professionals acted as bridges between people and processes, bringing order to the chaos of early data management.</p><p>But you won't find much if you search for "the history of data governance." Why? Because somewhere along the way, data governance lost its appeal. It became synonymous with restrictions, regulations, and red tape &#8211; hardly the stuff of viral LinkedIn posts.</p><p><strong>The 2008 financial crisis marked a turning point</strong>. As the mortgage market collapsed, regulators scrambled to prevent future disasters. This sparked the rise of compliance-focused governance frameworks and platforms like Collibra. Just like that, data governance became less about enabling teams and more about checking boxes and avoiding fines.</p><p>This shift created a persistent misconception. Today, many view data governance as the corporate equivalent of eating your vegetables &#8211; necessary but not particularly exciting. A task undertaken solely to avoid fines and security breaches. But this completely misses the original vision: data democratization and collaborative data management.</p><p>The irony? As organizations drown in more data than ever, we desperately need to return to those foundational principles.</p><blockquote><p>True data governance isn't about control &#8211; it's about creating an environment where data can be trusted, found, and used effectively by everyone who needs it.</p></blockquote><h2>Data Governance Today: The Evolution</h2><p>Data Governance has evolved beyond its compliance-focused roots to become the backbone of data strategy. Where organizations once viewed governance primarily through a compliance lens, it now encompasses a broader framework that ensures data remains valuable, accessible, useful, and credible - whether you're building AI models or analyzing market trends.</p><p>The numbers validate this evolution: <a href="https://www.gartner.com/peer-community/oneminuteinsights/omi-data-governance-frameworks-challenges-hbo">Gartner</a> reports 71% of organizations had Data Governance programs in 2024 &#8212; an 11% increase from 2023. The <a href="https://www.precisely.com/blog/data-integrity/2025-planning-insights-data-governance-adoption-has-risen-dramatically">benefits are clear</a>: 58% of organizations with data governance programs reported improved quality of data analytics and insights, as well as improved data quality.&nbsp;</p><p>But here's where things get interesting. Traditional data governance &#8211; you know, the kind with rigid top-down control and a lone data steward playing data police is becoming obsolete. In its place, a more flexible and inclusive approach is emerging.</p><p>Modern Data Governance is evolving in three fundamental ways: scope, ownership, and timing.</p><ol><li><p><strong>From data governance to "data and analytics" governance</strong>: Think bigger. &#8220;Data&#8221; isn&#8217;t the only asset that needs to be governed anymore. Today's data assets include dashboards, machine learning models, code repositories, and more. Modern governance needs to handle it all.</p></li><li><p><strong>From a rigid centralized model to flexible governance models</strong>: The old model of centralized control through a lone data steward is evolving. Modern organizations typically adopt either a federated approach &#8211; where central teams set standards that business units implement locally &#8211; or a fully decentralized model where teams operate autonomously. Federated governance has emerged as a popular middle ground, offering the benefits of consistent standards while enabling teams to adapt practices to their specific needs. This evolution reflects today's workplace reality (especially Gen Z and millennials): teams want both clarity and autonomy in how they handle data. Top-down cultures are getting eroded and employees crave purpose in everything they do, so just telling people to do something won&#8217;t work anymore. Modern data governance is fundamentally practitioner-led. (Here's a <a href="https://www.youtube.com/watch?v=82ceHr-9JH0">short 2-min video</a> to understand the differences between Centralized, Decentralized, and Federated)</p></li><li><p><strong>From an afterthought to a part of daily workflow</strong>s: In the past decade, data governance was always applied as an afterthought. Data practitioners would ship projects as they were, then go back later and add data governance requirements dictated by top-down mandates. Forward-thinking organizations are baking governance into their daily workflows from day one instead of applying it retroactively.</p></li></ol><p>Despite this evolution, organizations struggle to implement modern governance effectively. <a href="https://www.gartner.com/en/newsroom/press-releases/2024-02-28-gartner-predicts-80-percent-of-data-and-analytics-governance-initiatives-will-fail-by-2027-due-to-a-lack-of-a-real-or-manufactured-crisis-">Gartner</a> predicts that by 2027, 80% of data and analytics governance initiatives will fail due to a lack of a real or manufactured crisis.&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vtT3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fc00dcc-56b2-4a01-bd77-2b1c518154a2_2000x1078.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vtT3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fc00dcc-56b2-4a01-bd77-2b1c518154a2_2000x1078.png 424w, https://substackcdn.com/image/fetch/$s_!vtT3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fc00dcc-56b2-4a01-bd77-2b1c518154a2_2000x1078.png 848w, https://substackcdn.com/image/fetch/$s_!vtT3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fc00dcc-56b2-4a01-bd77-2b1c518154a2_2000x1078.png 1272w, https://substackcdn.com/image/fetch/$s_!vtT3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fc00dcc-56b2-4a01-bd77-2b1c518154a2_2000x1078.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vtT3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fc00dcc-56b2-4a01-bd77-2b1c518154a2_2000x1078.png" width="2000" height="1078" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6fc00dcc-56b2-4a01-bd77-2b1c518154a2_2000x1078.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1078,&quot;width&quot;:2000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!vtT3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fc00dcc-56b2-4a01-bd77-2b1c518154a2_2000x1078.png 424w, https://substackcdn.com/image/fetch/$s_!vtT3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fc00dcc-56b2-4a01-bd77-2b1c518154a2_2000x1078.png 848w, https://substackcdn.com/image/fetch/$s_!vtT3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fc00dcc-56b2-4a01-bd77-2b1c518154a2_2000x1078.png 1272w, https://substackcdn.com/image/fetch/$s_!vtT3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6fc00dcc-56b2-4a01-bd77-2b1c518154a2_2000x1078.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Gartner Press Release (STAMFORD, Conn., February 28, 2024)</figcaption></figure></div><p>The good news? The <a href="https://www.fortunebusinessinsights.com/data-governance-market-108640">global data governance market</a> is expected to grow to $19.86 billion by 2032, reflecting substantial investment in this critical area. This underscores the need for organizations to approach data governance strategically and with a sense of urgency. Modern tools like <a href="https://www.actian.com/zeenea/data-discovery-platform/">Zeenea</a> and <a href="https://atlan.com/">Atlan</a> are making governance less about bureaucracy and more about enablement. The transformation is real &#8211; data governance is becoming cool again. <br></p><h2>Understanding the Basics of Data Governance</h2><p>Before diving into specific aspects of data governance, let's understand it through a familiar analogy first. I firmly believe that if you can't explain something <em>simply</em>, you don't understand it well enough yourself.</p><h3>Data Governance in Layman terms</h3><p>Imagine moving into a new house with your family. You need systems for tracking documents, managing expenses, controlling access, and ensuring smooth operations. Some rules focus on security (who has spare keys), while others enable collaboration (shared calendars). That's essentially what data governance is for organizations &#8211; a system to manage and use data effectively.</p><h3>What exactly is Data Governance?</h3><p>Building on this analogy, we can define Data Governance as the internal framework that dictates how an organization manages, uses, and protects data assets through internal policies, standards, and controls to ensure compliance, data quality, and security.</p><p>It is fundamentally about creating order from chaos. Think of it like a constitution for your data: it provides the fundamental principles and rules that guide how your organization works with data.</p><p>The most important thing to understand about data governance is that it's not a technology solution or a one-time project - it's an ongoing discipline that requires commitment from across the organization.</p><blockquote><p>Data governance relies on 3 interconnected elements: compliance as an essential outcome, data management operations to execute the rules, and data contracts to formalize how data should be exchanged and used.</p></blockquote><h3>Governance and Compliance: So, they aren&#8217;t the same?</h3><p>Data governance and compliance are terms often used interchangeably, but they serve fundamentally different purposes in your organization's data strategy. While compliance focuses on meeting specific regulatory requirements, governance encompasses a broader strategic framework that includes compliance as one of its key outcomes.</p><p>Think of data governance as your organization's internal playbook for managing data effectively, while data compliance is about meeting external rules set by regulators and industry standards. In fact, compliance requirements often represent just a subset of the controls and policies that a robust governance framework puts in place.</p><p>The main difference? Data governance is always proactive&#8212;you create internal frameworks and policies that dictate how your organization handles data. Data compliance requires proactive planning too, but because regulations continuously evolve and new ones emerge, organizations must remain responsive to changing external requirements. Even with established regulations like GDPR and HIPAA as a foundation, compliance practices need to adapt as interpretations change and new standards develop.</p><p>Returning to our household analogy helps clarify the relationship between governance and compliance. Your family's system for managing the household represents governance &#8212; it's your internal playbook for running things smoothly and achieving your goals. Compliance, by contrast, is like meeting building codes or homeowners' association rules. While these external requirements are important, they don't drive your fundamental decisions about how to organize and run your home. Similarly, effective Data Governance naturally enables compliance, but its primary purpose is creating value through better data management.</p><p>Here's the key: data compliance is actually an outcome of good data governance, not a separate process.</p><blockquote><p>Data governance without compliance is ineffective; compliance without governance is impossible. They're two sides of the same coin, but governance is the side that determines the coin's value.<br>- Fenil Dedhia</p></blockquote><h3>Role of Data Compliance in Governance</h3><p>Data compliance serves as a critical foundation within the broader governance framework. Here's how they work together:</p><ol><li><p><strong>Governance Enables Compliance</strong></p><ul><li><p>Creates clear accountability and ownership structures</p></li><li><p>Establishes policies and procedures that embed compliance requirements into daily operations</p></li><li><p>Implements controls and monitoring mechanisms</p></li><li><p>Provides documentation and audit trails</p></li></ul></li><li><p><strong>Compliance Strengthens Governance</strong></p><ul><li><p>Provides clear requirements that inform governance policies</p></li><li><p>Helps identify gaps in existing governance frameworks</p></li><li><p>Drives continuous improvement in data handling practices</p></li><li><p>Creates measurable standards for governance effectiveness</p></li></ul></li></ol><p>For a deeper dive into the relationship between governance and compliance, explore my dedicated article on <a href="https://www.actian.com/blog/data-governance/untangling-data-governance-from-compliance/">Untangling Data Governance from Compliance</a>.</p><h3>Role of Data Management in Governance</h3><p>While governance defines the rules, data management executes the day-to-day operations that handle data according to those rules. Think of it this way:</p><ul><li><p>Governance says "customer data must be encrypted at rest"</p></li><li><p>Data Management handles the actual encryption, storage, and maintenance</p></li></ul><p>Key management activities include:</p><ul><li><p>Data storage and archival operations</p></li><li><p>Data processing and transformation</p></li><li><p>Database maintenance and optimization</p></li><li><p>Backup and recovery procedures</p></li><li><p>Implementation of access controls</p></li><li><p>Execution of data quality checks</p></li></ul><h3>Role of Data Contracts in Governance</h3><p>Data contracts formalize the technical specifications for data exchange between providers and consumers. They serve as the detailed implementation guide for governance policies, specifically defining:</p><ul><li><p>Data structure and format requirements</p></li><li><p>Quality standards and validation rules</p></li><li><p>Access patterns and usage terms</p></li><li><p>Service level expectations</p></li><li><p>Integration specifications</p></li></ul><p>For example, if governance policy requires "high-quality customer data," the data contract specifies exactly what that means:</p><ul><li><p>Required fields and their formats</p></li><li><p>Acceptable values and ranges</p></li><li><p>Quality metrics and thresholds</p></li><li><p>Update frequency and freshness requirements</p></li><li><p>Technical integration details</p></li></ul><p>Think of data contracts as the bridge between governance principles and technical implementation. They translate high-level policies into specific, measurable, and implementable requirements that data management teams can execute.</p><h2>Data Mesh: Rethinking Domain-Driven Data Governance</h2><p>Domain-driven design has long guided how we structure software applications. <em>Data Mesh</em> extends these principles to data governance by recognizing a simple truth: the teams closest to business domains should govern their own data while adhering to organization-wide standards. Lack of domain know-how usually means that a centralized data management team will likely "mesh" it up ;).</p><h1>A central team handling all data pipelines, quality, and access becomes a bottleneck, disconnecting domain experts from data ownership. Data Mesh solves this by distributing responsibility to domain teams who understand their data best.</h1><p>This distributed approach naturally aligns with federated governance. Think of a modern retail chain: corporate headquarters provides standardized systems and tools (like point-of-sale and inventory management), while store managers make local decisions about inventory and customer service. Similarly in Data Mesh, a central platform team provides infrastructure and governance frameworks, while domain teams maintain autonomy over their specific data assets.</p><p>While this might sound similar to microservices architecture, the focus differs fundamentally. Microservices isolate application functionality with separate databases, but Data Mesh focuses on making domain data discoverable and usable across the organization. Where microservices manage transactional boundaries, Data Mesh enables cross-domain data sharing with clear ownership and governance standards.</p><p>The Data Mesh approach supports modern Data Governance through several key principles:</p><ol><li><p>Domain teams own both their operational processes and data quality</p></li><li><p>Each domain provides data in standardized, consumable formats</p></li><li><p>Governance standards apply consistently across domains while accommodating domain-specific needs</p></li><li><p>Common infrastructure enables this distributed but connected approach</p></li></ol><p>However, successful Data Mesh implementation requires careful evaluation. Organizations should assess whether their domain boundaries are well-defined and if their teams can handle expanded data quality and governance responsibilities. The goal isn't to distribute data management for its own sake, but to align data ownership with domain expertise while maintaining consistent governance standards.</p><p>Teams handling a large amount of data sources and a need to experiment with data (in other words, transform data at a rapid rate) would be wise to consider leveraging a data mesh. There are several calculators out there to determine if it makes sense for your organization to invest in a data mesh.</p><h2>Legacy vs Modern Data Governance: 7 Key Differences</h2><p>This evolution from traditional to modern Data Governance reflects broader shifts in how organizations work with data. Traditional approaches, born in an era of centralized data warehouses and strict regulatory responses, often created bottlenecks that hindered innovation. Modern governance recognizes that in today's data-driven world, governance must enable rather than restrict, while maintaining appropriate controls.</p><ol><li><p><em>Approach to Control</em></p></li></ol><ul><li><ul><li><p>Traditional: Centralized control with emphasis on hierarchical control over data access</p></li><li><p>Modern: Federated or decentralized control, with emphasis on self-service and democratization</p></li></ul></li></ul><ol start="2"><li><p><em>Primary Goal</em></p></li></ol><ul><li><ul><li><p>Traditional: Compliance-first, treating governance as a necessary burden that often sacrifices innovation for control</p></li><li><p>Modern: Value-first, viewing governance as a strategic enabler through data discovery and analytics</p></li></ul></li></ul><ol start="3"><li><p><em>Accountability</em></p></li></ol><ul><li><ul><li><p>Traditional: Overseen by select individuals or committees (Hierarchical org structure)</p></li><li><p>Modern: Distributed responsibility - either through domain ownership with central guidance or fully autonomous domain teams</p></li></ul></li></ul><ol start="4"><li><p><em>Process Management</em></p></li></ol><ul><li><ul><li><p>Traditional: Manual processes requiring heavy documentation and approval chains</p></li><li><p>Modern: Semi-automated or fully-automated workflows with embedded governance controls that are either shared or domain-specific</p></li></ul></li></ul><ol start="5"><li><p><em>Collaboration</em></p></li></ol><ul><li><ul><li><p>Traditional: Limited collaboration, typically within specialized data teams</p></li><li><p>Modern: Cross-functional collaboration where teams actively participate in data decisions, encouraging input and shared accountability</p></li></ul></li></ul><ol start="6"><li><p><em>Technology Integration</em></p></li></ol><ul><li><ul><li><p>Traditional: Standalone governance tools disconnected from data platforms</p></li><li><p>Modern: Governance capabilities integrated into daily workflows, either through shared platforms with domain customization or through domain-specific tooling</p></li></ul></li></ul><ol start="7"><li><p><em>Change Management</em></p></li></ol><ul><li><ul><li><p>Traditional: Reactive adaptation to new requirements, often resulting in outdated policies and missed opportunities</p></li><li><p>Modern: Proactive evolution allowing domains to adapt quickly while maintaining organizational alignment through either shared frameworks or domain-specific approaches</p></li></ul></li></ul><p>The transformation from traditional to modern Data Governance mirrors the broader evolution in how organizations operate in the digital age. While traditional approaches emerged from an era of centralized control and regulatory compliance, they often created bottlenecks that stifled innovation and limited data value. Modern Data Governance recognizes a fundamental truth: in today's data-driven world, governance must act as an enabler of innovation while maintaining appropriate guardrails.</p><p>Success in modern Data Governance requires striking a delicate balance. Organizations must implement controls that ensure compliance and data quality without sacrificing the speed and flexibility that modern business demands. This balance isn't just about avoiding problems&#8212;it's about creating competitive advantage.</p>]]></content:encoded></item><item><title><![CDATA[Speaking Data Fluently: A Guide for Modern Data & AI Practitioners]]></title><description><![CDATA[Miscommunication costs more than time. With this guide help yourself and your team spend less time untangling terms and more time delivering value.]]></description><link>https://www.datacraft.wiki/p/speaking-data-fluently-a-guide-for-modern-data-ai-practitioners</link><guid isPermaLink="false">https://www.datacraft.wiki/p/speaking-data-fluently-a-guide-for-modern-data-ai-practitioners</guid><dc:creator><![CDATA[Fenil Dedhia]]></dc:creator><pubDate>Sun, 26 Jan 2025 05:33:53 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/714bd758-3299-4aa5-a022-1728ff07b3b5_1344x896.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!S9rl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11c26a12-51a7-443d-9ae3-b5e167b58d4a_1344x896.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!S9rl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11c26a12-51a7-443d-9ae3-b5e167b58d4a_1344x896.png 424w, https://substackcdn.com/image/fetch/$s_!S9rl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11c26a12-51a7-443d-9ae3-b5e167b58d4a_1344x896.png 848w, https://substackcdn.com/image/fetch/$s_!S9rl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11c26a12-51a7-443d-9ae3-b5e167b58d4a_1344x896.png 1272w, https://substackcdn.com/image/fetch/$s_!S9rl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11c26a12-51a7-443d-9ae3-b5e167b58d4a_1344x896.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!S9rl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11c26a12-51a7-443d-9ae3-b5e167b58d4a_1344x896.png" width="1344" height="896" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/11c26a12-51a7-443d-9ae3-b5e167b58d4a_1344x896.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:896,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1613146,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://datacraft.wiki/i/162129898?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11c26a12-51a7-443d-9ae3-b5e167b58d4a_1344x896.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!S9rl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11c26a12-51a7-443d-9ae3-b5e167b58d4a_1344x896.png 424w, https://substackcdn.com/image/fetch/$s_!S9rl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11c26a12-51a7-443d-9ae3-b5e167b58d4a_1344x896.png 848w, https://substackcdn.com/image/fetch/$s_!S9rl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11c26a12-51a7-443d-9ae3-b5e167b58d4a_1344x896.png 1272w, https://substackcdn.com/image/fetch/$s_!S9rl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11c26a12-51a7-443d-9ae3-b5e167b58d4a_1344x896.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>For busy data/AI builders, marketers, and sellers: Keep this guide handy and share it - the clarity of your strategic, tactical, and operational initiatives starts with clear terminology.</em></p><div><hr></div><h2>The Data Language Barrier</h2><p>Naming is a mysterious science.</p><p>Take compute service - a fundamental cloud offering that every provider has. Amazon Web Services (AWS) calls it EC2, Microsoft Azure and Oracle stick with 'virtual machine', while Google opts for 'compute engine' &#8211; all for essentially the same concept of a compute service.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!u4HB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21365650-a760-425f-a73c-2bab87a78cc8_1280x1977.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!u4HB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21365650-a760-425f-a73c-2bab87a78cc8_1280x1977.jpeg 424w, https://substackcdn.com/image/fetch/$s_!u4HB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21365650-a760-425f-a73c-2bab87a78cc8_1280x1977.jpeg 848w, https://substackcdn.com/image/fetch/$s_!u4HB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21365650-a760-425f-a73c-2bab87a78cc8_1280x1977.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!u4HB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21365650-a760-425f-a73c-2bab87a78cc8_1280x1977.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!u4HB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21365650-a760-425f-a73c-2bab87a78cc8_1280x1977.jpeg" width="1280" height="1977" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21365650-a760-425f-a73c-2bab87a78cc8_1280x1977.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1977,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:350366,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://datacraft.wiki/i/162129898?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21365650-a760-425f-a73c-2bab87a78cc8_1280x1977.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!u4HB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21365650-a760-425f-a73c-2bab87a78cc8_1280x1977.jpeg 424w, https://substackcdn.com/image/fetch/$s_!u4HB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21365650-a760-425f-a73c-2bab87a78cc8_1280x1977.jpeg 848w, https://substackcdn.com/image/fetch/$s_!u4HB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21365650-a760-425f-a73c-2bab87a78cc8_1280x1977.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!u4HB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21365650-a760-425f-a73c-2bab87a78cc8_1280x1977.jpeg 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://blog.bytebytego.com/p/ep70-cloud-services-cheat-sheet">ByteByteGo</a></figcaption></figure></div><p>This naming divergence extends beyond compute - you'll find it in data storage, databases, and AI services. Most providers build their own branded dictionary that practitioners must learn.</p><p>You might think, "Why not use fundamental concept names instead of branded terms to simplify communication?" The answer is that in the data and AI domain, even basic terms like "pipeline," "product," or "model" carry different meanings across contexts.</p><p>Just google "Data Observability definition" and see how market leaders define the same concept differently.</p><blockquote><p><strong>Reality:</strong> Definitions from technology vendors often <em>conveniently</em> align with their product offerings. This isn't surprising &#128518; &#8211; vendors naturally frame concepts in ways that highlight their solutions' strengths.</p></blockquote><p>Then there are the analyst firms, like Gartner, that maintain their own definitions. For innovative 0-1/0-10 work, these firms offer little value or insight &#8211; they document established markets, not emerging ones. <strong>They're followers, not leaders.</strong> Case in point: the "Active Metadata" market existed in practice long before Gartner created their definition and an MQ for it.</p><p>The problem isn't just vendor branding or analyst firm branding - it's how technical terms evolve and get repurposed across domains.</p><p>This leads us to the bounded context principle from Domain-Driven Design (DDD) which is an important concept to grasp.</p><h2>The Bounded Context Principle</h2><p>While established organizations rely on <a href="https://www.haystackteam.com/blog/10-problems-a-business-glossary-can-solve">business glossaries</a> as their terminology source of truth, it is not practical to have one in several situations (especially in 0-1 innovative work). Teams must actively align on technical definitions to move faster and build better solutions.</p><p>When innovating in the data and AI domain, you'll encounter terms that seem misleading or imprecise. Take "feature store" - it might feel like an odd name for a system that manages machine learning (ML) model inputs, but it's embedded in how ML practitioners communicate.</p><blockquote><p>Unless you're shaping industry standards, adapting to established terminology is more practical than fighting it. Moreover, different domains may use identical terms to mean different things, <strong>and that's okay</strong>. <strong>What really matters is clear context mapping</strong> -<strong> </strong>explicitly defining how terms translate across domain boundaries.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eRCB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F123c25b8-5a35-429f-a33a-0dcef5a9d0b3_960x540.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eRCB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F123c25b8-5a35-429f-a33a-0dcef5a9d0b3_960x540.png 424w, https://substackcdn.com/image/fetch/$s_!eRCB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F123c25b8-5a35-429f-a33a-0dcef5a9d0b3_960x540.png 848w, https://substackcdn.com/image/fetch/$s_!eRCB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F123c25b8-5a35-429f-a33a-0dcef5a9d0b3_960x540.png 1272w, https://substackcdn.com/image/fetch/$s_!eRCB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F123c25b8-5a35-429f-a33a-0dcef5a9d0b3_960x540.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eRCB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F123c25b8-5a35-429f-a33a-0dcef5a9d0b3_960x540.png" width="960" height="540" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/123c25b8-5a35-429f-a33a-0dcef5a9d0b3_960x540.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:540,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22832,&quot;alt&quot;:&quot;Conceptual Diagram: Bounded Context principle from Domain-Driven Design (DDD)&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://datacraft.wiki/i/162129898?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F123c25b8-5a35-429f-a33a-0dcef5a9d0b3_960x540.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Conceptual Diagram: Bounded Context principle from Domain-Driven Design (DDD)" title="Conceptual Diagram: Bounded Context principle from Domain-Driven Design (DDD)" srcset="https://substackcdn.com/image/fetch/$s_!eRCB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F123c25b8-5a35-429f-a33a-0dcef5a9d0b3_960x540.png 424w, https://substackcdn.com/image/fetch/$s_!eRCB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F123c25b8-5a35-429f-a33a-0dcef5a9d0b3_960x540.png 848w, https://substackcdn.com/image/fetch/$s_!eRCB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F123c25b8-5a35-429f-a33a-0dcef5a9d0b3_960x540.png 1272w, https://substackcdn.com/image/fetch/$s_!eRCB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F123c25b8-5a35-429f-a33a-0dcef5a9d0b3_960x540.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Bounded Context principle from Domain-Driven Design (DDD)</figcaption></figure></div><ol><li><p>Domain-Driven Design (DDD) says: Organize complex systems into separate domains, each with its own terminology and rules.</p></li><li><p>The Bounded Context principle says: Let each domain use its own language internally, but define clear translations for cross-domain communication. Within each domain's boundary, a shared "ubiquitous language" is used, where terms have consistent meanings that everyone in that domain understands.</p></li></ol><p>Clear translation is crucial for effective collaboration. When you're collaborating across different bounded contexts, remember to explicitly map the translations in your communication!</p><h3>Real-World Examples</h3><p><strong>Business example</strong></p><p>Different teams in product organizations interpret &#8220;customer&#8221; and &#8220;user&#8221; differently based on their context. The product team typically focuses on feature usage, sales team on licensing, and support team on issue resolution.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oR-F!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F114ef43d-ad8e-45ea-9917-9552ec41d31d_960x540.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oR-F!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F114ef43d-ad8e-45ea-9917-9552ec41d31d_960x540.png 424w, https://substackcdn.com/image/fetch/$s_!oR-F!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F114ef43d-ad8e-45ea-9917-9552ec41d31d_960x540.png 848w, https://substackcdn.com/image/fetch/$s_!oR-F!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F114ef43d-ad8e-45ea-9917-9552ec41d31d_960x540.png 1272w, https://substackcdn.com/image/fetch/$s_!oR-F!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F114ef43d-ad8e-45ea-9917-9552ec41d31d_960x540.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oR-F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F114ef43d-ad8e-45ea-9917-9552ec41d31d_960x540.png" width="960" height="540" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/114ef43d-ad8e-45ea-9917-9552ec41d31d_960x540.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:540,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:36136,&quot;alt&quot;:&quot;Bounded Context Example: \&quot;Customer\&quot; and \&quot;User\&quot; in Software Business context&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://datacraft.wiki/i/162129898?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F114ef43d-ad8e-45ea-9917-9552ec41d31d_960x540.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Bounded Context Example: &quot;Customer&quot; and &quot;User&quot; in Software Business context" title="Bounded Context Example: &quot;Customer&quot; and &quot;User&quot; in Software Business context" srcset="https://substackcdn.com/image/fetch/$s_!oR-F!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F114ef43d-ad8e-45ea-9917-9552ec41d31d_960x540.png 424w, https://substackcdn.com/image/fetch/$s_!oR-F!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F114ef43d-ad8e-45ea-9917-9552ec41d31d_960x540.png 848w, https://substackcdn.com/image/fetch/$s_!oR-F!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F114ef43d-ad8e-45ea-9917-9552ec41d31d_960x540.png 1272w, https://substackcdn.com/image/fetch/$s_!oR-F!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F114ef43d-ad8e-45ea-9917-9552ec41d31d_960x540.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">"Customer" and "User" in Software Business context</figcaption></figure></div><p></p><p>When product and sales teams collaborate on enterprise features:</p><ul><li><p>Product managers might say "Users need admin controls"</p></li><li><p>Sales reps might say "Customers want flexible licensing"</p></li><li><p>Translation: "Admin features let customers control user access and licensing"</p></li></ul><p><strong>Technical example</strong></p><p>In the data and AI domain, "Pipeline" means different things across different teams - data movement for data engineering, model lifecycle for ML, and deployment automation for DevOps. Each context maintains its specific meaning while coordinating that context translation in your communications.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SUH4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b439fa-39d7-413d-8224-27e5b04a8451_960x540.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SUH4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b439fa-39d7-413d-8224-27e5b04a8451_960x540.png 424w, https://substackcdn.com/image/fetch/$s_!SUH4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b439fa-39d7-413d-8224-27e5b04a8451_960x540.png 848w, https://substackcdn.com/image/fetch/$s_!SUH4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b439fa-39d7-413d-8224-27e5b04a8451_960x540.png 1272w, https://substackcdn.com/image/fetch/$s_!SUH4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b439fa-39d7-413d-8224-27e5b04a8451_960x540.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SUH4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b439fa-39d7-413d-8224-27e5b04a8451_960x540.png" width="960" height="540" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/63b439fa-39d7-413d-8224-27e5b04a8451_960x540.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:540,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:46235,&quot;alt&quot;:&quot;Bounded Context Example: \&quot;Pipeline\&quot; term in Data Platform context&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://datacraft.wiki/i/162129898?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b439fa-39d7-413d-8224-27e5b04a8451_960x540.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Bounded Context Example: &quot;Pipeline&quot; term in Data Platform context" title="Bounded Context Example: &quot;Pipeline&quot; term in Data Platform context" srcset="https://substackcdn.com/image/fetch/$s_!SUH4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b439fa-39d7-413d-8224-27e5b04a8451_960x540.png 424w, https://substackcdn.com/image/fetch/$s_!SUH4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b439fa-39d7-413d-8224-27e5b04a8451_960x540.png 848w, https://substackcdn.com/image/fetch/$s_!SUH4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b439fa-39d7-413d-8224-27e5b04a8451_960x540.png 1272w, https://substackcdn.com/image/fetch/$s_!SUH4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63b439fa-39d7-413d-8224-27e5b04a8451_960x540.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">"Pipeline" term in Data Platform context</figcaption></figure></div><p>The focus area for each varies when it comes to 'pipeline':</p><ul><li><p>Data Engineering focus: Data flows, transformations, scheduling</p></li><li><p>ML Engineering focus: Model lifecycle, experiment tracking, versioning</p></li><li><p>DevOps focus: Build, test, deployment automation</p></li></ul><p>When ML and data teams plan a new feature:</p><ul><li><p>ML team member might say "We need a pipeline for model retraining"</p></li><li><p>Data Engineering team member says "We'll need a pipeline for feature engineering"</p></li><li><p>Translation: "ML pipeline will consume outputs from data pipeline through defined interfaces"</p></li></ul><h2>Best way to view, interpret, and communicate terminologies</h2><p>When working with data terminology in your organization, understanding bounded contexts matters more than agreeing on universal definitions. Successful teams focus less on fighting over the "correct" terminology and more on clearly mapping how terms translate across domains. Different teams can use the same words differently as long as everyone understands the translation when they collaborate.</p><p>As a data and AI practitioner, look beyond definitions to examine actual capabilities and technical implementation details. Don't let terminology obscure what really matters: how the technology actually works and what problems it solves.</p><p>The approach of prioritizing clear context mapping over rigid definitions will help teams move faster and focus on delivering value rather than debating semantics.</p><h2>&#127775; Not the Same: Analogous Terms You're Using Interchangeably, But Shouldn't</h2><p>Similar &#8800; Same.</p><p>Unlike a Glossary where you lookup definitions one by one, I've taken a different approach: I organized terms in logical groupings because I've found it's most effective to learn new concepts by comparing them with similar terms. Use your judgment to skip or commit learning terms most relevant to the technical depth required in your role.</p><h3><em>Category 1: Most Commonly Misinterpreted Terms</em></h3><p><strong>'Data Product' vs 'Data as a Product' vs 'Data Asset' vs 'Data Application' vs 'Data Platform'</strong></p><p>Note: Given how often these terms are confused, we'll explore this distinction in more depth than others.</p><p>Lets start with reviewing some formal definitions of Data Products:</p><ul><li><p><a href="https://www.ibm.com/think/topics/data-product">IBM</a>: A data product is a reusable, self-contained package that combines data, metadata, semantics and templates to support diverse business use cases. It can include components such as datasets, dashboards, reports, machine learning (ML) models, pre-built queries or data pipelines.</p></li><li><p><a href="https://lakefs.io/blog/data-products/">LakeFS</a>: A data product is any tool or application that processes data and generates insights. These insights help businesses make better decisions for the future.</p></li></ul><p>In simple terms?</p><p><em>Data Product</em> is a solution that delivers data-driven value to end users - it has clear users, use cases, and measurable value. <br>Examples: <br>- Stock trading app with real-time market insights<br>- Customer churn prediction system<br>- Credit scoring system</p><p>Now, lets review all other terms that are often conflated with Data Products!</p><p><em>Data as a Product</em> is the methodology of treating data like a product - with quality standards, documentation, and governance. <br>Example: Setting SLAs and ownership for customer data across an organization.</p><p><em>Data Platform</em> provides infrastructure and services for core data operations.<br>Examples:<br>Storage and compute Platforms:<br>- Cloud-native: Snowflake, Databricks<br>- On-premises: Hadoop<br>- from cloud providers: AWS (S3/Redshift), Google (GCS/BigQuery)<br>Metadata Management &amp; Governance Platforms:<br>- Cloud-native: Actian&#8217;s <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;The Data Intelligence Platform&quot;,&quot;id&quot;:4767383,&quot;type&quot;:&quot;pub&quot;,&quot;url&quot;:&quot;https://open.substack.com/pub/dataintelligenceplatform&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/37bf8067-3579-405b-95be-2859b4abed3a_1042x1042.png&quot;,&quot;uuid&quot;:&quot;0628bd26-47af-4018-a285-a26e09debd3c&quot;}" data-component-name="MentionToDOM"></span>, Atlan, Collibra<br>- Cloud provider: AWS Glue, Azure Purview</p><p><em>Data Assets</em> are the raw or processed datasets that have business value. <br>Example: Customer transaction history, product inventory data, sales metrics, structured/semi-structured data, streaming data feeds.</p><p><em>Data Applications</em> are software solutions built primarily to interact with data.<br>Example: A dashboard builder, a report generator, a data cleaning tool, ML model registries, data integration pipelines.</p><ul><li><p>Common confusion: These terms overlap because they're interconnected - platforms host assets, which power applications, which can be products, all potentially managed using data-as-a-product principles.</p></li><li><p>Key difference: <strong>Think of it as layers!</strong></p><ul><li><p>Data platform provides the foundation (like Snowflake)</p></li><li><p>Data assets are the raw materials (like structured/unstructured datasets)</p></li><li><p>Data applications are tools to work with data (like analysis tools)</p></li><li><p>Data products deliver specific value (like a customer insights portal, or a recommendation engine)</p></li><li><p>Data as a Product is how you manage it all (like product development practices)</p></li></ul></li></ul><p>Lets apply this layered approach of thinking these different terms to Netflix</p><ul><li><p>Netflix Data Platform: Cloud infrastructure</p></li><li><p>Netflix Data Assets: User viewing history, content metadata</p></li><li><p>Netflix Data Applications: Content management system</p></li><li><p>Netflix Data Product: Recommendation engine</p></li><li><p>Netflix Data as a Product: Quality metrics for recommendations, clear ownership, user feedback loops<br></p></li></ul><blockquote><p>Important context: The same component can play different roles depending on context.</p></blockquote><ul><li><p>An ML model could be a data asset when used as a reusable component, but becomes a Data Product when deployed to solve specific user problems.</p></li><li><p>A dashboard could be a data application when it's a generic tool, but becomes a data product when customized to deliver specific business insights.</p></li><li><p>Integration pipelines could be data applications when they're tools for moving data, but become data products when packaged with a clear value proposition.</p></li></ul><p>The key test is to ask:</p><ul><li><p><em>Does it store/hold data?</em> &#8594; Data Asset</p></li><li><p><em>Does it primarily process/manage data?</em> &#8594; Data Application</p></li><li><p><em>Does it deliver specific value to users?</em> &#8594; Data Product</p></li><li><p><em>Is it infrastructure that enables data work?</em> &#8594; Data Platform</p></li><li><p><em>Is it a methodology for managing any of the above?</em> &#8594; Data as a Product</p></li></ul><h3><em>Category 2: Core Data Concepts</em></h3><p><strong>Data Governance vs Data Compliance vs Data Management</strong></p><p><em>Data Governance</em> is a framework that dictates how an organization manages, uses, and protects data assets through internal policies, standards, and controls. It ensures compliance, data quality, and security.</p><p><em>Data Management</em> is the operational execution of data handling activities like storage, processing, and maintenance.</p><p><em>Data Compliance</em> is adherence to external regulations and standards for data privacy and security (like GDPR, HIPAA).</p><ul><li><p>Common confusion: All three terms involve organizational data handling, often leading to unclear boundaries between setting internal policies (governance), executing them (management), and meeting external requirements (compliance). Governance and compliance are often conflated because governance is explicitly designed to ensure compliance - it's the proactive framework that enables compliance.</p></li><li><p>Key difference: Think of it as: governance dictates the internal playbook, management executes it through tools and processes, and compliance validates against external requirements. Internal framework drives external compliance through operational execution.</p><ul><li><p>Example: For sensitive healthcare data, governance defines access policies (internal rules), management implements encryption and access controls (execution), and compliance ensures HIPAA requirements are met (external validation).</p></li></ul></li></ul><p><strong>ETL vs ELT</strong></p><p><em>ETL</em> (Extract, Transform, Load) transforms data <em>before</em> loading it into the target system.</p><p><em>ELT</em> (Extract, Load, Transform) transforms data <em>after</em> loading it into the target system.</p><ul><li><p>Common confusion: Similar acronyms and both handle data pipeline processes.</p></li><li><p>Key difference: Timing and location of transformation. Think of ETL as pre-processing ingredients before cooking (like chopping vegetables), while ELT is adding raw ingredients to the pot and transforming them during cooking. Example: ETL transforms customer addresses into a standardized format before loading into a data warehouse, while ELT loads raw addresses and standardizes them using warehouse computing power.</p></li></ul><p><strong>Data Mesh vs Data Fabric vs Microservices</strong></p><p><em>Data Mesh</em> is an organizational and cultural approach treating data as a product, with domain teams owning their data assets and delivery.</p><p><em>Data Fabric</em> is a technical architecture using metadata and automation to integrate and manage distributed data sources.</p><p><em>Microservices</em> is an architectural style that structures an application as a collection of small, loosely coupled services. Each service is designed to perform a specific business function and can be developed, deployed, and scaled independently. While primarily known as an application development approach, microservices share interesting parallels with data mesh and fabric in their approach to distributed systems.</p><ul><li><p>Common confusion: All three approaches aim to address complexity in distributed environments, but they approach this challenge from different angles. They all fundamentally aim to make complex systems more manageable, flexible, and efficient. The choice between them&#8212;or more likely, a combination&#8212;depends on an organization's specific technical and cultural needs.</p></li><li><p>Key difference: Data mesh tackles organizational complexity by restructuring how teams own and manage data. Data Fabric is about technical integration, creating a smart, adaptive data network (provides a unifying layer across different data sources). Microservices is about modular design (breaking down monolithic systems into focused services that are autonomous). So you can think of it as mesh focuses on organizational transformation of data ownership, fabric automates system connections, while microservices breaks down application architecture into independent, specialized services.</p><ul><li><p>Example: A retail company implements mesh by making each department responsible for their customer data products, fabric automates how these products connect and share data, and microservices creates independent services for user authentication, product catalog, and order processing.</p></li></ul></li></ul><p><strong>Business Glossary vs Taxonomy vs Ontology</strong> <strong>vs Data Dictionary</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!C4Vs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12a4ae2-e93e-4cf5-859c-f4f7cfd252cb_1542x1156.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!C4Vs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12a4ae2-e93e-4cf5-859c-f4f7cfd252cb_1542x1156.png 424w, https://substackcdn.com/image/fetch/$s_!C4Vs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12a4ae2-e93e-4cf5-859c-f4f7cfd252cb_1542x1156.png 848w, https://substackcdn.com/image/fetch/$s_!C4Vs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12a4ae2-e93e-4cf5-859c-f4f7cfd252cb_1542x1156.png 1272w, https://substackcdn.com/image/fetch/$s_!C4Vs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12a4ae2-e93e-4cf5-859c-f4f7cfd252cb_1542x1156.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!C4Vs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12a4ae2-e93e-4cf5-859c-f4f7cfd252cb_1542x1156.png" width="1542" height="1156" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c12a4ae2-e93e-4cf5-859c-f4f7cfd252cb_1542x1156.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1156,&quot;width&quot;:1542,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!C4Vs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12a4ae2-e93e-4cf5-859c-f4f7cfd252cb_1542x1156.png 424w, https://substackcdn.com/image/fetch/$s_!C4Vs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12a4ae2-e93e-4cf5-859c-f4f7cfd252cb_1542x1156.png 848w, https://substackcdn.com/image/fetch/$s_!C4Vs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12a4ae2-e93e-4cf5-859c-f4f7cfd252cb_1542x1156.png 1272w, https://substackcdn.com/image/fetch/$s_!C4Vs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc12a4ae2-e93e-4cf5-859c-f4f7cfd252cb_1542x1156.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Data Dictionary</em> is a technical reference that documents database structure - tables, columns, data types, and constraints.</p><p><em>Business Glossary</em> is a curated dictionary of business terms with standardized definitions and ownership information.</p><p><em>Taxonomy</em> is a hierarchical classification system organizing concepts into parent-child relationships, much like a filing cabinet.</p><p><em>Ontology</em> is a comprehensive knowledge framework that defines concepts, relationships, and rules within a domain, similar to a network diagram.</p><ul><li><p>Common confusion: All four relate to organizing information, but serve different purposes and audiences.</p></li><li><p>Key difference: Audience and complexity of relationships. Data dictionary specifies technical implementation, glossary defines business terms, taxonomy categorizes them, ontology establishes relationships. Think of it like: glossary is a dictionary (definitions), data dictionary is an engineering blueprint (technical specs), taxonomy is a filing cabinet (categories), and ontology is a network diagram (relationships).</p><ul><li><p>Example: In healthcare data, a data dictionary defines PatientID as INT PRIMARY KEY, a glossary explains what "Active Patient" means to the business, taxonomy classifies procedures into types (diagnostic, therapeutic, preventive), and ontology establishes that "patients receive procedures from providers at facilities" with rules governing these relationships.</p></li></ul></li></ul><p><strong>Data Lake vs Data Lakehouse</strong></p><p><em>Data Lake</em> is a storage system for raw data in native format without schema enforcement.</p><p><em>Data Lakehouse</em> is a hybrid architecture adding database features (ACID, schema enforcement) on top of a data lake.</p><ul><li><p>Common confusion: Similar names and base capabilities - both store large volumes of raw data and are often positioned as modern data storage solutions.</p></li><li><p>Key difference: Lakes prioritize flexibility; lakehouses add structure and reliability. Think of a lake as a vast reservoir that accepts any type of data, while a lakehouse is like adding plumbing and filters to that reservoir.</p><ul><li><p>Example: Raw IoT sensor data goes into a lake, but when you need that data to be transaction-safe for business reporting, you'd use a lakehouse.</p></li></ul></li></ul><p><strong>Data Infrastructure vs Data Architecture</strong></p><p><em>Data Infrastructure</em> is the physical components layer (hardware, software, networks) that stores and processes data.</p><p><em>Data Architecture</em> is the blueprint defining how data assets are organized, integrated, and used.</p><ul><li><p>Common confusion: Both are foundational elements often discussed together in data strategy conversations.</p></li><li><p>Key difference: Infrastructure <em>implements</em>; architecture <em>designs</em>. Think of architecture as the building blueprints, while infrastructure is the actual construction materials and systems.</p><ul><li><p>Example: Architecture specifies that real-time analytics needs a streaming pipeline, while infrastructure provides the actual Apache Kafka clusters to implement it.</p></li></ul></li></ul><p><strong>Data Quality vs Data Reliability</strong></p><p><em>Data Quality</em> is the measure of data's accuracy, completeness, consistency, and fitness for intended use.</p><p><em>Data Reliability</em> is the consistency and stability of data over time, including system uptime and data delivery.</p><ul><li><p>Common confusion: Both relate to data trustworthiness, leading to quality metrics being confused with system reliability measures.</p></li><li><p>Key difference: Quality measures correctness; reliability measures consistency. Think of quality as checking if ingredients are fresh and properly measured, while reliability is ensuring the kitchen operates without interruption.</p><ul><li><p>Example: Data quality ensures customer addresses are accurate, while reliability ensures the address database is consistently available.</p></li></ul></li></ul><p><strong>Data Catalog vs Data Discovery vs Metadata Management</strong></p><p><em>Data Catalog</em> is an organized inventory of data assets that includes descriptions, ownership, and lineage information.</p><p><em>Data Discovery</em> is the process of finding and understanding available data assets across an organization.</p><p><em>Metadata Management</em> is the systematic organization and maintenance of metadata (data about data) across systems.</p><ul><li><p>Common confusion: These terms often overlap because they work together - catalogs use metadata to help with discovery, leading to unclear boundaries between the inventory system (catalog), the search process (discovery), and how the supporting information is maintained (metadata management).</p></li><li><p>Key difference: Catalogs provide the interface, discovery is the user activity, metadata management maintains the supporting information. Think of it as a library where the catalog is the searchable index system, discovery is how you find books you need, and metadata management keeps track of all book details like authors, locations, and categories.</p><ul><li><p>Example: When a data scientist needs customer information, they use the catalog interface to search (catalog), explore available datasets to find what they need (discovery), while metadata management ensures all the dataset descriptions and relationships are accurate and up-to-date.</p></li></ul></li></ul><p><strong>Data Marketplace vs Data Exchange vs Data Catalog</strong></p><p><em>Data Marketplace</em> is a platform for buying and selling data products with standardized terms and monetization.</p><p><em>Data Exchange</em> is a controlled environment for peer-to-peer data sharing between trusted partners.</p><p><em>Data Catalog</em> is an organized inventory of data assets that includes descriptions, ownership, and lineage information.</p><ul><li><p>Common confusion: All facilitate finding and accessing data, but serve different purposes in data commerce and discovery.</p></li><li><p>Key difference: Marketplaces enable commerce, exchanges facilitate sharing, catalogs provide inventory. Think of it like: marketplace is an e-commerce site (commercial transactions), exchange is a trading platform (partner sharing), catalog is a library index (asset discovery).</p><ul><li><p>Example: A marketplace sells consumer insights data, an exchange shares supply chain data between partners, a catalog helps employees find internal datasets.</p></li></ul></li></ul><p><strong>Data Observability vs Data Monitoring</strong></p><p><em>Data Observability</em> is the ability to understand the health and state of data in your systems by measuring quality, reliability, and lineage.</p><p><em>Data Monitoring</em> is the continuous tracking of specific data metrics and system performance against predefined thresholds.</p><ul><li><p>Common confusion: Both involve watching data systems, leading to observability being seen as just another word for monitoring.</p></li><li><p>Key difference: Observability enables understanding; monitoring tracks specifics. Think of observability as a full health checkup that helps diagnose issues, while monitoring is checking specific vital signs.</p><ul><li><p>Example: Observability helps understand why data quality dropped by showing lineage and dependencies, while monitoring alerts when quality scores fall below 90%.</p></li></ul></li></ul><p><strong>Data Security vs Data Privacy</strong></p><p><em>Data Security</em> is a framework that dictates how an organization protects data against unauthorized access, corruption, or theft through technical controls and measures.</p><p><em>Data Privacy</em> is adherence to appropriate data use and handling according to user consent rights and regulatory requirements.</p><ul><li><p>Common confusion: Similar to the earlier discussion of data governance and data compliance, security and privacy are interconnected - security provides the protective framework that enables privacy, leading to them often being conflated.</p></li><li><p>Key difference: Data security dictates and implements data protection, Data privacy ensures appropriate use. Think of it like our earlier data governance model: security provides the internal controls (governance), implementation happens through tools and processes, and privacy ensures compliance with user rights and regulations (compliance).</p><ul><li><p>Example: For user data, security implements encryption and access controls (protection), while privacy ensures data collection and use aligns with user consent and regulations (appropriate use).</p></li></ul></li></ul><p><strong>Data Contract vs Data SLA</strong></p><p><em>Data Contract</em> is a formal specification that defines data structure, format, semantics, quality, and terms of use between data providers and consumers.</p><p><em>Data SLA (Service Level Agreement)</em> specifies measurable targets for data service performance like availability, latency, freshness, and support response times.</p><ul><li><p>Common confusion: Data Contracts are often mistaken as just performance agreements, when they're actually broader specifications that may include SLAs as one component.</p></li><li><p>Key difference: Contracts define what's being delivered; SLAs measure how well it's delivered. Think of it like ordering a meal - the contract specifies the dish ingredients and preparation (what you get), while the SLA guarantees delivery time and temperature (how well you get it).</p><ul><li><p>Example: A data contract specifies customer data fields and access controls, while the SLA guarantees 99.9% availability and max 24-hour data freshness.</p></li></ul></li></ul><h3><em>Category 3: Data Movement &amp; Processing</em></h3><p><strong>Data Integration vs Data Pipeline vs Data Ingestion vs Data Workflow</strong></p><p><em>Data Integration</em> combines different data sources into a unified view while maintaining data quality and relationships.</p><p><em>Data Pipeline</em> is a series of processing steps that transform and move data from source to destination (including the technical act of loading data into target systems, often called 'Data Loading').</p><p><em>Data Ingestion</em> is the initial process of bringing data into a system from external sources.</p><p><em>Data Workflow</em> orchestrates and schedules the execution of data-related tasks and their dependencies.</p><ul><li><p>Common confusion: These terms are often used interchangeably because they all involve data movement, but each serves a distinct purpose in the data lifecycle.</p></li><li><p>Key difference: Each term represents a different scope and purpose. Think of building a house: ingestion brings raw materials (initial import), pipelines are the conveyor systems moving and transforming materials (data flow and loading), integration combines materials into cohesive structures (unified view), and workflow is the construction schedule ensuring everything happens in the right order.</p><ul><li><p>Example: In an e-commerce system - ingestion pulls raw data from various sources like sales and inventory, pipelines transform and load it into the data warehouse, integration combines it into a unified customer view, and workflow orchestrates the entire process including dependencies and scheduling.</p></li></ul></li></ul><p><strong>Data Sharing vs Data Replication vs Data Collaboration</strong></p><p><em>Data Sharing</em> is providing controlled access to data while maintaining a single source of truth.</p><p><em>Data Replication</em> is creating and maintaining copies of data across different locations or systems.</p><p><em>Data Collaboration</em> is enabling multiple parties to work together on shared data assets with governance controls.</p><ul><li><p>Common confusion: All involve multiple parties accessing data, leading to confusion about when to use each approach.</p></li><li><p>Key difference: Sharing controls access, replication duplicates data, collaboration enables joint work. Think of it like: sharing is giving someone view access to your document (controlled access), replication is making copies of the document (distributed copies), and collaboration is using a platform where multiple people can work on the document together (joint workspace).</p><ul><li><p>Example: For customer data, sharing gives partners read access to your database, replication copies data to their system, collaboration lets you jointly analyze and enrich the data.</p></li></ul></li><li><p>Let's examine data sharing through modern tooling:</p><ul><li><p><em>Snowflake's data sharing feature:</em></p><ul><li><p>What it does: Direct sharing of read-only data without copying/moving data</p></li><li><p>Common confusion: Often confused with data replication or ETL-based sharing.</p></li><li><p>Key difference: Live access vs copying data. Think of it like streaming a movie (data sharing) versus downloading it (data replication). Example: A retailer shares daily sales data with suppliers - suppliers query live data through Snowflake shares instead of receiving copied data dumps.</p></li><li><p>Provider maintains single source of truth while consumers query live data.</p></li><li><p>Key mechanisms: Shares (datasets), Reader Accounts (access control), Secure Data Sharing (governance)</p></li></ul></li><li><p><em>Atlan's metadata sharing capability:</em></p><ul><li><p>What it does: Shares context, governance, and knowledge about data across teams</p></li><li><p>Common confusion: Often mixed up with sharing the actual data.</p></li><li><p>Key difference: Shares knowledge about data vs sharing data itself. Data teams share dataset descriptions, quality metrics, and usage patterns through Atlan while actual data remains in source systems.</p></li><li><p>Enables collaboration on metadata like descriptions, classifications, and lineage.</p></li><li><p>Key mechanisms: Asset sharing, Active metadata, business glossary</p></li></ul></li></ul></li></ul><p><strong>Streaming Data vs Real-time Data</strong></p><p><em>Streaming Data</em> is the continuous flow of data processed incrementally as it arrives, analyzed through stateful operations like windowing and aggregations using tools such as Apache Flink or Spark Streaming.</p><p><em>Real-time Data</em> is data processed and made available for use immediately after creation, using in-memory processing for instant analysis through tools like Apache Kafka ksqlDB or Redis.</p><ul><li><p>Common confusion: Both terms suggest immediate data processing, leading them to be used interchangeably. They are also confused in an analytics context where streaming requires state management while real-time focuses on immediate event processing.</p></li><li><p>Key difference: Streaming defines data flow pattern; real-time defines speed requirement. Think of it like: streaming is like a river's continuous flow requiring monitoring over time (constant data with windowed analysis), while real-time is like instant messaging needing immediate responses (immediate event processing).</p><ul><li><p>Example: IoT sensor data streams continuously with rolling analytics over 5-minute windows, while payment processing requires real-time transaction validation for instant fraud detection.</p></li></ul></li></ul><p><strong>Data Transformation vs Data Processing</strong></p><p><em>Data Transformation</em> converts data from one format, structure, or value to another.</p><p><em>Data Processing</em> is the broader sequence of operations performed on data, including cleaning, transformation, analysis, and storage.</p><ul><li><p>Common confusion: Transformation is often mistaken as synonymous with all processing activities.</p></li><li><p>Key difference: Transformation is a specific operation; processing is the complete workflow. Think of it like: transformation is translating a document (format change), while processing is the entire editorial workflow.</p><ul><li><p>Examples: Converting timestamps to standardized UTC format is transformation, while the full ETL pipeline is processing. In retail data - transformation is converting product SKUs to standardized formats or euros to dollars, while processing includes data ingestion from Point of Sales (POS) systems, cleaning returns data, joining with inventory, aggregating sales, and loading to the data warehouse.</p></li></ul></li></ul><h3><em>Category 4: ML/AI</em></h3><p><strong>AI vs ML vs Deep Learning</strong></p><p><em>AI (Artificial Intelligence)</em> is the broad field of creating systems that can simulate human intelligence.</p><p><em>ML (Machine Learning)</em> is a subset of AI focused on algorithms that learn from data without explicit programming. It is more specifically a <a href="https://www.datacraft.wiki/i/162129891/learning-approaches">Learning Approach</a> in the field of AI.</p><p><em>Deep Learning</em> is a specialized ML implementation strategy using neural networks with multiple layers. It is one of the many&nbsp;<em>methodologies/strategies</em>&nbsp;that can be used to implement a specific Learning Approach within the field of AI.</p><blockquote><p>Want to master AI fundamentals? <a href="https://www.datacraft.wiki/p/decomposing-ai-development">Decomposing AI Development</a> is your fast track.</p></blockquote><ul><li><p>Key difference: Scope and specialization. <br>Think of it like: AI is transportation (many ways to get somewhere), ML is vehicles (learning from data to navigate), and deep learning is a specific type of vehicle design (layered neural networks) best suited for complex tasks.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!31CA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c60fd6-10e6-4da0-a724-fbfd960f89f9_500x817.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!31CA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c60fd6-10e6-4da0-a724-fbfd960f89f9_500x817.png 424w, https://substackcdn.com/image/fetch/$s_!31CA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c60fd6-10e6-4da0-a724-fbfd960f89f9_500x817.png 848w, https://substackcdn.com/image/fetch/$s_!31CA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c60fd6-10e6-4da0-a724-fbfd960f89f9_500x817.png 1272w, https://substackcdn.com/image/fetch/$s_!31CA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c60fd6-10e6-4da0-a724-fbfd960f89f9_500x817.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!31CA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c60fd6-10e6-4da0-a724-fbfd960f89f9_500x817.png" width="500" height="817" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a2c60fd6-10e6-4da0-a724-fbfd960f89f9_500x817.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:817,&quot;width&quot;:500,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!31CA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c60fd6-10e6-4da0-a724-fbfd960f89f9_500x817.png 424w, https://substackcdn.com/image/fetch/$s_!31CA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c60fd6-10e6-4da0-a724-fbfd960f89f9_500x817.png 848w, https://substackcdn.com/image/fetch/$s_!31CA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c60fd6-10e6-4da0-a724-fbfd960f89f9_500x817.png 1272w, https://substackcdn.com/image/fetch/$s_!31CA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c60fd6-10e6-4da0-a724-fbfd960f89f9_500x817.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>AI Automation vs AI Agents</strong></p><p><em>AI Automation</em> refers to systems that follow predefined rules for repetitive tasks, using rule-based logic and pre-set workflows. Examples: Automated email marketing, Lead scoring systems, Data cleansing tools, Routine pipeline updates.</p><p><em>AI Agents</em> are autonomous entities capable of learning, adapting, and making decisions in real time using advanced machine learning or deep learning algorithms. Examples: Conversational chatbots, Predictive forecasting tools, Dynamic personalization systems, Autonomous decision-making assistants.</p><ul><li><p>Common confusion: Both terms involve artificial intelligence, leading to misconceptions that automation is always "intelligent" or that agents are simply advanced automation.</p></li><li><p>Key difference: Automation follows fixed rules; agents learn and adapt. Think of automation as a thermostat that turns on/off at set temperatures (rule-based symbolic learning), while agents are like smart thermostats that learn preferences and adjust proactively (adaptive learning).</p></li></ul><p><strong>Model Training vs Model Fine-tuning</strong></p><p><em>Model Training</em> builds a model from scratch using a full dataset.</p><p><em>Model Fine-tuning</em> adjusts a pre-trained model for specific tasks or domains.</p><ul><li><p>Common confusion: Both involve learning from data but serve different purposes.</p></li><li><p>Key difference: Starting point and resource requirements. Think of it like: training is building a car from parts (full build), fine-tuning is customizing an existing car (adjustments). Example: Training GPT from scratch vs fine-tuning it for medical terminology.</p></li></ul><p><strong>Model Serving vs Model Deployment</strong></p><p><em>Model Serving</em> provides machine learning models through APIs for real-time predictions.</p><p><em>Model Deployment</em> is the broader process of making models available in production environments.</p><ul><li><p>Common confusion: Often used interchangeably in MLOps discussions.</p></li><li><p>Key difference: Serving is a component of deployment. Think of it like: serving is the restaurant's kitchen (prediction service), deployment is running the entire restaurant (full production system).</p><ul><li><p>Example: Model serving via REST API vs full deployment including monitoring and scaling. A fraud detection model's serving layer provides real-time predictions via REST API, while deployment includes the entire production system - API endpoints, monitoring dashboards, A/B testing framework, automated retraining pipeline, and rollback procedures.</p></li></ul></li></ul><p><strong>Feature Engineering vs Feature Selection</strong></p><p><em>Feature Engineering</em> creates new meaningful features from raw data.</p><p><em>Feature Selection</em> chooses the most relevant features for a model.</p><ul><li><p>Common confusion: Both modify feature sets but serve different purposes.</p></li><li><p>Key difference: Creation vs reduction. Think of it like: engineering is crafting new ingredients (creation), selection is choosing the best ingredients (reduction).</p><ul><li><p>Example: Creating interaction terms from variables vs selecting top predictive features. In e-commerce, engineering creates features like "days_since_last_purchase" and "avg_cart_value" from raw transaction data, while selection identifies that "avg_cart_value" is more predictive than "browser_type" for customer churn prediction.</p></li></ul></li></ul><p><strong>Model Registry vs Model Store</strong></p><p><em>Model Registry</em> tracks model versions, metadata, and lineage with governance.</p><p><em>Model Store</em> is a simpler repository for saving model artifacts.</p><ul><li><p>Common confusion: Both store models but with different capabilities.</p></li><li><p>Key difference: Registry provides governance and tracking, store provides storage. Think of it like: registry is a bank vault with detailed records (tracked storage), store is a safety deposit box (basic storage).</p><ul><li><p>Example: MLflow Model Registry with versioning vs simple S3 bucket storage.</p></li></ul></li></ul><h3><em>Category 5: Analytics</em></h3><p><strong>Data Analytics vs Business Analytics vs Data Science</strong></p><p><em>Data Analytics</em> extracts insights from data using statistical methods and tools.</p><p><em>Business Analytics</em> applies data analysis specifically to business problems and decisions.</p><p><em>Data Science</em> combines analytics, programming, and domain expertise to build predictive models and data products.</p><ul><li><p>Common confusion: Overlapping skills and tools lead to unclear boundaries.</p></li><li><p>Key difference: Focus and scope of analysis. <br>Think of it like: analytics answers "what happened," business analytics answers "what should we do," data science answers "what will happen."</p><ul><li><p>Example: Analytics shows customer churn rates, business analytics recommends retention strategies, data science builds prediction models.</p></li></ul></li></ul><p><strong>Data Visualization vs Business Intelligence (BI)</strong></p><p><em>Data Visualization</em> is the graphical representation of data and insights.</p><p><em>Business Intelligence</em> is a comprehensive approach to collecting, analyzing, and presenting business data for decision-making.</p><ul><li><p>Common confusion: Visualization is often seen as equivalent to BI because it's the most visible component.</p></li><li><p>Key difference: Visualization is a communication tool; BI is an end-to-end solution. Think of it like: visualization is the presentation slides, BI is the entire quarterly business review process.</p><ul><li><p>Example: A scatter plot showing customer segments is visualization, while Tableau dashboards with drill-downs, KPIs, and automated reporting is BI.</p></li></ul></li></ul><p><strong>Streaming Analytics vs Real-time Analytics</strong></p><p><em>Streaming Analytics</em> processes continuous data flows using time windows and stateful operations.</p><p><em>Real-time Analytics</em> delivers instant insights on individual events as they occur. It performs immediate analysis on data at the edge as it's generated.</p><ul><li><p>Common confusion: Both handle immediate data but differ in processing patterns.</p></li><li><p>Key difference: Streaming data analytics handles continuous flows with micro-batching, whereas real-time analytics handles discrete events. <br>Think of it like: streaming is monitoring highway traffic patterns (continuous), real-time is detecting individual speeders (immediate events).</p></li><li><p>Examples:</p><ul><li><p>Streaming: In a manufacturing plant - analyzing 15-minute windows of production line sensor data to detect quality trends (50,000 readings/minute) is streaming analytics.</p></li><li><p>Current "Real-time": Instant machine failure detection from individual anomalous readings (near real-time).</p></li><li><p>True Real-time (still technologically challenging): Simultaneously analyzing every sensor reading from every machine across multiple global plants (petabytes of data) with zero latency.</p></li></ul></li></ul><h2>Appendix</h2><h3><strong>Who is this post for?</strong></h3><p>Non-engineers looking to build a strong understanding (and just enough data vocabulary) to communicate like seasoned data professionals. Whether you're in product management, marketing, sales, or UX, if you're working in data/AI, you need strong technical understanding to successfully build, market, and sell your product.</p><h3><strong>How to make the best use of it?</strong></h3><p>There are levels to it, of course. Technical knowledge requirements vary by role - what a product manager needs differs from an executive or product Marketer. But terminology is ground zero - prioritize building a shared understanding first to accelerate higher-value work.</p><p>I've organized terms in logical groupings because I've found it's most effective to learn new concepts by comparing them with similar terms. Use your judgment to prioritize learning terms relevant to your role.</p><p>Consider bookmarking and sharing if you find this useful. Feedback is always encouraged! It helps me provide more value to you and improve my writing process. You can reach out on LinkedIn (please include a note) or write to me at fenil.h.dedhia@gmail.com.</p><h3><strong>Why did I write this post?</strong></h3><p>As a Product Lead doing innovation/R&amp;D work for more than 6 years, I've learned that clear technological communication requires more than just choosing the right words. As a product manager or similar role, you need to constantly bridge the gap between what you say and what others hear. Just like how a writing culture requires a reading culture, building a shared understanding around terminology demands active participation from all key stakeholders.</p><p>I wish I had this resource over the past two years. Not finding a practical, comprehensive resource covering common terms across the full data lifecycle motivated me to create this guide.</p><blockquote><p>Technical terminology can silently drag down team productivity and velocity.</p></blockquote><p>Experience proves this point - spend just a few years working full-time on technical products/projects and you'll see it firsthand. If you're interested in reading more about it from a research perspective, check out the links below.</p><p>Sources:</p><ol><li><p>Research from <a href="https://www.haystackteam.com/blog/10-problems-a-business-glossary-can-solve">Haystack</a> demonstrates how lack of shared context and terminology directly impacts productivity through miscommunication and conflict.</p></li><li><p><a href="https://www.nature.com/articles/s41599-024-04018-w">Mental health research</a> highlights that maintaining expertise - including domain terminology - through continuous learning is crucial for preventing burnout in rapidly evolving tech fields. Self-efficacy through reskilling and upskilling helps professionals stay effective.</p></li><li><p>Technical domains (like Data &amp; AI) face the greatest impact, especially in fast-paced environments like startups and innovation/R&amp;D teams. The impact multiplies when team members are new to the domain - varying technical depth and terminology gaps create daily friction that slows progress.</p></li></ol><ul><li><ul><li><p>The <a href="https://www.accenture.com/content/dam/accenture/final/a-com-migration/r3-3/pdf/pdf-118/accenture-the-human-impact-data-literacy.pdf">Accenture and Qlik research</a> highlights that a lack of data skills contributes to employee stress and productivity loss.&nbsp;</p></li><li><p>According to&nbsp;<a href="https://computhink.com/wp-content/uploads/2015/10/IDC20on20The20High20Cost20Of20Not20Finding20Information.pdf">IDC</a>, the average knowledge worker spends about 2.5 hours per day, or roughly 30% of their workday, searching for information.</p></li></ul></li></ul>]]></content:encoded></item><item><title><![CDATA[Decomposing AI Development: A Practitioner's Guide to Navigate Decision-Making in AI System Development]]></title><description><![CDATA[Welcome to the second installment of 'AI Development Demystified' series for builders. After digesting this deep dive, you'll join the minority 1% who truly grasp the AI development landscape. No fluff, just hard-earned insights.]]></description><link>https://www.datacraft.wiki/p/decomposing-ai-development</link><guid isPermaLink="false">https://www.datacraft.wiki/p/decomposing-ai-development</guid><dc:creator><![CDATA[Fenil Dedhia]]></dc:creator><pubDate>Wed, 23 Oct 2024 01:00:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/ffec6090-dcbd-4d3f-bd3f-ae4acbfe98ac_960x540.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There's no shortage of information on building AI solutions. But most of it lacks nuance.</p><p>That's why I wanted to create a <strong>foundational resource</strong> for<em> </em>new data and AI professionals<em> </em>aiming to commit and gain <em>practitioner-level knowledge</em>.</p><p>In this second installment, you will:</p><ul><li><p>Learn how to decompose AI development into its core components</p></li><li><p>Understand the interplay between the core components</p></li><li><p>Be able to demonstrate expert-level knowledge in workplace discussions that will earn you respect from the AI subject matter experts on the team</p></li></ul><p>If you want to <strong>master</strong> the <em>technical foundations</em> of AI, this resource is for you.</p><p>About my AI background: </p><ul><li><p>I've been in the trenches of AI development since 2018, way before ChatGPT made AI "cool" and became table-stakes in innovation. </p></li><li><p>As a product lead for Symbolic AI and Machine Learning (ML) based software products, I've led the development of multiple AI products from scratch. </p></li><li><p>My technical expertise spans NLP, deep learning, KR&amp;R, ML lifecycle management, and big data processing frameworks (like Spark), applied to solving customer problems in domains of text summarization, speech, personalization, and large-Scale MLOps. </p></li><li><p>I've made many mistakes in my AI journey, and now you don't have to.</p></li></ul><div><hr></div><p>Here's what we'll cover in this guide:</p><ul><li><p>The Three Technical Paradigms</p></li><li><p>Difference Between Symbolic AI, Adaptive AI, and Hybrid AI</p></li><li><p>Decomposing AI Development</p><ul><li><p>The AI Development Framework</p></li><li><p>High-level Processes and Key Relationships</p></li><li><p>The Problem Domain of AI development</p></li><li><p>The Solution Domain of AI development</p></li><li><p>Framing the Mathematical Formula of AI Development</p></li><li><p>Example 1: Decomposing "Generative AI"</p></li><li><p>Example 2: Decomposing "Natural Language Processing (NLP)"</p></li></ul></li><li><p>Decision-Making Guide for AI Development</p><ul><li><p>Which AI Development Paradigm to Choose?</p></li><li><p>Which Learning Approach to Use?</p></li><li><p>Which Learning Strategy to Apply?</p></li><li><p>Which AI Model Architecture to Implement?</p></li></ul></li><li><p>FAQs</p></li></ul><div><hr></div><blockquote><p><em>Headsup: This is a deep dive. And unlike my other posts, <strong>my deep dives are optimized for knowledge share, not length</strong>. This is a resource for "builder" archetype individual contributors (ICs) and Leaders who want to go from a surface-level understanding to a seasoned practitioner-level understanding in one weekend. </em><br><br>&#128161;<em> I recommend committing atleast 12 hours digesting this installment.</em></p></blockquote><p><em>If you're serious about mastering the fundamentals of AI development, start with Part 1 of this series and work your way through to the end. Each installment builds on the last, providing you with the practical knowledge to discern reality from hype, separate fact from misinformation, and ultimately think and talk like a true AI practitioner.</em></p><div><hr></div><p></p><p>Everyone's heard the golden rule of AI development: "Garbage IN, Garbage OUT."</p><blockquote><p>An AI system becomes what it eats, quite literally. It lives and learns within its own data bubble. The larger and more diverse the bubble, the more "intelligent" it gets.</p></blockquote><p>The boundaries of this data bubble are rapidly expanding. While <a href="https://www.backblaze.com/blog/hard-drive-cost-per-gigabyte/">storage costs have decreased</a>, the sheer scale of data required for sophisticated AI systems keeps overall storage expenses significant. Similarly, while processing power has become more efficient, the <a href="https://openai.com/index/ai-and-compute/">computational demands</a> of advanced AI models have significantly increased typical training and operational costs. For example, OpenAI's large language model GPT-3 was trained on 45TB of text data and the cost to train it was estimated to be around $4.6 million in 2020.</p><p>As covered in Part 1 of this series, "<a href="https://datacraft.wiki/p/whats-the-difference-between-narrow-ai-and-agi">From Narrow AI to Superintelligence: What's the Difference and When Will We Get There?</a>" the natural progression of AI's evolution journey is quite ambitious and we're only getting started. According to Stanford University's 2023 AI Index Report, private investment in AI reached $91.9 billion in 2022, while federal government investment in AI R&amp;D in the U.S. is expected to reach $1.5 billion in 2025. These investments in AI demonstrate a strong commitment from both private and public sectors to push AI's potential for transformative impact.</p><p>We're on the brink of Narrow AI disrupting many industries, poised to transform how we think, work, and live in the coming decades. But:</p><ol><li><p><em>How does one develop an AI solution?</em></p></li><li><p><em>What's a practical framework for approaching AI development?</em></p></li><li><p><em>What are the key decisions to be made, and what critical trade-offs must we consider in building AI systems that meet our goals?</em></p></li></ol><p>In this second installment, <strong>we'll answer these questions by zooming out and in, zigging and zagging to provide both the big picture and essential details.</strong></p><p>Let's dive into the trenches.</p><h2>The Three Technical Paradigms</h2><p>Understanding the foundational paradigms of AI is crucial for grasping the full spectrum of AI development.</p><ol><li><p>Symbolic AI: Uses logic and knowledge representation</p></li><li><p>Adaptive AI: Learns from data and adapts to improved performance over time</p></li><li><p>Hybrid AI: Combines Symbolic and Adaptive approaches</p></li></ol><p>Each paradigm has a set of different learning approaches that can be implemented in an AI system to help them gain new knowledge.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RjjJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131cc8cf-030a-4c9a-8869-4ffceadd46eb_960x540.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RjjJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131cc8cf-030a-4c9a-8869-4ffceadd46eb_960x540.jpeg 424w, https://substackcdn.com/image/fetch/$s_!RjjJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131cc8cf-030a-4c9a-8869-4ffceadd46eb_960x540.jpeg 848w, https://substackcdn.com/image/fetch/$s_!RjjJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131cc8cf-030a-4c9a-8869-4ffceadd46eb_960x540.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!RjjJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131cc8cf-030a-4c9a-8869-4ffceadd46eb_960x540.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RjjJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131cc8cf-030a-4c9a-8869-4ffceadd46eb_960x540.jpeg" width="960" height="540" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/131cc8cf-030a-4c9a-8869-4ffceadd46eb_960x540.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:540,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:43467,&quot;alt&quot;:&quot;The Three Technical Paradigms in AI Development&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://datacraft.wiki/i/162129891?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131cc8cf-030a-4c9a-8869-4ffceadd46eb_960x540.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="The Three Technical Paradigms in AI Development" title="The Three Technical Paradigms in AI Development" srcset="https://substackcdn.com/image/fetch/$s_!RjjJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131cc8cf-030a-4c9a-8869-4ffceadd46eb_960x540.jpeg 424w, https://substackcdn.com/image/fetch/$s_!RjjJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131cc8cf-030a-4c9a-8869-4ffceadd46eb_960x540.jpeg 848w, https://substackcdn.com/image/fetch/$s_!RjjJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131cc8cf-030a-4c9a-8869-4ffceadd46eb_960x540.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!RjjJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131cc8cf-030a-4c9a-8869-4ffceadd46eb_960x540.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The Three Technical Paradigms in AI Development</figcaption></figure></div><p>Most AI development guides jump past the basics and dive straight into specific, popular branches of AI, like the neural networks used in Deep Learning. However, the <strong>evolution of <a href="https://nohellofenil.com/whats-the-difference-between-narrow-ai-and-agi/">Narrow AI</a> is progressing towards an era of Hybrid AI systems</strong>. This shift is driven by the growing importance of Explainable AI (XAI) and Ethical AI, both crucial for any hope of achieving Artificial General Intelligence (AGI).</p><p>As a Data and AI practitioner, you need a working knowledge of all three technical paradigms. This isn't just necessary today&#8212;it's table stakes.</p><h2>Difference Between Symbolic AI, Adaptive AI, and Hybrid AI</h2><p><br><strong>Symbolic AI</strong>, also known as <em>Classical AI</em> or <em>Rule-Based AI</em> or <em>Good old fashioned AI (GOFAI)</em>, uses human-readable symbols and rules to represent knowledge and solve problems through logical reasoning. It relies on explicit programming of knowledge and rules, similar to how we use language or symbols in mathematics, making it particularly useful in domains where expert knowledge can be clearly <em>codified</em>.</p><p>This approach was the first official attempt at creating AI. It grew in popularity between the 1950s and 1980s.</p><ul><li><p>Expert systems for medical diagnosis, early Interactive Voice Response (IVR) systems, automated theorem provers, and early chess-playing programs like DeepBlue are all examples of Symbolic AI.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qk2W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee61c56-291a-44bb-b5b3-de13d8fe1c30_373x498.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qk2W!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee61c56-291a-44bb-b5b3-de13d8fe1c30_373x498.gif 424w, https://substackcdn.com/image/fetch/$s_!Qk2W!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee61c56-291a-44bb-b5b3-de13d8fe1c30_373x498.gif 848w, https://substackcdn.com/image/fetch/$s_!Qk2W!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee61c56-291a-44bb-b5b3-de13d8fe1c30_373x498.gif 1272w, https://substackcdn.com/image/fetch/$s_!Qk2W!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee61c56-291a-44bb-b5b3-de13d8fe1c30_373x498.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qk2W!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee61c56-291a-44bb-b5b3-de13d8fe1c30_373x498.gif" width="373" height="498" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7ee61c56-291a-44bb-b5b3-de13d8fe1c30_373x498.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:498,&quot;width&quot;:373,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Qk2W!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee61c56-291a-44bb-b5b3-de13d8fe1c30_373x498.gif 424w, https://substackcdn.com/image/fetch/$s_!Qk2W!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee61c56-291a-44bb-b5b3-de13d8fe1c30_373x498.gif 848w, https://substackcdn.com/image/fetch/$s_!Qk2W!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee61c56-291a-44bb-b5b3-de13d8fe1c30_373x498.gif 1272w, https://substackcdn.com/image/fetch/$s_!Qk2W!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ee61c56-291a-44bb-b5b3-de13d8fe1c30_373x498.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Pac-Man's ghosts follow rule-based chasing patterns, exemplifying the logical if-then-else codified characteristic of Symbolic AI systems</figcaption></figure></div><ul><li><p>Note that just the absence of adaptive learning alone is not sufficient to classify a AI system as a 'Symbolic AI' system. For example, a simple calculator operates on predefined rules and doesn't learn, but it's not considered AI. Similarly, <em>Pac-Man</em> isn't symbolic AI, despite its rule-based nature.</p></li><li><p>The main identifiers of Symbolic AI are:</p><ul><li><p>Operates primarily based on predefined rules and knowledge representations</p></li><li><p>Has explicit, human-readable rules and knowledge bases</p></li><li><p>Has built-in capacity to explain its decision-making process, which can be accessed for troubleshooting or when deeper understanding of system's decisions or reasoning is needed</p></li></ul></li></ul><p><strong>Adaptive AI</strong> (machine learning) is a type of AI that can learn from data and improve its performance over time <em>without</em> being explicitly programmed. It uses statistical techniques to find patterns in data and make decisions or predictions based on these patterns.</p><p>Under this technical paradigm, deep learning (or "Connectionist AI") is a specific <em>Learning Method</em> and Generative AI is an <em>Application Domain</em>. We'll understand these concepts in a little more depth in the upcoming sections.</p><ul><li><p>Image and speech recognition systems, chatbots, recommendation engines, spam filters, and predictive maintenance systems in manufacturing are a few examples of Adaptive AI.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9ZGy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6429684-8c35-4c31-9abe-5bc5c44a1904_498x373.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9ZGy!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6429684-8c35-4c31-9abe-5bc5c44a1904_498x373.gif 424w, https://substackcdn.com/image/fetch/$s_!9ZGy!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6429684-8c35-4c31-9abe-5bc5c44a1904_498x373.gif 848w, https://substackcdn.com/image/fetch/$s_!9ZGy!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6429684-8c35-4c31-9abe-5bc5c44a1904_498x373.gif 1272w, https://substackcdn.com/image/fetch/$s_!9ZGy!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6429684-8c35-4c31-9abe-5bc5c44a1904_498x373.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9ZGy!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6429684-8c35-4c31-9abe-5bc5c44a1904_498x373.gif" width="498" height="373" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6429684-8c35-4c31-9abe-5bc5c44a1904_498x373.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:373,&quot;width&quot;:498,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!9ZGy!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6429684-8c35-4c31-9abe-5bc5c44a1904_498x373.gif 424w, https://substackcdn.com/image/fetch/$s_!9ZGy!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6429684-8c35-4c31-9abe-5bc5c44a1904_498x373.gif 848w, https://substackcdn.com/image/fetch/$s_!9ZGy!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6429684-8c35-4c31-9abe-5bc5c44a1904_498x373.gif 1272w, https://substackcdn.com/image/fetch/$s_!9ZGy!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6429684-8c35-4c31-9abe-5bc5c44a1904_498x373.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Facial recognition uses machine learning algorithms to identify and verify faces</figcaption></figure></div><p><strong>Hybrid AI</strong> combines elements of both Symbolic AI and Adaptive AI (machine learning). It aims to integrate the logical reasoning and explainability of Symbolic AI with the learning capabilities and flexibility of machine learning.</p><ul><li><p>Self-driving cars, modern voice assistants like Siri and Alexa, humanoid robots, etc., are all good examples of Hybrid AI. They all involve ML-based systems to understand their surroundings (like recognizing objects, text, speech, etc.) and rule-based systems to make safe, predictable decisions.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RGkP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36c55cc-a59b-4125-9582-940c33d23ee0_498x281.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RGkP!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36c55cc-a59b-4125-9582-940c33d23ee0_498x281.gif 424w, https://substackcdn.com/image/fetch/$s_!RGkP!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36c55cc-a59b-4125-9582-940c33d23ee0_498x281.gif 848w, https://substackcdn.com/image/fetch/$s_!RGkP!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36c55cc-a59b-4125-9582-940c33d23ee0_498x281.gif 1272w, https://substackcdn.com/image/fetch/$s_!RGkP!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36c55cc-a59b-4125-9582-940c33d23ee0_498x281.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RGkP!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36c55cc-a59b-4125-9582-940c33d23ee0_498x281.gif" width="498" height="281" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f36c55cc-a59b-4125-9582-940c33d23ee0_498x281.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:281,&quot;width&quot;:498,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!RGkP!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36c55cc-a59b-4125-9582-940c33d23ee0_498x281.gif 424w, https://substackcdn.com/image/fetch/$s_!RGkP!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36c55cc-a59b-4125-9582-940c33d23ee0_498x281.gif 848w, https://substackcdn.com/image/fetch/$s_!RGkP!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36c55cc-a59b-4125-9582-940c33d23ee0_498x281.gif 1272w, https://substackcdn.com/image/fetch/$s_!RGkP!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36c55cc-a59b-4125-9582-940c33d23ee0_498x281.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Modern humanoid robots like "Atlas" by Boston Dynamics implement rule-based programming for basic movement and safety and ML for complex tasks like balancing on uneven terrain</figcaption></figure></div><h2>&#127775; Decomposing AI Development</h2><p>Decomposing the AI development into a core functional framework will help us visualize both the problem space and technical implementation, making it easier to systematically track the key decisions to be made and critical trade-offs to consider in building AI systems that meet our goals.<br><br>In the following sections, we'll first digest the key components of our AI development framework and illustrate its application through examples in Generative AI and Natural Language Processing.</p><h3>The AI Development Framework</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9jMa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3215e3a-8626-4001-9914-736f6eef793a_10000x5625.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9jMa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3215e3a-8626-4001-9914-736f6eef793a_10000x5625.webp 424w, https://substackcdn.com/image/fetch/$s_!9jMa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3215e3a-8626-4001-9914-736f6eef793a_10000x5625.webp 848w, https://substackcdn.com/image/fetch/$s_!9jMa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3215e3a-8626-4001-9914-736f6eef793a_10000x5625.webp 1272w, https://substackcdn.com/image/fetch/$s_!9jMa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3215e3a-8626-4001-9914-736f6eef793a_10000x5625.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9jMa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3215e3a-8626-4001-9914-736f6eef793a_10000x5625.webp" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b3215e3a-8626-4001-9914-736f6eef793a_10000x5625.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:236844,&quot;alt&quot;:&quot;Framework for AI System Development&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/webp&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://datacraft.wiki/i/162129891?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3215e3a-8626-4001-9914-736f6eef793a_10000x5625.webp&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Framework for AI System Development" title="Framework for AI System Development" srcset="https://substackcdn.com/image/fetch/$s_!9jMa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3215e3a-8626-4001-9914-736f6eef793a_10000x5625.webp 424w, https://substackcdn.com/image/fetch/$s_!9jMa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3215e3a-8626-4001-9914-736f6eef793a_10000x5625.webp 848w, https://substackcdn.com/image/fetch/$s_!9jMa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3215e3a-8626-4001-9914-736f6eef793a_10000x5625.webp 1272w, https://substackcdn.com/image/fetch/$s_!9jMa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3215e3a-8626-4001-9914-736f6eef793a_10000x5625.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">My AI Development Framework</figcaption></figure></div><p><em>Important note: The terminology and definitions are critical to digest in leveraging this framework in the real world. The nomenclature is deliberate and carefully crafted. This is a section you might want to review multiple times to fully internalize properly.</em></p><ol><li><p><strong>Framework Domain</strong></p><ol><li><p>Problem Domain: Focuses on the "what" and the "why" of AI development, defining the context, requirements, and objectives.</p></li><li><p>Solution Domain (AI): Focuses on the "how" of AI development, encompassing the technical approaches, methods, and models used to create an AI solution that addresses the Problem Domain.</p></li></ol></li><li><p><strong>Learning Approach</strong>: This is the <em>overall</em> <em>process</em> or <em>structure</em> of "how" the learning happens. The learning approach answers the fundamental question: How will the AI system learn from the available data?</p></li><li><p><strong>Learning Implementation: </strong>The process of implementing the learning approach to develop a tailored AI solution for a particular application domain or use case domain</p><ol><li><p>Learning Strategy: The specific <em>methodology</em> or <em>strategy</em> chosen to implement the Learning Approach</p></li><li><p>Learning Techniques: The specific <em>techniques</em> or <em>algorithms</em> used to execute the Learning Strategy</p></li><li><p>AI Model Architecture: The <em>architectural blueprint</em> of "how" an AI model processes data to learn, aka the model architecture "type"</p></li><li><p>Trained AI Model: The <em>outcome</em> of applying learning strategies and techniques to an AI model architecture, resulting in a model that is <em>fine-tuned</em> for a specific application domain or use case domain.</p></li></ol></li><li><p><strong>Application Domain (AI)</strong>: The <em>specialized</em> areas within AI that drive the development of new AI models. It represents both broad AI fields such as Generative AI, NLP, Autonomous Systems, Explainable AI (XAI), and others</p></li><li><p><strong>Use Case Domain</strong>: The real-world problems to which AI solutions are applied. It represents the concrete application of AI &#8212; the specific scenarios, tasks, or problems. Use Case Domains are generally industry-agnostic, but can be industry-specific when unique tasks or problems warrant an AI solution</p></li></ol><h2>High-level Processes and Key Relationships</h2><p>This section is the key aha moment. Are you ready?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1hRM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64a63a4-2351-474f-934a-9c438515fa37_1840x1826.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1hRM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64a63a4-2351-474f-934a-9c438515fa37_1840x1826.png 424w, https://substackcdn.com/image/fetch/$s_!1hRM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64a63a4-2351-474f-934a-9c438515fa37_1840x1826.png 848w, https://substackcdn.com/image/fetch/$s_!1hRM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64a63a4-2351-474f-934a-9c438515fa37_1840x1826.png 1272w, https://substackcdn.com/image/fetch/$s_!1hRM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64a63a4-2351-474f-934a-9c438515fa37_1840x1826.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1hRM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64a63a4-2351-474f-934a-9c438515fa37_1840x1826.png" width="1456" height="1445" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d64a63a4-2351-474f-934a-9c438515fa37_1840x1826.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1445,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:260365,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://datacraft.wiki/i/162129891?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64a63a4-2351-474f-934a-9c438515fa37_1840x1826.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1hRM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64a63a4-2351-474f-934a-9c438515fa37_1840x1826.png 424w, https://substackcdn.com/image/fetch/$s_!1hRM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64a63a4-2351-474f-934a-9c438515fa37_1840x1826.png 848w, https://substackcdn.com/image/fetch/$s_!1hRM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64a63a4-2351-474f-934a-9c438515fa37_1840x1826.png 1272w, https://substackcdn.com/image/fetch/$s_!1hRM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd64a63a4-2351-474f-934a-9c438515fa37_1840x1826.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>The Learning Approach, such as supervised or unsupervised learning, dictates how learning will be executed through Learning Implementation and is the <em>first decision</em> to make in the AI development process.</p></li><li><p>After selecting the Learning Approach, Learning Implementation begins, involving the selection of the Learning Strategy, Learning Techniques, and AI Model Architecture to create a tailored AI solution.</p></li><li><p>The Learning Strategy is chosen to align with the Learning Approach and the <em>target application domain or use case domain</em>, defining which strategy or methodology, such as deep learning or transfer learning, will be used to optimize the learning process.</p></li><li><p>As part of the Learning Strategy, the AI Model Architecture, such as CNNs or Transformers, is selected based on the type of <em>available data</em> and <em>target task</em>, serving as the blueprint for how the model processes data.</p></li><li><p>Once the Learning Strategy and AI Model Architecture are in place, Learning Techniques, including optimization algorithms or backpropagation, are applied to train and fine-tune the model.</p></li><li><p>The result of the Learning Implementation process is the Trained AI Model, a model that has been optimized using the chosen strategies, techniques, and architecture, and is now tailored to the Application Domain and Use Case Domain.</p></li><li><p>The Trained AI Model is then deployed within a specific Application Domain, such as NLP or Computer Vision, to address broader AI fields relevant to the problem at hand.</p></li><li><p>Finally, the Trained AI Model, by itself or part of a larger AI solution, is applied to real-world Use Case Domains, solving specific problems like medical diagnosis assistance, or code generation.</p></li></ul><div><hr></div><p>&#128273; <strong>You've completed the foundation section!</strong> </p><p>What's ahead: We're about to explore the Problem and Solution domains of AI development - the core concepts that will help you connect all those buzzwords you've been hearing. Get ready to see how everything fits together in the AI  landscape.</p><div><hr></div>
      <p>
          <a href="https://www.datacraft.wiki/p/decomposing-ai-development">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[From Narrow AI to Superintelligence: What's the Difference and When Will We Get There?]]></title><description><![CDATA[Part 1 of "AI Development Demystified" series for builders: Start your AI knowledge journey here. Learn about the three main types of AI and their potential future. Understand where we are now and the challenges that lie ahead.]]></description><link>https://www.datacraft.wiki/p/whats-the-difference-between-narrow-ai-and-agi</link><guid isPermaLink="false">https://www.datacraft.wiki/p/whats-the-difference-between-narrow-ai-and-agi</guid><dc:creator><![CDATA[Fenil Dedhia]]></dc:creator><pubDate>Sat, 05 Oct 2024 04:31:51 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/458ef309-36fb-423b-871c-80b554adcd87_1000x521.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>To understand the current state of AI, it&#8217;s best to understand how an AI program compares to a human brain.</p><p>In this visual, the green box shows the human brain comprising all its various functions:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Dd3k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45265d62-55c2-49ff-8d7d-628592078a06_1000x521.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Dd3k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45265d62-55c2-49ff-8d7d-628592078a06_1000x521.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Dd3k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45265d62-55c2-49ff-8d7d-628592078a06_1000x521.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Dd3k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45265d62-55c2-49ff-8d7d-628592078a06_1000x521.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Dd3k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45265d62-55c2-49ff-8d7d-628592078a06_1000x521.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Dd3k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45265d62-55c2-49ff-8d7d-628592078a06_1000x521.jpeg" width="1000" height="521" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/45265d62-55c2-49ff-8d7d-628592078a06_1000x521.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:521,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:69601,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://datacraft.wiki/i/162129894?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45265d62-55c2-49ff-8d7d-628592078a06_1000x521.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Dd3k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45265d62-55c2-49ff-8d7d-628592078a06_1000x521.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Dd3k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45265d62-55c2-49ff-8d7d-628592078a06_1000x521.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Dd3k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45265d62-55c2-49ff-8d7d-628592078a06_1000x521.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Dd3k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45265d62-55c2-49ff-8d7d-628592078a06_1000x521.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A single AI system can manage one fraction of the numerous functions in the human brain, but it can do it much better. The AI is much faster, and is capable of handling massive amounts of data and uncovering hidden patterns. In short, each AI system can perform a specific task far better than any person on Earth, generally speaking. This scenario, where AI systems excel at specific, narrow tasks, is called <strong>Narrow AI</strong> or <em>Weak AI</em>.</p><p>Artificial General Intelligence (AGI), also known as <em>Strong AI</em>, represents a future where one AI system encompasses <em>all</em> cognitive functions of a human brain. Even though innovation isn&#8217;t linear, this is a scenario that is still decades away. But as we witness Weak AI performing seemingly magical feats, we tend to look at it as signs of Strong AI. Remember, our current AI systems, as impressive as they are, remain limited to excelling at <em>specific</em> domains/tasks.</p><blockquote><p>The progression from Weak AI to Strong AI represents a significant leap in itself. If an AI system should ever surpass human intelligence across virtually all domains, it would be another, and truly incredible, leap beyond Strong AI.</p></blockquote><p>This (theoretical) leap would be something we call Artificial Superintelligence (ASI). Despite claims from CEOs of popular AI companies, both AGI and ASI remain theoretical concepts, with AGI being a major current research goal that has yet to be achieved.</p><div class="pullquote"><p>Despite what some CEOs of for-profit AI companies claim, we remain <strong>far</strong> from <strong>real</strong> Artifical General Intelligence (AGI), and even further from Artificial Superintelligence (ASI).</p></div><h2>Difference Between Narrow AI, AGI, and ASI</h2><p><strong>Narrow AI</strong>, or <em>Weak AI</em>, is a type of artificial intelligence trained for and capable of a <em>specific</em> task or <em>narrow</em> domain.</p><ul><li><p>Self-driving cars, Chatbots, Facial Recognition, AI-driven weather prediction and medical diagnosis, etc. are all examples of Narrow AI.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZWRL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4b08a86-f641-4efa-b80c-cab502287941_498x280.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZWRL!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4b08a86-f641-4efa-b80c-cab502287941_498x280.gif 424w, https://substackcdn.com/image/fetch/$s_!ZWRL!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4b08a86-f641-4efa-b80c-cab502287941_498x280.gif 848w, https://substackcdn.com/image/fetch/$s_!ZWRL!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4b08a86-f641-4efa-b80c-cab502287941_498x280.gif 1272w, https://substackcdn.com/image/fetch/$s_!ZWRL!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4b08a86-f641-4efa-b80c-cab502287941_498x280.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZWRL!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4b08a86-f641-4efa-b80c-cab502287941_498x280.gif" width="498" height="280" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f4b08a86-f641-4efa-b80c-cab502287941_498x280.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:280,&quot;width&quot;:498,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!ZWRL!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4b08a86-f641-4efa-b80c-cab502287941_498x280.gif 424w, https://substackcdn.com/image/fetch/$s_!ZWRL!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4b08a86-f641-4efa-b80c-cab502287941_498x280.gif 848w, https://substackcdn.com/image/fetch/$s_!ZWRL!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4b08a86-f641-4efa-b80c-cab502287941_498x280.gif 1272w, https://substackcdn.com/image/fetch/$s_!ZWRL!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff4b08a86-f641-4efa-b80c-cab502287941_498x280.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Artificial General Intelligence (AGI)</strong>, or <em>Strong AI</em> or <em>General AI</em>, is a type of artificial intelligence aimed at matching or exceeding human-level intelligence across a <em>wide range</em> of cognitive tasks.</p><ul><li><p>Because it doesn&#8217;t actually exist yet, the only true examples of AGI are found in works of science fiction like J.A.R.V.I.S. or F.R.I.D.A.Y. assistants from <em>Marvel's Iron Man</em> and HAL 9000 from <em>2001: A Space Odyssey</em>.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MExc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F579b4e5e-958d-46b3-9ee2-028d7dc66dbd_498x280.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MExc!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F579b4e5e-958d-46b3-9ee2-028d7dc66dbd_498x280.gif 424w, https://substackcdn.com/image/fetch/$s_!MExc!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F579b4e5e-958d-46b3-9ee2-028d7dc66dbd_498x280.gif 848w, https://substackcdn.com/image/fetch/$s_!MExc!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F579b4e5e-958d-46b3-9ee2-028d7dc66dbd_498x280.gif 1272w, https://substackcdn.com/image/fetch/$s_!MExc!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F579b4e5e-958d-46b3-9ee2-028d7dc66dbd_498x280.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MExc!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F579b4e5e-958d-46b3-9ee2-028d7dc66dbd_498x280.gif" width="498" height="280" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/579b4e5e-958d-46b3-9ee2-028d7dc66dbd_498x280.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:280,&quot;width&quot;:498,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!MExc!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F579b4e5e-958d-46b3-9ee2-028d7dc66dbd_498x280.gif 424w, https://substackcdn.com/image/fetch/$s_!MExc!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F579b4e5e-958d-46b3-9ee2-028d7dc66dbd_498x280.gif 848w, https://substackcdn.com/image/fetch/$s_!MExc!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F579b4e5e-958d-46b3-9ee2-028d7dc66dbd_498x280.gif 1272w, https://substackcdn.com/image/fetch/$s_!MExc!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F579b4e5e-958d-46b3-9ee2-028d7dc66dbd_498x280.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>Despite what CEOs of popular AI companies, like OpenAI, might have claimed about AGI, the AI research community has all generally avoided making specific predictions about when (real) AGI might be achieved.&nbsp;</p><p><strong>You might be thinking: who gets to define what "real" AGI is?</strong> The short answer is the path to AGI is not just a technical challenge, but also involves complex ethical, philosophical, and societal considerations. </p></blockquote><p>We'll go through the hurdles in achieving real AGI in the upcoming section of this article.</p><p>The next leap in the evolution journey from AGI is ASI.</p><p><strong>Artificial Superintelligence (ASI)</strong> is a type of artificial intelligence aimed at surpassing human intelligence and capabilities across virtually <em>all</em> domains. If research into&nbsp;AGI produced sufficiently intelligent software, it might be able to&nbsp;reprogram and improve itself. ASI is even more speculative than AGI and is often discussed in the context of science fiction, as well as the potential long-term development challenges and risks associated with building an ASI.</p><ul><li><p>ASI would be something like the <em>Skynet</em> from the <em>Terminator</em> movie series or the superintelligent AI from the movie <em>Transcendence</em>.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1PnU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F614fdb7b-59a4-43ff-9f4d-383ed0173980_498x498.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1PnU!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F614fdb7b-59a4-43ff-9f4d-383ed0173980_498x498.gif 424w, https://substackcdn.com/image/fetch/$s_!1PnU!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F614fdb7b-59a4-43ff-9f4d-383ed0173980_498x498.gif 848w, https://substackcdn.com/image/fetch/$s_!1PnU!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F614fdb7b-59a4-43ff-9f4d-383ed0173980_498x498.gif 1272w, https://substackcdn.com/image/fetch/$s_!1PnU!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F614fdb7b-59a4-43ff-9f4d-383ed0173980_498x498.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1PnU!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F614fdb7b-59a4-43ff-9f4d-383ed0173980_498x498.gif" width="498" height="498" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/614fdb7b-59a4-43ff-9f4d-383ed0173980_498x498.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:498,&quot;width&quot;:498,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!1PnU!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F614fdb7b-59a4-43ff-9f4d-383ed0173980_498x498.gif 424w, https://substackcdn.com/image/fetch/$s_!1PnU!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F614fdb7b-59a4-43ff-9f4d-383ed0173980_498x498.gif 848w, https://substackcdn.com/image/fetch/$s_!1PnU!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F614fdb7b-59a4-43ff-9f4d-383ed0173980_498x498.gif 1272w, https://substackcdn.com/image/fetch/$s_!1PnU!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F614fdb7b-59a4-43ff-9f4d-383ed0173980_498x498.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The infamous science fiction concept of 'AI taking over' requires ASI, not just AGI, and we're still far from achieving even AGI.</figcaption></figure></div><p>Now that we've covered the basics, let&#8217;s dive into more interesting questions and theories to solidify our understanding of the current state of AI and its potential future evolution.</p><h2>What Do Leading AI Companies Say About AGI Development?</h2><p>It's not uncommon or surprising when CEOs of prominent AI companies claim they've achieved AGI or will soon. These statements often serve to attract investors rather than reflect genuine technological progress. We should evaluate such claims critically, considering the actual technical challenges in developing true artificial general intelligence.</p><p>Currently, the consensus among the developer community across the leading companies is that while AGI is a long-term goal, the immediate focus should be on developing AI systems that are <em>safe</em>, <em>ethical</em>, and <em>beneficial</em> to humanity.</p><p>Top AI researchers recognize that the path to AGI is not just a technical challenge, but also involves complex ethical, philosophical, and societal considerations.</p><p>Some critics worry that commercial pressures could influence research priorities. <strong>The race for market dominance and profit maximization could potentially sideline safety and ethical considerations in AI development</strong>. <a href="https://www.wsj.com/tech/ai/openais-complex-path-to-becoming-a-for-profit-company-bad21a42">OpenAI&#8217;s shift to a for-profit model</a> has raised questions about their profit motives and the ethical development of AGI.</p><p>Despite rapid progress in AI, many fundamental problems remain unsolved, such as common-sense reasoning, transfer learning, and true language understanding. While we've made significant strides in narrow AI, the leap to AGI remains a formidable challenge &#8211; arguably the greatest technological hurdle of our time.&nbsp;</p><h2>Hurdles in Achieving AGI</h2><blockquote><p>To be clear: Building AGI (strong AI) is like building a human brain from scratch &#8211; that's kind of what we're up against.</p></blockquote><p>It's not just about making faster computers or developing more sophisticated algorithms. We're talking about recreating the essence of human-like thinking in a machine.</p><p>Here are some significant hurdles to achieve true AGI:</p><ol><li><p>AI systems don't really "get" the world like we do. They struggle with basic cause-and-effect relationships that toddlers grasp easily. This challenge relates to the <em>development of common sense reasoning</em> in AI. It's related to the frame problem in artificial intelligence, which deals with representing and reasoning about the effects of actions in a complex, dynamic world. It also touches on the development of causal inference capabilities in AI systems.</p></li><li><p>We will need computers as powerful as entire data centers just to match one human brain. That would be one colossal energy bill!</p></li><li><p>Today's hardware falls short of the computational power necessary for AGI. <em><a href="https://www.youtube.com/watch?v=TetLY4gPDpo">Neuromorphic Computing</a></em> and <em><a href="https://www.youtube.com/watch?v=JhHMJCUmq28">Quantum Computing</a></em> are being explored as potential solutions, but they're still in early stages of development. Even with sufficient raw computing power, efficiently utilizing this power for AGI-level tasks remains a significant challenge.</p></li><li><p>Making sure AI plays nice and doesn't go rogue isn't just sci-fi anymore &#8211; it's a real head-scratcher for researchers. Solving the control problem (ensuring we can maintain control over a superintelligent AI) and value alignment problem (ensuring AI systems act in accordance with human values and ethics) are extremely important.&nbsp;<em>Ethical AI</em> aims to create AI models and algorithms that are fair, and respectful of human values. It tries to address ethical concerns such as accountability, transparency, and data privacy.</p></li><li><p>Current AI systems are savants of imitation &#8211; they can produce human-like outputs, but do they truly understand what they're saying? This is like the difference between a person fluent in a language and someone just really good at using a phrasebook. <strong>The Chinese Room thought experiment</strong> illustrates this: imagine a person who doesn't know Chinese locked in a room with a big book of rules for responding to Chinese messages. They could produce seemingly intelligent Chinese responses <em>without understanding Chinese</em>. Similarly, our AI might be really good at producing human-like text <em>without genuine comprehension</em>. This relates to overcoming three key challenges in AGI development:</p><ul><li><p>Creating <em>genuine language understanding</em> in AI, beyond mere statistical pattern matching.</p></li><li><p>Developing <em><a href="https://thedecisionlab.com/reference-guide/design/explainable-ai-xai">Explainable AI (XAI)</a></em> &#8211; systems that can not only provide answers but also explain their reasoning in a way humans can understand. Deep learning models, such as Large Language Models (LLMs), function as "black boxes," with internal processes that are opaque even to their creators. Current language models can generate impressive responses, but they often can't provide clear, <em>consistent</em> explanations for how they arrived at those responses.</p></li><li><p>We've yet to practically solve fundamental issues like the <a href="https://en.wikipedia.org/wiki/Symbol_grounding_problem">symbol grounding problem</a> in knowledge representation, the <a href="https://www.ibm.com/topics/transfer-learning">disadvantages of current transfer learning techniques</a>, and developing general reasoning capabilities that mirror human cognition.&nbsp;Despite Luc Steels' compelling argument that the <a href="https://researchportal.vub.be/en/publications/the-symbol-grounding-problem-has-been-solved-so-whats-next">symbol grounding problem is solved in principle</a>, debate persists in AI circles, as practical implementation across all systems remains unproven.</p></li></ul></li></ol><h2>Timeline Predictions for Achieving AGI and ASI</h2><h3>AGI</h3><p>Ben Goertzel, CEO of SingularityNET, estimates AGI could be achieved between 2027 and 2030 in various interviews and conferences, including the AI for Good Global Summit (<a href="https://www.livescience.com/technology/artificial-intelligence/ai-agi-singularity-in-2027-artificial-super-intelligence-sooner-than-we-think-ben-goertzel">Source</a>).</p><p>Ray Kurzweil, futurist and Google engineer, predicts AGI by 2029 in his book <em>The Singularity Is Near</em> (2005). His track record shows an 86% accuracy rate across 147 written predictions (<a href="https://en.wikipedia.org/wiki/Ray_Kurzweil">Source</a>).</p><p>Elon Musk predicts AGI by 2029 as well (<a href="https://x.com/elonmusk/status/1767738797276451090?lang=en">Source</a>).</p><p>More conservative estimates from AI researchers like Yoshua Bengio and Stuart Russell suggest it could take several decades.</p><h3>ASI</h3><p>Nick Bostrom suggests in Chapter 4 of <em><a href="https://global.oup.com/academic/product/superintelligence-9780198739838?cc=us&amp;lang=en&amp;">Superintelligence</a></em> (2014) that ASI could emerge relatively quickly after AGI, possibly within days or years.</p><p>Eliezer Yudkowsky, in various writings for the Machine Intelligence Research Institute, has suggested ASI could emerge rapidly after AGI, potentially within hours (Chapter 8 of <a href="https://intelligence.org/files/AIPosNegFactor.pdf">Source</a>).</p><h3>What Should We Make of These Predictions?</h3><p>These predictions are speculative and based on current understanding. Innovation is non-linear so these timelines could change significantly. But the key point to understand here is that most experts estimate the innovation window to achieve ASI would be significantly shorter once AGI is achieved.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!brcI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd00c6a2-e235-4531-9500-07738ee17757_498x280.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!brcI!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd00c6a2-e235-4531-9500-07738ee17757_498x280.gif 424w, https://substackcdn.com/image/fetch/$s_!brcI!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd00c6a2-e235-4531-9500-07738ee17757_498x280.gif 848w, https://substackcdn.com/image/fetch/$s_!brcI!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd00c6a2-e235-4531-9500-07738ee17757_498x280.gif 1272w, https://substackcdn.com/image/fetch/$s_!brcI!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd00c6a2-e235-4531-9500-07738ee17757_498x280.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!brcI!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd00c6a2-e235-4531-9500-07738ee17757_498x280.gif" width="498" height="280" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cd00c6a2-e235-4531-9500-07738ee17757_498x280.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:280,&quot;width&quot;:498,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!brcI!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd00c6a2-e235-4531-9500-07738ee17757_498x280.gif 424w, https://substackcdn.com/image/fetch/$s_!brcI!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd00c6a2-e235-4531-9500-07738ee17757_498x280.gif 848w, https://substackcdn.com/image/fetch/$s_!brcI!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd00c6a2-e235-4531-9500-07738ee17757_498x280.gif 1272w, https://substackcdn.com/image/fetch/$s_!brcI!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd00c6a2-e235-4531-9500-07738ee17757_498x280.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>In my opinion, these predictions are way too optimistic.</strong> Having been in the trenches of AI development, I do not think we will achieve ASI in the 21st century. Here&#8217;s why:</p><ul><li><p>We often underestimate the true complexity of AGI and ASI. As we'll examine in more detail below, in fact, the challenges involved are far more complex and numerous than our optimistic minds assume, leading to timelines that are more wishful thinking than reasoned approximations.</p></li><li><p>We should be wary of falling into the trap of the <em><a href="https://thedecisionlab.com/biases/planning-fallacy">Planning Fallacy</a></em> cognitive bias. We all tend to underestimate the time required to complete future tasks, despite knowing that similar past tasks have taken longer than anticipated. This bias, often coupled with <em><a href="https://thedecisionlab.com/biases/optimism-bias">Optimism Bias</a></em>, leads us to make overly optimistic predictions about complex technological developments. History is littered with examples of this in the AI field.</p></li><li><p>In the 1950s and 60s, AI pioneers confidently predicted human-level machine intelligence within a generation. Fast forward to 1970, we have Marvin Minsky claiming we'd achieve it in 3-8 years. Here we are, some decades later, still working towards that goal.</p></li></ul><p>While AGI isn't impossible, its timeline deserves far more healthy skepticism. Our optimistic brains, influenced by these cognitive biases, might be painting a future that's closer than reality suggests. The path to AGI and beyond is probably longer and more winding than current predictions imply.</p><h3>The Plausibility of the Rapid Emergence of ASI from AGI</h3><p>As we learned before, a commonly-held theory is that the innovation window to achieve ASI would be significantly shorter once AGI is achieved.</p><p>The key arguments <em>supporting</em> this theory:</p><ul><li><p>AGI might be able to improve its own code, kicking off a rapid cycle of self-enhancement (recursive self-improvement). This process could theoretically occur very quickly, possibly within days or even hours.</p></li><li><p>With access to all digital knowledge, an AGI could quickly become a know-it-all. It might use advanced NLP techniques to synthesize information at a rate that would make current LLMs look like snails. Additionally, since it&#8217;s a machine it could work non-stop at full capacity.</p></li><li><p>Each improvement in AGI could lead to even bigger leaps, snowballing its capabilities. This exponential growth might follow a pattern similar to Moore's Law, but for intelligence rather than transistor density. AGI could multitask across countless systems, learning everything at once. It could potentially leverage quantum computing for certain tasks.</p></li></ul><p>The key arguments <em>against</em> this theory:</p><ul><li><p>The smarter AGI gets, the harder it might become to make significant improvements. We might hit a complexity ceiling where returns diminish, similar to the challenges in scaling current deep learning models.</p></li><li><p>Just because an AGI can crunch numbers faster doesn't mean it'll suddenly become <em>super</em> intelligent. There are fundamental algorithmic breakthroughs needed beyond mere computational power.</p></li><li><p>We might build in some ethical stop signs that prevent AGI from going full throttle on self-improvement. Think of it as a super-sophisticated (and more practical) version of <a href="https://www.brookings.edu/articles/isaac-asimovs-laws-of-robotics-are-wrong/">Asimov's Three Laws of Robotics</a>.</p></li><li><p>There could be unforeseen hurdles in creating superintelligence that we can't predict yet. The <a href="https://www.youtube.com/watch?v=VyHbd6sx5Po">halting problem</a> or <a href="https://www.youtube.com/watch?v=I4pQbo5MQOs">G&#246;del's incompleteness theorems</a> might pose unexpected limitations.</p></li><li><p>Intelligence might have a ceiling &#8211; there may be limits to how smart something can actually become. Perhaps there's a theoretical maximum to intelligence, just as there's a speed limit in physics. AI researchers and developers haven't come close enough to cross that bridge yet.</p></li></ul><h2>How AGI Might Transform Data Management and Cloud Infrastructure Products</h2><p>If and when we achieve AGI, it will disrupt our entire way of living life as we know it. But for the data management industry? They'll win the tech lottery. I firmly believe they're poised to be the first to reap the benefits of AGI innovation.</p><p>So what exactly would this AGI-enhanced data management space look like? Let's dive into some potential scenarios.</p><h3>Self-Optimizing, Self-Healing Systems</h3><p>Imagine systems that redesign themselves on the fly. AGI could continuously optimize infrastructure, making real-time decisions about resource allocation, storage solutions, and processing power. AGI might use advanced reinforcement learning algorithms, far beyond current AutoML capabilities, to evolve system architectures in real-time based on usage patterns and performance metrics.</p><p>Forget manual troubleshooting. AGI-powered systems could diagnose and fix most issues before your team could even notice a problem. This could be achieved by advanced anomaly detection algorithms combined with automated root cause analysis and solution generation, far beyond current AIOps capabilities.</p><h3>Intelligent Data Organization</h3><p>AGI could understand the context and relationships within your data, organizing it in ways that make human-designed databases look primitive.</p><p>We're talking about potential breakthroughs in semantic data models and self-evolving ontologies that could redefine how we structure and query information.</p><h3>Superhuman Cybersecurity</h3><p>AGI could predict and neutralize security threats we haven't even thought of yet, making current cybersecurity measures look like child's play.</p><p>This might involve real-time analysis of global network patterns, predictive modeling of potential exploits, and automatic generation and deployment of security patches. Definitely easier said than done, even for AGI, but it's entirely within the realm of possibility.</p><h2>Looking Ahead: The Future Landscape of AI Development</h2><p>As we journey from Narrow AI towards the (currently theoretical) realms of AGI and ASI, we face numerous challenges. We're tackling fundamental challenges in AI, including knowledge representation, transfer learning limitations, and the development of human-like reasoning capabilities.</p><p>The AI research community continues to debate the merits of <em>Symbolic AI</em> learning versus <em>Connectionist AI</em> learning approaches, with some exploring hybrid systems that combine the strengths of both. <strong>Learn more about these different approaches in part 2 of this series: </strong><a href="https://datacraft.wiki/p/decomposing-ai-development">Decomposing AI Development</a><strong>.</strong></p><p>On the hardware front, emerging technologies like neuromorphic computing and quantum neural networks offer exciting possibilities, though they're still in their infancy.</p><div class="pullquote"><p>While the timeline for achieving AGI remains uncertain, <strong>one thing is clear</strong>: the journey itself is <em>pushing</em> the boundaries of our knowledge and capabilities. The development of Strong AI also intersects with complex philosophical questions about ethics, intelligence, and what it means to BE human vs BUILD machines that act and think like humans.</p></div><p>As we continue this exploration, it's crucial to approach AI development with both optimism and caution. The potential benefits of AGI are enormous, but so too are the ethical considerations and potential risks. By maintaining a balanced perspective and fostering interdisciplinary collaboration, we can work towards a future where AI enhances human capabilities while respecting human values. This is why the current focus on <em>Explainable AI (XAI)</em> and <em>Ethical AI</em> is of paramount importance and will continue for decades to come.</p><p>The road to AGI and beyond is long and winding, but undoubtedly exciting, filled with challenges and opportunities. What a time to be alive!</p>]]></content:encoded></item></channel></rss>