{"id":82501,"date":"2025-07-21T12:13:20","date_gmt":"2025-07-21T12:13:20","guid":{"rendered":"https:\/\/mycryptomania.com\/?p=82501"},"modified":"2025-07-21T12:13:20","modified_gmt":"2025-07-21T12:13:20","slug":"eu-ai-act-understanding-data-and-data-governance-in-article-10","status":"publish","type":"post","link":"https:\/\/mycryptomania.com\/?p=82501","title":{"rendered":"EU AI Act: Understanding Data and Data Governance in Article 10"},"content":{"rendered":"<p>The European Union\u2019s Artificial Intelligence Act (EU AI Act) proposes a framework to regulate AI, particularly for \u201chigh-risk\u201d systems\u200a\u2014\u200athose that could impact health, safety, or fundamental rights. One element of this framework is Article 10, which focuses on data and data governance. This article mandates strict standards for the datasets used in training, validating, and testing high-risk AI systems to prevent issues like bias, errors, or discrimination.<\/p>\n<p>If you\u2019re an AI provider, or just curious about AI regulation on data and data governance, understanding Article 10 is important. In this post, I will conceptualize data and data governance requirement as outlined in the Act. We\u2019ll explore what data governance means, its key elements, and why it matters for compliance.<\/p>\n<p>Grok<\/p>\n<p><strong>What is Data Governance in the Context of\u00a0AI?<\/strong><\/p>\n<p>Data governance refers to the set of practices, policies, and processes that ensure data is handled ethically, accurately, and in line with ethical and legal standards. For high-risk AI systems, poor data practices can lead to amplified biases or unreliable outcomes, which is why the AI Act emphasizes governance to mitigate risks and ensure systems perform as intended.<\/p>\n<p>Think of data governance as a conceptual framework:<\/p>\n<p>It covers everything from how data is collected and prepared to how biases are detected and corrected.The goal? To make AI systems not just functional, but also fair and compliant with regulations like the General Data Protection Regulation (GDPR) and\u00a0others.In Article 10, this governance applies specifically to training, validation, and testing datasets, ensuring they\u2019re suitable for the AI\u2019s purpose and free from flaws that could harm\u00a0users.<\/p>\n<p><strong>The Five Pillars of Data Governance<\/strong><\/p>\n<p>Article 10 is structured around <strong>five main paragraphs<\/strong> (as conceptualized in this post and seen on the figure below), each building on the last to create a robust data management ecosystem. They apply to datasets for high-risk AI systems, with some exceptions for non-training-based systems. Let\u2019s dive into each\u00a0one.<\/p>\n<p><strong>Data Governance and Management Practices (Article\u00a010(2))<\/strong><\/p>\n<p>Datasets must undergo appropriate governance and management practice tailored to the AI system\u2019s intended purpose. It\u2019s not a one-size-fits-all approach; practices should reflect the system\u2019s design and real-world application.<\/p>\n<p>Key elements\u00a0include:<\/p>\n<p>Design Choices: Strategic decisions during development to align the AI with its goals. This involves selecting technical, procedural, and organizational elements, incorporating stakeholder input, and adhering to data principles like minimization, adequacy, necessity, and proportionality. Regular reviews ensure the system stays on track throughout its lifecycle.Data Collection Processes: Document the origins of data, how it was gathered, and (for personal data) its original purpose. Transparency here prevents misuse and builds\u00a0trust.Data Preparation Operations: Handle tasks like annotation, labeling, cleaning, updating, enrichment, and aggregation to maintain high\u00a0quality.Formulation of Assumptions: Clearly define what the data represents and measures\u200a\u2014\u200aavoid vague interpretations that could lead to\u00a0errors.Assessment of Data Suitability: Evaluate if datasets are available, sufficient in quantity, and fit for\u00a0purpose.Bias Examination: Scrutinize data for biases that could affect health, safety, fundamental rights, or cause discrimination, especially in feedback loops where outputs influence future\u00a0inputs.Bias Mitigation: Implement measures to detect, prevent, and correct\u00a0biases.Addressing Data Gaps and Shortcomings: Identify and fix any deficiencies that might hinder compliance with the AI\u00a0Act.<\/p>\n<p><strong>2. Dataset Characteristics (Article\u00a010(3))<\/strong><\/p>\n<p>Once governance practices are in place, the datasets themselves must meet quality benchmarks. They need to\u00a0be:<\/p>\n<p>Relevant and Sufficiently Representative: Mirror the real-world scenarios where the AI will be deployed, capturing diverse populations or contexts to avoid skewed\u00a0results.Free of Errors and Complete: To the greatest extent possible, minimize inaccuracies, duplicates, or missing values that could distort AI performance.Statistically Appropriate: Ensure the data\u2019s statistical properties align with the target population or group the AI serves, promoting reliability and generalizability.<\/p>\n<p><strong>3. Contextual Considerations (Article\u00a010(4))<\/strong><\/p>\n<p>Data doesn\u2019t exist in a vacuum. This paragraph requires datasets to be customized to the AI\u2019s specific geographical, behavioral, functional, or contextual settings. Why? To ensure the AI operates effectively, fairly, and safely in its intended environment.<\/p>\n<p>Benefits and rationale:<\/p>\n<p>Promotes Fairness and Non-Discrimination: Representative data reduces biases that could disadvantage certain\u00a0groups.Enhances Accuracy and Integrity: Tailored data improves completeness and reliability.Aligns with Legal Standards: Complies with GDPR principles like data minimization and purpose limitation.Reduces Risks: Matches data to operational contexts, avoiding mismatches that could lead to failures (e.g., historical issues like inaccuracies in Google\u2019s Gemini\u00a0AI).Compliance Workflow: Providers must assess the AI\u2019s purpose, curate relevant data, balance fairness with accuracy, document decisions, and conduct regular evaluations for ongoing bias mitigation.<\/p>\n<p><strong>4. Processing Special Categories of Personal Data (Article\u00a010(4))<\/strong><\/p>\n<p>Special categories of personal data\u200a\u2014\u200athink health records, biometric info, or racial\/ethnic details\u200a\u2014\u200aare highly sensitive. Providers can only process them exceptionally, and only for bias detection and correction when absolutely necessary (and when alternatives like synthetic or anonymized data won\u2019t suffice).<\/p>\n<p>Strict conditions must all be\u00a0met:<\/p>\n<p>No viable alternative data exists for the\u00a0task.Technical limitations on reuse, combined with top-tier security and privacy-preserving measures.Effective access controls, full documentation, and confidentiality obligations.Data must not be transferred or accessed by third\u00a0parties.Delete the data once the bias is fixed or the retention period ends (whichever comes\u00a0first).Processing records must explain why special data was essential and why other options weren\u2019t feasible.<\/p>\n<p>These safeguards, layered on top of GDPR and related directives, protect fundamental rights while allowing limited use for critical improvements.<\/p>\n<p><strong>5. Testing Datasets for Non-Training Systems (Article\u00a010(5))<\/strong><\/p>\n<p>Not all high-risk AI systems rely on machine learning models that \u201ctrain\u201d on data. For those that don\u2019t, the full governance requirements (Paragraphs 2\u20135) apply only to testing datasets. This streamlines compliance without skimping on quality for evaluation phases.<\/p>\n<p><strong>Why Does This Matter? The Bigger\u00a0Picture<\/strong><\/p>\n<p>Article 10 isn\u2019t just regulatory fine print; it\u2019s a blueprint for compliance. By enforcing rigorous data governance, the EU AI Act helps prevent AI from perpetuating inequalities or causing unintended harm. For providers, compliance means investing in robust processes\u200a\u2014\u200abut the payoff is AI that\u2019s more innovative, trustworthy, and market-ready.<\/p>\n<p>If you\u2019re building AI, start auditing your data practices against these pillars. As AI integrates deeper into society, remember: Great AI starts with great data governance.<\/p>\n<p>What challenges have you faced with data in AI projects? Share in the comments\u200a\u2014\u200aI\u2019d love to hear your thoughts!<\/p>\n<p><a href=\"https:\/\/medium.com\/coinmonks\/understanding-data-and-data-governance-in-the-eu-ai-act-9aeb60bc97bd\">EU AI Act: Understanding Data and Data Governance in Article 10<\/a> was originally published in <a href=\"https:\/\/medium.com\/coinmonks\">Coinmonks<\/a> on Medium, where people are continuing the conversation by highlighting and responding to this story.<\/p>","protected":false},"excerpt":{"rendered":"<p>The European Union\u2019s Artificial Intelligence Act (EU AI Act) proposes a framework to regulate AI, particularly for \u201chigh-risk\u201d systems\u200a\u2014\u200athose that could impact health, safety, or fundamental rights. One element of this framework is Article 10, which focuses on data and data governance. This article mandates strict standards for the datasets used in training, validating, and [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-82501","post","type-post","status-publish","format-standard","hentry","category-interesting"],"_links":{"self":[{"href":"https:\/\/mycryptomania.com\/index.php?rest_route=\/wp\/v2\/posts\/82501"}],"collection":[{"href":"https:\/\/mycryptomania.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mycryptomania.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/mycryptomania.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=82501"}],"version-history":[{"count":0,"href":"https:\/\/mycryptomania.com\/index.php?rest_route=\/wp\/v2\/posts\/82501\/revisions"}],"wp:attachment":[{"href":"https:\/\/mycryptomania.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=82501"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mycryptomania.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=82501"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mycryptomania.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=82501"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}