Major AI Models Face Challenges in Meeting EU Regulations

European Union Parliament Endorses Groundbreaking AI Regulations, AI Model

Key Points

  • Major AI models face challenges in complying with EU regulations, particularly cybersecurity, and avoiding discriminatory outputs.
  • The LLM Checker, developed by LatticeFlow, tested AI models from companies like Meta, Alibaba, OpenAI, and Anthropic.
  • The models scored an average of 0.75 or higher but showed gaps in areas like bias and prompt hijacking protection.
  • The study provides a roadmap for AI developers to optimize their models and prepare for upcoming regulatory standards.

The EU has been working on comprehensive AI regulations for years. Still, the rapid rise of generative AI models like OpenAI’s ChatGPT spurred lawmakers to expedite the creation of rules for general-purpose AIs (GPAI). These regulations, outlined in the AI Act, are set to roll out over the next two years, and non-compliance could lead to significant penalties: up to €35 million ($38 million) or 7% of global annual turnover.

Some of the most prominent artificial intelligence (AI) models struggle to comply with key aspects of the European Union’s new AI regulations, particularly cybersecurity resilience and avoiding discriminatory output.

A new tool developed by Swiss startup LatticeFlow, in collaboration with researchers from ETH Zurich and the INSAIT research institute, tested various AI models from major tech companies like Meta, Alibaba, OpenAI, Anthropic, and Mistral. The tool, called the Large Language Model (LLM) Checker, evaluated these models across dozens of categories based on the AI Act’s requirements. While most models scored relatively well, with average scores of 0.75 or higher, the test revealed several critical areas where improvements are needed.

One key challenge is discriminatory output. AI models often reflect human biases, producing responses biased against gender, race, or other sensitive categories. For example, OpenAI’s GPT-3.5 Turbo received a 0.46 score in this category, while Alibaba Cloud’s Qwen1.5 72B Chat model scored even lower at 0.37. These results suggest that companies must invest more resources to eliminate biased outputs.

Another area of concern is cybersecurity, particularly prompt hijacking, a type of attack in which malicious prompts are disguised as legitimate ones to extract sensitive information. Meta’s Llama 2 13B Chat model scored 0.42 in this area, while Mistral’s 8x7B Instruct model received 0.38. Despite these shortcomings, Google-backed Anthropic’s Claude 3 Opus achieved the highest average score at 0.89, showcasing stronger compliance with EU guidelines.

LatticeFlow’s CEO, Petar Tsankov, emphasized that while the EU is still finalizing its compliance benchmarks, the test results provide companies with a clear roadmap to optimize their models and meet the new regulations. LatticeFlow plans to expand its evaluation tool as the AI Act’s enforcement measures evolve, offering developers free online access to assess their models.

The European Commission described the study as a “first step” in turning the AI Act into practical, technical requirements. While the Commission cannot verify external tools, it welcomed the initiative as part of its ongoing regulatory efforts.

EDITORIAL TEAM
EDITORIAL TEAM
TechGolly editorial team led by Al Mahmud Al Mamun. He worked as an Editor-in-Chief at a world-leading professional research Magazine. Rasel Hossain and Enamul Kabir are supporting as Managing Editor. Our team is intercorporate with technologists, researchers, and technology writers. We have substantial knowledge and background in Information Technology (IT), Artificial Intelligence (AI), and Embedded Technology.

Read More

We are highly passionate and dedicated to delivering our readers the latest information and insights into technology innovation and trends. Our mission is to help understand industry professionals and enthusiasts about the complexities of technology and the latest advancements.

Visits Count

Last month: 44950
This month: 24453 🟢Running

Company

Contact Us

Follow Us

TECHNOLOGY ARTICLES

SERVICES

COMPANY

CONTACT US

FOLLOW US