Microsoft’s AI Security Team Reveals How Hidden Training Backdoors Survive Quietly Within Enterprise Language Models


  • Microsoft launches scanner to detect poisoned language models before deployment
  • Hijacked LLMs can hide malicious behavior until specific trigger phrases appear
  • Scanner identifies abnormal attention patterns linked to hidden backdoor triggers

Microsoft announced the development of a new scanner designed to detect hidden backdoors in large open language models used in enterprise environments.

The company says its tool aims to identify cases of model poisoning, a form of tampering in which malicious behavior is directly built into the model weights during training.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top