Microsoft was unable to withdraw its new WizardLM 2 model from public access


Reaction score
"GPT-4 killer" hastily erased from the network, but the Internet remembers everything…

Last week, Microsoft researchers introduced WizardLM 2 — one of the most powerful open source large language models (LLMs). However, shortly after the publication appeared, the company hastily removed it from the Internet. According to the developers themselves, this happened because they missed the necessary stage of toxicity testing before releasing the model to the public.

A remote message from the developers of WizardLM 2 stated that the model represents "the next generation of advanced large language models, improved for complex chats, multilingual systems, reasoning, and agency tasks." Unlike other models that were trained on publicly available data from the Internet or scientific journals, the developers of WizardLM 2 trained their brainchild on synthetic data created by other AI models.

In theory, this approach was supposed to make the new LLM from Microsoft safer, but since the developers did not have time to check it properly, according to all the company's directives, the model was "withdrawn" from public access.

Despite the quick removal, some users managed to download LLM and put it on the Github and Hugging Face platforms. Thus, the model, which according to Microsoft was not ready for wide distribution, is now still freely available. As the saying goes, what once gets on the Internet stays there forever. Microsoft, according to the classic, declined to comment on the incident.

The developers themselves posted the following message on April 16: "We are very sorry. It's been a long time since the model was last released, so we weren't familiar with the new release process: we accidentally missed the toxicity testing phase. We will soon complete this test and re-release the model. Don't worry, thank you for your concern and understanding."

The WizardLM 2 pages on Github and Hugging Face are still unavailable, but the model is easily found in several copy branches on the same platforms.

Using the MT-Bench method, which automatically evaluates the performance of large language models, Microsoft researchers found that WizardLM 2 shows highly competitive results compared to the most modern closed developments, such as GPT-4-Turbo and Claude-3. Theoretically, the new model can really become a "killer" of sensational models from OpenAI and Anthropic.

It is not yet known for certain whether the new model from Microsoft is really capable of generating malicious and even "toxic" responses, or whether the company removed it only because of the lack of verification of these aspects. However, the fact remains that the company failed to control the spread of the new AI model, which it considered not ready for public use.