0published the source code of the Grok-1 chatbot

Teacher · Mar 20, 2024

The company xAI, founded by Elon Musk and received about a billion dollars for the development of technologies related to artificial intelligence, announced the opening of a large language model Grok, used in a chatbot integrated into the social network X (Twitter). A set of weighting factors, neural network architecture, and usage examples are published under the Apache 2.0 license.A ready-to-use archive with a 296 GB model (magnet) is available for download.

The Grok model is pre-trained on a large collection of text data, using xAI's proprietary learning stack, and covers about 314 billion parameters, making it the largest open large language model available. For comparison, Google's recently discovered Gemma model has 7 billion parameters, Sber GigaChat-29 billion parameters, Meta LLaMA - 65 billion, Yandex YaLM-100 billion, OpenAI GPT-3.5-175 billion, and the market leader, the GPT-4 model, presumably includes 1.76 trillion parameters.

The open version of the Grok-1 model is published in the basic view and does not include optimizations for certain areas of use, such as organizing dialog systems. Testing requires a GPU with a large amount of memory (which one is not specified). A static impression of the model is publicly available, while one of the features of the Grok chatbot developed for Twitter is dynamic adaptation to new content (integration with the X/Twitter platform is used to access new knowledge).

The Grok-based chatbot outperforms GPT-3.5 in tests for solving high school math problems (GSM8k), generating answers to interdisciplinary questions (MMLU), adding Python code (HumanEval), and solving high school math problems described in LaTeX (MATH) format.

0published the source code of the Grok-1 chatbot

Teacher

Professional

Similar threads