Tailored news hub
categoria

Large Language Models

Large language models — their releases, architectures, capabilities, and head-to-head comparisons.

4 articoli
What is MiniCPM5-1B and How Does Its Dual-Mode Architecture Work?
Explore MiniCPM5-1B, a 1B-parameter LLM designed for on-device deployment, featuring state-of-the-art performance and a unique 'Think'/'No Think' dual-mode chat template.

What is MiniCPM5-1B and How Does Its Dual-Mode Architecture Work?

Discover MiniCPM5-1B, an efficient 1B-parameter causal language model optimized for local and resource-constrained environments. Learn about its Llama-based architecture, impressive 131K context window, and innovative 'Think' and 'No Think' modes that enable it to function as both a fast assistant and a deliberate reasoner from a single checkpoint.

Understanding Uncensored LLMs: A Deep Dive into Qwen3.5-35B-A3B-Heretic-V2
Explore the technical innovations, ethical considerations, and practical applications of uncensored large language models, focusing on a community-driven variant of Qwen3.5.

Understanding Uncensored LLMs: A Deep Dive into Qwen3.5-35B-A3B-Heretic-V2

Learn about the architecture and capabilities of uncensored language models, specifically Qwen3.5-35B-A3B-Heretic-V2. Discover how multi-token prediction and various quantization formats enhance performance and accessibility, while understanding the implications of removing safety filters for research and development.

xAI Completes Grok V9-Medium Training, June Release Expected
Elon Musk confirms 1.5T parameter model, tripling its predecessor, now enters fine-tuning for a public launch in weeks with enhanced coding capabilities.

xAI Completes Grok V9-Medium Training, June Release Expected

xAI has finished training its Grok V9-Medium foundational model, a 1.5 trillion parameter AI with significant improvements over its predecessor, v8-small. The model, which heavily emphasizes coding tasks through Cursor data, is now undergoing fine-tuning and reinforcement learning, with a public release anticipated in early to mid-June 2026.

Inside Talkie: The 13B LM Trained Only on Pre-1931 Text
Exploring the motivations, training data, capabilities, and community reactions to a language model that only knows the world before 1931

Inside Talkie: The 13B LM Trained Only on Pre-1931 Text

Talkie is a 13B-parameter language model trained exclusively on 260 billion tokens of text published before 1931. Built by Nick Levine, Alec Radford, and David Duvenaud to study AI generalization, it sparks discussion on historical perspective and anachronistic outputs. This deep dive covers data sources, processing, limitations, and public release plans.