Tag: large language models

Total 10 articles

Qwen3 Parameter Overview: From 0.6B to 235B, The Ultimate Balance of Hybrid Inference and Multimodality

Large Language Models Qwen3 MoE Architecture Dense Architecture AI Models

Introducing Alibaba Cloud's Qwen3 LLM series (0.6B-235B), featuring MoE and Dense architectures for optimal performance and efficiency. Supports 119 languages with advanced coding and math capabilities.

Apr 29, 2025 • 3 min read

News

Breaking: OpenAI Launches o3 All-Purpose Inference Model and Efficient o4-mini

AI OpenAI Large Language Models

OpenAI releases next-generation inference models o3 and o4-mini, demonstrating exceptional performance in coding, mathematics, science, and visual tasks. Both models show outstanding results in benchmark tests with major breakthroughs in safety and cost-effectiveness.

Apr 17, 2025 • 4 min read

News

Meta Llama 4 Ranking Controversy and Executive Departures Intertwine, Testing AI Strategy

Meta Llama 4 AI Large Language Models Multimodal

Meta Llama 4 faces ranking manipulation allegations while dealing with executive departures, putting its AI strategy to the test

Apr 7, 2025 • 5 min read

News

Understanding Core AI Technologies: The Synergy of MCP, Agent, RAG, and Function Call

MCP Agent RAG Function Call Model Context Protocol Artificial Intelligence Large Language Models

Explore four pillars of modern AI: Agent, RAG, Function Call, and MCP. Learn how these technologies work together to enhance AI capabilities through practical examples and analogies.

Apr 2, 2025 • 6 min read

AIGC

Mistral Small 3.1: Return of the Lightweight Champion, Can It Dethrone Gemma 3?

Mistral Small 3.1 Gemma 3 Large Language Models Model Comparison

Mistral Small 3.1 vs Gemma 3: Compare 24B vs 27B parameters, performance benchmarks, hardware requirements, and real-world applications. Discover which lightweight LLM offers better efficiency and multimodal capabilities.

Mar 20, 2025 • 4 min read

AIGC

Alibaba's QwQ-32B: 32B small parameters Rivaling DeepSeek R1's 671B, Reshaping Open-source AI Landscape?

QwQ-32B Tongyi Qianwen Large Language Models

Alibaba Cloud unveils QwQ-32B, a groundbreaking 32B-parameter model challenging DeepSeek R1 (671B) through pure reinforcement learning, showcasing comparable performance in reasoning tasks.

Mar 5, 2025 • 5 min read

News

LLM-Reasoner: Make Your Language Model Think Deeply Like DeepSeek R1

LLM-Reasoner Large Language Models Chain of Thought DeepSeek Deep Thinking

Learn to implement LLM-Reasoner framework for enhanced logical reasoning like DeepSeek R1. Step-by-step guide for building AI systems with advanced thinking capabilities.

Feb 7, 2025 • 3 min read

AIGC

DeepSeek R1:How Reinforcement Learning Reshapes Language Model Reasoning?

DeepSeek R1 Reinforcement Learning Large Language Models

DeepSeek R1, a breakthrough AI model using reinforcement learning, achieves human-level reasoning and matches OpenAI's o1-1217 performance

Jan 24, 2025 • 7 min read

News

AI Model Tools Comparison How to Choose Between SGLang, Ollama, VLLM, and LLaMA.cpp?

AI Models SGLang Ollama VLLM LLaMA.cpp Artificial Intelligence Large Language Models

In-depth comparison and analysis of popular AI model deployment tools including SGLang, Ollama, VLLM, and LLaMA.cpp, helping developers and users choose the most suitable AI model deployment tool

Jan 1, 2025 • 7 min read

AIGC

DeepSeek-V3 Model In-Depth Analysis: A Brilliant Star in the New AI Era

deepseek-v3 deepseek artificial-intelligence AI models large language models

"Deep dive into DeepSeek-V3 model. Its architecture combines MLA and DeepSeekMoE with innovative load balancing. Trained on 14.8T tokens, powered by HAI-LLM framework and FP8 technology. Enhanced by innovations like MTP, performance surpasses open-source and approaches closed-source models. Cost-effective with low training and API costs, a key reference in AI advancing language models."

Dec 27, 2024 • 6 min read

News