You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: packages/model-bank/src/aiModels/moonshot.ts
+56-5Lines changed: 56 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,34 @@ import { AIChatModelCard } from '../types/aiModel';
2
2
3
3
// https://platform.moonshot.cn/docs/pricing/chat
4
4
constmoonshotChatModels: AIChatModelCard[]=[
5
+
{
6
+
abilities: {
7
+
functionCall: true,
8
+
reasoning: true,
9
+
structuredOutput: true,
10
+
vision: true,
11
+
},
12
+
contextWindowTokens: 262_144,
13
+
description:
14
+
'Kimi K2.5 is Kimi\'s most versatile model to date, featuring a native multimodal architecture that supports both vision and text inputs, "thinking" and "non-thinking" modes, and both conversational and agent tasks.',
Copy file name to clipboardExpand all lines: packages/model-bank/src/aiModels/ollamacloud.ts
+14Lines changed: 14 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,20 @@
1
1
import{AIChatModelCard}from'../types/aiModel';
2
2
3
3
constollamaCloudModels: AIChatModelCard[]=[
4
+
{
5
+
abilities: {
6
+
functionCall: true,
7
+
reasoning: true,
8
+
vision: true,
9
+
},
10
+
contextWindowTokens: 262_144,
11
+
description:
12
+
'Kimi K2.5 is an open-source, native multimodal agentic model that seamlessly integrates vision and language understanding with advanced agentic capabilities, instant and thinking modes, as well as conversational and agentic paradigms.',
'Gemini 2.0 Flash Experimental is Google’s latest experimental multimodal AI model with quality improvements over prior versions, especially in world knowledge, code, and long context.',
'Kimi K2.5 is the most capable Kimi model, delivering open-source SOTA in agent tasks, coding, and vision understanding. It supports multimodal inputs and both thinking and non-thinking modes.',
'MiniMax-M2.1 is a flagship open-source large model from MiniMax, focusing on solving complex real-world tasks. Its core strengths are multi-language programming capabilities and the ability to solve complex tasks as an Agent.',
'Qwen3 Max models deliver large gains over the 2.5 series in general ability, Chinese/English understanding, complex instruction following, subjective open tasks, multilingual ability, and tool use, with fewer hallucinations. The latest qwen3-max improves agentic programming and tool use over qwen3-max-preview. This release reaches field SOTA and targets more complex agent needs.',
'Kimi K2.5 is an open-source native multimodal agent model, built on Kimi-K2-Base, trained on approximately 1.5 trillion mixed vision and text tokens. The model adopts an MoE architecture with 1T total parameters and 32B active parameters, supporting a 256K context window, seamlessly integrating vision and language understanding capabilities.',
'PaddleOCR-VL-1.5 is an upgraded version of the PaddleOCR-VL series, achieving 94.5% accuracy on the OmniDocBench v1.5 document parsing benchmark, surpassing leading general large models and specialized document parsing models. It innovatively supports irregular bounding box localization for document elements, handling scanned, tilted, and screen-captured images effectively.',
0 commit comments