Cohere targets global enterprises with new highly multilingual Command A model requiring only 2 GPUs

Canadian AI startup Cohere — cofounded by one of the authors of the original transformer paper that kickstarted the large language model (LLM) revolution back in 2017 — today unveiled Command A, its latest generative AI model designed for enterprise applications.

As the successor to Command-R, which debuted in March 2024, and Command R+ following it, Command A builds on Cohere’s focus on retrieval-augmented generation (RAG), external tool use and enterprise AI efficiency — especially with regards to compute and the speed at which it serves up answers.

That’s going to make it an attractive option for enterprises looking to gain an AI advantage without breaking the bank, and for applications where prompt responses are needed — such as finance, health, medicine, science and law.

With faster speeds, lower hardware requirements and expanded multilingual capabilities, Command A positions itself as a strong alternative to models such as GPT-4o and DeepSeek-V3 — classic LLMs, not the new reasoning models that have taken the AI industry by storm lately.

Unlike its predecessor, which supported a context length of 128,000 tokens (referencing the amount of information the LLM can handle in one input/output exchange, about equivalent to a 300-page novel), Command A doubles the context length to 256,000 tokens (equivalent to 600 pages of text) while improving overall efficiency and enterprise readiness.

It also comes on the heels Cohere for AI — the non-profit subsidiary of the company — releasing an open-source (for research only) multilingual vision model called Aya Vision earlier this month.

A step up from Command-R

When Command-R launched in early 2024, it introduced key innovations like optimized RAG performance, better knowledge retrieval and lower-cost AI deployments.

It gained traction with enterprises, integrating into business solutions from companies like Oracle, Notion, Scale AI, Accenture and McKinsey, though a November 2024 report from Menlo Ventures surveying enterprise adoption put Cohere’s market share among enterprises at a slim 3%, far below OpenAI (34%), Anthropic (24%), and even small startups like Mistral (5%).

Now, in a bid to become a bigger enterprise draw, Command A pushes these capabilities even further. According to Cohere, it:

Matches or outperforms OpenAI’s GPT-4o and DeepSeek-V3 in business, STEM and coding tasks
Operates on just two GPUs (A100 or H100), a major efficiency improvement compared to models that require up to 32 GPUs
Achieves faster token generation, producing 156 tokens per second — 1.75x faster than GPT-4o and 2.4x faster than DeepSeek-V3
Reduces latency, with a 6,500ms time-to-first-token, compared to 7,460ms for GPT-4o and 14,740ms for DeepSeek-V3
Strengthens multilingual AI capabilities, with improved Arabic dialect matching and expanded support for 23 global languages.

Cohere notes in its developer documentation online that: “Command A is Chatty. By default, the model is interactive and optimized for conversation, meaning it is verbose and uses markdown to highlight code. To override this behavior, developers should use a preamble which asks the model to simply provide the answer and to not use markdown or code block markers.”

Built for the enterprise

Cohere has continued its enterprise-first strategy with Command A, ensuring that it integrates seamlessly into business environments. Key features include:

Advanced retrieval-augmented generation (RAG): Enables verifiable, high-accuracy responses for enterprise applications
Agentic tool use: Supports complex workflows by integrating with enterprise tools
North AI platform integration: Works with Cohere’s North AI platform, allowing businesses to automate tasks using secure, enterprise-grade AI agents
Scalability and cost efficiency: Private deployments are up to 50% cheaper than API-based access.

Multilingual and highly performant in Arabic

A standout feature of Command A is its ability to generate accurate responses across 23 of the most spoken languages around the world, including improved handling of Arabic dialects. Supported languages (according to the developer documentation on Cohere’s website) are:

English
French
Spanish
Italian
German
Portuguese
Japanese
Korean
Chinese
Arabic
Russian
Polish
Turkish
Vietnamese
Dutch
Czech
Indonesian
Ukrainian
Romanian
Greek
Hindi
Hebrew
Persian

In benchmark evaluations:

Command A scored 98.2% accuracy in responding in Arabic to English prompts — higher than both DeepSeek-V3 (94.9%) and GPT-4o (92.2%).
It significantly outperformed competitors in dialect consistency, achieving an ADI2 score of 24.7, compared to 15.9 (GPT-4o) and 15.7 (DeepSeek-V3).

Built for speed and efficiency

Speed is a critical factor for enterprise AI deployment, and Command A has been engineered to deliver results faster than many of its competitors.

Token streaming speed for 100K context requests: 73 tokens/sec (compared to GPT-4o at 38/sec and DeepSeek-V3 at 32/sec)
Faster first token generation: Reduces response time significantly compared to other large-scale models

Pricing and availability

Command A is now available on the Cohere platform and with open weights for research use only on Hugging Face under a Creative Commons Attribution Non Commercial 4.0 International (CC-by-NC 4.0) license, with broader cloud provider support coming soon.

Input tokens: $2.50 per million
Output tokens: $10.00 per million

Private and on-prem deployments are available upon request.

Industry reactions

Several AI researchers and Cohere team members have shared their enthusiasm for Command A.

Dwaraknath Ganesan, pretraining at Cohere, commented on X: “Extremely excited to reveal what we have been working on for the last few months! Command A is amazing. Can be deployed on just 2 H100 GPUs! 256K context length, expanded multilingual support, agentic tool use… very proud of this one.”

Pierre Richemond, AI researcher at Cohere, added: “Command A is our new GPT-4o/DeepSeek v3 level, open-weights 111B model sporting a 256K context length that has been optimized for efficiency in enterprise use cases.”

Building on the foundation of Command-R, Cohere’s Command A represents the next step in scalable, cost-efficient enterprise AI.

With faster speeds, a larger context window, improved multilingual handling and lower deployment costs, it offers businesses a powerful alternative to existing AI models.

The post Cohere targets global enterprises with new highly multilingual Command A model requiring only 2 GPUs appeared first on Venture Beat.