What is Shisa V2 405B?
We're thrilled to introduce Shisa V2 405B, the latest and most powerful addition to the Shisa V2 family of open-source, bilingual large language models. Built upon the robust Llama 3.1 405B Instruct base, Shisa V2 405B is specifically engineered to deliver exceptional performance in both Japanese and English, addressing the critical need for high-quality, culturally aware, and capable AI within Japan and globally. This model not only sets a new standard for LLM performance trained in Japan but also competes effectively with leading global models on key benchmarks, providing you with a powerful tool for diverse and demanding language tasks.
Key Features
Shisa V2 405B is designed to provide you with cutting-edge language processing power, particularly for Japanese and English applications. Here are its core strengths:
🌐 Leading Japanese & English Performance: Based on extensive evaluations using industry-standard and custom benchmarks (like Japanese MT-Bench, ELYZA Tasks 100, MixEval), Shisa V2 405B demonstrates performance competitive with global leaders such as GPT-4o and DeepSeek-V3 in Japanese, and strong capabilities in English. This means you can expect highly accurate, nuanced, and contextually relevant responses in both languages.
🇯🇵 Deep Japanese Language Mastery: Leveraging a significantly refined, high-quality Japanese/English SFT dataset, Shisa V2 405B exhibits a profound understanding of Japanese grammar, linguistics, and cultural context. We've specifically developed new evaluations, including shisa-jp-ifeval (instruction following) and shisa-jp-rp-bench (role-playing), to ensure the model excels in real-world Japanese use cases that standard benchmarks might miss.
📈 Built on Superior Data Quality: Our intensive focus on improving dataset quality, particularly for synthetic data generation and filtering, has been the single most important factor driving Shisa V2 405B's performance. By using one of the best core JA/EN SFT datasets available, the model learns from cleaner, more relevant data, resulting in more reliable and higher-quality outputs for you.
🌍 Enhanced CJK Multilingual Support: While primarily focused on Japanese and English, Shisa V2 405B incorporates additional Korean (KO) and Traditional Chinese (ZH-TW) language data. This explicit inclusion makes it more capable for CJK (Chinese, Japanese, Korean) multilingual applications, broadening its utility for regional tasks.
💡 Large Scale, Advanced Training: As a 405B parameter model, Shisa V2 is massive, requiring substantial compute (>50x vs. Shisa V2 70B) and advanced full-parameter fine-tuning techniques. This scale and training intensity contribute directly to its ability to handle complex instructions, maintain coherence over long conversations, and generate sophisticated text.
Use Cases
Shisa V2 405B empowers you to tackle a wide range of advanced language challenges:
High-Quality Bilingual Content Generation: Generate accurate and natural-sounding text, articles, marketing copy, or creative content in both Japanese and English, maintaining linguistic nuances and cultural appropriateness.
Advanced Japanese/English Conversation & Role-Playing: Develop sophisticated chatbots or AI assistants capable of engaging in fluid, multi-turn conversations, including persona-based interactions and complex instruction following in Japanese.
Precise Bilingual Translation & Understanding: Improve the accuracy and quality of translations between Japanese and English, or deeply analyze and summarize content in either language.
Conclusion
Shisa V2 405B represents a significant leap forward for high-performance, bilingual AI, particularly for Japanese and English users and developers. Its top-tier performance, rooted in exceptional data quality and extensive training, makes it an ideal choice for demanding applications.
Ready to experience Japan's highest performing LLM? You can chat with Shisa V2 405B (FP8) right now or explore download options:
Chat with Shisa V2 405B: chat.shisa.ai
Download the model: shisa-ai/shisa-v2-llama3.1-405b on Hugging Face
FAQ
What kind of hardware is required to run Shisa V2 405B? Running the full FP16 model requires substantial memory, typically 800GB, necessitating multi-GPU setups (e.g., 2xH100 or 1xMI300X nodes). However, we also provide quantized versions (FP8, INT8, various GGUF quants ranging from ~100GB) that reduce memory requirements significantly, making it accessible on less extreme hardware. For easy testing, you can simply use the web demo at chat.shisa.ai.
How does Shisa V2 405B compare to other open-source models? Based on our evaluations, Shisa V2 405B significantly outperforms previous leading open models trained in Japan, including our own Shisa V2 70B. We observed its performance on industry-standard Japanese benchmarks like JA MT-Bench to be competitive with major global models like GPT-4o and DeepSeek-V3.
What is "Sovereign AI" and how does Shisa V2 405B relate to it? Sovereign AI refers to a nation's ability to develop and control its own AI systems. While Shisa.AI's team is international, they have chosen Japan as their home and share a deep appreciation for Japanese culture and language. Developing high-performing models like Shisa V2 405B within Japan contributes to linguistic preservation, cultural diversity, data privacy/security, and geopolitical resilience, aligning with the principles of Sovereign AI through an open-source approach.
More information on Shisa V2 405B
Shisa V2 405B Alternatives
Load more Alternatives-

-

-

Discover EXAONE 3.5 by LG AI Research. A suite of bilingual (English & Korean) instruction - tuned generative models from 2.4B to 32B parameters. Support long - context up to 32K tokens, with top - notch performance in real - world scenarios.
-

Jamba 1.5 Open Model Family, launched by AI21, based on SSM-Transformer architecture, with long text processing ability, high speed and quality, is the best among similar products in the market and suitable for enterprise-level users dealing with large data and long texts.
-

C4AI Aya Vision 8B: Open-source multilingual vision AI for image understanding. OCR, captioning, reasoning in 23 languages.
