FireRed Team (小红书智创音频团队)



Left image

Introduction: FireRed is the model series developed by Xiaohongshu's Super Intelligence team, featuring FireRed-ASR, FireRed-TTS, FireRed-Chat, FireRed-OCR, FireRed-Image, and FireRed-OpenStoryline.

Deriving from the saying "A single spark can start a prairie fire", the name FireRed represents our vision: to spread our SOTA capabilities—honed in real-world scenarios—like sparks across the wilderness, igniting the imagination of developers worldwide to reshape the future of AI together.

The Super Intelligence team comprises fundamental technology laboratories, including the Audio Lab, Vision Lab, and Foundation Lab.

As the company's core technical engine for future content forms and General Intelligence, the team aims to build an industry-leading multimodal foundation model system with sustainable, evolving capabilities. We consistently benchmark industry SOTA across content understanding, vision and multimodal, image generation and editing, speech understanding and generation, Omni Models, and effect rendering, while prioritizing the scalability and practical deployment of these models in complex business scenarios. Responsible for core R&D in content creation and publishing, we empower business lines such as Recommendation, Search, Video & Live Streaming, E-commerce, Advertising, and International Growth, establishing cutting-edge technology as the foundation for long-term growth and innovation.

Over the past two years, we have excelled in both academia and industry, publishing over 30 top-tier papers and releasing influential open-source projects like InstantID, StoryMaker, FireRedTTS, and FireRedASR. On the product side, we successfully incubated hit features such as Audio Comments, Text Posters, Long Articles, and Full-Screen HD. We build frontier models while translating technology into impactful products. If you are passionate about general intelligence, cutting-edge models, and creating real-world impact, we welcome you to join us!

News

[2025.4.14] We are releasing a new streamable TTS model [demo]

[2025.1.24] We are releasing a new ASR model [demo]

[2024.9.5] We are releasing a new TTS model [demo]

[2024.1.16] An official introudction of auido intelligence team of Xiaohongshu [link]

[2023.10.13] A technicial talk has been given on live at SpeechHome and Redtech [link]

[2023.9.1] We rank 1st in spoke task measursing speaker similarity at Blizzard Challenge 2023 [link]