mirror of https://github.com/CherryHQ/cherry-studio.git synced 2025-12-26 03:31:24 +08:00

🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.

agent anthropic assistant chatbot chatbotai electron llm mcp-client openai

Go to file

Phantom 0af5a85f67 feat: Image OCR (#9409 ) * build: 添加 tesseract.js 及其类型定义依赖 * feat(ocr): 添加OCR类型定义文件以支持OCR功能扩展 * feat(ocr): 添加 Tesseract OCR 提供程序配置 * feat(ocr): 添加Tesseract.js的logo * refactor(settings): 重构文档预处理设置模块结构将PreprocessSettings重命名为DocProcessSettings并调整文件结构更新相关路由和组件引用以保持功能一致性 * refactor(config): 重命名OCR_PROVIDER_CONFIG为BUILTIN_OCR_PROVIDERS以更准确描述用途 * refactor(ocr): 更改文件名 * refactor(ocr): 将获取OCR提供商logo的功能移动到utils目录将getOcrProviderLogo函数从config/ocr.ts移动到utils/ocr.ts，保持功能集中 * refactor(ocr): 重构OCR配置结构以支持默认提供者将内置OCR提供者数组重构为单独定义的常量，并添加默认OCR提供者映射。这提高了代码的可维护性并支持未来扩展。 * feat(store): 添加OCR状态管理切片实现OCR提供商的增删改查功能，使用Redux Toolkit管理OCR相关状态 * feat(types): 添加图片文件类型守卫函数添加 ImageFileMetadata 类型和 isImageFile 类型守卫函数，用于检查文件是否为图片类型 * feat(ocr): 添加对OCR支持文件类型的类型定义和校验函数添加SupportedOcrFileType类型和isSupportedOcrFileType校验函数添加SupportedOcrFile类型和isSupportedOcrFile校验函数 * feat(ocr): 添加OCR功能支持实现基于Tesseract的OCR功能，包括文件类型检查、服务接口和IPC通信新增OCR相关类型定义和服务实现 * refactor(OcrService): 更新日志上下文为'main:OcrService' * feat(ocr): 添加OCR服务基础功能实现OCR服务的基础功能，通过调用window.api.ocr接口处理支持的文件类型 * feat(store): 添加ocr模块到redux store * feat(ocr): 添加OCR功能支持及文件类型校验添加OCR功能钩子useOcr，支持图片文件识别添加不支持文件类型的错误提示国际化文案 * refactor(ocr): 重命名updatePreprocessProvider为updateOcrProvider以保持命名一致性 * feat(ocr): 添加设置图片OCR提供商的功能 * refactor(ocr): 统一OCR类型导入路径将所有OCR相关类型从'@renderer/types/ocr'改为从'@renderer/types'或'@types'导入优化DEFAULT_OCR_PROVIDER类型定义 * feat(store): 更新持久化存储版本并添加OCR配置迁移添加137版本迁移逻辑，初始化OCR提供者和默认图像提供者配置 * feat(ocr): 添加OCR服务设置界面及提供商选择功能实现OCR服务设置界面，包含图片OCR提供商的选择功能修复ocr.ts中imageProvider的类型定义添加相关国际化文本 * fix(ocr): 添加图像大小检查并优化错误处理检查图像文件大小是否超过50MB限制使用buffer读取文件替代直接路径识别简化错误处理逻辑，直接抛出原始错误 * feat(OCR服务): 支持base64字符串作为OCR输入扩展tesseractOcr函数以接受base64字符串或图像文件作为输入 * build: 将 tesseract.js 从 devDependencies 移至 dependencies 确保生产环境能正确使用 tesseract.js 功能 * refactor(ocr): 将Tesseract服务文件移动到tesseract子目录并更新配置 * refactor(TesseractService): 添加日志记录并更新worker配置添加loggerService用于记录worker日志，并更新createWorker配置以使用自定义logger * feat(i18n): 添加OCR功能的多语言支持 * refactor(preload): 移动OCR类型定义到共享类型文件将OCR相关的类型定义(OcrProvider, OcrResult, SupportedOcrFile)从渲染进程类型文件移动到共享类型文件@types，以提高代码复用性和维护性 * refactor(ocr): 修改tesseractOcr返回完整识别结果而非仅文本返回完整识别结果以便后续处理使用更多OCR信息，同时简化imageOcr中的条件判断逻辑 * fix(ocr): 修复文件类型与OCR提供者能力不匹配时的错误抛出位置将错误抛出语句移至else分支 * refactor(ocr): 简化 DEFAULT_OCR_PROVIDER 的类型定义 * fix(ocr): 改进OCR处理中的消息管理和错误处理在useOcr钩子中统一管理OCR处理的消息提示，并完善错误处理逻辑移除TranslatePage中重复的消息管理代码，简化OCR处理流程 * feat(i18n): 添加OCR相关的错误和状态翻译文本 * fix(useOcr): 修复未支持文件类型错误抛出位置将不支持的OCR文件类型错误抛出逻辑移至条件判断内 * refactor(ocr): ocrImage实现使用OcrService并更新日志上下文将ocrImage函数从useOcr钩子移动到OcrService中，提高代码复用性更新日志服务上下文从'main'改为'renderer'以更准确反映模块位置 * style(TabContainer): 移除多余的空行并保持代码整洁 * refactor(ocr): 简化OCR文件类型检查逻辑使用现有的isImageFile函数替代冗余的类型检查逻辑，提高代码复用性 * fix: 将迁移错误日志从136更新为137 * feat(ocr): enhance Tesseract service with language support and worker management - Added support for multiple Tesseract languages: Chinese (Simplified and Traditional) and English. - Refactored Tesseract worker management into a class for better encapsulation and reuse. - Introduced methods to dynamically determine language path based on IP country and manage worker lifecycle. * update cn url * support cn data * change to asyn * use register design mode * add type * use bind function * refactor(ipc): 简化OCR处理程序参数 * refactor(ocr): 修改ocrProviderCapabilityRecord类型定义允许只定义部分能力 * refactor(ocr): 将Tesseract相关配置移至服务内部将语言列表和下载URL常量从共享配置移至Tesseract服务内部使用常量定义图片大小阈值以提高可读性 * refactor(ocr): 统一使用 SupportedOcrFile 类型替换 FileMetadata 更新 OCR 服务及其 Tesseract 实现，使用 SupportedOcrFile 类型替代原有的 FileMetadata 类型，以提高类型安全性和一致性。同时在 OcrService 中添加重复注册的警告日志。 * refactor(ocr): 重构OCR类型定义以支持模型和API配置将OCR提供者配置拆分为独立类型，增加模型能力记录和API配置类型检查添加OCR处理程序类型定义，为未来扩展提供更好的类型支持 * refactor(OcrService): 移除重复的OcrHandler类型定义已在@types中定义OcrHandler类型，移除重复定义以提高代码一致性 * refactor(ocr): 将OcrService移动到ocr目录下并更新引用路径 * feat(ocr): 添加OCR API客户端工厂及示例实现实现OCR API客户端工厂模式，支持根据不同提供商创建对应的客户端新增OcrBaseApiClient作为基础类，提供通用功能添加OcrExampleApiClient作为示例实现修改OcrService以使用新的客户端工厂 * refactor(ocr): 添加日志记录以跟踪OCR文件处理在OCR服务中添加日志记录功能，便于跟踪文件处理过程 * fix(deps): 更新 tesseract.js 依赖并添加补丁文件修复 tesseract.js 类型定义问题并添加语言常量支持 * refactor(ocr): 移除注释掉的tesseract语言映射代码使用Tesseract.js的LanguageCode类型替代硬编码的语言列表，提高类型安全性 * feat(ocr): 添加 Tesseract OCR 配置类型 * refactor(OCR设置): 重命名OcrImageProviderSettings为OcrImageSettings并优化代码结构 * refactor(ocr): 将 Tesseract 相关类型移动到文件底部以改善代码组织 * feat(ocr): 添加 Tesseract OCR 提供者类型检查函数 * feat(ocr): 添加更新OCR提供者配置的功能 * feat: 添加OCR提供者钩子函数实现useOcrProvider钩子用于获取和更新OCR提供者配置 * refactor(ocr): 修改removeOcrProvider参数为字符串id 简化removeOcrProvider方法的参数类型，直接使用字符串id进行过滤，提高代码简洁性 * refactor(ocr): 将内置OCR提供者从数组改为映射结构重构OCR配置模块，使用映射结构存储内置OCR提供者以便于扩展和维护 * refactor(ocr): 将BUILTIN_OCR_PROVIDERS改为只读数组使用Object.freeze确保数组不可变，提高代码安全性 * feat(ocr): 添加OCR提供者管理功能并改进错误处理添加useOcrProviders钩子用于管理OCR提供者的添加和删除当内置OCR提供者不存在时自动恢复默认配置改进错误提示信息并增加国际化支持 * Revert "refactor(ocr): 将BUILTIN_OCR_PROVIDERS改为只读数组" This reverts commit `f23e37941a`. * feat(ocr): 为Tesseract OCR添加多语言支持配置添加对简体中文、繁体中文和英文的语言支持配置，扩展OCR功能以满足多语言识别需求 * refactor(types): 将Tesseract.LanguageCode重命名为TesseractLangCode以提高可读性 * feat(OCR设置): 添加OCR提供商设置组件及状态管理新增OCR提供商设置组件，支持显示当前选择的OCR提供商信息在OCR图片设置中添加状态管理，同步提供商选择到父组件添加Tesseract OCR设置组件，支持多语言选择（暂不可用） * fix(DocProcessSettings): 修复OCR语言选择默认值问题 * feat(i18n): 添加OCR提供商相关错误和警告的翻译 * fix(ocr): 将 Tesseract 语言配置类型改为部分 * fix(ocr): 修复ocrImage函数未使用await导致的问题 * fix(ocr): 修复迁移配置中ocr状态的初始化方式将分散的属性赋值改为对象整体赋值，避免潜在的属性丢失问题 * chore: 移除不再使用的@types/tesseract.js依赖 * refactor(OCR设置): 添加错误边界处理并移除无用注释在OCR设置组件中添加ErrorBoundary以处理潜在错误移除OcrTesseractSettings中的TODO注释 * build: 添加 sharp 依赖以支持图片处理功能 * refactor(ocr): 添加OCR图像预处理功能并优化TesseractService Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * refactor(ocr): 移除独立的灰度处理模块并改进预处理流程将灰度处理功能直接集成到OCR预处理中，不再需要单独的image模块添加normalise和threshold处理以提升OCR识别效果 * improve image preprocess --------- Co-authored-by: beyondkmp <beyondkmp@gmail.com> Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>		2025-08-26 00:13:24 +08:00
.github	chore(ci): refine pr ci steps (#9429 )	2025-08-22 22:52:03 +08:00
.husky	chore(pre-commit): add pre-commit hook to enforce code style (#3351 )	2025-03-15 11:09:11 +08:00
.vscode	chore(vscode): improve VSCode launch configurations for debugging (#9483 )	2025-08-25 10:46:45 +08:00
.yarn	feat: Image OCR (#9409 )	2025-08-26 00:13:24 +08:00
build	feat(installer): add architecture compatibility check to NSIS installer (#8587 )	2025-07-28 21:41:05 +08:00
docs	feat(translate): brand new translate feature (#8513 )	2025-08-11 13:33:31 +08:00
packages	feat: Image OCR (#9409 )	2025-08-26 00:13:24 +08:00
resources	feat: add code tools (#9043 )	2025-08-12 11:54:38 +08:00
scripts	feat(translate): brand new translate feature (#8513 )	2025-08-11 13:33:31 +08:00
src	feat: Image OCR (#9409 )	2025-08-26 00:13:24 +08:00
tests	refactor: Unified Logger / 统一日志管理 (#8207 )	2025-07-18 09:40:56 +08:00
.editorconfig	style: set eol to lf, code formatting (#7923 )	2025-07-08 09:50:33 +08:00
.env.example	fix: support gpt-5 (#8945 )	2025-08-10 14:27:26 +08:00
.git-blame-ignore-revs	chore: git blame ignore (#7925 )	2025-07-08 14:23:55 +08:00
.gitattributes	style: set eol to lf, code formatting (#7923 )	2025-07-08 09:50:33 +08:00
.gitignore	chore: add `CLAUDE.local.md` to .gitignore	2025-08-06 19:43:46 +08:00
.npmrc	opt: optimise local dev with fixed yarn (#3456 )	2025-03-19 13:18:11 +08:00
.prettierignore	feat: nutstore integration (#3461 )	2025-03-25 11:40:11 +08:00
.prettierrc	Revert "feat(cherry-store): add cherry store (#8683 )"	2025-08-06 14:29:55 +08:00
.yarnrc.yml	chore: upgrade yarn version to v4.9.1	2025-05-22 15:48:20 +08:00
CLAUDE.md	feat: add OpenAI o3 model support with enhanced tool calling (#8253 )	2025-08-04 23:19:21 +08:00
CODE_OF_CONDUCT.md	docs: update documentation for a more inclusive environment and added japanese and chinese documentation	2024-10-25 00:09:01 +08:00
CONTRIBUTING.md	docs: add `testplan` md (#7854 )	2025-07-05 17:19:25 +08:00
dev-app-update.yml	feat: add after-build script for renaming files and updating latest.yml	2025-04-14 17:14:45 +08:00
electron-builder.yml	chore: release v1.5.7-rc.1	2025-08-19 17:38:24 +08:00
electron.vite.config.ts	Revert "feat(cherry-store): add cherry store (#8683 )"	2025-08-06 14:29:55 +08:00
eslint.config.mjs	Revert "feat(cherry-store): add cherry store (#8683 )"	2025-08-06 14:29:55 +08:00
LICENSE	Update LICENSE (#4744 )	2025-04-13 08:00:41 +08:00
package.json	feat: Image OCR (#9409 )	2025-08-26 00:13:24 +08:00
playwright.config.ts	test: more unit tests (#5130 )	2025-05-26 16:50:26 +08:00
README.md	refactor: match provider and model using a consistent method (#7933 )	2025-07-23 10:45:09 +08:00
SECURITY.md	refactor: model list and health check (#7997 )	2025-07-21 15:57:08 +08:00
tsconfig.json	Revert "feat(cherry-store): add cherry store (#8683 )"	2025-08-06 14:29:55 +08:00
tsconfig.node.json	Revert "feat(cherry-store): add cherry store (#8683 )"	2025-08-06 14:29:55 +08:00
tsconfig.web.json	chore(tsconfig): adjust the path order (#8769 )	2025-08-02 00:05:15 +08:00
vitest.config.ts	Revert "feat(cherry-store): add cherry store (#8683 )"	2025-08-06 14:29:55 +08:00
yarn.lock	feat: Image OCR (#9409 )	2025-08-26 00:13:24 +08:00

README.md

🌐 Language

🍒 Cherry Studio

Cherry Studio is a desktop client that supports multiple LLM providers, available on Windows, Mac and Linux.

👏 Join Telegram Group｜Discord | QQ Group(575014769)

❤️ Like Cherry Studio? Give it a star 🌟 or Sponsor to support the development!

🌠 Screenshot

🌟 Key Features

Diverse LLM Provider Support:

☁️ Major LLM Cloud Services: OpenAI, Gemini, Anthropic, and more
🔗 AI Web Service Integration: Claude, Peplexity, Poe, and others
💻 Local Model Support with Ollama, LM Studio

AI Assistants & Conversations:

📚 300+ Pre-configured AI Assistants
🤖 Custom Assistant Creation
💬 Multi-model Simultaneous Conversations

Document & Data Processing:

📄 Supports Text, Images, Office, PDF, and more
☁️ WebDAV File Management and Backup
📊 Mermaid Chart Visualization
💻 Code Syntax Highlighting

Practical Tools Integration:

🔍 Global Search Functionality
📝 Topic Management System
🔤 AI-powered Translation
🎯 Drag-and-drop Sorting
🔌 Mini Program Support
⚙️ MCP(Model Context Protocol) Server

Enhanced User Experience:

🖥️ Cross-platform Support for Windows, Mac, and Linux
📦 Ready to Use - No Environment Setup Required
🎨 Light/Dark Themes and Transparent Window
📝 Complete Markdown Rendering
🤲 Easy Content Sharing

📝 Roadmap

We're actively working on the following features and improvements:

🎯 Core Features

Selection Assistant with smart content selection enhancement
Deep Research with advanced research capabilities
Memory System with global context awareness
Document Preprocessing with improved document handling
MCP Marketplace for Model Context Protocol ecosystem

🗂 Knowledge Management

Notes and Collections
Dynamic Canvas visualization
OCR capabilities
TTS (Text-to-Speech) support

📱 Platform Support

HarmonyOS Edition (PC)
Android App (Phase 1)
iOS App (Phase 1)
Multi-Window support
Window Pinning functionality

🔌 Advanced Features

Plugin System
ASR (Automatic Speech Recognition)
Assistant and Topic Interaction Refactoring

Track our progress and contribute on our project board.

Want to influence our roadmap? Join our GitHub Discussions to share your ideas and feedback!

🌈 Theme

Theme Gallery: https://cherrycss.com
Aero Theme: https://github.com/hakadao/CherryStudio-Aero
PaperMaterial Theme: https://github.com/rainoffallingstar/CherryStudio-PaperMaterial
Claude dynamic-style: https://github.com/bjl101501/CherryStudio-Claudestyle-dynamic
Maple Neon Theme: https://github.com/BoningtonChen/CherryStudio_themes

Welcome PR for more themes

🤝 Contributing

We welcome contributions to Cherry Studio! Here are some ways you can contribute:

Contribute Code: Develop new features or optimize existing code.
Fix Bugs: Submit fixes for any bugs you find.
Maintain Issues: Help manage GitHub issues.
Product Design: Participate in design discussions.
Write Documentation: Improve user manuals and guides.
Community Engagement: Join discussions and help users.
Promote Usage: Spread the word about Cherry Studio.

Refer to the Branching Strategy for contribution guidelines

Getting Started

Fork the Repository: Fork and clone it to your local machine.
Create a Branch: For your changes.
Submit Changes: Commit and push your changes.
Open a Pull Request: Describe your changes and reasons.

For more detailed guidelines, please refer to our Contributing Guide.

Thank you for your support and contributions!

🔧 Developer Co-creation Program

We are launching the Cherry Studio Developer Co-creation Program to foster a healthy and positive-feedback loop within the open-source ecosystem. We believe that great software is built collaboratively, and every merged pull request breathes new life into the project.

We sincerely invite you to join our ranks of contributors and shape the future of Cherry Studio with us.

Contributor Rewards Program

To give back to our core contributors and create a virtuous cycle, we have established the following long-term incentive plan.

The inaugural tracking period for this program will be Q3 2025 (July, August, September). Rewards for this cycle will be distributed on October 1st.

Within any tracking period (e.g., July 1st to September 30th for the first cycle), any developer who contributes more than 30 meaningful commits to any of Cherry Studio's open-source projects on GitHub will be eligible for the following benefits:

Cursor Subscription Sponsorship: Receive a $70 USD credit or reimbursement for your Cursor subscription, making AI your most efficient coding partner.
Unlimited Model Access: Get unlimited API calls for the DeepSeek and Qwen models.
Cutting-Edge Tech Access: Enjoy occasional perks, including API access to models like Claude, Gemini, and OpenAI, keeping you at the forefront of technology.

Growing Together & Future Plans

A vibrant community is the driving force behind any sustainable open-source project. As Cherry Studio grows, so will our rewards program. We are committed to continuously aligning our benefits with the best-in-class tools and resources in the industry. This ensures our core contributors receive meaningful support, creating a positive cycle where developers, the community, and the project grow together.

Moving forward, the project will also embrace an increasingly open stance to give back to the entire open-source community.

How to Get Started?

We look forward to your first Pull Request!

You can start by exploring our repositories, picking up a good first issue, or proposing your own enhancements. Every commit is a testament to the spirit of open source.

Thank you for your interest and contributions.

Let's build together.

🏢 Enterprise Edition

Building on the Community Edition, we are proud to introduce Cherry Studio Enterprise Edition—a privately-deployable AI productivity and management platform designed for modern teams and enterprises.

The Enterprise Edition addresses core challenges in team collaboration by centralizing the management of AI resources, knowledge, and data. It empowers organizations to enhance efficiency, foster innovation, and ensure compliance, all while maintaining 100% control over their data in a secure environment.

Core Advantages

Unified Model Management: Centrally integrate and manage various cloud-based LLMs (e.g., OpenAI, Anthropic, Google Gemini) and locally deployed private models. Employees can use them out-of-the-box without individual configuration.
Enterprise-Grade Knowledge Base: Build, manage, and share team-wide knowledge bases. Ensures knowledge retention and consistency, enabling team members to interact with AI based on unified and accurate information.
Fine-Grained Access Control: Easily manage employee accounts and assign role-based permissions for different models, knowledge bases, and features through a unified admin backend.
Fully Private Deployment: Deploy the entire backend service on your on-premises servers or private cloud, ensuring your data remains 100% private and under your control to meet the strictest security and compliance standards.
Reliable Backend Services: Provides stable API services and enterprise-grade data backup and recovery mechanisms to ensure business continuity.

✨ Online Demo

🚧 Public Beta Notice

The Enterprise Edition is currently in its early public beta stage, and we are actively iterating and optimizing its features. We are aware that it may not be perfectly stable yet. If you encounter any issues or have valuable suggestions during your trial, we would be very grateful if you could contact us via email to provide feedback.

🔗 Cherry Studio Enterprise

Version Comparison

Feature	Community Edition	Enterprise Edition
Open Source	✅ Yes	⭕️ Partially released to customers
Cost	Free for Personal Use / Commercial License	Buyout / Subscription Fee
Admin Backend	—	● Centralized Model Access ● Employee Management ● Shared Knowledge Base ● Access Control ● Data Backup
Server	—	✅ Dedicated Private Deployment

Get the Enterprise Edition

We believe the Enterprise Edition will become your team's AI productivity engine. If you are interested in Cherry Studio Enterprise Edition and would like to learn more, request a quote, or schedule a demo, please feel free to contact us.

For Business Inquiries & Purchasing: 📧 bd@cherry-ai.com

one-api: LLM API management and distribution system supporting mainstream models like OpenAI, Azure, and Anthropic. Features a unified API interface, suitable for key management and secondary distribution.
ublacklist: Blocks specific sites from appearing in Google search results

README.md

🍒 Cherry Studio

🌠 Screenshot

🌟 Key Features

📝 Roadmap

🌈 Theme

🤝 Contributing

Getting Started

🔧 Developer Co-creation Program

Contributor Rewards Program

Growing Together & Future Plans

How to Get Started?

🏢 Enterprise Edition

Core Advantages

✨ Online Demo

Version Comparison

Get the Enterprise Edition

🚀 Contributors

📊 GitHub Stats

⭐️ Star History

README.md Unescape Escape

🍒 Cherry Studio

🌠 Screenshot

🌟 Key Features

📝 Roadmap

🌈 Theme

🤝 Contributing

Getting Started

🔧 Developer Co-creation Program

Contributor Rewards Program

Growing Together & Future Plans

How to Get Started?

🏢 Enterprise Edition

Core Advantages

✨ Online Demo

Version Comparison

Get the Enterprise Edition

🔗 Related Projects

🚀 Contributors

📊 GitHub Stats

⭐️ Star History

README.md