feat: System (MacOS & Windows) OCR (#9572)

* build: 添加 macOS 系统 OCR 作为可选依赖

* refactor: 移动TesseractService

* feat(ocr): 添加MacOS Vision OCR支持并优化类型定义

添加对MacOS Vision OCR的支持,同时重构OCR相关类型定义以提升可维护性。新增PDF文件元数据类型为后续功能做准备。

* refactor(types): 重命名 isImageFile 为 isImageFileMetadata 以更准确描述功能

* refactor(ocr): 更新导入

* feat(ocr): 实现MacOS Vision OCR服务并重构OCR基础结构

添加MacOcrService以支持MacOS Vision OCR功能
创建OcrBaseService作为OCR服务的基类
清理MacOS OCR配置中的冗余字段

* fix(store): 更新持久化存储版本至138并添加MAC OCR提供者

添加内置OCR提供者支持并清空翻译输入框

* chore: 更新 @cherrystudio/mac-system-ocr 依赖至 0.2.4 版本

* feat(ocr): 添加 macOS 原生 OCR 服务支持

添加 macOS 原生 OCR 服务作为内置 OCR 提供商
在设置页面显示不可配置提示
添加相关 logo 和翻译文本

* build: 将 @cherrystudio/mac-system-ocr 从可选依赖移至常规依赖

* fix(ocr): 临时使用any类型替代平台特定依赖的类型定义

为了避免在Linux上运行类型检查CI时抛出错误,暂时将MacOCR属性的类型从平台特定依赖的类型定义改为any类型

* refactor(build): 将mac-system-ocr移至optionalDependencies并更新vite配置

将@cherrystudio/mac-system-ocr从dependencies移至optionalDependencies
更新electron.vite.config.ts中的external配置以包含该依赖

* feat(OCR设置): 根据平台过滤OCR提供商选项

添加平台检测逻辑,在非Mac平台隐藏Mac内置OCR提供商选项

* feat(OCR): 添加非MacOS系统的错误提示

在OCR图片设置中添加对非MacOS系统的错误提示,当用户尝试在非Mac系统上使用OCR功能时显示错误标签

* feat(i18n): 添加 OCR 相关多语言翻译

为 OCR 功能添加错误提示和配置项的多语言翻译,包括非 MacOS 系统提示和无配置项提示

* fix(MacOcrService): 忽略macOS专属模块的类型检查错误

添加@ts-ignore注释以避免在非macOS平台上的类型检查错误,该模块仅在macOS上可用

* build: 添加 @napi-rs/system-ocr 依赖以支持OCR功能

* chore: 移除未使用的mac-system-ocr依赖

* refactor(ocr): 将 MacOS OCR 重构为跨平台的系统 OCR

重构 OCR 服务,将原本仅支持 MacOS 的 OCR 功能扩展为支持 Windows 和 MacOS 的系统 OCR
更新相关类型定义、配置和界面适配

* feat(hooks): 添加设置图片OCR提供商的功能

* refactor(ocr): 重构OCR提供者相关逻辑,优化代码结构

- 将OCR提供者相关工具函数和hook合并到useOcrProvider中
- 替换mac提供者为system提供者
- 优化OCR设置界面的错误处理和UI展示
- 删除不再使用的ocr.ts工具文件

* refactor(OCR设置): 移除多余的SettingGroup包装并优化provider设置逻辑

移除OcrSettings中多余的SettingGroup包装,将主题样式直接应用于OcrProviderSettings组件
优化OcrProviderSettings逻辑,对于system provider直接返回null

* fix(i18n): 移除OCR服务中不可配置项的翻译并更新系统OCR支持提示

* fix(ocr): 根据系统平台设置默认OCR提供商

在Windows和Mac平台上使用系统OCR作为默认提供商,其他平台继续使用Tesseract

* build: 从外部依赖中移除 @cherrystudio/mac-system-ocr

* fix(i18n): 更新多语言OCR相关翻译

* fix(store): 在迁移配置中移除翻译输入的清空操作

* refactor(hooks): 将 getOcrProviderLogo 重命名为 OcrProviderLogo 并改为组件形式

将 useOcrProviders 中的 getOcrProviderLogo 函数重构为 OcrProviderLogo 组件
更新 OcrProviderSettings 中对应的调用方式

* support jpg

* refactor(ocr): 重构OCR服务基础结构并支持多语言配置

重构OCR基础服务类,提取公共接口为抽象类
为系统OCR和Tesseract服务添加多语言配置支持

* refactor(ocr): 重构OCR类型定义以提高可维护性

将OcrProviderConfig拆分为基础配置和具体实现配置类型
优化类型结构以更清晰地区分不同OCR提供者的配置

* feat(组件): 新增错误标签组件 ErrorTag

* refactor(ocr): 替换自定义标签组件为ErrorTag组件以简化代码

* fix(ocr): 在macOS下忽略语言参数

* feat(组件): 添加警告标签组件用于显示警告信息

* feat(ocr): 添加系统OCR支持并优化语言配置

- 新增系统OCR设置组件,支持Windows和MacOS平台
- 为系统OCR添加语言选择功能,Windows需配置语言包
- 创建SuccessTag组件用于显示配置状态
- 统一OCR语言设置相关翻译键名
- 修复系统OCR在非Windows/Mac平台下的显示问题

* feat(i18n): 添加 OCR 设置页面的多语言支持

为 OCR 设置页面添加了新的多语言翻译,包括支持的语言列表和系统 OCR 的相关提示信息

* feat(ocr): 支持自定义 Tesseract OCR 语言选择

添加 Tesseract OCR 语言映射配置和动态语言选择功能
在设置界面实现多语言选择器,支持用户自定义 OCR 语言
更新相关类型定义和工具提示信息

* docs(i18n): 为Tesseract OCR添加自定义语言支持提示文本

* fix(i18n): 移除OCR服务中临时语言支持提示

* fix(ocr): 修复OCR服务未传递provider配置的问题

* fix(ocr): 修复OCR服务未传递provider配置的问题

* fix(TesseractService): 修复worker没有显式dispose的问题

* feat(拖拽): 在useDrag钩子中暴露setIsDragging方法

允许外部组件直接控制拖拽状态,用于在TranslatePage中处理文件拖放时重置拖拽状态

* feat(i18n): 更新输入框占位文本以支持OCR功能

* fix(ocr): 添加错误处理并记录日志以改进Tesseract服务

在TesseractService中添加错误处理回调函数,捕获并抛出worker创建过程中的错误
同时增加调试日志以跟踪语言数组和worker创建过程

* refactor(ocr): 重构OCR状态管理,使用ID引用图像提供者并添加选择器

将imageProvider字段改为imageProviderId以简化状态管理
添加getImageProvider选择器方便获取当前图像提供者

* update cn data

* refactor(ocr): 重构OCR提供者管理逻辑,使用自定义hook统一处理

- 将OCR提供者状态管理从Redux迁移到自定义hook useOcrProviders
- 修复默认OCR提供者初始化问题
- 优化OCR图片识别逻辑,使用useCallback提升性能

* fix(ocr): 修复Tesseract worker初始化错误处理逻辑

重构worker初始化流程,使用Promise处理错误而非全局变量
修正非CN地区语言包下载URL为空的问题

* fix(ocr): 修复url

* feat(OCR设置): 在Tesseract语言选择器中添加自定义标签渲染

添加CustomTag组件以禁用默认的关闭操作

* refactor(translate): 优化拖拽上传文件的hooks调用顺序

将useDrag hooks的声明移到使用位置附近,提高代码可读性

* perf(ocr): 移除不必要的await提升图像预处理性能

* feat(translate): 添加文本文件类型检查并优化文件处理逻辑

在翻译页面中增加对文本文件类型的检查,避免处理非文本文件。同时优化文件处理流程,包括错误处理和加载状态管理。

* feat(i18n): 添加文件类型检查错误的多语言翻译

* docs(i18n): 更新输入框占位符文本以更清晰描述支持的功能

---------

Co-authored-by: beyondkmp <beyondkmp@gmail.com>
This commit is contained in:
Phantom 2025-08-28 15:28:27 +08:00 committed by GitHub
parent 168cc36410
commit f95b9cef77
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
42 changed files with 1099 additions and 331 deletions

View File

@ -72,6 +72,7 @@
"dependencies": {
"@libsql/client": "0.14.0",
"@libsql/win32-x64-msvc": "^0.4.7",
"@napi-rs/system-ocr": "^1.0.2",
"@strongtz/win32-arm64-msvc": "^0.4.7",
"graceful-fs": "^4.2.11",
"jsdom": "26.1.0",

View File

@ -1,7 +1,8 @@
import { loggerService } from '@logger'
import { BuiltinOcrProviderIds, OcrHandler, OcrProvider, OcrResult, SupportedOcrFile } from '@types'
import { tesseractService } from './tesseract/TesseractService'
import { systemOcrService } from './builtin/SystemOcrService'
import { tesseractService } from './builtin/TesseractService'
const logger = loggerService.withContext('OcrService')
@ -24,7 +25,7 @@ export class OcrService {
if (!handler) {
throw new Error(`Provider ${provider.id} is not registered`)
}
return handler(file)
return handler(file, provider.config)
}
}
@ -32,3 +33,4 @@ export const ocrService = new OcrService()
// Register built-in providers
ocrService.register(BuiltinOcrProviderIds.tesseract, tesseractService.ocr.bind(tesseractService))
ocrService.register(BuiltinOcrProviderIds.system, systemOcrService.ocr.bind(systemOcrService))

View File

@ -0,0 +1,5 @@
import { OcrHandler } from '@types'
export abstract class OcrBaseService {
abstract ocr: OcrHandler
}

View File

@ -0,0 +1,39 @@
import { isMac, isWin } from '@main/constant'
import { loadOcrImage } from '@main/utils/ocr'
import { OcrAccuracy, recognize } from '@napi-rs/system-ocr'
import {
ImageFileMetadata,
isImageFileMetadata as isImageFileMetadata,
OcrResult,
OcrSystemConfig,
SupportedOcrFile
} from '@types'
import { OcrBaseService } from './OcrBaseService'
// const logger = loggerService.withContext('SystemOcrService')
export class SystemOcrService extends OcrBaseService {
constructor() {
super()
if (!isWin && !isMac) {
throw new Error('System OCR is only supported on Windows and macOS')
}
}
private async ocrImage(file: ImageFileMetadata, options?: OcrSystemConfig): Promise<OcrResult> {
const buffer = await loadOcrImage(file)
const langs = isWin ? options?.langs : undefined
const result = await recognize(buffer, OcrAccuracy.Accurate, langs)
return { text: result.text }
}
public ocr = async (file: SupportedOcrFile, options?: OcrSystemConfig): Promise<OcrResult> => {
if (isImageFileMetadata(file)) {
return this.ocrImage(file, options)
} else {
throw new Error('Unsupported file type, currently only image files are supported')
}
}
}
export const systemOcrService = new SystemOcrService()

View File

@ -0,0 +1,115 @@
import { loggerService } from '@logger'
import { getIpCountry } from '@main/utils/ipService'
import { loadOcrImage } from '@main/utils/ocr'
import { MB } from '@shared/config/constant'
import { ImageFileMetadata, isImageFileMetadata, OcrResult, OcrTesseractConfig, SupportedOcrFile } from '@types'
import { app } from 'electron'
import fs from 'fs'
import { isEqual } from 'lodash'
import path from 'path'
import Tesseract, { createWorker, LanguageCode } from 'tesseract.js'
import { OcrBaseService } from './OcrBaseService'
const logger = loggerService.withContext('TesseractService')
// config
const MB_SIZE_THRESHOLD = 50
const defaultLangs = ['chi_sim', 'chi_tra', 'eng'] satisfies LanguageCode[]
enum TesseractLangsDownloadUrl {
CN = 'https://gitcode.com/beyondkmp/tessdata-best/releases/download/1.0.0/'
}
export class TesseractService extends OcrBaseService {
private worker: Tesseract.Worker | null = null
private previousLangs: OcrTesseractConfig['langs']
constructor() {
super()
this.previousLangs = {}
}
async getWorker(options?: OcrTesseractConfig): Promise<Tesseract.Worker> {
let langsArray: LanguageCode[]
if (options?.langs) {
// TODO: use type safe objectKeys
langsArray = Object.keys(options.langs) as LanguageCode[]
if (langsArray.length === 0) {
logger.warn('Empty langs option. Fallback to defaultLangs.')
langsArray = defaultLangs
}
} else {
langsArray = defaultLangs
}
logger.debug('langsArray', langsArray)
if (!this.worker || !isEqual(this.previousLangs, langsArray)) {
if (this.worker) {
await this.dispose()
}
logger.debug('use langsArray to create worker', langsArray)
const langPath = await this._getLangPath()
const cachePath = await this._getCacheDir()
const promise = new Promise<Tesseract.Worker>((resolve, reject) => {
createWorker(langsArray, undefined, {
langPath,
cachePath,
logger: (m) => logger.debug('From worker', m),
errorHandler: (e) => {
logger.error('Worker Error', e)
reject(e)
}
})
.then(resolve)
.catch(reject)
})
this.worker = await promise
}
return this.worker
}
private async imageOcr(file: ImageFileMetadata, options?: OcrTesseractConfig): Promise<OcrResult> {
const worker = await this.getWorker(options)
const stat = await fs.promises.stat(file.path)
if (stat.size > MB_SIZE_THRESHOLD * MB) {
throw new Error(`This image is too large (max ${MB_SIZE_THRESHOLD}MB)`)
}
const buffer = await loadOcrImage(file)
const result = await worker.recognize(buffer)
return { text: result.data.text }
}
public ocr = async (file: SupportedOcrFile, options?: OcrTesseractConfig): Promise<OcrResult> => {
if (!isImageFileMetadata(file)) {
throw new Error('Only image files are supported currently')
}
return this.imageOcr(file, options)
}
private async _getLangPath(): Promise<string> {
const country = await getIpCountry()
return country.toLowerCase() === 'cn' ? TesseractLangsDownloadUrl.CN : ''
}
private async _getCacheDir(): Promise<string> {
const cacheDir = path.join(app.getPath('userData'), 'tesseract')
// use access to check if the directory exists
if (
!(await fs.promises
.access(cacheDir, fs.constants.F_OK)
.then(() => true)
.catch(() => false))
) {
await fs.promises.mkdir(cacheDir, { recursive: true })
}
return cacheDir
}
async dispose(): Promise<void> {
if (this.worker) {
await this.worker.terminate()
this.worker = null
}
}
}
export const tesseractService = new TesseractService()

View File

@ -1,82 +0,0 @@
import { loggerService } from '@logger'
import { getIpCountry } from '@main/utils/ipService'
import { loadOcrImage } from '@main/utils/ocr'
import { MB } from '@shared/config/constant'
import { ImageFileMetadata, isImageFile, OcrResult, SupportedOcrFile } from '@types'
import { app } from 'electron'
import fs from 'fs'
import path from 'path'
import Tesseract, { createWorker, LanguageCode } from 'tesseract.js'
const logger = loggerService.withContext('TesseractService')
// config
const MB_SIZE_THRESHOLD = 50
const tesseractLangs = ['chi_sim', 'chi_tra', 'eng'] satisfies LanguageCode[]
enum TesseractLangsDownloadUrl {
CN = 'https://gitcode.com/beyondkmp/tessdata/releases/download/4.1.0/',
GLOBAL = 'https://github.com/tesseract-ocr/tessdata/raw/main/'
}
export class TesseractService {
private worker: Tesseract.Worker | null = null
async getWorker(): Promise<Tesseract.Worker> {
if (!this.worker) {
// for now, only support limited languages
this.worker = await createWorker(tesseractLangs, undefined, {
langPath: await this._getLangPath(),
cachePath: await this._getCacheDir(),
gzip: false,
logger: (m) => logger.debug('From worker', m)
})
}
return this.worker
}
async imageOcr(file: ImageFileMetadata): Promise<OcrResult> {
const worker = await this.getWorker()
const stat = await fs.promises.stat(file.path)
if (stat.size > MB_SIZE_THRESHOLD * MB) {
throw new Error(`This image is too large (max ${MB_SIZE_THRESHOLD}MB)`)
}
const buffer = await loadOcrImage(file)
const result = await worker.recognize(buffer)
return { text: result.data.text }
}
async ocr(file: SupportedOcrFile): Promise<OcrResult> {
if (!isImageFile(file)) {
throw new Error('Only image files are supported currently')
}
return this.imageOcr(file)
}
private async _getLangPath(): Promise<string> {
const country = await getIpCountry()
return country.toLowerCase() === 'cn' ? TesseractLangsDownloadUrl.CN : TesseractLangsDownloadUrl.GLOBAL
}
private async _getCacheDir(): Promise<string> {
const cacheDir = path.join(app.getPath('userData'), 'tesseract')
// use access to check if the directory exists
if (
!(await fs.promises
.access(cacheDir, fs.constants.F_OK)
.then(() => true)
.catch(() => false))
) {
await fs.promises.mkdir(cacheDir, { recursive: true })
}
return cacheDir
}
async dispose(): Promise<void> {
if (this.worker) {
await this.worker.terminate()
this.worker = null
}
}
}
export const tesseractService = new TesseractService()

View File

@ -2,11 +2,12 @@ import { ImageFileMetadata } from '@types'
import { readFile } from 'fs/promises'
import sharp from 'sharp'
const preprocessImage = async (buffer: Buffer) => {
return await sharp(buffer)
const preprocessImage = async (buffer: Buffer): Promise<Buffer> => {
return sharp(buffer)
.grayscale() // 转为灰度
.normalize()
.sharpen()
.png({ quality: 100 })
.toBuffer()
}
@ -23,5 +24,5 @@ const preprocessImage = async (buffer: Buffer) => {
*/
export const loadOcrImage = async (file: ImageFileMetadata): Promise<Buffer> => {
const buffer = await readFile(file.path)
return await preprocessImage(buffer)
return preprocessImage(buffer)
}

View File

@ -0,0 +1,20 @@
import { Popover, PopoverProps } from 'antd'
import { Info } from 'lucide-react'
type InheritedPopoverProps = Omit<PopoverProps, 'children'>
interface InfoPopoverProps extends InheritedPopoverProps {
iconColor?: string
iconSize?: string | number
iconStyle?: React.CSSProperties
}
const InfoPopover = ({ iconColor = 'var(--color-text-3)', iconSize = 14, iconStyle, ...rest }: InfoPopoverProps) => {
return (
<Popover {...rest}>
<Info size={iconSize} color={iconColor} style={{ ...iconStyle }} role="img" aria-label="Information" />
</Popover>
)
}
export default InfoPopover

View File

@ -0,0 +1,16 @@
import { CircleXIcon } from 'lucide-react'
import CustomTag from './CustomTag'
type Props = {
iconSize?: number
message: string
}
export const ErrorTag = ({ iconSize: size = 14, message }: Props) => {
return (
<CustomTag icon={<CircleXIcon size={size} color="var(--color-status-error)" />} color="var(--color-status-error)">
{message}
</CustomTag>
)
}

View File

@ -0,0 +1,16 @@
import { CheckIcon } from 'lucide-react'
import CustomTag from './CustomTag'
type Props = {
iconSize?: number
message: string
}
export const SuccessTag = ({ iconSize: size = 14, message }: Props) => {
return (
<CustomTag icon={<CheckIcon size={size} color="var(--color-status-success)" />} color="var(--color-status-success)">
{message}
</CustomTag>
)
}

View File

@ -0,0 +1,18 @@
import { AlertTriangleIcon } from 'lucide-react'
import CustomTag from './CustomTag'
type Props = {
iconSize?: number
message: string
}
export const WarnTag = ({ iconSize: size = 14, message }: Props) => {
return (
<CustomTag
icon={<AlertTriangleIcon size={size} color="var(--color-status-warning)" />}
color="var(--color-status-warning)">
{message}
</CustomTag>
)
}

View File

@ -1,12 +1,16 @@
import {
BuiltinOcrProvider,
BuiltinOcrProviderId,
ImageOcrProvider,
OcrProviderCapability,
OcrTesseractProvider
OcrSystemProvider,
OcrTesseractProvider,
TesseractLangCode,
TranslateLanguageCode
} from '@renderer/types'
const tesseract: BuiltinOcrProvider & ImageOcrProvider & OcrTesseractProvider = {
import { isMac, isWin } from './constant'
const tesseract: OcrTesseractProvider = {
id: 'tesseract',
name: 'Tesseract',
capabilities: {
@ -19,14 +23,132 @@ const tesseract: BuiltinOcrProvider & ImageOcrProvider & OcrTesseractProvider =
eng: true
}
}
} as const satisfies OcrTesseractProvider
} as const
const systemOcr: OcrSystemProvider = {
id: 'system',
name: 'System',
config: {
langs: isWin ? ['en-us'] : undefined
},
capabilities: {
image: true
// pdf: true
}
} as const satisfies OcrSystemProvider
export const BUILTIN_OCR_PROVIDERS_MAP = {
tesseract
tesseract,
system: systemOcr
} as const satisfies Record<BuiltinOcrProviderId, BuiltinOcrProvider>
export const BUILTIN_OCR_PROVIDERS: BuiltinOcrProvider[] = Object.values(BUILTIN_OCR_PROVIDERS_MAP)
export const DEFAULT_OCR_PROVIDER = {
image: tesseract
image: isWin || isMac ? systemOcr : tesseract
} as const satisfies Record<OcrProviderCapability, BuiltinOcrProvider>
export const TESSERACT_LANG_MAP: Record<TranslateLanguageCode, TesseractLangCode> = {
'af-za': 'afr',
'am-et': 'amh',
'ar-sa': 'ara',
'as-in': 'asm',
'az-az': 'aze',
'az-cyrl-az': 'aze_cyrl',
'be-by': 'bel',
'bn-bd': 'ben',
'bo-cn': 'bod',
'bs-ba': 'bos',
'bg-bg': 'bul',
'ca-es': 'cat',
'ceb-ph': 'ceb',
'cs-cz': 'ces',
'zh-cn': 'chi_sim',
'zh-tw': 'chi_tra',
'chr-us': 'chr',
'cy-gb': 'cym',
'da-dk': 'dan',
'de-de': 'deu',
'dz-bt': 'dzo',
'el-gr': 'ell',
'en-us': 'eng',
'enm-gb': 'enm',
'eo-world': 'epo',
'et-ee': 'est',
'eu-es': 'eus',
'fa-ir': 'fas',
'fi-fi': 'fin',
'fr-fr': 'fra',
'frk-de': 'frk',
'frm-fr': 'frm',
'ga-ie': 'gle',
'gl-es': 'glg',
'grc-gr': 'grc',
'gu-in': 'guj',
'ht-ht': 'hat',
'he-il': 'heb',
'hi-in': 'hin',
'hr-hr': 'hrv',
'hu-hu': 'hun',
'iu-ca': 'iku',
'id-id': 'ind',
'is-is': 'isl',
'it-it': 'ita',
'ita-it': 'ita_old',
'jv-id': 'jav',
'ja-jp': 'jpn',
'kn-in': 'kan',
'ka-ge': 'kat',
'kat-ge': 'kat_old',
'kk-kz': 'kaz',
'km-kh': 'khm',
'ky-kg': 'kir',
'ko-kr': 'kor',
'ku-tr': 'kur',
'la-la': 'lao',
'la-va': 'lat',
'lv-lv': 'lav',
'lt-lt': 'lit',
'ml-in': 'mal',
'mr-in': 'mar',
'mk-mk': 'mkd',
'mt-mt': 'mlt',
'ms-my': 'msa',
'my-mm': 'mya',
'ne-np': 'nep',
'nl-nl': 'nld',
'no-no': 'nor',
'or-in': 'ori',
'pa-in': 'pan',
'pl-pl': 'pol',
'pt-pt': 'por',
'ps-af': 'pus',
'ro-ro': 'ron',
'ru-ru': 'rus',
'sa-in': 'san',
'si-lk': 'sin',
'sk-sk': 'slk',
'sl-si': 'slv',
'es-es': 'spa',
'spa-es': 'spa_old',
'sq-al': 'sqi',
'sr-rs': 'srp',
'sr-latn-rs': 'srp_latn',
'sw-tz': 'swa',
'sv-se': 'swe',
'syr-sy': 'syr',
'ta-in': 'tam',
'te-in': 'tel',
'tg-tj': 'tgk',
'tl-ph': 'tgl',
'th-th': 'tha',
'ti-er': 'tir',
'tr-tr': 'tur',
'ug-cn': 'uig',
'uk-ua': 'ukr',
'ur-pk': 'urd',
'uz-uz': 'uzb',
'uz-cyrl-uz': 'uzb_cyrl',
'vi-vn': 'vie',
'yi-us': 'yid'
}

View File

@ -39,5 +39,5 @@ export const useDrag = <T extends HTMLElement>(onDrop?: (e: React.DragEvent<T>)
[onDrop]
)
return { isDragging, handleDragOver, handleDragEnter, handleDragLeave, handleDrop }
return { isDragging, setIsDragging, handleDragOver, handleDragEnter, handleDragLeave, handleDrop }
}

View File

@ -1,16 +1,18 @@
import { loggerService } from '@logger'
import * as OcrService from '@renderer/services/ocr/OcrService'
import { useAppSelector } from '@renderer/store'
import { ImageFileMetadata, isImageFile, SupportedOcrFile } from '@renderer/types'
import { ImageFileMetadata, isImageFileMetadata, SupportedOcrFile } from '@renderer/types'
import { uuid } from '@renderer/utils'
import { formatErrorMessage } from '@renderer/utils/error'
import { useCallback } from 'react'
import { useTranslation } from 'react-i18next'
import { useOcrProviders } from './useOcrProvider'
const logger = loggerService.withContext('useOcr')
export const useOcr = () => {
const { t } = useTranslation()
const imageProvider = useAppSelector((state) => state.ocr.imageProvider)
const { imageProvider } = useOcrProviders()
/**
* OCR识别
@ -18,9 +20,13 @@ export const useOcr = () => {
* @returns OCR识别结果的Promise
* @throws OCR失败时抛出错误
*/
const ocrImage = async (image: ImageFileMetadata) => {
return OcrService.ocr(image, imageProvider)
}
const ocrImage = useCallback(
async (image: ImageFileMetadata) => {
logger.debug('ocrImage', { config: imageProvider.config })
return OcrService.ocr(image, imageProvider)
},
[imageProvider]
)
/**
* OCR识别.
@ -33,7 +39,7 @@ export const useOcr = () => {
window.message.loading({ content: t('ocr.processing'), key, duration: 0 })
// await to keep show loading message
try {
if (isImageFile(file)) {
if (isImageFileMetadata(file)) {
return await ocrImage(file)
} else {
// @ts-expect-error all types should be covered

View File

@ -1,84 +0,0 @@
import { loggerService } from '@logger'
import { BUILTIN_OCR_PROVIDERS_MAP } from '@renderer/config/ocr'
import { useAppSelector } from '@renderer/store'
import { addOcrProvider, removeOcrProvider, updateOcrProviderConfig } from '@renderer/store/ocr'
import { isBuiltinOcrProviderId, OcrProvider, OcrProviderConfig } from '@renderer/types'
import { useTranslation } from 'react-i18next'
import { useDispatch } from 'react-redux'
const logger = loggerService.withContext('useOcrProvider')
export const useOcrProviders = () => {
const providers = useAppSelector((state) => state.ocr.providers)
const dispatch = useDispatch()
const { t } = useTranslation()
/**
* OCR服务提供者
* @param provider - OCR提供者对象id和其他配置信息
* @throws {Error} ID的提供者时抛出错误
*/
const addProvider = (provider: OcrProvider) => {
if (providers.some((p) => p.id === provider.id)) {
const msg = `Provider with id ${provider.id} already exists`
logger.error(msg)
window.message.error(t('ocr.error.provider.existing'))
throw new Error(msg)
}
dispatch(addOcrProvider(provider))
}
/**
* OCR服务提供者
* @param id - OCR提供者ID
* @throws {Error}
*/
const removeProvider = (id: string) => {
if (isBuiltinOcrProviderId(id)) {
const msg = `Cannot remove builtin provider ${id}`
logger.error(msg)
window.message.error(t('ocr.error.provider.cannot_remove_builtin'))
throw new Error(msg)
}
dispatch(removeOcrProvider(id))
}
return { providers, addProvider, removeProvider }
}
export const useOcrProvider = (id: string) => {
const { t } = useTranslation()
const dispatch = useDispatch()
const { providers, addProvider } = useOcrProviders()
let provider = providers.find((p) => p.id === id)
// safely fallback
if (!provider) {
logger.error(`Ocr Provider ${id} not found`)
window.message.error(t('ocr.error.provider.not_found'))
if (isBuiltinOcrProviderId(id)) {
try {
addProvider(BUILTIN_OCR_PROVIDERS_MAP[id])
} catch (e) {
logger.warn(`Add ${BUILTIN_OCR_PROVIDERS_MAP[id].name} failed. Just use temp provider from config.`)
window.message.warning(t('ocr.warning.provider.fallback', { name: BUILTIN_OCR_PROVIDERS_MAP[id].name }))
} finally {
provider = BUILTIN_OCR_PROVIDERS_MAP[id]
}
} else {
logger.warn(`Fallback to tesseract`)
window.message.warning(t('ocr.warning.provider.fallback', { name: 'Tesseract' }))
provider = BUILTIN_OCR_PROVIDERS_MAP.tesseract
}
}
const updateConfig = (update: Partial<OcrProviderConfig>) => {
dispatch(updateOcrProviderConfig({ id: provider.id, update }))
}
return {
provider,
updateConfig
}
}

View File

@ -0,0 +1,148 @@
import { loggerService } from '@logger'
import TesseractLogo from '@renderer/assets/images/providers/Tesseract.js.png'
import { BUILTIN_OCR_PROVIDERS_MAP, DEFAULT_OCR_PROVIDER } from '@renderer/config/ocr'
import { getBuiltinOcrProviderLabel } from '@renderer/i18n/label'
import { useAppSelector } from '@renderer/store'
import { addOcrProvider, removeOcrProvider, setImageOcrProviderId, updateOcrProviderConfig } from '@renderer/store/ocr'
import {
ImageOcrProvider,
isBuiltinOcrProvider,
isBuiltinOcrProviderId,
isImageOcrProvider,
OcrProvider,
OcrProviderConfig
} from '@renderer/types'
import { Avatar } from 'antd'
import { FileQuestionMarkIcon, MonitorIcon } from 'lucide-react'
import { useCallback, useEffect, useState } from 'react'
import { useTranslation } from 'react-i18next'
import { useDispatch } from 'react-redux'
const logger = loggerService.withContext('useOcrProvider')
export const useOcrProviders = () => {
const providers = useAppSelector((state) => state.ocr.providers)
const imageProviders = providers.filter(isImageOcrProvider)
const imageProviderId = useAppSelector((state) => state.ocr.imageProviderId)
const [imageProvider, setImageProvider] = useState<ImageOcrProvider>(DEFAULT_OCR_PROVIDER.image)
const dispatch = useDispatch()
const { t } = useTranslation()
/**
* OCR服务提供者
* @param provider - OCR提供者对象id和其他配置信息
* @throws {Error} ID的提供者时抛出错误
*/
const addProvider = useCallback(
(provider: OcrProvider) => {
if (providers.some((p) => p.id === provider.id)) {
const msg = `Provider with id ${provider.id} already exists`
logger.error(msg)
window.message.error(t('ocr.error.provider.existing'))
throw new Error(msg)
}
dispatch(addOcrProvider(provider))
},
[dispatch, providers, t]
)
/**
* OCR服务提供者
* @param id - OCR提供者ID
* @throws {Error}
*/
const removeProvider = (id: string) => {
if (isBuiltinOcrProviderId(id)) {
const msg = `Cannot remove builtin provider ${id}`
logger.error(msg)
window.message.error(t('ocr.error.provider.cannot_remove_builtin'))
throw new Error(msg)
}
dispatch(removeOcrProvider(id))
}
const setImageProviderId = useCallback(
(id: string) => {
dispatch(setImageOcrProviderId(id))
},
[dispatch]
)
const getOcrProviderName = (p: OcrProvider) => {
return isBuiltinOcrProvider(p) ? getBuiltinOcrProviderLabel(p.id) : p.name
}
const OcrProviderLogo = ({ provider: p, size = 14 }: { provider: OcrProvider; size?: number }) => {
if (isBuiltinOcrProvider(p)) {
switch (p.id) {
case 'tesseract':
return <Avatar size={size} src={TesseractLogo} />
case 'system':
return <MonitorIcon size={size} />
}
}
return <FileQuestionMarkIcon size={size} />
}
useEffect(() => {
const actualImageProvider = imageProviders.find((p) => p.id === imageProviderId)
if (!actualImageProvider) {
if (isBuiltinOcrProviderId(imageProviderId)) {
logger.warn(`Builtin ocr provider ${imageProviderId} not exist. Will add it to providers.`)
addProvider(BUILTIN_OCR_PROVIDERS_MAP[imageProviderId])
}
setImageProviderId(DEFAULT_OCR_PROVIDER.image.id)
setImageProvider(DEFAULT_OCR_PROVIDER.image)
} else {
setImageProviderId(actualImageProvider.id)
setImageProvider(actualImageProvider)
}
}, [addProvider, imageProviderId, imageProviders, setImageProviderId])
return {
providers,
imageProvider,
addProvider,
removeProvider,
setImageProviderId,
getOcrProviderName,
OcrProviderLogo
}
}
export const useOcrProvider = (id: string) => {
const { t } = useTranslation()
const dispatch = useDispatch()
const { providers, addProvider } = useOcrProviders()
let provider = providers.find((p) => p.id === id)
// safely fallback
if (!provider) {
logger.error(`Ocr Provider ${id} not found`)
window.message.error(t('ocr.error.provider.not_found'))
if (isBuiltinOcrProviderId(id)) {
try {
addProvider(BUILTIN_OCR_PROVIDERS_MAP[id])
} catch (e) {
logger.warn(`Add ${BUILTIN_OCR_PROVIDERS_MAP[id].name} failed. Just use temp provider from config.`)
window.message.warning(t('ocr.warning.provider.fallback', { name: BUILTIN_OCR_PROVIDERS_MAP[id].name }))
} finally {
provider = BUILTIN_OCR_PROVIDERS_MAP[id]
}
} else {
logger.warn(`Fallback to tesseract`)
window.message.warning(t('ocr.warning.provider.fallback', { name: 'Tesseract' }))
provider = BUILTIN_OCR_PROVIDERS_MAP.tesseract
}
}
const updateConfig = (update: Partial<OcrProviderConfig>) => {
dispatch(updateOcrProviderConfig({ id: provider.id, update }))
}
return {
provider,
updateConfig
}
}

View File

@ -5,8 +5,7 @@
*/
import { loggerService } from '@logger'
import { BuiltinMCPServerName, BuiltinMCPServerNames } from '@renderer/types'
import { ThinkingOption } from '@renderer/types'
import { BuiltinMCPServerName, BuiltinMCPServerNames, BuiltinOcrProviderId, ThinkingOption } from '@renderer/types'
import i18n from './index'
@ -322,3 +321,13 @@ const builtInMcpDescriptionKeyMap: Record<BuiltinMCPServerName, string> = {
export const getBuiltInMcpServerDescriptionLabel = (key: string): string => {
return getLabel(key, builtInMcpDescriptionKeyMap, t('settings.mcp.builtinServersDescriptions.no'))
}
const builtinOcrProviderKeyMap = {
system: 'ocr.builtin.system',
tesseract: ''
} as const satisfies Record<BuiltinOcrProviderId, string>
export const getBuiltinOcrProviderLabel = (key: BuiltinOcrProviderId) => {
if (key === 'tesseract') return 'Tesseract'
else return getLabel(key, builtinOcrProviderKeyMap)
}

View File

@ -1587,6 +1587,9 @@
"tip": "If the response is successful, then only messages exceeding 30 seconds will trigger a reminder"
},
"ocr": {
"builtin": {
"system": "System OCR"
},
"error": {
"provider": {
"cannot_remove_builtin": "Cannot delete built-in provider",
@ -3531,17 +3534,30 @@
"title": "Settings",
"tool": {
"ocr": {
"common": {
"langs": "Supported languages"
},
"error": {
"not_system": "System OCR only supports Windows and MacOS"
},
"image": {
"error": {
"provider_not_found": "The provider does not exist"
},
"tesseract": {
"langs": "Supported languages",
"temp_tooltip": "Currently only Chinese and English are supported"
"system": {
"no_need_configure": "MacOS requires no configuration"
},
"title": "Image"
},
"image_provider": "OCR service provider",
"system": {
"win": {
"langs_tooltip": "Dependent on Windows to provide services, you need to download language packs in the system to support the relevant languages."
}
},
"tesseract": {
"langs_tooltip": "Read the documentation to learn which custom languages are supported"
},
"title": "OCR service"
},
"preprocess": {
@ -3787,6 +3803,7 @@
"files": {
"drag_text": "Drop here",
"error": {
"check_type": "An error occurred while checking the file type",
"multiple": "Multiple file uploads are not allowed",
"too_large": "File too large",
"unknown": "Failed to read file content"
@ -3811,7 +3828,7 @@
"aborted": "Translation aborted"
},
"input": {
"placeholder": "Enter text to translate"
"placeholder": "Text, files, or images (OCR supported) can be pasted or dragged in"
},
"language": {
"not_pair": "Source language is different from the set language",

View File

@ -1587,6 +1587,9 @@
"tip": "応答が成功した場合、30秒を超えるメッセージのみに通知を行います"
},
"ocr": {
"builtin": {
"system": "システム OCR"
},
"error": {
"provider": {
"cannot_remove_builtin": "組み込みプロバイダーは削除できません",
@ -3531,17 +3534,30 @@
"title": "設定",
"tool": {
"ocr": {
"common": {
"langs": "サポートされている言語"
},
"error": {
"not_system": "システムOCRはWindowsとMacOSのみをサポートしています"
},
"image": {
"error": {
"provider_not_found": "該提供者は存在しません"
},
"tesseract": {
"langs": "サポートされている言語",
"temp_tooltip": "現在のところ、中国語と英語のみをサポートしています"
"system": {
"no_need_configure": "MacOS は設定不要"
},
"title": "画像"
},
"image_provider": "OCRサービスプロバイダー",
"system": {
"win": {
"langs_tooltip": "Windows が提供するサービスに依存しており、関連する言語をサポートするには、システムで言語パックをダウンロードする必要があります。"
}
},
"tesseract": {
"langs_tooltip": "ドキュメントを読んで、どのカスタム言語がサポートされているかを確認してください。"
},
"title": "OCRサービス"
},
"preprocess": {
@ -3787,6 +3803,7 @@
"files": {
"drag_text": "ここにドラッグ&ドロップしてください",
"error": {
"check_type": "ファイルタイプの確認中にエラーが発生しました",
"multiple": "複数のファイルのアップロードは許可されていません",
"too_large": "ファイルが大きすぎます",
"unknown": "ファイルの内容を読み取るのに失敗しました"
@ -3811,7 +3828,7 @@
"aborted": "翻訳中止"
},
"input": {
"placeholder": "翻訳するテキストを入力"
"placeholder": "テキスト、ファイル、画像OCR対応を貼り付けたりドラッグアンドドロップしたりできます"
},
"language": {
"not_pair": "ソース言語が設定された言語と異なります",

View File

@ -1587,6 +1587,9 @@
"tip": "Если ответ успешен, уведомление выдается только по сообщениям, превышающим 30 секунд"
},
"ocr": {
"builtin": {
"system": "Системное распознавание текста"
},
"error": {
"provider": {
"cannot_remove_builtin": "Не удается удалить встроенного поставщика",
@ -3531,17 +3534,30 @@
"title": "Настройки",
"tool": {
"ocr": {
"common": {
"langs": "Поддерживаемые языки"
},
"error": {
"not_system": "Системный OCR поддерживается только в Windows и MacOS"
},
"image": {
"error": {
"provider_not_found": "Поставщик не существует"
},
"tesseract": {
"langs": "Поддерживаемые языки",
"temp_tooltip": "На данный момент поддерживаются только китайский и английский языки"
"system": {
"no_need_configure": "MacOS не требует настройки"
},
"title": "Изображение"
},
"image_provider": "Поставщик услуг OCR",
"system": {
"win": {
"langs_tooltip": "Для предоставления служб Windows необходимо загрузить языковой пакет в системе для поддержки соответствующего языка."
}
},
"tesseract": {
"langs_tooltip": "Ознакомьтесь с документацией, чтобы узнать, какие пользовательские языки поддерживаются"
},
"title": "OCR-сервис"
},
"preprocess": {
@ -3787,6 +3803,7 @@
"files": {
"drag_text": "Перетащите сюда",
"error": {
"check_type": "Ошибка при проверке типа файла",
"multiple": "Не разрешается загружать несколько файлов",
"too_large": "Файл слишком большой",
"unknown": "Ошибка при чтении содержимого файла"
@ -3811,7 +3828,7 @@
"aborted": "Перевод прерван"
},
"input": {
"placeholder": "Введите текст для перевода"
"placeholder": "Можно вставить или перетащить текст, файлы, изображения (поддержка OCR)"
},
"language": {
"not_pair": "Исходный язык отличается от настроенного",

View File

@ -1587,6 +1587,9 @@
"tip": "如果响应成功则只针对超过30秒的消息进行提醒"
},
"ocr": {
"builtin": {
"system": "系统 OCR"
},
"error": {
"provider": {
"cannot_remove_builtin": "不能删除内置提供商",
@ -3531,17 +3534,30 @@
"title": "设置",
"tool": {
"ocr": {
"common": {
"langs": "支持的语言"
},
"error": {
"not_system": "系统 OCR 仅支持 Windows 与 MacOS"
},
"image": {
"error": {
"provider_not_found": "该提供商不存在"
},
"tesseract": {
"langs": "支持的语言",
"temp_tooltip": "目前暂时只支持中文和英文"
"system": {
"no_need_configure": "MacOS 无需配置"
},
"title": "图片"
},
"image_provider": "OCR 服务提供商",
"system": {
"win": {
"langs_tooltip": "依赖 Windows 提供服务,您需要在系统中下载语言包来支持相关语言。"
}
},
"tesseract": {
"langs_tooltip": "阅读文档以了解哪些自定义语言是受支持的"
},
"title": "OCR 服务"
},
"preprocess": {
@ -3787,6 +3803,7 @@
"files": {
"drag_text": "拖放到此处",
"error": {
"check_type": "检查文件类型时发生错误",
"multiple": "不允许上传多个文件",
"too_large": "文件过大",
"unknown": "读取文件内容失败"
@ -3811,7 +3828,7 @@
"aborted": "翻译中止"
},
"input": {
"placeholder": "输入文本进行翻译"
"placeholder": "可粘贴或拖入文本、文件、图片支持OCR"
},
"language": {
"not_pair": "源语言与设置的语言不同",

View File

@ -1587,6 +1587,9 @@
"tip": "如果回應成功則只針對超過30秒的訊息發出提醒"
},
"ocr": {
"builtin": {
"system": "系统 OCR"
},
"error": {
"provider": {
"cannot_remove_builtin": "不能刪除內建提供者",
@ -3531,17 +3534,30 @@
"title": "設定",
"tool": {
"ocr": {
"common": {
"langs": "支援的語言"
},
"error": {
"not_system": "系統 OCR 僅支援 Windows 與 MacOS"
},
"image": {
"error": {
"provider_not_found": "該提供商不存在"
},
"tesseract": {
"langs": "支援的語言",
"temp_tooltip": "目前暫時只支援中文和英文"
"system": {
"no_need_configure": "MacOS 無需配置"
},
"title": "圖片"
},
"image_provider": "OCR 服務提供商",
"system": {
"win": {
"langs_tooltip": "依賴 Windows 提供服務,您需要在系統中下載語言包來支援相關語言。"
}
},
"tesseract": {
"langs_tooltip": "閱讀文件以了解哪些自訂語言受支援"
},
"title": "OCR 服務"
},
"preprocess": {
@ -3787,6 +3803,7 @@
"files": {
"drag_text": "拖放到此处",
"error": {
"check_type": "檢查檔案類型時發生錯誤",
"multiple": "不允许上传多个文件",
"too_large": "文件過大",
"unknown": "读取文件内容失败"
@ -3811,7 +3828,7 @@
"aborted": "翻譯中止"
},
"input": {
"placeholder": "輸入文字進行翻譯"
"placeholder": "可粘貼或拖入文字、檔案、圖片支援OCR"
},
"language": {
"not_pair": "源語言與設定的語言不同",

View File

@ -1587,6 +1587,9 @@
"tip": "Εάν η απάντηση είναι επιτυχής, η ειδοποίηση εμφανίζεται μόνο για μηνύματα που υπερβαίνουν τα 30 δευτερόλεπτα"
},
"ocr": {
"builtin": {
"system": "σύστημα OCR"
},
"error": {
"provider": {
"cannot_remove_builtin": "Δεν είναι δυνατή η διαγραφή του ενσωματωμένου παρόχου",
@ -2727,6 +2730,7 @@
"title": "Αυτόματη ενημέρωση"
},
"avatar": {
"builtin": "Ενσωματωμένο προφίλ",
"reset": "Επαναφορά εικονιδίου"
},
"backup": {
@ -3530,17 +3534,30 @@
"title": "Ρυθμίσεις",
"tool": {
"ocr": {
"common": {
"langs": "Υποστηριζόμενες γλώσσες"
},
"error": {
"not_system": "Το σύστημα OCR υποστηρίζει μόνο Windows και MacOS"
},
"image": {
"error": {
"provider_not_found": "Ο πάροχος δεν υπάρχει"
},
"tesseract": {
"langs": "Υποστηριζόμενες γλώσσες",
"temp_tooltip": "Προς το παρόν υποστηρίζονται μόνο η κινεζική και η αγγλική γλώσσα"
"system": {
"no_need_configure": "MacOS δεν απαιτεί ρύθμιση"
},
"title": "Εικόνα"
},
"image_provider": "Πάροχοι υπηρεσιών OCR",
"system": {
"win": {
"langs_tooltip": "Εξαρτάται από τα Windows για την παροχή υπηρεσιών, πρέπει να κατεβάσετε το πακέτο γλώσσας στο σύστημα για να υποστηρίξετε τις σχετικές γλώσσες."
}
},
"tesseract": {
"langs_tooltip": "Διαβάστε την τεκμηρίωση για να μάθετε ποιες προσαρμοσμένες γλώσσες υποστηρίζονται"
},
"title": "Υπηρεσία OCR"
},
"preprocess": {
@ -3786,6 +3803,7 @@
"files": {
"drag_text": "Σύρετε και αφήστε εδώ",
"error": {
"check_type": "Παρουσιάστηκε σφάλμα κατά τον έλεγχο του τύπου αρχείου",
"multiple": "Δεν επιτρέπεται η μεταφόρτωση πολλαπλών αρχείων",
"too_large": "Το αρχείο είναι πολύ μεγάλο",
"unknown": "Αποτυχία ανάγνωσης του περιεχομένου του αρχείου"
@ -3810,7 +3828,7 @@
"aborted": "Η μετάφραση διακόπηκε"
},
"input": {
"placeholder": "Εισαγάγετε κείμενο για μετάφραση"
"placeholder": "Μπορείτε να επικολλήσετε ή να σύρετε κείμενο, αρχεία, εικόνες (με υποστήριξη OCR)"
},
"language": {
"not_pair": "Η γλώσσα πηγής διαφέρει από την οριζόμενη γλώσσα",

View File

@ -1587,6 +1587,9 @@
"tip": "Si la respuesta es exitosa, solo se enviará un recordatorio para mensajes que excedan los 30 segundos"
},
"ocr": {
"builtin": {
"system": "OCR del sistema"
},
"error": {
"provider": {
"cannot_remove_builtin": "No se puede eliminar el proveedor integrado",
@ -2727,6 +2730,7 @@
"title": "Actualización automática"
},
"avatar": {
"builtin": "Avatares integrados",
"reset": "Restablecer avatar"
},
"backup": {
@ -3530,17 +3534,30 @@
"title": "Configuración",
"tool": {
"ocr": {
"common": {
"langs": "Idiomas compatibles"
},
"error": {
"not_system": "El OCR del sistema solo admite Windows y MacOS"
},
"image": {
"error": {
"provider_not_found": "El proveedor no existe"
},
"tesseract": {
"langs": "Idiomas compatibles",
"temp_tooltip": "Actualmente solo se admiten chino e inglés."
"system": {
"no_need_configure": "MacOS no requiere configuración"
},
"title": "Imagen"
},
"image_provider": "Proveedor de servicios OCR",
"system": {
"win": {
"langs_tooltip": "Dependiendo de Windows para proporcionar servicios, necesita descargar el paquete de idioma en el sistema para admitir los idiomas correspondientes."
}
},
"tesseract": {
"langs_tooltip": "Lea la documentación para conocer qué idiomas personalizados son compatibles"
},
"title": "Servicio OCR"
},
"preprocess": {
@ -3786,6 +3803,7 @@
"files": {
"drag_text": "Arrastrar y soltar aquí",
"error": {
"check_type": "Se produjo un error al verificar el tipo de archivo",
"multiple": "No se permite cargar varios archivos",
"too_large": "El archivo es demasiado grande",
"unknown": "Error al leer el contenido del archivo"
@ -3810,7 +3828,7 @@
"aborted": "Traducción cancelada"
},
"input": {
"placeholder": "Ingrese el texto para traducir"
"placeholder": "Se puede pegar o arrastrar texto, archivos e imágenes (compatible con OCR)"
},
"language": {
"not_pair": "El idioma de origen es diferente al idioma configurado",

View File

@ -1587,6 +1587,9 @@
"tip": "Si la réponse est réussie, un rappel est envoyé uniquement pour les messages dépassant 30 secondes"
},
"ocr": {
"builtin": {
"system": "OCR système"
},
"error": {
"provider": {
"cannot_remove_builtin": "Impossible de supprimer le fournisseur intégré",
@ -2727,6 +2730,7 @@
"title": "Mise à jour automatique"
},
"avatar": {
"builtin": "Avatar intégré",
"reset": "Réinitialiser l'avatar"
},
"backup": {
@ -3530,17 +3534,30 @@
"title": "Paramètres",
"tool": {
"ocr": {
"common": {
"langs": "Langues prises en charge"
},
"error": {
"not_system": "L'OCR système prend uniquement en charge Windows et MacOS"
},
"image": {
"error": {
"provider_not_found": "Ce fournisseur n'existe pas"
},
"tesseract": {
"langs": "Langues prises en charge",
"temp_tooltip": "Pour le moment, seuls le chinois et l'anglais sont pris en charge."
"system": {
"no_need_configure": "MacOS ne nécessite aucune configuration"
},
"title": "Image"
},
"image_provider": "Fournisseur de service OCR",
"system": {
"win": {
"langs_tooltip": "Dépendre de Windows pour fournir des services, vous devez télécharger des packs linguistiques dans le système afin de prendre en charge les langues concernées."
}
},
"tesseract": {
"langs_tooltip": "Lisez la documentation pour connaître les langues personnalisées prises en charge"
},
"title": "Service OCR"
},
"preprocess": {
@ -3786,6 +3803,7 @@
"files": {
"drag_text": "Glisser-déposer ici",
"error": {
"check_type": "Une erreur s'est produite lors de la vérification du type de fichier",
"multiple": "Impossible de téléverser plusieurs fichiers",
"too_large": "Fichier trop volumineux",
"unknown": "Échec de la lecture du contenu du fichier"
@ -3810,7 +3828,7 @@
"aborted": "Traduction annulée"
},
"input": {
"placeholder": "entrez le texte à traduire"
"placeholder": "Peut coller ou glisser du texte, des fichiers, des images (avec reconnaissance optique de caractères)"
},
"language": {
"not_pair": "La langue source est différente de la langue définie",

View File

@ -1587,6 +1587,9 @@
"tip": "Se a resposta for bem-sucedida, lembrete apenas para mensagens que excedam 30 segundos"
},
"ocr": {
"builtin": {
"system": "OCR do sistema"
},
"error": {
"provider": {
"cannot_remove_builtin": "Não é possível excluir o provedor integrado",
@ -2727,6 +2730,7 @@
"title": "Atualização automática"
},
"avatar": {
"builtin": "Avatares embutidos",
"reset": "Redefinir avatar"
},
"backup": {
@ -3530,17 +3534,30 @@
"title": "Configurações",
"tool": {
"ocr": {
"common": {
"langs": "Idiomas suportados"
},
"error": {
"not_system": "O OCR do sistema suporta apenas Windows e MacOS"
},
"image": {
"error": {
"provider_not_found": "O provedor não existe"
},
"tesseract": {
"langs": "Idiomas suportados",
"temp_tooltip": "No momento, apenas chinês e inglês são suportados."
"system": {
"no_need_configure": "MacOS não requer configuração"
},
"title": "Imagem"
},
"image_provider": "Provedor de serviços OCR",
"system": {
"win": {
"langs_tooltip": "Dependendo do Windows para fornecer serviços, você precisa baixar pacotes de idiomas no sistema para dar suporte aos idiomas relevantes."
}
},
"tesseract": {
"langs_tooltip": "Leia a documentação para saber quais idiomas personalizados são suportados"
},
"title": "Serviço OCR"
},
"preprocess": {
@ -3786,6 +3803,7 @@
"files": {
"drag_text": "Arraste e solte aqui",
"error": {
"check_type": "Ocorreu um erro ao verificar o tipo de arquivo",
"multiple": "Não é permitido fazer upload de vários arquivos",
"too_large": "Arquivo muito grande",
"unknown": "Falha ao ler o conteúdo do arquivo"
@ -3810,7 +3828,7 @@
"aborted": "Tradução interrompida"
},
"input": {
"placeholder": "Digite o texto para traduzir"
"placeholder": "Pode colar ou arrastar e soltar texto, arquivos e imagens (suporte a OCR)"
},
"language": {
"not_pair": "O idioma de origem é diferente do idioma definido",

View File

@ -1,11 +1,11 @@
import { loggerService } from '@logger'
import { useAppSelector } from '@renderer/store'
import { setImageOcrProvider } from '@renderer/store/ocr'
import { isImageOcrProvider, OcrProvider } from '@renderer/types'
import { ErrorTag } from '@renderer/components/Tags/ErrorTag'
import { isMac, isWin } from '@renderer/config/constant'
import { useOcrProviders } from '@renderer/hooks/useOcrProvider'
import { BuiltinOcrProviderIds, ImageOcrProvider, isImageOcrProvider, OcrProvider } from '@renderer/types'
import { Select } from 'antd'
import { useEffect } from 'react'
import { useEffect, useMemo } from 'react'
import { useTranslation } from 'react-i18next'
import { useDispatch } from 'react-redux'
import { SettingRow, SettingRowTitle } from '..'
@ -17,17 +17,16 @@ type Props = {
const OcrImageSettings = ({ setProvider }: Props) => {
const { t } = useTranslation()
const providers = useAppSelector((state) => state.ocr.providers)
const imageProvider = useAppSelector((state) => state.ocr.imageProvider)
const { providers, imageProvider, getOcrProviderName, setImageProviderId } = useOcrProviders()
const imageProviders = providers.filter((p) => isImageOcrProvider(p))
const dispatch = useDispatch()
// 挂载时更新外部状态
useEffect(() => {
setProvider(imageProvider)
}, [imageProvider, setProvider])
const updateImageProvider = (id: string) => {
const setImageProvider = (id: string) => {
const provider = imageProviders.find((p) => p.id === id)
if (!provider) {
logger.error(`Failed to find image provider by id: ${id}`)
@ -36,22 +35,29 @@ const OcrImageSettings = ({ setProvider }: Props) => {
}
setProvider(provider)
dispatch(setImageOcrProvider(provider))
setImageProviderId(id)
}
const platformSupport = isMac || isWin
const options = useMemo(() => {
const platformFilter = platformSupport ? () => true : (p: ImageOcrProvider) => p.id !== BuiltinOcrProviderIds.system
return imageProviders.filter(platformFilter).map((p) => ({
value: p.id,
label: getOcrProviderName(p)
}))
}, [getOcrProviderName, imageProviders, platformSupport])
return (
<>
<SettingRow>
<SettingRowTitle>{t('settings.tool.ocr.image_provider')}</SettingRowTitle>
<div style={{ display: 'flex', gap: '8px' }}>
<div style={{ display: 'flex', gap: '8px', alignItems: 'center' }}>
{!platformSupport && <ErrorTag message={t('settings.tool.ocr.error.not_system')} />}
<Select
value={imageProvider.id}
style={{ width: '200px' }}
onChange={(id: string) => updateImageProvider(id)}
options={imageProviders.map((p) => ({
value: p.id,
label: p.name
}))}
onChange={(id: string) => setImageProvider(id)}
options={options}
/>
</div>
</SettingRow>

View File

@ -1,11 +1,14 @@
// import { loggerService } from '@logger'
import { ErrorBoundary } from '@renderer/components/ErrorBoundary'
import { isBuiltinOcrProvider, OcrProvider } from '@renderer/types'
import { getOcrProviderLogo } from '@renderer/utils/ocr'
import { Avatar, Divider, Flex } from 'antd'
import { isMac, isWin } from '@renderer/config/constant'
import { useTheme } from '@renderer/context/ThemeProvider'
import { useOcrProviders } from '@renderer/hooks/useOcrProvider'
import { isBuiltinOcrProvider, isOcrSystemProvider, OcrProvider } from '@renderer/types'
import { Divider, Flex } from 'antd'
import styled from 'styled-components'
import { SettingTitle } from '..'
import { SettingGroup, SettingTitle } from '..'
import { OcrSystemSettings } from './OcrSystemSettings'
import { OcrTesseractSettings } from './OcrTesseractSettings'
// const logger = loggerService.withContext('OcrTesseractSettings')
@ -15,12 +18,22 @@ type Props = {
}
const OcrProviderSettings = ({ provider }: Props) => {
// const { t } = useTranslation()
const getProviderSettings = () => {
const { theme: themeMode } = useTheme()
const { OcrProviderLogo, getOcrProviderName } = useOcrProviders()
if (!isWin && !isMac && isOcrSystemProvider(provider)) {
return null
}
const ProviderSettings = () => {
if (isBuiltinOcrProvider(provider)) {
switch (provider.id) {
case 'tesseract':
return <OcrTesseractSettings />
case 'system':
return <OcrSystemSettings />
default:
return null
}
} else {
throw new Error('Not supported OCR provider')
@ -28,16 +41,18 @@ const OcrProviderSettings = ({ provider }: Props) => {
}
return (
<>
<SettingGroup theme={themeMode}>
<SettingTitle>
<Flex align="center" gap={8}>
<ProviderLogo shape="square" src={getOcrProviderLogo(provider.id)} size={16} />
<ProviderName> {provider.name}</ProviderName>
<OcrProviderLogo provider={provider} />
<ProviderName> {getOcrProviderName(provider)}</ProviderName>
</Flex>
</SettingTitle>
<Divider style={{ width: '100%', margin: '10px 0' }} />
<ErrorBoundary>{getProviderSettings()}</ErrorBoundary>
</>
<ErrorBoundary>
<ProviderSettings />
</ErrorBoundary>
</SettingGroup>
)
}
@ -45,8 +60,5 @@ const ProviderName = styled.span`
font-size: 14px;
font-weight: 500;
`
const ProviderLogo = styled(Avatar)`
border: 0.5px solid var(--color-border);
`
export default OcrProviderSettings

View File

@ -1,7 +1,7 @@
import { PictureOutlined } from '@ant-design/icons'
import { ErrorBoundary } from '@renderer/components/ErrorBoundary'
import { useTheme } from '@renderer/context/ThemeProvider'
import { useAppSelector } from '@renderer/store'
import { useOcrProviders } from '@renderer/hooks/useOcrProvider'
import { OcrProvider } from '@renderer/types'
import { Tabs, TabsProps } from 'antd'
import { FC, useState } from 'react'
@ -14,7 +14,7 @@ import OcrProviderSettings from './OcrProviderSettings'
const OcrSettings: FC = () => {
const { t } = useTranslation()
const { theme: themeMode } = useTheme()
const imageProvider = useAppSelector((state) => state.ocr.imageProvider)
const { imageProvider } = useOcrProviders()
const [provider, setProvider] = useState<OcrProvider>(imageProvider) // since default to image provider
const tabs: TabsProps['items'] = [
@ -33,9 +33,9 @@ const OcrSettings: FC = () => {
<SettingDivider />
<Tabs defaultActiveKey="image" items={tabs} />
</SettingGroup>
<SettingGroup theme={themeMode}>
<ErrorBoundary>
<OcrProviderSettings provider={provider} />
</SettingGroup>
</ErrorBoundary>
</ErrorBoundary>
)
}

View File

@ -0,0 +1,78 @@
// import { loggerService } from '@logger'
import InfoTooltip from '@renderer/components/InfoTooltip'
import { SuccessTag } from '@renderer/components/Tags/SuccessTag'
import { isMac, isWin } from '@renderer/config/constant'
import { useOcrProvider } from '@renderer/hooks/useOcrProvider'
import useTranslate from '@renderer/hooks/useTranslate'
import { BuiltinOcrProviderIds, isOcrSystemProvider, TranslateLanguageCode } from '@renderer/types'
import { Flex, Select } from 'antd'
import { startTransition, useCallback, useMemo, useState } from 'react'
import { useTranslation } from 'react-i18next'
import { SettingRow, SettingRowTitle } from '..'
// const logger = loggerService.withContext('OcrSystemSettings')
export const OcrSystemSettings = () => {
const { t } = useTranslation()
// 和翻译自定义语言耦合了应该还ok
const { translateLanguages } = useTranslate()
const { provider, updateConfig } = useOcrProvider(BuiltinOcrProviderIds.system)
if (!isOcrSystemProvider(provider)) {
throw new Error('Not system provider.')
}
if (!isWin && !isMac) {
throw new Error('Only Windows and MacOS is supported.')
}
const [langs, setLangs] = useState<TranslateLanguageCode[]>(provider.config?.langs ?? [])
// currently static
const options = useMemo(
() =>
translateLanguages.map((lang) => ({
value: lang.langCode,
label: lang.emoji + ' ' + lang.label()
})),
[translateLanguages]
)
const onChange = useCallback((value: TranslateLanguageCode[]) => {
startTransition(() => {
setLangs(value)
})
}, [])
const onBlur = useCallback(() => {
updateConfig({ langs })
}, [langs, updateConfig])
return (
<>
<SettingRow>
<SettingRowTitle>
<Flex align="center" gap={4}>
{t('settings.tool.ocr.common.langs')}
{isWin && <InfoTooltip title={t('settings.tool.ocr.system.win.langs_tooltip')} />}
</Flex>
</SettingRowTitle>
<div style={{ display: 'flex', gap: '8px' }}>
{isMac && <SuccessTag message={t('settings.tool.ocr.image.system.no_need_configure')} />}
{isWin && (
<Select
mode="multiple"
style={{ width: '100%', minWidth: 200 }}
value={langs}
options={options}
onChange={onChange}
onBlur={onBlur}
maxTagCount={1}
/>
)}
</div>
</SettingRow>
</>
)
}

View File

@ -1,8 +1,12 @@
// import { loggerService } from '@logger'
import InfoTooltip from '@renderer/components/InfoTooltip'
import CustomTag from '@renderer/components/Tags/CustomTag'
import { TESSERACT_LANG_MAP } from '@renderer/config/ocr'
import { useOcrProvider } from '@renderer/hooks/useOcrProvider'
import { BuiltinOcrProviderIds, isOcrTesseractProvider } from '@renderer/types'
import useTranslate from '@renderer/hooks/useTranslate'
import { BuiltinOcrProviderIds, isOcrTesseractProvider, TesseractLangCode } from '@renderer/types'
import { Flex, Select } from 'antd'
import { useCallback, useMemo, useState } from 'react'
import { useTranslation } from 'react-i18next'
import { SettingRow, SettingRowTitle } from '..'
@ -11,38 +15,70 @@ import { SettingRow, SettingRowTitle } from '..'
export const OcrTesseractSettings = () => {
const { t } = useTranslation()
const { provider } = useOcrProvider(BuiltinOcrProviderIds.tesseract)
const { provider, updateConfig } = useOcrProvider(BuiltinOcrProviderIds.tesseract)
if (!isOcrTesseractProvider(provider)) {
throw new Error('Not tesseract provider.')
}
// const [langs, setLangs] = useState<OcrTesseractConfig['langs']>(provider.config?.langs ?? {})
const [langs, setLangs] = useState<Partial<Record<TesseractLangCode, boolean>>>(provider.config?.langs ?? {})
const { translateLanguages } = useTranslate()
// currently static
const options = [
{ value: 'chi_sim', label: t('languages.chinese') },
{ value: 'chi_tra', label: t('languages.chinese-traditional') },
{ value: 'eng', label: t('languages.english') }
]
const options = useMemo(
() =>
translateLanguages
.map((lang) => ({
value: TESSERACT_LANG_MAP[lang.langCode],
label: lang.emoji + ' ' + lang.label()
}))
.filter((option) => option.value),
[translateLanguages]
)
// TODO: type safe objectKeys
const value = useMemo(
() =>
Object.entries(langs)
.filter(([, enabled]) => enabled)
.map(([lang]) => lang) as TesseractLangCode[],
[langs]
)
const onChange = useCallback((values: TesseractLangCode[]) => {
setLangs(() => {
const newLangs = {}
values.forEach((v) => {
newLangs[v] = true
})
return newLangs
})
}, [])
const onBlur = useCallback(() => {
updateConfig({ langs })
}, [langs, updateConfig])
return (
<>
<SettingRow>
<SettingRowTitle>
<Flex align="center" gap={4}>
{t('settings.tool.ocr.image.tesseract.langs')}
<InfoTooltip title={t('settings.tool.ocr.image.tesseract.temp_tooltip')} />
{t('settings.tool.ocr.common.langs')}
<InfoTooltip title={t('settings.tool.ocr.tesseract.langs_tooltip')} />
</Flex>
</SettingRowTitle>
<div style={{ display: 'flex', gap: '8px' }}>
<Select
mode="multiple"
disabled
style={{ width: '100%' }}
placeholder="Please select"
value={['chi_sim', 'chi_tra', 'eng']}
style={{ minWidth: 200 }}
value={value}
options={options}
maxTagCount={1}
onChange={onChange}
onBlur={onBlur}
// use tag render to disable default close action
// don't modify this, because close action won't trigger onBlur to update state
tagRender={(props) => <CustomTag color="var(--color-text)">{props.label}</CustomTag>}
/>
</div>
</SettingRow>

View File

@ -27,7 +27,7 @@ import {
type TranslateHistory,
type TranslateLanguage
} from '@renderer/types'
import { getFileExtension, runAsyncFunction, uuid } from '@renderer/utils'
import { getFileExtension, isTextFile, runAsyncFunction, uuid } from '@renderer/utils'
import { abortCompletion } from '@renderer/utils/abortController'
import { isAbortError } from '@renderer/utils/error'
import { formatErrorMessage } from '@renderer/utils/error'
@ -465,7 +465,7 @@ const TranslatePage: FC = () => {
// 统一的文件处理
const processFile = useCallback(
async (file: FileMetadata) => {
// extensible
// extensible, only image for now
const shouldOCR = isSupportedOcrFile(file)
if (shouldOCR) {
@ -473,23 +473,45 @@ const TranslatePage: FC = () => {
const ocrResult = await ocr(file)
setText(ocrResult.text)
} finally {
// do nothing when failed.
// do nothing when failed. because error should be handled inside
}
} else {
// the threshold may be too large
if (file.size > 5 * MB) {
window.message.error(t('translate.files.error.too_large') + ' (0 ~ 5 MB)')
} else {
try {
window.message.loading({ content: t('translate.files.reading'), key: 'translate_files_reading', duration: 0 })
let isText: boolean
try {
const result = await window.api.fs.readText(file.path)
setText(result)
// 检查文件是否为文本文件
isText = await isTextFile(file.path)
} catch (e) {
logger.error('Failed to read text file.', e as Error)
window.message.error(t('translate.files.error.unknown') + ': ' + formatErrorMessage(e))
} finally {
window.message.destroy('translate_files_reading')
logger.error('Failed to check if file is text.', e as Error)
window.message.error(t('translate.files.error.check_type') + ': ' + formatErrorMessage(e))
throw e
}
if (!isText) {
window.message.error({
key: 'file_not_supported',
content: t('common.file.not_supported', { type: getFileExtension(file.path) })
})
logger.error('Unsupported file type.')
throw new Error('Unsupported file type')
}
// the threshold may be too large
if (file.size > 5 * MB) {
window.message.error(t('translate.files.error.too_large') + ' (0 ~ 5 MB)')
} else {
try {
const result = await window.api.fs.readText(file.path)
setText(result)
} catch (e) {
logger.error('Failed to read text file.', e as Error)
window.message.error(t('translate.files.error.unknown') + ': ' + formatErrorMessage(e))
}
}
} finally {
// do nothing when failed because error should be handled inside
window.message.destroy('translate_files_reading')
}
}
},
@ -533,9 +555,19 @@ const TranslatePage: FC = () => {
)
// 拖动上传文件
const {
isDragging,
setIsDragging,
handleDragEnter,
handleDragLeave,
handleDragOver,
handleDrop: preventDrop
} = useDrag<HTMLDivElement>()
const onDrop = useCallback(
async (e: React.DragEvent<HTMLDivElement>) => {
setIsProcessing(true)
setIsDragging(false)
// const supportedFiles = await filterSupportedFiles(_files, extensions)
const data = await getTextFromDropEvent(e).catch((err) => {
logger.error('getTextFromDropEvent', err)
@ -566,16 +598,9 @@ const TranslatePage: FC = () => {
}
setIsProcessing(false)
},
[getSingleFile, processFile, setText, t, text]
[getSingleFile, processFile, setIsDragging, setText, t, text]
)
const {
isDragging,
handleDragEnter,
handleDragLeave,
handleDragOver,
handleDrop: preventDrop
} = useDrag<HTMLDivElement>()
const {
isDragging: isDraggingOnInput,
handleDragEnter: handleDragEnterInput,

View File

@ -16,7 +16,7 @@ export const ocr = async (file: SupportedOcrFile, provider: OcrProvider): Promis
logger.info(`ocr file ${file.path}`)
if (isOcrApiProvider(provider)) {
const client = OcrApiClientFactory.create(provider)
return client.ocr(file)
return client.ocr(file, provider.config)
} else {
return window.api.ocr.ocr(file, provider)
}

View File

@ -64,7 +64,7 @@ const persistedReducer = persistReducer(
{
key: 'cherry-studio',
storage,
version: 137,
version: 138,
blacklist: ['runtime', 'messages', 'messageBlocks', 'tabs'],
migrate
},

View File

@ -3,7 +3,7 @@ import { nanoid } from '@reduxjs/toolkit'
import { DEFAULT_CONTEXTCOUNT, DEFAULT_TEMPERATURE, isMac } from '@renderer/config/constant'
import { DEFAULT_MIN_APPS } from '@renderer/config/minapps'
import { isFunctionCallingModel, isNotSupportedTextDelta, SYSTEM_MODELS } from '@renderer/config/models'
import { BUILTIN_OCR_PROVIDERS, DEFAULT_OCR_PROVIDER } from '@renderer/config/ocr'
import { BUILTIN_OCR_PROVIDERS, BUILTIN_OCR_PROVIDERS_MAP, DEFAULT_OCR_PROVIDER } from '@renderer/config/ocr'
import { TRANSLATE_PROMPT } from '@renderer/config/prompts'
import {
isSupportArrayContentProvider,
@ -17,6 +17,7 @@ import i18n from '@renderer/i18n'
import { DEFAULT_ASSISTANT_SETTINGS } from '@renderer/services/AssistantService'
import {
Assistant,
BuiltinOcrProvider,
isSystemProvider,
Model,
Provider,
@ -78,6 +79,13 @@ function addProvider(state: RootState, id: string) {
}
}
// add ocr provider
function addOcrProvider(state: RootState, provider: BuiltinOcrProvider) {
if (!state.ocr.providers.find((p) => p.id === provider.id)) {
state.ocr.providers.push(provider)
}
}
function updateProvider(state: RootState, id: string, provider: Partial<Provider>) {
if (state.llm.providers) {
const index = state.llm.providers.findIndex((p) => p.id === id)
@ -2180,7 +2188,7 @@ const migrateConfig = {
try {
state.ocr = {
providers: BUILTIN_OCR_PROVIDERS,
imageProvider: DEFAULT_OCR_PROVIDER.image
imageProviderId: DEFAULT_OCR_PROVIDER.image.id
}
state.translate.translateInput = ''
return state
@ -2188,6 +2196,15 @@ const migrateConfig = {
logger.error('migrate 137 error', error as Error)
return state
}
},
'138': (state: RootState) => {
try {
addOcrProvider(state, BUILTIN_OCR_PROVIDERS_MAP.system)
return state
} catch (error) {
logger.error('migrate 138 error', error as Error)
return state
}
}
}

View File

@ -1,20 +1,25 @@
import { createSlice, PayloadAction } from '@reduxjs/toolkit'
import { BUILTIN_OCR_PROVIDERS, DEFAULT_OCR_PROVIDER } from '@renderer/config/ocr'
import { ImageOcrProvider, OcrProvider, OcrProviderConfig } from '@renderer/types'
import { OcrProvider, OcrProviderConfig } from '@renderer/types'
export interface OcrState {
providers: OcrProvider[]
imageProvider: ImageOcrProvider
imageProviderId: string
}
const initialState: OcrState = {
providers: BUILTIN_OCR_PROVIDERS,
imageProvider: DEFAULT_OCR_PROVIDER.image
imageProviderId: DEFAULT_OCR_PROVIDER.image.id
}
const ocrSlice = createSlice({
name: 'ocr',
initialState,
selectors: {
getImageProvider(state) {
return state.providers.find((p) => p.id === state.imageProviderId)
}
},
reducers: {
setOcrProviders(state, action: PayloadAction<OcrProvider[]>) {
state.providers = action.payload
@ -43,8 +48,8 @@ const ocrSlice = createSlice({
Object.assign(state.providers[index].config, action.payload.update)
}
},
setImageOcrProvider(state, action: PayloadAction<ImageOcrProvider>) {
state.imageProvider = action.payload
setImageOcrProviderId(state, action: PayloadAction<string>) {
state.imageProviderId = action.payload
}
}
})
@ -55,7 +60,9 @@ export const {
removeOcrProvider,
updateOcrProvider,
updateOcrProviderConfig,
setImageOcrProvider
setImageOcrProviderId
} = ocrSlice.actions
export const { getImageProvider } = ocrSlice.selectors
export default ocrSlice.reducer

View File

@ -105,11 +105,15 @@ export type ImageFileMetadata = FileMetadata & {
type: FileTypes.IMAGE
}
export type PdfFileMetadata = FileMetadata & {
ext: '.pdf'
}
/**
* FileMetadata
* @param file -
* @returns true
*/
export const isImageFile = (file: FileMetadata): file is ImageFileMetadata => {
export const isImageFileMetadata = (file: FileMetadata): file is ImageFileMetadata => {
return file.type === FileTypes.IMAGE
}

View File

@ -666,6 +666,7 @@ export type GenerateImageResponse = {
}
// 为了支持自定义语言设置为string别名
/** zh-cn, en-us, etc. */
export type TranslateLanguageCode = string
// langCode应当能够唯一确认一种语言

View File

@ -1,9 +1,10 @@
import Tesseract from 'tesseract.js'
import { FileMetadata, ImageFileMetadata, isImageFile } from '.'
import { FileMetadata, ImageFileMetadata, isImageFileMetadata, TranslateLanguageCode } from '.'
export const BuiltinOcrProviderIds = {
tesseract: 'tesseract'
tesseract: 'tesseract',
system: 'system'
} as const
export type BuiltinOcrProviderId = keyof typeof BuiltinOcrProviderIds
@ -15,6 +16,7 @@ export const isBuiltinOcrProviderId = (id: string): id is BuiltinOcrProviderId =
// extensible
export const OcrProviderCapabilities = {
image: 'image'
// pdf: 'pdf'
} as const
export type OcrProviderCapability = keyof typeof OcrProviderCapabilities
@ -63,7 +65,7 @@ export const isOcrProviderApiConfig = (config: unknown): config is OcrProviderAp
*
* Extend this type to define provider-specific config types.
*/
export type OcrProviderConfig = {
export type OcrProviderBaseConfig = {
/** Not used for now. Could safely remove. */
api?: OcrProviderApiConfig
/** Not used for now. Could safely remove. */
@ -72,17 +74,21 @@ export type OcrProviderConfig = {
enabled?: boolean
}
export type OcrProviderConfig = OcrApiProviderConfig | OcrTesseractConfig | OcrSystemConfig
export type OcrProvider = {
id: string
name: string
capabilities: OcrProviderCapabilityRecord
config?: OcrProviderConfig
config?: OcrProviderBaseConfig
}
export type OcrApiProviderConfig = OcrProviderBaseConfig & {
api: OcrProviderApiConfig
}
export type OcrApiProvider = OcrProvider & {
config: OcrProviderConfig & {
api: OcrProviderApiConfig
}
config: OcrApiProviderConfig
}
export const isOcrApiProvider = (p: OcrProvider): p is OcrApiProvider => {
@ -108,6 +114,12 @@ export type ImageOcrProvider = OcrProvider & {
}
}
// export type PdfOcrProvider = OcrProvider & {
// capabilities: OcrProviderCapabilityRecord & {
// [OcrProviderCapabilities.pdf]: true
// }
// }
export const isImageOcrProvider = (p: OcrProvider): p is ImageOcrProvider => {
return p.capabilities.image === true
}
@ -115,28 +127,46 @@ export const isImageOcrProvider = (p: OcrProvider): p is ImageOcrProvider => {
export type SupportedOcrFile = ImageFileMetadata
export const isSupportedOcrFile = (file: FileMetadata): file is SupportedOcrFile => {
return isImageFile(file)
return isImageFileMetadata(file)
}
export type OcrResult = {
text: string
}
export type OcrHandler = (file: SupportedOcrFile) => Promise<OcrResult>
export type OcrHandler = (file: SupportedOcrFile, options?: OcrProviderBaseConfig) => Promise<OcrResult>
export type OcrImageHandler = (file: ImageFileMetadata) => Promise<OcrResult>
export type OcrImageHandler = (file: ImageFileMetadata, options?: OcrProviderBaseConfig) => Promise<OcrResult>
// Tesseract Types
export type OcrTesseractConfig = OcrProviderConfig & {
langs: Partial<Record<TesseractLangCode, boolean>>
export type OcrTesseractConfig = OcrProviderBaseConfig & {
langs?: Partial<Record<TesseractLangCode, boolean>>
}
export type OcrTesseractProvider = BuiltinOcrProvider & {
export type OcrTesseractProvider = {
id: 'tesseract'
config: OcrTesseractConfig
}
} & ImageOcrProvider &
BuiltinOcrProvider
export const isOcrTesseractProvider = (p: OcrProvider): p is OcrTesseractProvider => {
return p.id === BuiltinOcrProviderIds.tesseract
}
export type TesseractLangCode = Tesseract.LanguageCode
// System Types
export type OcrSystemConfig = OcrProviderBaseConfig & {
langs?: TranslateLanguageCode[]
}
export type OcrSystemProvider = {
id: 'system'
config: OcrSystemConfig
} & ImageOcrProvider &
// PdfOcrProvider &
BuiltinOcrProvider
export const isOcrSystemProvider = (p: OcrProvider): p is OcrSystemProvider => {
return p.id === BuiltinOcrProviderIds.system
}

View File

@ -1,5 +1,5 @@
import { FileMetadata } from '@renderer/types'
import { KB, MB } from '@shared/config/constant'
import { KB, MB, textExts } from '@shared/config/constant'
/**
*
@ -82,6 +82,11 @@ export async function isSupportedFile(filePath: string, supportExts: Set<string>
}
}
export async function isTextFile(filePath: string): Promise<boolean> {
const set = new Set(textExts)
return isSupportedFile(filePath, set)
}
export async function filterSupportedFiles(files: FileMetadata[], supportExts: string[]): Promise<FileMetadata[]> {
const extensionSet = new Set(supportExts)
const validationResults = await Promise.all(

View File

@ -1,12 +0,0 @@
import TesseractLogo from '@renderer/assets/images/providers/Tesseract.js.png'
import { isBuiltinOcrProviderId } from '@renderer/types'
export function getOcrProviderLogo(providerId: string) {
if (isBuiltinOcrProviderId(providerId)) {
switch (providerId) {
case 'tesseract':
return TesseractLogo
}
}
return undefined
}

View File

@ -4674,6 +4674,55 @@ __metadata:
languageName: node
linkType: hard
"@napi-rs/system-ocr-darwin-arm64@npm:1.0.2":
version: 1.0.2
resolution: "@napi-rs/system-ocr-darwin-arm64@npm:1.0.2"
conditions: os=darwin & cpu=arm64
languageName: node
linkType: hard
"@napi-rs/system-ocr-darwin-x64@npm:1.0.2":
version: 1.0.2
resolution: "@napi-rs/system-ocr-darwin-x64@npm:1.0.2"
conditions: os=darwin & cpu=x64
languageName: node
linkType: hard
"@napi-rs/system-ocr-win32-arm64-msvc@npm:1.0.2":
version: 1.0.2
resolution: "@napi-rs/system-ocr-win32-arm64-msvc@npm:1.0.2"
conditions: os=win32 & cpu=arm64
languageName: node
linkType: hard
"@napi-rs/system-ocr-win32-x64-msvc@npm:1.0.2":
version: 1.0.2
resolution: "@napi-rs/system-ocr-win32-x64-msvc@npm:1.0.2"
conditions: os=win32 & cpu=x64
languageName: node
linkType: hard
"@napi-rs/system-ocr@npm:^1.0.2":
version: 1.0.2
resolution: "@napi-rs/system-ocr@npm:1.0.2"
dependencies:
"@napi-rs/system-ocr-darwin-arm64": "npm:1.0.2"
"@napi-rs/system-ocr-darwin-x64": "npm:1.0.2"
"@napi-rs/system-ocr-win32-arm64-msvc": "npm:1.0.2"
"@napi-rs/system-ocr-win32-x64-msvc": "npm:1.0.2"
dependenciesMeta:
"@napi-rs/system-ocr-darwin-arm64":
optional: true
"@napi-rs/system-ocr-darwin-x64":
optional: true
"@napi-rs/system-ocr-win32-arm64-msvc":
optional: true
"@napi-rs/system-ocr-win32-x64-msvc":
optional: true
checksum: 10c0/170f89051d2b9da52648ef933ab5e73bbafcb7ffb98948d877d3c9718e308ebf258b7b947b0d4f2bfe65fd8d2adf5acf47e5f7efd59ac0535ca99562e41b833a
languageName: node
linkType: hard
"@napi-rs/wasm-runtime@npm:^1.0.1":
version: 1.0.1
resolution: "@napi-rs/wasm-runtime@npm:1.0.1"
@ -8692,6 +8741,7 @@ __metadata:
"@mistralai/mistralai": "npm:^1.7.5"
"@modelcontextprotocol/sdk": "npm:^1.17.0"
"@mozilla/readability": "npm:^0.6.0"
"@napi-rs/system-ocr": "npm:^1.0.2"
"@notionhq/client": "npm:^2.2.15"
"@opentelemetry/api": "npm:^1.9.0"
"@opentelemetry/core": "npm:2.0.0"