docs(architecture): restructure and enhance data management documentation

- Removed outdated sections on database architecture and data access patterns for clarity.
- Introduced a new table format for data management systems, detailing use cases and APIs.
- Updated references to the database schema and migration processes for better guidance.
- Consolidated key architectural components to streamline the documentation structure.
This commit is contained in:
fullex 2025-12-29 17:16:07 +08:00
parent 819c209821
commit 6feb322be8

View File

@ -48,34 +48,17 @@ When creating a Pull Request, you MUST:
### Key Architectural Components
#### Main Process Services (`src/main/services/`)
- **MCPService**: Model Context Protocol server management
- **KnowledgeService**: Document processing and knowledge base management
- **FileStorage/S3Storage/WebDav**: Multiple storage backends
- **WindowService**: Multi-window management (main, mini, selection windows)
- **ProxyManager**: Network proxy handling
- **SearchService**: Full-text search capabilities
#### AI Core (`src/renderer/src/aiCore/`)
- **Middleware System**: Composable pipeline for AI request processing
- **Client Factory**: Supports multiple AI providers (OpenAI, Anthropic, Gemini, etc.)
- **Stream Processing**: Real-time response handling
#### Data Management
- **Cache System**: Three-layer caching (memory/shared/persist) with React hooks integration
- **Preferences**: Type-safe configuration management with multi-window synchronization
- **User Data**: SQLite-based storage with Drizzle ORM for business data
**MUST READ**: [docs/en/references/data/README.md](docs/en/references/data/README.md) for system selection, architecture, and patterns.
#### Knowledge Management
| System | Use Case | APIs |
|--------|----------|------|
| Cache | Temp data (can lose) | `useCache`, `useSharedCache`, `usePersistCache` |
| Preference | User settings | `usePreference` |
| DataApi | Business data (**critical**) | `useQuery`, `useMutation` |
- **Embeddings**: Vector search with multiple providers (OpenAI, Voyage, etc.)
- **OCR**: Document text extraction (system OCR, Doc2x, Mineru)
- **Preprocessing**: Document preparation pipeline
- **Loaders**: Support for various file formats (PDF, DOCX, EPUB, etc.)
Database: SQLite + Drizzle ORM, schemas in `src/main/data/db/schemas/`, migrations via `yarn db:migrations:generate`
### Build System
@ -106,50 +89,6 @@ The project is in the process of migrating from antd & styled-components to Tail
UI Library: `@packages/ui`
### Database Architecture
- **Database**: SQLite (`cherrystudio.sqlite`) + libsql driver
- **ORM**: Drizzle ORM with comprehensive migration system
- **Schemas**: Located in `src/main/data/db/schemas/` directory
#### Database Standards
- **Table Naming**: Use singular form with snake_case (e.g., `topic`, `message`, `app_state`)
- **Schema Exports**: Export using `xxxTable` pattern (e.g., `topicTable`, `appStateTable`)
- **Field Definition**: Drizzle auto-infers field names, no need to add default field names
- **JSON Fields**: For JSON support, add `{ mode: 'json' }`, refer to `preference.ts` table definition
- **JSON Serialization**: For JSON fields, no need to manually serialize/deserialize when reading/writing to database, Drizzle handles this automatically
- **Timestamps**: Use existing `crudTimestamps` utility
- **Migrations**: Generate via `yarn run db:migrations:generate`
## Data Access Patterns
The application uses three distinct data management systems. Choose the appropriate system based on data characteristics:
### Cache System
- **Purpose**: Temporary data that can be regenerated
- **Lifecycle**: Component-level (memory), window-level (shared), or persistent (survives restart)
- **Use Cases**: API response caching, computed results, temporary UI state
- **APIs**: `useCache`, `useSharedCache`, `usePersistCache` hooks, or `cacheService`
### Preference System
- **Purpose**: User configuration and application settings
- **Lifecycle**: Permanent until user changes
- **Use Cases**: Theme, language, editor settings, user preferences
- **APIs**: `usePreference`, `usePreferences` hooks, or `preferenceService`
### User Data API
- **Purpose**: Core business data (conversations, files, notes, etc.)
- **Lifecycle**: Permanent business records
- **Use Cases**: Topics, messages, files, knowledge base, user-generated content
- **APIs**: `useDataApi` hook or `dataApiService` for direct calls
### Selection Guidelines
- **Use Cache** for data that can be lost without impact (computed values, API responses)
- **Use Preferences** for user settings that affect app behavior (UI configuration, feature flags)
- **Use User Data API** for irreplaceable business data (conversations, documents, user content)
## Logging Standards
### Usage