cherry-studio/docs/technical/how-to-write-middlewares.md
MyPrototypeWhat 5864c7e17b feat: add middleware support for provider (#6176)
* feat: add middleware support for OpenAIProvider with logging capabilities

- Introduced middleware functionality in OpenAIProvider to enhance completions processing.
- Created AiProviderMiddlewareTypes for defining middleware interfaces and contexts.
- Implemented sampleLoggingMiddleware for logging message content and processing times.
- Updated OpenAIProvider constructor to accept middleware as an optional parameter.
- Refactored completions method to utilize middleware for improved extensibility and logging.

* refactor: streamline OpenAIProvider initialization and middleware application

- Removed optional middleware parameter from OpenAIProvider constructor for simplicity.
- Refactored ProviderFactory to create instances of providers and apply logging middleware consistently.
- Enhanced completions method visibility by changing it from private to public.
- Cleaned up unused code related to middleware handling in OpenAIProvider.

* feat: enhance AiProvider with new middleware capabilities and completion context

- Added public getter for provider info in BaseProvider.
- Introduced finalizeSdkRequestParams hook for middleware to modify SDK-specific request parameters.
- Refactored completions method in OpenAIProvider to accept a context object, improving middleware integration.
- Updated middleware types to include new context structure and callback functions for better extensibility.
- Enhanced logging middleware to utilize new context structure for improved logging capabilities.

* refactor: enhance middleware structure and context handling in AiProvider

- Updated BaseProvider and AiProvider to utilize AiProviderMiddlewareCompletionsContext for completions method.
- Introduced new utility functions for middleware context creation and execution.
- Refactored middleware application logic to improve extensibility and maintainability.
- Replaced sampleLoggingMiddleware with a more robust LoggingMiddleware implementation.
- Added new context management features for better middleware integration.

* refactor: update AiProvider and middleware structure for improved completions handling

- Refactored BaseProvider and AiProvider to change completions method signature from context to params.
- Removed unused AiProviderMiddlewareCompletionsContext and related code for cleaner implementation.
- Enhanced middleware configuration by introducing a dedicated middleware registration file.
- Implemented logging middleware for completions to improve observability during processing.
- Streamlined middleware application logic in ProviderFactory for better maintainability.

* docs: 添加中间件编写指南文档

- 新增《如何为 AI Provider 编写中间件》文档,详细介绍中间件架构、类型及编写示例。
- 说明了中间件的执行顺序、注册方法及最佳实践,旨在帮助开发者有效创建和维护中间件。

* refactor: update completions method signatures and introduce CompletionsResult type

- Changed the completions method signature in BaseProvider and AiProvider to return CompletionsResult instead of void.
- Added CompletionsResult type definition to encapsulate streaming and usage metrics.
- Updated middleware and related components to handle the new CompletionsResult structure, ensuring compatibility with existing functionality.
- Introduced new middleware for stream adaptation to enhance chunk processing during completions.

* refactor: enhance AiProvider middleware and streaming handling

- Updated CompletionsResult type to support both OpenAI SDK stream and ReadableStream.
- Modified CompletionsMiddleware to return CompletionsResult, improving type safety.
- Introduced StreamAdapterMiddleware to adapt OpenAI SDK streams to application-specific chunk streams.
- Enhanced logging in CompletionsLoggingMiddleware to capture and return results from next middleware calls.

* refactor: update AiProvider and middleware for OpenAI completions handling

- Renamed CompletionsResult to CompletionsOpenAIResult for clarity and updated its structure to support both OpenAI SDK and application-specific streams.
- Modified completions method signatures in AiProvider and OpenAIProvider to return CompletionsOpenAIResult.
- Enhanced middleware to process and adapt OpenAI SDK streams into standard chunk formats, improving overall streaming handling.
- Introduced new middleware components: FinalChunkConsumerAndNotifierMiddleware and OpenAISDKChunkToStandardChunkMiddleware for better chunk processing and logging.

* 删除 ExtractReasoningCompletionsMiddleware.ts 文件,清理未使用的中间件代码以提高代码整洁性和可维护性。

* refactor: consolidate middleware types and improve imports

- Replaced references to AiProviderMiddlewareTypes with the new middlewareTypes file across various middleware components for better organization.
- Introduced TextChunkMiddleware to enhance chunk processing from OpenAI SDK streams.
- Cleaned up imports in multiple files to reflect the new structure, improving code clarity and maintainability.

* feat: enhance abort handling with AbortController in middleware chain

- Update CompletionsOpenAIResult interface to use AbortController instead of AbortSignal
- Modify OpenAIProvider to pass abortController in completions method return
- Update AbortHandlerMiddleware to use controller from upstream result
- Improve abort handling flexibility by exposing full controller capabilities
- Enable middleware to actively control abort operations beyond passive monitoring

This change provides better control over request cancellation and enables
more sophisticated abort handling patterns in the middleware pipeline.

* refactor: enhance AiProvider and middleware for improved completions handling

- Updated BaseProvider to expose additional methods and properties, including getMessageParam and createAbortController.
- Modified OpenAIProvider to streamline completions processing and integrate new middleware for tool handling.
- Introduced TransformParamsBeforeCompletions middleware to standardize parameter transformation before completions.
- Added McpToolChunkMiddleware for managing tool calls within the completions stream.
- Enhanced middleware types to support new functionalities and improve overall structure.

These changes improve the flexibility and maintainability of the AiProvider and its middleware, facilitating better handling of OpenAI completions and tool interactions.

* refactor: enhance middleware for recursive handling and internal state management

- Introduced internal state management in middleware to support recursive calls, including enhanced dispatch functionality.
- Updated middleware types to include new internal fields for managing recursion depth and call status.
- Improved logging for better traceability of recursive calls and state transitions.
- Adjusted various middleware components to utilize the new internal state, ensuring consistent behavior during recursive processing.

These changes enhance the middleware's ability to handle complex scenarios involving recursive calls, improving overall robustness and maintainability.

* fix(OpenAIProvider): return empty object for missing sdkParams in completions handling

- Updated OpenAIProvider to return an empty object instead of undefined when sdkParams are not found, ensuring consistent return types.
- Enhanced TransformParamsBeforeCompletions middleware to include a flag for built-in web search functionality based on assistant settings.

* refactor(OpenAIProvider): enhance completions handling and middleware integration

- Updated the completions method in OpenAIProvider to include an onChunk callback for improved streaming support.
- Enabled the ThinkChunkMiddleware in the middleware registration for better handling of reasoning content.
- Increased the maximum recursion depth in McpToolChunkMiddleware to prevent infinite loops.
- Refined TextChunkMiddleware to directly enqueue chunks without unnecessary type checks.
- Improved the ThinkChunkMiddleware to better manage reasoning tags and streamline chunk processing.

These changes enhance the overall functionality and robustness of the AI provider and middleware components.

* feat(WebSearchMiddleware): add web search handling and integration

- Introduced WebSearchMiddleware to process various web search results, including annotations and citations, and generate LLM_WEB_SEARCH_COMPLETE chunks.
- Enhanced TextChunkMiddleware to support link conversion based on the model and assistant settings, improving the handling of TEXT_DELTA chunks.
- Updated middleware registration to include WebSearchMiddleware for comprehensive search result processing.

These changes enhance the AI provider's capabilities in handling web search functionalities and improve the overall middleware architecture.

* fix(middleware): improve optional chaining for chunk processing

- Updated McpToolChunkMiddleware and ThinkChunkMiddleware to use optional chaining for accessing choices, enhancing robustness against undefined values.
- Removed commented-out code in ThinkChunkMiddleware to streamline the chunk handling process.

These changes improve the reliability of middleware when processing OpenAI API responses.

* feat(middleware): enhance AbortHandlerMiddleware with recursion handling

- Added logic to detect and handle recursive calls, preventing unnecessary creation of AbortControllers.
- Improved logging for better visibility into middleware operations, including recursion depth and cleanup processes.
- Streamlined cleanup process for non-stream responses to ensure resources are released promptly.

These changes enhance the robustness and efficiency of the AbortHandlerMiddleware in managing API requests.

* docs(middleware): 迁移步骤

* feat(middleware): implement FinalChunkConsumerMiddleware for usage and metrics accumulation

- Introduced FinalChunkConsumerMiddleware to replace the deprecated FinalChunkConsumerAndNotifierMiddleware.
- This new middleware accumulates usage and metrics data from OpenAI API responses, enhancing tracking capabilities.
- Updated middleware registration to utilize the new FinalChunkConsumerMiddleware, ensuring proper integration.
- Added support for handling recursive calls and improved logging for better debugging and monitoring.

These changes enhance the middleware's ability to manage and report usage metrics effectively during API interactions.

* refactor(migrate): update API request and response structures to TypeScript types

- Changed the definitions of `CoreCompletionsRequest` and `Chunk` to use TypeScript types instead of Zod Schemas for better type safety and clarity.
- Updated middleware and service classes to handle the new `Chunk` type, ensuring compatibility with the revised API client structure.
- Enhanced the response processing logic to standardize the handling of raw SDK chunks into application-level `Chunk` objects.
- Adjusted middleware to consume the new `Chunk` type, streamlining the overall architecture and improving maintainability.

These changes facilitate a more robust and type-safe integration with AI provider APIs.

* feat(AiProvider): implement API client architecture

- Introduced ApiClientFactory for creating instances of API clients based on provider configuration.
- Added BaseApiClient as an abstract class to provide common functionality for specific client implementations.
- Implemented OpenAIApiClient for OpenAI and Azure OpenAI, including request and response handling.
- Defined types and interfaces for API client operations, enhancing type safety and clarity.
- Established middleware schemas for standardized request processing across AI providers.

These changes lay the groundwork for a modular and extensible API client architecture, improving the integration of various AI providers.

* refactor(StreamAdapterMiddleware): simplify stream adaptation logic

- Updated StreamAdapterMiddleware to directly use AsyncIterable instead of wrapping it with rawSdkChunkAdapter, streamlining the adaptation process.
- Modified asyncGeneratorToReadableStream to accept AsyncIterable, enhancing its flexibility and usability.

These changes improve the efficiency of stream handling in the middleware.

* refactor(AiProvider): simplify ResponseChunkTransformer interface and streamline OpenAIApiClient response handling

- Changed ResponseChunkTransformer from an interface to a type for improved clarity and simplicity.
- Refactored OpenAIApiClient to streamline the response transformation logic, reducing unnecessary complexity in handling tool calls and reasoning content.
- Enhanced type safety by ensuring consistent handling of optional properties in response processing.

These changes improve the maintainability and readability of the codebase while ensuring robust response handling in the API client.

* doc(technicalArchitecture): add comprehensive documentation for AI Provider architecture

* feat(architecture): introduce AI Core Design documentation and middleware specification

- Added a comprehensive technical architecture document for the new AI Provider (`aiCore`), outlining core design principles, component details, and execution flow.
- Established a middleware specification document to define the design, implementation, and usage of middleware within the `aiCore` module, promoting a flexible and maintainable system.
- These additions provide clarity and guidance for future development and integration of AI functionalities within Cherry Studio.

* refactor(middleware): consolidate and enhance middleware architecture

- Removed deprecated extractReasoningMiddleware and integrated its functionality into existing middleware.
- Streamlined middleware registration and improved type definitions for better clarity and maintainability.
- Introduced new middleware components for handling chunk processing, web search, and reasoning tags, enhancing overall functionality.
- Updated various middleware to utilize the new structures and improve logging for better debugging.

These changes enhance the middleware's efficiency and maintainability, providing a more robust framework for API interactions.

* refactor(AiProvider): enhance API client and middleware integration

- Updated ApiClientFactory to include new SDK types for improved type safety and clarity.
- Refactored BaseApiClient to support additional parameters in the completions method, enhancing flexibility for processing states.
- Streamlined OpenAIApiClient to better handle tool calls and responses, including the introduction of new chunk types for tool management.
- Improved middleware architecture by integrating processing states and refining message handling, ensuring a more robust interaction with the API.

These changes enhance the overall maintainability and functionality of the API client and middleware, providing a more efficient framework for AI interactions.

* fix(McpToolChunkMiddleware): remove redundant logging in recursion state update

* refactor(McpToolChunkMiddleware): update tool call handling and type definitions

- Replaced ChatCompletionMessageToolCall with SdkToolCall for improved type consistency.
- Updated return types of executeToolCalls and executeToolUses functions to SdkMessage[], enhancing clarity in message handling.
- Removed unused import to streamline the code.

These changes enhance the maintainability and type safety of the middleware, ensuring better integration with the SDK.

* refactor(middleware): enhance middleware structure and type handling

- Updated middleware components to utilize new SDK types, improving type safety and clarity across the board.
- Refactored various middleware to streamline processing logic, including enhanced handling of SDK messages and tool calls.
- Improved logging and error handling for better debugging and maintainability.
- Consolidated middleware functions to reduce redundancy and improve overall architecture.

These changes enhance the robustness and maintainability of the middleware framework, ensuring a more efficient interaction with the API.

* refactor(middleware): unify type imports and enhance middleware structure

- Updated middleware components to import types from a unified 'types' file, improving consistency and clarity across the codebase.
- Removed the deprecated 'type.ts' file to streamline the middleware structure.
- Enhanced middleware registration and export mechanisms for better accessibility and maintainability.

These changes contribute to a more organized and efficient middleware framework, facilitating easier future development and integration.

* refactor(AiProvider): enhance API client and middleware integration

- Updated AiProvider components to support new SDK types, improving type safety and clarity.
- Refactored middleware to streamline processing logic, including enhanced handling of tool calls and responses.
- Introduced new middleware for tool use extraction and raw stream listening, improving overall functionality.
- Improved logging and error handling for better debugging and maintainability.

These changes enhance the robustness and maintainability of the API client and middleware, ensuring a more efficient interaction with the API.

* feat(middleware): add new middleware components for raw stream listening and tool use extraction

- Introduced RawStreamListenerMiddleware and ToolUseExtractionMiddleware to enhance middleware capabilities.
- Updated MiddlewareRegistry to include new middleware entries, improving overall functionality and extensibility.

These changes expand the middleware framework, facilitating better handling of streaming and tool usage scenarios.

* refactor(AiProvider): integrate new API client and middleware architecture

- Replaced BaseProvider with ApiClientFactory to enhance API client instantiation.
- Updated completions method to utilize new middleware architecture for improved processing.
- Added TODOs for refactoring remaining methods to align with the new API client structure.
- Removed deprecated middleware wrapping logic from ApiClientFactory for cleaner implementation.

These changes improve the overall structure and maintainability of the AiProvider, facilitating better integration with the new middleware system.

* refactor(middleware): update middleware architecture and documentation

- Revised middleware naming conventions and introduced a centralized MiddlewareRegistry for better management and accessibility.
- Enhanced MiddlewareBuilder to support named middleware and streamline the construction of middleware chains.
- Updated documentation to reflect changes in middleware usage and structure, improving clarity for future development.

These changes improve the organization and usability of the middleware framework, facilitating easier integration and maintenance.

* refactor(AiProvider): enhance completions middleware logic and API client handling

- Updated the completions method to conditionally remove middleware based on parameters, improving flexibility in processing.
- Refactored the response chunk transformer in OpenAIApiClient and AnthropicAPIClient to utilize a more streamlined approach with TransformStream.
- Simplified middleware context handling by removing unnecessary custom state management.
- Improved logging and error handling across middleware components for better debugging and maintainability.

These changes enhance the efficiency and clarity of the AiProvider's middleware integration, ensuring a more adaptable and robust processing framework.

* refactor(AiProvider, middleware): clean up logging and improve method naming

- Removed unnecessary logging of parameters in AiProvider to streamline the code.
- Updated method name assignment in middleware to enhance clarity and consistency.

These changes contribute to a cleaner codebase and improve the readability of the middleware and provider components.

* feat(middleware): enhance middleware types and add RawStreamListenerMiddleware

- Introduced RawStreamListenerMiddleware to the MiddlewareName enum for improved middleware capabilities.
- Updated type definitions across middleware components to enhance type safety and clarity, including the addition of new SDK types.
- Refactored context and middleware API interfaces to support more specific type parameters, improving overall maintainability.

These changes expand the middleware framework, facilitating better handling of streaming scenarios and enhancing type safety across the codebase.

* refactor(messageThunk): convert callback functions to async and handle errors during database updates

This commit updates several callback functions in the messageThunk to be asynchronous, ensuring that block transitions are awaited properly. Additionally, error handling is added for the database update function to log any failures when saving blocks. This improves the reliability and responsiveness of the message processing flow.

* refactor: enhance message block handling in messageThunk

This commit refactors the message processing logic in messageThunk to improve the management of message blocks. Key changes include the introduction of dedicated IDs for different block types (main text, thinking, tool, and image) to streamline updates and transitions. The handling of placeholder blocks has been improved, ensuring that they are correctly converted to their respective types during processing. Additionally, error handling has been enhanced for better reliability in database updates.

* feat(AiProvider): add default timeout configuration and enhance API client aborthandler

- Introduced a default timeout constant to the configuration for improved API client timeout management.
- Updated BaseApiClient and its derived classes to utilize the new timeout setting, ensuring consistent timeout behavior across different API clients.
- Enhanced middleware to pass the timeout value during API calls, improving error handling and responsiveness.

These changes improve the overall robustness and configurability of the API client interactions, facilitating better control over request timeouts.

* feat(GeminiProvider): implement Gemini API client and enhance file handling

- Introduced GeminiAPIClient to facilitate interactions with the Gemini API, replacing the previous GoogleGenAI integration.
- Refactored GeminiProvider to utilize the new API client, improving code organization and maintainability.
- Enhanced file handling capabilities, including support for PDF uploads and retrieval of file metadata.
- Updated message processing to accommodate new SDK types and improve content generation logic.

These changes significantly enhance the functionality and robustness of the GeminiProvider, enabling better integration with the Gemini API and improving overall user experience.

* refactor(AiProvider, middleware): streamline API client and middleware integration

- Removed deprecated methods and types from various API clients, enhancing code clarity and maintainability.
- Updated the CompletionsParams interface to support messages as a string or array, improving flexibility in message handling.
- Refactored middleware components to eliminate unnecessary state management and improve type safety.
- Enhanced the handling of streaming responses and added utility functions for better stream management.

These changes contribute to a more robust and efficient architecture for the AiProvider and its associated middleware, facilitating improved API interactions and user experience.

* refactor(middleware): translation 适配

- Deleted SdkCallMiddleware to streamline middleware architecture and improve maintainability.
- Commented out references to SdkCallModule in examples and registration files to prevent usage.
- Enhanced logging in AbortHandlerMiddleware for better debugging and tracking of middleware execution.
- Updated parameters in ResponseTransformMiddleware to improve flexibility in handling response settings.

These changes contribute to a cleaner and more efficient middleware framework, facilitating better integration and performance.

* refactor(ApiCheck): streamline API validation and error handling

- Updated the API check logic to simplify validation processes and improve error handling across various components.
- Refactored the `checkApi` function to throw errors directly instead of returning validation objects, enhancing clarity in error management.
- Improved the handling of API key checks in `checkModelWithMultipleKeys` to provide more informative error messages.
- Added a new method `getEmbeddingDimensions` in the `AiProvider` class to facilitate embedding dimension retrieval, enhancing model compatibility checks.

These changes contribute to a more robust and maintainable API validation framework, improving overall user experience and error reporting.

* refactor(HealthCheckService, ModelService): improve error handling and performance metrics

- Updated error handling in `checkModelWithMultipleKeys` to truncate error messages for better readability.
- Refactored `performModelCheck` to remove unnecessary error handling, focusing on performance metrics by returning only latency.
- Enhanced the `checkModel` function to ensure consistent return types, improving clarity in API interactions.

These changes contribute to a more efficient and user-friendly error reporting and performance tracking system.

* refactor(AiProvider, models): enhance model handling and API client integration

- Updated the `listModels` method in various API clients to improve model retrieval and ensure consistent return types.
- Refactored the `EditModelsPopup` component to handle model properties more robustly, including fallback options for `id`, `name`, and other attributes.
- Enhanced type definitions for models in the SDK to support new integrations and improve type safety.

These changes contribute to a more reliable and maintainable model management system within the AiProvider, enhancing overall user experience and API interactions.

* refactor(AiProvider, clients): implement image generation functionality

- Refactored the `generateImage` method in the `AiProvider` class to utilize the `apiClient` for image generation, replacing the previous placeholder implementation.
- Updated the `BaseApiClient` to include an abstract `generateImage` method, ensuring all derived clients implement this functionality.
- Implemented the `generateImage` method in `GeminiAPIClient` and `OpenAIAPIClient`, providing specific logic for image generation based on the respective SDKs.
- Added type definitions for `GenerateImageParams` across relevant files to enhance type safety and clarity in image generation parameters.

These changes enhance the image generation capabilities of the AiProvider, improving integration with various API clients and overall user experience.

* refactor(AiProvider, clients): restructure API client architecture and remove deprecated components

- Refactored the `ProviderFactory` and removed the `AihubmixProvider` to streamline the API client architecture.
- Updated the import paths for `isOpenAIProvider` to reflect the new structure.
- Introduced `AihubmixAPIClient` and `OpenAIResponseAPIClient` to enhance client handling based on model types.
- Improved the `AiProvider` class to utilize the new clients for better model-specific API interactions.
- Enhanced type definitions and error handling across various components to improve maintainability and clarity.

These changes contribute to a more efficient and organized API client structure, enhancing overall integration and user experience.

* fix: update system prompt handling in API clients to use await for asynchronous operations

- Modified the `AnthropicAPIClient`, `GeminiAPIClient`, `OpenAIAPIClient`, and `OpenAIResponseAPIClient` to ensure `buildSystemPrompt` is awaited, improving the handling of system prompts.
- Adjusted the `fetchMessagesSummary` function to utilize the last five user messages for better context in API calls and added a utility function to clean up topic names.

These changes enhance the reliability of prompt generation and improve the overall API interaction experience.

* refactor(middleware): remove examples.ts to streamline middleware documentation

- Deleted the `examples.ts` file containing various middleware usage examples to simplify the middleware structure and documentation.
- This change contributes to a cleaner codebase and focuses on essential middleware components, enhancing maintainability.

* refactor(AiProvider, middleware): enhance middleware handling and error management

- Updated the `CompletionsParams` interface to include a new `callType` property for better middleware decision-making based on the context of the API call.
- Introduced `ErrorHandlerMiddleware` to standardize error handling across middleware, allowing errors to be captured and processed as `ErrorChunk` objects.
- Modified the `AbortHandlerMiddleware` to conditionally remove itself based on the `callType`, improving middleware efficiency.
- Cleaned up logging in `AbortHandlerMiddleware` to reduce console output and enhance performance.
- Updated middleware registration to include the new `ErrorHandlerMiddleware`, ensuring comprehensive error management in the middleware pipeline.

These changes contribute to a more robust and maintainable middleware architecture, improving error handling and overall API interaction efficiency.

* feat: implement token estimation for message handling

- Added an abstract method `estimateMessageTokens` to the `BaseApiClient` class for estimating token usage based on message content.
- Implemented the `estimateMessageTokens` method in `AnthropicAPIClient`, `GeminiAPIClient`, `OpenAIAPIClient`, and `OpenAIResponseAPIClient` to calculate token consumption for various message types.
- Enhanced middleware to accumulate token usage for new messages, improving tracking of API call costs.

These changes improve the efficiency of message processing and provide better insights into token usage across different API clients.

* feat: add support for image generation and model handling

- Introduced `SUPPORTED_DISABLE_GENERATION_MODELS` to manage models that disable image generation.
- Updated `isSupportedDisableGenerationModel` function to check model compatibility.
- Enhanced `Inputbar` logic to conditionally enable image generation based on model support.
- Modified API clients to handle image generation calls and responses, including new chunk types for image data.
- Updated middleware and service layers to incorporate image generation parameters and improve overall processing.

These changes enhance the application's capabilities for image generation and improve the handling of various model types.

* feat: enhance GeminiAPIClient for image generation support

- Added `getGenerateImageParameter` method to configure image generation parameters.
- Updated request handling in `GeminiAPIClient` to include image generation options.
- Enhanced response processing to handle image data and enqueue it correctly.

These changes improve the GeminiAPIClient's capabilities for generating and processing images, aligning with recent enhancements in image generation support.

* feat: enhance image generation handling in OpenAIResponseAPIClient and middleware

- Updated OpenAIResponseAPIClient to improve user message processing for image generation.
- Added handling for image creation events in TransformCoreToSdkParamsMiddleware.
- Adjusted ApiService to streamline image generation event handling.
- Modified messageThunk to reflect changes in image block status during processing.

These enhancements improve the integration and responsiveness of image generation features across the application.

* refactor: remove unused AI provider classes

- Deleted `AihubmixProvider`, `AnthropicProvider`, `BaseProvider`, `GeminiProvider`, and `OpenAIProvider` as they are no longer utilized in the codebase.
- This cleanup reduces code complexity and improves maintainability by removing obsolete components related to AI provider functionality.

* chore: remove obsolete test files for middleware

- Deleted test files for `AbortHandlerMiddleware`, `LoggingMiddleware`, `TextChunkMiddleware`, `ThinkChunkMiddleware`, and `WebSearchMiddleware` as they are no longer needed.
- This cleanup helps streamline the codebase and reduces maintenance overhead by removing outdated tests.

* chore: remove Suggestions component and related functionality

- Deleted the `Suggestions` component from the home page as it is no longer needed.
- Removed associated imports and functions related to suggestion fetching, streamlining the codebase.
- This cleanup helps improve maintainability by eliminating unused components.

* feat: enhance OpenAIAPIClient and StreamProcessingService for tool call handling

- Updated OpenAIAPIClient to conditionally include tool calls in the assistant message, improving message processing logic.
- Enhanced tool call handling in the response transformer to correctly manage and enqueue tool call data.
- Added a new callback for LLM response completion in StreamProcessingService, allowing better integration of response handling.

These changes improve the functionality and responsiveness of the OpenAI API client and stream processing capabilities.

* fix: copilot error

* fix: improve chunk handling in TextChunkMiddleware and ThinkChunkMiddleware

- Updated TextChunkMiddleware to enqueue LLM_RESPONSE_COMPLETE chunks based on accumulated text content.
- Refactored ThinkChunkMiddleware to generate THINKING_COMPLETE chunks when receiving non-THINKING_DELTA chunks, ensuring proper handling of accumulated thinking content.
- These changes enhance the middleware's responsiveness and accuracy in processing text and thinking chunks.

* chore: update dependencies and improve styling

- Updated `selection-hook` dependency to version 0.9.23 in `package.json` and `yarn.lock`.
- Removed unused styles from `container.scss` and adjusted padding in `index.scss`.
- Enhanced message rendering and layout in various components, including `Message`, `MessageHeader`, and `MessageMenubar`.
- Added tooltip support for message divider settings in `SettingsTab`.
- Improved handling of citation display in `CitationsList` and `CitationBlock`.

These changes streamline the codebase and enhance the user interface for better usability.

* feat: implement image generation middleware and enhance model handling

- Added `ImageGenerationMiddleware` to handle dedicated image generation models, integrating image processing and OpenAI's image generation API.
- Updated `AiProvider` to utilize the new middleware for dedicated image models, ensuring proper middleware chaining.
- Introduced constants for dedicated image models in `models.ts` to streamline model identification.
- Refactored error handling in `ErrorHandlerMiddleware` to use a utility function for better error management.
- Cleaned up imports and removed unused code in various files for improved maintainability.

* fix: update dedicated image models identification logic

- Modified the `DEDICATED_IMAGE_MODELS` array to include 'grok-2-image' for improved model handling.
- Enhanced the `isDedicatedImageGenerationModel` function to use a more robust check for model identification, ensuring better accuracy in middleware processing.

* refactor: remove OpenAIResponseProvider class

- Deleted the `OpenAIResponseProvider` class from the `AiProvider` module, streamlining the codebase by eliminating unused code.
- This change enhances maintainability and reduces complexity in the provider architecture.

* fix: usermessage

* refactor: simplify AbortHandlerMiddleware for improved abort handling

- Removed direct dependency on ApiClient for creating AbortController, enhancing modularity.
- Introduced utility functions to manage abort controllers, streamlining the middleware's responsibilities.
- Delegated abort signal handling to downstream middlewares, allowing for cleaner separation of concerns.

* refactor(aiCore): Consolidate AI provider and middleware architecture

This commit refactors the AI-related modules by unifying the `clients` and `middleware` directories under a single `aiCore` directory. This change simplifies the project structure, improves modularity, and makes the architecture more cohesive.

Key changes:
- Relocated provider-specific clients and middleware into the `aiCore` directory, removing the previous `providers/AiProvider` structure.
- Updated the architectural documentation (`AI_CORE_DESIGN.md`) to accurately reflect the new, streamlined directory layout and execution flow.
- The main `AiProvider` class is now the primary export of `aiCore/index.ts`, serving as the central access point for AI functionalities.

* refactor: update imports and enhance middleware functionality

- Adjusted import statements in `AnthropicAPIClient` and `GeminiAPIClient` for better organization.
- Improved `AbortHandlerMiddleware` to handle abort signals more effectively, including the conversion of streams to handle abort scenarios.
- Enhanced `ErrorHandlerMiddleware` to differentiate between abort errors and other types, ensuring proper error handling.
- Cleaned up commented-out code in `FinalChunkConsumerMiddleware` for better readability and maintainability.

* refactor: streamline middleware logging and improve error handling

- Removed excessive debug logging from various middleware components, including `AbortHandlerMiddleware`, `FinalChunkConsumerMiddleware`, and `McpToolChunkMiddleware`, to enhance readability and performance.
- Updated logging levels to use warnings for potential issues in `ResponseTransformMiddleware`, `TextChunkMiddleware`, and `ThinkChunkMiddleware`, ensuring better visibility of important messages.
- Cleaned up commented-out code and unnecessary debug statements across multiple middleware files for improved maintainability.

---------

Co-authored-by: suyao <sy20010504@gmail.com>
Co-authored-by: eeee0717 <chentao020717Work@outlook.com>
Co-authored-by: lizhixuan <zhixuan.li@banosuperapp.com>
2025-06-12 16:01:19 +08:00

215 lines
13 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 如何为 AI Provider 编写中间件
本文档旨在指导开发者如何为我们的 AI Provider 框架创建和集成自定义中间件。中间件提供了一种强大而灵活的方式来增强、修改或观察 Provider 方法的调用过程,例如日志记录、缓存、请求/响应转换、错误处理等。
## 架构概览
我们的中间件架构借鉴了 Redux 的三段式设计,并结合了 JavaScript Proxy 来动态地将中间件应用于 Provider 的方法。
- **Proxy**: 拦截对 Provider 方法的调用,并将调用引导至中间件链。
- **中间件链**: 一系列按顺序执行的中间件函数。每个中间件都可以处理请求/响应,然后将控制权传递给链中的下一个中间件,或者在某些情况下提前终止链。
- **上下文 (Context)**: 一个在中间件之间传递的对象携带了关于当前调用的信息如方法名、原始参数、Provider 实例、以及中间件自定义的数据)。
## 中间件的类型
目前主要支持两种类型的中间件,它们共享相似的结构但针对不同的场景:
1. **`CompletionsMiddleware`**: 专门为 `completions` 方法设计。这是最常用的中间件类型,因为它允许对 AI 模型的核心聊天/文本生成功能进行精细控制。
2. **`ProviderMethodMiddleware`**: 通用中间件,可以应用于 Provider 上的任何其他方法(例如,`translate`, `summarize` 等,如果这些方法也通过中间件系统包装)。
## 编写一个 `CompletionsMiddleware`
`CompletionsMiddleware` 的基本签名TypeScript 类型)如下:
```typescript
import { AiProviderMiddlewareCompletionsContext, CompletionsParams, MiddlewareAPI } from './AiProviderMiddlewareTypes' // 假设类型定义文件路径
export type CompletionsMiddleware = (
api: MiddlewareAPI<AiProviderMiddlewareCompletionsContext, [CompletionsParams]>
) => (
next: (context: AiProviderMiddlewareCompletionsContext, params: CompletionsParams) => Promise<any> // next 返回 Promise<any> 代表原始SDK响应或下游中间件的结果
) => (context: AiProviderMiddlewareCompletionsContext, params: CompletionsParams) => Promise<void> // 最内层函数通常返回 Promise<void>,因为结果通过 onChunk 或 context 副作用传递
```
让我们分解这个三段式结构:
1. **第一层函数 `(api) => { ... }`**:
- 接收一个 `api` 对象。
- `api` 对象提供了以下方法:
- `api.getContext()`: 获取当前调用的上下文对象 (`AiProviderMiddlewareCompletionsContext`)。
- `api.getOriginalArgs()`: 获取传递给 `completions` 方法的原始参数数组 (即 `[CompletionsParams]`)。
- `api.getProviderId()`: 获取当前 Provider 的 ID。
- `api.getProviderInstance()`: 获取原始的 Provider 实例。
- 此函数通常用于进行一次性的设置或获取所需的服务/配置。它返回第二层函数。
2. **第二层函数 `(next) => { ... }`**:
- 接收一个 `next` 函数。
- `next` 函数代表了中间件链中的下一个环节。调用 `next(context, params)` 会将控制权传递给下一个中间件,或者如果当前中间件是链中的最后一个,则会调用核心的 Provider 方法逻辑 (例如,实际的 SDK 调用)。
- `next` 函数接收当前的 `context``params` (这些可能已被上游中间件修改)。
- **重要的是**`next` 的返回类型通常是 `Promise<any>`。对于 `completions` 方法,如果 `next` 调用了实际的 SDK它将返回原始的 SDK 响应例如OpenAI 的流对象或 JSON 对象)。你需要处理这个响应。
- 此函数返回第三层(也是最核心的)函数。
3. **第三层函数 `(context, params) => { ... }`**:
- 这是执行中间件主要逻辑的地方。
- 它接收当前的 `context` (`AiProviderMiddlewareCompletionsContext`) 和 `params` (`CompletionsParams`)。
- 在此函数中,你可以:
- **在调用 `next` 之前**:
- 读取或修改 `params`。例如,添加默认参数、转换消息格式。
- 读取或修改 `context`。例如,设置一个时间戳用于后续计算延迟。
- 执行某些检查,如果不满足条件,可以不调用 `next` 而直接返回或抛出错误(例如,参数校验失败)。
- **调用 `await next(context, params)`**:
- 这是将控制权传递给下游的关键步骤。
- `next` 的返回值是原始的 SDK 响应或下游中间件的结果,你需要根据情况处理它(例如,如果是流,则开始消费流)。
- **在调用 `next` 之后**:
- 处理 `next` 的返回结果。例如,如果 `next` 返回了一个流,你可以在这里开始迭代处理这个流,并通过 `context.onChunk` 发送数据块。
- 基于 `context` 的变化或 `next` 的结果执行进一步操作。例如,计算总耗时、记录日志。
- 修改最终结果(尽管对于 `completions`,结果通常通过 `onChunk` 副作用发出)。
### 示例:一个简单的日志中间件
```typescript
import {
AiProviderMiddlewareCompletionsContext,
CompletionsParams,
MiddlewareAPI,
OnChunkFunction // 假设 OnChunkFunction 类型被导出
} from './AiProviderMiddlewareTypes' // 调整路径
import { ChunkType } from '@renderer/types' // 调整路径
export const createSimpleLoggingMiddleware = (): CompletionsMiddleware => {
return (api: MiddlewareAPI<AiProviderMiddlewareCompletionsContext, [CompletionsParams]>) => {
// console.log(`[LoggingMiddleware] Initialized for provider: ${api.getProviderId()}`);
return (next: (context: AiProviderMiddlewareCompletionsContext, params: CompletionsParams) => Promise<any>) => {
return async (context: AiProviderMiddlewareCompletionsContext, params: CompletionsParams): Promise<void> => {
const startTime = Date.now()
// 从 context 中获取 onChunk (它最初来自 params.onChunk)
const onChunk = context.onChunk
console.log(
`[LoggingMiddleware] Request for ${context.methodName} with params:`,
params.messages?.[params.messages.length - 1]?.content
)
try {
// 调用下一个中间件或核心逻辑
// `rawSdkResponse` 是来自下游的原始响应 (例如 OpenAIStream 或 ChatCompletion 对象)
const rawSdkResponse = await next(context, params)
// 此处简单示例不处理 rawSdkResponse假设下游中间件 (如 StreamingResponseHandler)
// 会处理它并通过 onChunk 发送数据。
// 如果这个日志中间件在 StreamingResponseHandler 之后,那么流已经被处理。
// 如果在之前,那么它需要自己处理 rawSdkResponse 或确保下游会处理。
const duration = Date.now() - startTime
console.log(`[LoggingMiddleware] Request for ${context.methodName} completed in ${duration}ms.`)
// 假设下游已经通过 onChunk 发送了所有数据。
// 如果这个中间件是链的末端,并且需要确保 BLOCK_COMPLETE 被发送,
// 它可能需要更复杂的逻辑来跟踪何时所有数据都已发送。
} catch (error) {
const duration = Date.now() - startTime
console.error(`[LoggingMiddleware] Request for ${context.methodName} failed after ${duration}ms:`, error)
// 如果 onChunk 可用,可以尝试发送一个错误块
if (onChunk) {
onChunk({
type: ChunkType.ERROR,
error: { message: (error as Error).message, name: (error as Error).name, stack: (error as Error).stack }
})
// 考虑是否还需要发送 BLOCK_COMPLETE 来结束流
onChunk({ type: ChunkType.BLOCK_COMPLETE, response: {} })
}
throw error // 重新抛出错误,以便上层或全局错误处理器可以捕获
}
}
}
}
}
```
### `AiProviderMiddlewareCompletionsContext` 的重要性
`AiProviderMiddlewareCompletionsContext` 是在中间件之间传递状态和数据的核心。它通常包含:
- `methodName`: 当前调用的方法名 (总是 `'completions'`)。
- `originalArgs`: 传递给 `completions` 的原始参数数组。
- `providerId`: Provider 的 ID。
- `_providerInstance`: Provider 实例。
- `onChunk`: 从原始 `CompletionsParams` 传入的回调函数,用于流式发送数据块。**所有中间件都应该通过 `context.onChunk` 来发送数据。**
- `messages`, `model`, `assistant`, `mcpTools`: 从原始 `CompletionsParams` 中提取的常用字段,方便访问。
- **自定义字段**: 中间件可以向上下文中添加自定义字段,以供后续中间件使用。例如,一个缓存中间件可能会添加 `context.cacheHit = true`
**关键**: 当你在中间件中修改 `params``context` 时,这些修改会向下游中间件传播(如果它们在 `next` 调用之前修改)。
### 中间件的顺序
中间件的执行顺序非常重要。它们在 `AiProviderMiddlewareConfig` 的数组中定义的顺序就是它们的执行顺序。
- 请求首先通过第一个中间件,然后是第二个,依此类推。
- 响应(或 `next` 的调用结果)则以相反的顺序"冒泡"回来。
例如,如果链是 `[AuthMiddleware, CacheMiddleware, LoggingMiddleware]`
1. `AuthMiddleware` 先执行其 "调用 `next` 之前" 的逻辑。
2. 然后 `CacheMiddleware` 执行其 "调用 `next` 之前" 的逻辑。
3. 然后 `LoggingMiddleware` 执行其 "调用 `next` 之前" 的逻辑。
4. 核心SDK调用或链的末端
5. `LoggingMiddleware` 先接收到结果,执行其 "调用 `next` 之后" 的逻辑。
6. 然后 `CacheMiddleware` 接收到结果(可能已被 LoggingMiddleware 修改的上下文),执行其 "调用 `next` 之后" 的逻辑(例如,存储结果)。
7. 最后 `AuthMiddleware` 接收到结果,执行其 "调用 `next` 之后" 的逻辑。
### 注册中间件
中间件在 `src/renderer/src/providers/middleware/register.ts` (或其他类似的配置文件) 中进行注册。
```typescript
// register.ts
import { AiProviderMiddlewareConfig } from './AiProviderMiddlewareTypes'
import { createSimpleLoggingMiddleware } from './common/SimpleLoggingMiddleware' // 假设你创建了这个文件
import { createCompletionsLoggingMiddleware } from './common/CompletionsLoggingMiddleware' // 已有的
const middlewareConfig: AiProviderMiddlewareConfig = {
completions: [
createSimpleLoggingMiddleware(), // 你新加的中间件
createCompletionsLoggingMiddleware() // 已有的日志中间件
// ... 其他 completions 中间件
],
methods: {
// translate: [createGenericLoggingMiddleware()],
// ... 其他方法的中间件
}
}
export default middlewareConfig
```
### 最佳实践
1. **单一职责**: 每个中间件应专注于一个特定的功能(例如,日志、缓存、转换特定数据)。
2. **无副作用 (尽可能)**: 除了通过 `context``onChunk` 明确的副作用外,尽量避免修改全局状态或产生其他隐蔽的副作用。
3. **错误处理**:
- 在中间件内部使用 `try...catch` 来处理可能发生的错误。
- 决定是自行处理错误(例如,通过 `onChunk` 发送错误块)还是将错误重新抛出给上游。
- 如果重新抛出,确保错误对象包含足够的信息。
4. **性能考虑**: 中间件会增加请求处理的开销。避免在中间件中执行非常耗时的同步操作。对于IO密集型操作确保它们是异步的。
5. **可配置性**: 使中间件的行为可通过参数或配置进行调整。例如,日志中间件可以接受一个日志级别参数。
6. **上下文管理**:
- 谨慎地向 `context` 添加数据。避免污染 `context` 或添加过大的对象。
- 明确你添加到 `context` 的字段的用途和生命周期。
7. **`next` 的调用**:
- 除非你有充分的理由提前终止请求(例如,缓存命中、授权失败),否则**总是确保调用 `await next(context, params)`**。否则,下游的中间件和核心逻辑将不会执行。
- 理解 `next` 的返回值并正确处理它,特别是当它是一个流时。你需要负责消费这个流或将其传递给另一个能够消费它的组件/中间件。
8. **命名清晰**: 给你的中间件和它们创建的函数起描述性的名字。
9. **文档和注释**: 对复杂的中间件逻辑添加注释,解释其工作原理和目的。
### 调试技巧
- 在中间件的关键点使用 `console.log` 或调试器来检查 `params`、`context` 的状态以及 `next` 的返回值。
- 暂时简化中间件链,只保留你正在调试的中间件和最简单的核心逻辑,以隔离问题。
- 编写单元测试来独立验证每个中间件的行为。
通过遵循这些指南,你应该能够有效地为我们的系统创建强大且可维护的中间件。如果你有任何疑问或需要进一步的帮助,请咨询团队。