feat(migration): enhance ChatMigrator for comprehensive chat data migration

- Implemented detailed preparation, execution, and validation phases for migrating chat topics and messages from Dexie to SQLite.
- Added robust logging and error handling to track migration progress and issues.
- Introduced data transformation strategies to convert old message structures into a new tree format, ensuring data integrity and consistency.
- Updated migration guide documentation to reflect changes in migrator registration and detailed comments for maintainability.
This commit is contained in:
fullex 2026-01-01 23:13:43 +08:00
parent 4fcf047fa9
commit 4f4785396a
5 changed files with 1921 additions and 63 deletions

View File

@ -31,9 +31,10 @@ src/main/data/migration/v2/
- `execute(ctx)`: perform inserts/updates; manage your own transactions; report progress via `reportProgress`
- `validate(ctx)`: verify counts and integrity; return `ValidateResult` with stats (`sourceCount`, `targetCount`, `skippedCount`) and any `errors`
- Registration: list migrators (in order) in `migrators/index.ts` so the engine can sort and run them.
- Current migrators:
- Current migrators (see `migrators/README-<name>.md` for detailed documentation):
- `PreferencesMigrator` (implemented): maps ElectronStore + Redux settings to the `preference` table using `mappings/PreferencesMappings.ts`.
- `AssistantMigrator`, `KnowledgeMigrator`, `ChatMigrator` (placeholders): scaffolding and TODO notes for future tables.
- `ChatMigrator` (implemented): migrates topics and messages from Dexie to SQLite. See [`README-ChatMigrator.md`](../../../src/main/data/migration/v2/migrators/README-ChatMigrator.md).
- `AssistantMigrator`, `KnowledgeMigrator` (placeholders): scaffolding and TODO notes for future tables.
- Conventions:
- All logging goes through `loggerService` with a migrator-specific context.
- Use `MigrationContext.sources` instead of accessing raw files/stores directly.
@ -62,3 +63,10 @@ src/main/data/migration/v2/
- [ ] Wire progress updates through `reportProgress` so UI shows per-migrator progress.
- [ ] Register the migrator in `migrators/index.ts` with the correct `order`.
- [ ] Add any new target tables to `MigrationEngine.verifyAndClearNewTables` once those tables exist.
- [ ] Include detailed comments for maintainability (file-level, function-level, logic blocks).
- [ ] **Create/update `migrators/README-<MigratorName>.md`** with detailed documentation including:
- Data sources and target tables
- Key transformations
- Field mappings (source → target)
- Dropped fields and rationale
- Code quality notes

View File

@ -5,7 +5,9 @@
import { dbService } from '@data/db/DbService'
import { appStateTable } from '@data/db/schemas/appState'
import { messageTable } from '@data/db/schemas/message'
import { preferenceTable } from '@data/db/schemas/preference'
import { topicTable } from '@data/db/schemas/topic'
import { loggerService } from '@logger'
import type {
MigrationProgress,
@ -24,8 +26,6 @@ import { createMigrationContext } from './MigrationContext'
// TODO: Import these tables when they are created in user data schema
// import { assistantTable } from '../../db/schemas/assistant'
// import { topicTable } from '../../db/schemas/topic'
// import { messageTable } from '../../db/schemas/message'
// import { fileTable } from '../../db/schemas/file'
// import { knowledgeBaseTable } from '../../db/schemas/knowledgeBase'
@ -197,12 +197,13 @@ export class MigrationEngine {
const db = dbService.getDb()
// Tables to clear - add more as they are created
// Order matters: child tables must be cleared before parent tables
const tables = [
{ table: messageTable, name: 'message' }, // Must clear before topic (FK reference)
{ table: topicTable, name: 'topic' },
{ table: preferenceTable, name: 'preference' }
// TODO: Add these when tables are created
// { table: assistantTable, name: 'assistant' },
// { table: topicTable, name: 'topic' },
// { table: messageTable, name: 'message' },
// { table: fileTable, name: 'file' },
// { table: knowledgeBaseTable, name: 'knowledge_base' }
]
@ -216,14 +217,15 @@ export class MigrationEngine {
}
}
// Clear tables in reverse dependency order
// Clear tables in dependency order (children before parents)
// Messages reference topics, so delete messages first
await db.delete(messageTable)
await db.delete(topicTable)
await db.delete(preferenceTable)
// TODO: Add these when tables are created (in correct order)
// await db.delete(messageTable)
// await db.delete(topicTable)
// await db.delete(fileTable)
// await db.delete(knowledgeBaseTable)
// await db.delete(assistantTable)
await db.delete(preferenceTable)
logger.info('All new architecture tables cleared successfully')
}

View File

@ -1,81 +1,623 @@
/**
* Chat migrator - migrates topics and messages from Dexie to SQLite
* Chat Migrator - Migrates topics and messages from Dexie to SQLite
*
* TODO: Implement when chat tables are created
* Data source: Dexie topics table (messages are embedded in topics)
* Target tables: topic, message
* ## Overview
*
* Note: This migrator handles the largest amount of data (potentially millions of messages)
* and uses streaming JSON reading with batch inserts for memory efficiency.
* This migrator handles the largest data migration task: transferring all chat topics
* and their messages from the old Dexie/IndexedDB storage to the new SQLite database.
*
* ## Data Sources
*
* | Data | Source | File/Path |
* |------|--------|-----------|
* | Topics with messages | Dexie `topics` table | `topics.json` `{ id, messages[] }` |
* | Message blocks | Dexie `message_blocks` table | `message_blocks.json` |
* | Assistants (for meta) | Redux `assistants` slice | `ReduxStateReader.getCategory('assistants')` |
*
* ## Target Tables
*
* - `topicTable` - Stores conversation topics/threads
* - `messageTable` - Stores chat messages with tree structure
*
* ## Key Transformations
*
* 1. **Linear Tree Structure**
* - Old: Messages stored as linear array in `topic.messages[]`
* - New: Tree via `parentId` + `siblingsGroupId`
*
* 2. **Multi-model Responses**
* - Old: `askId` links responses to user message, `foldSelected` marks active
* - New: Shared `parentId` + non-zero `siblingsGroupId` groups siblings
*
* 3. **Block Inlining**
* - Old: `message.blocks: string[]` (IDs) + separate `message_blocks` table
* - New: `message.data.blocks: MessageDataBlock[]` (inline JSON)
*
* 4. **Citation Migration**
* - Old: Separate `CitationMessageBlock`
* - New: Merged into `MainTextBlock.references` as ContentReference[]
*
* 5. **Mention Migration**
* - Old: `message.mentions: Model[]`
* - New: `MentionReference[]` in `MainTextBlock.references`
*
* ## Performance Considerations
*
* - Uses streaming JSON reader for large data sets (potentially millions of messages)
* - Processes topics in batches to control memory usage
* - Pre-loads all blocks into memory map for O(1) lookup (blocks table is smaller)
* - Uses database transactions for atomicity and performance
*
* @since v2.0.0
*/
import { messageTable } from '@data/db/schemas/message'
import { topicTable } from '@data/db/schemas/topic'
import { loggerService } from '@logger'
import type { ExecuteResult, PrepareResult, ValidateResult } from '@shared/data/migration/v2/types'
import type { ExecuteResult, PrepareResult, ValidateResult, ValidationError } from '@shared/data/migration/v2/types'
import { eq, sql } from 'drizzle-orm'
import { v4 as uuidv4 } from 'uuid'
import type { MigrationContext } from '../core/MigrationContext'
import { BaseMigrator } from './BaseMigrator'
import {
buildBlockLookup,
buildMessageTree,
type NewMessage,
type NewTopic,
type OldAssistant,
type OldBlock,
type OldTopic,
type OldTopicMeta,
resolveBlocks,
transformMessage,
transformTopic
} from './mappings/ChatMappings'
const logger = loggerService.withContext('ChatMigrator')
/**
* Batch size for processing topics
* Chosen to balance memory usage and transaction overhead
*/
const TOPIC_BATCH_SIZE = 50
/**
* Batch size for inserting messages
* SQLite has limits on the number of parameters per statement
*/
const MESSAGE_INSERT_BATCH_SIZE = 100
/**
* Assistant data from Redux for generating AssistantMeta
*/
interface AssistantState {
assistants: OldAssistant[]
}
/**
* Prepared data for execution phase
*/
interface PreparedTopicData {
topic: NewTopic
messages: NewMessage[]
}
export class ChatMigrator extends BaseMigrator {
readonly id = 'chat'
readonly name = 'ChatData'
readonly description = 'Migrate chat data'
readonly description = 'Migrate chat topics and messages'
readonly order = 4
async prepare(): Promise<PrepareResult> {
logger.info('ChatMigrator.prepare - placeholder implementation')
// Prepared data for execution
private topicCount = 0
private messageCount = 0
private blockLookup: Map<string, OldBlock> = new Map()
private assistantLookup: Map<string, OldAssistant> = new Map()
// Topic metadata from Redux (name, pinned, etc.) - Dexie only has messages
private topicMetaLookup: Map<string, OldTopicMeta> = new Map()
// Topic → AssistantId mapping from Redux (Dexie topics don't store assistantId)
private topicAssistantLookup: Map<string, string> = new Map()
private skippedTopics = 0
private skippedMessages = 0
// Track seen message IDs to handle duplicates across topics
private seenMessageIds = new Set<string>()
// Block statistics for diagnostics
private blockStats = { requested: 0, resolved: 0, messagesWithMissingBlocks: 0, messagesWithEmptyBlocks: 0 }
// TODO: Implement when chat tables are created
// 1. Check if topics.json export file exists
// 2. Validate JSON format with sample read
// 3. Count total topics and estimate message count
// 4. Check for data integrity (e.g., messages have valid topic references)
/**
* Prepare phase - validate source data and count items
*
* Steps:
* 1. Check if topics.json and message_blocks.json exist
* 2. Load all blocks into memory for fast lookup
* 3. Load assistant data for generating meta
* 4. Count topics and estimate message count
* 5. Validate sample data for integrity
*/
async prepare(ctx: MigrationContext): Promise<PrepareResult> {
const warnings: string[] = []
return {
success: true,
itemCount: 0,
warnings: ['ChatMigrator not yet implemented - waiting for chat tables']
}
}
try {
// Step 1: Verify export files exist
const topicsExist = await ctx.sources.dexieExport.tableExists('topics')
if (!topicsExist) {
logger.warn('topics.json not found, skipping chat migration')
return {
success: true,
itemCount: 0,
warnings: ['topics.json not found - no chat data to migrate']
}
}
async execute(): Promise<ExecuteResult> {
logger.info('ChatMigrator.execute - placeholder implementation')
const blocksExist = await ctx.sources.dexieExport.tableExists('message_blocks')
if (!blocksExist) {
warnings.push('message_blocks.json not found - messages will have empty blocks')
}
// TODO: Implement when chat tables are created
// Use streaming JSON reader for large message files:
//
// const streamReader = _ctx.sources.dexieExport.createStreamReader('topics')
// await streamReader.readInBatches<OldTopic>(
// BATCH_SIZE,
// async (topics, batchIndex) => {
// // 1. Insert topics
// // 2. Extract and insert messages from each topic
// // 3. Report progress
// }
// )
// Step 2: Load all blocks into lookup map
// Blocks table is typically smaller than messages, safe to load entirely
if (blocksExist) {
logger.info('Loading message blocks into memory...')
const blocks = await ctx.sources.dexieExport.readTable<OldBlock>('message_blocks')
this.blockLookup = buildBlockLookup(blocks)
logger.info(`Loaded ${this.blockLookup.size} blocks into lookup map`)
}
return {
success: true,
processedCount: 0
}
}
// Step 3: Load assistant data for generating AssistantMeta
// Also extract topic metadata from assistants (Redux stores topic metadata in assistants.topics[])
const assistantState = ctx.sources.reduxState.getCategory<AssistantState>('assistants')
if (assistantState?.assistants) {
for (const assistant of assistantState.assistants) {
this.assistantLookup.set(assistant.id, assistant)
async validate(): Promise<ValidateResult> {
logger.info('ChatMigrator.validate - placeholder implementation')
// Extract topic metadata from this assistant's topics array
// Redux stores topic metadata (name, pinned, etc.) but with messages: []
// Also track topic → assistantId mapping (Dexie doesn't store assistantId)
if (assistant.topics && Array.isArray(assistant.topics)) {
for (const topic of assistant.topics) {
if (topic.id) {
this.topicMetaLookup.set(topic.id, topic)
this.topicAssistantLookup.set(topic.id, assistant.id)
}
}
}
}
logger.info(
`Loaded ${this.assistantLookup.size} assistants and ${this.topicMetaLookup.size} topic metadata entries`
)
} else {
warnings.push('No assistant data found - topics will have null assistantMeta and missing names')
}
// TODO: Implement when chat tables are created
// 1. Count validation for topics and messages
// 2. Sample validation (check a few topics have correct message counts)
// 3. Reference integrity validation
// Step 4: Count topics and estimate messages
const topicReader = ctx.sources.dexieExport.createStreamReader('topics')
this.topicCount = await topicReader.count()
logger.info(`Found ${this.topicCount} topics to migrate`)
return {
success: true,
errors: [],
stats: {
sourceCount: 0,
targetCount: 0,
skippedCount: 0
// Estimate message count from sample
if (this.topicCount > 0) {
const sampleTopics = await topicReader.readSample<OldTopic>(10)
const avgMessagesPerTopic =
sampleTopics.reduce((sum, t) => sum + (t.messages?.length || 0), 0) / sampleTopics.length
this.messageCount = Math.round(this.topicCount * avgMessagesPerTopic)
logger.info(`Estimated ${this.messageCount} messages based on sample`)
}
// Step 5: Validate sample data
if (this.topicCount > 0) {
const sampleTopics = await topicReader.readSample<OldTopic>(5)
for (const topic of sampleTopics) {
if (!topic.id) {
warnings.push(`Found topic without id - will be skipped`)
}
if (!topic.messages || !Array.isArray(topic.messages)) {
warnings.push(`Topic ${topic.id} has invalid messages array`)
}
}
}
logger.info('Prepare phase completed', {
topics: this.topicCount,
estimatedMessages: this.messageCount,
blocks: this.blockLookup.size,
assistants: this.assistantLookup.size
})
return {
success: true,
itemCount: this.topicCount,
warnings: warnings.length > 0 ? warnings : undefined
}
} catch (error) {
logger.error('Prepare failed', error as Error)
return {
success: false,
itemCount: 0,
warnings: [error instanceof Error ? error.message : String(error)]
}
}
}
/**
* Execute phase - perform the actual data migration
*
* Processing strategy:
* 1. Stream topics in batches to control memory
* 2. For each topic batch:
* a. Transform topics and their messages
* b. Build message tree structure
* c. Insert topics in single transaction
* d. Insert messages in batched transactions
* 3. Report progress throughout
*/
async execute(ctx: MigrationContext): Promise<ExecuteResult> {
if (this.topicCount === 0) {
logger.info('No topics to migrate')
return { success: true, processedCount: 0 }
}
let processedTopics = 0
let processedMessages = 0
try {
const db = ctx.db
const topicReader = ctx.sources.dexieExport.createStreamReader('topics')
// Process topics in batches
await topicReader.readInBatches<OldTopic>(TOPIC_BATCH_SIZE, async (topics, batchIndex) => {
logger.debug(`Processing topic batch ${batchIndex + 1}`, { count: topics.length })
// Transform all topics and messages in this batch
const preparedData: PreparedTopicData[] = []
for (const oldTopic of topics) {
try {
const prepared = this.prepareTopicData(oldTopic)
if (prepared) {
preparedData.push(prepared)
} else {
this.skippedTopics++
}
} catch (error) {
logger.warn(`Failed to transform topic ${oldTopic.id}`, { error })
this.skippedTopics++
}
}
// Insert topics in a transaction
if (preparedData.length > 0) {
await db.transaction(async (tx) => {
// Insert topics
const topicValues = preparedData.map((d) => d.topic)
await tx.insert(topicTable).values(topicValues)
// Collect all messages, handling duplicate IDs by generating new ones
const allMessages: NewMessage[] = []
for (const data of preparedData) {
for (const msg of data.messages) {
if (this.seenMessageIds.has(msg.id)) {
const newId = uuidv4()
logger.warn(`Duplicate message ID found: ${msg.id}, assigning new ID: ${newId}`)
msg.id = newId
}
this.seenMessageIds.add(msg.id)
allMessages.push(msg)
}
}
// Insert messages in batches (SQLite parameter limit)
for (let i = 0; i < allMessages.length; i += MESSAGE_INSERT_BATCH_SIZE) {
const batch = allMessages.slice(i, i + MESSAGE_INSERT_BATCH_SIZE)
await tx.insert(messageTable).values(batch)
}
processedMessages += allMessages.length
})
processedTopics += preparedData.length
}
// Report progress
const progress = Math.round((processedTopics / this.topicCount) * 100)
this.reportProgress(
progress,
`已迁移 ${processedTopics}/${this.topicCount} 个对话,${processedMessages} 条消息`
)
})
logger.info('Execute completed', {
processedTopics,
processedMessages,
skippedTopics: this.skippedTopics,
skippedMessages: this.skippedMessages
})
// Log block statistics for diagnostics
logger.info('Block migration statistics', {
blocksRequested: this.blockStats.requested,
blocksResolved: this.blockStats.resolved,
blocksMissing: this.blockStats.requested - this.blockStats.resolved,
messagesWithEmptyBlocks: this.blockStats.messagesWithEmptyBlocks,
messagesWithMissingBlocks: this.blockStats.messagesWithMissingBlocks
})
return {
success: true,
processedCount: processedTopics
}
} catch (error) {
logger.error('Execute failed', error as Error)
return {
success: false,
processedCount: processedTopics,
error: error instanceof Error ? error.message : String(error)
}
}
}
/**
* Validate phase - verify migrated data integrity
*
* Validation checks:
* 1. Topic count matches source (minus skipped)
* 2. Message count is within expected range
* 3. Sample topics have correct structure
* 4. Foreign key integrity (messages belong to existing topics)
*/
async validate(ctx: MigrationContext): Promise<ValidateResult> {
const errors: ValidationError[] = []
const db = ctx.db
try {
// Count topics in target
const topicResult = await db.select({ count: sql<number>`count(*)` }).from(topicTable).get()
const targetTopicCount = topicResult?.count ?? 0
// Count messages in target
const messageResult = await db.select({ count: sql<number>`count(*)` }).from(messageTable).get()
const targetMessageCount = messageResult?.count ?? 0
logger.info('Validation counts', {
sourceTopics: this.topicCount,
targetTopics: targetTopicCount,
skippedTopics: this.skippedTopics,
targetMessages: targetMessageCount
})
// Validate topic count
const expectedTopics = this.topicCount - this.skippedTopics
if (targetTopicCount < expectedTopics) {
errors.push({
key: 'topic_count',
message: `Topic count mismatch: expected ${expectedTopics}, got ${targetTopicCount}`
})
}
// Sample validation: check a few topics have messages
const sampleTopics = await db.select().from(topicTable).limit(5).all()
for (const topic of sampleTopics) {
const msgCount = await db
.select({ count: sql<number>`count(*)` })
.from(messageTable)
.where(eq(messageTable.topicId, topic.id))
.get()
if (msgCount?.count === 0) {
// This is a warning, not an error - some topics may legitimately have no messages
logger.warn(`Topic ${topic.id} has no messages after migration`)
}
}
// Check for orphan messages (messages without valid topic)
// This shouldn't happen due to foreign key constraints, but verify anyway
const orphanCheck = await db
.select({ count: sql<number>`count(*)` })
.from(messageTable)
.where(sql`${messageTable.topicId} NOT IN (SELECT id FROM ${topicTable})`)
.get()
if (orphanCheck && orphanCheck.count > 0) {
errors.push({
key: 'orphan_messages',
message: `Found ${orphanCheck.count} orphan messages without valid topics`
})
}
return {
success: errors.length === 0,
errors,
stats: {
sourceCount: this.topicCount,
targetCount: targetTopicCount,
skippedCount: this.skippedTopics
}
}
} catch (error) {
logger.error('Validation failed', error as Error)
return {
success: false,
errors: [
{
key: 'validation',
message: error instanceof Error ? error.message : String(error)
}
],
stats: {
sourceCount: this.topicCount,
targetCount: 0,
skippedCount: this.skippedTopics
}
}
}
}
/**
* Prepare a single topic and its messages for migration
*
* @param oldTopic - Source topic from Dexie (has messages, may lack metadata)
* @returns Prepared data or null if topic should be skipped
*
* ## Data Merging
*
* Topic data comes from two sources:
* - Dexie `topics` table: Has `id`, `messages[]`, `assistantId`
* - Redux `assistants[].topics[]`: Has metadata (`name`, `pinned`, `prompt`, etc.)
*
* We merge Redux metadata into the Dexie topic before transformation.
*/
private prepareTopicData(oldTopic: OldTopic): PreparedTopicData | null {
// Validate required fields
if (!oldTopic.id) {
logger.warn('Topic missing id, skipping')
return null
}
// Merge topic metadata from Redux (name, pinned, etc.)
// Dexie topics may have stale or missing metadata; Redux is authoritative for these fields
const topicMeta = this.topicMetaLookup.get(oldTopic.id)
if (topicMeta) {
// Merge Redux metadata into Dexie topic
// Note: Redux topic.name can also be empty from ancient version migrations (see store/migrate.ts:303-305)
oldTopic.name = topicMeta.name || oldTopic.name
oldTopic.pinned = topicMeta.pinned ?? oldTopic.pinned
oldTopic.prompt = topicMeta.prompt ?? oldTopic.prompt
oldTopic.isNameManuallyEdited = topicMeta.isNameManuallyEdited ?? oldTopic.isNameManuallyEdited
// Use Redux timestamps if available and Dexie lacks them
if (topicMeta.createdAt && !oldTopic.createdAt) {
oldTopic.createdAt = topicMeta.createdAt
}
if (topicMeta.updatedAt && !oldTopic.updatedAt) {
oldTopic.updatedAt = topicMeta.updatedAt
}
}
// Fallback: If name is still empty after merge, use a default name
// This handles cases where both Dexie and Redux have empty names (ancient version bug)
if (!oldTopic.name) {
oldTopic.name = 'Unnamed Topic' // Default fallback for topics with no name
}
// Get assistantId from Redux mapping (Dexie topics don't store assistantId)
// Fall back to oldTopic.assistantId in case Dexie did store it (defensive)
const assistantId = this.topicAssistantLookup.get(oldTopic.id) || oldTopic.assistantId
if (assistantId && !oldTopic.assistantId) {
oldTopic.assistantId = assistantId
}
// Get assistant for meta generation
const assistant = this.assistantLookup.get(assistantId) || null
// Get messages array (may be empty or undefined)
const oldMessages = oldTopic.messages || []
// Build message tree structure
const messageTree = buildMessageTree(oldMessages)
// === First pass: identify messages to skip (no blocks) ===
const skippedMessageIds = new Set<string>()
const messageParentMap = new Map<string, string | null>() // messageId -> parentId
for (const oldMsg of oldMessages) {
const blockIds = oldMsg.blocks || []
const blocks = resolveBlocks(blockIds, this.blockLookup)
// Track block statistics for diagnostics
this.blockStats.requested += blockIds.length
this.blockStats.resolved += blocks.length
if (blockIds.length === 0) {
this.blockStats.messagesWithEmptyBlocks++
} else if (blocks.length < blockIds.length) {
this.blockStats.messagesWithMissingBlocks++
if (blocks.length === 0) {
logger.warn(`Message ${oldMsg.id} has ${blockIds.length} block IDs but none found in message_blocks`)
}
}
// Store parent info from tree
const treeInfo = messageTree.get(oldMsg.id)
messageParentMap.set(oldMsg.id, treeInfo?.parentId ?? null)
// Mark for skipping if no blocks
if (blocks.length === 0) {
skippedMessageIds.add(oldMsg.id)
this.skippedMessages++
}
}
// === Helper: resolve parent through skipped messages ===
// If parentId points to a skipped message, follow the chain to find a non-skipped ancestor
const resolveParentId = (parentId: string | null): string | null => {
let currentParent = parentId
const visited = new Set<string>() // Prevent infinite loops
while (currentParent && skippedMessageIds.has(currentParent)) {
if (visited.has(currentParent)) {
// Circular reference, break out
return null
}
visited.add(currentParent)
currentParent = messageParentMap.get(currentParent) ?? null
}
return currentParent
}
// === Second pass: transform messages that have blocks ===
const newMessages: NewMessage[] = []
for (const oldMsg of oldMessages) {
// Skip messages marked for skipping
if (skippedMessageIds.has(oldMsg.id)) {
continue
}
try {
const treeInfo = messageTree.get(oldMsg.id)
if (!treeInfo) {
logger.warn(`Message ${oldMsg.id} not found in tree, using defaults`)
continue
}
// Resolve blocks for this message (we know it has blocks from first pass)
const blockIds = oldMsg.blocks || []
const blocks = resolveBlocks(blockIds, this.blockLookup)
// Resolve parentId through any skipped messages
const resolvedParentId = resolveParentId(treeInfo.parentId)
// Get assistant for this message (may differ from topic's assistant)
const msgAssistant = this.assistantLookup.get(oldMsg.assistantId) || assistant
const newMsg = transformMessage(
oldMsg,
resolvedParentId, // Use resolved parent instead of original
treeInfo.siblingsGroupId,
blocks,
msgAssistant,
oldTopic.id
)
newMessages.push(newMsg)
} catch (error) {
logger.warn(`Failed to transform message ${oldMsg.id}`, { error })
this.skippedMessages++
}
}
// Calculate activeNodeId based on migrated messages (not original messages)
// If no messages were migrated, set to null
let activeNodeId: string | null = null
if (newMessages.length > 0) {
// Use the last migrated message as active node
activeNodeId = newMessages[newMessages.length - 1].id
}
// Transform topic with correct activeNodeId
const newTopic = transformTopic(oldTopic, assistant, activeNodeId)
return {
topic: newTopic,
messages: newMessages
}
}
}

View File

@ -0,0 +1,138 @@
# ChatMigrator
The `ChatMigrator` handles the largest data migration task: topics and messages from Dexie/IndexedDB to SQLite.
## Data Sources
| Data | Source | File/Path |
|------|--------|-----------|
| Topics with messages | Dexie `topics` table | `topics.json` |
| Topic metadata (name, pinned, etc.) | Redux `assistants[].topics[]` | `ReduxStateReader.getCategory('assistants')` |
| Message blocks | Dexie `message_blocks` table | `message_blocks.json` |
| Assistants (for meta) | Redux `assistants` slice | `ReduxStateReader.getCategory('assistants')` |
### Topic Data Split (Important!)
The old system stores topic data in **two separate locations**:
1. **Dexie `topics` table**: Contains only `id` and `messages[]` array (NO `assistantId`!)
2. **Redux `assistants[].topics[]`**: Contains metadata (`name`, `pinned`, `prompt`, `isNameManuallyEdited`) and implicitly the `assistantId` (from parent assistant)
Redux deliberately clears `messages[]` to reduce storage size. The migrator merges these sources:
- Messages come from Dexie
- Metadata (name, pinned, etc.) comes from Redux
- `assistantId` comes from Redux structure (each assistant owns its topics)
## Key Transformations
1. **Linear → Tree Structure**
- Old: Messages stored as linear array in `topic.messages[]`
- New: Tree via `parentId` + `siblingsGroupId`
2. **Multi-model Responses**
- Old: `askId` links responses to user message, `foldSelected` marks active
- New: Shared `parentId` + non-zero `siblingsGroupId` groups siblings
3. **Block Inlining**
- Old: `message.blocks: string[]` (IDs) + separate `message_blocks` table
- New: `message.data.blocks: MessageDataBlock[]` (inline JSON)
4. **Citation Migration**
- Old: Separate `CitationMessageBlock` with `response`, `knowledge`, `memories`
- New: Merged into `MainTextBlock.references` as `ContentReference[]`
5. **Mention Migration**
- Old: `message.mentions: Model[]`
- New: `MentionReference[]` in `MainTextBlock.references`
## Data Quality Handling
The migrator handles potential data inconsistencies from the old system:
| Issue | Detection | Handling |
|-------|-----------|----------|
| **Duplicate message ID** | Same ID appears in multiple topics | Generate new UUID, log warning |
| **TopicId mismatch** | `message.topicId` ≠ parent `topic.id` | Use correct parent topic.id (silent fix) |
| **Missing blocks** | Block ID not found in `message_blocks` | Skip missing block (silent) |
| **Invalid topic** | Topic missing required `id` field | Skip entire topic |
| **Missing topic metadata** | Topic not found in Redux `assistants[].topics[]` | Use Dexie values, fallback name if empty |
| **Missing assistantId** | Topic not in any `assistant.topics[]` | `assistantId` and `assistantMeta` will be null |
| **Empty topic name** | Both Dexie and Redux have empty `name` (ancient bug) | Use fallback "Unnamed Topic" |
| **Message with no blocks** | `blocks` array is empty after resolution | Skip message, re-link children to parent's parent |
| **Topic with no messages** | All messages skipped (no blocks) | Keep topic, set `activeNodeId` to null |
## Field Mappings
### Topic Mapping
Topic data is merged from Dexie + Redux before transformation:
| Source | Target (topicTable) | Notes |
|--------|---------------------|-------|
| Dexie: `id` | `id` | Direct copy |
| Redux: `name` | `name` | Merged from Redux `assistants[].topics[]` |
| Redux: `isNameManuallyEdited` | `isNameManuallyEdited` | Merged from Redux |
| Redux: (parent assistant.id) | `assistantId` | From `topicAssistantLookup` mapping |
| (from Assistant) | `assistantMeta` | Generated from assistant entity |
| Redux: `prompt` | `prompt` | Merged from Redux |
| (computed) | `activeNodeId` | Last message ID or foldSelected |
| (none) | `groupId` | null (new field) |
| (none) | `sortOrder` | 0 (new field) |
| Redux: `pinned` | `isPinned` | Merged from Redux, renamed |
| (none) | `pinnedOrder` | 0 (new field) |
| `createdAt` | `createdAt` | ISO string → timestamp |
| `updatedAt` | `updatedAt` | ISO string → timestamp |
**Dropped fields**: `type` ('chat' | 'session')
### Message Mapping
| Source (OldMessage) | Target (messageTable) | Notes |
|---------------------|----------------------|-------|
| `id` | `id` | Direct copy (new UUID if duplicate) |
| (computed) | `parentId` | From tree building algorithm |
| (from parent topic) | `topicId` | Uses parent topic.id for consistency |
| `role` | `role` | Direct copy |
| `blocks` + `mentions` + citations | `data` | Complex transformation |
| (extracted) | `searchableText` | Extracted from text blocks |
| `status` | `status` | Normalized to success/error/paused |
| (computed) | `siblingsGroupId` | From multi-model detection |
| `assistantId` | `assistantId` | Direct copy |
| `modelId` | `modelId` | Direct copy |
| (from Message.model) | `modelMeta` | Generated from model entity |
| `traceId` | `traceId` | Direct copy |
| `usage` + `metrics` | `stats` | Merged into single stats object |
| `createdAt` | `createdAt` | ISO string → timestamp |
| `updatedAt` | `updatedAt` | ISO string → timestamp |
**Dropped fields**: `type`, `useful`, `enabledMCPs`, `agentSessionId`, `providerMetadata`, `multiModelMessageStyle`, `askId` (replaced by parentId), `foldSelected` (replaced by siblingsGroupId)
### Block Type Mapping
| Old Type | New Type | Notes |
|----------|----------|-------|
| `main_text` | `MainTextBlock` | Direct, references added from citations/mentions |
| `thinking` | `ThinkingBlock` | `thinking_millsec``thinkingMs` |
| `translation` | `TranslationBlock` | Direct copy |
| `code` | `CodeBlock` | Direct copy |
| `image` | `ImageBlock` | `file.id``fileId` |
| `file` | `FileBlock` | `file.id``fileId` |
| `video` | `VideoBlock` | Direct copy |
| `tool` | `ToolBlock` | Direct copy |
| `citation` | (removed) | Converted to `MainTextBlock.references` |
| `error` | `ErrorBlock` | Direct copy |
| `compact` | `CompactBlock` | Direct copy |
| `unknown` | (skipped) | Placeholder blocks are dropped |
## Implementation Files
- `ChatMigrator.ts` - Main migrator class with prepare/execute/validate phases
- `mappings/ChatMappings.ts` - Pure transformation functions and type definitions
## Code Quality
All implementation code includes detailed comments:
- File-level comments: Describe purpose, data flow, and overview
- Function-level comments: Purpose, parameters, return values, side effects
- Logic block comments: Step-by-step explanations for complex logic
- Data transformation comments: Old field → new field mapping relationships

File diff suppressed because it is too large Load Diff