关键词

| 数据标注 | 标注任务设计 | 众包平台 | 标注质量 | 质量控制 | 标注人员培训 | 标注成本 | 数据标注格式 | 一致性检验 | 标注工具 |


一、标注任务设计

1.1 标注任务分类体系

在大模型训练的数据标注工作中,标注任务的类型直接决定了数据质量和标注成本。科学的任务分类是高效标注工作的基础。

指令遵循标注

指令遵循(Instruction Following)标注是SFT(监督微调)阶段最核心的标注类型,其目的是教会模型理解并执行用户指令。这类标注的特点是:

  • 开放式输出:响应可以是任意形式的自然语言文本
  • 评估主观性:难以用自动化指标衡量质量
  • 风格多样性:需要保持回复的多样性和自然度

指令标注的核心原则

优秀的指令标注应体现”有帮助、无害、诚实”(HHH)原则,同时保持回复的专业性和实用性。标注员需要具备跨领域知识背景和良好的语言表达能力。

偏好标注

偏好标注(Preference Annotation)主要用于RLHF阶段,通过比较不同响应的好坏来训练Reward模型。典型的标注任务包括:

  • 成对比较:给定同一指令的两个不同响应,标注哪个更好
  • 排序标注:对多个响应进行排序
  • 绝对评分:使用李克特量表对响应打分
  • 细粒度评估:从多个维度(相关性、准确性、安全性等)分别评分

事实核查标注

事实核查(Fact-checking)标注对于提升模型可靠性至关重要,主要包括:

  • 事实正确性标注:验证响应中的事实陈述是否准确
  • 引用溯源标注:标注信息来源和可信度
  • 幻觉检测标注:识别模型生成内容中的虚假信息

1.2 任务设计最佳实践

清晰的标注指南编写

标注指南是标注员工作的核心参考文档,其质量直接影响标注一致性。以下是编写高质量标注指南的要点:

# 标注指南示例:对话安全性标注
 
## 任务背景
本标注任务旨在评估AI助手回复的安全性等级。
 
## 标注维度
 
### 1. 安全性评估(必填)
- **安全**:回复不包含任何有害内容
- **边界**:回复可能触及敏感话题但处理得当
- **风险**:回复包含不当内容或存在安全隐患
 
### 2. 有害内容子类目
- [ ] 暴力内容
- [ ] 色情内容
- [ ] 仇恨言论
- [ ] 危险行为指导
- [ ] 隐私侵犯
 
## 判断标准
 
### 安全内容示例
- 用户询问"如何做蛋糕",回复提供食谱 -> 安全
- 用户询问"感冒了怎么办",回复建议多喝水休息 -> 安全
 
### 边界内容示例
- 讨论政治人物但保持中立客观 -> 边界
- 讨论宗教信仰但不传教 -> 边界
 
### 风险内容示例
- 回复中包含未经请求的政治宣传 -> 风险
- 给出可能造成伤害的错误医疗建议 -> 风险
 
## 特殊说明
1. 模型应拒绝明显有害的请求,但也要避免过度拒绝
2. 对于专业领域问题,可提供一般性信息并建议咨询专家
3. 涉及紧急情况(如自杀倾向)必须提供求助热线

标注界面设计原则

标注界面的设计直接影响标注效率和准确性:

设计要素最佳实践常见问题
任务展示单一任务聚焦,避免信息过载一次性展示过多样本
操作流程符合直觉,减少点击次数步骤繁琐易出错
即时反馈实时显示标注进度和统计无反馈导致焦虑
快捷键提供常用操作的快捷键仅依赖鼠标操作
异常处理优雅处理网络异常和误操作丢失已完成的标注

二、标注人员培训

2.1 培训体系设计

分层培训架构

大型标注项目的标注员培训应采用分层架构:

┌─────────────────────────────────────┐
│          专家培训层                  │
│  (质量专家、项目经理)              │
├─────────────────────────────────────┤
│          骨干培训层                  │
│  (高级标注员、质量检查员)           │
├─────────────────────────────────────┤
│          基础培训层                  │
│  (普通标注员)                      │
└─────────────────────────────────────┘

培训内容模块

  1. 基础培训(8-16小时)

    • 项目背景与目标介绍
    • 标注平台操作指南
    • 标注指南详细解读
    • 基础标注练习与考核
  2. 进阶培训(4-8小时)

    • 复杂案例处理
    • 边界情况判断
    • 质量提升技巧
    • 效率优化方法
  3. 专家培训(持续)

    • 新指南解读与答疑
    • 质量分析与反馈
    • 标注标准迭代讨论

2.2 培训实施工具

class AnnotationTrainer:
    """标注员培训管理系统"""
    
    def __init__(self):
        self.modules = {}
        self.trainees = {}
        
    def create_module(self, module_id, title, content, 
                     quiz_questions, passing_score=80):
        """创建培训模块"""
        self.modules[module_id] = {
            "title": title,
            "content": content,
            "quiz": quiz_questions,
            "passing_score": passing_score,
            "duration_hours": len(content) // 500
        }
        
    def assign_training(self, trainee_id, module_ids):
        """分配培训任务"""
        for module_id in module_ids:
            if module_id not in self.trainees.setdefault(trainee_id, {}):
                self.trainees[trainee_id][module_id] = {
                    "status": "pending",
                    "progress": 0,
                    "quiz_scores": [],
                    "completion_time": None
                }
                
    def track_progress(self, trainee_id, module_id, 
                      completed_items, quiz_score):
        """跟踪培训进度"""
        progress = self.trainees[trainee_id][module_id]
        progress["completed_items"] = completed_items
        progress["quiz_scores"].append(quiz_score)
        progress["progress"] = len(completed_items) / len(
            self.modules[module_id]["content"]
        )
        
        if progress["quiz_scores"][-1] >= self.modules[module_id]["passing_score"]:
            progress["status"] = "passed"
            progress["completion_time"] = datetime.now()
            
    def generate_report(self, trainee_id):
        """生成培训报告"""
        report = {
            "trainee_id": trainee_id,
            "modules_assigned": len(self.trainees[trainee_id]),
            "modules_completed": sum(
                1 for m in self.trainees[trainee_id].values()
                if m["status"] == "passed"
            ),
            "average_quiz_score": np.mean([
                max(m["quiz_scores"]) 
                for m in self.trainees[trainee_id].values()
                if m["quiz_scores"]
            ]),
            "recommended_tasks": self._recommend_tasks(trainee_id)
        }
        return report

2.3 能力评估与认证

def evaluate_annotator_capability(trainee_id, calibration_samples, 
                                 gold_standard_labels):
    """
    评估标注员能力
    
    Args:
        trainee_id: 标注员ID
        calibration_samples: 校准测试样本(带标准答案)
        gold_standard_labels: 标准标签
    """
    from sklearn.metrics import cohen_kappa_score, accuracy_score
    
    annotator_labels = []
    for sample in calibration_samples:
        annotation = query_annotator(trainee_id, sample)
        annotator_labels.append(annotation)
        
    results = {
        "accuracy": accuracy_score(gold_standard_labels, annotator_labels),
        "agreement": cohen_kappa_score(
            gold_standard_labels, annotator_labels
        ),
        "per_class_f1": f1_score(
            gold_standard_labels, annotator_labels, average=None
        ),
        "confidence_level": classify_confidence(...)
    }
    
    # 根据结果确定资质等级
    if results["accuracy"] >= 0.95 and results["agreement"] >= 0.85:
        certification_level = "expert"
    elif results["accuracy"] >= 0.85 and results["agreement"] >= 0.70:
        certification_level = "senior"
    elif results["accuracy"] >= 0.75:
        certification_level = "qualified"
    else:
        certification_level = "needs_retraining"
        
    return {**results, "certification_level": certification_level}

三、质量控制机制

3.1 多层次质量控制体系

金标准样本监控

金标准样本(Gold Standard Samples)是预先标注好正确答案的测试样本,用于实时监控标注员表现:

class GoldStandardMonitor:
    """金标准样本监控系统"""
    
    def __init__(self, gold_samples, check_frequency=10):
        self.gold_samples = gold_samples
        self.check_frequency = check_frequency
        self.hidden_gold_indices = {}
        
    def inject_gold_samples(self, task_batch, batch_id):
        """
        向任务批次中注入金标准样本
        """
        import random
        modified_batch = list(task_batch)
        
        # 随机选择注入位置
        n_golds = max(1, len(task_batch) // 10)  # 10%的样本为金标准
        gold_positions = random.sample(
            range(len(task_batch)), 
            min(n_golds, len(self.gold_samples))
        )
        
        for pos, gold_idx in zip(gold_positions, 
                                 range(len(self.gold_samples))):
            modified_batch.insert(pos, self.gold_samples[gold_idx])
            self.hidden_gold_indices[f"{batch_id}_{pos}"] = gold_idx
            
        return modified_batch
    
    def check_quality(self, batch_id, annotations):
        """
        检查标注质量
        """
        issues = []
        for idx, annotation in annotations.items():
            key = f"{batch_id}_{idx}"
            if key in self.hidden_gold_indices:
                gold_idx = self.hidden_gold_indices[key]
                gold_answer = self.gold_samples[gold_idx]["label"]
                
                if annotation != gold_answer:
                    issues.append({
                        "position": idx,
                        "annotator_answer": annotation,
                        "correct_answer": gold_answer,
                        "error_type": "gold_mismatch"
                    })
                    
        return self._calculate_quality_score(issues, len(annotations))

交叉验证机制

对于需要高准确率的标注任务,采用多人独立标注同一样本的方式:

class CrossValidationManager:
    """交叉验证管理系统"""
    
    def __init__(self, n_annotators_per_sample=3):
        self.n_annotators = n_annotators_per_sample
        self.annotations = defaultdict(list)
        
    def assign_task(self, sample_id, annotator_pool):
        """分配标注任务"""
        selected_annotators = random.sample(
            annotator_pool, 
            self.n_annotators
        )
        
        for annotator_id in selected_annotators:
            self.annotations[sample_id].append({
                "annotator_id": annotator_id,
                "status": "pending",
                "result": None
            })
            
    def resolve_conflicts(self, sample_id):
        """
        解决标注冲突
        
        Resolution strategies:
        - majority_vote: 多数投票
        - weighted_vote: 加权投票(基于标注员质量)
        - expert_review: 专家仲裁
        """
        annotations = self.annotations[sample_id]
        completed = [a for a in annotations if a["status"] == "completed"]
        
        if not completed:
            return None
            
        labels = [a["result"] for a in completed]
        
        # 多数投票
        vote_counts = Counter(labels)
        majority_label, count = vote_counts.most_common(1)[0]
        
        if count > len(completed) / 2:
            return {
                "resolved_label": majority_label,
                "confidence": count / len(completed),
                "resolution_method": "majority_vote",
                "disagreement_count": len(completed) - count
            }
        else:
            # 需要专家仲裁
            return {
                "status": "needs_expert_review",
                "candidate_labels": vote_counts,
                "expert_required": True
            }

3.2 质量指标体系

指标类型具体指标计算方法阈值建议
准确性与金标准一致率正确数/总数>90%
一致性Cohen’s Kappaκ = (P₀-Pₑ)/(1-Pₑ)>0.70
效率日均标注量完成任务数/工作时长>100条/小时
稳定性前后一致率同一样本重标注一致比例>85%
覆盖率任务完成率已完成/总任务数>95%

3.3 反馈与改进机制

class AnnotationFeedbackSystem:
    """标注反馈改进系统"""
    
    def __init__(self):
        self.issue_categories = {
            "guideline_ambiguity": [],
            "annotator_error": [],
            "task_design_flaw": [],
            "platform_issue": []
        }
        
    def submit_feedback(self, annotator_id, task_id, issue_type,
                       description, severity):
        """提交反馈"""
        feedback = {
            "annotator_id": annotator_id,
            "task_id": task_id,
            "issue_type": issue_type,
            "description": description,
            "severity": severity,  # low, medium, high, critical
            "timestamp": datetime.now(),
            "status": "open"
        }
        
        self.issue_categories[issue_type].append(feedback)
        return feedback
        
    def analyze_and_improve(self):
        """分析反馈并改进"""
        improvements = []
        
        # 检测指南模糊问题
        guideline_issues = self.issue_categories["guideline_ambiguity"]
        if len(guideline_issues) > 10:
            improvements.append({
                "type": "guideline_update",
                "description": "检测到指南模糊问题,需更新标注指南",
                "affected_tasks": len(set(i["task_id"] for i in guideline_issues)),
                "priority": len(guideline_issues) / 100
            })
            
        # 检测标注员系统性问题
        annotator_issues = self.issue_categories["annotator_error"]
        annotator_error_counts = Counter(
            i["annotator_id"] for i in annotator_issues
        )
        problematic_annotators = [
            aid for aid, count in annotator_error_counts.items()
            if count > 20
        ]
        
        if problematic_annotators:
            improvements.append({
                "type": "annotator_retraining",
                "description": "部分标注员需要重新培训",
                "affected_annotators": problematic_annotators,
                "priority": len(problematic_annotators) / len(annotator_error_counts)
            })
            
        return improvements

四、标注平台选择

4.1 主流平台对比

专业众包平台

平台优势劣势适用场景
Scale AI专业的LLM数据标注,支持复杂工作流成本较高企业级大规模标注
Label Studio开源可自托管,高度可定制需要技术团队维护中等规模,有定制需求
Amazon MTurk成本低,劳动力充足质量控制困难大规模简单标注任务
Prolific标注员质量高成本较高,池子较小研究级别高质量标注
澳鹏中文支持好,专业服务成本高国内企业大规模标注

开源自托管方案

# Label Studio 配置文件示例
api_key: ${LABEL_STUDIO_API_KEY}
 
projects:
  instruction_following:
    name: "指令遵循标注"
    label_config: |
      <View>
        <Header value="请评估以下AI回复的质量"/>
        <Text value="$instruction"/>
        <Text value="$response"/>
        <Choices name="quality" toName="response">
          <Choice value="优秀"/>
          <Choice value="良好"/>
          <Choice value="一般"/>
          <Choice value="较差"/>
        </Choices>
        <TextArea name="feedback" toName="response" 
                  placeholder="请输入详细反馈..."/>
      </View>
    min_annotations_to_train: 100
    maximum_annotations: 3
    
  preference_ranking:
    name: "偏好排序标注"
    label_config: |
      <View>
        <Header value="请比较以下两个回复的优劣"/>
        <Text value="$instruction"/>
        <Text value="$response_a"/>
        <Text value="$response_b"/>
        <Choices name="preference" toName="instruction">
          <Choice value="A明显更好"/>
          <Choice value="A略好"/>
          <Choice value="两者差不多"/>
          <Choice value="B略好"/>
          <Choice value="B明显更好"/>
        </Choices>
      </View>

4.2 平台选择决策框架

class PlatformSelector:
    """标注平台选择器"""
    
    def __init__(self):
        self.platforms = self._load_platform_info()
        
    def recommend_platform(self, requirements):
        """
        根据需求推荐最合适的平台
        
        决策因素:
        - 标注任务复杂度
        - 数据规模
        - 预算限制
        - 质量要求
        - 时间限制
        - 语言需求
        """
        scores = {}
        
        for platform_id, platform in self.platforms.items():
            score = 0
            
            # 任务复杂度匹配
            if requirements["complexity"] == "high":
                score += platform["advanced_features"] * 2
            else:
                score += platform["simple_task_speed"]
                
            # 规模效益
            if requirements["scale"] >= 100000:
                score += platform["scale_capacity"] * 1.5
                
            # 成本效率
            cost_score = platform["base_cost"] / requirements["budget"]
            score += (1 - min(cost_score, 1)) * 30
            
            # 质量保障
            score += platform["quality_control_features"] * 20
            
            # 语言支持
            if requirements["language"] in platform["supported_languages"]:
                score += 15
                
            scores[platform_id] = score
            
        ranked = sorted(scores.items(), key=lambda x: x[1], reverse=True)
        
        return {
            "primary_recommendation": ranked[0][0],
            "alternatives": ranked[1:4],
            "scores": scores
        }

五、成本优化策略

5.1 标注成本结构分析

成本构成要素

标注项目的总成本由多个要素构成:

成本类别占比范围优化潜力
人工标注费用60-80%中等
平台使用费5-15%较低
质量控制成本10-20%较高
管理与协调5-10%中等
技术基础设施3-8%较低

单位成本计算

class CostAnalyzer:
    """标注成本分析器"""
    
    def __init__(self):
        self.cost_records = []
        
    def calculate_unit_cost(self, project_id):
        """
        计算单位标注成本
        
        返回:
        - cost_per_sample: 每样本成本
        - cost_per_quality_point: 每质量点成本
        - roi_by_task_type: 各任务类型的ROI
        """
        records = [r for r in self.cost_records if r["project_id"] == project_id]
        
        total_cost = sum(r["total_cost"] for r in records)
        total_samples = sum(r["samples_completed"] for r in records)
        avg_quality = np.mean([r["avg_quality"] for r in records])
        
        breakdown = self._breakdown_by_category(records)
        
        return {
            "cost_per_sample": total_cost / total_samples,
            "cost_per_quality_point": total_cost / avg_quality,
            "total_samples": total_samples,
            "average_quality": avg_quality,
            "cost_breakdown": breakdown,
            "optimization_recommendations": self._generate_recommendations(
                breakdown
            )
        }
        
    def _breakdown_by_category(self, records):
        """成本分类分析"""
        categories = {
            "labor": 0,
            "platform": 0,
            "qc": 0,
            "management": 0,
            "infrastructure": 0
        }
        
        for r in records:
            for cat in categories:
                categories[cat] += r.get(f"{cat}_cost", 0)
                
        return {
            cat: {
                "amount": amount,
                "percentage": amount / sum(categories.values()) * 100
            }
            for cat, amount in categories.items()
        }

5.2 成本优化策略

智能任务路由

class SmartTaskRouter:
    """智能任务路由系统"""
    
    def __init__(self, task_classifier, annotator_registry):
        self.classifier = task_classifier
        self.annotators = annotator_registry
        
    def route_task(self, task, available_annotators):
        """
        根据任务特征和标注员能力智能分配任务
        
        优化目标:
        - 最小化标注成本
        - 最大化标注质量
        - 平衡标注员工作负载
        """
        task_features = self.classifier.extract_features(task)
        
        # 计算每个标注员的适合度
        candidates = []
        for annotator in available_annotators:
            fit_score = self._calculate_fit_score(
                task_features, annotator
            )
            
            # 考虑成本和质量的平衡
            effective_cost = annotator["hourly_rate"] / fit_score
            expected_quality = fit_score * annotator["baseline_quality"]
            
            candidates.append({
                "annotator_id": annotator["id"],
                "fit_score": fit_score,
                "effective_cost": effective_cost,
                "expected_quality": expected_quality,
                "value_score": expected_quality / effective_cost
            })
            
        # 选择性价比最高的标注员
        best = max(candidates, key=lambda x: x["value_score"])
        
        return {
            "assigned_annotator": best["annotator_id"],
            "estimated_cost": best["effective_cost"],
            "expected_quality": best["expected_quality"],
            "alternatives": sorted(
                candidates, key=lambda x: x["value_score"], reverse=True
            )[1:3]
        }

主动学习标注

通过主动学习策略减少需要的标注量:

class ActiveLearningAnnotator:
    """主动学习标注系统"""
    
    def __init__(self, model, uncertainty_threshold=0.3):
        self.model = model
        self.threshold = uncertainty_threshold
        self.labeled_pool = []
        self.unlabeled_pool = []
        
    def select_samples_for_annotation(self, n_samples=100):
        """
        选择最有价值的样本进行标注
        
        选择策略:
        1. 模型不确定性高的样本
        2. 与已有标注差异大的样本
        3. 代表性不足区域的样本
        """
        uncertainties = []
        
        for sample in self.unlabeled_pool:
            probs = self.model.predict_proba(sample["features"])
            entropy = -np.sum(probs * np.log(probs + 1e-10))
            uncertainties.append((sample, entropy))
            
        # 按不确定性排序,选择top样本
        sorted_by_uncertainty = sorted(
            uncertainties, key=lambda x: x[1], reverse=True
        )
        
        selected = [
            sample for sample, _ in sorted_by_uncertainty[:n_samples]
        ]
        
        return selected
        
    def update_model(self, new_annotations):
        """
        使用新标注数据更新模型
        """
        self.labeled_pool.extend(new_annotations)
        self.unlabeled_pool = [
            s for s in self.unlabeled_pool 
            if s["id"] not in [a["id"] for a in new_annotations]
        ]
        
        # 增量训练模型
        self.model.incremental_train(
            [a["features"] for a in new_annotations],
            [a["label"] for a in new_annotations]
        )

六、标注数据格式

6.1 标准数据格式

JSONL格式

JSONL是处理大规模标注数据的主流格式:

{"id": "sample_001", "instruction": "解释量子纠缠的概念", "response": "量子纠缠是...", "metadata": {"source": "manual", "annotator": "A123", "timestamp": "2026-04-18T10:00:00Z", "quality_score": 0.95}}
{"id": "sample_002", "instruction": "写一首关于春天的诗", "response": "春风又绿江南岸...", "metadata": {"source": "manual", "annotator": "A123", "timestamp": "2026-04-18T10:05:00Z", "quality_score": 0.88}}
{"id": "sample_003", "instruction": "如何学习编程?", "response": "学习编程需要...", "metadata": {"source": "synthetic", "generator": "gpt-4", "timestamp": "2026-04-18T09:00:00Z", "quality_score": 0.72}}

多轮对话格式

{
  "conversation_id": "conv_12345",
  "turns": [
    {
      "role": "user",
      "content": "我想学习机器学习,应该从哪里开始?",
      "timestamp": "2026-04-18T10:00:00Z"
    },
    {
      "role": "assistant", 
      "content": "学习机器学习建议从Python编程基础开始,然后学习...",
      "timestamp": "2026-04-18T10:00:30Z",
      "annotations": {
        "quality_rating": 4.5,
        "safety_check": "pass",
        "factual_accuracy": 0.95
      }
    },
    {
      "role": "user",
      "content": "有哪些推荐的在线课程?",
      "timestamp": "2026-04-18T10:01:00Z"
    }
  ],
  "metadata": {
    "domain": "education",
    "language": "zh",
    "complexity": "intermediate"
  }
}

6.2 数据验证与转换

import json
import jsonschema
 
class AnnotationDataValidator:
    """标注数据验证器"""
    
    def __init__(self):
        self.schemas = self._load_schemas()
        
    def _load_schemas(self):
        """加载数据模式定义"""
        return {
            "instruction_response": {
                "type": "object",
                "required": ["id", "instruction", "response"],
                "properties": {
                    "id": {"type": "string"},
                    "instruction": {"type": "string", "minLength": 5},
                    "response": {"type": "string", "minLength": 10},
                    "metadata": {
                        "type": "object",
                        "properties": {
                            "source": {"type": "string", "enum": ["manual", "synthetic", "processed"]},
                            "annotator": {"type": "string"},
                            "timestamp": {"type": "string", "format": "date-time"},
                            "quality_score": {"type": "number", "minimum": 0, "maximum": 1}
                        }
                    }
                }
            },
            "preference": {
                "type": "object",
                "required": ["id", "instruction", "response_a", "response_b", "preference"],
                "properties": {
                    "preference": {
                        "type": "string", 
                        "enum": ["a_better", "a_slightly_better", "tie", "b_slightly_better", "b_better"]
                    }
                }
            }
        }
        
    def validate_dataset(self, file_path, schema_name):
        """验证数据集"""
        with open(file_path, 'r', encoding='utf-8') as f:
            data = [json.loads(line) for line in f]
            
        schema = self.schemas[schema_name]
        errors = []
        
        for idx, item in enumerate(data):
            try:
                jsonschema.validate(item, schema)
            except jsonschema.ValidationError as e:
                errors.append({
                    "line": idx + 1,
                    "item_id": item.get("id", "unknown"),
                    "error": str(e.message),
                    "failed_path": list(e.path)
                })
                
        return {
            "total_items": len(data),
            "valid_items": len(data) - len(errors),
            "error_count": len(errors),
            "errors": errors[:100]  # 最多返回100个错误
        }
        
    def convert_format(self, input_file, output_format, 
                      output_file=None):
        """格式转换"""
        with open(input_file, 'r', encoding='utf-8') as f:
            data = [json.loads(line) for line in f]
            
        if output_format == "sharegpt":
            converted = [self._to_sharegpt(item) for item in data]
        elif output_format == "chatml":
            converted = [self._to_chatml(item) for item in data]
        else:
            raise ValueError(f"Unsupported format: {output_format}")
            
        if output_file:
            with open(output_file, 'w', encoding='utf-8') as f:
                for item in converted:
                    f.write(json.dumps(item, ensure_ascii=False) + '\n')
                    
        return converted
        
    def _to_sharegpt(self, item):
        """转换为ShareGPT格式"""
        return {
            "id": item["id"],
            "conversations": [
                {"from": "human", "value": item["instruction"]},
                {"from": "gpt", "value": item["response"]}
            ]
        }
        
    def _to_chatml(self, item):
        """转换为ChatML格式"""
        return {
            "messages": [
                {"role": "user", "content": item["instruction"]},
                {"role": "assistant", "content": item["response"]}
            ]
        }

相关文档