zhouyang.xie 54948f9ffa 换用github jwjohns/unsloth-GRPO-qwen2.5 验证GRPO训练模型 4 months ago
..
conf_train.yaml 54948f9ffa 换用github jwjohns/unsloth-GRPO-qwen2.5 验证GRPO训练模型 4 months ago