下面是具体日志,帮忙看下这个是什么原因:
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 - ============================================================
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 - [flagged_words_filter] Filter Summary Statistics
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 - ============================================================
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 - Total samples: 2
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 - Kept samples: 2 (100.00%)
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 - Filtered samples: 0 (0.00%)
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 -
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 - No samples filtered. All samples passed the filter.
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 -
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 - Filter parameters:
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 - - Language: zh
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 - - Tokenization: False
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 - - Max ratio: 0.001 (0.10%)
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 - - Use words augmentation: True
2026-04-13 15:19:10 | INFO | data_engine.ops.base_op:635 - ============================================================
2026-04-13 15:19:10 | INFO | data_engine.core.data:202 - OP [flagged_words_filter] Done in 0.808s. Left 2 samples.
2026-04-13 15:19:10 | DEBUG | data_engine.utils.process_utils:30 - Setting multiprocess start method to 'fork'
2026-04-13 15:19:10 | DEBUG | data_engine.ops.base_op:182 - Op [text_length_filter] running with number of procs:3
2026-04-13 15:19:10 | DEBUG | data_engine.ops.base_op:182 - Op [text_length_filter] running with number of procs:3
text_length_filter_compute_stats (num_proc=2): 0%| | 0/2 [00:00<?, ? examples/s]
text_length_filter_compute_stats (num_proc=2): 50%|##### | 1/2 [00:00<00:00, 9.21 examples/s]
text_length_filter_compute_stats (num_proc=2): 100%|##########| 2/2 [00:00<00:00, 8.82 examples/s]
2026-04-13 15:19:11 | DEBUG | data_engine.ops.base_op:182 - Op [text_length_filter] running with number of procs:3
text_length_filter_process (num_proc=2): 0%| | 0/2 [00:00<?, ? examples/s]
text_length_filter_process (num_proc=2): 100%|##########| 2/2 [00:00<00:00, 10.29 examples/s]
2026-04-13 15:19:11 | INFO | data_engine.ops.base_op:635 - ============================================================
2026-04-13 15:19:11 | INFO | data_engine.ops.base_op:635 - [text_length_filter] Filter Summary Statistics
2026-04-13 15:19:11 | INFO | data_engine.ops.base_op:635 - ============================================================
2026-04-13 15:19:11 | INFO | data_engine.ops.base_op:635 - Total samples: 2
2026-04-13 15:19:11 | INFO | data_engine.ops.base_op:635 - Kept samples: 2 (100.00%)
2026-04-13 15:19:11 | INFO | data_engine.ops.base_op:635 - Filtered samples: 0 (0.00%)
2026-04-13 15:19:11 | INFO | data_engine.ops.base_op:635 -
2026-04-13 15:19:11 | INFO | data_engine.ops.base_op:635 - No samples filtered. All samples passed the filter.
2026-04-13 15:19:11 | INFO | data_engine.ops.base_op:635 -
2026-04-13 15:19:11 | INFO | data_engine.ops.base_op:635 - Filter parameters:
2026-04-13 15:19:11 | INFO | data_engine.ops.base_op:635 - - Min length: 10 characters
2026-04-13 15:19:11 | INFO | data_engine.ops.base_op:635 - - Max length: 999999 characters
2026-04-13 15:19:11 | INFO | data_engine.ops.base_op:635 - ============================================================
2026-04-13 15:19:11 | INFO | data_engine.core.data:202 - OP [text_length_filter] Done in 0.692s. Left 2 samples.
2026-04-13 15:19:11 | INFO | data_engine.tools.legacies.analyzer:101 - Exporting dataset to disk...
2026-04-13 15:19:11 | INFO | data_engine.exporter.base_exporter:130 - Exporting computed stats into a single file...
Creating json from Arrow format: 0%| | 0/1 [00:00<?, ?ba/s]
Creating json from Arrow format: 100%|##########| 1/1 [00:00<00:00, 144.71ba/s]
2026-04-13 15:19:11 | ERROR | data_server.job.JobExecutor:147 - Job 54 execution failed with error: [Errno 2] No such file or directory: '/data/dataflow_data/dwb-test222_edba79c2-c1f0-4fe3-8925-1616158e803d/output/_df_dataset_stats.jsonl/_data/x_stats.jsonl'
2026-04-13 15:19:11 | INFO | data_server.job.JobExecutor:157 - Job 54 marked as FAILED`
下面是具体日志,帮忙看下这个是什么原因: