tianyaogavin commited on
Commit
87b8a8a
·
1 Parent(s): 1bf36cc

ct2 translator

Browse files
Framework.md CHANGED
@@ -1,71 +1,77 @@
1
- ## 【伪流式音频转写 + LLM优化系统架构图】
2
 
3
- ### 🌊 总体流程图
4
 
5
  ```mermaid
6
  graph TD
7
- A[音频流输入] --> B[VAD]
8
- B --> C[Transcribe]
9
- C --> D[语义聚合控制器]
 
10
 
11
- D --> E[即时输出模块]
12
- D --> F[LLM 优化调度器]
 
13
 
14
- F --> G[优化后回填模块]
15
- G --> E
16
- E --> H[翻译模块]
17
  ```
18
-
19
  ---
20
 
21
- ### 🧱 模块划分(以伪流式为核心)
22
-
23
- #### **模块 A:音频流输入**
24
- - **职责**:接收用户麦克风或远程语音流(如 WebRTC, WebSocket),将连续音频切分为帧(如每帧 20ms)。
25
- - **特点**:持续运行的监听器,向下游推送 PCM / numpy array。
26
- - **实时性保障**:限帧缓冲长度(防止阻塞);异步 IO(web/本地都适用)。
27
-
28
- #### **模块 B:VAD 分段器(vad)**
29
- - **职责**:根据能量、静音、说话边界等信号判断语音段落。
30
- - **输出**:segment 音频块 + 时间戳。
31
- - **特点**:滑动窗口方式收集帧,支持重叠帧(方便 whisper 特征提取)。
32
- - **实时性保障**:分段逻辑低延迟计算,每完成一个 segment 即可推送。
33
-
34
- #### **模块 C:Whisper 转录模块(transcribe)**
35
- - **职责**:对每段 VAD 输出进行 whisper 转写,输出文本 + 时间戳。
36
- - **特点**:segment 级别调用 whisper,天然可并行;可 GPU 推理加速。
37
- - **实时性保障**:控制 segment 长度在 1~5s,支持多个转录 worker 异步执行。
38
-
39
- #### **模块 D:语义聚合控制器**(⚠️核心控制器)
40
- - **职责**:维护 segment 缓冲池(N 个),判断 segment 是否“组成完整语义单元”,推送两个下游:
41
- - 即时展示模块(原始或优化转写)
42
- - fine-tune 队列(异步 LLM 优化)
43
- - **判断逻辑**:基于标点、停顿、时间间隔、小模型或规则判断。
44
- - **实时性保障**:设置最大延迟窗口,防止句子粘连。
45
-
46
- #### **模块 E:即时输出模块(display)**
47
- - **职责**:将聚合后的转写结果立即显示给用户,无论是原始还是优化后的内容。
48
- - **特点**:无等待、无依赖,低延迟输出,支持句子更新。
49
- - **实时性保障**:最短路径,展示为第一响应版本。
50
-
51
- #### **模块 F:LLM 优化调度器(optimizer)**
52
- - **职责**:接收待优化句子,加入优化任务队列。
53
- - **特点**:任务调度、并行执行、负载均衡;支持多模型、可控超时。
54
- - **实时性保障**:异步非阻塞,不影响主流程。
55
-
56
- #### **模块 G:优化后回填模块**
57
- - **职责**:对照原句编号,将 LLM 优化结果回填替换,并推送给即时输出模块。
58
- - **特点**:非强覆盖,可差分更新;UI 区分原始/优化版本。
59
- - **实时性保障**:回填异步进行,不干扰主字幕。
60
-
61
- #### **模块 H:翻译模块(translator)**
62
- - **职责**:接收所有来自即时输出模块的句子(原始或优化后),将其翻译为目标语言。
63
- - **特点**:单一翻译模块,适配不同质量文本;可以并行多语种或缓存优化后重译。
64
- - **实时性保障**:即转即译 + 回填更新可分离,支持伪流式体验。
65
 
66
  ---
67
 
68
  ### 🔧 模块功能说明
69
 
70
- (说明内容略,与上述保持一致,可在后续按需同步更新)
 
 
 
 
71
 
 
 
 
 
 
1
+ # 【伪流式音频转写 + LLM优化系统架构图】
2
 
3
+ ## 🌊 总体流程图
4
 
5
  ```mermaid
6
  graph TD
7
+ A[音频流输入] --> B[VAD (20ms)]
8
+ B --> C[Transcribe(200ms)]
9
+ C --> D[快速翻译模块(200ms)]
10
+ D --> E[即时输出模块(非确认状态)]
11
 
12
+ C --> F[翻译确认模块(可选优化)]
13
+ F --> G[优化翻译模块(LLM或重转录)(500ms)]
14
+ G --> H[异步输出模块(确认状态)]
15
 
 
 
 
16
  ```
 
17
  ---
18
 
19
+ ## 🧱 模块划分(以伪流式为核心)
20
+
21
+ ### **模块 A:音频流输入**
22
+ - **职责**:接收用户麦克风或远程语音流(如 WebRTCWebSocket),将连续音频切分为帧(如每帧 20ms)。
23
+ - **特点**:持续运行的监听器,向下游推送 PCM 帧或 numpy array。
24
+ - **实时性保障**:限制帧缓冲长度(防止阻塞);异步 IO 实现(支持本地或 Web 场景)。
25
+
26
+ ### **模块 B:VAD 分段器**
27
+ - **职责**:根据语音能量、静音检测、语音边界等逻辑将音频切分成语音段(segment)。
28
+ - **输出**:segment 音频数据块及时间戳。
29
+ - **特点**:基于滑动窗口,支持帧重叠;优化 Whisper 特征提取。
30
+ - **实时性保障**:极低延迟;segment 生成即推送下游模块。
31
+
32
+ ### **模块 C:Whisper 转录模块**
33
+ - **职责**:对 VAD 输出的 segment 执行 Whisper 推理,生成转写文本。
34
+ - **输出**:原始文本段落(含时间戳)。
35
+ - **特点**:segment 单元并行处理;可通过 GPU 加速。
36
+ - **实时性保障**:每段 1~5 秒,支持异步 worker 并行转写。
37
+
38
+ ### **模块 D:快速翻译模块**
39
+ - **职责**:在转写完成后立即对文本进行机器翻译(如使用 CTranslate2+NLLB 模型)。
40
+ - **输出**:翻译文本(第一时间展示用)。
41
+ - **特点**:轻量翻译模块,适配实时性需求。
42
+ - **实时性保障**:200ms 内完成翻译并传递至��示模块。
43
+
44
+ ### **模块 E:即时输出模块(非确认状态)**
45
+ - **职责**:接收翻译结果,第一时间展示给用户。
46
+ - **特点**:无等待、无确认,仅为初版输出。
47
+ - **实时性保障**:面向用户 UI 的主响应路径,保证极低延迟。
48
+
49
+ ### **模块 F:翻译确认模块(控制器)**
50
+ - **职责**:判断是否需要对当前句子进行 LLM 优化或更深层次的重转录。
51
+ - **特点**:分析内容质量、标点情况或上下文完整度,触发优化流程。
52
+ - **实时性保障**:判断延迟可控,不阻塞主流程。
53
+
54
+ ### **模块 G:优化翻译模块(LLM或重转录)**
55
+ - **职责**:使用 LLM 或重新转写提升句子质量,适用于更复杂表达、用户配置优化等情景。
56
+ - **特点**:异步执行,支持任务排队与超时处理;高质量输出。
57
+ - **实时性保障**:不影响主路径,优化输出采用回填策略。
58
+
59
+ ### **模块 H:异步输出模块(确认状态)**
60
+ - **职责**:将优化后的结果替换展示或做差分更新,供用户确认或查看。
61
+ - **特点**:支持区分原始和优化版本的展示策略。
62
+ - **实时性保障**:异步更新,不影响当前交互。
63
 
64
  ---
65
 
66
  ### 🔧 模块功能说明
67
 
68
+ 上述模块可单独部署为微服务,也可组合为本地流式推理程序,适配不同设备和场景需求。
69
+
70
+ - Whisper 模块支持 CUDA / CPU 切换;
71
+ - 翻译模块支持 NLLB 量化模型,响应时间控制在百毫秒级;
72
+ - VAD 模块可基于 WebRTC VAD、Silero VAD 等方案替换。
73
 
74
+ 未来可拓展功能包括:
75
+ - 多用户通话流识别(扬声器分离);
76
+ - 跨语种对话自动识别与应答生成;
77
+ - 可控 LLM 插槽,用于个性化纠错 / 术语优化等场景。
dataset/audio/metadata/test1_segments_20250506_141232.json ADDED
@@ -0,0 +1,126 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "audio_file": "dataset/audio/test1.wav",
3
+ "timestamp": "20250506_141232",
4
+ "total_segments": 17,
5
+ "segments": [
6
+ {
7
+ "index": 0,
8
+ "start_time": 3.26,
9
+ "end_time": 3.92,
10
+ "duration": 0.6600000000000001,
11
+ "is_speech": true
12
+ },
13
+ {
14
+ "index": 1,
15
+ "start_time": 4.34,
16
+ "end_time": 5.56,
17
+ "duration": 1.2199999999999998,
18
+ "is_speech": true
19
+ },
20
+ {
21
+ "index": 2,
22
+ "start_time": 7.1,
23
+ "end_time": 7.8,
24
+ "duration": 0.7000000000000002,
25
+ "is_speech": true
26
+ },
27
+ {
28
+ "index": 3,
29
+ "start_time": 8.8,
30
+ "end_time": 12.44,
31
+ "duration": 3.639999999999999,
32
+ "is_speech": true
33
+ },
34
+ {
35
+ "index": 4,
36
+ "start_time": 12.8,
37
+ "end_time": 16.74,
38
+ "duration": 3.9399999999999977,
39
+ "is_speech": true
40
+ },
41
+ {
42
+ "index": 5,
43
+ "start_time": 17.32,
44
+ "end_time": 18.76,
45
+ "duration": 1.4400000000000013,
46
+ "is_speech": true
47
+ },
48
+ {
49
+ "index": 6,
50
+ "start_time": 19.76,
51
+ "end_time": 21.1,
52
+ "duration": 1.3399999999999999,
53
+ "is_speech": true
54
+ },
55
+ {
56
+ "index": 7,
57
+ "start_time": 21.62,
58
+ "end_time": 25.68,
59
+ "duration": 4.059999999999999,
60
+ "is_speech": true
61
+ },
62
+ {
63
+ "index": 8,
64
+ "start_time": 26.28,
65
+ "end_time": 28.2,
66
+ "duration": 1.9199999999999982,
67
+ "is_speech": true
68
+ },
69
+ {
70
+ "index": 9,
71
+ "start_time": 28.56,
72
+ "end_time": 31.6,
73
+ "duration": 3.0400000000000027,
74
+ "is_speech": true
75
+ },
76
+ {
77
+ "index": 10,
78
+ "start_time": 31.98,
79
+ "end_time": 33.2,
80
+ "duration": 1.2200000000000024,
81
+ "is_speech": true
82
+ },
83
+ {
84
+ "index": 11,
85
+ "start_time": 33.54,
86
+ "end_time": 36.52,
87
+ "duration": 2.980000000000004,
88
+ "is_speech": true
89
+ },
90
+ {
91
+ "index": 12,
92
+ "start_time": 37.82,
93
+ "end_time": 38.94,
94
+ "duration": 1.1199999999999974,
95
+ "is_speech": true
96
+ },
97
+ {
98
+ "index": 13,
99
+ "start_time": 39.34,
100
+ "end_time": 40.34,
101
+ "duration": 1.0,
102
+ "is_speech": true
103
+ },
104
+ {
105
+ "index": 14,
106
+ "start_time": 40.86,
107
+ "end_time": 42.4,
108
+ "duration": 1.5399999999999991,
109
+ "is_speech": true
110
+ },
111
+ {
112
+ "index": 15,
113
+ "start_time": 43.04,
114
+ "end_time": 46.6,
115
+ "duration": 3.5600000000000023,
116
+ "is_speech": true
117
+ },
118
+ {
119
+ "index": 16,
120
+ "start_time": 47.5,
121
+ "end_time": 49.8,
122
+ "duration": 2.299999999999997,
123
+ "is_speech": true
124
+ }
125
+ ]
126
+ }
translator/README.md ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 测试结果
2
+
3
+ ```bash
4
+ 2025-05-07 20:23:02,565 - translator - DEBUG - 使用设备: cuda
5
+ 2025-05-07 20:23:04,366 - translator - INFO -
6
+ ==== 测试用例 1 ====
7
+ 2025-05-07 20:23:04,367 - translator - DEBUG - 开始翻译
8
+ 2025-05-07 20:23:04,367 - translator - INFO - [翻译原文] 请问这附近有地铁站吗?
9
+ 2025-05-07 20:23:04,367 - translator - DEBUG - 源语言: zho_Hans, 目标语言: eng_Latn
10
+ 2025-05-07 20:23:04,515 - translator - DEBUG - 输出分词: ['eng_Latn', '▁Please', '▁ask', ',', '▁is', '▁there', '▁a', '▁rail', 'way', '▁station', '▁near', 'by', '?']
11
+ 2025-05-07 20:23:04,516 - translator - DEBUG - 翻译完成: zho_Hans -> eng_Latn, 耗时: 146.86ms
12
+ 2025-05-07 20:23:04,516 - translator - INFO - [翻译结果] Please ask, is there a railway station nearby?
13
+ 2025-05-07 20:23:04,516 - translator - INFO - 最终翻译结果: Please ask, is there a railway station nearby?
14
+ 2025-05-07 20:23:04,516 - translator - INFO - 总耗时: 148.93ms
15
+ 2025-05-07 20:23:04,516 - translator - INFO -
16
+ ==== 测试用例 2 ====
17
+ 2025-05-07 20:23:04,517 - translator - DEBUG - 开始翻译
18
+ 2025-05-07 20:23:04,517 - translator - INFO - [翻译原文] 我们今天要讨论人工智能的发展趋势。
19
+ 2025-05-07 20:23:04,517 - translator - DEBUG - 源语言: zho_Hans, 目标语言: eng_Latn
20
+ 2025-05-07 20:23:04,628 - translator - DEBUG - 输出分词: ['eng_Latn', '▁We', '▁are', '▁going', '▁to', '▁discuss', '▁today', '▁the', '▁tr', 'ends', '▁in', '▁the', '▁development', '▁of', '▁artificial', '▁intelligence', '.']
21
+ 2025-05-07 20:23:04,628 - translator - DEBUG - 翻译完成: zho_Hans -> eng_Latn, 耗时: 111.20ms
22
+ 2025-05-07 20:23:04,628 - translator - INFO - [翻译结果] We are going to discuss today the trends in the development of artificial intelligence.
23
+ 2025-05-07 20:23:04,628 - translator - INFO - 最终翻译结果: We are going to discuss today the trends in the development of artificial intelligence.
24
+ 2025-05-07 20:23:04,628 - translator - INFO - 总耗时: 111.20ms
25
+ 2025-05-07 20:23:04,628 - translator - INFO -
26
+ ==== 测试用例 3 ====
27
+ 2025-05-07 20:23:04,628 - translator - DEBUG - 开始翻译
28
+ 2025-05-07 20:23:04,628 - translator - INFO - [翻译原文] 他的回答令人非常失望。
29
+ 2025-05-07 20:23:04,628 - translator - DEBUG - 源语言: zho_Hans, 目标语言: eng_Latn
30
+ 2025-05-07 20:23:04,684 - translator - DEBUG - 输出分词: ['eng_Latn', '▁His', '▁answer', '▁was', '▁very', '▁disappoint', 'ing', '.']
31
+ 2025-05-07 20:23:04,684 - translator - DEBUG - 翻译完成: zho_Hans -> eng_Latn, 耗时: 55.06ms
32
+ 2025-05-07 20:23:04,684 - translator - INFO - [翻译结果] His answer was very disappointing.
33
+ 2025-05-07 20:23:04,684 - translator - INFO - 最终翻译结果: His answer was very disappointing.
34
+ 2025-05-07 20:23:04,684 - translator - INFO - 总耗时: 56.07ms
35
+ 2025-05-07 20:23:04,684 - translator - INFO -
36
+ ==== 测试用例 4 ====
37
+ 2025-05-07 20:23:04,684 - translator - DEBUG - 开始翻译
38
+ 2025-05-07 20:23:04,684 - translator - INFO - [翻译原文] 这个项目已经进行了三个月,还需要更多资源支持。
39
+ 2025-05-07 20:23:04,684 - translator - DEBUG - 源语言: zho_Hans, 目标语言: eng_Latn
40
+ 2025-05-07 20:23:04,787 - translator - DEBUG - 输出分词: ['eng_Latn', '▁The', '▁project', '▁has', '▁been', '▁running', '▁for', '▁three', '▁months', '▁and', '▁requires', '▁more', '▁resources', '▁to', '▁support', '▁it', '.']
41
+ 2025-05-07 20:23:04,788 - translator - DEBUG - 翻译完成: zho_Hans -> eng_Latn, 耗时: 102.36ms
42
+ 2025-05-07 20:23:04,788 - translator - INFO - [翻译结果] The project has been running for three months and requires more resources to support it.
43
+ 2025-05-07 20:23:04,788 - translator - INFO - 最终翻译结果: The project has been running for three months and requires more resources to support it.
44
+ 2025-05-07 20:23:04,788 - translator - INFO - 总耗时: 104.35ms
45
+ 2025-05-07 20:23:04,788 - translator - INFO -
46
+ ==== 测试用例 5 ====
47
+ 2025-05-07 20:23:04,788 - translator - DEBUG - 开始翻译
48
+ 2025-05-07 20:23:04,788 - translator - INFO - [翻译原文] 天气预报说明天会有暴雨,请大家注意安全。
49
+ 2025-05-07 20:23:04,788 - translator - DEBUG - 源语言: zho_Hans, 目标语言: eng_Latn
50
+ 2025-05-07 20:23:04,898 - translator - DEBUG - 输出分词: ['eng_Latn', '▁Weather', '▁fore', 'cas', 'ts', '▁indicate', '▁that', '▁there', '▁will', '▁be', '▁heavy', '▁rain', ',', '▁please', '▁pay', '▁attention', '▁to', '▁safety', '.']
51
+ 2025-05-07 20:23:04,898 - translator - DEBUG - 翻译完成: zho_Hans -> eng_Latn, 耗时: 109.08ms
52
+ 2025-05-07 20:23:04,898 - translator - INFO - [翻译结果] Weather forecasts indicate that there will be heavy rain, please pay attention to safety.
53
+ 2025-05-07 20:23:04,899 - translator - INFO - 最终翻译结果: Weather forecasts indicate that there will be heavy rain, please pay attention to safety.
54
+ 2025-05-07 20:23:04,899 - translator - INFO - 总耗时: 110.14ms
55
+ 2025-05-07 20:23:04,899 - translator - INFO -
56
+ ==== 测试用例 6 ====
57
+ 2025-05-07 20:23:04,899 - translator - DEBUG - 开始翻译
58
+ 2025-05-07 20:23:04,899 - translator - INFO - [翻译原文] 是时候重新思考我们的计划了。
59
+ 2025-05-07 20:23:04,899 - translator - DEBUG - 源语言: zho_Hans, 目标语言: eng_Latn
60
+ 2025-05-07 20:23:04,976 - translator - DEBUG - 输出分词: ['eng_Latn', '▁It', "'", 's', '▁time', '▁to', '▁r', 'eth', 'ink', '▁our', '▁plans', '.']
61
+ 2025-05-07 20:23:04,976 - translator - DEBUG - 翻译完成: zho_Hans -> eng_Latn, 耗时: 77.24ms
62
+ 2025-05-07 20:23:04,976 - translator - INFO - [翻译结果] It's time to rethink our plans.
63
+ 2025-05-07 20:23:04,976 - translator - INFO - 最终翻译结果: It's time to rethink our plans.
64
+ 2025-05-07 20:23:04,976 - translator - INFO - 总耗时: 77.76ms
65
+ 2025-05-07 20:23:04,976 - translator - INFO -
66
+ ==== 测试用例 7 ====
67
+ 2025-05-07 20:23:04,976 - translator - DEBUG - 开始翻译
68
+ 2025-05-07 20:23:04,977 - translator - INFO - [翻译原文] 我对这个结果非常满意,感谢你的努力。
69
+ 2025-05-07 20:23:04,977 - translator - DEBUG - 源语言: zho_Hans, 目标语言: eng_Latn
70
+ 2025-05-07 20:23:05,076 - translator - DEBUG - 输出分词: ['eng_Latn', '▁I', "'", 'm', '▁very', '▁happy', '▁with', '▁this', '▁result', ',', '▁thank', '▁you', '▁for', '▁your', '▁efforts', '.']
71
+ 2025-05-07 20:23:05,076 - translator - DEBUG - 翻译完成: zho_Hans -> eng_Latn, 耗时: 98.25ms
72
+ 2025-05-07 20:23:05,076 - translator - INFO - [翻译结果] I'm very happy with this result, thank you for your efforts.
73
+ 2025-05-07 20:23:05,076 - translator - INFO - 最终翻译结果: I'm very happy with this result, thank you for your efforts.
74
+ 2025-05-07 20:23:05,076 - translator - INFO - 总耗时: 99.88ms
75
+ 2025-05-07 20:23:05,076 - translator - INFO -
76
+ ==== 测试用例 8 ====
77
+ 2025-05-07 20:23:05,076 - translator - DEBUG - 开始翻译
78
+ 2025-05-07 20:23:05,077 - translator - INFO - [翻译原文] 她穿着一件红色的连衣裙,在人群中格外显眼。
79
+ 2025-05-07 20:23:05,077 - translator - DEBUG - 源语言: zho_Hans, 目标语言: eng_Latn
80
+ 2025-05-07 20:23:05,178 - translator - DEBUG - 输出分词: ['eng_Latn', '▁She', '▁we', 'ars', '▁a', '▁red', '▁dress', ',', '▁which', '▁is', '▁very', '▁prom', 'inent', '▁among', '▁the', '▁crowd', '.']
81
+ 2025-05-07 20:23:05,178 - translator - DEBUG - 翻译完成: zho_Hans -> eng_Latn, 耗时: 100.78ms
82
+ 2025-05-07 20:23:05,178 - translator - INFO - [翻译结果] She wears a red dress, which is very prominent among the crowd.
83
+ 2025-05-07 20:23:05,178 - translator - INFO - 最终翻译结果: She wears a red dress, which is very prominent among the crowd.
84
+ 2025-05-07 20:23:05,179 - translator - INFO - 总耗时: 102.00ms
85
+ 2025-05-07 20:23:05,179 - translator - INFO -
86
+ ==== 测试用例 9 ====
87
+ 2025-05-07 20:23:05,179 - translator - DEBUG - 开始翻译
88
+ 2025-05-07 20:23:05,179 - translator - INFO - [翻译原文] Can you help me find the nearest bus station?
89
+ 2025-05-07 20:23:05,179 - translator - DEBUG - 源语言: eng_Latn, 目标语言: zho_Hans
90
+ 2025-05-07 20:23:05,271 - translator - DEBUG - 输出分词: ['zho_Hans', '▁你', '能', '帮', '我', '找到', '最近', '的', '公 共', '汽', '车', '站', '吗', '?']
91
+ 2025-05-07 20:23:05,271 - translator - DEBUG - 翻译完成: eng_Latn -> zho_Hans, 耗时: 91.77ms
92
+ 2025-05-07 20:23:05,271 - translator - INFO - [翻译结果] 你能帮我找到最近的公共汽车站吗?
93
+ 2025-05-07 20:23:05,271 - translator - INFO - 最终翻译结果: 你能帮我找到最近的公共汽车站吗?
94
+ 2025-05-07 20:23:05,272 - translator - INFO - 总耗时: 91.77ms
95
+ 2025-05-07 20:23:05,272 - translator - INFO -
96
+ ==== 测试用例 10 ====
97
+ 2025-05-07 20:23:05,272 - translator - DEBUG - 开始翻译
98
+ 2025-05-07 20:23:05,272 - translator - INFO - [翻译原文] The machine learning model achieved an accuracy of 95%.
99
+ 2025-05-07 20:23:05,272 - translator - DEBUG - 源语言: eng_Latn, 目标语言: zho_Hans
100
+ 2025-05-07 20:23:05,368 - translator - DEBUG - 输出分词: ['zho_Hans', '▁', '机', '器', '学习', '模型', '达到', '9', '5%', '的', '准', '确', '性', '.']
101
+ 2025-05-07 20:23:05,368 - translator - DEBUG - 翻译完成: eng_Latn -> zho_Hans, 耗时: 95.58ms
102
+ 2025-05-07 20:23:05,369 - translator - INFO - [翻译结果] 机器学习模型达到95%的准确性.
103
+ 2025-05-07 20:23:05,369 - translator - INFO - 最终翻译结果: 机器学习模型达到95%的准确性.
104
+ 2025-05-07 20:23:05,369 - translator - INFO - 总耗时: 96.62ms
105
+ 2025-05-07 20:23:05,369 - translator - INFO -
106
+ ==== 测试用例 11 ====
107
+ 2025-05-07 20:23:05,370 - translator - DEBUG - 开始翻译
108
+ 2025-05-07 20:23:05,370 - translator - INFO - [翻译原文] He was overwhelmed by the unexpected response from the audience.
109
+ 2025-05-07 20:23:05,370 - translator - DEBUG - 源语言: eng_Latn, 目标语言: zho_Hans
110
+ 2025-05-07 20:23:05,471 - translator - DEBUG - 输出分词: ['zho_Hans', '▁他', '被', '观', '众', '的', '意', '想', '不', ' 到', '的', '反应', '压', '倒', '了', '.']
111
+ 2025-05-07 20:23:05,471 - translator - DEBUG - 翻译完成: eng_Latn -> zho_Hans, 耗时: 100.42ms
112
+ 2025-05-07 20:23:05,472 - translator - INFO - [翻译结果] 他被观众的意想不到的反应压倒了.
113
+ 2025-05-07 20:23:05,472 - translator - INFO - 最终翻译结果: 他被观众的意想不到的反应压倒了.
114
+ 2025-05-07 20:23:05,472 - translator - INFO - 总耗时: 102.39ms
115
+ 2025-05-07 20:23:05,472 - translator - INFO -
116
+ ==== 测试用例 12 ====
117
+ 2025-05-07 20:23:05,472 - translator - DEBUG - 开始翻译
118
+ 2025-05-07 20:23:05,472 - translator - INFO - [翻译原文] It’s important to stay hydrated during hot summer days.
119
+ 2025-05-07 20:23:05,473 - translator - DEBUG - 源语言: eng_Latn, 目标语言: zho_Hans
120
+ 2025-05-07 20:23:05,557 - translator - DEBUG - 输出分词: ['zho_Hans', '▁在', '炎', '热', '的', '夏', '天', '保持', '水', '分', '很', '重要', '.']
121
+ 2025-05-07 20:23:05,557 - translator - DEBUG - 翻译完成: eng_Latn -> zho_Hans, 耗时: 84.14ms
122
+ 2025-05-07 20:23:05,557 - translator - INFO - [翻译结果] 在炎热的夏天保持水分很重要.
123
+ 2025-05-07 20:23:05,557 - translator - INFO - 最终翻译结果: 在炎热的夏天保持水分很重要.
124
+ 2025-05-07 20:23:05,557 - translator - INFO - 总耗时: 85.14ms
125
+ 2025-05-07 20:23:05,557 - translator - INFO -
126
+ ==== 测试用例 13 ====
127
+ 2025-05-07 20:23:05,557 - translator - DEBUG - 开始翻译
128
+ 2025-05-07 20:23:05,557 - translator - INFO - [翻译原文] Although she was tired, she continued working late into the night.
129
+ 2025-05-07 20:23:05,557 - translator - DEBUG - 源语言: eng_Latn, 目标语言: zho_Hans
130
+ 2025-05-07 20:23:05,649 - translator - DEBUG - 输出分词: ['zho_Hans', '▁', '虽然', '她', '很', '累', ',', '但', '她', '继续', '工作', '直到', '深', '夜', '.']
131
+ 2025-05-07 20:23:05,650 - translator - DEBUG - 翻译完成: eng_Latn -> zho_Hans, 耗时: 92.03ms
132
+ 2025-05-07 20:23:05,650 - translator - INFO - [翻译结果] 虽然她很累,但她继续工作直到深夜.
133
+ 2025-05-07 20:23:05,650 - translator - INFO - 最终翻译结果: 虽然她很累,但她继续工作直到深夜.
134
+ 2025-05-07 20:23:05,650 - translator - INFO - 总耗时: 93.03ms
135
+ 2025-05-07 20:23:05,650 - translator - INFO -
136
+ ==== 测试用例 14 ====
137
+ 2025-05-07 20:23:05,650 - translator - DEBUG - 开始翻译
138
+ 2025-05-07 20:23:05,650 - translator - INFO - [翻译原文] The concert was amazing, and the crowd was full of energy.
139
+ 2025-05-07 20:23:05,650 - translator - DEBUG - 源语言: eng_Latn, 目标语言: zho_Hans
140
+ 2025-05-07 20:23:05,747 - translator - DEBUG - 输出分词: ['zho_Hans', '▁', '音乐', '会', '是', '惊', '人的', ',', '群', '众', '充', '满', '了', '能量', '.']
141
+ 2025-05-07 20:23:05,747 - translator - DEBUG - 翻译完成: eng_Latn -> zho_Hans, 耗时: 95.60ms
142
+ 2025-05-07 20:23:05,747 - translator - INFO - [翻译结果] 音乐会是惊人的,群众充满了能量.
143
+ 2025-05-07 20:23:05,748 - translator - INFO - 最终翻译结果: 音乐会是惊人的,群众充满了能量.
144
+ 2025-05-07 20:23:05,748 - translator - INFO - 总耗时: 97.54ms
145
+ 2025-05-07 20:23:05,748 - translator - INFO -
146
+ ==== 测试用例 15 ====
147
+ 2025-05-07 20:23:05,748 - translator - DEBUG - 开始翻译
148
+ 2025-05-07 20:23:05,748 - translator - INFO - [翻译原文] Please make sure to submit your application before the deadline.
149
+ 2025-05-07 20:23:05,748 - translator - DEBUG - 源语言: eng_Latn, 目标语言: zho_Hans
150
+ 2025-05-07 20:23:05,817 - translator - DEBUG - 输出分词: ['zho_Hans', '▁请', '确保', '在', '截', '止', '日', '期', '之前', '提交', '申请', '.']
151
+ 2025-05-07 20:23:05,817 - translator - DEBUG - 翻译完成: eng_Latn -> zho_Hans, 耗时: 69.40ms
152
+ 2025-05-07 20:23:05,817 - translator - INFO - [翻译结果] 请确保在截止日期之前提交申请.
153
+ 2025-05-07 20:23:05,817 - translator - INFO - 最终翻译结果: 请确保在截止日期之前提交申请.
154
+ 2025-05-07 20:23:05,817 - translator - INFO - 总耗时: 69.40ms
155
+ 2025-05-07 20:23:05,817 - translator - INFO -
156
+ ==== 测试用例 16 ====
157
+ 2025-05-07 20:23:05,817 - translator - DEBUG - 开始翻译
158
+ 2025-05-07 20:23:05,817 - translator - INFO - [翻译原文] After months of preparation, the product was finally launched.
159
+ 2025-05-07 20:23:05,817 - translator - DEBUG - 源语言: eng_Latn, 目标语言: zho_Hans
160
+ 2025-05-07 20:23:05,920 - translator - DEBUG - 输出分词: ['zho_Hans', '▁', '经', '过', '数', '月', '的', '准', '备', ',', '该', '产', '品', '最终', '推', '出', '.']
161
+ 2025-05-07 20:23:05,920 - translator - DEBUG - 翻译完成: eng_Latn -> zho_Hans, 耗时: 102.10ms
162
+ 2025-05-07 20:23:05,921 - translator - INFO - [翻译结果] 经过数月的准备,该产品最终推出.
163
+ 2025-05-07 20:23:05,921 - translator - INFO - 最终翻译结果: 经过数月的准备,该产品最终推出.
164
+ 2025-05-07 20:23:05,921 - translator - INFO - 总耗时: 104.10ms
165
+ ```
translator/translator.py CHANGED
@@ -1,8 +1,9 @@
1
- """
2
- 翻译模块 - 使用NLLB模型进行多语言翻译
3
- """
4
 
5
- from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
 
6
  from langdetect import detect
7
  import torch
8
  import time
@@ -10,140 +11,95 @@ import logging
10
 
11
  # 配置日志
12
  def setup_logger(name, level=logging.INFO):
13
- """设置日志记录器"""
14
  logger = logging.getLogger(name)
15
- # 清除所有已有的handler,避免重复
16
  if logger.handlers:
17
  logger.handlers.clear()
18
-
19
- # 添加新的handler
20
  handler = logging.StreamHandler()
21
  formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
22
  handler.setFormatter(formatter)
23
  logger.addHandler(handler)
24
  logger.setLevel(level)
25
- # 禁止传播到父logger,避免重复日志
26
  logger.propagate = False
27
  return logger
28
 
29
- # 创建日志记录器
30
  logger = setup_logger("translator")
31
 
32
  class NLLBTranslator:
33
- """
34
- NLLB翻译器,使用Facebook的NLLB模型进行多语言翻译
35
- """
36
-
37
- def __init__(self, model_name="facebook/nllb-200-distilled-600M", default_target="eng_Latn"):
38
- """
39
- 初始化NLLB翻译器
40
-
41
- :param model_name: 模型名称
42
- :param default_target: 默认目标语言代码
43
- """
44
- self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
45
  logger.debug(f"使用设备: {self.device}")
46
-
47
- if self.device.type == "cuda":
48
- logger.debug(f"GPU设备: {torch.cuda.get_device_name(0)}")
49
- total_mem = torch.cuda.get_device_properties(0).total_memory / 1024**3
50
- logger.debug(f"GPU显存: {total_mem:.1f} GB")
51
-
52
- # 加载模型和分词器
53
- logger.debug(f"加载模型: {model_name}")
54
- self.tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
55
- self.model = AutoModelForSeq2SeqLM.from_pretrained(
56
- model_name,
57
- torch_dtype=torch.float16 if self.device.type == "cuda" else torch.float32
58
- ).to(self.device)
59
 
 
 
60
  self.default_target = default_target
61
- logger.debug(f"翻译器初始化完成,默认目标语言: {default_target}")
62
-
63
- def detect_lang_code(self, text: str) -> str:
64
- """
65
- 检测文本语言并返回NLLB语言代码
66
-
67
- :param text: 要检测的文本
68
- :return: NLLB语言代码
69
- """
70
- try:
71
- lang = detect(text)
72
- logger.debug(f"检测到语言: {lang}")
73
- except Exception:
74
- logger.debug("语言检测失败,默认使用中文(zh)")
75
- lang = "zh-cn"
76
-
77
- # 语言代码映射
78
- lang_map = {
79
- "zh-cn": "zho_Hans", "zh": "zho_Hans", "en": "eng_Latn", "fr": "fra_Latn",
80
- "de": "deu_Latn", "ja": "jpn_Jpan", "ko": "kor_Hang", "ar": "arb_Arab"
81
- }
82
-
83
- lang_code = lang_map.get(lang.lower(), "eng_Latn")
84
- logger.debug(f"映射语言代码: {lang} -> {lang_code}")
85
- return lang_code
86
 
87
- def translate(self, text: str, target_lang_code: str = None) -> str:
88
- """
89
- 翻译文本到目标语言
90
-
91
- :param text: 要翻译的文本
92
- :param target_lang_code: 目标语言代码,如果为None则使用默认目标语言
93
- :return: 翻译后的文本
94
- """
95
  logger.debug("开始翻译")
96
-
97
- # 记录原文(INFO级别)
98
  logger.info(f"[翻译原文] {text}")
99
 
100
- # 检测源语言
101
- src_lang = self.detect_lang_code(text)
102
  tgt_lang = target_lang_code or self.default_target
103
-
104
- # 准备输入
105
- self.tokenizer.src_lang = src_lang
106
- inputs = self.tokenizer(text, return_tensors="pt", padding=True, truncation=True).to(self.device)
107
- inputs["forced_bos_token_id"] = self.tokenizer.convert_tokens_to_ids(tgt_lang)
108
-
109
- # 执行翻译
110
- start = time.time()
111
- with torch.no_grad():
112
- output = self.model.generate(**inputs, max_new_tokens=80)
113
 
114
- # 解码结果
115
- result = self.tokenizer.decode(output[0], skip_special_tokens=True)
116
 
117
- # 记录耗时和结果
 
 
 
 
 
 
 
 
 
 
118
  duration = time.time() - start
119
- logger.debug(f"翻译完成: {src_lang} -> {tgt_lang}, 耗时: {duration:.2f}秒")
 
 
120
 
121
- # 记录翻译结果(INFO级别)
 
 
 
 
 
 
122
  logger.info(f"[翻译结果] {result}")
123
-
124
- return result
125
 
 
126
 
127
  if __name__ == "__main__":
128
- # 设置日志级别为DEBUG以查看详细信息
129
  logger.setLevel(logging.DEBUG)
130
-
131
- # 创建翻译器
132
  translator = NLLBTranslator()
133
 
134
- # 测试中文到英文
135
- zh_text = "你会学习到如何使用音频数据集"
136
- logger.info("\n==== 中文 → 英文 ====")
137
- result = translator.translate(zh_text, target_lang_code="eng_Latn")
138
- logger.info(f"测试完成: {result}")
139
-
140
- # 测试英文到法语
141
- en_text = "This audio processing pipeline is fast and accurate."
142
- logger.info("\n==== 英文 → 法语 ====")
143
- result = translator.translate(en_text, target_lang_code="fra_Latn")
144
- logger.info(f"测试完成: {result}")
145
-
146
- # 测试英文到阿拉伯语
147
- logger.info("\n==== 英文 阿拉伯语 ====")
148
- result = translator.translate(en_text, target_lang_code="arb_Arab")
149
- logger.info(f"测试完成: {result}")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ '''
2
+ 翻译模块 - 使用CTranslate2加速的NLLB模型进行多语言翻译
3
+ '''
4
 
5
+ from ctranslate2 import Translator
6
+ from transformers import AutoTokenizer
7
  from langdetect import detect
8
  import torch
9
  import time
 
11
 
12
  # 配置日志
13
  def setup_logger(name, level=logging.INFO):
 
14
  logger = logging.getLogger(name)
 
15
  if logger.handlers:
16
  logger.handlers.clear()
 
 
17
  handler = logging.StreamHandler()
18
  formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
19
  handler.setFormatter(formatter)
20
  logger.addHandler(handler)
21
  logger.setLevel(level)
 
22
  logger.propagate = False
23
  return logger
24
 
 
25
  logger = setup_logger("translator")
26
 
27
  class NLLBTranslator:
28
+ def __init__(self, model_dir="nllb-600m-ct2-int8-fp16", default_target="eng_Latn"):
29
+ self.device = "cuda" if torch.cuda.is_available() else "cpu"
 
 
 
 
 
 
 
 
 
 
30
  logger.debug(f"使用设备: {self.device}")
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
+ self.tokenizer = AutoTokenizer.from_pretrained("facebook/nllb-200-distilled-600M")
33
+ self.translator = Translator(model_dir, device=self.device, compute_type="int8_float16")
34
  self.default_target = default_target
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
+ def translate(self, text: str, source_lang_code: str, target_lang_code: str = None) -> str:
 
 
 
 
 
 
 
37
  logger.debug("开始翻译")
 
 
38
  logger.info(f"[翻译原文] {text}")
39
 
40
+ src_lang = source_lang_code
 
41
  tgt_lang = target_lang_code or self.default_target
 
 
 
 
 
 
 
 
 
 
42
 
43
+ logger.debug(f"源语言: {src_lang}, 目标语言: {tgt_lang}")
 
44
 
45
+ # # 使用NLLB的标准格式处理源语言和目标语言
46
+ source = self.tokenizer.convert_ids_to_tokens(self.tokenizer.encode(text))
47
+
48
+ start = time.time()
49
+ target_prefix = [tgt_lang]
50
+ results = self.translator.translate_batch(
51
+ [source],
52
+ #beam_size=6,
53
+ length_penalty=1.2,
54
+ target_prefix=[target_prefix]
55
+ )
56
  duration = time.time() - start
57
+
58
+ output_tokens = results[0].hypotheses[0]
59
+ logger.debug(f"输出分词: {output_tokens}")
60
 
61
+ # 转换输出tokens为文本并清理
62
+ result = self.tokenizer.convert_tokens_to_string(output_tokens)
63
+ result = result.replace("<pad>", "").replace("</s>", "").replace("<s>", "").strip()
64
+ for lang_code in ["kor_Hang", "eng_Latn", "zho_Hans", "jpn_Jpan", "fra_Latn", "deu_Latn", "arb_Arab"]:
65
+ result = result.replace(lang_code, "").strip()
66
+
67
+ logger.debug(f"翻译完成: {src_lang} -> {tgt_lang}, 耗时: {duration * 1000:.2f}ms")
68
  logger.info(f"[翻译结果] {result}")
 
 
69
 
70
+ return result
71
 
72
  if __name__ == "__main__":
 
73
  logger.setLevel(logging.DEBUG)
 
 
74
  translator = NLLBTranslator()
75
 
76
+ test_cases = [
77
+ # 中文 -> 英文
78
+ ("请问这附近有地铁站吗?", "zho_Hans", "eng_Latn"),
79
+ ("我们今天要讨论人工智能的发展趋势。", "zho_Hans", "eng_Latn"),
80
+ ("他的回答令人非常失望。", "zho_Hans", "eng_Latn"),
81
+ ("这个项目已经进行了三个月,还需要更多资源支持。", "zho_Hans", "eng_Latn"),
82
+ ("天气预报说明天会有暴雨,请大家注意安全。", "zho_Hans", "eng_Latn"),
83
+ ("是时候重新思考我们的计划了。", "zho_Hans", "eng_Latn"),
84
+ ("我对这个结果非常满意,感谢你的努力。", "zho_Hans", "eng_Latn"),
85
+ ("她穿着一件红色的连衣裙,在人群中格外显眼。", "zho_Hans", "eng_Latn"),
86
+
87
+ # 英文 -> 中文
88
+ ("Can you help me find the nearest bus station?", "eng_Latn", "zho_Hans"),
89
+ ("The machine learning model achieved an accuracy of 95%.", "eng_Latn", "zho_Hans"),
90
+ ("He was overwhelmed by the unexpected response from the audience.", "eng_Latn", "zho_Hans"),
91
+ ("It’s important to stay hydrated during hot summer days.", "eng_Latn", "zho_Hans"),
92
+ ("Although she was tired, she continued working late into the night.", "eng_Latn", "zho_Hans"),
93
+ ("The concert was amazing, and the crowd was full of energy.", "eng_Latn", "zho_Hans"),
94
+ ("Please make sure to submit your application before the deadline.", "eng_Latn", "zho_Hans"),
95
+ ("After months of preparation, the product was finally launched.", "eng_Latn", "zho_Hans")
96
+ ]
97
+
98
+
99
+ for i, (text, src_lang, tgt_lang) in enumerate(test_cases):
100
+ logger.info(f"\n==== 测试用例 {i + 1} ====")
101
+ start_total = time.time()
102
+ result = translator.translate(text, source_lang_code=src_lang, target_lang_code=tgt_lang)
103
+ end_total = time.time()
104
+ logger.info(f"最终翻译结果: {result}")
105
+ logger.info(f"总耗时: {(end_total - start_total) * 1000:.2f}ms")