GB/Z 177.8-2026 Intelligence grading of artificial intelligence terminal—Part 8: Speaker English, Anglais, Englisch, Inglés, えいご
This is a draft translation for reference among interesting stakeholders. The finalized translation (passing through draft translation, self-check, revision and verification) will be delivered upon being ordered.
ICS
CCS
National Standard of the People's Republic of China
GB/Z 177.8-2026
Intelligence grading of artificial intelligence terminal - Part 8: Speaker
人工智能终端智能化分级 第8部分:音箱
Issue date: 2026-03-31 Implementation date: 2026-10-01
Issued by the General Administration of Quality Supervision, Inspection and Quarantine of the People's Republic of China
the Standardization Administration of the People's Republic of China
Contents
Foreword
Introduction
1 Scope
2 Normative References
3 Terms and Definitions
4 Abbreviations
5 Key Capabilities
5.1 Overview
5.2 L1 Response Level
5.3 L2 Tool Level
5.4 L3 Assistance Level
6 Level Determination
Annex A (Normative) Test Methods
A.1 Test Environment
A.2 L1 Response Level
A.3 L2 Tool Level
A.4 L3 Assistance Level
Annex B (Informative) Typical Usage Scenarios
Annex C (Informative) Test Scenario Design Methods
Bibliography
Artificial intelligence terminal intelligence classification — Part 8: Speakers
1 Scope
This document specifies the classification levels and level determination of key intelligence capabilities for speakers, and provides test methods.
This document is intended to guide the intelligence upgrade of speakers, and also provides a reference for the design, development, application, selection and testing of artificial intelligence speakers.
2 Normative References
The following documents are essential for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition (including any amendments) applies.
GB/Z 177.1-2026 Artificial intelligence terminal intelligence classification — Part 1: Reference framework
GB/Z 177.2-2026 Artificial intelligence terminal intelligence classification — Part 2: General requirements
3 Terms and Definitions
For the purposes of this document, the terms and definitions given in GB/Z 177.1-2026, GB/Z 177.2-2026 and the following apply.
3.1 speech wakeup; voice trigger
The process in which a speech interaction system, while in an audio stream monitoring state, switches to another processing state, such as command word recognition or continuous speech recognition, after detecting a specific feature or event.
[Source: GB/T 36464.2-2018, 3.13]
3.2 wakeup word
A specific word or phrase used by a user to wake up a device and initiate speech interaction.
3.3 speech recognition
The process of converting human voice signals into text or commands.
[Source: GB/T 21023-2007, 3.1]
3.4 speech synthesis
The process of synthesising human speech by mechanical or electronic methods.
NOTE: The speech produced by this process is called synthetic speech, as distinguished from natural speech produced by human vocal organs, and is sometimes also called artificial speech.
[Source: GB/T 21024-2007, 3.1]
4 Abbreviations
The following abbreviations apply to this document.
App: Application Software
MOS: Mean Opinion Score
GB/Z 177.8-2026 Intelligence grading of artificial intelligence terminal—Part 8: Speaker English, Anglais, Englisch, Inglés, えいご
This is a draft translation for reference among interesting stakeholders. The finalized translation (passing through draft translation, self-check, revision and verification) will be delivered upon being ordered.
ICS
CCS
National Standard of the People's Republic of China
GB/Z 177.8-2026
Intelligence grading of artificial intelligence terminal - Part 8: Speaker
人工智能终端智能化分级 第8部分:音箱
Issue date: 2026-03-31 Implementation date: 2026-10-01
Issued by the General Administration of Quality Supervision, Inspection and Quarantine of the People's Republic of China
the Standardization Administration of the People's Republic of China
Contents
Foreword
Introduction
1 Scope
2 Normative References
3 Terms and Definitions
4 Abbreviations
5 Key Capabilities
5.1 Overview
5.2 L1 Response Level
5.3 L2 Tool Level
5.4 L3 Assistance Level
6 Level Determination
Annex A (Normative) Test Methods
A.1 Test Environment
A.2 L1 Response Level
A.3 L2 Tool Level
A.4 L3 Assistance Level
Annex B (Informative) Typical Usage Scenarios
Annex C (Informative) Test Scenario Design Methods
Bibliography
Artificial intelligence terminal intelligence classification — Part 8: Speakers
1 Scope
This document specifies the classification levels and level determination of key intelligence capabilities for speakers, and provides test methods.
This document is intended to guide the intelligence upgrade of speakers, and also provides a reference for the design, development, application, selection and testing of artificial intelligence speakers.
2 Normative References
The following documents are essential for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition (including any amendments) applies.
GB/Z 177.1-2026 Artificial intelligence terminal intelligence classification — Part 1: Reference framework
GB/Z 177.2-2026 Artificial intelligence terminal intelligence classification — Part 2: General requirements
3 Terms and Definitions
For the purposes of this document, the terms and definitions given in GB/Z 177.1-2026, GB/Z 177.2-2026 and the following apply.
3.1 speech wakeup; voice trigger
The process in which a speech interaction system, while in an audio stream monitoring state, switches to another processing state, such as command word recognition or continuous speech recognition, after detecting a specific feature or event.
[Source: GB/T 36464.2-2018, 3.13]
3.2 wakeup word
A specific word or phrase used by a user to wake up a device and initiate speech interaction.
3.3 speech recognition
The process of converting human voice signals into text or commands.
[Source: GB/T 21023-2007, 3.1]
3.4 speech synthesis
The process of synthesising human speech by mechanical or electronic methods.
NOTE: The speech produced by this process is called synthetic speech, as distinguished from natural speech produced by human vocal organs, and is sometimes also called artificial speech.
[Source: GB/T 21024-2007, 3.1]
4 Abbreviations
The following abbreviations apply to this document.
App: Application Software
MOS: Mean Opinion Score