GB/T 45257-2025 Press and publication―Knowledge services―Extraction and marking up of knowledge element English, Anglais, Englisch, Inglés, えいご
This is a draft translation for reference among interesting stakeholders. The finalized translation (passing through draft translation, self-check, revision and verification) will be delivered upon being ordered.
ICS 01.140.40
CCS A 19
National Standard of the People's Republic of China
GB/T 45257-2025
Press and publication - Knowledge services - Extraction and marking up of knowledge element
新闻出版 知识服务 知识元提取与标引
(English Translation)
Issue date: 2025-02-28 Implementation date: 2025-06-01
Issued by the State Administration for Market Regulation
the Standardization Administration of the People's Republic of China
Contents
Foreword
1 Scope
2 Normative references
3 Terms and definitions
4 Abbreviations
5 Objects of extraction and marking up
6 Methods of extraction and marking up
7 Rules and specifications for extraction
8 Rules and specifications for marking up
9 Processes of extraction and marking up
10 Quality of extraction and marking up
Bibliography
Press and publication - Knowledge services - Extraction and marking up of knowledge element
1 Scope
This document specifies the objects, methods, rules, processes and quality requirements for the extraction and marking up of knowledge elements.
This document applies to the construction and management of knowledge bases and knowledge resource libraries in the press and publication field.
2 Normative references
The following documents contain requirements which, through reference in this text, constitute provisions of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.
GB/T 45256 Press and publication - Knowledge services - Processes for knowledge ontology construction
3 Terms and definitions
The following terms and definitions apply to this document.
3.1
marking-up
process of representing the content or form of a resource with words or phrases in accordance with the rules of indexing language
[Source: CY/T 101.1-2014, 6.2.7, modified]
3.2
fixed-layout document
type of document generated after typesetting that contains all data required for the fixed presentation of the layout
[Source: CY/T 101.1-2014, 6.3.10]
3.3
stream document
type of document whose content presentation can adapt to changes in the screen or window of terminal devices in accordance with the logical order of the content
[Source: CY/T 101.1-2014, 6.3.11]
3.4
morpheme
smallest meaningful unit of sound in a language
3.5
knowledge element
independent knowledge unit that expresses a complete thing or concept and cannot be further divided under application requirements
[Source: GB/T 38377-2019, 2.3]
4 Abbreviations
The following abbreviations apply to this document.
ePub: Electronic Publication
HTML: Hyper Text Markup Language
ID: Identity
JPEG: Joint Photographic Experts Group
MP3: Moving Picture Experts Group Audio Layer Ⅲ
MP4: Moving Picture Expert Group 4
PDF: Portable Document Format
PNG: Portable Network Graphic Format
TIFF: Tag Image File Format
TXT: Text
WMA: Windows Media Audio
WMV: Windows Media Video
5 Objects of extraction and marking up
5.1 Objects of extraction
The objects for extracting knowledge elements in the press and publication field mainly include materials with extraction value, such as books, periodicals, newspapers, research reports, conference papers and standards. Priority should be given to formal publications.
5.2 Objects of marking up
The objects of marking up are computer files in the press and publication field, including but not limited to the following types:
a) Stream documents, with file formats including ePub, HTML, TXT, etc.;
b) Fixed-layout documents, with file formats including PDF, etc.;
c) Image files, with file formats including JPEG, PNG, TIFF, etc.;
d) Audio files, with file formats including MP3, WMA, etc.;
e) Video files, with file formats including MP4, WMV, etc.
6 Methods of extraction and marking up
6.1 Principles and methods of extraction
6.1.1 Principles of extraction
The principles for extracting knowledge elements are as follows:
a) Completeness: Complete extraction types, constituent elements and descriptions to meet the essential requirements of the knowledge ontology model.
b) Accuracy: Accurate extraction of concepts, descriptions and relationship descriptions. When extracting content from the original text, the following conditions shall be met:
1) References or associations used in the original text shall be supplemented as much as possible;
2) Connecting descriptions in the original text shall be included after editing and improvement to meet the needs of independent expression;
3) Figure and table serial numbers involved in the original text shall be rearranged as needed;
4) Other editing work required to ensure the independence of the described content.
c) Practicality: Emphasize practicality in extraction, ensuring strong usability.
d) Standardization: Strive for standardization and unification in the specifications, quality, granularity and degree of extracted expressions.
e) Relevance: Extract knowledge elements related to the target scope, strive to construct various relationships between knowledge elements, and avoid isolated knowledge elements as much as possible.
Standard
GB/T 45257-2025 Press and publication―Knowledge services―Extraction and marking up of knowledge element (English Version)
Standard No.
GB/T 45257-2025
Status
valid
Language
English
File Format
PDF
Word Count
11500 words
Price(USD)
345.0
Implemented on
2025-6-1
Delivery
via email in 1~5 business day
Detail of GB/T 45257-2025
Standard No.
GB/T 45257-2025
English Name
Press and publication―Knowledge services―Extraction and marking up of knowledge element
GB/T 45257-2025 Press and publication―Knowledge services―Extraction and marking up of knowledge element English, Anglais, Englisch, Inglés, えいご
This is a draft translation for reference among interesting stakeholders. The finalized translation (passing through draft translation, self-check, revision and verification) will be delivered upon being ordered.
ICS 01.140.40
CCS A 19
National Standard of the People's Republic of China
GB/T 45257-2025
Press and publication - Knowledge services - Extraction and marking up of knowledge element
新闻出版 知识服务 知识元提取与标引
(English Translation)
Issue date: 2025-02-28 Implementation date: 2025-06-01
Issued by the State Administration for Market Regulation
the Standardization Administration of the People's Republic of China
Contents
Foreword
1 Scope
2 Normative references
3 Terms and definitions
4 Abbreviations
5 Objects of extraction and marking up
6 Methods of extraction and marking up
7 Rules and specifications for extraction
8 Rules and specifications for marking up
9 Processes of extraction and marking up
10 Quality of extraction and marking up
Bibliography
Press and publication - Knowledge services - Extraction and marking up of knowledge element
1 Scope
This document specifies the objects, methods, rules, processes and quality requirements for the extraction and marking up of knowledge elements.
This document applies to the construction and management of knowledge bases and knowledge resource libraries in the press and publication field.
2 Normative references
The following documents contain requirements which, through reference in this text, constitute provisions of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.
GB/T 45256 Press and publication - Knowledge services - Processes for knowledge ontology construction
3 Terms and definitions
The following terms and definitions apply to this document.
3.1
marking-up
process of representing the content or form of a resource with words or phrases in accordance with the rules of indexing language
[Source: CY/T 101.1-2014, 6.2.7, modified]
3.2
fixed-layout document
type of document generated after typesetting that contains all data required for the fixed presentation of the layout
[Source: CY/T 101.1-2014, 6.3.10]
3.3
stream document
type of document whose content presentation can adapt to changes in the screen or window of terminal devices in accordance with the logical order of the content
[Source: CY/T 101.1-2014, 6.3.11]
3.4
morpheme
smallest meaningful unit of sound in a language
3.5
knowledge element
independent knowledge unit that expresses a complete thing or concept and cannot be further divided under application requirements
[Source: GB/T 38377-2019, 2.3]
4 Abbreviations
The following abbreviations apply to this document.
ePub: Electronic Publication
HTML: Hyper Text Markup Language
ID: Identity
JPEG: Joint Photographic Experts Group
MP3: Moving Picture Experts Group Audio Layer Ⅲ
MP4: Moving Picture Expert Group 4
PDF: Portable Document Format
PNG: Portable Network Graphic Format
TIFF: Tag Image File Format
TXT: Text
WMA: Windows Media Audio
WMV: Windows Media Video
5 Objects of extraction and marking up
5.1 Objects of extraction
The objects for extracting knowledge elements in the press and publication field mainly include materials with extraction value, such as books, periodicals, newspapers, research reports, conference papers and standards. Priority should be given to formal publications.
5.2 Objects of marking up
The objects of marking up are computer files in the press and publication field, including but not limited to the following types:
a) Stream documents, with file formats including ePub, HTML, TXT, etc.;
b) Fixed-layout documents, with file formats including PDF, etc.;
c) Image files, with file formats including JPEG, PNG, TIFF, etc.;
d) Audio files, with file formats including MP3, WMA, etc.;
e) Video files, with file formats including MP4, WMV, etc.
6 Methods of extraction and marking up
6.1 Principles and methods of extraction
6.1.1 Principles of extraction
The principles for extracting knowledge elements are as follows:
a) Completeness: Complete extraction types, constituent elements and descriptions to meet the essential requirements of the knowledge ontology model.
b) Accuracy: Accurate extraction of concepts, descriptions and relationship descriptions. When extracting content from the original text, the following conditions shall be met:
1) References or associations used in the original text shall be supplemented as much as possible;
2) Connecting descriptions in the original text shall be included after editing and improvement to meet the needs of independent expression;
3) Figure and table serial numbers involved in the original text shall be rearranged as needed;
4) Other editing work required to ensure the independence of the described content.
c) Practicality: Emphasize practicality in extraction, ensuring strong usability.
d) Standardization: Strive for standardization and unification in the specifications, quality, granularity and degree of extracted expressions.
e) Relevance: Extract knowledge elements related to the target scope, strive to construct various relationships between knowledge elements, and avoid isolated knowledge elements as much as possible.