Case Study

Informing ISO 24617-2:2012: An International Standard for Dialogue Act Analysis

1. Summary of the impact

Dr Alex Fang co-authored ISO 24617-2, the international standard that governs dialogue act analysis approved and certified by the International Organization for Standardization (ISO). The standard has been adopted by member countries across Europe and Asia as national guidelines to enable interoperable practice in public and private sectors for research and development of language technologies. The standard, first published in 2012, was vigorously reviewed by ISO in the assessment period and has been certified as international standard for another 5 years starting in 2017.

2. Underpinning research

ISO 24617-2:2012 provides a set of empirically and theoretically motivated concepts and guidelines for the analysis of pragmatic meanings, communicative functions and speaker intensions identifiable with contextual utterances in interactive speech (Fang et al 2011, Bunt et al 2012). This international standard serves as a foundation for localized national standards addressing needs from specific languages such as Chinese (Fang et al 2019), German and Japanese.

Dr Fang’s contribution to the underpinning research falls within the area of corpus-based empirical approach to lexico-grammatical analysis of authentic data. His research determines the structural analysis of the spoken units, which includes the segmentation of speech units, the parts of speech of the component lexical items and the syntactic constituents at phrasal and clausal levels. There exists a multitude of analytical frameworks within the lexico-grammatical paradigm, some simple and some complex, but they do not always readily lend themselves to the meaningful analysis of spoken utterances. Dr Alex Fang’s expertise, arising from corpus linguistics, focused on the linguistic annotation of spoken data and the identification of a set of features, lexical and grammatical, that maximally allow for the reliable prediction of functions and intensions (Fang & Cao 2015), laying a solid foundation for its application in artificial intelligence in general and in automatic man-machine dialogue systems in particular. This research proposed a theoretical framework about the interactions between the internal, linguistic dimension and the external, contextual dimension of language communication. This framework provides essential guidance to the generation of a language model that automatically determines the communicative function of a contextual utterance according to its lexico-grammatical features. The language model was applied to annotate a large corpus of transcribed interactive speech (Fang et al 2012, Bunt et al 2018). The annotated corpus is now publicly available as an important language resource for linguistic research that has been broadly cited within the international research community in dialogue analysis.

Dr Fang’s expertise lies in corpus linguistics, which aims to use large quantities of text data to validate linguistic theories. His past and ongoing research in this particular area has covered linguistic analysis at lexical, grammatical and syntactic levels, which serves as an important instrument to automatically pinpoint the communicative functions of spoken utterances.

3. References to the research

[R1] Fang, A. C., Bunt, H., Cao, J., & Liu, X. (2011). Relating the Semantics of Dialogue Acts to Linguistic Properties: A machine learning perspective through lexical cues. In Proceedings of the 5th IEEE International Conference on Semantic Computing, Stanford University, Palo Alto, California, USA, September 18-21, 2011.

[R2] Bunt, H., Alexandersson, J., Choe, J.-W., Fang, A. C., Hasida, K., Petukhova, V., Popescu-Belis, A., & Traum, D. (2012). ISO 24617-2:2012 Language resource management – Semantic annotation framework (SemAF) – Part 2: Dialogue acts. Geneva: The International Organization for Standardization. 104 pages.

[R3] Fang, A. C., Cao, J, Liu, X, & Bunt, H (2012). Lexical Characteristics of Dialogue Acts in the Switchboard Corpus of Telephone Conversations. Journal of Foreign Language Teaching and Research, No 3, Vol 44. pp 28-40.

[R4] Fang, A. C., & Cao, J. (2015). Text Genres and Registers: The Computation of Linguistic Features. Berlin and Heidelberg: Springer. 267+xiii pages.

[R5] Bunt, H., Petukhova, V., Malchanau, A., Fang, A.C., & Wijnhoven, K. (2018). The DialogBank: Dialogues with Interoperable Annotations. Language Resources and Evaluation. DOI 10.1007/s10579-018-9436-9

[R6] Fang, A.C., Li, Y., Cao, J. and Bunt, H. (2019). Chinese Multimodal Resources for Dialogue Act Analysis. In C.-R. Huang, Z. Jing-Schmidt, and B. Meisterernst (Eds.), The Routledge Handbook of Chinese Applied Linguistics. Routledge. pp 256-275.

4. Details of the impact

Impacting everyone, everywhere, ISO has published over 22,100 international standards and related documents, covering every technological area including language and communication. These standards specify the requirements for state-of-the-art products, services, processes, materials and systems, and for good conformity assessment, managerial and organizational practice. They range from the coding of character sets for languages such as Chinese, Japanese, and Korean to the highly technical specifications for MPEG (Moving Picture Experts Group) requirements.

As a non-governmental organization, ISO works closely with governments through its national members (most of whom are part of the governmental structure of their countries) and with its worldwide partners. These partners in international standardization include IEC (International Electrotechnical Commission) and ITU-T (International Telecommunication Union). It also cooperates with the World Trade Organization (WTO) to promote a free and fair global trading system and with UNECSCO, for instance, to save endangered languages and preserve cultural heritage.

Because of his expertise in corpus linguistics, Dr Fang was invited to join international Working Group 2 (Semantic Annotation) of Sub-Committee 4 (Language Resource Management) of ISO Technical Committee 37 – Language and Terminology [S7] in 2007. This working group was developing a standard that would provide a set of standardized concepts for the analysis of dialogue acts, and it included nationally certified experts from the USA, China, Russia, Germany, Holland, Japan, South Korea and Norway – countries which represent the most volatile regions where there is an increasingly emergent demand for standardized analysis of speech and language to suit the growing market. The standard was urgently needed because of the rapidly increasing prevalence of man-machine dialogue systems in both commercial and research settings, without a standard to guide them. Since the new standard was intended for use with automatic computer systems to understand human language, it had to be much more rigorous than previous language standards which tended to be abstract and conceptual. Dr Fang’s expertise in corpus-based approach to language made this level of rigour possible. Dr Fang pioneered the application of the standard to a large corpus of transcribed conversations in English and Dutch, which is now publicly accessible for academic research ([R1], [R3], [R5]). This work had the benefit of demonstrating the applicability of the international standard and also the empirical testing of the suitability of the standard for automatic application. Dr Fang’s research has thus indirectly impacted on the development of international standards in terms of their empirical validation, in addition to his direct impact on the development of ISO 24617-2:2012. The standard was published in 2012, and its suitability and relevance were the subject of a five-year systematic review in 2017. This review approved its republication as an international standard for the next 5 years, a measure of its success.

ISO 24617-2:2012 is the first international standard for the annotation of communicative functions in human speech and serves as ISO recommendations to the member states as national standards. Since its publication, it has been adopted in Europe and Asia by the UK, Denmark, the Netherlands, Slovenia, and South Korea and published locally as their national standard. See items [S1]-[S4] for sources of information. Other ISO member states usually localize an international standard by adapting it to their own standard based on considerations related to their language, research and market specifics.

ISO 24617-2:2012 has provided commercial entities with a sound and solid basis for the development of speech and language applications. Its description of a fundamental set of communicative functions allows for the automatic, as well as manual, analysis of communicative functions and speaker intentions, an important quality desired by R&D of speech products. Alibaba, for instance, has a special standards department that works with product development teams to ensure conformity with international standards such as ISO 24617-2:2012. The same can be expected of other similar entities such as Apple Inc in their development of intelligent applications like Siri, which aims to achieve life-like man-machine interactions on the basis of speaker intention understanding.

ISO promotes its standards and works closely with academic conferences or workshops such as LREC, ACL, and COLING. In 2018, for instance, Working Group 2 met in early May in Miyazaki, Japan, where LREC 2018 was held. It had another meeting in late August, Santa Fe, New Mexico, USA, collocated with COLING 2018. City University of Hong Kong has been particularly active and, through Dr Fang’s influence, organized several such events. It also invited some of the core members to the university as visiting professors [S7].

5. Sources to corroborate the impact

[S1] The International Organization for Standardization: https://www.iso.org/standard/51967.html

Fig1.jpg

[S2] The Swedish Standards Institute: https://www.sis.se/en/produkter/standardization/terminology-/iso2461722012/

[S3] The Norwegian Standardization Organization: https://www.standard.no/no/Nettbutikk/produktkatalogen/Produktpresentasjon/?ProductID=590108

[S4] Standards New Zealand: https://shop.standards.govt.nz/catalog/24617-2%3A2012%28BS+ISO%29/view

[S5] Standards Australia: https://www.standards.org.au/standards-catalogue/international/iso-slash-tc--37-slash-sc--4/iso--24617-2-colon-2012

[S6] The British Standards Institute: https://shop.bsigroup.com/ProductDetail?pid=000000000030196984

[S7] Letter from Kiyong Lee, Convenor of ISO/TC 37/SC 4 Language Resource Management/WG 2 Semantic Annotation, confirming Dr Fang’s involvement