National cybersecurity strategies: Advanced text analytics

15 Jun 2017 13:15h - 14:00h

Event report

[Read more session reports from WSIS Forum 2017]

This session, moderated by Mr Jorge Martinez Navarrete (Analytics Partnerships & Innovation, United Nations Office of Information and Communications Technology), presented a tool for text analytics used to automate the comparison among national cybersecurity strategies. According to Navarrete, the project goals were to gain insights through data exploration, to expedite the work of human researchers, and to make information more appealing by using visualisations. The project was developed in partnership between the International Telecommunication Union (ITU), United Nations (Office of Information and Communications Technology – OICT), Fordham University, and New York University, with the ITU being responsible for business requirements, the OICT for project management, and the universities for solution development.

Prof. W. ‘RP’ Raghupathi (Professor and Director, Center for Digital Transformation, Gabelli School of Business, Fordham University) talked about the commitment of Fordham University to solving challenging problems affecting the world and to mobilizing the skills, experience, and time of faculty and students for the sustainable development goals. Nowadays there are more than 42 students involved in 5 ongoing projects with UN system entities. In his opinion, these real life projects motivate faculty and students. At the end of his presentation, Raghupathi said that text analytics is a commonplace in the Internet private sector, however, less so in policy research and government offices in developing countries, and that this kind of analytical tool could be used with other types of policies as well.

Mr Youwei Xiao and Mr Chuanze Cai (both MS in Business Analytics programme, Gabelli School of Business, Fordham University) explained the methodology used to analyse the national cybersecurity strategy of more than 60 countries. Starting with two data sources (cybersecurity strategy documents and cyberwellness profiles), the machine learning algorithm selects keywords, classifies them, and creates a dictionary. Using this dictionary, the information is classified and outputted. This process is iterative and can be repeated. They presented some improvements that were made through the creation of subcategories based on current corpuses, to increase the tagging coverage rate. They concluded that the tool has a limitation when it works with sentences with keywords from mixed categories. According to Cai, the future plan includes updating the model with more restrictive categorizing/clustering criteria to optimize the results.

In his presentation, Mr Christian Felix (Computer Science PhD programme, Tandon School of Engineering, New York University) affirmed that New York University has worked on UN challenges for more than 2 years. He explained the tool (available at https://nyuvis.github.io/revex/ict4sd/) and its functionalities, including search using a term, a category, a subcategory, a keyword, or a country; and frequency of a specific category or subcategory in a country document, etc.

At the end of the session, Mr Luc Dandurand (Head, ICT Applications and Cybersecurity Division, International Telecommunication Union) noted considerations about the difficulty to carry out an accurate analysis and comparison when only reading the chosen documents, and the importance of a tool like the one presented in the session.

 

by Nathalia Sautchuk Patrício