A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis

Dongo, Irvin; Cardinale, Yudith; Aguilera, Ana; Martinez, Fabiola; Quintero, Yuni; Robayo, German; Cabeza, David

A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis

dc.contributor.author	Dongo, Irvin
dc.contributor.author	Cardinale, Yudith
dc.contributor.author	Aguilera, Ana
dc.contributor.author	Martinez, Fabiola
dc.contributor.author	Quintero, Yuni
dc.contributor.author	Robayo, German
dc.contributor.author	Cabeza, David
dc.date.accessioned	2022-11-30T02:46:18Z
dc.date.available	2022-11-30T02:46:18Z
dc.date.issued	2021
dc.description.abstract	Purpose – This paper aims to perform an exhaustive revision of relevant and recent related studies, which reveals that both extraction methods are currently used to analyze credibility on Twitter. Thus, there is clear evidence of the need of having different options to extract different data for this purpose. Nevertheless, none of these studies perform a comparative evaluation of both extraction techniques. Moreover, the authors extend a previous comparison, which uses a recent developed framework that offers both alternates of data extraction and implements a previously proposed credibility model, by adding a qualitative evaluation and a Twitter-Application Programming Interface (API) performance analysis from different locations. Design/methodology/approach – As one of the most popular social platforms, Twitter has been the focus of recent research aimed at analyzing the credibility of the shared information. To do so, several proposals use either Twitter API or Web scraping to extract the data to perform the analysis. Qualitative and quantitative evaluations are performed to discover the advantages and disadvantages of both extraction methods. Findings – The study demonstrates the differences in terms of accuracy and efficiency of both extraction methods and gives relevance to much more problems related to this area to pursue true transparency and legitimacy of information on the Web. Originality/value – Results report that some Twitter attributes cannot be retrieved by Web scraping. Both methods produce identical credibility values when a robust normalization process is applied to the text i.e. tweet). Moreover, concerning the time performance, Web scraping is faster than Twitter API and it is more flexible in terms of obtaining data; however, Web scraping is very sensitive to website changes. Additionally, the response time of the Twitter API is proportional to the distance from the central server at San Francisco.	en_ES
dc.facultad	Facultad de Ingeniería	en_ES
dc.file.name	Dongo_Int2021.pdf
dc.identifier.citation	Dongo, I., Cardinale, Y., Aguilera, A., Martinez, F., Quintero, Y., Robayo, G. and Cabeza, D. (2021), "A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis", International Journal of Web Information Systems, Vol. 17 No. 6, pp. 580-606. https://doi.org/10.1108/IJWIS-03-2021-0037	en_ES
dc.identifier.doi	https://doi.org/10.1108/IJWIS-03-2021-0037
dc.identifier.uri	http://repositoriobibliotecas.uv.cl/handle/uvscl/7312
dc.language	en
dc.publisher	Emerald
dc.source	International Journal of Web Information Systems
dc.subject	API, WEB SCRAPING	en_ES
dc.subject	TWITTER	en_ES
dc.subject	CREDIBILITY	en_ES
dc.subject	QUALITATIVE ANALYSIS	en_ES
dc.title	A qualitative and quantitative comparison between Web scraping and API methods for Twitter credibility analysis
dc.type	Articulo
uv.departamento	Escuela de Ingenieria Informatica
uv.notageneral	No disponible para descarga

Colecciones

Artículos investigadores UV