Utilizing Elbow Method for Text Clustering Optimization in Analyzing Social Media Marketing Content of Indonesian e-Commerce



E-commerce, Twitter, marketing, text mining, K-means, elbow


The massive increases in textual data from Twitter and text analytics simultaneously have driven organizations to obtain hidden insights to implement the proper marketing strategies for businesses. The vast information generated by Twitter enables most e-commerce businesses to utilize Twitter to implement social media marketing. One of those e-commerce businesses is Blibli Indonesia. Intense business competition has led them to perform marketing strategies to understand consumer tendencies. Focusing the marketing strategies on consumer preferences enables the increase of consumer interest in Blibli, which is in line with enhancing the opportunity to reach new consumers. This research aims to discover Twitter content based on k-means results to cluster the tweets of @bliblidotcom. The best cluster is determined with the elbow method by selecting the deepest curvature, three clusters. The result suggests that Twitter users like Park Seo Jun's content. Hence, Blibli can focus on that content as its business marketing strategy on the Twitter platform.

Author Biographies

Aisyah Larasati, State University of Malang

Aisyah is an Associate Professor at Department of Industrial Engineering at Universitas Negeri Malang

Raretha Maren, State University of Malang

Raretha is a senior year student at Department of Industrial Engineering - State University of Malang

Retno Wulandari, State University of Malang

Retno is an assistant professor at Department of Mechanical Engineering - State University of Malang


Kinra, A., Beheshti-Kashi, S., Buch, R., Nielsen, T. A. S., and Pereira, F., Examining the Potential of Textual Big Data Analytics for Public Policy Decision-making: A Case Study with Driverless Cars in Denmark, Transport Policy, 98, 2020, pp. 68–78.

Casas, I., and Delmelle, E. C., Tweeting about Public Transit—Gleaning Public Perceptions from a Social Media Microblog, Case Studies, Transport Policy, 2017, 5(4), pp. 634–642.

Sivarajah, U., Irani, Z., Gupta, S., and Mahroof, K., Role of Big Data and Social Media Analytics for Business to Business Sustainability: A Parti-cipatory Web Context, Industrial Marketing Management, 86, 2020, pp. 163–179.

Rasool, A., Tao, R., Marjan, K., and Naveed, T., Twitter Sentiment Analysis: A Case Study for Apparel Brands, Journal of Physics: Conference Series, 1176, 2019, pp. 1-7.

Dastanwala, P. B., and Patel, V., A Review on Social Audience Identification on Twitter using Text Mining Methods, 2016 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), 2016, pp. 1917–1920.

Yusril, A. N., Larasati, I., and Aini, Q., Implementasi Text Mining untuk Advertising dengan Menggunakan Metode K-Means Clustering pada Data Tweets Gojek Indonesia, SISTEMASI: Jurnal Sistem Informasi, 9(3), 2020, pp. 586-596.

Pratama, E. E., and Atmi, R. L., A Text Mining Implementation Based on Twitter Data to Analyse Information Regarding Corona Virus in Indonesia, Journal of Computers for Society, 1(1), 2020, pp. 91-100.

Ahmed, M., Seraj, R., and Islam, S. M. S., The k-means Algorithm: A Comprehensive Survey and Performance Evaluation, Electronics, 9(8), 2020, pp. 1-12.

Xiong, C., Hua, Z., Lv, K., and Li, X., An Improved K-means Text Clustering Algorithm by Optimizing Initial Cluster Centers, 2016 7th International Conference on Cloud Computing and Big Data (CCBD), 2016, pp. 265–268.

Sinaga, K. P., and Yang, M.-S, Unsupervised K-Means Clustering Algorithm, IEEE Access, 8, 2020, pp. 1-13.

Suresh Babu, S., and Jayasudha, K., A Survey of Nature-inspired Algorithm for Partitional Data Clustering, Journal of Physics: Conference Series, 1706(1), 2020, pp. 2-11.

Syakur, M. A., Khotimah, B. K., Rochman, E. M. S., and Satoto, B. D., Integration K-Means Clustering Method and Elbow Method For Identification of The Best Customer Profile Cluster, IOP Conference Series: Materials Science and Engineering, 336(1), 2020, pp. 1-6.

Reyhana, Z., Analisis Sentimen Pendapat Masya-rakat terhadap Pembangunan Infra¬struk¬tur Kota Surabaya melalui Twitter dengan Meng¬guna¬kan Support Vetor Machine dan Neural Network, Tesis, Jurusan Statistika, Institut Teknologi Sepuluh Nopember, 2018.

HaCohen-Kerner, Y., Miller, D., and Yigal, Y., The Influence of Preprocessing on Text Classi-fication using a Bag-of-words Representation, PLOS ONE, 2020, 15(5), pp. 1-22.

Alam, S., and Yao, N., The Impact of Prepro-cessing Steps on the Accuracy of Machine Learning Algorithms in Sentiment Analysis, Computational and Mathematical Organization Theory, 25(3), 2019, pp. 319–335.

Saif, H., Fernandez, M., He, Y., and Alani, H., On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter, Proceedings of the 9th International Language Resources and Evaluation Conference (LREC'14), 2014, pp. 810-817.

Sari, D., Sari, Y., Furqon, M., Pembentukan Daftar Stopword Menggunakan Zipf Law dan Pembobotan Augmented TF - Probability IDF pada Klasifikasi Dokumen Ulasan Produk, Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, 4(1), 2020, pp. 406-412.

Mohammadi, M., Parallel Documenta Identifi-cation using Zipf's Law, Proceedings of the Ninth Workshop on Building and Using Comparable Corpora, 2016, pp. 21-25.

Bengfort, B., Bilbro, R., and Ojeda, T. Applied Text Analysis with Python, O’Reilly Media, Inc. 2016.

Liu, L., Peng, Z., Wu, H., Jiao, H., Yu, Y., and Zhao, J., Fast Identification of Urban Sprawl Based on K-Means Clustering with Population Density and Local Spatial Entropy, Sustainability, 10(8), 2018, pp. 1-16.

Umargono, E., Suseno, J. E., and S. K., V. G., K-Means Clustering Optimization using the Elbow Method and Early Centroid Determination Based-on Mean and Median, Proceedings of the International Conferences on Information System and Technology, 2019, pp. 234–240.

Wu, C., Yan, B., Yu, R., Yu, B., Zhou, X., Yu, Y., and Chen, N., K-Means Clustering Algorithm and Its Simulation Based on Distributed Computing Platform, Complexity, 2021(1), 2021, pp. 1–10.

Enesi, I., Liço, L., Biberaj, A., and Shahu, D., Analysing Clustering Algorithms Performance in CRM Systems, Proceedings of the 23rd Interna-tional Conference on Enterprise Information Systems, 2021, pp. 803–809.

Yang, Y., Han, D., Liang., S., Clustering Validity Index for Irregular Clustering Result, Applied Soft Computing Journal, 95(2020), 2020, pp 1-17.

Putri, R. K., Warsito, B., and Mustafid, M., Implementasi Algoritma Modified Gustafson-Kessel untuk Clustering Tweets pada Akun Twitter Lazada Indonesia, Jurnal Gaussian, 8(3), 2019, pp. 285–295.

Liya, I., Budiono, H., Sanjaya, V. F., and Suratmin, J. E, Pengaruh Hallyu Wave, Brand Ambassador, dan WOM terhadap Keputusan Pembelian pada Mie Sedap Selection Korean Spicy, REVENUE: Jurnal Manajemen Bisnis Islam, 2(1), 2021, pp. 11-26.

Sagia, A., and Situmorang, S. H., Pengaruh Brand Ambassador, Brand Personality dan Korean Wave terhadap Keputusan Pembelian Produk Nature Republic Aloe Vera, Jurnal Manajemen dan Bisnis Indonesia, 5(2), 2018, pp. 286–298.