Speakers
Description
Digital Economy and Management is an aspect that explores how digital technologies, online platforms, and data-driven practices influence economic behavior and management strategies. It examines how digital communication, social structures, and cultural norms intersect with economic actions, particularly in a globalized environment. This study analyzes the usage of English-derived words in Turkish social media texts within the framework of code-mixing, focusing on platforms such as YouTube and Twitter. Situated within the context of digital economy and management, this research investigates how code-mixing functions in digital communication and its role in shaping socio-economic interactions within the digital landscape. The study utilizes Natural Language Processing (NLP) techniques and modern language models to explore the frequency, context, and social-cultural implications of this linguistic phenomenon.
Data collection was carried out using two primary sources: YouTube and Twitter. For YouTube, the YouTube API was used to retrieve video transcripts by querying specific keywords and hashtags (e.g., #codeMixing, #Türkçeİngilizce). For Twitter, pre-existing datasets were used, filtered by specific keywords and hashtags to identify posts featuring Turkish-English code-mixing. Real-time data collection through the Twitter API was attempted, but limitations on the platform restricted the volume of data.
The raw data underwent several preprocessing steps: cleaning to remove unnecessary characters, stop words removal, tokenization, and lowercasing to ensure consistency. Various NLP techniques were applied to the preprocessed data, including frequency analysis, n-gram analysis, and language identification algorithms to detect code-mixed instances.
BERT (Bidirectional Encoder Representations from Transformers), a modern language model, was employed to detect code-mixed instances and analyze the contextual meaning of English-Turkish interactions. BERT’s ability to capture complex linguistic phenomena made it crucial in understanding code-mixing.
Findings revealed that English-derived words were frequently used in Turkish social media texts, serving communicative and identity-expressive functions. Code-mixing reflected social and cultural motivations, such as signaling cultural alignment. However, it also posed challenges for NLP systems, which struggled to handle mixed-language data. Suggestions for overcoming these challenges include adapting multilingual models.
This study provides valuable insights into the role of code-mixing in digital communication, examining both linguistic patterns and social implications. The integration of advanced language models like BERT enhances understanding of code-mixing and highlights its socio-economic impact on digital communication. The research contributes to both theoretical frameworks on language contact and practical advancements in NLP technologies for multilingual contexts.