Open access peer-reviewed chapter

Language Context in the Future of Television and Video Industry: Exploring Trends and Opportunities

Written By

Arkin Haris

Submitted: 30 May 2023 Reviewed: 28 September 2023 Published: 12 June 2024

DOI: 10.5772/intechopen.113309

From the Edited Volume

The Future of Television and Video Industry

Edited by Yasser Ismail

Chapter metrics overview

18 Chapter Downloads

View Full Metrics

Abstract

This chapter explores the importance of language context in the future of the television and video industry. With a growing global audience and increasing demand for content in multiple languages, the sector is turning to new innovations and strategies to meet this demand. It examines the impact of language on content creation, distribution, and consumption and how industry players are responding to this changing landscape. It also discusses the role of computational technology, machine learning, and artificial intelligence in shaping linguistic tasks in the multimedia. Several potential developments can influence the trends and opportunities that are included in the increased utilization of multilingual contents, advancements in automatic translation, the integration of natural language, the creation of language-inclusive content, and the utilization of AI technology to aid language learners. Multilingual content caters to diverse linguistic needs, expands reach, and promotes cultural exchange. Automatic translation optimizes language accessibility and facilitates smooth communication across different languages. Integration of Natural Language Processing improves the interaction between viewers and technology. Language inclusivity fosters cultural appreciation and representation. AI technology advances effective and personalized language learning experiences. Embracing these trends ensures relevance and connection with viewers in a globalized world.

Keywords

  • language
  • television industry
  • video industry
  • multilingual content
  • natural language
  • language inclusive content
  • artificial intelligence
  • language learning
  • computational linguistics

1. Introduction

The television and video industry has undergone significant changes in recent years, driven by advancements in technology and changes in consumer behavior. The telecommunications, media, and entertainment sectors are experiencing substantial transformations due to technological advancements, evolving consumer preferences, and the emergence of new business models. Within these industries, the video entertainment sector serves as a prime example of how the conventional linear value chain is evolving with the increasing influence of platform-based business models [1]. Digital innovations have revolutionized the way content is produced, distributed, and consumed. This has given rise to a new ecosystem of content creators, ranging from independent filmmakers to established studios, who can reach a global audience without the need for traditional broadcast infrastructure. On the other hand, viewers now have more control over what they watch, when they watch it, and how they engage with content. Furthermore, the revenue model has also evolved in the online era. Subscription-based services have become the norm, with viewers paying a monthly fee for unlimited access to a platform’s content library. This shift has led to a decrease in traditional advertising revenue for broadcasters, prompting them to explore new advertising strategies, such as targeted ads, digital marketing, or copy writing.

Currently, popular smart television and video streaming services are becoming increasingly favored by consumers. Video streaming is a term applied to the compression and buffering techniques that allow one to transmit and view video in real-time through the internet [2]. Its services offer unparalleled convenience to consumers. Users can access a vast library of movies, TV shows, and original content from anywhere and at any time, using various devices such as smartphones, tablets, smart TVs, or computers. The emergence of platform-based business models has played a pivotal role in reshaping the television and video industry. Platforms like Netflix, Amazon Prime Video, and Disney+ have gained significant market share by providing a direct-to-consumer streaming experience. These media provide an extensive selection of content across different genres, catering to diverse interests. Streaming services leverage sophisticated algorithms and user data to offer personalized recommendations based on individual viewing habits and preferences. By analyzing a user’s viewing history, keyword, ratings, and interactions, these platforms can suggest relevant content, helping users discover new shows and movies tailored to their interests. Moreover, it incorporates social features that enable users to engage with content and share their viewing experiences with friends and family. The proliferation of streaming platforms has disrupted the traditional media, offering viewers greater control, personalized experiences, and a vast array of content choices. As the industry continues to evolve, content creators, broadcasters, and technology companies must adapt to these changes and explore innovative strategies to thrive in the digital era of television and video.

One trend that has emerged is the increasing importance of language context in the industry. With a growing global audience and an increasing demand for content in multiple languages, the industry is turning to new technologies and strategies to meet this demand. Language plays a crucial role in determining the success and reach of television and video content. By considering language context, content creators can tailor their offerings to specific linguistic markets, effectively adapting to diverse audiences [3]. One of the key considerations in utilizing language is understanding the cultural nuances and preferences of different linguistic communities. It is not just about translating content from one language to another, but rather about adapting the content to resonate with the target audience’s cultural and linguistic sensibilities. This involves considering linguistic variations such as dialects, accents, and idiomatic expressions that are unique to specific regions or communities. Differences arise within a language as a result of how individuals utilize it or through interactions with other languages and cultures. Languages exhibit variations in various aspects of their structure, encompassing syntax, morphology, phonology, lexicon, semantics, pragmatics, and more [4].

In other phenomena, the perspective tries to explore trends and opportunities related to language context in the future of the television and video industry. It examines the impact of language on content creation, distribution, and consumption and how industry players are responding to this changing landscape. The discussion will further delve into the role of computational technology, machine learning, and artificial intelligence in shaping the future of language-related tasks in the area. These technologies enable the automation of translation, subtitling, and dubbing processes, making it easier and more efficient to produce multilingual content [5]. Machine-learning algorithms can analyze language patterns and preferences, allowing content providers to personalize recommendations and elevate user experiences. The emergence of multilingual search engine optimization has opened new avenues for content discovery and audience engagement. By optimizing content for multilingual search queries, television and video providers improve visibility and attract viewers from different language backgrounds. This presents an opportunity for content creators to expand their reach and capture niche markets (Figure 1).

Figure 1.

AI and TV.

However, the development of AI technology in language has significant implications for the video and television industry. Artificial intelligence-powered voice recognition and natural language processing capabilities are upgrading the interaction between viewers and video platforms. Voice commands and conversational interfaces enable users to search for content, navigate interfaces, and control playback using natural language, making the viewing experience more intuitive and convenient [6]. Localization project management systems are also gaining importance in this domain. These systems streamline the process of adapting content for different linguistic and cultural contexts. They facilitate efficient translation, localization, and quality assurance, confirming that the content resonates with local audiences and maintains cultural sensitivity. Content creators are increasingly tailoring their offerings to cater to specific linguistic markets. This includes customizing visuals, references, and storytelling techniques to make the content more relatable and appealing to different cultural backgrounds.

Language education and language learning have also found a place in the future of the television and video industry. As content becomes more accessible globally, there is an opportunity to leverage video platforms for language-learning purposes. Educational programs, language tutorials, and interactive language-learning content can be incorporated into streaming services, providing users with stimulating and compelling language-learning experiences. Video-based language-learning courses can be developed to provide a structured and supporting way for learners to develop their language skills. These courses can cover different levels and topics. They can include interactive exercises, quizzes, and assessments to help learners track their progress. On the other hand, the progress of computer-assisted language learning (CALL) programs has consistently advanced alongside the rapid growth in computing power and speed. Presently, the integration of multimedia applications, speech recognition, and the combination of artificial intelligence with machine learning has undeniably made a significant impact on language education [7]. CALL can also be used to create interactive language-learning content that engages learners and encourages them to practice their language skills. For example, interactive video content can be designed to prompt learners to respond to questions or complete tasks related to the language being taught. Moreover, it authorizes the creation of tailored language-learning experiences that cater to the specific requirements of learners. For instance, language-learning applications can utilize artificial intelligence algorithms to assess learner progress and offer customized feedback and suggestions [8]. So, the language landscape in the television and video industry is poised to undergo further transformations and developments in response to technological advancements and evolving audience demands. Several trends and opportunities have the potential to shape the future of language in these mediums (Figure 2).

Figure 2.

Language, TV, and video.

Advertisement

2. Multilingual content

Multilingualism refers to the ability or practice of using and understanding multiple languages [9]. It encompasses individuals who can speak, read, write, or comprehend more than one language. Multilingualism can exist at various levels, ranging from basic proficiency in multiple languages to native-like fluency. It can be acquired through various means, such as growing up in a multilingual content, multilingual environment, formal language education, immersion programs, or self-study. Multilingualism is a common phenomenon in many parts of the world, especially in regions with diverse linguistic communities or where multiple languages are officially recognized [10]. It offers numerous benefits, including enhanced communication abilities, improved cognitive skills, increased cultural understanding, and expanded opportunities in education, employment, and social interactions.

In the context of the television and video industry, multilingualism plays a crucial role in catering content to the diverse linguistic needs and preferences of the audience. The increased use of multilingualism in television and video content reflects a growing recognition of the assorted global audience and the need to cater to their language preferences. By incorporating multiple languages into their content, producers and broadcasters can effectively absorb a broad spectrum of viewers and create a more inclusive viewing experience. Here are a few ways in which the industry is encouraging multilingual contents:

2.1 Subtitles and closed captioning

Subtitles and closed captioning are textual representations of the dialog or spoken words in a video or television program [11]. They provide a written transcription of the audio content, allowing viewers to read and understand the dialog even if they cannot hear or understand the original language. Subtitles are typically used to translate the dialog from one language to another, allowing viewers who do not speak the language of the video to follow along and comprehend the content. They are commonly used for foreign films, documentaries, and television shows that are distributed internationally. On the other hand, closed captioning provides a text-based representation of not only the dialog but also any relevant audio information such as sound effects or music cues. Closed captions are primarily designed for viewers with hearing impairments, as they provide a way for them to access and understand the audio content. Both subtitles and closed captioning enrich accessibility and inclusivity in the television and video industry. They guarantee that a broader range of audiences, irrespective of their language skills or auditory capacities, can completely participate in and derive joy from the content.

2.2 Dubbing

Dubbing refers to the process of replacing the original language dialog in a video or film with a translated version in another language. It involves recording the dialog in the target language by voice actors who match the lip movements and overall timing of the original actors on screen [12]. The new dialog is then mixed with the original video, creating a synchronized version in the target language. Dubbing is commonly used to make content accessible and understandable to viewers who do not speak the original language. It allows audiences to watch foreign films, television shows, or other video content in their native language, without relying on subtitles or understanding the original language spoken by the actors. It is especially prevalent in animated films and series, as well as in international content that aims to reach a broader audience across different language markets. It also requires skilled voice actors who can accurately convey the emotions, nuances, and delivery of the original performances while ensuring a seamless integration with the visual elements of the video. Overall, dubbing plays an essential role in making content more approachable and culturally relevant for viewers who prefer to watch and listen in their own language.

2.3 Language options and audio tracks

Language options and audio tracks refer to the features available in video platforms and streaming services that allow viewers to choose different languages for the audio content of a video. It provides viewers with the flexibility to watch and listen to content in their preferred language. With language options, viewers can select from a range of available languages for the audio soundtrack of a video. This feature is particularly useful for international movies, TV series, drama, talk show, or other content that is available in multiple languages. By selecting their preferred language, viewers can enjoy the content in a language they recognize or feel more comfortable with. On the flip side, audio tracks are specific audio recordings available in different languages for a video. Each audio track represents a different language version of the original audio. Viewers can switch between these tracks to listen to the dialog and other audio elements in the language of their choice. Language options and audio tracks improve the availability and diversity of video content by catering to a wide range of viewers with varying linguistic backgrounds. They provide viewers with the freedom to enjoy content in their native language, making the watching experience more immersive and reachable.

2.4 Multilingual programming

Television networks and online platforms are increasingly producing multilingual programming that caters to diverse language audiences. Multilingual programming refers to television or video content that incorporates multiple languages in its production, presentation, or dialog. It involves the use of different languages to cater to a diverse audience and provide a more multifaceted viewing witness. Multilingual programming can also encompass various genres, such as talk shows, variety shows, and news programs, among others. This approach allows broadcasters and content creators to reach viewers from heterogeneous language origins and cultures, fostering cultural exchange and interpretation. It may involve incorporating subtitles, dubbing, or using multiple language options and audio tracks to accommodate viewers who prefer multiple languages.

2.5 Language-specific channels and networks

Language-specific channels and networks refer to television channels or networks that are dedicated to broadcasting content in a particular language. These channels focus on catering to the specific linguistic needs and preferences of viewers who prefer content in that language. They provide an extensive variety of programming, including TV shows, game shows, sports, children’s programs, and other forms of entertainment, all in the targeted language. These channels aim to create a sense of community and cultural connection for viewers who share a common language and enable them to access content that is relevant and meaningful to them.

By integrating multilingualism into the content, the television and video industry can break down language barriers and connect with audiences on a global scale. It not only expands the reach of content but also promotes cultural diversity. It allows for more authentic and relatable storytelling, as content creators can cater to the unique linguistic and cultural preferences of their viewers. Furthermore, this approach opens up new opportunities for collaboration, partnerships, and content distribution in international markets.

Advertisement

3. Automatic translation

Translation is another area where language plays a crucial role in the television and video industry. As content is distributed globally, producers and networks must ensure that their programs are adapted to the language and cultural context of different regions and audiences. Translation involves converting content from one language to another while preserving the meaning and intent of the original message [13]. Effective translation requires an understanding of both the source and target languages, as well as an awareness of the cultural context. The use of translation in the television and video industry can range from the translation of subtitles and closed captioning to the translation of entire programs and series for international distribution. In recent years, advances in translation technology have made it easier and more efficient to translate content for global viewers.

The increased use of automatic translation, specifically Neural Machine Translation (NMT), in television and video content has revolutionized the way multilingual content is created and distributed. It utilizes artificial intelligence and deep learning techniques to improve the accuracy and speed of translation. NMT models have significantly improved translation quality compared to traditional rule-based or statistical machine translation approaches. The neural networks in NMT systems can capture complex linguistic patterns and generate more accurate translations, resulting in a more natural and fluent output [14]. This system empowers faster translation turnaround times, allowing content producers to reach global audiences more quickly. Automated translation processes save time and resources by reducing the need for manual translation or subtitling, especially for large volumes of content. Here’s how it is transforming the industry:

3.1 Enhanced translation quality

Enhanced translation quality refers to the improvement and optimization of the accuracy, acceptability, readability, fluency, and overall quality of the translated content [15]. It involves utilizing advanced translation technologies, linguistic expertise, and quality assurance processes to ensure that translations are faithful to the original meaning, culturally appropriate, and linguistically polished. Enhanced translation quality can be achieved through various methods, such as the use of machine translation systems trained on large amounts of high-quality multilingual data, post-editing by human translators to refine and improve machine-generated translations, and the implementation of rigorous quality control measures to catch and correct any errors or inconsistencies. By focusing on it, organizations can provide more accurate and natural translations, leading to better communication and understanding between languages. This is particularly important in industries such as localization, television, video, movie, and content creation, where accurate and high-quality translations are crucial for effective global communication and interaction with diverse audiences.

3.2 Faster translation turnaround

Faster translation turnaround signifies the ability to complete translation projects within shorter timeframes compared to traditional translation processes [16]. It involves streamlining and optimizing translation workflows to increase efficiency and reduce the time required for translation tasks. With advancements in technology and translation tools, such as machine translation, computer-assisted translation (CAT) tools, and cloud-based translation platforms, translation processes have become more automated and collaborative. These tools enable faster processing of translation projects by automating repetitive tasks, providing instant access to translation memories and terminology databases, and facilitating real-time collaboration among translators, editors, and project managers. Faster translation turnaround offers several benefits. It allows businesses to meet tight deadlines, launch products and services in multiple markets more quickly, and respond promptly to customer needs. It also improves overall project management efficiency, as shorter turnaround times enable faster project completion and delivery. However, it’s important to note that while faster translation turnaround is desirable, it should not compromise the quality of translations. It’s crucial to maintain high translation standards, accuracy, and linguistic consistency while working within accelerated timelines.

3.3 Cost-effectiveness

Cost-effectiveness describes the ability to achieve desired results or outcomes at a reasonable cost or with maximum efficiency. In the context of translation, cost-effectiveness means obtaining high-quality translations while minimizing expenses and optimizing resources. Automatic translation offers cost advantages by reducing the dependency on human translators or dubbing specialists. It is achieved by implementing streamlined translation processes, utilizing translation technologies, post-editing, translation memory tools, and terminology management systems. While human translation is still necessary for certain sensitive or creative content, automatic translation can handle the bulk of translation needs, providing a more cost-effective solution for multilingual content production.

3.4 Customization and adaptation

Customization and adaptation in the context of automatic translation denote to the ability to modify and adjust the translation process to better suit specific requirements or preferences. It involves tailoring the translation system or algorithm to a particular domain, language pair, or style of content to improve the accuracy and acceptability of the translation quality. This customization can include training the system with domain-specific data, incorporating language rules or constraints, or fine-tuning the output based on user feedback. The goal is to enhance suitability of the automated translations for specific use cases or target audiences. Moreover, it can be customized and fine-tuned to accommodate specific domains, genres, or language pairs by NMT systems. This adaptability allows content creators to train the translation models on their own data, ensuring translations align with their specific industry terminology or stylistic preferences.

3.5 Human contribution and post-editing

Human contribution and post-editing represent the involvement of human translators or editors in the process of refining and improving machine-generated translations. After an initial automatic translation is generated by a machine translation system, human translators or editors review and edit the translation to ensure accuracy, fluency, and adherence to specific requirements or guidelines. This human involvement helps to refine and enhance the quality of the translation, addressing any errors, ambiguities, or cultural nuances that may not have been accurately captured by the machine translation system alone. Translate Community platforms, where bilingual speakers can contribute to improving machine translations, have emerged as valuable resources. It polishes the output of automatic translation, resulting in higher-quality translations for television and video content.

3.6 Real-time subtitling and localization

Real-time subtitling and localization imply to the process of providing live subtitles or captions in real-time during an event [17]. It involves the simultaneous translation and display of spoken dialog or audio content in the viewer’s preferred language. This allows viewers who are deaf or hard of hearing, as well as those who are not fluent in the original language, to follow along with the program or event in real-time. Automatic translation supports real-time subtitling, and localization breaks down language barriers for live television broadcasts, conferences, news, sports, and other live events where immediate accessibility is required. This technology allows viewers to follow the best content in their preferred language, optimizing accessibility and connection.

3.7 Multilingual distribution

Multilingual distribution specifies the process of distributing content, such as television programs or videos, in multiple languages to reach a diverse audience. It involves translating or adapting the content into different languages and making it available to viewers who prefer or require content in their native language. Multilingual distribution aims to cater to the linguistic diversity of the target audience and enhance accessibility and engagement with the content. It may involve various methods such as dubbing, subtitling, or providing different audio tracks in different languages. Automatic translation facilitates the distribution of multilingual content on various platforms and streaming services. It enables the rapid translation of metadata, video descriptions, and user-generated content, making the content more discoverable and accessible to a global audience.

The use of automatic translation, particularly NMT, in television and video content brings improved translation quality, faster turnaround times, cost-effectiveness, and the ability to engage with global audiences. By combining the power of artificial intelligence and human contribution, the industry can leverage automatic translation to create multilingual content that is accurate, efficient, and culturally relevant. The utilization of automatic translation, specifically Neural Machine Translation (NMT), in the realm of television and video content offers several significant advantages, such as improving the quality of translations, ensuring more accurate and contextually appropriate renditions of the original content.

Advertisement

4. Natural language processing

Natural language refers to the language that humans use to communicate with each other, such as Bahasa Indonesia, English, Spanish, French, or any other language spoken by people. It encompasses the words, grammar, and syntax used in everyday conversations and written texts. Natural language is the form of communication that humans use to express their thoughts, ideas, and emotions [18]. It is characterized by its complexity, variability, and the ability to convey meaning through words, sentences, and discourse. Natural language encompasses various aspects, including vocabulary, grammar, syntax, semantics, and pragmatics [19]. Natural language processing (NLP) is a field of study that focuses on enabling computers to understand, interpret, and generate human language in a way that is similar to how humans communicate with each other. The aim of natural language processing typically involves constructing a structured representation of unstructured natural language, drawing on insights from linguistics to add organization and coherence to the text [20]. In the field of natural language processing (NLP), researchers and engineers develop algorithms and systems that enable computers to understand, analyze, and generate natural language. This involves applying linguistic and computational techniques to process and interpret text or speech data.

The increased use of natural language or natural language processing that is more similar to human language in video content and AI interactions has transformed the way humans interact with technology. AI technology has made significant advancements in understanding and generating human-like language, enabling more natural and seamless interactions [21]. This technology has improved content discovery and recommendation systems. AI algorithms analyze video metadata, transcripts, subtitles, and user behavior to understand the content and user preferences. This allows for personalized recommendations based on users’ interests and viewing history. It can also analyze video transcripts, extract key information, and generate concise summaries or highlight reels. This capability is useful for video platforms, sport broadcasts, news programs, and other content where users may want a quick overview or highlights without watching the entire video. Furthermore, here’s how it generates the interaction between humans and technology, especially for video and television industry:

4.1 Voice-based assistants

Voice-based assistants in the video and television industry refer to the integration of voice-controlled features and functionalities within video streaming platforms, smart TVs, and other media devices [22]. These assistants support the user experience by permitting viewers to control and interact with their video content using voice commands. They provide a hands-free and convenient way for users to navigate through menus, search for specific shows or movies, and control playback options. Viewers can simply speak their commands, such as “Play the latest episode of my favorite series” or “Find action movies starring Vin Diesel,” and the voice-based assistant will execute the requested action. It also assists personalized recommendations based on user preferences and viewing history. By analyzing the user’s voice commands and interactions, these assistants can suggest relevant content that aligns with the viewer’s interests, leading to a more tailored and engaging viewing experience. Furthermore, voice-based assistants can provide additional information and context while watching video content. For example, viewers can ask for real-time information about the actors on screen, background details about a particular scene, or even trivia about the show or movie they are watching. It simplifies the interaction between viewers and the content, allowing for a more immersive and interactive experience using natural language commands. Popular examples of AI-powered voice assistants include Amazon’s Alexa, Apple’s Siri, Google Assistant, and Microsoft’s Cortana.

4.2 Chatbots and virtual agents

Chatbots and virtual agents are computer programs designed to simulate human conversation and provide automated responses to user queries and requests. In the context of the video and television industry, chatbots and virtual agents are utilized to interact with viewers, assist them with inquiries, and amplify their overall experience. Chatbots are typically implemented through messaging platforms or embedded within websites or applications. They can engage in text-based conversations with users, answer frequently asked questions, provide recommendations, and offer customer support. Chatbots are programmed with predefined responses based on common queries, and they can use natural language processing techniques to understand and interpret user input [23]. Virtual agents, on the other hand, are more sophisticated and advanced versions of chatbots. They often incorporate artificial intelligence and machine-learning algorithms to understand and respond to user queries in a more context-aware and personalized manner. Virtual agents can handle more complex tasks, such as providing detailed information about shows, helping users navigate through content catalogs, and even engaging in more conversational interactions.

The use of chatbots and virtual agents in the video and television industry offers several benefits. They can provide immediate assistance to users, addressing their queries and concerns in a timely manner. Chatbots and virtual agents can also handle a large volume of interactions simultaneously, ensuring efficient customer service and support. Additionally, chatbots and virtual agents can collect valuable data and insights about user preferences, behavior, and feedback. This data can be used to improve content recommendations, personalize the user experience, and optimize marketing strategies. Overall, chatbots and virtual agents play a vital role in the video and television industry by automating customer interactions, improving user engagement, and providing efficient support services. They contribute to a more seamless and interactive viewing experience for viewers.

4.3 Natural language understanding

Natural Language Understanding (NLU) refers to the capability of a computer system to understand and interpret human language in a way that is similar to how humans understand it [24]. It is a subfield of artificial intelligence and computational linguistics that focuses on empowering machines to comprehend and derive meaning from natural language inputs. NLU involves the use of algorithms and techniques to process and analyze text or speech data and extract relevant information from it. The goal is to enable computers to understand the context, semantics, and intent behind human language, allowing them to respond appropriately and take relevant actions. The model can extract meaning, context, and intent from user inputs, allowing systems to provide more accurate and relevant responses. It augments the overall user experience by enabling more natural and meaningful interactions. NLU techniques encompass various tasks, such as categorizing text into predefined categories or topics based on its content, identifying and extracting specific entities, determining the sentiment or opinion expressed in text, understanding the purpose or intention behind a user’s input, and building statistical models that capture the patterns and structure of natural language [25]. This system has essential contribution to the development of the video and television industry.

4.4 Voice-over technologies

Voice-over technologies relate to the tools and techniques used to record and synchronize spoken narration or dialog with visual content, such as videos, films, animations, or presentations. In voice-over, a voice actor or narrator provides an audio track that complements or boots the visual elements of the content. It facilitates the recording, editing, and mixing of voice recordings to create a seamless integration between the spoken words and the visual media. These technologies often involve professional recording studios, high-quality microphones, audio editing software, and audio mixing equipment to achieve optimal sound quality and clarity. Furthermore, they provide flexibility and versatility in adapting content for different audiences, languages, and cultural contexts. They play an important involvement in delivering effective communication, enhancing storytelling, and creating immersive audiovisual experiences.

4.5 Language generation and content creation

Language generation and content creation point to the process of using artificial intelligence and natural language processing techniques to generate human-like text and create various forms of content [26]. This includes generating written articles, blog posts, product descriptions, social media posts, and other forms of textual content using algorithms and machine-learning models. These technologies are designed to understand human language, mimic human writing styles, and produce coherent and relevant content that can be used in various applications such as marketing, advertising, and content production. This technology streamlines content production processes, allowing for the creation of engaging and informative video content in a more efficient manner.

4.6 Multimodal interactions

Multimodal interactions refer to the communication or interaction that involves multiple modes of input and output, such as speech, gestures, facial expressions, touch, and visuals [27]. It allows users to immerse in technology or systems using various modalities simultaneously or interchangeably. For example, a multimodal interaction system may allow users to give commands through voice while also using gestures or touch to navigate or manipulate content. This approach improves the naturalness and effectiveness of human-computer interactions, as it leverages multiple channels of communication to provide a more comprehensive and intuitive user experience. It is also used to communicate with technology in a holistic and natural manner, optimizing the user experience and making technology more accessible to an expanded audience, especially for the video and television industry.

The use of natural language or language that is more similar to human language in video content and AI interactions improves the human-technology interaction by assisting in more intuitive, conversational, and personalized experiences. As AI continues to advance, we can expect even more sophisticated language models and systems that seamlessly integrate into our daily lives, augmenting our interactions with technology. Moreover, the integration of natural language or language that closely resembles human language in video content and AI interactions advance the interaction between humans and technology. This leads to more intuitive, conversational, and personalized experiences.

Advertisement

5. Language inclusive content

Language is far from being impartial or isolated; it stands as a potent instrument at our disposal, encompassing personal presumptions, societal standards, and cultural ideologies [28]. Consequently, it is crucial to approach language with a critical mindset and remain vigilant for any biases embedded within its usage. Language both mirrors the realities of the world it operates in and actively contributes to upholding or reshaping that world. Modifying our language to incorporate more inclusive terminology provides us with an opportunity to develop and evolve as communicators who genuinely value and consider the individuals we interact with [29]. From this phenomenon, modern technology is currently striving to optimize language inclusivity, particularly in content usage.

The increased use of language-inclusive television and video content recognizes the importance of catering to diverse audience needs, including variations in dialects, accents, and cultural backgrounds. By creating content that is more language-inclusive, the television and video industry can better commit with audiences and foster a sense of representation and inclusivity. It seeks to ensure that viewers from different linguistic communities and language proficiency levels can access, understand, and relate to the content. It also takes into account cultural nuances, customs, and references. So, it avoids relying solely on a single cultural perspective and incorporates diverse cultural elements, stories, and themes. Some of the method provide audio description services for visually impaired viewers, sign language interpretation for certain programming, and user-friendly interfaces to accommodate individuals with different accessibility needs. Here’s how it can be elaborated:

5.1 Dialect and accent representation

Television and video content can incorporate a variety of dialects and accents to reflect the linguistic diversity within a particular language. By featuring characters or presenters with different dialects or accents, content creators can authentically represent various regional or cultural identities. This helps viewers feel represented and acknowledges the linguistic diversity within a language. It can also contribute to a richer storytelling experience and add authenticity to characters. Furthermore, this allows users to promote language learning and appreciate different speech patterns. As a result, it is expected to break down stereotypes and misconceptions associated with particular dialects.

5.2 Cultural references and context

Language-inclusive content goes beyond linguistic variations and encompasses cultural references and context. It acknowledges that languages are deeply intertwined with cultural nuances, customs, and traditions. Television shows and videos can incorporate cultural references, celebrations, and specific cultural practices. By incorporating cultural elements, such as traditional clothing, music, festivals, or rituals, television shows and videos can create a sense of familiarity, ultimately leading to a more harmonious content. Therefore, this also has the potential to foster a sense of belonging and representation for viewers from various cultural backgrounds.

5.3 Subtitling and captioning options

Providing subtitles and captions in different languages or dialects enables viewers who are not proficient in the spoken language to fully understand the content. This feature is particularly valuable for international films, series, documentaries, or TV shows where the use of different languages is integral to the narrative. Subtitles are the transcription or translation of movie or television dialog presented simultaneously at the bottom of the screen. By offering subtitles in multiple languages, content creators ensure accessibility and inclusivity for a worldwide audience. On the other hand, captions provide text descriptions of all the audio elements in a video, including dialog, sound effects, and music, which is particularly helpful for viewers with hearing impairments.

5.4 Representation of underrepresented languages

Representation of underrepresented languages refers to the inclusion and portrayal of languages that are not widely spoken or have limited visibility in mainstream media. It involves showcasing the linguistic diversity and cultural richness of communities that speak these underrepresented languages. In the context of television and video content, representation of underrepresented languages can optimize featuring characters, dialog, or subtitles in these languages, thus giving visibility and recognition to these linguistic communities. This representation promotes inclusivity and supports individuals who speak underrepresented languages to see themselves and their cultures reflected in the media they consume.

5.5 Language learning and educational content

Television and video content can incorporate language-learning elements by providing language lessons, vocabulary explanations, or cultural insights within the programming. This appeals to viewers who are interested in language learning or exploring different languages and encourages language education and appreciation. Language lessons can be integrated into the storyline, where characters engage in conversations or interactions that showcase language usage and learning [30]. Vocabulary explanations can be provided through on-screen graphics or pop-up captions, helping viewers expand their language skills. Additionally, cultural insights can be shared to provide context and deepen understanding of the language and its associated customs and traditions.

5.6 Audience participation and engagement

Language-inclusive content can actively involve the audience by encouraging participation and engagement through user-generated content or interactive features. It includes inviting viewers to contribute their own dialects, accents, or cultural stories, fostering a sense of belonging and shared experiences among diverse audiences. This can be achieved through different means, such as interactive features, social media platforms, live events, contests, surveys, and more. One example of audience participation is interactive television shows or game shows where viewers can vote, answer questions, or make decisions that influence the outcome of the program. Besides that, television networks and content creators often encourage viewers to use hashtags or participate in discussions online, creating a community around the content.

By adopting language-inclusive television and video content, the industry can create a more inclusive and representative media landscape. It acknowledges the diversity of languages, dialects, accents, and cultural backgrounds and ensures that viewers from different linguistic communities feel seen, heard, and included. Ultimately, language-inclusive content fosters a stronger connection between the audience and the content, leading to increased engagement and viewer satisfaction.

Advertisement

6. Artificial intelligence in language learning

Artificial intelligence (AI) encompasses machines that imitate the actions and intelligence of living beings. It is a multidisciplinary field of study and research that seeks to comprehend the workings of the human mind and apply those principles to the design of technology. The field that handles the portrayal of linguistic knowledge in AI is occasionally known as computational linguistics [31]. Within language learning and teaching, AI can replicate the actions and characteristics of both educators and students. Language learners can acquire a language by repetitively practicing phrase patterns that demonstrate the connections between words. An instance of AI employed for language learning is when a learner engages in English conversations using Siri, an iOS operating system developed by Apple. More precisely, it assists a cohort of learners in improving their oral communication abilities, with a specific focus on pronunciation and listening aptitude [32].

The increased use of AI technology in language learning has revolutionized the way users acquire new languages. It gives significant implications for the video and television industry. It enables the creation of language-specific content, develops language subtitles and translations, facilitates personalized content recommendations for language learners, provides language analysis for pronunciation and fluency, promotes language learning through authentic video content, and offers language learning support through virtual assistants. Here’s how it connects with the industry:

6.1 Language-learning video content

The video and television industry can create dedicated language-learning video content that utilizes AI technology to elevate the learning experience. These videos can provide language lessons, cultural insights, language instruction, vocabulary lessons, grammar explanations, conversation, pronunciation guidance, and interactive exercises to engage learners [33]. AI algorithms can personalize the content based on learners’ proficiency levels and track their progress, confirming an effective and tailored language-learning experience. The content may be presented in a structured format, following a curriculum or language proficiency levels, or it can be tailored to specific language-learning goals or interests. These videos often incorporate visual aids, such as subtitles, graphics, and animations, to improve comprehension and make learning more enjoyable. They may also include interactive elements, quizzes, and exercises to test and reinforce understanding. The platforms cater to individuals who prefer visual and auditory learning methods, allowing them to practice listening, speaking, and reading skills in a multimedia format. With the rise of online platforms and streaming services, learners can access a wide range of language courses, tutorials, and resources in various languages, making it convenient to learn at their own pace and according to their specific needs.

6.2 Subtitling and captioning

Subtitling and captioning are processes of adding text to video content to provide viewers with a written representation of the spoken words or audio elements [34]. Subtitling is primarily used for translating dialogs or narration from one language to another, allowing viewers who do not understand the original language to follow along. Subtitles are typically displayed at the bottom of the screen and provide a written version of the spoken content. However, captioning is used to provide a textual representation of the audio elements in a video, including spoken dialogs, sound effects, and other relevant audio information. Captions are beneficial for individuals who are deaf or hard of hearing, as they enable them to read and understand the audio content. AI-powered language translation and natural language processing can improve the accuracy and efficiency of subtitling and captioning services. Television shows and video content can leverage AI technology to generate subtitles in multiple languages, enabling language learners to watch their favorite programs with subtitles in their target language. This helps their language comprehension and listening skills.

6.3 Multilingual voice-overs and dubbing

Multilingual voice-overs and dubbing refer to the process of replacing the original audio of a video with a translated version in another language. In voice-overs, a narrator or voice actor provides a translated voice track that is synchronized with the original video [35]. Dubbing, on the other hand, involves re-recording the dialog of the characters in the target language while keeping the original video intact. Both methods allow viewers who are not familiar with the original language of the content to understand and enjoy it in their preferred language. AI technology can facilitate the process of multilingual voice-overs and dubbing for television and video content. Language-learning applications can utilize AI-generated voices to provide accurate pronunciation models for learners. Additionally, AI-driven voice modification can allow learners to practice speaking in their target language by imitating native speakers, helping them develop more authentic accents and intonation patterns.

6.4 Interactive language-learning experiences

Interactive language-learning experiences involve the use of technology and multimedia to provide engaging and immersive language-learning activities [36]. These experiences aim to facilitate language acquisition by incorporating interactive elements such as quizzes, games, interactive exercises, and real-life scenarios. Learners can actively participate in the learning process, practice their language skills, receive immediate feedback, and track their progress. Interactive language-learning experiences promote active engagement, motivation, and effective language-learning outcomes. Television and video content can incorporate interactive language-learning experiences that leverage AI technology. For example, interactive quizzes, virtual language tutors, or chatbots can be integrated into educational programs or language-focused shows, providing viewers with opportunities to actively engage in language learning while watching video content.

6.5 Language accessibility and inclusivity

Language accessibility and inclusivity relate to the efforts made to ensure that language is not a barrier for individuals to access and engage with content or participate in communication [37]. Language accessibility can be achieved through various means, such as providing subtitles or closed captions in different languages, offering audio descriptions for visually impaired individuals, using sign language interpreters, and providing translated materials or multilingual options. The goal is to create an inclusive environment where people of all linguistic abilities can fully participate, understand, and benefit from various forms of communication, including television and video content. AI-powered language-learning solutions can optimize it. This fosters inclusivity and expands the reach of video and television programming to diverse language communities.

6.6 Enhanced language-related content creation

Enhanced language-related content creation specifies the process of developing and producing content that specifically focuses on language-related aspects. This includes creating content that incorporates language-learning elements, cultural insights, and linguistic diversity. AI technology can assist content creators in developing language-related content for television and video platforms. AI-driven language analysis can provide insights into language trends, audience preferences, and language-learning needs, helping content creators make informed decisions about the creation of language-focused programming. This can also include language lessons, vocabulary explanations, cultural immersion experiences, or interactive activities that promote language learning and engagement.

The integration of AI technology and language learning in the video and television industry offers opportunities to create more engaging, interactive, and inclusive language-learning experiences. It allows the industry to cater to language learners’ needs, facilitate language acquisition, and expand the reach of educational content to a wider audience. One significant advantage of integrating AI technology into language learning is the ability to cater to learners’ individual needs. AI-powered platforms can analyze learners’ proficiency levels, learning styles, and progress and provide personalized recommendations and content tailored to their specific needs. This personalized approach enriches the effectiveness and efficiency of language learning, allowing learners to focus on areas where they need improvement and progress at their own pace.

Advertisement

7. Conclusion

The future of language in the television and video industry is dynamic and promising, driven by technological advancements and evolving audience expectations. The identified trends provide a glimpse into what lies ahead and the opportunities that can shape the industry’s language landscape. The increased use of multilingualism signifies the industry’s recognition of global audiences and the importance of catering to diverse linguistic needs. This trend opens up avenues for content creators to expand their reach to viewers from various language backgrounds. The use of multilingualism encourages content creators to tap into new markets and expand their global presence. By providing content in different languages, they can effectively target and engage audiences from various regions and cultures. This not only increases their viewership but also fosters cultural exchange and appreciation. Multilingualism also adds richness and authenticity to storytelling. It allows for the portrayal of diverse characters and settings, reflecting the linguistic diversity found in real-life communities. By featuring characters who speak different languages or incorporating multilingual dialog, content creators can create a more realistic experience for viewers.

However, automatic translation, powered by neural machine translation and human contribution, is revolutionizing language accessibility in television and video content. It offers enhanced accuracy and faster translation capabilities, enabling seamless communication across languages. It has surpassed traditional rule-based approaches by learning patterns and structures from vast amounts of multilingual data, optimizing it to generate more natural and contextually accurate translations. Viewers can access and understand content in languages that were previously inaccessible to them, broadening their horizons and promoting cross-cultural understanding. Moreover, the speed at which automatic translation operates significantly reduces turnaround times, allowing content creators to reach global audiences more quickly. The integration of natural language, which closely resembles human speech, brings a more intuitive and authentic interaction between viewers and technology. This trend paves the way for immersive experiences that foster deeper engagement and connection. By incorporating Natural Language Processing (NLP) and artificial intelligence (AI) technologies, television and video platforms can understand and interpret human language in a more sophisticated manner. This allows viewers to interact with the content using voice commands, natural conversations, or text inputs, mimicking the way they would communicate with another person. The technology behind natural language processing analyzes the context, meaning, and intent behind the user’s language, enabling a fluid interaction.

Furthermore, language inclusivity emerges as a significant consideration, encompassing dialects, accents, and cultural variations. By incorporating diverse linguistic expressions, television and video content can better represent and resonate with audiences, fostering inclusivity and cultural appreciation. When viewers see and hear their own language, it creates a sense of belonging and connection. On the other hand, AI technology also plays a pivotal role in language learning. It empowers the creation of effective and affordable language-learning applications and platforms, providing users with personalized and interactive experiences to improve their language skills. It can adapt to the user’s proficiency level, learning style, and specific goals. AI algorithms provide real-time feedback and adaptive recommendations, allowing learners to track their performance and focus on areas that require improvement. This system develops multimedia that supports the learning of listening, speaking, reading, and writing skills in a dynamic way. Ultimately, the future of language in the television and video industry relies on understanding and responding to the diverse needs and preferences of audiences. By welcoming these trends and opportunities, the industry can create language-rich content that enriches the viewing experience and connects people across linguistic boundaries.

References

  1. 1. Gimpel G. The future of video platforms: Key questions shaping the TV and video industry. International Journal on Media Management. 2015;17(1):25-46. DOI: 10.1080/14241277.2015.1014039
  2. 2. Hartsell T, Yuen SC. Video streaming in online learning. European Financial Services Law. 2019;14:612-613. DOI: 10.5771/9783845279893-612-1
  3. 3. de Brito RF, Pereira ATC. A model to support sign language content development for digital television. In: 2009 IEEE International Workshop on Multimedia Signal Processing. Rio de Janeiro, Brazil; 2009. pp. 1-6. DOI: 10.1109/MMSP.2009.5293266
  4. 4. da Rosa E. Linguistic variation in English. Revista de Letras. 2017;19(25):35-50, [Online]. Available from: https://onlinelibrary.wiley.com/doi/epdf/10.1002/pits.20214
  5. 5. Lyu MR, Song J, Cai M. A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Transactions on Circuits and Systems for Video Technology. 2005;15(2):243-255. DOI: 10.1109/TCSVT.2004.841653
  6. 6. Zhang S, Peng H, Fu J, Lu Y, Luo J. Multi-scale 2D temporal adjacency networks for moment localization with natural language. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2022;44(12):9073-9087. DOI: 10.1109/TPAMI.2021.3120745
  7. 7. Kai M. AI and Machine Learning in Language Education. 2019. [Online]. Available from: https://www.academia.edu/download/61970107/AI-and-Machine-Learning-in-Language-Education20200202-61217-w1qkn4.pdf#page=49
  8. 8. Janah KEN. Reviewing the use of television shows for learning and teaching of English language and literature: Potentials and limitations. In: English Language and Literature International Conference (ELLiC) Proceedings. Vol. 4. 2021. pp. 76-83
  9. 9. Kelly-Holmes H. Multilingualism and technology: A review of developments in digital communication from monolingualism to idiolingualism. Annual Review of Applied Linguistics. 2019;39:24-39. DOI: 10.1017/S0267190519000102
  10. 10. Chaud MR, Di Felippo A. Exploring content selection strategies for multilingual multi-document summarization based on the universal network language (UNL). Revista de Estudos da Linguagem. 2017;26(1):45. DOI: 10.17851/2237-2083.26.1.45-71
  11. 11. Danan M. Captioning and subtitling: Undervalued language learning strategies. Meta. 2004;49(1):67-77. DOI: 10.7202/009021ar
  12. 12. Lu D. Dubbing with deepfakes. New Scientist. 2019;244:8. DOI: 10.1016/S0262-4079(19)31987-6
  13. 13. Nandasara ST, Mikami Y, Mohideen AIC, Tharangie KGD. Automated language translation: Opportunities and impact on the society. International Journal of Computers and Applications. 2019;178(34):43-50. DOI: 10.5120/ijca2019919232
  14. 14. Tan Z et al. Neural machine translation: A review of methods, resources, and tools. AI Open. 2020;1:5-21. DOI: 10.1016/j.aiopen.2020.11.001
  15. 15. Waibel A et al. Face-dubbing++: LIP-synchronous, voice preserving translation of videos. In: 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW). Rhodes Island, Greece; 2023. pp. 1-5. DOI: 10.1109/ICASSPW59220.2023.10193719
  16. 16. Felten E, Raj M, Seamans R. How Will Language Modelers like ChatGPT Affect Occupations and Industries? 2023. DOI: 10.48550/arXiv.2303.01157
  17. 17. Liu J, Gao G, Ku X, Cheng J. Real-time caption detection and localization in TV programs via stroke width transform and morphological transform. In: The Fourth International Conference on Informatics & Applications (ICIA2015). Jul 2015. pp. 130-135
  18. 18. Ponti EM et al. Modeling language variation and universals: A survey on typological linguistics for natural language processing. Computational Linguistics. 2019;45(3):559-601. DOI: 10.1162/coli_a_00357
  19. 19. Indurkhya N, Damerau FJ. Classical Approaches to Natural Language Processing. Cambridge UK: Microsoft Research Ltd.; 2010
  20. 20. Verspoor K, Cohen KB. Natural language processing. In: Encyclopedia of Systems Biology. New York, NY: Springer; 2013. pp. 1495-1498. DOI: 10.1007/978-1-4419-9863-7_158
  21. 21. Lesia Viktorivna K, Andrii Oleksandrovych V, Iryna Oleksandrivna K, Nadia Oleksandrivna K. Artificial intelligence in language learning: What are we afraid of. Arab World English Journal. 2022;8:262-273. DOI: 10.24093/awej/call8.18
  22. 22. Sundari SP et al. Artificial Intelligence and Voice Assistants. India: Royal Book Publishing; 2022. DOI: 10.26524/royal.109
  23. 23. Mekni M. An artificial intelligence based virtual assistant using conversational agents. Journal of Software Engineering and Applications. 2021;14(9):455-473. DOI: 10.4236/jsea.2021.149027
  24. 24. Roukos S. Natural Language Understanding. Springer Handbooks. Berlin, Heidelberg: Springer; 2008. pp. 617-626. DOI: 10.1007/978-3-540-49127-9_31
  25. 25. Wang Z et al. Language models with image descriptors are strong few-shot video-language learners. In: 36th Conference on Neural Information Processing Systems 35 (NeurIPS). 2022. pp. 8483-8497
  26. 26. Li W et al. Faithfulness in natural language generation: A systematic survey of analysis, evaluation and optimization methods. Ver 1. DOI: 10.48550/arXiv.2203.05227
  27. 27. Chen Y, Wei F, Sun X, Wu Z, Lin S. A simple multi-modality transfer learning baseline for sign language translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022. pp. 5120-5130
  28. 28. Taheri P. Using inclusive language in the applied-science academic environments. Journal of Technology and Social Science. 2020;9(July):151-162. DOI: 10.47577/tssj.v9i1.1082
  29. 29. Ackah-Jnr FR, Appiah J, Kwao A. Inclusive language as a pedagogical and motivational tool in early childhood settings: Some observations. Open Journal of Social Sciences. 2020;8(09):176-184. DOI: 10.4236/jss.2020.89012
  30. 30. Schmidt T, Strassner T. Artificial intelligence in foreign language learning and teaching. Anglistik. 2022;33(1):165-184. DOI: 10.33675/angl/2022/1/14
  31. 31. Dodigovic M. Artificial intelligence and second language learning: An efficient approach to error remediation. Language Awareness. 2007;16(2):99-113. DOI: 10.2167/la416.0
  32. 32. Ali Z. Artificial intelligence (AI): A review of its uses in language teaching and learning. IOP Conference Series: Materials Science and Engineering. 2020;769(1):012043. DOI: 10.1088/1757-899X/769/1/012043
  33. 33. Jayanthiladevi A, Raj AG, Narmadha R, Chandran S, Shaju S, Krishna Prasad K. AI in video analysis, production and streaming delivery. Journal of Physics Conference Series. 2020;1712(1):012014. DOI: 10.1088/1742-6596/1712/1/012014
  34. 34. Alonzo O, Shin HV, Li D. Beyond subtitles: Captioning and visualizing non-speech sounds to improve accessibility of user-generated videos. In: Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility for Computing Machinery, New York, NY, USA. 2022. pp. 1-12. Article 26. DOI: 10.1145/3517428.3544808
  35. 35. Bhat AR, Abhishek SV, Acharya A, Amruth PS. Voice over internet protocol (VoIP) - a review. International Journal of Innovative Science and Research Technology. 2022;7(8):850-853
  36. 36. Omar SF, Nawi HSA, Shahdan TST, Mee RWM, Pek LS, Yob FSC. Interactive language learning activities for learners’ communicative ability. International Journal of Evaluation and Research in Education. 2020;9(4):1010-1016. DOI: 10.11591/ijere.v9i4.20605
  37. 37. Khanuja S, Ruder S, Talukdar P. Evaluating the Diversity, Equity and Inclusion of NLP Technology: A Case Study for Indian Languages. In Findings of the Association for Computational Linguistics: EACL. Dubrovnik, Croatia: Association for Computational Linguistics; 2023. pp. 1763-1777

Written By

Arkin Haris

Submitted: 30 May 2023 Reviewed: 28 September 2023 Published: 12 June 2024