Most of the current research on the information analysis of social media (SM) for public emergency focused on a single dimension such as emotion while neglecting the interaction between multidimensional information. Therefore, in this study, an information dispersing–superimposing model is proposed to explain the implicit regularity of the impact within a symbol, sentiment, and context information and their dependent evolution on the SM. Information hue, saturation, and flux (HSF) are defined to measure the interaction process. An online event was selected to verify the concept and hypothesis of this study. The results proved that the interaction among multidimensional information did exist on the SM for a public emergency. The turning points of information dispersing–superimposing often emerged when the number of online users involved had significant changes, and sentiment and context information were showed to have a strong interaction relationship and tended to be spread at the same time. It was also manifested that the dominant information component was varied at each stage of the emergency. This paper is one of the first to study the interaction of multidimensional information on the SM derived from optics scattering. The findings of the study will try to provide a theoretical explanation for why certain information components may be enhanced during the online dissemination and suggest practical support for the information predictions and interface design for SM.
- multidimensional information interaction
- social media
- HSF information space
- public emergency
- optical scattering
Social media (SM) has provided an effective channel for the diffusion of multidimensional information of public emergencies (An, Zhou, Ou, Li, Yu, & Wang, 2021), which may amplify the impact of an emergency. In previous studies, the evolution paths of information on the public emergency were viewed from a decentralized perspective for every single component such as emotional or network information (Kim, Bae, & Hastak, 2018). In fact, multidimensional information in public emergencies is mixed, and the mixing of specific dimensional information may produce differentiated diffusion and infection capabilities, thus affecting the trend of events. Therefore, it is necessary to grasp the interaction and diffusion mechanism among these information components from the perspective of integration. This will help to explain the formation and dissemination rules of emergency information and provide insights for emergency response.
The study presented in this paper will reexamine the information composition and their interaction modes in the online public emergency derived from the concept of dispersing–superimposing of optical perspective. In the dispersion model of light, the polychromatic light is decomposed into three monochromatic lights including red, green, and blue (Granier & Heidrich, 2003). All the other colors could be produced by the mix from these three monochromatic lights. The multidimensional information interaction is assumed to be similar in a dispersing–superimposing way. It could be considered that the information finally presented online is actually obtained by the superimposing of three information components including symbol, sentiment, and context – three monochromatic information dimensions.
These three information components are supposed to dependently interact with each other. It is hypothesized that during the multidimensional information interaction, some of the information components will restrict the spread of other components and the relationship between these three kinds of information components will be changed due to the interference of stimuli (Khan, 2017). Thus, it is assumed to cause the final presentation of information content on the SM to be varied at the different spread stage. For example, after an unexpected policy is introduced, the dissatisfaction of the public online will overlap the content information of policy itself, thereby causing the sentiment information component to be stronger in intensity than other dimensions. Therefore, in this study, we aim to explain under an information composition perspective the phenomenon where one item of information will actually show varied presentation such as the dispersion model of light and try to investigate the interaction between multidimensional information through information analysis methods. Furthermore, one of the purposes of this study is to verify whether a common regularity exists among the previous research findings on the content, emotion, and context studies for the SM. If so, it could be assumed that the former conclusion on the emotion tendency online was not an independent feature for itself, but also suggested the changes in the content information for example. In this case, knowing better how multidimensional information interacts with the new media could propose a way to predict certain information components from the given information dimension.
On the whole, this research is innovative from previous studies on the information analysis for SM. First, unlike former studies focused on the features of certain single information, this study considers the information on the SM, especially SM events as the combination of multidimensional information components. Inspired by the light dispersing theory, the fusion of information could be reflected by the spread intensity and breadth of information components at different key temporal points and spatial coordinates. The varied presentation of information was also assumed as the results of different ratios within these information components such as light. Second, this study tries to make an association with various research findings for the information analysis on the SM. A new argument will be pointed out that the interaction of multidimensional information components leads to dependent changes of content analysis online. The interaction and its changing trend of multidimensional information were emphasized. This argument will help to combine the currently separate research ideas on the content, emotion, and context research on the SM and contribute to the essence of online information composition. Practically this study will also show significant implications for predicting information on the SM irrespective of whether it is for information recommendation or marketing forecasting. The acquisition and analysis of target information components should be implemented on a more suitable propagating node.
More specifically, the research questions needing to be evaluated in this study are as follows.
Research question 1: according to the theory of optical dispersion, are three information components (symbol, sentiment, and context) dependently transmitted in a similar “dispersing–superimposing” way on the SM? When will the dispersing state likely emerge?
Research question 2: how do these information components interact with each other during the dispersing–superimposing process? What are the features of this interaction?
To modelize the process of SM information dispersing–superimposing during the public emergency, two steps will be fully considered. First, the three information components and their features will be defined and explained. Then variables will also need to be proposed to describe and assess the interaction process among these information components.
In Section 2 literature review will occur concerning the related studies on the content, emotion, and context analysis on the SM. The details of the information dispersing–superimposing model will be explained in Section 3 with the features of information components and the evaluation variables. In Section 4, an event on the micro-blog was selected and traced to manifest our hypothesis of dispersing–superimposing model. The communication modes of each component and the interaction between them will both be verified in detail. Then deeper consideration on how information components separate and fuse will be discussed based on the significant results in Section 5. Finally, the theoretical and practical implications of the information dispersing–superimposing model will be presented in the conclusion. Further research plans will be mentioned in this section.
SM consists of tools that enable an open and online exchange of information through conversation, interaction, and exchange of user-generated content (Howe, 2006), which presents a greater potential in providing information in emergencies due to its instant nature. Crane and Sornette (2008) divided the SM information into endogenous and exogenous based on its causes. Endogenous causes refer to phenomena in which an idea or “meme” gains popularity by a process of viral contagion or information cascade, while exogenous causes refer to large-scale events, usually happening in the physical world. Skinner (2013) listed some information types on SM including updates regarding the writers’ status, links to news sites, and emotional messages to those affected by the event, as well as humoristic messages. In recent years, researchers have mostly extracted the topic, emotion, user attributes, and location information of public emergencies by machine learning methods based on SM big data (Pourebrahim, Sultana, Edwards, Gochanour, & Mohanty, 2019; Kim et al., 2018; Cvetojevic & Hochmair, 2018).
SM information analysis is widely concerned in public emergencies. Relevant research can be divided into two types according to its purposes: extracting the trend of public emergencies as well as providing behavioral help for the organization and management departments. In terms of extraction of public emergency trend, most of the researches focused on public attitudes and online behavior toward the events. An et al. (2018) analyzed the diffusion of topic and emotion information from the perspective of the network during the emergency, and they discovered that the stakeholder types with high topical influence usually have low sentiment influence and vice versa. Bica, Palen, and Bopp (2017) analyzed the Twitter pictures of the two earthquakes in Nepal, and they found that the popularity of pictures varies regionally and globally, with local people emphasizing casualties and heavy losses, while international users emphasizing recovery and relief work. In terms of providing behavioral help, Fan, Jiang, and Mostafavi (2020) combined text, picture, and location information in SM to monitor building damage in emergencies. Alam, Ofli, and Imran (2020) used natural language processing and computer vision methods to extract the supplementary information generated during disasters, which was proposed to assist humanitarian organizations in their relief work.
At present, the analysis of text topics on SM platforms shows a trend of diversification. Two research purposes were mainly involved: identification and tracking of topic content as well as making inference from topic content.
The identification and tracking of topic content referred to discovering specific and associated topics on the SM platform. Cumbraos-Sánchez et al. (2019) applied feature words of health information onto Twitter by tracking antimicrobial topics. Lee, Lee, and Choi (2020) tracked information posted by politicians in the SM politics arena to study user’s reactions to blog posts and the spread of political topics. Pourebrahim et al. (2019) proposed a method for identifying emergency blog content based on the annotations of storm blogs by experts. Srijith, Hepple, Bontcheva, and Preotiuc-Pietro (2017) suggested the concept of sub-story based on multiple stages during the propagation of accidents to identify the multiple sub-events from a major one. Although the involved information fields and contents were varied, the common essential work was to monitor and track the features and content for a specific information topic by applying linguistic analysis to the text.
The other research goal of topic analysis on the SM is reasoning from the text for mining the implicit information behind the topic. Two kinds of information were often inferred.
First, the topic analysis could contribute to user-related information reasoning, including the role of the user, user behavior, and the user’s cognitive information. The information on the role of the user was often extracted from topic analysis to identify the specific users and their relationship. For example, a method was proposed for identifying the witnesses in emergencies based on tracking the linguistic features of blog information for an incident on Twitter (Zahra, Imran, & Ostermann, 2020). User behavior on the new media could be signified from topic analysis to show how the user reacts with each other and the interactive behaviors. Collective sense-making in times of crisis was proposed to study user’s information behavior in the context of terrorist attacks through the thematic analysis of related tweets (Fischer-Preßler, Schwemmer, & Fischbach, 2020). User’s cognitive information often referred to the user’s feelings and personal information derived from new media. This kind of information was similar to but different from emotional information. It did not focus on analyzing emotions or how emotions affected the user’s behavior, which was the goal of emotion analysis. In contrast, this type of research applied subjective information of users as an object to study its relevance characteristics. For example, Yen, Huang, and Chen (2019) built up a user’s knowledge base relied on their daily tweets. Other authors focused on the user’s loneliness on Twitter, which discussed the association features of tweets containing the keyword “loneliness” (Mahoney et al., 2019). Besides, a study was presented to detect tweets related to depression and their spatial-temporal patterns at the scale of the Metropolitan Statistical Area. They found the relationship between depression rates, climate risk factors, and seasonality was varied and geographically localized (Yang, Mu, & Shen, 2015).
Second, the topic analysis also served for information reasoning on the SM. This type of research covered two kinds of environmental information: real environmental information analysis and information analysis for virtual communities. Real environmental information analysis aimed to obtain real-world environmental information from the tweets. Khodabandeh, Fatemi, and Tabatabaee Malazi (2019) conducted a study based on user’s blog posts on the SM to infer the location of a traffic accident in which the explicit geographic data was lost. In terms of information analysis for the virtual community, a study on Adverse Drug Reaction (ADR) was proposed based on information posted by users on “Askapatents.com” (Li, Lee, & Yang, 2019). This research suggested a system for real-time monitoring of ADR based on the mining and inference of information released by relevant communities. Another study that focused on vulnerable communities proposed an approach investigating the application of hate speech detection for the vulnerable community identification on the SM (Mossie & Wang, 2020).
Typical topic analysis approaches on the new media included word segmentation, topic model, and similarity calculation (Imran, Castillo, Diaz, & Vieweg, 2015). For example, Lei, Zhang, Zhang, and Xu (2015) extracted new words from Sina’s blog posts and comments based on users’ data clustering. In the topic modeling process, Lim, Karunasekera, and Harwood (2017) built a cluster-based topic model (ClusTop) to identify the topics on Twitter that automatically adapted to the number of topics without requiring parameter settings. Ostrowski (2015) built a topic recognition model based on the Latent Dirichlet Allocation (LDA) topic model for filtering Twitter messages. The main methods for the similarity calculation of information concerned Euclidean distance, cosine similarity, and Jaked distance (Likavec, Lombardi, & Cena, 2019).
The studies on the emotion analysis for the SM concerned two aspects considering the research issues. One was the emotional study for public attitude. Kušen and Strembeck (2018) analyzed the sentiment of public on the Twitter for the 2016 Austrian presidential election. Plunz et al. (2019) measured the well-being with the urban green space of New York City by analyzing the emotion data on Twitter. Corea (2016) applied Twitter data to analyze customer’s sentiment on stocks as an indicator of future movement of the stock market. Another scholar identified malicious fake reviews by analyzing the emotions of normal users in reviews and their emotions in fake reviews (Martinez-Torres & Toral, 2019).
Since sentiment analysis was partly derived from the linguistic analysis of text, there were several studies emphasizing on an integrated point of view to both consider the topic and emotion implications at the same time on the SM. Some of them focused on a mixed analysis of short text from online production comments between sentiment and topic (Xiong, Wang, Ji, & Wang, 2018). Liu, Liu, Gedeon, Zhao, Wei, and Yang (2019) revealed that the impact of user ranking on online purchase behavior relied on the emotion and content analysis of online comments. Meanwhile, a study on the factors for the sharing of information during a crisis was proposed. In that study, the author used emotional and topical information (called “relevance” in that study) as two factors affecting information sharing (Xu & Zhang, 2018). The fusion of subject analysis and sentiment analysis in the real-time monitoring of elections was proposed as well (Bansal & Srivastava, 2018). This research created a hybrid topic-based sentiment model to analyze the sentiment of each topic in order to monitor the election status.
Different sentiment analysis approaches have also been conducted and applied, including dictionary-based analysis, machine learning-based analysis, deep learning-based analysis, and hybrid analysis (Agrawal & An, 2012). In addition to these commonly used methods, a few scholars have also proposed several novel sentiment analysis methods. Lee, Chen, and Huang (2010) demonstrated a sentiment analysis method based on grammatical rules. Their study identified seven groups of linguistic cues in Chinese and generalized two sets of linguistic rules for the detection of emotional causes. In terms of integrated analysis of topic and emotion, Mei, Ling, Wondra, Su, and Zhai (2007) suggested a Topic Sentiment Mixture model which could trace topic life cycles and sentiment dynamics. An interactive exploration method was also proposed to cluster the data according to the given specified dimension of topic and emotion (Dasgupta & Ng, 2009). Joint Sentiment/topic Model (JST) model proposed to add a sentiment layer between text and topic layer based on LDA (Lin & He, 2009). Differing from JST model, Jo and Oh (2011) took aspect and sentiment unification model to discover the pairs of (aspect, sentiment), which they called senti-aspects. One important advantage of aspect and sentiment unification model was there being no requirement of any sentiment labels of the reviews, which were often difficult to obtain from data.
All the existing studies of emotion models could be sorted into “Categorical” or “Dimensional” models (Calvo & Kim, 2013). “Categorical” emotion models categorized all the human emotions in several major classes (i.e., Anger, Disgust, Love, etc.); while “Dimensional” emotion models classified the emotions in details into multiple dimensions (i.e., valence, arousal, dominance, etc.) and intensities (i.e., basic, mild, intense, etc.).
“Context” has been illustrated by many scholars. Schilit, Adams, and Want (1994) described context through three aspects: user’s location, user’s companion, and nearby resources. Abowd suggested applying the classic “5W” model (who, says what, in which channel, to whom and with what effect) to illustrate the context (Abowd & Mynatt, 2000). Villegas summarized the situation according to the definition of previous scholars by individuality, time, location, activity, and relational (Villegas & Müller, 2010).
Two kinds of research interests often occurred in the existing context analysis for new media: user context and event context.
In terms of event context on the SM, Sandhu, Vinson, Mago, and Giabbanelli (2019) defined the context from the perspective of time and duration. They extracted the change of online public opinion onto the court before and after the incident. Cvetojevic and Hochmair (2018) demonstrated the spatial-temporal pattern of tweets through SM in the context of a terrorist attack. Rudat and Buder (2015) defined the context of an event as a source of informative value and agent awareness to study the impact of context on the user’s forwarded information. Veenstra, Iyer, Hossain, and Park (2014) defined the context of the event through time, place, and technology when studying Twitter as an information source in the Wisconsin labor protests.
As seen from the literature review, previous research analyzed the information on the SM from three dimensions: topic, emotion, and context. The purposes and methods of analysis were diversified. However, except for the intersection of topic and emotion, none of these studies investigated the interrelationship among multidimensional information, and the three dimensions were still considered separately. Although some studies emphasized the integration of topic and emotion, they only showed a parallel coordination of two information dimensions instead of revealing the mutual influence with each other. On the other hand, the three information dimensions have not been regarded as a whole in the existing literature. For example, when a text on the SM does not contain emotional or situational information, it will be treated as a general topic without considering that the emotional or situational information is only temporarily suppressed. Therefore, the existing research focuses were relatively scattered, rather than looking for the relationship between these different information dimensions from a deeper level to reflect the common implications by neglecting specific scenarios.
Therefore, our research will strengthen the integration degree of topic, emotion, and context information analysis for the SM. It is hypothesized that the information flowing is actually composite information composed of topic, emotion, and situation. The different elements of information we perceive and obtain are only due to different states at different dissemination points. And the different states are also assumed to depend on the interaction between the three information components, which is caused by the varied proportion among them. This hypothesis is inspired by the dispersion phenomenon in optics. Relying on the hypothesis, it is interesting to make further exploration on whether dispersing–superimposing phenomena also exist in multidimensional information on the SM from an integrated perspective and on the task of identifying the kinds of factors that are expected to cause different information dimensions to scatter and fuse. If this assumption is valid, it will be able to better address through information analysis the question of why the same information in the SM environment will produce different content effects and why different dimensions of the same information will be more easily perceived and acquired by users at some time. Meanwhile, the answers to these research issues will also seek the general rules and implicit relevance of research conclusions for current SM information analysis. From the perspective of practice, this study will propose the means to represent information at different key nodes of multidimensional information interaction, so as to regulate the proportion of different information components and to produce different information perception and acquisition results.
The analytical framework of the information dispersing–superimposing model is consistent with the dispersion model of light. The common dispersion model of light divided monochromatic lights into seven kinds, in which red, green, and blue form a basic color-coding system (Granier & Heidrich, 2003). By superimposing these three kinds of light in various proportions, it is possible to synthesize the light into other colors (seen in Figure 1). Similarly, when users receive the polychromatic information, these items of information are supposed to experience the following dispersing–superimposing process:
Optically, the polychromatic light is scattered through a prism and converted into three values of red, green, and blue relying on the RGB color-coding system. Likewise, users involved in the information transmitted via SM could also be regarded as a prism (as can be seen in Figure 1). When a user is more interested in a certain information dimension, he or she will pay more attention to this dimension and ignore others. For example, for a recently released movie preview on the SM, some users focus more on the story narrated in the movie, which will lead to the comments mainly on the topic; while some others could be more concerned about the emotional expression and the metaphor of movie, which refers to the sentiment information. Therefore, in the SM information dissemination, the user will strengthen the information dimension which he or she is more interested in and filter what he or she concerns less. The whole information could be dispersed into several information components.
Optically, different dimensions of light are superimposed to form a new composite color of light. Similarly, the three-dimensional information is finally superimposed and transmitted to other users under varied proportions among them. The information disseminated is the integration of three information components rather than a single dimension, in which we define that some of the information components are implicit only because they are restrained by other components under certain conditions.
Three information components are taken into consideration during the communication process: symbol information, sentiment information, and context information (corresponding to Red, Green, and Blue for the RGB color-coding system). Symbol information represents the nature of information items under a completely neutral and objective perspective. For example, the symbol information for an item of traveling information on the SM could be the location, transportation, and cost. Sentiment information refers to the feelings and emotions experienced from the information item. For example, if a user likes the place of this travel, he or she will describe this traveling information item by using a smiling emoji or an emotional adjective. It is regarded as a kind of sentiment information differing from symbol information by expressing subjective attitude. Context information is explained by the situation of the event as well as the characteristics of information disseminators. For example, if a user has been to the traveling place before, he or she is willing to share his or her experience when describing the traveling information on the SM. Later information readers could also receive the contextual information and determine whether to transmit it to other users or not.
The three information components on the SM cannot always spread by the same scope, intensity, or speed. In the beginning, the three of them are integrated. Along with the dissemination process carried forward, some of the components would be more explicit than others. For example, one of the users publishes an item of information taking about a new TV show on the SM. If this information is commented and shared by the fans of the actor who participated in the show, their sentiment will cover the original information of the show. In this case, the sentiment information component will be separated from the symbol component and spread faster in higher intensity. Taking another example, when a person who has suffered from an earthquake reads a piece of information announcing the recently occurred earthquake disaster, he or she will comment on the information and encourage the victims by introducing his or her own context. Under this situation, the negative context information will be restrained by the positive emotion of active rescue.
The similarity of information components between two user nodes signified whether and how much this information dimension is disseminated. Through this, the trends of each information dimension could be made manifest. We named it as transmission value which is ranged from 0 to 1. In the following sections, the method to calculate each information dimension and its transmission value will be defined and illustrated.
The theme discovery model of LDA was used to assess the topic features of text (Towne, Rosé, & Herbsleb, 2016). According to the user’s comments, the theme vector
Theme weight vector of each user comment text.
Similarity calculation denotes the similarity of the topics between two items of information text. In this paper, the similarity between symbol information represented the transmission value of symbol information.
The natural language processing module trained on a multi-scale Chinese corpus of Baidu Company was used as the analysis method of emotional information, and the accuracy of emotional orientation analysis of network texts reaches more than 95%. The module was based on machine learning-based naive Bayesian classification, in which emotions (
Context information could not be directly obtained from the information posted by users on the new media. The definition of the context information component in this study was illustrated by two aspects. The first aspect was the context of the user. This part included the basic information of the user who participated in the communication. The user-related context information involved in this study is shown in Table 1.
|Fan||Fans||Number of fans|
|Fol||Follows||Number of follows|
|W||Weibo-blog||Number of blogs|
|Lv||User-level||Level of the user account|
Another aspect of context information concerned the features of event information itself. This part included users’ information behavior and the time when the information behavior occurred. The related variables are as follows (shown in Table 2).
|T||Blog time||Time of blog posting (minute)|
|S||Forward and share||Number of forwarding|
|Up||Thumbs up||Number of thumbs up|
|C||Comments||Number of comments|
What has to be emphasized is that we only chose the variables which were able to be obtained directly online to represent the context information of the user in this study. These variables characterized the context of information dissemination on the SM, and so far, contextual information has not been mined from the text. For example, when a travel enthusiast shares his or her tourist attraction online, the context considered in this study is the context of the event itself (such as the heat of the event) and the context of the user’s related information (such as the user’s account level). The characteristics of the scenic spots and other implicit information mentioned by the user in the blog post are undoubtedly also kinds of context information. However, in order to make the measurement method concise and distinguish the context information from symbol information, we only consider the context information revealed by the dissemination action in this study. In the later research, we could also consider the implicit context information by deeper analysis implying the user’s own experience and knowledge.
Finally, the context information component was described by the variables shown in Table 3.
SM, social media.
The context information component is assessed by the following Eq. 10. Unlike other information, the context information cannot be directly measured. In our study we define: if the context of users and context of event information are similar between two users, it means that the context information is spread. The calculation of similarity continues to use the cosine similarity method. Therefore, the similarity between context information reveals the transmission value of context information.
The similarity of the context vector after normalization is used to represent the transmission value. The similarity calculation method used here is the same as Eq. (2).
In order to understand the characteristics of three information components and their evolution patterns during the information dispersing–superimposing process, several factors are set to describe and verify the scattering scope, intensity, and speed. Again, we draw lessons from the optical research. To represent the features of color, a Hue, Saturation and Value (HSV) color space was proposed (Smith, 1978). HSV is a function with the independent variables of RGB.
Relying on HSV color space shown in Figure 2, colors under each hue are arranged in a radial slice from black at the bottom to white at the top. The distance from the axis of the circle to the edge of the circle represents the saturation of color. The HSV color space represents the color from a statistical perspective: hue represents the type of color, saturation the purity of color, and value the brightness of color.
Similarly, our study also defines three elements for describing the information dispersing–superimposing process on the SM: information hue, information saturation, and information flux. Information hue addresses the questions of whether the three information dimensions experience the dispersion or not and which information component is dominant among the three; while information saturation refers to the intensity of dispersion of information dimensions. Finally, information flux signifies the maximum value among the three information components. These three variables construct a Hue, Saturation, and Flux (HSF) information space. Likewise, the HSF is a function with the independent variables of the transmission value of the symbol, sentiment, and context information dimensions. The HSF information space is assumed to comprehensively and clearly describe the multidimensional interaction of information and information dispersing–superimposing process on the SM. The reasons for choosing the transmission value of each information dimension as an independent variable could be illustrated in two aspects. On one hand, the presentation and unit of each dimension are varied: the vector is used for symbol and context information, while the number is used for the sentiment. Unlike the unified unit of RGB, it did not make sense to insert three dimensions into one function. On the contrary, the transmission value is ranged from 0 to 1 and could be calculated at the same time for three dimensions. On the other hand, in the SM environment the aspects of whether the information is transmitted and the extent to which it is transmitted are more meaningful than the information content itself. Thus, by using the transmission value it could be more accurate to signify the interaction process of the three information dimensions.
The parameter Hue in HSV color space represents the spectral position where the color is located. In this study, information hue is applied to imply the dispersing–superimposing state of information. It will signify whether the information disseminating at this period is experiencing dispersion or superimposition. For example, as shown in Figure 3, the information hue value between the symbol and the sentiment refers to the state of dispersion or superimposing of symbol and sentiment.
Information hue denotes the information dispersing–superimposing state among three information components on the new media. The calculation methods are shown as the following equations.
The classic formula of RGB conversion HSV was used for reference, where
Information hue was defined to signify whether the interaction state is under dispersing or superimposing and which the dominant information component is at the moment. Six situations based on the area where the angle was located (calculation of information hue) correspond to six states of multidimensional information interaction (seen in Figure 4): 1 – superimposition of symbol and sentiment information; 2 – dispersion as sentiment information (dominant information component); 3 – superimposition of sentiment and context information; 4 – dispersion as context information (dominant information component); 5 – superimposition of symbol and context information; and 6 – dispersion as symbol information (dominant information component).
Two extreme cases were excluded from our model in terms of numerical calculations. First, the situation when all the transmission values of the three information dimensions are 1 is out of consideration in our study. It reveals that each information dimension contains the same amount of content and is under perfect transmission state, yet is rare in the reality. This situation could be understood as pure white according to the color system, which has no practical significance in our model. Second, the situation when all the transmission values of the three information dimensions are 0 is also excluded from our analysis. This case refers to pure black and means that no information components are transmitted on the SM.
The parameter Saturation in HSV color space refers to the purity of color. The higher the saturation is, the softer the color will be. Its value is between 0% and 100%. In our study the saturation of information is considered as a similar concept, indicating the intensity of information dispersing and superimposing. The higher value of saturation is, the more explicit the dispersing–superimposing state will be.
According to the definition of saturation in the HSV color space, the saturation of information is defined as follows:
The parameter Value of light represents the degree of darkness of light. However, this variable is too abstract to assess the information dispersing. Thus, we apply information flux instead of value to represent the maximum value for symbol, sentiment, and context information. Here the information flux denotes the maximum value of the information transmitted during the communication process, which implies the maximum flow of information.
The three elements construct an information space called HSF information space. According to the HSV color space, a regular hexagon is also created to represent three information components, where symbol starts at 0°, sentiment at 120°, and context at 240° (seen in Figure 3).
The definition of factors of information flux is as follows.
The variable of max is the same to the definition in Eq. (12).
In order to verify and clarify the information dispersing–superimposing process on the SM, an experimental evaluation from a public emergency event was conducted by tracing the interaction paths of three information components. The “Tianjin Binhai warehouse explosion” event was chosen. On August 12, 2015, a fire and explosion accident occurred in the dangerous goods warehouse of Ruihai Company located in Tianjin Port, Binhai New Area, Tianjin, resulting in 165 deaths and 8 missing. The hashtag “#Tianjin Binhai warehouse explosion#” related to the event has been read over 60 million times. This was a typical emergency event with obvious propagation stages, so that the dispersion and superimposition issues for different information dimensions at each propagation stage were supposed to be well identified and recognized. Actually, other types of events could also be analyzed in future research, and it is as well meaningful to classify the events according to the different dispersing–superimposing modes.
The event selected on micro-blog referred to “Tianjin Binhai warehouse explosion”. To collect as much related data as possible from the platform we took a two-step collection method. First, all the tweets under the label of “#Tianjin Binhai warehouse explosion#” were selected from micro-blog and then the associated labels like #Tianjin Binhai explosion#, #Tianjin Binhai warehouse explosion#, #Explosion of Tianjin Binhai Wharf#, #Explosion of Tianjin Binhai New Area wharf #, and # Tianjin Port explosion accident# could also be visible through these tweets. Thus we conducted further data collection by applying all these related labels to ensure that as much useful tweets as possible were included in our analysis. Particularly, it was noted that some filtering operations were conducted during data collection. For example, the users who had no communication with others were out of our data collection. In addition, pure emoji was as well excluded. In the end, 128, 870 tweets were collected along with the user information related to these tweets.
However, the data collection that we did presented the following limitations: (a) we did not collect unlabelled tweets which might be also related to this event. Although the number of these unlabelled tweets was in a small size, this would still affect the analysis results; (b) emojis were excluded from the analysis, yet sometimes these emojis were informative and representative.
Four steps were considered to conduct our experiments. (a) An index of users and their content was created both in the comments or re-posts. The content of these index links was used for the calculation of symbol information and sentiment information. (b) According to Section 3.1.3, the collected user data was applied to measure user’s context information. (c) Finally, users’ operation during the whole timeline of the entire event was selected and calculated for event context analysis. In this way, we could obtain the key time point of user participation in the event followed by
Especially, the entire corpus was not used as the input of model for the LDA model at the step (a). The texts with the relationship of repost or comments would be aggregated into a set, and then the LDA model was applied to build up the symbol information vector from this set. Since users who had no communication with others were excluded from our analysis, all the tweets would have their own set of symbol information vector.
First of all, the overall development of the event is shown in Figure 5. The diagram implied the increased tendency of the number of users involved in the communication of the event. Actually, previous studies have pointed out that the events concerning public opinion might experience seven development stages (dawning awareness, greater urgency, reaching for solutions, wishful thinking, weighing the choices, taking a stand intellectually, making a responsible judgment morally and emotionally) (Yankelovich, 1991). However, in an actual public emergency event, some development stages of the event are often short, and the boundary is fuzzy. Three representative phases (dawning awareness, weighing the choices, and making a responsible judgment) were chosen as the criteria for the identification of event propagation in this evaluation. Seen from the data collected in the test, there were obvious three stages according to the peak value on the change for the number of users involved (seen as Figure 5). In the sub-graph of Figure 5 below, the abscissa indicated the number of minutes between the time of the event and 0:00 on January 1, while the axis coordinate was the density. The number changing over time was represented by the curve. Between 350,000 min and about 370,000 min, the number of users had a distinct peak. This was a clear sign of the beginning of the event. After 380,000 min, the number of users did not change significantly over time. This is the sign of the end of the event. Between 370,000 min and 380,000 min, the number of users had peaked several times. This indicated the intermediate stage of the development of the event. Considering this, the development of the event in our test was simplified into three phases: dawning awareness, weighing the choices, and making a responsible judgment. In Figure 5, the time axis unit of abscissa is minute and 300,000 is the initial time when data was collected.
The Descriptive statistics of the three information components are shown in Table 4. It could be seen from the table that sentiment information had the highest mean and variance (mean = 0.5212); while the average values of context information and symbol information were, respectively, 0.4654 and 0.3878. In addition, sentiment information had the largest median.
The correlation results of three information components are shown in Table 5. The correlation between symbol information and sentence information is the smallest and negative correlation (correlation = 0.008). Yet a positive correlation is presented between sentiment information and context information (correlation = 0.043). The correlation between symbol information and context information is the largest (correlation = 0.304).
Correlation is significant at 0.01 level (2-tailed)
From the above data, it is signified that in the event, the transmission value of symbol information was generally smaller compared to other dimensions of the data. This implied that on the SM the topic of information was easily impacted by other information dimensions during the online discussion of events. Sentiment information showed a bipolar trend and the overall average transmission value was higher, which meant that sentiment information was more easily transmitted on the SM. The overall transmission value of context information was low where the peak was around 0.3. It indicated that the contexts of the users participating in the online events were greatly varied.
The above results signified that three information components did have their own information features and different interactive relationships between them during the dissemination for micro-blog. Now in this part it is meaningful to know more accurately how these multidimensional information components interact in a dispersing–superimposing way.
According to the experimental results, five information turning points of information dispersing and superimposing were obtained by analyzing the variable of information hue. Meanwhile, four periods when the number of users changed rapidly have also been indicated below to compare with the trend of information hue (shown in Figure 6). The horizontal axis in Figure 6 is the timestamp of the time occurrence (the number of minutes between the time of the event and 0 h on January 1). The ordinate is the percentage of labeled components in the figure.
During the whole process of information dissemination, the information superimposition occupied the advantage on the percentage at the first stage of the event. On the contrary, at the second stage of the event, the state of information dispersion was dominant; while at the final stage, there was no significant difference in the percentage between information dispersion and superimposition. At this stage, the changes of information hue were also relatively flat.
More specifically, the information dispersing–superimposing process could be implied among three information components from Figure 7. The dominant information component refers to the information component that occupies most of the weight at each stage of communication on the SM. Dominant information components will show higher spread scope and intensity, reflecting which information dimension is mostly focused on by users at the different stages of the event. At the first stage, symbol and context information were dominant. While at the second stage, sentiment information turned to be dominant. And finally, context information was dominant at the last stage.
The results of information saturation and information flux during the dispersing–superimposing process were shown in Figures 8 and 9 for six states of multidimensional information interaction. The abscissas of Figure 8 and Figure 9 represent the six states of interaction that are defined in Figure 4.
It can be seen from Figure 8 that irrespective of whether the information was under the state of dispersion or superimposition, there was no significant difference in the value of information saturation as a whole. Only between the status No. 5 and 6 could there be a significant difference in information saturation.
It meant that when symbol information was considered as one of the components in superimposition, the information saturation would be higher than symbol information in dispersion. In other words, it could be derived from the results that when the information is under the state of symbol information dispersion, the difference between the transmission values of the three information dimensions was small.
As shown in Figure 9, there was a huge difference in information flux under the state related to sentiment information, which was consistent with the data shown in Table 5. There was no significant difference in the flux of the other status except the status No. 5 and 6 seen from the symmetrical image in Figure 9.
It implied that when symbol information was under the interaction state of information dispersion, its maximum value would be significantly lower than others. Furthermore, during this online event, users could pay less attention to symbol information than the information in other dimensions on the SM.
Through the results demonstrated from this experiment, the three information components did have their own dissemination features and interact with each other in a dispersing and superimposing way (answers to research question 1). It was proved that the information was integrated by varied dimensions and certain dimensions would be explicit and spread in higher intensity during some periods. More specifically, several interesting findings and specific arguments as well emerged according to the experimental results. Some of them were interpreted in the above section. In this part, three of most typical findings will be illustrated in more detail as follows in order to respond to the two research questions. First, the information turning point had an implicit association with the number of users participating in the event discussion on the SM (answers to research question 1). Second, the sentiment and context information had a strong correlation by considering the transmission value (answers to research question 2). Finally, the dominant information during the multidimensional information interaction changed and reflected the user’s information demands (answers to research question 2).
In the micro-blog event, information will have several turning points (the turning point of information dispersing and superimposing) over time. Based on the experimental results, the most important feature of the turning points was that they often occurred at a time when the number of users involved dramatically increased. On the contrary, the sharp increase in the number of users involved might not bring about the turning point of information dispersing–superimposing. This phenomenon could be explained by the following points.
First, the turning point of information dispersing and superimposing essentially implied the change in the number of users involved in the online communication of an event. Seen from the experimental results of this study (seen in Figure 8), each of the three stages of the event corresponded to several growing leaps of users, which could be regarded as a small climax of the event. The dawning awareness phase had two leaping changes on the number of users participating in the dissemination (periods a and b), one during the weighing the choices phase (period c), and one for the making a responsible judgment phase (period d). Meanwhile, when referred to the turning point, there were two for the dawning awareness phase (point A and B), one for the weighing the choices phase (point C), and two for the making a responsible judgment phase (point D and E). During the first two stages of the event, new users joined the event and the number of participants rose sharply in a short period of time. These joined users brought new ideas and arguments to the event, which made the three information components start new evolution. It resulted in the turning points of the dispersion and superimposition among information components. These changes could lead to the aggravation or reduction of the transmission value of information compared to the original state. The former was affected by the addition of a large number of existing opinion supporters and in turn made the original state magnified and aggravated. The latter was in the opposite case. Whatever the case, it represented the change in the number of users involved in the dissemination of the event. At the making a responsible judgment phase, the incident gradually subsided and the number of users turned to be less fluctuated. Thus, the number of newly entered users was significantly lower than the rest of the stages.
Secondly, the turning points of information dispersing and superimposing reveal the evolution of events on the SM. Considered by the experimental results, the turning point A was located at the dawning awareness phase, where the number of people participating in the discussion dramatically increased. However, the content of information was mainly based on objective reporting and diffusion, so that the information remained superimposing between symbol and context information components. The turning point B was located at the weighing the choices phase, in which the event was spread further and attracted more users, yet the entire information was still dominated by the superimposing of symbol and context information. The turning point C was also located at the weighing the choices phase as the second development peak of the event. A great number of users joined the event with personal opinions also causing conflicts with original ideas. This action led to the turning point of the dispersing of sentiment information. The turning points D and E were at the making of a responsible judgment phase. At this point, the event gradually subsided and the information was under a state of superimposition. There was no turning point in the making of a responsible judgment phase, which signified that the information was under the superimposing state among three information components. It could be seen from the experimental results that the turning point of information dispersing and superimposing conformed to the regularity of event evolution while reflecting the changes in the number of users.
Third, the turning point of information dispersing and superimposing has implications for identifying the stages of events. The turning point could be regarded as the criterion for the division of event development stages in addition to the number of users. Most of the research on the SM events applied “number of participants” as the criteria to identify the different stages of the event. In fact, one of the most influential determinants in the new media environment is the participation of “Internet celebrity” (Zsila, McCutcheon, & Demetrovics, 2018). The “Internet celebrity” is not restricted by the official media and recommendation mechanism of the online platform, but at the same time it has sufficient social influence. In this case, the change in the number of users brought by “Internet celebrity” cannot completely reflect the true propagation variation of events such as the turning point of information. Therefore, it is necessary to apply the perspective of information dispersing and superimposing in the purpose to identify the growth of the number of “Internet celebrity” and turning points for varied information components, which could in turn make a more logical development stage division of SM events.
From the experimental results, sentiment and context information had a higher transmission value of information (seen in Figure 7). If we set a threshold of 0.5 for the transmission value of each information component and determined when the transmission value exceeded over this threshold the information of this dimension was considered to be propagated, the above results could also be understood as the fact that sentiment and context information tend to be propagated at the same time and their existence is interdependent. This phenomenon can be explained in the following aspects.
First, users will be more proactive in spreading sentiment and context information simultaneously. In the SM environment, short text and fast reading have become the mainstream information transmission methods. When the transmission value of sentiment and context information exceeds the threshold, the opinions and emotions in the short text are in a very conspicuous state, and users will be able to express and obtain the clear viewpoints and attitudes provided by the information in limited words. Users participating in the communication need only to support or negate the opinions reflected by the composite information without deeply understanding the details of the information. This will greatly reduce the cost of reading and interaction, thereby making the user more active to disseminate the information. On the other hand, in the SM environment each user can freely express his/her own opinions and emotions. As the most subjective information component, sentiment information will cause enough response to promote the information component to be spread widely and quickly. The case selected for the experiment was attributed to the type of emergencies in the accident disaster. Judging from the experimental results, the sentiment information of this event mainly revolved around the two keywords of “sympathy” and “encouragement.” Users carried out spontaneously the “blessing delivery” in the process of information transmission. This had a great impact on the event dissemination and prompted the quick response of relevant departments under the information transmission direction. This transmission direction was similar to that of sentiment and context information. This is also the reason why the information saturation and flux of sentiment and context information in Figures 8 and 9 were much larger than symbol information.
Second, the simultaneous transmission of sentiment and context information signifies the potential social relationship of users. An important feature of the new media environment is to reflect the virtual social relationship (Sun, Dong, Tang, Xu, Qi, & Cai, 2015), in which the explicit relationship network and implicit relationship network are both interlaced. The explicit social network is the relationship between “fans and bloggers.” The important essence of this kind of social relationship is that users will pay attention to others who have similar interests or under the same user type, in order to form a radiant virtual society. Based on the information dispersing and superimposing model proposed in this study, context information is classified into the context of the event and the context of the user, where the latter refers to the user’s profile on the SM. The composite information will be propagated simultaneously with the context information between relevant users under a similar context. It is consistent with the explicit relationship network in the social new media environment. The implicit social relationship network on the SM is based on the recommendation algorithm of content. It provides users with relevant content recommendations based on their history of data querying. The context of the event applied in our information dispersing and superimposing model defines several attributes that may be applied to the recommendation algorithm as an index of event features. Therefore, the dissemination of sentiment and context information represents the implicit social relationship embodied by the content recommendation.
Finally, the information dissemination along with the sentiment and context information signifies the public’s attitude toward the event. In the SM environment, each user may act as the creator of content and the publisher of information (Tsugawa & Kimura, 2018). Therefore, the followers of the event are no longer limited to the stakeholders of the event; and the evaluation and discussion of the event are ultimately decided by the public. Thus, the sentiment and context information component could reveal the public’s attitude and emotions towards the event.
In the case of #Tianjin Binhai warehouse explosion # selected by the experiment, three development stages of the event corresponded to different dominant information components (seen in Figure 7). In the dawning awareness phase, the dominant information component was symbol information accompanied by the passive propagation of context information, such as the reports of the number of casualties, rescue measures for accidents, predictions of liability for accidents, etc. It could be found from Figure 7 that the sentiment information was silent at this time. In the weighing the choices phase, the most focused information concerned the public discussion of accidents and various participants’ emotions, such as dissatisfaction with hazards on chemical safety and praise for firefighters. Therefore, the sentiment information component served as the dominant information at this stage. Finally, in the making of a responsible judgment phase, the information was mainly involved by the follow-up reports and would not attract a large number of new followers. Subsequent information was only transmitted between the users who had a similar context. Thus, context information turned to be dominant.
The transformation between different dominant information components reflected the shift of the user’s focus onto the event at different stages. The information spread by users was essentially symbol and context information at the dawning awareness phase. In this period, users might focus on the current situation of the event, such as the casualty of the event and the attribution of responsibility. At the weighing the choices phase, the information disseminated changed into sentiment information. The user’s emotion was spread to the relevant departments to cause the event to be resolved. At this time, the users’ concern was the solution and responsibility of the event. During the making of a responsible judgment phase, the incident gradually subsides. Context information became the dominant component. Users on the SM at this time paid attention to the follow-up report and the reflection of the accident. In general, the event would no longer cause a large number of new users to join and discuss, and information was only spread between users under a similar context.
The transformation of different dominant information components also signified the relationship between the desired objective information and the bursting of subjective information. From the perspective of information flux, the burst of subjective information from users on the SM meant that the sentiment information dominated and its transmission value continued to peak. Corresponding to the results of the experiment, users were in the process of collecting and understanding the information of the incident online in the dawning awareness phase. Their demand for the objective information exceeded their own emotion (as shown in Figure 7). At this time, the symbol and context information served to the needs of objective information; while the sentiment information stayed silent. In the weighing the choices phase, users were disappointed about the unclear responsibility for the disasters and feeling sympathy for the people who suffered from the incident. Their subjective information burst more violently than the desire for objective information, and thus sentiment information dominated at this stage. In the making of a responsible judgment phase, the information disseminated on the SM was based on the subsequent development of the event and self-reflection. The relationship between users’ desire for the objective and subjective information turned to be gradually gentle.
This paper aimed to verify the interaction ways of multidimensional information involved in the new media. It was assumed that the three information components including symbol, sentiment, and context were transmitted dependently under a dispersing–superimposing path. Therefore, an interaction model was proposed and implemented to explain this dispersing–superimposing process according to the concept of the RGB color framework. Besides defining the calculations of three information components, an HSF information space was also suggested to explain the details during the interaction relied on the HSV color space for optics. Six states of dispersing–superimposing interaction were also extracted from the model. When the dispersing–superimposing interaction emerged, the intensity of dispersing–superimposing interaction and the value of the currently dominant information component at each stage all could be illustrated through this approach.
Then through the data collected from a micro-blog event, the analysis of three information components and their interaction process were both conducted based on the variables defined in the model and HSV information space. A reasonable amount of data was collected, which could demonstrate the validity of the hypotheses underlying the study and indicate the veracity of our conjecture that information dispersing–superimposing phenomenon did occur on the SM. Particularly, the experimental results signified some useful implications for the interaction of multidimensional information components. These results revealed that the three information dimensions had different features of transmission and certain dimensions could be considered as the dominant information component at varying stages of the online event.
From a theoretical perspective, this study explored the information in the SM environment from a multidimensional and interactive perspective. The current concept model could be extended and has explained the relationship between information components and why as a result of the influence factor some information components could be more attractive online for the same event. Although the experiment was conducted for one event, the results and research findings could signify that the interaction did exist between these dimensions. For example, it implied that the symbol was easily affected by sentiment information. Sentiment and context information was transmitted simultaneously according to the experimental results; and symbol information could be the dominant component at the beginning while context information obtained a higher transmission value at the last stage online. All these implications could respond to the second research question of the paper. It will be better to consider the interaction of these information components rather than single dimension in order to find more meaningful regularity of information on the SM.
From a practical perspective, this model proposes practical significance for explaining the diversity of online information items. According to the experimental results and research findings, the information disseminated on the SM was actually a composition of varied information components and each of them would show different transmission values and features during the whole process. The information perceived and understood differed for varied users. Thus, when practically creating and posting an information item online, it has to consider the interaction between these information components. It means the growth of transmission value of a sentiment is not only determined by the emotion involved in the information but also affected by symbol and context. Besides, this study could suggest more accurately the key monitoring points and predicting evolution trends for the information components on the SM.
This study had also some limitations and could be extended and improved in-depth. First, the measurement of the transmission value for the three information components—namely, symbol, sentiment, and context—could be more accurate. Optimized variables and methods need to be applied for a better description of these three information components, for example, the definition of context information. Moreover, the tested case in the experiment is classified under the topic of emergency. In the future, the classification of SM events could also be explored based on the varied interaction paths of information.
Attributes for the Context Information of the User
|Fan||Fans||Number of fans|
|Fol||Follows||Number of follows|
|W||Weibo-blog||Number of blogs|
|Lv||User-level||Level of the user account|
Variables to Represent the Context Information Components on the SM
Descriptive Statistics of the Three Information Components
Correlation Matrix Between Three Information Components
Attributes for the Context Information of the Event
|T||Blog time||Time of blog posting (minute)|
|S||Forward and share||Number of forwarding|
|Up||Thumbs up||Number of thumbs up|
|C||Comments||Number of comments|