[1] |
Jing Guo, Peng Zhang, JianlongTan, and Li Guo.
Mining hot topics from twitter streams.
Procedia Computer Science, 9:2008 - 2011, 2012.
Proceedings of the International Conference on Computational Science,
{ICCS} 2012.
[ bib |
DOI |
http ]
Mining hot topicsfrom twitter streams has attracted a lot of attention in recent years. Traditional hot topic mining from Internet Web pages were mainly based on text clustering. However, compared to the texts in Web pages, twitter texts are relatively short with sparse attributes. Moreover, twitter data often increase rapidly with fast spreading speed, which poses great challenge to existing topic mining models.To this end, we propose, in this paper, a flexible stream mining approach for hot twitter topic detection. Specifically, we propose to use the Frequent Pattern stream mining algorithm (i.e. FP-stream) to detect hot topics from twitter streams. Empirical studies on real world twitter data demonstrate the utility of the proposed method. Keywords: Data stream mining |
[2] |
Se Jung Park, Yon Soo Lim, and Han Woo Park.
Comparing twitter and youtube networks in information diffusion: The
case of the “occupy wall street” movement.
Technological Forecasting and Social Change, 95:208 - 217,
2015.
[ bib |
DOI |
http ]
Grounded by the micro approach to network theory, information diffusion theory, and the web ecology model, this study comparatively explores the network structure, interaction pattern, and geographic distribution of users involved in communication networks of the Occupy Wall Street movement on Twitter and YouTube. The results show that Twitter users generated a loosely connected hub-and-spoke network, suggesting that information was likely to be organized by several central users in the network and that these users bridged small communities. On YouTube, homogeneously themed videos formed a dense mesh network, reinforcing shared ideas and meanings. According to the geographic distribution, both Twitter and YouTube networks were actively organized by U.S. users, but the YouTube network was activated mainly by anonymous users. These results highlight differing roles of social media in political information diffusion in which the Twitter network not only organizes and coordinates information but also facilitate the exchange of ideas between different groups. YouTube is suitable for disseminating ideas and reinforcing solidarity among members. The results demonstrate useful analytical techniques for data mining and analyzing Twitter and YouTube networks and have important implications for distinct roles of social media platforms in organizing collective action. Keywords: Information diffusion |
[3] |
YoonKyung Cha and Craig A. Stow.
Mining web-based data to assess public response to environmental
events.
Environmental Pollution, 198:97 - 99, 2015.
[ bib |
DOI |
http ]
We explore how the analysis of web-based data, such as Twitter and Google Trends, can be used to assess the social relevance of an environmental accident. The concept and methods are applied in the shutdown of drinking water supply at the city of Toledo, Ohio, USA. Toledo's notice, which persisted from August 1 to 4, 2014, is a high-profile event that directly influenced approximately half a million people and received wide recognition. The notice was given when excessive levels of microcystin, a byproduct of cyanobacteria blooms, were discovered at the drinking water treatment plant on Lake Erie. Twitter mining results illustrated an instant response to the Toledo incident, the associated collective knowledge, and public perception. The results from Google Trends, on the other hand, revealed how the Toledo event raised public attention on the associated environmental issue, harmful algal blooms, in a long-term context. Thus, when jointly applied, Twitter and Google Trend analysis results offer complementary perspectives. Web content aggregated through mining approaches provides a social standpoint, such as public perception and interest, and offers context for establishing and evaluating environmental management policies. Keywords: Twitter |
[4] |
Pete Burnap, Omer F. Rana, Nick Avis, Matthew Williams, William Housley, Adam
Edwards, Jeffrey Morgan, and Luke Sloan.
Detecting tension in online communities with computational twitter
analysis.
Technological Forecasting and Social Change, 95:96 - 108,
2015.
[ bib |
DOI |
http ]
The growing number of people using social media to communicate with others and document their personal opinion and action is creating a significant stream of data that provides the opportunity for social scientists to conduct online forms of research, providing an insight into online social formations. This paper investigates the possibility of forecasting spikes in social tension – defined by the {UK} police service as “any incident that would tend to show that the normal relationship between individuals or groups has seriously deteriorated” – through social media. A number of different computational methods were trialed to detect spikes in tension using a human coded sample of data collected from Twitter, relating to an accusation of racial abuse during a Premier League football match. Conversation analysis combined with syntactic and lexicon-based text mining rules; sentiment analysis; and machine learning methods was tested as a possible approach. Results indicate that a combination of conversation analysis methods and text mining outperforms a number of machine learning approaches and a sentiment analysis tool at classifying tension levels in individual tweets. Keywords: Opinion mining |
[5] |
Chien wen Shen and Chin-Jin Kuo.
Learning in massive open online courses: Evidence from social media
mining.
Computers in Human Behavior, pages -, 2015.
[ bib |
DOI |
http ]
Because many massive open online courses (MOOCs) have adopted social media tools for large student audiences to co-create knowledge and engage in collective learning processes, this study adopted various social media mining approaches to investigate Twitter messages related to {MOOC} learning. The first approach adopted in this study was calculating the important descriptive statistics of MOOC-related tweets and examining the daily, weekly, and monthly trends of {MOOC} that appeared on Twitter. This information can enable {MOOC} practitioners to observe participants’ temporal activities on social media and ascertain the most effective time to post or analyze tweets. Secondly, we investigated how public sentiment toward {MOOC} learning can be assessed according to related tweets. Because the availability and popularity of opinion-rich social networking services are increasing for {MOOC} communities, our findings from the sentiment analysis of Twitter data can afford substantial insights into participant perceptions of {MOOC} learning. Third, we analyzed the positive and negative retweets related to {MOOCs} and identified the influencers of these retweets. Social network diagrams were also developed to reveal how sentimental messages about {MOOCs} on Twitter were disseminated from the top influencers with the highest number of positive/negative retweets about MOOCs. Analyzing the relationships among top retweet users is vital to {MOOC} practitioners because they can use this information to filter or recommend MOOC-related messages to the influencers. In short, the findings pertaining social media mining in this study afford a holistic understanding of {MOOC} trends, public sentiment toward {MOOC} learning, and the influencers of MOOC-related retweets. Keywords: MOOC |
[6] |
Michelle Odlum and Sunmoo Yoon.
What can we learn about the ebola outbreak from tweets?
American Journal of Infection Control, 43(6):563 - 571, 2015.
[ bib |
DOI |
http ]
Background Twitter can address the challenges of the current Ebola outbreak surveillance. The aims of this study are to demonstrate the use of Twitter as a real-time method of Ebola outbreak surveillance to monitor information spread, capture early epidemic detection, and examine content of public knowledge and attitudes. Methods We collected tweets mentioning Ebola in English during the early stage of the current Ebola outbreak from July 24-August 1, 2014. Our analysis for this observational study includes time series analysis with geologic visualization to observe information dissemination and content analysis using natural language processing to examine public knowledge and attitudes. Results A total of 42,236 tweets (16,499 unique and 25,737 retweets) mentioning Ebola were posted and disseminated to 9,362,267,048 people, 63 times higher than the initial number. Tweets started to rise in Nigeria 3-7 days prior to the official announcement of the first probable Ebola case. The topics discussed in tweets include risk factors, prevention education, disease trends, and compassion. Conclusion Because of the analysis of a unique Twitter dataset captured in the early stage of the current Ebola outbreak, our results provide insight into the intersection of social media and public health outbreak surveillance. Findings demonstrate the usefulness of Twitter mining to inform public health education. Keywords: Ebola outbreak |
[7] |
Haji Mohammad Saleem, Yishi Xu, and Derek Ruths.
Effects of disaster characteristics on twitter event signature.
Procedia Engineering, 78:165 - 172, 2014.
Humanitarian Technology: Science, Systems and Global Impact 2014,
HumTech2014.
[ bib |
DOI |
http ]
Twitter has emerged as a platform that is heavily used during disasters. Therefore, as an event unfolds, it generates varying levels of online engagement from victims as well as onlookers (both physical and virtual). Because methods for mining disaster-related content at scale must contend with the problem of filtering out vast numbers of unrelated posts, any prior knowledge about the characteristics of disaster-related content in the live Twitter feed may help improve the recovery of relevant posts. In this study, we consider the relative abundance of a disasters Twitter content over time (both relative to total event-related content and relative to the overall volume of content generated on Twitter). We refer to this time-varying abundance as the events signature. In an analysis of three different disasters, we find that event signatures are qualitatively different. These differences can be explained in terms of several characteristics of disasters: foreknowledge, duration, severity, and news media engagement. Keywords: Twitter event analysis |
[8] |
Bongsug (Kevin) Chae.
Insights from hashtag #supplychain and twitter analytics: Considering
twitter and twitter data for supply chain practice and research.
International Journal of Production Economics, 165:247 - 259,
2015.
[ bib |
DOI |
http ]
Recently, businesses and research communities have paid a lot of attention to social media and big data. However, the field of supply chain management (SCM) has been relatively slow in studying social media and big data for research and practice. In these contexts, this research contributes to the {SCM} community by proposing a novel, analytical framework (Twitter Analytics) for analyzing supply chain tweets, highlighting the current use of Twitter in supply chain contexts, and further developing insights into the potential role of Twitter for supply chain practice and research. The proposed framework combines three methodologies – descriptive analytics (DA), content analytics (CA) integrating text mining and sentiment analysis, and network analytics (NA) relying on network visualization and metrics – for extracting intelligence from 22,399 #supplychain tweets. Some of the findings are: supply chain tweets are used by different groups of supply chain professionals and organizations (e.g., news services, {IT} companies, logistic providers, manufacturers) for information sharing, hiring professionals, and communicating with stakeholders, among others; diverse topics are being discussed, ranging from logistics and corporate social responsibility, to risk, manufacturing, {SCM} {IT} and even human rights; some tweets carry strong sentiments about companies׳ delivery services, sales performance, and environmental standards, and risk and disruption in supply chains. Based on these findings, this research presents insights into the use and potential role of Twitter for supply chain practices (e.g., professional networking, stakeholder engagement, demand shaping, new product/service development, supply chain risk management) and the implications for research. Finally, the limitations of the current study and suggestions for future research are presented. Keywords: Supply chain management |
[9] |
Andrei P. Kirilenko and Svetlana O. Stepchenkova.
Public microblogging on climate change: One year of twitter
worldwide.
Global Environmental Change, 26:171 - 182, 2014.
[ bib |
DOI |
http ]
Public perceptions of climate change are traditionally measured through surveys. The exploding popularity of social networks, however, presents a new opportunity to research the spatiotemporal pattern of public discourse in relation to natural and/or socio-economic events. Among the social networks, Twitter is one of the largest microblogging services. The architecture of Twitter makes the question “what's happening?” the cornerstone of information exchange. This inspired the notion of using Twitter users as distributed sensors, which has been successfully employed in both the natural and social sciences. In 2012 and 2013, we collected 1.8 million tweets on “climate change” and “global warming” in five major languages (English, German, Russian, Portuguese, and Spanish). We discuss the geography of tweeting, weekly and daily patterns, major news events that affected tweeting on climate change, changes in the central topics of discussion over time, the most authoritative traditional media, blogging, and the most authoritative organizational sources of information on climate change referenced by Twitter users in different countries. We anticipate that social network mining will become a major source of data in the public discourse on climate change. Keywords: Climate change |
[10] |
Wei Yang and Lan Mu.
{GIS} analysis of depression among twitter users.
Applied Geography, 60:217 - 223, 2015.
[ bib |
DOI |
http ]
Depression is a common chronic disorder. It often goes undetected due to limited diagnosis methods and brings serious results to public and personal health. Former research detected geographic pattern for depression using questionnaires or self-reported measures of mental health, this may induce same-source bias. Recent studies use social media for depression detection but none of them examines the geographic patterns. In this paper, we apply {GIS} methods to social media data to provide new perspectives for public health research. We design a procedure to automatically detect depressed users in Twitter and analyze their spatial patterns using {GIS} technology. This method can improve diagnosis techniques for depression. It is faster at collecting data and more promptly at analyzing and providing results. Also, this method can be expanded to detect other major events in real-time, such as disease outbreaks and earthquakes. Keywords: Depression |
[11] |
Wei Yang, Lan Mu, and Ye Shen.
Effect of climate and seasonality on depressed mood among twitter
users.
Applied Geography, 63:184 - 191, 2015.
[ bib |
DOI |
http ]
Location-based social media provide an enormous stream of data about humans' life and behavior. With geospatial methods, those data can offer rich insights into public health. In this research, we study the effect of climate and seasonality on the prevalence of depression in Twitter users in the U.S. Text mining and geospatial methods are used to detect tweets related to depression and their spatiotemporal patterns at the scale of Metropolitan Statistical Area. We find the relationship between depression rates, climate risk factors and seasonality are varied and geographically localized. The same climate measure may have opposite association with depression rates at different places. Relative humidity, temperature, sea level pressure, precipitation, snowfall, weed speed, globe solar radiation, and length of day all contribute to the geographic variations of depression rates. A conceptual compact map is designed to visualize scattered geographic phenomena in a large area. We also propose a three-stage framework that semi-automatically detects and analyzes geographically distributed health issues using location-based social media data. Keywords: Climate |
[12] |
J. del Campo-Ávila, N. Moreno-Vergara, and M. Trella-López.
Bridging the gap between the least and the most influential twitter
users.
Procedia Computer Science, 19:437 - 444, 2013.
The 4th International Conference on Ambient Systems, Networks and
Technologies (ANT 2013), the 3rd International Conference on Sustainable
Energy Information Technology (SEIT-2013).
[ bib |
DOI |
http ]
Social networks play an increasingly important role in shaping the behaviour of users of the Web. Conceivably Twitter stands out from the others, not only for the platform's simplicity but also for the great influence that the messages sent over the network can have. The impact of such messages determines the influence of a Twitter user and is what tools such as Klout, PeerIndex or TwitterGrader aim to calculate. Reducing all the factors that make a person influential into a single number is not an easy task, and the effort involved could become useless if the Twitter users do not know how to improve it. In this paper we identify what specific actions should be carried out for a Twitterer to increase their influence in each of above-mentioned tools applying, for this purpose, data mining techniques based on classification and regression algorithms to the information collected from a set of Twitter users. Keywords: Social-Media networking |
[13] |
Luca Cagliero, Tania Cerquitelli, Paolo Garza, and Luigi Grimaudo.
Twitter data analysis by means of strong flipping generalized
itemsets.
Journal of Systems and Software, 94:16 - 29, 2014.
[ bib |
DOI |
http ]
Twitter data has recently been considered to perform a large variety of advanced analysis. Analysis of Twitter data imposes new challenges because the data distribution is intrinsically sparse, due to a large number of messages post every day by using a wide vocabulary. Aimed at addressing this issue, generalized itemsets – sets of items at different abstraction levels – can be effectively mined and used to discover interesting multiple-level correlations among data supplied with taxonomies. Each generalized itemset is characterized by a correlation type (positive, negative, or null) according to the strength of the correlation among its items. This paper presents a novel data mining approach to supporting different and interesting targeted analysis – topic trend analysis, context-aware service profiling – by analyzing Twitter posts. We aim at discovering contrasting situations by means of generalized itemsets. Specifically, we focus on comparing itemsets discovered at different abstraction levels and we select large subsets of specific (descendant) itemsets that show correlation type changes with respect to their common ancestor. To this aim, a novel kind of pattern, namely the Strong Flipping Generalized Itemset (SFGI), is extracted from Twitter messages and contextual information supplied with taxonomy hierarchies. Each {SFGI} consists of a frequent generalized itemset X and the set of its descendants showing a correlation type change with respect to X. Experiments performed on both real and synthetic datasets demonstrate the effectiveness of the proposed approach in discovering interesting and hidden knowledge from Twitter data. Keywords: Social network analysis and mining |
[14] |
Kim B. Stevens and Dirk U. Pfeiffer.
Sources of spatial animal and human health data: Casting the net wide
to deal more effectively with increasingly complex disease problems.
Spatial and Spatio-temporal Epidemiology, 13:15 - 29, 2015.
[ bib |
DOI |
http ]
During the last 30 years it has become commonplace for epidemiological studies to collect locational attributes of disease data. Although this advancement was driven largely by the introduction of handheld global positioning systems (GPS), and more recently, smartphones and tablets with built-in GPS, the collection of georeferenced disease data has moved beyond the use of handheld {GPS} devices and there now exist numerous sources of crowdsourced georeferenced disease data such as that available from georeferencing of Google search queries or Twitter messages. In addition, cartography has moved beyond the realm of professionals to crowdsourced mapping projects that play a crucial role in disease control and surveillance of outbreaks such as the 2014 West Africa Ebola epidemic. This paper provides a comprehensive review of a range of innovative sources of spatial animal and human health data including data warehouses, mHealth, Google Earth, volunteered geographic information and mining of internet-based big data sources such as Google and Twitter. We discuss the advantages, limitations and applications of each, and highlight studies where they have been used effectively. Keywords: Big data |
[15] |
Takamu Kaneko and Keiji Yanai.
Event photo mining from twitter using keyword bursts and image
clustering.
Neurocomputing, pages -, 2015.
[ bib |
DOI |
http ]
Twitter is a unique microblogging service which enables people to post and read not only short messages but also photos from anywhere. Since microblogs are different from traditional blogs in terms of timeliness and on-the-spot-ness, they include much information on various events over the world. Especially, photos posted to microblogs are useful to understand what happens in the world visually and intuitively. In this paper, we propose a system to discover events and related photos from the Twitter stream. We make use of “geo-photo tweets” which are tweets including both geotags and photos in order to mine various events visually and geographically. Some works on event mining which utilize geotagged tweets have been proposed so far. However, they used no images but only textual analysis of tweet message texts. In this work, we detect events using visual information as well as textual information. In the experiments, we analyzed 17 million geo-photo tweets posted in the United States and 3 million geo-photo tweets posted in Japan with the proposed method, and evaluated the results. We show some examples of detected events and their photos such as “rainbow”, “fireworks” “Tokyo firefly festival” and “Halloween”. Keywords: Twitter |
[16] |
Abd. Samad Hasan Basari, Burairah Hussin, I. Gede Pramudya Ananta, and Junta
Zeniarja.
Opinion mining of movie review using hybrid method of support vector
machine and particle swarm optimization.
Procedia Engineering, 53:453 - 462, 2013.
Malaysian Technical Universities Conference on Engineering
& Technology 2012, {MUCET} 2012.
[ bib |
DOI |
http ]
Nowadays, online social media is online discourse where people contribute to create content, share it, bookmark it, and network at an impressive rate. The faster message and ease of use in social media today is Twitter. The messages on Twitter include reviews and opinions on certain topics such as movie, book, product, politic, and so on. Based on this condition, this research attempts to use the messages of twitter to review a movie by using opinion mining or sentiment analysis. Opinion mining refers to the application of natural language processing, computational linguistics, and text mining to identify or classify whether the movie is good or not based on message opinion. Support Vector Machine (SVM) is supervised learning methods that analyze data and recognize the patterns that are used for classification. This research concerns on binary classification which is classified into two classes. Those classes are positive and negative. The positive class shows good message opinion; otherwise the negative class shows the bad message opinion of certain movies. This justification is based on the accuracy level of {SVM} with the validation process uses 10-Fold cross validation and confusion matrix. The hybrid Partical Swarm Optimization (PSO) is used to improve the election of best parameter in order to solve the dual optimization problem. The result shows the improvement of accuracy level from 71.87% to 77%. Keywords: Opinion |
[17] |
Mohamed M. Mostafa.
More than words: Social networks’ text mining for consumer brand
sentiments.
Expert Systems with Applications, 40(10):4241 - 4251, 2013.
[ bib |
DOI |
http ]
Blogs and social networks have recently become a valuable resource for mining sentiments in fields as diverse as customer relationship management, public opinion tracking and text filtering. In fact knowledge obtained from social networks such as Twitter and Facebook has been shown to be extremely valuable to marketing research companies, public opinion organizations and other text mining entities. However, Web texts have been classified as noisy as they represent considerable problems both at the lexical and the syntactic levels. In this research we used a random sample of 3516 tweets to evaluate consumers’ sentiment towards well-known brands such as Nokia, T-Mobile, IBM, {KLM} and DHL. We used an expert-predefined lexicon including around 6800 seed adjectives with known orientation to conduct the analysis. Our results indicate a generally positive consumer sentiment towards several famous brands. By using both a qualitative and quantitative methodology to analyze brands’ tweets, this study adds breadth and depth to the debate over attitudes towards cosmopolitan brands. Keywords: Consumer behavior |
[18] |
PhridviRaj, Chintakindi Srinivas, and C.V. GuruRao.
Clustering text data streams – a tree based approach with ternary
function and ternary feature vector.
Procedia Computer Science, 31:976 - 984, 2014.
2nd International Conference on Information Technology and
Quantitative Management, {ITQM} 2014.
[ bib |
DOI |
http ]
Data is the primary concern in data mining. Data Stream Mining is gaining a lot of practical significance with the huge online data generated from Sensors, Internet Relay Chats, Twitter, Facebook, Online Bank or {ATM} Transactions. The primary constraint in finding the frequent patterns in data streams is to perform only one time scan of the data with limited memory and requires less processing time. The concept of dynamically changing data is becoming a key challenge, what we call as data streams. In our present work, the algorithm is based on finding frequent patterns in the data streams using a tree based approach and to continuously cluster the text data streams being generated using a new ternary similarity measure defined. Keywords: similarity |
[19] |
M.S.B. PhridviRaj and C.V. GuruRao.
Data mining – past, present and future – a typical survey on data
streams.
Procedia Technology, 12:255 - 263, 2014.
The 7th International Conference Interdisciplinarity in Engineering,
INTER-ENG 2013, 10-11 October 2013, Petru Maior University of Tirgu Mures,
Romania.
[ bib |
DOI |
http ]
Data Stream Mining is one of the area gaining lot of practical significance and is progressing at a brisk pace with new methods, methodologies and findings in various applications related to medicine, computer science, bioinformatics and stock market prediction, weather forecast, text, audio and video processing to name a few. Data happens to be the key concern in data mining. With the huge online data generated from several sensors, Internet Relay Chats, Twitter, Face book, Online Bank or {ATM} Transactions, the concept of dynamically changing data is becoming a key challenge, what we call as data streams. In this paper, we give the algorithm for finding frequent patterns from data streams with a case study and identify the research issues in handling data streams. Keywords: Clustering |
[20] |
Wu He, Shenghua Zha, and Ling Li.
Social media competitive analysis and text mining: A case study in
the pizza industry.
International Journal of Information Management, 33(3):464 -
472, 2013.
[ bib |
DOI |
http ]
Social media have been adopted by many businesses. More and more companies are using social media tools such as Facebook and Twitter to provide various services and interact with customers. As a result, a large amount of user-generated content is freely available on social media sites. To increase competitive advantage and effectively assess the competitive environment of businesses, companies need to monitor and analyze not only the customer-generated content on their own social media sites, but also the textual information on their competitors’ social media sites. In an effort to help companies understand how to perform a social media competitive analysis and transform social media data into knowledge for decision makers and e-marketers, this paper describes an in-depth case study which applies text mining to analyze unstructured text content on Facebook and Twitter sites of the three largest pizza chains: Pizza Hut, Domino's Pizza and Papa John's Pizza. The results reveal the value of social media competitive analysis and the power of text mining as an effective technique to extract business value from the vast amount of available social media data. Recommendations are also provided to help companies develop their social media competitive analysis strategy. Keywords: Social media |
[21] |
Zachary Miller, Brian Dickinson, William Deitrick, Wei Hu, and Alex Hai Wang.
Twitter spammer detection using data stream clustering.
Information Sciences, 260:64 - 73, 2014.
[ bib |
DOI |
http ]
The rapid growth of Twitter has triggered a dramatic increase in spam volume and sophistication. The abuse of certain Twitter components such as “hashtags”, “mentions”, and shortened {URLs} enables spammers to operate efficiently. These same features, however, may be a key factor in identifying new spam accounts as shown in previous studies. Our study provides three novel contributions. Firstly, previous studies have approached spam detection as a classification problem, whereas we view it as an anomaly detection problem. Secondly, 95 one-gram features from tweet text were introduced alongside the user information analyzed in previous studies. Finally, to effectively handle the streaming nature of tweets, two stream clustering algorithms, StreamKM++ and DenStream, were modified to facilitate spam identification. Both algorithms clustered normal Twitter users, treating outliers as spammers. Each of these algorithms performed well individually, with StreamKM++ achieving 99% recall and a 6.4% false positive rate; and DenStream producing 99% recall and a 2.8% false positive rate. When used in conjunction, these algorithms reached 100% recall and a 2.2% false positive rate, meaning that our system was able to identify 100% of the spammers in our test while incorrectly detecting only 2.2% of normal users as spammers. Keywords: Twitter |
[22] |
Sunmoo Yoon, Noémie Elhadad, and Suzanne Bakken.
A practical approach for content mining of tweets.
American Journal of Preventive Medicine, 45(1):122 - 129,
2013.
[ bib |
DOI |
http ]
Use of data generated through social media for health studies is gradually increasing. Twitter is a short-text message system developed 6 years ago, now with more than 100 million users generating over 300 million Tweets every day. Twitter may be used to gain real-world insights to promote healthy behaviors. The purposes of this paper are to describe a practical approach to analyzing Tweet contents and to illustrate an application of the approach to the topic of physical activity. The approach includes five steps: (1) selecting keywords to gather an initial set of Tweets to analyze; (2) importing data; (3) preparing data; (4) analyzing data (topic, sentiment, and ecologic context); and (5) interpreting data. The steps are implemented using tools that are publically available and free of charge and designed for use by researchers with limited programming skills. Content mining of Tweets can contribute to addressing challenges in health behavior research.
|
[23] |
Chung-Hong Lee.
Mining spatio-temporal information on microblogging streams using a
density-based online clustering method.
Expert Systems with Applications, 39(10):9623 - 9641, 2012.
[ bib |
DOI |
http ]
Social networks have been regarded as a timely and cost-effective source of spatio-temporal information for many fields of application. However, while some research groups have successfully developed topic detection methods from the text streams for a while, and even some popular microblogging services such as Twitter did provide information of top trending topics for selection, it is still unable to fully support users for picking up all of the real-time event topics with a comprehensive spatio-temporal viewpoint to satisfy their information needs. This paper aims to investigate how microblogging social networks (i.e. Twitter) can be used as a reliable information source of emerging events by extracting their spatio-temporal features from the messages to enhance event awareness. In this work, we applied a density-based online clustering method for mining microblogging text streams, in order to obtain temporal and geospatial features of real-world events. By analyzing the events detected by our system, the temporal and spatial impacts of the emerging events can be estimated, for achieving the goals of situational awareness and risk management. Keywords: Topic detection |
[24] |
C.S. Lifna and M. Vijayalakshmi.
Identifying concept-drift in twitter streams.
Procedia Computer Science, 45:86 - 94, 2015.
International Conference on Advanced Computing Technologies and
Applications (ICACTA).
[ bib |
DOI |
http ]
We live in a Big Data society, where the dignity of data is like exchange of currency. What we produce as data affords as access to different application, benefits, services, delivery etc… In today's world communication is mainly through social networking sites like, Twitter, Facebook, and Google+. Huge amount of data that is being generated and shared across these micro-blogging sites, serves as a good source of Big Data Streams for analysis. As the topic of discussion changes drastically, the relevance of data is temporal, which leads to concept-drift. Identification and handling of this concept-drift in such Big Data Streams is present area of interest. The state-of-the-art techniques for identifying trending topics in such data streams mainly concentrates on the frequency of the topic as the key parameter. Concentrating on such a weak indicator, reduces the precision of mining. This study puts forward a novel approach towards identifying concept-drift by initially grouping topics into classes and assigning weight-age for each class, using sliding window processing model upon Twitter streams. Keywords: Big Data |
[25] |
Michael J. Widener and Wenwen Li.
Using geolocated twitter data to monitor the prevalence of healthy
and unhealthy food references across the {US}.
Applied Geography, 54:189 - 197, 2014.
[ bib |
DOI |
http ]
Mining the social media outlet Twitter for geolocated messages provides a rich database of information on people's thoughts and sentiments about myriad topics, like public health. Examining this spatial data has been particularly useful to researchers interested in monitoring and mapping disease outbreaks, like influenza. However, very little has been done to utilize this massive resource to examine other public health issues. This paper uses an advanced data-mining framework with a novel use of social media data retrieval and sentiment analysis to understand how geolocated tweets can be used to explore the prevalence of healthy and unhealthy food across the contiguous United States. Additionally, tweets are associated with spatial data provided by the {US} Department of Agriculture (USDA) of low-income, low-access census tracts (e.g. food deserts), to examine whether tweets about unhealthy foods are more common in these disadvantaged areas. Results show that these disadvantaged census tracts tend to have both a lower proportion of tweets about healthy foods with a positive sentiment, and a higher proportion of unhealthy tweets in general. These findings substantiate the methods used by the {USDA} to identify regions that are at risk of having low access to healthy foods. Keywords: Food deserts |
[26] |
Paola Velardi, Giovanni Stilo, Alberto E. Tozzi, and Francesco Gesualdo.
Twitter mining for fine-grained syndromic surveillance.
Artificial Intelligence in Medicine, 61(3):153 - 163, 2014.
Text Mining and Information Analysis of Health Documents.
[ bib |
DOI |
http ]
Background Digital traces left on the Internet by web users, if properly aggregated and analyzed, can represent a huge information dataset able to inform syndromic surveillance systems in real time with data collected directly from individuals. Since people use everyday language rather than medical jargon (e.g. runny nose vs. respiratory distress), knowledge of patients’ terminology is essential for the mining of health related conversations on social networks. Objectives In this paper we present a methodology for early detection and analysis of epidemics based on mining Twitter messages. In order to reliably trace messages of patients that actually complain of a disease, first, we learn a model of naïve medical language, second, we adopt a symptom-driven, rather than disease-driven, keyword analysis. This approach represents a major innovation compared to previous published work in the field. Method We first developed an algorithm to automatically learn a variety of expressions that people use to describe their health conditions, thus improving our ability to detect health-related “concepts” expressed in non-medical terms and, in the end, producing a larger body of evidence. We then implemented a Twitter monitoring instrument to finely analyze the presence and combinations of symptoms in tweets. Results We first evaluate the algorithm's performance on an available dataset of diverse medical condition synonyms, then, we assess its utility in a case study of five common syndromes for surveillance purposes. We show that, by exploiting physicians’ knowledge on symptoms positively or negatively related to a given disease, as well as the correspondence between patients’ “naïve” terminology and medical jargon, not only can we analyze large volumes of Twitter messages related to that disease, but we can also mine micro-blogs with complex queries, performing fine-grained tweets classification (e.g. those reporting influenza-like illness (ILI) symptoms vs. common cold or allergy). Conclusions Our approach yields a very high level of correlation with flu trends derived from traditional surveillance systems. Compared with Google Flu, another popular tool based on query search volumes, our method is more flexible and less sensitive to changes in web search behaviors. Keywords: Terminology clustering |
[27] |
Robert Micieli and Jonathan A. Micieli.
Twitter as a tool for ophthalmologists.
Canadian Journal of Ophthalmology / Journal Canadien
d'Ophtalmologie, 47(5):410 - 413, 2012.
[ bib |
DOI |
http ]
Twitter is a social media web site created in 2006 that allows users to post Tweets, which are text-based messages containing up to 140 characters. It has grown exponentially in popularity; now more than 340 million Tweets are sent daily, and there are more than 140 million users. Twitter has become an important tool in medicine in a variety of contexts, allowing medical journals to engage their audiences, conference attendees to interact with one another in real time, and physicians to have the opportunity to interact with politicians, organizations, and the media in a manner that can be freely observed. There are also tremendous research opportunities since Twitter contains a database of public opinion that can be mined by keywords and hashtags. This article serves as an introduction to Twitter and surveys the peer-reviewed literature concerning its various uses and original studies. Opportunities for use in ophthalmology are outlined, and a recommended list of ophthalmology feeds on Twitter is presented. Overall, Twitter is an underutilized resource in ophthalmology and has the potential to enhance professional collegiality, advocacy, and scientific research.
|
[28] |
Ayelet Gal-Tzur, Susan M. Grant-Muller, Tsvi Kuflik, Einat Minkov, Silvio
Nocera, and Itay Shoor.
The potential of social media in delivering transport policy goals.
Transport Policy, 32:115 - 123, 2014.
[ bib |
DOI |
http ]
Information flow plays a central role in the development of transport policy, transport planning and the effective operation of the transport system. The recent upsurge in web enabled and pervasive technologies offer the opportunity of a new route for dynamic information flow that captures the views, needs and experiences of the travelling public in a timely and direct fashion through social media text posts. To date there is little published research, however, on how to realize this opportunity for the sector by capturing and analysing the text data. This paper provides an overview of the different categories of social media, the characteristics of its content and how these characteristics are reflected in transport-related posts. The research described in this paper includes a formulation of the goals for harvesting transport-related information from social media, the hypotheses to be tested to demonstrate that such information can provide valuable input to transport policy development or delivery and the challenges this involves. A hierarchical approach for categorizing transport-related information harvested from social media is presented. An explanatory study was designed, based on the understanding of the nature of social media content, the goals in harvesting it for transport planning and management purposes and existing text mining techniques. An exploratory case study is used to illustrate the process based on Twitter posts associated with particular {UK} sporting fixtures (i.e. football matches). The results demonstrate both the volume and pertinence of the information obtained. Whilst text-mining techniques have been applied in a number of other sectors (notably entertainment, business and the political arena), the use of information in the transport sector has some unique features that stem from both day-to-day operational practices and the longer term decision making processes surrounding the transport system – hence the significance and novelty of the results reported here. Many challenges in refining the methodology and techniques remain for future research, however the outcomes presented here are of relevance to a wide range of stakeholders in the transport and text mining fields. Keywords: Social media |
[29] |
Leon Derczynski, Diana Maynard, Giuseppe Rizzo, Marieke van Erp, Genevieve
Gorrell, Raphaël Troncy, Johann Petrak, and Kalina Bontcheva.
Analysis of named entity recognition and linking for tweets.
Information Processing & Management, 51(2):32 - 49, 2015.
[ bib |
DOI |
http ]
Applying natural language processing for mining and intelligent information access to tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-dependent, and dynamic nature. Information extraction from tweets is typically performed in a pipeline, comprising consecutive stages of language identification, tokenisation, part-of-speech tagging, named entity recognition and entity disambiguation (e.g. with respect to DBpedia). In this work, we describe a new Twitter entity disambiguation dataset, and conduct an empirical analysis of named entity recognition and disambiguation, investigating how robust a number of state-of-the-art systems are on such noisy texts, what the main sources of error are, and which problems should be further investigated to improve the state of the art. Keywords: Information extraction |
[30] |
Efstratios Kontopoulos, Christos Berberidis, Theologos Dergiades, and Nick
Bassiliades.
Ontology-based sentiment analysis of twitter posts.
Expert Systems with Applications, 40(10):4065 - 4074, 2013.
[ bib |
DOI |
http ]
The emergence of Web 2.0 has drastically altered the way users perceive the Internet, by improving information sharing, collaboration and interoperability. Micro-blogging is one of the most popular Web 2.0 applications and related services, like Twitter, have evolved into a practical means for sharing opinions on almost all aspects of everyday life. Consequently, micro-blogging web sites have since become rich data sources for opinion mining and sentiment analysis. Towards this direction, text-based sentiment classifiers often prove inefficient, since tweets typically do not consist of representative and syntactically consistent words, due to the imposed character limit. This paper proposes the deployment of original ontology-based techniques towards a more efficient sentiment analysis of Twitter posts. The novelty of the proposed approach is that posts are not simply characterized by a sentiment score, as is the case with machine learning-based classifiers, but instead receive a sentiment grade for each distinct notion in the post. Overall, our proposed architecture results in a more detailed analysis of post opinions regarding a specific topic. Keywords: Micro-blogging |
[31] |
Felipe Bravo-Marquez, Marcelo Mendoza, and Barbara Poblete.
Meta-level sentiment models for big social data analysis.
Knowledge-Based Systems, 69:86 - 99, 2014.
[ bib |
DOI |
http ]
People react to events, topics and entities by expressing their personal opinions and emotions. These reactions can correspond to a wide range of intensities, from very mild to strong. An adequate processing and understanding of these expressions has been the subject of research in several fields, such as business and politics. In this context, Twitter sentiment analysis, which is the task of automatically identifying and extracting subjective information from tweets, has received increasing attention from the Web mining community. Twitter provides an extremely valuable insight into human opinions, as well as new challenging Big Data problems. These problems include the processing of massive volumes of streaming data, as well as the automatic identification of human expressiveness within short text messages. In that area, several methods and lexical resources have been proposed in order to extract sentiment indicators from natural language texts at both syntactic and semantic levels. These approaches address different dimensions of opinions, such as subjectivity, polarity, intensity and emotion. This article is the first study of how these resources, which are focused on different sentiment scopes, complement each other. With this purpose we identify scenarios in which some of these resources are more useful than others. Furthermore, we propose a novel approach for sentiment classification based on meta-level features. This supervised approach boosts existing sentiment classification of subjectivity and polarity detection on Twitter. Our results show that the combination of meta-level features provides significant improvements in performance. However, we observe that there are important differences that rely on the type of lexical resource, the dataset used to build the model, and the learning strategy. Experimental results indicate that manually generated lexicons are focused on emotional words, being very useful for polarity prediction. On the other hand, lexicons generated with automatic methods include neutral words, introducing noise in the detection of subjectivity. Our findings indicate that polarity and subjectivity prediction are different dimensions of the same problem, but they need to be addressed using different subspace features. Lexicon-based approaches are recommendable for polarity, and stylistic part-of-speech based approaches are meaningful for subjectivity. With this research we offer a more global insight of the resource components for the complex task of classifying human emotion and opinion. Keywords: Sentiment classification |
[32] |
Willyan D. Abilhoa and Leandro N. de Castro.
A keyword extraction method from twitter messages represented as
graphs.
Applied Mathematics and Computation, 240:308 - 325, 2014.
[ bib |
DOI |
http ]
Twitter is a microblog service that generates a huge amount of textual content daily. All this content needs to be explored by means of text mining, natural language processing, information retrieval, and other techniques. In this context, automatic keyword extraction is a task of great usefulness. A fundamental step in text mining techniques consists of building a model for text representation. The model known as vector space model, VSM, is the most well-known and used among these techniques. However, some difficulties and limitations of VSM, such as scalability and sparsity, motivate the proposal of alternative approaches. This paper proposes a keyword extraction method for tweet collections that represents texts as graphs and applies centrality measures for finding the relevant vertices (keywords). To assess the performance of the proposed approach, three different sets of experiments are performed. The first experiment applies {TKG} to a text from the Time magazine and compares its performance with that of the literature. The second set of experiments takes tweets from three different {TV} shows, applies {TKG} and compares it with {TFIDF} and KEA, having human classifications as benchmarks. Finally, these three algorithms are applied to tweets sets of increasing size and their computational running time is measured and compared. Altogether, these experiments provide a general overview of how {TKG} can be used in practice, its performance when compared with other standard approaches, and how it scales to larger data instances. The results show that {TKG} is a novel and robust proposal to extract keywords from texts, particularly from short messages, such as tweets. Keywords: Knowledge discovery |
[33] |
Guanhua Yan.
Peri-watchdog: Hunting for hidden botnets in the periphery of online
social networks.
Computer Networks, 57(2):540 - 555, 2013.
Botnet Activity: Analysis, Detection and Shutdown.
[ bib |
DOI |
http ]
In order to evade detection of ever-improving defense techniques, modern botnet masters are constantly looking for new communication platforms for delivering C&C (Command and Control) information. Attracting their attention is the emergence of online social networks such as Twitter, as the information dissemination mechanism provided by these networks can naturally be exploited for spreading botnet C&C information, and the enormous amount of normal communications co-existing in these networks makes it a daunting task to tease out botnet C&C messages. Against this backdrop, we explore graph-theoretic techniques that aid effective monitoring of potential botnet activities in large open online social networks. Our work is based on extensive analysis of a Twitter dataset that contains more than 40 million users and 1.4 billion following relationships, and mine patterns from the Twitter network structure that can be leveraged for improving efficiency of botnet monitoring. Our analysis reveals that the static Twitter topology contains a small-sized core sugraph, after removing which, the Twitter network breaks down into small connected components, each of which can be handily monitored for potential botnet activities. Based on this observation, we propose a method called Peri-Watchdog, which computes the core of a large online social network and derives the set of nodes that are likely to pass botnet C&C information in the periphery of online social network. We analyze the time complexity of Peri-Watchdog under its normal operations. We further apply Peri-Watchdog on the Twitter graph injected with synthetic botnet structures and investigate the effectiveness of Peri-Watchdog in detecting potential C&C information from these botnets. To verify whether patterns observed from the static Twitter graph are common to other online social networks, we analyze another online social network dataset, BrightKite, which contains evolution of social graphs formed by its users in half a year. We show not only that there exists a similarly relatively small core in the BrightKite network, but also this core remains stable over the course of BrightKite evolution. We also find that to accommodate the dynamic growth of BrightKite, the core has to be updated about every 18 days under a constrained monitoring capacity. Keywords: Twitter |
[34] |
Chung-Hong Lee.
Unsupervised and supervised learning to evaluate event relatedness
based on content mining from social-media streams.
Expert Systems with Applications, 39(18):13338 - 13356, 2012.
[ bib |
DOI |
http ]
Due to the explosive growth of social-media applications, enhancing event-awareness by social mining has become extremely important. The contents of microblogs preserve valuable information associated with past disastrous events and stories. To learn the experiences from past events for tackling emerging real-world events, in this work we utilize the social-media messages to characterize real-world events through mining their contents and extracting essential features for relatedness analysis. On one hand, we established an online clustering approach on Twitter microblogs for detecting emerging events, and meanwhile we performed event relatedness evaluation using an unsupervised clustering approach. On the other hand, we developed a supervised learning model to create extensible measure metrics for offline evaluation of event relatedness. By means of supervised learning, our developed measure metrics are able to compute relatedness of various historical events, allowing the event impacts on specified domains to be quantitatively measured for event comparison. By combining the strengths of both methods, the experimental results showed that the combined framework in our system is sensible for discovering more unknown knowledge about event impacts and enhancing event awareness. Keywords: Stream mining |
[35] |
Svetlana Mansmann, Nafees Ur Rehman, Andreas Weiler, and Marc H. Scholl.
Discovering {OLAP} dimensions in semi-structured data.
Information Systems, 44:120 - 133, 2014.
[ bib |
DOI |
http ]
{OLAP} cubes enable aggregation-centric analysis of transactional data by shaping data records into measurable facts with dimensional characteristics. A multidimensional view is obtained from the available data fields and explicit relationships between them. This classical modeling approach is not feasible for scenarios dealing with semi-structured or poorly structured data. We propose to the data warehouse design methodology with a content-driven discovery of measures and dimensions in the original dataset. Our approach is based on introducing a data enrichment layer responsible for detecting new structural elements in the data using data mining and other techniques. Discovered elements can be of type measure, dimension, or hierarchy level and may represent static or even dynamic properties of the data. This paper focuses on the challenge of generating, maintaining, and querying discovered elements in {OLAP} cubes. We demonstrate the power of our approach by providing {OLAP} to the public stream of user-generated content on the Twitter platform. We have been able to enrich the original set with dynamic characteristics, such as user activity, popularity, messaging behavior, as well as to classify messages by topic, impact, origin, method of generation, etc. Knowledge discovery techniques coupled with human expertise enable structural enrichment of the original data beyond the scope of the existing methods for obtaining multidimensional models from relational or semi-structured data. Keywords: Data warehousing |
[36] |
Harry A. Hassard, Joshua K.Y. Swee, Moustafa Ghanem, and Hironobu Unesaki.
Assessing the impact of the fukushima nuclear disaster on policy
dynamics and the public sphere.
Procedia Environmental Sciences, 17:566 - 575, 2013.
The 3rd International Conference on Sustainable Future for Human
Security, {SUSTAIN} 2012, 3-5 November 2012, Clock Tower Centennial Hall,
Kyoto University, {JAPAN}.
[ bib |
DOI |
http ]
Social and political fallout following the March 2011 Fukushima-Daiichi nuclear disaster permanently altered the zeitgeist of global public attitude towards nuclear power and towards energy technology in general. This area of public policy, which in Japan is particularly opaque and stagnant, was forced into a period of energy sector review amid domestic and worldwide debate. This study explores novel methodologies for measuring these developments, covering the 1) framing effects of traditional media and the 2) user-sourced content of social media. This quantitative approach yielded the following hypothesis verifications; 1) in an AHP-style online survey, exposure to real and simulated nuclear-related disaster headlines reduced collective partiality towards nuclear power by 3% and 4% respectively, and 2) retrospective opinion mining of Twitter procured an relative increase in negative nuclear-related posts of 38% and 134% in Japanese and English respectively, from the pre to post-Fukushima world. Using nuclear power and Fukushima as a case study, this paper attempts to elucidate both the influence of media on the public sphere, and the influence of the public sphere on policy and policymakers. From the results it is possible to make the conjecture that a lack of scientific education with regard to energy issues increases the former influence, and similarly reduces the latter. Keywords: Fukushima |
[37] |
Andrea L. Kavanaugh, Edward A. Fox, Steven D. Sheetz, Seungwon Yang, Lin Tzy
Li, Donald J. Shoemaker, Apostol Natsev, and Lexing Xie.
Social media use by government: From the routine to the critical.
Government Information Quarterly, 29(4):480 - 491, 2012.
Social Media in Government - Selections from the 12th Annual
International Conference on Digital Government Research (dg.o2011).
[ bib |
DOI |
http ]
Social media and online services with user-generated content (e.g., Twitter, Facebook, Flickr, YouTube) have made a staggering amount of information (and misinformation) available. Government officials seek to leverage these resources to improve services and communication with citizens. Significant potential exists to identify issues in real time, so emergency managers can monitor and respond to issues concerning public safety. Yet, the sheer volume of social data streams generates substantial noise that must be filtered in order to detect meaningful patterns and trends. Important events can then be identified as spikes in activity, while event meaning and consequences can be deciphered by tracking changes in content and public sentiment. This paper presents findings from a exploratory study we conducted between June and December 2010 with government officials in Arlington, {VA} (and the greater National Capitol Region around Washington, D.C.), with the broad goal of understanding social media use by government officials as well as community organizations, businesses, and the public at large. A key objective was also to understand social media use specifically for managing crisis situations from the routine (e.g., traffic, weather crises) to the critical (e.g., earthquakes, floods). Keywords: Digital government |
[38] |
Eleanor Clark and Kenji Araki.
Text normalization in social media: Progress, problems and
applications for a pre-processing system of casual english.
Procedia - Social and Behavioral Sciences, 27:2 - 11, 2011.
Computational Linguistics and Related Fields.
[ bib |
DOI |
http ]
The rapid expansion in user-generated content on the Web of the 2000s, characterized by social media, has led to Web content featuring somewhat less standardized language than the Web of the 1990s. User creativity and individuality of language creates problems on two levels. The first is that social media text is often unsuitable as data for Natural Language Processing tasks such as Machine Translation, Information Retrieval and Opinion Mining, due to the irregularity of the language featured. The second is that non-native speakers of English, older Internet users and non-members of the “in-group” often find such texts difficult to understand. This paper discusses problems involved in automatically normalizing social media English, various applications for its use, and our progress thus far in a rule-based approach to the issue. Particularly, we evaluate the performance of two leading open source spell checkers on data taken from the microblogging service Twitter, and measure the extent to which their accuracy is improved by pre-processing with our system. We also present our database rules and classification system, results of evaluation experiments, and plans for expansion of the project. Keywords: Natural Language Processing |
[39] |
Simon Price, Peter A. Flach, Sebastian Spiegler, Christopher Bailey, and Nikki
Rogers.
Subsift web services and workflows for profiling and comparing
scientists and their published works.
Future Generation Computer Systems, 29(2):569 - 581, 2013.
Special section: Recent advances in e-Science.
[ bib |
DOI |
http ]
Scientific researchers, laboratories, organisations and research communities can be profiled and compared by analysing their published works, including documents ranging from academic papers to web sites, blog posts and Twitter feeds. This paper describes how the vector space model from information retrieval, more normally associated with full text search, has been employed in the open source SubSift software to support workflows to profile and compare such collections of documents. SubSift was originally designed to match submitted conference or journal papers to potential peer reviewers based on the similarity between the paper’s abstract and the reviewer’s publications as found in online bibliographic databases such as Google Scholar. The software is implemented as a family of {RESTful} web services that, composed into a re-useable workflow, have already been used to support several major data mining conferences. Alternative workflows and service compositions are now enabling other interesting applications, such as expert finding for the press and media, organisational profiling, and suggesting potential interdisciplinary research partners. This work is a useful generalisation and proof-of-concept realisation of an engineering solution to enable {RESTful} services to be assembled in workflows to analyse general content in a way that is not immediately available elsewhere. The challenges and lessons learned in the implementation and use of SubSift are discussed. Keywords: Workflows |
[40] |
Sandra Servia-Rodríguez, Rebeca P. Díaz-Redondo, Ana Fernández-Vilas,
Yolanda Blanco-Fernández, and José J. Pazos-Arias.
A tie strength based model to socially-enhance applications and its
enabling implementation: mysocialsphere.
Expert Systems with Applications, 41(5):2582 - 2594, 2014.
[ bib |
DOI |
http ]
The growing omnipresence of the Social Web and the increasingly number of services in the Cloud have created a favourable atmosphere to develop socially-enhanced services, that is, services which are aware of the social dimension of the users to improve their experience in the Cloud. This paper introduces a model and an architecture for socially-enhanced services by mining interaction information across different Social Web sites. Most of the existing social applications require knowing who are the users socially-linked to each individual by exploring topological connections in social networks or, even, calculating the interactions network that underlies social sites. However these approaches are, on the one hand, hardly scalable when the number of users grows in the interaction network and, on the other hand, tightly coupled to the social application and so hardly reusable. The key contribution of this paper is a user-centred model whose goal is not to infer the aforementioned interaction network, but to build users’ social spheres. That is, assessing the strength and the context of the user’s ties by using signs of interaction available from social sites {APIs} (private messages, retweets, mentions, …) with user’s permission. To this aim, contrary to previous approaches, we take into account (i) different interaction types and contexts, (ii) the time in which interactions occur, (iii) the people involved in them and (iv) the interactions rhythms with the rest of user’s contacts. A prototype of this service has been implemented in order to, not only validate the tie strength model, but also to deploy some pilot experiences. Keywords: Social Web |
[41] |
Daniel Scanfeld, Vanessa Scanfeld, and Elaine L. Larson.
Dissemination of health information through social networks: Twitter
and antibiotics.
American Journal of Infection Control, 38(3):182 - 188, 2010.
[ bib |
DOI |
http ]
Background This study reviewed Twitter status updates mentioning “antibiotic(s)” to determine overarching categories and explore evidence of misunderstanding or misuse of antibiotics. Methods One thousand Twitter status updates mentioning antibiotic(s) were randomly selected for content analysis and categorization. To explore cases of potential misunderstanding or misuse, these status updates were mined for co-occurrence of the following terms: “cold + antibiotic(s),” “extra + antibiotic(s),” “flu + antibiotic(s),” “leftover + antibiotic(s),” and “share + antibiotic(s)” and reviewed to confirm evidence of misuse or misunderstanding. Results Of the 1000 status updates, 971 were categorized into 11 groups: general use (n = 289), advice/information (n = 157), side effects/negative reactions (n = 113), diagnosis (n = 102), resistance (n = 92), misunderstanding and/or misuse (n = 55), positive reactions (n = 48), animals (n = 46), other (n = 42), wanting/needing (n = 19), and cost (n = 8). Cases of misunderstanding or abuse were identified for the following combinations: “flu + antibiotic(s)” (n = 345), “cold + antibiotic(s)” (n = 302), “leftover + antibiotic(s)” (n = 23), “share + antibiotic(s)” (n = 10), and “extra + antibiotic(s)” (n = 7). Conclusion Social media sites offer means of health information sharing. Further study is warranted to explore how such networks may provide a venue to identify misuse or misunderstanding of antibiotics, promote positive behavior change, disseminate valid information, and explore how such tools can be used to gather real-time health data. Keywords: Antibiotic |
This file was generated by bibtex2html 1.96. and compiled by Subasish Das