2.1. Telecommuting
Telecommuting, also known as remote work or telework, is a work arrangement in which employees can work from a location other than the traditional office environment, usually their home or another remote location using digital tools to communicate and collaborate with colleagues and complete their work tasks (Nilles 1988). Telecommuting has become increasingly popular in recent years, driven by technological advancements, changing work cultures, and the need for flexibility and work-life balance. In particular, the COVID-19 pandemic caused an overnight change in work culture by making telecommuting a necessity for social distancing, and many acknowledge that the pandemic accelerated the shift toward telecommuting for the long term (Guyot and Sawhill 2020; Parker, Horowitz, and Minkin 2022). However, the sudden shift from in-person collaboration to remote teamwork caused many employers and employees to need help with telecommuting adoption. As a result, the literature on telecommuting has only recently started to grow significantly.
To gain a deeper understanding of how to improve workplace culture when many employees telecommute, it is necessary to examine both employers’ and employees’ attitudes toward telecommuting. However, due to the difficulty of collecting data from managers, the literature has focused on studying whether employees perceive telecommuting as a desirable option and what factors influence this attitude. Employees’ preferences for telecommuting have been shown to be influenced by constraints, opportunities, and socio-demographic factors. Previous studies referred constraints to factors that might limit an employee's ability to work remotely, such as the suitability of their home environment or the nature of their job tasks. On the other hand, opportunities refer to factors that make telecommuting a desirable option, such as saving commute time that helps maintain better work-life balance (Balbontin, Hensher, and Beck 2022; Mohammadi et al. 2022; Nguyen 2021; Nguyen and Armoogum 2021; Salon et al. 2022)
Nguyen and Armoogum (2021) conducted a study during the pandemic to investigate the effects of constraints and opportunities on attitudes and perceptions towards telecommuting and how these differ between male and female employees. The research revealed that females generally have a more favorable attitude towards telecommuting than males, but this is negatively affected by household chores such as childcare. In contrast, male employees' preferences are influenced by prior telecommuting experience, perceived productivity at home, and income. In another study using the same dataset, Nguyen (2021) investigated attitudes toward full-time and part-time telecommuting. This study found that demographic factors such as age, gender, education, and income, as well as prior telecommuting experience and home environment constraints, influence telecommuting preferences. In addition, job-related factors such as employer policy, job type, data accessibility, commute distance, and other home environment attributes such as employee productivity at home, were identified as significant factors.
The successful implementation of telecommuting depends on various factors, such as the differences in profiles between telecommuters and non-telecommuters, the importance of employer telecommuting policies, employee productivity, remote collaboration, and work-life balance challenges. This concept has been emphasized in the literature on telecommuting since the COVID-19 pandemic (Balbontin, Hensher, and Beck 2022; Beck and Hensher 2022; Mohammadi et al. 2022; Nguyen 2021; Nguyen and Armoogum 2021; Tahlyan et al. 2022). For example, Beck and Hensher (2022) conducted a study in Australia to investigate the positive and negative impacts of working from home on employees, employers, and society. The study revealed that the rapid adoption of telecommuting caused many employees to face challenges related to remote collaboration and technology, such as reduced collaboration and innovation, poor internet connectivity, inadequate home office setups, and difficulties accessing company systems remotely. They also highlighted the difficulty of maintaining a boundary between work and life and the feeling of social isolation as consequences of full-time telecommuting.
Tahlyan et al. (2022) conducted a study on the factors that impact employee satisfaction with telecommuting, using a dataset that included 318 employees in the US. The study revealed that certain groups, including those with children at home, disabilities, and lower income and education levels, had less success with telecommuting. Conversely, the authors found that satisfaction with telecommuting improved when employees had job autonomy, such as flexibility in their work schedule, workplace, and how they performed their tasks, task variety, and connectivity with co-workers.
Some authors of this article were part of a team that conducted a 3-phase nationwide survey in the US from early 2020 to late 2021 to explore modifications in household activities, such as work behavior, due to the COVID-19 outbreak (Capasso Da Silva et al. 2021; Chauhan et al. 2022; Chauhan, Bhagat-Conway, et al. 2021; Chauhan, Capasso Da Silva, et al. 2021; Javadinasr et al. 2022; Mirtich et al. 2021; Mohammadi et al. 2022; Nafakh et al. 2022a; 2022b; Salon et al. 2022; 2021). Based on the first wave of the survey conducted throughout 2020, Salon et al. (2022) examined what factors influence the ability to telecommute and the frequency of it. They found that higher educational attainment and income, together with specific job categories, largely determine whether workers have the option to telecommute.
Mohammadi et al. (2022) utilized data from the first and second waves of the same survey to investigate employees' preferences for telecommuting, considering the impact of unobserved behaviors such as productivity at home and COVID-19 risk perception. The study found that risk perception and productivity at home positively influence preferences for telecommuting. Factors such as home environment attributes (e.g., childcare, distractions), job-related factors (e.g., job type, lack of required technology), and work-life balance opportunities (e.g., saving commute time) also impacted job productivity. The study also found that preferences for telecommuting are affected by education, age, income, commute trip features, and prior telecommuting experience.
This study contributes to the existing literature in four ways. First, comprehensively examines of US employees' thoughts and opinions on telecommuting by analyzing their social media posts. This approach results in a larger sample size due to the high activity of people on social media, leading to a more accurate representation of employees. Second, the study explores sentiments toward telecommuting from both temporal and spatial dimensions. Third, the study analyzes telecommuting data collected in all three waves of the COVID Future, a recent dataset on activity-travel behavior collected across the US. Fourth, the study compares the sentiments toward telecommuting shared on social media with those obtained through surveys to determine the similarities and differences between the information collected by social media and surveys.
2.2. Sentiment Analysis on Twitter data
Natural Language Processing (NLP) is a field of Machine Learning (ML) and linguistics that aims to model language using computational methods. NLP offers much promise given its ability to process large amounts of text automatically in a short time compared to the time it would take for a human to complete this sort of task. One of the most prominent applications of NLP is sentiment analysis, which manages opinions and subjective text mainly for classification purposes. This can be used to process a large number of user reviews, public opinions, and social media posts, which later help assess product performance, elections, and major public events (Tul et al. 2017). The main benefit of sentiment analysis is that it can offer an interface between large amounts of unstructured text data and structured data mined from these text sources. For this reason, sentiment analysis methods have gained popularity among researchers who can now incorporate insights from different text sources in qualitative and quantitative research. One of the most prominent sources of text data is Twitter. Twitter posts contain real-time access to public plans and perceptions on all sorts of topics, not only from individuals but also from corporations, government organizations and figures, NGOs, and all kinds of public figures. This has made Twitter a perfect candidate to mine data from the text for various applications like public policy evaluation, stock prediction, and advertisement.
The first attempts to extract sentiment from tweets did so by using Naïve Bayes (NB), Maximum Entropy (ME), and Support Vector Machines (SVM) to classify tweets as either positive or negative and were able to reach 83.0% accuracy (Go, Bhayani, and Huang 2019). Pak and Paroubek (2010) extracted sentiment from tweets using NB and ME algorithms, with the addition of a neutral class that boosted the problem from binary to multi-class classification. Other approaches tackled the problem by considering the relevance of the tweets. For example, Barbosa and Feng (2010) introduced a first layer with a model of binary classification to determine whether the tweet itself reflected an opinion that could be labeled as positive or negative. Then, a second binary classification model was used to determine its sentiment. The study achieved a maximum accuracy of 81.9% with SVM. A substantially different approach was taken by Davidov et al. (2010), which included emoticons, hashtags, and punctuation as essential features in both the data collection process and the binary sentiment classification. This approach achieved an F1 score of 86.0%.
To date, SVM played a significant role in developing the first SA approaches for Twitter data. The approach by Bakliwal et al. (2012) trained an SVM classifier on 11 Twitter-specific features and features that arose from NLP pre-processing techniques (e.g., stemming and spelling correction). Other authors were also able to obtain good results using combining SVM with a proper feature selection (CBalabantaray, mohd, and Sharma 2012; Gokulakrishnan et al. 2012; Kiritchenko, Zhu, and Mohammad 2014). On the importance of the feature definition and selection, several authors contributed with their research on the impact of such parameters (Agarwal and Sabharwal 2012; Aisopos, Papadakis, and Varvarigou 2011; Aston, Liddle, and Hu 2014; Hamdan, Béchet, and Bellot 2013; Kouloumpis, Wilson, and Moore 2011; Saif, He, and Alani 2012).
The second half of the 2010s decade saw a rapid increase in the capabilities and applications of neural networks and the boom of deep learning. With this, the ability to extract abstract features using neural networks enabled a whole new level of approaches leveraging this technology. With this, Convolutional Neural Networks (CNN) were introduced for the sentiment analysis on tweets (Severyn and Moschitti 2015). The study introduced an architecture containing a single convolutional layer to extract features from individual sentences inside the tweets and individually classify them. Another approach using complex architectures of neural networks was the one developed by Tang et al. (2014) in which three neural networks were employed to learn word embeddings specific for sentiment analysis and further used as features.
An adaptive Recursive Neural Network (RNN) was proposed for Twitter SA by Dong et al. (2014) in which a dependency tree was used to track and propagate sentiment across words and their corresponding syntactically related targets, in addition to introducing their own manually labeled dataset for this task. This dataset was later used by Vo and Zhang (2015) whose proposed approach consisted of modeling separately the context before and after a specific target in each tweet. Additionally, rich features were extracted automatically.
Different combinations and architectures of these deep learning methods were also implemented to extract sentiment from tweets. By 2017, a seminal work on Transformers was published by Vaswani et al. (2017). Transformers models arose mainly to overcome the limitations of previous language models to capture context variations in word representations. For instance, even though the word “left” can have completely different meanings depending on the context, previous models only offered single-word embedding. Transformers consist of two main modules: encoder and decoder. These leverage the concept of the attention mechanism, which assigns a score to every word in a sentence to determine their relevance in building representations of words in the sentence using their context. The encoder module generates embeddings for each word which are then improved with the aggregation of information from the context words. Similarly, the decoder generates sequential outputs by attending to previously generated outputs and the encoded embeddings.
One of the most important applications of Twitter text mining and sentiment analysis focuses on prediction. For instance, Yao and Qian (2021) analyzed people's work and rest patterns to predict next-day morning traffic congestion, showing promising signs of improved prediction accuracy compared to traditional methods. Another study employs a deep Bi-directional Long Short-Term Memory (LSTM) stacked autoencoder model using Twitter, traffic, and weather data for short-term urban traffic prediction, showing improved accuracy over classical and machine learning models (Essien et al. 2021). Ratnani and Kumar (2021) investigated the correlation between Twitter data related to sporting events to predict passenger flow, aiming to improve metro transit system management and traffic control. Tweetluenza, a linear regression model utilizing cross-lingual Twitter data, effectively predicts Influenza prevalence and hospital visits in the UAE with improved accuracy when combining English and Arabic tweets (Alkouz, Aghbari, and Abawajy 2019). Valencia et al. (2019) explored the use of neural networks (NN), SVM, and random forest (RF) with Twitter and market data to predict cryptocurrency market movements, finding that neural networks outperform other models. Another study develops an algorithm for analyzing flood-related disaster tweets, categorizing them by priority and predicting user locations using the Markov model, achieving 81% classification accuracy and 87% location prediction accuracy (Singh et al. 2019). Lastly, using Twitter data and sentiment analysis, Yavari et al., (2022) predicted election results based on positive-to-negative message ratios, achieving high accuracy in predicting the 2020 US presidential election. More recently, Sun et al. (2023) showed Twitter sentiment can be used to explain variations in mobility and activity participation by using Kyoto, Japan as case study.