Small Wars Journal

Catalonia: Secession, Sentiment Scoring, and State Media in Irregular Warfare

Tue, 02/22/2022 - 3:55pm

Catalonia: Secession, Sentiment Scoring, and State Media in Irregular Warfare


By Bryce Vincent Johnston





Digital technology created new ways for states to influence events across the world. This paper explores a new opensource toolset that analysts can use to understand these threats by applying the sentiment analysis technique to the Catalan Independence movement. On October 27th, 2017, the eyes of the world turned to Catalonia. Earlier that month, the region held a referendum where 90% of voters chose independence from Spain.[1] While the Spanish government had declared the referendum illegal, the Catalan Parliament went ahead and issued an official declaration of independence from Spain. On October 27th, the Catalan Parliament chose to uphold the referendum. The Spanish government dissolved the Catalan Parliament and instituted direct rule. [2] Police clashed with protestors on the street leading to bloody conflicts.[3]

The Catalan Independence movement is just one example of how competing narratives can hijack separatist movements. After WWII, the character of war has changed from being defined by interstate conflict to intrastate conflict. Since the end of the Cold War, the number of Civil Wars has increased almost three-fold.[4]  


Fig 1


While the information space surrounding separatist movements is often chaotic, modern textual-analysis tools may help researchers and policymakers cut through the noise to understand the underlying goals of state actors. These tools are typically used in business settings to make sense of the uncertainty surrounding quarterly earnings reports, but they may be able to perform the same function for separatist movements.

This form of analysis dates to WWII. Under the direction of the RAND Corporation, Alexander George studied the predictions the Federal Communications Commission (FCC) made about the future political actions of Nazi Germany based on their propaganda. While George believed that their form of propaganda analysis was useful in inferring Nazi Germany’s underlying motives, their textual analysis was limited by human bias. Now that open-source machine learning tools can reduce human subjectivity in text classification, it is time to revisit their methods to see if they can help us make sense of the confusing information space surrounding separatist movements.

Using resources available for free on the internet, this paper study a sentiment analysis on 500 articles concerning the Catalan Independence movement from the state media outlets for the five permanent members of the United Nations Security Council. Sentiment analysis indicates that policy preferences do influence sentiment, but this influence is moderated by the editorial standards of the publications. Perhaps the more important finding from the study, however, is the validation of open-source tools as a viable method of analyzing irregular warfare.


The information space in the Catalan Revolution was shaped by propaganda. The Center for Propaganda Analysis at Princeton defines propaganda as an “expression of opinion or action by individuals or groups deliberately designed to influence opinions or actions of other individuals or groups with reference to predetermined ends.”[5] Propaganda plays an especially important role in independence movements due to the cultural and political fissures that separate the two sides. George’s school of propaganda analysis attempts to use messaging to understand the policy preferences of the propagandist.[6]

The key to propaganda analysis was the idea that the propaganda was not “spontaneous” or “reactive”.[7] George assumes that propaganda is aimed at producing effects in target populations.  George lists two populations: the domestic population, and the enemy camp. Elites then attempt to use propaganda to have a depriving or an indulging effect on these populations. George uses the word indulging to indicate propaganda that has a positive effect on the population. For the domestic audience, this is used to build morale. For the enemy camp, this can be used to create unrealistic expectations. Deprivation has the opposite effect. When used on a domestic audience, it is meant to temper expectations. For the enemy camp, it is used to demoralize the members of the camp.[8] The matrix below offers an example of his methodology:


Table 1

The effectiveness of this strategy has been subject to debate. George attempted to evaluate the accuracy of predictions made by the Federal Communications Commission (FCC) during WWII. George and the FCC stated that one of the issues with propaganda analysis at the time was the lack of quantitative tools to study content. For this reason, analysts could influence their inferences with personal biases. This same issue was also common in George’s study of FCC data. In a review of the book, Lee states that the quality of the data used in the study makes it difficult to know if his results are significant or not.[9] He contends that George and the FCC did not fully grasp “the highly relative and processual nature of content data”.[10] At the time, there was no objective way to categorize his data which made his framework impractical for future use. For this reason, George’s methods have not received much attention in recent years.

Open-Source Sentiment Analysis Tools

New quantitative tools could solve many of the problems associated with deriving political inferences from propaganda. Sentiment analysis, a tool mainly used by marketers and investors, can address the issue of objectivity present in George’s work.[11] This method works by analyzing pieces of text and assigning values to words. These values are usually associated with positive or negative traits such as optimism or uncertainty, but they can also be set to study emotions or other criteria set by the analyst. Once each word has been assigned a value, the program tallies up the score for the entire piece. Sentiment analysis is useful for three reasons. First, it can provide real-time analysis of text such as a social media feed. Second, it can sort data at scale. Finally, it removes human bias from textual analysis.[12]

Results from Open-Source Analysis

The dataset consists of 500 articles that referenced the Catalan Independence movement from January 2017 to May 2021. This data set included articles from the British Broadcasting Channel, France 24, RT, Voice of America, and The Global Times. The articles were last updated on May 27th, 2021. The results show that publications used positive language in their articles regardless of their political affiliation. As mentioned before, the overall scores were low compared to the total possible scores, but plotting each article reveals a large distribution within publications.


fig 3


Aside from the Global Times, Voice of America had the second tightest distribution which was very close to zero. While the publication was not neutral in its coverage of the independence movement, there is evidence that its editorial guidelines reigned in some of the bias. The distribution within the data set is evidence that Voice of America has lived up to its promise of a non-partisan, independent publication within the Department of State.

France 24 had a tight distribution that was interrupted by the presence of several outliers. Both positive and negative outliers for France 24 stretch their distribution away from a neutral outlook. This may be evidence of the difficulties of its recent digital migration. As the publication restructured itself, it may have relinquished some control over its journalists which allowed positive and negative articles to seep through. The BBC is the only publication whose distribution skews negative. While their distribution is not as tight as Voice of America or the Global Times, their articles use consistently negative language to describe Catalan Independence. This negative bias is not as strong as the positive bias observed in the other four publications. In this sense, the BBC may be one of the most neutral sources despite their larger distribution of scores. This is aligned with the charter of the BBC which sets up the publication as a source of truth that reflects the culture of the United Kingdom.

RT has the largest distribution among the five publications as they include the most positive articles and one of the most negative articles. While the overall trend of their bias is positive, they have many articles that use negative language as well. This may be evidence of their strategy to fill the information space with as much polarizing material as possible. Of the five publications, RT had the largest number of articles referencing the Catalan Independence movement.

Among the publications, The Global Times was the only one whose articles were entirely positive. Their distribution is also the tightest of the five publications. While this could be due to the large word count included in the HTML files, a manual test showed that isolating the article text did not change the score. Strict adherence to the editorial guidelines may be responsible for the tight distribution. This would limit the noise introduced by the views of individual authors and create a tighter message in support of Chinese policy.

Plotting the points in a quadrant reveals interesting mismatches between their stated preferences and their sentiment. Despite their tepid public support for a unified Spain, the 95% confidence for France 24 does not include zero showing that on average they do not display neutral sentiment. The average positive sentiment may be an overcorrection to the negative news that came out of Catalonia during this time. Whatever the cause, the preference to stay neutral did not influence how they wrote their articles. Voice of America did not seem to support the policy preferences of the United States as well.


fig 3


On average, they displayed a positive sentiment that was just below RT. One reason for this may be that editorial standards for neutrality may exclude some negative words while including more positive words. Because negative words have a greater emotional impact, the default for writers may have been to use words that have a positive sentiment regardless of the topic.

The Global Times also had a positive sentiment on average. Despite the measurement issues associated with the text-mining process, it is fair to say that their articles have at least a slight positive bias. This may be due to the publication attempting to signal support for the Chinese Communist Party within their articles. In this case, the positive language used for the Party would wash out the negative language used to describe the policy at hand.


The findings from this study should be of interest to practitioners and scholars who are analyzing politicized information spaces. Because separatist movements can often descend into violence and conflict, this analysis is especially relevant to policymakers as they try to navigate the difficult geopolitical landscape surrounding these movements. As the world sees an increase in foreign intervention in these movements, understanding the motives behind their messaging could prove crucial to limiting their effects on the target population. George’s attempt to infer political actions from propaganda serves as a prototype for a deeper understanding of propaganda analysis.

Policy preferences influence publication sentiment when there are weak commitments to independence and impartiality. Third, analysts must use new methods to study regimes that have tight control over their digital content. Analysts can use open-source tools to improve upon George’s propaganda analysis framework, but dictionaries must be updated to reflect language typically used to describe secessionist movements. Overall, open-source tools for text analysis reveal interesting, if noisy, patterns within state media. For sites that have strong commitments to neutrality and independence, publishing guidelines have a moderating effect that decreases the influence of policy preferences on the sentiment of the article. The reverse is true of publications that state a commitment to supporting the policies of their government. These publications often displayed their policy preferences in unsubtle ways.

At the same time, there are major limitations to open-source tools. First, even Google’s advanced algorithms have a difficult time sifting relevant articles from irrelevant articles. This becomes especially challenging for sites that cross-promote articles on their platforms. This makes it difficult to predict what kinds of articles are available to citizens while limiting the power of our analysis. While I observed a normal distribution of articles, there is a correlation between relevance and sentiment. Second, open-source sentiment analysis tools are not optimized for propaganda. Because they are used in the business world, their dictionaries do not pick up on vocabulary that is specific to propaganda. Third, this analysis did not account for articles that were not written. In propaganda, what publications choose to omit is just as important as what they choose to include. While the Global Times covered many aspects of the Independence Movement, an article covering the referendum itself did not appear within my survey of their most relevant articles. This will always be an issue with propaganda analysis and cannot be solved with better tools.

Finally, open-source tools do not work as well on publications that represent governments with heavy internet restrictions. This limitation is perhaps the most unfortunate for Western policymakers as it makes it difficult for them to understand the goals of authoritarian regimes. The Global Times was difficult to study due to their anti-web scraping measures. While I was able to save their data to my local computer, this method added noise to the analysis in the form of excess characters and is not feasible for researchers who need to sift through hundreds of documents due to the large file sizes.

Even with these limitations, the study proved that current tools could improve George’s methods for propaganda analysis. By improving upon these tools, future analysts will be able to create a much clearer picture of the information environment surrounding secessionist movements. Hopefully, this will allow policymakers to make better decisions surrounding foreign intervention and reduce the likelihood of conflict within these regions.




Bosetti, With Louise, James Cockayne, Cale Salih, and Wilfred Wan. “Civil War Trends and the Changing Nature of Armed Conflict,” n.d., 10.

Buckley, Nicole, Morgan Wack, Joey Schafer, and Martin Zhang. “Inconsistencies in State-Controlled Media Labeling.” Election Integrity Partnership. Accessed May 17, 2021.

Cantril, Hadley. “Propaganda Analysis.” The English Journal 27, no. 3 (March 1938): 217–21.

D’Andrea, Alessia, Fernando Ferri, Patrizia Grifoni, and Tiziana Guzzo. “Approaches, Tools and Applications for Sentiment Analysis Implementation.” International Journal of Computer Applications 125, no. 3 (September 17, 2015): 26–33.

Dolak, Kevin. “Hundreds Injured in Catalonia as Spanish Police Crack down on Referendum.” ABC News, October 1, 2017.

George, Alexander L. “Propaganda Analysis.” Evanston Und New York 1959, 1959.

George, Alexander. “Prediction of Political Action By Means of Propaganda Analysis.” Rand Corporation, December 22, 1955.

Lee, Alfred McClung. “Reviewed Work: Propaganda Analysis: A Study of Inferences Made from Nazi Propaganda in World War II.” Edited by Alexander L. George. American Sociological Review 25, no. 3 (1960): 432–33.

Mejova, Yelena. “Sentiment Analysis: An Overview.” University of Iowa Department of Computer Science, November 16, 2009, 34.

Proellochs, Nicolas, and Stefan Feuerriegel. “Package ‘SentimentAnalysis.’” CRAN, February 18, 2021.

“The Latest: Catalonia: 90 Percent Vote for Independence.” AP News, May 1, 2021.

Uppsala Conflict Data Program. “Fatalities Since 1989.” Uppsala University, 2021.



[1] “The Latest: Catalonia: 90 Percent Vote for Independence,” AP News, May 1, 2021.

[2] Jeannette Neumann and Marina Force, “Spain Seizes Power in Catalonia After Region Declares Independence - WSJ,” Wall Street Journal, October 27, 2017,

[3] Kevin Dolak, “Hundreds Injured in Catalonia as Spanish Police Crack down on Referendum,” ABC News, October 1, 2017,

[4] Bosetti, With Louise, James Cockayne, Cale Salih, and Wilfred Wan. “Civil War Trends and the Changing Nature of Armed Conflict,”10.

[5] Hadley Cantril, “Propaganda Analysis,” The English Journal 27, no. 3 (March 1938): 217

[6] Alexander George, “Prediction of Political Action By Means of Propaganda Analysis” (Rand Corporation, December 22, 1955).

[7] Ibid. 7

[8] Alexander L George, “Propaganda Analysis,” Evanston Und New York, 1959.

[9] Alexander L George, “Propaganda Analysis”.

[10] Alfred McClung Lee, “Reviewed Work: Propaganda Analysis: A Study of Inferences Made from Nazi Propaganda in World War II,” ed. Alexander L. George, American Sociological Review 25, no. 3 (1960): 432–33.

[11] Alessia D’Andrea et al., “Approaches, Tools and Applications for Sentiment Analysis Implementation,” International Journal of Computer Applications 125, no. 3 (September 17, 2015): 29.

[12] Yelena Mejova, “Sentiment Analysis: An Overview,” University of Iowa Department of Computer Science, November 16, 2009, 34.

About the Author(s)

Bryce Johnston is an Intelligence Officer in the United States Army. He holds a BS in American Politics from the United States Military Academy and an MSc in International Development from the IE School of Global and Public Affairs where he studied as a Fulbright Scholar.