Economic Insights from Reddit Discussions: A Text-to-Data Sentiment Analysis on Financial Strain

R | Sentiment Analysis | Text Mining | Data Preprocessing | Machine Learning | K-means Clustering | API Integration | Quanteda | ggplot2 | Data Visualization | Economic Forecasting | Risk Management | Parallel Processing | NLP | Business Analytics
Project Description
In this project, I utilized advanced Data Science and Sentiment Analysis techniques to conduct a comprehensive analysis of financial strain discussions on Reddit. By leveraging the power of APIs, text mining, and natural language processing, I captured and analyzed large volumes of user-generated content across multiple subreddits, focusing on topics related to economic hardship in America. This project demonstrates how sentiment analysis, combined with Data Science and Business Analytics, can provide deep insights into public perceptions and real-time economic challenges.
​
I applied a dictionary-based sentiment analysis model and custom text-cleaning methods to extract, tokenize, and analyze key discussions related to financial strain, wage stagnation, credit score problems, and loan defaults. Through clustering and K-means analysis, I uncovered underlying patterns in user sentiment, providing a clear view of how economic stressors impact various demographic groups. This analysis can inform decision-makers about the public’s financial sentiment and guide policy actions addressing critical economic issues like unemployment and loan defaults.
Project Skills
Data Mining & Text Analysis: Utilized Reddit API to extract large datasets of user discussions related to financial strain, followed by extensive data preprocessing to clean and prepare text data for analysis.
​
Sentiment Analysis & Business Analytics: Implemented dictionary-based sentiment analysis models to analyze user comments, identifying key economic trends, including unemployment, mortgage stress, and personal finance crises.
​
Clustering & Visualization: Employed K-means clustering to uncover latent patterns in the discussions, visualizing results using ggplot2 and bubble charts to highlight the most relevant terms and their economic implications.
​
Risk Management & Financial Insights: Developed an analytical framework to evaluate public sentiment on sensitive financial topics, offering valuable insights into wage stagnation, loan defaults, and credit score issues, contributing to better economic forecasting.
​
Parallel Processing & API Integration: Used parallel processing for efficient data extraction and API integration to capture real-time data from Reddit, enabling comprehensive economic analysis based on social discussions.
​
Project Demostrates
In this project, I demonstrated expertise in Data Science and Sentiment Analysis by transforming raw Reddit discussions into actionable insights on financial strain in America through advanced text mining and data preprocessing techniques. My ability to synthesize social media discussions into Business Strategy and Economic Insight revealed how economic stress shapes household financial behaviors, highlighting key areas for strategic intervention. Using Text Mining and Machine Learning, I applied algorithms such as K-means clustering and silhouette analysis to categorize financial discussions into meaningful clusters, providing a clearer understanding of public concerns surrounding economic inequality. Through Economic Forecasting and Data-Driven Decision Making, I generated insights that aligned with public sentiment on financial hardship and economic stress, offering valuable data for businesses and policymakers. My Technical Expertise in R programming, sentiment analysis, and data visualization enabled me to analyze complex datasets and produce impactful visuals, effectively communicating essential insights to stakeholders.