2023 Qualitative Datasets
Dataset #1: BTS English Translations
Link: Ask Jonathon
Format: CSV
Access: Free; please cite Jonathon Sun
Main variables: Album, Song, Lyrics
Dataset #2: Chat GPT Sentiment Tweets
Link: https://www.kaggle.com/datasets/charunisa/chatgpt-sentiment-analysis
Format: CSV
Access: Public Domain
Main variables: ID, Tweets, Labels
Dataset #3: Top 10 Songs from Billboard 2022
Link: Ask Jonathon
Format: CSV
Access: Free; please cite Jonathon Sun
Main variables: Line, Section, Song name, Artists Name
Dataset #4: Australian Legal Cases from the Federal Court of Australia (2006 - 2009)
Format: CSV
Access: Public Domain
Main variables: Case ID, case outcomes, case title, case text
Dataset #5: Disney+ Movies and TV shows
Link: https://www.kaggle.com/datasets/shivamb/disney-movies-and-tv-shows
Format: CSV
Access: Public Domain
Main variables: showID, type, title, director, cast, country, date added, release year, rating, duration, listed in, description
Dataset #6: Netflix Movies and TV Shows
Link: https://www.kaggle.com/datasets/shivamb/netflix-shows?select=netflix_titles.csv
Format: CSV
Access: Public Domain
Main variables: showID, type, title, director, cast, country, date added, release year, rating, duration, listed in, description
Dataset #7: Avatar: The Last Airbender
Link: https://www.kaggle.com/datasets/ekrembayar/avatar-the-last-air-bender
Format: CSV
Access: Kaggle
Main variables: #, ID, Book, Book number, chapter, chapter number, character, full text, character words, writer, director, imdb rating
Dataset #8: Tweets with the hashtag #ChatGPT
Link: https://www.kaggle.com/datasets/konradb/chatgpt-the-tweets
Format: CSV
Access: Public Domain
Main variables: username, text, user location, user description, user created, user followers, user friends, user favorites, user verified, date, hashtags, source
Dataset #9: Turkey Earthquake Tweets
Link: https://www.kaggle.com/datasets/gpreda/turkey-earthquake-tweets
Format: CSV
Access: Public Domain
Main variables: ID, username, user location, user description, user created, user followers, user friends, user favorites, user verified, date, text, hashtags, source, retweets, favorites, is retweet
Dataset #10: BBC News
Link: https://www.kaggle.com/datasets/gpreda/bbc-news
Format: CSV
Access: Public Domain
Main variables: Title, pubDate, guid, link, description