Finding data for your visualization projects and assignments should be easy. Here are 20 resources for finding rich datasets.
General Datasets
- UCI Machine Learning Repository: Consists of a diverse field of datasets (360 datasets currently and still growing) for the purpose of performing analytics and machine learning algorithms. http://archive.ics.uci.edu/ml/
- Kaggle datasets: Perfect for exploring data through visualization. https://www.kaggle.com/datasets
- Amazon Public Dataset: These are large datasets which deal with memory in gigabytes or terabytes. https://aws.amazon.com/public-datasets/
- Google Public Data: A dataset provided by Google, including Book corpus, US names, Genome dataset, BIgQuery dataset, and many more. https://cloud.google.com/public-datasets/
- Open Data by Socrata: Thousands of free datasets for exploration. https://opendata.socrata.com/
- gov: A website dedicated to supply datasets of different domains, eg. Education, Nutrient, Sports. https://catalog.data.gov/dataset?res_format=CSV
- Datahub: Just as its tagline states, “The easy way to get, share data”. https://datahub.io/dataset?tags=weather
- Harvard Dataverse: Find most of the datasets used for research purpose, and cited in different publications. https://dataverse.harvard.edu/
Challenge based datasets
- KDD Data Center: Have a problem coming up with a problem statement? No worries, KDD provides you with the dataset and problem statements through its challenges. http://www.kdd.org/kdd-cup
- CrowdAnalytics: More challenges to solve with datasets. https://www.crowdanalytix.com/community
- DataDriven: Problems for data scientists to solve. https://www.drivendata.org/competitions/
- Big Data Innovation Challenge: Tackle real problem with these analytics, and also win a challenge. https://bigdatainnovationchallenge.org/challenges/food-security-nutrition/
Census Datasets
- Open Census Data: Details of population in different cities of countries is just a click away with this open data. http://census.okfn.org/en/latest/
- gov: Census data of United States. http://www.census.gov/data.html
Weather/Climate datasets
- Wunderground: Want to work with weather data? Use Wunderground’s API to get your own dataset. https://www.wunderground.com/weather/api/
- National Center for Environmental Information: Climate datasets available for analytics. https://www.ncdc.noaa.gov/cdo-web/datasets
News Datasets
- BBC Dataset: It consists of documents from the BBC news website corresponding to stories in five topical areas. http://mlg.ucd.ie/datasets/bbc.html
- The Guardian: A collection of news datasets from the guardian, which is updated regularly. https://www.theguardian.com/news/datablog/interactive/2013/jan/14/all-our-datasets-index
Food, and Nutrition Datasets
- United States Department of Agriculture: The data are provided by the Center of Nutritional Policy and Promotion giving details about food prices dataset, health eating index. https://www.cnpp.usda.gov/data
- Nutritional Science Blog: A blog listing some of dataset relating to the domain of nutrition. http://nutsci.org/open-nutrition-food-data/