APA Style 7th Edition: Citing Your Sources

  • Basics of APA Formatting
  • In Text Quick View
  • Block Quotes
  • Books & eBooks
  • Thesis/Dissertation
  • Audiovisual
  • Conference Presentations
  • Social Media
  • Legal References

Standard Format

Formatting rules, various examples.

  • Reports and Gray Literature
  • Academic Integrity and Plagiarism
  • Additional Resources
  • Reference Page

Adapted from American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed).  https://doi.org/10.1037/0000165-000

  • Provide a retrieval date only if the data set is designated to change over time
  • Date for published data is the year of publication
  • Date for unpublished data is the year(s) of collection
  • If version number exists, include in parentheses after the title

See Ch. 10 pp. 313-352 of APA Manual for more examples and formatting rules

  • << Previous: Legal References
  • Next: Reports and Gray Literature >>
  • Last Updated: Nov 1, 2023 3:17 PM
  • URL: https://libguides.usc.edu/APA7th

how to cite kaggle dataset

Plagiarism Checker

Compare your paper to billions of pages and articles with Scribbr’s Turnitin-powered plagiarism checker.

Run a free check

how to cite kaggle dataset

AI Proofreader

Correct your document in minutes.

Upload my document

how to cite kaggle dataset

Proofreading & Editing

Have a human editor polish your writing to ensure your arguments are judged on merit, not grammar errors.

Get expert writing help

how to cite kaggle dataset

Check your Citations

Improve your in-text citations and references for errors and inconsistencies using Scribbr's AI technology or human experts.

how to cite kaggle dataset

Paraphraser

Rewrite and paraphrase texts instantly with our AI-powered paraphrasing tool.

Try for free

how to cite kaggle dataset

Grammar Checker

Eliminate grammar errors and improve your writing with our free AI-powered grammar checker.

Cite a published dataset accessed online or in a database, or unpublished raw data received directly from the researcher or organization. Use other forms to cite theses and dissertations or journal articles .

University Libraries

  • UI Libraries
  • Engineering Library
  • Course Reserves
  • Interlibrary Loan / Document Delivery
  • Renew My Books
  • Ask A Librarian
  • Circulation Policies
  • Library Request Forms
  • Group Study Rooms
  • Check My Account
  • My Interlibrary Loan
  • Recommend Library Purchase
  • EndNote Basic
  • About the Library
  • Location & Hours
  • UI Libraries home
  • Engineering Guides
  • Citing Sources
  • Citing Datasets

Citing Sources: Citing Datasets

  • Citing Sources Home
  • Citation Basics
  • Citation Management Software
  • Citing Patents
  • Citing Standards

The minimum data required for an acceptable citation are the name(s) of the data creators(s), title of dataset, publisher, published date, and the URL/DOI where data was found.

Examples of how to cite 2020 National Census of Ferry Operators using seven different style manuals:

how to cite kaggle dataset

  • National Census of Ferry Operators

ACS Guide to Scholarly Communication

Find more information in section: 4.3.5.14 Data & Datasets

Author names are listed in inverted form with periods and spaces: Surname, First Initial. Middle Initial., qualifier if applicable. Section 4.3.4.1 Authorship for more information.

General Format: Author1; Author2; et al. Title of dataset, ver. ##. Publisher, Published date (format Month Date, Year). DOI/URL

Example: Bureau of Transportation Statistics. 2020 National Census of Ferry Operators, United Stated Department of Transportation, March 1, 2022. https://www.bts.gov/NCFO

AMA Manual of Style

The AMA Manual of Style 11th Edition does not specifically reference Datasets

Author names are listed in inverted form without periods and without spaces: Surname, First Initial Middle Initial Section 3.7 Authors for more information.

Depending on where dataset is stored will determine which formatting to use.

For the example of the National Census of Ferry Operators, it was found on a website so using the general format for a website as: Author1, Author2. Title of dataset. Name of Website. Published date (format Month Date, Year). Updated date (format Month Date, Year). Accessed date (format Month Date, Year). DOI/URL

Example: Bureau of Transportation Statistics. 2020 National Census of Ferry Operators. United Stated Department of Transportation. Updated March 3, 2022. Accessed June 10, 2022. https://www.bts.gov/NCFO

APA Publication Manual

Find more information in section: 10.9 Data sets

Author names are listed in inverted form with periods and spaces with an ampersand before final author's name. Surname, First Initial. Middle Initial., qualifier if applicable. Section 9.8 Format of the Author Element for more information

General Format: Author 1, Author 2. (Year Published). Title of dataset (version #) [Data set]. Publisher Name. DOI/URL

Example: Bureau of Transportation Statistics. (2022). 2020 National Census of Ferry Operators [Data set]. United States Department of Transportation. https://www.bts.gov/NCFO

Chicago Manual of Style

The Chicago Manual of Style does not specifically reference Datasets.

Authors names are given as they appear in the source itself. If more than one inventor - order is “Inventor1 Last Name, Inventor1 First name and Inventor 2 First Name Inventor 2 Last Name” Sections 14.73 Form of author’s name and 15.12 Authors’ names in reference list entries for more information.

Author-Date References

Using the general format an Author-Date citation can be constructed as: Author. Title of dataset. Place: Publisher, Year. URL/DOI.

Example: Bureau of Transportation Statistics. 2020 National Census of Ferry Operators . Washington DC: United States Department of Transportation, 2022. https://www.bts.gov/NCFO.

Notes and Bibliography

Using the general format for Notes and Bibliography citations can be constructed as: Author , Title of dataset (Place: Publisher, Year), URL/DOI.

Example: Bureau of Transportation Statistics, 2020 National Census of Ferry Operators (Washington DC: United States Department of Transportation, 2022), https://www.bts.gov/NCFO.

Scientific Style and Format: The CSE Manual

The CSE Manual does not specifically reference Datasets.

Author names are listed in inverted form without periods and without spaces: Surname, First Initial Middle Initial Section 29.3.6.1.1 Personal Authors for more information.

Citation–sequence and citation–name:

Using the general format a Citation–sequence and Citation–name citation can be constructed as : Author1, Author2, et al. Dataset title. Publisher Location City (Location State): Publisher; Year [accessed date (format YYYY MMM DD)]. URL/DOI.

Example: Bureau of Transportation Statistics. 2020 National Census of Ferry Operators. Washington DC: United States Department of Transportation; 2022 [accessed 2022 Jun 10]. https://www.bts.gov/NCFO.

Name–year:

Using the general format a Name-year citation can be constructed as : Author1, Author2, et al.Year. Dataset title. Publisher Location City (Location State): Publisher; [accessed date (format YYYY MMM DD)]. URL/DOI.

Example: Bureau of Transportation Statistics. 2022. 2020 National Census of Ferry Operators. Washington DC: United States Department of Transportation; [accessed 2022 Jun 10]. https://www.bts.gov/NCFO.

IEEE Guide to Writing in the Engineering & Technical Fields

The IEEE Guide to Writing does not specifically reference Datasets.

Author Names are listed in order with periods and spaces for first and middle names. First Initial. Middle Initial. Surname. Appendix: IEEE Style for References for more information

Using the general format for IEEE Citations it can be constructed as: Author1, author2, et al., “Title of dataset,” Source, Publication date (Mon. DD, YYYY). [Online]. Available: URL/DOI

Example: Bureau of Transportation Statistics, "2020 National Census of Ferry Operators," United States Department of Transportation, Mar. 01, 2022. [Online]. Available: https://www.bts.gov/NCFO

MLA Handbook 9th Ed.

The MLA Handbook does not specifically reference Datasets.

Inventor names are listed in inverted order with full first name and middle initial Surname, First Name Middle Initial. Chapter 5.5 Author: How to Style It of MLA Handbook 9th Edition for more information

Using the MLA Format Template it can be constructed as: Author. Title of dataset . Publisher, Publication Date, Location. Publisher name, Date of publication (format DD Month YYYY), location. doi/url of data

Example: Bureau of Transportation Statistics. 2020 National Census of Ferry Operators . United States Department of Transportation, 01 March, 2022, Washington DC. https://www.bts.gov/NCFO

  • << Previous: Citing Standards
  • Last Updated: Jan 24, 2024 3:01 PM
  • URL: https://guides.lib.uiowa.edu/citations

MSU Libraries

Research guides.

  • Need help? Ask a Librarian

How to Cite Data: Dataset Citations

  • General Info
  • Dataset Citations
  • Statistical Tables Citations
  • Statistics & Writing

Style Manual Examples

APA 6th edition For a complete description of citation guidelines refer to pp. 210-211 (datset) and p. 212 (unpublished raw data) of the Publication Manual of the American Psychological Association, 6th edition [Call Number: Reference BF76.7 .P83 2010 ].

Basic form: Author/Rightsholder. (Year). Title of data set (Version number) [Description of form]. Location: Name of producer. or Author/Rightsholder. (Year). Title of data set (Version number) [Description of form]. Retrieved from http://  

Example:   Pew Hispanic Center. (2008). 2007 Hispanic Healthcare Survey [Data file and code book]. Retrieved from http://pewhispanic.org/datasets/

Unpublished raw data from study, untitled work

Basic form:   Author, F. N. (Year). [Description of study topic]. Unpublished raw data.

Example: Smith, J.A. (2006). [Personnel survey]. Unpublished raw data.

APA Style Guide to Electronic References For a complete description of citation guidelines refer to p. 16 of the APA Style Guide to Electronic References (2007) [Call Number: Reference BF11 .A6722 2007 ].

Pew Hispanic Center. (2008). 2007 Hispanic Healthcare Survey [Data file and code book]. Available from Pew Hispanic Center Web site: http://pewhispanic.org/datasets/

Note: Available from, rather than Retrieved from, indicates that the URL takes you to a download site, rather than directly to the data set file itself.

Graphic Representation of Data

Centers for Disease Control and Prevention. (2005). [Interactive map showing percentage of respondents reporting "no" to, During the past month, did you participate in any physical activities?]. Behavioral Risk Factor Surveillance System. Retrieved from http://apps.nccd.cdc.gov/gisbrfss/default.aspx

APA 5th edition For a complete description of citation guidelines refer to p. 264 (unpublished raw data) and p.281 (data file, available from government agency and from NTIS website) of the Publication Manual of the American Psychological Association, 5th edition [Call Number: Reference BF11 .A672 2001 ].

APSA Style Manual for Political Science

For a complete description of citation guidelines refer to p. 30 of the APSA Style Manual for Political Science .

Data Archived and Available at the Inter-university Consortium for Political and Social Research (ICPSR)

Eldersveld, Samuel J., John E. Jackson, M. Kent Jennings, Kenneth Lieberthal, Melanie Manion, Michael Oksenberg, Zhefu Chen, Hefeng He, Mingming Shen, Qingkui Xie, Ming Yang, and Fengchun Yang. 1996. Four-County Study of Chinese Local Government and Political Economy, 1990 [computer file] (Study #6805). ICPSR version. Ann Arbor, MI: University of Michigan/Beijing, China: Beijing University [producers], 1994. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 1996.

American Sociological Association Style Guide

For a complete description of citation guidelines refer to p. 104 of the ASA Style Guide [Call Number: Reference HM569 .A54 2007 ]

Machine-Readable Data Files

CBS News. 2009. CBS News Poll: Energy USCBS2009-02A Version 2 [MRDF]. New York: CBS News [producer]. Storrs, CT: The Roper Center for Public Opinion Research, University of Connecticut [distributor].

Data Archive Examples

Icpsr data archive.

Duncan, Otis D., and Howard Schuman. Detroit Area Study, 1971: Social Problems and Social Change in Detroit [Computer file]. ICPSR07325-v2. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 1997. doi:10.3886/ICPSR07325

Read the FAQ page, Why and How Should I Cite Data? , for additional information on citing ICPSR datasets.

Manuscripts and dissertations based on ICPSR data should be submitted for inclusion in the ICPSR Bibliography of Data-Related Literature .

Roper Center for Public Opinion Research Data Archive

Cable News Network & USA Today. Gallup/CNN/USA Today Poll: Aftermath of Hurricane Katrina [computer file]. 1st Roper Center for Public Opinion Research version. Lincoln, NE: Gallup Organization [producer], 2006. Storrs, CT: The Roper Center, University of Connecticut [distributor], 2006.

Read the How to Cite Roper Center data page for additional information.

Papers published based on Roper Center data may be submitted to the Bibliography of publications using data from the Roper Center .

Dataverse Network

Gary King; Langche Zeng, 2006, "Replication Data Set for 'When Can History be Our Guide? The Pitfalls of Counterfactual Inference'" hdl:1902.1/DXRXCFAWPK UNF:3:DaYlT6QSX9r0D50ye+tXpA== Murray Research Archive [distributor]

Read the Academic Credit page at Dataverse for additional information.

Publisher Examples

National center for education statistics.

Kroe, E. (2002). Data File (Public-Use): Public Libraries Survey, Fiscal Year 1994 (NCES 2003–304). U.S. Department of Education, National Center for Education Statistics. Washington, DC: 2002.

Holton, B., and George, A. (2007). Data File and Documentation, Public Use: Academic Libraries Survey (ALS): Fiscal Year 1996 (NCES 2008-318). U.S. Department of Education. Washington, DC: National Center for Education Statistics. Retrieved [date] from http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid=2008318.

Centers for Disease Control/National Center for Health Statistics

National Center for Health Statistics. National Ambulatory Medical Survey, 1994. Public-use data file and documentation. ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/. 1996.

Read the Citations for NCHS Publications and Electronic Media page for more information.

  • << Previous: General Info
  • Next: Statistical Tables Citations >>
  • Last Updated: Nov 17, 2021 11:09 AM
  • URL: https://libguides.lib.msu.edu/citedata

[email protected]

Kaggle Datasets for Research

Create a Kaggle Account

If you don’t already have a Kaggle account, the first step is to create one. Go to Kaggle’s website and sign up using your email address or social media accounts. Once you’re logged in, you’ll have access to a wide variety of datasets.

Explore Kaggle Datasets

Kaggle offers a vast collection of datasets on diverse topics, ranging from finance and healthcare to natural language processing and computer vision. Use the search bar and filters to find datasets that align with your research interests. You can also explore popular datasets and featured competitions.

Check Dataset Licenses

Before downloading any dataset, it’s crucial to check its licensing terms and usage restrictions. Some datasets are open and can be used for research, while others may have specific restrictions, such as for educational purposes only or non-commercial use. Always review the dataset’s description and licensing information provided by the dataset owner.

Download the Dataset

Once you’ve found a dataset that suits your research needs and complies with its licensing terms, you can download it directly from Kaggle. Most datasets are available in common formats like CSV or JSON. Click the “Download” button to save the dataset to your computer.

Understand the Data

Before diving into your research, take the time to understand the dataset thoroughly. Review any documentation or metadata provided with the dataset to gain insights into its structure, variables, and any preprocessing that may be required.

Clean and Preprocess the Data

Data from Kaggle may not always be in a ready-to-use format. Depending on your research goals, you may need to clean and preprocess the data. This can include handling missing values, encoding categorical variables, and scaling features. Tools like Python’s pandas and scikit-learn can be immensely helpful for this task.

Conduct Your Research

With the dataset prepared, you can now conduct your research. Utilize data analysis techniques, statistical methods, machine learning algorithms, or any other research methods applicable to your study. Document your work thoroughly to ensure transparency and reproducibility.

Cite the Dataset

When publishing or presenting your research, it’s essential to give proper credit to the dataset’s creators. Include a citation to the Kaggle dataset in your research paper, thesis, or presentation. Provide information about the dataset’s name, source, and any relevant identifiers.

Ethical Considerations

Respect ethical guidelines and privacy concerns when using Kaggle datasets. Ensure that your research complies with data protection regulations and that you do not misuse or misrepresent the data. Be transparent about any limitations or biases in the dataset.

Share Your Findings

After completing your research, consider sharing your findings with the Kaggle community and the broader research community. You can write a Kaggle kernel or contribute to discussions related to the dataset. Sharing your insights can help others in their research endeavors.

Kaggle offers a wealth of datasets that can be a valuable resource for research projects across various domains. By following these steps and maintaining ethical standards, you can effectively use Kaggle datasets for your research and contribute to the advancement of knowledge in your field. Remember to respect dataset licensing terms and provide proper attribution to dataset creators to ensure a collaborative and ethical research environment.

Let’s start building something great together!

Contact us today to discuss your project and see how we can help bring your vision to life. To learn about our team and expertise, visit our ‘ About Us ‘ webpage.

tradeshift-integrator-team

Comment or Message *

Upload document (optional) I agree to share my data with Setronica to receive offers Please prove you are human by selecting the heart .

Share this:

Recent posts.

  • Boost IT Team Performance: Key Software & Metrics for CTOs and CEOs
  • Mentors in IT: Formulas for Successful Tandems
  • IT company good SLACK practices. How to folster solid roots for a healty working envivorment.
  • Building Strong Foundations: Mentorship in Early Career Development
  • What If Someone at Work Asks You About Your Weaknesses?
  • Case Studies
  • Development
  • eProcurement/B2B Marketplaces
  • Market Research
  • Qualities Quest: HRD Blog

Search Here

Recent posts.

  • Simply dummy text of the printing
  • Lorem Ipsum has been the industry's standard
  • When an unknown printer took a galley
  • It to make a type specimen book
  • But also the leap into electronic typesetting
  • Letraset sheets containing Lorem Ipsum passages
  • Corporate Culture
  • Negociation

Our Categories

Setronica is a software engineering company that provides a wide range of services, from software products to core business applications. We offer consulting, development, testing, infrastructure support , and cloud management services to enterprises. We apply the knowledge, skills, and Agile methodology of project management to integrate software development and business objectives effectively and efficiently.

+386 6968 2515

Slovenia : Kolodvorska 7, 1000 Ljubljana

USA : 211 E 7th St ,  Austin ,  TX 78701

© Copyright 2024 Setronica. All Rights Reserved.

:cookie:

Privacy Overview

  • Help and Support
  • Referencing Guides

IEEE - Referencing Guide

  • Citing in the Text
  • Citing Personal Communications
  • Citing Secondary Sources
  • Annotated Bibliographies
  • AI Generated Content
  • Assignments
  • A-V Materials
  • Book Chapters
  • Conference Papers
  • Course Materials
  • Electronic Documents
  • Internet Documents
  • Journal Articles
  • Newspaper Articles
  • Personal Communication
  • Readers/Study Guides
  • Secondary Sources
  • All Examples
  • Sample Reference List
  • Recommended URLs
  • Abbreviations
  • 4 Easy Steps
  • Referencing Terms
  • More Information ...

Standard format for citation

DOI available:

No DOI available:

[1] M. Ambrose, Air Infiltration Results for 129 Australian Dwellings , vol. 1, Canberra: CSIRO, 2018. [Dataset]. Available:  https://doi.org/10.25919/5ca54346ef256. [Accessed: July 4, 2019].

Dataset repository

[2] United States. National Aeronautics and Space Administration., NASA Prognostics Data Repository, Washington, DC: NASA, 2019. [Online]. Available: https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/. [Accessed: July 4, 2019].

Dataset deposit record

[3] J. Tucker Lima, et al., Data from: A Social-ecological Database to Advance Research on Infrastructure Development Impacts in the Brazilian Amazon - Hydrology Dataset , Dryad Digital Repository, 2016. [Dataset]. Available: https://doi.org/10.5061/dryad.20627. [Accessed: July 11, 2019]. Referenced in: https://doi.org/10.1038/sdata.2016.71.

Dataset description article

[4] A. Rakotomamonjy and V. Guigue, "BCI competition III: Dataset II - Ensemble of SVMs for BCIP300 Speller, " IEEE Transactions on Biomedical Engineering , vol. 55, no. 3, pp. 1147-54, March 2008. [Online]. Available: IEEE Xplore, http://www.ieee.org. [Accessed: July 11, 2019].

See the  All Examples  page for examples of in-text and reference list entries for specific resources such as articles, books, and web pages.

Reference list entries.

  • AV Materials
  • << Previous: Course Materials
  • Next: Electronic Documents >>
  • Last Updated: Nov 14, 2023 3:24 PM
  • URL: https://libguides.murdoch.edu.au/IEEE

Citing sources: Cite data

  • Citation style guides

Manage your references

Use these tools to help you organize and cite your references:

  • Citation Management and Writing Tools

If you have questions after consulting this guide about how to cite, please contact your advisor/professor or the writing and communication center .

Cite data in your paper/presentation so that you can:

  • Give the data producer appropriate credit
  • Enable readers of your work to access the data, for their own use and to replicate your results
  • Fulfills some publisher requirements

Include in your citation:

  • Year of publication
  • Publisher or distributor
  • URL, identifier, or other access location

Using citation software or style guides ? In Endnote use the reference type for "dataset." If you're using Mendeley or Zotero, make due with using other more generic reference type templates and fill in the essentials for your dataset.

Cite data: examples

Want detailed guidelines for citing data?  See:

  • Quick Guide to Data Citation (IASSIST)
  • How to Cite Data (MSU)
  • How to Cite Datasets and Link to Publications (DCC)

Examples of data citations include:

  • Bachman, Jerald G., Lloyd D. Johnston, and Patrick M. O'Malley. Monitoring the Future: A Continuing Study of American Youth (12th-Grade Survey), 1998 [Computer file]. Conducted by University of Michigan, Survey Research Center. ICPSR02751-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [producer and distributor], 2006-05-15. http://dx.doi.org/10.3886/ICPSR02751 .
  • ASTER Global Digital Elevation Model, version 1, ASTGTM_N11E122_num.tif, ASTGTM_N11E123_num.tif, Ministry of Economy, Trade, and Industry (METI) of Japan and NASA, downloaded from https://wist.echo.nasa.gov/api/ , October 27, 2009
  • Cite a subject archive entry, e.g.: Genbank accession number, available at: http://www.ncbi.nlm.nih.gov .

Data archives may provide guidelines on how to cite the data, e.g.,:

  • Data catalogs like the Harvard Dataverse Network and ICPSR have standard citations included in the study record.
  • ICPSR: Why and how should I cite data?
  • How to Cite Roper Center Data
  • Dryad Good Data Practices
  • Earth Science Information Partner Federation Data Stewardship/Citations
  • NOAA Paleoclimatology Program: Data Citation
  • PANGEA Citation
  • Citing and linking to the Gene Expression Omnibus (NCBI) database

Cite data using Zotero

As Zotero lacks an "item type" for datasets, enter the citation in the system as a "Document," depending upon if/how the data producer provides a recommended citation; either:

  • Export an RIS file and import this file into Zotero
  • Copy and paste the information from a recommended citation into a new Zotero item with the type "Document"
  • Otherwise, use the "Document" item type to add the components of the citation
  • << Previous: Citation style guides
  • Last Updated: Jan 16, 2024 7:02 AM
  • URL: https://libguides.mit.edu/citing

Research Guides

Gould library, gould data knowledge base.

  • AntConc Tutorial
  • BOPS (Balance of Payments Statistics)
  • Bureau of Labor Statistics
  • CEDDS (Complete Economic and Demographic Data Source)
  • Concordle Tutorial
  • DOTS (Direction of Trade Statistics)
  • GDF (Global Development Finance)
  • GFS (Government Finance Statistics)
  • Human Development Index
  • ICRG Researchers Dataset
  • IDS (International Debt Statistics)
  • IFS (International Financial Statistics)
  • Kaggle Tutorial

Kaggle Tutorial Overview

Finding a dataset, understanding your dataset- it’s more than just a csv file..

  • Latinobarometro
  • NCompass Street Centerline Dataset (GIS)
  • News Data Services
  • Twitter API Tutorial
  • WDI (World Development Indicators)
  • World Values Survey
  • Voyant Tutorial
  • Data, Datasets, & Statistical Resources Guide This link opens in a new window

Free online resource

The Importance of Validity:

This is a great source for finding data- especially sports data. But, it is up to you to estimate the validity and authority of the data you find. This is a wonderful a resource, but you should always be asking the question, "Do I feel comfortable using this data for my project?"

  • Find a dataset using specific search terms.
  • Read and understand what’s in your dataset.
  • Download and check the validity of your dataset.
  • Head to​ Kaggle. Then click on datasets on the top row.
  • You will need to make an account at some point if you want to download a dataset, comment, or create a kernel, but it’s not necessary yet.

how to cite kaggle dataset

In the picture, a few of the important search tools are highlighted.

how to cite kaggle dataset

File Types:

  • Most of you will want to stick to the CSV (comma separated values) file type. That will allow you to open the dataset in Microsoft Excel (or similar programs). The other file types are used for databases and web development. These are only recommended if you have past experience with them or are eager to do some extra work.
  • Supported File Types This page describes the file formats that Kaggle uses.
  • This is a typical tool that will allow you to sort by relevance, votes, date released or hotness. Relevance or votes (popularity) are probably what you’re looking for.

Search Datasets:

  • For a phrase: Put it in quotes e.g. “This is my phrase”
  • More Information About Searching This page displays more information on how you can search for datasets.
  • Datasets Documentation This is the documentation for the datasets feature in Kaggle. A lot of this has information on how to collaborate on Kaggle (it’s primary use). But, there is some extra information on types of datasets and searching for datasets (which are linked above).

Once you’ve made your search, click on a result and it will bring you up to their page. Again, a few features are highlighted.

how to cite kaggle dataset

  • Both of the download buttons on the right will download all of the data sources. If you click to download, it will lead you into creating an account. It’s as simple as signing in with Google and accepting some terms. If you scroll down, you will see some visualizations of the current data sources.

Visualizations:

  • If you click on “2016.csv”, for example, it will show you some visualizations about that dataset. It does not show you visualizations over all of the sources.
  • If you want to contribute to the open source community and you know some Python, you can create a new kernel which can run a Python script on this dataset. Click on “ Kernels ” to see what some people have done with this dataset. This is a kernel for the World Happiness Report.
  • Click on “ Overview ” to see some information about the dataset and its source. On the popular datasets, some people have made interesting visualizations or created statistical models. If you go to “ Insights ”, you can see some statistics regarding the use of the dataset and kernels that people have made. If you click on the creator which in this case is “Sustainable Development Solutions Network” you can view their profile . In some cases, they will have their LinkedIn profile or their website attached. These are great ways to validate their data.

This tutorial helped you find a dataset on Kaggle using specific search tools. It also helped you understand all of the different features of your dataset such as the kernels and visualizations the Kaggle provides. 

  • << Previous: IFS (International Financial Statistics)
  • Next: LAPOP >>
  • Last Updated: Sep 8, 2023 11:34 AM
  • URL: https://gouldguides.carleton.edu/dataknowledgebase

Questions? Contact [email protected]

Creative Commons License

Powered by Springshare.

Help | Advanced Search

Computer Science > Machine Learning

Title: an explainable machine learning-based approach for analyzing customers' online data to identify the importance of product attributes.

Abstract: Online customer data provides valuable information for product design and marketing research, as it can reveal the preferences of customers. However, analyzing these data using artificial intelligence (AI) for data-driven design is a challenging task due to potential concealed patterns. Moreover, in these research areas, most studies are only limited to finding customers' needs. In this study, we propose a game theory machine learning (ML) method that extracts comprehensive design implications for product development. The method first uses a genetic algorithm to select, rank, and combine product features that can maximize customer satisfaction based on online ratings. Then, we use SHAP (SHapley Additive exPlanations), a game theory method that assigns a value to each feature based on its contribution to the prediction, to provide a guideline for assessing the importance of each feature for the total satisfaction. We apply our method to a real-world dataset of laptops from Kaggle, and derive design implications based on the results. Our approach tackles a major challenge in the field of multi-criteria decision making and can help product designers and marketers, to understand customer preferences better with less data and effort. The proposed method outperforms benchmark methods in terms of relevant performance metrics.

Submission history

Access paper:.

  • Download PDF

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IMAGES

  1. Kaggle Datasets Tutorial: Kaggle Notebooks

    how to cite kaggle dataset

  2. How we published a successful dataset on Kaggle

    how to cite kaggle dataset

  3. Clinical Text Classification on Medical Transcription Kaggle Dataset #nlp #tutorial

    how to cite kaggle dataset

  4. The Easy Approach to Access a Kaggle Dataset in Google Colab

    how to cite kaggle dataset

  5. Getting Started on Kaggle: Uploading a dataset

    how to cite kaggle dataset

  6. How To Upload Dataset On Kaggle

    how to cite kaggle dataset

VIDEO

  1. use dataset from kaggle in google colab

  2. Machine Learning Part 1

  3. Data Analysis and Visualization usingGoogle Data Studio _ Google Developers Group Session GDG

  4. Tutorial XGBoost Classification Using Titanic Dataset

  5. Foundation of data science|week 2|Coursera weekly challenge Answers|Google Advanced Data Analytics

  6. Top 1% Kaggle Solution

COMMENTS

  1. How can i cite a kaggle dataset in ieee conference paper?

    1 A quick google search will lead you to a forum post on kaggle where citing datasets is discussed: kaggle.com/general/46091 - Louic Jul 16, 2021 at 10:41 1 hlw @Louic, Thank you. But I had a little bit confused about the format of the following citation. Because there might be a difference in IEEE conference format for the Kaggle dataset.

  2. How can I cite kaggle in my research paper?

    If the issue persists, it's likely a problem on our side. Unexpected token < in JSON at position 4. SyntaxError: Unexpected token < in JSON at position 4. Refresh.

  3. Data Sets

    Data Sets - APA Style 7th Edition: Citing Your Sources - Research Guides at University of Southern California Standard Format Adapted from American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed). https://doi.org/10.1037/0000165-000 Formatting Rules

  4. Cite a Dataset

    Cite a published dataset accessed online or in a database, or unpublished raw data received directly from the researcher or organization. Use other forms to cite theses and dissertations or journal articles. Title Required Show description Show subtitle Container title Contributors Recommended Add organization Version Medium Publication status

  5. Citing Datasets

    Citing Datasets Example ACS AMA APA Chicago CSE IEEE MLA The minimum data required for an acceptable citation are the name (s) of the data creators (s), title of dataset, publisher, published date, and the URL/DOI where data was found. Examples of how to cite 2020 National Census of Ferry Operators using seven different style manuals:

  6. How to Cite Data: Dataset Citations

    For a complete description of citation guidelines refer to pp. 210-211 (datset) and p. 212 (unpublished raw data) of the Publication Manual of the American Psychological Association, 6th edition [Call Number: Reference BF76.7 .P83 2010 ]. Data set Basic form: Author/Rightsholder. (Year). Title of data set (Version number) [Description of form].

  7. citations

    1 Answer Sorted by: 3 I have cited these resources before in academic work. Usually I cite them in a general and holistic way, not a specific way. Several of the results presented in this paper were inspired by the discussion and resources on the Iowa Housing Project contest on Kaggle.com [citation with URL].

  8. Data set references

    Date created: February 2020 Cite this This page contains a reference example for a data set. This should be used when you have conducted secondary analyses of publicly archived data or archived your own data being presented for the first time.

  9. How to Use Kaggle Datasets for Research: A Step-by-Step Guide

    Use Kaggle datasets for research responsibly. Steps: Create an account, explore, check licenses, clean data, conduct research, cite, and share.

  10. Help and Support: IEEE

    Standard format for citation DOI available: [#] A. Author, Title of Dataset, vol., Place of Publication: Publisher, Year of publication. [Format]. Available: DOI. [Accessed: Date of access]. No DOI available: [#] A. Author, Title of Dataset, vol., Place of Publication: Publisher, Year of publication. [Format]. Available: internet address.

  11. Cite data

    Using citation software or style guides? In Endnote use the reference type for "dataset." If you're using Mendeley or Zotero, make due with using other more generic reference type templates and fill in the essentials for your dataset. Cite data: examples Want detailed guidelines for citing data? See: Quick Guide to Data Citation (IASSIST)

  12. Correctly citing and referencing your dataset for maximum impact

    Where datasets are hosted in public repositories that provide datasets with persistent identifiers (such as Digital Object Identifiers (DOIs) or accession codes), we encourage formal citation of these datasets in reference lists. This allows your...

  13. Dataset reference style in BibLaTeX

    Dataset reference style in BibLaTeX Ask Question Asked 3 years, 11 months ago Modified 3 years, 11 months ago Viewed 8k times 3 The journal I am submitting to wants datasets to be listed in the reference as [dataset] Authors; Year; Dataset title; Data repository or archive; Version (if any); Persistent identifier (e.g. DOI),

  14. Using Kaggle in Machine Learning Projects

    To get started with Kaggle Notebooks, you'll need to create a Kaggle account either using an existing Google account or creating one using your email. Then, go to the "Code" page. Left Sidebar of Kaggle Home Page, Code Tab. You will then be able to see your own notebooks as well as public notebooks by others.

  15. Kaggle Datasets Tutorial: Kaggle Notebooks

    With these, you can narrow your search by entering dataset tags, file type, and other values like the minimum or maximum size of the dataset (Figure 4.3). Figure 4.1: Dataset Search Filters. Kaggle allows you to download any dataset for free, but depending on what you are going to use it for, you may need to pay attention to the license type of ...

  16. A Guide to Extracting Data from Kaggle for Your Data Science ...

    kaggle.api.dataset_download_files ('username/dataset-name', path='./data', unzip=True) Replace 'username/dataset-name' with the Kaggle dataset you want. Step 6: Run the Kaggle Script. Execute your ...

  17. Discovering Datasets Through Machine Learning: An Ensemble Approach to

    The lack of a standardized citation methodology has thus far prevented the government from understanding dataset usage in a transparent, accessible way. In this work, we seek to build on recent successes in natural language processing techniques and a recent Kaggle competition to develop an extensible framework for extracting government dataset ...

  18. Research Guides: Gould Data Knowledge Base: Kaggle Tutorial

    This tutorial helped you find a dataset on Kaggle using specific search tools. It also helped you understand all of the different features of your dataset such as the kernels and visualizations the Kaggle provides. ... Contact [email protected]. Gould Library Research Guides are licensed under a Creative Commons Attribution-NonCommercial 4 ...

  19. [2402.05949] An explainable machine learning-based approach for

    We apply our method to a real-world dataset of laptops from Kaggle, and derive design implications based on the results. Our approach tackles a major challenge in the field of multi-criteria decision making and can help product designers and marketers, to understand customer preferences better with less data and effort.

  20. Find Open Datasets and Machine Learning Projects

    Download Open Datasets on 1000s of Projects + Share Projects on One Platform. Explore Popular Topics Like Government, Sports, Medicine, Fintech, Food, More. Flexible Data Ingestion.

  21. How to cite Kaggle?

    Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic.

  22. How to access datasets directly from Kaggle

    Generating the API Key. To generate the Kaggle API Key, follow the given steps: Login to your kaggle.com account. On the top right corner, you can see your profile. On clicking it, you will see an ...

  23. Citation Network Dataset

    4,894,081 papers and 45,564,149 citation relationships. 4,894,081 papers and 45,564,149 citation relationships. code. New Notebook. table_chart. New Dataset. tenancy. New Model. emoji_events. New Competition. No Active Events ... Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic.