Data Management
When planning and conducting research, it is good practice to also think about and develop a system for managing your data. Research data management generally refers to how researchers structure, organize, document, use, preserve, and share data throughout the research life cycle. Good data management can help reduce errors, minimize confusion, and improve the quality of your data and the efficiency of your analysis. Data that is well-organized and documented will also be easier to share and archive for future use, allowing others to understand and make use of your data and/or reproduce your research.
Principles and Best Practices
How to best organize and manage your data will depend on the type of data and analysis, as well as your (and your collaborators’) preferences. Here are some brief guides and primers on best practices in data management, to help you get started:
- Good Enough Research Data Management–A Very Brief Guide
- British Ecological Society (BES) guides to Data Management and Reproducible Code
- Directory structure and File naming guidelines
- Spreadsheet best practices from the University of Pennsylvania and UC Davis
- Kristin Briney’s blog post about README files and UBC’s Creating a README for your dataset
- Codebooks & Data Dictionaries and template
- How to make a data dictionary
For more information about the principles of Research Data Management, check out the following resources:
- Online text on Research Data Management in the Canadian Context
- UBC Library Research Data Management guide and other online resources
- Tri-Agency Statement of Principles on Digital Data Management
Data Management Plans
Generally, data management plans provide information on:
- how data will be collected, documented, formatted, protected and preserved;
- how existing datasets will be used and what new data will be created over the course of the research project;
- whether and how data will be shared; and
- where data will be deposited.
Creating a data management plan is particularly useful when working collaboratively with others on a project and sharing or preserving data for future use. Data management plans are also important components of research ethics applications. In some cases, a formal Data Management Plan document may be required for some grant applications, including from NSERC or SSHRC.
See the following for additional resources and tools to help you develop a data management plan:
- Guide from Canada’s Social Sciences and Humanities Research Council (SSHRC)
- Creating a Data Management Plan Document guide from the Center for Open Science
- NCEAS Learning Hub’s coreR Course chapter on data management plans
- DMP Assistant online tool for developing data management plans that meet the requirements for Canada’s federal research funding agency grants. See the UBC Library Research Data Management website to learn more.
- Digital Research Alliance of Canada training resources.
Lab Archive
Please note that the following section is a work in progress as we are working through a plan for data archiving - updates will be reflected here, but please connect with Tara when needing to archive data
Once your research has been completed, you are expected to leave an archive of your data and analysis code with the lab, along with any additional documentation needed for others to understand how your data was collected, processed, and analyzed, and how the data should be managed, shared, and/or used in the future.
Recommended Content
When archiving, we strongly recommend following best practices in data management and reproducible code, and ask that you include:
- a copy of the raw (unprocessed) data files, if possible. In many cases, this may consist simply of data collected from field sampling or compiled from expert elicitations or systematic literature reviews
- where applicable, the source of the data - e.g., image or video files from camera traps or bioacoustic monitoring devices, audio/video recordings of interviews/workshops, copies of survey responses, spatial data files, etc. If (physical) field samples were collected and stored in the lab for future analyses, include an inventory list of the samples, where and when they were collected, and where and how they are stored.
- the code used to process and analyze the data and generate any figures. If data cleaning/formatting was done manually (i.e., not using R or other programming software), then a detailed description of how input files for the analysis were derived from the raw data files should also be included
- a data dictionary/codebook that defines the variables and describes the structure, content, and layout of the dataset. Include the names of individuals that collected the data and the date and location of data collection if these were not already included in the dataset or the in the report/manuscript (see next item),
- a copy of the final report, thesis or thesis chapter, or scientific manuscript that describes the data collection methods, equipment and materials used, and the analysis. - any additional documentation needed to allow for replication of the study if needed
- permits or approvals needed for experimental or field data collection, or for any research involving humans (e.g., expert elicitations, interviews, workshops)
- data use/ownership agreements (if applicable), copyright licenses (e.g., Creative Commons), and protocols that govern how the data should be used, managed, stored, or shared in the future.
In addition, we also encourage you to archive copies of field/workshop photos or videos, presentations about the project, and copies of (or links to) any media articles that featured the study. Documents related to project administration, such as grant applications/agreements and contracts, budgets, progress reports, invoices, contact lists, et cetera should also be included in the archive in a separate ‘admin’ subfolder.
Make sure to include read me files that describe the contents of each subfolder or group of subfolders. Consult the links provided above as well as this guide from Cornell University to learn more about read me files.
Directory Structure
Your data and other project documents should be archived in the Conservation Decision Lab’s shared network drive. See below for more information on where to store and how to structure your project archive.
(to be continued…check with Tara before archiving data)