Data Collection: Definition and Methods
Data Collection Definition
Data collection is a systematic approach to accurately collect information from various sources to provide insights and answers, such as testing a hypothesis or evaluating an outcome. The main driver of data collection is to gather quality information that can be analyzed and used to support decisions or provide evidence.
There are two types of data collected—quantitative data and qualitative data. Quantitative data collection is based on numbers and measurements, such as percentages and statistics. Qualitative data collection covers descriptions, such as descriptions and opinions.
Data collection methods are broken into two core categories—primary and secondary.
Primary data collection methods gather information directly, so it is source data. Secondary data collection methods pull information from existing repositories. It could be third-party source material or the output of the analysis.
Because it is essentially second-hand information, secondary data is less expensive than primary data.
Data Collection Methods: Interviews and Surveys
For primary data collection, it is important to start by identifying the types of data that are desired, the sources, and the basic methods, which can include in-person, online, phone, mail, or multi-mode.
In-Person
Data collection with in-person interviews yields very high-quality data for a few reasons. It allows interviewers to collect in-depth information while also capturing verbal and non-verbal cues, such as body language, tone of voice, and emotions. In-person data collection is well suited for engaging people at a point of service, with neighborhood surveys, and by soliciting opinions.
In-person data collection should be recorded to evaluate the integrity of the answers. For instance, interview notes could say that the person had one input, but a review of the recording could reveal subtle details in body language that reveal a different conclusion.
Online
Online data collection has a number of benefits. It is relatively easy to roll out, and data is already digitized, so it is faster to start analysis and reporting.
Online surveys offer respondents the flexibility to answer the questions at their leisure and, depending on how the survey is conducted, provides anonymity which can lead to more honest answers. A few good use cases for online surveys are product or service feedback, quick tests of messages, or creative assessments.
Online interviews can also assess the interest levels for a topic and gather data for reports or infographics. Three important tips for online surveys are:
- 1. Make them easy to use.
- 2. Keep them short.
- 3. Give clear options for responses.
Phone
Phone-based data collection is also an excellent source of information. It also provides a way to reach almost any audience.
The trick is getting respondents to answer the phone and participate in the survey. When considering phone-based data collection, it is important to be sure that the target market is available by phone during survey times.
Customer satisfaction surveys, market research, political polls, and social research are all well suited for phone-based data collection. Questions asked during a phone survey should be closed-ended.
Mail-based data collection can be directed at nearly anyone. People are often willing to take time to engage with a paper-based survey as it is a change from the barrage of digital media.
Mail-based data collection is a very effective tool for reaching senior audiences who are less comfortable with online tools and sometimes reluctant to talk to strangers.
With mail-based data collection, it is important that the envelope is enticing and has a teaser to encourage the recipient to open it. In addition, the survey should be very easy to read, both visually and semantically.
Multi-Mode
A mixed methodology approach to data collection can increase response rates and generate a broader poll of respondents.
Multi-mode data collection is a good option for more extensive surveys that include a wide range of participants. To maximize engagement, provide the survey questions in as many different formats as possible (e.g., mail, online, in-person).
Data Collection Examples
Primary Data Collection Examples
- Delphi technique
With the Delphi method, a panel of experts is approached, individually or as a group, and asked questions. The answers are then consolidated to provide an answer to the proposed question. - Focus group interview
One of the most widely used in-person data collection methods, focus group interviews gather a half dozen to a dozen people together for a curated discussion of a topic or problem. Focus groups are used to gather in-depth information on perceptions, insights, attitudes, experiences, or beliefs. - Interview method
Interviews are used for data collection amongst smaller groups of people. There are three types:- Structured interviews
A questionnaire is verbally administered. - Semi-structured interviews
In addition to a questionnaire, there are a few open-ended questions. - Unstructured interviews
A set of questions is established, but the participant is allowed to answer independently without being given choices for the answers.
- Structured interviews
- Projective technique
With the projective technique, an unstructured and an indirect interview method is used to pose ambiguous questions that are designed to reveal participants’ underlying motivations, attitudes, or opinions. - Questionnaire or survey method
A survey is commonly used to gather information from a large group of people. This is an effective way to collect consistent data.
Surveys include a set of questions with closed-ended or open-ended questions. Participants are asked to respond based on their knowledge or experience. This method of data collection can be performed in a number of ways, including:- In-person interviews
- Mobile surveys
- Online (web) surveys
- Phone surveys
- Role playing
Role playing elicits data to predict future behaviors. Participants are presented with scenarios and are asked to explain or enact how they respond if the situation was real. - Sentence completion
Participants are presented with a partial sentence and asked to complete it. The objective of this data collection method is to understand more about the kinds of ideas the respondent has. - Cartoon caption completion
Another method of data collection that can predict future behavior is cartoon caption completion. Cartoon pictures are presented to participants, who are asked to write a caption for the cartoon. - Thematic apperception test (TAT)
The TAT uses images to uncover participants’ dominant drives, emotions, conflicts, and personality characteristics. Participants are shown multiple images and asked to describe what each one represents. - Word association
Here the researcher provides a set of words to the respondent and asks them to say whatever comes to their mind when they hear each one. This data collection method gathers information about associated feelings that could be related to a topic or brand.
Secondary Data Collection Examples
While secondary data has limitations, a significant advantage is that it is readily available and at scale. Secondary data can be both qualitative and quantitative. Example sources of secondary data are as follows:
Quantitative secondary data collection sources
- Customer details, like name, age, contact details, etc.
- Financial statements
- Government censuses, like the population census, agriculture census, etc.
- Information from other government departments, like social security, tax records, etc.
- Management information systems
- Sales reports
Qualitative secondary data collection sources
- Company information
- Diaries
- Interviews
- Newspapers
- Reports on feedback from employees, customers, partners, and vendors
- Transcripts
Data Collection and Data Integrity
The main purpose of data collection is to gather information in a measured and systematic manner to ensure accuracy and facilitate data analysis. Since the data collected is meant to provide content for data analysis, the information gathered must be of the highest quality for it to be of value.
Regardless of the data collection methods selected, it is essential to maintain the data's neutrality, credibility, quality, and authenticity. The subsequent data analysis will only provide real insights and practical guidance if the data is genuine and free from errors, discrepancies, or loopholes.
Without proper processes in place to maintain data integrity during the data collection process, a number of negative outcomes result, including:
- Data cannot be validated.
- Decisions based on data can be compromised.
- Further research can be skewed.
- Objectives are not achieved.
- Questions are not properly answered.
- Valuable resources are wasted.
Following are measures that can be incorporated into the data collection processes to improve the efficacy and accuracy of the various methods.
Maintain Neutrality
Only by maintaining neutrality in all parts of data collection can the results be free of bias or reflect the background, position, or conditioning circumstances of the data collection team. It is this neutrality that results in collected data that is considered to be trustworthy and legitimate.
Monitor Staff Involved in Data Collection
Closely monitor data collection teams to avoid these issues:
- Errors in data collection processes
- Failure to follow protocols
- Misconduct
- Overall performance gaps
- Systemic mistakes
Validate Participants and Their Answers
Ask questions about the same information several times during surveys or interviews in different ways to ensure data quality. This allows for the measurement of honest responses.
Take Advantage of Automation and Digital Tools
The impact of human error cannot be underestimated with regard to data entry and information recording. Employing automation and any available digital tools removes layers between the participant and their answers, resulting in more accurate and error-free data. It also expedites time-to-analysis by eliminating time-consuming manual steps.
Use Reliable Data Resources
Taking time to select the right resources will have a direct impact on the quality of data collection. Both systems and staff should be evaluated against a high bar for reliability and credibility to provide quality data collection.
Data Collection Best Practices
The following data collection best practices will help drive high-quality information from the process.
Be specific about the information that is to be collected.
Decide which topics to use to get that information. A data collection exercise could be used to determine customers’ favorite activities. This can be determined by asking the question directly and supported with inquiries related to specific activities such as related purchases.
Decide if the data will be qualitative, quantitative, or a blend.
The type of data that will be collected informs the method that should be used for data collection.
Have a clear goal or purpose for the data collection program.
Know what the end goal is, then start with a rough plan and refine as the program develops. This helps in marking the steps required to achieve the desired results.
Hone the audience for data collection.
Determining which audience type will yield the most valuable data is critical. The data collection program could target participants in a number of ways.
For example, horizontal audiences could be selected by age and no other filters. Vertical audiences can be selected by refining the group, such as by age, ethnicity, or geography.
Secure collected data.
Implement security controls to manage the integrity of the information that is collected. The levels of security should be commensurate with the sensitivity of the data. Data collection efforts can be compromised by inappropriate or unauthorized access.
Select a data collection method or combination that can be effectively supported.
Data collection is only as effective as the execution. The method used for data collection must be sustainable in addition to meeting objectives.
If an aggressive plan is put together that lacks resources, data quality will be lost and undermine the data collection. A balance can be found between available resources and data collection methods used.
Set a timeframe for data collection.
Ultimately, the end goal and available resources drive decisions about the timeframe for data collection. Some programs run continuously, while others are designed to target a specific point in time.
In addition, the timeframe of data collection can influence the information as external influences will come into play. For instance, surveys taken during the summer could yield different results than those taken over holidays.
Spend time on the details.
Invest time in developing the data collection plan with particular care taken with the questions. Both must reflect the purpose of the exercise and the desired output.
Data Collection: Focus on Objectives and Quality
While there are a number of technical solutions available to support data collection efforts, attention must be paid to goals. Data collection objectives should direct the selection of methods. In some cases, online surveys are the best approach. In others, old-fashioned, in-person interviews or simple observation are the right methods.
After assessing the various methods and options for data collection, care must be taken at every step of the data collection process to ensure the integrity of the information—from how it is collected to how it is stored. Quality dictates the success of any data collection effort.
Egnyte has experts ready to answer your questions. For more than a decade, Egnyte has helped more than 16,000 customers with millions of customers worldwide.
Last Updated: 25th August, 2021