See attached
Week 2 – Assignment: Analyze Secondary Data, Observation Research and Measurement of Variables
This assignment consists of three parts:
(1) Sketch a research design using observation research for each of the following. Be sure to explain the procedure you would use:
· A non-profit organization wants to know what Facebook posts are likely to be commented on, liked, or shared by stakeholders.
· An executive of a major fast food restaurant wants to know how long a customer has to wait to get their order. The order is placed inside the restaurant, not in the drive-thru.
· A researcher wants to know if the role portrayal of African American women in magazine advertisements has changed over the past 15 years.
(2) Go to the NCU library from your home page. Examine the guide to primary and secondary sources to see what is available. This can be found under LibGuides → Research Process → Primary and Secondary Resources (
Link
) or website (
https://ncu.libguides.com/c.php?g=635502&p=4444798
). Then go back to the home page and look scroll through the databases available (See A-Z databases) (
LINK
) or website(
http://ncu.libguides.com/az.php?a=all
) to locate data to answer the following questions:
(a) Does there appear to be a relationship between adult internet usage and age between 2000 and 2018?
(b) Has the incidence of data breaches in the health/medical sector increased in the last five years?
In each case, explain your rationale. Be specific. While you cannot use tables and/or figures from another source without copyright permission, you are free to create your own to support your conclusion. Be sure to cite the source of the table/figure using APA formatting.
(3) Comment on the appropriateness of observation and/or secondary data for the measurement of the concepts identified in your research questions. Explain your rationale.
Length: Your paper should be at least 5, but may be as long as 10 pages, if the table and/or figures are included. This does not include the title and reference page. You are encouraged to make effective use of tables and/or figures in your presentation.
References: Include a minimum of three (3) scholarly sources.
Your presentation should demonstrate thoughtful consideration of the ideas and concepts presented in the course and provide new thoughts and insights relating directly to this topic. Your response should reflect scholarly writing and current APA standards.
Secondary vs. Primary Data and Observation Research
There are two main types of data in research: primary and secondary. Secondary data is information that has already been collected for a purpose other than the current study. This may have been compiled by other researchers, government bodies, marketing research organizations, and others. As you can imagine, secondary data plays a major role in this digital age and has the advantages of relatively low cost, ease of access, and timeliness. With the Internet, you now have access to more information than ever before. Of course, this also means that the researchers need to be cautious and only use quality data that will benefit the research design since it was not collected for the particular problem you are working on. This involves considering the source of the data, the timeliness, and how the research was designed and conducted.
In contrast to secondary data, primary data is information collected specifically for the given problem. There are essentially only two ways of collecting data, communication vs. observation. Communication is more versatile, allowing researchers to obtain information on past behavior, behavioral intentions, and a variety of cognitive phenomenon such as attitudes, satisfaction, opinions, etc. Observation is more limited to current behavior but the researcher watches for behavioral patterns of people, objects, and occurrences as they are witnessed. While there are both qualitative and quantitative methods that involve both communication and observation studies, the focus in this course is only on quantitative methods.
1
4 IASSIST Quarterly 2015
IASSIST Quarterly IASSIST Quarterly
IASSIST Quarterly
Abstract
Academic librarians and data specialists use a variety
of approaches to gain insight into how researcher
data needs and practices vary by discipline, including
surveys, focus groups, and interviews. Some published
studies included small numbers of business school
faculty and graduate students in their samples, but
provided little, if any, insight into variations within
the business discipline. Business researchers employ
a variety of research designs and data collection
methods and engage in quantitative and qualitative
data analysis. The purpose of this paper is to provide
deeper insight into primary and secondary data use by
business graduate students at one Canadian university
based on a content analysis of a corpus of 32 Master
of Science in Management theses. This paper explores
variations in research
designs and data
collection methods
between and within
business subfields
(e.g., accounting,
finance, operations
and information
systems, marketing, or
organization studies) in order to better understand the
extent to which these researchers collect and analyze
primary data or secondary data sources, including
commercial or open data sources. The results of
this analysis will inform the work of data specialists
and liaison librarians who provide research data
management services for business school researchers..
keywords: business, primary data, secondary data,
graduate students, research data management
Introduction
A bridge is an apt metaphor for the work of an academic
liaison librarian, who acts as a boundary spanner
between faculty, students, and the Library. Much of this
boundary spanning activity is driven by traditional liaison
responsibilities including reference service, information
literacy instruction, and collection development. As
Canadian academic libraries begin to develop new
research data management (RDM) services, liaison
librarians have been identified as ‘crucial intermediaries
between the library’s services and its researcher
community… [who] often have domain-specific expertise
and a network of department-specific relationships’
(Steeleworthy, 2014, p.7). Like many of its Canadian peers,
Brock University Library has articulated a desire, through
its most recent strategic planning exercise, to explore
opportunities to support research data management and
curation (Brock University Library, 2012). As the liaison
librarian to the Goodman School of Business at Brock
University, I was quite familiar with the challenges of
working with complex, and often expensive, commercial
sources of numeric business data such as Compustat and
CRSP (Hong & Lowry, 2007), but less familiar with the data
practices of business scholars who generated primary data
as part of the research life cycle. In order to bridge the
business data divide, I needed to acquire evidence-based
Bridging the Business
Data Divide: Insights
into Primary and
Secondary Data Use by
Business Researchers
by Linda D. Lowry
1
This study employs content analysis to
investigate the research designs and data
collection methods
IaSSIST Quarterly 2015 14
IaSSIST Quarterly 2015 15
IASSIST Quarterly
insight into business researchers in their dual roles as data producers
and data consumers.
A key phase in the development of RDM services is the discovery phase,
which documents and analyses current researcher data practices that
may be shaped by a variety of factors such as discipline, funding source
requirements, research team composition, and career stage (Whyte,
2014). An independent assessment of the management, business, and
finance (MBF) research landscape in Canada, commissioned by the
Social Sciences and Humanities Research Council (SSHRC), provides
some insight into these factors (Council of Canadian Academies [CCA],
2009a). The number of business faculty in Canada was estimated at just
over 2,900 individuals working at 58 different academic institutions
(CCA, 2009a, p. 14). Business is diverse discipline comprised of many
subfields, some of which are more research-intensive than others. A
bibliometric analysis of Canadian MBF research output published
between 1997 and 2006 found that the accounting subfield
represented 14% of business school faculty but produced only 2%
of the research output, while the organizational studies and human
resources subfield represented 5% of total business school faculty but
produced 11% of the research output (CCA, 2009a, p.22). An analysis
of research grants administered by SSHRC between 2005 and 2008
calculated that just 1.7% of these grants went to MBF research (CCA,
2009a, p. 18). The Council of Canadian Academies also examined
the level of collaborative activity among MBF researchers and found
that: (a) 40% of all papers published between 1996 and 2007 were
collaborative; and (b), among the top 25 Canadian universities, 45% of
collaborative papers had an international co-author. Data management
plans are not currently required for SSHRC-funded research, but
researchers who collaborate internationally may find themselves
subject to data management and sharing policies required by funding
agencies in other countries (Corti et al., 2014).
This study employs content analysis to investigate the research designs
and data collection methods found in one form of academic business
research output, the master’s thesis, in order to discover to what
extent graduate student business researchers collect primary data,
or rely on access to secondary data sources for their analysis, and to
explore variations within and between business subfields. In order
to distinguish between the terms research strategy (which was not
considered in this study), research design, and research method, the
following definitions were considered:
1 A research strategy refers to ‘a general orientation to the conduct
of social research’ (Bryman et al., 2011, p.579). Commonly cited
strategies are qualitative, quantitative, and mixed methods,
while other terms used to describe research strategies include
strategies of inquiry, traditions of inquiry, or methodologies
(Creswell, 2003, p. 13).
2 A research design refers to ‘a framework for the collection and
analysis of data’ (Bryman et al., 2011, p. 579). Examples of research
designs described in standard accounting, business, and social
science research textbooks include experimental, cross-sectional
(survey), fieldwork, case study, and archival (secondary analysis)
designs (Bryman et al., 2011; Neuman, 2003; and Smith, 2011).
3 A research method can be defined as ‘simply a technique
for collecting data’ (Bryman et al., 2011, p. 77) such as self-
completion questionnaires, structured interviewing, focus groups,
structured observation, ethnography and participant observation,
content analysis, and secondary analysis. For consistency’s sake,
the term ‘data collection method’ will be used in this study when
discussing research methods.
Reliance on primary data collection has implications for the
development of research data management services, while
reliance on secondary data has implications for data reference
support and collection development planning, particularly due
to the high cost, proprietary nature, and complex interfaces of
many business data sets. This study sets a baseline measurement
for data practices at the master’s level of business research, and
can be used in future studies to compare current data practices
at other career stages, or at other institutions, at the disciplinary
or sub disciplinary level of analysis. This paper is structured as
follows: section 2 reviews the literature related to methods of
discovering researcher data practices; section 3 describes the
purpose of the study and the research questions I will be exploring;
section 4 describes the study’s procedures including the setting,
and methods of data collection; section 5 presents the findings
of the content analysis; section 6 discusses the implications of
the findings for research data management, reference support,
and collection development; section 7 discusses the limitations
of the study; and the final section presents suggestions for
future research
Literature Review
Surveys and Interviews
Academic librarians and data specialists have used a variety of
approaches to gain insight into current research data management
practices such as case studies (e.g., Key Perspectives, 2010),
campus-wide questionnaires (e.g., Parham, Bodnar & Fuchs, 2012),
interviews (e.g., Carlson, 2012), and focus groups (e.g., McClure et
al., 2014). One study revealed statistically significant differences
in research data management practices and attitudes across
four research domains (but did not consider discipline-specific
distinctions), leading the authors to recommend tailoring data
management services using discipline-specific approaches (Akers
& Doty, 2013). Another study attempted to examine differences
in research data practices by discipline and by methodology
but the findings were of limited generalizability due to the low
response rate and a survey instrument which confounded research
strategies (e.g., qualitative, quantitative, and mixed methods) with
research designs (e.g., experimental, survey, field work), and data
collection methods (e.g., oral history, textual analysis) (Weller &
Monroe-Gulick, 2014).
Motivated by research funding agency requirements for data
management plans, most studies have focused on the data
curation behaviors and attitudes (such as data preservation
and data sharing) of science researchers (e.g., Scaramozzino,
Ramirez, & McGaughey, 2012), or if institution-wide, have grouped
business scholars with social science or professional schools
(e.g., Akers & Doty, 2013), thus providing little, if any, insight into
variations within the business discipline. In the next section, I
discuss how deeper insight into a researcher’s choice of data
collection methods and patterns of secondary data use within
a discipline can be acquired by conducting a content analysis
of scholarly research publications such as journal articles, theses,
and dissertations.
Content Analysis
Content analysis, which is a nonreactive or unobtrusive data
collection method, enables researchers to overcome some of
the weaknesses of survey research, such as low response rates,
sampling errors, or unclear question wording (Neuman, 2003).
Several studies provided insight into business research designs
16 IASSIST Quarterly 2015
IASSIST Quarterly
at the disciplinary level of analysis. Researchers investigating
the prevalence of mixed methods research designs in business
and management dissertations conducted a systematic content
analysis of 186 Doctor of Business Administration theses and
confirmed the use of a diverse set research designs and data
collection methods (Miller & Cameron, 2011). In a similar study,
McLennan, Moyle, and Weiler (2013) explored the role of
economics in tourism postgraduate research by conducting a
content analysis of 118 doctoral dissertations completed in the
United States, Canada, Australia, and New Zealand between 2000
and 2010. Their examination of the frequency of use of specific
research approaches methodologies found that 60% of tourism
economics theses used quantitative approaches, 21% used
qualitative approaches, and 9% used mixed methods approaches,
while their analysis of the data collection methods employed
identified a diverse range of techniques including interviews,
surveys, case studies, econometric forecasting, observation, and
econometric modeling (McLennan, Moyle, & Weiler, 2013, p. 186).
Other content analysis studies explored research trends and
practices within specific business subfields (e.g., accounting,
logistics and supply chain management), thus providing insight
into primary and secondary data use at the sub disciplinary level of
analysis. An examination of trends in accounting research over 50
year period found that archival research, defined as ‘papers using
data from historical market information [such as] stock prices’ (Oler,
Oler, & Skousen, 2010, p. 668), has been the dominant research
methodology in published accounting papers since the 1980s and
comprised more than 60% of all papers published between 2000
and 2007. A review of articles published over a two year span in the
Journal of Business Logistics found that 62% of empirically-based
studies used primary data from surveys or case studies, while 21%
of studies used secondary data methods (Rabinovich & Cheon,
2011). While logistics and supply chain researchers appear to rely
less heavily on secondary sources than do accounting researchers,
a broader review of recent research in the logistics and supply
chain field identified extensive use of secondary data sources for
archival data collection, simulation, content analysis, event studies,
and meta-analysis, leading Rabinovich and Cheon to advocate for
the extension of traditional secondary data methods to include
logistics research.
Several studies conducted by librarians also illustrate the value
of the content analysis method in uncovering discipline-specific
data practices. Nicholson and Bennett (2009) explored the
nature of primary and secondary data use and availability within
business ethics research through a content analysis of 48 doctoral
dissertations. Their analysis revealed that 51% of the dissertations
contained only primary data, 12% relied exclusively on secondary
data, and 32% collected both primary and secondary data. A
review of primary data collection methods identified four main
categories: observations, surveys, experiments, and structured
interviews, while a review of the secondary data collected
identified a range of data types including numeric datasets,
corporate annual reports, government filings and regulatory
cases (Nicholson & Bennett, 2009). More recently, Williams (2013)
analysed the content of 124 journal articles published by 64 faculty
members in crop sciences for evidence of data usage and data
sharing, in order to identify faculty candidates for data services. An
advantage of the bibliographic study (sic) approach was that it
revealed a diversity of discipline-specific data practices, but it was
time consuming to conduct, because if data sets were used, they
were typically not cited in the bibliography, but within the text of
the article (Williams, 2013, p. 207).
In summary, content analysis is an unobtrusive discovery method
which can provide insight into the prevalence of various research
designs and data collection methods in order to determine
patterns of primary and secondary data use within specific
disciplines, but few studies have examined variations between or
within business subfields. This study attempts to fill that gap by
reporting on the findings of an exploratory content analysis study
of business master’s theses.
Purpose
The purpose of this study was to investigate the research designs
and data collection methods of students in a research-based
Master of Science in Management (MSCM) program in order to
better understand the extent to which these researchers collected
and analyzed primary or secondary data. A content analysis of a
corpus of 32 master’s theses explored differences between and
within subfields of business with respect to research designs and
data collection methods. In cases where a thesis used secondary
data, attempts were made to identify whether the data sources
could be considered open data, or commercial data. This study
explored the following research questions:
1 What is the distribution of theses by area of specialization and
how does it compare to the distribution of core (supervisory)
faculty?
2 What is the overall distribution of theses by research design
and by data collection method? What are the patterns of data
collection method use within each type of research design?
3 What is the distribution of research designs and data collection
methods by area of specialization?
4 What is the overall nature of primary and secondary data
collection and use (across all specializations)?
5 What types of secondary sources are used in business research?
Do these researchers use open data sources, proprietary/
commercial data sources, or both?
Procedures
Setting
Brock University is a large comprehensive university located in
Canada which offers a wide variety of undergraduate and graduate
programs across seven faculties. Brock University’s Goodman
School of Business (GSB) is accredited by AACSB International and
has undergraduate and graduate degree programs in accounting
and business administration, an enrollment of 2500 FTE students,
and a faculty complement of 95 (Brock University Institutional
Analysis & Planning, 2014). The GSB launched a research-based
Master of Science in Management program during the 2007/2008
academic year with two goals in mind: first, to prepare students
to conduct research in industry and government settings, and
second, to prepare students for doctoral level studies in business
(Brock University, 2007). The MSCM is a two year program which
culminates in a thesis based on independent and original
research, and currently offers specializations in accounting,
finance, operations and information systems management (O &
ISM), (formerly known as management science), marketing, and
organization studies. The organization studies stream was first
offered during the 2010/2011 academic year (Brock University,
2010). Although Brock University does not currently offer a
doctoral level degree in Business, at one point in time the GSB’s
medium to long term plan included the development of a
IaSSIST Quarterly 2015 17
IASSIST Quarterly
research-based doctoral degree in Business, perhaps jointly with
another university (Brock University Faculty of Business, 2005).
Method
In order to better understand the extent to which business student
researchers collect and analyze primary or secondary data, I
conducted a systematic content analysis of a corpus of 32 Master
of Science in Management theses which were deposited in Brock
University Library’s Digital Repository2. Master’s theses must be
published in the digital repository as a graduation requirement,
so this sample represented 100% of the MSCM degrees awarded
since the inception of the program. Each thesis was hand coded
using a hybrid approach of manifest and latent coding, similar to
the approach taken by Nicholson and Bennett (2009), in order to
identify the business subfield, research design, and data collection
method employed, and the extent and nature of secondary data
use (see Appendix A) . The full text of each thesis was reviewed,
with particular attention paid to the title page, acknowledgements,
abstract, table of contents, methods, and data sections.
Brock University’s MSCM program offers five subfields (referred to
as areas of specialization): which are (a) accounting, (b) finance, (c)
O & ISM, (d) marketing, and (e) organization studies. If the area of
specialization was not specifically stated on the title page, a code
was assigned based on the topic of the theses, and the home
subject area of the student’s thesis advisor (who was often cited in
the acknowledgements).
Each thesis was coded according to the choice of research design
and data collection method and the coding form allowed for the
possible use of more than one research design and data collection
method, as might be the case in a mixed method research strategy.
A core list of research designs and data collection methods was
compiled after a review of accounting, business, and social science
research methods textbooks (see Appendices B and C).
Finally, each thesis was analyzed for evidence of secondary data
use. Each secondary data source was identified by name and by
type (i.e., open or commercial / proprietary). Further investigation
was required in some cases to determine if a secondary source was
a commercial or an open source.
Findings
Distribution of Theses by Area of Specialization
Table 1 presents a comparison of the distribution of theses and
core faculty by area of specialization. The largest proportion of
theses came from the finance area, followed by the marketing
area. The proportions for these two areas were larger than one
might expect, based on the distribution of core faculty by area
of specialization as currently listed on the program’s website
(Brock University Goodman School of Business, 2015). Three of
the five subject area specializations were under-represented
(when compared to the distribution of core faculty) including:
accounting, O & ISM, and organization studies. The differences
in proportions might be a result of several factors such as the
relative newness of the MSCM program, the growth of the
program over time, and variations in student interest in each of
the specialized streams. According to the Appraisal Brief for the
MSCM program (Brock University Faculty of Business, 2005), at
the time the degree program was proposed there were 21 core
faculty distributed across four areas of specialization: (a) eight
faculty in accounting, (b); five faculty in finance, (c); five faculty in
management science, and (d) three faculty in marketing. Given the
two year length of the program, and the fact that the organization
studies specialization was not added until the 2010-2011 academic
year, it is not as surprising to have just two theses completed in
organization studies. The 2014-2015 Graduate Calendar notes that
the specialized streams may not be offered every year if there is
insufficient student interest (Brock University, 2014).
Distribution of Theses by Research Design and Data
Collection Method
Three types of research designs were employed in MSCM theses:
archival / secondary analysis, survey, and experimental (see Figure
1). There were no examples of case study designs, and none of
the theses employed more than one research design. The analysis
of data collection method use, as shown in Figure 2, noted three
different types of data gathering methods: archival-empirical/
quantitative, questionnaires, and archival-content analysis. Patterns
of data collection method use within each type of research design
appear in Table 2. Both examples of theses with experimental
designs used questionnaires for data collection, as did all seven of
the theses with survey research designs. Of the 23 theses which
employed the archival / secondary analysis research design, only
one engaged in a qualitative content analysis, while the other 22
engaged in the empirical analysis of quantitative data. None of the
theses employed more than one type of method for the collection
of data.
Table 1
Table 1
Distribu.on of Theses and Core Faculty by Area of Specializa.on
Area of Specializa.on Theses (%) Core Faculty (%) Over or Under-
Represented
Accoun.ng 5 (16%) 13 (25%) Under
Finance 15 (47%) 8 (15%) Over
O & ISM 3 (9%) 8 (15%) Under
Marke.ng 7 (22%) 7 (13%) Over
Organiza.on Studies 2 (6%) 16 (30%) Under
Total 32 (100%) 52 (100%)
1
Figure 1 Research Design Use
6%
22%
72%
Archival
Survey
Experimental
Figure 1 Overall patterns of research design use
in MSCM theses (N=32).
18 IASSIST Quarterly 2015
IASSIST Quarterly
Distributions of Research Designs and Data Collection Methods
by Area of Specialization
The distribution of research designs and data collection
methods by area of specialization are presented in Table 3
and Table 4. The archival /secondary analysis research design
was employed at least once within each area of specialization,
but was most heavily used within the finance and accounting
specializations. Survey designs were employed in four of the five
areas of specialization, while experimental designs were used in
two of the marketing theses. The marketing area exhibited the
widest variety of research designs, while finance used only one
type of research design. An analysis of data collection method
use by area of specialization revealed widespread use of the
archival – empirical/quantitative method, with evidence of use
within four of the five areas of specialization.
In order to make sense of the patterns of research design and
data collection method use by area of specialization, I also
examined the course descriptions for each area of specialization
in the MSCM program. Students in all specializations except
finance take a two course research methodology sequence
which covers topics such as: multivariate statistical techniques,
advanced regression analysis, measurement and scaling,
survey research and questionnaire design, sampling methods,
qualitative research, and structural equation modeling (Brock
University, 2014). Students in the finance specialization take
a two course sequence in empirical finance which covers
empirical research methods and econometric techniques in
investment finance (Brock University, 2014). Students in the
Figure 2 Data Collection Method
Use
3%
28%
69%
Archival – Quantitative
Questionnaire
Archival – Content Analysis
Figure 2 Overall patterns of data collection method use
in MSCM theses (N=32).
Table 2
Pa)erns of Data Collec3on Method Use by Research Design
Research
Design
Ques/onnaire Archival – Content
Analysis
Archival –
Quan/ta/ve
Total:
Experimental 2 (100%) 0 (0%) Under 2 (100%)
Survey 7 (100%) 0 (0%) Over 7 (100%)
Archival 0 (0%) 1 (4.3%) Under 23 (100%)
Total 9 (28%) 1 (3.15) 22 (68.7%) 32 (100%)
1
accounting stream also take additional courses which
cover accounting theory and research methods in
behavioural accounting research and market-based
research, while the O & ISM specialization includes
courses on modeling, data mining, mathematical
programming, simulation, and forecasting. Looking
again at Table 3 and Table 4, patterns of use begin
to emerge, with finance, accounting, and O & ISM
theses favouring archival designs and quantitative
analysis methods, while marketing and organization
studies theses used a variety of research designs and
data collection methods. Insights from a case study
of an MSc program in finance in the United Kingdom
confirmed that students were exposed to secondary
data and regression analysis as the model to follow in
their own research (Belghitar & Belghitar, 2010, p.578).
Primary and secondary data collection
This study also explored the nature of primary and
secondary data collection and use across all areas of
specialization, and within each specialization. Table 5
presents the patterns of primary and secondary data
collection across all areas of specialization. MSCM
theses showed a greater reliance on secondary data
sources, less reliance on primary data collection, and
no evidence of combining primary and secondary
data collection, when compared to the Nicholson and
Bennett (2009) analysis of business ethics dissertations.
Primary data collection methods were used in four of
the five areas of specialization, and secondary data
sources were used in all five areas of specialization
(see Table 6). The Finance area relied exclusively
on secondary data sources, as did the majority of
Table 3
Research Design Use by Area of Specializa9on, Percentage of Row Totals
Specializa)on Experimental Survey Case Archival
Total
Accoun)ng 0 (0%) 1 (20%) 0 (0%) 4 (80%) 5 (100%)
Finance 0 (0%) 0 (0%) 0 (0%) 15 (100%) 15 (100%)
O & ISM 0 (0%) 1 (33.3%) 0 (0%) 2 (66.6%) 3 (100%)
Marke)ng 2 (28.5%) 4 (57.1%) 0 (0%) 1 (14.2%) 7 (100%)
Organ. Studies 0 (0%) 1 (50%) 0 (0%) 1 (50%) 2 (100%)
Total 2 (6.2%) 7 (21.8%) 0 (0%) 23 (71.8%) 32 (100%)
1
Table 4
Data Collec-on Method Use by Area of Specializa-on, Percentage of Row Totals
Specializa)on Ques)onnaire Archival –
Content Analysis
Archival –
Quan)ta)ve
Total
Accoun)ng 1(20%) 0 (0%) 4 (80%) 5 (100%)
Finance 0 (0%) 0 (0%) 15 (100%) 15 (100%)
O & ISM 1 (33.3%) 0 (0%)) 2 (66.6%) 3 (100%)
Marke)ng 6 (85.7%) 0 (0%) 1 (14.2%) 7 (100%)
Organ. Studies 1 (50%) 1 (50%) 0 (0%) 2 (100%)
Total 9 (28%) 1 (3.1%) 22 (68.7%) 32 (100%)
1
IaSSIST Quarterly 2015 19
IASSIST Quarterly
accounting and operations and information systems management
theses. All the theses which collected primary data relied on some
form of a questionnaire for data collection (see Table 4). Although
not a focus of this study, a latent analysis revealed that a variety of
methods were used to collect questionnaire data including printed
questionnaires and online software packages (e.g., MediaLab,
SurveyMonkey, and Qualtrics). However, in some cases it was not
possible to determine if the questionnaires were administered
using paper or online instruments.
Secondary sources types used in business research
This study investigated both the nature and types of secondary
sources used and revealed that very few theses relied exclusively
on open data sources. 28% of theses used commercial data
sources exclusively, and 37% of theses used a combination of
open and commercial data sources (see Table 5). Many theses
used multiple secondary data sources, including four theses which
used five or more secondary data sources (Table 5). A detailed
listing of these open and commercial secondary data sources,
including source type, example names, and frequency
of use, is presented in Appendix D. The secondary
sources included many of the data types identified
in the Nicholson and Bennett (2009) study, but given
this study’s Canadian location, it was not surprising to
find some Canadian equivalents, such as SEDAR (for
corporate financial reports and filings) and CANSIM (for
socioeconomic data). Some, but not all, of the theses
that used secondary sources relied on commercial
numeric datasets hosted by the Library or Business
School (e.g., Bloomberg Professional, CFMRC, Compustat,
CRSP, Datastream). Other theses used datasets such as
the Internet Retailer’s Top 500 Guide, which may have
been purchased directly the student or provided by the
student’s faculty advisor.
Discussion
Implications for research data management
Libraries planning research data services may use a
life cycle model to describe the real-world activities
of their researchers (Carlson, 2014). The research
data lifecycle consists of the following data related
activities: discovery and planning, data collection, data
processing and analysis, publishing and sharing, long
term management, and reusing data (Corti et al., 2014
p. 17). Viewed through the lens of the research data
lifecycle, this study’s analysis of business master’s theses
identified variations between and within business
subfields with respect to research design and data
collection method use which have implications for the
development of research data management services for
business researchers.
Over 70% of graduate student business researchers
could be categorized as data consumers, while less
than 30% could be considered data producers. Archival
/ secondary analysis research designs were employed
at least once within each business subfield, and
constituted the majority of theses in the accounting,
finance, and operations and information system
management subfields. The discovery and acquisition
of secondary data sources are crucial real-world activities
for these business data consumers. A minority of theses
employed survey or experimental research designs and
collected research data via questionnaires. Due to the small sample
size, clear sub disciplinary patterns could not be found, but the
researchers collecting primary data in this study were more likely
to be from the marketing subfield. The real-world data activities
of these business data producers, such as planning data collection
protocols and obtaining informed consent, closely align with social
science researchers employing survey or experimental research
designs. This segment of researchers could be targeted for future
research data management services.
Unlike our counterparts in the United States and the United
Kingdom, there is less urgency with respect to developing research
data management plans in Canada due to a lack of public funding
agency data sharing mandates. The Tri-Agency Open Access
Policy on Publication, which applies to peer-reviewed journal
publications, was announced on February 27, 2015, but the sharing
of publication-related research data is only required by Canadian
Institutes of Health Research funding recipients (Tri-Agency, 2015).
Barriers to publishing and sharing business data exist due to the
Table
5
Primary and Secondary Data Collec5on Across All Areas of Specializa5on (N = 32)
Descrip(on Number ( %)
Theses with primary data collec(on 9 (28%)
Theses with secondary data collec(on 23 (72%)
• Collected only open data sources 2 (6%)
• Collected only commercial data sources 9 (28%)
• Collected both open and commercial sources 12 (37%)
Theses with both primary and secondary data collec(on 0 (0%)
Number of secondary data sources used Number ( %)
1 Source 3 (13%)
2 Sources 6 (26%)
3 Sources 3 (13%)
4 Sources 4 (17%)
5 Sources or more 4 (17%)
Total 23 (100%)
1
Table 6
Primary and Secondary Data Collec5on by Area of Specializa5on, Percentage of Row
Totals
Specializa)on Primary Secondary Total
Accoun)ng 1 (20%) 4 (80%) 5 (100%)
Finance 0 (0%) 15 (100%) 15 (100%)
O & ISM 1 (25%) 3 (75%) 4 (100%)
Marke)ng 6 (85%) 1(15%) 7 (100%)
Organ. Studies 1 (50%) 1(50%) 2 (100%)
Total 9 (28%) 23 (72%) 32 (100%)
1
20 IASSIST Quarterly 2015
IASSIST Quarterly
proprietary nature of many of the data sets used in the accounting
and financial research which rely on econometric methods. While
some top economic journals have mandatory data availability
policies which require authors to submit the datasets used in their
research, most journals do allow exemptions for research based
on proprietary or confidential sources such as Thomson Reuters
Datastream (Vlaeminck, 2012, 2013).
Implications for reference support and collection development
Liaison librarians and data specialists often provide data
reference support to novice researchers such as undergraduate
students (Partlo, 2010). According to the ‘Seven Ages of Research’
model, master’s degree students (who are in the first age of the
researcher’s lifecycle) are engaged in research for a limited period
of time and may not be seeking a career in academia (Bent,
Gannon-Leary, & Webb, 2007, p. 85). Researchers in this early career
stage often turn to thesis supervisors for guidance on the research
process, academic writing, and information retrieval, to varying
degrees of satisfaction (Bent, Gannon-Leary, & Webb, 2007, p. 89).
Librarians serving business schools with research-based master’s
degree programs with an accounting or finance emphasis, where
students are expected to engage in archival quantitative research,
may want to direct their energy toward providing higher levels
of support for secondary data discovery and extraction, and then
promoting this service to faculty and graduate students (e.g.,
Kellam, 2011).
Liaison librarians and data specialists who develop data collections
to support academic programs in their institutions will need to
work closely with disciplinary faculty in accounting, finance, or
other subfields that rely heavily on secondary data, to identify
key data sources. Thesis and dissertation content analysis can
provide much needed evidence of student usage, and strengthen
the argument that these databases support the curriculum (e.g.,
when writing the library or data services support portions of
program appraisal briefs or reaccreditation reviews). A content
analysis of faculty journal publications can uncover hidden data
sources (e.g., owned or licensed by faculty members for use by
their own research groups, but not listed on the Library website
or a Departmental resources web page), which may be missed
by traditional database benchmarking efforts that examine such
listings. One must also keep in mind that traditional citation
analysis studies which only look at bibliographies (for a review, see
Hoffman & Doucette, 2012), will drastically undercount the use
of secondary data sources which are often only cited within the
methods sections of dissertations and journal articles. In times
of Library materials budget cuts, expensive subscription-based
financial datasets with high cost per use may be prime targets
for cancellation, but are fundamental to the research process for
financial scholars.
Limitations
There were three aspects of this study which limited the
generalization of the findings to the broader population of
academic business researchers. The first limitation was due to
the small sample size which, while representative of 100% of the
population of Brock University’s MSCM students, did not reflect
the sub disciplinary distribution of core faculty supervisors in
the MSCM program. The second limitation was due to the focus
of the study on only one career stage of researchers. While a
graduate student’s choice of research designs and data collection
methods may mirror the standard protocols within a discipline,
the short duration of master’s degree programs may limit his or
her choices (see Belghitar & Belghitar, 2010, p. 579), and therefore
may not be representative of research designs employed within
doctoral theses or studies published by faculty in peer-reviewed
journal articles. The third limitation was due to the use of a single
coder, rather than multiple coders. However, the results were
strengthened by the nature of the coding scheme, which relied
primarily on manifest coding, and focused on measuring the
frequency of occurrence of each variable (Neuman, 2003). This
exploratory study served as a pilot project to test the thesis coding
scheme, which could easily be adapted for broader use at other
institutions by using a broader list of business subfields, such
as the list of 16 subfields employed by the Council of Canadian
Academies in their bibliometric analysis (CCA, 2009b, p. 11).
Conclusion
This study used the content analysis method to provide insight
into primary and secondary data use by master’s level business
students at one institution. The content analysis found variations
in the choice of research designs and data collection methods
use across the business discipline, as well as variations within
business subfields. The results of this exploratory study can serve
as a benchmark for future discipline-specific content analysis
studies, and the content analysis method can be used to examine
variations in data practices within other social science disciplines.
The study could be extended to examine business research from
multiple academic institutions, multiple career stages (e.g., doctoral
students, faculty, and postdoctoral researchers), or a variety of
types of research output including doctoral dissertations and
peer-reviewed journal articles. The use of multiple coders would
allow for the examination of a larger and more representative
sample of the broader business research landscape. Given the
Canadian Association of Research Libraries’ interest in developing
a collaborative, networked approach to building a research
data management infrastructure (for a discussion see Canadian
Association of Research Libraries, 2013), a collaborative approach
to conducting future discipline-specific content analysis studies is
recommended.
References
Akers, K.G. & Doty, J. (2013). Disciplinary differences in faculty research
data management practices and perspectives. International Journal
of Digital Curation, 8(2), p. 5-26.
Belghitar, Y. & Belghitar, G.S. (2010). The role of critical evaluation in
finance education: Insights from an MSc Programme. Accounting
Education: An International Journal, 19(6), p. 569-586.
Bent, M., Gannon-Leary, & Webb, J. (2007). Information literacy in a
researcher’s learning life: The seven ages of research. New Review of
Information Networking, 13(2), p. 84-99.
Brock University (2014). Master of Science in Management Program
Description, Brock University Graduate Calendar, 2014-2015. [Online].
Available from: http://www.brocku.ca/webca/2014/graduate/mgmt.
html
Brock University (2010). Master of Science in Management Program
Description, Brock University Graduate Calendar, 2010-2011. [Online].
Available from: http://www.brocku.ca/webcal/2010/graduate/mgmt.
html
Brock University (2007). Master of Science in Management Program
Description, Brock University Graduate Calendar, 2007-2008. [Online].
Available from: http://brocku.ca/webcal/2007/graduate/mgmt.html
Brock University Faculty of Business (2005). Appraisal Brief Master of
Science (M.Sc.) in Management. [Online]. Available from: http://www.
brocku.ca/webfm_send/1289
IaSSIST Quarterly 2015 21
IASSIST Quarterly
Brock University Goodman School of Business (2015). Msc in
Management: Core Faculty/Supervisors. [Online]. Available from
http://www.brocku.ca/business/future/graduate/researchdegrees/
msc/core-faculty
Brock University Institutional Analysis & Planning (2014). Brock Facts
and Supplemental Dynamic Reports. [Online]. Available from: http://
brocku.ca/institutional-analysis
Brock University Library (2012). Brock University Library Strategic Plan.
[Online]. Available from: https://www.brocku.ca/webfm_send/23579
Bryman, A., Bell, E., Mills, A.J. & Yue, A.R. (2011). Business research
methods, Canadian Edition. Don Mills, ON: Oxford University Press.
Canadian Association of Research Libraries (2013). Facilitation,
collaboration, and cooperation: A Canadian research data
management network. [Online]. Available from: http://www.carl-
abrc.ca/uploads/SCC/Canadian_RDMN-Dec-2-2013-summary
Carlson, J. (2012). Demystifying the data interview: developing a
foundation for reference librarians to talk with researchers about
their data. Reference Services Review, 40(1), p. 7-23.
Carlson, J. (2014). The use of life cycle models in developing
and supporting data services. In Ray, J.M (ed.). Research data
management: practical strategies for information professionals. West
Lafayette: Purdue University Press.
Corti, L., Van den Eynden, V., Bishop, L. & Woollard, M. (2014). Managing
and sharing research data: A guide to good practice. Los Angeles:
Sage.
Council of Canadian Academies. Expert Panel on Management,
Business, and Finance Research (2009a). Better research for better
business. Ottawa: Council of Canadian Academies. [Online]. Available
from: http://www.scienceadvice.ca/en/assessments/completed/
research-business.aspx
Council of Canadian Academies. Expert Panel on Management,
Business, and Finance Research (2009b). Better research for better
business. Report Appendices. [Online]. Ottawa: Council of Canadian
Academies. Available from: http://www.scienceadvice.ca/en/
assessments/completed/research-business.aspx
Creswell, J.W. (2003) Research design: Qualitative, quantitative, and
mixed methods, Second Edition. Thousand Oaks, CA: Sage.
Hoffman, K. & Doucette, L. (2012). A Review of Citation Analysis
Methodologies for Collection Management. College & Research
Libraries, 73(4), p. 321-335.
Hong, E. & Lowry, L. (2007). Business data: issues and challenges from
the Canadian perspective. IASSIST Quarterly, (Spring), p. 9-13.
Kellam, L.M. (2011). Numeric data services and sources for the general
reference librarian. Oxford: Chandos Publishing.
Key Perspectives (2010). Data dimensions: disciplinary differences in
research data sharing, reuse, and long term viability. SCARP Synthesis
study. [Online]. Edinburgh: Digital Curation Centre. Available from:
http://www.dcc.ac.uk
McClure, M., Level, A.V., Cranston, C.L., Oehlerts, B. & Culbertson, M.
(2014). Data curation: a study of researcher practices and needs.
Portal: Libraries and the Academy, 14(2), p. 139-164.
McLennan, C.J., Moyle, B.D. & Weiler, B.V. (2013). The role of economics
in tourism postgraduate research: An analysis of doctoral
dissertations completed between 2000 – 2010. Journal of Applied
Economics and Business Research, 3(4), p. 181-191.
Miller, P.J. & Cameron, R. (2011). Mixed method research designs: a
case study of their adoption in a doctor of business administration
program. International Journal of Multiple Research Approaches,
5(3), p. 382-402.
Neuman, W.L. (2003). Social research methods: qualitative and
quantitative approaches, Fifth Edition. Boston: Allyn and Bacon.
Nicholson, S.W. & Bennett, T.B. (2009). Transparent practices: primary
and secondary data in business ethics dissertations. Journal of
Business Ethics, 84(3), p. 417-425.
Oler, D.K, Oler, M.J. & Skousen, C.J. (2010). Characterizing accounting
research. Accounting Horizons, 24(4), p. 635-670.
Parham, S.W., Bodnar, J., & Fuchs, S. (2012). Supporting tomorrow’s
research: assessing faculty data curation needs at Georgia Tech.
College & Research Libraries News, 73(1), p. 10-13.
Partlo, K. (2010). The pedagogical data interview. IASSIST Quarterly
(Winter/Spring), p. 6-10.
Rabinovich, E. & Cheon, S. (2011). Expanding horizons and deepening
understanding via the use of secondary data sources. Journal of
Business Logistics, 32(4), p. 303-316.
Scaramozzino, J.M., Ramirez, M.L. & McGaughey, K.J. (2012). A study of
faculty data curation behaviors and attitudes at a teaching-centered
university. College & Research Libraries, 73(4), p. 349-365.
Smith, M. (2011). Research methods in accounting, 2nd edition. Los
Angeles, Sage.
Steeleworthy, M. (2014). Research data management and the
Canadian academic library: An organizational consideration of
data management and data stewardship. [Online]. Partnership: The
Canadian Journal of Library and Information Practice and Research,
9(1). Available from: https://journal.lib.uoguelph.ca/index.php/perj/
article/view/2990/3278
Tri-Agency (CIRH, NSERC, & SSHRC). (2015). Tri-Agency Open Access
Policy on Publications. [Online]. 27th February. Available from: http://
www.science.gc.ca/default.asp?lang=En&n=F6765465-
1
Vlaeminck, S. (2012). Data policies of economics journals: research
data management in economic journals. [Online]. 10th
December. Available from http://openeconomics.net/resources/
data-policies-of-economics-journals/
Vlaeminck, S. (2013). Data management in scholarly journals and
possible roles for libraries – some insights from EDaWaX. Liber
Quarterly, 23(1), p. 48-79.
Weller, T. & Monroe-Gulick, A. (2014). Understanding methodological
and disciplinary differences in the data practices of academic
researcher. Library Hi Tech, 32(3), p. 467-482.
Whyte, A. (2014). A pathway to sustainable research data services:
from scoping to sustainability. In Pryor, G., Jones, S. & Whyte, A. (eds.)
Delivering research data management services: Fundamentals of
good practice. London: Facet Publishing.
Williams, S.C. (2013). Using a bibliographic study to identify faculty
candidates for data services. Science & Technology Libraries. 32 (2),
p. 202-209.
notes
1 Linda D. Lowry is the business and economics liaison librarian at
Brock University in St. Catharines, Ontario, Canada. She can be
reached by email: llowry@brocku.ca
2 https://dr.library.brocku.ca/
22 IASSIST Quarterly 2015
IASSIST Quarterly
Appendix A
Content Analysis Coding Form
Variable Name Codes
Area of Specializa5on (circle one) Accoun5ng (1); Finance (2); Opera5ons and Informa5on Systems
Management (3); Marke5ng (4); Organiza5on Studies (5)
Research Design
(circle all that apply)
Experimental (1); Survey (2); Case study/Field Research (3);
Archival / Secondary Analysis (5); Other (specify) (6)
Data Collec5on Method
(circle all that apply)
Ques5onnaire (1); Structured interview (2); Focus group (3);
Ethnography/observa5on (4); Archival-content analysis (5);
Archival-empirical/quan5ta5ve (6); Other (specify) (7)
Type of Secondary Data Collected Open only (specify sources or names of data sets) (1);
Proprietary only (specify sources or names of data sets) (2);
Both (3) (specify sources or names of data sets);
Not applicable (primary data only) (4)
1
IaSSIST Quarterly 2015 23
IASSIST Quarterly
Appendix B
Descriptions of Research Designs
Research Design Descrip.on Examples from textbooks
Experimental Includes classical experimental designs
(random assignment, pretest, post-test,
experimental group, control group) or quasi-
experimental designs (Neuman, 2003, p.
247).
Experimental (Bryman et al., 2011;
Smith, 2011; Neuman, 2003).
Survey “Quan.ta.ve social research in which one
systema.cally asks many people the same
ques.ons, then records and analyzes their
answers” (Neuman, 2003, p. 546).
Cross-sec.onal or Social Survey
(Bryman et al., 2011);
Survey (Smith, 2011; Neuman, 2003).
Case Study /
Field Research
Case study “entails the detailed and intensive
analysis of a single case” (Bryman et al.,
2011, p. 571); Field research is “a type of
qualita.ve research in which a researcher
directly observes the people being studied in
a natural seWng for an extended
period” (Neuman, 2003, p. 535).
Case Study (Bryman et al., 2011);
Fieldwork (Smith, 2011);
Field Research (Neuman, 2003).
Archival / Secondary
Analysis
Archival: research using secondary sources
such as historical documents, texts, journal
ar.cles, corporate annual reports, and
company disclosures to conduct .me-series,
cross-sec.on data analysis, content analysis
or cri.cal analysis (Smith, 2011);
Secondary Analysis: “research in which one
does not gather data oneself, but re-
examines data previously gathered by
someone else and asks new
ques.ons” (Neuman, 2003, p.544).
Archival (Smith, 2011);
Secondary Analysis (Neuman, 2003).
1
24 IASSIST Quarterly 2015
IASSIST Quarterly
Appendix C
Descriptions of Data Collection Methods
Data Collec*on
Method
Descrip*on Examples from textbooks
Ques*onnaire “A collec*on of ques*ons administered to
respondents” (Bryman et al., 2011, p. 579).
Self-comple*on ques*onnaires (Bryman
et al., 2011); Mail and online surveys
(Smith 2011); Mail and self-
administered ques*onnaires (Neuman,
2003).
Structured Interview “A research interview in which all
respondents are asked exactly the same
ques*ons in the same order with the aid of a
formal interview schedule” (Bryman et al.,
2011, p. 581).
Structured interviewing (Bryman et al.,
2011); Interviews (Smith, 2011);
Telephone and face-to-face interviews
(Neuman, 2003)
Focus Group “A form of group interview in which: there
are several par*cipants; there is an emphasis
in the ques*oning on a par*cular fairly *ghtly
defined topic; and the emphasis is upon
interac*on with the group and the joint
construc*on of meaning” (Bryman et al.,
2011, p. 575).
Focus groups (Bryman et al., 2011;
Neuman, 2003).
Ethnography/
Observa*on
A composite category which includes
ethnography (immersion in a social se]ng)
and structured observa*on (observing and
recording behaviour) (Bryman, et al., p. 574,
581).
Structured observa*on, ethnography &
par*cipant observa*on (Bryman et al.,
2011); Complete par*cipant, complete
observer, par*cipant-observer (Smith,
2011); Nonreac*ve / unobtrusive
observa*on; Ethnography (Neuman,
2003).
Archival –
Content
Analysis
“a systema*c analysis of texts (which
may be printed or visual) to determine
the presence, associa*on, and meaning
of images, words, phrases, concepts,
and/or themes (Bryman et al., p. 375)
Content analysis (Bryman et al.,
2011; Smith, 2011; Neuman,
2003)
Archival –
Empirical /
Quan*ta*ve
The empirical analysis of secondary
data sources (such as cross-sec*onal or
*me-series data).
Secondary analysis (Bryman et al.,
2011; Neuman, 2003); Archival
(econometric analysis of cross-
sec*onal or *me series data)
(Smith, 2011).
1
IaSSIST Quarterly 2015 25
IASSIST Quarterly
Appendix D
Open and Commercial Secondary Data Sources Used in MSCM Theses
Classifica(on Data Type Examples Number
Open Data Sources Canadian Public Company &
Mutual Fund filings
Socioeconomic Data
Stock Exchange websites
Canadian regulatory filings
US Government websites
Bankruptcy Cases
Other academic datasets
Organiza(onal websites
SEDAR
CANSIM; FRED
FTSE; ME; NASDAQ
IIROC
EPA
UCLA-LoPucki Bankruptcy
Research Database
Hasbrouck’s Liquidity Es(mates
Ontario Winery websites
5
3
3
1
1
1
1
1
Commercial /
Proprietary Sources
Library subscrip(ons Numeric Databases Compustat,
Datastream,
CFMRC,
Bloomberg,
CRSP,
Execucomp,
Fundata,
Risk Metrics,
TRACE,
IBES,
Carbon Disclosure Project,
8
7
5
5
4
2
2
2
1
1
1
Library subscrip(ons Bibliographic Databases and Full
Text Publica(ons
Lexis-Nexus
TSX E-Review
CBCA
Hoover’s,
Law Source,
4
2
1
1
1
Other (not
Library-
hosted)
Ecommerce data; web
search analy(cs, financial
trading and ownership data
Internet Retailer Top 500 Guide
SpyFu
Keyword Spy
Econoday,
Espeed,
Global Hysales,
13f Spectrum
3
2
1
1
1
1
1
1
Research Design: Observational
and Correlational Studies
Video Title:
Originally Published: 2011
Publishing Company: SAGE Publications, Inc
City: Thousand Oaks, USA
ISBN: 9781483397108
DOI: https://dx.doi.org/10.4135/9781483397108
(c) SAGE Publications, Inc., 2011
This PDF has been generated from SAGE Research Methods.
https://dx.doi.org/10.4135/9781483397108
NARRATOR: Research Design– Observational and Correlational Studies. Since the moment you
were born, you’ve been exploring the world around you. In a sense, you’ve been conducting research.
You’ve noticed the ways people interact with each other, the relative sizes of objects,
NARRATOR [continued]: and how the colors of nature change with the seasons. Each of us is an
amateur researcher, observing, analyzing, and drawing conclusions about everything we see. In order
to conduct a more formal study whose conclusions you can share with others, you need to apply
scientific methods to your research.
NARRATOR [continued]: Knowing about scientific research methods will also help you understand,
interpret, and be more analytical in your thinking about studies you read about in textbooks, journals,
newspapers, or online. To make sure your research is as strong as possible, let’s talk about designing
your study and interpreting your results.
NARRATOR [continued]: Specifically, we’ll focus on some overarching types of research studies,
when to use an observational design, along with some advantages and disadvantages, two different
types of observational design, those that you conduct in the field and those that you conduct in a
laboratory,
NARRATOR [continued]: analyzing data from an observational study, including some statistical
methods, when to use a correlational design, along with some advantages and disadvantages, how
to design and implement one, and analyzing data from a correlational study.
NARRATOR [continued]: Before we begin to explore research designs, it is important to understand
the terms “variable” and “construct.” These terms are used interchangeably and are found throughout
scientific literature.
NICOLE CAIN: A “construct,” which can also be called a “variable,” is a topic of interest that varies
from person to person. Some examples of constructs that researchers are often interested in would
include things like quality of life, IQ or intelligence, or anxiety.
EVELYN BEHAR: Another variable that a lot of people are interested in is marital quality. [Evelyn
Behar, PhD, Assistant Professor of Psychology, University of Illinois at Chicago] Obviously, some
people will have very, very high-quality marriages, some people will have extremely low-quality
marriages, and then there will be lots of people in between those two extremes. So again, a variable
is exactly what it sounds like. It’s something that varies across different people
EVELYN BEHAR [continued]: and we also call it a
construct.
NARRATOR: Types of studies–
EVELYN BEHAR: In general, there are three basic types of designs in research. The first type is what
we call the “observational design.” This is when we just want to know what is the basic nature of
a particular construct or a particular variable. So we might ask, what is the basic nature of marital
quality?
EVELYN BEHAR [continued]: The second type of design is the “correlational design.” And essentially,
what we’re asking here is, how do two variables or two constructs relate to one another? So you might
SAGE
2011 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods Video
Page 2 of 12 Research Design: Observational and Correlational Studies
be asking, what’s the relationship between marital quality and parenting behaviors? The third type of
design is what’s called an “experimental design and this is a little bit
EVELYN BEHAR [continued]: different. This really speaks to cause and effect and this is where you
manipulate one variable and you see the effect that it has on the other variable. So even though we
would never actually run this type of study, you might randomly assign people to have either very,
very good marriages or very, very bad marriages and then see what effect it has on their parenting
behaviors
EVELYN BEHAR [continued]: and on their parenting ability. We would never actually run that study
but that’s what an experiment would look like.
NARRATOR: Observational Studies. Let’s focus first on observational studies. In this type of study,
we’re learning new information about one variable by watching, listening, and in some way measuring
or recording
data.
NARRATOR [continued]: This is especially useful when it would be immoral or impossible to cause
a phenomenon to occur or when we’re interested in getting preliminary information on a brand new
topic before investing heavily in other more expensive and time-consuming types of research.
EVELYN BEHAR: So let’s say that you’re a medical doctor and someone comes into your office, into
your practice, and the person has all of a sudden grown neon-green hair. And you have never heard
of this phenomenon before. And the next day, another person comes in with neon-green hair and you
think to yourself, there must be something here.
EVELYN BEHAR [continued]: Now, you might be tempted to run a clinical trial to try to treat the green
hair phenomenon. You might be tempted to go and do all this expensive research. But you’ve only
seen two cases of it so first, maybe you should run an observational study. So you might want to call
10,000 households in the United States. And when the person answers the phone, you’re going to
ask,
EVELYN BEHAR [continued]: has anyone in your home developed neon-green hair? Let’s say that
you find no other cases of neon-green hair. Well, now you’ve just saved yourself a lot of time and
money. You don’t have to go and run a clinical trial. You’re not going to go and do all this expensive
research. But let’s say that out of the 10,000 households you called,
EVELYN BEHAR [continued]: lo and behold, there are 100 households with people who have neon-
green hair. So you can ask lots of questions but keep in mind that they’re still observational. Even
if you find out that the age of onset was age 30 for every one of these cases who developed neon-
green hair, you didn’t run an experiment. So you have to be very careful in drawing your conclusions.
EVELYN BEHAR [continued]: You cannot go on to say, turning 30 causes individuals to develop neon-
green hair.
NARRATOR: So we’ve seen that one of the advantages of the observational design is that it requires
less of an investment of money and other resources than correlational and experimental studies.
SAGE
2011 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods Video
Page 3 of 12 Research Design: Observational and Correlational Studies
Also, it gives us a great deal of information about a construct. This is especially valuable when there
is a brand new phenomenon
NARRATOR [continued]: that has just come into existence or that at least has never been studied in
the past. The observational design also allows us to generate hypotheses about the construct. In Dr.
Behar’s green hair example, there seems to be some connection between turning 30 and developing
this condition.
NARRATOR [continued]: We can formulate a hypothesis about the connection and then test it using
correlational or experimental methods. On the downside, when we use an observational design, we
cannot draw any conclusions about the relationship between one construct and another. We also do
not find out about causes or effects.
NARRATOR [continued]: Once we choose to conduct an observational study, we need to decide
whether it will take place in the field or in a laboratory.
EVELYN BEHAR: Running an observational study in the field makes the most sense when you have
what’s called a “naturalistic question,” basically a question about a construct as it exists in the natural
environment.
NARRATOR: No matter what kind of study we’re conducting, it is extremely important to develop a
systematic method for recording our data. For an observational study in the field, this can be achieved
by creating a check sheet with certain potential observations. Each time we see a certain behavior or
characteristic, we place a checkmark in the appropriate column.
NARRATOR [continued]: It is a good idea to develop a check sheet that lists as many types of
observations as may possibly interest us when we analyze our data.
EVELYN BEHAR: If you want to look at basic friendliness levels in society, you might literally go and
ride an elevator all day long. You could ride many different elevators in many different buildings.
NARRATOR: In this case, we probably will want to prepare ourselves for many levels of friendliness in
order to give ourselves a chance to see varying degrees of friendliness. Our check sheet may include
“Total Avoidance,” “Eye Contact,” “Nod,” Greeting,” and “Compliment.” We also may want to reserve
our final column for “Other,”
NARRATOR [continued]: in case we observe something we didn’t expect. Not everything can be
observed and recorded in the field, however. If we need a more controlled environment or more
sophisticated measuring apparatus, we may need to bring our participants into the laboratory.
EVELYN BEHAR: If you were interested in looking at IQ levels, obviously, you can’t just go out into
the field, look at someone, and assess what their IQ is. You need to bring them into the lab. Have
them undergo an entire procedure where they take an IQ test because in order to do this type of
research, you have to have a controlled environment.
NARRATOR: When we conduct observational studies in the laboratory, it is somewhat easier to
record our measurements on a computer than it is when we are in the field, though sometimes, we
may be more comfortable working on paper and later transferring the data into a program that can
SAGE
2011 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods Video
Page 4 of 12 Research Design: Observational and Correlational Studies
help us analyze it.
NICOLE CAIN: In an observational design that takes place out in the field, your participants are
anonymous. [Nicole Cain, PhD, Assistant Professor of Psychology, Long Island University Brooklyn
Campus] Oftentimes, they don’t even know that you’re observing their behavior. So it’s not necessary
to get their permission because they’re anonymous. This means, though, that you can’t have any
photographs of them or any video recordings and you cannot have any identifying information about
them at all.
NICOLE CAIN [continued]: In contrast, in an observational design that takes place in a laboratory,
you do have identifying information about them. So in this case, you do need to get their permission
in order to record and track their behavior. This often takes the form of what’s called an “informed
consent form for research.” This is an explanation of the study,
NICOLE CAIN [continued]: along with an explanation of the risks and benefits to participating in the
study and an explanation of how you plan to keep their data confidential.
NARRATOR: The more careful we are about systematically collecting our data, the easier things will
be when we are ready to do our analysis. Data analysis should help us describe our findings in ways
that improve our intuitive understanding of the construct. For observational studies, we usually focus
on five categories of analysis,
NARRATOR [continued]: measures of central tendency, measures of variability, kurtosis, skewness,
and shape of the distribution. Measures of central tendency and measures of variability are actual
values or numbers. Kurtosis, skewness, and shape of the distribution
NARRATOR [continued]: are often more visual in nature. Let’s look at each of these types of analysis.
NICOLE CAIN: The measure of central tendency is what a typical person looks like on a particular
construct.
NARRATOR: The three most common measures of central tendency are mean, median, and mode.
The mean is the average of all scores in a distribution. This is the most commonly used measure
of central tendency. We’re generally referring to the mean whenever we talk about “average” IQ or
“average” income.
NARRATOR [continued]: Occasionally, a problem arises from using the mean as the measure of
central tendency.
NICOLE CAIN: An outlier is an extreme score and an outlier can drive your mean up or down
artificially. So for example, if you were interested in looking at salaries of 300 people, you would ask
300 participants to record their salaries to get an average. But if you had one person who had a salary
of $100 million,
NICOLE CAIN [continued]: you could see how that would artificially drive your mean up. So it would
no longer be a good measure of central tendency.
EVELYN BEHAR: So the problem of the outlier is that it is an extreme score in either direction, either
SAGE
2011 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods Video
Page 5 of 12 Research Design: Observational and Correlational Studies
too high or too low, that essentially is changing the entire look of your sample. It’s changing every
thing
that your sample is trying to reflect about itself. When this happens, you have two options. You can
either simply get rid of that extreme observation
EVELYN BEHAR [continued]: and that’s OK to do. There’s a good reason to do it. Your other option is
to simply not rely on the average or the mean as your measure of central tendency. You might choose
to rely on a different measure of central tendency, something like the median, which is defined as the
middle value in a distribution of values. So you can see that if you just line up everybody’s income
EVELYN BEHAR [continued]: from lowest to highest, you’re still going to have that person who’s
making $100 million all the way at the right. But it’s only one observation and if you’re just looking for
the middle observation, that one person is not going to affect your search for the middle observation.
NARRATOR: Another measure of central tendency is mode. Mode is the most common score in a
distribution. For example, if we’re observing the number of jelly beans a child eats when he has a
bowl full of jelly beans in front of him, we may find that the numbers range from zero for a child who
doesn’t like jelly beans to 200 for a child
NARRATOR [continued]: who will keep eating beyond the point of getting a stomach ache but that
the most common number of jelly beans a child will eat is 20. The number that comes up the most
times is the mode.
EVELYN BEHAR: The other measure that we’re interested in is variability. So whereas measures of
central tendency tell you about the typicalness of a particular observation, measures of variability tell
you about how variable your sample is around that typicalness. We may know that the average IQ of
our sample is 100
EVELYN BEHAR [continued]: but now we want to know how variable is our sample around that 100.
We might have some people at 110, 120, 130. And also on the other side, we’re going to have some
people with an IQ of 90, of 80, of 70. So we’re going to have people falling on either end of that
average score of 100.
NARRATOR: Variability can be high or low for a given construct. High variability means that we are
seeing scores that are way above and way below the mean. Low variability means that all of the
scores are grouped closely around the mean. They don’t vary much. The most common measure of
variability
NARRATOR [continued]: is called “standard deviation.” Standard deviation is the average distance
from a score to the mean, in other words, the average amount that scores in the sample deviate
from the mean of that sample. To calculate standard deviation, we first calculate the mean. Then, we
subtract the mean from each of the individual scores
NARRATOR [continued]: to find what we call “difference scores.” Next, we square each difference
score. Then, we add up all of the squared difference scores. Then, we divide this number by the total
number of observations in our sample, called the “n.” Finally, we take the square root of the whole
thing
SAGE
2011 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods Video
Page 6 of 12 Research Design: Observational and Correlational Studies
NARRATOR [continued]: and that is our standard deviation. Another thing we look at when analyzing
our data is called “kurtosis.” Kurtosis is the shape of the distribution. It refers to how peaked or flat
the distribution is. A platykurtic shape is short and fat.
NARRATOR [continued]: It indicates a great deal of variability. The scores are very spread out. For
example, we may have a class of 100 students with very different levels of ability. If we administer a
calculus exam with 10 questions, we may find the following distribution. This would be a platykurtic
distribution.
NARRATOR [continued]: As our intuition will tell us, there is no real tendency here. A leptokurtic
shape is tall and skinny. It indicates very little variability. There are many scores close to the mean.
For example, we may be measuring the number of hours per day newborns sleep in the first week
NARRATOR [continued]: of their lives. Most of the numbers could be around 17 hours, with many
babies sleeping 16 and others sleeping 18 but few sleeping much less or much more than that. A
mesokurtic shape is in between. It indicates a normal or medium amount of variability.
NARRATOR [continued]: This is also similar to a normal curve, which we will talk more about later. IQ
is a good example of a mesokurtic distribution. Most people will score around 100 and scores become
progressively less frequent as we move away from the mean in either direction.
NARRATOR [continued]: We may also want to look at our sample’s skewness. Skewness refers to
whether scores are distributed fairly evenly around the mean or whether there are many more scores
to the right or many more scores to the left of the mean. There are three types of distributions in terms
of skewness. A symmetrical distribution, known
NARRATOR [continued]: as a “normal distribution,” is when an equal number of cases fall to the left
and to the right of the mean. The most common example of a normal distribution is IQ. The mean is
100 and the standard deviation is 15. This means that the average IQ is 100
NARRATOR [continued]: and 34% of the population has an IQ within 15 points above the mean,
between 101 and 115, while another 34% has an IQ that is within 15 points below the mean, between
85 and 99. Then, if we take it out to another standard deviation,
NARRATOR [continued]: we have about 13% of the population having an IQ from 116 to 130 and
about 13% of the population having an IQ from 70 to 84. Finally, if we go even farther out into the tails
or the almost unpopulated extremes of our sample, we will find that there
NARRATOR [continued]: is a very small number of people with an IQ that is more than two standard
deviations above or below the mean. A left-skewed distribution is also called “negatively skewed.” It
means that some very low scores exist in the sample and that pushes the tail to the left, making the
mean lower.
NARRATOR [continued]: On the other hand, a right-skewed distribution, which is also called
“positively skewed,” means that there are some very high scores that pull the tail to the right and
make the mean higher.
EVELYN BEHAR: One really important step in analyzing your data is to graph your data. It’s an
SAGE
2011 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods Video
Page 7 of 12 Research Design: Observational and Correlational Studies
opportunity for you to see what your distribution looks like. And that’s where you might see an outlier
in your data and you otherwise may not have been aware of it. You might see an interesting shape in
your distribution. So before analyzing anything, you
EVELYN BEHAR [continued]: should always graph your data and just take a visual look to see if
there’s anything interesting or odd or wrong that pops out at you.
NARRATOR: One final way of analyzing data and learning about the nature of our construct is looking
at the shapes of our distributions. There are three basic shapes, rectangle, bimodal, and normal. The
rectangular shape comes about when all of the scores occur with roughly the same frequency.
NARRATOR [continued]: For example, if we were eating M&Ms from a bag and we kept track of how
often we picked each color, we might end up with a rectangular shape of distribution. Let’s say our
bag contained 120 M&Ms, 20 red, 20 blue, 20 orange, 20 green, 20 yellow, and 20 brown.
NARRATOR [continued]: If we randomly pick one M&M 60 times, we’re likely to pick approximately
10 of each color and we would end up with a pretty flat distribution, which will look like a rectangle. A
second shape of distribution is bimodal. This occurs when two groups seem to emerge from the data.
NARRATOR [continued]: The bimodal shape often can inform us about the nature of the topic we’re
studying and can give us hints about sub-samples that might exist.
NICOLE CAIN: So one example of a bimodal distribution would be height. If we compare males to
females, males are always going to be a little bit on average taller than females. So you would have
one peak at the higher end of the spectrum for males and another peak on the lower end of the
spectrum for females.
NARRATOR: On a normal curve, many of the values we’ve measured are clustered around the mean,
with fewer and fewer cases appearing as the values spread to the right and to the left, away from the
mean. On a graph, the edges of the curve are so low that they look like tails.
EVELYN BEHAR: The age at which infants start to crawl or walk or speak exists on a normal
distribution. Intelligence exists on a normal distribution. There are many phenomena in the natural
environment and in the everyday world that exist on a normal distribution or along a normal curve.
NARRATOR: Sometimes, the information we gather about a construct in an observational study
leads us to create hypotheses that we wish to test further through correlational studies. Correlational
Studies.
NARRATOR [continued]: When we know something about our construct and we want to check on its
interplay with other constructs, we’ll want to use a correlational design.
NICOLE CAIN: One instance where you would want to use a correlational design is when you want
to know what other variables are related to your construct of interest and what might be the potential
cause and effect of your variable.
EVELYN BEHAR: So let’s say, for example, that you are interested in insomnia and you want to know
what causes individuals to have a lot of trouble falling asleep at night. You might first look to see what
SAGE
2011 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods Video
Page 8 of 12 Research Design: Observational and Correlational Studies
it co-occurs with. What are some problems or some environmental situations that tend to go along
with insomnia?
EVELYN BEHAR [continued]: And if you see that as caffeine ingestion goes up, insomnia levels go
up, then you might start to think a little bit along the lines of causation. The correlational study can’t
tell you whether two constructs are causally related but it can give you an opportunity to draw some
hypotheses about causal relationships.
NICOLE CAIN: Another time that you would use a correlational design is when an experiment is not
possible. So for example, you would do this in a lot of clinical research.
EVELYN BEHAR: Let’s say that you want to study the effects of depression on marital quality.
This is a great question but obviously, it’s unethical to go out into the world and make some people
depressed and other people not depressed. And so in this situation, it makes more sense to run a
correlational study and simply see what happens to marital quality levels
EVELYN BEHAR [continued]: as depression levels go up or down as they naturally occur in the world.
But you are not going to run an experiment and actually go and make people depressed in order to
see the effect that it has on their marriages. You wouldn’t want to do that.
NARRATOR: As with observational studies, correlational studies have some distinct advantages
and disadvantages that we need to consider when choosing our research design. As we’ve seen,
correlational studies enable us to begin formulating hypotheses which we can then test by means of
an experiment. They’re also good to use when we can’t control a variable,
NARRATOR [continued]: in other words, when we can’t set up an experiment. Correlational designs
do have their disadvantages, however. It is impossible to determine based on a correlational study
whether one variable is causing the other to rise or fall. Even if we know there is a causal relationship,
the direction of that causality is unknown.
NARRATOR [continued]: When we’re running a correlational study, we’re measuring at least two
variables and seeing what happens to the value of one as the value of the other construct changes.
First, we want to select two variables that we believe will be related and whose relationship is
important for some theoretical or practical reason.
NARRATOR [continued]: For example, we may be interested in finding out whether the SAT test
actually predicts college aptitude. In this case, we will want to look at both SAT scores and success
in college, perhaps based on grade point averages in the student’s freshman year.
NICOLE CAIN: As with all research, you should keep careful track of all of the data that you’re
collecting. This includes keeping track of what condition participants are in, keeping track of their
scores on the various variables that you’re measuring, as well as any problems or issues that come
up in the course of collecting data. You’ll want to have this information available to you when you go
to analyze your results.
NARRATOR: The nuts and bolts of computing correlations requires advanced knowledge and most
people use a statistical software package to accomplish this task. The most common type of
SAGE
2011 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods Video
Page 9 of 12 Research Design: Observational and Correlational Studies
correlation is the bivariate correlation, which measures the degree of relationship between two
variables. The value that this calculation yields
NARRATOR [continued]: is called the “r statistic.” This correlation, r, ranges from negative 1.0 to
positive 1.0. For example, we could look for a correlation between the number of hours students spent
studying and their scores on a math exam by way of these steps.
NARRATOR [continued]: First, we would look for the sign of the correlation, whether it was positive
or negative. A correlation of positive 1.0 would indicate a perfect positive relationship between hours
spent studying and math score. That means that as the number of hours spent studying increases,
the math score also increases.
NARRATOR [continued]: A correlation of negative 1.0 would indicate a perfect negative relationship
between hours spent studying and math score. That means that as the number of hours spent
studying increases, the math score decreases. A correlation of zero would indicate no relationship at
all between the two variables.
NARRATOR [continued]: In reality, correlations are almost never exactly positive 1.0 or negative 1.0.
Variables are usually not that exactly related. Second, we would look at the absolute value of the
number, ignoring the sign, and see whether it is closer to one or to zero. The closer it is to a full
negative or positive 1.0,
NARRATOR [continued]: the stronger the relationship between our two variables. The closer it is to
zero, the weaker the relationship. Let’s apply these principles to three sets of numbers. Which is the
stronger correlation, negative 0.98 or positive 0.34?
NARRATOR [continued]: Negative 0.98. Which indicates that both variables increase together,
negative 0.45 or positive 0.21? Positive 0.21. Which indicates that as one variable increases, the
other decreases, positive 0.62 or negative 0.33?
NARRATOR [continued]: Negative 0.33. Once we have calculated our correlation, we need to
understand the implications of that correlation. It is extremely important to be careful at this stage in
order not to misinterpret the results of our correlational study. The most important rule remains that
correlation does not
NARRATOR [continued]: imply causation. This is the biggest mistake one can make when analyzing
data.
EVELYN BEHAR: So let’s talk about a specific example. Imagine that you ran a study looking at two
variables that you were interested in, number of ice cream cones sold in New York City and number
of violent crimes committed in New York City. And let’s say that you found a very strong correlation
EVELYN BEHAR [continued]: between these two variables. You found that as ice cream cone sales
increased, so did number of violent crimes, that it increased almost at a one to one ratio. Well, we
have to be very, very careful about drawing any causal conclusions here. It could be that ice cream
causes people to be violent.
EVELYN BEHAR [continued]: It doesn’t sound very plausible but it is a possibility and the correlational
SAGE
2011 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods Video
Page 10 of 12 Research Design: Observational and Correlational Studies
study cannot tell you whether it’s true or not. It’s also possible that when people commit violent crimes,
it makes them somehow want ice cream. This also doesn’t really seem to be intuitively true and yet
we cannot rule it out with a correlational design.
EVELYN BEHAR [continued]: It’s still a possible hypothesis. The third possibility is that there is some
third variable that’s acting on these two variables. So in this example, I think there’s a fairly clear third
variable and that’s weather. It’s what the conditions are like outside. In the summertime, there are lots
EVELYN BEHAR [continued]: of people outside so ice cream sales go up and also number of violent
crimes go up. So remember, anytime you have a correlational study and you find some correlation
between your two variables that you’re interested in, there are three possibilities. It could be that
variable A is causing a change in variable B, it could be that variable B is causing a change in variable
A,
EVELYN BEHAR [continued]: or it could be that C, some third variable, is causing both A and B to
change and it’s influencing both of them. And again, the only way to distinguish between these three
possibilities and to draw solid conclusions about causality and direction of causality is by running a
true experiment.
NARRATOR: Conclusion. We all think about exciting science experiments in which we act upon one
variable in order to cause a change in another. However, we can also learn a great deal without
experimentation just by observing and taking note
NARRATOR [continued]: of what is happening around us. Let’s review the two basic types of non-
experimental research designs. In an observational study, we’re seeking to learn as much as we can
about a single construct or variable. This may be a brand new, never studied phenomenon, such as
the example of the neon-green hair.
NARRATOR [continued]: An observational study allows us to formulate hypotheses about a construct.
The advantages of this type of study include its relatively low cost when compared to other types
of research and its application in cases where it would be immoral or impossible to cause a
phenomenon to occur. The disadvantages of observational design
NARRATOR [continued]: include its inability to tell us anything conclusive about the relationship
between one construct and another. Correlational studies enable us to find positive and negative
correlations between variables, to see whether they both increase together or whether one increases
while the other decreases
NARRATOR [continued]: or whether they bear a very weak or nonexistent relationship. Correlational
studies are particularly well-suited in situations in which we cannot conduct an experiment, such as
when it would be unethical to cause one condition in order to see what its result would be. In other
cases, a correlational study is a precursor to an experiment, much as an observational study
NARRATOR [continued]: can be a precursor to either a correlational or an experimental study. It
enables us to formulate hypotheses which we can then research further. A drawback of correlational
SAGE
2011 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods Video
Page 11 of 12 Research Design: Observational and Correlational Studies
designs, which is shared, of course, by observational designs, is that we cannot use them to
determine whether one phenomenon causes another.
NARRATOR [continued]: Even if we find a strong correlation, we’re left with the question of whether
the first influences the second, the second influences the first, or a separate third variable influences
them both. The only way to solve this dilemma is, when possible, through experimental research.
[MUSIC PLAYING]
SAGE
2011 SAGE Publications, Ltd. All Rights Reserved.
SAGE Research Methods Video
Page 12 of 12 Research Design: Observational and Correlational Studies
- Research Design: Observational and Correlational Studies