Chapter 7: Finding the Data

This concluding chapter will address what is logically the first stage of any data-based research project: before you present and interpret your data, you have to find them. This chapter comes at the end because a productive data search best begins with a clear understanding of what kind of data are likely to be available and how other investigators have used the available data to define and address the critical questions might wish to study. A subtext of what precedes this chapter has been to share with the reader an understanding of the variety and wealth of social indicator data that are available from governmental and nongovernmental sources for those who would use numerical evidence to address questions of public policy and political affairs. This chapter will take that a step further, offering a more detailed description of the data sources used for this book and the other data that are available from these and related sources.

Almost all social indicators in this book—with the exception of data derived from agency records such as the FBI Uniform Crime Report and the U.S. Federal Budget—were originally derived from surveys, usually regular periodic surveys, conducted by either a governmental agency or a survey organization. Each indicator was first constructed from “raw” data files containing each individual response to the questions in the surveys. Thus, the Census Bureau calculates monthly employment indicators, annual poverty and income estimates, and biennial voter turnout measures from the raw responses to its monthly Current Population Survey.

Much of the data used in the charts and tables in this book was obtained from the primary sources, often from the Census Bureau or statistical agency websites. In some cases secondary sources were used either to demonstrate how particular researchers used the data, because some secondary sources have improved on the raw data, or because the data were more conveniently formatted by the secondary source. Many publications and websites repackage primary source data, combining data from different sources on related topics in ways that provide added value for researchers. On education, for example, the journal Education Week produces an annual publication and database, Quality Counts, that reports a series of indicators on state educational achievement, school climate, state policies, and fiscal resources. The journal also scores states on several indexes related to school finance and student achievement. I] Similarly, the annual Kids Count Data Book provides a similar compilation of data on the status of American children.[ii]

Some data were not available on any statistical website but were obtained from published research articles and, in one case, by requesting that author of a published study forward a copy of the data. Finding data on the timing of state and national laws is a particularly challenging task. For the analysis of Election Day registration in chapter 4, this information was provided through a published study and an internet search that revealed the timing of Oregon’s law. To fill-in missing data on the civil registries in Greece and Luxembourg, I phoned the Luxembourg embassy in Washington and emailed a graduate school friend working with a Greek political campaign organization.

For most research endeavors, a good literature review is not only essential to defining the critical issues but also reveals crucial information on the kinds of data and the primary data sources that have been used to address the issue in the past. Using both library and internet search engines, I have often found it useful to include the words “table” or “figure” with the search terms to find data-based research studies.

Notes on Data Formats The data from the statistical websites come in a variety of formats, even at a single agency website, and some agencies and websites make their social indicator data more accessible than others do. Ideally, the agency will provide the data in spreadsheet or a spreadsheet-compatible format (such as .csv), but data are also made available in web page (html) tables and in .pdf format. Data stored in a web page table easily transfers to a spreadsheet, by either cutting-and-pasting or by opening the URL address from the spreadsheet program. Some data are only available in Adobe .pdf files, particularly data contained in agency reports and research studies. The newest versions of the Adobe document viewer have some special features for copying tables and columns of data that work well with some tables but not others. Sometimes, all the columns from a .pdf data table (or data in plain text format on a web page) will “paste” into a single spreadsheet column. Depending on the table format, the “text to columns” function can often sort these data out.