2 Methodology
21CC contracted Westat, Inc. to conduct a sample representative of residents of the Baltimore area, which is defined as residents Baltimore City and Baltimore County, Maryland.
2.1 Population Frame
English-speaking household residents in Baltimore City and Baltimore County, Maryland.
2.2 Sampling Plan
The overall sampling plan used a stratified address-based sample (ABS) of 10,000 addresses designed to include over-samples of Baltimore City and neighborhoods (Census tracts) with large shares of Black or Hispanic residents.
2.2.1 Stratification
The 10,000 addresses were sampled from two major strata defined by jurisdiction with 6,000 addresses in Baltimore City and 4,000 addresses in Baltimore County. Substrata were defined based on:
- Public Use Microdata Area regions (five in Baltimore City and seven in Baltimore County)
- Community Statistical Areas defined by the Baltimore Neighborhood Indicators Alliance (in Baltimore City only)
- Black/Hispanic strata based on Census tract shares of Black and Hispanic residents
The Black/Hispanic strata are based on the combined share of Blacks and Hispanics for the Census Tract using data from the 2017-2021 American Community Survey (ACS) estimates. The following six classifications for strata of were used for the combined share of Black and Hispanic residents:
- \(\leq\) 10% Black or Hispanic
- \(>\) 10% to \(\leq\) 25% Black or Hispanic
- \(>\) 25% to \(\leq\) 50% Black or Hispanic
- \(>\) 50% to \(\leq\) 75% Black or Hispanic
- \(>\) 75% to \(\leq\) 90% Black or Hispanic
- \(>\) 90% Black or Hispanic
2.2.2 Oversample
Census tracts were over-sampled based on jurisdiction. In Baltimore County, addresses were over-sampled such that the ratio of sampled addresses in the highest-Black/Hispanic stratum was 1.62 times that of the lowest stratum. In Baltimore City, the addresses were over-sampled such that the ratio of sampled addresses in the highest-Black/Hispanic stratum was 1.27 times that of the lowest stratum. Tables 2.1 and 2.2 provide the relative sampling rates of different strata within each jurisdiction.
Stratum | Percent HH in Juris. | Relative Sampling Rate | HH Sample Percent |
---|---|---|---|
1 | 25.4% | 83.1% | 21.1% |
2 | 26.5% | 93.5% | 24.8% |
3 | 22.9% | 103.9% | 23.8% |
4 | 13.9% | 114.3% | 14.9% |
5 | 9.6% | 124.6% | 12.0% |
6 | 2.6% | 135.0% | 3.5% |
Stratum | Percent HH in Juris. | Relative Sampling Rate | HH Sample Percent |
---|---|---|---|
1 | 4.6% | 84.8% | 3.9% |
2 | 13.5% | 89.6% | 12.1% |
3 | 17.6% | 94.3% | 16.6% |
4 | 15.1% | 99.0% | 15.0% |
5 | 18.3% | 103.7% | 19.0% |
6 | 30.8% | 108.4% | 33.4% |
Westat divided the ABS frame into 145 substrata: 33 substrata in Baltimore County and 112 substrata in Baltimore City. The 33 cells in Baltimore county are all possible minority strata within the 7 PUMA regions (note that each PUMA region does not have every minority stratum represented within it). The 112 cells in Baltimore represent 56 community statistical areas (CSAs), sometimes divided across PUMA region, and then all represented minority strata within the CSAs.
For more detailed information about the sample design, including lists of all substrata, refer to Baltimore Area Survey Methods Report (Popick and Rizzo, 2023).
2.3 Data Collection
The 2023 BAS used a “push-to-web” mode that allowed respondents to conduct a computer-assisted personal interview (CAPI). A paper version was offered as well. Two survey experiments were used and described in more detail below.
2.3.2 Mailings
Sampled addresses received up to four letters inviting them to participate in the Baltimore Area Survey. Each mailing mentioned a promised $5 incentive for completing the survey. The contents of each mailing are described below. All mailings were sent by first class USPS mail. All undeliverable mail and completed surveys were returned to Westat and documented in the sample file.
Mailing schedule and contents:
- Mailing 1, sent June 12, 2023 to all sampled addresses
- #10 envelope with logo of either a Baltimore Area Survey logo or a Johns Hopkins 21st Century Cities Initiative logo (determined by randomization, see Section 2.4 for more information)
- One page cover letter including survey website and unique login credentials, with FAQ on reverse
- $2 bill as an incentive, either visible through the envelope window or tucked inside the letter (determined by randomization, see Section 2.4 for more information)
- Mailing 2, sent June 20, 2023 to all sampled addresses
- Folded postcard with either a Baltimore Area Survey logo or a Johns Hopkins 21st Century Cities Initiative logo
- Content included survey website and unique login credentials
- Mailing 3, sent June 27, 2023 to addresses that previous mailings had not been turned as undeliverable and that had not completed survey
- #10 envelope with either a Baltimore Area Survey logo, or a Johns Hopkins 21st Century Cities Initiative logo
- One page cover letter including survey website and unique login credentials, with FAQ on reverse
- Mailing 4, sent July 12, 2023 to addresses that previous mailings had not been turned as undeliverable and that had not completed survey
- #10 envelope with either a Baltimore Area Survey logo, or a Johns Hopkins 21st Century Cities Initiative logo
- One page cover letter including survey website and unique login credentials, with FAQ on reverse
Thank-you mailings were sent to respondents who completed the survey who requested cash rather than an Amazon.com gift card.
2.3.3 Computer-Assisted Personal Interview
Respondents who chose to complete the BAS via a web survey were directed to an individualized URL with the hosted at https://www.BaltimoreAreaSurvey.org. The individualized URL pointed to a site with the questionnaire designed using SurveyBuilder, Westat’s proprietary software administration software.
Random assignment was used to control the order of the survey in two locations:
- The question
bas23_org_gentrust
question (“Before we move onto the next set of questions, we want to know, generally speaking, how often do you feel people can be trusted?”) was randomly assigned to come before or after trust in different types of organizations. The variablecantruspos
was loaded with the survey based on random assignment arranged in a geographical sort. If the value ofcantruspos
was assigned to equal 1, the question came before that set of questions; if the value was assigned to equal 2, the question came after that set of questions. - The variable
thinkfirst
was loaded based on random assignment arranged in a geographical sort. When the variablethinkfirst
was assigned a value of 1, questions were asked in the orderbas23_org_bthnk
,bas23_org_nthnk
,bas23_org_gthnk
,bas23_org_bsrv
,bas23_org_nsrv
,bas23_org_gsrv
, with questions about how often different types of businesses “think about people like you” first, and questions about how well these businesses “serve your family’s needs” second. Whenthinkfirst
was assigned 2, questions were asked in the orderbas23_org_bsrv
,bas23_org_nsrv
,bas23_org_gsrv
,bas23_org_bthnk
,bas23_org_nthnk
,bas23_org_gthnk
, with the “think about” questions asked second.
2.3.4 Pen-and-Paper Personal Interview
The pen-and-paper version of the survey did not allow randomization and skip patterns. Data from respondents who requested and returned a paper copy of the survey were entered through the survey website by Westat project staff. Because instructions cannot be enforced on paper surveys as on the web instrument, the following rules were needed in order to enter the paper survey.
- Entered the survey from beginning to end, with the web programming enforcing skips. This means that if respondents answered a later question that they should not have because of a skip, their answer was not recorded.
- Recorded the less extreme response if there were multiple responses to a “select-one” question, or the mark was between two response options.
- If there was a provided response as well as written commentary, recorded the selected response and ignored the commentary. If no response was selected, none was recorded based on the commentary.
- If the respondent scratched out a selected response, that response was considered empty, whether or not they provided an alternate response.
- If the person wrote “zero” for number of adults in the household, this was submitted as “1”
- If the respondent initially indicated the presence of businesses or non-profits in their neighborhood, later selections of the paper-only response of “There are no [businesses or non-profits] in my neighborhood” were considered a non-response to that question.
- If more than one employment status was selected, “Retired” took precedence over “Keeping house”
2.3.5 Respondent Incentives
All respondents were given a $2 bill as a pre-incentive (based on experiment described in Section 2.4.2) and a $5 after completing the survey via an Amazon.com electronic gift card or, if requested, cash.
2.4 Survey Field Experiments
Two field experiments were used in the mailings to ascertain their effect on response.
2.4.1 Logo Experiment
Households were selected to receive mailings with either the Baltimore Area Survey logo or the Johns Hopkins 21st Century Cities Initiative logo (see Figure 2.1). The assignment to the BAS logo or 21CC logo was made with addresses arranged in a geographical sort so that the two conditions were distributed evenly across all geographic areas. The assigned logo was included on both the envelope and the letter. Households were assigned to receive the same logo for all mailings throughout the field period. The website had a single landing page that displayed both logos.
There were 640 completed surveys from those receiving the Baltimore Area Survey logo, and 712 completed surveys from those receiving the Johns Hopkins 21st Century Cities Initiative logo.
2.4.2 Incentive Visibility Experiment
All selected addresses received a $2 pre-paid incentive with the first mailing. Respondents were selected for this $2 pre-incentive to be either visible through the envelope window by the mailing address, or tucked inside the letter and only found when opening the envelope. This was assigned with addresses arranged in the same geographic sort as for the logo and ensured that equal numbers of each logo condition were assigned to each incentive condition.
There were 668 completed surveys from those receiving the $2 as visible pre-incentive, and 684 completed surveys from those receiving hidden pre-incentive.
As a check for possible irregularities with delivering visible cash, Westat also considered how many envelopes from the first mailing were returned undeliverable. For those sent visible cash, there were 376 returned as undeliverable, and for those sent hidden cash, there were 365 returned as undeliverable.
2.5 Response Rate
The response rate for this study was calculated using AAPOR’s RR3 formula, and was 15.3% for the overall sample. Table 2.3 shows each category of the calculation in detail for the overall sample, as well as experimental and geographical groups of interest.
Completed surveys were those that finished the final substantive question (i.e., they may not have provided contact information for future follow-up). All other surveys were considered break-offs.
Category | Disposition | Full sample | Visible pre-incentive | Hidden pre-incentive | BAS logo | JHU 21CC logo | City | County |
---|---|---|---|---|---|---|---|---|
Total sample used | 10000 | 5000 | 5000 | 5000 | 5000 | 6000 | 4000 | |
Interview (Category 1) - I | 1.1 Complete | 1352 | 668 | 684 | 640 | 712 | 818 | 534 |
Eligible, non-interview (Category 2) - R | 2.1 Refusal & Breakoff | 224 | 115 | 109 | 113 | 111 | 131 | 91 |
Unknown eligibility, non-interview (Category 3) – UH | 3.19 Nothing ever returned | 7682 | 3841 | 3841 | 3879 | 3803 | 4503 | 3179 |
Not eligible (Category 4) | 4.30 Housing Unit Ineligible | 741 | 376 | 365 | 368 | 373 | 547 | 194 |
Not eligible (Category 4) | 4.70 No Eligible Respondent | 1 | 0 | 1 | 0 | 1 | 1 | 0 |
e | 0.9442 | 0.9452 | 0.9432 | 0.9436 | 0.9448 | 0.9613 | 0.9185 | |
AAPOR RR3 | 15.3 | 15.1 | 15.5 | 14.5 | 16.1 | 15.5 | 15.1 |
2.6 Weighting
Weights were applied to ensure representation of adults in the Baltimore area. Three components were used to create sample weights to represent the population: base weights to account for unequal probability of selection, nonresponse adjustment to adjust for differential nonresponse, and calibration to ensure representation to population-level control values.
2.6.1 Base Weights
The ABS address probability of selection arises from the stratification structure determined by the Census Tract of the address. We assigned 12 sampling-rate strata with graduated sampling rates listed in Tables 2.1 and 2.2. The household-level base weight is the inverse of this sampling-rate stratum probability.
The second stage of sampling is then sampling one adult within each sampled household. Within household sampling is based on the next-birthday method. The invitation letter requested that the adult with the next birthday respond to the survey. We treat this pseudo-random selection as equivalent to a random sampling, so the probability of selection of the final respondent adult within the household is based on the adults in the household. This information is collected as part of the questionnaire, variable bas23_dem_adults
. The within-household base weight component is therefore bas23_dem_adults
itself (e.g., if there are three adults in the household, then the sampling probability is 1 in 3, so that the base weight is 3). If bas23_dem_adults
is missing on an otherwise completed questionnaire, it was imputed (see Section 2.6.4 below).
2.6.2 Non-Response Adjustments
Non-response adjustments were made based on age, educational attainment, race/ethnicity, gender, and geography. Sampling strata were used for the basis of non-response adjustment. In Baltimore County, the 33 sampling strata were based on PUMA region and Black/Hispanic stratum (listed in Table 2.1). In Baltimore City, the 88 sampling strata were based on PUMA crossed with Community Statistical Area, crossed with Black/Hispanic stratum (listed in Table 2.2). Cells with fewer than 10 respondents were collapsed across Black/Hispanic strata within geography first and then by geography if necessary.
2.6.3 Calibration Adjustments
Weights were then calibrated to match control totals from the ACS Public Use Microdata Set (PUMS) so that these control totals are based on samples from the U.S. population within ‘PUMS regions’. The calibration adjusts both for nonresponse (beyond the adjustments in Section 2.6.2), and for undercoverage of the ABS frame as a representation of all adults. Weights were calibrated using raking (iterative proportional fitting, or IPF) to rake the nonresponse-adjusted weights (the base weights after non-response adjustments as specified in Section 2.6.2) to control totals in the following dimensions separated by jurisdiction:
- PUMS region within county
- Gender within each county (male and female; note that ACS only has control totals for male and female, so transgender and other gender from the questionnaire were collapsed for this calibration step)
- Race/ethnicity within county:
- Hispanic or non-Hispanic Other Race
- Non-Hispanic Black Only
- Non-Hispanic White Only
- Non-Hispanic Asian Only
- Age within county
- 18 to 34
- 35 to 54
- 55 to 69
- 70 or older
- Education level within county
- No high-school diploma or high-school diploma only
- 2-year associates degree only or 4-year college diploma only
- Professional degree
- Home-owners vs. renters within county
Excessive weights were trimmed. Westat’s RAKE-TRIM algorithm was used for raking and trimming weights. Westat applied the algorithm to trim back weights that fell outside a given range, and then re-raked to the control totals, in an iterative raking and trimming process until the control totals were achieved while at same time the weights are also within certain constraints. When convergence did not occur, control cells were collapsed to reduce constraints until the algorithm converged.
Prior to collapsing cells, the trimming constraint was adjusted. To preserve the unbiasedness of the weights, trimming cells within which contraints were applied were the 12 sampling-rate strata listed in Tables 2.1 and 2.2. Within these 12 sampling-rate strata, weights 3.5 times the median weight were trimmed to 3.5 and weights smaller than 1/3.5 times the median weight were trimmed to 1/3.5. The values were changed to 4.5 and 1/4.5 when convergence could not be achieved.
2.6.4 Imputations
The weighting steps described in Sections 2.6.1-2.6.3 above required responses for the following variables from all respondents:
- Number of adults in household (
bas23_dem_adults
) - Gender (
bas23_dem_gender
)1 - Age group (derived from
bas23_dem_yearborn
) - Education (derived from
bas23_dem_edattain
) - Race (derived from responses to race questions
bas23_dem_race*
) - Ethnicity (
bas23_dem_latx
) - Housing tenure (derived from
bas23_dem_own
)
When values for variables missed for fewer than five (5) respondents, the value was imputed to be the modal response category for the derived variable. In other cases, imputations were calculated using the R program mice
(multiple imputations chained equations). The mice
algorithm was applied using predictive models for each variable calculated based on all others in addition to Census tract information (Black/Hispanic percentage) and then iterated through the individual-variable chained equations one-by-one to convergence. Models were estimated separately for each jurisdiction.
Values other than “male” or “female” for
bas23_dem_gender
were imputed for weighting purposes because the American Community Survey includes no responses other than “male” and “female”.↩︎