2 Methodology

21CC contracted Westat, Inc. to conduct a sample representative of residents of the Baltimore area, which is defined as residents Baltimore City and Baltimore County, Maryland.

2.1 Population Frame

English-speaking household residents in Baltimore City and Baltimore County, Maryland.

2.2 Sampling Plan

The overall sampling plan used a stratified address-based sample (ABS) of 10,000 addresses designed to include over-samples of Baltimore City and neighborhoods (Census tracts) with large shares of Black or Hispanic residents.

2.2.1 Stratification

The 10,000 addresses were sampled from two major strata defined by jurisdiction with 6,000 addresses in Baltimore City and 4,000 addresses in Baltimore County. Substrata were defined based on:

Public Use Microdata Area regions (five in Baltimore City and seven in Baltimore County)
Community Statistical Areas defined by the Baltimore Neighborhood Indicators Alliance (in Baltimore City only)
Black/Hispanic strata based on Census tract shares of Black and Hispanic residents

The Black/Hispanic strata are based on the combined share of Blacks and Hispanics for the Census Tract using data from the 2017-2021 American Community Survey (ACS) estimates. The following six classifications for strata of were used for the combined share of Black and Hispanic residents:

$\leq$ 10% Black or Hispanic
$>$ 10% to $\leq$ 25% Black or Hispanic
$>$ 25% to $\leq$ 50% Black or Hispanic
$>$ 50% to $\leq$ 75% Black or Hispanic
$>$ 75% to $\leq$ 90% Black or Hispanic
$>$ 90% Black or Hispanic

2.2.2 Oversample

Census tracts were over-sampled based on jurisdiction. In Baltimore County, addresses were over-sampled such that the ratio of sampled addresses in the highest-Black/Hispanic stratum was 1.62 times that of the lowest stratum. In Baltimore City, the addresses were over-sampled such that the ratio of sampled addresses in the highest-Black/Hispanic stratum was 1.27 times that of the lowest stratum. Tables 2.1 and 2.2 provide the relative sampling rates of different strata within each jurisdiction.

Table 2.1: Relative sampling rates of strata within Baltimore County
Stratum	Percent HH in Juris.	Relative Sampling Rate	HH Sample Percent
1	25.4%	83.1%	21.1%
2	26.5%	93.5%	24.8%
3	22.9%	103.9%	23.8%
4	13.9%	114.3%	14.9%
5	9.6%	124.6%	12.0%
6	2.6%	135.0%	3.5%

Table 2.2: Relative sampling rates of strata within Baltimore City
Stratum	Percent HH in Juris.	Relative Sampling Rate	HH Sample Percent
1	4.6%	84.8%	3.9%
2	13.5%	89.6%	12.1%
3	17.6%	94.3%	16.6%
4	15.1%	99.0%	15.0%
5	18.3%	103.7%	19.0%
6	30.8%	108.4%	33.4%

Westat divided the ABS frame into 145 substrata: 33 substrata in Baltimore County and 112 substrata in Baltimore City. The 33 cells in Baltimore county are all possible minority strata within the 7 PUMA regions (note that each PUMA region does not have every minority stratum represented within it). The 112 cells in Baltimore represent 56 community statistical areas (CSAs), sometimes divided across PUMA region, and then all represented minority strata within the CSAs.

For more detailed information about the sample design, including lists of all substrata, refer to Baltimore Area Survey Methods Report (Popick and Rizzo, 2023).

2.3 Data Collection

The 2023 BAS used a “push-to-web” mode that allowed respondents to conduct a computer-assisted personal interview (CAPI). A paper version was offered as well. Two survey experiments were used and described in more detail below.

2.3.1 Field period

The field period for this survey was from June 12 through July 24, 2023.

2.3.2 Mailings

Sampled addresses received up to four letters inviting them to participate in the Baltimore Area Survey. Each mailing mentioned a promised $5 incentive for completing the survey. The contents of each mailing are described below. All mailings were sent by first class USPS mail. All undeliverable mail and completed surveys were returned to Westat and documented in the sample file.

Mailing schedule and contents:

Mailing 1, sent June 12, 2023 to all sampled addresses
- #10 envelope with logo of either a Baltimore Area Survey logo or a Johns Hopkins 21st Century Cities Initiative logo (determined by randomization, see Section 2.4 for more information)
- One page cover letter including survey website and unique login credentials, with FAQ on reverse
- $2 bill as an incentive, either visible through the envelope window or tucked inside the letter (determined by randomization, see Section 2.4 for more information)
Mailing 2, sent June 20, 2023 to all sampled addresses
- Folded postcard with either a Baltimore Area Survey logo or a Johns Hopkins 21st Century Cities Initiative logo
- Content included survey website and unique login credentials
Mailing 3, sent June 27, 2023 to addresses that previous mailings had not been turned as undeliverable and that had not completed survey
- #10 envelope with either a Baltimore Area Survey logo, or a Johns Hopkins 21st Century Cities Initiative logo
- One page cover letter including survey website and unique login credentials, with FAQ on reverse
Mailing 4, sent July 12, 2023 to addresses that previous mailings had not been turned as undeliverable and that had not completed survey
- #10 envelope with either a Baltimore Area Survey logo, or a Johns Hopkins 21st Century Cities Initiative logo
- One page cover letter including survey website and unique login credentials, with FAQ on reverse

Thank-you mailings were sent to respondents who completed the survey who requested cash rather than an Amazon.com gift card.

2.3.3 Computer-Assisted Personal Interview

Respondents who chose to complete the BAS via a web survey were directed to an individualized URL with the hosted at https://www.BaltimoreAreaSurvey.org. The individualized URL pointed to a site with the questionnaire designed using SurveyBuilder, Westat’s proprietary software administration software.

Random assignment was used to control the order of the survey in two locations:

The question bas23_org_gentrust question (“Before we move onto the next set of questions, we want to know, generally speaking, how often do you feel people can be trusted?”) was randomly assigned to come before or after trust in different types of organizations. The variable cantruspos was loaded with the survey based on random assignment arranged in a geographical sort. If the value of cantruspos was assigned to equal 1, the question came before that set of questions; if the value was assigned to equal 2, the question came after that set of questions.
The variable thinkfirst was loaded based on random assignment arranged in a geographical sort. When the variable thinkfirst was assigned a value of 1, questions were asked in the order bas23_org_bthnk, bas23_org_nthnk, bas23_org_gthnk, bas23_org_bsrv, bas23_org_nsrv, bas23_org_gsrv, with questions about how often different types of businesses “think about people like you” first, and questions about how well these businesses “serve your family’s needs” second. When thinkfirst was assigned 2, questions were asked in the order bas23_org_bsrv, bas23_org_nsrv, bas23_org_gsrv, bas23_org_bthnk, bas23_org_nthnk, bas23_org_gthnk, with the “think about” questions asked second.

2.3.4 Pen-and-Paper Personal Interview

The pen-and-paper version of the survey did not allow randomization and skip patterns. Data from respondents who requested and returned a paper copy of the survey were entered through the survey website by Westat project staff. Because instructions cannot be enforced on paper surveys as on the web instrument, the following rules were needed in order to enter the paper survey.

Entered the survey from beginning to end, with the web programming enforcing skips. This means that if respondents answered a later question that they should not have because of a skip, their answer was not recorded.
Recorded the less extreme response if there were multiple responses to a “select-one” question, or the mark was between two response options.
If there was a provided response as well as written commentary, recorded the selected response and ignored the commentary. If no response was selected, none was recorded based on the commentary.
If the respondent scratched out a selected response, that response was considered empty, whether or not they provided an alternate response.
If the person wrote “zero” for number of adults in the household, this was submitted as “1”
If the respondent initially indicated the presence of businesses or non-profits in their neighborhood, later selections of the paper-only response of “There are no [businesses or non-profits] in my neighborhood” were considered a non-response to that question.
If more than one employment status was selected, “Retired” took precedence over “Keeping house”

2.3.5 Respondent Incentives

All respondents were given a $2 bill as a pre-incentive (based on experiment described in Section 2.4.2) and a $5 after completing the survey via an Amazon.com electronic gift card or, if requested, cash.

2.4 Survey Field Experiments

Two field experiments were used in the mailings to ascertain their effect on response.

2.4.1 Logo Experiment

Households were selected to receive mailings with either the Baltimore Area Survey logo or the Johns Hopkins 21st Century Cities Initiative logo (see Figure 2.1). The assignment to the BAS logo or 21CC logo was made with addresses arranged in a geographical sort so that the two conditions were distributed evenly across all geographic areas. The assigned logo was included on both the envelope and the letter. Households were assigned to receive the same logo for all mailings throughout the field period. The website had a single landing page that displayed both logos.

Figure 2.1: BAS and 21CC/JHU Logos used on mailings

There were 640 completed surveys from those receiving the Baltimore Area Survey logo, and 712 completed surveys from those receiving the Johns Hopkins 21st Century Cities Initiative logo.

2.4.2 Incentive Visibility Experiment

All selected addresses received a $2 pre-paid incentive with the first mailing. Respondents were selected for this $2 pre-incentive to be either visible through the envelope window by the mailing address, or tucked inside the letter and only found when opening the envelope. This was assigned with addresses arranged in the same geographic sort as for the logo and ensured that equal numbers of each logo condition were assigned to each incentive condition.

There were 668 completed surveys from those receiving the $2 as visible pre-incentive, and 684 completed surveys from those receiving hidden pre-incentive.

As a check for possible irregularities with delivering visible cash, Westat also considered how many envelopes from the first mailing were returned undeliverable. For those sent visible cash, there were 376 returned as undeliverable, and for those sent hidden cash, there were 365 returned as undeliverable.

2.5 Response Rate

The response rate for this study was calculated using AAPOR’s RR3 formula, and was 15.3% for the overall sample. Table 2.3 shows each category of the calculation in detail for the overall sample, as well as experimental and geographical groups of interest.

Completed surveys were those that finished the final substantive question (i.e., they may not have provided contact information for future follow-up). All other surveys were considered break-offs.

Table 2.3: AAPOR Response Rate 3 calculations for groups of interest
Category	Disposition	Full sample	Visible pre-incentive	Hidden pre-incentive	BAS logo	JHU 21CC logo	City	County
Total sample used		10000	5000	5000	5000	5000	6000	4000
Interview (Category 1) - I	1.1 Complete	1352	668	684	640	712	818	534
Eligible, non-interview (Category 2) - R	2.1 Refusal & Breakoff	224	115	109	113	111	131	91
Unknown eligibility, non-interview (Category 3) – UH	3.19 Nothing ever returned	7682	3841	3841	3879	3803	4503	3179
Not eligible (Category 4)	4.30 Housing Unit Ineligible	741	376	365	368	373	547	194
Not eligible (Category 4)	4.70 No Eligible Respondent	1	0	1	0	1	1	0
	e	0.9442	0.9452	0.9432	0.9436	0.9448	0.9613	0.9185
	AAPOR RR3	15.3	15.1	15.5	14.5	16.1	15.5	15.1

2.6 Weighting

Weights were applied to ensure representation of adults in the Baltimore area. Three components were used to create sample weights to represent the population: base weights to account for unequal probability of selection, nonresponse adjustment to adjust for differential nonresponse, and calibration to ensure representation to population-level control values.

2.6.1 Base Weights

The ABS address probability of selection arises from the stratification structure determined by the Census Tract of the address. We assigned 12 sampling-rate strata with graduated sampling rates listed in Tables 2.1 and 2.2. The household-level base weight is the inverse of this sampling-rate stratum probability.

The second stage of sampling is then sampling one adult within each sampled household. Within household sampling is based on the next-birthday method. The invitation letter requested that the adult with the next birthday respond to the survey. We treat this pseudo-random selection as equivalent to a random sampling, so the probability of selection of the final respondent adult within the household is based on the adults in the household. This information is collected as part of the questionnaire, variable bas23_dem_adults. The within-household base weight component is therefore bas23_dem_adults itself (e.g., if there are three adults in the household, then the sampling probability is 1 in 3, so that the base weight is 3). If bas23_dem_adults is missing on an otherwise completed questionnaire, it was imputed (see Section 2.6.4 below).

2.6.2 Non-Response Adjustments

Non-response adjustments were made based on age, educational attainment, race/ethnicity, gender, and geography. Sampling strata were used for the basis of non-response adjustment. In Baltimore County, the 33 sampling strata were based on PUMA region and Black/Hispanic stratum (listed in Table 2.1). In Baltimore City, the 88 sampling strata were based on PUMA crossed with Community Statistical Area, crossed with Black/Hispanic stratum (listed in Table 2.2). Cells with fewer than 10 respondents were collapsed across Black/Hispanic strata within geography first and then by geography if necessary.

2.6.3 Calibration Adjustments

Weights were then calibrated to match control totals from the ACS Public Use Microdata Set (PUMS) so that these control totals are based on samples from the U.S. population within ‘PUMS regions’. The calibration adjusts both for nonresponse (beyond the adjustments in Section 2.6.2), and for undercoverage of the ABS frame as a representation of all adults. Weights were calibrated using raking (iterative proportional fitting, or IPF) to rake the nonresponse-adjusted weights (the base weights after non-response adjustments as specified in Section 2.6.2) to control totals in the following dimensions separated by jurisdiction:

PUMS region within county
Gender within each county (male and female; note that ACS only has control totals for male and female, so transgender and other gender from the questionnaire were collapsed for this calibration step)
Race/ethnicity within county:
- Hispanic or non-Hispanic Other Race
- Non-Hispanic Black Only
- Non-Hispanic White Only
- Non-Hispanic Asian Only
Age within county
- 18 to 34
- 35 to 54
- 55 to 69
- 70 or older
Education level within county
- No high-school diploma or high-school diploma only
- 2-year associates degree only or 4-year college diploma only
- Professional degree
Home-owners vs. renters within county

Excessive weights were trimmed. Westat’s RAKE-TRIM algorithm was used for raking and trimming weights. Westat applied the algorithm to trim back weights that fell outside a given range, and then re-raked to the control totals, in an iterative raking and trimming process until the control totals were achieved while at same time the weights are also within certain constraints. When convergence did not occur, control cells were collapsed to reduce constraints until the algorithm converged.

Prior to collapsing cells, the trimming constraint was adjusted. To preserve the unbiasedness of the weights, trimming cells within which contraints were applied were the 12 sampling-rate strata listed in Tables 2.1 and 2.2. Within these 12 sampling-rate strata, weights 3.5 times the median weight were trimmed to 3.5 and weights smaller than 1/3.5 times the median weight were trimmed to 1/3.5. The values were changed to 4.5 and 1/4.5 when convergence could not be achieved.

2.6.4 Imputations

The weighting steps described in Sections 2.6.1-2.6.3 above required responses for the following variables from all respondents:

Number of adults in household (bas23_dem_adults)
Gender (bas23_dem_gender)¹
Age group (derived from bas23_dem_yearborn)
Education (derived from bas23_dem_edattain)
Race (derived from responses to race questions bas23_dem_race*)
Ethnicity (bas23_dem_latx)
Housing tenure (derived from bas23_dem_own)

When values for variables missed for fewer than five (5) respondents, the value was imputed to be the modal response category for the derived variable. In other cases, imputations were calculated using the R program mice (multiple imputations chained equations). The mice algorithm was applied using predictive models for each variable calculated based on all others in addition to Census tract information (Black/Hispanic percentage) and then iterated through the individual-variable chained equations one-by-one to convergence. Models were estimated separately for each jurisdiction.

Values other than “male” or “female” for bas23_dem_gender were imputed for weighting purposes because the American Community Survey includes no responses other than “male” and “female”.↩︎