2 Methodology

21CC contracted Westat, Inc. to conduct a sample representative of residents of the Baltimore area, which is defined as residents Baltimore City and Baltimore County, Maryland.

2.1 Population Frame

English-speaking household adults in Baltimore City and Baltimore County, Maryland.

2.2 Sampling Plan

The overall sampling plan used a stratified address-based sample (ABS) of 11,500 addresses divided into two components. The first component was a stratified random sample of 10,000 addresses with oversamples of addresses in Baltimore City and in neighborhoods with disproportionately large shares of Black or Hispanic residents. The sampling plan for this component closely followed the BAS 2023 sampling plan.

The second component was an additional sample of 1,500 addresses that augmented the first component.1 These 1,512 addresses were all located in one of the 55 census tracts that include the route proposed for the Red Line mass transit project that traverses Baltimore from east to west.

2.2.1 Stratification

The 10,000 addresses in the first component were sampled from two major strata defined by jurisdiction with 6,000 addresses in Baltimore City and 4,000 addresses in Baltimore County. Substrata were defined based on:

  • Public Use Microdata Area regions (five in Baltimore City and seven in Baltimore County)
  • Community Statistical Areas defined by the Baltimore Neighborhood Indicators Alliance (in Baltimore City only)
  • Black/Hispanic strata based on Census tract shares of Black and Hispanic residents

The Black/Hispanic strata are based on the combined share of Blacks and Hispanics for the Census Tract using data from the 2017-2021 American Community Survey (ACS) estimates. The following six classifications for strata of were used for the combined share of Black and Hispanic residents:

  1. \(\leq\) 10% Black or Hispanic
  2. \(>\) 10% to \(\leq\) 25% Black or Hispanic
  3. \(>\) 25% to \(\leq\) 50% Black or Hispanic
  4. \(>\) 50% to \(\leq\) 75% Black or Hispanic
  5. \(>\) 75% to \(\leq\) 90% Black or Hispanic
  6. \(>\) 90% Black or Hispanic

2.2.2 Oversample

Census tracts were over-sampled based on jurisdiction. In Baltimore County, addresses were over-sampled such that the ratio of sampled addresses in the highest-Black/Hispanic stratum was 1.62 times that of the lowest stratum. In Baltimore City, the addresses were over-sampled such that the ratio of sampled addresses in the highest-Black/Hispanic stratum was 1.27 times that of the lowest stratum. Tables 2.1 and 2.2 provide the relative sampling rates of different strata within each jurisdiction (totals may not sum to 100 due to rounding).

Table 2.1: Relative sampling rates of strata within Baltimore County
Stratum Percent HH in Juris. Relative Sampling Rate HH Sample Percent
1 22.4% 82.5% 18.5%
2 26.5% 92.8% 24.6%
3 25.3% 103.1% 26.0%
4 12.1% 113.5% 13.7%
5 12.4% 123.8% 15.3%
6 1.4% 134.1% 1.9%
Table 2.2: Relative sampling rates of strata within Baltimore City
Stratum Percent HH in Juris. Relative Sampling Rate HH Sample Percent
1 5.0% 84.9% 4.3%
2 11.7% 89.6% 10.5%
3 19.7% 94.3% 18.6%
4 14.5% 99.0% 14.4%
5 18.7% 103.7% 19.4%
6 30.4% 108.4% 33.0%

2.2.3 Red Line Augmentation

The 1,512 extra households sampled in the Red Line corridor were built on top of the base sampling frame described above. The rate of sampling relative to the baseline design was 1.95 times the baseline rate for tracts in the Red Line Corridor. Crossing strata for oversamples of racial/ethnic categories with presence in the Red Line resulted in 21 strata.

2.2.4 Final Design

The final design yielded 11,506 addresses in the sample, 7,337 in Baltimore City and 4,169 in Baltimore County. There were 3,092 sampled addresses in the Red Line corridor and 8,415 not in the Red Line corridor. The final household probability of being sampled ranged from 1.00% to 5.13% across the 21 strata. A list of sampling strata and sample probabilities are reported in Table 4.1.

Westat implicitly stratified ABS frame into 145 substrata to ensure geographic representation: 33 substrata in Baltimore County and 112 substrata in Baltimore City. The 33 cells in Baltimore county are all possible minority strata within the 7 PUMA regions (note that each PUMA region does not have every minority stratum represented within it). The 112 cells in Baltimore represent 56 community statistical areas (CSAs), sometimes divided across PUMA region, and then all represented minority strata within the CSAs.

For more detailed information about the sample design, including lists of all substrata, refer to Baltimore Area Survey Methods Report (Popick and Rizzo, 2024).

2.3 Data Collection

The 2024 BAS used a “push-to-web” mode that allowed respondents to conduct a computer-assisted personal interview (CAPI). A paper version was offered as well. One survey experiments were used and described in more detail below.

2.3.1 Field period

The field period for this survey was from September 23 through November 12, 2024.

2.3.2 Mailings

Sampled addresses received up to four letters inviting them to participate in the Baltimore Area Survey. Each mailing mentioned a promised $5 incentive for completing the survey. The contents of each mailing are described below. All mailings were sent by first class USPS mail. All undeliverable mail and completed surveys were returned to Westat and documented in the sample file.

Mailing schedule and contents:

  • Mailing 1, sent September 23, 2024 to all sampled addresses
    • #10 envelope with logo of either a Baltimore Area Survey logo or a Johns Hopkins 21st Century Cities Initiative logo (determined by randomization, see Section 2.4 for more information)
    • One page cover letter including survey website and unique login credentials, with FAQ on reverse
    • $2 bill tucked inside the letter
  • Mailing 2, sent September 30, 2024 to all sampled addresses
    • Folded postcard with either a Baltimore Area Survey logo or a Johns Hopkins 21st Century Cities Initiative logo
    • Content included survey website and unique login credentials
  • Mailing 3, sent October 16, 2024 to addresses that previous mailings had not been turned as undeliverable and that had not completed survey
    • #10 envelope with either a Baltimore Area Survey logo, or a Johns Hopkins 21st Century Cities Initiative logo
    • One page cover letter including survey website and unique login credentials, with FAQ on reverse
  • Mailing 4, sent October 29, 2024 to addresses that previous mailings had not been turned as undeliverable and that had not completed survey
    • #10 envelope with either a Baltimore Area Survey logo, or a Johns Hopkins 21st Century Cities Initiative logo
    • One page cover letter including survey website and unique login credentials, with FAQ on reverse

If a respondent requested a paper survey, the respondent was sent:

  • 9x12 inch envelope with Baltimore Area Survey logo
  • One page cover letter including instructions for returning paper survey as well as survey website and unique login credentials, with FAQ on reverse
  • Survey booklet, 20 pages
  • 8.75x11.5 inch postage-paid return envelope with the Baltimore Area Survey logo

Thank-you mailings were sent to respondents who completed the survey who requested cash rather than an Amazon.com gift card.

2.3.3 Computer-Assisted Personal Interview

Respondents who chose to complete the BAS via a web survey were directed to an individualized URL with the hosted at https://www.BaltimoreAreaSurvey.org. The individualized URL pointed to a site with the questionnaire designed using SurveyBuilder, Westat’s proprietary software administration software.

2.3.4 Survey Variations

Random assignment was used to adjust the survey in six locations:

  • Climate concern fill pattern (bas24_svy_clmcond); the value for a respondent was assigned randomly to either 1 or 2. The value determined the text used to fill the {{CLMCOND}} fill in the question bas24_clm_worry.
    • If the value equaled 1, {{CLMCOND}} was filled with the text “your personal well-being?”
    • If the value equaled 2, {{CLMCOND}} was filled with the text “the well-being of your loved ones?”
  • Key Bridge Collapse question split ballot (bas_svy_kbsplit); the value for a respondent was assigned randomly to either 1 or 2. The value determined which subset of questions the respondent was asked about the Key Bridge Collapse.
    • If the value equaled 1, respondents were asked about confidence in the response of governments to addressing the Key Bridge (bas24_con_kbconfl, bas24_con_kbconfs, bas24_con_kbconff, and bas24_con_kbdate1)
    • If the value equaled 2, respondents were asked to rate the job of governments in addressing the Key Bridge collapse (bas24_con_kbhelpl, bas24_con_kbhelps, bas24_con_kbhelpf, and bas24_con_kbdate2)
  • Complete Streets fill pattern (bas24_svy_cstcond); the value for a respondent was assigned randomly to either 1 or 2. The value determined the text used to fill the {{COMPSTRTXT}} text in questions about safe streets policy (bas24_tsp_cstknow, bas24_tsp_cstfo, and bas24_tsp_cstfoa). The introductory text to the questions were also varied to match. -If the value equaled 1, {{COMPSTRTXT}} was filled with the text “Complete Streets Policy” -If the value equaled 2, {{COMPSTRTXT}} was filled with the text “policy like this”
  • Question order for school integration questions (bas24_svy_schinto); the value for a respondent was assigned randomly to either 1 or 2.
    • If the value equaled 1, bas24_att_schintb was asked before bas24_att_schintw
    • If the value equaled 2, bas24_att_schintw was asked before bas24_att_schintb
  • Question order for school policy questions (bas24_svy_schpolo); the value for a respondent was assigned randomly to either 1 or 2.
    • If the value equaled 1, bas24_att_schrspnd was asked before bas24_att_schrmv
    • If the value equaled 2, bas24_att_schrmv was asked before bas24_att_schrspnd
  • Response order for limited government question (bas24_att_lmtgov); the value for a respondent was assigned randomly to either 1 or 2
    • If the value equaled 1, the response options were presented with “Much more” listed first
    • If the value equaled 2, the response options were presented with “Much less” listed first

2.3.5 Pen-and-Paper Personal Interview

The pen-and-paper version of the survey did not allow randomization and skip patterns. Data from respondents who requested and returned a paper copy of the survey were entered through the survey website by Westat project staff. Randomization could not be executed with the paper-and-pencil interview. Therefore all six survey variations described in Section 2.3.4 were set to equal “1” in the paper-and-pencil version.

Because instructions cannot be enforced on paper surveys as on the web instrument, the following rules were needed in order to enter the paper survey.

  • Entered the survey from beginning to end, with the web programming enforcing skips. This means that if respondents answered a later question that they should not have because of a skip, their answer was not recorded. The exception to this was the income questions: if the respondent filled the detailed income question, then the corresponding value for above or below $70,000 was entered and the detailed value was entered.
  • Recorded the less extreme response if there were multiple responses to a “select-one” question, or the mark was between two response options. If the question included an “Other-specify” option and the “specify” option was filled, that option took precedence.
  • If there was a provided response as well as written commentary, recorded the selected response and ignored the commentary. If no response was selected, none was recorded based on the commentary.
  • If the respondent scratched out a selected response, that response was considered empty, whether or not they provided an alternate response.
  • If the person wrote “zero” for number of adults in the household, this was submitted as “1”
  • If more than one employment status was selected, “Retired” took precedence over other responses

2.3.6 Respondent Incentives

All respondents were given a $2 bill as a pre-incentive and a $5 after completing the survey via an Amazon.com electronic gift card or, if requested, cash.

2.4 Survey Field Experiment

Households were selected to receive mailings with either the Baltimore Area Survey logo or the Johns Hopkins 21st Century Cities Initiative logo (see Figure 2.1). The assignment to the BAS logo or 21CC logo was made with addresses arranged in a geographical sort so that the two conditions were distributed evenly across all geographic areas. The assigned logo was included on both the envelope and the letter. Households were assigned to receive the same logo for all mailings throughout the field period. The website had a single landing page that displayed both logos.

BAS and 21CC/JHU Logos used on mailingsBAS and 21CC/JHU Logos used on mailings

Figure 2.1: BAS and 21CC/JHU Logos used on mailings

There were 696 completed surveys from those receiving the Baltimore Area Survey logo, and 796 completed surveys from those receiving the Johns Hopkins 21st Century Cities Initiative logo.

2.5 Response Rate

The response rate for this study was calculated using AAPOR’s RR3 formula. The overall sample response rate was 19.9 percent. Table 2.3 shows each category of the calculation in detail for the overall sample as well as experimental and geographical groups of interest.

Completed surveys were those that finished the final substantive question (i.e., they may not have provided contact information for future follow-up). All other surveys were considered break-offs.

The variable \(e\) is calculated as the proportion of eligible interviews (i.e., the sum of categories 1 and 2) divided by the sum of eligible and not eligible addresses.2

Table 2.3: AAPOR Response Rate 3 calculations for groups of interest
CategoryDispositionFull sampleBAS logoJHU 21CC logoCityCounty
Total sample used11,5125,7565,7567,3504,162
Interview (Category 1) - I1.1 Complete1,492696796934558
Eligible, non-interview (Category 2) - R2.1 Refusal & Breakoff289162177226113
Unknown eligibility, non-interview (Category 3) – UH3.19 Nothing ever returned8,7454,4164,3295,4733,272
Not eligible (Category 4)4.30 Housing Unit Ineligible936482454717219
e0.660.640.680.620.75
AAPOR RR319.918.920.320.617.8

2.6 Weighting

Weights were applied to ensure representation of adults in the Baltimore area. Three components were used to create sample weights to represent the population: base weights to account for unequal probability of selection, nonresponse adjustment to adjust for differential nonresponse, and calibration to ensure representation to population-level control values.

2.6.1 Base Weights

The ABS address probability of selection arises from the stratification structure determined by the Census Tract of the address. The household-level base weight is the inverse of this sampling-rate stratum probability from the 21 sampling strata described in Section 2.2.4.

One adult was sampled from each sampled household. Within household sampling is based on the next-birthday method. The invitation letter requested that the adult with the next birthday respond to the survey. We treat this pseudo-random selection as equivalent to a random sampling, so the probability of selection of the final respondent adult within the household is based on the adults in the household. This information is collected as part of the questionnaire, variable bas24_dem_adults. The within-household base weight component is therefore bas24_dem_adults itself (e.g., if there are three adults in the household, then the sampling probability is 1 in 3, so that the base weight is 3). If bas24_dem_adults is missing on an otherwise completed questionnaire, it was imputed (see Section 2.6.4 below).

2.6.2 Non-Response Adjustments

Non-response adjustments were made based on age, educational attainment, race/ethnicity, gender, and geography. Sampling strata were used for the basis of non-response adjustment. In Baltimore County, the 34 sampling strata were based on oversample stratum (listed in Table 2.1) and Red Line status. In Baltimore City, the 120 sampling strata were based on Red Line status crossed with Community Statistical Area, crossed with Black/Hispanic stratum (listed in Table 2.2). Cells with fewer than 10 respondents were collapsed across Black/Hispanic strata, then by Red Line status, and then by geography if necessary. Collapsing led to 43 response cells with response adjustments ranging from a low of 1.0 to a high of 1.6.

2.6.3 Calibration Adjustments

Weights were then calibrated to match control totals from the ACS Public Use Microdata Set (PUMS) so that these control totals are based on samples from the U.S. population within ‘PUMS regions’. The calibration adjusts both for nonresponse (beyond the adjustments in Section 2.6.2), and for undercoverage of the ABS frame as a representation of all adults. Weights were calibrated using raking (iterative proportional fitting, or IPF) to rake the nonresponse-adjusted weights (the base weights after non-response adjustments as specified in Section 2.6.2) to control totals in the following dimensions separated by jurisdiction:

  1. PUMS region within county
  2. Gender within each county (male and female; note that ACS only has control totals for male and female, so transgender and other gender from the questionnaire were collapsed for this calibration step)
  3. Race/ethnicity within county:
    • Hispanic or non-Hispanic Other Race
    • Non-Hispanic Black Only
    • Non-Hispanic White Only
    • Non-Hispanic Asian Only
  4. Age within county
    • 18 to 34
    • 35 to 54
    • 55 to 69
    • 70 or older
  5. Education level within county
    • No high-school diploma or high-school diploma only
    • 2-year associates degree only or 4-year college diploma only
    • Professional degree
  6. Home-owners vs. renters within county

Excessive weights were trimmed. Westat’s RAKE-TRIM algorithm was used for raking and trimming weights. Westat applied the algorithm to trim back weights that fell outside a given range, and then re-raked to the control totals, in an iterative raking and trimming process until the control totals were achieved while at same time the weights are also within certain constraints. When convergence did not occur, control cells were collapsed to reduce constraints until the algorithm converged.

In order to achieve convergence, we needed to collapse the three education level cells into two cells for Baltimore County only (not Baltimore City, where all three cells were maintained). The final two education cells for Baltimore County were no professional degree vs. professional degree (collapsing the two lower-education-achievement cells). No other collapsing was necessary.

Prior to collapsing cells, the trimming constraint was adjusted. To preserve the unbiasedness of the weights, trimming cells within which constraints were applied were the 12 sampling-rate strata listed in Tables 2.1 and 2.2. Within these 12 sampling-rate strata, weights 3.5 times the median weight were trimmed to 3.5 and weights smaller than 1/3.5 times the median weight were trimmed to 1/3.5. The values were changed to 4.5 and 1/4.5 when convergence could not be achieved.

There were a total of 14 trimming cells based on county, sampling stratum, and number of adults in the household. The final rake-trim algorithm allowed some slightly higher ratios than 3.5 within the final trimming cells to facilitate convergence of the algorithm. The maximum ratio between trimming cell maximum weight and the cell median weight was 3.553. The minimum ratio between trimming cell minimum weight and the cell median weight was 1/3.587.

2.6.4 Imputations

The weighting steps described in Sections 2.6.1-2.6.3 above required responses for the following variables from all respondents:

  • Number of adults in household (bas24_dem_adults)
  • Gender (bas24_dem_gender)3
  • Age group (18 to 34, 35 to 54, 55 to 69, 70 and over; derived from bas24_dem_yearborn)
  • Education (No HS degree or HS degree without associates or bachelors degree, associates degree or bachelors degree only; master’s, doctorate, and professional Degree; derived from bas24_dem_edattain)
  • Race (Asian-American Only, Black Only, White Only, Other; derived from responses to race questions bas24_dem_race*)
  • Ethnicity (Hispanic vs. non-Hispanic; bas24_dem_latx)
  • Housing tenure (homeowner vs. renter; derived from bas24_dem_own)

When values for variables missed for fewer than five (5) respondents, the value was imputed to be the modal response category for the derived variable. In other cases, imputations were calculated using the R program mice (multiple imputations chained equations). The mice algorithm was applied using predictive models for each variable calculated based on all others in addition to Census tract information (Black/Hispanic percentage) and then iterated through the individual-variable chained equations one-by-one to convergence. Models were estimated separately for each jurisdiction.


  1. The final implementation of the sample plan resulted in 1,512 addresses in the Red Line corridor to reach desired sample in individual strata.↩︎

  2. See Section 4.2 in the Appendix for response rates calculated to be comparable to the BAS 2023.↩︎

  3. Values other than “male” or “female” for bas24_dem_gender were imputed for weighting purposes because the American Community Survey includes no responses other than “male” and “female”.↩︎