California and Counties Population by Age, Race/Hispanics, and Gender: 2010–2020
Release Notes:
- 2025.1.29: Initial public data release.
Complete Race/Ethnicity and Sex by Age for California and Counties, Excel Compatible
Complete File Comma-Delimited, Database-Ready Format – comma-delimited dataset and data dictionary (compressed with zip). This is a large file that will NOT open successfully in Microsoft Excel.
Data Dictionary – text containing codes and correspondence tables
Intercensal population estimates are estimates made for the years between two completed census counts. The intercensal estimates supersede the 2010-2019 postcensal estimates to ensure consistency with the 2020 Census.
Contents
These data files provide intercensal population estimates by single year of age (0-100+), race/Hispanic origin, and gender for California and its counties. These estimates are for the time periods of April 1, 2010 and April 1, 2020 and July 1, 2010 through July 1, 2019.
Data Sources
The California Department of Public Health provided the vital statistics (births and deaths) used to develop these estimates. Data for April 1, 2010 comes from the Census Bureau’s 2010 Census Modified Race Summary File (MARS). The file for April 1, 2020 was produced by DOF as described in the Population Projections methodology. The purpose of those two files is to allocate the ‘Some Other Race’ category to other race/ethnic categories to comply with 1997 U.S. Office of Management and Budget standards.
Technical Notes
Assumptions
Base Population: The Department of Finance used the April 1, 2010 MARS file as the starting point for these intercensal estimates. The ending point of the intercensal estimates uses the DOF-produced 2020 projection-based file.
Births and Deaths: Birth and death data were derived from vital records and assigned characteristics using the methodology described in the population projections’ methodology referenced above.
Migration: The migration component was generated using a survived population method. The 2010 population was aged forward using birth and death data and compared to the 2020 Census population. The differences were assumed to be migration.
Methodology
2010-2020 intercensal estimates were generated by a cohort-component model starting with 2010 Census data. The cohort component model uses three components: births, deaths and migration.
Current Population = Previous Population + (Birth-Deaths) + Net Migration
Annual birth and death data were added to each year’s starting population. The ten-year migration total derived from the survival method was annualized and divided by the total to derive a yearly migration component (migration was front-loaded to 2010-2015 to reflect the migration patterns in the state). The migration component was grouped by ten-year age groups and randomly added or subtracted from each year’s population group by county/race grouping.
The resulting data set was smoothed by using a bi-directional smoothing function for ages 0-89 in cases where the population was larger than 200. For ages 90 and older the data were smoothed employing a spline interpolation method using the STATA “mipolate” command. The smoothed dataset was then controlled to match the published population estimates by county.
Authority
These population estimates were prepared under the mandate of Government Code, Sections 13073 and 13073.5. In addition, the State Administrative Manual, Section 1100 on state plans, sets the general policy of…”(3) The use of the same population projections and demographic data that is provided by the State’s Demographic Research Unit.”
Acknowledgments
Andrés Gallardo prepared these estimates of population by age, gender and race/ethnicity. Technical and analytical expertise was provided by Walter Schwarm and Jim Miller.
Suggested Citation
State of California, Department of Finance, California and Counties Intercensal Population by Age, Race/Ethnicity, and Gender: 2010–2020. Sacramento, California, January 2025.