Part I: Boundary and population data
Discussion of data sources
The Latin America and Caribbean administrative boundaries and population
database was compiled from medium-scale maps at country and sub-national level,
national population censuses and United Nations data. The United Nations data
are for the smaller islands of the Caribbean. Population data for all
of mainland Latin America, Cuba, Puerto Rico, Jamaica, Trinidad and
Tobago, Haiti and the Dominican Republic are from population census data. The
administrative boundary maps for mainland Latin America and the large Caribbean
islands were digitized at the International Center for Tropical Agriculture
(Jones and Bell 1997). The smaller island nations of the Caribbean do not have
sub-national administrative units. The outlines of these countries are from the
Digital Chart of the World. None of the input boundary or population data
has been officially checked or endorsed by national statistical agencies or the
United Nations.
Boundaries
The scale of the source boundary maps vary from
1:50,000 to 1:1,125,000. More detailed information on the source maps is
available in the appendix. In order to ensure a close match between different national
coverages, and to obtain maximum compatibility with other standard medium
resolution data sets, all national boundaries and coastlines were replaced with
the political boundaries template (PONET) of the Digital Chart of the World
(DCW). The DCW is a set of basic digital GIS data layers with a nominal scale
of 1:1 million scale. The use of a very detailed international boundaries
template for, in some cases, relatively coarse resolution data is quite
misleading, but was required to ensure a close match between the national
coverages. In any application the smaller cartographic scale (i.e., coarser
resolution) of the administrative boundary data in comparison to the
international and coastlines template should be kept in mind.
For a few countries very detailed boundary data were available for which the spatial
referencing information was not known. In light of the objectives of this
project, these were nevertheless incorporated in order to achieve maximum
resolution. Yet, the ad hoc transformation, projection change and
rubbersheeting required to make these data compatible with the DCW template
have no doubt introduced positional error which may well reach a magnitude in
the order of 1-2 km.
Population data
With few exceptions, we used official census figures or official estimates,
which were taken from national publications (census reports or statistical
yearbooks) or from secondary data sources (yearbooks and gazetteers). The
specific sources are indicated for each country in the appendix. The accuracy
of censuses obviously varies by country. It was beyond the scope of this
project to evaluate the accuracy of every census used, or of any of the
official estimates. This would be possible since most censuses are followed by
a post-census enumeration that provides an accuracy estimate. We compared the
country totals from this dataset with values from the Population Reference
Bureau (PRB) and the Economic Commission for Latin America (ECLAC). ECLAC, PRB and
other sources giving country totals are likely to have values that are closer
to the true value for the nation. The country totals have been corrected to
account for inaccuracy in the census. Our data is from the original
disaggregated censuses and does not account for these corrections. In countries
that differed by more than 10 percent from the PRB values, we made a uniform
adjustment of the population data to bring it in line with the PRB country
totals. In countries with functioning registration systems, population figures
reach an accuracy within a fraction of a percent. In the US, census counts have
been shown to have an accuracy of about 2 percent. With few exceptions, the
accuracy of Latin American censuses is likely to be considerably lower.
The population data is generally from the early 1990's with an average
census year of 1990. Costa Rica and Honduras carried out their last population
censuses in 1984 and 1988 respectively. The sources for population data are
listed in the appendix.
Population projections
In order to maximize comparability across
national boundaries, all sub-national population figures from the 1990's census
round were projected to 1995. The population for 1960, 1970, 1980, 1990 and
2000 was projected from the 1995 base year. These projections were based on
historical population growth rates for departments in Latin America and the
Caribbean. In some cases we used national level population growth rates for the
projections. The volume of papers and monographs on population projection
methods in the demographic literature is very large. It is matched, however, by
the number of publications that emphasize the continuing inability of these
methods to accurately forecast population figures over more than very short
time periods (see the interesting discussion in Cohen, 1995).
For this project, we used ECLAC figures for
population growth rates based on a mathematical trend forecast. In contrast to
previous estimates for the global demography project, the current figures for
each sub-national unit are based for most countries on a district-specific
inter-censal growth rate between the last and the next to last enumeration. The
inter-censal growth rate was calculated as

where r is the average rate of
growth, P1 and P2 are
the population totals, for example, in the first and second census, and t
is the number of years between the two enumerations. The 1995 estimate was then
derived using:
.
See, for example, Rogers (1985). For
predictions over only a few years, mathematical trend projections are usually
fairly accurate, and the specific type of function used has little influence on
the results (Cohen 1995). A more elaborate estimation approach such as the
cohort survival method would result in more reliable estimates, but the data
requirements for this technique (age and sex distribution as well as age
specific birth, death and migration rates) were far beyond what was possible in
this project. Given the method used for the population forecasting, the
characteristics of the available source data obviously have a significant
impact.
The population estimates are at best a rough
estimate which should be interpreted within wide confidence margins. In general
we can expect the reliability of the estimates to be lower, the longer the
census upon which they are based lies back - that means the confidence
intervals around the point estimates become increasingly wider over time. The
data for some countries for which data were available for the early eighties
only, need to be regarded as a best-guess only.
The figures included in the database are directly
taken from the estimation and thus show more significant digits than is
justified by their accuracy. During data manipulation and processing one should
preserve all significant digits, but for presentation purposes, the figures
should be rounded to reflect the uncertainty of the data. Even the use of
population numbers to the nearest thousand in the above table is clearly
optimistic.
Given the limited amount and quality of the
base population data, we checked the resulting total national population
figures against standard benchmarks, the regularly published population
estimates produced by ECLAC and data from PRB. In the summary table in the
appendix, our total estimated population is compared with PRB and ECLAC figures
for 1995. Obviously, the ECLAC and PRB data are by themselves associated with a
considerable amount of uncertainty since the estimates are based on conditional
forecasts that make a number of assumptions regarding the most recent and
future fertility, mortality and migration rates. They are also based, for the
most part, on official census figures which sometimes prove to be highly
unreliable. In cases where our estimate was considerably different from the UN
estimate, the intercensal growth rates were adjusted uniformly such that the
resulting estimate was equal to or close to the UN estimate (United Nations
1998). Typically this is the case where the latest available population figures
were very old, or where a country experienced significant reductions in
fertility in recent years that are not sufficiently reflected in the population
dynamics between the last two censuses. The adjustments are indicated in the
specific country documentation below.
The United Nations population projections
were used for all the Caribbean countries without sub-national boundaries,
mainly the smaller islands. This group of Caribbean countries excludes Cuba,
Jamaica, Puerto Rico, Trinidad and Tobago, Haiti and the Dominican Republic
(United Nations 1998). United Nations population figures were used to calculate
population growth rates in three cases: (1) for Puerto Rico where we had no
information on population growth rates at sub-national levels; (2) for Surinam
and Guyana where census figures were available for only one point in time or
the next to last census was too long ago, we applied the average annual
national growth rate for a ten-year period centered on the target date. This
modification resulted in a uniform adjustment of population figures across
these two countries; (3) for departments in Latin America with missing
population growth rate data in the ECLAC database. This group included
departments in Argentina (1), Brasil (2), Colombia (5), Cuba (1), Honduras (1),
Mexico (1), Paraguay (4), and Venezuela (1).
Data Quality Estimates
Given our limited knowledge about the accuracy of the input data, it is
impossible to make an objective assessment of data quality. The development of
a qualitative index of boundary and population data quality was considered.
However, such an index would be associated with considerable subjective
judgment. Any question "how good are the data?" is incomplete
since we also have to ask "for what purpose?" Data that are
clearly inappropriate for high resolution applications at the province or
sub-province level, are still sufficiently accurate to be used in regional or
continental scale applications (the prime motivation for this project), or for
the visualization of spatial patterns in a country. Thus, we only provide some
informal summary measures in the table below, and refer to the individual
country documentation that provides all known details about the lineage of the
data (admittedly, this knowledge is too often very limited). The user can
consider this information to make his or her own decision about whether the
data are appropriate for the specific tasks.
As in previous databases of this nature, we included two useful summary
measures of data resolution in the summary table in the appendix:
Mean resolution in km = 
i.e., the length of a side of an administrative unit, if all the units were
square.
Mean population per unit = total_national_population /
number_of_units..
These two measures complement each other well. In countries where large
areas are uninhabitable, the mean resolution in km gives a biased impression of
available detail. In such cases, the number of people per unit is a more
meaningful indicator. The following
table shows how these measures of resolution compare for Africa, Asia and Latin
America.
|
|
Mean resolution in km
|
|
|
|
Mean population per
administrative unit ('000)
|
|
|
|
|
|
|
|
|
|
|
|
Asia
|
|
117
|
|
|
|
1148
|
|
|
Africa
|
|
88
|
|
|
|
209
|
|
|
LAC
|
|
26
|
|
|
|
66
|
|
|
There are 10,666 administrative units with population information in the
data set. Much of the reduction of resolution in kilometers for Latin America
and the Caribbean is due to the high level of detail for Brazil, with about one
third of all the units for the data set. The population data for all the
countries of Latin America was collected at a finer level of detail. The
reduction in mean population per unit reflects the higher resolution in
kilometers, and in comparison with Asia, lower population densities.
Next Section / Back To
Introduction / udeichmann@worldbank.org