Household Economy Analysis (HEA) Baseline Database
Breadcrumbs

Baseline Data Collection Guidance

Loading new Baseline Storage Sheets (BSS) into the database requires the import to recognize the data in the various worksheets and allocate the correct standard metadata. Data that cannot be recognized automatically because it uses different terminology or different cell locations to the norm must be reconciled manually, requiring additional effort.

Below are recommendations for data entry in a BSS that ensure that it will be easy to load into the HEA Baseline Database. Data that differs from these norms cannot be recognized automatically and must be reconciled manually, requiring additional effort and risking additional data entry errors.

Spatial files

Each BSS should be accompanied by a spatial file that contains the boundaries for the associated Livelihood Zone. This file can be Livelihood Zone specific or a contain multiple Livelihood Zone boundaries (e.g., boundaries for an entire country) and can be in any geocoded format (shapefile, GeoJSON, etc.). Spatial files are essential for including the BSS information in the Livelihoods Explorer application.

BSS files

File formats

All new BSS should be maintained using current versions of Excel and saved in the .xlsx file format. The .xls format is obsolete and is not supported by many modern tools. The database does not support .csv files.

Livelihood Zone Codes

The database allows for comparison of a Livelihood Zone across time.

In order to ensure accurate comparison, Livelihood Zones that use the same code should be broadly the same geographically. While small changes are acceptable (see Burkina Faso or Mali, for example), a new code should be used when there is a major change in geography.

Example

A country is split into 40 Livelihood Zones when baselines are done in 2015. When they are redone in 2025, it is only split into 15 Livelihood Zones. The numbering for the 2025 set of Livelihood Zones should be 41-55, not 1-15, since all the boundaries have significantly shifted.

Note that this has not been the practice historically for Livelihood Zones. Where we have baseline data for a historic Livelihood Zone where the geography has changed significantly, the Livelihood Zone codes will be appended with a reference year (see Nigeria baselines, for example).

In some cases, organizations collect baseline data intending to represent areas smaller than a livelihood zone, usually the intersection of a livelihood zone and one or more administrative units. In such cases, the baseline will retain the number of the larger zone and will be appended with a 3-letter geographic identifier. For example, the Niger agropastoral zone crosses the entire country from East to West. No organization has yet assessed the entire zone as a unit with a single BSS. Rather, multiple baselines collected over many years cover parts of the zone representing one or more administrative departments each (i.e., NE04(DTK), NE04(TAN), NE04(NMT), NE04(TAP)).

Methods

The Methods sheet should contain sufficient information for the BSS to be understood with no need to look up key information in other files.

Important standard fields:

  • Dates of fieldwork

  • Reference year dates and description

  • Currency and exchange rate

We recommend adding entries for:

  • code: The Livelihood Zone Code (e.g. ML14) is normally part of the filename, but the format is not standard and the filename frequently contains dates or other information. Putting the code as a separate field in the Methods tab will make it system-readable.

  • alternate_code: In some countries, there an alternative Livelihood Zone Code that is used locally. For example, ML11 is referred to within Mali as KOL.

  • name_en: The agreed name of the Livelihood Zone is a critical attribute, and should be available within the BSS. For the BSS to be loaded into the HEA Database the English name (name_en) must be specified even if the BSS was authored in another language. Please follow the naming conventions described in the Practitioners' Guide to HEA, Chapter 2.

  • name_fr, name_es, etc.: There should be separate name entries for each language in which the BSS will be used. Available options are as follows:

    • French: name_fr

    • Spanish: name_es

    • Portuguese: name_pt

    • Arabic: name_ar

  • main_livelihood_category_id: The main Livelihood Category is important for analyses that process data from multiple BSS. This entry should be one of the following:

    • Agricultural

    • Agropastoral

    • Pastoral

    • Irrigation

    • Peri_Urban

    • Urban

    • Fishing

  • description_en, description_fr, etc.: While the Livelihood Zone name must be relatively succinct for display, the description allows for a paragraph that describes the zone. There should be separate description entries for each language in which the BSS will be used. Description language options are the same as livelihood zone name languages.

  • Season definitions: Any Seasons referenced within the BSS should have an accompanying definition. Typically this will require a definition for Season 1 and Season 2, which are used by Milk Production. Additional definitions may be needed for other Seasons referenced in the BSS, such as deyr, belg, post-recolte, etc. Each definition should identify the type of Season (harvest, post-harvest, lean, etc.) and the typical start and end of the season (in months or in number of days from the start of the year). For example, Season 1: Main harvest season; October-December. Some countries have multiple different seasons in different parts of the country or for different types of livelihood. In this case, the season definition must also be associated with the relevant livelihood zones or other geographic boundaries.

Community Names

The Community Full Name is constructed from the admin unit name (typically labeled District) and the community name (typically labeled Village) in rows 4 and 5 of the WB, Data, Data2, and Data3 worksheets in a BSS. The database uses these names to ensure that the data from the different worksheets can be linked to the correct Community entry in the database. Consequently, the spelling of those names in all places where they appear is critical to successfully recognizing the data.

Ideally, either cell validation or conditional formatting should be used to ensure that the names are accurate, for example, by comparing them to the Form 3 interview results in ‘WB’!C4:L5.

Livelihood Strategies (WB, Data, Data2, Data3)

Adding rows and columns

One of the benefits of using a database for data collection and analysis is that you do not have to worry about adding rows and columns to your spreadsheet as they can not break anything.

Therefore, we recommend adding rows and columns where needed in order to keep data tidy in the spreadsheet. For example, if a section only has space for 3 items but you have 5 items to record, you do not need to make the data fit into 3 sets of rows. In fact, forcing the data to fit will make ingestion harder. Instead, add two additional sets of rows to accommodate all of your data.

This also means that you can have rows for both individual and summary data.

Example

Each village in a BSS produces a different vegetable: squash, tomatoes, chilies, etc. Traditionally, they might all fall under one set of rows with the label mixed vegetable production. However, now the best approach is to list each product in a different set of rows. If a summary of mixed vegetable production is still desired, it can be included in its own set of rows containing only the summary data and no community-level data.

IDPs

While it is rare for one BSS to contain data for both IDPs and the local community, it does happen on occasion. In this instance, use a single WB sheet and rename the wealth categories in column B. Likewise, use a single data sheet and rename the wealth categories at the top, usually in row 3. The labels should make it clear which wealth groups are IDPs. For example, a French BSS would use TP for the very poor in the community and TP-IDP for the very poor in the displaced population.

Metadata should only be in Column A

Don’t put information describing the livelihood strategy (metadata) in the per-community columns (column B onwards). For every row, the numbers for each community must refer to the same Livelihood Strategy. If the strategy refers to a particular type of food, then it should be the label in Column A, not in that row in Column B.  

For example, the following data:

WEALTH GROUP

V.Poor

V.Poor

V.Poor

WILD FOODS

termites


termites

Wild food type 1 - kg gathered

2


10

Should properly be entered as:

WEALTH GROUP

V.Poor

V.Poor

V.Poor

WILD FOODS




Termites - kg gathered

2


10

Notice that “termites” has been moved from the community cells (columns B and D) to the label in column A.

Column A should explicitly state the expected data values for the row

If data is expected in the row, then the label in column A should explicitly state what data is expected.

Example: Bambara nuts

If the expected data is the number of local units purchased, use bambara nuts: no. local meas.

If the expected data is the amount produced, use bambara nuts: kg produced.

This avoids the ambiguity inherent in labels such as Other Crop Type: bambara nuts, where it is unclear whether the values represent the amount in kg, the number of local measures, or just a label where no additional data is expected.

Use zero purposefully

Only use zero when there is an opportunity for earning money, kcal, etc. If there is no opportunity, leave the cell blank. For example, labor should be left blank for kcal unless the field contains kcals from payment in kind. In contrast, you must enter the carcass weight for animals used for meat, even if the number of animals is zero.

There is one exception to this recommendation. In general, if the only values entered from any community for an activity are zero, it is better to leave all of the community values blank for that activity so that the system does not to create a product for which the only data is zero.

Example

If all of the communities in a BSS would report 0 for Other crops - kg produced, it would be preferable to leave those entries null (blank) rather than create the baseline entry for Other crops - kg produced with only entries of 0 kg.

Column A should use standardized product/crop names

The labels used in column A of the WB and Data sheets should use standardized and specific names and/or product codes where possible. The Livelihoods database uses a set of product codes based on CPC v2.1 . Using a name and/or code from this list ensures that the correct item is associated with that data in the database.

Example: Cocoyam

Cocoyam can refer to a number of items including taro and yautia. Instead of using cocoyam in column A, use a more specific identifier such as 01550 Taro or 01591 Yautia.

Links to other cells should be removed. They are not a good substitute for specific names as they are not readable by the database and can cause errors.

Example: Main cereal

In the BSS pictured below, column A is labeled as Main cereal - Belg: kg produced and is linked to another cell in the spreadsheet. This should instead use a specific product name/code and attribute, such as Maize:kg produced.

image-20250926-205043.png

Green consumption

Green consumption should be reported in months, not weeks. Note that there are 4.3 weeks in a month, according to the HEA field guide, so 1 week = 1/4.3 months or .23 months.

Add rows to the green consumption section if necessary to capture green consumption of each crop individually. In keeping with the spirit of the CPC 2.1, crops consumed green will use a different product code than the main crop. For example, regular maize is R01122, but green maize is R01290HA .

Data should not be stored in label rows

Do not put data values, including 0, in columns B onward on rows where the value in column A is a heading or other label not associated with expected data. The BSS ingestion will refuse to load a BSS containing data in unexpected locations to ensure that no important data is lost.

This sometimes happens when the previous Livelihood Activity contained a 0 entry for many rows and they have been copied down one row too far, into the heading for the next section of the BSS.