Skip to main content
Skip table of contents

Metadata Reconciliation

The BSS spreadsheets contain many similar items that use slightly different terminology. We want to eliminate these duplicates as we load the Baseline Storage Sheets (BSS), to ensure that they are comparable across countries and regions.

Metadata Reconciliation Tables

To enable this we have two additional metadata items: Wealth Characteristic Label and Activity Label. These items are used to translate the items in Column A of the ‘WB’ and ‘Data’ worksheets respectively, converting them into specific deduplicated items across many different metadata types.

These two tables are used by the data ingestion pipelines to recognize the data in the BSS and convert it to a standard form. Completing the review of these metadata items is critical to being able to automatically recognize the data in the BSS and automate the ingestion of it.

These metadata items contain some standard columns:

  • label: the exact text from Column A of the “WB” or “Data” worksheet

  • langs: the primary language(s) detected for the BSS where the label is used 

  • datapoint_count_sum: The number of data points (i.e. non-blank cells) that use that label

  • unique_filename_count: The number of BSS that contain data with this label

  • min_row_number: The smallest row number in any BSS that contains this label 

  • max_row_number: The largest row number in any BSS that contains this label

  • filename_for_min_row: The name of the BSS where the label has the smallest row number

  • filename_for_max_row: The name of the BSS where the label has the largest row number

  • status: Indicates whether the label has been reviewed and the correct lookups have been assigned and the record is ready to be used. 

The Wealth Characteristic Label worksheet contains the following additional columns:

  • wealth_characteristic_id: the name of the standard WealthGroupCharacteristic that this label represents. For example, Donkey number owned matches the number owned WealthGroupCharacteristic, and Land area cultivated (hectares) matches the land area cultivated  WealthGroupCharacteristic.

  • product_name: the common name of the standard Product that this label represents, if appropriate. For example, Donkey number owned matches the L02132: Donkeys Product. 

  • unit_of_measure_id: the name of the standard Unit Of Measure that this label represents. For example, Land area cultivated (hectares) matches the ha: Hectare Unit Of Measure.

The Activity Label worksheet contains the following additional columns:

  • is_start: Indicates whether this label indicates the start of a new Livelihood Strategy. In the BSS these cells often have a light green background. For example Sorghum Deyr: kg produced or any other crop name followed by kg produced indicates the start of a new Activity.

  • strategy_type: the name of the Livelihood Strategy subtype, such as MilkProduction, or OtherCashIncome.

  • attribute: the standard name for this attribute that matches the name of the field in the data model for the Livelihood Activity that will store the data from this row. For example Sorghum Deyr: kg produced maps to the quantity_produced field.

  • product_name: the common name of the standard Product that this label represents, if appropriate. For example, Donkey number owned matches the L02132: Donkeys Product. 

  • unit_of_measure_id: the name of the standard Unit Of Measure that this label represents. For example, Land area cultivated (hectares) matches the ha: Hectare Unit Of Measure.

  • currency_id: the iso4217a3 code for the currency if one is specified in the label. This field is not often used.

  • season: the name or alias of the season, if one is specified in the label.

  • additional_identifier: an additional identifier required to distinguish between Livelihood Strategies. For example, there may be two Strategies for growing maize, with labels Maize rainfed: kg produced and maize irrigated: kg produced. In this case the additional identifiers will be rainfed and irrigated, the Strategy Type will be CropProduction , the Unit Of Measure will be kg: Kilogram and the product will be R01122: Maize/corn grain

Metadata Reconciliation Process

  • Find the row in the ActivityLabel or WealthCharacteristicLabel worksheet in the Reference Data spreadsheet.

  • Open the files listed in the  filename_for_min_row and filename_for_max_row columns so that you can see the label in context to decide whether it is the start of a new Livelihood Activity, etc.

  • Fill in the attributes in the Reference Data spreadheet. Most attributes are constrained by validation to selecting from a pre-approved list of values. When you find the need for a Product, for example, that isn’t already available then reach out to the Hub and they add the name to the list of available products after checking the appropriate code, etc.  

  • Mark the metadata row as Complete.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.