This is already solid โ letโs turn it into something a dev can execute step-by-step without ambiguity, and make a few key improvements (performance, correctness, extensibility).
๐ PRD 2 โ Results CSV (results.csv)¶
๐ 0. Implementation Plan (READ FIRST)¶
Follow these steps in order:
Step 1 โ Ensure mapping table is correct¶
- Table:
emission_type_mapping - Must contain:
id,namecategory_id,subcategory_idcategory_name,subcategory_name
๐ If not populated, run:
UPDATE emission_type_mapping
SET
category_id = (id / 10000) * 10000,
subcategory_id = CASE
WHEN id % 10000 >= 100 THEN (id / 100) * 100
ELSE NULL
END;
Step 2 โ Validate JSON structure¶
Ensure carbon_reports.stats contains:
{
"scope1": ...,
"scope2": ...,
"scope3": ...,
"total": ...,
"by_emission_type": {
"10000": 123.4,
"20000": 456.7
}
}
๐ If by_emission_type is missing โ STOP (data issue)
Step 3 โ Run base query (below)¶
๐ This generates the dataset
Step 4 โ Export¶
In pgAdmin:
- Run query
- Right-click โ Save as CSV
- File name:
results.csv
Step 5 (optional, recommended)¶
Create a materialized view for performance:
CREATE MATERIALIZED VIEW carbon_report_results AS
<query below>;
๐ฏ 1. Objective¶
Provide a dataset of emissions per report ร emission_type, enriched with hierarchy:
- category
- subcategory
- emission type
Used for:
- analytics
- dashboards
- exports
๐ฆ 2. Scope¶
- One row per (report_id ร emission_type_id)
- Source:
carbon_reports.stats - Enrichment:
emission_type_mapping - No recomputation from raw emissions
๐งฉ 3. Data Sources¶
| Table | Role |
|---|---|
carbon_reports | aggregated emissions (JSON) |
emission_type_mapping | hierarchy + labels |
๐ง 4. Data Model¶
Granularity¶
1 row = 1 report ร 1 emission_type
Columns¶
| Column | Type | Description |
|---|---|---|
| report_id | integer | report identifier |
| year | integer | reporting year |
| unit_id | integer | unit identifier |
| category_id | integer | derived category |
| subcategory_id | integer | derived subcategory |
| category_name | text | category label |
| subcategory_name | text | subcategory label |
| emission_type_id | integer | emission type |
| emission_type_name | text | emission label |
| kg_co2eq | float | emissions |
๐งฎ 5. SQL Specification (FINAL)¶
โ Recommended version (robust + explicit)¶
SELECT
cr.id AS report_id,
cr.year,
cr.unit_id,
etm.category_id,
etm.subcategory_id,
etm.category_name,
etm.subcategory_name,
etm.id AS emission_type_id,
etm.name AS emission_type_name,
(kv.value)::float AS kg_co2eq
FROM carbon_reports cr
-- ๐ฅ explode JSON (safer + faster than ? operator)
CROSS JOIN LATERAL jsonb_each(cr.stats->'by_emission_type') kv(key, value)
-- join mapping
JOIN emission_type_mapping etm
ON etm.id = kv.key::int
ORDER BY
cr.id,
etm.category_id,
etm.subcategory_id,
etm.id;
โ ๏ธ Why this version is better than original¶
| Change | Reason |
|---|---|
jsonb_each instead of ? | avoids repeated JSON lookup |
CROSS JOIN LATERAL | explicit row expansion |
direct join on kv.key | faster + cleaner |
๐ 6. Output¶
- File:
results.csv - Format: CSV
- Sorted by:
- report
- category
- subcategory
- emission type
โ๏ธ 7. Constraints¶
| Constraint | Behavior |
|---|---|
| Missing emission type | Not included |
| Missing mapping | Row dropped (JOIN) |
| JSON malformed | Query fails |
| No aggregation | Raw report-level values only |
โก 8. Performance Notes¶
- Query is O(number of emission types ร reports)
- Typically fast (JSON already aggregated)
- Add index:
CREATE INDEX idx_carbon_reports_stats_gin
ON carbon_reports USING GIN (stats);
๐ 9. Optional Improvements¶
Add scopes¶
(cr.stats->>'scope1')::float AS scope1,
(cr.stats->>'scope2')::float AS scope2,
(cr.stats->>'scope3')::float AS scope3,
(cr.stats->>'total')::float AS total,
Add institutional_id¶
JOIN units u ON u.id = cr.unit_id
Zero-fill missing emission types¶
๐ requires LEFT JOIN + full mapping scan (not current design)
Materialized view (recommended)¶
CREATE MATERIALIZED VIEW carbon_report_results AS
SELECT ...
Refresh:
REFRESH MATERIALIZED VIEW carbon_report_results;
๐ง 10. Design Decisions¶
| Decision | Reason |
|---|---|
| Use JSON stats | already aggregated |
| Use mapping table | decouples enum |
| No pivot | keeps flexibility |
| LATERAL join | proper JSON handling |
| No zero-fill | smaller dataset |
๐ฌ TL;DR for the dev¶
- Ensure
emission_type_mappingis correct - Use
jsonb_eachto explodestats.by_emission_type - Join mapping
- Export
- (Optional) materialize
Example not completed (missing category/subcategory depends on what we want to display in the csv) (can import direct in DB)
"id" "name" "category_id" "subcategory_id" "category_name" "subcategory_name" 10000 "food" 10000 10000 10001 "foodvegetarian" 10000 10000 10002 "foodnon_vegetarian" 10000 10000 20000 "waste" 20000 20000 20001 "wasteincineration" 20000 20000 20002 "wastecomposting" 20000 20000 20003 "wastebiogas" 20000 20000 20004 "wasterecycling" 20000 20000 30000 "commuting" 30000 30000 30001 "commutingwalking" 30000 30000 30002 "commutingcycling" 30000 30000 30003 "commutingpowered_two_wheeler" 30000 30000 30004 "commutingpublic_transport" 30000 30000 30005 "commutingcar" 30000 30000 50000 "professional_travel" 50000 50000 50100 "professional_traveltrain" 50000 50100 50101 "professional_traveltrainclass_1" 50000 50100 50102 "professional_traveltrainclass_2" 50000 50100 50200 "professional_travelplane" 50000 50200 50201 "professional_travelplanefirst" 50000 50200 50202 "professional_travelplanebusiness" 50000 50200 50203 "professional_travelplaneeco" 50000 50200 60000 "buildings" 60000 60000 60100 "buildingsrooms" 60000 60100 60101 "buildingsroomslighting" 60000 60100 60102 "buildingsroomscooling" 60000 60100 60103 "buildingsroomsventilation" 60000 60100 60104 "buildingsroomsheating_elec" 60000 60100 60105 "buildingsroomsheating_thermal" 60000 60100 60200 "buildingscombustion" 60000 60200 60201 "buildingscombustionnatural_gas" 60000 60200 60202 "buildingscombustionheating_oil" 60000 60200 60203 "buildingscombustionbiomethane" 60000 60200 60204 "buildingscombustionpellets" 60000 60200 60205 "buildingscombustionforest_chips" 60000 60200 60206 "buildingscombustionwood_logs" 60000 60200 60300 "buildingsembodied_energy" 60000 60300 70000 "process_emissions" 70000 70000 70100 "process_emissionsch4" 70000 70100 70200 "process_emissionsco2" 70000 70200 70300 "process_emissionsn2o" 70000 70300 70400 "process_emissionsrefrigerants" 70000 70400 80000 "equipment" 80000 80000 80100 "equipmentscientific" 80000 80100 80200 "equipmentit" 80000 80200 80300 "equipmentother" 80000 80300 90000 "purchases" 90000 90000 90100 "purchasesgoods_and_services" 90000 90100 90200 "purchasesscientific_equipment" 90000 90200 90300 "purchasesit_equipment" 90000 90300 90400 "purchasesconsumable_accessories" 90000 90400 90500 "purchasesbiological_chemical_gaseous" 90000 90500 90600 "purchasesservices" 90000 90600 90700 "purchasesvehicles" 90000 90700 90800 "purchasesother" 90000 90800 90900 "purchasesadditional" 90000 90900 90901 "purchasesadditionalln2" 90000 90900 100000 "research_facilities" 100000 100000 100100 "research_facilitiesfacilities" 100000 100100 100200 "research_facilitiesanimal" 100000 100200 110000 "external" 110000 110000 110100 "externalclouds" 110000 110100 110101 "externalcloudsvirtualisation" 110000 110100 110102 "externalcloudscalcul" 110000 110100 110103 "externalcloudsstockage" 110000 110100 110200 "externalai" 110000 110200 110201 "externalaiprovider_google" 110000 110200 110202 "externalaiprovider_mistral_ai" 110000 110200 110203 "externalaiprovider_anthropic" 110000 110200 110204 "externalaiprovider_openai" 110000 110200 110205 "externalaiprovider_cohere" 110000 110200 110206 "externalaiprovider_others" 110000 110200 2000301 "wastebiogasorganic_waste_food_leftovers" 2000000 2000300 2000302 "wastebiogascooking_vegetable_oil" 2000000 2000300 2000401 "wasterecyclingpaper" 2000000 2000400 2000402 "wasterecyclingcardboard" 2000000 2000400 2000403 "wasterecyclingplastics" 2000000 2000400 2000404 "wasterecyclingglass" 2000000 2000400 2000405 "wasterecyclingferrous_metals" 2000000 2000400 2000406 "wasterecyclingnon_ferrous_metals" 2000000 2000400 2000407 "wasterecyclingelectronics" 2000000 2000400 2000408 "wasterecyclingwood" 2000000 2000400 2000409 "wasterecyclingpet" 2000000 2000400 2000410 "wasterecyclingaluminum" 2000000 2000400 2000411 "wasterecyclingtextile" 2000000 2000400 2000412 "wasterecyclingtoner_and_ink_cartridges" 2000000 2000400 2000413 "wasterecyclinginert_waste" 2000000 2000400 6010101 "buildingsroomslightingoffice" 6010000 6010100 6010102 "buildingsroomslightinglaboratories" 6010000 6010100 6010103 "buildingsroomslightingarchives" 6010000 6010100 6010104 "buildingsroomslightinglibraries" 6010000 6010100 6010105 "buildingsroomslightingauditoriums" 6010000 6010100 6010106 "buildingsroomslightingmiscellaneous" 6010000 6010100 6010201 "buildingsroomscoolingoffice" 6010000 6010200 6010202 "buildingsroomscoolinglaboratories" 6010000 6010200 6010203 "buildingsroomscoolingarchives" 6010000 6010200 6010204 "buildingsroomscoolinglibraries" 6010000 6010200 6010205 "buildingsroomscoolingauditoriums" 6010000 6010200 6010206 "buildingsroomscoolingmiscellaneous" 6010000 6010200 6010301 "buildingsroomsventilationoffice" 6010000 6010300 6010302 "buildingsroomsventilationlaboratories" 6010000 6010300 6010303 "buildingsroomsventilationarchives" 6010000 6010300 6010304 "buildingsroomsventilationlibraries" 6010000 6010300 6010305 "buildingsroomsventilationauditoriums" 6010000 6010300 6010306 "buildingsroomsventilationmiscellaneous" 6010000 6010300 6010401 "buildingsroomsheating_elecoffice" 6010000 6010400 6010402 "buildingsroomsheating_eleclaboratories" 6010000 6010400 6010403 "buildingsroomsheating_elecarchives" 6010000 6010400 6010404 "buildingsroomsheating_eleclibraries" 6010000 6010400 6010405 "buildingsroomsheating_elecauditoriums" 6010000 6010400 6010406 "buildingsroomsheating_elecmiscellaneous" 6010000 6010400 6010501 "buildingsroomsheating_thermaloffice" 6010000 6010500 6010502 "buildingsroomsheating_thermallaboratories" 6010000 6010500 6010503 "buildingsroomsheating_thermalarchives" 6010000 6010500 6010504 "buildingsroomsheating_thermallibraries" 6010000 6010500 6010505 "buildingsroomsheating_thermalauditoriums" 6010000 6010500 6010506 "buildingsroomsheating_thermalmiscellaneous" 6010000 6010500