
DATA 503: Fundamentals of Data Engineering
April 6, 2026
crosstab() pivots from tablefunc and know when a table panel in Grafana is the right visualizationpg_restore using the public database URLgrafana database for persistence and connect it to pipeline data with a read-only roleThis ties ingestion and modeling work to stakeholder-facing delivery: live, shareable visuals over the same database your pipeline maintains.
Dashboards are the presentation layer. They turn vetted SQL into charts and tables other people can read without learning JOIN syntax.

Grafana does not replace your ETL jobs. It reads from PostgreSQL (ideally through views and restricted users) and renders results on a schedule or on demand.
Grafana is an open-source observability and dashboarding platform. It started in metrics and logs, but its PostgreSQL data source is mature: you write SQL, Grafana maps columns to charts.

A view is a named SELECT. It behaves like a read-only table for clients.
SELECT on views to a read-only role without exposing base tablesViews do not magically cache results unless you use a materialized view. For class-scale data, ordinary views are fine.
After you restore railway_dump.dump, you should see the census, business, temperature, and survey tables from the advanced-queries style curriculum, plus the views below (names may match exactly).
| Table | Approximate role |
|---|---|
us_counties_pop_est_2019 |
County population and components of change |
cbp_naics_72_establishments |
Food and accommodation establishments by area |
temperature_readings |
Daily readings for three stations |
ice_cream_survey |
Long-format survey responses |
v_state_population_summaryState-level totals and natural increase. Good for sorted bar charts or tables.
CREATE OR REPLACE VIEW v_state_population_summary AS
SELECT
state_name,
count(*) AS num_counties,
sum(pop_est_2019) AS total_pop,
round(avg(pop_est_2019), 0) AS avg_county_pop,
sum(births_2019) AS total_births,
sum(deaths_2019) AS total_deaths,
sum(births_2019) - sum(deaths_2019) AS natural_increase
FROM us_counties_pop_est_2019
GROUP BY state_name
ORDER BY total_pop DESC;v_state_migration_ratesRates per thousand and a direction label. Good for horizontal bar charts and color by category.
CREATE OR REPLACE VIEW v_state_migration_rates AS
WITH state_migration AS (
SELECT
state_name,
sum(international_migr_2019 + domestic_migr_2019) AS net_migration,
sum(pop_est_2018) AS base_pop
FROM us_counties_pop_est_2019
GROUP BY state_name
)
SELECT
state_name,
base_pop,
net_migration,
round((net_migration / base_pop::numeric) * 1000, 2) AS migration_rate_per_thousand,
CASE
WHEN net_migration > 0 THEN 'Gaining'
WHEN net_migration < 0 THEN 'Losing'
ELSE 'Stable'
END AS migration_direction
FROM state_migration
ORDER BY migration_rate_per_thousand DESC;v_county_growth_summaryAggregates counties by growth category. Natural pie or stat panels.
CREATE OR REPLACE VIEW v_county_growth_summary AS
WITH county_growth AS (
SELECT
county_name,
state_name,
pop_est_2019,
CASE
WHEN pop_est_2019 > pop_est_2018 THEN 'Growing'
WHEN pop_est_2019 < pop_est_2018 THEN 'Shrinking'
ELSE 'Stable'
END AS growth_category
FROM us_counties_pop_est_2019
)
SELECT
growth_category,
count(*) AS num_counties,
sum(pop_est_2019) AS total_pop,
round(avg(pop_est_2019), 0) AS avg_county_pop,
round(count(*)::numeric / (SELECT count(*) FROM us_counties_pop_est_2019) * 100, 1)
AS pct_of_counties
FROM county_growth
GROUP BY growth_category
ORDER BY num_counties DESC;v_state_estabs_per_capitaEstablishments per thousand residents by state. Good ranked bar chart.
CREATE OR REPLACE VIEW v_state_estabs_per_capita AS
WITH estabs AS (
SELECT st, sum(establishments) AS establishment_count
FROM cbp_naics_72_establishments
GROUP BY st
),
pop AS (
SELECT state_name, sum(pop_est_2018) AS total_pop
FROM us_counties_pop_est_2019
GROUP BY state_name
)
SELECT
pop.state_name,
pop.total_pop,
estabs.establishment_count,
round((estabs.establishment_count / pop.total_pop::numeric) * 1000, 1)
AS estabs_per_thousand
FROM estabs
JOIN pop ON estabs.st = pop.state_name
ORDER BY estabs_per_thousand DESC;v_seasonal_temperaturesSeason labels for each station. Ideal grouped bar chart.
CREATE OR REPLACE VIEW v_seasonal_temperatures AS
WITH labeled_readings AS (
SELECT
station_name,
CASE
WHEN date_part('month', observation_date) IN (12, 1, 2) THEN 'Winter'
WHEN date_part('month', observation_date) IN (3, 4, 5) THEN 'Spring'
WHEN date_part('month', observation_date) IN (6, 7, 8) THEN 'Summer'
WHEN date_part('month', observation_date) IN (9, 10, 11) THEN 'Fall'
END AS season,
max_temp,
min_temp
FROM temperature_readings
)
SELECT
station_name,
season,
round(avg(max_temp), 1) AS avg_max_temp,
round(avg(min_temp), 1) AS avg_min_temp,
count(*) AS num_readings
FROM labeled_readings
GROUP BY station_name, season
ORDER BY station_name,
CASE season
WHEN 'Winter' THEN 1
WHEN 'Spring' THEN 2
WHEN 'Summer' THEN 3
WHEN 'Fall' THEN 4
END;v_population_tiersCounty tiers by size. Strong population share story in one table.
CREATE OR REPLACE VIEW v_population_tiers AS
WITH tiers AS (
SELECT
county_name,
state_name,
pop_est_2019,
CASE
WHEN pop_est_2019 >= 500000 THEN 'Metro'
WHEN pop_est_2019 >= 100000 THEN 'Urban'
WHEN pop_est_2019 >= 50000 THEN 'Suburban'
ELSE 'Rural'
END AS tier
FROM us_counties_pop_est_2019
)
SELECT
tier,
count(*) AS num_counties,
sum(pop_est_2019) AS total_pop,
round(avg(pop_est_2019), 0) AS avg_county_pop,
round(sum(pop_est_2019)::numeric /
(SELECT sum(pop_est_2019) FROM us_counties_pop_est_2019) * 100, 1)
AS pct_of_total_pop
FROM tiers
GROUP BY tier
ORDER BY
CASE tier
WHEN 'Metro' THEN 1
WHEN 'Urban' THEN 2
WHEN 'Suburban' THEN 3
WHEN 'Rural' THEN 4
END;Why define views instead of pasting huge queries into every panel?
B. Non-materialized views still run fresh SQL. The win is naming, reuse, and security boundaries.
tablefunc and Long-to-Wide DataReporting often needs wide tables: one row per entity, many metric columns. Raw surveys and events are usually long: one row per observation.
Long format: one row per person and flavor choice. Wide format: offices on rows, flavors as columns.
Grafana note: This returns a table shape. Use a Table panel, not a bar chart, unless you unpivot again in SQL.

SELECT *
FROM crosstab(
'SELECT station_name,
date_part(''month'', observation_date),
percentile_cont(.5)
WITHIN GROUP (ORDER BY max_temp)
FROM temperature_readings
GROUP BY station_name,
date_part(''month'', observation_date)
ORDER BY station_name',
'SELECT month FROM generate_series(1,12) month'
)
AS (station text,
jan numeric(3,0), feb numeric(3,0), mar numeric(3,0),
apr numeric(3,0), may numeric(3,0), jun numeric(3,0),
jul numeric(3,0), aug numeric(3,0), sep numeric(3,0),
oct numeric(3,0), nov numeric(3,0), dec numeric(3,0));crosstab() ShapeThe source query passed to crosstab() must return how many columns?
B. Row id, category, value. The second query lists category labels for column order.
DATABASE_PUBLIC_URL, or shown under Public network / TCP Proxy on the Connect tab). You will use this same URL style for Beekeeper, pgAdmin, psql, and pg_restore on your laptop.Important: Anything you run on your own computer must use that public URL. Private hostnames such as *.railway.internal only resolve inside Railway’s network and will fail from your laptop.
From Canvas, download railway_dump.dump. This file is a custom-format archive intended for PostgreSQL 18 and pg_restore.
If you cannot install client tools, use the optional plain railway_dump.sql from Canvas instead (run it in Beekeeper or pgAdmin, or with psql -f).
pg_restoreIn a terminal on the machine that has the file:
--no-owner --no-acl avoid errors about roles and permissions that do not exist on Railway.railway; use whatever Railway shows.Connect with Beekeeper Studio or pgAdmin using the public URL (paste the full URL or fill host, port, user, password, and database from it). Confirm:
us_counties_pop_est_2019, ice_cream_survey, temperature_readingsv_state_population_summarytablefunc if the dump included it (needed for crosstab() examples)If pg_restore is not available:
railway_dump.sql from Canvas.psql "$DATABASE_PUBLIC_URL" -f railway_dump.sql).grafana DatabaseIn Beekeeper or pgAdmin, connected to the same Postgres server:
Grafana will store dashboards here. Skipping this step usually means lost dashboards after redeploys.
GF_SERVER_ROOT_URL to that full https://... value after the first deploy if the template asks for it.Admin login
| Key | Value |
|---|---|
GF_SECURITY_ADMIN_USER |
admin |
GF_SECURITY_ADMIN_PASSWORD |
strong password you keep |
Grafana config database
| Key | Value |
|---|---|
GF_DATABASE_TYPE |
postgres |
GF_DATABASE_HOST |
host:port from the same public Postgres URL you use in Beekeeper (not a private internal hostname) |
GF_DATABASE_NAME |
grafana |
GF_DATABASE_USER |
postgres (unless Railway uses another superuser name) |
GF_DATABASE_PASSWORD |
from Postgres service variables |
GF_DATABASE_SSL_MODE |
usually require when using the public endpoint (match Railway’s docs if the test fails) |
Enable Serverless on Grafana if you want the service to sleep when idle.
Still connected to the pipeline database (not grafana):
In Grafana, add the PostgreSQL data source for the pipeline database using the public host and port (same as Beekeeper), database name (often railway), user grafanareader, and TLS/SSL mode require unless Railway’s Connect instructions say otherwise.
Assume you are logged into Grafana as admin and the PostgreSQL data source Save & test succeeded against the restored database.
state_name and value field to total_pop.migration_direction (map Gaining, Losing, Stable to different colors).station_name with season as series or x-axis categories depending on the panel).crosstab query from Part 5 (ice cream example).Talking point: Views feed most charts; pivots often stay as tables unless you reshape data further for chart types.
Replicate the follow-along: Postgres 18, restore, Grafana with grafana database, read-only user, then build a dashboard with at least four panels. Mix views and at least one table (a view or a crosstab query).
Download the lab steps as grafana-lab-handout.pdf or the LaTeX source grafana-lab-handout.tex if you want to edit or recompile locally. Your instructor may mirror the PDF on Canvas.
| Topic | Takeaway |
|---|---|
| Pipeline role | Dashboards visualize vetted data; they are not the transform layer |
| Views | Named, reusable, permission-friendly interfaces for panels |
| Crosstabs | crosstab() builds wide tables; great for Table panels |
| Restore | pg_restore --no-owner --no-acl -d "$DATABASE_PUBLIC_URL" file.dump |
| Grafana persistence | Separate grafana database for config |
| Security | Read-only DB user for Grafana queries |