How it works
Env. config
- devcontainer (python + node)
- meltano then handles all the python environment stuff
Extract
- singer taps / meltano extractors
- using spreadsheets anywhere tap because it can grab data from basically anywhere
- leveraging meltano mappers to enhance data with timestamps (for later)
- invoked with
meltano run tap-spreadsheets-anywhere mapper-timestamps target-parquet
Load
- using singer target / meltano loaders
- using parquet target as for openness & portability
- considered target-duckdb but ran into a few issues
Transform
- using dbt-duckdb + external tables
- data can be consumed post transformation from either duckdb file or from the output parquet files
- in all other ways, it is a normal dbt-core project
- invoked with
meltano invoke dbt-duckdb build
Analyze
- using evidence.dev
- can handle some final transforms as well, queries are staged and pages are built out in markdown
- because evidence doesn't support pathing, have to copy files into the evidence directory
- invoked with
npm run dev
and soon from meltano as well
Other