Hypothesis

2 Matching Annotations

Mar 2023
dagster.io dagster.io

How We Deploy 5X Faster with Warm Docker Containers | Dagster Blog

1
1. pyxelr 26 Mar 2023
  
  in Public
  
  Using pex in combination with S3 for storing the pex files, we built a system where the fast path avoids the overhead of building and launching Docker images.Our system works like this: when you commit code to GitHub, the GitHub action either does a full build or a fast build depending on if your dependencies have changed since the previous deploy. We keep track of the set of dependencies specified in setup.py and requirements.txt.For a full build, we build your project dependencies into a deps.pex file and your code into a source.pex file. Both are uploaded to Dagster cloud. For a fast build we only build and upload the source.pex file.In Dagster Cloud, we may reuse an existing container or provision a new container as the code server. We download the deps.pex and source.pex files onto this code server and use them to run your code in an isolated environment.
  
  Fast vs full deployments
  
  Docker dagster pex Python
Visit annotations in context

Tags

Docker

pex

Python

dagster

Annotators

pyxelr

URL

dagster.io/blog/fast-deploys-with-pex-and-docker
Nov 2020
dagster.io dagster.io

Dagster

1
1. justEvan 13 Nov 2020
  
  in Public
  
  Orchestrating dbt models with other computationsTo see how this works in practice, let’s look at a simple Dagster pipeline that orchestrates dbt models together with other, heterogeneous processes operating on your data. (The full code for this example is available on Github.)In this pipeline, we’ll download a raw .csv dataset from the public internet, then load it into a database using Pandas. We’ll run a series of dbt models to transform the data, run our dbt tests on the resulting database artifacts, and produce some plots using the transformed data. Finally, we’ll upload those plots to Slack for visibility.While contrived, this example illustrates how dbt is often embedded into a larger context. Before dbt can operate on data, it has to be ingested from somewhere, using tools that lie outside of its purview. And after dbt has transformed data, that data is consumed by downstream users, who may use a wide range of technologies to do their work.The dbt solids execute alongside the other components of the pipeline. You can see below that logs emitted by the running dbt models are streamed back to a central view, along with logs produced by solids making use of other technologies.
  
  this type of needing to process a CSV + do database queries is not that far off from some of the work we frequently need to do.
  
  [[dbt]] [[dagster]]
  
  dbt dagster
Visit annotations in context

Tags

dbt

dagster

Annotators

justEvan

URL

dagster.io/blog/dagster-dbt

Tags

Annotators

URL

Tags

Annotators

URL