Develop data solutions like a software engineer would

Open-source, code-first, configuration-centric tools to build and operate your data platform.

dataform.yaml
analytics_hub.schema
name: analytics_hub
version: 1.0.0
db:
type: snowflake
database: analytics_db
auth: ${DATASHELL_DWH_CREDENTIALS}
schema:
exclude: INFORMATION_SCHEMA
model:
use: dataform

General Documentation

An end-to-end data platform framework

Why DataShell

In the sprawling landscape of data tools, why do we need another one?

Getting started

Step-by-step guides to setting up your system and installing DataShell CLI.

What is DataShell?

DataShell builds an abstraction and management layer on top of your data platform toolset. It does this by using a monorepo strategy to combine the code and configuration for multiple tools into one source code repository.

DataShell can be seen as a:

  1. Build system. DataShell installs and maintains your data products, handling things like system dependencies and configuration.
  2. Integrator. For the supported data products, DataShell ensures they play well together, and provides abstractions and meta-commands for actions that span multiple tools.
  3. GitOps operator. DataShell handles Git operations to run integration and validation tests across your platform toolset. It also creates self-contained, ephemeral deployments to inspect and evaluate pull requests.
  4. Environment manager. DataShell allows you to maintain different enviroments (i.e., dev, prod, integration), automatically detecting schema drift, and facilitating migration and updates.

Getting involved

We love to hear from you, and have you all shape up the future of DataShell. Below are the ways in which you can get involved 🐔, or commited 🐷.

We use GitHub as a place to track issues, make announcements, and to interact with our users. Please check the discussion boards, and our project roadmap.

Also, you are more than welcome to join our Discord Server where you can interact in near real-time with our team, ask questions, and more.