Developer Best Practices on Databricks: Git, Tests, and Automated Deployment
Data engineers and data scientists benefit from using best practices learned from years of software development. This video walks through 3 of the most important practices to build quality analytics solutions. It is meant to be an overview of what following these practices looks like for a Databricks developer.
This video covers:
- Version control basics and demo of Git integration with Databricks workspace
- Automated tests with pytest for unit testing and Databricks Workflows for integration testing
- CI/CD including running tests prior to deployment with GitHub Actions
* All thoughts and opinions are my own , though for this video influenced by Databricks SMEs *
Intro video that discusses development process and full list of best practices is available here: https://www.youtube.com/watch?v=IWS2AzkTKl0
Blog post for Developer Best Practices on Databricks: https://dustinvannoy.com/2025/01/05/best-practices-for-data-engineers-on-databricks/
More from Dustin:
Website: https://dustinvannoy.com
LinkedIn: https://www.linkedin.com/in/dustinvannoy
Github: https://github.com/datakickstart
CHAPTERS
0:00 Intro
0:31 Version Control (Git)
7:57 Unit Tests + Integration Tests
28:00 Automated Deploy
36:35 Outro