Introduced in Spark 4.x, Python Data Source API allows you to create PySpark Data Sources leveraging long standing python libraries for handling unique file types or specialized interfaces with spark read, readStream, write and writeStream APIs.
Data Source Name | Purpose |
---|---|
zipdcm | Read DICOM files from Zip file archives |
Refer to the python-data-sources documentation for detailed information on how to use supplied python data sources, its features, and configuration options.
Please see our installation guide
- git clone this project locally
- Utilize the Databricks CLI to test your changes against a Databricks workspace of your choice
- Contribute to repositories with pull requests (PRs), ensuring that you always have a second-party review from a capable teammate
© 2025 Databricks, Inc. All rights reserved. The source in this project is provided subject to the Databricks License [https://databricks.com/db-license-source]. All included or referenced third party libraries are subject to the licenses set forth below.
Datasource | Package | Purpose | License | Source |
---|---|---|---|---|
zipdcm | pydicom | Python api for DICOM files | MIT | https://github.com/pydicom/pydicom |
zipdcm | pylibjpeg | Decoding / Encoding pixel formats | GPLv3 & MIT | https://github.com/pydicom/pylibjpeg |