Databricks Data Loss Prevention (DLP) Guide | Nightfall Developer Platform

Guide to Cloud Data Loss Prevention (DLP) on Databricks

Learn how to discover, classify, and protect sensitive data in Databricks with Nightfall’s APIs. Improve security & compliance with cloud data loss prevention.
The Challenge

Data Sprawl on Databricks

  • Databricks is a leading cloud-based data lakehouse. Sensitive data like PII, credentials & secrets sprawl into Databricks at an alarming rate. It’s nearly impossible to know what types of data are in Databricks through manual efforts.
  • Databricks doesn’t have enterprise-grade data protection, DLP, data classification, or content filtering capabilities built-in.
  • Current data protection solutions are built for devices and networks, not cloud services like Databricks, so they are hard to implement. They aren’t flexible, accurate, or developer-friendly because they are primarily based on regular expressions and simple heuristics.
  • This can lead to productivity loss, risk of data breach, and compliance problems.
+
  • Programmatically get structured results from Nightfall’s detectors for many types of sensitive data like credit card numbers and API keys. Nightfall maintains a growing library of detectors that include personally identifiable information (PII) and other data types as defined by data privacy regulations, such as GDPR, PCI-DSS, and HIPAA.
  • AI-based detectors go well beyond regexes, rules, and search strings so you can make sense of your data without the alert fatigue. These detection techniques continually improve over time.
  • Customizable detection engine to tailor detectors and detection rules to your needs.
  • Custom-defined data types using regular expressions or word lists, to discover proprietary or unique sensitive data specific to your use cases.
  • Scan a broad range of file types and MIME-types, and perform AI-based optical character recognition (OCR) to extract text.
  • Nightfall Dashboard enables you to create, save, and manage detection rules easily and flexibly in the UI to reference in code.
  • Integration example & starter code for integrating directly with Databricks so you don’t need to write the integration glue from scratch.
Instructions

Using Nightfall & Databricks APIs

  • Create an API key. Integrate with just a few lines of code.
  • Configure a detection rule. Set up detection rules as code or manage them in the Nightfall Console. Use our Playground to test detection rules easily.
  • Make your first API call. Scan text payloads or files with detectors trained via AI.
  • Read our integration example & starter code for Databricks.
Nightfall example API call to classify text
REQUEST
curl --url https://api.nightfall.ai/v2/scan \
	--request POST \
	--header 'content-type: application/json' \
	--header 'x-api-key: $NIGHTFALL_API_KEY' \
	--data '{
	    "payload": [
	      "4916-6734-7572-5015 is my credit card number"
	    ],
	    "config": {
	      "conditionSet": {
	        "conditions": [{
	          "minNumFindings": 1,
	          "minConfidence": "LIKELY",
	          "detector": {
	            "displayName": "Credit Card Number",
	            "detectorType": "NIGHTFALL_DETECTOR",
	            "nightfallDetector": "CREDIT_CARD_NUMBER"
	          }
	        }]
	      }}}'
RESPONSE
[
    [
        {
            "fragment": "4916-6734-7572-5015",
            "detectorName": "Credit Card Number",
           "confidence": "VERY_LIKELY",
            "location": {
                "byteRange": {
                    "start": 0,
                    "end": 19
                },
                "unicodeRange": {
                    "start": 0,
                    "end": 19
                }
            }
        }
    ]
]

Case Study

“The Nightfall Developer Platform allows us to scan for certain patterns of information, like social security numbers or credit cards. We can ensure that our internal communication stays as work-appropriate as possible.”

Tim Alman, Enterprise Process Solutions Manager
Use case: DLP & content moderation
Employees: 12,000+
Industry: Retail & Ecommerce

Get Started

It's free to get started with the Developer Platform. Sign up now to start classifying and protecting sensitive data, or read our API docs to learn more.
Sign Up — OR — Read Docs