In today’s data-driven world, ensuring the security and privacy of machine learning models is essential, as neglecting these aspects can lead to heavy fines, data breaches, ransom to hacker groups, and significant reputational loss among customers and partners . DataRobot offers powerful solutions for protection against the top 10 risks identified by The Open Worldwide Application Security Project (OWASP), including security and privacy vulnerabilities. Whether you’re working with custom models, using the DataRobot playground, or both, this 7 step protection guide will guide you on how to create an effective surveillance system for your organization.
Step 1: Access the Supervision Library
Start by opening the DataRobot Guard Library, where you can select various guards to protect your models. These guards can help prevent many problems, including:
- Leakage of personally identifiable information (PII).
- Direct injection
- Harmful content
- Illusions (using Rouge-1 and Faithfulness)
- Discussion of competition
- Unauthorized topics
Step 2: Use custom and advanced guardrails
DataRobot not only comes equipped with built-in guards, but also provides the flexibility to use any custom model as a guard, including large language models (LLM), binary models, regression, and multi-class models. This allows you to tailor the monitoring system to your specific needs. In addition, you can use state-of-the-art ‘NVIDIA NeMo’ self-checking input and output rails to ensure models stay current, avoid blocked words and handle conversations in a predefined way. Whether you choose the powerful built-in options or decide to integrate your own custom solutions, DataRobot supports your efforts to maintain high standards of security and efficiency.
Step 3: Configure your Guards
Setup Evaluation Deployment Guard
- Select the entity to apply it to (prompt or response).
- Develop global models from the DataRobot Registry or use your own.
- Set the moderation threshold to determine the strictness of the guard.
![Example how to set the limit](https://www.datarobot.com/wp-content/uploads/2024/05/image-2-1024x472.png)
![Example of response with PII > 0.8 screening criteria” class=”wp-image-55145″ srcset=”https://www.datarobot.com/wp-content/uploads/2024/05/image-3-1024×260.png 1024w, https://www.datarobot.com/wp-content/uploads/2024/05/image -3-600×153.png 600w, https://www.datarobot.com/wp-content/uploads/2024/05/image-3-1536×391.png 1536w, https://www.datarobot.com/wp-content /uploads/2024/05/image-3.png 1600w” sizes=”(max-width: 1024px) 100vw, 1024px”><figcaption class=](https://www.datarobot.com/wp-content/uploads/2024/05/image-3-1024x260.png)
![Example of response with PII > 0.5 control criteria” class=”wp-image-55146″ srcset=”https://www.datarobot.com/wp-content/uploads/2024/05/Example-of-response-with-PII-moderation-criteria-0.5-1024×154.png 1024w, https://www.datarobot.com /wp-content/uploads/2024/05/Example-of-response-with-PII-moderation-criteria-0.5-600×90.png 600w, https://www.datarobot.com/wp-content/uploads/2024/ 05/Example-of-response-with-PII-moderation-criteria-0.5-1536×230.png 1536w, https://www.datarobot.com/wp-content/uploads/2024/05/Example-of-response-with -PII-moderation-criteria-0.5.png 1600w” sizes=”(max-width: 1024px) 100vw, 1024px”><figcaption class=](https://www.datarobot.com/wp-content/uploads/2024/05/Example-of-response-with-PII-moderation-criteria-0.5-1024x154.png)
Configuring NeMo Guardrails
- Provide your OpenAI key.
- Use pre-uploaded files or customize them by adding blocked terms. Configure the system prompt to specify blocked or allowed topics, moderation criteria, and more.
![Configuring NeMo Guardrails](https://www.datarobot.com/wp-content/uploads/2024/05/image-4-1024x472.png)
Step 4: Define the Tuning Logic
Choose a moderation method:
- Report: Monitor and notify administrators if monitoring criteria are not met.
- BLOCK: Block the prompt or response if it doesn’t meet the criteria, displaying a custom message instead of the LLM response.
![Mitigation Logic](https://www.datarobot.com/wp-content/uploads/2024/05/image-5-1024x438.png)
By default, supervision works like this:
- First, prompts are evaluated using configured buffers in parallel to reduce latency.
- If a prompt fails evaluation by any “blocking” guard, it is not sent to the LLM, reducing costs and enhancing security.
- Prompts that passed the criteria are scored using LLM and then the responses are evaluated.
- If the response fails, users see a predefined message generated by the client instead of the raw LLM response.
![Genealogy of assessment and moderation](https://www.datarobot.com/wp-content/uploads/2024/05/image-6-1024x441.png)
Step 5: Test and deploy
Before you go live, thoroughly test the logic of moderation. Once you’re satisfied, register and develop your model. You can then integrate it into various apps, such as a Q&A app, a custom app, or even a Slackbot, to see moderation in action.
![Q&A app - DataRobot](https://www.datarobot.com/wp-content/uploads/2024/05/image-7-1024x478.png)
Step 6: Monitor and Control
Monitor your monitoring system’s performance with automatically generated custom metrics. These metrics provide information about:
- The number of prompts and responses blocked by each guard.
- The delay of each moderation and guard phase.
- The average score for each guard and phase, such as loyalty and toxicity.
![LLM with direct injection](https://www.datarobot.com/wp-content/uploads/2024/05/image-8-1024x478.png)
In addition, all monitored activities are logged, allowing you to monitor application activity and the effectiveness of the monitoring system.
Step 7: Implement a human feedback loop
In addition to automated monitoring and logging, creating a human feedback loop is critical to improving the effectiveness of your surveillance system. This step involves regularly reviewing the results of the supervision process and the decisions made by automated guards. By incorporating feedback from users and administrators, you can continuously improve the accuracy and responsiveness of the model. This human approach ensures that the supervision system adapts to new challenges and evolves according to user expectations and changing standards, further enhancing the trustworthiness and reliability of your AI applications.
from datarobot.models.deployment import CustomMetric
custom_metric = CustomMetric.get(
deployment_id="5c939e08962d741e34f609f0", custom_metric_id="65f17bdcd2d66683cdfc1113")
data = [{'value': 12, 'sample_size': 3, 'timestamp': '2024-03-15T18:00:00'},
{'value': 11, 'sample_size': 5, 'timestamp': '2024-03-15T17:00:00'},
{'value': 14, 'sample_size': 3, 'timestamp': '2024-03-15T16:00:00'}]
custom_metric.submit_values(data=data)
# data witch association IDs
data = [{'value': 15, 'sample_size': 2, 'timestamp': '2024-03-15T21:00:00', 'association_id': '65f44d04dbe192b552e752aa'},
{'value': 13, 'sample_size': 6, 'timestamp': '2024-03-15T20:00:00', 'association_id': '65f44d04dbe192b552e753bb'},
{'value': 17, 'sample_size': 2, 'timestamp': '2024-03-15T19:00:00', 'association_id': '65f44d04dbe192b552e754cc'}]
custom_metric.submit_values(data=data)
Final Takeaways
Protecting your models with DataRobot’s comprehensive monitoring tools not only enhances security and privacy, but also ensures that your deployments run smoothly and efficiently. Using the advanced safeguards and customization options offered, you can tailor your surveillance system to meet specific needs and challenges.
![LLM with direct injection and NeMo guardrails](https://www.datarobot.com/wp-content/uploads/2024/05/image-9-1024x476.png)
Monitoring tools and detailed controls let you stay in control of your app’s performance and user interactions. Ultimately, by incorporating these powerful moderation strategies, you’re not just protecting your models—you’re also supporting trust and integrity in your machine learning solutions, paving the way for safer, more reliable AI applications.
About the Author
![Aslihan Buner](https://www.datarobot.com/wp-content/uploads/2024/05/Aslihan-Buner-300x300.png)
Aslihan Buner is Senior Product Marketing Manager for AI Observability at DataRobot, where she creates and executes go-to-market strategy for LLMOps and MLOps products. Works with product management and development teams to identify key customer needs as strategic identification and implementation of messaging and positioning. Her passion is targeting market gaps, addressing pain points across industries and connecting them to solutions.
Meet Aslihan Buner
![Katerina Bozenko](https://www.datarobot.com/wp-content/uploads/2023/09/Kateryna-Bozhenko-300x300.png)
Kateryna Bozhenko is Product Manager for AI Production at DataRobot, with extensive experience building AI solutions. With degrees in International Business and Healthcare Management, he is passionate about helping users make AI models work effectively to maximize ROI and experience the true magic of innovation.
Meet Katerina Bozenko