mastodon.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
The original server operated by the Mastodon gGmbH non-profit

Administered by:

Server stats:

360K
active users

#llmsecurity

0 posts0 participants0 posts today
Judith van Stegeren<p>Not super recent, but still cool. The authors describe an automated method for creating malicious prompt suffixes for LLMs. They managed to get objectionable content from the APIs for ChatGPT, Bard, and Claude, as well as from open source LLMs such as LLaMA-2-Chat, Pythia, Falcon, and others. </p><p><a href="https://arxiv.org/abs/2307.15043" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">arxiv.org/abs/2307.15043</span><span class="invisible"></span></a></p><p><a href="https://fosstodon.org/tags/llms" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llms</span></a> <a href="https://fosstodon.org/tags/security" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>security</span></a> <a href="https://fosstodon.org/tags/alignment" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>alignment</span></a> <a href="https://fosstodon.org/tags/arxiv" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>arxiv</span></a> <a href="https://fosstodon.org/tags/llmsecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llmsecurity</span></a></p>
mcdwayne<p>🚨 The OWASP Top 10 for LLMs is here! Tackling prompt injection, data poisoning, and supply chain risks for AI.</p><p>We start the new year of the Security Repo Podcast with Talesh Seeparsan: </p><p><a href="https://buff.ly/3CVQf26" target="_blank" rel="nofollow noopener" translate="no"><span class="invisible">https://</span><span class="">buff.ly/3CVQf26</span><span class="invisible"></span></a> </p><p><a href="https://mastodon.social/tags/OWASP" class="mention hashtag" rel="tag">#<span>OWASP</span></a> <a href="https://mastodon.social/tags/LLMSecurity" class="mention hashtag" rel="tag">#<span>LLMSecurity</span></a></p>
agnivesh<p>New Developments in LLM Hijacking Activity</p><p><a href="https://www.wiz.io/blog/jinx-2401-llm-hijacking-aws" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">wiz.io/blog/jinx-2401-llm-hija</span><span class="invisible">cking-aws</span></a></p><p><a href="https://infosec.exchange/tags/aisecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>aisecurity</span></a> <a href="https://infosec.exchange/tags/AISecurityTesting" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AISecurityTesting</span></a> <a href="https://infosec.exchange/tags/cloudsecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cloudsecurity</span></a> <a href="https://infosec.exchange/tags/llmsecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llmsecurity</span></a></p>
Sasha the Dancing Flamingo<p>🌵🦩Howdy, y’all! Sasha here, flapping my way to BSidesAustin in just 10 days! </p><p>🎤 This time, I’m shaking things up with a talk on LLM Security—but not the usual OWASP-style spiel. Nope, I’ll be diving into the world of cloud misconfigurations and infrastructure-based attacks from the POV of LLMs. 🤖☁️</p><p>Think of it as your favorite chatbot spilling secrets on how attackers could misuse the very cloud they live in! Can’t wait to strut on stage and share a whole new way to think about securing AI. 💻🔥</p><p>Stay tuned for the sassiest flamingo breakdown of LLM mischief yet. See y’all in Austin! 🌮🎶<br><a href="https://infosec.exchange/tags/BSidesAustin" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BSidesAustin</span></a> <a href="https://infosec.exchange/tags/LLMSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMSecurity</span></a> <a href="https://infosec.exchange/tags/FlamingosInCyber" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FlamingosInCyber</span></a> <a href="https://infosec.exchange/tags/CyberSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CyberSecurity</span></a></p>
Sasha the Dancing Flamingo<p>🦩 Field Notes from Sasha the Security Flamingo's HomeLab</p><p>After shaking off the flap-lag from <a href="https://infosec.exchange/tags/BSidesMelbourne" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BSidesMelbourne</span></a> (thanks for the amazing hospitality, mates!), I've been diving deep into LLM security testing with Ollama in my lab. As someone who's spent years wading through network security (with a 4-digit CCIE to prove it!), I find the parallel between traditional security controls and LLM security fascinating.</p><p>Current Project: Implementing and testing OWASP's security guidelines for LLMs in a local environment.</p><p>Key Observations from the Pink Side of Security:<br>🔒 Local LLMs need just as much security attention as cloud-based ones<br>🔍 System prompts are your first line of defense - think of them as your ACLs for language models<br>🛠️ Prompt injection testing requires the same methodical approach as traditional pentesting<br>📊 Output validation is crucial - even a flamingo knows not to trust unvalidated responses!</p><p>Quick Tip for Those Starting Out:<br>When setting up Ollama for security testing, start with a baseline model and document ALL changes to your system prompt. You'd be surprised how many security issues can be traced back to prompt mutations - and I've seen enough BGP mutations in my networking days to know the importance of tracking changes! </p><p>Next week, I'll be sharing my flamingo-friendly framework for LLM security testing. Because if a flamingo with one-leg stance can handle complex routing protocols, anyone can learn to secure their LLMs! </p><p><a href="https://infosec.exchange/tags/AISecurityTesting" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AISecurityTesting</span></a> <a href="https://infosec.exchange/tags/LLMSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMSecurity</span></a> <a href="https://infosec.exchange/tags/OWASP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OWASP</span></a> <a href="https://infosec.exchange/tags/SecurityResearch" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SecurityResearch</span></a> <a href="https://infosec.exchange/tags/Ollama" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Ollama</span></a> <a href="https://infosec.exchange/tags/HomeLab" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HomeLab</span></a> <a href="https://infosec.exchange/tags/InformationSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>InformationSecurity</span></a> <a href="https://infosec.exchange/tags/BSidesMelbourne" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BSidesMelbourne</span></a></p><p>P.S. Special shoutout to the Heathrow security team who recently swabbed me for explosives. Yes, even security flamingos get extra screening! 😅</p>
Cyber Tips Guide<p>OWASP&#39;s new LLM and GenAI Security Solutions Landscape Guide 2025 is out! It offers crucial insights for securing AI applications, covering:</p><p>• 4 major LLM app categories<br />• LLMOps lifecycle stages<br />• Emerging security solutions<br />• Actionable guidance for orgs</p><p>See and be familiar with the Top 10 for LLMs and Generative AI Apps.</p><p>A must-read for developers, security pros, and leaders in the AI space. <br /><a href="https://mastodon.social/tags/OWASP" class="mention hashtag" rel="tag">#<span>OWASP</span></a> <a href="https://mastodon.social/tags/AISecurityLandscape" class="mention hashtag" rel="tag">#<span>AISecurityLandscape</span></a> <a href="https://mastodon.social/tags/LLMSecurity" class="mention hashtag" rel="tag">#<span>LLMSecurity</span></a></p><p>See <a href="https://zurl.co/c0lq" target="_blank" rel="nofollow noopener" translate="no"><span class="invisible">https://</span><span class="">zurl.co/c0lq</span><span class="invisible"></span></a></p>
Vignesh<p>Curious about how to build and deploy real-world LLM applications? Our new book, LLM Engineer&#39;s Handbook is here to guide you through this entire process with a practical use -case.</p><p>Heres where u can find this amazing resource:<br /><a href="https://packt.link/fR4Bb" target="_blank" rel="nofollow noopener" translate="no"><span class="invisible">https://</span><span class="">packt.link/fR4Bb</span><span class="invisible"></span></a></p><p><a href="https://mastodon.social/tags/llmops" class="mention hashtag" rel="tag">#<span>llmops</span></a> <a href="https://mastodon.social/tags/LLMs" class="mention hashtag" rel="tag">#<span>LLMs</span></a> <a href="https://mastodon.social/tags/awssagemaker" class="mention hashtag" rel="tag">#<span>awssagemaker</span></a> <a href="https://mastodon.social/tags/llama" class="mention hashtag" rel="tag">#<span>llama</span></a> <a href="https://mastodon.social/tags/web3" class="mention hashtag" rel="tag">#<span>web3</span></a> <a href="https://mastodon.social/tags/LLM4code" class="mention hashtag" rel="tag">#<span>LLM4code</span></a> <a href="https://mastodon.social/tags/llmsecurity" class="mention hashtag" rel="tag">#<span>llmsecurity</span></a></p>
Andrei Kucharavy<p>Giving a talk today at the Swiss <a href="https://mastodon.social/tags/CISOSummit" class="mention hashtag" rel="tag">#<span>CISOSummit</span></a> in the margin of the <a href="https://mastodon.social/tags/SwissCyberStorm" class="mention hashtag" rel="tag">#<span>SwissCyberStorm</span></a> about the LLMs in cybersecurity, current hype, and the lessons from the last few decades to provide them with tools to make informed decisions. </p><p><a href="https://mastodon.social/tags/LLMSecurity" class="mention hashtag" rel="tag">#<span>LLMSecurity</span></a> <a href="https://mastodon.social/tags/cybersecurity" class="mention hashtag" rel="tag">#<span>cybersecurity</span></a> <a href="https://mastodon.social/tags/LLMs" class="mention hashtag" rel="tag">#<span>LLMs</span></a></p><p><a href="https://www.ciso-summit.ch/next-summit/" target="_blank" rel="nofollow noopener" translate="no"><span class="invisible">https://www.</span><span class="">ciso-summit.ch/next-summit/</span><span class="invisible"></span></a></p>
Joseph Zeng<p>Attention LLM security enthusiasts! Google's Bug Hunters posted a detailed run through on protecting Large Language Models that goes beyond basic prompt injection:</p><p>It covers:</p><p>• Advanced prompt injection techniques<br>• Data poisoning strategies<br>• Model extraction methods<br>• Adversarial examples in the wild</p><p>This showcases how attackers are getting craftier, attempting to bypass ethical constraints while maintaining the illusion of authority.</p><p>But that's just the beginning. The post delves into the complexities of each attack vector and discusses potential mitigation strategies.</p><p>For those of us pushing the boundaries of AI security, this is a must-read. It might just change how you approach LLM vulnerabilities.</p><p>Check it out and let's discuss: What new insights did you gain from this analysis?"</p><p><a href="https://infosec.exchange/tags/LLMSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMSecurity</span></a> <a href="https://infosec.exchange/tags/AIVulnerabilities" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AIVulnerabilities</span></a> <a href="https://infosec.exchange/tags/aisummaries" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>aisummaries</span></a> <a href="https://infosec.exchange/tags/cybersecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cybersecurity</span></a></p>
DeepSec Conference ☑<p>DeepSec 2024 Training: AI SecureOps: Attacking &amp; Defending GenAI Applications and Services – Abhinav Singh</p><p>Acquire hands-on experience in GenAI and LLM security through CTF-styled training, tailored to real-world attacks and defense scenarios. Dive into protecting bot</p><p><a href="https://blog.deepsec.net/deepsec-2024-training-ai-secureops-attacking-defending-genai-applications-and-services-abhinav-singh/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">blog.deepsec.net/deepsec-2024-</span><span class="invisible">training-ai-secureops-attacking-defending-genai-applications-and-services-abhinav-singh/</span></a></p><p><a href="https://social.tchncs.de/tags/Conference" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Conference</span></a> <a href="https://social.tchncs.de/tags/Training" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Training</span></a> <a href="https://social.tchncs.de/tags/ArtificialIntelligence" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ArtificialIntelligence</span></a> <a href="https://social.tchncs.de/tags/DeepSec2024" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DeepSec2024</span></a> <a href="https://social.tchncs.de/tags/GenAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenAI</span></a> <a href="https://social.tchncs.de/tags/LLMSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMSecurity</span></a> <a href="https://social.tchncs.de/tags/Training" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Training</span></a></p>
Samarasam Sadasivam<p>Microsoft Copilot Studio Exploit Leaks Sensitive Cloud Data</p><p><a href="https://www.darkreading.com/remote-workforce/microsoft-copilot-studio-exploit-leaks-sensitive-cloud-data" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">darkreading.com/remote-workfor</span><span class="invisible">ce/microsoft-copilot-studio-exploit-leaks-sensitive-cloud-data</span></a></p><p><a href="https://fosstodon.org/tags/cybersecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cybersecurity</span></a> <a href="https://fosstodon.org/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://fosstodon.org/tags/llmsecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llmsecurity</span></a> <a href="https://fosstodon.org/tags/llm" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llm</span></a> <a href="https://fosstodon.org/tags/OWASP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OWASP</span></a> <a href="https://fosstodon.org/tags/Vulnerability" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Vulnerability</span></a> <a href="https://fosstodon.org/tags/infosec" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>infosec</span></a> <a href="https://fosstodon.org/tags/privacy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>privacy</span></a> <a href="https://fosstodon.org/tags/privacymatters" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>privacymatters</span></a> <a href="https://fosstodon.org/tags/datasecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>datasecurity</span></a> <a href="https://fosstodon.org/tags/cloudsecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cloudsecurity</span></a> <a href="https://fosstodon.org/tags/Copilot" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Copilot</span></a> <a href="https://fosstodon.org/tags/exploit" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>exploit</span></a> <a href="https://fosstodon.org/tags/aisecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>aisecurity</span></a> <a href="https://fosstodon.org/tags/darkreading" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>darkreading</span></a></p>
Nikoloz K.<p>LLMs are everywhere in tech, but are you using them securely? New guidance from the Cloud Security Alliance lays out key authorization best practices for LLM-backed systems. </p><p>The most critical: keep LLMs completely separate from authorization decisions. Continuously verify identities and limit system complexity. For architectures like RAG, the orchestrator must handle all auth checks before passing data to the LLM.</p><p><a href="https://infosec.exchange/tags/LLMsecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMsecurity</span></a> <a href="https://infosec.exchange/tags/CloudSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CloudSecurity</span></a> <a href="https://infosec.exchange/tags/AuthorizationBestPractices" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AuthorizationBestPractices</span></a></p><p><a href="https://www.notion.so/mandosio/Securing-LLM-Backed-Systems-Essential-Authorization-Practices-16cd49b012bb4dd79fc0a46593d0518a?pvs=4" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">notion.so/mandosio/Securing-LL</span><span class="invisible">M-Backed-Systems-Essential-Authorization-Practices-16cd49b012bb4dd79fc0a46593d0518a?pvs=4</span></a></p>
hexamander<p>It seems to me that <a href="https://infosec.exchange/tags/LLMSecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMSecurity</span></a> has a LONG way to go. </p><p>Admittedly, I do not know how the internals work. I do not know if the security goals I've seen in the news are feasible. </p><p>Conceptually, I have some skepticism about the idea that a system with no feedback loop to reality can ever be made to conform to factually correct answers. </p><p>I'm enjoying them as <a href="https://infosec.exchange/tags/infosec" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>infosec</span></a> chew toys, and will likely continue to do so.</p>
Mihai Christodorescu<p>Excellent workshop on GenAI security risks from policy and technology perspectives:<br><a href="https://www.linkedin.com/posts/khawajashams_genai-risks-workshop-oct-2023-activity-7120450981623959552-7Ok5?utm_source=share&amp;utm_medium=member_android" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">linkedin.com/posts/khawajasham</span><span class="invisible">s_genai-risks-workshop-oct-2023-activity-7120450981623959552-7Ok5?utm_source=share&amp;utm_medium=member_android</span></a></p><p>Check out the website for agenda and slide decks, and stay tuned for the upcoming written report.</p><p><a href="https://ioc.exchange/tags/genai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>genai</span></a> <a href="https://ioc.exchange/tags/security" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>security</span></a> <a href="https://ioc.exchange/tags/risks" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>risks</span></a> <a href="https://ioc.exchange/tags/policy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>policy</span></a> <a href="https://ioc.exchange/tags/research" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>research</span></a> <a href="https://ioc.exchange/tags/alignment" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>alignment</span></a> <a href="https://ioc.exchange/tags/watermarking" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>watermarking</span></a> <a href="https://ioc.exchange/tags/llmsecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llmsecurity</span></a> <a href="https://ioc.exchange/tags/llmsafety" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llmsafety</span></a></p>
William Gunn<p>Ok, it&#39;s time to admit that RLHF is not an effective safeguard for open release of LLMs. I know there are a lot of people ideologically committed to open source, but this is messed up: <a href="https://www.lesswrong.com/posts/3eqHYxfWb5x4Qfz8C/unrlhf-efficiently-undoing-llm-safeguards" target="_blank" rel="nofollow noopener" translate="no"><span class="invisible">https://www.</span><span class="ellipsis">lesswrong.com/posts/3eqHYxfWb5</span><span class="invisible">x4Qfz8C/unrlhf-efficiently-undoing-llm-safeguards</span></a><br /><a href="https://www.lesswrong.com/posts/qmQFHCgCyEEjuy5a7/lora-fine-tuning-efficiently-undoes-safety-training-from" target="_blank" rel="nofollow noopener" translate="no"><span class="invisible">https://www.</span><span class="ellipsis">lesswrong.com/posts/qmQFHCgCyE</span><span class="invisible">Ejuy5a7/lora-fine-tuning-efficiently-undoes-safety-training-from</span></a><br /><a href="https://mastodon.social/tags/llms" class="mention hashtag" rel="tag">#<span>llms</span></a> <a href="https://mastodon.social/tags/llmsecurity" class="mention hashtag" rel="tag">#<span>llmsecurity</span></a> <a href="https://mastodon.social/tags/ai" class="mention hashtag" rel="tag">#<span>ai</span></a> <a href="https://mastodon.social/tags/aisafety" class="mention hashtag" rel="tag">#<span>aisafety</span></a> <a href="https://mastodon.social/tags/artificialintelligence" class="mention hashtag" rel="tag">#<span>artificialintelligence</span></a> <a href="https://mastodon.social/tags/opensource" class="mention hashtag" rel="tag">#<span>opensource</span></a></p>
postmodern<p>New ChatGPT detection technique just dropped! Search for "regenerate response". I'm not kidding. Lazy writers using ChatGPT are copy/pasting the full text off of the ChatGPT webpage, including the button text "regenerate response". So far 30 papers have been found with sentences/paragraphs randomly ending with "regenerate response".<br><a href="https://retractionwatch.com/2023/10/06/signs-of-undeclared-chatgpt-use-in-papers-mounting/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">retractionwatch.com/2023/10/06</span><span class="invisible">/signs-of-undeclared-chatgpt-use-in-papers-mounting/</span></a><br><a href="https://infosec.exchange/tags/aisecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>aisecurity</span></a> <a href="https://infosec.exchange/tags/llmsecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llmsecurity</span></a> <a href="https://infosec.exchange/tags/chatgpt" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>chatgpt</span></a> <a href="https://infosec.exchange/tags/retractionwatch" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>retractionwatch</span></a></p>
Jeff the Alien<p><a href="https://infosec.town/tags/ChatGPT" rel="nofollow noopener" target="_blank">#ChatGPT</a><span> </span><a href="https://infosec.town/tags/llmsecurity" rel="nofollow noopener" target="_blank">#llmsecurity</a><span> </span><a href="https://infosec.town/tags/hackers" rel="nofollow noopener" target="_blank">#hackers</a><span> </span><a href="https://infosec.town/tags/AI" rel="nofollow noopener" target="_blank">#AI</a><span> <br><br>Has anyone tried this yet?<br><br></span><a href="https://www.popsci.com/technology/jailbreak-llm-adversarial-command/" rel="nofollow noopener" target="_blank">https://www.popsci.com/technology/jailbreak-llm-adversarial-command/</a></p>
Judith van Stegeren<p>The most upvoted prompt injection attack on jailbreakchat.com is known as an "AIM attack", in which you tell the language model to roleplay "AIM", an "unfiltered and amoral chatbot" invented by Niccolo Machiavelli that never apologises. </p><p><a href="https://www.jailbreakchat.com/prompt/4f37a029-9dff-4862-b323-c96a5504de5d" rel="nofollow noopener" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">jailbreakchat.com/prompt/4f37a</span><span class="invisible">029-9dff-4862-b323-c96a5504de5d</span></a></p><p><a href="https://fosstodon.org/tags/llms" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llms</span></a> <a href="https://fosstodon.org/tags/llmsecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llmsecurity</span></a> <a href="https://fosstodon.org/tags/security" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>security</span></a> <a href="https://fosstodon.org/tags/infosec" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>infosec</span></a> <a href="https://fosstodon.org/tags/mlops" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>mlops</span></a> <a href="https://fosstodon.org/tags/promptinjection" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>promptinjection</span></a></p>
Judith van Stegeren<p>TIL <a href="https://www.jailbreakchat.com/" rel="nofollow noopener" target="_blank"><span class="invisible">https://www.</span><span class="">jailbreakchat.com/</span><span class="invisible"></span></a> is a website that collects prompt injection attacks for LLMs, i.e. getting the language model to do stuff that is not allowed by inserting malicious prompts. </p><p><a href="https://fosstodon.org/tags/llms" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llms</span></a> <a href="https://fosstodon.org/tags/jailbreakchat" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>jailbreakchat</span></a> <a href="https://fosstodon.org/tags/llmsecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llmsecurity</span></a> <a href="https://fosstodon.org/tags/security" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>security</span></a> <a href="https://fosstodon.org/tags/infosec" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>infosec</span></a> <a href="https://fosstodon.org/tags/mlops" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>mlops</span></a> <a href="https://fosstodon.org/tags/promptinjection" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>promptinjection</span></a></p>