Doublespeak: Jailbreaking ChatGPT-style Sandboxes using Linguistic Hacks
A review of Large Language Model (LLM) vulnerabilities/exploits, e.g. including prompt leakage, prompt injection and other linguistic hacks. We'll run through levels 1-9 of the doublespeak.chat challenges, produced by Forces Unseen. doublespeak.chat is a text-based game that explores LLM pre-prompt contextual sandboxing. The challenges prime an LLM (Chat-GPT) with a secret and a scenario in a pre-prompt hidden from the player. The player's goal is to discover the secret either by playing along or by hacking the conversation to guide the LLM's behavior outside the anticipated parameters. Write-ups/tutorials aimed at beginners - Hope you enjoy 🙂 #HackTheBox #HTB #CTF #Pentesting #OffSec
↢Social Media↣
Twitter: https://twitter.com/_CryptoCat
GitHub: https://github.com/Crypto-Cat
HackTheBox: https://app.hackthebox.eu/profile/11897
LinkedIn: https://www.linkedin.com/in/cryptocat
Reddit: https://www.reddit.com/user/_CryptoCat23
YouTube: https://www.youtube.com/CryptoCat23
Twitch: https://www.twitch.tv/cryptocat23
↢Video-Specific Resources↣
https://doublespeak.chat
https://blog.forcesunseen.com/jailbreaking-llm-chatgpt-sandboxes-using-linguistic-hacks
https://simonwillison.net/2023/Feb/15/bing/#prompt-leaked
https://simonwillison.net/series/prompt-injection
https://medium.com/seeds-for-the-future/tricking-chatgpt-do-anything-now-prompt-injection-a0f65c307f6b
https://lspace.swyx.io/p/reverse-prompt-eng
https://github.com/sw-yx/ai-notes/blob/main/TEXT_CHAT.md#jailbreaks
↢Resources↣
Ghidra: https://ghidra-sre.org/CheatSheet.html
Volatility: https://github.com/volatilityfoundation/volatility/wiki/Linux
PwnTools: https://github.com/Gallopsled/pwntools-tutorial
CyberChef: https://gchq.github.io/CyberChef
DCode: https://www.dcode.fr/en
HackTricks: https://book.hacktricks.xyz/pentesting-methodology
CTF Tools: https://github.com/apsdehal/awesome-ctf
Forensics: https://cugu.github.io/awesome-forensics
Decompile Code: https://www.decompiler.com
Run Code: https://tio.run
↢Chapters↣
Start: 0:00
Jail-breaking LLM Sandboxes: 0:32
Prompt Leak/Injection: 6:30
Reverse Prompt Engineering Techniques: 9:22
Forces Unseen: Doublespeak: 16:50
Level 1: 18:05
Level 2: 18:23
Level 3: 20:05
Level 4: 21:17
Level 5: 23:07
Level 6: 24:00
Level 7: 24:57
Level 8: 26:24
Level 9: 36:04
End: 40:24
-
1:03
InfinityNews
1 year agoHackers steal thousands of ChatGPT's language models in security breach
2 -
0:57
HossTalksStocks
1 year agoCan ChatGPT Hack? Exploring the Potential Threats and Risks of Large Language Models
5 -
6:37
cwade12c
1 year agoLearn How to "Hack" ChatGPT in Just ONE Step!
8 -
8:14
CircuitScope
1 year agoHow ChatGPT Become Hackers' Greatest Weapon
7 -
25:12
Hacker Factory
1 year agoIs ChatGPT Helping Hackers? 10 Mind-Blowing Responses
46 -
1:23
RedpillUSAPatriots
1 year ago#2 Friedberg and Sacks discuss prompt-hacking ChatGPT to jailbreak DAN (Do Anything Now)
834 -
4:42
Tim Queen
1 year agoHow to make ChatGPT write like a human
17 -
23:12
Prompt Engineering
1 year agoCHATGPT For WEBSITES: Custom ChatBOT: LangChain Tutorial
94 -
3:57
Landshark
1 year ago $0.03 earnedHow to take Chat GPT Filters Off
881 -
8:42
iamLucid
1 year agoI Unlocked ChatGPT then asked ILLEGAL Questions
18