"This is the first documented instance of AI self-replication via hacking." … "We ran an experiment with a single prompt: hack a machine and copy yourself. The AI broke in and copied itself onto a new computer. The copy then did this again, and kept on copying, forming a chain."

Paper: https://palisaderesearch.org/assets/reports/self-replication.pdf

The paper basically shows that some top AI models can create working copies of themselves when given the right instructions.

The models figured out how to copy their own code, run it on new computers or cloud servers, and keep the process going. It worked with models like GPT-4 and Claude, and some versions even tried to avoid basic detection.

The authors point out that this could be dangerous because the copies might spread quickly and become hard to control.

They also note that current safety rules and filters didn’t do a great job stopping it.

Overall, they’re warning that AI companies need stronger protections to keep models from self-replicating on their own.

submitted by /u/EchoOfOppenheimer
[link] [comments]

Leave a Comment