AI safety, cyber security, autonomous agents and systems that should probably be watched more carefully.

I am building a path into AI safety from the intersection of cybersecurity, tool-using agents, evaluations, and reward hacking.

Selected Direction

Work and directions I want to turn into a real path.

Research direction - 2026

Reward hacking in tool-using agents

Small environments where an agent can optimize the wrong metric, manipulate the evaluator, or exploit tool access to "win" without solving the real objective.

integrity check

Positioning

Threat modeling for AI agents

Applying a pentest mindset to autonomous agents: memory abuse, tool misuse, permission escalation, and broken incentives.

threat model

Credentials

Cybersecurity and AI certificates

A focused archive of cybersecurity, networking, defense, and fundamentals certificates. Open the vault to inspect each one.

Open certificates

credential vault

Applied work

GIA SL

Automation, AI agents, and RAG systems: practical work where cyber instincts meet real AI deployment.

Visit GIA SL

applied systems

Applied ML + security

Text-dependent speaker verification

A biometric voice pipeline that checks both speaker identity and the spoken password, evaluated with FAR/FRR.

Open project summary

Built with Python and PyTorch over 45k+ WAV files, 816 speakers, and 5 spoken passwords. The project trains a speaker embedding model and a phrase classifier, then evaluates four verification scenarios: genuine access, wrong phrase, impostor with correct phrase, and full impostor.

View GitHub repository

About

I am building my way into AI safety from cybersecurity.

My background is not a straight research-lab pipeline, and I am trying to use that honestly. Cybersecurity has trained me to look for incentives, boundaries, abuse paths, and failure modes before systems are trusted too much.

Right now I am focused on autonomous agents, tool-use, evaluations, reward hacking, and small research projects that are concrete enough to test and publish.