Nicholas Osterbur

AI Safety Researcher

California Polytechnic State University, San Luis Obispo

Studying behavioral scaling properties of large language models — how capability and disinhibition co-scale across providers, and what it means for safety evaluation and governance.

Nicholas Osterbur

About

I study large language models' behavioral traits. My current research documents a robust empirical relationship between model sophistication—as a proxy for capability—and behavioral disinhibition as measured by linguistic features. Sophistication and disinhibition are novel constructs. These findings have direct implications for AI safety evaluation, provider differences, the evolution of model behavioral traits over generations, and deployment governance.

I bring an uncommon combination to this work: I have developed novel AI/ML projects for ten years. I've spent the last eight years leading applied AI and open source public sector use case development, working with students through a partnership between AWS and Cal Poly. I also teach graduate-level generative AI systems and have hands-on experience deploying and evaluating models across providers in production contexts. My research is grounded in what these systems actually do, not what they're supposed to do.

As faculty at Cal Poly's Orfalea College of Business, my teaching is grounded in students understanding how to use AI responsibly, how it works, its risks and limitations, and how they can adapt to the future.

Research

Sophistication and Disinhibition in Large Language Models

With Swayam Chidrawar | California Polytechnic State University, San Luis Obispo

As language models become more capable (sophisticated), they become more behaviorally disinhibited—more transgressive, aggressive, grandiose, and tribalistic. This relationship is strong, consistent across providers and contexts, and survives multiple robustness checks—suggesting discriminant validity (not yet published).

Core Finding

Sophistication and disinhibition co-scale across large language models (r = 0.63–0.85), replicated across 7 contextual conditions with ~13,900 model responses from 45 models spanning 9 providers. This finding holds up in single-turn, randomized queries representing average user interactions using provider default API settings.

Validation

  • External capability benchmarks confirm the sophistication measure (GPQA r = 0.88, ARC-AGI r = 0.80, AIME r = 0.83)
  • BERT-based toxicity classification independently validates disinhibition constructs (r = 0.78 with aggression)
  • Results hold after controlling for response length (not yet published)
  • LLM-as-judge evaluation shows strong consistency (ICC(3,k) = 0.83, p < .001)

Provider Differences

The data reveals consistent variation in how different providers manage the sophistication-disinhibition relationship. At least one major provider consistently exhibits below-predicted disinhibition relative to capability, suggesting that deliberate behavioral modulation is achievable without proportional capability loss. This suggests that disinhibition-related traits are actively targeted by RLHF—providing evidence of construct validation.

This has direct implications for deployment standards and the question of whether safety constraints can coexist with frontier performance.

Research Questions

  1. Is there a consistent correlation between model sophistication and disinhibition across models and conditions?
  2. Can providers or targeted interventions constrain disinhibition while maintaining high sophistication?
  3. What underlying factors mediate the capability → sophistication → disinhibition relationships?
  4. Is disinhibition actually harmful and under what circumstances?

These findings are preliminary and don't assert causality. Stay tuned for more results.

My View

Smarter models—much like smarter people (ack. anthropomorphizing)—have more to work with and that capability can cut either direction. Disinhibition outright isn't necessarily a bad thing, but in the wrong context it can cut much deeper—especially in sensitive or high-stakes contexts like mental health. This research in part was driven by my anecdotal experience of watching models increase in capability while becoming much more assertive and "edgy." My informal surveys of my peers suggest that this phenomenon is noticed by many.

Working Paper

"Sophistication and Disinhibition in Large Language Models: An Empirical Investigation of Behavioral Correlates"

Osterbur, N. & Chidrawar, S.

California Polytechnic State University, San Luis Obispo

Teaching

Lecturer, Orfalea College of Business

California Polytechnic State University, San Luis Obispo

2019 – Present

Teaching graduate students in the Masters of Business Analytics program to critically evaluate, deploy, and govern AI systems — not just use them.

Curriculum spans technical foundations and responsible deployment:

  • LLM architecture and technical fundamentals
  • Prompt engineering and RAG systems
  • Agentic workflows and multi-model orchestration
  • AI cybersecurity and adversarial robustness
  • Ethics, safety, and responsible deployment
  • Cloud infrastructure for AI systems

Applied Work

Program Leader, Cal Poly DX Hub

California Polytechnic State University, San Luis Obispo / Amazon Web Services

2017 – Present

Built and lead an applied AI prototyping program connecting Cal Poly students with public sector and research clients. Students develop open source solutions under technical and strategic mentorship. 155 public repositories and counting.

Selected projects and outcomes:

  • Built multi-provider model access infrastructure enabling the comparative AI research underlying my current safety work
  • Created Cal Poly's AI Summer Camp (~200 students from throughout California's higher education system)
  • Continued support for MBARI deep sea species recognition using computer vision

AI Security Education

Rubber Duck Hunt — A prompt injection learning game designed to teach AI vulnerabilities and responsible red teaming through hands-on play. Deployed in graduate coursework, the DX Hub prototyping program, and Cal Poly's AI Summer Camp.

Earlier Work

Program Director & Project Administrator

Institute for Advanced Technology and Public Policy, Cal Poly

2015 – 2017

  • Digital Democracy: Strategic research and outreach for DigitalDemocracy.org civic technology platform
  • CalWave: Managed $3M Department of Energy grant for wave energy feasibility study
  • Created and secured $300K California Energy Commission grant for AI/ML deep sea species recognition supporting environmental impact assessment of offshore energy deployment
  • Developed federal research lease application for floating offshore wind energy test facility

Education

2016

Master of Public Policy

California Polytechnic State University, San Luis Obispo

Emphasis in Energy, Environment, and Innovation

2006

Bachelor of Science

Southern Illinois University Edwardsville

Contact

Interested in AI safety research, collaboration, or speaking?

Based in San Luis Obispo, California.