Behavioral research practitioner and systems designer. I close the gap between how people perform and how organizations support that performance — through measurement tools, learning systems, and AI-augmented workflows.
I'm Ovi — an organizational psychology practitioner based in Malawi (Remote · UTC+2), specializing in the intersection of behavioral science, systems design, and AI workflow engineering.
At DeepThought CultureTech, I built psychological safety interventions, performance measurement tools, and AI-assisted operational systems across 40+ people and 7 distributed labs — reducing execution friction while making the work environment genuinely better for the people inside it.
Available for remote roles and fractional engagements with async-first, mental-health-forward organizations.
When low self-esteem and imposter syndrome were quietly eroding team performance, I didn't send a motivational message. I built a diagnostic-to-intervention system grounded in behavioral psychology — and ran it across a distributed cohort of 40+ people.
Across multiple labs, a pattern was emerging: capable people consistently underperformed, avoided stepping forward, and disengaged from growth opportunities. The surface-level read was "low motivation." The root cause was structural — inferiority complex, imposter syndrome, and social comparison anxiety were operating unchecked inside the organization's culture.
Using the Cynefin framework, I classified this as a complex problem — not a compliance issue or a knowledge gap. There was no clear cause-and-effect chain. It required iterative probing, not a top-down solution. It needed a designed system.
When engagement dropped, I didn't assume the issue was "lack of interest." I mapped the ecosystem: Were task assignments aligned with learner readiness? Was measurement reinforcing the right behaviors? Were reflection loops strong enough to create belief shifts?
I was operating as Micro Lead and Team Lead — not a senior manager with authority to mandate change. I had influence without positional power, which meant every intervention had to be designed to be genuinely wanted, not enforced. I proposed, designed, tested, and iterated the entire initiative.
Designed a 6-question psychological safety form using metaphor-driven language instead of clinical jargon. "Do you ever feel like everyone else is succeeding and you're falling behind?" — not "Do you experience inferiority complex symptoms?"
Each question mapped to a specific psychological construct with documented reasoning. The Q3 swimming pool metaphor alone maps to 7 distinct constructs — imposter syndrome, social anxiety, decision paralysis, avoidance — without a single clinical term.
Responses were clustered into themes, not read literally. From these I designed a bank of growth activities — gamified, time-boxed (10–30 minutes), with clear boundaries between private and shareable. Guiding principle: structured sharing of insights, never wounds.
Deployed as a series (Season 1: 3 Confidence Builders) rather than a single event. AIDCA-structured nudges replaced reminders: "Do you feel like everyone else is ahead? What if you weren't alone in this?" Pull, not push.
Post-activity reflections with before/after ratings. Decision framework documented in advance — if resistance increases → simplify; if activities underperform → rotate from the bank. KPIs set before launch: ≥30 form responses, ≥20 participants, ≥70% satisfaction, ≥50% retention at 1 month.
"Cringe" meant forced vulnerability, juvenile games disconnected from growth, surface-level cheerleading, and ambiguous expectations. Every activity was tested against this definition before deployment.
Where previously there was no structured mechanism for addressing self-esteem and identity challenges, the organization now had a documented, repeatable system — form, activity bank, rollout cadence, iteration framework, and KPIs.
Self-reflection language in sprint reports shifted from output-oriented to growth-oriented. Stakeholders reported improved emotional safety and peer collaboration across learning cohorts.
The activity bank, form structure, and iteration logic persist beyond any individual facilitator. The design investment compounds over time.
"I shifted from effort-driven execution to leverage-driven systems thinking. Instead of completing tasks manually, I began designing prompts, workflows, and environments where clarity, confidence, and morale made performance sustainable. The shift was from enforcement to institutional clarity and emotional infrastructure."
Sprint reports measure what people did. UBS scores measure how people behaved. Neither measured how people felt about their own growth — the internal signal that predicts disengagement before any external metric shows a problem.
UBS scores tracked behavioral quality, sprint reports tracked execution, supervisor feedback tracked performance. None of it measured whether people actually felt they were growing. By the time a UBS score drops, disengagement has already been happening for weeks. I designed a tool to catch that signal earlier.
External performance data and internal psychological experience measure different things. You need both. The Achievement Scorecard bridges that gap — not replacing existing metrics, but completing the picture.
Anchored in Self-Determination Theory (Deci & Ryan). Pairs felt pride with conscious internalization of growth — preventing external validation dependency and building identity development.
Anchored in Bandura's Self-Efficacy Theory. Measures both current confidence and recognition of past-to-present progress — anchoring self-belief in evidence, not feeling alone.
Anchored in Dweck's Growth Mindset and Kolb's Experiential Learning Cycle. Quantitative skill check paired with open-ended identity reflection captures both tangible and aspirational growth.
Anchored in Hackman & Oldham's Job Characteristics Model. Assesses durability of contribution and generativity — not just "did I contribute?" but "did my work open possibilities?"
Anchored in Work Engagement Research (Kahn; Maslach & Leiter). Energy and values alignment are the early warning signals for burnout — this dimension catches misalignment before disengagement.
Anchored in Social Identity Theory. Measures whether individual effort is being mirrored back by peers and mentors — the social reinforcement that converts individual pride into belonging.
10 Likert-scale items + 2 open-ended reflections at sprint end. Open-ended questions use expressive writing methodology (Pennebaker) to capture nuance scales cannot — the texture of someone's growth, the specific moment something clicked.
Scores averaged per sprint, rolled into monthly trends. Cross-referenced with UBS scores, sprint reports, and peer feedback to identify correlations — not isolated data points.
One instrument, three layers of insight with no duplication of effort:
Response protocols defined in advance: individual drop → mentorship or check-in; team gap → refine rituals, adjust scaffolding; org-wide signal → escalate with data, not anecdote.
Program Managers receive dashboards across all six dimensions. Participants receive personalized reflection nudges. Both sides informed — this is the difference between surveillance infrastructure and developmental infrastructure.
The SOP includes full governance — who owns completion (Program Managers), how data auto-aggregates (ART Lab/PDGMS system), frequency (every 2-week sprint, monthly review), and integration with existing governance meetings. A measurement tool without governance is just a form.
The organization previously had no instrument measuring the internal psychological experience of achievement. This tool filled that gap with a theoretically grounded, operationally integrated system.
By measuring energy, alignment, and confidence at sprint cadence, the system surfaces early warning signals before they manifest as performance drops — giving time to intervene effectively.
The SOP, rubrics, governance structure, and intervention logic are documented and repeatable. Any new cohort onboards without rebuilding — the design investment compounds.
"I don't just design tools — I design systems with theoretical grounding, operational integration, governance structures, and feedback loops built in from the start. The scorecard is not a survey. It is a piece of organizational infrastructure."
Most team meetings produce the illusion of understanding. Clarity Arena was designed to produce the real thing — structured Socratic-Feynman dialogue sessions where participants had to explain, defend, and stress-test their thinking in real time, across four business domains simultaneously.
Teams worked across four domains — BIZ, PMF, GTM, and OPS. Each required different reasoning. But the organization had no mechanism to assess whether participants understood the logic behind their work, or were simply completing tasks.
The symptoms: shallow sprint reflections, inability to explain decisions under pressure, low-quality peer feedback, and the same problems recurring because root causes were never interrogated. Execution without comprehension is just expensive motion.
The system needed to build genuine understanding — not performance of understanding. This meant creating conditions where participants had to explain concepts in their own words, respond to probing questions, and defend their reasoning. The Feynman Technique wasn't a reference — it was the operating principle.
Clarity Arena ran three times per week across all four business domains. Topics were drawn from real organizational challenges — not hypothetical exercises. Participants presented positions, responded to Socratic questioning, and demonstrated reasoning depth. Examples from sessions I designed and facilitated:
"How is DT Avahana going to help organizations in their execution? What is the importance of feedback and how do you provide it to help teams improve?"
"Why is it important to escalate issues, and how does that impact the organization? How can we create a winning culture and ensure engagement with L&D activities?"
"How can we achieve resonance and increase CTR by ideating better and experimenting with content? What parameters matter for a QA Agent in content review?"
"How can we maintain quality during a transition phase? What frameworks should we follow to ensure deliverable quality remains high?"
I designed the topic bank feeding each week's sessions — translating live organizational challenges into dialogue-ready questions with enough complexity to require genuine reasoning, not shallow recall. I also facilitated sessions directly, applying Socratic questioning: probing surface-level answers, redirecting participants toward deeper causal reasoning. The facilitation model was never tell, always ask.
The goal was not to expose what people didn't know — it was to help them discover what they actually thought. Questions were designed to surface assumptions, not humiliate. Psychological safety in the room was a prerequisite for intellectual honesty in the dialogue.
Topics were not invented — they were extracted from real blockers, strategy gaps, and decisions the organization was actively facing. Participants reasoned about their actual work, not simulations. It raised the stakes and the relevance simultaneously.
Participants pitched a position on the assigned topic. The facilitator then probed using Socratic questioning: "Why?" "What would have to be true for that to work?" "What's the strongest counterargument?" "Explain that to me as if I had no context." No position was accepted at face value.
After each session, participants submitted reflections linking the topic to their own lab's work — creating cross-domain knowledge transfer and building shared mental models across teams that had previously operated in silos.
Participation, pitch quality, and reflection depth were tracked through the Unicorn Behaviour Score system. Clarity Arena wasn't an optional enrichment activity — it was integrated into the performance measurement infrastructure.
Sprint reflections submitted after Clarity Arena sessions showed higher causal reasoning quality — participants moved from describing what happened to analyzing why it happened and what should change.
By running sessions across BIZ, PMF, GTM, and OPS simultaneously with shared reflection, the organization built shared mental models across teams that had previously operated in silos.
Designing for safety-first dialogue created a feedback loop: safer sessions produced more honest reasoning, which produced better organizational decisions downstream.
"The hardest part of facilitation is not asking good questions — it's resisting the urge to answer them. Every time I held the silence after a probing question, I was creating space for someone to discover something they couldn't have been told."
Getting distributed people to show up to optional growth activities is a behavioral design problem, not a scheduling one. Unicorns Assemble was the engagement architecture built to solve it — combining servant leadership rituals, healing hackathons, AI-powered feedback loops, and a behavioral scoring system that made participation visible and meaningful.
DeepThought's learning ecosystem — Socratic dialogue sessions, Leadership Development Intensives (LDIs), PDGMS sessions — was well-designed on paper. The bottleneck was adoption. Participation rates hovered below targets, not because the activities lacked value, but because the behavioral architecture around them was weak.
This is a classic organizational psychology problem: knowing something is good for you doesn't reliably produce the behavior of doing it. The gap between intention and action required a designed intervention, not a reminder. The question: how do you make growth feel like belonging, not obligation?
The problem wasn't motivation — it was that participation in growth activities had no social visibility, no recognition infrastructure, and no identity anchor. People weren't engaging because engagement felt invisible and unrewarded. The fix wasn't incentives — it was meaning-making architecture.
Rather than waiting for burnout to surface before intervening, healing hackathons were scheduled proactively — minimum 4 per quarter. Each targeted a specific emotional growth dimension: self-esteem, agency, resilience, or transcendence. Pre/post tracking measured real shifts, not just attendance.
The design principle: emotional immunity is built through practice, not crisis management.
Servant leadership was defined operationally — not as a personality trait, but as specific visible behaviors: peer-support logs, reflections highlighting others' growth, mentoring actions. These were tracked and published. The first formally recognized Servant Leader in the organization was identified through this system.
Making leadership visible created a social mirror — others could see what it looked like in practice, not just theory.
Two GPT tools were deployed: a General Feedback GPT (embedded in orientation to surface real-time developmental feedback) and an Observation GPT (answering organizational philosophy, mission, and career questions on demand). Feedback shifted from a periodic event into a continuous ambient resource — available whenever a participant needed it, without requiring facilitator availability.
The Unicorn Behaviour Score made participation quantifiable and transparent. Attendance, reflection quality, and pitch participation across LDIs, SD sessions, and PDGMS were weighted and scored out of 100. Not punitive — a visibility mechanism. People could see exactly where their engagement stood and what growth in each dimension looked like.
Across ART Lab and LIS Lab, I contributed to hackathon content design, the servant leadership recognition framework, and facilitation of Socratic dialogue sessions. I was also formally recognized as the first Servant Leader identified by the organization — a signal that the behavioral infrastructure I helped design was working, because it surfaced me as an example before I knew I was being observed for it.
Designing an engagement system while also being a participant in it required careful separation of roles — designing for the whole cohort, not optimizing for your own visibility. The credibility of a recognition system depends entirely on it appearing to work for everyone, not just the designers.
The 3-day orientation process was rebuilt around transformation goals — every candidate, hired or not, was designed to leave with higher agency. Completion rate target: 70%+, up from a ~50% baseline.
What was previously an abstract cultural aspiration became a tracked, recognized, and publicly visible behavior pattern — a living record of the culture being built.
The shift from reactive intervention (responding to burnout after it surfaces) to proactive emotional infrastructure (running hackathons on schedule regardless of visible crisis) is a systems-level change that prevents compounding disengagement.
"Being recognized as the first Servant Leader wasn't something I designed for myself — it was the system working. That's the difference between performative culture and designed culture: one produces the behavior you perform, the other produces the behavior you forgot you were doing."
The organization's reporting system was producing friction at every stage — delayed submissions, low-quality reflections, and a review cycle that took 6–8 days and still left decision-makers without clear data. I rebuilt it using AI-assisted prompts, structured communication design, and a scoring system that made quality visible. Cycle time dropped to 3–4 days. Submission efficiency improved by 40%.
Across 7 distributed labs, the sprint reporting process was the organization's primary visibility mechanism. In theory: the system's nervous system. In practice: consistently delayed, inconsistently formatted, and insufficient for decision-making.
Root causes were structural, not motivational. People weren't submitting late because they didn't care — the process was cognitively expensive, the format was ambiguous, the feedback loop was slow, and there was no clarity on what "good" looked like. The system was generating effort without generating insight.
Using root cause analysis, I mapped the failure points: unclear submission expectations, no standardized format, no AI assistance to reduce cognitive load, no scoring rubric to define quality, and a review cycle requiring too many back-and-forth touchpoints. Each was a solvable design problem, not a people problem.
A library of context-enriched AI prompts — one set per lab, tailored to each lab's specific deliverable types, terminology, and stakeholder expectations. Each prompt was engineered to:
Result: content production time dropped from 3–5 hours to under 30 minutes for standard sprint reports.
The end-of-sprint review was redesigned from an unstructured debrief into a structured artifact. The SPRR required teams to document: what was planned vs. delivered, variance causes, decision changes, and what would be different next sprint.
Not a compliance form — a decision-readiness tool. Program Managers could read one SPRR and understand where a team stood without a meeting. The meeting cost moved into the document, which doesn't require scheduling.
A structured PM verbal scoring check-in replaced back-and-forth async clarification chains that were inflating cycle times. Program Managers used a defined rubric to rate report quality on specific dimensions — clarity, completeness, decision-readiness, and reflection depth. This created a feedback loop specific enough to actually improve future submissions, rather than generic "please improve" comments that changed nothing.
The goal was never to have AI write the reports. The goal was to remove the cognitive friction of starting — the blank page problem causing procrastination, underwriting, and structural inconsistency. Each prompt had three layers: a context layer (what this report is for, who reads it), a structure layer (mandatory sections and their purpose), and a quality layer (what distinguishes useful from compliant).
The prompt system was adapted to work within DeepThought's existing Avahana platform — not built as a standalone tool. Adoption friction was near-zero: people used the same interface they already used, with a better cognitive starting point embedded in it.
The PBAN format (Purpose → Background → Action → Next Steps) was the single highest-leverage change in the system. When every document follows the same structure, readers know exactly where to find information. Review time drops. Ambiguity drops. Decision speed increases.
I also designed the Inbox Architect workshop — a structured learning activity training participants on PBAN email writing, escalation logic, synchronous vs. asynchronous communication decisions, and meeting documentation. This converted the PBAN standard from a policy people knew about into a skill people actually had.
Measured through internal tracking across sprint cycles. The combination of AI-assisted prompts, clearer format expectations, and the PM scoring check-in reduced total effort for writers and reviewers simultaneously.
The primary driver: removing back-and-forth clarification chains. When reports were structured correctly from the first submission, the review cycle became a verification step, not a correction process.
Across 7 labs, SOP standardization and the prompt library removed the ambiguity-tax costing teams hours of rework, re-explanation, and reformatting every sprint cycle.
The shift from noise to signal meant PM check-ins moved faster, escalations were clearer, and leadership had better data for resourcing and prioritization decisions across the organization.
"The best systems infrastructure is the kind people stop noticing — not because it stopped working, but because it started working so reliably that it became the floor, not the ceiling. That's what I was building: not a tool people had to remember to use, but a structure they couldn't help but move inside of."
Available for remote roles and fractional engagements with async-first, results-driven organizations. If your team cares about how people experience their work — not just what they produce — we're probably a good fit.
Open to full-time remote roles and fractional engagements in People Operations, Organizational Psychology, AI Systems Design, and Behavioral Research. Async-first preferred. Mon–Fri, 6–8 hrs/day.