Sahara Al-Madi
Computational Linguist | AI Researcher | Founder, Linguistic Security Institute (LSI)
I’m a computational linguist asking a question that travels across domains: What does technology owe the voices it learns from?
My work sits at the intersection of AI, language, and community-centered design exploring how linguistic nuance creates both vulnerabilities and defenses in intelligent systems. I lead error analysis for peer-reviewed Arabic NLP research with the NAMAA community (ACL 2026 forthcoming), consult for AI companies including SoundHound AI, and speak at university symposia, security conferences (BSides San Diego), and international forums on data sovereignty and AI ethics (GEO Indigenous Summit).
The questions I ask about who defines meaning, who benefits, and how we ensure listening doesn’t become extraction don’t live in one field. They emerge wherever technology meets human language. And I believe the answers are discovered together, across disciplines, across communities, across ways of knowing.
These questions are at the heart of the Linguistic Security Institute (LSI) , a framework and community I founded to build technology that is culturally coherent, resilient, and trustworthy.
About My Work
Linguistic Security Institute (LSI)
Founder of a research initiative exploring what technology owes the voices it learns from – across AI governance, data sovereignty, and cross-domain accountability.
Industry Experience
Language Consultant for AI companies including SoundHound AI, designing QA frameworks for multilingual systems and managing data operations.
Speaking & Engagement
Invited speaker at GEO Indigenous Summit (Space4Innovation), BSides San Diego (San Diego State University), and Tech Intersections (Northeastern University- Oakland).
Open Source Collaboration
I collaborate with the NAMAA community on peer-reviewed research, error analysis, and tools for Arabic NLP.
AI Security Research
I’m interested in how linguistic nuance, dialectal variation, code-switching, cultural context, creates vulnerabilities in AI systems. With NAMAA, I lead error analysis for ACL 2026 research examining model failures across 10+ LLMs.
Featured Projects
Mitote: Nahuatl-Aware Pronunciation Explorer
A language preservation demo teaching AI to accurately pronounce and explain Nahuatl-rooted words in Mexican Spanish. Built with RAG, curated linguistic data, and TTS synthesis.
Linguistic Firewall: Polyglot Poisoning Probe
Open-source Python script exposing how linguistic nuances in code-switching and non-Latin scripts (Arabic, RTL) bypass LLM guardrails. Presented at BSides San Diego (2026).
The Cyber Lingo (Archived)
An ongoing video series documenting my cybersecurity journey while making technical concepts accessible to non-technical learners, language-focused thinkers, and educators building inclusive spaces.
Current Research Interests
-
Linguistic vulnerabilities in LLMs
-
Error analysis and model bias
-
Data sovereignty and AI governance
-
Low-resource language NLP
-
Cross-domain accountability (Earth observation, bioacoustics, environmental data)
On Collaboration
I believe the best answers are discovered together. Across disciplines, across communities, across ways of knowing – the questions we face are too complex to solve alone. I’m always open to collaboration, conversation, and building with others who care about building technology that gives back.
If that’s you, I’d love to connect.
Testimonials
“Her attention to detail was evident throughout the project, ultimately achieving a 62% increase in website traffic.”
— Dr. Jill Robbins, VP, CTO, National Museum of Language
“Sahara exemplifies initiative, innovation, and follow-through in every project she undertakes.”
— Gregory Nedved, President Emeritus, National Museum of Language
Let’s Connect
I’m interested in:
-
AI governance and security research collaborations
-
Speaking engagements at the intersection of language and AI
-
Mentorship opportunities
-
Educational partnerships
