Héctor Javier Vázquez Martínez

Contact
Send me an email
Location

Philadelphia, Pennsylvania, USA

About me

Ph.D. candidate in Computational Linguistics at UPenn (B.S./M.Eng. EECS, MIT), working on data-efficient, linguistically informed methods for speech and language processing, with applications to under-resourced settings and the interpretability of speech and language models. My applied background includes patented explainable-AI systems, NLP solutions deployed to the U.S. Department of Defense, and peer-reviewed publications across computational linguistics and speech venues.

What I do

Speech & Language Processing

Data-efficient, linguistically informed methods for speech segmentation and representation.
Technical AI Safety

Mechanistic interpretability and explainable-AI solutions where deployed AI systems need to be reliable and predictable.
Computational Linguistics

Computational methods for the linguistic analysis of both language models and the linguistic environments where they are deployed.
AI/ML Engineering

Production-grade ML and patented commercial AI products, including NLP systems deployed for the U.S. Department of Defense.

Advisors

Prof. Charles Yang

Professor of Linguistics, Computer Science, and Psychology at the University of Pennsylvania.
Prof. Mark Liberman

Christopher H. Browne Distinguished Professor of Linguistics and Professor of Computer and Information Science at University of Pennsylvania.
Prof. Daniel Swingley

Professor of Psychology at the University of Pennsylvania.
Prof. Robert C. Berwick

Robert C. Berwick is Professor of Computational Linguistics and Computer Science and Engineering, in the Laboratory for Information and Decision Systems and the Institute for Data, Systems, and Society at the Massachusetts Institute of Technology.

Resume

Education

University of Pennsylvania
August 2022 — Present
PhD in Linguistics (GPA: 3.94/4.0).
Certificate in Language & Communication Sciences.
Certificate in College & University Teaching.
Massachusetts Institute of Technology
September 2019 — June 2021
M.Eng. and B.Sc. in Electrical Engineering & Computer Science (GPA: 5.0/5.0). M.Eng. concentration in Artificial Intelligence (GPA: 4.8/5.0).
Massachusetts Institute of Technology
June 2015 - May 2020
B.S. in Electrical Engineering and Computer Science.

Skills

Programming

Python (4+ yrs), C#, R, MATLAB
ML & Deep Learning

PyTorch, Hugging Face Transformers, scikit-learn, TensorFlow (basic)
Speech & NLP

Representation learning, fine-tuning, interpretability, data-efficient modeling, speech segmentation, TextGrid/Praat
Data & Scientific Computing

NumPy, Pandas, Matplotlib, Seaborn, Librosa

Tools & Workflow

Git/GitHub, Docker (basic), REST APIs, Jupyter, VSCode
Communication

Technical writing, grant & patent writing, client demos, stakeholder coordination
Languages

Spanish (native), English (native/bilingual), German (professional)

Publications

findsylls: A Language-Agnostic Toolkit for Syllable-Level Speech Tokenization and Embedding

H. J. Vázquez Martínez. To appear, Interspeech 2026.
Measuring Form and Function in Language Models

H. J. Vázquez Martínez & C. Yang. Under review, 2026.
Evaluating the Existence Proof: LLMs as Cognitive Models of Language Acquisition

H. J. Vázquez Martínez, A. Heuser, C. Yang & J. Kodner. Forthcoming. In Artificial Knowledge of Language, Vernon Press.
Acceptability Evaluation of Naturally Written Sentences

V. Daultani, H. J. Vázquez Martínez, N. Okazaki. Journal of Information Processing, 2024.
Evaluating Neural Language Models as Cognitive Models of Language Acquisition

H. J. Vázquez Martínez, A. Heuser, C. Yang & J. Kodner. GenBench 2023 (EMNLP Workshop), Singapore, 2023.
The Acceptability Delta Criterion: Testing Knowledge of Language using Sentence Acceptability Gradience

H. J. Vázquez Martínez. BlackboxNLP 2021 (EMNLP Workshop), Punta Cana, 2021.
Using Natural Language Processing to Construct a Knowledge Graph of Test Incident Reports

S. Indurkhya, H. J. Vázquez Martínez, A. Indurkhya, C. Donalek. International Test & Evaluation Symposium, 2021.
XAI Methods in VIP bring us closer to interpretable Network Graphs

H. J. Vázquez Martínez et al. Caltech Explainable AI (XAI) Workshop, 2021.
The Acceptability Delta Criterion: Memorization is Not Enough

H. J. Vázquez Martínez. Master's Thesis, Massachusetts Institute of Technology, 2021.
BERT's Adaptability to Small Data

H. J. Vázquez Martínez, A. L. Heuser. BlackboxNLP 2020 (EMNLP Workshop), 2020.

Patents

Systems and Methods for Numeric Network Extraction

US-20230306044-A1, 2023. S. Indurkhya, H. J. Vázquez Martínez, A. Salimov, A. Indurkhya, G. Zanfardino, E. Sloan, C. Donalek, M. Amori.
Systems and Methods for Network Explainability

US-20230004557-A1, 2023. H. J. Vázquez Martínez, S. Indurkhya, G. Zanfardino, A. Indurkhya, S. Sahu, C. Donalek, M. Amori.
Systems and Methods for Natural Language Querying

US-20220342873-A1, 2022. S. Indurkhya, H. J. Vázquez Martínez, G. Zanfardino, C. Donalek.

Industry Experience

AI Innovation Intern
Virtualitics, Inc June 2024 — August 2024
Returned to lead a 5-person team on generative AI integration, automating key parts of the analytics pipeline and delivering demos to executives, investors, and clients that secured buy-in for a 12-month roadmap to scale client-facing AI features.
Natural Language Processing Engineer
Virtualitics, Inc February 2021 — August 2022
Generated $750K USD in revenue for Virtualitics as lead research engineer on an SBIR Phase II contract with the U.S. Air Force. Developed an NLP pipeline for operational analytics and collaborated with Air Force analysts to deliver all milestones and deploy to the production platform.

Brought $500K USD in new client contracts with a patented explainable-AI capability for graph analytics that accelerated adoption of Virtualitics' core network-analysis routines.

Reduced time-to-first-insight for non-technical users from more than an hour to 10–20 minutes with a patented natural-language recommendation system that suggested relevant workflows from live data.

Teaching

LING 0001 - Introduction to Linguistics
University of Pennsylvania Spring 2025
LING 4000 - Tutorial in Linguistics
University of Pennsylvania Fall 2024
LING 0500 - Introduction to Formal Linguistics
University of Pennsylvania Spring 2024
LING 0001 - Introduction to Linguistics
University of Pennsylvania Fall 2023

6.034 - Artificial Intelligence
Massachusetts Institute of Technology Fall 2020
6.08 - Introduction to EECS via Interconnected Embedded Systems
Massachusetts Institute of Technology Spring 2020
6.034 - Artificial Intelligence
Massachusetts Institute of Technology Fall 2019
6.803/6.833 - The Human Intelligence Enterprise
Massachusetts Institute of Technology Spring 2019

Professional & Extracurricular Affiliations

Member, Association for Computational Linguistics
2021–present
Member, Linguistics Society of America
2024–present
Chair, The New Mind workshop & AI4GOOD Research Incubator
2024–present
Chair, 48th Annual Penn Linguistics Conference (PLC48)
2023–2024
Chair, 47th Annual Penn Linguistics Conference (PLC47)
2022–2023
Member, Fairmount Rowing Association
2023–present
Member, Wharton Crew Rowing Team
2022–present
Member, Treinta y Tres Delaware (Rueda de Casino) Dance and Performance Team
2022–present
Curriculum Director and Dance Instructor, MIT Casino Rueda
2018–2020
Team Member, NCAA Division 1 MIT Men's Lightweight Rowing Team
2015–2019
Member, Association of Puerto Rican Students at MIT
2015–2021

Awards & Honors

Harrison Graduate Fellowship (UPenn)
2025–2026
Google PhD Fellowship (UPenn Nominee)
May 2025
National GEM Consortium Employer Fellow (declined)
November 2024
UPenn Fontaine Fellow
August 2022
MIT Charles & Jennifer Johnson AI and Decision-Making MEng Thesis Award
June 2022
MIT Harold L. Hazen Teaching Award
May 2021