Héctor Javier Vázquez Martínez

Email
hjvm at upenn dot edu
Location

Philadelphia, Pennsylvania, USA

About me

PhD researcher in Computational Linguistics at UPenn with a B.S./M.Eng. in Electrical Engineering and Computer Science from MIT, specializing in deep learning and natural language processing (NLP). I have industry experience with patent-backed, client-facing explainable AI systems and NLP pipelines deployed to the U.S. Department of Defense, alongside a strong publication record and a proven track record of building scalable, production-ready Machine Learning (ML) systems.

My research bridges insights from human language acquisition and representation learning to develop data-efficient speech and language systems. I believe leveraing our human inductive biase toward spoken syllables may enable us drastically reduce the amount of input data needed to train speech models, thereby unlocking their capabilities for more linguistic communities around the world.

What I do

Natural Language Processing

Customized data analysis and natural language processing pipelines including OpenAI Assistant API integration.
LLM Benchmarking

In-depth linguistic evaluation of Large Language Models in comparison to traditional, statistical language models and human judgement data.
Linguistic Analysis

Application of multiple levels of linguistic analysis, including Pragmatics, Syntax, and Phonology; from different perspectives such as Theoretical, Historical, and Computational Linguistics.
Phonetics & DSP

Large scale digital signal processing (DSP) and acoustic analyses of speech databases to attest scientific hypotheses, answer theoretical questions or discover patterns in the data.

Advisors

Prof. Charles Yang

Professor of Linguistics, Computer Science, and Psychology at the University of Pennsylvania.
Prof. Mark Liberman

Christopher H. Browne Distinguished Professor of Linguistics and Professor of Computer and Information Science at University of Pennsylvania.
Prof. Daniel Swingley

Professor of Psychology at the University of Pennsylvania.
Prof. Robert C. Berwick

Robert C. Berwick is Professor of Computational Linguistics and Computer Science and Engineering, in the Laboratory for Information and Decision Systems and the Institute for Data, Systems, and Society at the Massachusetts Institute of Technology.

Resume

Education

University of Pennsylvania
August 2022 — Present
PhD in Linguistics (GPA: 3.94/4.0).
Certificate in Language & Communication Sciences.
Certificate in College & University Teaching.
Massachusetts Institute of Technology
September 2019 — June 2021
M.Eng. and B.Sc. in Electrical Engineering & Computer Science (GPA: 5.0/5.0). M.Eng. concentration in Artificial Intelligence (GPA: 4.8/5.0).
Massachusetts Institute of Technology
June 2015 - May 2020
B.S. in Electrical Engineering and Computer Science.

Skills

Programming

Python (4+ yrs), C#, R, MATLAB
Deep Learning Frameworks

PyTorch, Hugging Face Transformers, TensorFlow (basic)
Data & ML Tools

scikit-learn, NumPy, Pandas, Seaborn, Matplotlib, Librosa
Development

Git, GitHub, VSCode, Jupyter

Deployment/Systems

REST APIs, Docker (basic)
NLP Expertise

Representation learning, fine-tuning, explainability, speech segmentation, TextGrid/Praat, data-efficient modeling
Communication

Technical writing, client demos, grant proposals, patents, stakeholder coordination
Languages

Spanish (native), English (professional), German (B2–C1)

Publications

Evaluating the Existence Proof: LLMs as Cognitive Models of Language Acquisition

H. J. Vázquez Martínez, A. Heuser, C. Yang & J. Kodner. Forthcoming. In Artificial Knowledge of Language, Vernon Press.
Acceptability Evaluation of Naturally Written Sentences

V. Daultani, H. J. Vázquez Martínez, N. Okazaki. Journal of Information Processing, 2024.
Evaluating Neural Language Models as Cognitive Models of Language Acquisition

H. J. Vázquez Martínez, A. Heuser, C. Yang & J. Kodner. GenBench 2023 (EMNLP Workshop), Singapore, 2023.
The Acceptability Delta Criterion: Testing Knowledge of Language using Sentence Acceptability Gradience

H. J. Vázquez Martínez. BlackboxNLP 2021 (EMNLP Workshop), Punta Cana, 2021.
Using Natural Language Processing to Construct a Knowledge Graph of Test Incident Reports

S. Indurkhya, H. J. Vázquez Martínez, A. Indurkhya, C. Donalek. International Test & Evaluation Symposium, 2021.
XAI Methods in VIP bring us closer to interpretable Network Graphs

H. J. Vázquez Martínez et al. Caltech Explainable AI (XAI) Workshop, 2021.
The Acceptability Delta Criterion: Memorization is Not Enough

H. J. Vázquez Martínez. Master's Thesis, Massachusetts Institute of Technology, 2021.
BERT's Adaptability to Small Data

H. J. Vázquez Martínez, A. L. Heuser. BlackboxNLP 2020 (EMNLP Workshop), 2020.

Patents

Systems and Methods for Numeric Network Extraction

US-20230306044-A1, 2023. S. Indurkhya, H. J. Vázquez Martínez, A. Salimov, A. Indurkhya, G. Zanfardino, E. Sloan, C. Donalek, M. Amori.
Systems and Methods for Network Explainability

US-20230004557-A1, 2023. H. J. Vázquez Martínez, S. Indurkhya, G. Zanfardino, A. Indurkhya, S. Sahu, C. Donalek, M. Amori.
Systems and Methods for Natural Language Querying

US-20220342873-A1, 2022. S. Indurkhya, H. J. Vázquez Martínez, G. Zanfardino, C. Donalek.

Industry Experience

AI Innovation Intern
Virtualitics, Inc June 2024 — August 2024
As an intern familiar with Virtualitics' software stack, I prototyped, and led the integration of OpenAI's API into the company software. I produced demos of our implementation of generative AI capabilities for key stakeholders within the company, including C-suite executives, shareholders and potential new clients.
Natural Language Processing Engineer
Virtualitics, Inc February 2021 — August 2022
Lead R&D engineer in a $750000 (USD) SBIR Phase II contract (CONTRACT NUMBER: FA864922P0579) with the Air Force. Developed the Deficiency Report AI Monitory (DReAM) from the ground up with successful completion of all milestones. Managed Virtualitics' Artificial Intelligence engineering resources to enable full integration of DReAM within the Virtualitics AI Platform.

Led R&D effort to design, prototype, patent (PATENT NUMBER: US-20230004557-A1), and deploy into production a series of custom Explainable AI algorithms to produce descriptive, plain English descriptions of the clusters or communities of nodes detected in Network Graphs loaded into the Virtualitics Explore. Presented the final product at the Caltech Explainable AI (XAI) Virtual Workshop.

Prototyped, deployed and patented (PATENT NUMBER: US-20220342873-A1) the VIP AI-Guided Suggestions feature, a statistical subsystem that underlies the Natural Language Query system in VIP that analyses the dataset currently loaded into the software and recommends AI routines or specific visualizations based on the statistics of the dataset in order to reveal crucial insights about the data.

Teaching

LING 0001 - Introduction to Linguistics
University of Pennsylvania Spring 2025
LING 4000 - Tutorial in Linguistics
University of Pennsylvania Fall 2024
LING 0500 - Introduction to Formal Linguistics
University of Pennsylvania Spring 2024
LING 0001 - Introduction to Linguistics
University of Pennsylvania Fall 2023

6.034 - Artificial Intelligence
Massachusetts Institute of Technology Fall 2020
6.08 - Introduction to EECS via Interconnected Embedded Systems
Massachusetts Institute of Technology Spring 2020
6.034 - Artificial Intelligence
Massachusetts Institute of Technology Fall 2019
6.803/6.833 - The Human Intelligence Enterprise
Massachusetts Institute of Technology Spring 2019

Professional & Extracurricular Affiliations

Member, Association for Computational Linguistics
2021–present
Member, Linguistics Society of America
2024–present
Chair, The New Mind workshop & AI4GOOD Research Incubator
2024–present
Chair, 48th Annual Penn Linguistics Conference (PLC48)
2023–2024
Chair, 47th Annual Penn Linguistics Conference (PLC47)
2022–2023
Member, Fairmount Rowing Association
2023–present
Member, Wharton Crew Rowing Team
2022–present
Member, Treinta y Tres Delaware (Rueda de Casino) Dance and Performance Team
2022–present
Curriculum Director and Dance Instructor, MIT Casino Rueda
2018–2020
Team Member, NCAA Division 1 MIT Men's Lightweight Rowing Team
2015–2019
Member, Association of Puerto Rican Students at MIT
2015–2021

Awards & Honors

Google PhD Fellowship (UPenn Nominee)
May 2025
National GEM Consortium Employer Fellow (declined)
November 2024
UPenn Fontaine Fellow
August 2022
MIT Charles & Jennifer Johnson AI and Decision-Making MEng Thesis Award
June 2022
MIT Harold L. Hazen Teaching Award
May 2021