About me

Ph.D. candidate in Computational Linguistics at UPenn (B.S./M.Eng. EECS, MIT), working on data-efficient, linguistically informed methods for speech and language processing, with applications to under-resourced settings and the interpretability of speech and language models. My applied background includes patented explainable-AI systems, NLP solutions deployed to the U.S. Department of Defense, and peer-reviewed publications across computational linguistics and speech venues.

What I do

  • Speech & Language Processing

    Data-efficient, linguistically informed methods for speech segmentation and representation.

  • Technical AI Safety

    Mechanistic interpretability and explainable-AI solutions where deployed AI systems need to be reliable and predictable.

  • Computational Linguistics

    Computational methods for the linguistic analysis of both language models and the linguistic environments where they are deployed.

  • AI/ML Engineering

    Production-grade ML and patented commercial AI products, including NLP systems deployed for the U.S. Department of Defense.

Advisors

  • Prof. Charles Yang

    Prof. Charles Yang

    Professor of Linguistics, Computer Science, and Psychology at the University of Pennsylvania.

  • Prof. Mark Liberman

    Prof. Mark Liberman

    Christopher H. Browne Distinguished Professor of Linguistics and Professor of Computer and Information Science at University of Pennsylvania.

  • Prof. Daniel Swingley

    Prof. Daniel Swingley

    Professor of Psychology at the University of Pennsylvania.

  • Prof. Robert C. Berwick

    Prof. Robert C. Berwick

    Robert C. Berwick is Professor of Computational Linguistics and Computer Science and Engineering, in the Laboratory for Information and Decision Systems and the Institute for Data, Systems, and Society at the Massachusetts Institute of Technology.

Resume

Education

  1. University of Pennsylvania

    August 2022 — Present

    PhD in Linguistics (GPA: 3.94/4.0).
    Certificate in Language & Communication Sciences.
    Certificate in College & University Teaching.

  2. Massachusetts Institute of Technology

    September 2019 — June 2021

    M.Eng. and B.Sc. in Electrical Engineering & Computer Science (GPA: 5.0/5.0). M.Eng. concentration in Artificial Intelligence (GPA: 4.8/5.0).

  3. Massachusetts Institute of Technology

    June 2015 - May 2020

    B.S. in Electrical Engineering and Computer Science.

Skills

  1. Programming

    Python (4+ yrs), C#, R, MATLAB

  2. ML & Deep Learning

    PyTorch, Hugging Face Transformers, scikit-learn, TensorFlow (basic)

  3. Speech & NLP

    Representation learning, fine-tuning, interpretability, data-efficient modeling, speech segmentation, TextGrid/Praat

  4. Data & Scientific Computing

    NumPy, Pandas, Matplotlib, Seaborn, Librosa

  1. Tools & Workflow

    Git/GitHub, Docker (basic), REST APIs, Jupyter, VSCode

  2. Communication

    Technical writing, grant & patent writing, client demos, stakeholder coordination

  3. Languages

    Spanish (native), English (native/bilingual), German (professional)

Publications

  1. findsylls: A Language-Agnostic Toolkit for Syllable-Level Speech Tokenization and Embedding

    H. J. Vázquez Martínez. To appear, Interspeech 2026.

  2. Measuring Form and Function in Language Models

    H. J. Vázquez Martínez & C. Yang. Under review, 2026.

  3. Evaluating the Existence Proof: LLMs as Cognitive Models of Language Acquisition

    H. J. Vázquez Martínez, A. Heuser, C. Yang & J. Kodner. Forthcoming. In Artificial Knowledge of Language, Vernon Press.

  4. Acceptability Evaluation of Naturally Written Sentences

    V. Daultani, H. J. Vázquez Martínez, N. Okazaki. Journal of Information Processing, 2024.

  5. Evaluating Neural Language Models as Cognitive Models of Language Acquisition

    H. J. Vázquez Martínez, A. Heuser, C. Yang & J. Kodner. GenBench 2023 (EMNLP Workshop), Singapore, 2023.

  6. The Acceptability Delta Criterion: Testing Knowledge of Language using Sentence Acceptability Gradience

    H. J. Vázquez Martínez. BlackboxNLP 2021 (EMNLP Workshop), Punta Cana, 2021.

  7. Using Natural Language Processing to Construct a Knowledge Graph of Test Incident Reports

    S. Indurkhya, H. J. Vázquez Martínez, A. Indurkhya, C. Donalek. International Test & Evaluation Symposium, 2021.

  8. XAI Methods in VIP bring us closer to interpretable Network Graphs

    H. J. Vázquez Martínez et al. Caltech Explainable AI (XAI) Workshop, 2021.

  9. The Acceptability Delta Criterion: Memorization is Not Enough

    H. J. Vázquez Martínez. Master's Thesis, Massachusetts Institute of Technology, 2021.

  10. BERT's Adaptability to Small Data

    H. J. Vázquez Martínez, A. L. Heuser. BlackboxNLP 2020 (EMNLP Workshop), 2020.

Patents

  1. Systems and Methods for Numeric Network Extraction

    US-20230306044-A1, 2023. S. Indurkhya, H. J. Vázquez Martínez, A. Salimov, A. Indurkhya, G. Zanfardino, E. Sloan, C. Donalek, M. Amori.

  2. Systems and Methods for Network Explainability

    US-20230004557-A1, 2023. H. J. Vázquez Martínez, S. Indurkhya, G. Zanfardino, A. Indurkhya, S. Sahu, C. Donalek, M. Amori.

  3. Systems and Methods for Natural Language Querying

    US-20220342873-A1, 2022. S. Indurkhya, H. J. Vázquez Martínez, G. Zanfardino, C. Donalek.

Industry Experience

  1. AI Innovation Intern

    Virtualitics, Inc June 2024 — August 2024

    Returned to lead a 5-person team on generative AI integration, automating key parts of the analytics pipeline and delivering demos to executives, investors, and clients that secured buy-in for a 12-month roadmap to scale client-facing AI features.

  2. Natural Language Processing Engineer

    Virtualitics, Inc February 2021 — August 2022

    Generated $750K USD in revenue for Virtualitics as lead research engineer on an SBIR Phase II contract with the U.S. Air Force. Developed an NLP pipeline for operational analytics and collaborated with Air Force analysts to deliver all milestones and deploy to the production platform.

    Brought $500K USD in new client contracts with a patented explainable-AI capability for graph analytics that accelerated adoption of Virtualitics' core network-analysis routines.

    Reduced time-to-first-insight for non-technical users from more than an hour to 10–20 minutes with a patented natural-language recommendation system that suggested relevant workflows from live data.

Teaching

  1. LING 0001 - Introduction to Linguistics

    University of Pennsylvania Spring 2025
  2. LING 4000 - Tutorial in Linguistics

    University of Pennsylvania Fall 2024
  3. LING 0500 - Introduction to Formal Linguistics

    University of Pennsylvania Spring 2024
  4. LING 0001 - Introduction to Linguistics

    University of Pennsylvania Fall 2023
  1. 6.034 - Artificial Intelligence

    Massachusetts Institute of Technology Fall 2020
  2. 6.08 - Introduction to EECS via Interconnected Embedded Systems

    Massachusetts Institute of Technology Spring 2020
  3. 6.034 - Artificial Intelligence

    Massachusetts Institute of Technology Fall 2019
  4. 6.803/6.833 - The Human Intelligence Enterprise

    Massachusetts Institute of Technology Spring 2019

Professional & Extracurricular Affiliations

  1. Member, Association for Computational Linguistics

    2021–present
  2. Member, Linguistics Society of America

    2024–present
  3. Chair, The New Mind workshop & AI4GOOD Research Incubator

    2024–present
  4. Chair, 48th Annual Penn Linguistics Conference (PLC48)

    2023–2024
  5. Chair, 47th Annual Penn Linguistics Conference (PLC47)

    2022–2023
  6. Member, Fairmount Rowing Association

    2023–present
  7. Member, Wharton Crew Rowing Team

    2022–present
  8. Member, Treinta y Tres Delaware (Rueda de Casino) Dance and Performance Team

    2022–present
  9. Curriculum Director and Dance Instructor, MIT Casino Rueda

    2018–2020
  10. Team Member, NCAA Division 1 MIT Men's Lightweight Rowing Team

    2015–2019
  11. Member, Association of Puerto Rican Students at MIT

    2015–2021

Awards & Honors

  1. Harrison Graduate Fellowship (UPenn)

    2025–2026
  2. Google PhD Fellowship (UPenn Nominee)

    May 2025
  3. National GEM Consortium Employer Fellow (declined)

    November 2024
  4. UPenn Fontaine Fellow

    August 2022
  5. MIT Charles & Jennifer Johnson AI and Decision-Making MEng Thesis Award

    June 2022
  6. MIT Harold L. Hazen Teaching Award

    May 2021
  1. Hispanic Scholarship Fund (HSF)

    2016 — 2020
  2. Zeno Karl Schindler Summer School Grant

    May 2019
  3. MIT Excellence Award: Albert G. Hill Prize

    May 2018
  4. MIT EECS RE — Research & Innovation Scholar

    Aug 2017 — Jan 2018
  5. Palantir Future Scholarship

    Oct 2017

Certifications

  1. Technical AI Safety

    BlueDot Impact 2026
  2. Technical AI Safety Project

    BlueDot Impact 2026
  3. Certificate in Language and Communication Sciences

    University of Pennsylvania
  4. Certificate in College and University Teaching

    University of Pennsylvania

Publications

Blog

Contact

Contact Form