About me

PhD researcher in Computational Linguistics at UPenn with a B.S./M.Eng. in Electrical Engineering and Computer Science from MIT, specializing in deep learning and natural language processing (NLP). I have industry experience with patent-backed, client-facing explainable AI systems and NLP pipelines deployed to the U.S. Department of Defense, alongside a strong publication record and a proven track record of building scalable, production-ready Machine Learning (ML) systems.

My research bridges insights from human language acquisition and representation learning to develop data-efficient speech and language systems. I believe leveraing our human inductive biase toward spoken syllables may enable us drastically reduce the amount of input data needed to train speech models, thereby unlocking their capabilities for more linguistic communities around the world.

What I do

  • Natural Language Processing

    Customized data analysis and natural language processing pipelines including OpenAI Assistant API integration.

  • LLM Benchmarking

    In-depth linguistic evaluation of Large Language Models in comparison to traditional, statistical language models and human judgement data.

  • Linguistic Analysis

    Application of multiple levels of linguistic analysis, including Pragmatics, Syntax, and Phonology; from different perspectives such as Theoretical, Historical, and Computational Linguistics.

  • Phonetics & DSP

    Large scale digital signal processing (DSP) and acoustic analyses of speech databases to attest scientific hypotheses, answer theoretical questions or discover patterns in the data.

Advisors

  • Prof. Charles Yang

    Prof. Charles Yang

    Professor of Linguistics, Computer Science, and Psychology at the University of Pennsylvania.

  • Prof. Mark Liberman

    Prof. Mark Liberman

    Christopher H. Browne Distinguished Professor of Linguistics and Professor of Computer and Information Science at University of Pennsylvania.

  • Prof. Daniel Swingley

    Prof. Daniel Swingley

    Professor of Psychology at the University of Pennsylvania.

  • Prof. Robert C. Berwick

    Prof. Robert C. Berwick

    Robert C. Berwick is Professor of Computational Linguistics and Computer Science and Engineering, in the Laboratory for Information and Decision Systems and the Institute for Data, Systems, and Society at the Massachusetts Institute of Technology.

Resume

Education

  1. University of Pennsylvania

    August 2022 — Present

    PhD in Linguistics (GPA: 3.94/4.0).
    Certificate in Language & Communication Sciences.
    Certificate in College & University Teaching.

  2. Massachusetts Institute of Technology

    September 2019 — June 2021

    M.Eng. and B.Sc. in Electrical Engineering & Computer Science (GPA: 5.0/5.0). M.Eng. concentration in Artificial Intelligence (GPA: 4.8/5.0).

  3. Massachusetts Institute of Technology

    June 2015 - May 2020

    B.S. in Electrical Engineering and Computer Science.

Skills

  1. Programming

    Python (4+ yrs), C#, R, MATLAB

  2. Deep Learning Frameworks

    PyTorch, Hugging Face Transformers, TensorFlow (basic)

  3. Data & ML Tools

    scikit-learn, NumPy, Pandas, Seaborn, Matplotlib, Librosa

  4. Development

    Git, GitHub, VSCode, Jupyter

  1. Deployment/Systems

    REST APIs, Docker (basic)

  2. NLP Expertise

    Representation learning, fine-tuning, explainability, speech segmentation, TextGrid/Praat, data-efficient modeling

  3. Communication

    Technical writing, client demos, grant proposals, patents, stakeholder coordination

  4. Languages

    Spanish (native), English (professional), German (B2–C1)

Publications

  1. Evaluating the Existence Proof: LLMs as Cognitive Models of Language Acquisition

    H. J. Vázquez Martínez, A. Heuser, C. Yang & J. Kodner. Forthcoming. In Artificial Knowledge of Language, Vernon Press.

  2. Acceptability Evaluation of Naturally Written Sentences

    V. Daultani, H. J. Vázquez Martínez, N. Okazaki. Journal of Information Processing, 2024.

  3. Evaluating Neural Language Models as Cognitive Models of Language Acquisition

    H. J. Vázquez Martínez, A. Heuser, C. Yang & J. Kodner. GenBench 2023 (EMNLP Workshop), Singapore, 2023.

  4. The Acceptability Delta Criterion: Testing Knowledge of Language using Sentence Acceptability Gradience

    H. J. Vázquez Martínez. BlackboxNLP 2021 (EMNLP Workshop), Punta Cana, 2021.

  5. Using Natural Language Processing to Construct a Knowledge Graph of Test Incident Reports

    S. Indurkhya, H. J. Vázquez Martínez, A. Indurkhya, C. Donalek. International Test & Evaluation Symposium, 2021.

  6. XAI Methods in VIP bring us closer to interpretable Network Graphs

    H. J. Vázquez Martínez et al. Caltech Explainable AI (XAI) Workshop, 2021.

  7. The Acceptability Delta Criterion: Memorization is Not Enough

    H. J. Vázquez Martínez. Master's Thesis, Massachusetts Institute of Technology, 2021.

  8. BERT's Adaptability to Small Data

    H. J. Vázquez Martínez, A. L. Heuser. BlackboxNLP 2020 (EMNLP Workshop), 2020.

Patents

  1. Systems and Methods for Numeric Network Extraction

    US-20230306044-A1, 2023. S. Indurkhya, H. J. Vázquez Martínez, A. Salimov, A. Indurkhya, G. Zanfardino, E. Sloan, C. Donalek, M. Amori.

  2. Systems and Methods for Network Explainability

    US-20230004557-A1, 2023. H. J. Vázquez Martínez, S. Indurkhya, G. Zanfardino, A. Indurkhya, S. Sahu, C. Donalek, M. Amori.

  3. Systems and Methods for Natural Language Querying

    US-20220342873-A1, 2022. S. Indurkhya, H. J. Vázquez Martínez, G. Zanfardino, C. Donalek.

Industry Experience

  1. AI Innovation Intern

    Virtualitics, Inc June 2024 — August 2024

    As an intern familiar with Virtualitics' software stack, I prototyped, and led the integration of OpenAI's API into the company software. I produced demos of our implementation of generative AI capabilities for key stakeholders within the company, including C-suite executives, shareholders and potential new clients.

  2. Natural Language Processing Engineer

    Virtualitics, Inc February 2021 — August 2022

    Lead R&D engineer in a $750000 (USD) SBIR Phase II contract (CONTRACT NUMBER: FA864922P0579) with the Air Force. Developed the Deficiency Report AI Monitory (DReAM) from the ground up with successful completion of all milestones. Managed Virtualitics' Artificial Intelligence engineering resources to enable full integration of DReAM within the Virtualitics AI Platform.

    Led R&D effort to design, prototype, patent (PATENT NUMBER: US-20230004557-A1), and deploy into production a series of custom Explainable AI algorithms to produce descriptive, plain English descriptions of the clusters or communities of nodes detected in Network Graphs loaded into the Virtualitics Explore. Presented the final product at the Caltech Explainable AI (XAI) Virtual Workshop.

    Prototyped, deployed and patented (PATENT NUMBER: US-20220342873-A1) the VIP AI-Guided Suggestions feature, a statistical subsystem that underlies the Natural Language Query system in VIP that analyses the dataset currently loaded into the software and recommends AI routines or specific visualizations based on the statistics of the dataset in order to reveal crucial insights about the data.

Teaching

  1. LING 0001 - Introduction to Linguistics

    University of Pennsylvania Spring 2025
  2. LING 4000 - Tutorial in Linguistics

    University of Pennsylvania Fall 2024
  3. LING 0500 - Introduction to Formal Linguistics

    University of Pennsylvania Spring 2024
  4. LING 0001 - Introduction to Linguistics

    University of Pennsylvania Fall 2023
  1. 6.034 - Artificial Intelligence

    Massachusetts Institute of Technology Fall 2020
  2. 6.08 - Introduction to EECS via Interconnected Embedded Systems

    Massachusetts Institute of Technology Spring 2020
  3. 6.034 - Artificial Intelligence

    Massachusetts Institute of Technology Fall 2019
  4. 6.803/6.833 - The Human Intelligence Enterprise

    Massachusetts Institute of Technology Spring 2019

Professional & Extracurricular Affiliations

  1. Member, Association for Computational Linguistics

    2021–present
  2. Member, Linguistics Society of America

    2024–present
  3. Chair, The New Mind workshop & AI4GOOD Research Incubator

    2024–present
  4. Chair, 48th Annual Penn Linguistics Conference (PLC48)

    2023–2024
  5. Chair, 47th Annual Penn Linguistics Conference (PLC47)

    2022–2023
  6. Member, Fairmount Rowing Association

    2023–present
  7. Member, Wharton Crew Rowing Team

    2022–present
  8. Member, Treinta y Tres Delaware (Rueda de Casino) Dance and Performance Team

    2022–present
  9. Curriculum Director and Dance Instructor, MIT Casino Rueda

    2018–2020
  10. Team Member, NCAA Division 1 MIT Men's Lightweight Rowing Team

    2015–2019
  11. Member, Association of Puerto Rican Students at MIT

    2015–2021

Awards & Honors

  1. Google PhD Fellowship (UPenn Nominee)

    May 2025
  2. National GEM Consortium Employer Fellow (declined)

    November 2024
  3. UPenn Fontaine Fellow

    August 2022
  4. MIT Charles & Jennifer Johnson AI and Decision-Making MEng Thesis Award

    June 2022
  5. MIT Harold L. Hazen Teaching Award

    May 2021
  1. Hispanic Scholarship Fund (HSF)

    2016 — 2020
  2. Zeno Karl Schindler Summer School Grant

    May 2019
  3. MIT Excellence Award: Albert G. Hill Prize

    May 2018
  4. MIT EECS RE — Research & Innovation Scholar

    Aug 2017 — Jan 2018
  5. Palantir Future Scholarship

    Oct 2017

Publications

Blog

Contact

Contact Form