UG Opportunities in 2024

For student intern and short-term research (after BTech) positions.

Topics

Audio LLMs

Present day LLMs (such as chatGPT) learn from text. Can we have them learn from audio, without text? We are not talking of just English speech, but multilingual speech. We are not talking of just speech, but also music.

To build your basics, read the following paper:

  • Baevski, A., Zhou, Y., Mohamed, A., & Auli, M. (2020). wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems, 33, 12449-12460.

Diffusion Models

Diffusion models generate high-quality images. Can we adapt the diffusion models to improve on generating certain style with little data from that style?

To build your basics, read the following paper:

  • Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. Advances in neural information processing systems, 33, 6840-6851.

How to apply

If you are comfortable with the above papers, send an email to Prof. Arora with the heading “[UG Project Appl] <Project Name>”. We are open for students outside IITK too.

Guidelines

  • No PVF without tangible progress.
  • At least two semesters of work for LoR.