BUET CSE NLP Group

Masum Hasan, Tanveer Muttaqueen, Abdullah Al Ishtiaq, Kazi Sajeed Mehrab, Md. Mahim Anjum Haque, Tahmid Hasan, Wasi Ahmad, Anindya Iqbal, Rifat Shahriyar

In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

PDF Code

Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation

Tahmid Hasan, Abhik Bhattacharjee, Kazi Samin, Masum Hasan, Madhusudan Basak, M. Sohel Rahman, Rifat Shahriyar

In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

PDF Code

BERT2Code: Can Pretrained Language Models be Leveraged for Code Search?

Abdullah Al Ishtiaq, Masum Hasan, Md. Mahim Anjum Haque, Kazi Sajeed Mehrab, Tanveer Muttaqueen, Tahmid Hasan, Anindya Iqbal, Rifat Shahriyar

ArXiv Pre-print, 2021

PDF

Text2App: A Framework for Creating Android Apps from Text Descriptions

Masum Hasan, Kazi Sajeed Mehrab, Wasi Uddin Ahmad, Rifat Shahriyar

ArXiv Pre-print, 2021

PDF Code

Aligning Large Language Models with Human Values

The rapid advancement of large language models, such as OpenAI's GPT-3, has raised concerns about their potential to generate biased or harmful outputs. Ensuring that these models align with human values is crucial for their responsible deployment in various domains. This research aims to investigate and develop techniques for aligning large language models with human values. The BUET CSE NLP Group is actively engaged in pretraining billion-scale GPT and T5 models in Bangla, which are serving as primary testbeds for our proposed research. The project is exploring directions such as instruction fine-tuning and reinforcement learning with human feedback to enhance the model's alignment with specific human values in the Bangla language context. Additionally, the project is focusing on developing interpretability frameworks to shed light on the decision-making processes of these models, enabling better understanding and control over their output. Additionally, we are also exploring ideas from game theory, cognitive science, and behavioral economics to design objective/reward functions aligned with human rationales.
Retrieval-Augmented Large Language Models

This research is focusing on exploring the potential of retrieval-augmented large language models, aiming to enhance their performance in natural language understanding and generation tasks. Retrieval-based methods have shown promising results in improving the quality and relevance of generated responses in conversational AI systems. This project proposes to investigate techniques that combine the strengths of large language models with effective retrieval mechanisms to generate more contextually relevant and coherent responses. The research involves designing and training retrieval models that can efficiently retrieve relevant information from large knowledge bases or corpora to support the language model's generation process. Furthermore, the project exploring methods for fine-tuning large language models using retrieval-based objectives, enabling it to leverage retrieved information for more accurate and informed responses. Potential applications include open-domain and/or cross-lingual question answering.
Vision-Enhanced Large Language Models

This research aims to investigate the integration of vision in large language models to enhance their capabilities in visual understanding and generation tasks. While large language models have achieved remarkable success in natural language processing, they often lack the ability to comprehend and generate content related to visual information. This project proposes to explore techniques that combine the power of large language models with computer vision methodologies, enabling the models to understand and generate text based on visual input. The research involves designing and training models that can effectively process and analyze images, extracting meaningful visual features to enrich the language model's understanding. Furthermore, the project explores methods for fine-tuning the language model using visual-based objectives, allowing it to generate text that is coherent and aligned with the visual content. The proposal has potential applications in areas such as visual question answering and multimodal summarization.
Paraphrase Generation via Knowledge Distillation from Machine Translation Models

Synthetic paraphrase datasets are typically generated with round-trip machine translation. Since these back-translation-based data generation approaches have been shown to generate appropriate paraphrases.

In this work, we are trying to directly distill the knowledge of translation models into a paraphrase generation model. We are aiming to use two teachers, namely a forward translation model and a backward translation model, to distill two types of knowledge into the paraphrase model: the cross-attention distribution and the output distribution. In constrast to traditional knowledge distillation, here we have two teacher models instead of one and the task of the student model is different from the teacher models.