Open to stage de fin d'études internship · 4-6 months

M2 Student in NLP.
Focused on Applied AI.

Final-year Master's (M2) student in NLP at Université de Lorraine (IDMC) with focus on LLM fine-tuning, RAG pipelines, Agentic Workflows, and model interpretability. From data to modeling, evaluation, and deployment.

Location Nancy, France · Open to relocation
Focus Applied NLP / ML Engineering

Technical Skills

GenAI & LLMs

  • Fine-tuning
  • RAG Pipelines
  • Agentic AI (LangChain)
  • Prompt Engineering

ML & NLP

  • PyTorch
  • Hugging Face
  • scikit-learn
  • SHAP / LIME

Engineering

  • Python
  • Docker
  • SQL / FAISS
  • AWS (S3)

Data & Viz

  • Tableau
  • Power BI
  • Excel
  • Gradio

AI Dev Tools

  • Claude Code / Codex
  • Antigravity
  • Cursor / Windsurf
  • n8n Automation

Featured Projects

Agentic Workflows

AnnotaLoop

Annotation time cut from 30m to 5m · Cross-platform Desktop App

Problem

Annotation of data manually is slow, and inconsistent at large scale.

Solution

Desktop application that utilises LLM recommendations in accelerating labelling. Select any model (Mistral, OpenAI, Anthropic or Ollama as offline) and it takes care of the rest and leaves the data local. React and TypeScript on a Tauri backend.

Agentic AI Ollama React / TypeScript Tauri

Multi-Agent Orchestrator

Visual task delegation with full execution trace

Problem

Multi-agent pipelines cannot be easily debugged without understanding of the activity of each agent.

Solution

GUI to connect up coordinator and sub-agents and assign tools to each of them and then observe the task allocation and execution in a stepwise manner.

LangChain Agentic AI Gemini API Python
Fine-tuning

FLAN-T5 Fine-Tuning for News-to-Headline Generation

Fine-tuned FLAN-T5-Small on Gigaword for concise headline generation

Problem

To transform the news text into brief headline-like summaries, the outputs need to remain brief, yet maintain the main meaning.

Solution

Fine-tuned FLAN-T5-Small with the Gigaword dataset to transform written news sentences into summarised headline style text, and created a Gradio interface to test it interactively.

FLAN-T5 Hugging Face PyTorch Gradio Gigaword
RAG & Interpretability

Retrieval-Augmented Document QA

Grounded answers with source citations · No hallucination

Problem

LLMs begin hallucinating in response to queries that are not in their training.

Solution

Document Q&A system that encodes uploaded files into a FAISS vector store and a query time retrieval of the most relevant chunks, meaning answers are always based on the actual source.

FAISS LangChain Gemini API Python

French Embeddings Eval

85% accuracy using only 1% of dimensions

Problem

How grammatical features such as gender are encoded in the form of language models in their embeddings remains unclear.

Solution

Examined the encoding of grammatical gender in FlauBERT in 33K French nouns, with SHAP and LIME showing which dimensions that information is encoded in. It turns out that a small percentage does most of the work.

FlauBERT SHAP LIME PyTorch
Prompt Optimization

Prompt Optimization Playground

Side-by-side variant comparison with heuristic scoring

Problem

Manually comparing prompt variants among the runs of multiple LLM is tedious and subject to error.

Solution

Assign it a task and it will produce multiple immediate framings, execute them all on Gemini, and display the results along with a heuristic scoring in side by side views to assist in choosing which ones actually perform.

Python Gradio Gemini API Prompt Engineering
Research & Analytics

Researchlytic

250M academic works · Interactive analytics dashboards

Problem

Trend analysis in academic research in terms of institutions and authors needs mass data drawing and aggregation.

Solution

Web platform that is linked to the OpenAlex API that allows the users to filter trends by institution, author or topic and view citation impact and open-access statistics in interactive dashboards. Built with PHP and JavaScript on WordPress.

PHP JavaScript WordPress OpenAlex API SQL

Experience

Pythosoft

Jun 2023 - Jul 2024

Web & NLP Developer · Full-time · On-site

  • Worked on client websites using WordPress, PHP, and JavaScript across a range of business projects.
  • Integrated chatbot and AI features into client websites to automate support and lead capture.
  • Contributed to data pipeline and analytics work on the Researchlytic platform.

Freelance Developer

Oct 2023 - Sep 2024

NLP & Web Projects · Self-employed

  • Registered freelancer with Pakistan Software Export Board (PSEB) · Reg. No. FL21/PSEB/2026/23160 · Verify.
  • Took on NLP and web automation projects for independent clients.

Education

MSc in NLP

2024 - 2026

Université de Lorraine (IDMC)

Generative AI, Prompt Engineering, Fine-tuning, IR, Deep Learning.

BSc in Computer Science

2020 - 2023

University of the People

Data Science, ML, Software Eng. (CGPA: 3.80/4.00)

Contact

Seeking an end-of-studies internship (stage de fin d'études) in Applied NLP or ML Engineering.

hello@tayyab.io

+33 7 45 75 05 83

Nancy, France · Open to relocation

Languages: English (Fluent), Urdu/Punjabi (Native), French (Basic)