Context and Objectives:
Recent advancements in natural language processing and machine learning have led to the development of powerful large language models (LLMs). These LLMs exhibit exceptional text generation capabilities and have demonstrated the potential for diverse applications. The objective of the project is to design and implement AI models / pipelines capable of automatically converting, extracting and analysing data (speech, text) for specific real world data classification use cases. The application is built on the backbone of a large language model.

Tasks include:

Processing & conversion of speech datasets (pre-processing, sound capturing, feature extraction)
Build, train & validate AI models that extract content information from speech/text data by using LLM prompt engineering
Includes: -topic/intent classification, summarization, speaker identification, dialogue state tracking -applying time series segmentation methods, preferably using LLMs
Analyse conversation phases and speech & text patterns/features, perform statistical analysis

General requirements:

Self-motivated scientist/PhD graduate, with a passion for AI related projects on LLMs, natural language processing, speech recognition, data science.
Hold a PhD/master in a relevant field of AI, natural language processing, data science, speech recognition.
Knowledge of and experienced with time series analysis methods, LLMs, dialogue state tracking.
Prior experience in working with deep learning projects, speech recognition software, LLM based applications (eg. conversational agents), data mining

Specific technical requirements:

Excellent experience, knowledge, and skills in AI models for speech recognition & speech transcription (Whisper), machine learning and data science (time series analysis)
Excellent experience, knowledge, and hands-on skills in programming languages, particularly Python, C++ -Experienced in LangGraph, LangChain, RAG, Azure, Git, Docker, SQL, Pytorch

Musts:

Data Scientist
PHD preferred or AI background
Speech Recognition - 2-3 years experience
Deep Learning
LLMs

Pluses:

Python
Deep understanding of LLMs

Contract information :

Location: Full Remote 40hrs per week - Client based in Belgium
Status : Open to Freelancers
LOA: 12 months + potential extension
Start : ASAP

Solliciteer nu