Blank
Lancez-vous
Blank
Possibilités de carrières
Supprimer de la présélection

AI Data Scientist

  • Lieu:

    Bruxelles, Belgique

  • Contact:

    Michelle Kenneally

  • Type de poste:

    Contract

  • Téléphone de contact:

    +353 21 485 7249

  • Secteurs d’activité:

    Technology

  • Contact E-mail:

    michelle_kenneally@oxfordcorp.com

Context and Objectives:
Recent advancements in natural language processing and machine learning have led to the development of powerful large language models (LLMs). These LLMs exhibit exceptional text generation capabilities and have demonstrated the potential for diverse applications. The objective of the project is to design and implement AI models / pipelines capable of automatically converting, extracting and analysing data (speech, text) for specific real world data classification use cases. The application is built on the backbone of a large language model.

Tasks include:

  • Processing & conversion of speech datasets (pre-processing, sound capturing, feature extraction)
  • Build, train & validate AI models that extract content information from speech/text data by using LLM prompt engineering
  • Includes: -topic/intent classification, summarization, speaker identification, dialogue state tracking -applying time series segmentation methods, preferably using LLMs
  • Analyse conversation phases and speech & text patterns/features, perform statistical analysis


General requirements:

  • Self-motivated scientist/PhD graduate, with a passion for AI related projects on LLMs, natural language processing, speech recognition, data science.
  • Hold a PhD/master in a relevant field of AI, natural language processing, data science, speech recognition.
  • Knowledge of and experienced with time series analysis methods, LLMs, dialogue state tracking.
  • Prior experience in working with deep learning projects, speech recognition software, LLM based applications (eg. conversational agents), data mining


Specific technical requirements:

  • Excellent experience, knowledge, and skills in AI models for speech recognition & speech transcription (Whisper), machine learning and data science (time series analysis)
  • Excellent experience, knowledge, and hands-on skills in programming languages, particularly Python, C++ -Experienced in LangGraph, LangChain, RAG, Azure, Git, Docker, SQL, Pytorch



Musts:

  • Data Scientist
  • PHD preferred or AI background
  • Speech Recognition - 2-3 years experience
  • Deep Learning
  • LLMs

Pluses:

  • Python
  • Deep understanding of LLMs


Contract information :

  • Location: Full Remote 40hrs per week - Client based in Belgium
  • Status : Open to Freelancers
  • LOA: 12 months + potential extension
  • Start : ASAP