The Intelligence Almanac


This is where I will tell my friends way too much about me
  • Cleaning text data with regex

    Why regex is a superhero in the LLM world

    By Rahul Baburajan
    Post thumbnail
    Post thumbnail
    Regular expressions (regex) are indispensable in the realm of data cleaning and preparation, particularly for Language Learning Models (LLMs) in natural language processing. The majority of data available, especially from extensive sources like Project Gutenberg, is often unstructured and cluttered with extraneous information. Regex excels in such environments, providing a... [Read More]
    Tags: