Hugging Face

a company that focuses on natural language processing (NLP) and artificial intelligence (AI) research.

Created: by Pradeep Gowda Updated: Nov 04, 2023 Tagged: huggingface · llm · ai

Hugging Face is a company that focuses on natural language processing (NLP) and artificial intelligence (AI) research. They are best known for their open-source library called “transformers,” which provides state-of-the-art general-purpose architectures for NLP tasks. These architectures include BERT, GPT, RoBERTa, T5, and others, which are pre-trained on large datasets and can be fine-tuned for specific tasks such as sentiment analysis, machine translation, question-answering, and more.

Hugging Face’s mission is to advance AI research and democratize access to powerful NLP tools for developers, researchers, and organizations. Their platform offers an ecosystem of resources, including pre-trained models, datasets, and community-contributed tools, making it easier for users to implement and experiment with cutting-edge NLP technologies.

Getting started instructions

2023-05-07From their website on creating a new account:

Getting started with our git and git-lfs interface

If you need to create a repo from the command line (skip if you created a repo from the website)

$ pip install huggingface_hub
$ huggingface-cli login
$ huggingface-cli repo create repo_name --type {model, dataset, space}

Clone your model or dataset locally

Make sure you have git-lfs installed (https://git-lfs.github.com)

$ git lfs install
$ git clone https://huggingface.co/username/repo_name

Then add, commit and push any file you want, including larges files

# save files via `.save_pretrained()` or move them here
$ git add .
$ git commit -m "commit from $USER"
$ git push

In most cases, if you’re using one of the compatible libraries, your repo will then be accessible from code, through its identifier: username/repo_name.

For example for a transformers model, anyone can load it with:

 tokenizer = AutoTokenizer.from_pretrained("username/repo_name") model = AutoModel.from_pretrained("username/repo_name")](https://huggingface.co/welcome)