Skip to main content

All Questions

Filter by
Sorted by
Tagged with
-1 votes
1 answer
42 views

Cannot install llama-index-embeddings-huggingface==0.1.3 because these package versions have conflicting dependencies

I am unable to install the huggingfaceEmbedding \ Getting the followng error: ERROR: Cannot install llama-index-embeddings-huggingface==0.1.3, llama-index-embeddings-huggingface==0.1.4 and llama-index-...
Saurabh Verma's user avatar
0 votes
1 answer
55 views

How to Log Training Loss at Step Zero in Hugging Face Trainer or SFT Trainer?

’m using the Hugging Face Trainer (or SFTTrainer) for fine-tuning, and I want to log the training loss at step 0 (before any training steps are executed). I know there’s an eval_on_start option for ...
Charlie Parker's user avatar
0 votes
0 answers
55 views

How to Compute Teacher-Forced Accuracy (TFA) for Hugging Face Models While Handling EOS Tokens?

I am trying to compute Teacher-Forced Accuracy (TFA) for Hugging Face models, ensuring the following: EOS Token Handling: The model should be rewarded for predicting the first EOS token. Ignoring ...
Charlie Parker's user avatar
-1 votes
0 answers
34 views

Spring Ai with Pinecone using Onnx Embedding Error

I am Using SpringAi with PineCone Vector Storage with Openai Embeddings / Onnx Embeddings in both the case I got the same issue I referred these documentations to implement the things Referred ...
mahes waran's user avatar
0 votes
1 answer
66 views

Error ("bus error") running the simplest example on Hugging Face Transformers Pipeline (Macos M1)

I'm trying to follow the quick tour example here: https://huggingface.co/docs/transformers/quicktour and i'm getting a "bus error". My env is: MacOS Sonoma 14.7, Apple M1 Max chip Python 3....
Roy Ca's user avatar
  • 491
2 votes
1 answer
107 views

HuggingFace model loaded from the disk generates gibberish

I trained a LongT5 model using Huggingface's tooling. When I use the trained model directly after training the inference works as expected, I get good quality output, as expected from the training ...
gphilip's user avatar
  • 706
0 votes
0 answers
67 views

How to serve a bitsandbytes model with SGLang

Im trying to serve the model: unsloth/Llama-3.1-Nemotron-70B-Instruct-bnb-4bit. But I'm having an error and I don't really understand it at all. This is my command code: sudo docker run --gpus all \ ...
lauther27's user avatar
0 votes
0 answers
39 views

Diffuser pipeline embedings not enough values to unpack

I wanted to generate a image using text embedding instead of text as input using clip to tokenizes & embeds. The code so far : from transformers import AutoTokenizer, CLIPTextModelWithProjection ...
Felox's user avatar
  • 542
2 votes
0 answers
62 views

How to Log Custom Metrics with Metadata in Hugging Face Trainer during Evaluation?

I'm working on a sentence regression task using Hugging Face’s Trainer. Each sample consists of: input_ids: The tokenized sentence. labels: A numerical scalar target (for regression). metadata: A ...
enter_thevoid's user avatar
3 votes
1 answer
263 views

How to Load a 4-bit Quantized VLM Model from Hugging Face with Transformers?

I’m new to quantization and working with visual language models (VLM).I’m trying to load a 4-bit quantized version of the Ovis1.6-Gemma model from Hugging Face using the transformers library. I ...
meysam's user avatar
  • 83
0 votes
0 answers
43 views

Why is there a sudden performance degradation of a model uploaded to Hugging Face and then reloaded?

I was practicing fine-tuning the llama3 model on Kaggle and then uploading the tuned model to Hugging Face. I didn't end the session after training on the fine-tuning dataset I created. The response ...
ggapsang's user avatar
0 votes
1 answer
31 views

How can I adjust the performance of tokenizer?

Working with the tokenizer from the transformers library of Hugging Face. The tokenizer works fine in most cases, but in some cases, it does not. I'm wondering if I can "adjust" (not train a ...
IMAPOTATO's user avatar
0 votes
0 answers
23 views

I am having one text-sql-text project, facing few issues like max_token issue, sometimes wrong answer, result is too slow

I am trying to create a text-sql-text project with a huggingface model and a sample db. I'm getting few issues and doubts, listed right after the code snippet. Code Snippet: from transformers import ...
Abhra Sarkar's user avatar
0 votes
1 answer
58 views

Not able to download and save huggingface model - jinaai/jina-reranker-v2-base-multilingual

I am trying to download and save the following model from huggingface for later use. Here is the snippet. from transformers import AutoModelForSequenceClassification, AutoTokenizer,...
Abhra Sarkar's user avatar
1 vote
1 answer
123 views

How to add EOS when training T5?

I'm a little puzzled where (and if) EOS tokens are being added when using Huggignface's trainer classes to train a T5 (LongT5 actually) model. The data set contains pairs of text like this: from to ...
gphilip's user avatar
  • 706
0 votes
0 answers
45 views

How to use Inception V3 as Backbone for Vision Transformer?

I’m looking to create a Vision Transformer (ViT) using Inception V3 as the backbone. For an input image of size 500x500x3, Inception V3 outputs feature maps with dimensions [1, 2048, 14, 14]. How can ...
Asif Khan's user avatar
  • 1,278
0 votes
0 answers
20 views

Ai responces differently after exporting to exe using pyinstaller

I just had finished making the coding part for my Ai girlfriend. But as I published it to my Team's group page, I realized not all my friends had python on their computer and making them install ...
Galib Amir's user avatar
0 votes
1 answer
123 views

To use hugging face model, what are shards (same as checkpoint)? Do I need all?

I want to use this for a zero-shot QA model: https://huggingface.co/nvidia/NVLM-D-72B/tree/main When I download it using the following, it starts downloading 46 shards. There are 46 bin files in the ...
Vexbeex's user avatar
0 votes
0 answers
42 views

ValueError: If no `decoder_input_ids` or `decoder_inputs_embeds` are passed, `input_ids` cannot be `None`

I am trying to get the decoder hidden state of the florence 2 model. I was following this https://huggingface.co/microsoft/Florence-2-large/blob/main/modeling_florence2.py to understand the parameters ...
user10418143's user avatar
0 votes
0 answers
57 views

Not able to predict using Hugging-face Transformers Trainer class

I am learning to finetune a pre-trained model but when I am trying to test the finetuned model, I am getting this error IndexError: tuple index out of range this is my google colab notebook and I am ...
Lijin Durairaj's user avatar
0 votes
1 answer
49 views

AutoModelForSequenceClassification loss not decrease

from datasets import load_dataset from torch.utils.data import DataLoader from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch from tqdm import tqdm def ...
naivebird's user avatar
2 votes
1 answer
214 views

How does the data splitting actually work in Multi GPU Inference for Accelerate when used in a batched inference setting?

I followed the code given in this github issue and this medium blog I ran the batched experiment with process = 1 and process=4 it gave me the result but I'm confused right now because I thought the ...
Deshwal's user avatar
  • 4,132
0 votes
0 answers
83 views

Get error in transformers.Trainer. TypeError: Object of type set is not JSON serializable

I want to fine-tune the model "FacebookAI/roberta-large" on disaster-tweets dataset. I have problem in transformers.Trainer and I got this error: TypeError ...
Zahra Reyhanian's user avatar
0 votes
0 answers
55 views

BERTopic RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0, details in the post

I am trying to use the BERTopic library with a custom text generation model using the transformers library. However, I am getting this RuntimeError. I have tried to specify the device as 0 (GPU) in ...
Shivam Tawari's user avatar
0 votes
0 answers
55 views

How to create a custom model with Hugginface PreTrainedModel

I'm trying to create a simple model with the code below, taken almost directly from the documentation, and receiving an error import torch from transformers import PretrainedConfig, PreTrainedModel ...
Tobi's user avatar
  • 1
0 votes
0 answers
67 views

Unexpected memory usage when training transformer models with LoRA and quantization

I am training models using transformers, accelerate, bitsandbytes and PEFT. I recently got a second graphics card (RTX 3090) and am seeing a decent increase in training speed. However when ...
gazm2k5's user avatar
  • 490
0 votes
1 answer
289 views

OutOfMemoryError: CUDA out of memory while using compute_metrics function in Hugging Face Trainer

I'm encountering a CUDA out of memory error when using the compute_metrics function with the Hugging Face Trainer during model evaluation. My GPU is running out of memory while trying to compute the ...
KainnT's user avatar
  • 13
0 votes
0 answers
38 views

ValueError: Need either a `state_dict` or a `save_folder` containing offloaded weights while trying to run the transformer.pipeline for Llama-3.1-8B

I tried to run "meta-llama/Meta-Llama-3.1-8B-Instruct" model on MAC M2 but getting error while running below code: model = "meta-llama/Meta-Llama-3.1-8B-Instruct" tokenizer = ...
kamal Dwivedi's user avatar
0 votes
1 answer
82 views

How to reinitialize from scratch GPT2 XL in HuggingFace?

I'm trying to confirm that my GPT-2 model is being trained from scratch, rather than using any pre-existing pre-trained weights. Here's my approach: Load the pre-trained GPT-2 XL model: I load a pre-...
Charlie Parker's user avatar
1 vote
1 answer
213 views

How to ask multiple choice questions using pipeline

I want to ask multiple choice questions to the Hugging Face transformer pipeline; however, there does not seem to be a good task choice for this. I am referencing this: qa_pipeline = pipeline("...
θ_enthusiast's user avatar
0 votes
0 answers
140 views

HuggingFace: Efficient Large-Scale Embedding Extraction for DNA Sequences Using Transformers

I have a very large dataframe (60+ million rows) that I would like to use a transformer model to grab the embeddings for these rows (DNA sequences). Basically, this involves tokenizing first, then I ...
youtube's user avatar
  • 404
0 votes
0 answers
169 views

What does the "AttributeError: 'NoneType' object has no attribute 'cget_managed_ptr'" mean?

I'm trying to train a model with very standard HF code I've used before: import os from transformers import Trainer, TrainingArguments, AutoModelForCausalLM, AutoTokenizer from datasets import ...
Charlie Parker's user avatar
0 votes
0 answers
28 views

Why loading llama2 takes about 30 GB of GPU RAM?

I am trying to finetune llama2 on a self-created dataset. The text in the dataset is quite long, so I used rope_scaling=8 for llama initialization. I also used 4-bit and qlora to save memory. However, ...
Rain Gu's user avatar
0 votes
1 answer
139 views

Received server error (500) while deploying HuggingFace model on Sgaemaker

I've successfully fine tuned a sentence-transformers model all-MiniLM-L12-v2 on our data in SageMaker Studio and the model was saved in S3 as a model.tar.gaz. I want to deploy this model for inference ...
Yoan B. M.Sc's user avatar
  • 1,505
0 votes
1 answer
45 views

BertTokenizer vocab_size remains unchanged after adding tokens

I am using HuggingFace BertTokenizer and adding some tokens to it. Here are the codes: from transformers import BertTokenizer tokenizer = BertTokenizer.from_pretrained('fnlp/bart-base-chinese') print(...
Raptor's user avatar
  • 54.1k
0 votes
0 answers
84 views

How do I run this model in HuggingFace from Nvidia and Mistral?

The model is: nvidia/Mistral-NeMo-12B-Instruct And the link in HuggingFace nvidia/Mistral-NeMo-12B-Instruct Most model pages in HuggingFace have example Python code. But this model page doesn't have ...
abbas-h's user avatar
  • 440
1 vote
0 answers
30 views

BPE tokenizer add_tokens overlap with trained tokens

I am training a BPE from scratch. I want the vocabulary to include certain tokens that might or might not exist in the training dataset. from datasets import load_dataset from tokenizers import models,...
meliksahturker's user avatar
-1 votes
1 answer
92 views

IndexError: list index out of range, when trying to predict from the fine tuned model using Hugginface

i am trying to learn on how to fine tune a pretrained model and use it. this is my code from transformers import AutoModelForSequenceClassification, AutoTokenizer, TrainingArguments, Trainer from ...
Lijin Durairaj's user avatar
1 vote
0 answers
121 views

How to load LoRA weights for image classification model

I trained a model like below. model_name = 'owkin/phikon' model = AutoModelForImageClassification.from_pretrained( model_name, label2id=label2id, id2label=id2label, ...
Wtow's user avatar
  • 108
0 votes
0 answers
67 views

What is the difference, if any, between model.half() and model.to(dtype=torch.float16) in huggingface-transformers?

Example: # pip install transformers from transformers import AutoModelForTokenClassification, AutoTokenizer # Load model model_path = 'huawei-noah/TinyBERT_General_4L_312D' model = ...
Franck Dernoncourt's user avatar
-2 votes
1 answer
1k views

I load a float32 Hugging Face model, cast it to float16, and save it. How can I load it as float16?

I load a huggingface-transformers float32 model, cast it to float16, and save it. How can I load it as float16? Example: # pip install transformers from transformers import ...
Franck Dernoncourt's user avatar
0 votes
1 answer
399 views

Size mismatch for embed_out.weight: copying a param with shape torch.Size([0]) from checkpoint - Huggingface PyTorch

I want to finetune an LLM. I am able to successfully finetune LLM. But when reload the model after save, gets error. Below is the code import argparse import numpy as np import torch from datasets ...
Masthan's user avatar
  • 685
1 vote
0 answers
143 views

HuggingFace pipeline doesn't use multiple GPUs

I made a RAG app that basically answers user questions based on provided data, it works fine on GPU and a single GPU. I want to deploy it on multiple GPUs (4 T4s) but I always get CUDA out of Memory ...
Cihan Yalçın's user avatar
0 votes
1 answer
239 views

How can I make my Hugging Face fine-tuned model's config.json file reference a specific revision/commit from the original pretrained model?

I uploaded this model: https://huggingface.co/pamessina/CXRFE, which is a fine-tuned version of this model: https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-specialized Unfortunately, CXR-BERT-...
Pablo Messina's user avatar
0 votes
0 answers
101 views

Issue with Loading BLIP Processor and Model for Image Captioning

I'm experiencing an issue with loading the BLIP processor and model for image captioning using the Salesforce/blip-image-captioning-base model. My script seems to get stuck while attempting to load ...
mad's user avatar
  • 11
-1 votes
1 answer
302 views

Optimizing an LLM Using DPO: nan Loss Values During Evaluation

I want to optimize an LLM based on DPO. When I tried to train and evaluate the model, but there are nan values in the evaluation results. import torch from transformers import AutoModelForCausalLM, ...
Masthan's user avatar
  • 685
0 votes
0 answers
426 views

Llama-3-Instruct with Langchain keeps talking to itself

I am trying to eliminate this self-chattiness of the Llama3-Instruct Model with Langchain implementation. I am following several methods found over the internet. But no solution yet. Can anyone please ...
Arif Hamim's user avatar
0 votes
1 answer
173 views

Why aren't my metrics showing in SageMaker (CloudWatch)?

I'm training a S-BERT model in SageMaker, using Huggins Face library. I've followed the HF tutorials on how to define metrics to be tracked in the huggingface_estimator, yet when my model is done ...
Yoan B. M.Sc's user avatar
  • 1,505
0 votes
0 answers
54 views

MT-Bench evaluation of a model using pre generated model answers

I want to find MT-Bench score of an LLM (say EleutherAI/pythia-1b).I was able to run the command python gen_model_answer.py --model-pat EleutherAI/pythia-1b --model-id pythia-1b to generate answers ...
Masthan's user avatar
  • 685
1 vote
0 answers
122 views

Why do I get two diffrent responses when using the Inference API feature on thee widget on hugging face compared to one when I run the api locally?

I recently discovered hugging face and have been trying to work with it. When using text-to-text model, I came into the issue. no matter which model, i would get a different response when trying the ...
Brown Canadian's user avatar

1
2 3 4 5
9