All Questions
Tagged with huggingface-transformers huggingface
433 questions
-1
votes
1
answer
42
views
Cannot install llama-index-embeddings-huggingface==0.1.3 because these package versions have conflicting dependencies
I am unable to install the huggingfaceEmbedding \
Getting the followng error:
ERROR: Cannot install llama-index-embeddings-huggingface==0.1.3, llama-index-embeddings-huggingface==0.1.4 and llama-index-...
0
votes
1
answer
55
views
How to Log Training Loss at Step Zero in Hugging Face Trainer or SFT Trainer?
’m using the Hugging Face Trainer (or SFTTrainer) for fine-tuning, and I want to log the training loss at step 0 (before any training steps are executed). I know there’s an eval_on_start option for ...
0
votes
0
answers
55
views
How to Compute Teacher-Forced Accuracy (TFA) for Hugging Face Models While Handling EOS Tokens?
I am trying to compute Teacher-Forced Accuracy (TFA) for Hugging Face models, ensuring the following:
EOS Token Handling: The model should be rewarded for predicting the first EOS token.
Ignoring ...
-1
votes
0
answers
34
views
Spring Ai with Pinecone using Onnx Embedding Error
I am Using SpringAi with PineCone Vector Storage with Openai Embeddings / Onnx Embeddings
in both the case I got the same issue
I referred these documentations to implement the things
Referred ...
0
votes
1
answer
66
views
Error ("bus error") running the simplest example on Hugging Face Transformers Pipeline (Macos M1)
I'm trying to follow the quick tour example here: https://huggingface.co/docs/transformers/quicktour
and i'm getting a "bus error".
My env is:
MacOS Sonoma 14.7, Apple M1 Max chip
Python 3....
2
votes
1
answer
107
views
HuggingFace model loaded from the disk generates gibberish
I trained a LongT5 model using Huggingface's tooling.
When I use the trained model directly after training the inference works as expected, I get good quality output, as expected from the training ...
0
votes
0
answers
67
views
How to serve a bitsandbytes model with SGLang
Im trying to serve the model: unsloth/Llama-3.1-Nemotron-70B-Instruct-bnb-4bit.
But I'm having an error and I don't really understand it at all.
This is my command code:
sudo docker run --gpus all \
...
0
votes
0
answers
39
views
Diffuser pipeline embedings not enough values to unpack
I wanted to generate a image using text embedding instead of text as input using clip to tokenizes & embeds.
The code so far :
from transformers import AutoTokenizer, CLIPTextModelWithProjection
...
2
votes
0
answers
62
views
How to Log Custom Metrics with Metadata in Hugging Face Trainer during Evaluation?
I'm working on a sentence regression task using Hugging Face’s Trainer. Each sample consists of:
input_ids: The tokenized sentence.
labels: A numerical scalar target (for regression).
metadata: A ...
3
votes
1
answer
263
views
How to Load a 4-bit Quantized VLM Model from Hugging Face with Transformers?
I’m new to quantization and working with visual language models (VLM).I’m trying to load a 4-bit quantized version of the Ovis1.6-Gemma model from Hugging Face using the transformers library. I ...
0
votes
0
answers
43
views
Why is there a sudden performance degradation of a model uploaded to Hugging Face and then reloaded?
I was practicing fine-tuning the llama3 model on Kaggle and then uploading the tuned model to Hugging Face.
I didn't end the session after training on the fine-tuning dataset I created. The response ...
0
votes
1
answer
31
views
How can I adjust the performance of tokenizer?
Working with the tokenizer from the transformers library of Hugging Face. The tokenizer works fine in most cases, but in some cases, it does not.
I'm wondering if I can "adjust" (not train a ...
0
votes
0
answers
23
views
I am having one text-sql-text project, facing few issues like max_token issue, sometimes wrong answer, result is too slow
I am trying to create a text-sql-text project with a huggingface model and a sample db. I'm getting few issues and doubts, listed right after the code snippet.
Code Snippet:
from transformers import ...
0
votes
1
answer
58
views
Not able to download and save huggingface model - jinaai/jina-reranker-v2-base-multilingual
I am trying to download and save the following model from huggingface for later use. Here is the snippet.
from transformers import AutoModelForSequenceClassification,
AutoTokenizer,...
1
vote
1
answer
123
views
How to add EOS when training T5?
I'm a little puzzled where (and if) EOS tokens are being added when using Huggignface's trainer classes to train a T5 (LongT5 actually) model.
The data set contains pairs of text like this:
from
to
...
0
votes
0
answers
45
views
How to use Inception V3 as Backbone for Vision Transformer?
I’m looking to create a Vision Transformer (ViT) using Inception V3 as the backbone. For an input image of size 500x500x3, Inception V3 outputs feature maps with dimensions [1, 2048, 14, 14].
How can ...
0
votes
0
answers
20
views
Ai responces differently after exporting to exe using pyinstaller
I just had finished making the coding part for my Ai girlfriend. But as I published it to my Team's group page, I realized not all my friends had python on their computer and making them install ...
0
votes
1
answer
123
views
To use hugging face model, what are shards (same as checkpoint)? Do I need all?
I want to use this for a zero-shot QA model: https://huggingface.co/nvidia/NVLM-D-72B/tree/main
When I download it using the following, it starts downloading 46 shards. There are 46 bin files in the ...
0
votes
0
answers
42
views
ValueError: If no `decoder_input_ids` or `decoder_inputs_embeds` are passed, `input_ids` cannot be `None`
I am trying to get the decoder hidden state of the florence 2 model. I was following this https://huggingface.co/microsoft/Florence-2-large/blob/main/modeling_florence2.py to understand the parameters ...
0
votes
0
answers
57
views
Not able to predict using Hugging-face Transformers Trainer class
I am learning to finetune a pre-trained model but when I am trying to test the finetuned model, I am getting this error
IndexError: tuple index out of range
this is my google colab notebook and I am ...
0
votes
1
answer
49
views
AutoModelForSequenceClassification loss not decrease
from datasets import load_dataset
from torch.utils.data import DataLoader
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
from tqdm import tqdm
def ...
2
votes
1
answer
214
views
How does the data splitting actually work in Multi GPU Inference for Accelerate when used in a batched inference setting?
I followed the code given in this github issue and this medium blog
I ran the batched experiment with process = 1 and process=4 it gave me the result but I'm confused right now because I thought the ...
0
votes
0
answers
83
views
Get error in transformers.Trainer. TypeError: Object of type set is not JSON serializable
I want to fine-tune the model "FacebookAI/roberta-large" on disaster-tweets dataset. I have problem in transformers.Trainer and I got this error:
TypeError ...
0
votes
0
answers
55
views
BERTopic RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0, details in the post
I am trying to use the BERTopic library with a custom text generation model using the transformers library. However, I am getting this RuntimeError. I have tried to specify the device as 0 (GPU) in ...
0
votes
0
answers
55
views
How to create a custom model with Hugginface PreTrainedModel
I'm trying to create a simple model with the code below, taken almost directly from the documentation, and receiving an error
import torch
from transformers import PretrainedConfig, PreTrainedModel
...
0
votes
0
answers
67
views
Unexpected memory usage when training transformer models with LoRA and quantization
I am training models using transformers, accelerate, bitsandbytes and PEFT. I recently got a second graphics card (RTX 3090) and am seeing a decent increase in training speed. However when ...
0
votes
1
answer
289
views
OutOfMemoryError: CUDA out of memory while using compute_metrics function in Hugging Face Trainer
I'm encountering a CUDA out of memory error when using the compute_metrics function with the Hugging Face Trainer during model evaluation. My GPU is running out of memory while trying to compute the ...
0
votes
0
answers
38
views
ValueError: Need either a `state_dict` or a `save_folder` containing offloaded weights while trying to run the transformer.pipeline for Llama-3.1-8B
I tried to run "meta-llama/Meta-Llama-3.1-8B-Instruct" model on MAC M2 but getting error while running below code:
model = "meta-llama/Meta-Llama-3.1-8B-Instruct"
tokenizer = ...
0
votes
1
answer
82
views
How to reinitialize from scratch GPT2 XL in HuggingFace?
I'm trying to confirm that my GPT-2 model is being trained from scratch, rather than using any pre-existing pre-trained weights. Here's my approach:
Load the pre-trained GPT-2 XL model: I load a pre-...
1
vote
1
answer
213
views
How to ask multiple choice questions using pipeline
I want to ask multiple choice questions to the Hugging Face transformer pipeline; however, there does not seem to be a good task choice for this.
I am referencing this:
qa_pipeline = pipeline("...
0
votes
0
answers
140
views
HuggingFace: Efficient Large-Scale Embedding Extraction for DNA Sequences Using Transformers
I have a very large dataframe (60+ million rows) that I would like to use a transformer model to grab the embeddings for these rows (DNA sequences). Basically, this involves tokenizing first, then I ...
0
votes
0
answers
169
views
What does the "AttributeError: 'NoneType' object has no attribute 'cget_managed_ptr'" mean?
I'm trying to train a model with very standard HF code I've used before:
import os
from transformers import Trainer, TrainingArguments, AutoModelForCausalLM, AutoTokenizer
from datasets import ...
0
votes
0
answers
28
views
Why loading llama2 takes about 30 GB of GPU RAM?
I am trying to finetune llama2 on a self-created dataset. The text in the dataset is quite long, so I used rope_scaling=8 for llama initialization. I also used 4-bit and qlora to save memory. However, ...
0
votes
1
answer
139
views
Received server error (500) while deploying HuggingFace model on Sgaemaker
I've successfully fine tuned a sentence-transformers model all-MiniLM-L12-v2 on our data in SageMaker Studio and the model was saved in S3 as a model.tar.gaz.
I want to deploy this model for inference ...
0
votes
1
answer
45
views
BertTokenizer vocab_size remains unchanged after adding tokens
I am using HuggingFace BertTokenizer and adding some tokens to it. Here are the codes:
from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained('fnlp/bart-base-chinese')
print(...
0
votes
0
answers
84
views
How do I run this model in HuggingFace from Nvidia and Mistral?
The model is:
nvidia/Mistral-NeMo-12B-Instruct
And the link in HuggingFace nvidia/Mistral-NeMo-12B-Instruct
Most model pages in HuggingFace have example Python code.
But this model page doesn't have ...
1
vote
0
answers
30
views
BPE tokenizer add_tokens overlap with trained tokens
I am training a BPE from scratch. I want the vocabulary to include certain tokens that might or might not exist in the training dataset.
from datasets import load_dataset
from tokenizers import models,...
-1
votes
1
answer
92
views
IndexError: list index out of range, when trying to predict from the fine tuned model using Hugginface
i am trying to learn on how to fine tune a pretrained model and use it. this is my code
from transformers import AutoModelForSequenceClassification, AutoTokenizer, TrainingArguments, Trainer
from ...
1
vote
0
answers
121
views
How to load LoRA weights for image classification model
I trained a model like below.
model_name = 'owkin/phikon'
model = AutoModelForImageClassification.from_pretrained(
model_name,
label2id=label2id,
id2label=id2label,
...
0
votes
0
answers
67
views
What is the difference, if any, between model.half() and model.to(dtype=torch.float16) in huggingface-transformers?
Example:
# pip install transformers
from transformers import AutoModelForTokenClassification, AutoTokenizer
# Load model
model_path = 'huawei-noah/TinyBERT_General_4L_312D'
model = ...
-2
votes
1
answer
1k
views
I load a float32 Hugging Face model, cast it to float16, and save it. How can I load it as float16?
I load a huggingface-transformers float32 model, cast it to float16, and save it. How can I load it as float16?
Example:
# pip install transformers
from transformers import ...
0
votes
1
answer
399
views
Size mismatch for embed_out.weight: copying a param with shape torch.Size([0]) from checkpoint - Huggingface PyTorch
I want to finetune an LLM. I am able to successfully finetune LLM. But when reload the model after save, gets error. Below is the code
import argparse
import numpy as np
import torch
from datasets ...
1
vote
0
answers
143
views
HuggingFace pipeline doesn't use multiple GPUs
I made a RAG app that basically answers user questions based on provided data, it works fine on GPU and a single GPU. I want to deploy it on multiple GPUs (4 T4s) but I always get CUDA out of Memory ...
0
votes
1
answer
239
views
How can I make my Hugging Face fine-tuned model's config.json file reference a specific revision/commit from the original pretrained model?
I uploaded this model: https://huggingface.co/pamessina/CXRFE, which is a fine-tuned version of this model: https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-specialized
Unfortunately, CXR-BERT-...
0
votes
0
answers
101
views
Issue with Loading BLIP Processor and Model for Image Captioning
I'm experiencing an issue with loading the BLIP processor and model for image captioning using the Salesforce/blip-image-captioning-base model. My script seems to get stuck while attempting to load ...
-1
votes
1
answer
302
views
Optimizing an LLM Using DPO: nan Loss Values During Evaluation
I want to optimize an LLM based on DPO. When I tried to train and evaluate the model, but there are nan values in the evaluation results.
import torch
from transformers import AutoModelForCausalLM, ...
0
votes
0
answers
426
views
Llama-3-Instruct with Langchain keeps talking to itself
I am trying to eliminate this self-chattiness of the Llama3-Instruct Model with Langchain implementation. I am following several methods found over the internet. But no solution yet. Can anyone please ...
0
votes
1
answer
173
views
Why aren't my metrics showing in SageMaker (CloudWatch)?
I'm training a S-BERT model in SageMaker, using Huggins Face library. I've followed the HF tutorials on how to define metrics to be tracked in the huggingface_estimator, yet when my model is done ...
0
votes
0
answers
54
views
MT-Bench evaluation of a model using pre generated model answers
I want to find MT-Bench score of an LLM (say EleutherAI/pythia-1b).I was able to run the command
python gen_model_answer.py --model-pat EleutherAI/pythia-1b --model-id pythia-1b
to generate answers ...
1
vote
0
answers
122
views
Why do I get two diffrent responses when using the Inference API feature on thee widget on hugging face compared to one when I run the api locally?
I recently discovered hugging face and have been trying to work with it. When using text-to-text model, I came into the issue. no matter which model, i would get a different response when trying the ...