Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
-2 votes
0 answers
13 views

Where can I find datasets for medical document analysis and disease diagnosis using NLP? [closed]

I'm working on a healthcare-related project where I need to analyze medical documents, extract specific values (e.g., creatinine, glucose levels, etc.), and generate personalized paragraphs for ...
jlassi Mohamed Hani's user avatar
-1 votes
0 answers
27 views

How to split and spelling correct arabic text without spaces into list of words

I'm looking for a way to split and spelling correct the arabic text without spacing, just like in Microsoft Word for example: تشرابالقطط الحليب Expected: [تشرب، القطط، الحليب] If there Is no ready ...
Ali Suliman's user avatar
0 votes
0 answers
11 views

How to log only the current script file to W&B code panel immediately?

How can I ensure that only the current script file (e.g., train.py) is logged to the W&B Code panel when running a script, without logging the entire directory? Currently, I'm using: wandb.run....
Charlie Parker's user avatar
0 votes
0 answers
34 views

Encoder Decoder Transformer model generate a repetitive token as output in text summarization

I implemented a transformer Encoder Decoder (Bert2Bert) for text summarization task. In train phase train loss decreases but in prediction phase it generate a repetitive token as output for example [2,...
rasoul mohammadi's user avatar
-1 votes
0 answers
23 views

GPT2: `register_forward_hook` and `output_hidden_state` gave different outputs of an intermediate layer

I want to output the 20th GPT2Block in a GPT2 medium model (24 GPT2Block blocks in total). I have used register_forward_hook and output_hidden_state separately, but they give different results. My ...
FeiYiZhaiMenRen's user avatar
2 votes
1 answer
313 views

Alternative to device_map = "auto" in Huggingface Pretrained

I have a model that I was reading from huggingface using the following code: from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained(model_path) model = ...
Omar's user avatar
  • 375
0 votes
0 answers
39 views

Simple Transformers ClassificationModel freezes during training, making the terminal unresponsive

I am attempting to train a whole bunch of DistilBERT (Simple Transformers) Classification Models using 10-Fold validation. The process works as intended, but freezes during training (at different ...
GandalfTheAlien's user avatar
0 votes
1 answer
71 views

Layer expects 2 input(s), but it received 1 input tensors

I am trying to build model to predict posts likes, the model takes text and content type which is one hot encoded column. I have made a TensorFlow dataset but when trying to fit the model I got this ...
Abdulaziz Snobrah's user avatar
1 vote
1 answer
212 views

How does OpenAIEmbeddings() work? Is it creating a single vector of size 1536 for whole text corpus?

I'm working with the OpenAIEmbeddings() class from OpenAI, which uses the text-embedding-3-small model. According to the documentation, it generates a 1536-dimensional vector for any input text. ...
MichaelScott's user avatar
0 votes
0 answers
39 views

The Impact of Pretraining on Fine-tuning and Inference

I am working on a binary prediction classification task, primarily focusing on fine-tuning a BERT model to learn the association between CVEs and CWEs. I've structured my task into three phases: first,...
joehu's user avatar
  • 19
-1 votes
1 answer
42 views

LSTM forecasting predicts zeros

i'm building an LSTM model to forecast future total number of cases for covid 19 using OWID dataset i use a multivariate series of 6 columns including the date column, the Problem is i get all zero ...
Muhamed Khaled's user avatar
0 votes
0 answers
77 views

I am implementing transformers from scratch in pytorch and getting some error in addition in positional encoding part in output layer

I am implementing transformer in pytorch and getting an error when the Positional encoding is applied in the decoder layer that is in the op_positional_encoding = self.positional_encoding(op_embed) ...
user8916969's user avatar
0 votes
1 answer
129 views

Enhance model performance in text classification task

I tried to build a model for multi-label text classification task in chinese, but the performance of the model is not good enough (about 60% accuracy), and I come for help about how to enhance it. I ...
tardis blue's user avatar
0 votes
1 answer
67 views

Why loading AutoTokenizer takes so much RAM?

I was measuring the RAM that is used by my script and I was surprised that it takes about 300Mb of RAM, while the tokenizer file itself is about 9MB. Why is that? I tried: from transformers import ...
Yanina Kolotilova's user avatar
0 votes
0 answers
19 views

Model prediction differed after one year with same config and dataset

I tried a TrOCR-Hugging Face model, with a training data (400k) and got around 83% of accuracy with the test data. After a year, I tried to train the same model, with same config and dataset. Now the ...
MohanGandhi's user avatar
0 votes
0 answers
47 views

Text to Openpose and Weird RNN bugs

I want to create AI that generate openpose from textual description for example if input "a man running" output would be like the image I provided Is there any model architecture recommend ...
Peemmaphat Sripongsai's user avatar
-3 votes
1 answer
137 views

Unsupervised Sentiment Analysis in nlp [closed]

How to do sentiment analysis on unlabeled data, and I've looked all over the internet(gave clustering algorithms) and it was not effective. How to do sentiment analysis from scratch on unlabeled data ,...
M ASHWIN's user avatar
0 votes
0 answers
37 views

Assertion with no scription in vllm with DeepSeekMath 7b model

I'm working with the DeepSeekMath 7b model to generate synthetic data using Python, but I'm encountering an AssertionError with no description alongside warnings related to token length exceeding the ...
Charlie Parker's user avatar
0 votes
0 answers
32 views

Is it valid to separate text generation and logits computation in reinforcement learning?

I'm working with a reinforcement learning setup using the REINFORCE algorithm for a text summarization task in NLP. Originally, I developed a setup by extending Huggingface's .generate() function, ...
inverted_index's user avatar
0 votes
0 answers
86 views

OSError: [Errno -9996] Invalid input device (no default output device) using Pyaudio in google collab

I am working on a Realtime speech emotion recognition using LSTM. I have used pydub library to capture the live input speech . I am working with google collab . I am not sure that whether i am using ...
Krdo Bhai plz's user avatar
0 votes
1 answer
72 views

NefTune Receiving 0 Training Loss on Transformers

I'm basically trying to fine-tune my model with Neftune. Model is based on Turkish Language. But there I'm receiving zero training lose. I've tried to another model like Turkish-GPT2 there is no issue ...
Zephyrus's user avatar
0 votes
1 answer
276 views

save and load keras transformer model

i am still new to deep learning right now i follow this keras tutorial to make an translation model using transformer here the link. Everything works just fine but i have no idea how to save the model,...
Alvalen Shafel's user avatar
0 votes
0 answers
9 views

RNN invalidArgumentError

I am training RNN model for speech to text purpose. While training RNN, I am getting this weird error called invalid argument error indices[28,0] = -1 is not in [0,169) model = keras.Sequential() # ...
Saish Sawant's user avatar
0 votes
1 answer
66 views

How to import tensorflow-text correctly

I have bundle of errors in importing tensorflow-text. I was first trying to import the below versions which were working correctly. !pip install tensorflow==2.8 but now it says like this `import ...
Zahid Rahman's user avatar
0 votes
0 answers
79 views

How to use BERT for identify words unrelated to the content of a sentence and replace them with suitable words?

For this question we can see below example: A: Jack continues to pray well. B: Jack continues to play well. In the first sentence, the word "pray" is written in a typographical error and the ...
Pourya0022's user avatar
0 votes
0 answers
40 views

Unable to import trax

I was trying to import Trax on Jupyter Notebook, and it gave me an error message. I am running the Python 3.11 version. installed trax using the " pip install Trax command." All the ...
user13045135's user avatar
2 votes
2 answers
1k views

RuntimeError: Failed to import transformers.integrations.bitsandbytes

I am trying to load an llm model in 4 bits precision. However, I got RuntimeError: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):...
Mayor's user avatar
  • 29
1 vote
0 answers
75 views

How to create an Embedding Model for Recipe Dataset Using Deep Metric Learning?

I want to create an embedding model for my own dataset of recipes. The goal is to create a neural network that takes each recipe, represented by a list of its ingredients, a list of recipe tags, and ...
blight etc's user avatar
0 votes
0 answers
23 views

LSTM Masking with Attention Graph Execution Error

I've created an LSTM encoder - decoder for a Seq2Seq task but I am facing a problem that Attention layer is not working as expected even though my architecture is logically sound. An example of one my ...
Subhan Malik's user avatar
0 votes
1 answer
278 views

How to remove layers in Huggingface's transformers GPT2 pre-trained models?

My code: from transformers import GPT2Config, GPT2Model from transformers import AutoTokenizer, AutoModelForMaskedLM, AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("openai-...
dark kk's user avatar
0 votes
1 answer
226 views

Upgrading accelerate while using Trainer class

I am facing an issue whilst using Trainer class with Pytorch on Google Colab as it demands accelarate>=0.21.0 even though I have updated all the requirements, is there any alternative to it? "...
ishaan's user avatar
  • 1
0 votes
0 answers
32 views

Inferring BERT fill-mask with tflite

There is no error. But I'm just not sure if what am I doing is correct. I have model.tflite for task fill-mask with BERT. I expect I can get list of probabilities output in list of string. The model....
Muhammad Ikhwan Perwira's user avatar
0 votes
1 answer
224 views

Error training transformer with QLoRA and Peft

So I am trying to finetuning google Gemma model using Peft and QLoRA. Yesterday I successfully fine-tuned it for 1 epoch just as a test. However, when I opened the notebook today and ran the cell that ...
eneko valero's user avatar
0 votes
2 answers
1k views

TypeError: Exception encountered when calling layer 'embeddings' (type TFBertEmbeddings)

My model was wholly workable two weeks back, but now it's showing the following error: --------------------------------------------------------------------------- TypeError ...
Tanjim Taharat Aurpa's user avatar
0 votes
0 answers
69 views

Why am I not able to load and use below spacy pipeline properly?

I used available pipeline enn_core_web_lg and added concise_concepts in the pipeline, with some dummy NER data. Immediately after that if i try NER, it works. But when i try to save the whole thing ...
Insouciant's user avatar
0 votes
0 answers
109 views

Unusual behaviour with PyTorch transformer decoder layer

I was turning the decoder model code with pytorch transformer decoder layer and I am getting different loss even though I tried to match the implementation and the tokens are getting repetitive when ...
harsh's user avatar
  • 11
0 votes
1 answer
248 views

How to get the SHAP value per class?

I want to get shap value per class. I have checked tutorial and I found below example how to do this. However, the code do not work because of shap_value.shape is (10,None,6). 10 is your the number of ...
Nemo's user avatar
  • 23
1 vote
0 answers
60 views

pytorch softmax outputs several values

I'm trying to calculate the softmax to get the probability of the text being real or not. When I use the openai weights and load the checkpoint for their detector I get the following results using the ...
Jesper Ezra's user avatar
-1 votes
1 answer
400 views

Questions about training LLMs on large text datasets for text generation from scratch

I made a fully custom made GPT in Jax (with Keras 3), using Tensorflow for the data pipeline. I've trained the model on the Shakespeare dataset and got good results (so no problem with the model). Now ...
qmzp's user avatar
  • 79
0 votes
0 answers
32 views

How to cconcatenate input to LSTM

I have one input for the embedding layer and another input for the sentiment score feature. How can I combine these two inputs in the model and feed them into an LSTM? I'm a novice in deep learning, ...
Lynn's user avatar
  • 1
0 votes
0 answers
17 views

Wants to know a topic modelling approach which will give me more suitable topics for automobile related complaints data

So, I have users complaint data for an automobile firm. I want to map each complaint to a specific topic based on the context. For example : Complaint Description : As per telephonic conversation with ...
Ankit Rawat's user avatar
2 votes
1 answer
273 views

How to calculate the weighted sum of last 4 hidden layers using Roberta?

The table from this paper that explains various approaches to obtain the embedding, I think these approaches are also applicable to Roberta too: I'm trying to calculate the weighted sum of last 4 ...
user avatar
0 votes
0 answers
70 views

Issues in Training line-based OCR by dividing data on IAM Handwriting Dataset

I am beginner level at deep learning who has theoritical knowledge but less practical work and I am currently working on final year OCR project using the IAM Handwriting dataset and looking to train ...
Anish Khatiwada's user avatar
1 vote
0 answers
72 views

What is the correct method to calculate contextualized embeddings using Roberta?

I'm trying to calculate contextualized embeddings using the RobertaModel. However, I'm not sure about the correct approach, I have tried two methods, the first method is as follows: from transformers ...
user avatar
0 votes
0 answers
522 views

How do I get the training loss from the callback function of the Trainer in huggingface

This is my code.This code can be run directly. from datasets import load_dataset from transformers import AutoTokenizer, DataCollatorWithPadding from transformers import TrainingArguments from ...
一个路过的程序员's user avatar
0 votes
0 answers
44 views

Use nn.transformerEncoder for context-free grammar parsing (sequence classification)

I want to use a transformer to do context-free grammar parsing (to classify whether a sequence is in the grammar or not). The input are sequences like "abbaba", the output is 0 or 1. Here's ...
Qi.'s user avatar
  • 1
0 votes
0 answers
20 views

How to run *.pt file trained using OpenNMT-py in a different Python script?

I have trained Hindi-English parallel language corpora using OpenNMT-py, after running to some epoch I got checkpoints as 10k.pt, 5k.pt, etc. Now I want to use that .pt file in a different Python ...
repleeka's user avatar
  • 610
0 votes
1 answer
1k views

OutOfMemoryError: CUDA out of memory in LLM

I have a list of texts and I need to send each text to large language model(llama2-7b). However I am getting CUDA out of memory error. I am running on A100 on Google Colab. Here is my try: path = &...
grey's user avatar
  • 67
1 vote
0 answers
49 views

How to install swarms; AssertionError: Error: Could not open 'optimum/version.py' due [Errno 2] No such file or directory: 'optimum/version.py'

I'm trying to install swarms but I cannot and get this error: pip install swarms Collecting swarms Using cached swarms-2.7.7-py3-none-any.whl.metadata (15 kB) Collecting Pillow (from swarms) ...
Charlie Parker's user avatar
0 votes
2 answers
255 views

Attention Layer changing Batch Size at inference

I have trained seq-to-seq model using Encoder-Decoder architecture. I'm trying to produce an output sequence given an input context, and I am trying to do that on a batch of input context vectors. I ...
Krishnang K Dalal's user avatar

1
2 3 4 5
19