All Questions
Tagged with deep-learning nlp
907 questions
-2
votes
0
answers
13
views
Where can I find datasets for medical document analysis and disease diagnosis using NLP? [closed]
I'm working on a healthcare-related project where I need to analyze medical documents, extract specific values (e.g., creatinine, glucose levels, etc.), and generate personalized paragraphs for ...
-1
votes
0
answers
27
views
How to split and spelling correct arabic text without spaces into list of words
I'm looking for a way to split and spelling correct the arabic text without spacing, just like in Microsoft Word for example:
تشرابالقطط الحليب
Expected:
[تشرب، القطط، الحليب]
If there Is no ready ...
0
votes
0
answers
11
views
How to log only the current script file to W&B code panel immediately?
How can I ensure that only the current script file (e.g., train.py) is logged to the W&B Code panel when running a script, without logging the entire directory?
Currently, I'm using:
wandb.run....
0
votes
0
answers
34
views
Encoder Decoder Transformer model generate a repetitive token as output in text summarization
I implemented a transformer Encoder Decoder (Bert2Bert) for text summarization task. In train phase train loss decreases but in prediction phase it generate a repetitive token as output for example [2,...
-1
votes
0
answers
23
views
GPT2: `register_forward_hook` and `output_hidden_state` gave different outputs of an intermediate layer
I want to output the 20th GPT2Block in a GPT2 medium model (24 GPT2Block blocks in total). I have used register_forward_hook and output_hidden_state separately, but they give different results.
My ...
2
votes
1
answer
313
views
Alternative to device_map = "auto" in Huggingface Pretrained
I have a model that I was reading from huggingface using the following code:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = ...
0
votes
0
answers
39
views
Simple Transformers ClassificationModel freezes during training, making the terminal unresponsive
I am attempting to train a whole bunch of DistilBERT (Simple Transformers) Classification Models using 10-Fold validation. The process works as intended, but freezes during training (at different ...
0
votes
1
answer
71
views
Layer expects 2 input(s), but it received 1 input tensors
I am trying to build model to predict posts likes, the model takes text and content type which is one hot encoded column.
I have made a TensorFlow dataset but when trying to fit the model I got this ...
1
vote
1
answer
212
views
How does OpenAIEmbeddings() work? Is it creating a single vector of size 1536 for whole text corpus?
I'm working with the OpenAIEmbeddings() class from OpenAI, which uses the text-embedding-3-small model. According to the documentation, it generates a 1536-dimensional vector for any input text.
...
0
votes
0
answers
39
views
The Impact of Pretraining on Fine-tuning and Inference
I am working on a binary prediction classification task, primarily focusing on fine-tuning a BERT model to learn the association between CVEs and CWEs. I've structured my task into three phases: first,...
-1
votes
1
answer
42
views
LSTM forecasting predicts zeros
i'm building an LSTM model to forecast future total number of cases for covid 19 using OWID dataset
i use a multivariate series of 6 columns including the date column,
the Problem is i get all zero ...
0
votes
0
answers
77
views
I am implementing transformers from scratch in pytorch and getting some error in addition in positional encoding part in output layer
I am implementing transformer in pytorch and getting an error when the Positional encoding is applied in the decoder layer that is in the op_positional_encoding = self.positional_encoding(op_embed) ...
0
votes
1
answer
129
views
Enhance model performance in text classification task
I tried to build a model for multi-label text classification task in chinese, but the performance of the model is not good enough (about 60% accuracy), and I come for help about how to enhance it.
I ...
0
votes
1
answer
67
views
Why loading AutoTokenizer takes so much RAM?
I was measuring the RAM that is used by my script and I was surprised that it takes about 300Mb of RAM, while the tokenizer file itself is about 9MB. Why is that?
I tried:
from transformers import ...
0
votes
0
answers
19
views
Model prediction differed after one year with same config and dataset
I tried a TrOCR-Hugging Face model, with a training data (400k) and got around 83% of accuracy with the test data. After a year, I tried to train the same model, with same config and dataset. Now the ...
0
votes
0
answers
47
views
Text to Openpose and Weird RNN bugs
I want to create AI that generate openpose from textual description for example if input "a man running" output would be like the image I provided Is there any model architecture recommend ...
-3
votes
1
answer
137
views
Unsupervised Sentiment Analysis in nlp [closed]
How to do sentiment analysis on unlabeled data, and I've looked all over the internet(gave clustering algorithms) and it was not effective. How to do sentiment analysis from scratch on unlabeled data ,...
0
votes
0
answers
37
views
Assertion with no scription in vllm with DeepSeekMath 7b model
I'm working with the DeepSeekMath 7b model to generate synthetic data using Python, but I'm encountering an AssertionError with no description alongside warnings related to token length exceeding the ...
0
votes
0
answers
32
views
Is it valid to separate text generation and logits computation in reinforcement learning?
I'm working with a reinforcement learning setup using the REINFORCE algorithm for a text summarization task in NLP. Originally, I developed a setup by extending Huggingface's .generate() function, ...
0
votes
0
answers
86
views
OSError: [Errno -9996] Invalid input device (no default output device) using Pyaudio in google collab
I am working on a Realtime speech emotion recognition using LSTM. I have used pydub library to capture the live input speech . I am working with google collab . I am not sure that whether i am using ...
0
votes
1
answer
72
views
NefTune Receiving 0 Training Loss on Transformers
I'm basically trying to fine-tune my model with Neftune. Model is based on Turkish Language. But there I'm receiving zero training lose. I've tried to another model like Turkish-GPT2 there is no issue ...
0
votes
1
answer
276
views
save and load keras transformer model
i am still new to deep learning right now i follow this keras tutorial to make an translation model using transformer here the link. Everything works just fine but i have no idea how to save the model,...
0
votes
0
answers
9
views
RNN invalidArgumentError
I am training RNN model for speech to text purpose.
While training RNN, I am getting this weird error called invalid argument error indices[28,0] = -1 is not in [0,169)
model = keras.Sequential()
# ...
0
votes
1
answer
66
views
How to import tensorflow-text correctly
I have bundle of errors in importing tensorflow-text. I was first trying to import the below versions which were working correctly.
!pip install tensorflow==2.8
but now it says like this
`import ...
0
votes
0
answers
79
views
How to use BERT for identify words unrelated to the content of a sentence and replace them with suitable words?
For this question we can see below example:
A: Jack continues to pray well.
B: Jack continues to play well.
In the first sentence, the word "pray" is written in a typographical error and the ...
0
votes
0
answers
40
views
Unable to import trax
I was trying to import Trax on Jupyter Notebook, and it gave me an error message.
I am running the Python 3.11 version.
installed trax using the "
pip install Trax
command." All the ...
2
votes
2
answers
1k
views
RuntimeError: Failed to import transformers.integrations.bitsandbytes
I am trying to load an llm model in 4 bits precision. However, I got RuntimeError: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback):...
1
vote
0
answers
75
views
How to create an Embedding Model for Recipe Dataset Using Deep Metric Learning?
I want to create an embedding model for my own dataset of recipes. The goal is to create a neural network that takes each recipe, represented by a list of its ingredients, a list of recipe tags, and ...
0
votes
0
answers
23
views
LSTM Masking with Attention Graph Execution Error
I've created an LSTM encoder - decoder for a Seq2Seq task but I am facing a problem that Attention layer is not working as expected even though my architecture is logically sound.
An example of one my ...
0
votes
1
answer
278
views
How to remove layers in Huggingface's transformers GPT2 pre-trained models?
My code:
from transformers import GPT2Config, GPT2Model
from transformers import AutoTokenizer, AutoModelForMaskedLM, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("openai-...
0
votes
1
answer
226
views
Upgrading accelerate while using Trainer class
I am facing an issue whilst using Trainer class with Pytorch on Google Colab as it demands accelarate>=0.21.0 even though I have updated all the requirements, is there any alternative to it?
"...
0
votes
0
answers
32
views
Inferring BERT fill-mask with tflite
There is no error. But I'm just not sure if what am I doing is correct. I have model.tflite for task fill-mask with BERT. I expect I can get list of probabilities output in list of string. The model....
0
votes
1
answer
224
views
Error training transformer with QLoRA and Peft
So I am trying to finetuning google Gemma model using Peft and QLoRA. Yesterday I successfully fine-tuned it for 1 epoch just as a test. However, when I opened the notebook today and ran the cell that ...
0
votes
2
answers
1k
views
TypeError: Exception encountered when calling layer 'embeddings' (type TFBertEmbeddings)
My model was wholly workable two weeks back, but now it's showing the following error:
---------------------------------------------------------------------------
TypeError ...
0
votes
0
answers
69
views
Why am I not able to load and use below spacy pipeline properly?
I used available pipeline enn_core_web_lg and added concise_concepts in the pipeline, with some dummy NER data. Immediately after that if i try NER, it works. But when i try to save the whole thing ...
0
votes
0
answers
109
views
Unusual behaviour with PyTorch transformer decoder layer
I was turning the decoder model code with pytorch transformer decoder layer and I am getting different loss even though I tried to match the implementation and the tokens are getting repetitive when ...
0
votes
1
answer
248
views
How to get the SHAP value per class?
I want to get shap value per class. I have checked tutorial and I found below example how to do this. However, the code do not work because of shap_value.shape is (10,None,6). 10 is your the number of ...
1
vote
0
answers
60
views
pytorch softmax outputs several values
I'm trying to calculate the softmax to get the probability of the text being real or not.
When I use the openai weights and load the checkpoint for their detector I get the following results using the ...
-1
votes
1
answer
400
views
Questions about training LLMs on large text datasets for text generation from scratch
I made a fully custom made GPT in Jax (with Keras 3), using Tensorflow for the data pipeline.
I've trained the model on the Shakespeare dataset and got good results (so no problem with the model).
Now ...
0
votes
0
answers
32
views
How to cconcatenate input to LSTM
I have one input for the embedding layer and another input for the sentiment score feature.
How can I combine these two inputs in the model and feed them into an LSTM?
I'm a novice in deep learning, ...
0
votes
0
answers
17
views
Wants to know a topic modelling approach which will give me more suitable topics for automobile related complaints data
So, I have users complaint data for an automobile firm. I want to map each complaint to a specific topic based on the context.
For example :
Complaint Description : As per telephonic conversation with ...
2
votes
1
answer
273
views
How to calculate the weighted sum of last 4 hidden layers using Roberta?
The table from this paper that explains various approaches to obtain the embedding, I think these approaches are also applicable to Roberta too:
I'm trying to calculate the weighted sum of last 4 ...
0
votes
0
answers
70
views
Issues in Training line-based OCR by dividing data on IAM Handwriting Dataset
I am beginner level at deep learning who has theoritical knowledge but less practical work and I am currently working on final year OCR project using the IAM Handwriting dataset and looking to train ...
1
vote
0
answers
72
views
What is the correct method to calculate contextualized embeddings using Roberta?
I'm trying to calculate contextualized embeddings using the RobertaModel. However, I'm not sure about the correct approach, I have tried two methods, the first method is as follows:
from transformers ...
0
votes
0
answers
522
views
How do I get the training loss from the callback function of the Trainer in huggingface
This is my code.This code can be run directly.
from datasets import load_dataset
from transformers import AutoTokenizer, DataCollatorWithPadding
from transformers import TrainingArguments
from ...
0
votes
0
answers
44
views
Use nn.transformerEncoder for context-free grammar parsing (sequence classification)
I want to use a transformer to do context-free grammar parsing (to classify whether a sequence is in the grammar or not). The input are sequences like "abbaba", the output is 0 or 1. Here's ...
0
votes
0
answers
20
views
How to run *.pt file trained using OpenNMT-py in a different Python script?
I have trained Hindi-English parallel language corpora using OpenNMT-py, after running to some epoch I got checkpoints as 10k.pt, 5k.pt, etc.
Now I want to use that .pt file in a different Python ...
0
votes
1
answer
1k
views
OutOfMemoryError: CUDA out of memory in LLM
I have a list of texts and I need to send each text to large language model(llama2-7b). However I am getting CUDA out of memory error. I am running on A100 on Google Colab. Here is my try:
path = &...
1
vote
0
answers
49
views
How to install swarms; AssertionError: Error: Could not open 'optimum/version.py' due [Errno 2] No such file or directory: 'optimum/version.py'
I'm trying to install swarms but I cannot and get this error:
pip install swarms
Collecting swarms
Using cached swarms-2.7.7-py3-none-any.whl.metadata (15 kB)
Collecting Pillow (from swarms)
...
0
votes
2
answers
255
views
Attention Layer changing Batch Size at inference
I have trained seq-to-seq model using Encoder-Decoder architecture. I'm trying to produce an output sequence given an input context, and I am trying to do that on a batch of input context vectors. I ...