4,790 questions
0
votes
0
answers
15
views
Migrating grafana-loki data from binary installation to kubernetes installation
I currently have a Loki installation done using the Loki binary download and running as a service on a VM.
I am creating a Kubernetes cluster for the monitoring tools we use today (Loki, Grafana, ...
0
votes
0
answers
21
views
Uptime check validation and monitoring in GCP network load balancer [closed]
One of the critical components in our setup is an regional internal proxy Network Load Balancer that handles the communication between our application servers and the backend services.The challenge I'...
-3
votes
0
answers
27
views
What parameters are required for monitoring river water quality using IoT technology? [closed]
I am designing a river water quality monitoring system using IoT technology. My goal is to measure parameters such as temperature, pH, and electrical conductivity using IoT sensors.
I would like to ...
-1
votes
0
answers
10
views
Set default priority for all the alerts of a service in PagerDuty
I am trying to set a default priority per service.
For instance, we have some business services that are critical. And any alerts coming from it should be of priority P1.
In comparison, some other ...
0
votes
0
answers
36
views
Customizing Slack Notifications for Uptime Kuma
I have set up Uptime Kuma to monitor some of my website URLs. If any URL goes down, it sends a notification to my Slack channel. I have also customized the Slack message format.
How I Customized the ...
0
votes
1
answer
61
views
How does the description of the rules rules file use the data of the node_exporter in prometheus alertmanager monitoring
When I send an alert, I expect to send a push when the alert is triggered
I would like to add more information in the description, such as node_filesystem_files_free & node_cpu_seconds_total & ...
0
votes
1
answer
51
views
Logs Not Flowing from Function App deployed under ASE to its own Application Insights
We are experiencing an issue where logs are not flowing from our Azure Function App to Application Insights. The Function App has been deployed under an App Service Environment (ASE), and the ...
0
votes
1
answer
28
views
Pynput not logging alphanumerical keys on Mac
I want to detect when I press the "space" bar key using python and pynput.
I tried their documentation, my old code but nothing works anymore on MacOS 15.1.1. Even running the script using ...
0
votes
0
answers
16
views
How to delete latest data in Zabbix
def lambda_handler(event, context):
token = get_zabbix_token()
failed = False
# Define metric functions to iterate over
metric_functions = {
# "ELB": ...
0
votes
1
answer
66
views
How to stop monitoring process in Flask/Python?
in my project I have a USB monitor running, but I see it interfere with other functions, so I want to start/stop it when necessary. In my code the start function is ok, but the stop one no, I can't ...
-1
votes
0
answers
22
views
AWS EKS cluster PVC data to nagios
How could i monitor my PVC usage and send reports to nagios core from AWS EKS? I need a lightweight solution, prometheus not an option.
I didn't tried anything yet, currently I am in searching phase ...
0
votes
0
answers
24
views
Custom instrumentation in Sentry browser profiling?
I need some help in understanding the "Custom Instrumentation" feature in Sentry browser profiling. As per their documentation, "Browser Profiling" is currently on beta, and "...
0
votes
0
answers
19
views
Zabbix nodeAddress hardcode
First look to powerful tool - Zabbix. Setup it with helm and after check up some widgets, see a error block, telling about incorrectness of NodeAddress, simply giving me some hardcode IP instead of ...
0
votes
0
answers
19
views
How to produce a very fast https empty reply from nginx?
I have an nginx balancer and after two application servers with apache and tomcat.
I need to create a very fast https response server to let the client monitor the speed of its connection measuring ...
0
votes
1
answer
53
views
Prometheus Thanos Receiver Error: content deadline exceeded
I m trying to setup remote write Thanos receiver with docker compose on a virtual box vm but for some reason the receiver api endpoint is crashing anytime Prometheus send a post request:
Error 500 msg:...
0
votes
1
answer
45
views
Why is there no SQL statement in MON$STATEMENTS.MON$SQL_TEXT for all active transactions in Firebird 2.5
I am trying to find the SQL statements for my active transactions and specifically for OAT.
I am running this query:
select MT.MON$TRANSACTION_ID,MT.MON$TIMESTAMP,MS.MON$STATEMENT_ID,MS.MON$SQL_TEXT
...
2
votes
2
answers
58
views
Check if SMTP server is online
How to check if SMTP is online and running?
func (p *sys_ping) smtp(host string) string {
client, err := smtp.Dial(host)
if err != nil {
time.Sleep(time.Duration(p.delay) * time.Second)...
1
vote
0
answers
59
views
Opentelemetry context propagation in nestjs app
there. I bumped into a problem with configuration of OTEL for NestJS. I want to add additional information into the span attributes and propagate it over all trace. I use baggage. My entry point is ...
0
votes
0
answers
7
views
How to join in alerts?
I'm trying to set alerts whenever there is a pod restart.
As part of this alert it will be nice to display the restart reason as well.
I tried the following:
First, I tried the inner join and join by ...
0
votes
1
answer
83
views
Error when creating a new item prototype (zabbix proxmox)
I'm using Zabbix 7.0.4 and I'm using the template Proxmox VE by HTTP that has predefined item prototypes like these:
I want to create a new item prototype for Proxmox VE by HTTP that calculates the % ...
4
votes
4
answers
121
views
Find the statement currently running in my PL/pgSQL code block
Is there a way to figure out which statement from the block is currently running in Postgres? (Even extra extensions or tracing might be an option)
Below is the quick way to reproduce but in real ...
0
votes
1
answer
60
views
How can I monitor application performance?
I've encountered an issue where I'm not sure which direction to take. My application is using GCP. I want to monitor the performance, mostly in requests, not infrastructure, as far as i've looked I ...
0
votes
0
answers
70
views
Monitoring of a directory and get notified for any new events like file creation in C++
I have a directory like data in root like /data, also i will have multiple directories like test, prod, staging inside data directory like below:
/data/prod/
/data/test/
/data/staging/
New files will ...
2
votes
1
answer
150
views
Google Cloud alerting doesn't allow creating an alert for its own metric [closed]
I have Firebase API on GCP, but I would like to get alerts when my API returns error for more than 10% of requests.
Something which GCP already has:
It's exactly what I need, first problem is that ...
0
votes
0
answers
62
views
Pyroscope agent failing to load after changing version
I encountered the following issue:
I was trying to dockerize some monitoring applications and this happened:
This first version of the dockerfile works fine:
RUN apt-get update && apt-get ...
0
votes
1
answer
101
views
Snyk container monitor target name is not showing an IMAGE_TAG
Wanted to ask a question about Snyk cli container monitor of docker image. So we got a docker image for example reponame/image_name:image_tag. We are monitoring this image from cli like snyk container ...
0
votes
0
answers
30
views
With web page crashes, how does one determine if the client browser is crashing vs. the server side?
We self-host an internal website for colleagues that renders lots of data/ images.
We suspect the internal website (hosted in AWS) is under provisioned or has other bad practices and is the root cause ...
0
votes
1
answer
19
views
AutoTrigger to launch an action
I have a PowerShell script that monitors a folder to detect the creation of new PDF files and automatically triggers an action to print them. However, I want to avoid having the script running ...
0
votes
1
answer
97
views
how can i control the number of events a namespace can save?
I want to control the number of events my tenants can create.
My ttl is 1 our, and that ok, but sometimes we have spikes in several namespaces and i am worry about etcd getting flouted if it happens ...
0
votes
1
answer
162
views
Loki is taking too long time to load items in Explorer and on Grafana dashboards
I have been experiencing an issue with Loki as a datasource in Grafana. When I check the explorer page in grafana, it often takes a long time to load filenames, jobs, labels, and other elements. ...
0
votes
1
answer
275
views
Status: 500. Message: Get "http://localhost:3100/loki/api/v1/query_range? EOF (End of File) Loki Grafana
I'm experiencing issues retrieving log data from Loki in my Grafana dashboard. I've configured Loki to collect logs from various instances using Promtail, but I'm encountering an error when trying to ...
0
votes
0
answers
53
views
Monitoring Incoming HTTP requests For Resource Utilization on server
Hope everyone is doing fine.
I have been trying to find a way to monitor my webserver (nginx 1.24.0) for resource utilisation per each request such as CPU and memory usage for each HTTP request ...
0
votes
0
answers
132
views
Context deadline exceeded while REMOTE WRITE from Prometheus to vminsert
**
Before I get started this issue is related to remote write not scraping metrics from a server.**
I am scraping metrics of more than 100 servers.
But when i am remote writing it to vminsert I am ...
0
votes
1
answer
43
views
Monitoring dashboard showing free space in marklogic [closed]
On the MarkLogic Monitoring Dashboard, the free space shows as full. According to the MarkLogic documentation, this refers to the amount of free space remaining on the disk after accounting for the ...
0
votes
0
answers
31
views
Should I use PromQL's increase function as an alert rule expression for a resource quota breach?
I have this Prometheus alert expression which tries to capture if/when we exceed the monthly quota of a service by using the increase function on a counter metric over a 30day period.
sum(increase(...
0
votes
1
answer
54
views
How to implement a custom composite meter and natively integrate it into Micrometer and Spring Boot Actuator?
I want to implement a new type of meter (HeartbeatMeter) to track the liveness of a heartbeat from an upstream source.
Interface would be very simple, something like:
private interface ...
0
votes
0
answers
31
views
Unable to send logs from fluent-bit to influxdb (cpu, mem, etc. work perfectly)
Any try to send logs from a machine running debian with fluentbit v3.1.6 to a server running influxdb provides:
...
[ warn] [output:influxdb:influxdb.0] skip send record, since no record available or ...
0
votes
0
answers
62
views
How to create graph in kibana to visualize apis and their response times with respect to timestamps?
I have logs containing fields api and their response_time and every log contains timestamp associated with it like when it is generated. Now i want to generate a graph to view the timestamps on X-axis ...
0
votes
0
answers
39
views
Sending histogram data to grafana from statsd library of python
We are using statsd library in python to send some metrics to grafana. I am struggling to send histograms to grafana.
We use statsd-exporter and victoria metrics data store for the metrics in grafana.
...
1
vote
0
answers
125
views
Zabbix Proxy 6.4 Cannot send heartbeat message to 'server'
I have configured a Zabbix Proxy version 6.4 and I am trying to add it to my zabbix server, but I am getting 'last seen age' as never.
The installation process was the following one:
wget [1]https://...
0
votes
0
answers
101
views
Get more than 1000 records from Matillion Task History API
I am using Matillion Task History API to fetch the history of tasks that have been running in the Matillion ETL instance. Task history results are subject to a limit of 1,000 records for a single task ...
0
votes
0
answers
25
views
netdata agent does not visualise the chart
Good afternoon. I am new to netdata, so I may still be confused about the terminology.
on my production server, the netdata agent is already running and collecting data, visualisation is working, ...
0
votes
0
answers
22
views
How to get a sequential x-ray map that shows task progression through different aws services
I have an aws x-ray map that looks like this:
The state machine goes like lambda => s3 => lamdba but the x-ray map does not show this. This is okay when it is one state machine, but when you ...
0
votes
1
answer
42
views
How to route alerts based on the duration the alert condition has been true
I would like to set up notification for alerts from alertmanager such that I receive:
Slack notifications immediately
OpsGenie notifications if a given alert is firing for at least 1 hour
Email ...
0
votes
1
answer
28
views
AWS Lambda monitor dependencies
I have an AWS Lambda python application, lets call it A. It calls multiple services, lets call one of them B. I would like to know how can I monitor in CloudWatch connection between A and B, namely:
...
0
votes
0
answers
35
views
How to detect the changes on selected text in chrome extension
In my chrome extension I am storing the selected text of user in local storage but the problem is that when I close the popup and reopen the popup in text area the selected text is removed.
How I can ...
0
votes
0
answers
28
views
Kafka kafka_consumergroup_lag metric keeps increasing when application no longer consumes the topic
We are using spring boot Kafka streams integration to consume topics in our app.
Prometheus, grafana and a kafka exporter is used for monitoring.
The problem is that when we change the code to stop ...
2
votes
3
answers
157
views
Write out the peak memory utilization of a Pyspark Job on EMR to a file
We run a lot of Pyspark jobs on EMR. The pipeline executed is the same, but the inputs can wildly change the peak memory utilization, and that utilization is growing over time. I would like to ...
0
votes
1
answer
385
views
Error: Job for telegraf.service failed because the control process exited with error code
I have already installed InfluxDB on my Ubuntu instance. After that, I installed Telegraf. But when I checked the status of Telegraf, it said: Failed to Start Telegraf
See below image for better ...
1
vote
1
answer
85
views
Prometheus Scrape Interval Causing Time Difference in Metrics Monitoring
Timing is a crucial component in our business, so I have to have a robust monitoring of our servers' time. For this purpose, I have set up the prometheus node-exporter in our servers to get the ...