1

I'm using varnish as a web cache. I'm running on RHEL 9.2. My cache is sized at 1GB. My varnish process is using 3.7G memory.

$ ps -p 1163  -o %mem,rss
%MEM   RSS
15.7 3886108

I'm running varnish with the -s malloc,1024M argument, which should set the cache size to 1GB. The full command, being invoked by systemd, is:

/usr/sbin/varnishd -a 127.6.6.6:7480 -f /etc/varnish/default.vcl -S /etc/varnish/secret -s malloc,1024M -P /var/run/varnish.pid -T 127.6.6.6:62

It appears that is working, I can see with varnishstat:

$ varnishstat -1 | grep s0.g
SMA.s0.g_alloc                            22573          .   Allocations outstanding
SMA.s0.g_bytes                       1073105012          .   Bytes outstanding
SMA.s0.g_space                           636812          .   Bytes available

I find it hard to believe there is 2.7GB overhead for a 1GB cache, what am I missing here?

Restarting with systemctl restart varnish drops the usage temporarily (as it empties the cache, but also the non-cache memory disappears), but it gradually builds up again to settle at 3.7GB.

This is a test system under somewhat heavy load, is it possible this memory is genuinely in use serving requests? I can see this in the transient space, to my eye this looked uninteresting but after reading some other threads maybe it is.

SMA.Transient.c_req                     9720800         7.90 Allocator requests
SMA.Transient.c_fail                          0         0.00 Allocator failures
SMA.Transient.c_bytes              324751584558    263852.57 Bytes allocated
SMA.Transient.c_freed              324751584558    263852.57 Bytes freed
SMA.Transient.g_alloc                         0          .   Allocations outstanding
SMA.Transient.g_bytes                         0          .   Bytes outstanding
SMA.Transient.g_space                         0          .   Bytes available

I've also seen a few suggestions that varnish is designed for use with jemalloc, I do not have jemalloc installed, it looks like varnish is using glibc's malloc.

EDIT: Full varnishstat -1 output can be found here: https://pastebin.com/3uZ1kpy6, and varnishadm param.show here: https://pastebin.com/7jrLKdMx. At the time of this stat dump, memory usage was at 3GB.

ps -p 917285   -o %mem,vsz,rss
%MEM    VSZ   RSS
12.8 7278920 3136620

1 Answer 1

2

malloc cache storage only represents a part of Varnish's memory consumption.

Transient storage

As you mentioned in your question, the transient storage can also consume memory. Transient storage is there to buffer or stream non-cacheable and shortlived content. It is unbounded, which means it could cause an OOM in case of spikes.

FYI: the shortlived runtime parameter defines what shortlived content is and defaults to 10 seconds.

Threads

There is also the runtime cost of Varnish. For every thread there is some memory allocation taking place.

The amount of threads is limited by the thread_pool_min & thread_pool_max runtime parameters per thread pool. By default these values are a minimum of 100 and a maximum of 5000 per thread pool.

Since there are 2 thread pools (as defined by thread_pools), it 'll range between 200 and 1000 if you rely on default values.

The workspace_thread runtime parameter defines the initial amount of memory consumed per thread. The default value is 2 KB.

More workspace memory consumed

Besides the workspace_thread memory consumption per thread, there is additional memory consumption for each thread depending on the type of workload it is processing.

If the thread is used to manage the TCP session, and additional 0.75k of memory is consumed by default, as defined by the workspace_session runtime parameter.

If the thread is used to handle an HTTP request, 96k of memory is consumed by default, as defined by the workspace_client runtime parameter.

And for backend requests the memory consumption is defined by the workspace_backend runtime parameter, which also defaults to 96k.

Tuning

If you altered any of the runtime parameters I mentioned, it's going to have an impact on the total memory consumption.

You are free to tune these parameters accordingly, and its very much a balancing act between making sure you have enough threads to handle all the incoming requests and backend requests.

But also to have enough memory to handle all the logic that you specified in your VCL file.

An alternative solution is the Massive Storage Engine that is part of Varnish Enterprise, the commercial version of Varnish. It has a feature called the Memory Governor, which has the ability to limit the total memory consumption of Varnish, and autoscale the size of the cache depending on the runtime memory it requires.

9
  • Thanks for the details. I'm not sure this explains what I'm seeing though. To provide some more info, I have 200 threads. "varnishstat" as above shows I'm using 1GB in malloc cache and 0 in transient space. So what could that extra 2.7GB be coming from? I noticed when I took the load off, the memory usage also remains high. Feels like I must have some config setting wrong.
    – Alex
    Commented May 3 at 8:42
  • I should also mention that this has drastically changed for us over upgrade from 4.0.5 to 6.6.2. Pre-upgrade we were using approx 1.3GB in total to support our 1GB cache.
    – Alex
    Commented May 3 at 12:58
  • @Alex 200 your thread count is very low. I see no reason why varnishd would be consuming so much memory. Could you attach the complete output of varnishstat -1 to your question and your complete VCL file? Of course you need to capture the stats when the high memory consumption is taking place. Please also attach the output from varnishadm param.show to see what runtime parameters you're using. I'd like to get the full picture. Commented May 3 at 15:00
  • I've updated the initial post with the full varnishstat output and the varnishadm command. Many thanks for your continued advice.
    – Alex
    Commented May 3 at 17:04
  • 1
    Thanks for the suggestion. I've installed version 6.0.13 and jemalloc (suspect it was required, ldd shows it is linked to by varnishd). Memory usage for a 1GB cache has gone down from ~3.7GB to ~2GB. That's an awesome improvement, I really wonder what's going on with the RHEL9 version. I'll raise a support ticket with them most likely. Many thanks for your help with this.
    – Alex
    Commented May 7 at 16:43

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.