Windows Server Patching

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Windows Server Patching: Best Practices

Table of Contents
 Audience
 Downtime
 Scheduling
 Change Management
 Compliance and Reporting
 Additional Notes
 Microsoft zero days/Out-of-band patches
 Windows Fail-over Cluster patching
 Credits

As we all know applying security/cumulative updated in the Windows environment is very important to secure the
environment from external attack. It also helps to fix identified bugs in the previous version and improve stability and
performance.

Writing this article, because we have seen many administrators struggle to accomplish patching activity into own
organization or in a customer environment. Most of the struggle is not because of technical challenges, but it’s due to
operational challenges, for example, server downtime, scheduling, change management etc.

I am taking this opportunity to share the best practices which I have followed while performing Windows Patching. This
may help to organize/structure the patching activity in your organization.

Audience
Windows Server Administrators and the people who follow server patching via any patching tool available in the market.

Important
This article covers for the Windows Server environment and applicable for Operating System
patching - like Windows Server 2008, 2012, 2016, this will not cover for installed Microsoft or
any other application on servers - like Microsoft Exchange, SharePoint, SQL etc. For application
patching separate test needs to be carried by application specialist before deployment to
production environment.

Downtime
Downtime is the key factor in entire patching activity. Many of us are experiencing issues while getting downtime for
business. Every organization handles it differently, based on business approval. Few of the methods which can be used
mentioned below.!

By following one of this method you can reduce the efforts which we add for getting downtime for business.
1. Follow the standard maintenance window. There are a couple of organizations which follow the standard
maintenance window of the environment. We can utilize the same window for patching. Example – Standard
maintenance window for DEV and UAT environment during weekday post business hours (10 PM – 3 AM) and
weekend for production and DR environment. Servers can be restarted within the provided Standard
maintenance window.
2. Or Standard four hours patching window, which is only approved to perform patching activity. This window
should be agreed on by all business units. For example – Second Friday (DEV), Second Saturday (UAT), Third
Saturday (Production) and Fourth Saturday ( DR).
3. Or Another example – First week after second Tuesday à (Week 1) DEV, (Week 2) UAT, (Week 3 ) Production,
(Week 4) DR (DR week can be the first week of the upcoming month).

Scheduling
As mentioned above, getting downtime is always a challenging task for all administrators. There are a couple of points,
that we need to consider while preparing a schedule to line up it with agreed downtime.

Bellow scenario is describing based on the downtime method (3) mentioned in the above section of “Server Downtime”.

1. It’s recommended to perform Windows patching on a monthly basis, not by quarterly.


2. List out the Servers which are in scope for patching. If your organization has segregated environment like
DEV/UAT/Production/DR, then prepare the schedule starting with DEV than UAT, Production, and DR. Using
this schedule you can patch the servers within four weeks of time span. If you don’t have any tool to prepare
schedules, then you can prepare It in excel sheet and share It will all the stakeholders/Servers, Application
Owners via email notification. This notification is important because it reminds them about the upcoming
patching activity and accordingly they can do pre-work on the application front if needed. This also helps, If
they want to exclude the server from pathing due to scheduled application release. (It’s not recommended to
exclude servers from patching unless it’s really a valid/business need). Even If you have excluded the server,
make sure you will take next agreed downtime window from application team to cover patching activity.
3. Microsoft will release the patches on the second Tuesday of every month, post that you can identify the patch
and get the Security Team/CISO approval (Security Team/CISO approval process may vary based on the
organization). After approval, we have to perform Initial testing on all the versions of Windows OS (Windows
Server 2008, Windows Server 2012, Windows Server 2016). This will clarify us whether the OS is booting and
coming up without any issue, MMC snap-in is working as expected, no error reported under Windows event
logs, all Windows automated services are running, server utilization/performance is normal, etc.
4. As shown in below schedule, from Second Saturday you can start your first week of patching, which will cover
DEV server’s patching, the second week for UAT, Third for Production and fourth week for DR servers. The
Individual application team needs to carry out application-level testing post completion of DEV/UAT patching
before proceeding to patch on Production and DR environment. This will help to avoid production impact.
5. It’s also administrators responsibility to notify all stakeholders/Servers, Application Owners via email, post
completion of patching activity so that they can carry further application level testing to make sure hosted
applications on the server are working as expected.

Change Management
Change management is also one of the important factors in patching. This gives awareness about the upcoming
changes in the environment and also help from an audit point of view. Every organization will have defined process
based on business needs. It's recommended using Standard Change Template since patching activity is one of the
mandatory activities which will be performed on a monthly basis. Using Standard Template we minimize the change
initiator work of drafting the Change Description/Change Task etc.
Compliance and Reporting
It’s very important to carry out compliance check post completion of patching. Measuring the implanted work is always
beneficial to the organization from the security audit point of view.

It’s recommended to perform patching compliance imitated post completion of patching. For example – if you have
four hours of downtime, then perform the patching compliance scan on second of third hours so that you can re-patch
the servers within the same downtime under approved change. If you missed checking compliance within the same
downtime window, then you may need to request for new downtime for business and also need to raise a separate
change ticket.

If your compliance mechanism is giving compliance data after 24/48 hours, then its recommended patching missing
servers in upcoming downtime windows. Do not keep a backlog for a longer time. This impact on the overall compliance
by end of month cycle.

Additional Notes
1. Make sure you are performing daily health check for the patching tool agent (The agent will depend on
patching tool which you are using example Microsoft SCCM, HPSA etc.). All agents should be reported as
healthy. The agent who are not healthy should remediate them immediately. If the agent is not healthy, it may
fail to patch the server and it will impact on patching compliance.
2. If any issue encountered on the application during DEV/UAT testing, then make sure to exclude production
and DR servers from patching until issue fixed on DEV/UAT
3. If we are uninstalling the patches due to the reported issue, then make sure the application team will consult
with App vendor for solution and compatibility. Because we can’t keep servers without patching for a longer
duration.

Microsoft zero days/Out-of-band patches


Microsoft zero days/Out-of-band patches can be deployed once the risk assessment is done by the internal security
team. Microsoft recommends deploying OOB patches as soon as possible to avoid the external attack.

If the security team confirms to deploy the patches within the next 48 hours, then we have to define the scope by
identifying servers running with an impacted software/product under the venerability. For example, If the vulnerability
is identified in Internet Explorer 9, then we have to identify how many servers in the environment are running with IE9.
Data can be fetched by the compliance tool which you are using in your environment. If you are using Microsoft SCCM
, then you can create a custom report with a custom query to fetch this data. If you don’t have any tool, then you have
to use any scripting method, the last option is a manual method, but fetching this information manually will be a tedious
job if you have more servers.

Assume after assessment, you have 100 servers running with IE9 out of 4000 servers. In this case, you have to plan to
patch these 100 servers on priority. Since the timeline is short, you may need to notify/contact server owner/application
owner to take explicit approval for a server reboot. After the approval servers can be patched and reboot post business
hours to minimize the business impact. If the standard changed management is not helping to fulfill the change
management requirement, then you may need to go with an emergency change request.
Apart from these impacted 100 servers, the rest of the servers you can patch as per your standard patching schedule.

Sometimes installed antivirus software can mitigate the vulnerability, In this situation, you have to take a call with the
security team. As far as installed antivirus is securing your environment, you can patch the servers in regular patching
schedule. Make sure you have confirmation from antivirus vendor about security coverage.

Check the compliance status post completion of patching.

Windows Fail-over Cluster patching


You can use the below method to patch Windows Failover Cluster unless you are using Cluster Aware Updated feature
for Windows 2012.

Consider you have two node windows Fail-over cluster running File Server Role.

1. Move all the running resources from Node1 to Node2.


2. Make sure after moving resources to Node2, all are online and all the shares are accessible.
3. Install patches on Node1, restart Node1.
4. Move all the resources from Node2 to Node1. Make sure they are online and all the shares are accessible.
5. Install patches on Node2, restart Node2.
6. Re-balance all the resources on their preferred cluster node. Check cluster log to make sure everything is
green.

Microsoft recommends running all the cluster nodes on the same patch level.

Patching and restart you can automate If you are going to take care of pre-work of resources movement before Patch
deployment schedule.

Hope this article will be helpful for you. Your comments and feedback is important

Important
Suggesting you consult with your senior staff/Technical Lead/Technical Manager before you
follow any of the approaches, the above best practices are shared based on my experience.

You might also like